cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
959
Views
0
Helpful
4
Replies

LMS 3.2 CSDiscovery stalled

mburrell5
Level 1
Level 1

I am using LMS 3.2 and Device Discovery never finishes. It discoevers the majority of devices and then hangs. The only way to stop it is to run pdterm CSDiscovery from the server command line. I used to have exactly this problem under LMS 3.1 and was advised it was a bug and 3.2 would fix it. Discovery has run ok under 3.2 for 6 months but is no longer working.

1 Accepted Solution

Accepted Solutions

It appears you're using a custom group to place discovered devices.  Try disabling this, and see if Discovery works.

View solution in original post

4 Replies 4

mburrell5
Level 1
Level 1

Update - Discovery is working when limited to very small subnets. For any number of devices over approx 100 it stalls.

I have turned on debugging for CSDiscovery and am seeing a java OutOfMemory error - see below extract from csdiscovery.log

[Jul 14  22:26:03] INFO  [LoggingOutputStream : Thread-2 flush]  : java.lang.OutOfMemoryError: Java heap space

[Jul 14  22:26:03] INFO  [LoggingOutputStream : Thread-2 flush]  : at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:144)

[Jul 14  22:26:03] INFO  [LoggingOutputStream : Thread-2 flush]  : at java.io.OutputStreamWriter.write(OutputStreamWriter.java:204)

[Jul 14  22:26:03] INFO  [LoggingOutputStream : Thread-2 flush]  : at java.io.Writer.write(Writer.java:126)

[Jul 14  22:26:03] INFO  [LoggingOutputStream : Thread-2 flush]  : at org.apache.log4j.helpers.QuietWriter.write(QuietWriter.java:39)

[Jul 14  22:26:03] INFO  [LoggingOutputStream : Thread-2 flush]  : at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:292)

[Jul 14  22:26:03] INFO  [LoggingOutputStream : Thread-2 flush]  : at org.apache.log4j.WriterAppender.append(WriterAppender.java:150)

[Jul 14  22:26:03] INFO  [LoggingOutputStream : Thread-2 flush]  : at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:221)

[Jul 14  22:26:03] INFO  [LoggingOutputStream : Thread-2 flush]  : at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:57)

[Jul 14  22:26:03] INFO  [LoggingOutputStream : Thread-2 flush]  : at org.apache.log4j.Category.callAppenders(Category.java:187)

[Jul 14  22:26:03] INFO  [LoggingOutputStream : Thread-2 flush]  : at org.apache.log4j.Category.forcedLog(Category.java:372)

[Jul 14  22:26:03] INFO  [LoggingOutputStream : Thread-2 flush]  : at org.apache.log4j.Category.fatal(Category.java:346)

[Jul 14  22:26:03] INFO  [LoggingOutputStream : Thread-2 flush]  : at com.cisco.nm.csdiscovery.adaptor.CSDiscoveryAdaptor.updateOGSGroup(CSDiscoveryAdaptor.java:2054)

[Jul 14  22:26:03] INFO  [LoggingOutputStream : Thread-2 flush]  : at com.cisco.nm.csdiscovery.adaptor.CSDiscoveryAdaptor.updateDevices(CSDiscoveryAdaptor.java:1666)

[Jul 14  22:26:03] INFO  [LoggingOutputStream : Thread-2 flush]  : at com.cisco.nm.csdiscovery.adaptor.CSDiscoveryAdaptor.addDevices(CSDiscoveryAdaptor.java:1637)

[Jul 14  22:26:03] INFO  [LoggingOutputStream : Thread-2 flush]  : at com.cisco.nm.csdiscovery.adaptor.CSDiscoveryAdaptor.putAllProcessed(CSDiscoveryAdaptor.java:759)

[Jul 14  22:26:03] INFO  [LoggingOutputStream : Thread-2 flush]  : at com.cisco.nm.discovery.framework.DiscoveryController.updateProcessedNodes(DiscoveryController.java:688)

[Jul 14  22:26:03] INFO  [LoggingOutputStream : Thread-2 flush]  : at com.cisco.nm.discovery.framework.DiscoveryController.finishDiscovery(DiscoveryController.java:637)

[Jul 14  22:26:03] INFO  [LoggingOutputStream : Thread-2 flush]  : at com.cisco.nm.discovery.framework.DiscoveryController.run(DiscoveryController.java:1304)

Can I increase the heap size? Where do I go to adjust?

It appears you're using a custom group to place discovered devices.  Try disabling this, and see if Discovery works.

That fixed it!

Why would having a custom group cause a problem? There is nothing in the doco to suggest this

The problem appears to occur due to a bug.  A message is printed to the logs with the group attributes.  If enough devices are discovered, the size of this message overflows the log.  The message appears to have been left in during some pre-release debugging.  The message could be removed from the code to prevent the overflow.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Innovations in Cisco Full Stack Observability - A new webinar from Cisco