Never-ending discovery

Unanswered Question
Jan 20th, 2009
User Badges:
  • Bronze, 100 points or more

Hello,


we use LMS 3.1 with the newest patches installed.

The problem is, that the discovery doesn't end after a few runs.

That means all devices will be discovered (or are unreachable), but the process doesn't end.

We have ca. 6000 devices to manage and I think there are ca. 100 unreachable.


I read somewhere in the forum, that the SNMP timeouts could be the problem for that. So I checked the settings in the config files.


----


\ROOT\CSCOpx\conf\csdiscovery\system-config.xml


<DiscoveryEngine PurgeDeviceObject="true" ExportDiscardedDevices="true" ExportUnreachableDevices="true" TotalThreadAllowed="20" PreferredMgmtIPMethod=""

SNMPTimeout="180000" SNMPRetry="3" ICMPTimeout="1000" ICMPRetry="1" SNMPFallback="false" ResolveName="false"/>



I think "SNMPTimeout=180000" means msec....the timeout is 3 minutes?


----


\ROOT\CSCOpx\conf\csdiscovery\CSDiscovery-config.xml


<DiscoveryEngine PurgeDeviceObject="false" ExportDiscardedDevices="true" ExportUnreachableDevices="true" TotalThreadAllowed="20"

PreferredMgmtIPMethod="UseLoopBack" SNMPTimeout="10000" SNMPRetry="3" ICMPTimeout="1000" ICMPInterPacketTimeout="20" ICMPRetry="1"

SNMPFallback="true" NextDeviceInterval="75" ThreadTimeout="300" ResolveName="true" NeighbourData="true" SNMPMaxThread="45" SNMPMinThread="30"

SNMPBulkSize="20" DiscoverNonSnmpAccessibles="false"/>


----


Now I have the question, which of the SNMPTimeout grips? That one of the system-config.xml or that of the CSDiscovery-config.xml?


And the next question is, how I can change the settings to "normal" values? Directly in the xml-files?


Thanks a lot!


Sven

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
Joe Clarke Tue, 01/20/2009 - 08:55
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

You should not be looking in these files. The SNMP timeout and retries are tied to credentials, and can be configured in the GUI. If you post the CSDiscovery.log and the CSDiscovery-config.xml, I can analyze them. We have seen some problems with large discoveries, and this might be memory-related.

Sven Hruza Tue, 01/20/2009 - 09:04
User Badges:
  • Bronze, 100 points or more

The problem is, that I don't have a log-file for the whole discovery because it will take a week or something...

A discovery with 1500 devices produces a logfile with 300MB.

And smaller discoveries ended fine in the last 3 days. So these logfiles are not very helpful I think.


Where I can configure the SNMP timeout in the GUI? I only found the settings for ICMP...



Joe Clarke Tue, 01/20/2009 - 09:24
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

Your timeouts are all the default. These are configured at the same place you added your SNMP community strings for Discovery. The default is 2 retries with a 3 second timeout. One issue I did notice was that you have non-UTF-8 characters in one of your community string comments (the one for 172.*.*.*). This is breaking the file. Please change this comment.


You may also have a timeout issue. Even with the default values, a timeout will take 21 seconds (per string per device). The only way to debug this is with the ngdiscovery log with debugging enabled. But as you say, that will be big. Therefore, I recommend you open a TAC service request. You can then upload the logs via FTP for further analysis.

Sven Hruza Wed, 01/21/2009 - 09:40
User Badges:
  • Bronze, 100 points or more

Hi Joe,


I reworked the whole SNMP parameters and set the retries to 1 and the timeout to 1 second.


Then I checked the >100 unreachable devices and configured a filter to get the known unreachable devices out.


Now the discovery only with the CDP-module takes about 15 minutes for the 5800 device :)


I will monitor the processes the next days, but I think it works!

Sven Hruza Thu, 01/22/2009 - 08:55
User Badges:
  • Bronze, 100 points or more

It is not so easy as I thought...


Today I tested the discovery again and detected, that the process doesn't end if there are seed devices configured in the CDP module.


At first it seemed that an old CatOS 6509 with IOS Sup2 was the problematic device.

But in a later discovery without that device it didn't work though there were only four 6509 with Sup720 configured as seed devices.


Has anybody informations about that behavior?

Joe Clarke Thu, 01/22/2009 - 08:59
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

As I said, it would probably be a good idea for you to open a TAC service request at this point so that the full CSDiscovery and ngdiscovery logs can be collected and analyzed.

Sven Hruza Fri, 03/06/2009 - 09:35
User Badges:
  • Bronze, 100 points or more

There is a CaseID CSCsv42110 for this problem with a bugfix.


On my system it works!

Joe Clarke Fri, 03/06/2009 - 10:10
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

This is a bug ID. Yes, there was some recent work on this issue as development was able to get the thread dumps and logs they needed. Patches are available.

Actions

This Discussion