Long Device Discovery in LMS 3.0.1

Unanswered Question
Jan 9th, 2008
User Badges:

I recently installed the Dec 2007 update to my Windows 2003 based LMS 3.0 system which upgrades it to LMS 3.0.1. Device Discovery was changed in this update and is now part of CS instead of CM. Since this upgrade my device discovery times have gone from about 15 minutes to 7 hours. I am using the discovery method created by the Dec 2007 update which just migrates the settings from CM to CS so the module being used is CDP. My existing seed devices and IP address range filter was migrated correctly. I performed a sniffer trace and found all the extra time is apparently due to discovery treating Cisco IP Phones as valid Cisco devices that should be discovered via SNMP. I see discovery try SNMP queries on each IP Phone and the phone responds with an ICMP port unreachable because IP phones don't run SNMP. Discovery ignores this and goes through the configured timeouts and retries that are configured before giving up and moving on to the next one only to suffer the same time wasting queries again. My SNMP timeout is 3 seconds with 2 retries. This adds up when you have 1500 IP phones. Is this a bug or is there some way to stop this? I never had this problem with LMS 3.0 or LMS 2.X.

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
Joe Clarke Wed, 01/09/2008 - 09:13
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

Good catch. I'm looking into this. In the meantime, using filters can help avoid this problem.

sfenderson Wed, 01/09/2008 - 09:55
User Badges:

I have IP Phones and some voice gateways in the same voice only subnets. So if I exclude the entire IP range I will miss the voice gateways. I thought about a DNS filter to filter out SEP* but I am already using an IP address filter and you can't use both. I have a TAC case open but I haven't heard much from them so I thought I would ask here. Thanks.

Joe Clarke Wed, 01/09/2008 - 09:59
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

I will be filing a new bug on this. I'll post the bug ID when I have it.

Joe Clarke Wed, 01/09/2008 - 11:30
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

I have filed CSCsm13114 to track this issue. I have written an untested patch as well to fix it. If you would like to try the patch, please have your TAC engineer contact me directly to get it.

sfenderson Tue, 01/15/2008 - 06:38
User Badges:

Joe, I installed the 2 files you provided to TAC. Now when I try to configure the discovery settings I get an HTTP Status 500 error with a java.lang.NullPointerException. When I look at CSDiscovery.log I see this error:


[ Tue Jan 15 09:31:36 EST 2008 ] FATAL [DiscoverySettingsSummaryAction : perform] : Exception in reading CSDiscovery-config.xml.

Reason :unable to find FieldDescriptor for 'CDPID' in ClassDescriptor of SystemFilters

com.cisco.nm.csdiscovery.CSDiscoveryException: Exception in reading CSDiscovery-config.xml.

Reason :unable to find FieldDescriptor for 'CDPID' in ClassDescriptor of SystemFilters


Joe Clarke Tue, 01/15/2008 - 12:35
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

I have a new patch ready. Please have your engineer contact me again.

sfenderson Wed, 01/23/2008 - 10:35
User Badges:

Joe I received the new patch with 3 files from TAC. I applied these files, updated my discovery settings, and started a discovery. Something is still wrong. Once I started discovery the CS home page doesn't show discovery status as "running". However a job was submitted and is running. I let it run for several hours and it still had not finished so I cancelled it. I then enabled debug for CSdiscover and started it again. The CS home page still didn't show the correct discovery status. I let this run for about 15 minutes and then cancelled the job. I will attach the csdiscovery.log file. I looked at it and see lots of this message group:


[ Wed Jan 23 13:20:53 EST 2008 ] DEBUG [DiscoveryUtil : getNMSROOT] : NMSROOT: E:\PROGRA~1\CSCOpx

[ Wed Jan 23 13:20:53 EST 2008 ] DEBUG [DiscoveryJobUtil : processStatus] : [processStatus] Called!!

[ Wed Jan 23 13:20:53 EST 2008 ] DEBUG [DiscoveryJobUtil : processStatus] : Executing jobCmd: E:\PROGRA~1\CSCOpx\bin\pdshow.cmd CSDiscovery

[ Wed Jan 23 13:20:53 EST 2008 ] DEBUG [DiscoveryJobUtil : processStatus] : Job Command Response:

[ Wed Jan 23 13:20:53 EST 2008 ] DEBUG [DiscoveryJobUtil : processStatus] : Exit


These groups of messages are in the log about every 2 seconds.


I also sent this info to the TAC engineer I am working with.


Thanks



Attachment: 
Joe Clarke Wed, 01/23/2008 - 10:49
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

You need to manually adjust the two XML files I sent you. My development platform is Solaris, and thus the paths are set accordingly. You need to adjust them for your NMSROOT. Replace all instances of /opt/CSCOpx in both files with wherever CiscoWorks installed (e.g. C:\PROGRA~1\CSCOpx). Then, replace all '/' with '\'. Once the files have been properly adjusted, place them into NMSROOT\conf\csdiscovery, and restart dmgtd. That should solve this problem.



sfenderson Wed, 01/23/2008 - 13:00
User Badges:

Joe, I updated the 2 files to correct the path information. It still didn't work so I compared the old and new files and found the last statement in each file (Value=) was delimited differently. The ones you sent me had file paths delimited with ":" and my old one used ";". So I changed this and then discovery started working. However I am still seeing the same thing. The discovery runs for a very long time and appears to still be trying to discover IP Phones. The Device Discovery Summary shows the "Unreachable Devices" count slowly incrementing. When I click on this count to show the devices they are IP phones.

Joe Clarke Wed, 01/23/2008 - 13:26
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

Please tell me the byte size of the discovery.jar file you were given. It sounds like you still have an early bug that should have been fixed in the latest version.

sfenderson Wed, 01/23/2008 - 13:35
User Badges:

My discovery.jar file shows as:


Size: 581,018 bytes


Size on disk: 581,632

Joe Clarke Wed, 01/23/2008 - 13:49
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

Hmm, this should be 581019. Looks like you have the wrong version. I might have attached the wrong one by accident. I'll send the right one out.

sfenderson Thu, 01/24/2008 - 09:13
User Badges:

Thanks Joe. I got the updated discovery.jar file and this works much better. One remaining issue is that the patch doesn't skip the following 3 device types:


ATA-186

7936 conference phone

PC's running IP Communicator


These currently make up only a small amount (15) of devices in my network so it didn't slow discovery down to much (it ran in about 7 minutes). I would think you would want to include skipping these in the patch however.


Any idea when this patch will become an official fix?



Joe Clarke Thu, 01/24/2008 - 09:52
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

The device types were taken from LMS 3.0's ANIDevice.properties file which is the last set of non-SNMP CDP devices Campus supported. The conference phone can certainly be added (given its CDP platform ID), but the others have not been evaluated.


This should be fixed in the next Campus Manager release due out this spring.

sfenderson Mon, 03/10/2008 - 10:38
User Badges:

Joe, I am having a problem with discovery again. I recently configured to allow discovery of my entire network. Previously I was discovering about one 1/3. I quickly found that the patch you gave me to skip IP phones doesn't skip 7941's and 7961's and the new part of the network had a couple thousand of these. So I excluded the entire IP range we address phones with until this problem is fixed. This made things ok for a while. I then noticed that sometimes discovery seemed to hang but if I restarted CiscoWorks it would work again. Now even after a restart/reboot discovery will not complete. This is not because of the IP Phone problem as I am excluding these by IP address range. Also, the counts in discovery summary stop incrementing. It will never complete unless I stop it. I will attach my CSDiscovery.log file.



Attachment: 
Joan Pelser Mon, 05/05/2008 - 01:20
User Badges:

Hi There,

Have you guys found some kind of a solution to the problem discussed above, as I am experiencing exactly the same thing.


Joan

sfenderson Mon, 05/05/2008 - 04:45
User Badges:

Joan, I do not have a resolution to this problem yet. I have a case open with TAC and they gave me an updated file that collects more debugging info. So far I can't recreate the problem with the extra debuging enabled. I will update you when this gets resolved.

sfenderson Fri, 05/09/2008 - 10:49
User Badges:

Joan, the device discover hang was resolved. TAC gave me an updated version of the file ngd-log4j.properties with some extra debugging stuff enabled. The problem would not happen when using this file. It created a large log file everytime discovery ran so I couldn't use this on an everyday basis. So TAC had me edit the file and change the debugging level so it would not normally generate a log file. Since then I have not had the hange problem.

Actions

This Discussion