LMS 3.2 CS Discovery finding "0" devices

Unanswered Question
Aug 20th, 2010

All of a sudden, after my "friendly" Windows server admin re-booted my LMS 3.2 box (without stopping the daemon!), my Discovery jobs are returning zero devices discovered. Prior to this, I was discovering 135 devices; which is correct for our environment. Tried deleting all the jobs & configuring new ones, but I get the same results. Tried stopping & re-starting the daemon service with no luck. Basically, we use UTT to feed a NAM process which is still showing entries on a daily basis, so I assume the DCR is still functioning; albeit off older network scans. Discovery is just not updating, therefore, I'm missing recent network adds & moves. Jobs are completing as "successful" but only run for 1 or 2 seconds. Anybody ever see anything like this before? Thought I'd try here for a "quick fix" before calling in a TAC case.....

All LMS modules are the latest. Recently updated to CS 3.3.0 and CM 5.2.1. Also updated RME to 4.3.1. Everything ran great after these updates. Box is an HP/Compaq DL360 G4 running Windows Server 2003 R2 64-bit; dual 3gig processors; 8gig RAM; dual mirrored 72.8gig SCSI Ultra 320 drives.

Here's a screen shot of the discovery job report:

Discovery.jpg

Thanks,

Rick

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
Joel Monge Fri, 08/20/2010 - 07:00

Seems like something might have gotten corrupted during the reboot.  Enable discovery debugging for Discovery Framework, Discovery Util, and System Module under CS > Device and Credentials > Device Discovery > Discovery Logging Configuration.  Run a new discovery and post these:

1. CSCOpx\log\CSDiscovery.log.

2. CSCOpx\conf\csdiscovery\CSDiscovery-config.xml.

3. CSCOpx\campus\etc\cwsi\DeviceDiscovery.properties.

rick684_2 Fri, 08/20/2010 - 11:34

Thanks for your response Joel....

I can't post the xml file here until I clean it up. Too much info in there. What specifically are you looking for in that one? Here's the other two:

Looks like I've had issues with this for quite awhile (early June?). Basically, the most recent run says the

DCRServer isn't running but Ciscoworks says it is. Must be hosed somehow. No, haven't changed the name

recently. Actually, this is a new install that's only been running since June. I do however have Symantec Endpoint

(ver. 11.0.4202.75) running AV & Spyware modules only on it. Any known issues with this? Saw these two related issues in other posts......

Rick  

Joel Monge Fri, 08/20/2010 - 11:42

Run the below command on DOS:

pdshow > C:\pdshow.txt

Then post the pdshow.txt from your C drive.  I wanted to see your general discovery settings and verify the integrity of the file.  I cannot do that without having the actual file.

Exactly where do you see the DCRServer error, can you post a screenshot?

rick684_2 Fri, 08/20/2010 - 12:41

Sorry, I can't post the actual file. I've edited out IP's, names, & email addys and posted it below. Also see the pdshow output. Here's a screen showing the CM status. I WAS getting the error(of course I can't remember it verbatim, but it's the one telling you to check the status of DCRServer back in CS) whenever I would click on the number below "Best Practices..." or "Discrepancies" in the CM Status window below. Now, it's actually pulling up the window it's supposed to for either one.

  

Also, notice the date of the "Last Completion Time" for Discovery above. Below is my current Discovery Schedule.

So, I'm not getting that error now, but my Discovery is still showing "0" devices discovered and isn't

updating the CM Discovery window although it falsely thinks it's completing discovery in a normal fashion.

Another thing: If I manually remove a device entry from the DCR, and that device is later connected in a

different location on the network (under the same device name), shouldn't Discovery 're-find" that device?

Thanks,

Rick

Attachment: 
Joel Monge Fri, 08/20/2010 - 12:57

The pdshow looks fine and it seems the discovery ran fine today:

    Process= CSDiscovery
    State  = Transient terminated
    Pid    = 0
    RC     = 0
    Signo  = 0
    Start  = 8/20/2010 12:22:55 PM
    Stop   = 8/20/2010 12:24:00 PM

Campus Manager Data Collection and the discovery process are separate and should not be mixed up.  The discovery log showed this error:

[ Thu Aug 19 13:32:43 CDT 2010 ] FATAL  [DiscoveryUtil : isDCRServerRunning]  : DCRServer is not running.

This seems accurate comparing with the pdshow:

    Process= DCRServer
    State  = Running normally
    Pid    = 12136
    RC     = 0
    Signo  = 0
    Start  = 8/19/2010 1:34:04 PM
    Stop   = Not applicable
    Core   = Not applicable
    Info   = DCRServer is up and running

So it seems this last discovery captured on the logs could not run because indeed the DCRServer was down at that time.

Please post a screenshot of Common Services > Device and Credentials > Device Discovery.  Do you see 0 devices discovered there?  Check under Common Services > Device and Credentials > Device Management, if you expand "All Devices" do you see devices there?

rick684_2 Mon, 08/23/2010 - 05:30

Discovery is showing "0" devices. See below:

Device Summary shows all objects as it should. If I select one, I can open "Edit Credentials" or "Edit Identity" normally. The objects are all there.....

mbilgrav Mon, 08/23/2010 - 07:03

Hello,

Since you have been running and a server reboot might have corrupted the files ....

Did you try to roll a backup into the server ?

If config files are corrupted you will get wierd behavior...

Normally:

Device and CredentialsDevice DiscoveryDiscovery Schedule

What settings do you see in the CSDis schedule ?

You can run several Dis jobs with different credentials etc - might be a good place to look.

Click on the radio button and verify settings and seed devices

also verify that seed device are reachabkles with the credentials that the CSDis job uses.

HTH

Regards

Martin

rick684_2 Mon, 08/23/2010 - 08:10

I've run different Discoveries Martin. No change. I've deleted the

schedule and started all over. No change.

I have not attempted to restore any database yet.....

Rick

Joel Monge Mon, 08/23/2010 - 08:17

On the last discovery log, there did not seem to be any debugs enabled so I could not infer much from it.  Post these:

1. Output of DOS command:

pdreg -l CSDiscovery

2. Clean the discovery logs:

logrot_trunc C:\Progra~2\CSCOpx\log\CSDiscovery.log

logrot_trunc C:\Progra~2\CSCOpx\log\ngdiscovery.log

3. Make sure these modules are enabled under CS > Device and Credentials > Device Discovery > Discovery Logging Configuration:

Discovery Framework

CSDiscovery Adaptor

Discovery Util

Credential Module

Discovery DeviceInfo

System Module

4. Post a screenshot of all contents under C:\Progra~2\CSCOpx\conf\csdiscovery.

5. Run a new manual discovery under CS > Device and Credentials > Device Discovery.  Refresh a couple of times immediately after and make sure it shows running.

6. Once this discovery shows completed on the summary page, post both logs from step 2.

Joel Monge Mon, 08/23/2010 - 11:57

Seems the job did run fine:

[ Mon Aug 23 11:36:18 CDT 2010 ] DEBUG  [DiscoveryJobUtil : processStatus]  :  Output from JNI call buf[value] :    Process= CSDiscovery
    State  = Transient terminated
    Pid    = 0
    RC     = 0
    Signo  = 0
    Start  = 8/23/2010 11:35:07 AM
    Stop   = 8/23/2010 11:36:15 AM
    Core   = Not applicable
    Info   = Server started by admin request

I can see the below error in the logs this time:

[ Mon Aug 23 11:36:19 CDT 2010 ] DEBUG  [DiscoveryUtil : changePermission]  : [Mon Aug 23 11:36:19 CDT 2010]Error while changing the permission of the file C:\PROGRA~2\CSCOpx\objects\csdiscovery\1517\DiscoveryStatusObj

This object is responsible for updating the discovery status in the GUI.  Seems some permissions might have been changed and this could be the problem.  Does "casuser" have full rights over C:\PROGRA~2\CSCOpx\objects\csdiscovery\1517\DiscoveryStatusObj?  Casuser needs to have full permissions over CSCOpx and everything under it.  Is it possible your Windows admin changed any security settings or a domain policy is overwriting local security policies?

rick684_2 Mon, 08/23/2010 - 12:45

Ya know, I think he may have changed domain membership a week or two ago Joel while troubleshooting a script.

I just looked at rights to CSCOpx & casuser has full rights; but my admin guy just said that

he's seen this sort of thing before so he's going to "push" rights down for casuser. I'll let you

know if it cures it....

Update:

The only file he couldn't push rights down to was "pidm.log". On this file, casuser has everything except "full".

In checking rights on some other log files, casuser is showing full rights to them. I'm told that pidm.log shouldn't

make any difference. I'll let you know if I see numbers other than zero in the discovery the next time it runs....

Rick

rick684_2 Thu, 09/09/2010 - 07:54

As it turns out, this issue had nothing to do with the server. I had

"Excluded" four ip's from Device Discovery. When I cleared

the excluded list, everything started to behave normally.

Thanks for your help Joel......

Rick

Actions

This Discussion