Cisco Support Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Announcements
Webcast-Catalyst9k
Blue

System Inventory collection and Inventory Collection jobs failed

Just noticed Inventory Changes being 0, which never happens. Looks like the two inventory collections have been working. What's wrong with them?

System Inventory Collection

/var/adm/CSCOpx/files/rme/jobs/ICServer/1391/

Inventory Collection

/var/adm/CSCOpx/files/rme/jobs/ICServer/1594/

10 REPLIES
Cisco Employee

Re: System Inventory collection and Inventory Collection jobs fa

Looks like some of your RME daemons may have crashed. The daemons to check are RMECSTMServer and ICServer.

Blue

Re: System Inventory collection and Inventory Collection jobs fa

Seems fine in pdshow. Should I restart ICServer?

Process= RMECSTMServer

State = Running normally

Pid = 20654

RC = 0

Signo = 0

Start = 01/25/08 15:26:26

Stop = Not applicable

Core = Not applicable

Info = RMECSTMServer started.

Process= ICServer

State = Administrator has shut down this server

Pid = 0

RC = 1

Signo = 0

Start = 01/25/08 15:26:30

Stop = 01/26/08 01:23:52

Core = Not applicable

Info = ICServer started.

Cisco Employee

Re: System Inventory collection and Inventory Collection jobs fa

Yes, but you should check the daemons.log (ICServer.log on Windows) for any indication of why it crashed in the first place. Note: a pdexec might not fix this. You may have to restart dmgtd.

Blue

Re: System Inventory collection and Inventory Collection jobs fa

Seems a number of tables were locked.

Cisco Employee

Re: System Inventory collection and Inventory Collection jobs fa

While these errors would prevent inventory from being successfully collected, they would not crash ICServer. Additionally, if you have a process which is stuck, and holding the locks on these tables, you will definitely need to restart dmgtd to recover.

Re: System Inventory collection and Inventory Collection jobs fa

there is a 'java.lang.OutOfMemoryError' at the very end in line 7987 which I think forced ICServer to exit:

[ Sat Jan 26 01:23:49 EST 2008 ],FATAL,[Thread-18],com.cisco.nm.rmeng.inventory.ics.server.InvDataProcessor,481,Fatal Error has Occured, exiting ICServer java.lang.OutOfMemoryError

but why did it occur? Could it be the process locks the tables?

Re: System Inventory collection and Inventory Collection jobs fa

there is a 'java.lang.OutOfMemoryError' at the very end in line 7987 which I think forced ICServer to exit:

[ Sat Jan 26 01:23:49 EST 2008 ],FATAL,[Thread-18],com.cisco.nm.rmeng.inventory.ics.server.InvDataProcessor,481,Fatal Error has Occured, exiting ICServer java.lang.OutOfMemoryError

but why did it occur? Could it be the process locks the tables?

Cisco Employee

Re: System Inventory collection and Inventory Collection jobs fa

Yeah, one of the threads hit that error, then it exited. I doubt the locks caused this. If you look, the thread that encountered the OOME did not encounter the lock problem. But there does appear to be an issue with the 192.168.8.44 device. It takes 355 seconds to process this device, and there could be a problem in the CISCO-STACK-MIB implementation. It would be beneficial to look at a sniffer trace of the inventory collection for this device to rule out any bugs on the device side.

Re: System Inventory collection and Inventory Collection jobs fa

Yes, the thread that incountered the OOME did not encounter the lock problem, but if I interprete the log correct it has yet finished processing and was in a state of just giving the last information about its runtime.

perhaps it is a more widely spread problem... :-(

If you say that 355 is a long time for processing a device, there are several devices for which it takes longer (up to 876 sec). But as I see, they all (except the one with the OOME) finished processing (a few are showing the lock prbl also). Could it be, that for some of these devices the memory does not get properly freed?

It could be of interest if they are all of the same device type...

yjdabear, perhaps this list is somewhat useful for you....

it contains the IPs with processing time > 300s

172.19.10.1

172.19.10.74

172.19.20.102

172.19.20.111

172.19.20.212

172.19.20.232

172.19.25.1 (842s)

172.19.26.2

172.19.29.1

172.19.3.1

172.19.32.1

172.19.42.3

192.168.11.28

192.168.110.71

192.168.116.28

192.168.254.29

192.168.254.30

192.168.26.20

192.168.26.36

192.168.26.44

192.168.28.4 (DP time:863s, Total time: 876s)

192.168.29.12

192.168.29.36

192.168.3.36

192.168.32.44

192.168.37.36 (DP time: 793, Total time: 854s)

192.168.4.28

192.168.5.12

192.168.52.36

192.168.53.76

192.168.8.36

192.168.8.44

Cisco Employee

Re: System Inventory collection and Inventory Collection jobs fa

The reason this device was interesting to me is that it also had an SNMP access error in it. However, given network latency, size of device, etc. 355 seconds may not be that long. That's why I suggested a sniffer trace to rule out a problem with device instrumentation.

All that said, it could be that there is a memory leak that is encountered by this thread. This would not be the first time that we've seen an ICServer leak. Profiling ICServer is not an easy task, though, so it would be good to rule out obvious problems first.

287
Views
0
Helpful
10
Replies
CreatePlease to create content