01-06-2010 05:43 PM
Hello,
Encountered interesting problem, after the last restart of LMS 3.2 the IPM component would not work:
- IPMProcess is shutdown, the attempts to pdexec it logging the DB connection refuse messages
- the IpmDbEngine is running according to pdshow and Solaris 10 ps -ef:
Process: | IpmDbEngine |
Path: | /opt/CSCOpx/objects/db/bin64s/dbsrv10 |
Flags: | -x tcpip{HOST=localhost,DOBROADCAST=NO,ServerPort=43820} -m -q -ti 0 -gm 100 -gn 50 -ch 50P -s local0 -c 16M -n ipmEng /opt/CSCOpx/databases/ipm/ipm.db -n ipm |
Startup: | Started automatically at boot. |
Dependencies: | Not applicable |
Checked the port number of the DB (43820) via netstat and lsof, not there (there is no port 43820 listening)
Checked for all the ports the the ipmDBEngine might be listening on (via the ipmDB PID), nothing.
One suspicion I have is the size of the ipmDB - just over 3Gb, could this be the reason?
Thanks
Dmitry
01-06-2010 08:23 PM
Post the daemons.log.
01-07-2010 07:08 AM
The daemons.log is attached. Strange things is that
overnight the ipmDB opened up its TCP port:
bash-3.00# /usr/local/bin/lsof -i :43820
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
dbsrv10 3331 casuser 13u IPv4 0x600144d52c0 0t0 UDP *:43820
dbsrv10 3331 casuser 14u IPv6 0x600144d4ec0 0t0 UDP *:43820
dbsrv10 3331 casuser 15u IPv4 0x60018d33280 0t0 TCP *:43820 (LISTEN)
dbsrv10 3331 casuser 16u IPv6 0x600181cb740 0t0 TCP *:43820 (LISTEN)
dbsrv10 3331 casuser 20u IPv4 0x60015cd7b40 0t0 TCP *:43820 (BOUND)
dbsrv10 3331 casuser 21u IPv4 0x60018bc4a80 0t0 TCP localhost:43820->localhost:58528 (ESTABLISHED)
cwjava 3702 casuser 13u IPv4 0x60018352000 0t33798 TCP localhost:58528->localhost:43820 (ESTABLISHED)
I also tried to start the IPMProcess (the attached daemons1.log.gz taken before the attempt and the daemons2.log.gz after) and it started out fine:
bash-3.00# /opt/CSCOpx/bin/pdshow IPMProcess
Process= IPMProcess
State = Program started - No mgt msgs received
Pid = 12414
RC = 0
Signo = 0
Start = 01/07/10 10:02:16
Stop = Not applicable
Core = Not applicable
Info = Application started by administrator request.
In the daemons.log I noticed some messages for the ipmDB with the warning on the DB indexes, maybe the DB was being checked that long (3Gb) and until it finished it would not open its TCP port?
Thanks
01-07-2010 08:42 AM
Sorry, the logs had .gz, they are now in .zip
Also, even though all looks good from the processes point I still cannot use the IPM, the CW CUI shows:
Error in communicating with IPMProcess or IpmDbEngine.
It may be down or not yet up. Please make sure processes are up and running
bash-3.00# ps -ef | grep -i ipm
casuser 12414 3275 0 10:02:17 ? 0:27 CSCO.IPMProcess -cw /opt/CSCOpx -cw:jre lib/jre -server -Xms64m -Xmx192m -cp:p
casuser 3702 3275 0 20:07:33 ? 1:04 CSCO.IPMOGSServer -cw /opt/CSCOpx -cw:jre lib/jre -server -Xms64m -Xmx512m -cp:
01-07-2010 09:31 AM
Yep, the IPM database is corrupt. If you have a good backup of LMS, you can restore it, and hopefully that will provide you with a good IPM DB. If not, you will need to reinitialize the IPM database with the following command:
/opt/CSCOpx/bin/dbRestoreOrig.pl dsn=ipm dmprefix=Ipm
01-07-2010 11:31 AM
Thanks for checking it, have monthly IPM exports done, so the data loss is not a problem, just need to recreate the collectors or import from the "shadow" router.
Is there an upper threshold for the ipmDB size, that cisco would not recommend crossing?
01-07-2010 11:43 AM
No, there is no limit. The problem here is not the size per se. It's the fact that the database became corrupt, and is now larger than the engine expects it to be. It is perfectly normal to see multiple gigabyte databases if a lot of data retention is done.
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: