IPM 4.2 won't start

Unanswered Question
Jan 6th, 2010

Hello,

Encountered interesting problem, after the last restart of LMS 3.2 the IPM component would not work:

- IPMProcess is shutdown, the attempts to pdexec it logging the DB connection refuse messages

- the IpmDbEngine is running according to pdshow and Solaris 10 ps -ef:

    

Process:IpmDbEngine
Path:/opt/CSCOpx/objects/db/bin64s/dbsrv10
Flags:-x tcpip{HOST=localhost,DOBROADCAST=NO,ServerPort=43820} -m -q -ti 0 -gm 100 -gn 50 -ch 50P -s local0 -c 16M -n ipmEng /opt/CSCOpx/databases/ipm/ipm.db -n ipm
Startup:Started automatically at boot.
Dependencies:Not applicable

Checked the port number of the DB (43820) via netstat and lsof, not there (there is no port 43820 listening)

Checked for all the ports the the ipmDBEngine might be listening on (via the ipmDB PID), nothing.

One suspicion I have is the size of the ipmDB - just over 3Gb, could this be the reason?

Thanks

Dmitry

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
dmitry Thu, 01/07/2010 - 07:08

The daemons.log is attached. Strange things is that

overnight the ipmDB opened up its TCP port:

bash-3.00# /usr/local/bin/lsof -i :43820
COMMAND  PID    USER   FD   TYPE        DEVICE SIZE/OFF NODE NAME
dbsrv10 3331 casuser   13u  IPv4 0x600144d52c0      0t0  UDP *:43820
dbsrv10 3331 casuser   14u  IPv6 0x600144d4ec0      0t0  UDP *:43820
dbsrv10 3331 casuser   15u  IPv4 0x60018d33280      0t0  TCP *:43820 (LISTEN)
dbsrv10 3331 casuser   16u  IPv6 0x600181cb740      0t0  TCP *:43820 (LISTEN)
dbsrv10 3331 casuser   20u  IPv4 0x60015cd7b40      0t0  TCP *:43820 (BOUND)
dbsrv10 3331 casuser   21u  IPv4 0x60018bc4a80      0t0  TCP localhost:43820->localhost:58528 (ESTABLISHED)
cwjava  3702 casuser   13u  IPv4 0x60018352000  0t33798  TCP localhost:58528->localhost:43820 (ESTABLISHED)

I also tried to start the IPMProcess (the attached daemons1.log.gz taken before the attempt and the daemons2.log.gz after) and it started out fine:

bash-3.00# /opt/CSCOpx/bin/pdshow IPMProcess
        Process= IPMProcess
        State  = Program started - No mgt msgs received
        Pid    = 12414
        RC     = 0
        Signo  = 0
        Start  = 01/07/10 10:02:16
        Stop   = Not applicable
        Core   = Not applicable
        Info   = Application started by administrator request.

In the daemons.log I noticed some messages for the ipmDB with the warning on the DB indexes, maybe the DB was being checked that long (3Gb) and until it finished it would not open its TCP port?

Thanks

dmitry Thu, 01/07/2010 - 08:42

Sorry, the logs had .gz, they are now in .zip

Also, even though all looks good from the processes point I still cannot use the IPM, the CW CUI shows:

Error in communicating with IPMProcess or IpmDbEngine.
It may be down or not yet up. Please make sure processes are up and running

bash-3.00# ps -ef | grep -i ipm
casuser 12414  3275   0 10:02:17 ?           0:27 CSCO.IPMProcess -cw /opt/CSCOpx -cw:jre lib/jre -server -Xms64m -Xmx192m -cp:p
casuser  3702  3275   0 20:07:33 ?           1:04 CSCO.IPMOGSServer -cw /opt/CSCOpx -cw:jre lib/jre -server -Xms64m -Xmx512m -cp:

Joe Clarke Thu, 01/07/2010 - 09:31

Yep, the IPM database is corrupt.  If you have a good backup of LMS, you can restore it, and hopefully that will provide you with a good IPM DB.  If not, you will need to reinitialize the IPM database with the following command:

/opt/CSCOpx/bin/dbRestoreOrig.pl dsn=ipm dmprefix=Ipm
dmitry Thu, 01/07/2010 - 11:31

Thanks for checking it, have monthly IPM exports done, so the data loss is not a problem, just need to recreate the collectors or import from the "shadow" router.

Is there an upper threshold for the ipmDB size, that cisco would not recommend crossing?

Joe Clarke Thu, 01/07/2010 - 11:43

No, there is no limit.  The problem here is not the size per se.  It's the fact that the database became corrupt, and is now larger than the engine expects it to be.  It is perfectly normal to see multiple gigabyte databases if a lot of data retention is done.

Actions

This Discussion