IPM - unable to add source in WebClient

Unanswered Question
Mar 21st, 2007

I have an LMS 2.5.1 installation (no DFM) on sol9.

Opening IPM > Client > WebClient Edit >configure

got to 'Source'

add information for a new source it is displayed in the GUI but the status button shows yellow for ever;

closing the config GUI and reopen it let the source disappear from the table , trying to readd it gives a message that the source alreaddy exists - but it is not there - even if i look directly into the IPMDB I cannot find it.

export the DB to a fresh install of IPM does not show this behaviour..

adding a source from cli (ipm addsrc) gives a message 'server not respondig'

How can I troubleshoot this- I cannot find any log file that gives me information on the problem ...

MArtin

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
Joe Clarke Wed, 03/21/2007 - 13:21

Bring up the standalone client, and launch the Message Log Client under File. Enable debug for ProcessManager, DataCollectionServer, and ConfigServer, then reproduce the problem. The ipm tshoot should have some details about what is going on.

Martin Ermel Thu, 03/22/2007 - 17:59

I enabled the debugs but cannot find a hint why the device is not added and I cant find anything in the ipm troubleshooting section...

the source is already defined with a hostname and IP address (dns resolution is ok) and is used for a collection

the customer wants to add the same device with a different IPaddress (not in DNS); as this does not work on the production server, I tried to do exact the same thing on a test machine (LMS 2.6, IP with empty DB); I added the device as a source with its IP and DNS name -and added it a second time with another IP Interface - no problem, they showed with status green in one minute; so the device could not be the problem (checked the ACL on the device and did an snmpwalk from server)

Now, what I see in the debug is a time gap of 5 hours between the first ocurence of the IP in question and the second ...

Hardware is sunfire 440 2xCPU; 7GB Ram; LMS 2.5.1 (no DFM) with 900 devices and HP NNM 7.5 with 1200 device (currently I do not know the polled object number);

but the source still does not appears...(meanwhile I closed the Client and reopend it to verify if it was added or not)

I put some lines from the log in the attachment

Martin Ermel Fri, 03/23/2007 - 10:14

ipm.env on both servers are identical (verified with diff cmd) and have the following content:

*************

#! /bin/sh

#****************************************************

# Copyright (c) 2004 Cisco Systems, Inc.

# All rights reserved.

#****************************************************

# If you want to tune your SNMP polling parameters

# comment in the following lines and edit

# Max value is 60, default is 5, min is 1

#IPM_SNMP_TIMEOUT=5

#export IPM_SNMP_TIMEOUT

# Max value is 5, default is 3, min is 1

#IPM_SNMP_RETRIES=3

#export IPM_SNMP_RETRIES

# Max value is 60, default is 5, min is 1

#IPM_SNMP_TIMEOUT_INCREMENT=5

#export IPM_SNMP_TIMEOUT_INCREMENT

# Set to 1 if the SAA probes in the router should

# start appearing in the running config and get Syslog messages

IPM_NVRAM_ENABLE=0

export IPM_NVRAM_ENABLE

# Set to 1 if SAA probes should set the managed interface as

# the source interface

IPM_USE_MANAGED_SRC_INTF_ADDR=0

export IPM_USE_MANAGED_SRC_INTF_ADDR

*************

I also tried to get more detailed info setting the tracelevel in the Java Console to '5', but that is all I get when adding the source - just some pushes and pops...

network: Verbindung von socket://10.1.3.2:1784 mit Proxy=DIRECT wird hergestellt

network: Verbindung von http://10.1.3.2:9088/gatekeeper.ior mit Proxy=DIRECT wird hergestellt

network: Verbindung http://10.1.3.2:9088/gatekeeper.ior mit Cookie "jsessionid=B9252D3D337C15EC7423C4B8B151F257; MICEcookie=B9252D3D337C15EC7423C4B8B151F257"

basic: Modality-Push durchgef?hrt

basic: Modality-Pop durchgef?hrt

basic: Modality-Push durchgef?hrt

basic: Modality-Pop durchgef?hrt

######

customer told me now, when he recognized the problem he did an unistallation of IPM and a reinstallation and readded the db with 'restore' cmd;

currently I think it is a problem with other sw on the server( NNM is currently stopped but Nedi is running), OS related or file permission rather then an IPM problem.

As well, solaris 'IP Network Multipath' is configured and I am not sure if this is can make any problems - (gatekeeper.cfg has the IP entry of primary NIC);

I also tried to truss 'ipm addsrc' but can not find any hint on a problem (..the output is not that easy to read..)

any suggestions??

Joe Clarke Fri, 03/23/2007 - 10:28

I see nothing of interest in the log, and the ipm.env is fine (they may want to eventually set IPM_USE_MANAGED_SRC_INTF_ADDR=1, though). Of course, the log is filtered, so there may be other information I'm missing. At this point, you should have this customer open a TAC SR as remote access may shed some more light on this problem.

Martin Ermel Fri, 03/23/2007 - 10:51

Yes, I snipped the log, cause of sensitive information from some routers.

I currently do only have remote access on the server (VPN) and the native client forces me to meditate after each click...

is there a cli way to configure debug levels and read the log?

after I enabled debugging in the Java Gui, I used 'ipmviewlog' on cli to view the log - but I am not sure If the output is really the same as in the MsgLogView.

Can you confirm that they are identical?

When does the log get reinitialized ( backed out) - after changing log levels or after restarting IPM?

This would be of interest for a last test, - if this fails a TAC Case will be opened or I ask the customer to do a fresh install with LMS 2.6..

Joe Clarke Fri, 03/23/2007 - 10:56

Actually, I am now able to reproduce this. I am analyzing the problem now.

Joe Clarke Wed, 03/28/2007 - 07:38

We're still researching this problem. My server is still seeing this problem.

Joe Clarke Thu, 03/29/2007 - 08:45

This suggestion is going to suck, but it worked for me. My Solaris server had been up for 197 days, so I rebooted it just to see if it would help. It did. I am once again able to add collectors to IPM. Have your customer schedule some downtime to reboot the server, and see if the same happens for them.

On the plus side, IPM 4.0 is a complete rewrite to make it fit in the dmgtd umbrella. This will allow it to share more parts of Common Services, thus lowering its resource overhead, and making things easier to troubleshoot.

Martin Ermel Fri, 03/30/2007 - 08:52

I will let the customer do a reboot of the server. Although he told me he had yet done it (uptime is 35 days) - but as he runs NNM and nedi along with LMS on this server perhaps this is an issue.

What I am wondering about is, using 'ipm addsrc' interactive cli to add a new source I'll get a 'Server not responding' while trying to add an existing source I'll get 'Source already exists'...

If I try to add the new source again (after receiving 'Server not responding' for the first try) I'll then also get 'Source already exists'. -So this entry must be anywhere in a tmp data storage (file or RAM) and the IPM processes have knowledge of its existence and also have access to this information - but it is *NOT* in the database.

Restarting IPM processes whipe out this information and trying to readd this new source again gives again a 'Server not responding' ...strange...

Joe Clarke Fri, 03/30/2007 - 09:08

I got the same thing. The problem is the source is wedged in ConfigServer, and thus the IPM processes have to be restarted before the same source can be attempted again. Only after the source is properly polled will it be added to the database.

Martin Ermel Wed, 07/18/2007 - 04:38

I hope I can bring this to an end ...

new installation of LMS 2.6 and import of the old DB reveals to the same behaviour. Further investigation seems to bring the source of the problem to a certain custom operations: its a copy of the 'DefaultIpEcho' with the follwoing customized Packet Settings:

IP Qos Type: IP Precedence

IP Qos Setting: 1 (Low)

the operation is done several times with the same target but different sources (67 collectors are configured with this operation).

When the previously described problem exists, deleting all of these collectors and restarting ipm will solve the issue;

Adding a few of these collectors let all run well and currently it seems, that when the customer additionaly dds 2 certain sources the problem rises again within a day.

The sources are

Cisco 1721 with IOS 12.3(20)

Cisco 7206 VXR with IOS 12.3(9)

the target is

Cisco 4500 with IOS 12.2(24a), SAA Version 2.2.0; Responder is set to ON

it seems there is an issue with some data collection on these collectors which puts some other ipm operations (of ipm config server?) into a queue (including the snmp poller that is necessary to add a new source and the logging process as well - it stops to add any new entry in the log).

If all hangs restarting ipm suddenly flushes some (pending?) output to the log file and sending a shutdown signal afterwards to the processes.

are there any known problems with this config?

Martin Ermel Fri, 07/20/2007 - 07:23

Are there any known problmes with IPM snmpd on solaris or a way to debug this daemon;

If we ?ve got the before mentioned collectors active it seems that it is eating up RAM continuosly. It looks like the data collection for these collectors is the source of the problem...

Actions

This Discussion