I have installed LMS 2.6 and must say it is a hudge improvement over 2.2 The only thing I am having trouble with is the Alerts in DFM. They are not clearing even after the issue has been resolved. I have to manually clear them. For example if a switch is unplugged from the network, and the polling cycle hits and sees that the IP of the switch is off line. Then when with switch is back online and the polling cycle hits, the alert is not cleared. Any ideas?
An Alert will not clear until all of the associated Events have cleared. If you drill into the Alert, what is the state of the enclosed Events?
The event_description said unresponsive. If I tried to ping the switch, sure enough, it was unresponsive. The IPStatus was TIMEDOUT. After we got the switch back up and could ping it, I expected that event to clear, but several days went by and it never did.
And you are now able to ping the IP address listed in the component field from the CiscoWorks server? For example, if you see the following as the Event Component:
The 10.1.1.1 address is the one DFM cannot ping.
If you can ping that address from the server, you might want to get a sniffer trace for all traffic to that switch for one of DFM's polling cycles to see if it's actually trying to talk to it.
Check the EPM log under NMSROOT/lob/dfmLogs/EPM as well as the DFM.log under NMSROOT/objects/smarts/local/logs to see if there are any errors corresponding to that switch. We have seen unclearable events in the past which is why we added the manual clear capability to 2.0.6, but I do not known of any pervasive bug that would prevent an unreachable event from clearning.
Assuming the device in question is 160-gw.mtview.k6.slcsd.net, it appears the SNMP credentials in DCR are incorrect for this device. It cannot poll it to clear the event.
I don't see any other errors that could explain why these events are not clearing on their own. You should open a TAC Service Request so more additional debugging can be done.
You see these unresponsive events for only the interfaces that are UP with unreachable IP's. The status of the event shows it as IP status Timed out. I would ask you to bring down the interface.
There was a bug which was fixed in LMS 2.6
There is no point in having an Interface Up when it is not having a reachable IP Address
So for the unresponsive event to clear you would need to make the Interface down.
These unresponsive events are generated due to ICMP ping, LMS 3.0 will bring a feature to deactivate ICMP ping on the IP addresses according to user needs.
We are having the vey same problem. Any news regarding those unresponsive alarms that doesn't clear.
One thing we did to workaround was to just tick of informational alarms. But that's just a Workaround ...
About putting the interfaces down ..they already are, but CW still try's to ping them (in our case the ping doesn't work because there's no routing for those ip addresses).
How can i deactivate reachability for interfaces that CW doesn't have routing ...
This is not easily done in LMS 2.6. While this will be fixed in the upcoming LMS 3.0, for now you will need to open a TAC service request, and refer to the bug CSCsb48643.