We have two LMS deployments that we recently updated to LMS 3.0.1 and in particular updated DFM to DFM 3.0.3.
The two LMS deployments are managing different segments of our network. On one, in DFM all the devices are in known state bur when we select a device in DFM, DFM displays the objects (e,g power supply, fan, interface) and its instance name but says status is not available. The same device types in our other instanced of DFM is fine.
We knew right we had a problem as soon as we realized we were getting no DFM alerts from rthe deployment with this problem. Anyone see this before ?
the problem is on all devices in this one instance of DFM. Our other instance has the same type of devices with good normal monitoring. We have two deployments of LMS, LMS-A and LMS-B that were maintained in the past by numerous folks on as time is available basis so I don't when this issue started. But right away if obvious there was a problem when you see DFM with no alerts except some duplicate address alerts.
I ran the utility scripts and see that all IP interfaces are in managed state. We are also intending to deinstall DFM on our LMS servers and move them to separate servers so I haven't open a tac case yet on this and may not unless it occurs on the new DFM box.
I am just curious if it was a known problem Anyway here are some screen shots of it.
There is an internal bug for this, CSCsg51571, but it is not fully understood, and it is still open. It is believed to affect all versions of DFM 3.0, and there is currently no workaround.
That said, this shouldn't be too serious. Even though DDV is not reporting the correct values, events should still be generated for the devices. If you want to open a TAC case, they can walk you through checking the dmctl output to make sure the objects are properly managed.
You are amazing in your knowledge of this stuff ! But now I am very nervous about this and whether it will occur on our standalone DFMS that we are in the mist of deploying.
I also had a few node down and interface down occurrence that I know occurred that DFm should have caught so I don't know if it's correct when the bug report says that the nodes are properly managed.
Thanks for the info.
Then you should have TAC walk you through using dmctl. If the devices are in a Known state, then DFM should be processing events for all managed objects (those with a managed state of TRUE). That said, there is one DFM bug that affects both Windows and Solaris. In this bug, devices may hang in a learning state after a rediscover. After about three hours, the devices will eventually become Questioned. Once a device is Questioned, events may not be generated for it.
There is a patch for this on Cisco.com at http://www.cisco.com/cgi-bin/tablebuild.pl/cw2000-dfm (bug is CSCsi01966).
I know how to do dmctl and asl and you are correct. All the object property values are available but not properly being displayed.
The node that I know was down the other day was a c3560 version 12.2. Via DDV its Device Type is listed as Switches and Hubs but via dmctl I cannot find it under any of the classes of Switch, or uncertified, or Hub,
or RelayDevice ,,, but I am still looking for it. So my issue of not getting events on this node down is probably independent of the CSCsg51571 issue. Thx
FYI TAC resolved both my problems. Unknown to me someone had gone into polling and thresholds and disabled monitoring on just about all groups. Once enabled back to the defaults, we got two pages of alerts and in addition Object properties values were now visible in DDV.