Cisco Support Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Announcements

Welcome to Cisco Support Community. We would love to have your feedback.

For an introduction to the new site, click here. And see here for current known issues.

New Member

LMS 4.0 incorrectly clearing faults

I have a really annoying problem on LMS 4.0.

I monitor the interfaces on the switches.  When an interface goes down, LMS correctly reports the fault and it appears on the Monitor>Monitoring Tools>Fault Monitor screen.  However it will then seemingly randomly clear these faults (usually days later) even though the fault is still there.  If I go to the Fault Device Details screen, it shows Admin Status UP and Operational Status DOWN.

By changing the Managed State on that interface to false, applying the changes, changing it back to true and re-applying, the fault is picked back up again.

The exact same problem happens with failed power supplies (in this case the Fault Device Details screen shows the status as CRITICAL even after LMS has 'cleared' the fault from the Fault Monitor screen).

The switches in question are all 3750X’s.

Has anyone ever come across this problem and is so is there a fix?

Everyone's tags (3)
4 REPLIES
Cisco Employee

LMS 4.0 incorrectly clearing faults

Ideally the interfaces once displayed as an alarm, it will unless clear may take a backseat and if recovered will show as alarm clear and reoccur once the issue is back.

Making an interface unmanage and remanage may bring the alert back. Alerts are polled every 30 seconds, and the data is refreshed if a change has occurred. The information is always updated every 6 minutes, regardless of whether changes are detected.

But for an alert which has a single occurence may go to old aleerts and still viewable via history report.

All alerts in Fault Manager are kept for 31 days and purged from there. I think I am not clear with what exactly you're facing as a problem.

Is it the alert never comes back when it is still present on device?

-Thanks
Vinod
**Rating Encourages contributors, and its really free. **

-Thanks Vinod **Rating Encourages contributors, and its really free. **
New Member

LMS 4.0 incorrectly clearing faults

Hi Vinod,

Correct, the alert will appear, and then at a later date disappear altogether evern though the fault is still there.

The number of faults on the bar at the top of the screen also decrements when it incorrectly clears the alert.

I havnt kept notes of exact dates and time, but I can definitely say that some alerts will dissapear within a week (but others will remain constant).

Steve

Cisco Employee

LMS 4.0 incorrectly clearing faults

Did you tried generating a history report from :

Reports > Fault and Events > Device Fault History

When LMS polling determines that the alarm has been in the Cleared state for 30 minutes or more (from the time of polling), the alarm expires and is removed from the Alerts and Activities display.

Please be sure, that no other users are clearing the alerts. It would be better if you have any specific example. It can be analyzed for root cause.

Also, it should be noticed, if this behaviour is with any specific IOS/Device platform/Line card/ kind of interfaces etc.

-Thanks
Vinod
**Rating Encourages contributors, and its really free. **

-Thanks Vinod **Rating Encourages contributors, and its really free. **
New Member

Re: LMS 4.0 incorrectly clearing faults

Hi Vinod,

No-one else is clearing the faults as I am the only person that uses LMS.

The faults still exist on the switches, its just that LMS clears the alarm from the 'Monitor> Monitoring Tools> Fault Monitor' screen.

Switches exhibiting the problem are WS-C3750X-24P-S and WS-C3750X-48PF-S.

I have attached a screenshot which shows that LMS recognises the port as DOWN even though it has cleared from the faults screen.  If I change the ports  managed state to False, Apply changes, and then change back to True and re-apply changes it re-appears on the faults screen and sends an email alert.

Have ran a Device Fault History report as you suggested for a specific switch since 24th Feb - I see the Active alerts from when I forced it on 24th Feb, and there is no Cleared entry - the next entry is an Active once from when I re-forced the alert this morning as it had disappeared from the alerts screen.

IOS on this specific switch is c3750e-universalk9-mz.150-1.SE3.bin

Steve

177
Views
0
Helpful
4
Replies
CreatePlease login to create content