On monitored devices I will get a message like this infrequently:
WMIDiskDriveStatus Degraded Pred Fail Stressed Unknown Error Service NonRecover No Contact Lost Comm on host <machine name> (<IP>) at 2012-08-17 8:52:53 -0600 - Unknown access error occured. Verify that remote WMI access is configured and enabled on this device.
Then I will get a message like this:
WMIDiskDriveStatus Degraded Pred Fail Stressed Unknown Error Service NonRecover No Contact Lost Comm on host <machine name> (<IP>) at 2012-08-17 8:57:52 -0600 - WMI: Error, Service, NonRecover, No Contact, Lost Comm not present. Degraded, Pred Fail, Stressed, Unknown not present.
So I was wondering if we could have a filter applied to a 'wmi' error where if the first event happens, it polls a second time (like it does now), and if the second event happens, it doesn't send anything for an alert. in the case that the second event doesn't happen, then we send the error from the first event.
My assumption is that this is caused by a machine starting up from a boot, and the wmi service not responding properly.
I wrote this monitor so I guess I'll chime in here. The WMI montior works like most of the other monitors where once a problem is detected, the item being checked is re-checked after 60 seconds and if still found bad an event is generated. When I created the monitor, I only envisioned people using it to check critical infrastructure, not PCs that were likely to reboot frequently. Shows you what I know as far as how people will ultimately use the software
I've committed a change to the next release that will delay the re-check of WMI-based monitors for 20 minutes, this should give plenty of time for reboots, even with the occasional kernel/software update or service pack being applied (hope I'm not being too optimistic there). For critical infrastructure I think you would want to have a separate ICMP monitor anyway, which would catch an offline host within just a few minutes.
I appreciate the response. I know for us, I like to use the wmi monitors for the desktop/laptops as well because we could have performance issues that are based at the desktop level. Maybe even if it was 5 mins instead of 20. I currently get about 5-10 false reports a day on this.
Also, I had included a couple of other WMI queries that would be good (and might help with ip address changes). Are they anywhere on the radar?
Article ID:4018 Monitor Web Services on a Device on Cisco OnPlus Portal
Objective Event Monitors are mainly used to monitor the device in order
to provide notifications to the technicians in a timely manner if there
are any problems with the device perfor...
Article ID:3884 Monitor Duplicate IP on OnPlus100 Objective The
Duplicate IP monitor is one of the event monitors on the OnPlus100
device to check the network performance. Duplicate IP monitor checks
whether more than one device in the network uses the sa...
Article ID:3986 Monitor Host Performance with Internet Contol Message
Protocol (ICMP) for a Device on Cisco OnPlus Portal Objective Event
Monitors are mainly used to monitor the device in order to provide
notifications to the technicians in a timely manne...