I am using the Composite Device Health EEM that sends out an email notification once my interface is or CRC errors reach a particular threshold. The problem I am having is that I have to manually do a 'clear count' to get these emails to stop once the script has executed. I would like for the script to execute, send the email notification and then clear count - automatically.
Is this possible to do with an action command? I do not know where I would put this in the script, if so. The script I am using can be found here: http://www.cisco.com/en/US/prod/iosswrel/ps6537/ps6555/ps9421/networking_solutions_products_genericcontent0900aecd80719ee6.html
Any help with this would be appreciated. I posted this in another thread, but it became locked. I have attached the .tar file to this post
edit: Well- the file didnt attach.
I don't understand what you mean by "clear count". Do you mean execute the command "clear counters"? The script will only run when a certain syslog message is seen (%SYS-4-FREEMEMLOW). Are you saying that this message is coming frequently, and you only want to run the script once per device boot?
Correct, I would like to clear the counters once the email has been sent. I currently have these event environments set:
event manager environment _email_server 67.XX.XX.XX
event manager environment _email_from firstname.lastname@example.org
event manager environment _email_to email@example.com
event manager environment input_errors_threshold 2000
event manager environment output_errors_threshold 2000
event manager environment crc_threshold 2000
event manager environment interface_reset_threshold 20
event manager environment interface_health_period 60
When, say the input errors on the device reach 2000 on an interface- then the email will be sent out in 60 seconds. I would like to be able to automaticlly clear the counters after this email is sent out.
action 1.0 cli command "clear counters" pattern "[confirm]"
action 1.5 cli command "confirm"
action 2.0 ...
I had downloaded the wrong script. Just so there is no more confusion, can you provide me the exact link you used to download the script you're using?
You should only get notifications when the delta value (i.e. the difference between the current poll and the previous poll) exceed the configured threshold. So, if after three polls, the change in output errors between poll 2 and poll 3 is above your threshold, you will get a notification. However, if the delta between polls 3 and 4 is not above the threshold, no notification will go out. When you do a "show int", you will still see the total number of output errors.
ok, so if I set the output error threshold to 1000, then it will only send me an email when those errors reach 1000, then again at 2000, then at 3000...and so on?
For testing purposes I have the output threshold set to 2. but, the output errors have reached 3, and I have not recieved an email yet. However, I did get one when it reached 5. and notied this in the email:
Output errors delta: 5
Right, it's all about the change between polling cycles. It sounds like the first poll saw 0 errors, then the second poll saw 5 (even though you saw it grow to 2 then 3).
I am still having issues with this. It may be because of something that I am doing worng, but once the threshold is reached, I still get emails depending on what the inteface health period is set to (event manager environment interface_health_period 600). Checking against the deltas doesnt seem to be working. I would just like to get one email when the threshold is reached- then be able to clear count before it happens again.
A problem I am having is...if the threshold for input errors is set to 2000- once this threshold is reached, I get an email- which is good, but then I get one every 600 seconds until the count is manually cleared back to 0. Well this is a problem because if it is reached at night or on the weekend, then I come back to work woth 50 or more emails.
Am I just out of luck on this? Thank you for your previous help, Joe!
The only time you shoul be getting an email is when the delta value between the current poll and the previous poll is greater than th the threshold. In this case, that would be if the delta valus is greater than 2000 input errors. Are you seeing something else, or are you really getting 2000 new input errors every 600 seconds?
Correct. I understand how it is supposed to work, but what seems to be happening is if I hit the 2000
threshold of input errors, I will still get the email every 600 seconds- even if the errrors have only made it to 2001 in 600 seconds. So maybe the issue is in this statement: event manager environment interface_health_period 600.
If the amount of time that the email goes out could just be specified in the code in the script, then maybe I wouldnt even need the health period command?
No, the 600 seconds isn't at fault here. I'm not seeing the problem. Can you enable "debug event manager tcl commands" and "debug event manager tcl cli", then reproduce the problem? That will help me track this down.