Best practices for SNMP and Syslog

Unanswered Question
Nov 8th, 2009

Good Morning,

What are some best practices for getting started with SNMP and Syslog?

is it bad to have both enable on the switch and/or it does not matter/

Thanks

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Average Rating: 4.7 (3 ratings)
Leo Laohoo Sun, 11/08/2009 - 13:40

Enable SNMP and Syslog is a very good practice. But you know what is even better? DAILY review of the logs!

I my experience with the following, I have never come across a NOC who practice daily review of the syslogs. They just look at the SNMP alarms and thats it. I must've caught a number of major issues after reviewing syslogs daily.

Joseph Clarke Mon, 11/09/2009 - 20:31

Actually, when dealing with best and leading practices the opposite is true for the long run. That is, one should manage be exception rather than try and comb through pages of logs each day. Eventually your mind will become numb, and you may miss important events.

Instead, start by building a baseline of normal messages. You may choose to do this over a typical two-week period (i.e. two weeks devoid of holidays). This way, you get an idea of the types of messages you see, and especially the message severities. Then, you build exception rules. Messages that fall outside of the established norm are flagged, and reported to your operators. Of course, one should always pay close attention to sev 0, 1, and 2 messages as those are usually quite severe.

Your baseline should always adapt, too. That is, as time goes on, you may notice a new pattern emerge. After careful auditing, you find that the new messages showing up (or the lack of old messages you used to see) are the new norm. Your baseline should change to accommodate this.

As for SNMP, the suggestion to use SNMPv3 is a good one. However, even today it is not always possible to use v3 as not every management platform supports it. If you have to go with community string-based SNMP, choose a hard-to-guess community string, and use views and access-lists to limit the polling to certain required MIB branches, and from certain SNMP managers.

A good article on securing SNMP can be found at:

http://www.cisco.com/en/US/tech/tk648/tk362/technologies_tech_note09186a0080094489.shtml

par13@psu.edu Tue, 11/10/2009 - 04:52

Hi,

I entered this basic command that at least will get me some information from the switch. The syslog server is getting some of notifications. However, it is not telling me the what's the error and/or the cause of the error.

logging trap alerts

logging 10.1.1.2

snmp-server location KC-218B

snmp-server enable traps snmp authen

snmp-server enable traps envmon

snmp-server enable traps syslog

snmp-server host 10.1.1.2 SW-AD

Joseph Clarke Tue, 11/10/2009 - 08:57

Your only sending very high severity messages to your syslog server which may not give you a complete picture of what is going on on the device. However, if you're seeing a specific message, and you're experiencing a specific problem, what is the message, and what are the symptoms?

par13@psu.edu Wed, 11/11/2009 - 12:46

One of the issues, I keep getting a message about a port UP and Down. I checked the devices connected to the port which is a network printer.

I have setup the port in 100FD, 100HD, and Auto. The SNMP alarm keeps coming with port been up and down.

Another message is this one which does not make any sense.

Local02009-11-11172.31.13.90community=public, enterprise=1.3.6.1.4.1.11.2.14.12.1, uptime=8366117, agent_ip=172.31.13.90, generic_num=6, specificTrap_num=5, specificTrap_name=hpicfCommonTraps.5, version=Ver1, hpicfFfLogFaultType.2=3, hpicfFfLogAction.2=2, hpicfFfLogSeverity.2=medium, hpicfFfFaultInfoURL.0.2=http://172.31.13.90/cgi/fDetail?index=2

Joseph Clarke Wed, 11/11/2009 - 13:38

It could be that it is the printer side which is experiencing the problem, and not the switch. The trap details you have here come from an HP device, not a Cisco device.

par13@psu.edu Wed, 11/11/2009 - 14:04

is there a way to filter these messages. I'm only concern of the switch hardware failer,etc..

Joseph Clarke Thu, 11/12/2009 - 11:07

I don't know what management software you're using. I'm also not sure if there is a way to disable these traps on the HP device (there probably is). Typically all trap managers do support a way of filtering certain traps.

ohassairi Sun, 11/08/2009 - 21:09

try to use SNMP v3 for security reasons.

some syslog msg are critical. you should not wait 24h to review them. try to send them immediatly via mail to your inbox.

some syslog servers/devices can do it.

hobbe Thu, 11/12/2009 - 09:14

Excelent question!

Here is my view:

Log as much as you possibly can.

log everything if possible it is not a bad idea to have lots and lots of logs the day something needs to be fixed or audited or whatever.

You can never have to much logs !

Some stuff in the logs are important to know quite immediately such as breakins from the firewall loggs or a server misbehaving and such things, somethings are not needed right now, but maybe a couple of days ago there was a problem and you get to hear about it from a complaining user only today, it is good to have something to go back to and find out that the user was right/wrong and if there is a problem such as a hacker attack actually be able to find out where it originated and what was targeted.

tip 1: protect your syslog server (maybe an asa in transparent mode ?) it is a hacker prime target ! never forget that.

tip 2: a good syslogserver will be able to filter alarms to you on different levels. try kiwi syslog server, its a nice one.

tip 3: syslog compresses very well ie gzip/zip makes the syslog file become 1-10th of its original size or less.

tip 4: Grep is your friend !

SNMP? YES! manage your switches and learn the patterns and how they work You will start out looking at several errors and dropped packets and such but that is just stuff you have missed before so just start sorting things out and with a little luck you will have a quick and happy network that runs smoothly.

You know the user that comes in complaining that the internet is slow or server x is slow and so on.

isnt it great to slap him over the face with the graphs telling him at that time he supposedly had the problem his computer was not on and the network had a average response time of 0.7ms topping 1.2ms and that server x had an average network load of 4 Mbit transmit and 2Mb recieve ?

Hmm I wounder if this could be the reason we dont hear "oh that has to be a network problem" anymore from the guys who used it as a favorite explanation on why their software didnt work.

Leo Laohoo Sun, 11/15/2009 - 17:44

I used to work for EDS and when I had a stint (aka punishment) at the NOC, I was assigned to trawl through the syslogs daily every morning. Someone is always responsible for this task and that sorry bugger have better come up with a very good explanation if something is playing up and wasn't picked up in the morning trawl of the syslog.

Actions

Login or Register to take actions

This Discussion

Posted November 8, 2009 at 4:46 AM
Stats:
Replies:12 Avg. Rating:4.66667
Views:6289 Votes:0
Shares:0
Tags: No tags.

Discussions Leaderboard