C3750, SNMP, MRTG, Vlan Interface Counters..

Unanswered Question
Jul 17th, 2010

This question HAS to have been asked and answered a thousand times by now, but I've tried for the last half hour to find that info and can't

For years now I've just accepted that I can't get correct traffic counts on Vlan interfaces on C3750 switches by snmp polling with MRTG.

Has anyone out there either figured out how to do this or tracked down the reason why it's not possible?  I read one post that said the C3750 didn't support this.  But then I started thinking.  If it didn't support it then why is there an OID for it I can successfully poll?  I just get wrong information, not no information.  The count that it does give me seems to amount to the behavior of some kind of minimum traffic flow or keep alive activity, and the pattern doesn't seem to be affected much or at all by how much or little traffic is being carried by the Vlan.

Anyone out there that's already pursued an explanation/resolution to this issue? 

Thanks!

-John Jackson

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
Joe Clarke Sun, 07/18/2010 - 10:18

The VLAN SVI counters were never an accurate measurement of traffic on a particular VLAN on a particular switch.  Those counters correspond to the traffic hitting the L3 interface (e.g. management traffic).  If you want an accurate count of L2 VLAN traffic, add up the counters for all ports on that switch in a given VLAN.

johnpjackson Sun, 07/18/2010 - 10:35

Joe, thanks for your reply.  I'm starting to understand a little better now, from what you said.  I agree with you that what I must be getting a count of is the L3 management traffic addressed directly to the switch itself.  Is that correct?  Here's the thing though - I'm also using this switch as an L3 router, not just for L2 switching.  Why would the switch be returning a count of the L3 management traffic hitting the switch, and not the L3 user traffic being received for forwarding on to it's next hop?  The switch HAS to be looking at the Vlan ID of each frame it's handling, and there is an L3 interface in the switch that the frame has to belong to.  I'm trying to understand if the problem is that Cisco had intended IF-MIB to work properly, and that's why all the OID's are present and return data.  Perhaps they overlooked something though, didn't code it correctly, and/or the hardware they used turned out not to be able to do the job the way they expected, and they subsequently just walked away from it?  It's just so frustrating, because we have no way to count traffic for individual Vlans through a trunk.  It's L3 traffic, running through the trunk to/from an L3 SVI in the swtich, it seems deficient that we can't get counters for that.  Counters don't seem to be a technically challenging feature to provide.  They seem fundamental and basic, actually.  Sigh.

Joe Clarke Sun, 07/18/2010 - 20:10

On the 3750, the packets are switched in hardware and this is not counted in the VLAN SVI counters.  You will only see management and CPU-switched traffic there.  There is no way to get at the L3 counters on this platform.  The 6500 and 7600s support the CISCO-SWITCH-ENGINE-MIB which does expose these counters, however.

johnpjackson Sun, 07/18/2010 - 21:43

So, So, does anyone have any idea why, if the IF-MIB counters don't  supply the correct count of the traffic that they're supposed to, Cisco  has provided working OID's for them at all?  What keeps getting me about this issue is that I keep hearing from everyone that this is simply a 'feature that is not supported' on this platform.  What I don't hear along with that, which I would expect, is an acknowledgement attributed to Cisco that yes, someone made a mistake, and that's why it doesn't work properly.  For Cisco to respond that way though seems like it would be opening itself up to the logical next thought - if it's broken, then fix it.  If Cisco knew the hardware wouldn't support this, why have they implemented the OID's for it at all?  If, as Joe is saying, the problem is not that the counters don't exist, it's just that you can't get at them, why is that??  If they exist, what would be the reason for making it so you couldn't get at them?  This seems like such a small issue, and why am I making such a fuss about it?  Well, I'm just tired of accepting a vague explanation about the issue, which I've been hearing from people for years now.  I'd really like for someone to indulge my curiosity and hit me with the full, detailed explanation of how we got to this point of having these switches give essentially wrong information and Cisco's explanation has just been to say that's acceptable.  I don't think it's acceptable.  I just can't imagine I can really possible bring about a change in that.

-John

Joe Clarke Sun, 07/18/2010 - 21:54

The problem is a hardware limitation of the platform.  It will not be fixed per se as this is how the architecture of this line of switches works.  Essentially, the first packet in a routed flow will be switched by the CPU (thus counted on the SVI).  Subsequent packets will be switched by CEF at the ASIC level, and thus will not be counted.

A similar symptom is documented in CSCec35100.  This bug was marked as not-to-be-fixed given the limitation in 3750 hardware.  Higher-end platforms do offer these counters.

johnpjackson Sun, 07/18/2010 - 22:48

I would say the best thing that Cisco could do about this bug would be to prune their MIB of the OID's that return incorrect information.  I know, technically, it's arguable that the information is not incorrect.  I think when you look at what information is supposed to be returned, ie., a count of all bits for the interface, then it's hard to argue this is correct behavior.  I think returning the wrong information, like this does, is even worse than returning no information.  I also wish Cisco would spell out in their documentation that this is a bug, not simply that they don't support it.  They obviously intended to support it or the code wouldn't have been put in there for it in the first place.  Simply saying it's unsupported is spin, making it sound like they chose up front not to provide the functionality.  What really happened is that they made a mistake, and then decided it wasn't worth their while to fix it, even though the customers who've bought this equipment from them would reasonably expect this feature to be available.   Yup, I'm totally frustrated and ticked off about this and having a pity party!

Actions

This Discussion