I Have a Cisco 7204VXR router with NPE-G1 that started acting weird for 24 hours. This router is dual attached to two 6k5 routers using multimode FO and SX GBICs.
The router is used as an LNS termination point for PPPoVPDN sessions, we have a bunch of them.
Here is a show ver output :
Cisco IOS Software, 7200 Software (C7200-ADVENTERPRISEK9-M), Version 12.2(33)SRD, RELEASE SOFTWARE (fc2)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2008 by Cisco Systems, Inc.
Compiled Thu 23-Oct-08 12:58 by prod_rel_team
ROM: System Bootstrap, Version 12.3(4r)T1, RELEASE SOFTWARE (fc1)
BOOTLDR: 7200 Software (C7200-KBOOT-M), Version 12.3(5a), RELEASE SOFTWARE (fc1)
LNS1.IX1 uptime is 21 weeks, 23 hours, 22 minutes
System returned to ROM by reload at 13:36:53 UTC Wed Nov 25 2009
System restarted at 13:39:45 UTC Wed Nov 25 2009
System image file is "disk0:c7200-adventerprisek9-
>> CPU utilization for five seconds: 32%/28%; one minute: 37%; five minutes: 36%
most of cpu is used by SW interrupts this means that most of traffic is process switched, this could be caused by the specific role of the device or it can be a problem
use the following link as a reference for troubleshooting this:
About output drops:
they are few in comparison with total output packets
because you have CBWFQ you can use sh policy-map interface gi0/2 to see what traffic class suffers drops.
>> Output queue: 311/1000/0 (size/max total/drops)
this is interesting there are too many packets in queue for the traffic offered but packets are not dropped here you see drops=0 they may be dropped by CBWFQ so again the sh policy-map int gi0/2 can be helpful
Hope to help
As I have said before, the equipment in question is used as an LNS in order to terminate DSL links (PPPoVPDN).
The interesting thing is that the problem stopped on this equipment and moved to another identical one.
As for your diagnostic, I have come to the same conclusion. I think one of my clients' link is sending huge amounts of malformed packets to the router, probably zombie PCs infected with a virus or something like that.
It's not a matter of DATA volume because the gigabit link is far from being overloaded. It's not a matter of number of packets because my graphs show smooth curves. It's more a problem of crappy packets.
I have to try to pin point the source link that is causing the packets packets to be process switched instead of being fast switched.
Is there any magical command to protect my CPU from this ? Or something that could help me pin point the source of the packets ? Some said netflow...
Another thing is that my interface changed her queueuing strategy from FIFO to CBFWQ by herself !?! Any magical command to come-back to FIFO ? I think no fair-queue under the interface makes the trick.
Thanks for the feedback.
>> Another thing is that my interface changed her queueuing strategy from FIFO to CBFWQ by herself !?
there is no service-policy out configured on gi0/2? it would be strange
I agree with your analysis malformed packets may be the cause of the problem.
If supported on your devices you could think of using control plane policing to rate-limit IP packets with options in header to protect main cpu
Hope to help
No, no service policies configured or applied on my interfaces !?! Weird huh !!!
As for CoPP, I will have to do some research about it... This needs to be studied before being implemented...