We are having a problem where the CPU is being driven to over 90% at times and we can't even get at this box , high input on the CPU is "ip input" . This box has all DFC3 cards in it . Under what circumstances does ip traffic get forwarded to the CPU when you have DFC cards installed ? Anyone have any ideas on how to track something like this down? Have checked all links for errors , everything clean . Looked at spanning tree , don't see an issue there . Hard to get any info off the box when this is happening . You look at any of the interfaces and the traffic is not that high , these are gig links down to access layer boxes that are trunked . Frankly I am running out of ideas on what to do with this , any ideas appreciated .
There are quite a bit of reasons traffic may be punted to the MSFC for process switching. Some include unsupported features in hardware, IP options, ttl = 1, and ICMP unreachables/redirects. Here is a link with a more complete list:
The best way to determine what is causing the process switching is to dump the packet buffers on the interface(s) that is seeing the large volumes of process level traffic. You can do this one of two ways:
1. "show interface switching" and look for the interfaces with increasing IP Process counter.
2. "show interfaces" this is probably easier. You can look at the Input Queue for drops or packets actually in the queue.
Vlan10 is up, line protocol is up
Hardware is EtherSVI, address is 00d0.0061.040a (bia 00d0.0061.040a)
Description: VLAN10: Uplink
Internet address is xxx.xxx.xxx.xxx/xx
MTU 1500 bytes, BW 1000000 Kbit, DLY 10 usec,
reliability 255/255, txload 1/255, rxload 1/255
Encapsulation ARPA, loopback not set
ARP type: ARPA, ARP Timeout 04:00:00
Last input 00:00:00, output 00:00:00, output hang never
Last clearing of "show interface" counters never
Input queue: 33/75/3385/3367 (size/max/drops/flushes); Total output drops: 0
In this case we have 33 packets in the queue and we are seeing drops and flushes due to the amount of process level traffic. So once I have this information I can dump those packets by entering "show buffer input-interface vlan10 dump". This will dump those packets from the queue so you can take a look at whats in there. Should look like this:
Router# show buffers input-interface vlan10 dump
Buffer information for Small buffer at 0x437874D4
data_area 0x8060F04, refcount 1, next 0x5006D400, flags 0x280
Hi Anthony great info on the buffer dumping stuff . What I need is a way to be able to get to the box when the cpu gets above 90% , it is all but unusuable via telnet when it happens . I need to be able to dump those buffers when it gets busy . When the problem is not occuring the 720 runs less than 5% all the time . The code level is 12.2.18SXD4 . I did do some of the cef command you described and didn't see too much there . If there is way to allocate more cycles to the vty sessions let me know . Thanks alot...
You can use "remote command" or "remote login switch" to get onto the supervisor. Is the switch remote from your current location? I would recommend login through the console port to capture the information instead of telnet. However you can try the "scheduler allocate" command. Try "sheduler allocate 3000 1000", where 3000 is interrupt time and 1000 is process time.
I'm seeing the exact same symptom on our two new Cat6513 with Sup720 running 12.2(17d)SXB10. Both 6513's are trunked to a 6509 with Sup2/MSFC2. CPU utilization on both 6513's are at 99%, 6509 at 1%. Network utilization across the GB trunks are ~50%. The interesting part is after I disabled all ports except for the trunk ports on all three switches the CPU and network utilization remain the same and when I span/sniff the trunk ports on 6509 I see 'phantom' WINS host accouncement broadcast traffic from devices no longer connected to the switch. Fortunately, this issue reared its head in testing phase before going to production. I have a case opened with our reseller but it's not too assuring that you're seeing the same issue with a later IOS release.
Just because you are seeing high CPU does not make this an IOS issue. Was your issue in interrupt or Process level? In this case, its not IOS, its the fact that there is a type of traffic being sent to the device that is not supported in hardware. There a lot of reasons for this, could be a ttl with a value of 0 or 1 and we need to send an IP Unreachable in response, could be a feature that the Sup720 doesn't support in hardware or other traffic related issues. The fact is without seeing the show proc cpu from your case, the type of traffic being punted, and all the information you are blindly comparing all issues to the one you are seeing and assuming its due to a defect. I would be happy to look into your case if you are still seeing it. Shoot me an email and I will request some information from you and we will try and narrow it down to why the packet is being punted.
Question We run asr9001 with XR 6.1.3, and we have a very long delay to
login w/ SSH 1 or 2 to the device compare to IOS device. After
investigation, the there is 1s delay between the client KEXDH_INIT and
the server (XR) KEXDH_REPLY. After debug ssh serv...
Introduction The purpose of this document is to demonstrate the Open
Shortest Path First (OSPF) behavior when the V-bit (Virtual-link bit) is
present in a non-backbone area. The V-bit is signaled in Type-1 LSA only
if the router is the endpoint of one or ...
Hi, I am seeing quite a few issues with patch install and wanted to
share my experience and workaround to this. Login to admin via CLI, then
access root with the “shell” command Issue “df –h” and you’ll probably
see the following directory full or nearly ...