4500: High CPU

Unanswered Question
Feb 5th, 2010

Hi all,

We have a C4500 with 2 sup V which is showing high CPU values sometimes.

When this occures, the switch shows 95-100% for total cpu for a few hours at a time.

show proccesses cpu shows process Cat4k Mgmt LoPri with an unusual cpu value of 65%

The show platform heath command gave me a high reading for K2FibFC DelFlow which is about 50% at this peak but is normaly around 1%

This peak seam to show up randomly and stays for a few hours.

I've read the document http://www.cisco.com/en/US/products/hw/switches/ps663/products_tech_note09186a00804cef15.shtml

But none of the mentioned common reasons seems to be the cause.

Does anyone know what the process K2FibFC DelFlow means and what could be the cause of this peak?

Thanks in advance.

I have this problem too.
1 vote
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
sachinraja Fri, 02/05/2010 - 08:46

Hi Dennis

Can you please post the "show proc cpu" & "show platform health" , "show log" outputs from your switch ?

Have you enabled anything new on your switch ? when did this start happening ? Did you change any topology on your network ? K2FibFC - could be something to do with the Forwarding Information Base (CEF etc), .. any pointers on thaat ? does your "show log" show any suspicious activity ?

Raj

DennisV99_2 Sat, 02/06/2010 - 06:05

Hi Raj,

See the attachments for the sh platform health output at the moment of the cpu peak. I havn't got an output from sh processes cpu at the time of the peak. The BPDU guard block in the logging at line 1675 was 'expected'.

The logging output is from a few minutes ago. The last cpu peak occured yesterday (5 feb) and started around 1 pm and dropped around 5 pm.

Yes we made some minor topplogy changes but that doesn't make sense to me that the high cpu is only for a few hours and not all the time.

I dont know when this started. I noticed it since a few weeks now when i turned on snmp traps for high cpu. The one remarkable thing i remember now that the last peak before yesterday was also on a friday afternoon at the same hours.

I did some basic troubleshooting like checking STP for unusual things but it seems that besides K2FibFC every process has normal cpu load.

In what direction should i look for Forwarding Information Base? Has it something to do with IP routing?

Attachment: 
Giuseppe Larosa Sat, 02/06/2010 - 08:11

Hello Dennis,

have a look at following document about high cpu on C4500 switches

http://www.cisco.com/en/US/products/hw/switches/ps663/products_tech_note09186a00804cef15.shtml

K2Fib

FIB1 management

>> K2FibFC DelFlow revi   2.00  51.43     10      8  100  500   80  74   53  552:56

I will give a look at your log files

Edit:

sorry I didn't realize you had already found the doc about high cpu usage.

the issue is related with activity of maintanance of the CEF tables.

How many routes are in the  IP routing table you can check with sh ip route summary.

Also do you have netflow enabled?

the name of process leads to attempt to delete flow entries from a table.

Hope to help

Giuseppe

DennisV99_2 Sat, 02/06/2010 - 09:22

Hi Giuseppe,

I don't think netflow is enabled. I tried show ip cache ? and the command show ip cache flow doesn't exists so i think it is not enabled

I'm new to the term CEF and right now i'm reading my way through CEF troubleshoot documents and articles to pinpoint the problem the next time this cpu peak occures.

Is there a way to see what causes the activity in the CEF tables?

Giuseppe Larosa Wed, 02/10/2010 - 04:47

Hello Dennis,

I would recommend you to open a Cisco TAC service request, because existing documentation is not of help in this case.

Hope to help

Giuseppe

rmedvedev Tue, 03/02/2010 - 01:41

In my case the same thing happens when I enable pbr on one of the interfaces

K2FibFC DelFlow revi   2.00  63.77     10      8  100  500   78  75   10  2034:06

I have not found any additional information on this issue
DennisV99_2 Fri, 03/05/2010 - 11:28

I'm a step closer to the solution i think.

Last week i had a +15% cpu on our 4500 for some days (not like 99% for a few hours as the last 3 times).

Also this time the same K2FibFC DelFlow process caused this higher CPU.

During this higher cpu i found out that the ARP table showed some unusual behaviour: lots of arp requests (300 per sec) and lots of them with the marking "Incomplete".

After searching for the source of the ARP requests with Wireshark i found out there was a host in one of the vlans which generated a lot of packets with spoofed source ip adresses which caused this high number of arp requests.

With disconnecting the host by disabling the switchport the problem seems to be solved.

Does someone know some sort of monitoring tool to watch an arp table for unusual activity or e.g. a vb script to count the entries in an arp table?

rmedvedev Mon, 03/15/2010 - 02:46

Hi!

I checked the cpu-bound traffic with analyzer, but did not find any suspicious activity.
I found that in the case of PBR switch uses the flow switching model, and K2FibFC DelFlow revi process
simply removes expired flowcache entries. Now I have about 900 mac addresses on this switch.

According to Cisco documentation:
The Catalyst 4500 switching engine supports matching a “set next-hop” route-map action with a packet on a permit ACL.
All other route-map actions, as well as matches of deny ACLs, are supported by a flow switching model.
In this model, the first packet on a flow that matches a route-map is delivered to the software for forwarding.
Software determines the correct destination for the packet and installs an entry into the TCAM so that future packets on that flow are switched in hardware.
The Catalyst 4500 switching engine supports a maximum of 4096 flows.

sh platform software ip flow summary
IrmFlows in use: 4094 free: 2 max: 4096
aging timeout: 300.000000 seconds.
adjs in use: 4094 out of 4096
cam entries in use: 4094

In any case, this has no effect on performance and I do not know whether to open a TAC.
It seems that this is normal, having in mind the well-known document at http://www.cisco.com/en/US/products/hw/switches/ps663/products_tech_note09186a00804cef15.shtml.
Or am I wrong? What experts say?

JevgenijSabaliauskas Mon, 03/05/2012 - 03:29

Hi

I have Cat4500.

Avg CPU was about 20-25%

After I enabled PBR on all VLANs, CPU became about 50-55%.:

K2FibFC DelFlow revi   2.00  33.04 - this process became HIGH

Very strange, bacause I have lots of free TCAM masks:

  sh platform software ip flow summary

  IrmFlows in use: 295 free: 3 max: 4096

  aging timeout: 300.000000 seconds.

  adjs in use: 295 out of 4096

  cam entries in use: 295

Not sure what could be done on this...

Jevgenij

nkarpysh Mon, 03/05/2012 - 05:10

Just my 2 cents about PBR on 4500. Hope those will help and clarify:

On the cat4k, most of the route-map actions, especially the deny statements,

are supported by a flow-switching model. In this model, the first packet on

a flow that matches a route-map will be delivered to the CPU for forwarding.

Software determines the correct destination for the packet and installs an

entry into the TCAM so that future packets on that flow are switched in

hardware. The Catalyst 4500 switching engine supports a maximum of 4096

flows. When the number of entries increase or decreases, you will see an

increase in the process  "K2FibFC DelFlow" as this process creates/deletes

the flow entries.

Other  thing is that using  "set ip default next-hop" can also forward packets to SW.

So you can check these details and compare to config - if you have a number of deny statements - those can be a reason for CPU going higher as that process need to install/remove those entries.

Nik

Actions

This Discussion