High Cpu on 6513 with sup720-3B

Unanswered Question
Jun 25th, 2008
User Badges:

Hello,


I've got two 6513 with dual sup 720.

One of the switch is at 95-96% for more 3 days.


I have taken a show proc cpu, show proc cpu history and done diagnostic with

http://www.cisco.com/en/US/products/hw/switches/ps708/products_tech_note09186a00804916e0.shtml

but I don't find anything


Here is the show proc cpu history :

1111111111111111 11111111111111111

9957382228226922942294299999900000000000000009999999900000000000000000

8994087697789187136762949989900000000000000009999999900000000000000000

100 ** * ******##############**********################

90 ** * * * * * **#*##################***##***#################

80 #* * * * * * **#############################################

70 ## * * * ** * * **#############################################

60 ##** * * ** * * *##############################################

50 ##*# * * ** * * *##############################################

40 ##*# * * ** ** ** *##############################################

30 ##*#********************##############################################

20 ##*#********************##############################################

10 #######*****##***#**##################################################

0....5....1....1....2....2....3....3....4....4....5....5....6....6....7.

0 5 0 5 0 5 0 5 0 5 0 5 0


CPU% per hour (last 72 hours)

* = maximum CPU% # = average CPU%


and show proc cpu sorted :

C6513-LTP1SL7-01#show proc cpu sorted

CPU utilization for five seconds: 82%/77%; one minute: 86%; five minutes: 87%

PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process

9 20000572 28477236 702 2.39% 0.53% 0.51% 0 ARP Input

118 80264904 754410308 106 0.79% 0.78% 0.72% 0 IP Input

171 3799940 5226005 727 0.31% 0.23% 0.22% 0 CEF process

122 5001564 685722 7293 0.23% 0.17% 0.16% 0 Adj Manager

333 431608 26387383 16 0.23% 0.04% 0.00% 0 CEF RP IPC Backg

162 136288752 412771070 330 0.23% 0.30% 0.35% 0 DHCPD Receive

119 35436 319760 110 0.15% 0.01% 0.00% 0 MOP Protocols

114 576 297 1939 0.15% 0.11% 0.04% 1 SSH Process

113 1504580 4068144 369 0.15% 0.04% 0.04% 0 CDP Protocol

314 11376536 383070773 29 0.15% 0.24% 0.31% 0 Standby (HSRP)

124 1804984 3655920 493 0.15% 0.05% 0.05% 0 ARP HA

312 6214852 7147511 869 0.15% 0.20% 0.23% 0 MLSM Process

309 1194832 10882599 109 0.07% 0.05% 0.03% 0 IGMP Input

174 249384 4166130 59 0.07% 0.01% 0.00% 0 L3 Manager


Version is s72033-ipservicesk9_wan-mz.122-18.SXF10a.bin


Why the cpu is loaded to 95/96% ?


Thank you in advance.

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
a.alekseev Wed, 06/25/2008 - 07:57
User Badges:
  • Gold, 750 points or more

mayby you have stp loop...


use

sh int | i rate

and pay attention to high PPS.

DWAM_2 Wed, 06/25/2008 - 22:07
User Badges:

Thank you for your answer.

I will do this in the morning.

DWAM_2 Wed, 06/25/2008 - 23:50
User Badges:

Hello,


I haven't got topology change on the network.


Regards.

a.alekseev Thu, 06/26/2008 - 10:55
User Badges:
  • Gold, 750 points or more

do you have BGP running on the router&

Could you describe your router, what services are running?

DWAM_2 Fri, 06/27/2008 - 05:36
User Badges:

Hello,


Thank you for your answer.

There is no BGP on the router.

I have opened a case and the engineer resolve my problem (very quickly!)

I insert the important informations, without named IP.


Action 1

From current show tech I can see that no interface is dropping traffic and also that lots of traffic is sent to the CPU to be software switched.

First of all we need to understand the nature of the traffic sent to the CPU.

About the trigger can you confirm that also this time High CPU occurred after having disabled mls qos?

Action Plan:

Please get me the following outputs:

Show clock

Show mls qos

Show process cpu sorted | e 0.00%

Sh ibc

Sh ip traffic (taken 3 times with ~30 seconds interval)

Show interface switching (taken 3 times with ~30 seconds interval)

Show interfaces

If during these captures CPU was still high (ONLY IF CPU WAS HIGH) please also perform a Debug netdr

Debug netdr capture rx

After 30 seconds undebug netdr capture

show netdr captured-packets - Please send me all the packets you captures this way.


Action 2

It seems that some traffic in vlan 5 (quite a bit) is being routed to the same interface it was received from.

Before giving you a possible solution I would like to verify my findings.

Please send me following outputs:

Sh ip route A.B.C.D

Sh ip cef A.B.C.D det

Sho mls cef lookup A.B.C.D

Sh ip route A.B.C.D

Sh ip cef A.B.C.D det

Sho mls cef lookup A.B.C.D

Sh ip route A.B.C.D

Sh ip cef A.B.C.D det

Sho mls cef lookup A.B.C.D

In any case traffic is always either sourced or destined from/to subnet A.B.C.D/24.

It would be great if you could also get me another CPU capture, this time in the other direction (previous capture was taken for traffic going TO the CPU now I would like to see the traffic coming back FROM the CPU).

deb netdr clear-capture (to clear the capture buffer)

Debug netdr capture tx

And again after a few seconds “show netdr captured-packets”

NOTE: Please perform the previous netdr capture only after you verify that CPU is still high. So you run again a

SWITCH6513#Show process cpu sorted | e 0.00%


Action3

Actually all the prefixes are in the routing table.

You have a static route for X.Y.Z.A/24 and a default routes for the other prefixes.

All of them point to next hop A.B.C.D in same vlan.

Therefore packets enter the router in vlan5 through GiX/Y and exit in vlan5 though Gi5/16 > high CPU expected under this condition.

All the captures you sent me confirmed my findings.

You can lower CPU by issuing the command: mls rate-limit unicast ip icmp redirect 0.

That should solve this High CPU condition.

Anyway the question would be “why does the traffic follow this path”? Is this expected? Is it your design that requires that?

As you can see this kind of routing is not optimal.

Action plan:

Configure “mls rate-limit unicast ip icmp redirect 0” and check CPU utilization afterwards.


I want to congratulate the Cisco engineer which have found a solution.


Best regards.

rpratapa Thu, 06/26/2008 - 11:50
User Badges:

Hi DWAM,

The CPU can run high if you have a long list of ACLs configured on the switch. Check if you can reduce the ACLs.


2. Try configuring " ip cef" .globally and on the interface carrying high traffic. This can help.


HTH


-rk

DWAM_2 Fri, 06/27/2008 - 05:37
User Badges:

Hello rpratapa,


thank you wor your answer. You will find the solution on the previous answer.


Best regards.

Actions

This Discussion