This discussion is locked

Ask the Expert:Troubleshooting Tools to Analyze High CPU Utilization Issues on Cisco Catalyst 6500 Series Switches

Unanswered Question
Jan 17th, 2012

With Souvik Ghosh

Welcome to the Cisco Support Community Ask the Expert conversation. This is an opportunity to learn and ask questions about different methods of advanced troubleshooting tools to debug high CPU utilization issues on Cisco Catalyst 6500 Series Switches with Cisco Expert Souvik Ghosh. You can ask questions on troubleshooting issues running in native mode with Cisco Catalyst 6500 Series Supervisor Engine 720 or Cisco Catalyst 6500 Supervisor Engine 32, Cisco Catalyst 6500 running in native mode with Cisco Catalyst 6500 Series Supervisor Engine 2, and Cisco Catalyst 6500 running in hybrid mode.  Souvik Ghosh is a customer support engineer at the Cisco Technical Assistance Center in Bangalore, India. He has three and half years of experience in LAN switching technologies. LAN switching products such as the Cisco Catalyst 6500, 4500, 3750, and 2960 Series Switches are his areas of expertise. He has been involved in various escalation requests from India, Singapore, and Australia and is currently working as a technical lead for the LAN switching team in Bangalore, India. He holds CCNP and CCIP certifications.

Remember to use the rating system to let Souvik know if you have received an adequate response. 

Souvik might not be able to answer each question due to the volume expected during this event. Remember that you can continue the conversation on the Network Infrastructure sub-community   discussion forum shortly after the event.  This event lasts through January 27, 2012. Visit this forum often to view responses to your questions and the questions of other community members.

You can also read the questions he answered during the live event in this FAQ Document. You can also review the Live Webcast video with Souvik who gave a presentation on this topic.

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Average Rating: 5 (2 ratings)
ivan.petrov51 Thu, 01/19/2012 - 13:47

Hi Souvik,

I am having a problem on high CPU being shown on my 7600 router due to IP Input. Do you have any pointer to a document that can help me troubleshoot this? I am running 12.4 IOS.

Also, can you tell me under which circumstances we the IP input would how high percentage of CPU usage?

Thanks,

Ivan

souvghos Thu, 01/19/2012 - 17:55

Hi Ivan,

High CPU due to ip input is probably because of normal data packets hitting the CPU. you can collect the output of "debug netdr cap rx" followed by "sh netdr cap" ( commands are safe to run in production network ) and find out what packets are hittinf the CPU. Try finding out a trend in those packets.In case you need further assistance please attach the "show netdr cap " otuput alongwith the "show tech" output from the switch.

BTW are you sure you are running 12.4 on 7600 ?

regards,

Souvik

ivan.petrov51 Fri, 01/20/2012 - 11:34

Souvik,

Thank you for your advice. I will try this. Indeed I meant to say 12.2SRB version.

Thanks,

Ivan

burleyman Fri, 01/20/2012 - 12:32

Souvik,

I have 4500's and 6500's in my environment and would like to ask a few things.

What would you consider a normal range for the CPU utilization to run in?

At what level of utilization should you start to take some preventive action to prevent further degradation?

Now lets say I have a switch that all of a sudden starts running at 80% to 100% for a extended period, what are some of the first commands I can run to help find the problem? And what are some debug commands I can run that will not harm the flow of data on production switches? and what are some commands I should NOT run during production hours?

Thanks Mike

souvghos Fri, 01/20/2012 - 18:37

Hi Mike,

the normal average cpu utilization of the 4500 switches is around 30-40% and that of 6500 is between 0-15%. However the CPU utilization will depend on the nature of applications your network is supporting. There could be few application traffic which cannot be forwarded in the hardware of the supervisor and needs software forwarding. In case the average CPU utilization is above the 15% on a 6500 switch then find out what traffic or process is consuming CPU cycles, if those traffic are required in the network then that would be your benchmark CPU utilization.

Now if you that the average CPU utilization is way more than your benchmark CPU utilzation then that is the time you would need to troubleshoot the cause. The tools which you have to troubleshoot CPU utilization problem depends on the SUP and the OS which you are running. The tools  are discussed in the PDF available in the following link.

https://supportforums.cisco.com/docs/DOC-21945

Regards,

Souvik

intercommerz2 Sat, 01/21/2012 - 11:29

Hi Souvik,

explain that I did not do so. need to arrange a remote connection, for

those who do not know, much has changed in 8.4.

this configuration of the docks from the site cisco.com

hostname(config)# interface ethernet0
hostname(config-if)# ip address 10.10.4.200 255.255.0.0
hostname(config-if)# nameif outside
hostname(config-if)# no shutdown
hostname(config)# crypto ikev1 policy 1
hostname(config-ikev1-policy)# authentication pre-share
hostname(config-ikev1-policy)# encryption 3des
hostname(config-ikev1-policy)# hash sha
hostname(config-ikev1-policy)# group 2
hostname(config-ikev1-policy)# lifetime 43200
hostname(config)# crypto ikev1 outside
hostname(config)# ip local pool testpool 192.168.0.10-192.168.0.15
hostname(config)# username testuser password 12345678
hostname(config)# crypto ipsec ikev1 transform set FirstSet esp-3des
esp-md5-hmac
hostname(config)# tunnel-group testgroup type remote-access
hostname(config)# tunnel-group testgroup general-attributes
hostname(config-general)# address-pool testpool
hostname(config)# tunnel-group testgroup ipsec-attributes
hostname(config-ipsec)# ikev1 pre-shared-key 44kkaol59636jnfx
hostname(config)# crypto dynamic-map dyn1 1 set ikev1 transform-set
FirstSet
hostname(config)# crypto dynamic-map dyn1 1 set reverse-route
hostname(config)# crypto map mymap 1 ipsec-isakmp dynamic dyn1
hostname(config)# crypto map mymap interface outside

hostname(config)# nat (inside,outside) source static any any destination static

192.168.0.0 192.168.0.0 route-lookup
hostname(config)# write memory


n this case a config client connects, is assigned an address from the
pool, but local resources can not see, tell me, what is missing.

souvghos Sat, 01/21/2012 - 18:01

Hi Slava,

I am not sure if this the right forum to answer your question. Please post your question in "security" forum.

Regards,

Souvik

burleyman Mon, 01/23/2012 - 05:48

Souvik,

Here is the info for my SUP.

Supervisor Engine 720 10GE (Active)    VS-S720-10G

MSFC3 Daughterboard         VS-F6K-MSFC3

MSFC3 Daughterboard         VS-F6K-MSFC3

I looked through the document and it was good but could you explain what you would do first when you see the CPU go above the baseline.

What commands would you run and what would I look for to help find the problem.

What debug commands would be helpful and what do I look for in the output?

What debug commands can be run during production and which should you not run till after hours?

If I span the CPU what should I look for to find the problem?

Mike

souvghos Mon, 01/23/2012 - 22:36

Hi Mike,

Since you have SUP720 in your 6500 chassis you have more options to troubleshoot a high CPU utilization issue as compared to older SUPs. here are the steps which you can try in order to start troubleshooting the problem.

-> Issue the command "sh proc cpu history" and find out what is the average CPU utilization and since when the CPU utilization is high. Try correlating with some recent changes in the network.

-> if the average CPU utilization is above the benchmark issue the command "show proc cpu sort | e 0.00" and look the line which talks about cpu utilization. you will see something like this.

Switch#show proc cpu sort | e 0.00

CPU utilization for five seconds: 17%/10%; one minute: 18%; five minutes: 18%

Here 17% is the total CPU utilization and 10% is the utilization due to interrrupt switching. In the above output CPU utilization due to process switching is 17-10=7%. Here is the difference between process switching and interrupt switching.

Process switching - CPU usedby IOS processes like "eigrp process", "ospf process" etc.

Interrupt switching- CPU used to forward normal data packets.

If the CPU utilization is due to IOS process then the troubleshooting is specific to that process, like in case it is eigrp process which is consuming the CPU cycles then you need to check the routing protocol and see if there is a routing loop or there is some eigrp neighbor in SIA state etc. If the PCU utilization is because of interrrupt switching then we need to capture packets which are hitting the CPU and find out a trend in those packets like src IP, dst IP, src interface, src mac etc. following steps will help you to capture those packets hitting the CPU.

-> debug netdr cap rx << this is safe to run in production network from 12.2(18)SXF code and later.

-> show netdr cap << to see the packets punted to the CPU.

-> you can also take an inband CPU span to find out what packets are punted to the CPU. this is also safe to run int the network

-> Issue the command " show interface | i line|drops" and find out if there are any "input queue" drops for any interface. Interfaces with input queue drops are the ones which are sending packets towards the CPU. Input queue is the software queue where the packet waits before it can be processed by the CPU. In case there are more packets which are waiting the in the queue than what the queue can handle then we start tail dropping the packets.

You can find more detail regarding your quesiton in the webcast video recording which will be uploaded shorlty in our supportforum.

Regards,

Souvik

Franmacias Wed, 01/25/2012 - 09:53

Hi Souvik,

I wonder what can cause the Per Minute Job counters go up?

I have the following from my 6500:

switch#sh proc cpu sorted

CPU utilization for five seconds:  50%/12%; one minute: 28%; five minutes: 30%

PID Runtime(ms)   Invoked      uSecs    5Sec   1Min   5Min TTY Process

  38   374422416   1732046     219083 29.47%   3.53%  2.31%   0 Per-minute Jobs 

123  29738384121937605086       9234   4.39%  5.37%  5.48%   0 IP Input        

- Paco

DCASW01#sh proc cpu sorted

CPU utilization for five seconds: 53%/15%;  one minute: 30%; five minutes: 27%

PID Runtime(ms)   Invoked      uSecs   5Sec   1Min    5Min TTY Process

  38   374549292   1732206     216240 29.12%   3.54%  2.38%   0 Per-minute Jobs 

souvghos Wed, 01/25/2012 - 22:53

Hi Francisco,

"Per-minute job" is a background process which runs in cisco switches and routers and performs the following tasks once a minute:

analyzes stack usage

announces low stacks

executes registered one_minute jobs

Do you see constant high CPU utilization on the "Per-minute job" process ? could you please provide the output of "show proc cpu hist" command. what is the version of the code which you are running? what is the memory utilization ? How many routes do you have on this switch.

regards,

souvik

parulpatel6 Thu, 01/26/2012 - 13:46

Hi,

I need to know what are the recomended uptimes for all the routers and switches models. What is manufactured recomended reboot period? Please provide an answer or guide me to the right resource who could provide me this details.

Thank you

Actions

Login or Register to take actions

This Discussion

Posted January 17, 2012 at 9:13 AM
Stats:
Replies:12 Avg. Rating:5
Views:5542 Votes:0
Shares:0

Related Content

Discussions Leaderboard

Rank Username Points
1 14,997
2 8,150
3 7,720
4 7,078
5 6,713
Rank Username Points
180
80
59
57
55