cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2686
Views
5
Helpful
11
Replies

My VSS switch goes 90% high cpu utilization

dorsia
Level 1
Level 1

How can I fix my high cpu utilization on vss based on proc cpu log ?

11 Replies 11

julijime
Cisco Employee
Cisco Employee

Hi Dorsia,

 

Per the output attached it seems that you're getting a lot of interrupt traffic, the best way you can troubleshoot this is to get a netdr capture and check which type of traffic is affecting your CPU. The following links should be useful for this purpose:

 

https://supportforums.cisco.com/document/59956/troubleshooting-netdr-capture-sup7206500

http://www.cisco.com/c/en/us/support/docs/switches/catalyst-6500-series-switches/116475-technote-product-00.html

 

Hope this helps!

Pavol Golis
Cisco Employee
Cisco Employee

As Jule said, do "debug netdr capture rx" & analyze the traffic in "show nedr capture" - especially look for source VLAN/Interface, source/destination IP. Check also "show ibc brief" - it will tell you the rate of traffic hitting CPU. Useful is also "show mls statistics" - see if some Error counters grow.

Dear All,

I have the output for my VSS after the debug netdr capture rx.

If you can assist me analyzing captured information, I will so grateful.

Please see the attachment.

Thanks,

dorsia

Hello,

Please see complete sh proc cpu.

Thanks,

dorsia

This seems to be the traffic hitting CPU the most:

 

------- dump of incoming inband packet -------
interface Po201, routine mistral_process_rx_packet_inlin, timestamp 04:53:58.835
dbus info: src_vlan 0x3F8(1016), src_indx 0xB73(2931), len 0xA2(162)
  bpdu 0, index_dir 0, flood 0, dont_lrn 0, dest_indx 0x380(896)
  48020401 03F80000 0B730100 A2000000 00110468 0E000008 00000010 03803371 
mistral hdr: req_token 0x0(0), src_index 0xB73(2931), rx_offset 0x76(118)
  requeue 0, obl_pkt 0, vlan 0x3F8(1016)
destmac 00.08.E3.FF.FD.90, srcmac 00.23.34.56.C8.00, protocol 0800
protocol ip: version 0x04, hlen 0x05, tos 0x00, totlen 144, identifier 39354
  df 0, mf 0, fo 0, ttl 254, src 10.135.193.254, dst 10.135.52.158
    udp src 514, dst 514 len 124 checksum 0xA365

 

Incoming by Port-Channel201, from 10.135.193.254 destined to 10.135.52.158, UDP source and destination ports are 514 (Syslog). Now its your turn to figure out why that happens. TTL is 254 so host is not far away. (Is 10.135.52.158 IP on a router?) "show vlan inter usa | i 1016"

    
kkmc-vss#show ibc brief
Interface information:
        Interface IBC0/0(idb 0x4756117C)
        Hardware is Mistral IBC (revision 5)
        5 minute rx rate 1914000 bits/sec, 1118 packets/sec <<< 68% (760pps) of this is below:
        5 minute tx rate 326000 bits/sec, 346 packets/sec


------- dump of incoming inband packet -------
interface Po201, routine mistral_process_rx_packet_inlin, timestamp 04:53:58.835
dbus info: src_vlan 0x3F8(1016), src_indx 0xB73(2931), len 0xA2(162)
  bpdu 0, index_dir 0, flood 0, dont_lrn 0, dest_indx 0x380(896)
  48020401 03F80000 0B730100 A2000000 00110468 0E000008 00000010 03803371 
mistral hdr: req_token 0x0(0), src_index 0xB73(2931), rx_offset 0x76(118)
  requeue 0, obl_pkt 0, vlan 0x3F8(1016)
destmac 00.08.E3.FF.FD.90, srcmac 00.23.34.56.C8.00, protocol 0800
protocol ip: version 0x04, hlen 0x05, tos 0x00, totlen 144, identifier 39354
  df 0, mf 0, fo 0, ttl 254, src 10.135.193.254, dst 10.135.52.158
    udp src 514, dst 514 len 124 checksum 0xA365
    
Questions:
- Why is 10.135.193.254 logging so much to syslog ? (Is it FW logging sessions?)
- Is 10.135.52.158 on node kkmc-vss ?
  If yes why is 10.135.193.254 sending its syslog to it ? (misconfigured syslog?)
- Looking at MACs both systems are Cisco.

Hi Pavol,

Now I see it now. Yes, 10.135.193.254 is a firewall logging and probably 10.135.52.158 is a syslog server.

I will inform our system administrator to move out his syslog server out of vss and transfer to 

network management vlan which is inside firewall.

Thanks,

Dani

 

Well, it shouldn't go to CPU anyway. Can you check "show ip route 10.135.52.158 detail" @ kkmc-vss & routing for same destination on firewall. This happens often when traffic is coming out the same SVI (Vlan interface) as it came in.

 

Hello Pavol,

We already removed firewall logging going to 10.135.52.158 and it is on default console logging.

Today, no appearance from 10.135.52.158 but remote command switch sh proc cpu shows 99% still.

I see from the debug that it came from 10.135.122.10. Isn't it?

I will send attachment with this.

Thanks,

 

dorsia

 Your Route Processor CPU (L3) traffic rate went down a lot, so its CPU load:

kkmc-vss#sh proc cpu
CPU utilization for five seconds: 15%/5%; one minute: 18%; five minutes: 18%
kkmc-vss#show ibc brief
Interface information:
        Interface IBC0/0(idb 0x4756117C)
        Hardware is Mistral IBC (revision 5)
        5 minute rx rate 591000 bits/sec, 228 packets/sec
        5 minute tx rate 231000 bits/sec, 312 packets/sec

 

The command you run:

remote command switch sh proc cpu

Is to measure Switch Processor CPU (L2) load, its also busy processing traffic (likely different cause for its high load), now do "attach X" where X is slot of supervisor, then "enable" and also "debug netdr capture rx" & "show netdr capture".

Apart from traffic processing in interrupts, your SP CPU is busy a lot by "MLS-MLD Process", thats the MLD Snooping for IPv6 Multicast, so likely some IPv6 mcast. (http://www.cisco.com/c/en/us/td/docs/switches/lan/catalyst6500/ios/12-2SX/configuration/guide/book/snoopmld.html)

 453   344579848 303171872       1136 27.59% 27.53% 27.25%   0 mls-mld Process  

 

Did you managed to figure out if syslog traffic from FW went in same SVI(Vlan interface) as it got it ?

 

So provide:

- netdr capture from SP

- Check routing for that syslog if ingress & egress SVI was same

Hi Pavol,

Regarding the firewall logging, it was return back to console logging and the syslog server been removed. It was a temporary logging message only.

Attached file is the capture log for show netdr capture after issuing attach 5. 

 

Please note that this problem occurs usually in the morning and today same problem but after awhile for 30 minutes is gone.

Thanks,

Dani

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Innovations in Cisco Full Stack Observability - A new webinar from Cisco