cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
895
Views
0
Helpful
7
Replies

6500 HIGH CPU, Standby HSRP state changes continuously

sign2anup
Level 1
Level 1

Hi All,

We are using two  6500 User distribution connected with port channel also WLC is connected on Dist-2 now, these two 6500 are connected with backend core 7609 running ospf in between.  Spanning-tree root and Standby Active on Dist-1

Suddenly we are facing High CPU utilization on switches, couldn't able to ping nor login. However I could observe to  the following through console.

Switch1 is stable, all the vlans are active, but on standby all the vlans are getting state changes, suspecting there are not getting hello's on time due to which  it is going to active and standby.

%HSRP-5-STATECHANGE: Vlan208 Grp 208 state Standby -> Active
%HSRP-5-STATECHANGE: Vlan160 Grp 160 state Speak -> Standby
%HSRP-5-STATECHANGE: Vlan115 Grp 115 state Standby -> Active
%HSRP-5-STATECHANGE: Vlan156 Grp 156 state Speak -> Standby
%HSRP-5-STATECHANGE: Vlan208 Grp 208 state Speak -> Standby
%HSRP-5-STATECHANGE: Vlan209 Grp 209 state Speak -> Standby
%HSRP-5-STATECHANGE: Vlan196 Grp 196 state Speak -> Standby
%HSRP-5-STATECHANGE: Vlan112 Grp 112 state Speak -> Standby
%HSRP-5-STATECHANGE: Vlan196 Grp 196 state Standby -> Active

Configuration of Wireless VLAN

on both switch

interface Vlan198
 ip address 10.X.X.X
 ip helper-address
 standby 198 ip 10.X.X.X
 standby 198 timers 250 msec 750

 standby 198 priority 90
 standby 198 preempt
end

All the vlans having same timers and preempt on both switches,

we are not tracking any interface, can I remove preempt on secondary switch?

 

I have captured packets through netdr

Dist-2

interface Vl198, routine mistral_process_rx_packet_inlin, timestamp 23:53:02.823
dbus info: src_vlan 0xC6(198), src_indx 0x2(2), len 0x40(64)
  bpdu 0, index_dir 0, flood 1, dont_lrn 0, dest_indx 0x40C6(16582)
  F8020400 00C60000 00020000 40080168
 S000 E0000560 8E0FFFF8 00000008 40C60000
mistral hdr: req_token 0x0(0), src_index 0x2(2), rx_offset 0x76(118)
  requeue 0, obl_pkt 0, vlan 0xC6(198)
destmac FF.FF.FF.FF.FF.FF, srcmac 00.00.0C.07.AC.C6, protocol 0806
layer 3 data: 00010800 06040002 00000C07 ACC60A19 C601FFFF FFFFFFFF
              0A19C601 00000000 00000000 00000000 00000000 0000C601
              8300A369 00000000 0000FFFF
protocol ip: version 0x04, hlen 0x05, tos 0x00, totlen 48, identifier 0
  df 0, mf 0, fo 0, ttl 1, src 10.X.X.X, dst 224.0.0.2
    udp src 1985, dst 1985 len 28 checksum 0x53BE

Dist -1

interface Vl198, routine mistral_process_rx_packet_inlin, timestamp 23:48:26.830
dbus info: src_vlan 0xC6(198), src_indx 0x341(833), len 0x42(66)
  bpdu 0, index_dir 0, flood 1, dont_lrn 0, dest_indx 0x40C6(16582)
  60020401 00C60400 03410400 42080000 00110448 0E087C7C 00000008 40C60000
mistral hdr: req_token 0x0(0), src_index 0x341(833), rx_offset 0x76(118)
  requeue 0, obl_pkt 0, vlan 0xC6(198)
destmac 01.00.5E.00.00.02, srcmac 00.00.0C.07.AC.C6, protocol 0800
protocol ip: version 0x04, hlen 0x05, tos 0x00, totlen 48, identifier 0
  df 0, mf 0, fo 0, ttl 1, src 10.X.X.X, dst 224.0.0.2
    udp src 1985, dst 1985 len 28 checksum 0x53BF

This issue is not continuous, its triggering intermittently sometime with in couple of hours   or days

7 Replies 7

Hi Anup,

The HSRP flaps could only be the victim of the high cpu.

The nedr capture above seems to be a normal HSRP packet (destination 224.0.0.2) src mac 00.00.0C.07.AC.C6 and it has flood bit 1.

I think we cannot conclude based on that what is the reason for high cpu. Do you have complete output for netdr during the issue. Also you can check basic outputs such as

sh proces cpu sorted | ex 0.00%

Also to ensure we have l2 stability, you could turn on mac move notification to see if any loop.

 

(config)# #mac address-table notification mac-move

 

Kindly share the outputs.

 

Hope this helps.

 

Thanks,

Madhu.

 

Attached netdr output from both the distribution switches

 

srcmac 00.23.EA.7A.FC.00 - Mac address of vlan 198 interferface on Dist-2

 

This 2 switches connected to 20 floor switches, where each floors are in stack. WLC connected on Distribution 2 and multiple AP's are connected on floor switches.

Hi,

 

The  netdr capture does not provide much info. Most of them are hsrp packets. The timers are also quite aggressive so i think it is expected to have lot of packets.

 

Do you see input queue drops on interfaces between these 2 or any other interfaces?

Also have you enable mac move notification? It is not service impacting command.

 

Thanks,

Madhu.

 

 

 

Also share 

 

sh proc cpu sorted | ex 0.00%

Mac move notification has been enabled, I can see drops on WLC connected interface.

As said issue is not continuous, again if it occurs I will share cpu sorted.

 

Do I need to enable mac move notification on floor switches, also should I need to harden the trunk interfaces of floor switches.

After digging I could able to find out

 

VLAN0195 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 165 last change occurred 00:31:14 ago
          from GigabitEthernet6/14
 VLAN0196 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 122 last change occurred 00:31:14 ago
          from GigabitEthernet6/14
 VLAN0197 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 291 last change occurred 00:31:14 ago
          from GigabitEthernet6/14
 VLAN0198 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 961 last change occurred 00:31:14 ago
          from GigabitEthernet6/14
 VLAN0199 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 1165 last change occurred 00:31:25 ago
          from GigabitEthernet6/14
 VLAN0200 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 888 last change occurred 00:31:25 ago
          from GigabitEthernet6/14

Port g 6/14 config

interface GigabitEthernet6/14
 switchport
 switchport trunk encapsulation dot1q
 switchport trunk native vlan 200
 switchport trunk allowed vlan 195-200
 switchport mode trunk

connected to Dist2- WLC

195  Wireless-Voice                   active    
196  GTC-VIDEO                        active    
197  IP_Camera                        active    
198  Wireless-client                        active    
199  Cisco-AP-MGMT                    active    
200  NW-MGMT                          active  

Rest of all vlans

VLAN0110 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 824 last change occurred 00:51:49 ago
          from TenGigabitEthernet1/3
 VLAN0111 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 680 last change occurred 00:51:49 ago
          from TenGigabitEthernet1/3

Also some ports I can root inconsistent on secondary switch configured as root secondary

Ten gig1/3 is connected to floor switch

 

VLAN0110 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 297 last change occurred 00:37:47 ago
          from StackPort1
 VLAN0111 is executing the rstp compatible Spanning Tree protocol
  Number of topology changes 164 last change occurred 00:37:47 ago
          from StackPort1

 

 

 

So you mean to say you saw high cpu during these spanning-tree changes?

Keep the mac-flap notification turned on. That will help to check in case of any flapping ports.

 

Also you can try to set up an EEM script to capture show process cpu sorted 
| ex 0.00%, sh span tree, netdr etc.. Please let me know if you require help with EEM.

 

Thanks,

Madhu.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Review Cisco Networking products for a $25 gift card