Cat6500 - HSRP Failing / High CPU Load

Unanswered Question
Dec 31st, 2008

The 'show processes cpu' is gone upto 98% on utilization on both Cat switches in redundancy.

The process consuming highest CPU is 'IP Input'.

Is there any way I can identify the host that is causing broadcast.

show cdp nei shows other switches only.

Please assist.

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
MATTHEW BECK Wed, 12/31/2008 - 12:34

I wouldn't expect broadcast traffic to do that to a 6509 CPU. Multicast, perhaps? Is there a multicast stream with a TTL of 1 expiring on the router? Something is getting process switched instead of hardware switched. Has anything changed in the config recently? You can start here:

for more help.

Good luck!


One quick way to find out might be to enable storm-control on the host ports (I'm not sure what the CATOS equivalent is, but here's IOS). Be aware that this will be service impacting when the offending port is shutdown. Also, you will need to manually reenable it, unless globally you define errdisable recovery for storm-control:

storm-control broadcast level 10

storm-control multicast level 15

storm-control unicast level 10

storm-control action shutdown



glen.grant Wed, 12/31/2008 - 13:09

Look at all your SVI's and you will find one or more much higher than normal . These normally should be very low because most traffic is switched in hardware and should not even hit the cpu . Something is causing a large amount of traffic to be sent to the cpu . If you find one svi much higher than the rest you can turn on netflow for the svi and you can probably get a pretty good idea of whoever is sending the traffic. You can also look at the counters on that SVI and see what traffic is really driving it . Someone doing a single multicast on a even a 720 can bring the CPU to its knees if its not configured for multicast. HSRP will fail and start to flap when the cpu gets that high .

cisco_lite Wed, 12/31/2008 - 16:58


Could you please mention how can I check the traffic on svi stated in your post.


cisco_lite Thu, 01/01/2009 - 01:02

Ok. I have noticed that when I shutdown the trunk (4 Gigabit ethernet links in etherchannel) between the two Cat6500 switches, the CPU utilization goes down to 0% from 99%.

Does this give any clue ?


cisco_lite Thu, 01/01/2009 - 01:23

I ran debug ip packet detail (buffered) and found excessive multicast flooding such as below. Every msec I can see 10 of these.

Jan 1 09:13:45.918: IP: s= (Vlan12), d=, len 48, rcvd 0

*Jan 1 09:13:45.918: UDP src=1985, dst=1985

*Jan 1 09:13:45.918: IP: s= (Vlan12), d=, len 48, rcvd 0

*Jan 1 09:13:45.918: UDP src=1985, dst=1985 & are both the SVI IPs (HSRP) of Vlan12 on the two Cat6500 switches.

Could you please tell me why is an SVI generating the multicast. Could some of the hosts have joined the multicast session. If so, why did they do so and how could I identify those hosts. In our network, multicasting is not an application requirement. How can I check it and minimize it ?

Please assist. Thanks.

Giuseppe Larosa Thu, 01/01/2009 - 01:30

Hello Cisco_lite,

don't worry about these messages they are just HSRP hellos

UDP port 1985 destination: all routers in subnet both routers send hellos messages every 3 seconds with default timers

your issue is a brigding loop see my other post

turn off all debugging and inspect the log messages for some event caused by UDLD or STP

Hope to help


Giuseppe Larosa Thu, 01/01/2009 - 01:26


this means you are experiencing a bridging loop.

For at least one L2 vlan.

The problem can be also on the uplink of one access layer switch not only on the etherchannel between the two devices.

Look in the cat6500 log messages if there are messages from UDLD or STP messages about inconsistent ports.

To be noted that we have seen the following:

if a new vlan has to be added and instead of adding it in the port-channel interface the new vlan is added in the configuration of one member link a bridging loop is formed.

This happened twice in two different campus networks.

Other cases were caused on uplinks of one access layer switch: for example in one case UDLD tried unsuccessfully to torn down a port.

We later changed the GBIC.

Hope to help


cisco_lite Thu, 01/01/2009 - 01:58

I have removed all the uplinks/connected switches. Now I have only two Cat6500 connected to each other via etherchannel.

I was actually doing new vlan configuration when this problem happened. I have deleted the vlan from both switches and it is not allocated to any of the service modules anymore.

The port channel interface has all the vlans as 'allowed' by default.

After undoing debug, clearing the log, I don't see any messages in the log. logging buffered has been enabled.

Please assist.


Giuseppe Larosa Thu, 01/01/2009 - 02:08

Hello Cisco_Lite,

you have the two C6500 with etherchannel enabled between them allowing all possible vlans.

The next step I suggest you is:

choice one C6500: re-enable only the uplinks of this device to all the access-layer switches.

This shouldn't cause any problem until one uplink is up on each access-layer switch.


now that access layer switches are accessible telnet to each of them and do

sh proc cpu hist

sh log

try to see if there is one device with some messages related to UDLD or STP events.

if you find something meaningful you have found the problem.

An alternate way can be:

re-enable the second uplink on the second C6500 one per time then wait two minutes and look at cpu usage or log messages.

This second method expose you to a chance of making the loop to happen again but this time you can find out the troubled link.

clearing the log you have lost the previous messages if you have a syslog server you can look at the messages there.

Hope to help


cisco_lite Thu, 01/01/2009 - 02:21

Hello Giuseppe,

Just want to correct one thing. There wasn't any uplink to a switch before. It was the ASA firewall. ASA Inside is connected to the Cat6500. Didn't find anything unusual in the ASA log though.

Since both redundant Cat6500 are not connected to any other network device now except servers on ethernet module, there is no need I think to connect any uplinks etc for troubleshooting. The Cat6500 with trunk is given high CPU and without trunk is functioning well.

Is there any debug command for bridging loop. Strange that I don't see any thing in the logs.

Please assist.


cisco_lite Thu, 01/01/2009 - 02:34

Hi Guiseppe,

Do you find any abnormal values in the SVI interface output below. At the moment, I have brought down all the SVI's except Vlan12 on both Cat6500.

Vlan12 is up, line protocol is up

Hardware is EtherSVI, address is 0023.3457.0e00 (bia 0023.3457.0e00)

Internet address is

MTU 1500 bytes, BW 1000000 Kbit, DLY 10 usec,

reliability 255/255, txload 1/255, rxload 1/255

Encapsulation ARPA, loopback not set

Keepalive not supported

ARP type: ARPA, ARP Timeout 04:00:00

Last input 00:00:00, output 00:00:01, output hang never

Last clearing of "show interface" counters never

Input queue: 1086/75/1702325563/4707 (size/max/drops/flushes); Total output dr

ops: 0

Queueing strategy: fifo

Output queue: 0/40 (size/max)

5 minute input rate 4233000 bits/sec, 8035 packets/sec

5 minute output rate 0 bits/sec, 0 packets/sec

L2 Switched: ucast: 134646434 pkt, 12547593600 bytes - mcast: 22281592735 pkt,

1729637921319 bytes

L3 in Switched: ucast: 17598303 pkt, 9025600495 bytes - mcast: 0 pkt, 0 bytes


L3 out Switched: ucast: 18650247 pkt, 13911482101 bytes mcast: 0 pkt, 0 bytes

522294062 packets input, 42388663968 bytes, 0 no buffer

Received 504675200 broadcasts (12 IP multicasts)

0 runts, 0 giants, 1113898 throttles

0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored

19203008 packets output, 13873692330 bytes, 0 underruns

0 output errors, 0 interface resets

0 output buffer failures, 0 output buffers swapped out

Giuseppe Larosa Thu, 01/01/2009 - 03:50

Hello Cisco_lite,

this SVI is receiving

5 minute input rate 4233000 bits/sec, 8035 packets/sec

Input queue: 1086/75/1702325563/4707 (size/max/drops/flushes); Total output dr

ops: 0

you had a lot of drops in the input queue and what is more serious the actual queue size id 1086 when when max size is 75

most of rx traffic is broadcast:

522294062 packets input, 42388663968 bytes, 0 no buffer

Received 504675200 broadcasts (12 IP multicasts)

0 runts, 0 giants, 1113898 throttles

With the last details you have given you are facing a very high volume of broadcast traffic.

this traffic as explained by other collegues hit the cpu causing high cpu usage.

I would try with ip accounting to find out the source of traffic.

the broadcast traffic can also be the result of a loop

check STP in vlan 12 on both devices with

show spanning-tree 12 detail

verify that both devices agree on root bridge ID and that one of them is the root

forget debug about stp they are very heavy.

Hope to help


cisco_lite Thu, 01/01/2009 - 04:27

Hello Giuseppe,

The root id is the same on both devices for vlan12

I noticed something, not sure if its an issue.

show spanning-tree vlan 12 detail

gives ' VLAN0012 is executing the ieee compatible Spanning Tree protocol' on standby switch and gives 'VLAN0012 is executing the rstp compatible Spanning Tree protocol' on primary switch'. Is that ok ?

I have enabled ip accounting on the interface, but nothing is seen on 'show ip accouting'. ip accounting was enabled with

interface vlan 12

ip accounting

Another observation, I was trying to go one-by-one with SVIs. So I shutdown Vlan12 (to which front end servers are connected) and brought up another vlan13 which is connected to FWSM outside. On primary switch 'no shut' on vlan13 works fine. But on the standby switch it says

'Forcing SVI13 to stay shutdown (SVI 20 tied to line card in slot 1.)'

Any clues.


cisco_lite Thu, 01/01/2009 - 04:58

Please ignore Forcing SVI to stay shutdown error. It occured due to another shut duplicate vlan shared with FWSM. I have removed the duplicate vlan.

With vlan12 down, I brought up vlan13 and did not see any hike in cpu. So the culprit is vlan12 to which front end servers are connected. Please let me know how can I identify the source of broadcast traffic in vlan12. Why wasn't I able to see the broadcast on debug ip packet detail on the Cat6500.

Please advise.


cisco_lite Thu, 01/01/2009 - 05:12

Additional info,

vlan12 is assigned to the ACE module configured in bridged mode. vlan14 is the server vlan and vlan12 is the client vlan(SVI).

The physical ports on ethernet module are connected to vlan14 (i.e. Server vlan). Vlan12 which shows a high broadcast traffic does not have any physical ports in the vlan.


Giuseppe Larosa Thu, 01/01/2009 - 05:47

Hello Cisco_lite,

gives ' VLAN0012 is executing the ieee compatible Spanning Tree protocol' on standby switch and gives 'VLAN0012 is executing the rstp compatible Spanning Tree protocol' on primary switch'. Is that ok ?

it looks like that the two switches are running two different modes of STP:

rPVTP should fall back to PVST on detection of a legacy neighbor.

I would suggest you to configure RPVST also on the standby

config t

spanning-tree mode rapid-pvst

Also if ACE is bridging you need to think of it as a bridge/switch too.

FWSM can convert PVST BPDUs from one vlan to the other.

I will look for ACE

what are the vlans bridged by ACE ?

are VL12 and Vl14 ?

Hope to help


Giuseppe Larosa Thu, 01/01/2009 - 06:15

Hello Cisco_Lite,

the ACE can convert PVST BPDUs from vlan x to vlan y

be aware that ACE by bridging joins two broadcast domains into a single one.

by default supervisor STP BPDUs are not allowed but if you want to do this you need:

NoteIf you use failover, you must permit BPDUs on both interfaces with an EtherType ACL to avoid bridging loops.


important note

NoteBefore allowing or blocking BPDUs on the ACE, you must disable the spanning-tree loopguard default IOS command (if configured) on the Catalyst 6500 supervisor. Otherwise, if you allow and then block BPDUs on the ACE, the ACE port enters the blocking state, resulting in a complete outage. To recover, you must reboot the ACE.

So there are chances that two ACE in two C6500 in bridging mode are creating a loop

C6500_1 --- ACE_1

VLx Vly

C6500_2 --- ACE_2

please give a look at the following chapter of ACE conf. guide

Hope to help


cisco_lite Thu, 01/01/2009 - 06:22

Yes, the bridged vlans on ACE are 12 & 14.

Please find below the output of 'show int stats', show ip traffic, and show interfaces switching. I would like to identify which specific host in vlan 12 (all hosts are connected to vlan 14-bridged) is generating high traffic. On show interfaces switching I can't see anyone generating huge traffic. Please let me know the difference between Pkts Out and Pkts In in show interfaces switching output.

CORE-SW#sh int stats | b Vlan12


Switching path Pkts In Chars In Pkts Out Chars Out

Processor 576571532 38111277318 563901 37537634

Route cache 2449 150205 25 2211

Distributed cache 17598303 9025600495 18650247 13836881113

Total 594172284 47137028018 19214173 13874420958

CORE-SW#sh ip traffic

IP statistics:

Rcvd: 705186107 total, 579197716 local destination

0 format errors, 0 checksum errors, 1544 bad hop count

0 unknown protocol, 0 not a gateway

0 security failures, 0 bad options, 11 with options

Opts: 0 end, 0 nop, 0 basic security, 0 loose source route

0 timestamp, 0 extended security, 0 record route

0 stream ID, 0 strict source route, 11 alert, 0 cipso, 0 ump

0 other

Frags: 0 reassembled, 0 timeouts, 0 couldn't reassemble

0 fragmented, 0 couldn't fragment

Bcast: 440082 received, 0 sent

Mcast: 578301381 received, 2309157 sent

Sent: 2660882 generated, 125989596 forwarded

Drop: 28 encapsulation failed, 0 unresolved, 0 no adjacency

353 no route, 0 unicast RPF, 0 forced drop

0 options denied, 0 source IP address zero

CORE-SW#sh interfaces switching


Throttle count 0

Drops RP 0 SP 0

SPD Flushes Fast 0 SSE 0

SPD Aggress Fast 0

SPD Priority Inputs 0 Drops 0

Protocol Path Pkts In Chars In Pkts Out Chars Out

Giuseppe Larosa Thu, 01/01/2009 - 06:41

Hello Cisco_lite,

I would suggest you to disable one ACE module and to see if one ACE disabled and the etherchannel trunk on we still see high cpu usage and broadcast.

for ip accounting

int vlan 12

ip accounting mac-address input

then after some minutes


sh ip accounting

Hope to help


cisco_lite Thu, 01/01/2009 - 06:50

Excellent!!!. Thanks Giuseppe for pointing in the right direction.

I shutdown the BVI on ACE module and the CPU usage came down. I recently added a new context to the ACE module and the ft group went out of sync. Hence, I could see 'Received ARP collision message' on both the ACE modules.

Could you please let me know how can I configure ft for a new context. The FT group for admin context was working perfectly fine. Can same FT group be used for non-admin context. Or is there a separate procedure to configure FT for non-admin context.


Giuseppe Larosa Thu, 01/01/2009 - 09:47

Hello Cisco_lite,

so something is wrong with the ACE


If you use failover, you must permit BPDUs on both interfaces with an EtherType ACL to avoid bridging loops.

The following is an example of an EtherType ACL that permits BDPUs:

host1/Admin(config)# access-list NONIP ethertype permit bdpu

before thinking of FT link the two vlans bridged have to allow Supervisor's BPDU frames so that the redundant links can be blocked and the brigding loop doesn't appear anymore

notice that by default the ACE doesn't bridge the BPDUs and this create the problem.

Hope to help


Giuseppe Larosa Thu, 01/01/2009 - 23:26

Hello Cisco_lite,

for FT vlans groups and contexts you need to associate the context with an FT group


Associating a Context with an FT Group

An FT group consists of two members (contexts) with the same name, each residing on a different ACE. To associate a context with an FT group, use the associate-context command in FT group configuration mode. You need to make this association for both redundant contexts in an FT group. The syntax of this command is:

associate-context name

For the name argument, enter the unique identifier of the context that you want to associate with the FT group.

For example, enter:

host1/Admin(config-ft-group)# associate-context C1

Hope to help



This Discussion