cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
8329
Views
0
Helpful
8
Replies

High CPU utilization on Cisco 4500 catalyst switches when using NLB

andrewswanson
Level 7
Level 7

hello

we're having problems with Microsoft NLB causing high cpu usage (peaks of 95%) on catalyst 4500s (IOS Version 12.2(31)SGA1, RELEASE SOFTWARE (fc3)). please see attached powerpoint slide for network diagram and configs (ip addresses aren't the real ones).

the 2003 servers are configured for IGMP multicast.

i've followed the http://www.cisco.com/warp/public/473/cat4500_high_cpu.pdf document to troubleshoot.

Found:

• K2CpuMan Review was taking the most cpu

• CPU queue L3 Fwd Low was getting a high volume of traffic so the document recommended a cpu monitor session

CPU monitor session showed:

• High volume of traffic between NLB server (1.1.1.13) and proxy (1.1.1.20)

• High volume of Client (both external/internal) traffic to proxy

Tried

switchport block multicast

switchport block unicast

on VLAN 222 but this made no difference to 4500 cpu

Tried

enabling cgmp on vlan 222 on 6500 but it's cpu jumped to 99% (from 1%) - the 4500s also went to 99% - cgmp switched off on 6500 vlan 222

the traffic we're seeing in the capture isn't unexpected but what can we do stop the high cpu on the 4500s.

thanks

andy

8 Replies 8

aghaznavi
Level 5
Level 5

Packets that are processed by the IGMP11 snooping features causes the CPU, try to avoid ACL fwd(snooping)

http://www.cisco.com/en/US/products/hw/switches/ps663/products_tech_note09186a00804cef15.shtml

thanks for the feedback. the 4500 cpu queue causing the problem is L3 Fwd Low. the documentation states that:-

"If the ARP is unresolved for the destination IP address, packets are sent to this queue."

i have static arp entries for the destination ip address (the web server) on all the L3 switches but still getting very high cpu usage because of hits on the L3 Fwd Low queue.

cheers

andy

We had exactly the same issue ,using 6513`s with NLB ,solved problem using ACL on Vlan interface ,switch down to between 10-15% from 95%

could you post an example of your ACL please

thanks

andy

Hi, did you find any solution to your case ?

I'm trying to implement IGMP in a similar network.

Hi - couldn't find a solution for this. MS NLB worked fine when all load balanced servers were on the same site/campus - as soon as these servers were geo-dispersed we started getting the high cpu problems. we're currently looking at a hardware solution with either Cisco ACE or F5 BigIP.

cheers

andy

This sounds like a reasonably simple case of process switching. The key here will be to find out why the packets are getting punted to the CPU rather than forwarded in hardware. Is there a route to the destination? An ACL that is logging the packets? Are the packets being sent with IP options? You mentioned you have static routes - do they point to an interface or to an IP address. If they point to an IP address, can you resolve ARP for that address? Are there TTL failures or ICMP redirects being sent out the interface (try a SPAN session to find out).

For anyone who might run into this issue in the future, we experienced the same problem, but the Cisco documentation is actually backwards. Entering static MAC addresses for the CAM table CAUSES the high CPU instead of preventing it. See my blog post:

http://www.orcsweb.com/blog/jeff/high-cpu-on-cisco-ios-with-msft-nlb-multicast-cluster/

Jeff

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Innovations in Cisco Full Stack Observability - A new webinar from Cisco