Solved: Re: Huge number of entries in MAC address table caused High CPU?

hiepnguyenho · ‎10-11-2010

Hello all,

We have got a problem of 6509 with SUP 720 (VSS system) for a week. In working hour, CPU process is about 80-90%. After working hour, CPU process is about 30-40%.

I have checked cpu process and this show high cpu utilization caused by ios-base. (please check the attachment)

This is really strange because we have not changed anything in configuration. I'm thinking about some flooding/broadcast on user VLAN. And here is the result of show mac-address-table:

There are a lot, lot, lot of multicast static MAC address (please check detail of attachment). I dont configure anything like this.So is this the main caused high CPU utilization?

Sw6509-VSS#sh mac-address
Legend: * - primary entry
age - seconds since last seen
n/a - not available

vlan   mac address     type    learn     age              ports
------+----------------+--------+-----+----------+--------------------------
*   43 0100.5e42.359d    static Yes          -   Gi1/2/3,Gi1/2/4,Gi1/2/8
                                                   Gi2/2/1,Gi2/2/2,Gi2/2/3
                                                   Gi2/2/4,Gi2/2/5,Po1,Po2
                                                   Po3,Po30
*   43 0100.5e0c.8b73    static Yes          -   Gi1/2/3,Gi1/2/4,Gi1/2/8
                                                   Gi2/2/1,Gi2/2/2,Gi2/2/3
                                                   Gi2/2/4,Gi2/2/5,Po1,Po2
                                                   Po3,Po30
*   42 0100.5e7f.c0e1    static Yes          -   Gi1/2/3,Gi1/2/4,Gi1/2/8
                                                   Gi2/2/1,Gi2/2/2,Gi2/2/3
                                                   Gi2/2/4,Gi2/2/5,Po1,Po2
                                                   Po3,Po30
*   20 0100.5e11.86e2    static Yes          -   Gi1/2/3,Gi1/2/4,Gi1/2/8
                                                   Gi2/2/1,Gi2/2/2,Gi2/2/3
                                                   Gi2/2/4,Gi2/2/5,Po1,Po2
                                                   Po3,Po30
*   42 0100.5e4f.863c    static Yes          -   Gi1/2/3,Gi1/2/4,Gi1/2/8
                                                   Gi2/2/1,Gi2/2/2,Gi2/2/3
                                                   Gi2/2/4,Gi2/2/5,Po1,Po2
                                                   Po3,Po30
*   75 0100.5e1d.7dfb    static Yes          -   Gi1/2/3,Gi1/2/4,Gi1/2/8
                                                   Gi2/2/1,Gi2/2/2,Gi2/2/3
                                                   Gi2/2/4,Gi2/2/5,Po1,Po2

Thank you very much.

Hiep Nguyen.

jorge.calvo · ‎10-13-2010

Hello,

I suggest typing the command "show ip traffic" and save the output. Then clear the statistics with "clear ip traffic" and check he statistics again after a couple of minutes. If the ICMP traffic is too high the "mls rate-limit unicast ip icmp redirect 0" would be needed.

Regards.

View solution in original post

Mahesh Gohil · ‎10-11-2010

hi Hiep Nguyen,

I have faced similar problem and the solution given was to put static routes towards (interface + next-hop ip) instead of only one entity.

I don't know what exactly reason for your case, let me try some more in my lab and will revert you.

Regards

Mahesh

hiepnguyenho · ‎10-11-2010

Dear Mahesh,

Thank you for your support. I will try what you said. I am thinking about MAC address table. Do you think a huge number of entries in MAC address table caused high CPU utilization?

jorge.calvo · ‎10-12-2010

Hello,

I faced a similar issue in my VSS system and it was caused by the ICMP redirects being swtiched through the CPU instead of the hardware ASICs.

I configured the non service intrusive command: "mls rate-limit unicast ip icmp redirect 0" and the CPU load returned to normal values.

Maybe you can give it a try as it is not disruptive.

Hope this helps.

jacyu2 · ‎10-12-2010

Hi Hiep,

There are several features configured on VSS: WCCP, NAT, Netflow etc. The high cpu might be caused by flow mask confilicts. please change the flow mode to interface full.

Fore more detail, please refer to

Table 57-1 Feature Requirements for Flow Masks

http://www.cisco.com/en/US/docs/switches/lan/catalyst6500/ios/12.2SX/configuration/guide/netflow.html

You could sniffer the traffic with the following link to check what packet puncted to cpu

http://www.cisco.com/en/US/products/hw/switches/ps708/products_tech_note09186a00804916e0.shtml#span

Regards,

Jack

hiepnguyenho · ‎10-12-2010

Thanks Jorge and Jack

I will try what you suggested. But, when I "show mac-address-table count" and "show mac-address-table", I can see that there are approximately 17,000 static multicast MAC address.

I dont statically configure multicast MAC address with interface. 17,000 static MAC address are made by igmp snooping. (I have also check "show mac-address-table multicast igmp snooping, the output showed all 17,000 MAC addresses in CAM table)

Is this the main cause of my problem? For your experience in 6500 system as Core Campus of a large enterprise, how many MAC address in CAM is acceptable? Thank you very much.

Hiep Nguyen.

jorge.calvo · ‎10-13-2010

Hello,

In my VSS system I only have 18 dynamic MAC addresses and none multicast mac-address because I do not have IGMP snooping. However the 6500 series CAM limit is around 98300 MAC entries so, it should not be a problem.

From you "show proc CPU detail 16407" PID, I see that 25% is used by IOS base and 8% by CPU interruptions. I think you have many packets punted to the CPU, probably multicast traffic.

Please issue the next commands:

- show mls ip multicast

- show mls ip multicast statistics

- show mls ip multicast summary

- show cef not-cef-switched

If the CPU increses due to IGMP snooping traffic being punted to the CPU I would configure the command "mls rate-limit multicast ipv4 igmp " with a number of packets per second depending on your needs. You can play with that value until reaching a normal CPU load.

Hope this helps.

hiepnguyenho · ‎10-13-2010

In working hour, it seems to be critical @ 90 - 100%

Sw6509-VSS(config)#do sh process cpu de 16407

CPU utilization for five seconds: 97%/76%; one minute: 96%; five minutes: 92%

PID/TID 5Sec 1Min 5Min Process Prio STATE CPU

16407 90.7% 89.0% 84.2% ios-base 6d22h

1 3.1% 2.2% 1.4% 10 Receive 2m25s

2 0.1% 0.2% 0.4% 5 Ready 6h38m

3 0.0% 0.1% 0.9% 10 Receive 3m26s

4 0.2% 1.6% 1.6% 10 Receive 30.298

5 0.0% 0.0% 0.0% 11 Nanosleep 93.871

6 70.1% 67.5% 62.4% 21 Intr 2d19h

7 2.5% 3.4% 3.4% 22 Intr 18h39m

8 0.8% 0.9% 0.9% 23 Intr 3h28m

9 0.0% 0.0% 0.0% 25 Intr 0.000

10 1.3% 1.0% 1.4% 10 Reply 18m55s

11 0.2% 0.2% 0.2% 10 Receive 1h10m

12 0.0% 0.0% 0.0% 10 Condvar 8.858

14 0.0% 0.0% 0.0% 20 Sigwaitin 0.030

15 0.5% 2.1% 2.0% 10 Reply 21m22s

16 1.3% 1.2% 1.5% 10 Receive 6h37m

17 0.0% 0.0% 0.0% 10 Reply 3.031

18 0.7% 0.5% 1.2% 10 Receive 17m46s

19 0.7% 2.6% 1.4% 10 Receive 82.311

20 3.9% 1.6% 1.7% 10 Reply 3m03s

21 4.0% 1.8% 1.8% 10 Receive 1h38m

23 1.3% 2.1% 2.0% 10 Receive 17m15s

Process sbin/ios-base, type IOS, PID = 16407

CPU utilization for five seconds: 13%/76%; one minute: 15%; five minutes: 14%

Task Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Prio Task Name

1 441 4419 99 0.00% 0.00% 0.00% 0 P Hot Service Task

2 576 4429 130 0.00% 0.00% 0.00% 0 P Hot Service Task

3 435 4436 98 0.00% 0.00% 0.00% 0 P Hot Service Task

4 8488835 54001326 157 1.07% 1.15% 1.20% 0 M Service Task

5 95490 1517613 62 0.09% 0.06% 0.04% 0 M Service Task

6 13576 379918 35 0.00% 0.01% 0.00% 0 M Service Task

7 13 50 260 0.00% 0.00% 0.00% 0 C Chunk Manager

8 0 1 0 0.00% 0.00% 0.00% 0 H Connection Mgr

9 501 390 1284 0.33% 0.19% 0.10% 1 M Virtual Exec

10 0 1 0 0.00% 0.00% 0.00% 0 M PF Redun ICC Req

11 0 1 0 0.00% 0.00% 0.00% 0 M PF Redun ICC Req

12 0 138 0 0.00% 0.00% 0.00% 0 M Retransmission o

13 1 8 125 0.00% 0.00% 0.00% 0 M IPC ISSU Dispatc

14 55036 499954 110 0.00% 0.00% 0.00% 0 C Load Meter

15 8587666 581580 14766 0.11% 0.30% 0.33% 0 L Check heaps

16 65 521 124 0.00% 0.00% 0.00% 0 C Pool Manager

17 0 1 0 0.00% 0.00% 0.00% 0 M ext_log_pak Svc

18 0 2 0 0.00% 0.00% 0.00% 0 M Timers

19 419214 2008720 208 0.00% 0.03% 0.00% 0 M EnvMon

20 0 1 0 0.00% 0.00% 0.00% 0 L AAA_SERVER_DEADT

For detail process 16407, 76% is a big number, so there are a lot of traffic punted to my CPU.

I have try your commands, no multicast traffic.

Sw6509-VSS(config)#do sh cef not

% Command accepted but obsolete, see 'show (ip|ipv6) cef switching statistics [feature]'

IPv4 CEF Packets passed on to next switching layer

Slot No_adj No_encap Unsupp'ted Redirect Receive Options Access Frag

RP 0 0 11043448 237743 39760960 10 0 0

21/0 0 0 0 0 0 0 0 0

18/0 0 0 0 0 0 0 0 0

17/0 0 0 0 0 0 0 0 0

37/0 0 0 0 0 0 0 0 0

37/1 0 0 0 0 0 0 0 0

33/0 0 0 0 0 0 0 0 0

34/0 0 0 0 0 0 0 0 0

I will try to sniff traffic on my switch

Thank you very much.

jorge.calvo · ‎10-13-2010

Hello,

I suggest typing the command "show ip traffic" and save the output. Then clear the statistics with "clear ip traffic" and check he statistics again after a couple of minutes. If the ICMP traffic is too high the "mls rate-limit unicast ip icmp redirect 0" would be needed.

Regards.

hiepnguyenho · ‎10-13-2010

Hi Jorge,

It's so great!! When I "show ip traffic", there's a lot of ICMP redirect and unreachable. So I rate limit these 2 traffic. And my CPU goes down to 40%.

Is it caused by Routing protocol or something? Much appreciate!!

jorge.calvo · ‎10-13-2010

The cause of this ICMP redirect is:

When a packet is routed on the same interface (sent out of the same interface the switch received the packet on) an ICMP redirect message is by default generated (to inform the sender that there is a more direct next hop able to route the traffic)

For example, when a host sends packets through a non-optimal router, the MSFC sends ICMP-redirect messages to the host to correct its sending path. If this traffic occurs continuously, and is not rate limited, the MSFC will continuously generate ICMP-redirect messages. On this platform even if you disable the ICMP generation by configuring "no ip redirect" under the relevant interface, at hardware level the traffic that is supposed to be routed back to the same interface is sent to the RP engine which then forwards it.

Hope this helps.

hiepnguyenho · ‎10-13-2010

You're so great, thank you very much, Jorge

Regards,

Hiep Nguyen.