cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2878
Views
5
Helpful
11
Replies

Huge number of entries in MAC address table caused High CPU?

hiepnguyenho
Level 1
Level 1

Hello all,


We have got a problem of 6509 with SUP 720 (VSS system) for a week. In working hour, CPU process is about 80-90%. After working hour, CPU process is about 30-40%.

I have checked cpu process and this show high cpu utilization caused by ios-base. (please check the attachment)


This is really strange because we have not changed anything in configuration. I'm thinking about some flooding/broadcast on user VLAN. And here is the result of show mac-address-table:


There are a lot, lot, lot of multicast static MAC address (please check detail of attachment). I dont configure anything like this.So is this the main caused high CPU utilization?


Sw6509-VSS#sh mac-address
Legend: * - primary entry
        age - seconds since last seen
        n/a - not available


  vlan   mac address     type    learn     age              ports
------+----------------+--------+-----+----------+--------------------------
*   43  0100.5e42.359d    static  Yes          -   Gi1/2/3,Gi1/2/4,Gi1/2/8
                                                   Gi2/2/1,Gi2/2/2,Gi2/2/3
                                                   Gi2/2/4,Gi2/2/5,Po1,Po2
                                                   Po3,Po30
*   43  0100.5e0c.8b73    static  Yes          -   Gi1/2/3,Gi1/2/4,Gi1/2/8
                                                   Gi2/2/1,Gi2/2/2,Gi2/2/3
                                                   Gi2/2/4,Gi2/2/5,Po1,Po2
                                                   Po3,Po30
*   42  0100.5e7f.c0e1    static  Yes          -   Gi1/2/3,Gi1/2/4,Gi1/2/8
                                                   Gi2/2/1,Gi2/2/2,Gi2/2/3
                                                   Gi2/2/4,Gi2/2/5,Po1,Po2
                                                   Po3,Po30
*   20  0100.5e11.86e2    static  Yes          -   Gi1/2/3,Gi1/2/4,Gi1/2/8
                                                   Gi2/2/1,Gi2/2/2,Gi2/2/3
                                                   Gi2/2/4,Gi2/2/5,Po1,Po2
                                                   Po3,Po30
*   42  0100.5e4f.863c    static  Yes          -   Gi1/2/3,Gi1/2/4,Gi1/2/8
                                                   Gi2/2/1,Gi2/2/2,Gi2/2/3
                                                   Gi2/2/4,Gi2/2/5,Po1,Po2
                                                   Po3,Po30
*   75  0100.5e1d.7dfb    static  Yes          -   Gi1/2/3,Gi1/2/4,Gi1/2/8
                                                   Gi2/2/1,Gi2/2/2,Gi2/2/3
                                                   Gi2/2/4,Gi2/2/5,Po1,Po2


Thank you very much.

Hiep Nguyen.

1 Accepted Solution

Accepted Solutions

Hello,

I suggest typing the command "show ip traffic" and save the output. Then clear the statistics with "clear ip traffic" and check he statistics again after a couple of minutes. If the ICMP traffic is too high the "mls rate-limit unicast ip icmp redirect 0" would be needed.

Regards.

View solution in original post

11 Replies 11

Mahesh Gohil
Level 7
Level 7

hi Hiep Nguyen,

I have faced similar problem and the solution given was to put static routes towards (interface + next-hop ip) instead of only one entity.

I don't know what exactly reason for your case, let me try some more in my lab and will revert you.

Regards

Mahesh

Dear Mahesh,

Thank you for your support. I will try what you said. I am thinking about MAC address table. Do you think a huge number of entries in MAC address table caused high CPU utilization?

Hello,

I faced a similar issue in my VSS system and it was caused by the ICMP redirects being swtiched through the CPU instead of the hardware ASICs.

I configured the non service intrusive command: "mls rate-limit unicast ip icmp redirect 0" and the CPU load returned to normal values.

Maybe you can give it a try as it is not disruptive.

Hope this helps.

jacyu2
Cisco Employee
Cisco Employee

Hi Hiep,

There are several features configured on VSS: WCCP, NAT, Netflow etc. The high cpu might be caused by flow mask confilicts. please change the flow mode to interface full.

Fore more detail, please refer to

Table 57-1     Feature Requirements for Flow Masks

http://www.cisco.com/en/US/docs/switches/lan/catalyst6500/ios/12.2SX/configuration/guide/netflow.html

You could sniffer the traffic with the following link to check what packet puncted to cpu

http://www.cisco.com/en/US/products/hw/switches/ps708/products_tech_note09186a00804916e0.shtml#span

Regards,

Jack

Thanks Jorge and Jack

I will try what you suggested. But, when I "show mac-address-table count" and "show mac-address-table", I can see that there are approximately 17,000 static multicast MAC address.

I dont statically configure multicast MAC address with interface. 17,000 static MAC address are made by igmp snooping. (I have also check "show mac-address-table multicast igmp snooping, the output showed all 17,000 MAC addresses in CAM table)

Is this the main cause of my problem? For your experience in 6500 system as Core Campus of a large enterprise, how many MAC address in CAM is acceptable? Thank you very much.

Hiep Nguyen.

Hello,

In my VSS system I only have 18 dynamic MAC addresses and none multicast mac-address because I do not have IGMP snooping. However the 6500 series CAM limit is around 98300 MAC entries so, it should not be a problem.

From you "show proc CPU detail 16407" PID, I see that 25% is used by IOS base and 8% by CPU interruptions. I think you have many packets punted to the CPU, probably multicast traffic.

Please issue the next commands:

- show mls ip multicast

- show mls ip multicast statistics

- show mls ip multicast summary

- show cef not-cef-switched

If the CPU increses due to IGMP snooping traffic being punted to the CPU I would configure the command "mls rate-limit multicast ipv4 igmp " with a number of packets per second depending on your needs. You can play with that value until reaching a normal CPU load.

Hope this helps.

In working hour, it seems to be critical @ 90 - 100%

Sw6509-VSS(config)#do sh process cpu de 16407

CPU utilization for five seconds: 97%/76%; one minute: 96%; five minutes: 92%

PID/TID   5Sec    1Min     5Min Process             Prio  STATE        CPU

16407    90.7%   89.0%    84.2% ios-base                               6d22h

      1   3.1%    2.2%     1.4%                       10  Receive      2m25s

      2   0.1%    0.2%     0.4%                        5  Ready        6h38m

      3   0.0%    0.1%     0.9%                       10  Receive      3m26s

      4   0.2%    1.6%     1.6%                       10  Receive     30.298

      5   0.0%    0.0%     0.0%                       11  Nanosleep   93.871

      6  70.1%   67.5%    62.4%                       21  Intr         2d19h

      7   2.5%    3.4%     3.4%                       22  Intr        18h39m

      8   0.8%    0.9%     0.9%                       23  Intr         3h28m

      9   0.0%    0.0%     0.0%                       25  Intr         0.000

     10   1.3%    1.0%     1.4%                       10  Reply       18m55s

     11   0.2%    0.2%     0.2%                       10  Receive      1h10m

     12   0.0%    0.0%     0.0%                       10  Condvar      8.858

     14   0.0%    0.0%     0.0%                       20  Sigwaitin    0.030

     15   0.5%    2.1%     2.0%                       10  Reply       21m22s

     16   1.3%    1.2%     1.5%                       10  Receive      6h37m

     17   0.0%    0.0%     0.0%                       10  Reply        3.031

     18   0.7%    0.5%     1.2%                       10  Receive     17m46s

     19   0.7%    2.6%     1.4%                       10  Receive     82.311

     20   3.9%    1.6%     1.7%                       10  Reply        3m03s

     21   4.0%    1.8%     1.8%                       10  Receive      1h38m

     23   1.3%    2.1%     2.0%                       10  Receive     17m15s


Process sbin/ios-base, type IOS, PID = 16407

CPU utilization for five seconds: 13%/76%; one minute: 15%; five minutes: 14%

Task  Runtime(ms)  Invoked  uSecs    5Sec   1Min   5Min TTY Prio Task Name

   1         441      4419     99   0.00%  0.00%  0.00%   0    P Hot Service Task

   2         576      4429    130   0.00%  0.00%  0.00%   0    P Hot Service Task

   3         435      4436     98   0.00%  0.00%  0.00%   0    P Hot Service Task

   4     8488835  54001326    157   1.07%  1.15%  1.20%   0    M Service Task   

   5       95490   1517613     62   0.09%  0.06%  0.04%   0    M Service Task   

   6       13576    379918     35   0.00%  0.01%  0.00%   0    M Service Task   

   7          13        50    260   0.00%  0.00%  0.00%   0    C Chunk Manager  

   8           0         1      0   0.00%  0.00%  0.00%   0    H Connection Mgr 

   9         501       390   1284   0.33%  0.19%  0.10%   1    M Virtual Exec   

  10           0         1      0   0.00%  0.00%  0.00%   0    M PF Redun ICC Req

  11           0         1      0   0.00%  0.00%  0.00%   0    M PF Redun ICC Req

  12           0       138      0   0.00%  0.00%  0.00%   0    M Retransmission o

  13           1         8    125   0.00%  0.00%  0.00%   0    M IPC ISSU Dispatc

  14       55036    499954    110   0.00%  0.00%  0.00%   0    C Load Meter     

  15     8587666    581580  14766   0.11%  0.30%  0.33%   0    L Check heaps    

  16          65       521    124   0.00%  0.00%  0.00%   0    C Pool Manager   

  17           0         1      0   0.00%  0.00%  0.00%   0    M ext_log_pak Svc

  18           0         2      0   0.00%  0.00%  0.00%   0    M Timers         

  19      419214   2008720    208   0.00%  0.03%  0.00%   0    M EnvMon         

  20           0         1      0   0.00%  0.00%  0.00%   0    L AAA_SERVER_DEADT

For detail process 16407, 76% is a big number, so there are a lot of traffic punted to my CPU.

I have try your commands, no multicast traffic.

Sw6509-VSS(config)#do sh cef not

% Command accepted but obsolete, see 'show (ip|ipv6) cef switching statistics [feature]'


IPv4 CEF Packets passed on to next switching layer

Slot  No_adj No_encap Unsupp'ted Redirect  Receive  Options   Access     Frag

RP         0       0    11043448   237743 39760960       10        0        0

21/0       0       0           0        0        0        0        0        0

18/0       0       0           0        0        0        0        0        0

17/0       0       0           0        0        0        0        0        0

37/0       0       0           0        0        0        0        0        0

37/1       0       0           0        0        0        0        0        0

33/0       0       0           0        0        0        0        0        0

34/0       0       0           0        0        0        0        0        0

I will try to sniff traffic on my switch

Thank you very much.

Hello,

I suggest typing the command "show ip traffic" and save the output. Then clear the statistics with "clear ip traffic" and check he statistics again after a couple of minutes. If the ICMP traffic is too high the "mls rate-limit unicast ip icmp redirect 0" would be needed.

Regards.

Hi Jorge,

It's so great!! When I "show ip traffic", there's a lot of ICMP redirect and unreachable. So I rate limit these 2 traffic. And my CPU goes down to 40%.

Is it caused by Routing protocol or something? Much appreciate!!

The cause of this ICMP redirect is:

When a packet is routed on the same interface (sent out of the same interface the switch received the packet on) an ICMP redirect message is by default generated (to inform the sender that there is a more direct next hop able to route the traffic)

For example, when a host sends packets through a non-optimal router, the MSFC sends ICMP-redirect messages to the host to correct its sending path. If this traffic occurs continuously, and is not rate limited, the MSFC will continuously generate ICMP-redirect messages. On this platform even if you disable the ICMP generation by configuring "no ip redirect" under the relevant interface, at hardware level the traffic that is supposed to be routed back to the same interface is sent to the RP engine which then forwards it.

Hope this helps.

You're so great, thank you very much, Jorge

Regards,

Hiep Nguyen.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Review Cisco Networking products for a $25 gift card