BGP with 4510 which has high CPU utilization

Unanswered Question
Aug 19th, 2010
User Badges:

Hi,

I am planning to extend my bgp routes of Private Network Cloud to core switch which is cisco 4510 switch.
Before that just wanted to make sure the 4510 switch which is acting as core is capable of handling present 4300 bgp routes of private colud & also future growth..


I see followoing challanges in 4500 switch

1. CPU utilization is always in spike. Following is the graph..


    8888888888888888888888888888884444422222222223333322222222
    3333344444333335555555555444442222222222555551111144444555
100
90                **********
80 ******************************
70 ******************************
60 ******************************
50 ******************************
40 ***********************************
30 ***********************************     **********     ***
20 **********************************************************
10 **********************************************************
   0....5....1....1....2....2....3....3....4....4....5....5....
             0    5    0    5    0    5    0    5    0    5
               CPU% per second (last 60 seconds)

    8528882589452885258945288525893528852589352885258925388536
    4466748563059675976185667675619966658551369576666176967611
100
90    *#   *#   **   *#   **   **   **   **   **   **   **
80 *  *#*  *#   *#   *#   *#   *#   *#   *#   *#   *#   *#
70 *  *#*  *#   *#   *#   *#   *#   *#   *#   *#   *#   *#
60 *  ##* *## * ##* *## * ##* *## * ##* *## * ##* *## * ##* *
50 ** ##* *## * ##* *##** ##* *## * ##* *## * ##* *## * ##* *
40 ** ##* *##** ##* *##** ##* *##** ##* *## * ##* *## **##* *
30 ##*###*#####*###*#####*###*#####*###*#####*###*###*#######
20 ##########################################################
10 ##########################################################
   0....5....1....1....2....2....3....3....4....4....5....5....
             0    5    0    5    0    5    0    5    0    5
               CPU% per minute (last 60 minutes)
              * = maximum CPU%   # = average CPU%

    9999999999999999999999999999999889999999999999999999999999999999999999
    1212222100011112210200113514231981011111211030322212121111000100112132
100                          *
90 **********************************************************************
80 **********************************************************************
70 **********************************************************************
60 **********************************************************************
50 **********************************************************************
40 ######################################################################
30 ######################################################################
20 ######################################################################
10 ######################################################################
   0....5....1....1....2....2....3....3....4....4....5....5....6....6....7.
             0    5    0    5    0    5    0    5    0    5    0    5    0
                   CPU% per hour (last 72 hours)
                  * = maximum CPU%   # = average CPU%


CTSINMUMWINCORS1-3#sh processes cpu | exclude 0.00%
CPU utilization for five seconds: 84%/6%; one minute: 67%; five minutes: 48%
PID      Runtime(ms)        Invoked      uSecs   5Sec   1Min   5Min TTY Process
  14         21386956  47536404             449  0.15%  0.15%  0.15%   0 ARP Input
  36        251641280  29355777            8572  1.19%  1.15%  1.14%   0 IDB Work
  48       17236580561236754938       1393 18.79% 16.54% 14.14%   0 Cat4k Mgmt HiPri
  49  13914748003235177802             430 13.19% 11.45%  9.68%   0 Cat4k Mgmt LoPri
  96     8274860  76071420                  108  0.23%  0.21%  0.21%   0 CDP Protocol
103   6571898362666893213             246  7.19%  5.26%  3.11%   0 IP Input
106    10031404  26977475              371  0.07%  0.09%  0.08%   0 ADJ resolve proc
109   398618552 213545634            1866  2.07%  2.26%  2.26%   0 Spanning Tree
144     7241736  32973568                219  0.07%  0.10%  0.08%   0 CEF: IPv4 proces
194         124      2400         51             0.31%  0.08%  0.02%   1 Virtual Exec
196     69564961275581072          5  0.31%  0.35%  0.34%   0 HSRP Common
200  1355102708 535068069       2532 13.91%  9.83%  5.46%   0 IP SNMP
201   3226393642386101047        135  4.00%  2.82%  1.54%   0 PDU DISPATCHER
202  11946354162416539742        494 15.75% 11.20%  6.21%   0 SNMP ENGINE


2.  we observed Cat4k Mgmt HiPri & Cat4k Mgmt LoPri is always high.....CPU Utlization looks worrying,...?

3. Memory status are as under

CTSINMUMWINCORS1-3#dir
Directory of bootflash:/

    1  -rwx    12975808  Nov 29 2009 00:21:59 +00:00  cat4500-entservices-mz.122
-31.SG.bin
    3  -rwx    16630624  Nov 29 2009 03:18:27 +00:00  cat4500-entservices-mz.122
-50.SG4.bin

59244544 bytes total (13007104 bytes free)


Is it  capable of handling 4300 present bgp routes & furture growth


  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
yogesh.suryawanshi Thu, 08/19/2010 - 04:03
User Badges:

Hi,

Thanks for lightning response !!!

What if i just accept default route from iBGP peer & advertise segments from 4510.

Will it cause issues with 4510 switch.

While looking at 4510 switch what do you think ? CPU utilization is warring...however we don’t see any performance issues.

Yogesh



paolo bevilacqua Thu, 08/19/2010 - 04:06
User Badges:
  • Super Gold, 25000 points or more
  • Hall of Fame,

    Founding Member

Default route or just few routes and little update exchanges is ok.


You do not see performance degradation because that is a layer 3 switch and CPU usage does not impact routing performances.


Notwithstanding, you should always limit CPU usage to reasonable values.


Please remember to rate useful posts clicking on the stars below.

paolo bevilacqua Thu, 08/19/2010 - 03:56
User Badges:
  • Super Gold, 25000 points or more
  • Hall of Fame,

    Founding Member

Campus switches are ALWAYS as bad choice for protocol intensive applications like BGP, flapping networks or many prefixes.


That is because their slow CPU is not designed for that.


Either find a way to not send all these prefixes the switch, or upgrade to a truly WAN capable switch.

Giuseppe Larosa Thu, 08/19/2010 - 11:54
User Badges:
  • Super Silver, 17500 points or more
  • Hall of Fame,

    Founding Member

Hello Yogesh,

what Paolo says is 100% true: if you can live with just a single iBGP default route there is no need to receive 4300 routes if in any case the core switch has to send the packet to the WAN facing router(s), you just need a direct link between the WAN routers.


>> 2.  we observed Cat4k Mgmt HiPri & Cat4k Mgmt LoPri is always high.....CPU Utlization looks worrying,...?


About what you see on C4500 cpu usage there is a specific document that can be of help these processes are C4500 specific


http://www.cisco.com/en/US/products/hw/switches/ps663/products_tech_note09186a00804cef15.shtml


Hope to help

Giuseppe

Chad Peterson Thu, 08/19/2010 - 17:25
User Badges:
  • Cisco Employee,

As Giuseppe said above, check out that doc...that is the BEST high CPU troubleshooting doc.



Also how many routes do you have on there 'show ip route sum'.  I know you said 4300 BGP routes...just curious what the total you have on the box is.


What supervisor are you using?


Message was edited by: Chad Peterson

yogesh.suryawanshi Fri, 08/20/2010 - 02:07
User Badges:

Hello,

Here goes the output of sh ip route summary

CTSINMUMWINCORS1-3#sh ip route summary

IP routing table name is Default-IP-Routing-Table(0)

IP routing table maximum-paths is 8

Route Source    Networks    Subnets     Overhead    Memory (bytes)

connected               1           18          1216        2888

static                      2           4           384         912

ospf 400                  0           3           192         456

  Intra-area: 2 Inter-area: 1 External-1: 0 External-2: 0

  NSSA External-1: 0 NSSA External-2: 0

internal        1                                   1172

Total           4           25          1792        5428

I was just going through document & trying to look, what all process at hitting to CPU.

Following is output....

CTSINMUMWINCORS1-3#sh processes cpu sorted | exclude 0.00%

CPU utilization for five seconds: 27%/1%; one minute: 32%; five minutes: 42%

PID Runtime(ms)   Invoked      uSecs   5Sec   1Min   5Min TTY Process

  48  17302999441257770854       1375 11.43% 11.68% 13.22%   0 Cat4k Mgmt HiPri

  49  13970651123247705765          0  7.11%  8.78%  8.82%   0 Cat4k Mgmt LoPri

   7    45279984   3931314      11517  2.23%  0.29%  0.20%   0 Check heaps

109   400145452 214338174       1866 2.00%  2.22%  2.25%   0 Spanning Tree

  36   252606240  29463177       8573  1.11%  1.15%  1.14%   0 IDB Work

103   6595812562676841939          0  0.39%  1.08%  2.40%   0 IP Input

195          44        80        550  0.39%  0.05%  0.01%   2 Virtual Exec

196     69849001280450538          5  0.31%  0.35%  0.34%   0 HSRP Common

  96     8310080  76380603        108  0.15%  0.21%  0.21%   0 CDP Protocol

  14    21496992  47768286        450  0.15%  0.15%  0.15%   0 ARP Input

202  11992251202425787568          0  0.07%  1.66%  4.66%   0 SNMP ENGINE

  56     2628460   6284609        418  0.07%  0.04%  0.05%   0 Compute load avg

106    10085196  27123570        371  0.07%  0.15%  0.15%   0 ADJ resolve proc

144     7273800  33098733        219  0.07%  0.10%  0.09%   0 CEF: IPv4 proces

200  1360324872 553587573       2457  0.07%  1.46%  4.09%   0 IP SNMP

163     3665196 216560293         16  0.07%  0.07%  0.07%   0 PM Callback

Under show plathform helath – see the following proce with high stats.

K2CpuMan Review       30.00   6.72     30     95  100  500    7   9    9  33187:28

CTSINMUMWINCORS1-3#show platform cpu packet statistics

Packets Received by Packet Queue

Queue                  Total           5 sec avg 1 min avg 5 min avg 1 hour avg

---------------------- --------------- --------- --------- --------- ----------

Esmp                        3371834142       191       198       161        152

L2/L3Control                 307722052        23        17        12         11

Host Learning                 12031432         2         1         0          0

L3 Fwd Low                    19253951         1         0         0          0

L2 Fwd Low                    99099662         8         3         4          2

L3 Rx Low                   2531021968       373       161       106        113

Packets Dropped by Packet Queue

Queue                  Total           5 sec avg 1 min avg 5 min avg 1 hour avg

---------------------- --------------- --------- --------- --------- ----------

Host Learning                    20606         0         0         0          0

I can see queue with L2/L3 Control & L3 Rx/ Low ...I am less experienced to dig out reasons behind high cpu utilization ..Please help…

Actions

This Discussion