bgp with sup720-3b

Unanswered Question
Oct 22nd, 2009
User Badges:

hello,


I have a cat6500 with sup720-3b and 6704-10ge. I am running full bgp. I have traffic coming in on the gig ports on the sup and going out on the 10gig port. The problem is that the cpu load is increasing as the traffic is increasing. The way the cpu load is increasing it looks like I will get to 100 percent load way before I get to 10 gigs.


I don't have any crazy access lists. How can I find out why the cpu load is getting so high?


Thanks!





  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
Giuseppe Larosa Sat, 10/24/2009 - 11:50
User Badges:
  • Super Silver, 17500 points or more
  • Hall of Fame,

    Founding Member

Hello Doron,

a show module could be helpful to understand if you have all modules using DFCs or you have for example GE ports in a module with a CFC.

This can make some difference and could explain the greater use of cpu as traffic load increases.


I see you have only 160,000 that is under the limits for your supervisor (256,000 IPv4 prefixes, 128,000 IPv6 prefixes)


However, I worry about the following line:


External: 295127 Internal: 6 Local: 0


295,000 prefixes is an internet full table nowdays.


what is the sh ip bgp sum output?

How many BGP prefixes is the box receiving?


see

http://www.cisco.com/en/US/prod/collateral/switches/ps5718/ps708/product_data_sheet09186a0080159856.html


in case the number of BGP prefixes, that you receive and accept, would be higher that what can hosted in CEF table part of traffic would be process switched with heavy loading of main cpu.


Hint:

you can use

sh proc cpu sorted 1min


to see processes in order of cpu usage


Hope to help

Giuseppe


sridsdale Sat, 10/24/2009 - 11:59
User Badges:

The SUP720-3B is only recommended upto 256K routes. As you can see from your summary the full internet routing table somewhat exceeds that nowadays.

It could well be that the PFC card is struggling and has to call on the main board CPU for assitance.

I believe the SUP720-3BXL has support for 1M routes as it has a better PFC installed.

vishwancc Sun, 10/25/2009 - 22:45
User Badges:

Hi Dhalevi,


CPU utilization for five seconds: 18%/15%; one minute: 20%; five minutes: 21%.


Your High cup id due to Interrupts and not due to processes,could you check if the CEF is working fine on the device.


Regards

Vishwa

dhalevi Wed, 10/28/2009 - 06:55
User Badges:

hi,


Thank you for your input. I got rid of BGP and put in a default route. The cpu usage was still high. I thought there might be a problem with the sup so I forced the hot standby to take over. Problem remained. Then I forced a switchover again and now the problem has gone away.


_BGP_3#show platform hardware cap cpu

CPU Resources

CPU utilization: Module 5 seconds 1 minute 5 minutes

1 0% / 0% 0% 0%

2 0% / 0% 0% 0%

5 RP 0% / 0% 0% 0%

5 SP 5% / 0% 7% 7%

6 RP 0% / 0% 0% 0%

6 SP 3% / 0% 4% 4%

Processor memory: Module Bytes: Total Used %Used

1 219712512 53998320 25%

2 219712512 53998104 25%

5 RP 395027248 61022104 15%

5 SP 369174164 96653808 26%

6 RP 395057616 61854320 16%

6 SP 369152132 96154512 26%

I/O memory: Module Bytes: Total Used %Used

5 RP 67108864 10418792 16%

5 SP 67108864 10418736 16%

6 RP 67108864 10418792 16%

6 SP 67108864 10418736 16%

R_BGP_3#


Does anyone have an explanation of what is going on?


thanks!

Giuseppe Larosa Sun, 11/01/2009 - 01:01
User Badges:
  • Super Silver, 17500 points or more
  • Hall of Fame,

    Founding Member

Hello Doron,


>> Does anyone have an explanation of what is going on?


No, there is no a clear explanation of what happened.

Sure Fact: in initial scenario too many BGP prefixes that couldn't be stored in CEF tables so part of traffic is process switched and you saw cpu usage increased as traffic volume increased.

Then you remove BGP and place a simple static route.

But you still see high cpu.

The additional aspect to be considered are the two supervisors: could be the cpu very high for problems in communication between supervisors for this big topology change 1 prefix instead of 300,000?

After reloading the active sup the problem appears to be solved.


It was stucked in some activity but we cannot say what process was.


Hope to help

Giuseppe


Actions

This Discussion