Strange LAN slowness response

Unanswered Question
May 12th, 2008
User Badges:

I've setup a Catalyst 3750 to L3 route and interface directly with an MPLS network via its int g1/0/24 and g2/0/24. This Cat3750 runs BGP on it to BGP peer with the MPLS cloud. Attached is the configuration I've done.


NO problem with running the switch until I carry out a link fail-over test by pulling the cable off int g1/0/24 (primary link) and at the same time do a continous ping to a remote site from a PC on the local LAN (and console to the switch at the same time). From console I can see that it took < 10 sec for BGP to converge and I was able to ping (from the switch) to the remote site. The PC however, timed out with several ping failures and was not restored shortly after 70 seconds. What I don't understand is why does it takes 70 seconds for the PC to reach the remote site instead of 10 seconds. I have a feeling that this might have something to do with the command "bpg dampening" but just want to have a second opinion out there. Thanks in advance for your comments.

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 5 (1 ratings)
Loading.
mheusing Tue, 05/13/2008 - 01:47
User Badges:
  • Cisco Employee,

Hi,


First, "bgp dampening" might delay convergence in other cases and I would recommend to remove it unless there is an important reason to have it in there. BGP dampening will supress a BGP path if it flapps several times. With default settings this will only happen, if there are three flaps within a short time. Failing an interface once will not lead to dampening with default settings, so will not explain the observed behaviour. Still, BGP dampening was introduced to reduce CPU usage in internet routers, because of excessive flapping. But even in the internet BGP dampening is now discouraged (http://www.ripe.net/ripe/docs/ripe-378.html). So in your environment I would consider this to be more harmful than helpful, why I recommended to remove it.


Second, this observation imho has to do with the networks involved. The two "link networks" on g1/0/24 and g2/0/24 will be routed in the provider network to the respective PE router, where the interfaces connect. So I am assuming convergence is mainly coming from the local BGP convergence. In case the PC sends the pings, another network (your LAN) is used. For this network all the traffic usually is routed through the MPLS network to your primary link. Usually - depending on the exact provider config of your MPLS VPN - the remote PE only knows about the one route to your main link. This has to be withdrawn by the PE with the failed link and the other path has to be announced in the MPLS provider network through BGP. So the 70 sec are explainable with normal BGP update behaviour and default timers.


You could try to use an extended ping with a LAN source IP from your switch to confirm this. Likely you will get around 70 sec convergence time.


One way of reducing the convergence time is to reduce the eBGP update-interval in your 3750 and the provider will have to configure his PE to get faster convergence times. I suggest to talk to your provider about the options he can offer you to achieve faster convergence in case 70 sec. are unacceptable.


Hope this helps! Please use the rating system.


Regards, Martin

vincent-n Wed, 05/14/2008 - 18:30
User Badges:

Thank you very much for your explanation. Hopefully things will work the way you've suggested it.

Actions

This Discussion