BGP and backup VPN

Unanswered Question

I have a minor issue I need help with.

We have installed some dsl and cable modems at various remote sites and are running GRE IPsec tunnels over the connections back to our HQ. These are always up connections using EIGRP, that take over if the primary T1s ever fail. We are using BGP over an MPLS network on the primary.

Overall this works great. It fails over only losing a few packets on almost all the backup vpns, except for a few. Some take as long as 3 min to pick up the backup connection. Once they are on they work fine, but does anyone know the reason as to why the descrepancy on a handful of the sites? The configs are the same and the vendors are different (so not a vendor issue). Is there BGP or routing issue on some that would delay the backup from taking over on some of the sites but not the others?

any help would be appreciated.

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
scottlivingston Fri, 05/30/2008 - 12:12

As I'm sure you know, 3min is the default bgp holdtime. I would verify the timers on both sides of the peering.

Also, are you mimicking the outage the same way for all connections?


scottlivingston Mon, 06/02/2008 - 10:25

Wish I had a quick answer for you, man. If it were me (and I'm sure you already took a similar approach) I would find the common factors between the handful of sites that are not failing over immediately and compare that to the others that are.

My understanding is that if the timers are diff on both sides of the peer then the lowest timer wins, but we are dealing w/ the 180sec hold, which is as high as it gets.

I'll be watching to see the winning post for this one.


yea that is the problem, there is no one common denominator on the ones that are not failing over right away (at least that I have found yet).

I have a few more tests scheduled in the next few weeks, and I am going to run a debug when I do to see if I can get some more detailed info, but thought I would see if anyone had some suggestions in the mean time.

avillalva Mon, 06/02/2008 - 22:17

I wonder if it's related to the peering interface on the other end. For example, if the peering interface is a loopback (ebgp multihop + update source) and the physical interface dropped then the session would need to time out.

If the peer was on a physical interface that dropped, the session would drop immediately.

Hope that helps,



This Discussion