Very Technical BGP Question

cbeswick · ‎10-21-2005

Hi,

Imagine if you will a head office and a remote office. Each site uses the OSPF routing protocol.

Two ISP network's interconnect the two sites using BGP on each provider network. At each site the ISP has provided 2

Customer Edge Routers that redistribute OSPF to BGP and via versa at the remote office, e.g :

Head office Head Office CE Router Remote Office CE Router Remote Office

----------- ---------------------- ----------------------- -------------

OSPF Routing -> Redistribute OSPF to BGP -> Redistribute BGP to OSPF -> OSPF Routing

Routes from the office are advertised down to the remote office over both ISP networks. Routes from the remote office

are advertised up to the head office over both ISP networks.

OSPF timers between our companies routers that connect to the service provider CE routers have been reduced so that if one

of the provider networks fails, OSPF will quickly flush out routes learnt from the failed network and force all traffic

to pass over the working provider "cloud".

The problem is this :

OSPF is flushing out the entries in about 3-4 seconds, but it is taking the failed provider network about 30 seconds to

flush out the routes being advertised back up to head office.

Is there any way in which a triggered update can be configured, or the timers at each ISP CE be modified so that the routes

learnt over the failed network are flushed out of the routing tables quicker ?

The ISP cannot modify the backbone BGP timers on the PE routers as this could potentially affect the whole design / convergence

of the total backbone, on which many customers operate.

I am sure that there is something we can do on the CE routers at each site where re-distribution takes place to

perhaps trigger a flush of the lost routes, or to speed up the aging out of the lost routes but am unsure

exactly what :)

Thanks in advance

baldy · ‎10-25-2005

The delay that you see is a function of a number of BGP nodes all taking their sweet time about dealing with the changes. Its what BGP does. Think of BGP as an old man, its very stately and dignified.

There is a function on the PE which will allow it to withdraw routes immediately based on loss of a connected interface, rather than waiting for a peer timout. This helps when the tail circuit goes down, which is after all the most likely scenario. The only thing you can do to increase the reconvergence rate here really is to reduce the peer advertisement interval, but this requires changes to your ISP's networks. Alternatively move to Cable and Wireless's IPVPN as we have already "tweaked" our network and can reconverge in about 10 seconds ;-)

cbeswick · ‎10-26-2005

Thanks - this has helped alot.