cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
3212
Views
15
Helpful
6
Replies

BGP fallover timers

Hi

I have two ISPs providing MPLS VPN service to my sites. We want to use BGP as PE-CE routing protocol (EBGP session with each ISP) to exchange my LAN routes but I have some questions about convergence time, i need to switch to secondary ISP as fast as possible in my situation.

There are a lot of limiting timers in BGP, here are my thoughts:

1. BGP scan timer import onwhich is default to 15 secs on PE routers. Pls correct me if i am mistaken but my understanding is that this timer is configured globally on PE BGP process and could not be configured per particular VRF instance, so my ISP will not lower it for me because of the risk that PE will consume much more resources.

Due to this limitation my convergence time is already limited to 15 secs in the worst case.

2. BGP advertise interval which is default to 30 secs. I want to ask my ISP to lower this to 15 secs, this can be done per neighbour and with my total amount of hundred prefixes from all sites should not be a problem to my ISP(concerning resource consumation on its PE). My understanding is that when my CE lose its PE neighbour information will be deliverd to all other CE two times faster in the best case.

3. BGP keepalive and hold time intervals. I am not able to use BFD (ISP refused to set it up) and fast-external-fallover (there is L2 device between our routers) for fast session deactivation. So the only way I see is to lower bgp timers. My idea is to set  BGP timers 4 12on my CE. PE routers will accept

4. BGP scan-time interval on my CEs which is default to 60 secs.My understanding is that this timer  is responsible for checking nex-hop reachability for the prefixes and (not sure about this) installing the routes from BGP routing table to the global routing table. My suggestion is to set bgp scan time to 10 seconds.

I suppose that all this timers work independently from each other but in my case i expect my average convergence time to be around 15 seconds. Am I missing something?

6 Replies 6

Mahesh Gohil
Level 7
Level 7

Hello,

Well I think you are talking about fallback from primary to secondary and vice versa....

B'se convergence is basically bringing all routing tables to a state of consistancy..more no. of routers in network more it take time to converge


So scan time is verifying what router is having and advt interval take care of when new information is advertising to peer.

I think you are more concern here with fallback only.

For that i can see only hold time (after how much time traffic will switchover)

well BGP is path vector, it is made to attend huge no. of routes and not for fast response.

If your requirement is shifting of traffic immediately use another routing protocol (ospf/eigrp)..but i don't think your provider(or you) will agree on that.

Your provider must support BFD..that is the only option left.

you have already noticed that each routing protocol have timers in seconds..and any voice application need parameter in mili second. only bfd

can help you on this

hope this helps

Regards

Mahesh

Thank you for your reply

You are right, my goal is to switch form main to second provider as fast as it possible in my situation.

BFD is supported by my ISPs but there are some sort of internal company rules that do not allow to set it up with customer.

I'd like to understand the whole process and predict/optimize average fallover time, assume I have this simple setup:

CE1--PE1--P1--PE2--CE2

Link between CE1--PE1 goes down and the process starts (i will use my corrected timers which are described above)

In the worst case:

1. PE1-CE1 hold time expires (12 seconds)

2. Scan timer for importing routes from VRF to MP-BGP expires on PE (+15 seconds)

3. Routing information travel across ISP network, pass route-reflectors and arrive on PE2 (+5 seconds?)

4. Advertisement interval timer expires on PE2 and prefix is withdrawn from CE2 (+15 seconds)

5. Scan timer expires on CE2  (10seconds) Not sure about this one. After the route has been withdrawned from bgp table the backup route (through ISP2 not shown here) should be chosen as best and injected to global routing table. I believe that scan timer is responsible for this. (pls correct me if i am mistaken)

So, in the worst case i will get 12+15+5+15+10=57 second of fallover. But because of this timers work independently from each other I expect average fallover time around 15 seconds.

I've edited thread name because last one was a bit confusing. Your comments are greatly appreciated

Hello,

I have performed some tests between R1-R2-R3 (loopback100)

> On R3 I shutdown the interface going oward R2

> BGP down after holdtime on R2

> R2 Perform general scan and declare loopback100 as unreachable

> R2 immediately send updates to R1 (may be advertisement timer in picture)

> R1 immediately discard route..respect the updates received from R2

I got supporting statement from ripe as below

Update coming into IOS router is processed immediately, and sent onwards if appropriate:

Time required depends on number of prefixes, speed of

processor, etc

General case dependent on “advertisement interval”

Locally configured routes await the BGP scanner…

…when BGP network statement exists,

It means not every router wait for scan time but it process routes immediately it receives updates. In your query advt. intreval is mentioned is 5sec. By default it is 0 sec. unless it is configured explicitly.

I will dig more into this and update you once i am having concrete details

Regards

Mahesh

Hi, Mahesh

I've read the document from ripe too and now I have better understanding how advt timer works.

The whole process is described there:

Walk BGP table & Detect Changes

Generate Update & send the packets (here your R2 router send the first and the only update and triggers the advt timer)

Start minimum adv interval timer when finished

Walk the table again

If any changes detected, DO NOTHING!

We wait until minimum adv interval timer is finished

before sending any more updates

I've got another question now, RFC 4271 states that:

The suggested default value for the MinRouteAdvertisementIntervalTimer on EBGP connections is 30 seconds

But my cisco router with IOS 12.4(24)T shows that:

Default minimum time between advertisement runs is 0 seconds (for eBGP session)

So my understanding is that cisco default is  zero advt interval, and i am trying to find when (from what IOS version) this change was made.

Hello Dmitry,

looks like you are correct with little clarity below

Defaults  for 12.4 mainline

eBGP sessions not in a VRF: 30 seconds
eBGP sessions in a VRF: 0 seconds
iBGP sessions: 0 seconds

http://www.cisco.com/en/US/partner/docs/ios/iproute_bgp/command/reference/irg_bgp3.html#wp1094820

Hope this helps

Regards

Mahesh

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Review Cisco Networking products for a $25 gift card