OSPF Fast Convergence on specific links in a network.

Answered Question
Sep 10th, 2013
User Badges:

Hi,


I have a couple of question regarding speeding up OSPF convergence to help me understand it better. 

In this particular scenario I want to speed up OSPF convergence should a fault occur between two specific locations without causing issues for other routers on the network.  These two locations are part of the backbone area. The two locations have two point to point connection between them on two different routers at either end.


            R1-----------------(Serv Provider1)------------------------R3

        SiteA                                                             Site B

            R2-----------------(Serv Provider 2)------------------------R4 


As far I can see there are two main factors controlling OSPF fast convergence.

1. Failure Detection Time.

2. Propagation of fault / SPF recalulation time.


The first factor - failure detection time can be reduced by decreasing the OSPF hello/dead interval timers or by using BFD to detect the failure.  Which is the best option?


From what I see if using OSPF hello/dead timers I would only need to match timer values on router interfaces either side of my point to point links and could leave other interfaces as they are. Is this correct?



Regard the second part -  by throttling the SPF timers, OSPF SPF calculation time is reduced. Again considering my two Point to Point links can I adjust the timers on the routers either side of the Point to Point link or do I need to set them the same on all routers in the OSPF network.

("Timers throttle SPF" command).

Correct Answer by Joseph W. Doherty about 3 years 11 months ago

Disclaimer


The  Author of this posting offers the information contained within this  posting without consideration and with the reader's understanding that  there's no implied or expressed suitability or fitness for any purpose.  Information provided is for informational purposes only and should not  be construed as rendering professional advice of any kind. Usage of this  posting's information is solely at reader's own risk.


Liability Disclaimer


In  no event shall Author be liable for any damages whatsoever (including,  without limitation, damages for loss of use, data or profit) arising out  of the use or inability to use the posting's information even if Author  has been advised of the possibility of such damage.


Posting


5 second convergence should be achievable, if your OSPF support fast-hellos or BFDs.  I would try a dead-time of 2 seconds with a half second hello.


To deal with the other variables related to fast OSPF network convergence, after link change has bee noted, you might find the following a good read:

http://blog.ine.com/2010/06/02/ospf-fast-convergenc/

Correct Answer by Rolf Fischer about 3 years 11 months ago

Hi Pat,



Regarding the SPF hold timers. to confirm is there any issue with having different values on different routers in the network. I presume not as the value is random anyway?



the values for deferring the SPFA is always a compromise: On the one hand you don't want a continuing recalculation caused by a flapping link, on the other you expect fast convergence. So you have to find values which meet the requirements in terms of convergence and stability at the same time.

I think the main issue with different values within an Area are micro-loops. Since recalculation (and subsequent updating of the routing-tables) is never done at exactly the same time on the routers, link-state routing protocols can produce short times with micro-loops after topology changes because some forwarding tables are updated sooner than others (we're talking about tens-hundreds of milliseconds).

If you change the sfp-start timer to, let's say 1 second or even less, and other routers in the area still have the default-value (which is 5 seconds), the period of time where micro-loops are can occur could be considerably long and there could be some undesired impact, depending on topology/design.


Hope that helps

Rolf

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 5 (3 ratings)
Loading.
shajakhan85 Tue, 09/10/2013 - 08:15
User Badges:

1. From what I see if using OSPF hello/dead timers I would only need to  match timer values on router interfaces either side of my point to point  links and could leave other interfaces as they are. Is this correct?


Should match timer values on router interfaces either side of point to point  links.



2. can I adjust the timers on the routers either side of the Point to Point  link or do I need to set them the same on all routers in the OSPF  network.


Not required on both interfaces...


http://www.cisco.com/en/US/docs/ios/12_2s/feature/guide/fs_spftrl.html

n_schloemer Tue, 09/10/2013 - 13:13
User Badges:

Hi Pat,


I personally prefer to manage the hello and dead timers.  If your SPF calculations are causing a significant delay during convergence you may want to consider implementing summary routes at the ABR's or add new areas.


The configuration for timers would be the following on an interface sub command:


interface fa0/0

ip ospf hello-interval [time in seconds], depending on link reliability I usually do 1s

ip ospf dead-interval [time in seconds], make it 4x the hello-interval, 4s


The hello and dead timers must be the same for OSPF neights, this is one of the attribtues verified within the LSA's during the init stage.

pat.meade Wed, 09/11/2013 - 05:10
User Badges:

Thanks for replies.


Regarding the SPF hold timers. to confirm is there any issue with having different values on different routers in the network. I presume not as the value is random anyway?


thanks

Pat.

Correct Answer
Rolf Fischer Wed, 09/11/2013 - 05:41
User Badges:
  • Blue, 1500 points or more

Hi Pat,



Regarding the SPF hold timers. to confirm is there any issue with having different values on different routers in the network. I presume not as the value is random anyway?



the values for deferring the SPFA is always a compromise: On the one hand you don't want a continuing recalculation caused by a flapping link, on the other you expect fast convergence. So you have to find values which meet the requirements in terms of convergence and stability at the same time.

I think the main issue with different values within an Area are micro-loops. Since recalculation (and subsequent updating of the routing-tables) is never done at exactly the same time on the routers, link-state routing protocols can produce short times with micro-loops after topology changes because some forwarding tables are updated sooner than others (we're talking about tens-hundreds of milliseconds).

If you change the sfp-start timer to, let's say 1 second or even less, and other routers in the area still have the default-value (which is 5 seconds), the period of time where micro-loops are can occur could be considerably long and there could be some undesired impact, depending on topology/design.


Hope that helps

Rolf

Joseph W. Doherty Wed, 09/11/2013 - 07:43
User Badges:
  • Super Bronze, 10000 points or more

Disclaimer


The  Author of this posting offers the information contained within this  posting without consideration and with the reader's understanding that  there's no implied or expressed suitability or fitness for any purpose.  Information provided is for informational purposes only and should not  be construed as rendering professional advice of any kind. Usage of this  posting's information is solely at reader's own risk.


Liability Disclaimer


In  no event shall Author be liable for any damages whatsoever (including,  without limitation, damages for loss of use, data or profit) arising out  of the use or inability to use the posting's information even if Author  has been advised of the possibility of such damage.


Posting


The first factor - failure detection time can be reduced by decreasing the OSPF hello/dead interval timers or by using BFD to detect the failure.  Which is the best option?

Especially if there are lots of (routable) interfaces, BFD, if supported, as it's supposed to decrease the work load.


From what I see if using OSPF hello/dead timers I would only need to match timer values on router interfaces either side of my point to point links and could leave other interfaces as they are. Is this correct?

Yes, correct. OSPF hello timers are per link, so you can have different settings.

BTW, if the link is a physical p2p, OSPF will get (very fast) notification when link drops.  I.e. reduces need to reduce hello timers.

Reduced hello timers are most useful when link is "lost" but interface stays up.


Regard the second part -  by throttling the SPF timers, OSPF SPF calculation time is reduced. Again considering my two Point to Point links can I adjust the timers on the routers either side of the Point to Point link or do I need to set them the same on all routers in the OSPF network. 

("Timers throttle SPF" command).

Timers are global settings.


Actual (individual device) SPF calculation time is not reduced with timer adjustments.  (NB: if your devices support iSPF, that can reduce SPF calculation time.)


However, what can be reduced with timer settings is how quickly OSPF passes on a link state change to a neighbor and how OSPF deals with multiple topology changes and how long OSPF delays the SPF computation.  Basically, timer adjustments can impact how fast the OSPF topology converges.


The risk of speeding up convergence, is routers spending too much of their time doing SPF (I've seen brand X go into OSPF meltdown both with a flapping link or a single link change that ping ponged best path when there were many multiple paths).  The later Cisco implementation can be configured to pass along and recompute SPF for a single link change very rapidly yet queue up multiple link state changes and/or exponentially slow down multiple back-to-back SPF computations (in theory, the best of both worlds, i.e. fast convergence without OSPF meltdown).


As far as I know, SPF timers processing is not RFC'ed, so behavior can be quite different between vendors that support OSPF.


Although Cisco allows different settings between devices, there are some expectations between devices when changing some of the timer values, i.e. how fast one device might send LSAs vs. how fast the other device expects to receive them.


Cisco does have a paper or two on recommendations for adjusting timers, just be sure your dealing with the timer settings that apply to your devices (believe later IOSs have improved timer functioning).

pat.meade Wed, 09/11/2013 - 08:36
User Badges:

Again thanks for the replied.


The issue I am trying to address related to a specific site which is connected via OSPF P2P links over two different provider networks to a central site. The RTT is approx 10 msec.  I need to get the OSPF failover time down below 5 seconds to this specific site. Using BFD for detection I can achieve approx.10 seconds. I was looking into throttling the SPF hold timers to speed up convergence to reduce time further. My concern is the affect on other routers in the network.  Tests in GNS look good but I think "Proceed with Extreme Caution"seems to be the best approach.

Correct Answer
Joseph W. Doherty Wed, 09/11/2013 - 11:12
User Badges:
  • Super Bronze, 10000 points or more

Disclaimer


The  Author of this posting offers the information contained within this  posting without consideration and with the reader's understanding that  there's no implied or expressed suitability or fitness for any purpose.  Information provided is for informational purposes only and should not  be construed as rendering professional advice of any kind. Usage of this  posting's information is solely at reader's own risk.


Liability Disclaimer


In  no event shall Author be liable for any damages whatsoever (including,  without limitation, damages for loss of use, data or profit) arising out  of the use or inability to use the posting's information even if Author  has been advised of the possibility of such damage.


Posting


5 second convergence should be achievable, if your OSPF support fast-hellos or BFDs.  I would try a dead-time of 2 seconds with a half second hello.


To deal with the other variables related to fast OSPF network convergence, after link change has bee noted, you might find the following a good read:

http://blog.ine.com/2010/06/02/ospf-fast-convergenc/

pat.meade Wed, 09/11/2013 - 13:15
User Badges:

Thanks for help with this. really great support.

Actions

This Discussion