I have a couple of question regarding speeding up OSPF convergence to help me understand it better.
In this particular scenario I want to speed up OSPF convergence should a fault occur between two specific locations without causing issues for other routers on the network. These two locations are part of the backbone area. The two locations have two point to point connection between them on two different routers at either end.
SiteA Site B
R2-----------------(Serv Provider 2)------------------------R4
As far I can see there are two main factors controlling OSPF fast convergence.
1. Failure Detection Time.
2. Propagation of fault / SPF recalulation time.
The first factor - failure detection time can be reduced by decreasing the OSPF hello/dead interval timers or by using BFD to detect the failure. Which is the best option?
From what I see if using OSPF hello/dead timers I would only need to match timer values on router interfaces either side of my point to point links and could leave other interfaces as they are. Is this correct?
Regard the second part - by throttling the SPF timers, OSPF SPF calculation time is reduced. Again considering my two Point to Point links can I adjust the timers on the routers either side of the Point to Point link or do I need to set them the same on all routers in the OSPF network.
("Timers throttle SPF" command).
The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.
In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.
5 second convergence should be achievable, if your OSPF support fast-hellos or BFDs. I would try a dead-time of 2 seconds with a half second hello.
To deal with the other variables related to fast OSPF network convergence, after link change has bee noted, you might find the following a good read:
Regarding the SPF hold timers. to confirm is there any issue with having different values on different routers in the network. I presume not as the value is random anyway?
the values for deferring the SPFA is always a compromise: On the one hand you don't want a continuing recalculation caused by a flapping link, on the other you expect fast convergence. So you have to find values which meet the requirements in terms of convergence and stability at the same time.
I think the main issue with different values within an Area are micro-loops. Since recalculation (and subsequent updating of the routing-tables) is never done at exactly the same time on the routers, link-state routing protocols can produce short times with micro-loops after topology changes because some forwarding tables are updated sooner than others (we're talking about tens-hundreds of milliseconds).
If you change the sfp-start timer to, let's say 1 second or even less, and other routers in the area still have the default-value (which is 5 seconds), the period of time where micro-loops are can occur could be considerably long and there could be some undesired impact, depending on topology/design.
Hope that helps