Here is the senario: 3560 as our CE equipment connected via fiber to PE equipment in vpls network. Running eigrp for all sites. 3560 is also connected to 6500 in our local network. Eigrp neighbors for the 3560 are the CEs at the vpls remote sites plus the 6500. If we loose connectivity to the vpls cloud on the other side of the PE hence the interface on the 3560 is still up/up toward vpls cloud, how long before the 3560 realizes there is a problem and tells the 6500 to drop the routes through him to the remote sites (all Eigrp timers are at defaults).
Thanks for the quick response. 15 seconds, what I thought. Other than an increase in traffic, what are the "gotchas" for decreasing the hold down timer to 3 seconds? Are there other avenues to explore to decrease the amount of time till the lost neighbor is discovered in the above scenario?
I do not have much experience with tuning the EIGRP timers (a couple of times I have changed them but mostly I just leave them at default). As far as I am aware the main aspect of tuning the timers lower would be an increase in traffic. I am not clear whether there might also be some increased exposure to Stuck In Active problems with short timers.
I believe that I have read several things about tuning the EIGRP timers and I do not remember any strong negatives about doing it.
Please note that this is a very interesting question. Now the point of decreasing timers for IGPs is to ensure convergnce is faster. Where this may work well in some scenarios, is may cause what we call "postive routing feedback" and this has to be taken into account. I currently do not have the awnser to your questions, just things to watch out for.
The way the IGP timers are chosen are to stop route flapping causing postive feedback, and things such as bandwdith link util, cpu etc etc may come into play when changing timers.
There are certain other techniques for faster convergence in the market place today, and where these timers for IGPs seem OK for a data only network, I am not sure they are OK for a fully converged network.
If you consider that EIGRP is 5 and 15 and ospf is 10 and 40, and then you factor in such protocols as TCP, pls refer to post
you will see that with some timers in some networks for TCP, this may break you users TCP sessions.
Some of the new technologies for convergnce include :
SSO (Stateful switchover) with NSF (Non-Stop Forwarding) in redundant platform linecards separating the control-plane from the data-plane in distributed platform architectures.
Graceful Restarts of routing protocol adjacencies, a method for not always reporting network failures should the failure be short lived and recovering the adjacencies routing information without having to reset the neighbor as traditional routing protocols do.
Fast Hellos. This is where you simply cut down the timers between adjacencies in the network to milliseconds, rather than seconds, but this must be used cautiously.
MARP (Multiaccess Reachability Protocol). This is a useful technique for allowing layer 3 routing protocols to use Layer 2 switch MAC address to signal of a routing adjacency is reachable.
Link State Expediential Backoff. This is where certain routing protocols can delay sending of updates if they see that a link in the network is flapping.
IP Event Dampening. This is very similar to the previous bullet, with the difference that it will stop advertising network reachability information for periods of calculated time depending on how often the link flaps.
EIGRP Feasible Successor tweaking. If a design has two paths to a network, and the reported distance is equal to the reported distance, convergence has to take place. Using metrics to make feasible successors is an option.
BFD (Bidirectional Forwarding Detection). This is a routing protocol independent hello mechanism for detecting link failure in the sub-millisecond range, which can enhance the routing protocol convergence significantly.
When I say positive routing feedback, this I mean to interpret, a link is flapping, or some other constant update generating event. The routing protocol is quite rightly sending positive information about the status of the network. The link is up, send an update the link is up and usable, ooops, the link is down, inform everyone not to use this path. and this is happening constantly (feedback). The network goes into a state of turmoil, or as the books say "meltdown" as CPU are going crazy, update info is flying everywhere, and no node knows exacly what to do. The network does not converge. So you have to isolate the said link.
Thats what I meant, and it is in some good books written by Russ White and the guys. The Optimal Routing Design book is excellent in explaining this. Well worth a purchase.
The ProblemEnter EVCsHow It Works (Ingress)How It Works
(Egress)Step-by-Step ExampleFinal Thoughts The ProblemOn traditional
switches whenever we have a trunk interface we use the VLAN tag to
demultiplex the VLANs. The switch needs to determine which MAC ...
The ProblemEnter EVCsHow It Works (Ingress)How It Works
(Egress)Step-by-Step ExampleFinal Thoughts Introduction: Netdr is a tool
available on a RSP720, Sup720 or Sup32 that allows one to capture
packets on the RP or SP inband. The netdr command can be use...
IntroductionOSPF, being a link-state protocol, allows for every router
in the network to know of every link and OSPF speaker in the entire
network. From this picture each router independently runs the Shortest
Path First (SPF) algorithm to determine the b...