I have a question regarding SIA, as explained on page 122-124 in "Authorized Self-Study Guide for BSCI" 3rd edition. Luckily this can be found direct from ciscopress.com as a sample chapter (http://ptgmedia.pearsoncmg.com/images/9781587052231/samplechapter/2237_SampleChapter_03.pdf)
Basically, they give a scenario where RTR B has a network that goes down (10.1.8.0/24). So he queries A,C,D,E.
CDE answer that they have a good route via A because A was their FS.
A, upon receiving the query from B, query CDE.
CDE, upon receiving the query from A, purge A as the new successor for that route, and query B.
B can't reply to CDE, because it is still waiting for a reply from A.
So, B is waiting for A before it can answer CDE.
A is waiting for CDE before it can answer B.
CDE is waiting for B before they can answer A.
The book doesn't involve the SIA-Query/SIA-Reply packets in this scenario, but ultimately this will delay the resetting of the peer between A and B by a maximum of 6 minutes (assuming the default 180 second acvite timer).
I find it hard to believe that resetting the peer between A and B (which impacts all networks learned between them both) is the optimal behavior for EIGRP to resolve this deadlock situation.
Why can't RTR B, upon reaching 3 minutes (or 6 if you include the SIA-Query/SIA-Reply packet) simply purge the 10.1.8.0/24 route instead of resetting the peer?
This seems much more efficient.
This topology is not too out-of-the ordinary, where you have one subnet hanging off of only one of the distribution routers due to cost limitations/etc, where it isn't feasible that the subnet have redundant connections. I just don't think that if this prefix goes down, it should end up in a neighbor reset to clear it all away.
All other examples imply a bad connection or a faulty link that leads to a reply not being returned. This scenario, however, is a deadlock situation where there is no faulty links (other than the original network going down).
I understand summarizaiton and stub can fix this all, but I just see those as band-aid solutions for what seems to be a flaw in EIGRP. Ultimately there should be RTR-IDs included in the SIA-Reply packet (who you are waiting to get a reply from), and with this extra bit of info, RTR B would have realized A is waiting for B, so that means the network is "unreachable" from A.
Did I miss something?
Thanks for any and all advice!
You are welcome. Let's go over your individual comments.
I'm not sure why I always see 3 minutes as the limit to go to SIA. Yes, 3 minutes is the active timer, but this is reset at 90 second intervals with the SIA-Query and SIA-Reply packets (up to three times). So, assuming there are no bad links in the network, or faulty devices, 6 minutes will elapse prior to going to SIA.
Yes, you are correct - the same fact is explained in the Jeff Doyle's book Routing TCP/IP, Volume 1.
I follow Peter's logic, and it makes sense (other than not following split horizon rules ).
Yep I completely skipped the split horizon rules - they only optimize the query process somewhat but do not influence it in any significant way. And also, I forgot completely that the EIGRP uses split horizon with queries I owe you one!
The only question I had was in step 3, and the point you make of CDE answering first being a coincidence. Let's say that A reponds prior to C, D, E. What would A's reply be? B was his successor, and he had no feasible successor -- so would A have responded with infinity, or would have had to have asked C, D, E first?
A very good question. The router A cannot send a reply immediately because the B is its current successor and by its query, it indicates it has problems reaching that network. More precisely, the A will receive the query from B, indicating the current distance of B towards the network 10.1.8.0/24, which is infinity. For router A, the router B was the successor and there are no feasible successors because the C, D and E also decided to use A as their next hop towards the 10.1.8.0/24. Therefore, upon receiving the query from B, router A will lose its successor and will need to forward the query itself before it can provide a reply to B. The remaining process will follow approximately from the step 5.
Also, in step 8. B will reply with infinity to CDE rather than the new metric it learned from CDE in step 3? Is this because those new metrics were never used due to the network never transitioning to passive, or because the queries from CDE was changed to infinity after A queried CDE?
In step 8, the router B must reply with infinity to C, D and E. The reason is that in DUAL algorithm, once a router transitions to the active state, it must not change the current successor, feasible distance or reported distance until it receives all replies and selects a new successor. This is one of the most important principles with DUAL - a router can change its idea about distances and next hops only after it receives all replies because only then it can be sure that the remaining part of the network has already updated its routing tables and distances correctly (by the virtue of repeating the same principle). So, in this case, while the router B has received the replies from C, D and E, it did not receive all replies it needs to make a qualified decision and cannot be sure that the neighbors are already aware of the most recent correct route to the destination.
I suggest strongly reading the following document:
Regarding the behavior of receiving and sending updates, see this section:
You are welcome to ask further!
Is the same explanation to the exhibit, as you have quoted it here, also present in the Cisco Press book? If yes then the book is patently wrong on this subject.
The explanation in your post is not correct because it is based on a false assumption: It wrongly assumes that if a router is in active state for a destination, it cannot send a reply to another query that it receives during the active state.
In reality, if a router is in active state for a destination and subsequently receives yet another query, it immediately responds with its current metric that was set in the moment of transition to the active state (note that this metric must be higher than what it was when the router was passive because a transition into the active state is always a result of distance increase). This way, no deadlocks can occur, in contrary to what was described in your original post.
So the sequence of events would be as follows:
- B loses connection to the 10.1.8.0/24 network which will result in the metric of this route to go to infinity. No other neighbor is identified as a feasible successor so the router B has to enter the active state for the network.
- B starts a diffusing computation by sending a query to its directly connected neighbors. The query contains the current distance of B to the network 10.1.8.0/24, which is infinity.
- Assuming that routers C, D and E send their answers sooner than router A can process the query (which is a matter of coincidence, not a rule at all), B will receive replies from C, D and E indicating their current (unchanged) distance to the network 10.1.8.0/24 via router A. Routers C, D and E will not proceed to active mode because their metric was not influenced by this query.
- Router A will receive the query from B which was its successor. The query contains the new metric, which is infinity. Router A therefore loses its successor and because no other neighbor is identified as a feasible successor, router A will have to enter the active state
- Router A will send queries to all its neighbors indicating its current distance to the network 10.1.8.0/24, which is also infinity. This query will reach routers B, C, D and E.
- Router B is currently in the active state for the same network and its current distance to it is infinity. Upon receiving a query from A, it will immediately send a reply to A with its current metric of infinity.
- Upon receiving the query from A, routers C, D and E will lose their successor. In turn, they will also enter the active state and send queries to their neighbors.
- Routers A and B receive queries from C, D and E. They will immediately send a reply with their current metric of infinity to C, D and E. Moreover, router B will note that the current distance of C, D and E as indicated in their query is infinity and no longer the shorter distance it received in replies in the step 3.
- In steps 7 and 8, the routers C, D and E went into active state and received replies to all their queries. After receiving all replies, routers C, D and E conclude that there is no alternate route to the network 10.1.8.0/24. They will therefore change to passive state, send their replies to the router A indicating the metric of infinity, and remove the network 10.1.8.0/24 from their topology and routing tables.
- Upon receiving answers from routers C, D and E, the router A will also conclude that the network is no longer reachable. It will therefore send a reply back to router B with the metric of infinity and remove the network from its topology and routing table.
- The router B now has replies to all queries it sent out. However, all routers have indicated that the network is no longer reachable. As a result, the router B will also remove this network from its topology and routing table and the diffusing computation is thereby completely concluded.
In particular, the SIA state will not appear in this network.