We have a pair of VXR7204's connected to seperate peers. They are both linked to each other and also trunked to a pair of cat 3550's. One of the 3550 switches had a psu failure and caused hsrp to failover the internal VLAN trunked (802.1q) ip's onto the second VXR. Now here is the strange bit.
The primary VXR on our main internet pipe showed that the bgp routing table was instructing packets for the local network to be routed via VXR2 on their private link, but the ip routing table on VXR1 was routing all traffic for our network range to NULL0. It was as tho it was ignoring the bgp routes from VXR2 for some reason. Even admin shutdown on the interfaces 0/1 and 0/1.1-1.5 had no effect on this.
Sorry for the ramble but if anyone requires more info then just let me know.
Are you running CEF on your VXRs? on each interface or a subset perhaps? What version of IOS code are you running? Perhaps a search in the Bug DB might yield something there. I'm also assuming that the 3500's are trunked together and passing all the appropriate VLANs. Are the VXR's connected to each other via dedicated links? Can you share some sanitized configs?
My guess is a CEF inconsistency but that's just a guess at this point. CEF is on by default in later versions of IOS but earlier version you needed to manually enable it globally and on each interface. If you enable it on one interface but not another you might end up with some inconsistencies.
When you say the BGP routing table was directing the traffic a specific direction, do you mean the routes in the BGP table (from show ip bgp) give a specific next hop? And when you say the router is actually directing the packets to null0, are you saying the output of show ip route shows a route to null0 being the best route?
Could you post a show ip bgp x.x.x.x and a show ip route x.x.x.x to show a specific instance of what you are talking about?
The aggregate is the only route that shows up when you do a show ip route 188.8.131.52 or a show ip bgp 184.108.40.206, so, yes, your packets are being /dev/nul'd. You need a more specific route--I see you have network statements covering more specific routes, but those routes aren't in your local routing table, it doesn't look like.
Where would you normally learn those more specific routes from? Do they normally exist, but drop out of the table when you HSRP fail over?
They do exist as they are directly connect networks or static routes. Initially when I set this up I did not have the aggregate command in place and I found the the router did not publish either the static routes or the directly connected ones unless I used the commands
The aggregate was suggested by the cisco partner who provides our support.
To answer your question about the hsrp failure. Yes they do exist while the router is in normal operation but when the internal switch failed and droped the interface everything went to null. Even although another router had taken over the routing for those subnets and was connected using a cross-over on a different interface.
If you need any other info/clarification then let me know
Quick net diagram
Output from a different HSRP subnet (the 196 range was only available on VXR2)
scocore1#sh ip route 220.127.116.11
Routing entry for 18.104.22.168/25
Known via "connected", distance 0, metric 0 (connected, via interface)
Advertised by bgp 20860
Routing Descriptor Blocks:
* directly connected, via GigabitEthernet0/1.1
Route metric is 0, traffic share count is 1
scocore1#sh ip bgp 22.214.171.124
BGP routing table entry for 126.96.36.199/25, version 252880
Paths: (1 available, best #1, table Default-IP-Routing-Table, Advertisements suppressed by an aggregate.)
That makes sense, then.... If BGP still peering with the other router when the interfaces are down? I would guess not.... If it is, then you probably aren't seeing the routes from the iBGP peer correctly, and you might want to try turning off synchronization, to see if that helps. Normally, you'd solve this problem by making certain the two border routers are running iBGP with each other over some link that doesn't fail all that often, or isn't involved in HSRP/etc. If they are in the same room, then just putting a serial or ethernet link between them....
If it doesn't, then you need to get this router to stop advertising the aggregate when the interfaces are down. You propably have some other route in the routing table that's making the router advertise the aggregate (?) even when the interfaces in question are down. The aggregate is normally a good thing, but you might not want to run it in this case, and let your upstream aggregate for you, since it's causing you problems.
We are pleased to announce availability of Beta software for 16.6.3. 16.6.3 will be the second rebuild on the 16.6 release train targeted towards Catalyst 9500/9400/9300/3850/3650 switching platforms. We are looking for early feedback from custome...