Yesterday we encountered a very strange problem with a small multihomed AS of one of our customers:
The AS includes our two routers
Cisco 7206VXR NPE-400, c7200-ik9s-mz.124-25c.bin
Cisco 2811, c2800nm-spservicesk9-mz.124-18.bin
Both of the routers are connected to different upstreams in different ASs with according eBGP of course. The two upstreams are completely separated not sharing any infrastructure.
Internally they have an iBGP session configured and are connected via two switches to the customers LAN with a HSRP configuration.
The wohle setup covers connection losses to either one of the upstreams or hardware failures very well.
But yesterday we had the following situation:
Both of the routers lost their eBGP-sessions almost simultaneously and never brought them up again.
Further investigation showed that both upstream links lost layer 3 connectivity completely. While layer 2 was perfectly OK and CDP showed all the neighbors correctly we were not able to send a single ping across the upstream links.
Apart from this both of the routers appeared in a perfectly normal state without CPU or memory issues or any clue in the logging.
Shutting and reactivating the according interfaces did nothing to resolve this situation.
Only after a reboot of both machines the situation reverted back to normal.
Do any of the experts here have any idea what might have been going on here?
Thanks in advance for any hint or idea.