We have a DC and branch connections over an ISP's MPLS VPN cloud. There are 2911 routers at branches and they are connected to ISP PE routers over BGP (Our ISP only support BGP). And our DC ASR 1002 routers are connected over BGP again. We advertise default route from DC and branches only receive default route. And they advertise their local net.
Problem: When the ISP has problems in their MPLS cloud our branches stop working. Because they can still receive default route. And mostly our DC receive branch prefixes. So; routing works but traffic does not.
Moreover we have GetVPN on our branch and DC routers.
Did you experience any workaround about this issue? Or any other ideas? Tracking reachability and triggering EEM is our second choice, so i will be pleased of the answers withour EEM
if I have understood correctly you would like to handle failures in the service provider forwarding plane.
As you have correctly noted one way to perform this would be the use of tracking + EEM to react to lack of connectivity over the MPLS VPN service.
An alternate way to do this is to change routing in such a way to have "end-to-end" BGP sessions between CE devices. This idea was proposed by Cisco expert Edison Ortiz some time ago on the forums.
The current BGP sessions should be used only to propagate information about CE IP addresses to be used to build p2p GRE tunnels. Over these GRE tunnels you should be able to configure iBGP sessions.
DC CE nodes ASR 1002 should advertise the default route only on these new iBGP sessions over GRE Tunnels
on the other side remote CE nodes should advertise internal LAN subnets of each site only over the iBGP sessions over GRE.
In this case if a failure happens on the MPLS SP forwarding plane these end-to-end iBGP sessions over GRE will fail and each CE node can revert to use the backup GETVPN to reach the DC instead of being stucked to the PE-CE eBGP sessions as it happens now.
To be honest the tracking + EEM solution might be faster then the proposed routing schema.
Question We run asr9001 with XR 6.1.3, and we have a very long delay to
login w/ SSH 1 or 2 to the device compare to IOS device. After
investigation, the there is 1s delay between the client KEXDH_INIT and
the server (XR) KEXDH_REPLY. After debug ssh serv...
Introduction The purpose of this document is to demonstrate the Open
Shortest Path First (OSPF) behavior when the V-bit (Virtual-link bit) is
present in a non-backbone area. The V-bit is signaled in Type-1 LSA only
if the router is the endpoint of one or ...
Hi, I am seeing quite a few issues with patch install and wanted to
share my experience and workaround to this. Login to admin via CLI, then
access root with the “shell” command Issue “df –h” and you’ll probably
see the following directory full or nearly ...