Yet another - MPLS Failover & Load Balancing question
I'll try to explain the diagram (attached below) and then relate my question . The question appears a bit long because I tried to simplify it as much as possible for you guys to understand so PLEASE, don't be scared by the length of it . ;)
Until now, we had single MPLS VPN provider, connecting all over our offices and life was simple with static routes and no IGP or EGP configured (except maybe within the provider MPLS cloud).
Now we brought in a second MPLS Service provider for redundancy purposes & we need to architect around the new scenario of connecting offices on separate MPLS clouds with most optimal utilisation to the investment.
Hence, soon we'll have two MPLS circuit providers, termed in the diagram as ISP1 and ISP2. The two routers attached to ISP1 & ISP2 cloud are not under our administration but would reside in our premises; at all our Sites 1, 2 & 3.
An expanded Site 1 shows that both the MPLS circuits terminate in our datacenter of Site 1. On the ISP1 MPLS cloud and ISP2 MPLS cloud are different sets of offices and some of the times, the need of an office connected to ISP1 is to directly talk to another office on ISP2 without having to do anything with our Site 1 office or enter our internal LAN. Internal LAN has a pair of Cisco Core switches configured in HSRP mode with one of them being active and forwarding traffic. The MPLS links bandwidth varies between 4-8Mbps.
So what we decided is, first to optimise traffic by placing a WAN optimiser (could be Riverbed, Cisco, Bluecoat etc.. not yet decided). Wan optimisers do not yet have the capability to route the traffic neither are meant to.
1. Automatic failover of links with some sort of active load balancing;
Solution 1: Bring ISP1 and ISP2 to participate in our side of BGP and configure BGP on Cisco switches ( emulating CE) with PE routers ( ISP1 Router1 and ISP2 Router1) , lying in our premises.
Concern 1: Would this mechanism bring about auto redundancy in case connectivity to one of the ISPs goes down?
Concern 2: Would there be some sort of arrangement required amongst ISP1 and ISP2 to get this BGP thing working? The two are competitors and might not collaborate with each other but if BGP implementation is independent of these two interacting directly with each other, then it's fine.
Concern 3: Anything else that you can think of??
2. Load balancing across two links
Solution 2: Configure static routing to Sites 2 and 3 that share both links. Assign equal costs to those routes and emulate ECMP concept. For those that do not have both the links yet (Sites 4 and 5), will be having only single route with no ECMP.
Concern 1: Related to concern 1 of solution1 above. When ISP1 Router 1 fails or the whole ISP1 link fails, would the traffic destined to that path be lost and throw the whole network in a tizzy?
3. Security from Malwares
The links provided by the two ISPs are pure pipes and traffic passing through the two MPLS VPN links, though trusted (non internet) but is still coming from disparate geographically spread locations with various degrees of security mechanism implemented internally. The need is to only check the traffic for malwares (Antivirus, Trojans, etc).
Solution 3: Request both ISPs to run some sort of Cisco IPS services on their side of the routers, maybe Ciscos IOS IPS or IPS AIM module insertion.
Or ..have my own inline device in the form of Fortinet/Sonicwall UTM to take care of this aspect.
Concern: Costs??!!! UTM might as well take care of link load balancing and autofailover and I might do away with the configuration of both BGP and ECMP. But most UTM manufacturers talk about Internet links load balancing rather than MPLS VPN associated links.
Now the question is, am I missing something here? Would these concepts work in practise? Has anyone been there done this before?
Question We run asr9001 with XR 6.1.3, and we have a very long delay to
login w/ SSH 1 or 2 to the device compare to IOS device. After
investigation, the there is 1s delay between the client KEXDH_INIT and
the server (XR) KEXDH_REPLY. After debug ssh serv...
Introduction The purpose of this document is to demonstrate the Open
Shortest Path First (OSPF) behavior when the V-bit (Virtual-link bit) is
present in a non-backbone area. The V-bit is signaled in Type-1 LSA only
if the router is the endpoint of one or ...
Hi, I am seeing quite a few issues with patch install and wanted to
share my experience and workaround to this. Login to admin via CLI, then
access root with the “shell” command Issue “df –h” and you’ll probably
see the following directory full or nearly ...