Yet another - MPLS Failover & Load Balancing question

fahim · ‎06-11-2009

I'll try to explain the diagram (attached below) and then relate my question . The question appears a bit long because I tried to simplify it as much as possible for you guys to understand so PLEASE, don't be scared by the length of it . ;)

Until now, we had single MPLS VPN provider, connecting all over our offices and life was simple with static routes and no IGP or EGP configured (except maybe within the provider MPLS cloud).

Now we brought in a second MPLS Service provider for redundancy purposes & we need to architect around the new scenario of connecting offices on separate MPLS clouds with most optimal utilisation to the investment.

Hence, soon we'll have two MPLS circuit providers, termed in the diagram as ISP1 and ISP2. The two routers attached to ISP1 & ISP2 cloud are not under our administration but would reside in our premises; at all our Sites 1, 2 & 3.

An expanded Site 1 shows that both the MPLS circuits terminate in our datacenter of Site 1. On the ISP1 MPLS cloud and ISP2 MPLS cloud are different sets of offices and some of the times, the need of an office connected to ISP1 is to directly talk to another office on ISP2 without having to do anything with our Site 1 office or enter our internal LAN. Internal LAN has a pair of Cisco Core switches configured in HSRP mode with one of them being active and forwarding traffic. The MPLS links bandwidth varies between 4-8Mbps.

So what we decided is, first to optimise traffic by placing a WAN optimiser (could be Riverbed, Cisco, Bluecoat etc.. not yet decided). Wan optimisers do not yet have the capability to route the traffic neither are meant to.

Design needs:

1. Automatic failover of links with some sort of active load balancing;

Solution 1: Bring ISP1 and ISP2 to participate in our side of BGP and configure BGP on Cisco switches ( emulating CE) with PE routers ( ISP1 Router1 and ISP2 Router1) , lying in our premises.

Concern 1: Would this mechanism bring about auto redundancy in case connectivity to one of the ISPs goes down?

Concern 2: Would there be some sort of arrangement required amongst ISP1 and ISP2 to get this BGP thing working? The two are competitors and might not collaborate with each other but if BGP implementation is independent of these two interacting directly with each other, then it's fine.

Concern 3: Anything else that you can think of??

2. Load balancing across two links

Solution 2: Configure static routing to Sites 2 and 3 that share both links. Assign equal costs to those routes and emulate ECMP concept. For those that do not have both the links yet (Sites 4 and 5), will be having only single route with no ECMP.

Concern 1: Related to concern 1 of solution1 above. When ISP1 Router 1 fails or the whole ISP1 link fails, would the traffic destined to that path be lost and throw the whole network in a tizzy?

3. Security from Malwares

The links provided by the two ISPs are pure pipes and traffic passing through the two MPLS VPN links, though trusted (non internet) but is still coming from disparate geographically spread locations with various degrees of security mechanism implemented internally. The need is to only check the traffic for malwares (Antivirus, Trojans, etc).

Solution 3: Request both ISPs to run some sort of Cisco IPS services on their side of the routers, maybe Ciscos IOS IPS or IPS AIM module insertion.

Or ..have my own inline device in the form of Fortinet/Sonicwall UTM to take care of this aspect.

Concern: Costs??!!! UTM might as well take care of link load balancing and autofailover and I might do away with the configuration of both BGP and ECMP. But most UTM manufacturers talk about Internet links load balancing rather than MPLS VPN associated links.

Now the question is, am I missing something here? Would these concepts work in practise? Has anyone been there done this before?

Pls advise!

Giuseppe Larosa · ‎06-13-2009

Hello Fahim,

note1: redundancy is limited not all sites are multihomed.

if I've understood you correctly only main sites will be multihomed with both ISP1 and ISP2.

Doing so you will have a limited redundancy capacity that can still be acceptable for your needs.

Note2: requirement for direct communication between branch offices served by different provider

>> the need of an office connected to ISP1 is to directly talk to another office on ISP2 without having to do anything with our Site 1 office or enter our internal LAN

You can achieve this only if ISP1 and ISP2 are involved and you with both them setup an inter-AS MPLS VPN.

So this cannot coexist with other requirement to have the two providers not involved.

This is really basic in ip routing concepts: to avoid to go through the main sites somebody else has to provide an alternate path and information about it.

Note3:

eBGP solution is the only one that can provide you real faul tolerance capacity.

An alternate way would be the usage of OER and performance routing but I don't have experience of this.

Note4:

you can achieve flow based load balancing without using the load balancers / WAN optimizers but it may be your world.

Hope to help

Giuseppe