I guess my problem is a very classical problem.
We have number of vlans extended between our 2 DCs (interconnexion are 2 DWDM links, using STP).
On each vlan, HSRP is deployed with active interface on DC#1 or DC#2.
this indroduces traffic optimization problem. Here is an example:
- there an application server (let's say "A") on DC#1, on a vlan with active gateway on DC#2
- there is a database (let's say "D") on DC#1, on a vlan with active gateway on DC#2
Transaction between A to D will take the following path:
- request: DC1 -> DC2 -> DC1
- answer: DC1 -> DC2 -> DC1
Of course, we have not the problem with non extended vlans.
My question is: are there new protocols/tecnics which would permit to have a active/active redundant IP gateway on both side, with client of one DC able to use the closest gateway?
Thanks by advance for your lights!
You will need to isloate HSRP to allow active-active L3 forwarding. However, it can be difficult with STP over 2 DCs (with VACL filters, etc.).
This can be done much easier with Cisco OTV technology running on Nexus 7000s. Below is a good white paper for your reference and you should take a look at the FHRP isolation section:
Thanks, it is indeed interesting.
We intend to migrate the 2 datacenters in VSS, then to interconnect the 2 VSS blocks through etherchannel, which will permit to not remove STP for L2 redundancy (just keep it as security on access layer).
Following the whitepaper, I imagine to activate HSRP on each VSS block (same vIP, same HSRP group on each site) with HSRP messages filtering between the 2 sites. Locally (I mean inside a datacenter), HSRP would have no need because of VSS, but it will permit to have the same vMAC on both sides.
I don't know OTV (only the global concept of L2 extension), so I'm not sure, but do this protocol have an interest in a topology with only 2 datacenters to interconnect (and having DWDM links between them)? Our chassis with sup720 don't support OTV, but anyway, I am interested to know use conditions.
OTV is not global concept, it is actually a Cisco technology which is also an IETF draft. To answer your questions:
1) You can run OTV over your traditional L3 network, in your case, DWDM is just a L1 transport. You can put any L3 devices on top of it and run OTV for L2 extention.
2) OTV is not available on SUP720, it is only available currently on Nexus 7000 and ASR 1000.
Ok, but I wanted to know if OTV was usefull in the case of only 2 datacenters interconnexion.
In case of each of the sites use VSS with C6500, or VPC with Nexus, interconnexion between the 2 sites can be made with an etherchanel and a 802.1q for L2 extension. Can OTV give advantages (in case of Nexus use)? Or is it only useful if more than 2 sites need to have L2 extension?
OTV is useful for even if you have only 2 data centers and when you need L2 interconnection. It is extremely easy to configure. In your case, since you have DWDM, you can argue that 802.1Q trunk is easier to configure. But please keep in mind that when you use 802.1Q trunk, you can extend spanning tree between 2 sites (of course, you can also argue that you can block BPDU between your 2 sites over DWDM, but you can potentially create a loop if you have more than 1 link). OTV will block BPDU, unknown unicast, etc. automatically between the OTV sites.
If you have more than 2 sites, then OTV will definately be preferred (once again, I don't need to deal with how to avoid STP loop, etc.).
If this is a data center environment, and you isolate HSRP it will work....... but you will fall into problems if the gateway becomes unavailable at any of the DC's, this could become a big problem. Access to anything else above the core (that holds the VIP) e.g. any service above the core like the internet will be inaccessible. You could.... deploy MHSRP providing DC1 with .1 VIP and DC2 with a .254 VIP. This will provide you with optimization as well as resiliency.
The gateway for your hosts at DC1 would then be .1 and in DC2 the hosts have a default GW of .254
Isolating HSRP, in my opinion would be satisfactory depending on the requirements but if the requirements are to provide resiliency accross DC's then containment of HSRP wont do.
The aim is to have the same vIP on both DC to avoid to have to take care about server localization when configure the gateway.
The problem is why you want to have servers in DC1 when its local gateway is down (most likely it is critical network down). All traffic needs to traverse your L2 extention link across DC. Realistically, you want to Vmotion all services to DC2 when DC1's GW is down.
Then we'd be assuming that everything is virtualised. So my question is, does this still provide resilliency for desktops/laptop users, appliances, file and print services, web access, mail access etc.. which aren't virtualised that we can't really move over to another data center with a click of a button... With the VM technology is it possible to create "site profiles" too?
Without Virtualization, what is the bennefit of extending L2 across sites? Just because it is cool?
You can definite do that with OTV, but duing local HSRP GW are both out of service, you need to remove the VLAN and GW filters from the Overlay. Once that is done, traffic should be allowed to go through. I will not be automatic.
if you have enough bandwidth over the Darfkibre DCI then you my do not need to use HSRP localization even through the default gateway services will be done by one DC but the assumption is you have abundance of bandwidth in this case if the entire L3 gateways are down services can fallback to DC2/DR DC
I guess the benefit of HSRP localization is not only bandwidth but latency too, specifically for applications generating a lot of little packest exchanges sequentially (1 transcation needing to be finished to go the the next one). I thought for example to database read/write transactions. Even if latency between the 2 DCs is only 2ms, I guess go and return in big number can have impact.
if you have a DCI over a darkfibre with a certain distance and using big pipe of bandwidth you could get it 1 to 2ms
thats why it depends on the underlying DCI link
I am not sure to well understand.
What I imagined is an application working like that between server A & server B each placed in a the same DC but with gateway in the other DC.
I suppose a latency of 2 ms between the 2 DC.
A packet which go to A to B then return will take 4 x 2 = 8ms.
Imagine now if the application consist into a lot of little transacton (using TCP), and each transaction need to launch the next transaction: the pb of gateway can introduce a lot of delay for such an application (TCP synchro + data exchange - imagine 1 packet - + TCP end => each trnsation can take more or less 25ms). If there is 10000 transactions => 250s.
With local gateway, this time should be well reduced.