We have a single 10Gb fiber connection between 2 datacenters. All production servers are in our main datacenter. Recently we moved production application server workloads to the remote datacenter. We perfrom failover testing from time to time in the remote datacenter which requires us shutting down the 10Gb interface between the sites on our Nexus. Now that the production workloads are running in the remote datacenter we cannot shutdown the 10Gb interface. How can we perform testing without affecting the production workloads running in the remote datacenter? Currently all servers in the remote datacenter are running on a single VLAN with EIGRP routing between sites. I am looking for input on how to allow the production application traffic from the remote datacenter to our main site while still able to perform our failover testing in the remote datacenter. Any suggestions would be appreciated.
We perfrom failover testing from time to time in the remote datacenter which requires us shutting down the 10Gb interface between the sites on our Nexus
What exactly were you testing before moving some production servers to the secondary DC. You say failover but failover from what to what ie. it cannot be failover between the DCs because you were shutting the DC interconnect down.
Perhaps you could clarify ?
The discussion subject may have been a bit misleading. I will edit and change to more appropriate subject.
We are replicating our production virtual infrastructure to the remote datacenter. We bring up these servers for testing so we can't have the same servernames existing in both sites.
When they have the same servernames that presumably resolves to the same IP so you are saying that one of the servers in the backup DC could accidentally be used instead of the one that should be the production server.
Is that right ?
If so, when you bring up the servers do they need to communicate with anything in the production environment or not ?
During testing the servers that are brought online do not need to communicate with anything production. However the server workloads running in the remote datacenter need to be able to communicate with the main datacenter.
These are separate servers though yes ?
Do you need just one vlan ie is it that you need L2 adjacency between the DCs on the interconnect ?
The easiest thing would then be to have the servers you want to bring up in their own vlan(s) and use acls to restrict traffic to and from them.
A further step down the line would be to use VRFs (switch dependant) so even if there was a mistake in the acls they would not have any routes to any production servers and vice versa.
And of course there is always a firewall although this would need careful placement to ensure it only stopped traffic to and from those servers and not anything else.
Edit - regarding the firewall, you definitely wouldn't want it facing the 10Gbps interconnect ie. in the direct path from one DC to the other, as even if you allowed everthing through it could have a serious impact on the production traffic.
I have some additional questions to already been asked by Jon :
1. These remote servers have different IP in remote DC? As they the same IP range or different IP range all together?
2. If these are virtual workloads, you have these servers shutdown and brought up only for testing or they are active at remote sites at all the times?
3. When you say application fail-over testing, what excatly do you mean by that? Server fail-over, aplication failover, database fail-over or the whole application instance only. There might be some dependencies for a testing on a specific application that is what I am trying to understand.
4. If you have the same server on the remote DC active, do you have user conencetd to it. How does your Storage traffic gets syncd at the back-end for active-active application traffic?
I might have more questions on this but I will write as they come on my mind. We need to understand what exactly you want to achieve so that we can look at some new tehcnologies to help better.
Haven't seen you around for ages. Mind you i haven't been round for a while until recently
Hope everythings good with you.
How you have been?
I am very well Mate. I know, its been ages and I have just started picking the things a bit more on supportform. Its jus work always mate, keeping me busy.
Lets connect on emails (email@example.com). I believe I still have your email address.
In response to you questions.
1. The servers have different IP's when thay are brought up. However, since we have DNS replicating between the datacenters there will be name resolution conflicts.
2. The virtual servers are replicating to the remote datacenter and are shutdown at all times.
3. The virtual servers are brought online for testing while the 10Gb link shutdown.
4. We have storage traffic that is replicating at all times. The luns are failed over to the DR site for testing. Any changes on the DR side are deleted after testing is completed.
I was thinking same as far as creating a separate VLAN for the production workload servers running in the remote site and restricting via ACL. The VRF option seems to be achievable, but would that complicate things a bit.Would you mind sharing some insight on how this would be accomplished.
Happy to help but i think it may be better to wait on Amit as he has more experience than me with DC interconnects/VmWare etc.
As a general guideline you could have a VRF for the servers. You don't need to use just one vlan, you can have multiple vlans and they can all be in the same VRF. The only routes those servers could use would be the ones for their actual vlans.
So they couldn't route anywhere else nor could anything route to them. In fact they are not visible to anything outside of the VRF.
It would provide complete separation on the same physical infrastructure at L3, the separation at L2 is the vlan. It's not full MPLS VPNs, it is more VRF-Lite which is just not as full featured.
So you could have the server vlans all allocated into one VRF, they could route to each other but not anywhere else. It does depend on an extent to the switch interconnects you have ie. if some of the servers were on one switch and some on another and this switch was shared by production servers then if the uplink between the switches was L3 this would be a problem ie. it would need to be a L2 trunk.
But it sounds like you only have one vlan anyway so i doubt you are using L3 uplinks between switches.
They can be a useful solution but vlans with acls might be enough and if you need to start allowing access to shared resources then it can get more complex with VRFs because you need to leak routes and you may well end up having to use acls anyway.
So it's an option but i think Amit will probably have some more and probably better than mine
Sorry for a late response, had to step out. What I was thinking is to use Private Vlans for this scenario.
We can use private Vlans on DC which will restrict the inter-vlan communication. This is easier than using VACLS ir IP ACL's.
If its the virtual enviornment we can also use some advanced solution to carry the private vlans back to main DC using OTV, if possible, or VXLAN is Nexus1000v is being used.
I back-up Jon's idea of using a diff vlan and then using an ACL or VRF to stop the inter-vlan comm. This can be a good start.
Appreciate the response to my question. So just for starts I have decided to go down the ACL path to permit access to only the subnet in the remote datacenter hosting the production workload when we are performing a test using the same servernames. The PVLAN and VRF option does make sense, but I would need to do some route leaking to provide web access to the VLAN in that particular VRF. So my question is where to place the ACL's and do I need an ingress and egress access-group on the interface to block traffic both ways. Here is the ACL I came up with. Let me know if this makes sense.
ip access-list dr-acl-in
remark Permit Ingress ATL subnet 10.x.x.x/22 to DevQa 11.x.x.x/23 and Mgmt 11.x.x.x/24
permit ip 10.x.x.x/16 11.x.x.x/23
permit ip 10.x.x.x/16 11.x.x.x/24
ip access-list dr-acl-out
remark Permit Egress DevQa 11.x.x.x/23 and Mgmt 11.x.x.x/24 to ATL subnet 10.x.x.x/22
permit 11.x.x.x/23 10.x.x.x/16
permit 11.x.x.x/24 10.x.x.x/16
ip access-group dr-acl-out out
ip access-group dr-acl-in in
Where is int e4/41 in relation to the DCs, the server vlans etc ?
How many test server vlans are there going to be ?
If there are multiple test server vlans do you want these to be able to communicate with each other ?
You talk about web access to/from the test servers. Is this web access internet or specific production web servers ?
The address ranges used in your acls, are these production ranges ?
It's not possible to say whether what you have would work without knowing the answers to the above questions.
Could you provide answers and then we can say whether we think it will work or not.
In response to your questions.
1. The e4/41 interface is the 10Gbe on the N7K in the remote datacenter.
2. I just need communication between a single subnet DevQA/Mgmt 11.x.x.x(remote datacenter) and our production subnet ATL 10.x.x.x(ATL datacenter) during our testing.
3. Web access internet.
4. These are ranges use just as an example
Are these vlans in the remote DC (both test and production) routed on SVIs ?
If they are then applying the acl to the 10Gbps interface will control which traffic can come and go between DCs but it wouldn't control test servers talking to production servers within your remote DC.
Is this an issue ?
The acl doesn't include anything for web access. Is internet acess via the main DC because if it is your acls are going to get more complicated ie. you need to deny everything and then
"permit ip any
unless you know the specific IPs of the web servers on the internet.
I'm not trying to complicate it i just want to be sure that you don't have any unforseen issues.
The VLANs are both production and I just need the one subnet in remote datacenter communicating with the main datacenter during our monthly tests. I know it wont control the inter VLAN communication in the remote datacenter as this will be work in progress trying to figure out. However, if you have any ideas on how to block the Inter Vlan communication while allowing web access out to the remote datacenter internet that would be helpful.
I deleted my last reply because i read you only had one vlan but i was assuming you were going to create new vlans for the test servers.
Are you going to do that or not.
If everything is in the same vlan is this vlan extended across the interconnect or is it only in the remote DC ?
Apologies but if you only have one vlan then yes the 10Gbps link is the place to apply the acl but if you can i would stringly recommend having a different vlan for the test servers.
I assume the one vlan is routed in the remote DC ?
Can you give the test servers new IPs ?
If so the easiest thing to do is put them in their own vlan, create an SVI on the remote DC L3 switch and apply the acl there.
Yes, I can put the test servers in a separate VLAN with different IP's and apply the ACL's there. If that removes some of the complexity then I'm good with that.
i think that would be a good way to do it as it removes the danger of a misconfiguration of the acl on the 10Gbps link affecting production servers.
So if you create a new vlan and an SVI for it your acl becomes relatively easy. I assume you are using private addressing internally. So you don't need to specify each subnet you can simply summarise even if you are not using all the addresses within that range eg.
production addressing = 10.x.x.x
test servers = 10.10.10.x/24
ip access-list from_test_servers
deny ip 10.10.10.0 0.0.0.255 10.0.0.0 0.255.255.255
permit ip 10.10.10.0 0.0.0.255 any
int vlan 10 <-- new test server vlan
ip access-group from_test_servers in
what the above does is -
1) deny any traffic from the test servers to any device in production
2) allow internet access
if there were some reason you needed to access any production servers just include host entries before the first deny line. The above does assume as i say all production is using 10.x.x.x addressing.
This only blocks traffic going out from the test servers. It would also block return traffic so if a production device tried to connect to a test server the packet would get to the test server but the return packet would be dropped. However if you want be doubly sure you can also apply an acl outbound ie.traffic to the test servers -
ip access-list to_test_servers
deny ip 10.0.0.0 0.255.255.255 10.10.10.0 0.0.0.255
permit ip any 10.10.10.0 0.0.0.255
int vlan 10
ip access-group to_test_servers out
with both acls the test servers can't send anythiing to production and no production devices can send anything to the test servers.
Obviously if you use other private addressing in production you need to add that as deny lines as well otherwise they will be allowed by the permit lines for internet access.
Finally if you are using HSRP and the switches are not running VSS don't forget to apply the acl(s) to both SVIs, one on each switch.
Any questions please come back.