Data Centre and Main Office Infrastructure Design

Answered Question
Apr 26th, 2010

Tuesday last week I was on my way to work and I got a call on my mobile from somebody asking where I was as the whole network was down. I was already making my way to our data centre building so decided that was probably the best place to be to look at it.  Sure enough when I got there it wasn’t happy.  I plugged into the console on one of the Core switches and noticed MAC addresses flapping around, I managed to track down the problem to a common work area where somebody had plugged an unmanaged 3COM switch into the network which in itself hadn’t caused the problem but they had then plugged a cable from port 3 in the switch to port 4 in the switch creating a nice loop.  So I pulled it out and hey presto it was all good again.  In all there had been an outage for around 30 – 60 minutes.

Our business offers a lot of services to clients including some website hosting and this outage seemed to affect the whole network including these services, which are located in different VLAN’s.

Our data centre building also has office space as some of our employees work from there.

When I started here around a year ago I didn’t like how the Data Centre building was linked with the Main Office which also hosts some redundant, clustered and load balanced systems with the DC.  Between the two sites we have two separate BT LES 100 circuits but only one of them is ever used the other is only plugged in at one end and is a manual switch over if the active one fails, I’ve been planning to set an etherchannel up but to date haven’t had the time.  Each end of the LES circuits are switches and the circuits plug into trunk ports so ALL the vlans are trunked between both sites.

Now during a review of what happened management have come up with the idea of having a physical separate network infrastructure for our users at each site so they would need to pass through a firewall to connect to anything.

I don’t really like this idea as we’re going to have to manage multiple infrastructures and there must be a way to have one main infrastructure but still securing the Data Centre from this kind of outage.

The reason they plugged the LES 100’s into trunk port’s was because they required Layer 2 connectivity between the sites as they run Server Clusters both Microsoft and Oracle.

I was hoping somebody may have had experience in this type of setup and could help me out or make some suggestions so we can secure the Data Centre subnets/VLAN’s from out user VLAN’s/Subnets.  We obviously have firewalls to separate the subnets/VLAN’s but the switching infrastructure is shared.

We have been told we can spend money to resolve this, as it is really important that we can show our customers we have a sound infrastructure with built-in resilience.

Also we are a Cisco house when it comes to switching and routing but our firewalls are not Cisco.

There is a diagram included if it helps.

Thank you for taking the time to read this and I hope I have included enough information.

I have this problem too.
0 votes
Correct Answer by Jon Marshall about 6 years 7 months ago

Mark

Apart from port security here are some other things to consider -

1) only allow vlans to switches that need to be allowed.

2) stop using vlan 1. Vlan 1 by definition spans you entire network so any STP issue will affect all switches.

3) You may want to consider doing this -

you only need L2 adjacency for the cluster. So leave the links as L2 trunk between the 2 sites but only allow 3 vlans on it -

1) the first vlan would be the server vlan for clustering. If this is vlan 1 you really do need to segregate the servers

2) route the rest of the vlans between sites. To do this you can use the other 2 vlans for L3 peering ie. OSPF/EIGRP. All the other user vlans are now routed across the LES links and not switched. This would limit STP failure. And obviously if you have a server vlan between the 2 sites you can control which ports are allocated into this vlan ie. no user should be able to connect devices into the server vlan. If they can then you do have real problems.

You can only do the above if you setup vlans for users/servers. If you don't then port security is your main fallback.

Other than that you could look at L2TPv3 and EoMPLS to extend a L2 vlan across routed links and make the links purely routed links between the 2 sites.

Jon

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 5 (1 ratings)
Loading.
Ganesh Hariharan Mon, 04/26/2010 - 03:08

Tuesday last week I was on my way to work and I got a call on my mobile from somebody asking where I was as the whole network was down. I was already making my way to our data centre building so decided that was probably the best place to be to look at it.  Sure enough when I got there it wasn’t happy.  I plugged into the console on one of the Core switches and noticed MAC addresses flapping around, I managed to track down the problem to a common work area where somebody had plugged an unmanaged 3COM switch into the network which in itself hadn’t caused the problem but they had then plugged a cable from port 3 in the switch to port 4 in the switch creating a nice loop.  So I pulled it out and hey presto it was all good again.  In all there had been an outage for around 30 – 60 minutes.

Our business offers a lot of services to clients including some website hosting and this outage seemed to affect the whole network including these services, which are located in different VLAN’s.

Our data centre building also has office space as some of our employees work from there.

When I started here around a year ago I didn’t like how the Data Centre building was linked with the Main Office which also hosts some redundant, clustered and load balanced systems with the DC.  Between the two sites we have two separate BT LES 100 circuits but only one of them is ever used the other is only plugged in at one end and is a manual switch over if the active one fails, I’ve been planning to set an etherchannel up but to date haven’t had the time.  Each end of the LES circuits are switches and the circuits plug into trunk ports so ALL the vlans are trunked between both sites.

Now during a review of what happened management have come up with the idea of having a physical separate network infrastructure for our users at each site so they would need to pass through a firewall to connect to anything.

I don’t really like this idea as we’re going to have to manage multiple infrastructures and there must be a way to have one main infrastructure but still securing the Data Centre from this kind of outage.

The reason they plugged the LES 100’s into trunk port’s was because they required Layer 2 connectivity between the sites as they run Server Clusters both Microsoft and Oracle.

I was hoping somebody may have had experience in this type of setup and could help me out or make some suggestions so we can secure the Data Centre subnets/VLAN’s from out user VLAN’s/Subnets.  We obviously have firewalls to separate the subnets/VLAN’s but the switching infrastructure is shared.

We have been told we can spend money to resolve this, as it is rea

Hi,

With the current setup and propsed setup the only differenc i see the intorduction of firewall for user subnet and in current i dont think you are using for the same.

Always a good practice to restrci user subnet traffic with restriction on firewall with set of policies,but in your current setup you can start with 802.1x authentication with ACS and swithcport security features in switches in order to avoid the above mentioned problem.L2 based authentication will make authentication when ever you plug a cable to your switch port,once authenticated user can access the lan.To avoid loops switchport bpdu gaurds/loop gaurds features can help if you get any bpdu on the connected ports which will make that port to err-disable state rather in looping in switching environment.

Hope to Help !!

Ganesh.H

Remember to rate the helpful post

markoldhamuk Mon, 04/26/2010 - 03:38

There are currently firewalls restricting traffic between the users subnets and the data centre subnets I've

not included them on the topology as I was focusing on the switching infrastructure.

I included the firewalls on the proposed topology to basical try and show that the two user subnets would be physically seperated from the switching infrastructure.

BPDU guard didn't stop the problem we had the this these 3COM switches are invisable and you don't see them even when you look at the mac address table.

With the above in mind regarding the switch being invisable would 802.1x still work?  I've used VMPS in the past which I believe is a similar process and this worked via MAC addresses of the devices, with 802.1x it can use the username if I remember correct.  What sort of problems does this cause with machine being able to see DHCP and Login servers such as active directory when they boot up.

I think the end goal is to have some form of segmentation between the two environments but I presonally didn't want to seperate the switching infrastructure if I could help it.

Jon Marshall Mon, 04/26/2010 - 05:09

markoldhamuk wrote:

There are currently firewalls restricting traffic between the users subnets and the data centre subnets I've

not included them on the topology as I was focusing on the switching infrastructure.

I included the firewalls on the proposed topology to basical try and show that the two user subnets would be physically seperated from the switching infrastructure.

BPDU guard didn't stop the problem we had the this these 3COM switches are invisable and you don't see them even when you look at the mac address table.

With the above in mind regarding the switch being invisable would 802.1x still work?  I've used VMPS in the past which I believe is a similar process and this worked via MAC addresses of the devices, with 802.1x it can use the username if I remember correct.  What sort of problems does this cause with machine being able to see DHCP and Login servers such as active directory when they boot up.

I think the end goal is to have some form of segmentation between the two environments but I presonally didn't want to seperate the switching infrastructure if I could help it.

Mark

If you are sharing the switching infrastructure then a firewall isn't really going to make that much difference to the problem you faced.

Have you looked at port-security which is a good solution to people connecting hubs into ports they shouldn't ?

Edit - also have to say that if you are a company hosting services for other companies and your reputation depends on availability etc. then you should not be allowing users to do this. And it should be made very clear that any sort of action would lead to reprimand/dismissal, it's your company reputation after all.

Jon

lamav Mon, 04/26/2010 - 07:04

Mark, I agree with Jon on both counts. Port security is a good solution and the person who did this should probably be fired to boot.

That having been said, I can't see why your whole network was shut down because of this loop. Are you running PVST+/rPVST+? If so, the loop created by this genius should have only effected the vlan that the access port in his cube is a part of.

Some good rules to follow with regard to mitigating loops:

1. Implement STP according to Cisco's best practices. There are several very informative documents fom Cisco.

2. Segment your network. Create smaller vlans and divide them in a sensible manner. EX: Instead of one user vlan for the 3rd and 4th floor, create one for each.
3. Keep the size of the layer 2 domain as small as possible for all vlans. Look into using a routed access layer - at least in the campus network, if not in the server farm.

4. Keep L2 off the core and implement a distribution layer to create L3 isolation between the two layers, thereby mitigating the spread of the bridging loop contagion. Or you may look into VSS, in which case you can leverage the benefits of virtualization to mitigate loops in L2 domains.

5 EDIT Separate user vlans from the server vlans. EDIT

I'm not sure what the purpose is of the firewalls, except perhaps to create L3 isolation between the subnets. But you have that already just by implementing vlans and inter-vlan routing. Not sure i get their solution.....

HTH

Victor

markoldhamuk Mon, 04/26/2010 - 11:58

First of all Jon and Victor thank you for you feedback.

I don't have the power to enforce action against employees, I wish I did but I don't

At start here justover a year ago and I'm obviously working with an existing setup.  The current firewalls are being used to segment the vlans and are also doing all the routing internally.

One thing I forgot to mention in my first message was that users are on a seperate VLAN to the hosting stuff but they are on VLAN1 .

The users only share a VLAN with business related servers such as Active Directory, Exchange, File Server, etc and again I'm hoping to move away from this although I'm treating this a much lower priority that the other stuff.

We have a project in line to migrate all the servers and the internal business servers to a new VLAN but I'm not able to do this as I would need to bridge VLAN1 with my new VLAN4 to do a staged migration and the current Core switches won't cope with it as they are already running high memory and cpu.

Another project in the line before this happened was to swap the 6 current Cisco 2948G which are used as the core at present with 4 nice new Cisco 4507R's, but since this issue has come about I'm holding off on these incase I need something different to solve my issue.

Now I'm wondering if the use of VLAN1 is what has caused such a big outage as it will have been VLAN1 the user created a loop with.

We are using VoIP here so the most of the PC's and Laptop's arelinked via the Cisco Phone's so with port security if I just allow a max of 2 MAC Addresses will this still help with the addition of switches and hubs?

Places I've worked in the past have always used L3 routing between sites and buildings but we have usually had clusters in the same vlan on the same site not spread across multiple sites. If I was to run L3 in the core would I still be able to extend the VLAN's for the Hosting subnets and VLAN's over the two sites so we can still run the Clustered server environments as they need to be in the same subnet to work.

I know I've got a lot to do, but its something I can get stuck into once I know my options and if I sort it now while I have the chance I'm hoping it will make life a lot easier in the future and also be a lot better for the company.

Once again thank you for you comments and feedback I'm really greatful you are able to find the time to help me.

Kind Regards

Mark.

Correct Answer
Jon Marshall Mon, 04/26/2010 - 12:18

Mark

Apart from port security here are some other things to consider -

1) only allow vlans to switches that need to be allowed.

2) stop using vlan 1. Vlan 1 by definition spans you entire network so any STP issue will affect all switches.

3) You may want to consider doing this -

you only need L2 adjacency for the cluster. So leave the links as L2 trunk between the 2 sites but only allow 3 vlans on it -

1) the first vlan would be the server vlan for clustering. If this is vlan 1 you really do need to segregate the servers

2) route the rest of the vlans between sites. To do this you can use the other 2 vlans for L3 peering ie. OSPF/EIGRP. All the other user vlans are now routed across the LES links and not switched. This would limit STP failure. And obviously if you have a server vlan between the 2 sites you can control which ports are allocated into this vlan ie. no user should be able to connect devices into the server vlan. If they can then you do have real problems.

You can only do the above if you setup vlans for users/servers. If you don't then port security is your main fallback.

Other than that you could look at L2TPv3 and EoMPLS to extend a L2 vlan across routed links and make the links purely routed links between the 2 sites.

Jon

markoldhamuk Mon, 04/26/2010 - 12:28

Thanks Jon.

Point 3 is proabably best for us I think. I've looked at OTV and I think its a bit too much for us, in an ideal world maybe but really don't think we could do that kind of implemenation.

I think one of the big issues here is the VLAN1, and none of the hosting stuff is on VLAN1 so I'm hoping we can sort this out using your idea of only trunking a few VLAN's and route everything else.

Thank you again for you help.

Regards

Mark

Actions

This Discussion

Related Content