Hello. I'm sure this topic has been beaten to death, but one more kick won't hurt. We have a data center with approximatley 140 servers. Most servers connect into 2950T switches, which connect to the core switch (4507R) via gig ethernet uplink. The core 4507R has a 48 port gig ethernet line card, and routing is done using interVLAN routing, and OSPF to the WAN. Also, approx 35 servers are directly connected to the core switch. This was done because of the gig capability of the line card, so the servers with the most data could be backed up faster. We have approx 10 subnets (not including WAN), 3 of which are user subnets and 6 that are for servers.
Our manager has asked us to investigate redundancy for the 4507R, so I have recommended implementing a 4506 with the same line card config. Also, I have proposed that we remove the 2950's that uplink the servers, and replace them with a stack of 3750G-24 switches. This way we could VLAN the stack into the 6 existing subnets, and spread the VLAN's out across the different switches in the stack. This way we don't have to worry that if we lose one switch the entire subnet will go down. I have also recommended removing the servers that are plugged directly into the core, and plugging them into the stack of 3750's.
Also, our manager would like to see the critical servers dual-homed, and I don't like the idea of plugging one NIC from serverX into the 4507R, and the other into the 4506. I've got some pics, which looks like a better design? By rule, should servers be plugged directly into the core switch, or should they be plugged into the dist/access layer?
if we take a look at the avvid architecture , specialy in the infrastructure
module of the AVVID and we zoom further we will see the entreprise composite model,
let take only the entreprise campus module from the entreprise composite model, we find the
server-farm module which can be organised using three layers access/distribution/core the
core here will be the core of your network however access/ distribution are owned by
the server-farm moduule,you organize that with hardware redundancy and dual NIC cards on the servers one
this done connect the distribution switches of the server-farm directly to the core of your network.
OK, thanks. So in the design with the stack of 3750's, they become the distribution layer, and the wiring closets are the access layer, and the 4507 is still the core.
Am I right in assuming that servers should be plugged into the distribution layer 3750's, and not plugged directly into the core switch?
Yes, you are correct about that. The Network Design Best Practices document says that you should configure the Core Layer. This layer contains all the high end routers,it also requires a highly redundant configuration since this is the heart of the network.Core down means no business.
Distribution Layer is where you connect the servers and low end routers. As the name says, this layer holds the application servers in your organization. Then you have the Access Layer which is the end user environment where the workstations are connected.
I hope it helps.
yes you are right , for a good design the core must be free of any business traffic and any policy its functionality is speed switching so pluggin servers in the core is not good to go with.
i would like to add something here:
first the stack does not provide redundancy you still have an SPF a single point of failure, and the stack just used to provide more port density!
the model you provided in both schems is not a pure hierarchical model i mean a access/distribution/core but you are using what we call a collapsed backbone in that the core and the distribution are merged whin on
device one layer here your 4507/6.this model is good from small to medium enterprise size.
for your servers if you use just one card you still have a single point of failure
so if you use two nic card each connected to a different device you have full redandancy
if your budget allows go with clustering of your servers the best way , this is required if you want to achieve five 9 -->99.999 availability.
Thanks for all of the great info.
If we dual-home the servers into different members of the stack of 3750's, and then create ether-channels using ports from different members of the stack, would this not eliminate the SPF?
Also, my manager wants to dual-home servers and plug one NIC into the 4507, and the other NIC into the 4506. I'm not sure if this would even work, so I'm definetly trying to talk him out of this idea. This is why I'm proposing the 3750's.
Another thing the 3750's should provide is jumbo frames, which might help out with the backup times we are struggling with.
hi thanks for the comments,
let take one server and you treat the other the same way, if your server is connected to one 3750 , the spf is still here if the 3750 failed the server can not communicate with the rest of the network, if you connect your server to two different 3670 here you eliminate the SPF, but this needs two nic cards in your sevrer or a transeiver.some OS support 2 nic card and presents both cards as a single interface to the OS (especialy unix) and even you can use them for loadbalancing the traffic of your server.
for etherchannel you can lie up to 8 ports, if you have more then one uplink
toward your 4506/7 spanning tree will block most of them leaving just one in the forwarding state, but if you use therchannel, SPT will treat the bundle as one link , in that case you wil have a fault tolerence of the link if one failed the others link in the bundle will handle the traffic transparently
,also you will have loadbalancing there is a hash that determines which port in the bundle will be used may be you will use just one and they others are there not helping at all, so you have to tune
the config of the loadbalancing of the bundle (use SRC-DSN-mac,SRC-DST-IP....) to make all or most of them working.
for the solution of dual home your server in both 4506/7 is good to go , you have a full redundancy for the device and the link ( between the server and the device ) so there is NO SPF.
for the struggling time of the backup is a reality, to overcome it you need budget
(SAN with fiber channel, cisco has solution on that field)
do rate if it does clarify
So you are saying it's OK to plug the servers directly into the core switch? When teaming the NIC's (HP Proliants), do the corresponding network ports not have to be setup as an etherchannel to accomodate the two NIC's? If so, I'm unclear how this could be done on two separate devices (4507 and 4506).
Anybody else have an opinion on the two designs? Also, does anyone else know if it is possible to dual-home a server into two different L3 devices?
Server dual-homing capabilities depent upon the operating system / TCP ip stack. You mentioned Compaq / HP teaming, so I assume you are talking Windows. If you connect a server to two seperate devices using HP teaming you can get resliance but not increased throughput. Only one of the interfaces will be able to recieve traffic, whilst both may transmit.
The exception to this rule is the 3750 stack, this is because, whilst they are seperate physical devices, they act as a single logical unit.
Thanks Mark. Which design would you go with, the one where the server farm module consists of a stack of 3750's, or the design where the servers that need to be dual-homed are plugged into the 4507 and 4506, and rest of the servers are plugged into the 2950's which uplink into the cores.
I like the idea of the stack for redundancy in the server farm module, and if a server needs to be dual-homed or not, it still plugs into the stack. It seems much cleaner.
You might want to think of redundancy from the perspective of the machine. If you take a user desktop and follow it all the way up to its physical path you should find that by tracing out the course of its travel to the endpoint what the possible points of failures are.
So if you take what your boss wants..
If you are running HSRP for lets say VLAN 2 (user workstations) and the primary path is the cat4507 then the user desktop will use that link to forward its request. If that link is down it will then forward it to the backup L3 switch (cat4506.) That redundancy piece is covered.
Step2. Now the packet needs to get to the db server. With your diagram it seems as if it should make it there with no problem, but what happens if that link to the server is down. It has no way of getting to that server. The packet will never know where the server is because your cat4507 told it that the path to the server is here. You should have a trunk or some sort of etherchannel between the catalyst to achieve the full redundancy.
You question is hard to answer because there are many reasons why you should or should not do something. What is your boss's budget? How large is the environment. Do you need cisco's 3 layer model or does the collapsed backbone fit your needs now and a year from now. We network engineers tend to think that by building the 3 layer model in the beginning it will make our lives much easier later on but at the same time it might not be feasible.
You have to remember though that you should be looking at the logical aspect of the design. Think about certain disaster type scenarios that might prevent the servers and host from talking to each other and then figure out what workarounds there are to prevent them. Keeping your design simple will help you achieve more.
Your design is good but it is missing the connection between switches. Your boss's design can work to but that also needs the links between the two switches.
Thanks for the insight, it is very much appreciated.
When you say my design is good, but it is missing the connection between switches, what connection are you refering to?
If each riser closet in the building had two uplinks coming back to the server room, from two different switches, and each uplinked into each 4500, redundancy from the wiring closet is achieved.
The stack of 3750's allows us to dual-home servers (potentially using ether-channel), and make sure each member of the stack uplinks into each 4500, giving us redundancy at the server farm level as well.
This is a very interesting discussion and I'm glad someone else out there is struggling with making "the right choice."
In my organization we have very high availability requirements, and we have tried to engineer redundancy into every aspect of the operation. We have redundant applications on redundant hardware, connected to the network via redundant links, those links connected to redundant switches, all backed up entirely with a redundant datacenter.
In our switching design we have each server dual-homed into our "collapsed backbone" which consists of two 4507R's with a 2 member Etherchannel between the switches. This has worked fine for us, although I must admit we have yet to experience a failure of a major component (switch, linecard, fiber transciever, etc). The only issue I've ever been able to find are the incessant log messages on the switches reporting hosts flapping between the etherchannel and the local port (although I have wondered if that might be causing inefficient switching and therefore a performance hit -- any ideas?).
So, for what it's worth, we plug directly into our core switches, dual-homed, and haven't ever had any infrastructure related outages. (Knock on wood!!).
Another reason I would like to move servers out of the core switch and into a stack of 3750's is the oversubscription properties of the 48 port gb line card. We are seeing a lot of buffer errors on switchports when our backups go at night, and Cisco TAC tells me that they are showing up because we are pushing too much data through each of the 8 port groups.
The switch just doesn't seem to be built to have a lot of servers plugged directly into it when they are moving a lot of data.
We used to have our servers directly plugged into core 6500's in our datacentre until we ran out of ports and then we had to use access-layer 6500's for the servers. They both will work but a more scalable model is certainly to connect servers into access-layer switches.
Another factor to take into account are blade systems (HP/IBM etc) where each chassis has 2 intergrated Cisco switches. You would want to connect these into your core switches not your access-layer switches.
One of the key decisions in designing is to design something that isn't just fit for purpose now but can grow without a mjor change to the network infrastructure. Of course as a previous poster mentioned budget is also an issue.
One advantage of your design is that because you are not connecting your 4500 switches directly to each other (correct me if you are but your diagram doesn't show it) then you have no loops which means that STP does not kick in. But this also means that your HSRP traffic will need to go via the access layer switch links.
A more serious problem could be with your traffic flows depending on how you set your trunks up. For arguments sake lets say you set up the trunks to each switch to only allow vlans that are in use on that switch, so for example your trunks to your 3750 switches only allow server vlans on them.
Now the HSRP active gateway for one of your client vlans is the 4507. A client in that vlan wants to talk to a server. The server vlan HSRP active gateway is the 4506.
Now lets assume that the link from the client switch to the 4506 switch dies.
The client on that switch sends a packet to the server. The packet reaches the server via the 4507 switch. The server responds and sends the packet to it's active gateway on the 4506 switch.
When the packet reaches the switch the 4506 now has no way of sending the packet back to the client as it's link to the client vlan has gone down.
If you have a layer 2 trunk between your 2 4500 switches that carries all vlan traffic this would not be an issue. The 4506 would send the traffic down the L2 trunk and the 4507 would send it back to the client vlan.
That's why a lot of design have a layer 2 etherchannel trunk between the core switches. HSRP would also flow across this link.
The downside is that now you have loops in your network and STP has to come into the picture. However with RPVST+ the failover times can be reduced to seconds on the failure of a link.
It's always a trade off in design but i would go with your solution and add a layer 2 etherchannel trunk between your 4500 switches.
if you do this make sure you set the 4500's to be spanning-tree root and secondary for all vlans to ensure optimal traffic flows.
I agree with Jon. This was also a comment/suggestion I made as well in my previous post. The etherchannel is still needed regardless of your access layer design.Jon did a wonderdul job of explaining why the etherchannel is ideal.I think with the suggestions made in here you should have a pretty decent design to present to your boss.
Thanks for taking the time to write up that fantastic post. Looks like I'll add the ether-channel between core switches, and maybe I'll investigate buying a used 6509E instead of a new 4506. It seems like you can pick up used equipment much cheaper, and just add it to a Smartnet contract.