3750 Design and Stack Redundancy

Unanswered Question


We recently had an issue with stack port flapping between two of our switches in a six switch stack. Unfortunately for us we did not have the "ring" completed because the longest stack cable you can get wasn't quite long enough. This of course caused major problems and it took a while get our environment back up. This episode has lead to a redesign project. I am of the opinion (opinion because I don't know it for fact) that if we had had the stack interconnected in a ring the problem wouldn't have been as major as it was but I can't say for sure. Due to this we are considering partioning our switches into two seperate stacks interconnected by cross-stack etherchannel and then partitioning our two node server clusters (one member on one stack and then other member on the other stack). In normal operating conditions all active nodes in the clusters would be on the same stack thus taking advantage of the 32Gb back plane but on fail over communication would be traversing the etherchannel. Now the thought behind this is if for some reason once stack has major problems either the clusters will automatically fail over to the other member node which is connected to the other stack or we give them a little help failing over. I understand that a single stack properly connetct in a ring with stack cables has quite a bit of redundancy built in. We could easily connect each member of clusters to seperate switches in the stack and still reap the benifits of failover if a switch fails. I think the big question mark is the master switch. We are concerened that the master switch in the stack could potentionally bring the whole switch to it's knees if for some reason the master switch got into a state in which it wasn't down or locked up but instead it was operating erratically. Any thoughts????

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
adavenport Tue, 05/22/2007 - 19:58

Your best bet is to complete the ring correctly. I suspect you have not cabled it correctly if you need anything other than the available cables. IE, you don't have to go all the way from the 'bottom' to the top. At most you need to skip one switch. Look at the examples carefully.

We have something like 300+ switches in around 75 stacks, and never had the master switch go 'crazy' and fail the whole stack. If it dies it will elect a new master. I don't think this is a valid argument. The benefits of the high bandwidth backplane and single point management far outweigh any other issues.

HTH, Roger

adavenport Thu, 05/24/2007 - 21:32

You job, should you choose to accept it, is to convince the others of the error in their ways.:)

Using their logic (such that it is) they should attach the cluster nodes with four interconnects, two to each stack on different member switches. Why not three stacks? That would be even better, huh?

Good luck.



This Discussion