Not sure if this is a spanning-tree or vtp domain question, but wanted some input regarding our campus. We have approx. 100 access layer switches, mostly 2900xl, and 3500 xl switches throughout 5 buildings all interconnected via fiber. We have 2 6500 core switches running sup 720 running hsrp between them. We have approx. 30 vlans and 1 vtp domain. I have been advised by a consultant that I should probably break up my network by adding some Layer 3 routed links between at least one of the buildings to make up couple of smaller vtp domains. Wanted to get some feedback here on this suggestion, and understand why this would be necessary. We recently did have a strange network problem where we lost access from all vlans to our server vlan because of a possible network loop somewhere, and as a result of this issue, that recommendation was made. Please let me know what you think.
The consultant suggested going to layer 3 because then you don't have spanning-tree recalcs to deal with then. With layer 3 and using EIGRP it is sub-second as opposed to have to go through the spanning tree recalc.
What circumstances would cause a spanning-tree recalculation? Also, do recalcuations typically take 45 seconds, or are they sometimes longer?
If you're doing L3 on the 3500XLs (dont know if the 2900XL's support that), try doing a topography design upgrade where you use L3 uplinks. They're much faster healing and don't have that nasty STP wait period. On the non L3 switches, suggest RPVST with all of the built in backbonefast and uplinkfast.
Network design best practice is that you keep your Layer 2 as local and as small as possible. You should have as much of your network as possible working at Layer 3.
This is because working Layer 3 has a lot less barriers and problems to overcome as IP takes care of it all which is a lot faster at detecting & fixing, and you don't have to worry as much about the Layer 2 problem. Also if there is a Layer 2 problem less of your network will be affected.
STP re-calculations can take up to 50 seconds when using the default timers on the default STP type of 801.2D, also know as Pvst+ mode in Cisco speak.
This is because if a link goes down the timers have to be aged which takes 20 seconds, then listening and learning takes 15 seconds each before going into forwarding mode.
If you can switch to using RSTP (keyword: spanning-tree mode rapid-pvst) it's best that you do. It is backward compatible with 802.1D so if only some of your switches support it at least switch it on those.
Not sure how you would do that anyway if they are all 2900's and 3500xl's as they are layer 2 only switches so routed links to those devices is impossible. Unless you have some l3 distribution switches below the 6500's then the point is probably moot . Those l2 switches are really old at this point and probably don't have some of spanning tree protection features like bpduguard etc , if they do then maybe you should think about implementing things like that .
Layer 3 is ideal.
Recommend 'udld port aggressive' on all switch interconnections both fibre and copper will. I know udld is usually only talked about on fibre but have seen problems with unidirectional links causing spanning-tree meltdown due to faulty copper interfaces.
I should clarify that in our main building are the 2 6500's, that have about 30 3548 switches of of it, and in the second building another single 6500 with about 70 switches off of it. My idea what to create the L3 routed links between the 2 buildings, or the 6500's, and breakup the vtp domain from about 100 switches, into 2 separate vtp domains, 1 with about 70switches, and the other with about 30 switches, also creating 2 smaller spanning-tree domains.
That sounds like a good plan with the kit you've got. Personally I would switch VTP into transparent mode as well and keep the VLAN database local to each switch. You have more control and less risk that way!