I am having problems with 2960 & 3560 switches detecting a change in the network topology that 2950 switches do not see. When I view spanning-tree details on any of my 2960/3560 switches, this is what I see
Root port is 2 (GigabitEthernet0/2), cost of root path is 4
Topology change flag not set, detected flag not set
Number of topology changes 4903 last change occurred 00:03:43 ago
I do not get the same results on my my 2950 switches:
Root port is 24 (FastEthernet0/24), cost of root path is 23
Topology change flag not set, detected flag not set
Number of topology changes 21 last change occurred 06:02:38 ago
The 2950 switch is connected upstream of the 3560 switch on port 24. The flush message therefor is comming from or through the 2950 switch. This is consistant across my network. The 3560 switches are repeatedly flushing the cam tables where as the 2950 switches are not. Debuggin STP on the 3560 switch gave me this:
105813: Aug 14 13:23:22.890: STP: VLAN0010 rx BPDU: config protocol = rstp, packet from FastEthernet0/24 , linktype SSTP , encty
pe 3, encsize 22
105814: Aug 14 13:23:22.890: STP: enc 01 00 0C CC CC CD 00 0D 28 C0 89 58 00 32 AA AA 03 00 00 0C 01 0B
105815: Aug 14 13:23:22.890: STP: Data 0000020239600A001C0EADFB0000000017A00A000D28C0894080180100140002000F00
105816: Aug 14 13:23:22.890: STP: VLAN0010 Fa0/24:0000 02 02 39 600A001C0EADFB00 00000017 A00A000D28C08940 8018 0100 1400 0200 0F
105817: Aug 14 13:23:22.890: RSTP(10): Fa0/24 other msg
105818: Aug 14 13:23:22.890: STP SW: VLAN10: topology change over - this bridge is not root
105819: Aug 14 13:23:22.890: STP SW: Gi0/2 new flush req for 1 vlans
105820: Aug 14 13:23:22.899: STP SW: Fa0/13 new flush req for 1 vlans
105821: Aug 14 13:23:22.899: STP SW: Fa0/14 new flush req for 1 vlans
105822: Aug 14 13:23:22.899: STP SW: Fa0/15 new flush req for 1 vlans
105823: Aug 14 13:23:22.899: STP SW: Fa0/16 new flush req for 1 vlans
105824: Aug 14 13:23:22.899: STP SW: Fa0/17 new flush req for 1 vlans
105825: Aug 14 13:23:22.899: STP SW: Fa0/20 new flush req for 1 vlans
105826: Aug 14 13:23:22.899: STP SW: Fa0/23 new flush req for 1 vlans
105827: Aug 14 13:23:22.899: STP SW: flushed Gi0/2 1 vlans agg 1 qtime 8 dur 0ms
105828: Aug 14 13:23:22.899: STP SW: flushed Fa0/13 1 vlans agg 1 qtime 8 dur 0ms
105829: Aug 14 13:23:22.899: STP SW: flushed Fa0/14 1 vlans agg 1 qtime 0 dur 0ms
105830: Aug 14 13:23:22.899: STP SW: flushed Fa0/15 1 vlans agg 1 qtime 0 dur 0ms
105831: Aug 14 13:23:22.899: STP SW: flushed Fa0/16 1 vlans agg 1 qtime 0 dur 0ms
105832: Aug 14 13:23:22.899: STP SW: flushed Fa0/17 1 vlans agg 1 qtime 0 dur 0ms
105833: Aug 14 13:23:22.899: STP SW: flushed Fa0/20 1 vlans agg 1 qtime 0 dur 0ms
105834: Aug 14 13:23:22.899: STP SW: flushed Fa0/23 1 vlans agg 1 qtime 0 dur 0ms
105835: Aug 14 13:23:22.899: STP SW: flush task will sleep, processed 8, yield 0ms
105836: Aug 14 13:23:22.949: RSTP(10): sending BPDU out Gi0/2
105837: Aug 14 13:23:22.949: STP SW: TX: 0100.0ccc.cccd<-001f.27de.a682 type/len 0032
105838: Aug 14 13:23:22.949: encap SNAP linktype sstp vlan 10 len 64 on v10 Gi0/2
105839: Aug 14 13:23:22.949: AA AA 03 00000C 010B SSTP
105840: Aug 14 13:23:22.949: CFG P:0000 V:02 T:02 F:79 R:600A 001c.0ead.fb00 00000004
105841: Aug 14 13:23:22.949: B:800A 001f.27de.a680 80.02 A:0000 M:1400 H:0200 F:0F00
105842: Aug 14 13:23:22.949: T:0000 L:0002 D:000A
105843: Aug 14 13:23:22.949: RSTP(10): sending BPDU out Fa0/13
105844: Aug 14 13:23:22.949: STP SW: TX: 0100.0ccc.cccd<-001f.27de.a68f type/len 0032
105845: Aug 14 13:23:22.949: encap SNAP linktype sstp vlan 10 len 64 on v10 Fa0/13
105846: Aug 14 13:23:22.949: AA AA 03 00000C 010B SSTP
2960/3560/3750 switches - IOS 12.2(46)SE
2950 switches - IOS 12.1(22)EA12
Are all switches consistently configured for Rapid Spanning Tree of 802.1D spanning tree ? What is your standard ? (from the debugging , it seems rapid, but are all switches configured like this ?)
All switches are configured for Rapid Spanning Tree. I have a few 1410 bridges that are configured for protocol IEEE. All the Motorola Canopy units function as transparent briges.
One issue I would pursue right now follows from the fact that your switches (or at least some of them) run Rapid STP. The RSTP flushes the CAM table when a topology change is detected, rather than just shortening the timeout value.
A source of numerous topology changes in RSTP networks are ordinary switchports connected to workstations going up. However, generating a topology change when a normal workstation comes in is useless and only causes temporary flooding in the network. Therefore, the ports towards workstations should all be configured with the spanning-tree portfast command - this will designate these ports as RSTP Edge ports. An RSTP Edge port does not generate a topology change, and also it allows for rapid transition from Discarding to Forwarding state.
Therefore, if possible, all switches in your network should run RSTP and all switchports leading to workstations should be configured as portfast ports.
Is it possible for you to verify that your network is set up in this way and make changes if necessary?
Last week I transisitioned all the switches on my network to Rapid Spanning Tree. The reason for this was increasing instances of mac-flap errors being detected on certain switches. I am certain the two problems are connected, just that rapid spanning tree is allowing me to debug it more thoroughly. I can tell you that all access ports on every switch has spanning-tree portfast enabled. Also, the 2950 switch I mentioned has a Canopy Cluster Management Module connected to it. The Canopy system contains 6 APs that funtion as tranparent point to multi-point bridges. When you do CP Neighbor you will see 20 - 30 switches on that one trunk port. For this reason I specify that port as "spanning link-type shared".
Is it possible that some switch, possibly behind the Canopy devices, is trying to become the root bridge? Does the debug spanning-tree root give any interesting information regarding this?
Also, despite being sure about having the RSTP activated on all switches, check using the show spanning-tree if all switches really see themselves as RSTP. Sometimes I have come across a situation where two switches configured for RSTP held each other for a legacy STP neighbor because of a race condition when configuring them. Issuing the command clear spanning-tree detected-protocols is helpful here.
Also, are the switches behind the Canopy running STP or RSTP? It might be possible - regarding their high count - that they are still running the legacy STP that generates a topology change whenever a port goes up or down. It might be helpful to protect the switchport to the Canopy using the Root Guard or even the BPDU Filter if it is guaranteed that there is no Layer2 loop via the Canopy part back to your network.
Apart from this, the best I can suggest right now is to use debug to trace where the topology change originates and possibly why.
I have not tried the clear spanning-tree detected protocols but will give it a try. All switches save an except for one are globally configured with BPDU Guard and BPDU Filter. VTP on all switches except the core are running in Transparent mode. The Core switch is the only VTP server on the network and is configured with Priority 24576. Do you have any insight on why the 2950 switches are not flushing their cam tables along with the other switches?
I am really not quite sure why the 2950 behave differently from 3560. It might well be caused by differences in the IOS. Also, if by accident, the 2950 run legacy STP at least against some selected switches, it is conceivable that they use the legacy style of MAC flushing (shortening the aging time instead of flushing the MAC immedtiately). I must admit I have never tried to debug how a switch running RSTP and having a couple of legacy STP neighbors deals with topology changes.
You say the 2950 is connected upstream of the 3560 switch. So the 2950 is closer to the root switch as the 3560. However, this does not match with your debug. Something on port Gi0/2 on the 3560 is becoming root switch and this is the 3560 root port.
Second, you have wireless bridges to other networks connected behind the Canopy Cluster ? See if the wireless connection is ok, if it looses bpdu's for 6 seconds, the remote sites may becoming root switches for some seconds. A topology drawing might also help :-)
Core(3750)->3560->PTP600->PTP600->(2950T-24)->Canopy Cluster->(41 switches(2950T-24/2960-8TC)
The 2960/3560 switches will flush the tables and see 5000+ topology changes. A 2950 switch will see 1 - 10 topology changes and will not flush. It seems that those switches running 12.2(46)SE are seeing a change to the root that the 2950 switches don't. The core will see a couple hundred topolgy changes but does not flush the tables.