The layout for this problem is:
At location #1 there are 4 switches A, B, C and D connected to each other in a small mesh.
At location #2 there also are 4 switches E, F, G and H connected to each other in a small mesh.
There are several vlans trunked between the switches, and 2 fiber connections between the two locations.
The problem that occurred the other week is a broadcast storm. The storm disappears when I disable one of the two fiber connections between the two locations.
I found that the root selection works fine for vlan 2-999 but not for vlan 1. One switch at location #1, lets say A, is root for all vlans. But another switch at location #2, lets say E, is root for vlan 1 and only vlan 1.
I can swap the root for vlan 1 between switches E, F, G and H by changing the priorities on ditto. But I am unable to make A the root for all vlans.
Seems like BPDUs are dropped for vlan 1 over the fiber connection. But then again why do they come through for vlan 2-999?
The switches are 2 x WS-C3560G-24TS and 2 x WS-C2950T-24 at location #1 and 4 x WS-C3560G-24TS at location #2.
What could be causing this problem?
Thanks in advance!
You're definitely not receiving vlan 1's bpdu on E. A does not look to be the root either btw, but that might be a cut and paste error.
On the fiber link. Do a show spanning-tree vlan 1 interface
Also, you can set a debug condition for vlan 1 and the fiber port and trace the BPDU sent/received.
The only only difference between vlan 1's bpdus and the other ones is that vlan 1 is sending to the IEEE address, untagged bpdus (they are thus a little bit shorter than the PVST BPDUs). Do you have some configuration forcing the trunk traffic to be all tagged?
Nothing out of the ordinary from the portion of the config you've posted. If you have spanning-tree running globally in all switches, and have a circular topology, one port will go blocking while flowing the traffic on the remaining port up to the root.
I don't see from the portion you posted 2 switches having the root of Vlan 1.
Switch E is the root of Vlan 1 while Switch A is pointing towards G0/20 for the root of all Vlans.
I believe I need more information from your network to determine what the problem really is.
Please post the spanning-tree output from all switches.
Oh, sorry Edison.
I have pasted the output from switch D and not A. Please find the correct version attached.
I think Francois is on to someting, too. But I am not sure how to debug the BPDU transmission between D and E. Is it possible to do using "switch-commands", or am I going to use some kind of tool, fx. ethereal?
Thanks for the reply!
It's weird that this kind of problem should occur all of a sudden if this is a configuration issue. Did you change anything before getting into the trouble? Or at least, were there some significant network events (ports added/removed/failing)?
Vlan 1's BPDUs, in PVST+ implementation, are transmitted untagged on the wired, in the "native vlan". Is the native vlan allowed on both side of the fiber links?
Thanks and regards,
Thanks for the reply!
I am not aware of any changes on my part that could initiate the broadcast storm. That is why I blaim the fiber-provider. One thing that I forgot to mention is that on every end of the fiber there is a "box" which converts the light to 1000 Mbit/s ethernet. And my fiber-provider tells me that they have not changed anything when the broadcast storm began. Actually it happened at 3 am on a sunday morning. And I am sure I was sleeping! :-)
So I am very confused, and my boss is not very happy that we don't have redundance anymore.
Maybe my attached txt-document in the previous post helps?
I would definitely look into what this fiber provider is doing. Maybe they have started some kind of filtering. That's why it's important that you show that you are sending the bpdu on one side and not receiving them on the other. Considering that the IEEE BPDUs are a little bit shorter than the PVST BPDUs, that could be a physical problem. But I think it's unlikely if it affects two independent fibers (well, they might not be that independent).
A feature like loopguard would have prevented(temporarily at least) the loop by blocking the fiber ports on E.
Thank you very much Francois!
Now I know for sure that BPDUs are not transmitted accross the fiber connection for vlan 1 but for other vlans. Please look at the attached txt-document.
I have three options now (at least):
#1: Tell my current fiber provider to fix the problem.
#2: Find a new fiber provider.
#3: Try to find a way to make BDPUs go through for vlan 1. Remember that the setup is ethernet->fiber->ethernet. I am not sure what the "box" that converts light into ethernet and vica versa is doing (besides the obvious part). Maybe I should investegate on that part.
Thank you very much for the exact details on my problem!
Well, maybe not so fast!
There is something really striking in this output: "Bpdu filter is enabled internally"!!
Can you post the full configuration of your ports? Did you ever configured (and maybe removed) some features like Q-in-Q?
I don't know how this could end up in your configuration. This filtering mechanism is meant to be used by some feature other than STP that want to get rid of BPDUs. It can also be activated by some test commands. If this was the latter case, you could get rid of this by doing a shut/no shut on the interface. I don't know if you can afford that, but it would be nice if you could flap the interface as a test.
I'm extremely surprised that it could happen with no user intervention. I need to go now, but I'll try to get more information on this.
Hmmm, some of the switches have "Bpdu filter is enabled internally" and some don't.
I don't see a logical pattern on this though. I tried to "shutdown" + "no shutdown" some of the interfaces, but no luck. And there is nothing in the running conf that says anything about an internal bpdu filter. I found the following note, which says why: 'With Cisco IOS Release 12.2(20)EW, the BPDU filtering configuration for both dot1q and Layer 2 protocol tunneling is no longer visible in the running configuration as "spanning-tree bpdufilter enable." Instead, it is visible in the output of the show spanning tree int detail command as shown below.'
I am not 100% sure but I don't think that "Q-in-Q" has ever been configured on the switches.
I am not able to guess what should have caused this "bad behavior". A couple of days before the incident my colleague installed a vmware server. And I installed a couple of new SUN Solaris servers. But these servers are only connected using one interface per server. The vmware uses vlans though.
I still believe that talking to my fiber provider is the next step.
Thanks for your help.
I have now tried to connect Wireshark (Ethereal) to the "open end" of my second fiber connection. And it does show that bpdu packets are received from the switch in the other end of the fiber connection. It shows bpdu packets for every vlan (also vlan 1).
So now I am stuck with the "Bpdu filter is enabled internally" problem. I have tried adding "spanning-tree bpdufilter disable" to the interfaces, but no luck.
Any ideas how to disable the "Bpdu filter is enabled internally"?
Sorry, I forgot you.
I have no clue how this could be enabled. That looks like a bug. As a workaround, please, try the following to clear the flag:
test spanning-tree set int-bpdufilter int
I thought that doing a shut/no shut would have been enough, let me know if this works.