I've encountered a problem today which puzzles me a lot. We are building redundancy in a data center and adding L2 links to two 4506 Catalysts instead of one. Most of our switches were connected to the first 4506 via LACP portchannels and two physical links. We painlessly removed that configuration and transferred one of the links to the second 4506.
Now we have two CB3120 switches in an HP enclosure. They are not stacked and each is connected to the first 4506 by four links bundled in a LACP portchannel.
I've shut down two of the four ports on both sides, removed portchannel configuration, connected links physically to the second 4506. Then I created a new porchannel configuration on the second 4506 and one of the CB3120, included the needed interfaces and brought the interfaces up. The second 4506 blocked the new portchannel by STP - the expected behavior, since the root bridge is the first 4506 - the setup worked for about 40 second and then boom! All traffic to/through the CB3120 stops. I quickly shut down the interfaces and portchannel interface on the second 4506, but that doesn't help. I then consoled into the CB3120 and issued shutdown on both physical interfaces and the portchannel interface.
What is mind breaking for me is this did not help, at all! So while I was reconfiguring interfaces and they were shutdown, everything was ok. But once I brought them up, the traffic flow stopped, even as I shut them back down. I noticed that the traffic did not stop completely - for example, about 8 out of 40 pings passed. I re-checked if STP is still blocking the links to second 4506, if they actually went down - yes, the interfaces on both sides are administratively down and no traffic is passing on them. Only the two interfaces and portchannel to the first 4506 are being used. As the downtime became critical, I reconnected the two links back to the first 4506 and rolled back to the original configuration with 4 interfaces in 1 portchannel to 1 4506.
Immideately everything came back to normal and traffic started passing.
The logs on both 4506 and CB3120 showed only interfaces going up and down - no errors, error-disables and such stuff. Logging is debug level.
I gave a log of all my actions and command outputs to another person - he also does not see any error or problem.
Are there any ideas why could have this happened?
Since the network is rather simple - a "core" made of 4506 and less than 10 switches connected into it and never into one another, I'm confused even to what statistics and logs to search.
We are pleased to announce availability of Beta software for 16.6.3. 16.6.3 will be the second rebuild on the 16.6 release train targeted towards Catalyst 9500/9400/9300/3850/3650 switching platforms. We are looking for early feedback from custome...