SG300 - possible Bug regarding Vlan-Interfaces in Release 220.127.116.11
Today we had a strange problem at one of our customers productive environments. They have 6 SMB Switches, of which 3 were suddenly unreachable. All 3 (B, C, D) are daisy-chained behind switch (A). The trunk from A to B was up and ok, vlans active, L2-status ok.
"show cdp neigh det" showed the correct output, also with the management IP of B, but it was unreachable. After reboot of B same issue. Rebooted switch A, after reboot it had the same issue. Trunk from our core to A ok, vlans up and active, cdp ok - but not reachable.
Established connection to A via console, all status ok, but still not working. After 2 more reboots, switch A came up again.
Console to B -> ping to it's own interface not working (sh ip int vlan shows UP/UP). Reboot doesn't help. Shut/noShut -> Switch working again.
Same procedure on C and D, both showing interface as UP/UP, but only reachable after a shut/no shut.
The topology is plain - only one VLAN in use for Mgmt and Clients.
I can't really explain why a reboot of the whole switch didn't have any impact, but a reset of the interface had.
Thanks for your time and advice. Regarding your points:
- Yes, switches use of course different IP-addresses (192.168.0.x). Everything was working properly.
- Physical Layer - from my point of view impossible, because L2 seemed completely ok. I could see the cdp neighbour, uplinks, traversing vlans, spanningtree - everything ok.
- Cabling - same as above.
- Regarding MAC address table, seemed also ok. On Switch A I could see MAC-addresses on the downlink to B.
- Since Monday, when the problem was solved the network is stable without any problems. So a packet-capture won't have any effect now, until it happens again. But if it happens again, we will get some serious troubles with the customer, because it's a production environment. :)
P.S.: Dot1x is running also on the switches. Since the ip-interfaces of the switches made problems, also the connection to the RADIUS failed. I think this was the reason for the failed-user-connectivity. Otherwise the users wouldn't have any problems, only the switches would have been unmanagable.
But the main question remains - why did the VLAN-interfaces went down - but were shown as up?
As mentioned, they weren't even pingable by themselves before the shut/noshut.
In the first step I think of upgrading all devices to 18.104.22.168 this weekend. Any recommendations?
Cannot Add a New Switch to an Existing Stack
Stacking is a network solution that connects two or more switches on top of one another and configuring them as one device. The switches in a stack function as a single switch...
Configure the PNP Settings on a Switch
The installation of new networking devices or replacement of devices can be expensive, time-consuming, and error-prone when performed manually. Typically, new devices are first sent to...
SG350X, Sx550X: RSPAN Mirrored Packet Loss when Forwarding
February 12, 2017