Connect a non-stack switch to router with ether channel caused network down
We just experienced a problem and caused production network down.
Here is the scenario:
Two 3750G/48 have stacked together, configured port channel include both port 49 (fiber), then connected to core router 6509 port 2/19 and 2/20, both ports are also in port channel.Running fine, about 100 machines are connected to this stacked switches.
One new 3750G/48 switch powered on and connected to the port 2/14 at the same router. This new switch has not been stacked to other two switches---means it is a standalone switch without any machine connected to.The router port was not turned on, but switch port 49 is on.
After turned on the router port 2/14 and configured as part of port channel with port 2/19 and 20, whole network (that core router supports) down immediately. Pull the fiber out from switch immediately, disconnected it from the network, the traffic come back.
From existing stacked switch, we saw many mac flapping info in the log:
Jun 17 14:58:25.933: %SW_MATM-4-MACFLAP_NOTIF: Host 0024.e834.9723 in vlan 110 is flapping between port Po1(note: this is switch's ether channel) and port Gi1/0/27 (note: this is port connected to the machine)
Jun 17 14:58:27.024: %SW_MATM-4-MACFLAP_NOTIF: Host 0024.e841.ad31 in vlan 111 is flapping between port Gi2/0/11 and port Po1
In the other switch, I saw:
Jun 17 14:58:53.830: %SW_MATM-4-MACFLAP_NOTIF: Host 0015.2b68.bf80 in vlan 113 is flapping between port Po2(note: this is to standby router 6509) and port Po1 (note: this is to active router 6509).
Jun 17 14:58:53.830: %SW_MATM-4-MACFLAP_NOTIF: Host 0018.8b31.e7cc in vlan 49 is flapping between port Po2 and port Po1
Jun 17 14:58:53.830: %SW_MATM-4-MACFLAP_NOTIF: Host 0001.d740.4c85 in vlan 921 is flapping between port Po1 and port Po2
Note(again):Here Po1 (etherport) is to active core router and Po2(etherport) is to standby core router.
Looks like in the procedure,a certain loop was formed.
Here are my Questions:
a.What was the problem to cause core router jammed and cannot handle traffic?
b.What is the correct procedure to add a new 3750G to existing stack?
c.Is the stackWise Cable hot swappable? What issues that we need to pay attentionto add a new 3750G to the existing stack? Some coworkers suggested to config ether channel first, then connect physical fiber. Is it correct?
d. Last question: if we have 3 stacked 3750G, but for some reason, such as, stackwise cable broken, one switch is disconnected from the other switch in the stack, can this scenirio cause the network jam as I experienced?
There was no machine connected to the new standalone 3750. After configure core port 2/14 to be a part of core port-channel with 2/19-20, loop occurred.
From stacked switch log, I can see the pkg loop from core back to the stacked switch---learnt mac addr flap between phy port and port-channel (error msg log: Host 0024.e834.9723 in vlan 110 is flapping between port Po1 and port Gi1/0/27).
From discussion with other engineers, they believed that the Gi2/14 in core never sent BPDU packets to the new standalone switch, the port 1/0/49 (standalone switch) is in FWD state. And also Gi2/14 in core1 believed it was connected to an end machine (not switch), it is in FWD too. When a PC sent a packet to stacked switch, then to core1, core1 will broadcast to all of ports in FWD state, so the new standalone switch would receive it, it has no machine to connect, it forward to all of ports in FWD state, which means it sent back to core1, then Core1 forwarded to the stacked switch through Po1 (etherchnl 1), this caused stacked switch learnt the PC mac from Po1 (flap). Next package to PC coming to stacked switch, will be routed back to Po1. Core1, then route back again.. loop started from here.
Here is my confusion:
The standalone switch only has one physical port connected to core1 only, if it received frame, will it drop it or sent it back?
Also we use load-balance src-dst-ip for port-channel, will it cause core1 to send frame to 2/14 to standalone switch?
All of our ports in port-channel is “ON” (not LCAP or PAGP). That is not good. If we like to change to the mode to “desirable”, what is the process? Change switch side first? Or change core side first? Or we have to remove the mode from all ports from port-channel, then change them together?
Last Q, if we have these three switches stacked together already, and port-channel is running fine, one day, due to some accident, one of switches disconnected from other two switches (stackwise cable disconnected), will this cause the problem I experienced? Why? If it will not, why?
Regarding adding one more 3750 to the existing stack and make the fiber port to be part of etherchannel group (both in router and in this 3750 stack), I have the following step by step procedure as I think, please help and let me know if I need to change anything:
Two 3750g/48 is connected with Stackwise cable and running in production network already.
SW1’s StackA is connected to SW2’s StackB and
SW1’s StackB is connected to SW2’s StackA.
1.Power down the new switch SW3.
2.Disconnect SW1’s StackA from SW2’s StackB
3.Connect SW2’s Stack B to SW3 (new sw) StackA.
4.Connect SW1’s StackA to SW3 StackB.
5.Do not config SW3’s fiber port as part of the etherchannel.
6.Do not config core router’s port that will be connected to SW3’s fiber port as etherchannel in core side.
7.Disconnect the fiber from core to SW3.
8.Power on SW3.
9.Connect fiber from SW3 and Core (without config etherchannel), to make sure connection is up.
11.Make sure SW3 is a part of Stack and not the master.
12.Config SW3 fiber port as part of the ether channel.
13.Config router’s fiber port to SW3 as the member of etherchannel.
Re: Connect a non-stack switch to router with ether channel caus
Your answer seems to indicate you had a standalone 3750 that hooks back to a Core switch. You then attempted to to put the connection into the same port channel that was already in a portchannel going to the 2 stacked 3750 's . Is this correct ? If so then you cannot do this as the standalone 3750 is not part of the stack and I can see where that would cause all sorts of issues . Why would you try to put the standalone into the same port channel . If I understand this incorrectly please clarify. If you want a port channel for the standalone 3750 then you would have to create a whole new portchannel on both sides...
Question We run asr9001 with XR 6.1.3, and we have a very long delay to
login w/ SSH 1 or 2 to the device compare to IOS device. After
investigation, the there is 1s delay between the client KEXDH_INIT and
the server (XR) KEXDH_REPLY. After debug ssh serv...
Introduction The purpose of this document is to demonstrate the Open
Shortest Path First (OSPF) behavior when the V-bit (Virtual-link bit) is
present in a non-backbone area. The V-bit is signaled in Type-1 LSA only
if the router is the endpoint of one or ...
Hi, I am seeing quite a few issues with patch install and wanted to
share my experience and workaround to this. Login to admin via CLI, then
access root with the “shell” command Issue “df –h” and you’ll probably
see the following directory full or nearly ...