10-11-2005 06:11 AM - edited 03-03-2019 12:20 AM
My question is concerned with how Cisco switches use keepalive at layer 2 to detect loopbacks. All the info on keepalive that I could find was concerned with its use at higher layers (eg tunnels, routing protocols etc).
We had an incident where a mismatch in switch configurations caused a loop that wasn't blocked by spanning tree. This caused a number of our switches to err-disable their uplinks "%ETHCNTR-3-LOOP_BACK_DETECTED: Keepalive packet loop-back detected". Some of these switches where fairly deep down a chain of switches with only a single uplink path.
We have also had a situation (which eventually went away) where the single uplink on a particular switch was err-disabled loopback detected although we could find no evidence of a loop. I have my suspicions about keepalive packets not being dealt with properly especially if vlan 1 is removed from trunk links (which we have to do in some cases) or ether channels are involved. Bugs CSCeg58877 and CSCdt82690 describe such problems but neither of these match our circumstances.
Because of the above I am considering disabling keepalive on my Cisco switches layer 2 links, especially uplinks, is this a good or bad idea?
Alex McLaren
10-11-2005 06:38 AM
Hi Alex-
Loopback detection was a mechanism that was put in place to detect loops in the network
caused by Type1a or tyep2 copper cabling. While it is a good mechansim for those special
situations , there seems to be no need for having it enabled on the Gig ports where you
can use features like UDLD to detect a fiber loop or unidirectional fiber.
But in 12.1EA releases , loopback detection is enabled by default on the fiber ports as
well as copper ports. On some fiber ports , neighboring switch may just switch the
keepalive packet back w/o making any change in the packet causing the original switch that
sent out the keepalive to recive its own packet back kicking in this mechanism of
error-disabling the port.
What you can do is you can disable keepalives on the uplink Gig ports of the 2950 and that
should take care of the problem
On 2950
int gig
no keepalives
But make sure you have UDLD enabled. Once you have disabled the keepalives , please make
sure you have low cpu on the 2950s that were showing the problem as well as make sure the
mac-aging timer stays at 300 sec.
show proc cpu
show mac- aging
UDLD should however be enabled on fiber Gig ports to detect spanning tree loop problems.
Keepalive detection mechansim is speciafically there for Type1a and type 2 cabling and has
no significance on fiber ports.
This is also documented in the the release notes of the following Cisco DDTS id #
CSCea46385.
An interface on a Catalyst switch is errordisabled after detecting a loopback.
Mar 7 03:20:40: %ETHCNTR-3-LOOP_BACK_DETECTED: Loop-back detected on GigabitEthernet0/2.
The port is forced to linkdown.
Mar 7 03:20:42: %LINK-5-CHANGED: Interface GigabitEthernet0/2, changed state to
administratively down
Mar 7 03:20:43: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/2,
changed state to down
Conditions:
This might be seen on a Catalyst 2940, 2950, 2950-LRE, 2955, 2970, 3550, 3560 or
3750 switch running 12.1EA or 12.2SE based code.
Workaround:
Disable keepalives by using the "no keepalive" interface command. This will
prevent the port from being errdisabled, but it does not resolve the root cause of
the problem. Please see section below for more information.
Additional Information:
The problem occurs because the keepalive packet is looped back to the port that sent
the keepalive. There is a loop in the network. Although disabling the keepalive
will prevent the interface from being errdisabled, it will not remove the loop.
The problem is aggravated if there are a large number of Topology Change Notifications
on the network. When a switch receives a BPDU with the Topology Change bit set,
the switch will fast age the MAC Address table. When this happens, the number of
flooded packets increases because the MAC Address table is empty.
Keepalives are sent on the Catalyst 2940, 2950, 2950-LRE, 2955, 2970, 3550, 3560
or 3750 switch to prevent loops in the network. The primary reason for the keepalives
is to prevent loops as a result of Type 2 cabling. For more information, see:
http://www.cisco.com/en/US/netsol/ns340/ns394/ns74/ns149/networking_solution
s_white_paper09186a00800b4249.shtml
Keepalives are sent on ALL interfaces by default in 12.1EA based software. Starting
in 12.2SE based releases, keepalives are NO longer sent by default on fiber and uplink
interfaces. Since there is no 12.2SE release for 2950s , you will have to manually disable
the keepalives on uplink Gig ports.
Hope this helps.
thanks
Salman Zahid
10-12-2005 07:23 AM
Hi Salam
Thanks for your reply it was very helpful.
I can see that is a good idea to disable keepalive on fibre links but I am not clear if this also applies to copper links (our problems with err-disable hit both fibre and copper uplinks).
I tried to look at the reference you provided but I couldn't access it for some reason, so I am wondering about the consequences of disabling the keepalive on copper links.
Regards
Alex McLaren
10-12-2005 08:14 AM
Keepalives should not be disabled on Copper links. Only on Fiber links. If you are seeing copper ports getting error disabled , there may be a spanning tree loop in the network that you need to find out. It is recommended only to disable keepalives on the fiber ports.
HTH.
Salman Z.
10-13-2005 08:49 AM
Hi Salam
Thanks for the clarification.
Regards
Alex McLaren
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide