cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
253
Views
0
Helpful
1
Replies

Is UDLD Recovery on GEC's Good?

jstewart
Level 1
Level 1

Recently we had a situation where ports 1-4 on an 8-port 6509 (cat6000-supk9_6-3-5/(C6MSFC2-PK2SV-M), Version 12.1(8b)E9) blade failed. Each of the ports was the first port in a GEC to four other 6509's running the same software. On the other boxes UDLD saw the port go down and reenabled it(%UDLD-3-AGGRDISABLE:Neighbor(s) of port 3/6 disappeared on bidirectional link. Port disabled %MGMT-5-ERRDISPORTENABLED:Port 3/6 err-disabled by udld enabled by errdisable timeout). This recovery looks correct, but the four GEC's were now unusable: CDP couldn't see the neighbor port, and OSPF on the four connected boxes looped in different ways depending on the neighbor (*Jul 15 09:49:35: %OSPF-5-ADJCHG: Process 1, Nbr 1.111.11.111 on Vlan26 from EXSTART to DOWN, Neighbor Down: Dead timer expired

*Jul 15 09:49:48: %OSPF-5-ADJCHG: Process 1, Nbr 1.111.11.111 on Vlan26 from EXSTART to DOWN, Neighbor Down: Dead timer expired

*Jul 15 09:50:02: %OSPF-5-ADJCHG: Process 1, Nbr 1.111.11.111 on Vlan26 from EXSTART to DOWN, Neighbor Down: Dead timer expired

or

Jul 15 09:35:03: %OSPF-5-ADJCHG: Process 1, Nbr 1.111.11.111 on Vlan33 from LOADING to FULL, Loading Done

*Jul 15 09:35:08: %OSPF-5-ADJCHG: Process 1, Nbr 1.111.11.111 on Vlan33 from FULL to DOWN, Neighbor Down: Dead timer expired

*Jul 15 09:35:14: %OSPF-5-ADJCHG: Process 1, Nbr 1.111.11.111 on Vlan33 from LOADING to FULL, Loading Done

*Jul 15 09:35:30: %OSPF-5-ADJCHG: Process 1, Nbr 1.111.11.111 on Vlan33 from FULL to DOWN, Neighbor Down: Dead timer expired).

Aside from the fact that it looks like we ran into something that the software could't handle, looking at this it seems to me that it would be better in the future to turn UDLD recovery off on GEC's and hope that the channel will recover if a failing link stays down.

I'd like to hear some best practices or opinions on this.

Thanks.

1 Reply 1

rfroom
Cisco Employee
Cisco Employee

Interesting, open a TAC case. This could be a software bug as UDLD should work across a channel and recover correctly.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: