Protection Against UDLD Failure in case of VPLS( Dot1q Tunnel) Setup

Unanswered Question
Sep 10th, 2009
User Badges:

Hi,


I am facing the following problem.


I have a switched network built over VPLS Cloud.


Switch1---VPLS-CLOUD-----Switch2


Switch 1 is root switch and there are multiple switches connecting to VPLS cloud including switch 2.


All were working fine until one fine day one of the Amplifier used on the VPLS provider cloud failed resulting in Uni-directional link. This has resulted in a traffic storm and my whole network was brought down.


We were able to simulate this problem inside the lab network. And all cisco documentation pointed to UDLD/Spanning tree Loop Guard feature as the remedy.


UDLD as pottential work around failed because of the following reason.


1) Aggressive mode UDLD did not result in spanning tree loop prevention because I have a point to Multipoint network segment.


2) Normal mode also did not help either.


I then tested Loop guard Feature. This worked but I ran upon a limitation.


Loop Guard help with Transmit side of the Root switch went down.


However it did not work when the recive side of the root switch went down.


Pls. Note : When I enabled Loopguard, I was able to supress the traffic strom completely in all scenarios. However network did not reconverge when the receive side of the Root switch went down.


Is there any workaround for this problem.


It would be great if you can help me solve this issue.


Thanks and Regards

Arun

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
Giuseppe Larosa Fri, 09/11/2009 - 04:22
User Badges:
  • Super Silver, 17500 points or more
  • Hall of Fame,

    Founding Member

Hello Arun,

what spanning-tree mode are you using on customer-side switches?


if it is PVST+ I wonder if moving to Rapid PVST could provide benefits.


With PVST+ non root bridges send STP BPDUs only when a topology change happens with TCN set.


Unfortunately also Rapid STP share the same concept with only improvements on convergence time (no need to wait for max_age to expire)


http://www.cisco.com/en/US/tech/tk389/tk621/technologies_white_paper09186a0080094cfa.shtml#topic4


A possible help could come from LACP: if using LACP on the link from root bridge to service provider when rx side fails the root bridge will fail to receive LACP messages and should after some time put the port in error disable state.


the LACP neighbor should be the service provider switch.

This requires cooperation with them.


Hope to help

Giuseppe


aarumuganainar Tue, 09/15/2009 - 21:32
User Badges:

Hi Giuseppe,


Thanks very much for the reply.


Problem with my setup is that , my switched do not detect the topology change.


For example .


I have two ports connected via QinQ facility (offered by a external provider) between root and non root bridge. When every thing works fine . I have absolutely no problem. One of the port goes in to Root port state and other to Alternate and Blocking.


Now on the link that is connected to the root port a unidirectional link occurs. The root cause of the failure lies inside the service provide cloud. And transmission fails on the Send port. I have no mechansim to detect that and port continues to operate in the current stp state and starts black holing the traffic.


LACP is not an option for me. Because the provider is using extreme switches and uses a wierd way of establishing QinQ like VPLS network. Also fault did happen inside the cloud and not at the edge.


Thanks and Regards

Arun

Actions

This Discussion