Loop guard blocking ethernetinterface connected to a provider link 1Gb/20Mb

Unanswered Question
Jan 26th, 2010
User Badges:

Hi all,


we operate a dot1q trunk via a providers Ethernet Service. The line terminates on LX GBIC's on our switch environment.

The bandwidth offered by the provider is 20Mbit/s

now to the problem we have:

For some reason, the following messages occuring from time to time on the C3560 Switch:


Jan 25 21:06:59: %SPANTREE-6-PORT_STATE: Port Gi0/49 instance 1 moving from forw
arding to blocking
Jan 25 21:06:59: %SPANTREE-2-LOOPGUARD_BLOCK: Loop guard blocking port GigabitEt
hernet0/49 on VLAN0001.
Jan 25 21:06:59: %LINEPROTO-5-UPDOWN: Line protocol on Interface Vlan1, changed
state to down
Jan 25 21:07:03: %SPANTREE-2-LOOPGUARD_UNBLOCK: Loop guard unblocking port Gigab
itEthernet0/49 on VLAN0001.
Jan 25 21:07:03: %SPANTREE-6-PORT_STATE: Port Gi0/49 instance 1 moving from bloc
king to blocking
Jan 25 21:07:03: %SPANTREE-6-PORT_STATE: Port Gi0/49 instance 1 moving from bloc
king to forwarding
Jan 25 21:07:03: %LINEPROTO-5-UPDOWN: Line protocol on Interface Vlan1, changed
state to up


In my understanding this is because of missing BPDU's from the other end. To localize the reason for the missing BPDU's i replaced all of the hardware owned by us. Fibre Cables as well as GBIC's.

Additional i reduced the number of Vlans to the needed vlans.

With Cacti i monitor the link utilization. Additional netflow is collecting data for this link.

As far as we are able to see in these tools we have no overload situation there. However, i know that 20 Mbit link based on a Gigabit Ethernet connection is relative poor.

During these interuptions i had so far no way to to verify things on the switches. (Problem consists only for a few seconds, mostly during the night)


The provider gave me the insufficent information that everything is o.k. on their end. And to analyze in detail he needs to do a End to End measurement.

Of course the line needs to be deactivated to measure this.

Now to my questions:

1. Is it possible to limit the bandwidth on a L2 interface to 20Mbit/s

2. how does a provider shape this 1gbit link to 20 Mbit, and which counters is he able to read out.

3. Has anybody an idea, of how we are able to fix this problem

4. In which way, could i ask the provider, to assist me in finding the cause.




Aditional information:


Cat4506 with SUPII+ on one side and CAT3560G 48 PS-S on the other side.


The interface configuration is as followed (C3560side)

int gi 0/49

switchport trunk encapsulation dot1q
switchport trunk allowed vlan 1,4,6,7
switchport mode dynamic desirable
logging event trunk-status
logging event spanning-tree
logging event status
keepalive 10
udld port aggressive


sh int status:

Gi0/49    US_RLC-US Fibre li connected    trunk      a-full a-1000 1000BaseLX SF

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
Mohamed Sobair Tue, 01/26/2010 - 07:42
User Badges:
  • Gold, 750 points or more

Hi,



1. Is it possible to limit the bandwidth on a L2 interface to 20Mbit/s

2. how does a provider shape this 1gbit link to 20 Mbit, and which counters is he able to read out.

3. Has anybody an idea, of how we are able to fix this problem

4. In which way, could i ask the provider, to assist me in finding the cause.




Aditional information:


Cat4506 with SUPII+ on one side and CAT3560G 48 PS-S on the other side.


The interface configuration is as followed (C3560side)

int gi 0/49

switchport trunk encapsulation dot1q
switchport trunk allowed vlan 1,4,6,7
switchport mode dynamic desirable
logging event trunk-status
logging event spanning-tree
logging event status
keepalive 10
udld port aggressive


---------------------------------------------------------------------------------------------------------------------------------------------


1- The provider might be using QoS boxes that can limit the bandwidth based on layer-2 mac address and/or Vlan IDs. so the answer is yes.

2- Some QoS Box has graphs that includes graphs concerntly, current , day before , day after or hourly / daily basis.

3- The problem is that loop guard makes additional check, on its continuos operation, if the port fails to recieve BPDUs, Spanning-tree moves the port from blocking to forwarding and loopguard kicks in and errdisabled the port, If the port is the only uplink port, then I suggest you disable loop guard as there is no possibility of a loop. A nother point is that why the port is configured to rely on DTP to negotiate trunking other than setting it manually, and I am wondering if you ar running the same VTP domain with the provider. So the first action is to set the port to trunking mode manually, and disable negotiation with the provider.


4- This problem is due to link flaps or instability of the fiber links, so I would first check what stated above and then let them check the connectivity from thier end side to make sure there is no duplex/speed mismatch or single end fiber connectivity problem caused by the type of the connector. On both sides connectivity has to be checked properly.



HTH

Mohamed

joepena2012 Fri, 01/29/2010 - 00:04
User Badges:

Hello Mohamed,


thx for your feedback.

Please see my comments:


1- The provider might be using QoS boxes that can limit the bandwidth based on layer-2 mac address and/or Vlan IDs. so the answer is yes.

     --> In the meantime i got the following output from the provider:

*************************************************************************
Connection 44 - HPPGNYKX0AW


STATISTICS
Ingress                                        Egress
Fwd packets              : 317839518           Fwd packets         : 235804543
Discard packets          : 19939343            Discard packets     : 0
Fwd Unmarked octets      : 315918439926        Fwd Unmarked octets : 13269313791
7
Fwd Marked octets        : 0                   Fwd Marked octets   : 0
Discard octets over EIR  : 28730402144---------------------------------------------------Discarded packets!
Discard octets CIR to EIR: 0
Discard octets Below CIR : 0

SLA REPORT
               Iterator 1               Iterator 2
Packet Length  NA               128
RTD Period     NA               1
Jitter Period  NA               1
RTD Average    NA               0
Jitter Average NA               0
**************************************************************************


2- Some QoS Box has graphs that includes graphs concerntly, current , day before , day after or hourly / daily basis.

--> I'm still waiting for time based figueres

3- The problem is that loop guard makes additional check, on its continuos operation, if the port fails to recieve BPDUs, Spanning-tree moves the port from blocking to forwarding and loopguard kicks in and errdisabled the port, If the port is the only uplink port, then I suggest you disable loop guard as there is no possibility of a loop.

--> In the current environment i need to support 4 Vlans. one of the IP ranges (Vlans) i need to provide on the far end. This is an historical issue with a unix server having 2 adapters in two different VLANs. One of these Vlans is on the near end of the trunk and the other on the far end of the trunk. And so i need to provide one of the vlans on the far end.

For sure i know this is a disastrous design. But i need to operate this till the Application on the unix server will operate on one NIC.

A nother point is that why the port is configured to rely on DTP to negotiate trunking other than setting it manually, and I am wondering if you ar running the same VTP domain with the provider. So the first action is to set the port to trunking mode manually, and disable negotiation with the provider.

--> we operate on DTP trunks because of our good experiences with DTP during the last ten Years on hundreds of trunks. However, if there is any indication for a problem with DTP i will change to manually configured trunks.
4- This problem is due to link flaps or instability of the fiber links, so I would first check what stated above and then let them check the connectivity from thier end side to make sure there is no duplex/speed mismatch or single end fiber connectivity problem caused by the type of the connector. On both sides connectivity has to be checked properly.

--> The connectivity is checked on both sides. No Errors occuring on both sides of the Interfaces. On the provider side i need to trust on the information i got from them. And they are saying: No errors occuring on the ethernet interfaces.



Resume:     Due the high rate of discarded frames it may has something to do with an overload situation on the link. However, so far i have no evidence for this. To find out this i placed a sniffer to monitor the link. But to be honest, there is only spark of hope to get the cause of the missing BPDU frames with this measure.


One other question:

Is it possible to configure a trunkport with spanning tree portfast?



Thanks for you holding my hands. :-) (typical german phrase, means: thanks for being with me)


Dieter

Giuseppe Larosa Sat, 01/30/2010 - 09:08
User Badges:
  • Super Silver, 17500 points or more
  • Hall of Fame,

    Founding Member

Hello Dieter,

this is an interesting issue.


I wonder if you see on site A the effect of dropping STP BPDUs on site A to site B direction made by provider.

on site B is STP loopguard configured too?

Have you tried to correlate logs on devices on the two sides of  the trunk link to see if something happens on the other side that could cause it to stops to send BPDUs in  direction site B to site A.

the provider show output tells of several packets dropped at the ingress that is in direction site A to siteB.


Ingress                                        Egress
Fwd packets              : 317839518           Fwd packets         : 235804543
Discard packets          : 19939343            Discard packets     : 0


Also this may be the show related to the port connected to site A.

They should give you a similar show for other side towards site B.

frames sent in direction site B to site A may be dropped on the other side and here we see egress discard packets = 0 but if this is a point to point service discarding at ingress both sides is enough to comply with SLA.

That is it would be useless to carry a frame from site B to site A just to drop it at egress. Ingress policing provides a way for service provider to avoid waste of its own core bandwidth.


I would ask them to show output of both sides.


Hope to help

Giuseppe

joepena2012 Mon, 02/01/2010 - 08:21
User Badges:

Hi Giuseppe,


thx for the feedback.



to review:


the counters i got from the provider are from A to B

The overload situation is also from A to B

the missing RSTP packets i see on B


loopguard is active on both sides

On switch of Site A i see no errors and even no log entries.


Please see my bos on bottom, i gave an update because of news regarding the overload

Mohamed Sobair Sat, 01/30/2010 - 00:49
User Badges:
  • Gold, 750 points or more

Hi Dieter,


Please see my Edit:


You cant enable portfast feature on trunk ports.( didnt take my coffe when answered previously)


I still believe its a connectivity issue between both ends, Have you double checked fiber connectors? duplex/speed at both ends?


what is the output from (show interface command)? Do you have CRC , late collisions or collisions , packet drops from the output.




HTH

Mohamed

joepena2012 Mon, 02/01/2010 - 08:14
User Badges:

NUSEX003#sh int gi 4/4
GigabitEthernet4/4 is up, line protocol is up (connected)
  Hardware is Gigabit Ethernet Port, address is 000c.3006.8749 (bia 000c.3006.87
49)
  Description: US-US_RLC NY-NY_RLC
  MTU 1500 bytes, BW 1000000 Kbit, DLY 10 usec,
     reliability 255/255, txload 1/255, rxload 1/255
  Encapsulation ARPA, loopback not set
  Keepalive set (10 sec)
  Full-duplex, 1000Mb/s, link type is auto, media type is 1000BaseLH
  input flow-control is off, output flow-control is off
  ARP type: ARPA, ARP Timeout 04:00:00
  Last input 00:00:02, output never, output hang never
  Last clearing of "show interface" counters 6d23h
  Input queue: 0/2000/0/0 (size/max/drops/flushes); Total output drops: 0
  Queueing strategy: fifo
  Output queue: 0/40 (size/max)
  5 minute input rate 1748000 bits/sec, 284 packets/sec
  5 minute output rate 1448000 bits/sec, 338 packets/sec
     359605097 packets input, 148940316376 bytes, 0 no buffer
     Received 303426 broadcasts (302886 multicast)
     0 runts, 0 giants, 0 throttles
     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
     0 input packets with dribble condition detected
     527994523 packets output, 576887006086 bytes, 0 underruns
     0 output errors, 0 collisions, 0 interface resets
     0 babbles, 0 late collision, 0 deferred
     0 lost carrier, 0 no carrier
     0 output buffer failures, 0 output buffers swapped out
NUSEX003#




no Errors, indicating a problem of the physical level.

In the meantime i got a tracefile, and i had 300000 packets (1414byte) per minute on the link, when the BPDU's were lost.

--> which would be roundabout 56Mbit/s


---> So the next topic is to prevent the switch on Site A from flooding these trunk

Giuseppe Larosa Mon, 02/01/2010 - 14:28
User Badges:
  • Super Silver, 17500 points or more
  • Hall of Fame,

    Founding Member

Hello Dieter,

if side A is that where it is the C4500 with SupII I'm afraid there is little to do.


C3560 can use SRR in shaped mode to actually shape outbound.


correction:


Per Port Per VLAN QoS

Per-port per-VLAN QoS (PVQoS) offers differentiated quality-of-services to individual VLANs on a trunk port. It enables service providers to rate limit individual VLAN-based services on each trunk port to a business or a residence. In an enterprise Voice-over-IP environment, it can be used to rate limit voice VLAN even if an attacker impersonates an IP phone. A per-port per-VLAN service policy can be separately applied to either ingress or egress traffic.



see

http://www.cisco.com/en/US/docs/switches/lan/catalyst4500/12.2/46sg/configuration/guide/qos.html#wp1463346


this can be a good starting point.


Hope to help

Giuseppe

joepena2012 Wed, 02/03/2010 - 23:31
User Badges:

Hi Giuseppe,


thanks for the Port based VLAN QOS suggestion.

I will try to establish this on the affected link.

to be honest i'm not that expert in this, and so it will be a lot of work for me to establish.

For this reason I'll not do this during the next days.

Yesterday i did the marked configurations on the trunk:



interface GigabitEthernet4/4
description US-US_RLC NY-NY_RLC
switchport trunk encapsulation dot1q
switchport trunk allowed vlan 1,4,6,7
switchport mode dynamic desirable
switchport block unicast
logging event link-status
logging event trunk-status
spanning-tree portfast trunk



This should prevent from unicast flooding on the interface, additional the "spann-tree portfast trunk" command will affect the spanning tree process.


Additional i'm working very hard on a way to migrate this L2 link to a L3 point to point connection.


I'll let you know the further steps.


thanks so far

Mohamed Sobair Sun, 01/31/2010 - 07:38
User Badges:
  • Gold, 750 points or more

Hi Dieter,


Please have a look at my message update.



HTH

Mohamed

Mohamed Sobair Thu, 02/04/2010 - 00:50
User Badges:
  • Gold, 750 points or more

Hi Dieter,


when I earlier said (spanning-tree portfast) cant be enabled on trunk ports, I based my post on earlier IOS releases. I must say its being added and implemented. I have never done it in trunk port before, and I remeber when I tried doing it on a trunk on some platform , it wonmt allow me unless the port is an Access port.


However, If your trunk port is the only uplink port to the provider, then enabling this command shouldnt produce any problems as loop is not possible here. I also suggest you remove the (loop guard) command along with it.


Any reason why you have (switch port block unicast) here?




HTH

Mohamed

joepena2012 Thu, 02/04/2010 - 02:38
User Badges:

Hi Mohamed,


i didn't know either. But i saw it when i studied release notes. :-)


The block unicast is because of a unicast flood i saw on the monitor port.

This is the first time i use this feature, and i'm all on edge of the effects. :-)


Additional i will deactivate the device which caused the need for bring a network from site A additional to Site B.

We start on 12 CET. Keep your fingers crossed.

If this will work, i will migrate the L2 link to a L3 link. This will simplify the design on Site B. also i expect more chances to restrict bandwidth to the available 20Mbit/s on my equipment.



Thanks again


Dieter

joepena2012 Thu, 07/22/2010 - 02:03
User Badges:

Hi all,


meanwhile the issue is fixed cause i migrated from L2 to a

L3 design.

Since this time we had no more floodings on the link.

And so we had no more outages.


--> So we couldn't fix the problem. But because of redesign of the LAN (moving WAN link to L3 connected)

the problem do not exist anymore.



thanks to all for your help

Actions

This Discussion

Related Content