Strange problem with 4500X, 3750 stacks and multifabric PortChannels/LACP

Unanswered Question
Dec 11th, 2013
User Badges:

Hi community,


we migrated a customer from his old 65xx core switches to new 4500X in VSS.

As exchange for the copper linecards in the chassis he got two stacks of the new 3750X, with three switches each.

The other access switches are still untouched, despite of configuration changes, some older 3750 stacks with 2 or 3 switches included.


Softwareversions

4500X:          cat4500e-universalk9.SPA.03.04.00.SG.151-2.SG.bin

3750X:          c3750e-universalk9-mz.150-2.SE4.bin

Old 3750:      c3750-ipbase-mz.122-35.SE5.bin


Obviously all worked fine, but the customer complained about loads of these messages on his syslog server now, distributed over the whole campus/all stacks:


4-WarningDec 07 17:15:54
Host   0021.9b5e.7176 in vlan 1 is flapping between port Po1 and port Fa1/0/11


Which what seems like a classic loop or anything like that isnt in fact a loop. This station is directly attached to Fa1/0/11 of this switch, and theres no other link to the rest of the infrastructure, just the channel Po1, with two memberinterfaces.


Looks like this way for all of the stacks:



Config on the coreswitches:


interface Port-channel113

description PORTCHANNEL 2GBit

switchport

switchport mode trunk

switchport nonegotiate

spanning-tree portfast trunk

interface TenGigabitEthernet1/1/3

description MEMBER-IF CU PO113 1GBit

switchport mode trunk

switchport nonegotiate

channel-protocol lacp

channel-group 113 mode active

spanning-tree portfast trunk


interface TenGigabitEthernet2/1/3

description MEMBER-IF LWL PO113 1Gbit

switchport mode trunk

switchport nonegotiate

channel-protocol lacp

channel-group 113 mode active

spanning-tree portfast trunk



(4500X-1 and 4500X-2)

       VSS-Cluster

Te1/1/3      Te2/1/3


    /   Po113    \

   /                   \

  /                     \

/       Po1          \

3750-1                \

         \                  \

           3750-2       \     Stack 3x 3750 via stacking cables

                      \       \

                       3750-3



Config on the Stackswitches:


interface Port-channel1

description PORTCHANNEL TO VSS-CORE 2GBit -> PO1

switchport trunk encapsulation dot1q

switchport mode trunk

switchport nonegotiate

spanning-tree portfast trunk


interface GigabitEthernet1/0/1

description LWL MEMBERLINK PO1 -> 4500X

switchport trunk encapsulation dot1q

switchport mode trunk

switchport nonegotiate

load-interval 30

channel-group 1 mode active

spanning-tree portfast trunk


interface GigabitEthernet3/0/1

description LWL MEMBERLINK PO1 -> 4500X

switchport trunk encapsulation dot1q

switchport mode trunk

switchport nonegotiate

load-interval 30

channel-group 1 mode active

spanning-tree portfast trunk




These errormessages occur everywhere actually, also on the new stack with the 3750X.


I checked already and couldnt find:

- no messages regarding Loops on the core

- no messages or anything that at any given time one of the memberlinks left the channel

- no real problems recognized by users or management systems


I´m running out of ideas actually how i could solve that.

All i could find was regarding real loops, or ports leaving the channel...we don´t have THESE problems here.


And even more interesting is:

We have one more switch, which has no seperate stackmembers, and is connected with two interfaces to the VSS core...this switch expereinces exactly the same problem...a mac-address is learned over Po1 instead of the port this station is exactly patched to...so it obviously has nothing to do with the portchannes over different stackmodule-switches.


Many thanks in advance for the input,

Andreas

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
Richard Primm Thu, 12/12/2013 - 17:12
User Badges:
  • Cisco Employee,

Hi Andreas,


Have you checked your stack port status and counters? Make sure you don't have a bad or loose cable.  Ive seen this issue before and it is usually the result of a poorly tightened cable. Run the summary command below and look for # of changes.





3750X#show switch stack-ports summ



Switch#/  Stack   Neighbor   Cable    Link   Link   Sync      #         In

Port#     Port              Length    OK   Active   OK    Changes   Loopback

          Status                                          To LinkOK

--------  ------  --------  --------  ----  ------  ----  ---------  --------

  2/1     OK         3      50 cm     Yes    Yes    Yes       19        No

  2/2     Down      None    50 cm     No     No     No         2        No

  3/1     Down      None    50 cm     No     No     No         1        No

  3/2     OK         2      50 cm     Yes    Yes    Yes        1        No




Second question:  Do you have NAC enabled?


-LP

Andreas Wittemann Mon, 12/16/2013 - 03:45
User Badges:

Hi Richard,


thanks for the input. I checked the stacking already before, no inconsistency at all...was, due to the errormessage and the local infrastructure one of my first suspects...but afterwards it figured out, additionally, that the problem is happening NOW on all the stacks, over the whole campus. And we hadn´t a similar problem before.


And no, theres no NAC.

Elton Babcock Thu, 12/12/2013 - 19:30
User Badges:
  • Bronze, 100 points or more

Also check to see if there is a Microsoft wake up proxy running on the machines. It's a wake on LAN solution that will produce MAC flap messages on the access switches.

Sent from Cisco Technical Support iPhone App

Andreas Wittemann Mon, 12/16/2013 - 03:52
User Badges:

Hi Elton,


thanks for the input, i checked with the customer...no WOL at all deployed there.

Actions

This Discussion

Related Content