cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1953
Views
0
Helpful
11
Replies

Port-channel issue in Nexus 7k

lukaszkhalil
Level 1
Level 1

Hello

I was testing the influence of reloading one Nexus 7k (secondary-vpc) to the stability of the network and I found that from time to time, Nexus brings up the physical interface before it brings up the whole VPC port-channel.

As a result the traffic from the access-switch, coming by this link is black-wholed on the Nexus 7K till the vpc is being brought online.

I am running NX-OS 5.2.1 and have the following vpc configuration

vpc domain 100

  peer-switch

  role priority 100

  system-priority 4000

  peer-keepalive destination 2.2.2.2

  delay restore 60

  peer-gateway

  auto-recovery reload-delay 300

  ip arp synchronize

I would apparaciate if somebody can explain to my why this is happaning and what I can do to mitigate it.

Regards

Lucas

1 Accepted Solution

Accepted Solutions

Hi Lucas,

I had a chat with the ppl who worked on this and indeed it seems that you are hitting this bug.

The fix is being tested as we speak, so a cco release containing it will come shortly.

I suggest you to monitor the bug I gave you. As soon the issue is fixed it will be readable and will contain info of the releases with the fix.

Please rate and close the thread if helpful 

regards,

Riccardo

View solution in original post

11 Replies 11

rsimoni
Cisco Employee
Cisco Employee

Hi Lucas,

do you have LACP enabled on the vpc towards the access switch?

Which switch is your access switch?

How often do you see the problem (every how many reload of the secondary)?

Riccardo

Hi Riccardo

I do have LACP enabled on both sides. I use 6500 as an access-switch (12.2(SXI)).

Here is my config

Nexus side

interface port-channel211

  switchport

  switchport mode trunk

  switchport trunk allowed vlan 10

  spanning-tree port type normal

  spanning-tree guard root

  vpc 211

interface Ethernet3/9

  switchport mode trunk

  switchport trunk allowed vlan 10

  channel-group 211 mode active

  no shutdown

interface Ethernet2/9

  switchport mode trunk

  switchport trunk allowed vlan 10

  channel-group 211 mode active

  no shutdown

Cat6500 side

interface Port-channel211

switchport

switchport trunk encapsulation dot1q

switchport trunk allowed vlan 10

switchport mode trunk

logging event link-status

logging event bundle-status

spanning-tree guard loop

interface GigabitEthernet1/4

switchport

switchport trunk encapsulation dot1q

switchport trunk allowed vlan 10

switchport mode trunk

logging event bundle-status

channel-protocol lacp

channel-group 211 mode active

interface GigabitEthernet1/5

switchport

switchport trunk encapsulation dot1q

switchport trunk allowed vlan 10

switchport mode trunk

logging event link-status

logging event bundle-status

channel-protocol lacp

channel-group 211 mode active

So far I noticed it twice in 3 reloades.

Regards

Lucas

Hi Lucas,

what you describe sounds similar to an outstanding issue >

CSCtn05804    VPC LACP comes up after reload of chassis before VPC delay timer has run.

The public info on this bug is still limited, however the description and the internal info seem to match.

In a nutshell the bug describes a condition for which LACP declares the VPC channel ready w/o waiting for the VPC timer to complete (it is needed before all the hw is correctly programmed). This, as you noticed, caused temporary blackhole of traffic coming from downstream device through a VPC. Also the release you run is within the range of the affected versions.

Let me ask the people who are working on this and i will let you know.

Riccardo

Hi

Thanks. It seems that it might be that.

I will be waiting for any news regarding the fixed software release.

Thank you in advance

Regards

Lucas

Hi Lucas,

I had a chat with the ppl who worked on this and indeed it seems that you are hitting this bug.

The fix is being tested as we speak, so a cco release containing it will come shortly.

I suggest you to monitor the bug I gave you. As soon the issue is fixed it will be readable and will contain info of the releases with the fix.

Please rate and close the thread if helpful 

regards,

Riccardo

Hi

Ok, thank you very much for your help.

Regards

Lucas

Hi all.

It seems like we have the same problem with Nexus 5548UP and SW 5.0.3.N2.2b.

Can it be the same bug in Nexus5000 SW?

thx,

regards, Juergen

Hi Juergen,

yes there is a similar issue on N5k

CSCtk52637    vPC peer-link PC failure cause 20 sec. traffic loss

It is apparently fixed in 5.0(3)N1(1) or later. But there is another similar issue I could not find the bug addressing it though.

I suggest you to go for the latest release and if you still have issue go for a TAC case as unfortunatley I don't have time to do a deep search these days.

Riccardo

Hi Riccardo.

Thx for your quick answer.

As wrote in my first posting, we already use a higher version - 5.0.3.N2.2b.

Next higher version would be 5.1.3.N1.1a. We've tested it also with this SW - same result.

And with the newest SW 5.1.3.N1.1a we had also Problems with FCoe (NPV). Same config works with 5.0.3.N2.2b, but with 5.1.3.N1.1a no data is transported through FC-Uplink and vfc-Interfaces to ESX-Servers.

thx

Regard, Juergen

Hi Juergen,

yes I noticed that you have a more recent version this is why I mentioned the other bug I don't remember the details for and also suggested to go for a TAC case where you will get better support as I have limited time now. 

Riccardo

Hello Riccardo

We are monitoring the bug you gave us regarding the 7k, but nothing have changed so far. The bug is still not accessible for the non-Cisco employees.

Could you please do me a favour and check if it is fixed and if yes in which 7k release?

It has been 6 moths since the fix has been tested so I hope it is avaliable now.

Thank you in advanced

Regards

Lucas

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: