01-10-2012 01:31 AM - edited 03-07-2019 04:15 AM
Hello
I was testing the influence of reloading one Nexus 7k (secondary-vpc) to the stability of the network and I found that from time to time, Nexus brings up the physical interface before it brings up the whole VPC port-channel.
As a result the traffic from the access-switch, coming by this link is black-wholed on the Nexus 7K till the vpc is being brought online.
I am running NX-OS 5.2.1 and have the following vpc configuration
vpc domain 100
peer-switch
role priority 100
system-priority 4000
peer-keepalive destination 2.2.2.2
delay restore 60
peer-gateway
auto-recovery reload-delay 300
ip arp synchronize
I would apparaciate if somebody can explain to my why this is happaning and what I can do to mitigate it.
Regards
Lucas
Solved! Go to Solution.
01-10-2012 06:39 AM
Hi Lucas,
I had a chat with the ppl who worked on this and indeed it seems that you are hitting this bug.
The fix is being tested as we speak, so a cco release containing it will come shortly.
I suggest you to monitor the bug I gave you. As soon the issue is fixed it will be readable and will contain info of the releases with the fix.
Please rate and close the thread if helpful
regards,
Riccardo
01-10-2012 02:31 AM
Hi Lucas,
do you have LACP enabled on the vpc towards the access switch?
Which switch is your access switch?
How often do you see the problem (every how many reload of the secondary)?
Riccardo
01-10-2012 02:53 AM
Hi Riccardo
I do have LACP enabled on both sides. I use 6500 as an access-switch (12.2(SXI)).
Here is my config
Nexus side
interface port-channel211
switchport
switchport mode trunk
switchport trunk allowed vlan 10
spanning-tree port type normal
spanning-tree guard root
vpc 211
interface Ethernet3/9
switchport mode trunk
switchport trunk allowed vlan 10
channel-group 211 mode active
no shutdown
interface Ethernet2/9
switchport mode trunk
switchport trunk allowed vlan 10
channel-group 211 mode active
no shutdown
Cat6500 side
interface Port-channel211
switchport
switchport trunk encapsulation dot1q
switchport trunk allowed vlan 10
switchport mode trunk
logging event link-status
logging event bundle-status
spanning-tree guard loop
interface GigabitEthernet1/4
switchport
switchport trunk encapsulation dot1q
switchport trunk allowed vlan 10
switchport mode trunk
logging event bundle-status
channel-protocol lacp
channel-group 211 mode active
interface GigabitEthernet1/5
switchport
switchport trunk encapsulation dot1q
switchport trunk allowed vlan 10
switchport mode trunk
logging event link-status
logging event bundle-status
channel-protocol lacp
channel-group 211 mode active
So far I noticed it twice in 3 reloades.
Regards
Lucas
01-10-2012 06:16 AM
Hi Lucas,
what you describe sounds similar to an outstanding issue >
CSCtn05804 VPC LACP comes up after reload of chassis before VPC delay timer has run.
The public info on this bug is still limited, however the description and the internal info seem to match.
In a nutshell the bug describes a condition for which LACP declares the VPC channel ready w/o waiting for the VPC timer to complete (it is needed before all the hw is correctly programmed). This, as you noticed, caused temporary blackhole of traffic coming from downstream device through a VPC. Also the release you run is within the range of the affected versions.
Let me ask the people who are working on this and i will let you know.
Riccardo
01-10-2012 06:28 AM
Hi
Thanks. It seems that it might be that.
I will be waiting for any news regarding the fixed software release.
Thank you in advance
Regards
Lucas
01-10-2012 06:39 AM
Hi Lucas,
I had a chat with the ppl who worked on this and indeed it seems that you are hitting this bug.
The fix is being tested as we speak, so a cco release containing it will come shortly.
I suggest you to monitor the bug I gave you. As soon the issue is fixed it will be readable and will contain info of the releases with the fix.
Please rate and close the thread if helpful
regards,
Riccardo
01-10-2012 06:41 AM
Hi
Ok, thank you very much for your help.
Regards
Lucas
02-29-2012 09:27 AM
Hi all.
It seems like we have the same problem with Nexus 5548UP and SW 5.0.3.N2.2b.
Can it be the same bug in Nexus5000 SW?
thx,
regards, Juergen
03-01-2012 01:37 AM
Hi Juergen,
yes there is a similar issue on N5k
CSCtk52637 vPC peer-link PC failure cause 20 sec. traffic loss
It is apparently fixed in 5.0(3)N1(1) or later. But there is another similar issue I could not find the bug addressing it though.
I suggest you to go for the latest release and if you still have issue go for a TAC case as unfortunatley I don't have time to do a deep search these days.
Riccardo
03-01-2012 01:49 AM
Hi Riccardo.
Thx for your quick answer.
As wrote in my first posting, we already use a higher version - 5.0.3.N2.2b.
Next higher version would be 5.1.3.N1.1a. We've tested it also with this SW - same result.
And with the newest SW 5.1.3.N1.1a we had also Problems with FCoe (NPV). Same config works with 5.0.3.N2.2b, but with 5.1.3.N1.1a no data is transported through FC-Uplink and vfc-Interfaces to ESX-Servers.
thx
Regard, Juergen
03-01-2012 01:53 AM
Hi Juergen,
yes I noticed that you have a more recent version this is why I mentioned the other bug I don't remember the details for and also suggested to go for a TAC case where you will get better support as I have limited time now.
Riccardo
07-03-2012 04:06 AM
Hello Riccardo
We are monitoring the bug you gave us regarding the 7k, but nothing have changed so far. The bug is still not accessible for the non-Cisco employees.
Could you please do me a favour and check if it is fixed and if yes in which 7k release?
It has been 6 moths since the fix has been tested so I hope it is avaliable now.
Thank you in advanced
Regards
Lucas
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: