10-01-2014 10:34 AM
Dear Community,
We experienced an unusual behavior of two Nexus 7000 switches within a vPC domain.
According to the attached sketch, we have four N7Ks in two data centers - two Nexus 7Ks are in a vPC domain for each data center.
Both data centers are connected via a Multilayer-vPC.
We had to reload one of these switches and I expected the other N7K in this vPC domain to continue forwarding over its vPC-Member-ports.
Actually, all vPC ports have been disabled on the secondary switch until the reload of the first N7K (vPC-Role: primary) finished.
Logging on Switch B:
20:11:51 <Switch B> %VPC-2-VPC_SUSP_ALL_VPC: Peer-link going down, suspending all vPCs on secondary
20:12:01 <Switch B> %VPC-2-PEER_KEEP_ALIVE_RECV_FAIL: In domain 1, VPC peer keep-alive receive has failed
In case of a Peer-link failure, I would expect this behavior if the other switch is still reachable via the Peer-Keepalive-Link (via the Mgmt-Port), but since we reloaded the whole switch, the vPCs should continue forwarding.
Could this be a bug or are there any timers to be tuned?
All N7K switches are running on NX-OS 6.2(8)
Switch A:
vpc domain 1
peer-switch
role priority 2048
system-priority 1024
peer-keepalive destination <Mgmt-IP-Switch-B>
delay restore 360
peer-gateway
auto-recovery reload-delay 360
ip arp synchronize
interface port-channel1
switchport mode trunk
switchport trunk allowed vlan <x-y>
spanning-tree port type network
vpc peer-link
Switch B:
vpc domain 1
peer-switch
role priority 1024
system-priority 1024
peer-keepalive destination <Mgmt-IP-Switch-A>
delay restore 360
peer-gateway
auto-recovery reload-delay 360
ip arp synchronize
interface port-channel1
switchport mode trunk
switchport trunk allowed vlan <x-y>
spanning-tree port type network
vpc peer-link
Best regards
10-19-2014 08:12 AM
Problem solved:
During the reload of the Nexus 7K, the linecards were powerd off a short time earlier than the Mgmt-Interface. As a result of this behavior, the secondary Nexus 7K received at least one vPC-Peer-Keepalive Message while its peer-link was already powerd off. To avoid a split brain scenario, the VPC-member-ports have been shut down.
Now we are using dedicated interfaces on the linecards for the VPC-Peer-Keepalive-Link and a reload of one N7K won't result in a total network outage any more.
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: