cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1114
Views
0
Helpful
1
Replies

Traffic blackholing on overlapping STP and vPC (CAM table related)

fmamedov
Level 1
Level 1

Hello everyone,

I have run into a very strange problem while doing pre-deployment vPC/STP testing in the lab with a pair of Nexus 7000s. Having searched all over the web I couldn't find an answer so I am hoping someone may have encountered a similar issue before.

The basic configuration is as follows:

2x Nexus 7000 VDCs (ver 6.0(4)) are configured as vPC peers and connected with a vPC peer-link (redundant on different 10G blades) and a vPC peer-keepalive link. The switches also act as HSRP and EIGRP routers. The N7K-A switch is nominally configured as STP root and HSRP prime for all VLANs, N7K-B switch is STP backup root and HSRP secondary. STP version is PV-RSTP+. As it stands now STP root and vPC prime are on different switches, STP root is on N7K-A and vPC prime is on N7K-B.

3x Layer-2 access switches (3750-1, 3750-2, 3560-1) are configured as access switches and connected to the Nexus 7Ks with a 1G uplinks in V-pattern.

3750-1 and 3560-1 are configured for vPC as Port-Channel10 and Port-Channel12 respectively. 3750-2 is configured for STP. Vlan 35 is shared between all three switches and is enabled on the vPC peer-link (overlapping vPC and STP domains). The downlink port to the STP-only 3750-2 on N7Ks is configured as "vpc orphan suspend".

Everything seems to work fine and pings on VLAN 35  between access switches (that have mgmt interfaces in VLAN35) recover rapidly after failures. However, if I break the vpc peer-link the ping between the two vPC switches 3750-1 and 3560-1 stops. Moreover, this appears to be sporadic in nature with some vpc peer-link failure attempts recreating the problem and some not. Sometimes the problem manifests itself when the peer-link is brought back up rather than taken down.

After doing a bit of troubleshooting, I have isolated the problem to MAC address blackholing. Basically when the peer link is taken down, MAC Address table on the vPC primary switch, N7K-B, (I believe during vPC convergence) forces the traffic destined from 3750-1 to 3560-1 through the STP only switch 3750-2, which apparently goes through the RSTP convergence and enables its alternate link to N7K-B before vPC has finished its convergence. After vPC convergence is finished the path through the STP-only access layer switch 3750-2 no longer exists, as vPC will take down all vPC ports and suspend orphan ports on the vPC secondary switch (N7K-A). However the MAC Address table on N7K-B still points through the 3750-2 access layer switch instead of directly through Port-Channel 12 on N7K-B and thus creates a traffic blackhole. Issuing a ping or bouncing SVI interfaces on N7K-B fixes the problem.

I am puzzled as to why I am seeing this behavior, why its sporadic and what can be done to fix it (unless its a bug in 6.0). Any insight would be greatly appreciated.

Thanks!

1 Reply 1

fmamedov
Level 1
Level 1

UPDATE: I may have found a workaround by enabling STP Root Guard on all downlinks, that effectively prevents the N7K switches from unblocking the path through an access switch during vPC reconvergence. Would still like to hear any opinions as to why the problem is happening in the first place.

Thanks,

Fakhri

Review Cisco Networking products for a $25 gift card