We had two 3120 blade switches fail on us in our production environment and when we plugged in the RMA replacements, it caused a loop that took down our VM environment:
Apr 15 17:33:11: %SW_MATM-4-MACFLAP_NOTIF: Host xxxx.56a4.0090 in vlan 2224 is flapping between port Po14 and port Po30
Apr 15 17:33:17: %SW_MATM-4-MACFLAP_NOTIF: Host xxxx.56a4.0095 in vlan 2224 is flapping between port Po16 and port Po32
Apr 15 17:33:19: %SW_MATM-4-MACFLAP_NOTIF: Hostxxxx.56ad.16b6 in vlan 2054 is flapping between port Po14 and port Po30 Apr 15 17:33:19: %SW_MATM-4-MACFLAP_NOTIF: Host xxxx.56a4.008c in vlan 2224 is flapping between port Po15 and port Po31 Apr 15 17:33:21: %SW_MATM-4-MACFLAP_NOTIF: Host xxxx.56ad.6264 in vlan 2054 is flapping between port Po16 and port Po32 Apr 15 17:33:23: %SW_MATM-4-MACFLAP_NOTIF: Host xxxx.56ad.522e in vlan 2055 is flapping between port Po42 and port Po26
Has anyone else experienced this? In Vsphere the Nics are set up for 'IP hash' and the the OS team claims nothing has changed in their configuration. We have our blade switches plugged in to an older HP C7000 chassis that has had a number of blade failures of late; both server and switch. I'm suspecting this is a chassis issue.
I suspect the loop you saw may have been caused by the fact that on a new switch i.e., one with no configuration, all switchports will be part of the same default VLAN. If your VMware servers are using route based on IP hash i.e., port-channels, then they will balance traffic from a single VM across both their NICs. If your Cisco switches had the default configuration the connections to the servers would not have been configured as port-channels, and therefore they would have seen the same MAC flapping between two ports.
Are you able to supply some additional information as to how things are connected. For example:
- Confirm it is the VMware ESX hosts in the blade chassis that are configured with IP hash? - Are the Catalyst 3120 switches configured as a Virtual Blade Switch? - The connectivity from the blade switches to the external network and the configuration of that network
Also, the two switches that were replaced:
- Were these both part of the same chassis? - What was the process to replace them e.g., plugged in, all external cables connected and then configured?
we have not scheduled an outage to further investigate this, however we believe it's an issue with us not using LACP on the trunks to the vm's. When the environment was set up, LACP wasn't an option and we had to set up the port channels to 'on' instead of 'active'....Now that we've upgraded our non-prod ESXi environment from 4.x to 5.5, we are able to use LACP on the trunks and that should resolve the loop.
Topology & Design:
Two ACI fabrics
Stretching VLANs using OTV
Both fabrics are advertising BD subnets into same routing domain
Some BDs(or say VLANs) are stretched, but some are not.
Endpoints can move betwee...
VMware Trunk Port Group is supported from ACI version 2.1
VMM integration must be configured properly
ASA device package must be uploaded to APIC
ASAv version must be compatible with ACI and device package version
Topology &Design:Traffic flow within same fabric:Endpoint moves to Fabric-2Bounce Entry Times OutTraffic Black-holedSummarySolutionAppendix:
In the Previous articles of ACI Automation, we are using Postman/Newman a...