I have to clear arp all of the time on a 3750. We have two vmware hosts. Each host has three NICs in them: one for service console, and two for failover. These are set up in a cluster. The problem is that when we move VMs from one system to another, we can't ping them anymore until I clear the arp table. All ports that the hosts connect to are configured with portfast, so it's not a learning issue that I can see.
What I did the other day is:
Moved the VM to the other server. Lost ping from my workstation to the server IP.
I COULD ping the server name from the 3750, soooo kindof tells me that the switch knows where the server is and it's updated it's table.
I COULD ping from the edge switch that uplinks to the core.
At first I thought it was because we had the subnet that we use for our workstations as a secondary address on VLAN1. I connected directly into the core from my workstation, and I couldn't ping it from a different VLAN, but I could ping it from the same subnet on a different server. I tried to clear arp on my workstation using netsh, and it didn't make a difference UNTIL I cleared ARP on the switch, and then it works fine.
It doesn't work to clear just the arp entry for the interfaces that are associated to the vmware server. I have to clear arp for the vlan that it's associated to.
Does anyone have any experience in what to do with this? I'm not even sure how to go about troubleshooting this any further.
Happy New Year
if the VMware MAC address changes and the ip address is the same there is no other way then to clear arp to delete the old entry and make the multilayer switch create a new correct entry.
if the MAC address would be the same and the ip address mac address learning is enough to associate the MAC to new port/instance
So the suggestions is to replicate the events
a) collect MAC and IP address information
b) make the vm switchover
c) get info about MAC and IP address
Hope to help
Happy New Year Giuseppe!
We've done that, and the mac address does change. The problem is that we've left it alone overnight to see if it would refresh, and it doesn't. I don't know if it's a configuration problem with the VMWare server.
Under the VSwitch's configuration in VMWare, make sure you enable "Notify Switches" under the NIC teaming tab. This will fire of a GARP message to the switch when a VM is VMotioned to another server.
Also, if you have Dymantec ARP inpection enabled, make sure you set the VMware's Vmotion ports to "trusted" ports. (IP ARP INSPECTION TRUST interface command)
I think the issue is related more to the cluster or VM config and less to the switch.
As you move the "physical" resources (like the VMs), the cluster executive must make the changes to the physical/logical map, which then should advertise the change to the outside world.
It sounds like the when you move the VM within the cluster, the cluster is (still) presenting the same logical MAC (layer two only cares about MAC)to the switch, so the switch sees no need to change the ARP table.
Are the VMs configured in bridge mode to the physical Ethernet port or are they assigned each to their own port (whether the port is physical or logically related through the cluster)?
If you have a diagram, it might be helpful.
Are your VMware servers located in the same L2 broadcast domain? I believe this is the only support configuration for VMotion. I was mistaken about the GARP, it is actually a RARP that is sent when you VMotion a server, which is only going to update the CAM.
Here's more information as to what I have so far.
I've connected a laptop directly to the core switch. The core switch is configured like this:
ip address 10.1.1.5 255.255.255.0
ip address 10.2.1.5 255.255.255.0 sec
ip address 10.3.1.5 255.255.255.0 sec
If I have my laptop get an address from our dhcp server, it will pull an address in the 10.2.1.0 subnet. The VM server is in the 10.1.1.0 subnet, and after the move, all devices in the 10.1.1.0 subnet can still see it. Now I've tried to change my laptops gateway from being the .5 to .1 (router). I can see the vmware server after doing this. I change the gateway back to .5, and it still doesn't work.
I've also created a new vlan and added my laptop port to that vlan, and still the same thing. If I statically assign my address to the same 10.1.1.0 subnet and attach to either the core or the edge switch, I can see the vmware server. By changing the gateway to .1, I think I've effectively elimated it being a vmware problem, but more of a switch issue. If I clear the arp table, everything works fine until it's moved over again. The mac address matches the vmware host, and it matches the arp entry in the switch.
I wanted to let you guys know in case you had other suggestions, but I'm about to call Cisco on this one; it makes no sense.
You answered that the MAC address changes when moving the VMware instance
is the IP address still the same ?
If so you are facing an ARP problem and as explained by other collegues the VMware could help with a RARP message gratuitos ARP to update the ARP tables.
For example when the Active HSRP router changes the new active sends out a gratuitos ARP that allow lan switches to learn that the VIP and VIP mac addresses are now reachable from another port.
But in the case of HSRP (if not using standby use-bia) there is a virtual IP and a virtual MAC and no changes are needed in the ARP table only the CAM table (L2 only)
If the MAC address changes but the IP is still the same all devices that are not informed of the change will point to the old pair IPaddress, oldMAC and something must be done to update this info.
secondary ip addresses to primary ip addresses is like inter-vlan routing the PC with a secondary ip address needs to go through its default gateway even if all devices are in the same broadcast domain.
for this reason you see the following:
The VM server is in the 10.1.1.0 subnet, and after the move, all devices in the 10.1.1.0 subnet can still see it. Now I've tried to change my laptops gateway from being the .5 to .1 (router). I can see the vmware server after doing this. I change the gateway back to .5, and it still doesn't work.
until the multilayer switch doesn't update its entry the moved cannot be reached from another subnet.
Devices on subnet are reached by the message sent by vmware (RARP o gratuitos ARP) after move and so they can reach the moved server because they updated their ARP table.
Have you enabled ARP inspection ?
this could prevent the switch to accept the RARP message
Hope to help
It's not a problem with the switch (I don't think). I don't have DAI configured on the switch. The switch gets the update fine, but the workstation that's in another subnet is what can't hit it. Well, for whatever reason it's working fine now and we can't get it to fail. I haven't changed anything but enable portfast on the VM service console ports. We'll just have to wait and see what happens.