Windows 2008 Gratuitous ARP being ignored

Unanswered Question
Dec 8th, 2009

MS has changed the way they send out a gratuitous ARP with Windows 2008.  Some of our routers appear to be ignoring this new method and others recognise it.

If you add a second IP address to a NIC with Windows 2003, the routers will update the MAC address.  If you do the same thing with a Windows 2008 machine some routers will update the ARP table and others seem to ignore this change.

This is very important in a High Availability scenario where you may move an IP address between two servers on the same subnet.

We have two 6509eCisco routers and one 6509 in the same datacenter. One of these routers appears to ignore the change of the IP address from one node to another.

What can cause some routers to update their ARP cache and others to ignore the new type gratuitous ARP broadcast?

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 5 (1 ratings)
Loading.
schrader_john Tue, 12/08/2009 - 16:29

Tere are three routers in this datacenter and I am confirming which one

is the next hop.  The three IOS and routers are:

Cisco 6509e running IOS version 12.2(18)SXF9

Cisco 6509e running IOS version 12.2(18)SXF9

Cisco 6509 running IOS 12.1(12c)E5

I believe the one that is the next hop is the top one but I'm not sure at this time.

Mike Pavlovich Tue, 01/26/2010 - 14:20

The new Windows Vista / Server 2008 gratuitous arp behavior which results in Cisco routers not creating an arp entry seems to be by design. This was done by Microsoft so that the original purpose of gratuitous arp (duplicate IP address detection) does not result in invalid arp entries in the router which was a problem in older 2003 / XP windows:

http://blogs.technet.com/networking/archive/2009/03/30/tcp-ip-networking-from-the-wire-up.aspx

--SNIP--

Additionally, when a gratuitous ARP is sent by a Windows Vista or Windows

Server 2008, the following change has been made –  the SPA field in the

initial request is set to 0.0.0.0. This way the ARP or neighbor caches of

systems receiving this request are not updated. So, if there is a duplicate

IP address, the receivers do not need to have their cache corrected.

--SNIP--

The problem as stated in this thread is that this change in Windows Vista / Server 2008 has led to some problems with applications that rely on gratuitous arps for other purposes, such as high availability virtual IP failover. I did some poking around and found the following thread which suggests that application problems like these are being corrected via updated application drivers:


http://social.technet.microsoft.com/Forums/en-US/winserverPN/thread/c6cb9f57-7d5d-4b75-a79a-ff0806300fbe

--SNIP--
In pursuing this via other channels, I did receive an answer from Microsoft
Clustering via our account representative. There is a mechanism present in
MS Clustering which allows a GARP to be sent without the SPA set to 0.0.0.0,
which would update a router's cache, however this is not exposed to other
applications. The recommendation we were given was to craft an NDIS driver
and push out our own GARP with the SPA set whenever we completed asserting
an IP address. We implemented this solution using a Microsoft reference
driver as our template and successfully tested it.

In speaking with other companies working on high-availability / failover
systems, this appears to be the same response they're receiving.
--SNIP--

I hope this helps,

Mike

lbgaus-outerhost Mon, 11/22/2010 - 11:30

I found a way to send gratuitous ARP packets in Windows 2008 and above that seems to work for me. Search for and download arping for Windows (Thomas Habets version), copy it to the server you wish to update ARP for, and run arping ipaddr -S ipaddr. (Replace ipaddr with your IP). This will broadcast gratuitous ARP packets since Windows no longer will.

Windows 2008 R2 and Windows 7 both extended the functionality of the PING command, adding a new option called "-S" (note uppercase letter S). I am wondering if this is the similar functionality as provided by Thomas Habets (mentioned in this thread see for instance http://www.habets.pp.se/synscan/programs.php?prog=arping).

The documentation from Microsoft on the new "-S" option of PING refers to use in IPv6, makes no mention that it also sends a gratuitius ARP in an IPv4 network.

We are a customer of a high availabiltiy solution, that vendor is working to develop code to work around the change in GARP behavior from Win2K3 to Win2K8. I want to verify that adding a "PING -S 10.10.0.10 10.10.0.1" will allow the MAC (neighbor) cache to be updated in Win2K8 and network switches attached.

This thread has been stale a while, seems like the above is an important footnote if accurate.

Thank you.

Mike Pavlovich Tue, 07/26/2011 - 18:18

No, I do not think that "ping -S" in Windows 7 has the same functionality as Thomas Habets app referenced above.

What Thomas Habets app seems to do is gererate a regular "ARP" request (not a "gratuitous ARP") on demand. A regular "ARP" does not have the "SPA field set to 0.0.0.0" which is the problem with "gratuitous ARP" in newer versions of Windows being discussed on this thread. Instead the SPA field is set to the IP address of the host generating the regular "ARP" request. Please see the following link for more details on "gratuitous ARP" vs regular "ARP": http://blogs.technet.com/b/networking/archive/2009/03/30/tcp-ip-networking-from-the-wire-up.aspx

What the "ping -S" option on a Windows 7 system does is allow you to specify the source IP address of the ping in the case that you have say both a wireless and ethernet NIC with a seperate IP address assigned to each and you want to control which one sources the ping. Ping is independent of ARP. A "ping -S" will only result in the generation of a regular "ARP" (and simulate Thomas Habets app) if the ARP entry does not already exist in the ARP table for the destination IP (or default gateway if the ping destination IP is not on the local IP subnet) of the device being pinged. "ping -S" will *not* generate a regular "ARP" reguest If we already have an ARP entry for the device being pinged (or for the default gateway) in the ARP table since the "ping -S"  will use the MAC address in the existing ARP entry to build the IP ping request packet. In this case there is no reason to generate a new ARP request to learn the destination IP's MAC address (or default gateway's MAC address) at all since it already exists in the ARP table and so we already know it.

Mike

Thanks Mike for replying. Your response seems to contradict what our HA application vendor is suggesting in their work-around solution to Microsoft's change of GARP/ARP behavior going from Windows Server 2003 to 2008. Have you observed a Windows 7 or Windows Server 2008 R2 "ping -S" layer-2 trace to know that the "-S" option doesn't force a Windows XP and Server 2003 GARP behavior ("SPA field set to source IP")? I have no way of knowing except the HA vendor we're working with claims adding an in-line "ping -S" command to their Windows Server 2008 R2 installation of the HA application "fixes" their issues when their application needs to failover and inherit and IP address (refresh ARP cache) from the down system.

Mike Pavlovich Wed, 07/27/2011 - 09:39

Doing a "ping -S" to a destination device that does not have an entry in the ARP table (neighbor cache) will generate a regular ARP request to that device with SPA field set to source IP. This will occur even with a regular ping without the "-S" option since this option simply allows you to specify the source IP address for the ping.  I see this when I run wireshark on a Windows 7 device and do a "ping -S" to a device that is not in the ARP table.

When an entry already exists in the ARP table for the device being pinged however no ARP request will go out since we already have the information cached. In this case doing a "ping -S" will not generate the ARP request you desire.

Since you say in the scenario above that the "ping -S" only goes out in a failure scenario it is probably the case that the device does not have an ARP entry cached prior to the failover and so an ARP request is sent out with the "ping -S" which is why it works. If this is the case (no prior ARP table entry) then it seems like a good workaround to the problem.

Mike

Isn't the issue with HA solutions more than the host's ARP cache (which I think defaults to 10 minutes in Windows?) but the network switch's MAC table cache? In other words wouldn't it potentially take up to 10 minutes for a host to recognize that an IP has moved to a new network port (in the case of a host) or longer (in the case of network switch)?

Mike Pavlovich Wed, 07/27/2011 - 12:20

When the ARP request is sent out as discussed above the source MAC addres of the ARP request will be either the burnt in address of the NIC card that sent it or a virtual MAC depending on the vendor HA mechanism. The destination MAC address of the ARP request will be a broadcast and so the ARP packet will flood the vlan in question. All devices in the vlan should receive the broadcast ARP request and either learn or update their MAC tables for the ARP packet's source MAC to the new path toward the device that sent the ARP request.

Mike

schrader_john Thu, 07/28/2011 - 07:46

When I initially posted this on the Cisco board a couple years back it was because we were seeing that some routers would update and some would not.  I was hoping there was some switch in the router code that would help with this.

Since then I have found that Microsoft introduced a new "feature" in Vista and Windows 2008 that was supposed to control whether the new behavior was in effect or the old Windows 2003 behavior would be used.  Unfortunately we finally got them to admit that this feature did not work. 

What I am referring to is the netsh.exe configuration feature NUD (Neighbor Unreachability Detection).  If you used the following syntax, it was supposed to revert to the old way of sending out GARP updates:

netsh interface ipv4>set interface "Public Interface" nud=disabled store=persistent

The problem was that when you issued that command and it responded with an OK, nothing actually changed.

In the mean time we were forced to find an alternate solution and did so by writing a device driver to approximate the old process.  That is still in use in our HA product.

As far as I know Microsoft has still not resolved this issue as they promised us that they would.  However, since we had a workaround I have not followed up on whether this has been fixed or not.

Anyone who is having trouble with this issue may want to try this approach and report back to this message board for the benefit of others.  If is still does not work, I hope Cisco has better success than I did at getting this fixed in Windows.

Actions

This Discussion