arp timeout

Unanswered Question
Sep 11th, 2007

Per SRND

Per Design guide below, page 44-45

http://www.cisco.com/application/pdf/en/us/guest/netsol/ns432/c649/cdccont_0900aecd801a8a2d.pdf

and other links

http://www.cisco.com/en/US/products/hw/switches/ps708/products_tech_note09186a00807347ab.shtml

http://www.cisco.com/en/US/products/hw/switches/ps700/products_tech_note09186a00801d0808.shtml

I want to set the "arp timeout 200" but not sure if I put it on vlan interface, port-channell interface or physical interface?

Also do I have to set this command on all switches in the network or just the distribution switches?

Thx

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
lgijssel Tue, 09/11/2007 - 22:01

Only Layer3 devices are keeping arp tables. This means: vlan interfaces and physical interfaces on routers and L3 switches.

The default on Cisco devices is 14400 secs (=4 hrs).Making the arp-timeout shorter is not recommended.

Reason for this is as follows: IP devices keep track of the mac-ip relations using the arp table. When an entry times out, a new arp request is sent to refresh the data. This is broadcast traffic.

PC's and servers do this on a regular basis, something like 3 - 5 mins. This is no big problem while they have relatively few entries in the arp-table, these being the hosts that they communicate with and very often the default gateway.

For a router or L3 switch, this is a completely different situation. It has arp-entries for all (active) hosts on all directly attached subnets. For a L3 switch at the distribution-level, it is not uncommon to have far over 1000 entries. When you make a quick calculation of the amount of arp-requests that would be needed to keep the arp table up-to-date with such a short refresh rate, you will understand why this is not a good idea.

The point for having a short arp-age is to avoid sending traffic to the wrong mac adress. However, this is only a problem on the client-side and HSRP and similar protocols are designed to adress just this issue. Hence there is no need to tweak the maximum arp-age on any network device.

Hope this clarifies your questions.

regards,

Leo

Richard Burts Wed, 09/12/2007 - 03:13

Leo

You make a good point about why the default ARP timeout on routers is long. But there is another side to this issue that sometimes does make it recommended to decrease the ARP timeout. The issue is the ageing timer for the CAM or mac-address-table (the layer 2 forwarding table for switches). This timer is short. This can lead to an issue of unicast flooding.

The issue is that a router learns a MAC via ARP. The same frame is used by the switch to learn the source MAC and put it into the layer 2 forwarding table. Then the ageing timer expires and the MAC is removed from the layer 2 forwarding table. The router still has the MAC in the ARP table so when a packet arrives for that destination address the router forwards to the switch. But the switch no longer has that MAC in the layer 2 forwarding table so the switch floods the frame to every port in the VLAN.

Which problem has more impact: increased traffic from ARP or increased traffic from unicast flooding? Many people prefer the solution of a shorter ARP timeout.

HTH

Rick

lgijssel Wed, 09/12/2007 - 03:36

That's a good one as well Rick,

However, I disagree with you on this for the reasons below:

1: A router does not learn a MAC via ARP, at least not exclusively. The mac- or cam-table will use any packet to renew an entry, not just arp responses. As long as a host is active and transmits unicasts with it's source adress on a regular basis, the cam-entry remains in memory.

2: This unicast flooding you refer to is limited to the first packet after the flushing of the cam-entry. After that, the switch will have re-learned the source interface. Most networks should be able to deal with this.

3: I can see no direct relation between arp aging and cam-aging. They are two different processes serving a different purpose.

4: The aging of the cam-table is a local process which occurs asynchronously on all switches in the network. When a neigbor switch still has the entry there will be no network-wide flooding, just on one switch. With broadcast traffic, the packet is always flooded to every port in the LAN.

best regards,

Leo

Kevin Dorrell Wed, 09/12/2007 - 04:11

I would like to add my own 2 Euro-cents-worth here.

There are so many variables here that the decision surely has to be made on a site-by-site basis. Some things I have observed:

- If you have a site where host machines are changed frequently, or where DHCP is managed not from the router but has a short lease, then the ARP cache needs to age more rapidly. I find that 4 hours would be a bit long for a user to wait for access through the router just because he has re-used an IP address that was recently in use by another machine. Or does the router listen to a gratuitous ARP as the host powers up? (And if so, is this a security risk?)

- If your VLANs are many and small, then the overhead from all those ARP broadcasts is also small. Granted that the total number of ARP broadcasts is the same, but each host has to process only the broadcasts within its own VLAN.

- On a (layer-3) distribution switch, how many active IP addresses are you going to have? It depends really on how many VLANs it is routing and how many access IPs are on each one.

- On a layer-3 core, there may not be the problem. It may be routing only down to the distribution switches, which then further route down to the VLAN access layer. In that case the core switch will only have one entry per distribution switch (per VLAN), and not one per access IP.

I agree there is only a tenuous connection between ARP aging and CAM aging. CAM aging is reset when a host sources a packet, and any packet at that. The router's ARP entry is created when the router has a packet to send to the host, and then stays in the ARP cache for the full aging time, and doesn't get refreshed by traffic. Therefore the ARP requests are almost guaranteed to repeat every 4 hours.

I take the view that it is safe to reduce the ARP aging time to n seconds where n is the typical (or maximum) number of entries in the ARP cache - that would result in an average of 1 ARP request per second. Does anyone have 14400 entries in their ARP cache? If so, I would argue that it is time to take a more hierarchical design.

What do you all think?

Oh .... and the answer to the original question ... you put it on the VLAN interface.

Kevin Dorrell

Luxembourg

Richard Burts Wed, 09/12/2007 - 06:24

Leo

Disagree with me if you wish. But take a look at this tech note prepared by Cisco TAC which discusses the issue of unicast flooding and has as one of its common solutions to reduce the ARP timeout to be close to the CAM ageing timer:

http://www.cisco.com/en/US/products/hw/switches/ps700/products_tech_note09186a00801d0808.shtml

Also take a look at this troubleshooting tech note from the TAC:

http://www.cisco.com/en/US/products/hw/switches/ps672/products_tech_note09186a0080093fff.shtml

which specifically suggests changing the ARP timer as a solution to the problem:

There are two workarounds for this problem:

* Redesign the routing topology so that traffic for a given remote IP subnet follows the same route into and out of the Catalyst 2948G-L3 switch.

* Reduce the ARP aging time on router interfaces connected to the Catalyst 2948G-L3 switch to 5 minutes (using the arp timeout interface configuration command).

My colleagues and I have seen this problem for real and I can assure you that there are circumstances where reducing the ARP timer is a very effective solution.

HTH

Rick

lgijssel Wed, 09/12/2007 - 10:20

Rick,

To disagree is what a discussion forum is for. I have read the note that you provided, I must admit that I did not know it.

In this document, several situations are discussed where unicast flooding may occur.

This may very well be true but we are talking exceptions here.

In a well-designed environment without asymmetric routing, with a stable or loopfree topology, I cannot see any reason to modify the default arp-age.

Perhaps the confusion is about making any modifications whatsoever? I totally agree with you -and everyone else- that there may be situations where this becomes necessary.

From the question as posted, I got the impression that the intention was to increase the availability and/or stability of the network and I am convinced that, in the situation as described, it would do no good at all.

regards,

Leo

paul.matthews Wed, 09/12/2007 - 07:16

I would take the unicast flooding. The reason being that the unicast will probably be dropped in hardware by end systems, where the broadcast will probably be processed in software.

Should a device change where it uses an IP address, eg a server with two lan cards decides to dtart using the other, one would expect a gratuitous arp from the server on the new address that will neatly update the ARP table on interetsted devices, and be a first packet with the mac address as source to update CAM tables.

Kevin Dorrell Wed, 09/12/2007 - 07:30

Paul, so you are saying that the router will believe a gratuitous ARP? That would indeed update the ARP cache nicely, but isn't it a security risk? What is to stop me from crafting a load of gratuitous ARPs (and maybe address them to the unicast MAC of the router to avoid detection) and so becoming man-in-the-middle?

Kevin Dorrell

Luxembourg

aamercado Sun, 09/30/2007 - 22:51

Thanks for all the info, here's my 2 cents -

The design guides

1. Recommends 300 but seems to advise 200 on the arp timeout

2. Recommends using arp timeout when vlans are spanned. If vlans are not spanned, not necessary to use arp timeout.

paul.matthews Mon, 10/01/2007 - 00:22

Hi Kevin - I missed your question.

I do believe the router will accept a gratuitous arp. The routers use them - indeed it is normal on things like HSRP failover as it is a way of onforming the switch of the change of location of the mac address as well.

You are right in that it is a security risk as well, and could potentially be exploited in the manner you suggest. Security is always going to be a trade off between security and functionality.

I suspect IP Source guard may be of use here.

Paul.

Actions

This Discussion