It's my understanding that unless
there's a point-to-point link, you're supposed to specify an IP as next
hop when doing static routing.
There's a static route on our VPN3000 concentrator that points to the private interface as next hop, instead of the next hop IP.
It's been working until now.
Can anyone explain how it was working, and why it ceased working?
Basically static routing on next hop and interface based are that by default, static routes have an administrative distance of one, which gives them precedence over routes from dynamic routing protocols.
If you point a static route to a broadcast interface, the route is inserted into the routing table only when the broadcast interface is up. This configuration is not recommended because when the next hop of a static route points to an interface, the router considers each of the hosts within the range of the route to be directly connected through that interface. For example, ip route 0.0.0.0 0.0.0.0 Ethernet0.
With this type of configuration, a router performs Address Resolution Protocol (ARP) on the Ethernet for every destination the router finds through the default route because the router considers all of these destinations as directly connected to Ethernet 0.
This kind of default route, especially if it is used by a lot of packets to many different destination subnets, can cause high processor utilization and a very large ARP cache (along with attendant memory allocation failures).
So better recommendation is to specifying a numerical next hop on a directly connected interface prevents the router from performing ARP or each destination address. However, if the interface with the next hop goes down and the numerical next hop is reachable through a recursive route, you should specify both the next hop IP address and the interface through which the next hop should be found. For example, ip route 0.0.0.0 0.0.0.0 Serial 3/3 192.168.20.1.
Your response is formulated in terms of IOS functionality, and is quite correct for that environment. But the original poster asked a question about the VPN 3000 concentrator, which is quite different from IOS.
To the original poster: It is not quite clear from your post but I believe that you are telling us that the static route on the VPN concentrator which just specified the private interface for a static route used to work and now does not work. Is that correct?
It may not be evident from the description of a static route specifying only the outbound interface, but an essential requirement for it to work is that a layer 3 device on the connected interface must support proxy ARP. Based on what I think I understand about your symptoms, I would guess that the connected router (or other layer 3 device) connected to the private interface of your VPN 3000 concentrator used to support Proxy ARP and then its configuration was changed to not support Proxy ARP (which some people regard as a security challenge). When Proxy ARP is not supported then the static route specifying only the outbound interface will no longer work.
For some odd reason, a former colleague specified the private interface (toward the internal LAN) as next hop only for 192.168.0.0/16, but numeric IP as next hop for the other two, 10.0.0.0/8 and 172.16.0.0/12.
Users only complain about not being able to hit 192.168.0.0/16 destinations, which is why I suspected the way static route being configured is the cause.
This issue surfaced right after we moved the redundant VPN3000's from a pair of 6509's to a pair of 3750's.
All the SVI's and VLAN configurations are replicated over; essentially it was a copy & paste.
I just checked, and proxy arp is not disabled.
What also puzzled me is this issue is sporadic.
Sometimes 192.168.0.0/16 is reachable via VPN, but sometimes it's not.
I checked all redundant nodes, and they're all configured the same way, so it's not like when you get load balanced to a different VPN concentrator, the behavior would be different.
What I've seen so far is most the succesful attempts were made in the morning & evening hours.
During peak hours in the afternoon the subnet would be unreachable.
I haven't tried it long enough to be absolutely positive that it's definitely the trend.
If it is though I'd be really curious to know the reason behind it.
I wonder if what Ganesh said about exhausting memory due to large ARP cache would be the cause.
Anyone know the difference in terms of arp cache size supported between 3750 & 6509 (w/ SUP720-3B)?
Or is it purely based on the physical RAM? (128M vs 512M)
Unfortunately we may never be able to find out because it's affecting a production issue, and I need to change that static route from pointing to an interface to the next hop IP.
We are pleased to announce availability of Beta software for 16.6.3. 16.6.3 will be the second rebuild on the 16.6 release train targeted towards Catalyst 9500/9400/9300/3850/3650 switching platforms. We are looking for early feedback from custome...