I have an 1811 router that is doing a site to site IPSEC VPN back to an ASA5510. The primary connection is through Fa0/0 which connects to a telco provided DSL modem. For backup the internal modem dials a local ISP and sets up a VPN to the same ASA5510. A GRE tunnel rides inside the VPN tunnel to support OSPF. The GRE tunnel is sourced from loopback1 on the 1811 and ends on a router behind the ASA. Dial backup is triggered by an IP SLA monitor that pings the ASA.
Failover to dialbackup works well. It takes about 30 seconds to get the backup connection up and routing traffic.
Failback to the primary connection works but as soon as the IP SLA monitor target reachability comes back "up", traffic stops flowing for nearly 2 minutes. The 1811 has 2 sets ofIPSEC SAs. One set for Fa0/0 and the other for Async0/1/0. I suspect that the ASA is confused about which tunnel to use to send packets back to the 1811. How do I remedy this?
I tried doing the command "crypto map mymap local-address Loopback1" but then none of the tunnels came up. I suspect it is because the Loopback1 address is a private address and the ASA doesn't know how to get there. I can do "crypto map mymap local-address Fa0/0" and the primary tunnel works but dial backup never gets a tunnel established.
I suspect that I'll either have to either NAT the 1811 Loopback1 address to the public interfaces (how do you do that to overload to two different interfaces?) or do something else.
How do I get this to work without a 2 minute outage during failback?
Not very well apparently... The ASA just sees another IPSEC session from the 1811's (actually this is a lab router and it's actually an 1841. I'll use a mix of 1841 and 1811 routers in production)) ASYNC interface. It's handled by the same dynamic crypto map as the IPSEC tunnel from the 1841's Fa0/0. The ASA sees both tunnels handling GRE traffic for the same two source/destination addresses.
If the ASA itself fails, it is part of a failover pair. The standby ASA takes over in about 6 seconds.
I think the solution lies in using the "crypto map mymap local address" on the 1841, so aonly a single IPSEC session gets established with the ASA, but how to get the ASA to route the IPSEC packets to an internal address on the remote router through either a static address many hops away, or a dynamically addressed IP on the ASYNC interface, before the tunnels are up is the tricky bit that is missing from all the docs I found about the "local-address" command.
Just to clarify your setup, you have another router at your central site that's behind the ASA? That router forms the other end of two GRE tunnels that extend to the remote 1841? Two VPN tunnels extend from the ASA to the remote 1841? You are using the ASYNC and FastEthernet interfaces on the ASA to form two VPN tunnel endpoints?
That's correct. THe router behind the ASA handles GRE termination at HQ. GRE termination at the 1841 is via loopback1. Fa0/0 of the 1841 goes to a DSL modem and ASYNC0/1/0 dials a local ISP number. Normally, there is only one IPSEC tunnel between the 1841 and the ASA. The only time there are two tunnels is when the DSL comes back up after an outage, and before the ASYNC interface times out. for about two minutes after the DSL comes back, the production traffic fails to get through. I have shortened the idle-timeout to 30 seconds (from 10 minutes) and the duration of the "failback outage" isn't affected. I can end the failback outage sooner if I manually clear the IPSEC SAs on the 1841 but that's not useful in production.
How do you get the async line to go down? Do you use DDR? If that interface went down immediately, the VPN path would disappear immediately.
The ASA also supports SLA monitoring. Could you monitor an IP associated with the remote 1841's async interface to force routing over the DSL path?
The Async line goes down with an idle-timeout. Once the ip sla monitor sees the ASA as "reachable" via fa0/0, it re-installs two static routes: one for the ASA address (because I don't have or want a default route to the Internet), and one for the GRE tunnel destination (needed so GRE traffic doesn't loop through it's own tunnel once OSPF converges). THese routes have a lower administrative distance than the floating static routes that point to the async interface, so no more GRE traffic flows out the async interface and the idle timer increments until it times out.
Sorry, I didn't reply to the second bit (doing too many things at once today).
The async interface of the 1841 is dynamically assigned so I can't monitor that IP from the ASA. I suppose I could monitor the statically addressed interface of the 1841 at the ASA but the ASA is not aware that the tunnel from the async interface of the 1841 is coming from the same router as the tunnel from the DSL side. I don't think there's a way to tell the ASA to kill the right tunnel as soon as a tunnel from the "better" interface comes up.
I haven't tried killing the async interface as soon as "reachability" through fa0/0 returns, but killing it after 30 seconds doesn't shorten the 2 minute outage at all.
Do you have "gre keeaplive" set on the two routers? You could kill the GRE tunnel as soon as the central site router loses contact with the remote router's GRE tunnel endpoint.
Nope. Not unless it's on by default. I could see that helping if I had a problem with the initial failover to the async sourced VPN tunnel, but that works fine.
It's during failback, when the DSL comes back up that I have the problem. At that point I have two VPN tunnels between the remote router and the ASA and one GRE tunnel that the router is pushing out Fa0/0, and the ASA is doing something yet to be determined.
Know of any good debugs to run on the ASA that might shed some light? I'm fairly new to the ASA. I've been doing 3000 series concentrators for the last few years.
Oops... I should clarify: There is only one GRE tunnel between HQ and the 1841. During failback, when the remote DSL circuit comes back up, and before the ASYNC interface times out, there are two IPSEC tunnels between the 1841 and the ASA. I think the problem is that the ASA is confused about which IPSEC tunnel to route the GRE tunnel through since the dynamic crypto map on the ASA identifies the GRE tunnel as appropriate for both of the IPSEC tunnels.
If you could monitor the static (DSL) IP from the ASA, you could determine whether it was up or down, and send traffic over the async interface only if the DSL connection was down.
I could monitor the static DSL ip from the ASA but I'm not sure what "action" the ASA should take based on the reachability state. Would I configure a static route that tracks the ip sla target (the 1841's Fa0/0 address). Is it possible to give that a lower cost than what the ASA's reverse-route injection is now assigning to both tunnels?
Any idea how this will affect scalability? I could have up to 700 remotes like this going to a failover pair of ASA5520 appliances once this goes into production. That's a lot of added complexity on a per site basis.
My suggestion is definitely not scalable for 700 sites, but I'm wondering how you will use one dial-up line as a backup for that many sites. Do you have another type of broadband in mind as a backup?
This is what I was thinking of as a backup scenario. Your backup line could be VLAN'd into the ASA and you could use the backup interface command along with SLA monitor to achieve failover. The benefit of this failover is that it moves the VPN rules to the backup interface.
We wouldn't be using one line for all the sites to use for backup. Each site would be dialing in to an ISP line that is local for them. I'm prototyping the config with async dialup because that's what I have available in my lab. My expectation is that most of the sites (spread across the northern half of north america) will use wireless broadband for backup, but the design/configuration issues will be the same as dialup to an ISP, and some sites will be stuck with analog dialup as their only option for backup.
We do have one prospective vendor that would aggregate the backup connections into a single VPN or ATM PVC, but that's far from a done deal.
Right now I have less than 100 VPN sites and they all do analog or ISDN dial backup directly to an AS5350 at HQ. This works fine as long as there are no more than 50 sites down at a time (Like when AT&T has a major cable cut) but it's not economically feasible to pay for enough phone lines to handle all the sites at once.
I plan to get about 400 more sites that are now frame-relay onto VPNs.
I was thinking more in terms of the central site backup connection. Is it feasible to set up a backup interface as described in the URL or are you going to use an analog or ISDN line to dial out at the central site?
The central site will likely be a pair of fractional DS-3s to the Internet, connecting to a single Ethernet segment where the ASA failover pair would live. BGP would provide load-balancing and fail-over for the Internet connection.