I think adding a "local-address" parameter to the crypto map on my remote router will solve the issue described below. I tried setting it to my loopback1 interface but that just resulted in packets with a source address of my privately addressed (192.168.x.x) loopback1 heading to certain death on the Internet. I kind of expected this but I tried it anyway as this little detail wasn't mentioned in any of the docs that I've found so far.
Is this parameter only useful in a private network where you have full control of routing tables, or is there some way to get the response packets from the ASA across the Internet, to the loopback address of the remote router via the correct interface (the one that is "up") of the remote router?
Here's the setup:
I have a remote router doing an IPSEC VPN to an ASA5510. Normally the remote router connects via Fa0/0 which is connected to a telco supplied DSL modem. Fa0/0 has a static IP. If the DSL link fails the router uses an internal analog modem to dial a local ISP, and then sets up an IPSEC VPN to the same ASA5510. DDR is triggered by an IP SLA monitor that pings the ASA. The IPSEC tunnel carries a GRE tunnel to support dynamic routing (OSPF). The remote router terminates the GRE tunnel on Loopback1 and the other end of the GRE tunnel terminates on a router behind the ASA at the central site.
Failover to dial backup works great. It takes 30-45 seconds to get traffic flowing. Not bad for analog dialup.
My problem is with failback. When the DSL line comes back up the IP SLA monitor detects it and re-installs the tracked routes correctly. The router pushes the GRE packets out the correct interface so the idle-timer on the async interface times out as it should. The bad part is that as soon as the IP SLA monitor sees that the DSL line is up, production traffic stops working for about 105-120 seconds.
During this "outage" the router sees the GRE tunnel interface as "up" and it has two VPN tunnels to the ASA established. Killing the async connection manually does not end the "outage" any sooner.
I suspect that the ASA is either not forwarding the GRE packets back to the remote router because it has two tunnels and doesn't know which to use, or it's doing round-robin distribution between the tunnels and the packets are getting to the remote router too far out of sequence to be useful. If either of these is true it should be solved by the "crypto map mymap local-address" config on the remote router which would prevent the duplicate IPSEC tunnels.
"Crypto map local-daddress" is to specify which IP will be used as the source IP for IPSEC packet. It must be an IP which is route-able between two sites.
Can you check the following after primary link is up?
1. How long does it take for the route entry is added back to routing table?
2. How long does it take for both vpn phase1 (show crypto isa sa) and phase2 (show crypto ipsec sa) to finish negociation?
3. If both first two steps are very fast, you can check if encry and decry count in "show crypto ipsec sa" is incremeting according?
By the way, can you try to enable keepalive on both isakmp and GRE tunnel interface.
I understand what the command does. The issue with doing this on the Internet is getting "the Internet" to route to my "routable" loopback address which would seem to require that "the Internet" knows which of my router's two public interfaces is "up" and able to forward the packets to loopback1. I think I'd have a hard time convincing ISPs to let my loopbacks peer with their BGP routers... I did notice that "ip source-route" was mysteriously enabled on the remote router when I tried this. Would anyone really expect that to work across the Internet? Disabling source-route is a pretty basic, and common security precaution.
1. The IP SLA monitor re-installs the static routes (one to the ASA public address and one to the GRE destination address) as soon as it detects that the DSL link is up. This is within 5 seconds of the "Internet" LED lighting up on the DSL modem.
2. The IPSEC SAs for Fa0/0 (the DSL side) are fully up within a few seconds of the tracked route being re-installed by IP SLA. The "outage" occurs as soon as there are two IPSEC tunnels between the router and the ASA. I can clear up the outage sooner by manually clearing the IPSEC SAs on the ASA. The router will re-establish a VPN through the DSL link and production traffic will flow again about five seconds after I clear the SAs.
I have IKE keepalives running on both ends. I don't see much point in doing GRE keepalives at this point. There are OSPF "hellos" every 10 seconds through the GRE tunnel anyway. There's only one GRE tunnel and it has no awareness of which (or how many) IPSEC tunnel(s) it's riding in.
On DSL side, if the primary route is added back, the VPN packet should go through via primary link. Did you notice if both decry and encry counter in "show crypto ipsec sa" are incremeting?
I did a:
access-list 130 permit gre any any
debug ip packet 130 detail
During the failback "outage" it shows all the inbound GRE packets coming in the Async interface, and all the outbound GRE packets going out the Fa0/0 to the DSL device.
Debug output is attached.
So DSL site is OK. The problem is at the other end. I just wonder how it was implemented at the other end.
THe other end is a new ASA5510 that is not yet in production. The ASA has a single dynamic crypto map that handles the crypto tunnel from the DSL side of the remote router as well as the crypto tunnel from the Async side of the remote router. IKE keepalives are enabled at 30 second intervals for the DefaultLan2LanGroup on the ASA. Nothing else is really configured on the ASA other than a partially configured webVPN instance and the usual AAA, DNS, NTP, and logging stuff.
If you have any suggestions for debugs or whatnot to run on the ASA, I'll do that when I get back to work tomorrow.
I believe that ASA could not differentiate two vpn sessions in this setup. My suggestion is to setup two Tunnel interfaces on the router behind ASA, one use F0/0 IP of the remote router as destination IP and the other use dial up interface. Then you can use routing protocol to force the traffic to be send on the right tunnel interface after the primary link is up.
If the above is not good for you, and if the router on DSL site supports EEM and a clear VPN session can restore the service after the primary link is up, you can try to use EEM to clear vpn session after primary route is added into routing table. Search on CCO by EEM, you can find a lot reference links.
Thanks, I'll search on EEM.
I can't set up a GRE tunnel to the Async interface of the remote router because that is dynamically addressed. Once I go into production with 400 plus remote sites I maight also end up with a few dynamically addressed Fa0/0 interfaces on the DSL side too. That is, if I can figure out how to "track" static routes that use a DHCP assigned gateway. I know how to track the dhcp assigned default route, but I need specific static routes for the GRE tunnel & to the ASA. It seems static route commands only take the "dhcp" OR "track" keywords, not both...
I've reduced the duration of the failback outage with an EEM applet. Now it lasts 4 to 14 seconds instead of 100 to 120 or so.
I'm going to try creating a second loopback and source second GRE tunnel interface from that to see if I can improve things by using EEM etc to route one GRE tunnel out of the DSL side only and the second out the Async side only to see if that works any better.
I am still very curious about the answer to my original question though... I do have a TAC case open for that but the guy working on it typically takes 3 -4 days to respond, and I'm going on vacation in a couple of days.
Cool, great job.
What's your question? about that crypto map local-address?
TAC engineer might be busy. In general, P3 case should get an update every five days if I remember correctly.
I'm still interested in whether or not the crypto map local-address parameter is really practical or even possible with VPNs across the Internet. I can see the loopback address being routable via either physical interface in a private network where the router can influence the core routes via a dynamic routing protocol, but on the Internet I can't do that. Unless there's some sort of BGP trick with non-adjacent peers and something else I don't know about (yet?) I don't think it'll work on the Internet.
Right now I'm going to do two loopbacks on the remote router, and source two GRE tunnels from there. I'll add to the PBR route map so the second GRE tunnel only goes out the backup interface. I'll edit the ACL for the route map that sends the IP SLA pings out the primary interface and add the first GRE tunnel traffic definition to that. Then I'll do the EEM applet to clear the crypto map for the backup interface and shut down the backup interface upon restoration of IP SLA reachability, and another EEM applet to "no shut" the backup interface when SLA reachability is lost. I may not get that far today though... I'm heading out on vacation tomorrow and I'm experiencing a severe case of pre-vacation lack of motivation.
Thanks for sharing your solution here.
As you know "crypto map local-address" will change the source address of IPSec packet, therefore, if you would like use it on the internet, it must be a public IP. But in your case, if you use a valid public IP on your loopback interface and use this address in "crypto map local-address", you still have an issue that this IP can only be accessible via the ISP who provide you this IP since I don't think the other ISP would let you advertise this IP to their network.