cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1158
Views
24
Helpful
19
Replies

Problems with Phase I of IPSEC communication - Main Mode failure

mitchen
Level 2
Level 2

We have a number of ADSL connected remote sites with IPSEC tunnels terminating on the PIX firewall at our central office.

This morning, we lost connectivity with one of the ADSL connected sites.

I've checked the router at the remote site, its up and running, its interfaces are all up, etc.

I asked the telco to check the ADSL line and they report there are no problems with the line.

From running some debugs, the router doesn't seem to be establishing the IPSEC communications with the PIX firewall properly - Phase 1 Main Mode fails. The question is why and what can I do to resolve it?

Here's an example of the "debug isakmp" on the router:

*Mar 1 01:36:54.383: ISAKMP: received ke message (1/1)

*Mar 1 01:36:54.383: ISAKMP (0:0): SA request profile is (NULL)

*Mar 1 01:36:54.383: ISAKMP: local port 500, remote port 500

*Mar 1 01:36:54.387: ISAKMP: set new node 0 to QM_IDLE

*Mar 1 01:36:54.387: ISAKMP: insert sa successfully sa = 817BFFEC

*Mar 1 01:36:54.387: ISAKMP (0:1): Can not start Aggressive mode, trying Main mode.

*Mar 1 01:36:54.387: ISAKMP: Looking for a matching key for xxx.xxx.xxx.xxx in default : success

*Mar 1 01:36:54.387: ISAKMP (0:1): found peer pre-shared key matching xxx.xxx.xxx.xxx

*Mar 1 01:36:54.387: ISAKMP (0:1): constructed NAT-T vendor-03 ID

*Mar 1 01:36:54.387: ISAKMP (0:1): constructed NAT-T vendor-02 ID

*Mar 1 01:36:54.387: ISAKMP (0:1): Input = IKE_MESG_FROM_IPSEC, IKE_SA_REQ_MM

*Mar 1 01:36:54.391: ISAKMP (0:1): Old State = IKE_READY New State = IKE_I_MM1

*Mar 1 01:36:54.391: ISAKMP (0:1): beginning Main Mode exchange

*Mar 1 01:36:54.391: ISAKMP (0:1): sending packet to xxx.xxx.xxx.xxx my_port 500 peer_port 500 (I) MM_NO_STATE.....

Success rate is 0 percent (0/5)

*Mar 1 01:37:04.391: ISAKMP (0:1): retransmitting phase 1 MM_NO_STATE...

*Mar 1 01:37:04.391: ISAKMP (0:1): incrementing error counter on sa: retransmit phase 1

*Mar 1 01:37:04.391: ISAKMP (0:1): retransmitting phase 1 MM_NO_STATE

*Mar 1 01:37:04.391: ISAKMP (0:1): sending packet to xxx.xxx.xxx.xxx my_port 500 peer_port 500 (I) MM_NO_STATE

*Mar 1 01:37:14.391: ISAKMP (0:1): retransmitting phase 1 MM_NO_STATE...

*Mar 1 01:37:14.391: ISAKMP (0:1): incrementing error counter on sa: retransmit phase 1

*Mar 1 01:37:14.391: ISAKMP (0:1): retransmitting phase 1 MM_NO_STATE

*Mar 1 01:37:14.391: ISAKMP (0:1): sending packet to xxx.xxx.xxx.xxx my_port 500 peer_port 500 (I) MM_NO_STATE

*Mar 1 01:37:24.383: ISAKMP: received ke message (1/1)

*Mar 1 01:37:24.383: ISAKMP: set new node 0 to QM_IDLE

*Mar 1 01:37:24.383: ISAKMP (0:1): SA is still budding. Attached new ipsec request to it. (local yyy.yyy.yyy.yyy, remote xxx.xxx.xxx.xxx)

*Mar 1 01:37:24.391: ISAKMP (0:1): retransmitting phase 1 MM_NO_STATE...

*Mar 1 01:37:24.391: ISAKMP (0:1): incrementing error counter on sa: retransmit phase 1

*Mar 1 01:37:24.391: ISAKMP (0:1): retransmitting phase 1 MM_NO_STATE

*Mar 1 01:37:24.391: ISAKMP (0:1): sending packet to xxx.xxx.xxx.xxx my_port 500 peer_port 500 (I) MM_NO_STATE

As I say, this site has had connectivity for some time without any problems - until today. There have been no configuration changes so I don't think any problems are config related.

Any suggestions as to what the problem could be and how to resolve it would be greatly appreciated!

19 Replies 19

Richard Burts
Hall of Fame
Hall of Fame

Neil

From the messages that you posted I get the impression that the router is receiving ISAKMP messages from the PIX and guess that the problem may be that the PIX is not receiving ISAKMP from the router. If there have not been any config changes on the router or on the PIX then I would wonder if the provider has possibly changed something and might be blocking the ISAKMP packets from the router.

Can you tell from the PIX side whether it is receiving any ISAKMP messages from this router?

HTH

Rick

HTH

Rick

Rick,

thanks for the help.

As it happens, aswell as IPSEC tunnel terminating on PIX, we also establish IPSEC communications between 6 other sites, all using Cisco 837 routers.

Now, none of the other sites are experiencing any problems EXCEPT when they try to establish IPSEC communications with the problem site - we get MM_NO_STATE and Phase 1 Main Mode fails.

I turned on debug isakmp on the problem router and one of the other routers and watched the transactions. I could not see any ISAKMP traffic from the problem router - which would completely correspond with what you say i.e. it looks like something on the provider network is somehow blocking the ISAKMP packets!

The problem is, the provider is insisting that they have not made any changes and there are no problems at their end so I'm kind of stuck at the moment!

Thanks for the help all the same and any more suggestions that would help pinpoint the cause of this issue for certain, would be welcomed!

Neil

Do I assume that you have remote access to the router (can telnet or SSH to it)? If so that would rule out a basic IP connectivity issue. If not can you establish that there is good IP connectivity (ping to it or traceroute to it from your central site, or from the remote router ping to your central site).

If we are trying to determine if ISAKMP is going through it might work to debug ip packet on the remote router and look for receipt and generation of UDP 500 packets. Depending on the amount of activity on the remote router the debug might generate lots of output. You can reduce the amount of output by using an access list with the debug. It would look something like this:

choose an extended access list number that is not in use on the router (in my example 199)

config t

access-list 199 permit udp any eq 500 any

access-list 199 permit udp any any eq 500

end

debug ip packet 199

if you are telnetted (or SSH) into the router be sure to do terminal monitor so that you see the debug output. Or you can enable logging buffered debug and do show log to see what is in the logging buffer which would include the debug output.

That should show whether the router is receiving and sending ISAKMP. If you can show that the router is generating the traffic and receiving the traffic maybe the provider will accept that as proof. You might need to find a way to examine traffic coming into your central site to demonstrate that you are receiving ISAKMP from other sites but not from this one.

HTH

Rick

HTH

Rick

Rick,

the remote site has an ISDN router too - so as a workaround, they are currently using the ISDN link. I can connect to the ADSL router for troubleshooting purposes by telneting from the ISDN router (as the ADSL router is just connected to it via Ethernet)

As you have described, I will set up an ACL and debug the IP packets to determine what is happening with ISAKMP traffic. Hopefully that will yield some more info!

Thanks again for your help - will keep you posted on any progress!

Neil

replace the router to isolate the possibility of hardware failure.

also, just wondering if all adsl sites are provided by the same isp.

Replacing the router may be something we have to try - though it will probably take me a few days to get hold of another one, get it configured and shipped to the remote site.

Shouldn't there at least be some indications of a hardware problem from the router itself though? I've examined the show tech output and no signs of any problems?

The sites are all provided by the same ISP - however, they are quite separate geographically so its still a possibility that the ISP may have changed something in a part of the network that only affects the problem site. (Well, thats what I think/hope anyway!)

To make matters worse, the ISP we use is more dedicated to home users rather than businesses and particularly not businesses using the kind of set-up that we have so its kind of difficult to get any decent assistance from their technical support!

(I'm relatively new to this Company so all this is a legacy of the old regime!)

Thanks for all the help so far, guys.

Neil

providing all the adsl services are provided by the same isp, it is very less likely that the isp is blocking or restricting the ispec just on that particular site.

another quick question, i was wondering if all the routers are running the same ios or not. i agree with you that at least there should be some indications on the router, unfortunately it's not always the case.

Yes, all the routers are Cisco 837s running the same IOS - 12.3(2)XC2.

These have all been running pretty much without problems, certainly for the last 3 months or so that I have been here.

The fact that all of a sudden this site has stopped working (without any config changes at our end) is what makes me so suspicious of the provider network (and the fact there doesn't seem to be any indication of any hardware issues with the router itself).

I will try some of the troubleshooting Rick has suggested and will also send out a replacement router to see if that makes any difference, at least then I can eliminate a hardware problem.

Thanks,

Neil

Rick,

I configured the ACL as you suggested on the problem remote router and on one of the remote routers that is working.

When I ping from the "working" router to the "problem" router, on the "working" router I see packets being sent but nothing returning.

When I ping from the "problem" router to the "working" router, on the "problem" router I see messages such as:

IP: s=xxx.xxx.xxx.xxx (local), d=yyy.yyy.yyy.yyy (Dialer0), len 152, encapsulation failed

But presumably, this is just because the IPSEC communications have not been established successfully i.e. the IPSEC tunnel has not been set up so the packets cannot be encrypted and sent?

I should hopefully be able to get a replacement router with the exact same configuration to the remote site tomorrow so we can see if that makes any difference.

Neil

It is interesting that the ping fails. We have not yet really ruled out the possibility of IP connectivity problems. The encapsulation failed error message indicates that the router was not able to map the layer 3 destination address to an appropriate layer 2 address.

Can you post the output of show interface (and perhaps the output of show ip interface)? And perhaps you might post the configuration of the interface (if you have reservations about posting the entire config)?

I am wondering if the provider has made any kind of changes about your connection (even if they claim that they have not). Can the problem router ping (or any other way access) the next hop router (the provider router)?

HTH

Rick

HTH

Rick

we had issue with the ios version 12.3(2)XC2 before. with that particular project, we were deploying 15 adsl routers with this ios. during the implementation, 4 out of 15 were having issue with both ssh and ipsec.

with ssh, we were able to establish ssh to the router. however, it drops out every few minutes.

with ipsec, the lan-lan tunnel connected back to the central site was established. unfortunately, no traffic was traversed via the tunnel and eventually the tunnel was dropped after a minute or so.

had a discussion with cisco tac, no luck. since all 14 sites share identical config except the ip scheme, and all routers were running the same ios; we didn't include the ios as a possibility. we even replaced the router as we thought that could be hardware failure, still no luck.

anyhow, we did one ios upgrade due to the fact that we've got nothing to lose, and surprisingly it started working. we then did an upgrade on all "naughty" routers and it resolved the issue 100%.

Ok, we have tried replacing the router with a new one but still see the same thing - MM_NO_STATE message, Phase 1 Main Mode failing.

jackko - thats interesting about the IOS and is something we could maybe try. I've certainly had strange issues before that were only ever rectified by an IOS upgrade but noone knew how or why!

Rick - attached is the output from show int, show ip int, and the interface config itself. (I have blanked out username/password and public address info and changed private address info)

The things that STILL make me suspicious of the provider though is that a) this has been working happily for several months until now b) there have been no config changes from our end and c) we have replaced the router itself but this hasn't made any difference!

Please me know if you spot anything from the output or if you have any more suggestions.

Thanks,

Neil

Sorry, might help if I actually provide the attachment!

Guys, I'm not sure if you're actually getting the points I've assigned to you for all your help on this so far?

For some reason, it doesn't seem to be showing any of the points I am allocating (or trying to allocate!) to you.

Not sure if there's just a glitch in the system at the moment. If you are not getting any points, let me know and I will try to assign them again later. Rest assured though your help IS appreciated!

Cheers,

Neil

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: