cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1857
Views
0
Helpful
33
Replies

GRE tunnel issue

glyle
Level 1
Level 1

I am experiencing a strange problem with GRE tunnels we are using to connect a remote site. There are two routers, RT1 & RT2, using HSRP & each router has its own GRE tunnel connected over the internet to a router in our network hub. I am running EIGRP over the tunnels. The problem occurs when the tunnels go down due to an internet outage, when it comes back up everything is ok except i cannot ping an NMS server at our hub, i can ping devices in the same subnet as the server but not the server. I then need to shut down the inside interfaces to switch HSRP over to router 2, which can ping the server. If the tunnels go down again then router 2 cannot ping the server as with router 1. I have noticed that after roughly 4 hours the routers are able to ping the server again. I have checked the routing tables after each outage & all the correct routes are there. The only traffic going over the active tunnel when it goes down is to & from the server so i am not sure if this has anything to do with it.

I am puzzled as to why this is happening, has anyone out there seen this issue before?

33 Replies 33

Graeme,

OK here is what I see:-

1) from rt1 192.168.0.16 is visible from tunnel 0 - great

2) from rt2 192.168.0.16 is visibale from VLAN190 - great, with a feasible sucesssor of tunnel 0 - great

3) from the hub 192.168.10.0 is visible from tunnel 2 (which I presume is the tunnel to rt1?) with a feasible sucessor of tunnel 3 (which I presume is the tunnel to rt2?)

from all that it should be OK. But lets check some more things:-

1) on the HUB router, what is the delay and bandwidth for tunnel 2 and 3?

2) On the rt1 what is the delay and bandwidth for tunnel 0?

3) On the rt2 what is the delay and bandwdith for tunnel 0?

The above can be foun by typing "show int tun #"

4) On rt1 & rt2 what is the delay and badnwidth of the vlan190 interface

The above can be foun by typing "show int vlan 190"

I have pasted traces to the switch connected to the server & the server itself. As you can see the trace to the server never gets past the tunnel. Do you think this may have something to do with the tunnel rather than EIGRP? As i said previously, i can ping the server after the tunnel has been back up after roughly 4 hours without any intervention on my part.

NWBB_T12-60-RT1#trace 192.168.0.18

Type escape sequence to abort.

Tracing the route to 192.168.0.18

1 172.20.20.13 88 msec 88 msec 92 msec

2 172.20.21.2 88 msec 88 msec 88 msec

3 192.168.0.18 88 msec 88 msec *

NWBB_T12-60-RT1#trace 192.168.0.19

Type escape sequence to abort.

Tracing the route to 192.168.0.19

1 * * *

2 * * *

3 * * *

4 * * *

5 * * *

6 * *

NWBB_T12-60-RT1#

mmmm if there was an issue with the tunnel - you would not really get and IEGRP neighbour, and it would not pass traffic at all! are you using a loopback source and destinations for the tunnels?

Also have to "tweaked" the eigrp timers on the tunnels?? As by default on T1 and below circuit speeds, the EIGRP default hello is 60 seconds, with a hold/dead of 180 seconds. On T1 and above the default hello is 5 seconds and hold/dead 15 seconds. From your posts you did not attach the output from "show ip eigrp nei" This will indicate the dead timers - you might have an issue there.

One thing - do you have any other issues with traffic over the tunnels?? Also why are you running RIP over VLAN190?

Andrew

3, Correct.

1, tu2 bw 9kb del 500000 usec

tu3 bw 9kb del 10000000 usec

2, rt1 tu0 bw 9kb del 500000 usec

3, rt2 tu0 bw 9kb del 10000000 usec

4, rt1 vlan190 bw 100000kb del 100 usec

rt2 vlan190 bw 100000kb del 100 usec

The remote network is concentrating remote sites using satellite communication, the vsat modems used for this only use rip hence the redistribution on the routers as i don't want to run rip over the hub. I have not changed the eigrp timers on any interface.

I agree that a tunnel issue would prevent any eigrp updates & not pass any traffic but everything else is working as it should, very strange.

OK - what I would try is change the EIGRP timers, as since the default will either be 15 or 180 seconds to remove the neighbour and route from the routing table, if the tunnels go down.

I would change the timers to the following:-

ip hello-interval eigrp <> 1

ip hold-time eigrp <> 3

I would also change the BW and Delays to ensure the corretct paths are choosen in the routing table and feasible sucessor. Even though it's OK right now - I would change them to make sure. I would try something like:-

From the HUB - tunnel to rt1

Bandwdith 2048

Delay 1000000

From the HUB - tunnel to rt2

Bandwdith 2048

Delay 2000000

And the same numbers on rt1 and rt2 back to the hub.

Andrew

I tried adding a static route on rt1 to the servers subnet but i was still unable to ping it, however when i added a static route to the servers ip address i was able to ping the server. I have since shutdown & enabled the tunnel several time & i can now ping the server every time. I have now added this route to rt2. Do you know why this seems to have resolved the issue?

You must have a routing loop issue somewhere, or overlapping IP addressing.

HTH>

I don't have any duplicate ip addresses & if there is a loop why would it only affect one address?

I am going to leave this overnight because the tunnels always seem to go down at night & check when i get in tomorrow morning.

By adding a static route to the server indicates either an IP address overlap or a loop.

As the static route is more specific to the desintation in the routing table, so that path will always be taken. The fact that when the static route is input indicates that the issue is from rt1 to the hub, as you did not define a static route from the hub to rt1 - is this assumption correct?

if there is a loop a good test would be to make tunnel3 passive in the hub, and tunnel 0 passive in rt2, then remove any configured static routes. See if it works, if it does, then re-enable the tunnels, then make tunnel 2 passive in the hub and tunnel 0 passive in rt1 (the rt2 is the best path to the server) if this works....re-nable the other tunnel and re-test. If this fails without a static route...then there is somekind of loop issue.

The I would suggest you post the relvant routing config from the hub, rt1 & rt2 for review.

HTH>

I have only defined a static route at rt1.

I will test & get back to you.

Andrew

I have tried the test you suggested but received all pings. When i initially tried the static route & was then able to ping the server, i then removed the static route but found i was still able to ping server. If there is a loop, do you think enabling the static route would have cleared the loop & remain loop free after removing the route? Only when the tunnel goes down & back up allows the loop to return?

That is a possibility - I have seen in the past EIGRP has an issue that can effect routing...it's called "stuck in active"

And also clearing a specific EIGRP learned route from the routing table.

I could not see from the posts the SIA condition, and not see from the routing tables a persistant route.

If you performed the test, and all is OK. If you currently do not have a static route configured and everything is working.....I would revise my config's. Perhaps as you are using loopbacks for the source and destination of the tunnels, I would check the static routes to the loopbacks are correct.

I never saw SIA either. I am planning on keeping the static routes in place, i would like to keep an eye on it over the next few days to make sure this does not return. The tunnels usually go down early morning so i should know tomorrow morning.

I am not using loopbacks for the tunnels, only the outside fastethernet interface.

Hi Andrew

I see the tunnels have gone down, can still ping the server & the remote vsat network is stable. I have checked the eigrp topology & all routes are passive. I looked at Rt1 & RT2's logs & have attached the output. It seems to suggest that the ip address in vlan 190 is not in vlan 12 & vice versa, which is true. Do you think there is an issue with this?

They are just trying to form neighbours - and it could have something to do with your issue.

Which ever is the transit link between rt1 & rt2 should be in the EIGRP process. If that is VLAN190 then make VLAN12 passive in process 65000 or vice versa, both VLAN's do not need to be in EIGRP, unless you want failover.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: