Cisco Support Community
cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Community Member

Question about OSPF failover

Good day!

I have a switch in my server room

I also have a switch at a secondary server room at a site 5 miles away

I have two separate LANEXs (1Gb each) connecting the switches. This is for fault tolerance if someone pulls down the fibre (as happened a couple of years ago).

I am using OSPF to route between the two switches. It seems to be balancing the traffic between the two links (not a bad thing)

My default gateway for the switch in my server room is both the endpoints at the other location (since that is where our internet connection is)

ip route 0.0.0.0 0.0.0.0 172.18.10.2 5

ip route 0.0.0.0 0.0.0.0 172.18.11.2 5

(10.1 and 11.1 are here ..)

When I unplug one of the links to test, I lose connectivity to the other site. I only left it unplugged maybe 5-10 seconds (an OH S#it! moment ) since I thought the failover to the second route.

Shouldnt OSPF detect the other route to be unavailable fairly quickly and increase the cost? or is it the fact that I am specifying the cost that screws up the failover? Or am I just a newb and impatient?

Everyone's tags (3)
2 ACCEPTED SOLUTIONS

Accepted Solutions
Purple

Question about OSPF failover

Hi,

a neighbour is declared down only after holdtime which is 4 times hello interval so by default it will take 40 seconds for the neighbour to be down.

You can either configure sub second hellos with ip ospf dead-interval hello-multiplier command under interfaces

or you can use BFD along with OSPF   http://www.cisco.com/en/US/docs/ios/12_0s/feature/guide/fs_bfd.html

It should reduce the convergence time needed for the traffic to be switched the only left interface.

Regards.

Alain

Don't forget to rate helpful posts.

Don't forget to rate helpful posts.
Community Member

Re: Question about OSPF failover

Correct, you can do that with the statics in place still, checking the ospf database for the new defaults.  If they are there you can remove your statics.

27 REPLIES

Re:Question about OSPF failover

It all depends on the timers - what timers have you configured?

Sent from Cisco Technical Support Android App

Community Member

Re:Question about OSPF failover

You have specified the cost of static routes, not OSPF routes.  OSPF has nothing to do with those routes whatsoever, unless you have redistributed the static routes you posted into OSPF, in which case they would be on the internet side of the link pointing to your edge device and advertise via OSPF to the site you are at.

Do a show ip route from the device in your server room, and post the output.

Community Member

Re:Question about OSPF failover

Thanks !!   Here it is

                  

MTL-STACK3750-12-1#sho ip route
Codes: C - connected, S - static, R - RIP, M - mobile, B - BGP
       D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
       N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
       E1 - OSPF external type 1, E2 - OSPF external type 2
       i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
       ia - IS-IS inter area, * - candidate default, U - per-user static route
       o - ODR, P - periodic downloaded static route

Gateway of last resort is 172.18.11.2 to network 0.0.0.0


S    192.168.15.0/24 [1/0] via 10.30.10.177
     64.0.0.0/32 is subnetted, 1 subnets
S       64.230.170.178 [1/0] via 10.30.10.12
     209.167.212.0/32 is subnetted, 1 subnets
O E2    209.167.212.154 [110/1000] via 172.18.11.2, 03:48:54, Vlan101
                        [110/1000] via 172.18.10.2, 03:48:54, Vlan102
     172.17.0.0/24 is subnetted, 4 subnets
C       172.17.4.0 is directly connected, Vlan34
C       172.17.2.0 is directly connected, Vlan32
     172.16.0.0/24 is subnetted, 21 subnets
     172.18.0.0/24 is subnetted, 2 subnets
C       172.18.11.0 is directly connected, Vlan101
C       172.18.10.0 is directly connected, Vlan102
     172.22.0.0/32 is subnetted, 1 subnets
S       172.22.21.166 [1/0] via 10.30.10.12
     10.0.0.0/8 is variably subnetted, 9 subnets, 3 masks
O E2    10.10.10.0/24 [110/1000] via 172.18.11.2, 03:48:56, Vlan101
                      [110/1000] via 172.18.10.2, 03:48:56, Vlan102
C       10.30.0.0/16 is directly connected, Vlan1
O       10.31.0.0/16 [110/20] via 172.18.11.2, 03:48:56, Vlan101
                     [110/20] via 172.18.10.2, 03:48:56, Vlan102
O E2    10.35.0.0/16 [110/1000] via 172.18.11.2, 03:48:56, Vlan101
                     [110/1000] via 172.18.10.2, 03:48:56, Vlan102
O E2    10.38.1.0/24 [110/1000] via 172.18.11.2, 03:48:56, Vlan101
                     [110/1000] via 172.18.10.2, 03:48:56, Vlan102
S       10.37.2.0/24 [1/0] via 10.30.10.12
O E2    10.37.1.0/24 [110/1000] via 172.18.11.2, 03:48:56, Vlan101
                     [110/1000] via 172.18.10.2, 03:48:56, Vlan102
O E2    10.36.0.0/16 [110/1000] via 172.18.11.2, 03:48:56, Vlan101
                     [110/1000] via 172.18.10.2, 03:48:56, Vlan102
O E2    10.10.10.61/32 [110/1000] via 172.18.11.2, 03:48:56, Vlan101
                       [110/1000] via 172.18.10.2, 03:48:56, Vlan102

S    192.168.1.0/24 [1/0] via 10.30.10.66
S*   0.0.0.0/0 [5/0] via 172.18.11.2
               [5/0] via 172.18.10.2

Community Member

Re:Question about OSPF failover

Maybe this is better?

MTL-STACK3750-12-1#sho ip route ospf

     10.0.0.0/8 is variably subnetted, 9 subnets, 3 masks
O E2    10.10.10.0/24 [110/1000] via 172.18.11.2, 03:54:26, Vlan101
                      [110/1000] via 172.18.10.2, 03:54:26, Vlan102
O       10.31.0.0/16 [110/20] via 172.18.11.2, 03:54:26, Vlan101
                     [110/20] via 172.18.10.2, 03:54:26, Vlan102
O E2    10.35.0.0/16 [110/1000] via 172.18.11.2, 03:54:26, Vlan101
                     [110/1000] via 172.18.10.2, 03:54:26, Vlan102
O E2    10.38.1.0/24 [110/1000] via 172.18.11.2, 03:54:26, Vlan101
                     [110/1000] via 172.18.10.2, 03:54:26, Vlan102
O E2    10.37.1.0/24 [110/1000] via 172.18.11.2, 03:54:26, Vlan101
                     [110/1000] via 172.18.10.2, 03:54:26, Vlan102
O E2    10.36.0.0/16 [110/1000] via 172.18.11.2, 03:54:26, Vlan101
                     [110/1000] via 172.18.10.2, 03:54:26, Vlan102
O E2    10.10.10.61/32 [110/1000] via 172.18.11.2, 03:54:26, Vlan101
                       [110/1000] via 172.18.10.2, 03:54:26, Vlan102

      

from running config

! router ospf 1
log-adjacency-changes
auto-cost reference-bandwidth 10000
redistribute static metric 1000 subnets
network 10.0.0.0 0.255.255.255 area 0
network 172.16.0.0 0.0.255.255 area 0
network 172.17.0.0 0.0.255.255 area 0
network 172.18.10.1 0.0.0.0 area 0
network 172.18.11.1 0.0.0.0 area 0
!

Community Member

Re:Question about OSPF failover

Could you post the other side's routing table as well?

Also post show ip protocol and show ip ospf nei from both devices.

Community Member

Re:Question about OSPF failover

I am cleaning out a few irrelevant entries for sercuirty/brevity.. but the important stuff should still be here

     172.17.0.0/24 is subnetted, 5 subnets
C       172.17.16.0 is directly connected, Vlan46
O       172.17.4.0 [110/20] via 172.18.11.1, 04:11:06, Vlan101
                   [110/20] via 172.18.10.1, 04:11:06, Vlan102
O       172.17.1.0 [110/20] via 172.18.11.1, 04:11:06, Vlan101
                   [110/20] via 172.18.10.1, 04:11:06, Vlan102
O       172.17.3.0 [110/20] via 172.18.11.1, 04:11:06, Vlan101
                   [110/20] via 172.18.10.1, 04:11:06, Vlan102
O       172.17.2.0 [110/20] via 172.18.11.1, 04:11:06, Vlan101
                   [110/20] via 172.18.10.1, 04:11:06, Vlan102
     172.16.0.0/24 is subnetted, 21 subnets

O       172.16.15.0 [110/20] via 172.18.11.1, 04:11:07, Vlan101
                    [110/20] via 172.18.10.1, 04:11:07, Vlan102
O       172.16.9.0 [110/20] via 172.18.11.1, 04:11:07, Vlan101
                   [110/20] via 172.18.10.1, 04:11:07, Vlan102
O       172.16.10.0 [110/20] via 172.18.11.1, 04:11:07, Vlan101
                    [110/20] via 172.18.10.1, 04:11:07, Vlan102
O       172.16.11.0 [110/20] via 172.18.11.1, 04:11:07, Vlan101
                    [110/20] via 172.18.10.1, 04:11:07, Vlan102

     172.18.0.0/24 is subnetted, 2 subnets
C       172.18.11.0 is directly connected, Vlan101
C       172.18.10.0 is directly connected, Vlan102
     172.22.0.0/32 is subnetted, 1 subnets
O E2    172.22.21.166 [110/1000] via 172.18.11.1, 04:11:07, Vlan101
                      [110/1000] via 172.18.10.1, 04:11:07, Vlan102
     10.0.0.0/8 is variably subnetted, 9 subnets, 3 masks
S       10.10.10.0/24 [1/0] via 10.31.0.1
O       10.30.0.0/16 [110/20] via 172.18.11.1, 04:11:07, Vlan101
                     [110/20] via 172.18.10.1, 04:11:07, Vlan102
C       10.31.0.0/16 is directly connected, Vlan1
S       10.35.0.0/16 [1/0] via 10.30.10.12
S       10.38.1.0/24 [1/0] via 10.31.0.1
O E2    10.37.2.0/24 [110/1000] via 172.18.11.1, 04:11:07, Vlan101
                     [110/1000] via 172.18.10.1, 04:11:07, Vlan102
S       10.37.1.0/24 [1/0] via 10.31.0.1
S       10.36.0.0/16 [1/0] via 10.31.0.1
O E2 192.168.1.0/24 [110/1000] via 172.18.11.1, 04:11:07, Vlan101
                    [110/1000] via 172.18.10.1, 04:11:07, Vlan102
S*   0.0.0.0/0 [1/0] via 10.31.0.1

Community Member

Re:Question about OSPF failover

What about the output of show ip protocols and show ip ospf nei?

Purple

Question about OSPF failover

Hi,

a neighbour is declared down only after holdtime which is 4 times hello interval so by default it will take 40 seconds for the neighbour to be down.

You can either configure sub second hellos with ip ospf dead-interval hello-multiplier command under interfaces

or you can use BFD along with OSPF   http://www.cisco.com/en/US/docs/ios/12_0s/feature/guide/fs_bfd.html

It should reduce the convergence time needed for the traffic to be switched the only left interface.

Regards.

Alain

Don't forget to rate helpful posts.

Don't forget to rate helpful posts.
Community Member

Re:Question about OSPF failover

But if he is ECMPing, as he is, he wouldn't have totolly lost connection to the site as he described.

Community Member

Re:Question about OSPF failover

Here is the info you are looking for Chris. I am fully game to wait 40 seconds but I cannot do it until the weekend (even then I have to do a CMR) I should mention that I was not able to ping servers at the site when I unplugged the redundant link (in the test is was VL101s link). Oddly no one complaned (400 users all connecting to exchange and voip at the other site) but as I mentioned.. it was a 10 second blip so... 

MTL-CISCO3750MCI-1#show ip ospf nei

Neighbor ID     Pri   State           Dead Time   Address         Interface
172.18.11.1        1   FULL/BDR        00:00:35    172.18.11.1      Vlan101
172.18.11.1        1   FULL/BDR        00:00:35    172.18.10.1      Vlan102
172.17.254.5      1   FULL/BDR        00:00:35    172.17.254.5    Vlan104
172.17.254.5      1   FULL/BDR        00:00:35    172.17.254.1    Vlan103

Routing Protocol is "ospf 1"

  Outgoing update filter list for all interfaces is not set

  Incoming update filter list for all interfaces is not set

  Router ID 172.18.11.2

  It is an autonomous system boundary router

  Redistributing External Routes from,

    static with metric mapped to 1000, includes subnets in redistribution

  Number of areas in this router is 1. 1 normal 0 stub 0 nssa

  Maximum path: 4

  Routing for Networks:

    10.0.0.0 0.255.255.255 area 0

    172.18.10.2 0.0.0.0 area 0

    172.18.11.2 0.0.0.0 area 0

  Routing Information Sources:

    Gateway         Distance      Last Update

    172.18.11.1           110      04:48:57

    172.18.10.1           110      1y35w

  Distance: (default is 110)

Routing Protocol is "ospf 1"
  Outgoing update filter list for all interfaces is not set
  Incoming update filter list for all interfaces is not set
  Router ID 172.18.11.2
  It is an autonomous system boundary router
  Redistributing External Routes from,
    static with metric mapped to 1000, includes subnets in redistribution
  Number of areas in this router is 1. 1 normal 0 stub 0 nssa
  Maximum path: 4
  Routing for Networks:
    10.0.0.0 0.255.255.255 area 0
    172.18.10.2 0.0.0.0 area 0
    172.18.11.2 0.0.0.0 area 0
  Routing Information Sources:
    Gateway         Distance      Last Update
    172.18.11.1           110      04:48:57
    172.18.10.1           110      1y35w
  Distance: (default is 110)

Community Member

Question about OSPF failover

And along the same lines here is the other info for the timers

Timer intervals configured, Hello 10, Dead 40, Wait 40, Retransmit 5

Community Member

Re:Question about OSPF failover

Well I don't see anything wrong with your setup, unless those static routes in your local server room are messing with ECMP for some reason.  Remove the static default routes, as you are advertising a static default route into OSPF from the remote server room and try your test again before tuning your timers. 

Community Member

Re:Question about OSPF failover

I disregarded the most basic questions...

Was what you were pinging from (to the servers) also apart of VLAN 101?  If so, is your default gateway local to your site, or is it at the remote site? 

If your testing node was part of VLAN101 and your default-gatway was on the remote side of the link, when you unplugged the link you wouldn't be able to ping anything on the remote side, but members of VLAN102 would have no problems.  This wouldn't be a matter of routing convergence, but a layer2 issue.

Community Member

Re:Question about OSPF failover

Thanks Chris!   I have scheduled a maintenatnce for 6am tom. and will know more then.

VLAN 101 and 102 are only used to route traffic between the sites. My test was pinging from another vl here (where the users reside) to different vlan at the other site (actually pinging the mail server).

Before I do anything else I would like to understand the timers better and digest what has been said above. Thanks for all your help!

Drew

7216
Views
10
Helpful
27
Replies
CreatePlease to create content