Goofy drops in BGP/OSPF route configuration for DIA

gregwoodson · ‎09-23-2010

RELEVANT CONFIG:

router ospf XXXXX

log-adjacency-changes

redistribute connected subnets

redistribute static subnets

passive-interface default

no passive-interface GigabitEthernet0/0.54

network XX.XX.XX.XX 0.0.0.0 area 0

default-information originate

!

router ospf 65501

log-adjacency-changes

redistribute connected subnets

redistribute static subnets

redistribute bgp XXXXX subnets

passive-interface default

no passive-interface GigabitEthernet0/0.50

network XX.XX.XX.XX 0.0.0.0 area 0

default-information originate

!

router bgp XXXXX

no synchronization

bgp log-neighbor-changes

network XX.XX.XX.XX mask 255.255.255.255

network XX.XX.XX.XX

network XX.XX.XX.XX mask 255.255.224.0

network XX.XX.XX.XX mask 255.255.255.0

aggregate-address XX.XX.XX.XX 255.255.224.0 summary-only

neighbor XX.XX.XX.XX remote-as XXXX

neighbor XX.XX.XX.XX description DIA B-Peer (Receive from this node via loopback 10)

neighbor XX.XX.XX.XX password 7 X

neighbor XX.XX.XX.XX ebgp-multihop 5

neighbor XX.XX.XX.XX update-source Loopback10

neighbor XX.XX.XX.XX soft-reconfiguration inbound

neighbor XX.XX.XX.XX filter-list 1 in

neighbor XX.XX.XX.XX remote-as XXXX

neighbor XX.XX.XX.XX description DIA A-Peer (Announce to this node)

neighbor XX.XX.XX.XX password 7 X

neighbor XX.XX.XX.XX filter-list 2 in

maximum-paths 6

no auto-summary

When pinging to one of the nodes in this lab environment, if I ping simply from the DIA edge router, pings go through without any pauses or issues. If I ping from int gi0/0.54- then it will start, then go out 3 seconds, start, go out 14 seconds, start, go out 3 seconds- then go for a while. There is obviously something goofy going on in the routing table, but I cant seem to find out what. I've done bgp and ospf debugs with no success. The routing table seems to be solid and not changing. Recommendations?

Giuseppe Larosa · ‎09-23-2010

Hello Greg,

You have two OSPF processes and BGP process.

>> f I ping simply from the DIA edge router, pings go through without any pauses or issues. If I ping from int gi0/0.54- then it will start, then go out 3 seconds, start, go out 14 seconds, start, go out 3 seconds-

Timing may be meaningful, I would say that it may be related to BGP.

you can use debug ip routing combined with an ACL to monitor some IP prefix like the ping destination.

How do you reach the BGP next-hop?

This is a key point: the BGP next-hop may be at the beginning considered valid and later dismissed if on BGP you receive a better route ( more specific) with that BGP next-hop.

You can use a /32 static route to fix this if this is the trouble.

IT is a consistency check the BGP next-hop cannot be validated by a BGP route received on that same eBGP session.

After the BGP next-hop is considered not valid the BGP route is damped and the process repeats over time: the BGP next-hop is again validated and the route accepted but then at next BGP next hop validation check the test fails.

Hope to help

Giuseppe

gregwoodson · ‎09-23-2010

I have already adjusted the BGP timers, with no change in the result. The bgp nexthop is reached via static route for the host. I figured that the next-hop might be dropping out of the routing table. The last thing I did was actually enter that static route. It produced no change.

tbulliard · ‎01-20-2011

I am having almost the exact same thing happening. I have a remote site connected to our corporate office via At&t MPLS. At

the remote site, I have a 3750 for layer 3. Coming off of the 3750 I have a 2801 router connected to the At&t managed WAN router.

I am running OSPF on the 3750 and running OSPF and iBGP on the 2801. From the 3750, I can ping multiple networks at the corporate office all day. As soon as I try to access any application (can be on different networks at corporate office) the network drops from route table on the 2801 and 3750.

When the appliction times out, the route is immediately put back into the routing table on both devices and pinging resumes. I know a better route isn't being found, I have only one egress point at this site currently.

Giuseppe Larosa · ‎01-20-2011

Hello Tbulliard,

so in the remote office you have:

C3750 ---- C2801 ----- AT&T device ----- MPLS cloud ----- HQ --- application servers

in your case it looks like that at some point in the path the application traffic fills the pipe and BGP messages are dropped causing the routing failure

can you check in C2801 using

show ip bgp sum

the uptime of the BGP session?

is this a L3 VPN or a different service?

if the uptime is not great this might be part of the problem

also

show log

should show BGP related messages

Hope to help

Giuseppe

Varun Uniyal · ‎01-20-2011

If there is no recurring pattern to the drops i suggest you check the interface on the local and intermediate devices between the source and the destination