BGP Issues ---- 2 sites, 2 providers --- Failover Broken. :)

Answered Question

This is my first post. Thanks for the all of the great assistance and guidance with this forum. :)

BGP ROUTING ISSUE:

I had a routing problem this morning between SITE A and SITE B with the following issue:

SITE A -------

BGP NEIGHBOR #1 - 192.168.200.1 (BELLSOUTH)

BGP NEIGHBOR #2 - 172.31.250.125 (AT&T)

SITE B -------

BGP NEIGHBOR #1 - 10.2.250.9 (AT&T ONLY)

SITE A is weighted as BGP#1 (BELLSOUTH) primary as follows:

router bgp 64702

no synchronization

bgp log-neighbor-changes

timers bgp 20 60

redistribute connected

redistribute static

redistribute eigrp 1

neighbor 172.31.250.125 remote-as 13979

neighbor 172.31.250.125 weight 100

neighbor 172.31.250.125 route-map localonly out

neighbor 192.168.200.1 remote-as 6389

neighbor 192.168.200.1 password xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

neighbor 192.168.200.1 soft-reconfiguration inbound

neighbor 192.168.200.1 weight 400

neighbor 192.168.200.1 distribute-list 55 in

neighbor 192.168.200.1 route-map localonly out

no auto-summary

ACL:

access-list 55 remark <==SITE B --- AT&T ROUTES==>

access-list 55 deny 172.22.100.0 0.0.0.255

access-list 55 deny 172.22.101.0 0.0.0.255

access-list 55 deny 172.22.102.0 0.0.0.255

access-list 55 deny 172.22.103.0 0.0.0.255

access-list 55 deny 172.22.104.0 0.0.0.255

access-list 55 deny 172.22.107.0 0.0.0.255

access-list 55 deny 172.16.9.0 0.0.0.255

access-list 55 permit any

I placed an ACL (distribute-list 55) on the inbound side of the BGP neighbor in order to deny routes from SITE B to be placed into the table from BGP NEIGHBOR #1. (it would only have AT&T routes for SITE B)

This would allow a faster transport of the routes from SITE A to SITE B since site B only had AT&T and it would prevent the traffic from having to traverse our remote Datacenter core to then hit a transit AS and then finally reach SITE B.

To protect this scenario and provide a failover in case I lost AT&T at SITE A, I set floating statics with an AD of 200 to all subnets at SITE B to force it to use the Bellsouth network to reach SITE B through our datacenter. (these floating statics were setup on router at SITE A)

EVENT:

AT&T had a brief BGP neighbor adjacency flap last night and the routes to AT&T were obviously lost & the floating statics were injected into the local routing table. Problem is that AT&T was restored but its lower AD (eBGP) routes were never replaced in the routing table and for some reason I ended up with inability to reach SITE B during this time. I noticed that on trace to reach SITE B, I got a routing loop between the external BGP neighbor interfaces, it bounced steadily between 192.168.200.1 and 192.168.200.2 repeately.

When I finally removed the statics, everything starting working properly.

Can you help me understand the cause for this & what is a better practice for everything I want to accomplish here including failover?

Thank you all so very much! :)

Andy

I have this problem too.
0 votes
Correct Answer by Edison Ortiz about 7 years 6 months ago

Andy,

Per your configuration, SITE A is also advertising 172.x.x.x in BGP since you have a redistribute static without a route-map.

I'm not sure what you have under the route-map 'localsonly' but if you aren't blocking 172.x.x.x, then your peering ISP was learning those routes from SITE A and perhaps that was the reason of this loop.

You can modify your configuration several different ways. Since you are using WEIGHT as a path selection, you can create a route-map to match 172.x.x.x and set a higher WEIGHT when coming via AT&T while leaving the rest of the routes with WEIGHT 100.

For instance:

route-map NET172 permit 10

match ip address NET172

set weight 500

route-map NET172 permit 20

set weight 100

neighbor 172.31.250.125 route-map NET172 in

and remove the static routes and distribute-list 55 from the router.

HTH,

__

Edison.

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 5 (2 ratings)
Loading.
Correct Answer
Edison Ortiz Mon, 06/08/2009 - 06:37

Andy,

Per your configuration, SITE A is also advertising 172.x.x.x in BGP since you have a redistribute static without a route-map.

I'm not sure what you have under the route-map 'localsonly' but if you aren't blocking 172.x.x.x, then your peering ISP was learning those routes from SITE A and perhaps that was the reason of this loop.

You can modify your configuration several different ways. Since you are using WEIGHT as a path selection, you can create a route-map to match 172.x.x.x and set a higher WEIGHT when coming via AT&T while leaving the rest of the routes with WEIGHT 100.

For instance:

route-map NET172 permit 10

match ip address NET172

set weight 500

route-map NET172 permit 20

set weight 100

neighbor 172.31.250.125 route-map NET172 in

and remove the static routes and distribute-list 55 from the router.

HTH,

__

Edison.

Thanks for your insight, Edison.

Can you look over this?

My local-only map:

ip as-path access-list 10 permit ^$

!

route-map localonly permit 10

match as-path 10

!

(It was supposed to only advertise locally orginated subnets not allowing sites with 2 BGP connections to become active transit AS sites)

---------------------------------

I think I found another issue with my logic. The routes were modified since I am redistributing statics into BGP. I set the AD on the floating statics to 200, but I believe that when the BGP link failed the routes were now seen as a new locally originated route and they were set with a weight of 32768 into the routing table. Needless to say, this route wouldn't have been removed when BGP#2 (AT&T) was restored.

I forgot about this. Is this accurate?

-------------

I still don't know why the Bellsouth routing loop was created........

The route-maps sure look like the way to go.... Why do you show two different weights for the NET172 (500 & 100) --- Is that for all of the other AT&T routes that are not specifically covered in the "match ip address NET172" statement?

500 for the preferred SITE B subnet routes only and 100 for the rest?

Thanks for your expert insight and time.

It is truly appreciated. :)

Edison Ortiz Mon, 06/08/2009 - 08:02

Based on the as-path ACL, your statics were being advertised. BellSouth was accepting those routes as 'best routes' from you instead of AT&T because your routes had a shorter AS path.

However, you wanted to get those routes from AT&T, see the situation? - That's why the loop was created - you wanted to exit via BellSouth once the link was restored but BellSouth had your router as the best path (you announced the statics to them in the first place).

I show 2 different weights so you will prefer the AT&T link for NET172 while leaving the rest of routes from AT&T as least preferred when comparing with the BellSouth routes.

HTH,

__

Edison.

Please rate helpful posts

Thanks again, Edison. I understand the reason for the routing loop that was present due to my error in judgement & configuration.

I thought that I recall reading that the "match ip address" command in BGP you can only use route-maps to filter outbound traffic updates, not on inbound BGP updates. Please clarify this for me.

Can you please advise and help with the complete ACL and prefix-list syntax to ensure that I have it all correct?

Below are the subnets from SITE B that I Want to ensure get marked as higher AD from AT&T on SITE A router.

172.22.100.0 0.0.0.255

172.22.101.0 0.0.0.255

172.22.102.0 0.0.0.255

172.22.103.0 0.0.0.255

172.22.104.0 0.0.0.255

172.22.107.0 0.0.0.255

172.16.9.0 0.0.0.255

I truly appreciate your time & effort on helping me with this resolution.

Edison Ortiz Mon, 06/08/2009 - 09:53

1. You can modify BGP attributes on incoming routes with a 'match ip address' clause.

2.

ip access-list standard NET172

permit 172.22.100.0 0.0.3.255

permit 172.22.104.0 0.0.3.255

permit 172.16.9.0 0.0.0.255

Actions

This Discussion