This is my first post. Thanks for the all of the great assistance and guidance with this forum. :)
BGP ROUTING ISSUE:
I had a routing problem this morning between SITE A and SITE B with the following issue:
SITE A -------
BGP NEIGHBOR #1 - 192.168.200.1 (BELLSOUTH)
BGP NEIGHBOR #2 - 172.31.250.125 (AT&T)
SITE B -------
BGP NEIGHBOR #1 - 10.2.250.9 (AT&T ONLY)
SITE A is weighted as BGP#1 (BELLSOUTH) primary as follows:
router bgp 64702
timers bgp 20 60
redistribute eigrp 1
neighbor 172.31.250.125 remote-as 13979
neighbor 172.31.250.125 weight 100
neighbor 172.31.250.125 route-map localonly out
neighbor 192.168.200.1 remote-as 6389
neighbor 192.168.200.1 password xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
neighbor 192.168.200.1 soft-reconfiguration inbound
neighbor 192.168.200.1 weight 400
neighbor 192.168.200.1 distribute-list 55 in
neighbor 192.168.200.1 route-map localonly out
access-list 55 remark <==SITE B --- AT&T ROUTES==>
access-list 55 deny 172.22.100.0 0.0.0.255
access-list 55 deny 172.22.101.0 0.0.0.255
access-list 55 deny 172.22.102.0 0.0.0.255
access-list 55 deny 172.22.103.0 0.0.0.255
access-list 55 deny 172.22.104.0 0.0.0.255
access-list 55 deny 172.22.107.0 0.0.0.255
access-list 55 deny 172.16.9.0 0.0.0.255
access-list 55 permit any
I placed an ACL (distribute-list 55) on the inbound side of the BGP neighbor in order to deny routes from SITE B to be placed into the table from BGP NEIGHBOR #1. (it would only have AT&T routes for SITE B)
This would allow a faster transport of the routes from SITE A to SITE B since site B only had AT&T and it would prevent the traffic from having to traverse our remote Datacenter core to then hit a transit AS and then finally reach SITE B.
To protect this scenario and provide a failover in case I lost AT&T at SITE A, I set floating statics with an AD of 200 to all subnets at SITE B to force it to use the Bellsouth network to reach SITE B through our datacenter. (these floating statics were setup on router at SITE A)
AT&T had a brief BGP neighbor adjacency flap last night and the routes to AT&T were obviously lost & the floating statics were injected into the local routing table. Problem is that AT&T was restored but its lower AD (eBGP) routes were never replaced in the routing table and for some reason I ended up with inability to reach SITE B during this time. I noticed that on trace to reach SITE B, I got a routing loop between the external BGP neighbor interfaces, it bounced steadily between 192.168.200.1 and 192.168.200.2 repeately.
When I finally removed the statics, everything starting working properly.
Can you help me understand the cause for this & what is a better practice for everything I want to accomplish here including failover?
Thank you all so very much! :)
Per your configuration, SITE A is also advertising 172.x.x.x in BGP since you have a redistribute static without a route-map.
I'm not sure what you have under the route-map 'localsonly' but if you aren't blocking 172.x.x.x, then your peering ISP was learning those routes from SITE A and perhaps that was the reason of this loop.
You can modify your configuration several different ways. Since you are using WEIGHT as a path selection, you can create a route-map to match 172.x.x.x and set a higher WEIGHT when coming via AT&T while leaving the rest of the routes with WEIGHT 100.
route-map NET172 permit 10
match ip address NET172
set weight 500
route-map NET172 permit 20
set weight 100
neighbor 172.31.250.125 route-map NET172 in
and remove the static routes and distribute-list 55 from the router.