cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
3479
Views
3
Helpful
33
Replies

BGP - 3 isps, 1 router doesn't want to play nicely..

gskhanna
Level 1
Level 1

I have 3 local routers connecting to 3 diff providers. All 3 connected with ibgp just fine, sharing all the routes. Router 1 and 2 also connected to the isp with ebgp and get routes just fine.

Now when Router 3 connects to it's isp, it establishes and gets all the routes from it's ISP, but then the ibgp to that router goes all wierd.. All the ibgp routers send withdrawl updates, and removes all the routes, then resends them all again. Over and over, sends/withdraws.

However the bgp connection never goes down between it and the ibgps, and the connection between Router 3 and the isp peer doesn't go down and the routes stay.

Ideas?

I am using basic config of:

neighbor <ibgprouter> remote-as 1

neighbor <ibgprouter> remote-as 1

neighbor <isprouter> remote-as 2

Please any suggestions?

1 Accepted Solution

Accepted Solutions

Second Question:

Can you point me to a sample config on how to have routes going to a certain AS take the path out a certain router?

--

There aren't any that I know of, or that I could fine by poking around the CCO docs.... What you want to do is to use an as path access-list to match on each provider, and then set the local preference for a match. So, say your two providers are AS65000 and AS65001, and you want to push some of the traffic over to AS65000. You can make it so AS65000 is preferred for 3 hop paths, even if the connection through AS65001 is normally preferred:

ip as-path access-list 100 permit ^[0-9]*$

ip as-path access-list 100 permit ^[0-9]*_[0-9]*$

ip as-path access-list 100 permit ^[0-9]*_[0-9]*_[0-9]*$

!

route-map prefer65000 permit 10

match as-path 100

set local-preference 110

!

router bgp x

neighbor x.x.x.x remote-as 65000

neighbor x.x.x.x route-map prefer65000 in

(from memory, not on a router, so I could have something wrong here)

This should match any routes of three AS hops, and force them to go through AS65000, even if you have a two hop route going through AS65001. You can adjust the outbound traffic by adjusting the number of [0-9]*_'s included in the as path access list--two, and it should prefer any two hop as paths to go through AS65000, three for any three hop paths, etc. So, for instance, if you want to influence more distant networks, where there's less likelihood of suboptimal routing, then add statements with 3 and 4 [0-9]*_'s.

I hope this makes sense....

:-)

Russ.W

View solution in original post

33 Replies 33

aretana
Level 1
Level 1

Turn syncronization off: "no sync" under router bgp.

By default, the BGP process tries to find a matching route in the routing table (from an IGP) to validate the routes...if that doesn't happen then the routes are removed from the table. It is common for an ISP to announce a block that covers the next-hop...this contributes to seeding the behavior you're seeing: the routes get installed (using the BGP route to resolve the next-hop), but then they get withdrawn because the next-hop is no longer rechable (outside BGP), etc...

'no sync' should solve the problem. ;-)

Alvaro.

Actually sorry I forgot to put that here.

router bgp 1

neighbor remote-as 1

neighbor remote-as 1

neighbor remote-as 2

no synchronization

bgp log-neighbor-changes

no auto-summary

i have that set allready also.

I do not have the neighbor command with the next-hop-self, But that should not affect this config i'd think.

If 'no sync' is already there (in all routers), then...

Setting the next-hop to self assures that the next-hop is known via the IGP. If the next-hop is not known via the IGP *and* (as I mentioned in my last post) a route from this ISP covers the next-hop, then you can get the bouncing behavior as well.

So...the next step is to either make sure that the next-hop is reachable through the IGP OR use next-hop self.

After this, I'll probably need to take a look at the actual routes...

Alvaro.

Hmmm.... It's most likely the same problem, but for a different reason. Is it the iBGP learned routes that are flapping, or the eBGP learned routes? If it's the iBGP learned routes, then somehow you are learning a route to the iBGP peering address that's better than the IGP learned route to the iBGP peering address through iBGP itself. I would check the routing table on the 3rd router for the route to each of the iBGP peers every 60 seconds or so, to see if you see it pointing to an iBGP route intermitently, or where it's pointing when the routes are up.

I'm almost certain it's a recursion problem, in any case, we just have to figure out where the recursion is, and stop it.

Russ.W

okay i will try to turn on next-hop-self and see if that fixes it. It is only the ibgp that is affected. so it seems the ebgp might have a better metric then the internal ibgp so that's why it's flapping. I think that next-hop-self will help, as all the routers are connected thru an internal lan.

One question, currently router 1 has a 500mbs link to Yipes, 2nd is 300mbs to cogent, and 3rd is 100mbs to xo. How can i tell it to use more of link 1 vs link2 etc.. Is their an easy way to load share in this environment?

I was told the only way, crudly to do it, is to show certain bigger AS's closer to one router or another to help shape the traffic that way.

Idealy I just don't want one link to be maxed while the other two are idle.

Thanks for your help btw.

-GK

"easy way"...?? No, not currently. The as-path prepend mechanism is pretty much it, but it requires constant tunning as traffic characteristics change.

The "easy way" is called OER. Take a look at the slides in my Networkers' presentation: ftp://ftp-eng.cisco.com/alvaro/4004.pdf

We're currently targetting an official release for sometime next year...

Alvaro.

Well, Still no go.

Did notice a few things tho.

First off, I have no sync on both sides, and no autosummary, and barebone confs.

If i set next-hop-self on both sides, and they connect fine as ibgp. But then when i add in the xo (3rd router) ebgp to it's isp, it get's all the routes and for about 1 min all looks good

and then boom. withdraw statements, I was able to capture the debug for it:

*Sep 30 01:17:20: BGP(0): 66.90.64.50 NEXT_HOP part 1 net 193.110.159.0/24, next 66.237.108.29

*Sep 30 01:17:20: BGP(0): 66.90.64.50 send UPDATE (format) 193.110.159.0/24, next 66.237.108.29, metric 3, path 2828 701 702 8513 9154 24770

*Sep 30 01:17:20: BGP(0): 66.90.64.50 NEXT_HOP part 1 net 198.63.206.0/24, next 66.237.108.29

*Sep 30 01:17:20: BGP(0): 66.90.64.50 send UPDATE (prepend, chgflags: 0x208) 198.63.206.0/24, next 66.237.108.29, metric 3, path 2828 2914 27369

*Sep 30 01:17:20: BGP(0): 66.90.64.50 5 updates enqueued (average=66, maximum=69)

*Sep 30 01:17:20: BGP(0): 66.90.64.50 update run completed, afi 0, ran for 0ms, neighbor version 267773, start version 267779, throttled to 267779

*Sep 30 01:17:21: BGP(0): 66.90.64.50 rcv UPDATE about 81.161.240.0/20 -- withdrawn

*Sep 30 01:17:21: BGP(0): 66.90.64.50 rcv UPDATE about 81.161.248.0/21 -- withdrawn

*Sep 30 01:17:21: BGP(0): 66.90.64.50 rcv UPDATE about 82.146.16.0/21 -- withdrawn

*Sep 30 01:17:21: BGP(0): 66.90.64.50 rcv UPDATE about 82.146.24.0/24 -- withdrawn

66.90.64.50 = ibgp peer (2nd router that's working fine :) )

66.237.108.29 = ebgp to xo isp router.

this 3rd router is 66.90.64.51.

I am not sure if u can tell anything by that, but never know..

Also what i noticed was after about 5-10min of this, it seemed to just stop

and was sharing the link load, but the 3rd router was only showing 28K routes from the ibgp, whereas it should of been showing the full routes ~110K like it was with the ebgp isp side.

Also, one odd thing i saw, was that when I did:

XO#sh ip bgp 66.90.64.0

BGP routing table entry for 66.90.64.0/19, version 17797

Paths: (1 available, best #1, table Default-IP-Routing-Table)

Advertised to non peer-group peers:

66.237.108.29

Local

0.0.0.0 from 0.0.0.0 (66.237.109.41)

Origin IGP, metric 0, localpref 100, weight 32768, valid, sourced, local, best

that's from router 3. (the 109.41 is on the lan interface along with 66.90.64.51)

currently I have the bgp off, else it would also show the proper route for the 66.90.64.0/19 thru the ibgp, but it does not show it as best path, nor does it show it with any weight!

66.90.64.0/19 is our block we are advertising everywhere.

My other router #2 (66.90.64.50) the ibgp peer, shows this:

Kingcomp1#sh ip bgp 66.90.64.0

BGP routing table entry for 66.90.64.0/24, version 18069

Paths: (1 available, best #1, table Default-IP-Routing-Table)

Not advertised to any peer

Local

66.90.64.49 from 66.90.64.49 (209.120.155.14)

Origin IGP, metric 0, localpref 100, valid, internal, best

Now, I know that XO is advertising a 66.90.0.0 /18 thru their router. can that be conflicting and causing this problem? as our block is 66.90.64.0/19 ?

Thanks for all your help in this guys. This is really frustrating and I don't have anyone left to ask help from :(

-GK

anyone please?

It is possible that what you are seeing is normal BGP behaviour, especially given that it stabilises after about 10 minutes.

Withdraws are very common on IBGP, since you usually compete with the IBGP routes of the other routers. If your advertised IBGP route is inferior to theirs, then you sent an IBGP withdrawal for the route that you sent.

Therefore, before you bring up the 3rd ISP, with all 3 IBGP peers in place, the 3rd router will have a full IBGP table. When the 3rd ISP EBGP peer is added, the third router will now be competing with the other two routers on IBGP. It will win on some IBGP routes usually through AS PATH, and lose on others, thereby ending up with an IBGP routing table that is smaller than its EBGP table.

Regards

Ian

After it stablilizes, how many of the XO router's peers are using routes through the XO router, rather than through their eBGP connection? I would guess that you are seeing the normal settling, though its taking longer than I would have expected, and then the other routers are choosing the XO router as their exit point for all but about 28k of their routes, so they stop advertising the full table to the XO router (split horizon).

You could check this by checking some of the routes for which the XO router has eBGP routes, and not iBGP routes. If you find a couple of these, then check them in the iBGP peers of the XO router. If they are using the XO router as their best path towards those destinations, then they won't advertise the route to the XO router.

Russ.W

Hmm I see. I will give it a try tonight and check the routes. I just figured that it would actually keep all the routes from each peer in memory and the routing table would just be "best" path table.

Also I didn't relize XO had that many better routes then Yipes and Cogent (our 2 other providers). Just noticing they are 1 hop from at&t AS. That might be why :)

As was said earlier since their is no 'easy' way to shape traffic on unequal links. What is the proper way to do it?

I was told to put bigger AS's on my bigger links, to more traffic would go out there. Can you point me to a sample config on how to do this please?

Thanks guys. I really thought something was messed here, but seems it is working like it's supposed to :/

It could be that XO is closer to some of the major providers than your other providers, and this is skewing routing towards them... If this is the case, there are several things you can do to route more traffic out towards the other two providers. For instance, you could set the local pref on some of the routes you're learning from the other providers higher, based on a range of addresses, or a given as path length.

But you'll have to do some manual tuning if you want the traffic to be loosely balanced, or wait until something like Optimized Edge Routing is available. :-)

Russ.W

As I believe you are absolutely correct. XO seems to have much better peering with AT&T/Sprint and some other major backbones when compared to my other providers.

OER sounds great, but I need something till then :)

Two other questions on this topic:

Why does it take so long to converge? ie, it I see the router get the entire routing table from the ibgp peer, and then drop them down to 30k, then all the way back up to 110k, then back down. like 2-4 times before it settled at 30K.?

Can you point me to a sample config on how to have routes going to a certain AS take the path out a certain router?

You aksed:

Why does it take so long to converge? ie, it I see the router get the entire routing table from the ibgp peer, and then drop them down to 30k, then all the way back up to 110k, then back down. like 2-4 times before it settled at 30K.?

--

I'd guess it's an issue with queueing, rather than the number of routes you're sending. Two things you could try to speed up convergence--increase you r input queues, and turn on path mtu discovery for the TCP implementation on the router.

http://www.cisco.com/en/US/tech/tk365/tk80/technologies_tech_note09186a00801c4f48.shtml#iptcpmtudisc

Isn't directly related, but it's in the right area anyway (and the other things listed on this page might actually help, as well). The second thing would be to reduce the minadvinterval, especially since you aren't transiting any traffic.

http://www.cisco.com/en/US/products/sw/iosswrel/ps1828/products_command_summary_chapter09186a00800f0ab2.html#5335

I'd feel pretty safe setting it down to 1 second in this network. See if these make any difference, along with the other issues you're having with the write queue getting stuck.

:-)

Russ.W

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: