Re: HEADS UP -- BGP Issue - Page 2

Mohamed Sobair · ‎02-19-2009

Hi,

Here is the scenario,

ISP has 2 uplinks toward the upstream provider, The first link represent the Internet Gateway link (Main Internet GW), the Second link represent the National link for (National exchange).

The ISP has its own registered public AS number from RIPE. The upstream National provider has a private AS as its peering address with the ISP and its being removed when announcing thier Network to the National peers. The Upstream National provider also has its own Internet link (This link is being used for thier own usage not used to connect the ISP to the outside world). The ISP uses the second upstream Internet GW to announce it network and connect to the Internet.

The Problem:

The Upstream National provider has announces the ISP Network to the Internet by mistake (The link that is used for thier own purpose), The public AS of the ISP has been removed when thier Network announced to the Internet.

Now, The Outside world have 2 possible paths to reach the ISP Network, however, One of the paths is annoncing the Network without its registered Public AS.

The Second link (Main Internet GW link) is announcing the ISP Network with its associated Public registered AS number.

What happened is that the ISP Network has been down for a while, why?? The Internet has prefered the National link path to reach ISP Network because its the shortest AS path, although thier AS is not announced.

MY question: Since a border Router (BGP router) uses AS attributes to reach particulat destination , How is it going to ssure that a particular Network belong to a particular registere AS number? why the Second link is not considered although its announcing thier Public AS?

Regrds,

Mohmed

marikakis · ‎02-20-2009

OK, then. I just was not sure at which side you were standing. Since its other people's job to decide this, we cannot do much I suppose. That was a nice case you posted. Thanks. I totally enjoyed it :-)

Edit: A typical case for a provider is this: Customer initially requires national connectivity only. At some point customer changes its mind and asks for international connectivity. ISP goes to router(s) connected to its IGW and allows customer routes to pass to upstream.

In your scenario, a case like this could cause the meltdown. ISP removes private AS towards IGW to accomplish the announcement for particular customer and all customers are affected. This is a wild guess, but certainly possible. This is not just poor design. It is also not flexible enough to handle various common customer scenarios.

Mohamed Sobair · ‎02-20-2009

Maria,

The ISP has one physical link, but 2 seperate PVCs, one for the National & one for the IGW link. So the National should be isolated from IGW!!

Mohamed

marikakis · ‎02-20-2009

I think we have a misunderstanding. I used the word "ISP" in my previous post, while I was referring to the national provider. Sorry about that. I am referring to the design of the national provider.

Edit: Ignore that post. I re-post corrected.

A typical case for a provider is this: Customer initially requires national connectivity only. At some point customer changes its mind and asks for international connectivity. National provider goes to router(s) connected to its IGW and allows customer routes to pass to upstream.

In your scenario, a case like this could cause the meltdown. National provider removes private AS towards its IGW to accomplish the announcement for particular customer and all customers are affected. This is a wild guess, but certainly possible. This is not just poor design. It is also not flexible enough to handle various common customer scenarios without being error prone.

ralwarrag · ‎02-21-2009

Hi everybody

Only the thing that I couldn't understand that how the RIPE database checked and who checking it I mean which router in the internet , I believe that for every a single prefix that you announce it to the internet you have to create a route of it in the RIPE database ( webupdates) and tied it with our AS number but how is the mechanism to be checked to insure the each prefix announces from its corresponding AS to avoid what happen in Pakistan telecom with YouTube website last year

Thanks alot

Giuseppe Larosa · ‎02-21-2009

Hello Rashed,

the problem is actually this:

the BGP protocol by itself performs some checks but a router running BGP doesn't verify with RIPE or other RIR if an advertisement is correct or not and if a specified prefix is advertised with an origin AS that is the legitimate one or not.

A BGP router by itself just checks to see if the own AS is not present in the AS path string ( loop avoidance) and if it can reach the BGP next hop.

Additional sanity checks should be performed:

it is easier to control near the edge near the leaf ASes the ones that are just multihomed but doesn't provide transit service to any other AS.

So the provider of a customer can and should verify to be receiving only some prefixes that are the ones associated to that customer.

But then that ISP needs to interconnect with others.

Hope to help

Giuseppe

marikakis · ‎02-21-2009

Guiseppe is right.The basic question here is why doesn't everydody check the RIR databases. Answers have been provided previously. I will try to sum up.

1. In the transit core of the internet that big providers interconnect there is mutual trust. This is a practical issue. There are so many networks and ASs and their policies and connectivity change often. A big provider cannot check everybody every single moment even with automated systems and adjust accordingly.

2. We go to the leafs. Typically we have: customer AS connecting to ISP with that ISP in turn connecting to an upstream (bigger) provider. The ISP and the upstream have to check, but not all of them are using prefix-filters everywhere, because they consider it an administrative burden. PCCW was not using prefix-filters, so could not catch the illegitimate prefix.

3. Suppose that we have a perfect world and ISP and upstream are using tight as-path filters and prefix-filters. Even in this case, there are customer scenarios (especially failover ones) that cannot be verified against the RIR database and are up to leaf systems to decide. I will try to help you understand this third point with a couple of examples.

The first example would be the primary case in this thread you are reading. The ISP connects to national provider using private AS. The national provider advertises the ISP networks towards the national peers as being their own and they agreed with the ISP customer on this. They did not agree for those networks to go further to the internet, just to the national peers. If the national peers check the database, this scenario will not work. The national peers will say to the national provider: hey, those networks are assigned to the ISP, not you! I am not saying that the scenario cannot work in any other (even better) way, but sometimes various cost factors lead to weird setups.

And now another story that I was trying to forget, because it caused me humiliation in my country's national exchange (I actually remember my mistakes better than anything else :-). Customer comes and says they have a network for us to announce in national peers only. I check the network and I see its not theirs, but is rather part of a block of one of our national peers. I start arguing with the TAC. They contact customer and at the end TAC says: customer has arranged this with their primary provider (the national peer). Our announcement will be used in the national exchange only. Now, this is a weird setup, but since it won't go further to the internet, we could just do it and say to national peers that we have an exit point to somebody else's network. Assuming that this somebody else agreed. Customer said so, but that was a case of sales not talking to their engineers. When I sent the e-mail to the national peers, national peer that owned the block gave me a shock: This is one of our blocks! Don't do this!

A, the world is far less than perfect. This weird setup could work. They just did not accept it. They were right in general. We should not be doing things just because a customer says so. Still, it's hard to resist sometimes to customer requests. This makes possible for scenarios that cannot be checked against the databases (assuming that databases are current, which is another story).

p.s. The humiliation story had to do with the fact that not every customer has its own block assigned. This has to do with block assignment policies of RIRs. Anyone interested in IPv6? ;-)