cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
10391
Views
16
Helpful
48
Replies

BGP route-reflectors and MPLS - suboptimal path.

Hello everybody,

I'm quite lost and need some good advices about my network topology.

Please have a look at a  picture in the attachment.

We have 4 routers physicaly  connected in a ring, three of them have a eBGP session with a upsteam ISP.

Two RC-RR's are route-reflectors and all other routers have  BGP sessions with them using   Loopbacks IP as source. 

Because of speed and price the connection RC-E001 <--> RC-RR1 is a backup and OSPF and BGP metric are set accordingly.

The internal routing are working as expected.  All routers are MPLS "P" routers, but only  Loopbacks IP are label-switched, it means that only traffic to a Loopback follow label-path, other traffic should use normal routing table.

The problem is followin: Traffic to the Internet  from the router RC-E001 follows the path RC-E002 ---> RC-RR2 ---> RC-RR1,

but it should just go to the router RC-E002 and then directly to the Internet.  All external prefixes on RC-E001 have RC-RR1 as a next-hop (higher local-preference)

Traceroute on RC-E001 shows following:

RC-E001#traceroute 8.8.8.8    

  1 RC-E002 [MPLS: Label 202 Exp 0] 16 msec 20 msec 60 msec

  2 RC-RR2 [MPLS: Label 79 Exp 0] 20 msec 16 msec 20 msec

  3 RC-RR1 [AS UPSTREAM] 20 msec 16 msec 20 msec

  4 UPSTREAM [AS UPSTREAM] 20 msec 16 msec 20 msec

  5 ....

I understand that RC-E001 tries to reach  the BGP next-hop via MPLS label-path, bacause all Loopbacks should use MPLS Label path-switching, but I don't want that the traffic goes in such sub-optimal way.

What have I configured wrong and what should I do to force  the traffic  from RC-E001 goes out direct  from RC-E002?

Best regards,

Konstantin

48 Replies 48

Vaibhava Varma
Level 4
Level 4

Hi Konstantin

The attachment is not showing any complete diagram but just a partial arrow..Can you please provide the updated diagram to better understand the network topology.

Are you using NHS on the RRs as by default NHS will not work for RR even though configured..If RC-E002 also has an upstream ISP Peering along with the RC-RR1 and RC-RR2 then for RC-E001 to choose RC-E001 the BGP attributes for the routes being injected by RC-E002 has to be better than RC-RR1/RR2 so that RC-RR1/RR2 in order to choose RC-E002 as the best route and do not announce their own routes..

Regards

Varma

oh, sorry, it was saved  only selected objects , here is the fully diagram.

What is "NHS" ?

Hi Konstantin

Thanks for providing the diagram. By NHS I meant Next-Hop-Self.

If I undertand correctly looking at the topology depicted above RC-E002 is also RR-Client . Am I correct in my understanding. RC-E001 only peers with the RR's RC-RR1 and RC-RR2 right ?

In order for RC-E001 to prefer RC-E002 as the exit point for Internet Traffic we need to make the EC-R002 BGP routes more preferred than RC-RR1 so that RC-RR1 reflects the RC-E002's routes to RC-E001 and do not advertise its own routes. Same can be achieved by increaing LP of RC-E002's routes..

Hope this helps in your query.

Regards

Varma

Hi Varma,

the problem is not a BGP attributes, it's a LSP what makes a problem for me, see the Matthew's answer.

Hi Konstantin

The Problem here is that RC-E002 is not the best candidate for Internet Traffic but instead RC-RR1 is advertising the best routes..LSP comes at 2nd place only after we select the best route from a peer to reach that peer..Even if we peer RC-E002 directly with RC-E001 it will not solve the issue as the default LP for routes learnt from RC-E002 will be 100 and whereas LP of routes learnt from RC-RR1 will be 150 which is better thereby selecting RC-RR1's routes for Internet Traffic.

Regards

Varma

hmm, I can't agree with that.

If there were no MPLS in network, then  traffic would be routed hop-by-hop.

It means  RC-E001 still sees RC-RR1 as hext-hop for all external prefixes, but on the way to it, traffic would  be routed by RC-E002 and RC-E002 would simply send the external traffic directly to Upstream. The LP doesn't play any role here because  RC-E001 has no BGP session with RC-E002 and doesn't get any prefixes from it.

But you're right - if I set the BGP session between RC-E001 and RC-E002 then I can set the LP accordingly. I should think about it, because it means a "small" changes in a design . RC-E002 was not supposed to be RR router, but may be it's a good idea to create a second level of RR sessions.

Hi Konstantin

I would agree on this with you and Matthew for the MPLS Point adding to this issue. Actually this is the basis for deploying a BGP free core. I just overlooked that..

We can have two options for this :

1. Make RC-E002 RR-Client and set LP to 200 for RC-E002 routes so that RC-RR1 reflects RC-E002 as best path with next-hop set to RC-E002

2. Peer RC-E002 and RC-E001 directly and again keep LP to 200 for RC-E002 routes.

Regards

Varma

Ok, I see your point.

I've posted a couple minutes ago the idea of my new design with 2 level of RR, do you mean  it makes sence?

Hi Konstantin

In my opinion when we look for any hirearchical topology and the top level serves the bottom level and the clients connect to the bottom level..

So I think best way would be to keep RC-E002 at the topmost level and make RC-RR1 and RC-RR2 as its clients and keep RC-E001 as before the client of RC-RR1 and RC-RR2..

In my personal opinion making a client peer to different levels of RRs won't help in anything extra from traffic transport perspective.

Regards

Varma

you've it a little bit  misunderstood.

Just to clarify - currently in our network we have RC-RR1 and RC-RR2 as Route-reflectors, all othe routers are route-reflector clients of those both routers. Not other way around.

Hi Konstantin

NHS stands for next hop self, of you are using a transit/border router as a RR you might notice some of your next hops get trampled, you can manually fix this with a route-map setting the next hop if you wish

I can't see the diagram (using app) but I would provide good odds that this is it;

When the ingress LSR (the guy you are tracing from) does a lookup it sees the next hop recourses to a label switch path (LSP).

With MPLS forwarding the packet is not routed hop by hop. Actually the packet is put onto a predetermined path to the next hop router in this case.

When the packet arrives here the router sees an IP packet (PHP) and sends the packet on a new predetermined path towards the upstream.

Have a look at the next hop values, I'm confident you'll see this is causing the issue, as mentioned you can manipulate the next hop the RR is advertising to correct the issue.

Hi Matthew,

you're right about LSP, the RC-E001 use the LSP to reach the nex-hop and that is why the external traffic goes to RC-RR1.

But how should I change the Next-hop value - just to set the next-hop IP of RC-E002? I think it's not a best solution, because RC-E001 doesn't have any BGP session with RC-E002. It'll work for sure, but it breaks the design rules.

I was think to configure RC-E002  as router-reflector for RC-E001 and so create an additional Route-reflector level, because as I remember according to BGP best-practice traffic  from one route-reflector client should not pass the other route-reflector client.

if I create a second level of Route-Reflector sessions, may I set a Route-reflector client so it connects to RR routers form different levels?

In my case it would be like this, RC-E001 is a route-reflector client of RC-RR1 and RC-E002 and RC-e002 is a route-reflector client of RC-RR1 and RC-RR2?

Hi Konstantin,

I'm at home now on a PC and can see the diagrams. I think the issue is just that the best route is via RR1 (and not RR2) hence the traffic gets there based on the best path to the loop0 based on OSPF (and as already mentioned MPLS sends the packet on a prederemined path).

You're asking how to make traffic leave your network to the internet on RR2, you have two options;

- don't use label switching (this may not be an option for you)

- change the BGP attributes so that the best route in the BGP table on E001 is the path via RR1

I think your current local preference scheme is what is causing the route via RR2 to be selected, you could choose to up the local preference to above 150 for just the default route (assuming that's the route you're using) to resolve this.

Note: Varma did say the second solution above earlier

You asked about hierachical route-reflectors also, I don't think this could help you solve the problem. You never mentioned there is an iBGP session between RR1 & RR2, is there? As long as it is iBGP with no RRClient that should be fine and each RR will learn the routes which the other RR learnt through eBGP (that sounds confusing but I'm sure you know what i mean).

If you don't think that's correct or doesn't solve the problem can you post some "show ip bgp 0.0.0.0" (or whatever the route is if you're not using default) on all the routers so we can see whats happening?

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: