iBGP multipath & MPLS label forwarding

Unanswered Question
Oct 25th, 2008
User Badges:
  • Bronze, 100 points or more

Hi all,


I have a scenario in which i've setup (pls refer to the attached diagram). On R1 I would like to achieve load-sharing for traffic to destination networks 60.x.x.x/24 & 160.x.x.x/24 (advertised from R9) via border routers R4 & R5. I am able to use ibgp multipath on R1, and R1 installs i.e. 60.40.9.0/24 via 1.1.4.4 & 1.1.5.5 (pls refer to "bgp multipath1.txt"). However, the actual path taken is always via 1.1.5.5 (r5) as verified with "sh ip cef 60.40.9.0 255.255.255.0 det". The outgoing label used is 18 for 1.1.5.5. I noticed though this has something to do with which router (r4 or r5) was the last to send the bgp updates would influence which next-hop to be used:


Rack1R1#sh ip route 60.40.9.0

Routing entry for 60.40.9.0/24

Known via "bgp 9000", distance 200, metric 0

Tag 64512, type internal

Last update from 1.1.5.5 00:11:13 ago

Routing Descriptor Blocks:

* 1.1.4.4, from 1.1.7.7, 00:16:25 ago

Route metric is 0, traffic share count is 1

AS Hops 1, BGP network version 0

Route tag 64512

1.1.5.5, from 1.1.8.8, 00:11:13 ago

Route metric is 0, traffic share count is 1

AS Hops 1, BGP network version 0

Route tag 64512


Rack1R1#sh ip cef 60.40.9.0 255.255.255.0 det

60.40.9.0/24, version 93, epoch 0, per-destination sharing

0 packets, 0 bytes

tag information from 1.1.5.5/32, shared, all rewrites owned

local tag: 18

via 1.1.4.4, 0 dependencies, recursive

traffic share 1

next hop 1.1.13.3, FastEthernet0/1 via 1.1.4.4/32 (Default)

valid adjacency

tag rewrite with Fa0/0, 1.1.12.2, tags imposed {18}

via 1.1.5.5, 0 dependencies, recursive

traffic share 1

next hop 1.1.13.3, FastEthernet0/1 via 1.1.5.5/32 (Default)

valid adjacency

tag rewrite with Fa0/1, 1.1.13.3, tags imposed {18}

0 packets, 0 bytes switched through the prefix

tmstats: external 0 packets, 0 bytes

internal 0 packets, 0 bytes

Rack1R1#sh mpls for 1.1.5.5

Local Outgoing Prefix Bytes tag Outgoing Next Hop

tag tag or VC or Tunnel Id switched interface

18 18 1.1.5.5/32 0 Fa0/0 1.1.12.2

18 1.1.5.5/32 0 Fa0/1 1.1.13.3

Rack1R1#


(to be continued on next post)

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 5 (1 ratings)
Loading.
frenzeus Sat, 10/25/2008 - 06:48
User Badges:
  • Bronze, 100 points or more

When i forced R4 to re-advertise the bgp routes learnt from R9 (60.x.x.x/24 & 160.x.x.x/24), on R1 the next hop to 60.40.9.0/24 is changed to 1.1.4.4 (R4):


Rack1R1#sh ip route 60.40.9.0

Routing entry for 60.40.9.0/24

Known via "bgp 9000", distance 200, metric 0

Tag 64512, type internal

Last update from 1.1.4.4 00:00:57 ago

Routing Descriptor Blocks:

* 1.1.5.5, from 1.1.8.8, 00:00:57 ago

Route metric is 0, traffic share count is 1

AS Hops 1, BGP network version 0

Route tag 64512

1.1.4.4, from 1.1.7.7, 00:00:57 ago

Route metric is 0, traffic share count is 1

AS Hops 1, BGP network version 0

Route tag 64512


Rack1R1#sh ip cef 60.40.9.0 255.255.255.0 det

60.40.9.0/24, version 151, epoch 0, per-destination sharing

0 packets, 0 bytes

tag information from 1.1.4.4/32, shared, all rewrites owned

local tag: 19

via 1.1.5.5, 0 dependencies, recursive

traffic share 1

next hop 1.1.12.2, FastEthernet0/0 via 1.1.5.5/32 (Default)

valid adjacency

tag rewrite with Fa0/0, 1.1.12.2, tags imposed {19}

via 1.1.4.4, 0 dependencies, recursive

traffic share 1

next hop 1.1.13.3, FastEthernet0/1 via 1.1.4.4/32 (Default)

valid adjacency

tag rewrite with Fa0/1, 1.1.13.3, tags imposed {19}

0 packets, 0 bytes switched through the prefix

tmstats: external 0 packets, 0 bytes

internal 0 packets, 0 bytes

Rack1R1#

Rack1R1#sh mpls for 1.1.4.4

Local Outgoing Prefix Bytes tag Outgoing Next Hop

tag tag or VC or Tunnel Id switched interface

19 19 1.1.4.4/32 0 Fa0/0 1.1.12.2

19 1.1.4.4/32 0 Fa0/1 1.1.13.3

Rack1R1#


Appreciate any pointers as to why it is not using both R4 & R5 when sending traffic to networks 60.x.x.x/24 & 160.x.x.x/24 of R9, eventhough ibgp multipath is in effect but cef says otherwise. Any useful links would be greatly appreciated as well. (I know of other ways of achieving load-sharing but am confused in this scenario.)


Note: The ipaddresses & AS numbers are random numbers that i'm using to test it out before deployment in actual environment. R7 & R8 are ipv4 route reflectors and R1, R4 & R5 are route-reflector clients.



frenzeus Sat, 10/25/2008 - 09:01
User Badges:
  • Bronze, 100 points or more

Thanks for sharing. But this is not what i'm referring to, I'm aware of the bgp multipath load sharing for both ebgp/ibgp in MPLS VPN, but this is not what i'm referring to.

Giuseppe Larosa Sat, 10/25/2008 - 11:16
User Badges:
  • Super Silver, 17500 points or more
  • Hall of Fame,

    Founding Member

Hello Hon,

the complex lab setup that you have built shows a limitation about iBGP multipath support over MPLS forwarding.


It works well only without the use of Route Reflector Servers.


In Cisco implementation (some time ago) of Route Reflector servers routes coming from clients are not influenced by iBGP multipath feature.

What happens is that both Route Reflectors make a best path choice that is determined by the lowest BGP router-id but here they are both propagating both paths.


But here in your case actually you receive on R1 four copies of each advertisement as the debug output in second text file shows:

one from RRS 7 with next hop 1.1.4.4

one from RRS 7 with next hop 1.1.5.5

one from RRS 8 with next hop 1.1.4.4

one from RRS 8 with next hop 1.1.5.5


And this happens for each prefix.

Then on R1 you probably have maximum paths 2 and not 4.

So for each prefix there is a competition between 4 route candidates to be installed in the routing table.

Because IGP metric to BGP nexthop is the same 3 for both 1.4.4.4 and 1.5.5.5 and for the maximum-paths 2 in iBGP for each prefix two of the four are chosen.

When we look at the forwarding plane you can see for some prefixes two paths using two different interfaces (desired result ) and with different BGP next hop (1.1.4.4 and 1.1.5.5 both represented).


What is strange here is that the MPLS label declared for each prefix is only one and the one of a single BGP next-hop.

Some times the one of 1.1.4.4 some times that of 1.1.5.5.

We see that MPLS forwarding use different paths then the declared ones in IP routing level.


I would suggest to try to increase the maximum paths to 4 on R1 to see if installing all received routes for each prefix leads to load balancing at the MPLS level.


May I ask you what devices are the two RRS r7 and R8 and what IOS release they are running ?



Hope to help

Giuseppe




frenzeus Sat, 10/25/2008 - 17:16
User Badges:
  • Bronze, 100 points or more

Hi Giuseppe,


Thanks for the prompt response. On R1, it is actually already configured with multipath 4. Since bgp prefixes learnt from R4/R5 on both RRs (7206 on 12.0(32)S11) will end up having both the RRs preferring the same border router (in this case, R4 - lowest bgp router-id, since both will have the same metric to next-hop), i tweaked it in such a way to have R8-RR prefer bgp prefixes of R5, while R7-RR will prefer bgp prefixes of R4 (default - lowest bgp router-id). This way, 2 copies of the same bgp prefix with different next-hop (R4 & R5) is sent to R1. Thus R1 configured with multipath will install both in its RIB. However, as we see on R1 prefix 60.40.9.0/24 has 2x NH 1.1.4.4 & 1.1.5.5, but only uses 1.1.5.5 or 1.1.4.4 in mpls forwarding, depending on which was the last to send the bgp updates out. Appreciate your feedback on this.

Harold Ritter Sun, 10/26/2008 - 18:15
User Badges:
  • Cisco Employee,

Hon,


This is a limitation that is inherent to the 12.0S code. It is described in bugid CSCsb52253. One work around would be to enable labeled IPv4 on the iBGP sessions. This will make you see both labels to R4 and R5 on R1.


Regards

frenzeus Tue, 10/28/2008 - 22:31
User Badges:
  • Bronze, 100 points or more

Hi Ritter,


Thanks for the info. I did a quick read up on the bug and looked up the affected versions, though i must say 12.0(32)S11 was not listed as one of the versions.


However, since this is closely related to the mpls-lfib, I would like to check if the 7600s running on 12.2(33)SRB/SRC code would be affected by this bug, as my understanding is that it uses the newer forwarding infrastructure known as MFI. I'm trying to avoid changing the core infrastructure by introducing another mode of label advertisement (BGP as u've advised).


Also, would you be able to provide links/good readings on how IOS performs mpls load sharing for label stacks, eompls, l3vpn etc.?


Very much appreciated.


Tx.

Harold Ritter Wed, 10/29/2008 - 07:05
User Badges:
  • Cisco Employee,

Hi Hon,


This limitation affects 12.0(32)S11 as well. On the other hand, you will not see this issue on any MPLS Forwarding Infrastructure (MFI) based code. So this limitation will not be seen in the 12.2SR code (SRA, SRB, SRC, etc).


Here is a summary for MPLS load-balancing. I will forward you a link if I find one:


Native IP over MPLS: Load-balancing is performed based on src and dst ipv4 addresses.


L3VPN: Load-balancing is performed based on src and dst ipv4 addresses.


L2VPN: Load-balancing is performed based on the VC label (inner label).


6PE: Load-balancing is performed based on the inner label.


6VPE: Load-balancing is performed based on the inner label.


Regards

frenzeus Wed, 10/29/2008 - 09:19
User Badges:
  • Bronze, 100 points or more

Hi Ritter,


Thanks for the info. Would greatly appreciate that u could post the link if u happen to come across it.

Actions

This Discussion