6pe

diego.macedo · ‎07-05-2007

Hi, We are configuring Cisco 6PE in our network as in

rfc 4798 (http://www.rfc-editor.org/rfc/rfc4798.txt).

In our configuration the CE is in our own IGP domain and there is a IPv6 BGP session between the CE and the 6PE who acts as a RR Server. Routes from the CE

are reflected by the 6PE to the rest of the network.

As we are using 6PE prefixes MUST have a label bundary of 2, using the first label to identify the 6PE exit router and a second label to identify the IPv6 AFI.

So the problem is routes from one CE arrives to the 6PE and are reflected to the rest of the network. As the ingress 6PE does not have a label for the CE it publish

the prefix with second label=0. I have seen that the receiving 6PE does not understand the message and set a buggy prefix in its routing table.

The problem can be worked around if you set a "next-hop-self" at the 6PE for the routes announced by the CE or by eliminating the "send-label" on the ipv6 BGP config.

I believe that Cisco should not be sending label=0 as it is a reserved label for "ipv4 explicit null".

Here some texts extracted from the RFC:

Interconnecting IPv6 islands over an IPv4 MPLS cloud takes place

through the following steps:

1. Exchange IPv6 reachability information among 6PE routers with MP-

BGP [RFC2545]:

The 6PE routers MUST exchange the IPv6 prefixes over MP-BGP

sessions as per [RFC2545] running over IPv4. The MP-BGP Address

Family Identifier (AFI) used MUST be IPv6 (value 2). In doing so,

the 6PE routers convey their IPv4 address as the BGP Next Hop for

the advertised IPv6 prefixes. The IPv4 address of the egress 6PE

router MUST be encoded as an IPv4-mapped IPv6 address in the BGP

Next Hop field. This encoding is consistent with the definition

of an IPv4-mapped IPv6 address in [RFC4291] as an "address type

used to represent the address of IPv4 nodes as IPv6 addresses".

In addition, the 6PE MUST bind a label to the IPv6 prefix as per

[RFC3107]. The Subsequence Address Family Identifier (SAFI) used

in MP-BGP MUST be the "label" SAFI (value 4) as defined in

[RFC3107]. Rationale for this and label allocation policies are

discussed in Section 3.

Any help will be appreciated .

Regards

Diego

Harold Ritter · ‎07-05-2007

Diego,

Running iBGP between the PE and the CE and having the PE configured as a route reflector is not supported in this context.

Regards,

Harold Ritter
Sr Technical Leader
CCIE 4168 (R&S, SP)
harold@cisco.com
México móvil: +52 1 55 8312 4915
Cisco México
Paseo de la Reforma 222
Piso 19
Cuauhtémoc, Juárez
Ciudad de México, 06600
México

r.gagliano · ‎07-05-2007

Not sure the RFC text is clear about this issue.

Here is an example of the route being correctly advertised by the RR (ibb2) 2001:50::/64 and the bogus route that the 6PE RR client (ibb1) generates by "itself": 5000::/40. ?where does this prefix come from?

Finally the result of the debug bgp command when the session is cleanned. You can see that there is warning message and the "reserved" label = 0 is used, ?why does this happens?

ibb2#sh bgp ipv6 unicast neighbors 10.0.0.1 advertised-routes

BGP table version is 50, local router ID is 10.0.0.2

Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,

r RIB-failure, S Stale

Origin codes: i - IGP, e - EGP, ? - incomplete

Network Next Hop Metric LocPrf Weight Path

*> 2001:33::/64 :: 0 32768 i

*>i2001:50::/64 2001::5 0 100 0 i

Total number of prefixes 2

% NOTE: This command is deprecated. Please use 'show bgp ipv6 unicast'

ibb2#

ibb1#sh bgp ipv6 unicast

BGP table version is 29, local router ID is 10.0.0.1

Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,

r RIB-failure, S Stale

Origin codes: i - IGP, e - EGP, ? - incomplete

Network Next Hop Metric LocPrf Weight Path

*>i2001:33::/64 ::FFFF:10.0.0.2 0 100 0 i

*> 2001:34::/64 :: 0 32768 i

* i5000::/40 2001::5 0 100 0 i

ibb1#

ibb1#deb bgp ipv6 unicast 10.0.0.2 updates

BGP updates debugging is on for neighbor 10.0.0.2 for address family: IPv6 Unicast

ibb1#clear bgp ipv6 uni

ibb1#clear bgp ipv6 unicast 10.0.0.2

ibb1#

*May 14 08:28:27.333: BGP(1): no valid path for 2001:33::/64

*May 14 08:28:27.333: %BGP-5-ADJCHANGE: neighbor 10.0.0.2 Down User reset

*May 14 08:28:27.333: BGP(1): nettable_walker 2001:33::/64 no best path

*May 14 08:28:27.641: %BGP-5-ADJCHANGE: neighbor 10.0.0.2 Up

*May 14 08:28:27.641: BGP(1): 10.0.0.2 send UPDATE (format) 2001:34::/64, next ::FFFF:10.0.0.1, metric 0, path Local, mpls label 26

*May 14 08:28:27.645: BGP(1): 10.0.0.2 rcvd UPDATE w/ attr: nexthop 2001::5, origin i, localpref 100, metric 0, originator 10.0.0.5, clusterlist 10.0.0.2

*May 14 08:28:27.649: BGP(1): 10.0.0.2 rcvd 5000::/40

*May 14 08:28:27.649: BGP(1): no valid path for 5000::/40

*May 14 08:28:27.649: BGP(1): 10.0.0.2 rcvd UPDATE w/ attr: nexthop ::FFFF:10.0.0.2, origin i, localpref 100, metric 0

*May 14 08:28:27.649: BGP(1): 10.0.0.2 rcvd 2001:33::/64

*May 14 08:28:27.649: BGP(1): Revise route installing 2001:33::/64 -> ::FFFF:10.0.0.2 (::) to main IPv6 table

ibb1#

ibb2#clear bgp ipv6 unicast 10.0.0.1

ibb2#

Jul 6 01:33:56.909: BGP(1): no valid path for 2001:34::/64

Jul 6 01:33:56.909: %BGP-5-ADJCHANGE: neighbor 10.0.0.1 Down User reset

ibb2#

Jul 6 01:33:56.913: BGP(1): nettable_walker 2001:34::/64 no best path

Jul 6 01:33:56.917: BGP(1): 2001::5 send unreachable 2001:34::/64

Jul 6 01:33:56.917: BGP(1): 2001::5 send UPDATE 2001:34::/64 -- unreachable

Jul 6 01:33:57.925: %BGP-5-ADJCHANGE: neighbor 10.0.0.1 Up

Jul 6 01:33:57.925: BGP(1): next-hop unchanged for 2001:50::/64 but no label is available to forward

Jul 6 01:33:57.925: BGP(1): 10.0.0.1 send UPDATE (format) 2001:50::/64, next 2001::5, metric 0, path , mpls label 0

Jul 6 01:33:57.925: BGP(1): 10.0.0.1 send UPDATE (format) 2001:33::/64, next ::FFFF:10.0.0.2, metric 0, path , mpls label 0

Jul 6 01:33:57.929: BGP(1): 10.0.0.1 rcvd UPDATE w/ attr: nexthop ::FFFF:10.0.0.1, origin i, localpref 100, metric 0

Jul 6 01:33:57.929: BGP(1): 10.0.0.1 rcvd 2001:34::/64

Jul 6 01:33:57.929: BGP(1): Revise route installing 2001:34::/64 -> ::FFFF:10.0.0.1 (::) to main IPv6 table

ibb2#

Jul 6 01:34:02.825: BGP(1): 2001::5 send UPDATE (format) 2001:34::/64, next 2001::2, metric 0, path

ibb2#

mheusing · ‎07-06-2007

Hi,

I think Harold has a point and it is along the RFC mentioned above. From RFC 4798:

"A routing protocol (IGP or EGP) may run between the CE router and the 6PE router for the distribution of IPv6 reachability information."

The whole RFC does not mention your setup and EGP does refer to eBGP - unfortunately not written as explicitly as I do it here.

Also from RFC 4798:

"The ingress 6PE router MUST forward IPv6 data over the IPv4-signaled LSP towards the egress 6PE router identified by the IPv4 address advertised in the IPv4-mapped IPv6 address of the BGP Next Hop for the corresponding IPv6 prefix."

Now in your case a PE, which is IPv6 Route-Reflector does not insert itself as next hop - at least a RR should not do it. And there is the problem, as the LSP between the ingress and egress 6PE router is established using the IPv4 MP-BGP session addresses; those two addresses MUST be used. The RFC is absolutely clear about this.

So use eBGP or an IGP for IPv6 routing between CE and 6PE, MP-iBGP using IPv4 addresses for peering between 6PE routers and establish LSPs between all 6PEs IPv4 peering addresses. Then it works like a charm.

Hope this helps!

Regards, Martin

swaroop.potdar · ‎07-06-2007

Looking at the RFC the topology used for testing is not quite aligned.

1) RFC says IPV4 source address to be used for advertising the routes, if we are receiving native IPV6 routes over a native IPV6 IBGP connection the source address wont be IPV4.

2) Its also says the main purspose of this document is to connect Native IPV6 islands to each other, which definately wont be the case when we do IBGP with the 6PE as it would be an extension of the network.

RFC are generally most technically and politically correct documents :-).

But here for the problem you pointed out, it appears to be an unwanted behaviour from the code. Where it should

not at all advertise the native IPv6 routes if the routes have not been originated by the 6PE it self from BGP point of view.

This point is also made clear in the RFC terming it as "distrbution of routes between CE and 6PE".

PS: Also the bogus routes are actually the native IPV6 prefixes received and advertised by the ingress PE to the egress PE.

I came to this conclusion by seeing this pattern, it always picks up the second octet and appends 00 to it with a mask of /40.

This is pretty consitent. for eg: if you have a prefix received on IBGP from your CE as 2001:25::/64 then on the other end you will see it as 2500::/40. I am not sure why is it doing this conversion as its an unwanted behaviour, when it should rather discard these native IBGP routes over an IPV6-MPLS MP-IBGP session.

Alternatively if you want to run IBGP only for some reason between the 2 native IPV6 islands then you can consider extedning your IGP across the IPV6-MPLS domain and running IBGP across it. You can have this achieved by injecting your IGP at the 6PE router and hence have a the native IPV6 route advertised with a label attached to it to the remote 6PE's. This works as well. This way forwarding will be done by MPLS and you can still retain the routing control you need within your AS using IBGP.

HTH-Cheers,

Swaroop

r.gagliano · ‎07-06-2007

Thanks for the replies. Roque