cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2444
Views
5
Helpful
27
Replies

PE Failure convergence time

csco10387876
Level 1
Level 1

Hi,

I would like to know if anyone knew of a way to get sub 1s convergence for MPBGP in case of a PE failure ?

Right no I can get a convergence time of around 4-5s.

The cust lan has 2 connection to the mpls network.

The dely come fro mthe fact that the VRF routing table is not updated with the backup route as soon as the route to the failed Pe is removed from the core igp.

Thanks for any input.

27 Replies 27

Hi Chintan,

In your case, unique RD doesn't resolve your issue because PE2 will prefer its iBGP route from PE1 over its eBGP one received from CE2.

So as a result PE2 will not send its eBGP route to the RR so the remote PE will receive only one route even if you are using a unique RD.

To be useful, PE1 and PE2 must prefer their eBGP route, so the RR will receive both of them and forward both of them (thanks to unique RD which make the VPNv4 prefix unique)

On remote PE, it will increase the VRF memory as you store more routes but 12K scale well so except if you want the full routing in all your VRF's, you should be fine.

Regarding your question about BGP local convergence, it will be supported in XR via the BGP PIC edge feature. Please contact your Cisco account team for more details about the road-map.

HTH

Laurent.

Hi Laurent,

In my case uf we use unique RD, shouldn't PE2 select local ( learned by EBGP) and routes learned by PE1 (via iBGP)as they are still two different VPNV4 prefix ( due to unique RD) ?

And regarding remote PE, it increase VRF memory which is RP but not memory (FIB)on line card -for example on 12K ? Is that correct ?

Thanks for clarification on BGP local convergance, will chasse Account team on the same.

Regards,

Chintan

RD is not involved in BGP best path selection. If you're doing as-prepend on CE2, PE2 will prefer it's iBGP routes.

Using unique RD is only useful for RR which will forward all the VPNv4 prefixes as they are unique.

RIB and FIB contain only best path so you're correct. It's the BGP table which will increase

HTH

Laurent.

Thanks for clarification. so In vpn if i want one side primary and other backup what is best way ? - I should not loose advantage of unique RD.

And as you said RIB and FIB contain best path and BGP table increase-that is on RP only right no impact on memory of line card ? - just to ensure.

Regards,

Chintan

Hi,

What you should do is to define two standard BGP communities:

C1 for nominal routes and C2 for backup routes.

You configure PE1 and PE2 to add one of those communities to the route they are exporting. Then you configure the remote PE to set a different LP based on the community via an import-map.

HTH

Laurent.

mpls to the CE mean that as we manage the CPE we propagate mpls up to the CE, and it runs bgp as well.

Hi All,

Is there anytoher technique under BGP timer besides Hold timer and Keepalive that can improve convergence in MPLS L3 VPN network.

I know about BFD, some work is going on under BGP timer side to improve convergence

Jaweed

Hi Jaweed,

It's not just a question of timers and it's not recommended to configure aggressive BGP timers as it will increase CPU processing on the RR if you have many PEs. It could also lead to session flapping.

The mechanism you can use to improve MPLS-VPN convergence time depends of the type of failure:

1- A core link failed: Tune your IGP for fast failure detection and fast-convergence. MPLS-TE FRR is also a solution. BGP is not involved.

Use LDP-IGP sync feature to avoid blackholing MPLS-VPN traffic during Down-UP event.

2- PE failure: We don't use the global scanner (every 60s) anymore for BGP peer failure detection. We have a new mechanism which is event driven called BGP Next-Hop Tracking.

http://www.cisco.com/en/US/docs/ios/12_3t/12_3t14/feature/guide/gt_bnht.html

So now your detection time is based on the time for your IGP to converge + BGP NHT initial delay (5s by default)

3- PE-CE link failure or CE failure for a dual attached site.

Convergence is based on fast link failure detection and on BGP convergence in the backbone. To improve the BGP part:

- Use one RD per vrf and per PE so the remote PEs will receive and install in their local vrf BGP RIB both routes (you bypass this way the import scanner)

If your PE is a 7600, you can use the following BGP local convergence feature:

http://www.cisco.com/en/US/docs/ios/mpls/configuration/guide/mp_vpn_pece_lnk_prot_ps6922_TSD_Products_Configuration_Guide_Chapter.html

Fast-convergence is a big subject involving many techniques so you should not focus on one feature only.

HTH

Laurent.

Hi,

Have you heard of VPN FRR. Huawei devices are already supporting this feature. They are using this feature for acheiving <200 ms convergence time during PE node failure secanrios.

Actually we are designing MPLS backbone for GSM voice traffic. We are implementing MPLS TE FRR for core link & P node failure to acheive convergence time around 200ms. But I am concerned for PE node failure scenarions. According to me convergence time should happen in seconds & not ms for PE node failure scenario. To address this Huawei came up with VPN FRR solution. Wanted to know whether VPN FRR is in cisco's roadmap. Or does cisco also support it in some IOS.

FRR is supported for a long time in IOS, this is a Traffic Engineering feature.

FRR is not a solution for PE crash it is more used for Link failure and intermediate P/PE router failure.

Endpoint PE failure is not a good case for Fast reroute I think.

more info here :

http://www.cisco.com/en/US/docs/ios/12_0s/feature/guide/fslinkpt.html

on a side note, frr can be also triggered by BFD.

Hope it helps

TE FRR is different than VPN FRR. VPN FRR is specifically designed for PE failures but it is not yer developed for Cisco.

Only Huawei supports VPN FRR i believe.

Hi Laurent,

Does BGP local convergance feature avilable in 12K (IOS/IOS-XR)?

I see that it is like per CPE one lable than per prefix label for VPN.

Regards,

Chintan

HI Jaweed

I m not sure how this will be applicable to your scenario. but if the links are POS u can use the loss of signal (LOS) to detect the failure and it can trigger igp convergence more rapidly

rgds