cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2310
Views
20
Helpful
12
Replies

iBGP flaps when routes are propagated from eBGP peer

We currently encountered a problem with our distribution routers, BGP session is flapping when trying to propagate routes from IGW routers. Routes are coming from ISP.

[06:34:658]000534: Jan 25 06:34:56.323 PH: %BGP-3-NOTIFICATION: received from neighbor x.x.x.1 4/0 (hold time expired) 0 bytes

[06:34:658]000535: Jan 25 06:34:56.323 PH: %BGP-5-ADJCHANGE: neighbor x.x.x.1 Down BGP Notification received

[06:34:658]000536: Jan 25 06:34:56.323 PH: %BGP_SESSION-5-ADJCHANGE: neighbor x.x.x.1 VPNv4 Unicast topology base removed from session  BGP Notification received

[06:34:658]000537: Jan 25 06:34:56.323 PH: %BGP_SESSION-5-ADJCHANGE: neighbor x.x.x.1 IPv4 Unicast topology base removed from session  BGP Notification received

[06:34:661]000538: Jan 25 06:35:01.411 PH: %BGP-3-NOTIFICATION: received from neighbor x.x.x.2 4/0 (hold time expired) 0 bytes

[06:34:661]000539: Jan 25 06:35:01.411 PH: %BGP-5-ADJCHANGE: neighbor x.x.x.2 Down BGP Notification received

[06:34:661]000540: Jan 25 06:35:01.411 PH: %BGP_SESSION-5-ADJCHANGE: neighbor x.x.x.2 VPNv4 Unicast topology base removed from session  BGP Notification received

[06:34:666]000541: Jan 25 06:35:01.411 PH: %BGP_SESSION-5-ADJCHANGE: neighbor x.x.x.2 IPv4 Unicast topology base removed from session  BGP Notification received

[06:34:656]000542: Jan 25 06:35:04.519 PH: %BGP-5-ADJCHANGE: neighbor x.x.x.1 Up

[06:35:657]000543: Jan 25 06:35:07.591 PH: %BGP-5-ADJCHANGE: neighbor x.x.x.2 Up

May I ask if you encountered this problem?

2 Accepted Solutions

Accepted Solutions

The fact you don't ping can be the issue indeed.

How many physical links you have between the iBGP routers? You need to make sure that MTU is 7500 on all of them. It would seem that some intermediate link is configured with lower value.

Likely there is 'no ip unreachables' along the path and no icmp 3/4 are sent back (or a fw is blocking them). This way PMTUD cannot be used to adjust the mss.

Your BGP routers use 7436 as MSS but as we saw those big chunks are not going through instead.

We have a workaround from bgp/ios  point of view but we can talk of it after veryfing end to end MTU.

About the other question if we deal with MTU issue the fact that some other ISP is fine does not matter. This problem pops out when BGP udates of big size are being sent. If your BGP peering is stable and no many changes are happening BGP updates will be smaller and likely will not be dropped.

Riccardo

View solution in original post

any update on this Bryan?

View solution in original post

12 Replies 12

rsimoni
Cisco Employee
Cisco Employee

it would seem that the BGP peers which flap don't hear from each other while prefixes are propagated.

[06:34:658]000534: Jan 25 06:34:56.323 PH: %BGP-3-NOTIFICATION: received from neighbor x.x.x.1 4/0 (hold time expired) 0 bytes

[06:34:658]000535: Jan 25 06:34:56.323 PH: %BGP-5-ADJCHANGE: neighbor x.x.x.1 Down BGP Notification received

Have you checked iif you had high CPU while those prefixes were exchanged?

Have you also checked mtu related issue on the ibgp link?

can you ping the ibgp with full mtu?

ping x.x.x.1 size 1500 df-bit

Riccardo

Hi Bryan,

Riccardo is right.. I have the same issue few days ago.

Things to check

MTU values

Traffic shaping

Rate-limiting parameters

• Looks like a Layer 2 problem

• At this point it seems BGP is not at fault

Also, try to change the hold down and keepalive timer to lower value for BGP interface.

HTH,

Hi Ricardo,

Just tried what you have recommended

But it can be ping with a datagram od 1000 and I can ping the neighbor iBGP.

routerB#ping x.x.x.1 size 1500 repeat 100

Type escape sequence to abort.

Sending 100, 1500-byte ICMP Echos to x.x.x.1, timeout is 2 seconds:

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Success rate is 100 percent (100/100), round-trip min/avg/max = 1/2/4 ms

routerB#

checked the MTU on both routers and see below output:

routerA#sh ip bgp nei

routerA#sh ip bgp neighbors x.x.x.2 | inc max data segment

Datagrams (max data segment is 7436 bytes):

routerB#sh ip bgp nei x.x.x.1 | inc max data segment

Datagrams (max data segment is 7436 bytes):

routerB#

Is 1500 the recomended MTU to be used for testing? can I use 7500?

Thanks,

Hi Bryan,

routerB#ping x.x.x.1 size 1500 repeat 100

Type escape sequence to abort.

Sending 100, 1500-byte ICMP Echos to x.x.x.1, timeout is 2 seconds:

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Success rate is 100 percent (100/100), round-trip min/avg/max = 1/2/4 ms

routerB#


You missed the df-bit in your ping test. You need to use the df-bit(dont fragement) bit to ensure that you get 1500bytes MTU between the peers. Your test doesnt ensure the MTU end to end is 1500bytes as the packets could get fragmented in between.

What the problem could be is that BGP packages the routes in nice little packages with a bow and uses the MTU of the interface which is normally 1500bytes. Now, imagine if the MTU along the path is not 1500 then these wil be fragmented and the receiving peer has to reassemble all of this shoots up the CPU..you get what I mean?

Is 1500 the recomended MTU to be used for testing? can I use 7500? 

well, if your interface supports it then yes you can use it  but it better be supported  along the path meaning if your router is generating a L3 MTU of 7500bytes then all the interfaces in between should be able to handle that and pass thru without any issues. If the path has some L2 switches and the interface MTU is less than this then it will drop the frame.

Essentially, Ethernet has a default MTU of 1500bytes. Gig Eth support baby giants(1600bytes) and upto 9216bytes(Jumbo frames).  The higher values are normally used in SP space where a varietly of overheads are involved when using  multiple technoloiges etc. In a medium enterprise its based on the requirements etc.

Edit:  Sorry just saw your output now.

routerA#sh ip bgp nei

routerA#sh ip bgp neighbors x.x.x.2 | inc max data segment

Datagrams (max data segment is 7436 bytes):

routerB#sh ip bgp nei x.x.x.1 | inc max data segment

Datagrams (max data segment is 7436 bytes):

routerB#


The max data segment is the MSS which has been negotiated between the peers. please see the below lnk which will give you some more info on this

http://nagendrakumar-nagendra.blogspot.com/2010/03/bgp-path-mtu-discovery.html

HTH

Kishore

Message was edited by: Kishore Chennupati

Hi Kishore,

Please see output below:

routerA#ping x.x.x.1 size 1500 df-bit repeat 1000

Type escape sequence to abort.

Sending 1000, 1500-byte ICMP Echos to x.x.x.1, timeout is 2 seconds:

Packet sent with the DF bit set

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!!!!!!!!!!!!!!!!!!!!

Success rate is 100 percent (1000/1000), round-trip min/avg/max = 1/1/4 ms

routerB#ping x.x.x.1 size 1500 df-bit repeat 1000

Type escape sequence to abort.

Sending 1000, 1500-byte ICMP Echos to x.x.x.1, timeout is 2 seconds:

Packet sent with the DF bit set

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!!!!!!!!!!!!!!!!!!!!

Success rate is 100 percent (1000/1000), round-trip min/avg/max = 1/1/4 ms

routerB#

Thanks,

Bryan

Hi Bryan,

I took it for granted that you were using MTU 1500, but since you configured 7500 you need to make sure that a frame of such size makes through. Same logic of the test done before but using the correct value.

As you saw the TCP session negotiated MSS at 7436, so you need to see if such big packets are making through.

can you ping this way?

ping x.x.x.1 size 7476 df-bit repeat 1000

Riccardo

Hi Ricardo,

I tried using 7476 and 7436 as per your advice and it seems I cannot ping the designated IP is this the possible cause?

routerA#ping x.x.x..2 size 7436 df-bit repeat 50

Type escape sequence to abort.

Sending 50, 7436-byte ICMP Echos to x.x.x.2, timeout is 2 seconds:

Packet sent with the DF bit set

..................................................

Success rate is 0 percent (0/50)

routerA#

routerA#ping x.x.x.2 size 7476 df-bit repeat 50

Type escape sequence to abort.

Sending 50, 7476-byte ICMP Echos to x.x.x.2, timeout is 2 seconds:

Packet sent with the DF bit set

..................................................

Success rate is 0 percent (0/50)

routerA#

If that is the case I have another question an eBGP routes from other ISP are already proagated or advertised from the IGW and it does not cause flapping on the iBGP links. BGP configuration are the same for ISP1 and ISP2. Is it possible that the problem is may also ne isolated on the ISP?

Thanks,

Bryan

The fact you don't ping can be the issue indeed.

How many physical links you have between the iBGP routers? You need to make sure that MTU is 7500 on all of them. It would seem that some intermediate link is configured with lower value.

Likely there is 'no ip unreachables' along the path and no icmp 3/4 are sent back (or a fw is blocking them). This way PMTUD cannot be used to adjust the mss.

Your BGP routers use 7436 as MSS but as we saw those big chunks are not going through instead.

We have a workaround from bgp/ios  point of view but we can talk of it after veryfing end to end MTU.

About the other question if we deal with MTU issue the fact that some other ISP is fine does not matter. This problem pops out when BGP udates of big size are being sent. If your BGP peering is stable and no many changes are happening BGP updates will be smaller and likely will not be dropped.

Riccardo

Hey Bryan,

Can you do a ping sweep with record option set. This will give you the bottleneck and also the MTU which can travel without been fragmented.

see below link how to do a ping sweep

http://www.cisco.com/en/US/tech/tk365/technologies_tech_note09186a0080093f22.shtml

HTH

Kishore

any update on this Bryan?

Hi Ricardo, Kishore and Arnold,

     You are all correct, it seems that the problem is the MTU here. when we try to change the  MTU to a lower value in between the links the iBGP does not flap while propagating the eBGP. We suspect that the in between link (Transmission) does not support an MTU size of 7500. We are still checking with the transmission provider if they could lower down the MTU size on their end to verify this.

Appreciate all your response. Your all a great help.

Thanks,

Bryan

that's very good news!!! 

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Review Cisco Networking products for a $25 gift card