cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
31066
Views
30
Helpful
10
Replies

BGP flaps between IOS and XR

Hi

I see there are random BGP session flaps that occur between an IOS running and XR running router. It is a VPNv4 session and both routers have different uplinking MTU and are connected through IGP.

some errors that were logged in during the flap are

BGP-5-ADJCHANGE

BGP_SESSION-5-ADJCHANGE

BGP-3-NOTIFICATION

Has anyone experienced similar issues on their network and have any solution to this?

1 Accepted Solution

Accepted Solutions

Hi Ramya,

That is bit challenging to look at the information as it comes in parts with a bit un-matching time and some substituted details.

From the output we can see:

  Last reset 02:06:08, due to Peer closing down the session

  Peer reset reason: Remote closed the session (Connection timed out)

So it was 2 hours 6 min back and should be around 16:22, but we have no logs for this in the provided details. The session was closed by the peer “Foreign host: x.x.x.x,”

6 hours 6 min ago we sent notification too

  Time since last notification sent to neighbor: 06:06:22

  Error Code: hold time expired

Matching log:

RP/0/5/CPU0::Dec 12 12:21:54.039 IST: bgp[139]: %ROUTING-BGP-5-ADJCHANGE : neighbor x.x.x.xDown - BGP Notification sent, hold time expired (VRF: default)

So the last session went down due to missed bgp keepalives. The keepalive hold timer is 180 sec (3min). what took my attention is that before the session reset we start getting Invalid MD5 just 3 min before the reset.  So the XR didn’t like MD5 digest coming from IOS but after a flap the session was restored.

There is one bug on IOS “CSCsx33622   Fix MSS calcuation issue in TCP” making IOS sending segments with incorrect MD5 at some conditions. You may want to look into it.

/A

View solution in original post

10 Replies 10

Alexei Kiritchenko
Cisco Employee
Cisco Employee

Hi Ramya,

There could be many different reasons for the session to close.

Complete ‘show log’ output and ‘show bgp vpnv4 unicast neighbors x.x.x.x detail’ may help you to get details on why the session was terminated.

Regards,

/A

Hi

Some logs during the flap are

001359: Dec 12 12:18:43.229 IST: %BGP-5-ADJCHANGE: neighbor x.x.x.x Down BGP protocol initialization

001360: Dec 12 12:18:43.229 IST: %BGP_SESSION-5-ADJCHANGE: neighbor x.x.x.x VPNv4 Unicast topology base removed from session  BGP protocol initialization

001361: Dec 12 12:18:43.229 IST: %BGP_SESSION-5-ADJCHANGE: neighbor x.x.x.x IPv4 MDT topology base removed from session  BGP protocol initialization

001362: Dec 12 12:18:43.733 IST: %BGP_SESSION-5-ADJCHANGE: neighbor x.x.x.x VPNv4 Unicast topology base removed from session  Unknown path error

001363: Dec 12 12:18:43.733 IST: %BGP_SESSION-5-ADJCHANGE: neighbor x.x.x.x IPv4 MDT topology base removed from session  Unknown path error

001364: Dec 12 12:18:43.733 IST: %BGP_SESSION-5-ADJCHANGE: neighbor x.x.x.x IPv4 Unicast topology base removed from session  Unknown path error

001365: Dec 12 12:18:51.981 IST: %BGP_SESSION-5-ADJCHANGE: neighbor x.x.x.x IPv4 Unicast topology base removed from session  Capability changed

001366: Dec 12 12:18:51.981 IST: %BGP-5-ADJCHANGE: neighbor x.x.x.x Up

001367: Dec 12 12:21:54.183 IST: %BGP-3-NOTIFICATION: received from neighbor x.x.x.x 4/0 (hold time expired) 0 bytes

001368: Dec 12 12:21:54.183 IST: %BGP-5-ADJCHANGE: neighbor x.x.x.x Down BGP protocol initialization

001369: Dec 12 12:21:54.347 IST: %BGP_SESSION-5-ADJCHANGE: neighbor x.x.x.x VPNv4 Unicast topology base removed from session  Peer closed the session

001370: Dec 12 12:21:54.347 IST: %BGP_SESSION-5-ADJCHANGE: neighbor x.x.x.x IPv4 MDT topology base removed from session  Peer closed the session

001371: Dec 12 12:21:55.363 IST: %BGP_SESSION-5-ADJCHANGE: neighbor x.x.x.x VPNv4 Unicast topology base removed from session  Unknown path error

001372: Dec 12 12:21:55.363 IST: %BGP_SESSION-5-ADJCHANGE: neighbor x.x.x.x IPv4 MDT topology base removed from session  Unknown path error

001373: Dec 12 12:21:55.363 IST: %BGP_SESSION-5-ADJCHANGE: neighbor x.x.x.x IPv4 Unicast topology base removed from session  Unknown path error

001374: Dec 12 12:22:07.415 IST: %BGP_SESSION-5-ADJCHANGE: neighbor x.x.x.x IPv4 Unicast topology base removed from session  Capability changed

001375: Dec 12 12:22:07.415 IST: %BGP-5-ADJCHANGE: neighbor x.x.x.x Up

may you get as well  ‘show log’ output and ‘show bgp vpnv4 unicast neighbors x.x.x.x detail’ from the XR side?

/A

RP/0/5/CPU0::Dec 12 12:15:43.496 IST: tcp[443]: %IP-TCP-3-BADAUTH : Invalid MD5 digest from x.x.x.x37416 to 192.168.207.229:179

RP/0/5/CPU0::Dec 12 12:15:59.965 IST: tcp[443]: %IP-TCP-3-BADAUTH : Invalid MD5 digest from x.x.x.x37416 to 192.168.207.229:179

RP/0/5/CPU0::Dec 12 12:16:18.789 IST: tcp[443]: %IP-TCP-3-BADAUTH : Invalid MD5 digest from x.x.x.x37416 to 192.168.207.229:179

RP/0/5/CPU0::Dec 12 12:16:37.619 IST: tcp[443]: %IP-TCP-3-BADAUTH : Invalid MD5 digest from x.x.x.x37416 to 192.168.207.229:179

RP/0/5/CPU0::Dec 12 12:16:56.410 IST: tcp[443]: %IP-TCP-3-BADAUTH : Invalid MD5 digest from x.x.x.x37416 to 192.168.207.229:179

RP/0/5/CPU0::Dec 12 12:17:15.227 IST: tcp[443]: %IP-TCP-3-BADAUTH : Invalid MD5 digest from x.x.x.x37416 to 192.168.207.229:179

RP/0/5/CPU0::Dec 12 12:17:34.054 IST: tcp[443]: %IP-TCP-3-BADAUTH : Invalid MD5 digest from x.x.x.x37416 to 192.168.207.229:179

RP/0/5/CPU0::Dec 12 12:17:52.879 IST: tcp[443]: %IP-TCP-3-BADAUTH : Invalid MD5 digest from x.x.x.x37416 to 192.168.207.229:179

RP/0/5/CPU0::Dec 12 12:18:11.683 IST: tcp[443]: %IP-TCP-3-BADAUTH : Invalid MD5 digest from x.x.x.x37416 to 192.168.207.229:179

RP/0/5/CPU0::Dec 12 12:18:30.499 IST: tcp[443]: %IP-TCP-3-BADAUTH : Invalid MD5 digest from x.x.x.x37416 to 192.168.207.229:179

RP/0/5/CPU0::Dec 12 12:18:42.915 IST: bgp[139]: %ROUTING-BGP-5-ADJCHANGE : neighbor x.x.x.xDown - BGP Notification sent, hold time expired (VRF: default)

RP/0/5/CPU0::Dec 12 12:18:49.364 IST: tcp[443]: %IP-TCP-3-BADAUTH : Invalid MD5 digest from x.x.x.x37416 to 192.168.207.229:179

RP/0/5/CPU0::Dec 12 12:18:52.272 IST: bgp[139]: %ROUTING-BGP-5-ADJCHANGE : neighbor x.x.x.xUp (VRF: default)

RP/0/5/CPU0::Dec 12 12:19:01.433 IST: tcp[443]: %IP-TCP-3-BADAUTH : Invalid MD5 digest from x.x.x.x38341 to 192.168.207.229:179

RP/0/5/CPU0::Dec 12 12:19:21.680 IST: tcp[443]: %IP-TCP-3-BADAUTH : Invalid MD5 digest from x.x.x.x38341 to 192.168.207.229:179

RP/0/5/CPU0::Dec 12 12:19:45.766 IST: tcp[443]: %IP-TCP-3-BADAUTH : Invalid MD5 digest from x.x.x.x37416 to 192.168.207.229:179

RP/0/5/CPU0::Dec 12 12:20:04.584 IST: tcp[443]: %IP-TCP-3-BADAUTH : Invalid MD5 digest from x.x.x.x37416 to 192.168.207.229:179

RP/0/5/CPU0::Dec 12 12:20:23.402 IST: tcp[443]: %IP-TCP-3-BADAUTH : Invalid MD5 digest from x.x.x.x37416 to 192.168.207.229:179

RP/0/5/CPU0::Dec 12 12:20:42.215 IST: tcp[443]: %IP-TCP-3-BADAUTH : Invalid MD5 digest from x.x.x.x37416 to 192.168.207.229:179

RP/0/5/CPU0::Dec 12 12:21:01.029 IST: tcp[443]: %IP-TCP-3-BADAUTH : Invalid MD5 digest from x.x.x.x37416 to 192.168.207.229:179

RP/0/5/CPU0::Dec 12 12:21:19.863 IST: tcp[443]: %IP-TCP-3-BADAUTH : Invalid MD5 digest from x.x.x.x37416 to 192.168.207.229:179

RP/0/5/CPU0::Dec 12 12:21:36.635 IST: tcp[443]: %IP-TCP-3-BADAUTH : Invalid MD5 digest from x.x.x.x38341 to 192.168.207.229:179

RP/0/5/CPU0::Dec 12 12:21:54.039 IST: bgp[139]: %ROUTING-BGP-5-ADJCHANGE : neighbor x.x.x.xDown - BGP Notification sent, hold time expired (VRF: default)

RP/0/5/CPU0::Dec 12 12:21:57.519 IST: tcp[443]: %IP-TCP-3-BADAUTH : Invalid MD5 digest from x.x.x.x37416 to 192.168.207.229:179

RP/0/5/CPU0::Dec 12 12:22:07.728 IST: bgp[139]: %ROUTING-BGP-5-ADJCHANGE : neighbor x.x.x.xUp (VRF: default)

RP/0/5/CPU0::Dec 12 12:22:16.309 IST: tcp[443]: %IP-TCP-3-BADAUTH : Invalid MD5 digest from x.x.x.x37416 to 192.168.207.229:179

RP/0/5/CPU0::Dec 12 12:22:30.616 IST: tcp[443]: %IP-TCP-3-BADAUTH : Invalid MD5 digest from x.x.x.x38341 to 192.168.207.229:179

RP/0/5/CPU0::Dec 12 12:23:24.615 IST: tcp[443]: %IP-TCP-3-BADAUTH : Invalid MD5 digest from x.x.x.x38341 to 192.168.207.229:179

RP/0/5/CPU0::Dec 12 12:24:18.581 IST: tcp[443]: %IP-TCP-3-BADAUTH : Invalid MD5 digest from x.x.x.x38341 to 192.168.207.229:179

RP/0/5/CPU0::Dec 12 12:25:12.566 IST: tcp[443]: %IP-TCP-3-BADAUTH : Invalid MD5 digest from x.x.x.x38341 to 192.168.207.229:179

RP/0/5/CPU0:hostname#   sh bgp vpnv4 unicast neighbors x.x.x.x detai$

Mon Dec 12 18:28:16.091 IST

BGP neighbor is x.x.x.x

Remote AS aaaa, local AS aaaa, internal link

Description:

Remote router ID x.x.x.x

Cluster ID zz.zz.zz.zz

  BGP state = Established, up for 02:05:52

  Last read 00:00:18, Last read before reset 02:06:47

  Hold time is 180, keepalive interval is 60 seconds

  Configured hold time: 180, keepalive: 60, min acceptable hold time: 3

  Last write 00:00:00, attempted 110, written 110

  Second last write 00:00:00, attempted 110, written 110

  Last write before reset 02:06:08, attempted 124, written 124

  Second last write before reset 02:06:08, attempted 187, written 187

  Last write pulse rcvd  Dec 12 18:28:16.034 last full Dec 12 16:25:34.364 pulse count 13423732

  Last write pulse rcvd before reset 02:06:08

  Socket not armed for io, armed for read, armed for write

  Last write thread event before reset 02:06:08, second last 02:06:08

  Last KA expiry before reset 00:00:00, second last 00:00:00

  Last KA error before reset 00:00:00, KA not sent 00:00:00

  Last KA start before reset 02:06:08, second last 02:06:08

  Precedence: internet

  Neighbor capabilities:            Adv         Rcvd

    Route refresh:                  Yes         Yes

    4-byte AS:                      Yes         No

    Address family VPNv4 Unicast:   Yes         Yes

    Address family IPv4 MDT:        Yes         Yes

  Message stats:

    InQ depth: 0, OutQ depth: 0

                    Last_Sent               Sent  Last_Rcvd               Rcvd

    Open:           Dec 12 16:22:23.625       66  Dec 12 16:22:23.624       66

    Notification:   Dec 12 12:21:54.648       58  ---                        0

    Update:         Dec 12 18:28:16.369 56505846  Dec 12 18:14:26.334   653688

    Keepalive:      Dec 12 16:22:29.953      192  Dec 12 18:27:57.836    46465

    Route_Refresh:  ---                        0  Dec  8 11:57:08.021       22

    Total:                              56506162                        700241

  Minimum time between advertisement runs is 0 secs

For Address Family: VPNv4 Unicast

  BGP neighbor version 129938542

  Update group: 0.1

  Route-Reflector Client

  Route refresh request: received 22, sent 0

  2871 accepted prefixes, 2871 are bestpaths

  Cumulative no. of prefixes denied: 0.

  Prefix advertised 376981, suppressed 0, withdrawn 39560

  Maximum prefixes allowed 524288

  Threshold for warning message 75%, restart interval 0 min

  An EoR was received during read-only mode

  Last ack version 129938526, Last synced ack version 0

  Outstanding version objects: current 463, max 476

For Address Family: IPv4 MDT

  BGP neighbor version 10353

  Update group: 0.1

  Route-Reflector Client

  Route refresh request: received 0, sent 0

  0 accepted prefixes, 0 are bestpaths

  Cumulative no. of prefixes denied: 0.

  Prefix advertised 811, suppressed 0, withdrawn 0

  Maximum prefixes allowed 131072

  Threshold for warning message 75%, restart interval 0 min

  An EoR was received during read-only mode

  Last ack version 10353, Last synced ack version 0

  Outstanding version objects: current 0, max 6

  Connections established 66; dropped 65

  Local host: zz.zz.zz.zz, Local port: 179

  Foreign host: x.x.x.x, Foreign port: 25768

  Last reset 02:06:08, due to Peer closing down the session

  Peer reset reason: Remote closed the session (Connection timed out)

  Time since last notification sent to neighbor: 06:06:22

  Error Code: hold time expired

  Notification data sent:

    None

RP/0/5/CPU0:hostname#

Hi Ramya,

It seems you configured MD5 protection for the TCP session but the pwd is not the same on both peers. Please remove it first on both side to check the seesion comes UP correctly and is stable. Then add the pwd on both side back and be careful with the syntax. Be sure there is no space left at the end for example.

HTH

Laurent.

Laurent

I dont think it is MD5 issue becos this is set among my other routers with the same XR router and the sessions dont flap or show MD5 bad.

Hi Ramya,

That is bit challenging to look at the information as it comes in parts with a bit un-matching time and some substituted details.

From the output we can see:

  Last reset 02:06:08, due to Peer closing down the session

  Peer reset reason: Remote closed the session (Connection timed out)

So it was 2 hours 6 min back and should be around 16:22, but we have no logs for this in the provided details. The session was closed by the peer “Foreign host: x.x.x.x,”

6 hours 6 min ago we sent notification too

  Time since last notification sent to neighbor: 06:06:22

  Error Code: hold time expired

Matching log:

RP/0/5/CPU0::Dec 12 12:21:54.039 IST: bgp[139]: %ROUTING-BGP-5-ADJCHANGE : neighbor x.x.x.xDown - BGP Notification sent, hold time expired (VRF: default)

So the last session went down due to missed bgp keepalives. The keepalive hold timer is 180 sec (3min). what took my attention is that before the session reset we start getting Invalid MD5 just 3 min before the reset.  So the XR didn’t like MD5 digest coming from IOS but after a flap the session was restored.

There is one bug on IOS “CSCsx33622   Fix MSS calcuation issue in TCP” making IOS sending segments with incorrect MD5 at some conditions. You may want to look into it.

/A

akiritch

that was bang on the bug......all the symptoms and workaround worked exactly the same way on my routers. Thankyou so much.

Hi Ramya,

a few things to look at:

the pattern for these flaps seems to be every 3 minute, i.e. hold-down interval:

RP/0/5/CPU0::Dec 12 12:18:52.272 IST: bgp[139]: %ROUTING-BGP-5-ADJCHANGE : neighbor x.x.x.xUp (VRF: default)

RP/0/5/CPU0::Dec 12 12:21:54.039 IST: bgp[139]: %ROUTING-BGP-5-ADJCHANGE : neighbor x.x.x.xDown - BGP Notification sent, hold time expired (VRF: default)

so I belive indeed the MD5 hash is correct.

What does it tell us on the other side of this peering? I would need to have a look at the same outputs requested by Alex but from the other side.

Also you mentioned in the very beginning that the MTU size is different. I recommend to either equalize it on both sides, or to set the TCP mss size to a common value, which could make it through the MTU.

HTH,

Ivan.

vinit shah
Level 1
Level 1

Dear All,

Can anyone assist to resolve the below issue.........

Oct 26 07:55:49: %BGP-5-ADJCHANGE: neighbor x.x.x.x Down BFD adjacency down

Oct 26 07:55:49: %BGP_SESSION-5-ADJCHANGE: neighbor x.x.x.xIPv4 Unicast topology base removed from session  BFD adjacency down

Oct 26 07:55:49: %BGP-5-ADJCHANGE: neighbor x.x.x.x Down BFD adjacency down

Oct 26 07:55:49: %BGP_SESSION-5-ADJCHANGE: neighbor x.x.x.x IPv4 Unicast topology base removed from session  BFD adjacency down

Oct 26 07:55:49: %BGP-5-ADJCHANGE: neighbor x.x.x.x Down BFD adjacency down

Oct 26 07:55:49: %BGP_SESSION-5-ADJCHANGE: neighbor x.x.x.x IPv4 Unicast topology base removed from session  BFD adjacency down

Oct 26 07:55:49: %BGP-5-ADJCHANGE: neighbor x.x.x.x Down BFD adjacency down

Oct 26 07:55:49: %BGP_SESSION-5-ADJCHANGE: neighbor x.x.x.x IPv4 Unicast topology base removed from session  BFD adjacency down

Oct 26 07:55:49: %BGP-5-ADJCHANGE: neighbor x.x.x.xDown BFD adjacency down

What needs to be done to resolve the above issue........