I Know this is an old thread,

andybonyx · ‎07-20-2010

Hi all,

We've recently introduced two 7606-S routers on our core that are handling peering sessions and we've experienced some problems with sessions not coming up/flapping.

The behaviour matches a bug known, but as we're not using BGP dampening we've discounted this. The router is 7206-S and running IOS 12.2(33)SRD4, RELEASE SOFTWARE (fc2) (disk0:/c7600s72033-advipservices-mz.122-33.SRD4.bin)

And after enabling debugging on the affected peer (there are between 10 and 15 peers showing this behaviour) we see:

2d11h: BGP: 195.xx.xx.xx active went from Idle to Active 2d11h: BGP: 195.xx.xx.xx open active, local address 195.66.226.35 2d11h: BGP: ses global 195.xx.xx.xx (0) act read request no-op 2d11h: BGP: ses global 195.xx.xx.xx (0) act Adding topology IPv4 Unicast:base 2d11h: BGP: ses global 195.xx.xx.xx (0) act Send OPEN 2d11h: BGP: ses global 195.xx.xx.xx (0) act NSF Building GR capability. 2d11h: BGP: 195.xx.xx.xx active went from Active to OpenSent 2d11h: BGP: 195.xx.xx.xx active sending OPEN, version 4, my as: 6067, holdtime 180 seconds, ID C361C032 2d11h: BGP: 195.xx.xx.xx active send message type 1, length (incl. header) 60 2d11h: BGP: ses global 195.xx.xx.xx (0) act Remote close. 2d11h: BGP: nbr_topo global 195.xx.xx.xx IPv4 Unicast:base (0) NSF delete stale NSF not active 2d11h: BGP: nbr_topo global 195.xx.xx.xx IPv4 Unicast:base (0) NSF no stale paths state is NSF not active 2d11h: BGP: nbr_topo global 195.xx.xx.xx IPv4 Unicast:base (0) Resetting ALL counters. 2d11h: BGP: 195.xx.xx.xx active closing 2d11h: BGP: nbr_topo global 195.xx.xx.xx IPv4 Unicast:base (0) Resetting ALL counters. 2d11h: BGP: 195.xx.xx.xx active went from OpenSent to Idle *Jul 20 10:26:11: %BGP_SESSION-5-ADJCHANGE: neighbor 195.xx.xx.xx IPv4 Unicast topology base removed from session Unknown path error 2d11h: BGP: ses global 195.xx.xx.xx (0) act Removed topology IPv4 Unicast:base 2d11h: BGP: nbr global 195.xx.xx.xx Open active delayed 9216ms (35000ms max, 60% jitter)

Which continues cycling constantly for the affected peer. This doesn't happen to all sessions and we're now trying to pinpoint the problem. The config is fairly straight forward for one of the affected peers:

neighbor 195.xx.xx.xx remote-as 1234

neighbor 195.xx.xx.xx peer-group ix-linx-peers

neighbor 195.xx.xx.xx update-source GigabitEthernet4/2

neighbor 195.xx.xx.xx activate

neighbor 195.xx.xx.xx maximum-prefix 100

The peer-group has nothing exciting in it either, so we're a little stumped as it seems to be the bug relating to path dampening, but that isn't enabled here.

Anyone can advise/suggest next troubleshooting steps?

Thanks!

west33637 · ‎07-20-2010

Hello. I see you have the BGP update source as gi4/2. Check to see if the remote neighbor can ping this routers gi4/2 interface. This would be a good test to see if there is connectivity to form the BGP adjacency. If the remote neighbor cant ping this gi4/2 interface than they cannot form a TCP connection over port 179. Im assuming this is an IBGP connection.

Most commonly in IBGP, I have seen the update source configured as a loop back interface that is reachable via IGP. This ensures that in case of a hardware failure on GI4/2 the BGP connection we still remain up as long as there is an IGP path to get to the 'always up' loopback interface.

Also, seeing the peer-group configuration and the configuration on the other end would help. Thx

Please remember to rate post if it helps.

milan.kulik · ‎07-20-2010

Hi,

I guess some problem with routing to reach your BGP neighbor (or him to reach your router)?

Are you both using the IP addresses within a directly connected subnet for peering?

If not, is there ebgp-multihop configured (if eBGP is used)?

And is there a static route configured on both sides for the neighbor address?

I noticed similar behaviour in the past when routing to a peer was based on a default route only. The peering was established but disconnected after a minute.

HTH,

Milan

andybonyx · ‎07-20-2010

Thanks for the replies guys. Yes the peers can ping each other, and it is intentional to use the interface IP address as this is a directly connecting peering setup (over a peering exchange).

So I can do a sh ip route 195.xx.xx.xx for the IP and see it as:

Known via "connected", distance 0, metric 0 (connected, via interface)

..

* directly connected, via GigabitEthernet4/2

So that looks fine, we can ping constantly over the link, and prior to the IOS upgrade on this device the peer was working fine.

milan.kulik · ‎07-20-2010

Hi,

what about the maximum-prefix limit on the neighbor side?

Aren't you exceeding it with the number of prefixes you are advertising?

BR,

Milan

west33637 · ‎07-20-2010

Hello Andy. Try telnetting to to the neighbors on port 179 on both sides. See if a connection is opened.

Telnet 195.xx.xx.xx 179

debug ip tcp transactions --- where your access-list is narrowed down to permit communication on port 179

that might show you some detail as to whats going on in the TCP establishment.

west33637 · ‎07-20-2010

also, try removing the update-source command on both ends since they are directly connected neighbors. shouldnt need the update-source although I honestly dont see why that should cause the neighbor adjacency not to form.

If that doesnt work, remove the neighbors from the peer-group. bring it back to the very basic BGP configuration between the two connected neighbors. test it then.

osoerensen · ‎01-31-2017

I Know this is an old thread, but somebody may find it usefull someday. - I experienced the same behavior today... it turned out, that one of the peers had a wrong (wider) subnetmask :)

BGP Peers not coming up 7606-S unknown path error - not using dampening!