LDP session flap on a 7600 router

prdpvaghela · ‎03-08-2012

Hi guys,,

Please refer below the brief intro to the problem:

LDP neighbor went down [as seen in the below logs] for around few secs between below mentioned PE device to the two uplinks P devices.

Please refer to the note detail Topology:

PE[tE x/0/0] -------- TE Tunnel ------ P Device @ A location

[Te y/0/0] ------ TE Tunnel-------- P Device @ B location

#Show logs

917597: Mar 8 08:10:09 SAST: %LDP-5-NBRCHG: LDP Neighbor a.b.c.d:0 (5) is DOWN (TCP

connection closed by peer)

917664: Mar 8 08:10:19 SAST: %LDP-5-NBRCHG: LDP Neighbor e.f.g.h:0 (1) is DOWN

(Session KeepAlive Timer expired)

917701: Mar 8 08:10:21 SAST: %LDP-5-NBRCHG: LDP Neighbor a.b.c.d:0 (5) is UP

917771: Mar 8 08:10:23 SAST: %LDP-5-NBRCHG: LDP Neighbor e.f.g.h:0 (7) is UP

Please anybody can tell what was the reason behind the LDP flap?

Regards

Pradip

xuhu_commverge · ‎03-11-2012

your problem is keepalive timer expired, not the hello timer expired, you need to firstly check the timer you had already configured,

The command

show mpls ldp parameters

can be used to review the locally configured timers and the command

show mpls ldp neighbor details can be used to show the final negotiated timers

The command

show mpls ldp parameters can be used to review the locally configured timers and the command

show mpls ldp neighbor details can be used to show the final negotiated timers

if the timer is 180s, i think it is a long time, you need to check the circuit problem, it is not the configuration problem.

prdpvaghela · ‎03-12-2012

Hi Hu xu,

Thank you for reply..

Please find the output of the ldp parameter:

#show mpls ldp parameters

Protocol version: 1

Session hold time: 180 sec; keep alive interval: 60 sec

Discovery hello: holdtime: 15 sec; interval: 5 sec

Discovery targeted hello: holdtime: 90 sec; interval: 10 sec

Downstream on Demand max hop count: 255

LDP for targeted sessions

LDP initial/maximum backoff: 15/120 sec

LDP loop detection: off

timer seems to be 180sec.

There was no any physical problem as all other protocol i.e ISIS(IGP) was stable on same link at the time of problem. We observed flap in a LDP protocol only.

Regards

Pradip

prdpvaghela · ‎03-15-2012

Anyone has any clue please ??

Cheers

Pradip

Vinit Jain · ‎03-16-2012

Hello Pradeep

Can you please check the following:

- show ibc

- show interface (output of the connected interfaces on both the routers.)

Has the targetted ldp flapped or the ldp on the physical interface

Since this was a TCP connection closed. You can also check for the TCP MSS value. it may be possible due to some reason it might have got negotiated with lower value and have caused the LDP to flap.

Please provide the above information. I can look further into this.

Regards

Vinit

Thanks
--Vinit

prdpvaghela · ‎03-17-2012

Hi Vinit,

Please find below the o/p of both show ibc & show int:

# show ibc

Interface information:

Interface IBC0/0(idb 0x1D224BF0)

5 minute rx rate 390000 bits/sec, 470 packets/sec

5 minute tx rate 724000 bits/sec, 696 packets/sec

1227785718 packets input, 132277835813 bytes

531848764 broadcasts received

703054986 packets output, 89510912001 bytes

53000709 broadcasts sent

0 Bridge Packet loopback drops

386955797 Packets CEF Switched, 58076 Packets Fast Switched

0 Packets SLB Switched, 0 Packets CWAN Switched

Label switched pkts dropped: 0 Pkts dropped during dma: 219471401

Invalid pkts dropped: 57811 Pkts dropped(not cwan consumed): 8925

Xconnect pkts processed: 0, dropped: 1111508

Xconnect pkt reflection drops: 0

Total paks copied for process level 0

Total short paks sent in route cache 78654011

Total throttle drops 218301133 Input queue drops 773327

total spd packets classified (456244739 low, 359912505 medium, 20316995 high)

total spd packets dropped (153503515 low, 65948095 medium, 41 high)

spd prio pkts allowed in due to selective throttling (0 med, 0 high)

IBC resets = 1; last at 05:12:26.831 SAST Wed Feb 8 2012

Driver Level Counters: (Cumulative, Zeroed only at Reset)

Frames Bytes

Rx(0) 368004372 1762711303

Rx(1) 913412478 2159603712

Tx(0) 728948380 3037780905

Input Drop Frame Count

Rx0 = 22141 Rx1 = 996745

Per Queue Receive Errors:

FRME OFLW BUFE NOENP DISCRD DISABLE BADCOUNT

Rx0 0 0 0 0 0 0 0

Rx1 0 0 0 68 0 0 0

Tx Errors/State:

One Collision Error = 0 More Collisions = 0

No Encap Error = 0 Deferred Error = 0

Loss Carrier Error = 0 Late Collision Error = 0

Excessive Collisions = 0 Buffer Error = 0

Tx Freeze Count = 0 Tx Intrpt Serv timeout= 1

Counters collected at Idb:

Is input throttled = 0 Throttle Count = 0

Rx Resource Errors = 0 Input Drops = 1104309

Input Errors = 610494

Output Drops = 0 Giants/Runts = 0/0

Dma Mem Error = 0 Input Overrun = 0

#show int te X/0/0

TenGigabitEthernetX/0/0 is up, line protocol is up (connected)

MTU 4470 bytes, BW 10000000 Kbit, DLY 10 usec,

reliability 255/255, txload 20/255, rxload 43/255

Encapsulation ARPA, loopback not set

Keepalive not supported

Carrier delay is 0 msec

Full-duplex, 10Gb/s

Transport mode LAN (10GBASE-R, 10.3125Gb/s)

input flow-control is on, output flow-control is on

ARP type: ARPA, ARP Timeout 04:00:00

Last input 00:00:00, output 00:00:00, output hang never

Last clearing of "show interface" counters 2w1d

Input queue: 0/75/38/33 (size/max/drops/flushes); Total output drops: 0

Queueing strategy: Class-based queueing

Output queue: 0/40 (size/max)

5 minute input rate 1691089000 bits/sec, 390075 packets/sec

5 minute output rate 818470000 bits/sec, 413745 packets/sec

L2 Switched: ucast: 5460795 pkt, 851382131 bytes - mcast: 557617 pkt, 63052504 bytes

L3 in Switched: ucast: 72659051788 pkt, 57114890177182 bytes - mcast: 0 pkt, 0 bytes mcast

L3 out Switched: ucast: 237095519242 pkt, 42944537492139 bytes mcast: 0 pkt, 0 bytes

704163207990 packets input, 315321595661874 bytes, 1 no buffer

Received 1125 broadcasts (0 IP multicasts)

0 runts, 0 giants, 3 throttles

0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored

0 watchdog, 567310 multicast, 0 pause input

0 input packets with dribble condition detected

786519825664 packets output, 210404496698710 bytes, 0 underruns

0 output errors, 0 collisions, 0 interface resets

0 babbles, 0 late collision, 0 deferred

0 lost carrier, 0 no carrier, 0 pause output

0 output buffer failures, 0 output buffers swapped out

#show int te X/0/0

TenGigabitEthernetY/0/0 is up, line protocol is up (connected)

MTU 4470 bytes, BW 10000000 Kbit, DLY 10 usec,

reliability 255/255, txload 5/255, rxload 2/255

Encapsulation ARPA, loopback not set

Keepalive not supported

Carrier delay is 0 msec

Full-duplex, 10Gb/s

Transport mode LAN (10GBASE-R, 10.3125Gb/s)

input flow-control is on, output flow-control is on

ARP type: ARPA, ARP Timeout 04:00:00

Last input 00:00:00, output 00:00:00, output hang never

Last clearing of "show interface" counters 2w1d

Input queue: 0/75/292/19 (size/max/drops/flushes); Total output drops: 0

Queueing strategy: Class-based queueing

Output queue: 0/40 (size/max)

5 minute input rate 112466000 bits/sec, 66131 packets/sec

5 minute output rate 221669000 bits/sec, 69520 packets/sec

L2 Switched: ucast: 2161150 pkt, 440372316 bytes - mcast: 514981 pkt, 65137030 bytes

L3 in Switched: ucast: 5460878448 pkt, 1060741666543 bytes - mcast: 0 pkt, 0 bytes mcast

L3 out Switched: ucast: 3518851937 pkt, 2138635500086 bytes mcast: 0 pkt, 0 bytes

136383893748 packets input, 34454049909804 bytes, 1 no buffer

Received 1125 broadcasts (0 IP multicasts)

0 runts, 0 giants, 3 throttles

267 input errors, 220 CRC, 47 frame, 0 overrun, 0 ignored

0 watchdog, 536405 multicast, 0 pause input

0 input packets with dribble condition detected

124870339660 packets output, 57889219631351 bytes, 0 underruns

0 output errors, 0 collisions, 0 interface resets

0 babbles, 0 late collision, 0 deferred

0 lost carrier, 0 no carrier, 0 pause output

0 output buffer failures, 0 output buffers swapped out

There was no any flap observed on the physical interface. IGP protocol is also running thru the same physical links & that was UP & stable.

Please tell me how do we check the TCP MSS value this LDP sessions.

- Pradip

Vinit Jain · ‎03-19-2012

Hello Pradip

I noticed a couple of things in the above logs:

1. In the show ibc output, i noticed that there are input queue driops, this means that there are packets that have been dropped by the RP of the router.

2. on the first TenGig interface, i see input queue drops and on the other one, I notice a huge number of input queue drops which means that the interface has run out of buffer and has started dropping the packets. This may sometimes lead to control plane packets getting dropped. For this I will recommend you to increase the hold queue size on the interface using the interface command : "hold-queue 2000 in". This will increase the input hold queue size.

3. I also notice input errors, CRC errors on the 2nd TenGig interface. This basically points to some issue on the interface of the media on that link. Can you please get the fiber checked along with the sfp.

3. I will also recommend you to configure mpls ldp sync.

Regards

Vinit

Thanks
--Vinit

prdpvaghela · ‎03-21-2012

Thanks vinit,

Point#2: Increasing the hold-queue will impact the any operation of the network including voice traffic.

Point#3: May I know the purpose of that command please ?

-Regards

Pradip

Vinit Jain · ‎03-21-2012

Hello Pradip

Point2# No, increasing the hold queue will not have any impact on the network operation.

Point3# There are many benefits of this command. You may want to refer to the below link. This will answer your question.

http://www.cisco.com/en/US/docs/ios/12_0s/feature/guide/fsldpsyn.html#wp1053817

Regards

Vinit

Thanks
--Vinit

prdpvaghela · ‎03-21-2012

Thnk you for prompt reply

-Pradip

ปลาวาฬทราย RMUTT CPE IX · ‎09-15-2016

The issue fixed by use 2 commands, right?

mpls ldp router-id loopback 0 force

router x

mpls ldp sync

If the 2 commands above are configured then LDP unexpected down, how should I do?

Sep 14 01:31:21.091 bkk: %SYS-5-CONFIG_I: Configured from console by am on vty0 (x.x.3.37)
Sep 14 04:22:46.586 bkk: %SYS-5-CONFIG_I: Configured from console by am on vty0 (x.x.3.37)
Sep 14 04:22:46.626 bkk: %LDP-5-NBRCHG: LDP Neighbor x.x.0.73:0 (14) is UP
Sep 15 11:15:48.842 bkk: %LDP-5-NBRCHG: LDP Neighbor x.x.0.163:0 (10) is DOWN (TCP connection closed by peer)
Sep 15 14:10:55.397 bkk: %SYS-5-CONFIG_I: Configured from console by ps on vty1 (x.x.12.16)

Thank you very much.

jmauleonf · ‎07-09-2013

I had the same flapping problem, but the root cause was that the ldp router-id for one router was not accessible for the neighbor:

RouterA#show mpls ldp discovery

Local LDP Identifier:

10.10.10.1:0

Discovery Sources:

Interfaces:

Ethernet0/0 (ldp): xmit/recv

RouterB#sh ip ro 10.10.10.1

% Subnet not in table

This was resolved forcing the routerA to have a router-id that were known by the other router with the command:

mpls ldp router-id loopback 0 force

RouterA#show mpls ldp discovery

Local LDP Identifier:

1.1.1.1:0

Discovery Sources:

Interfaces:

Ethernet0/0 (ldp): xmit/recv

RouterB#sh ip ro 1.1.1.1

Routing entry for 1.1.1.1/32

Known via "ospf 10", distance 110, metric 11, type intra area

And the LDP session has become stable