IPv6 Tunnel Input Wedged on 15.1(4)M4/M5

Answered Question
Nov 23rd, 2012
User Badges:

Hi,


I have a problem with an IPv6 tunnel (ipv6ip) on a Cisco 1841 runnining 15.1(4)M4 or 15.1(4)M5.

It appears that a bug was introduced into 15.1(4)M4 and it is related to IPv6 tunnels and IP SLA.


interface Tunnel64

description IPv6 Tunnel to x.x.x.x

ipv6 address 2001:XXXX:XXXX:XXXX::2/64

tunnel source ATM0/1/0.1

tunnel mode ipv6ip

tunnel destination x.x.x.x

!


After reloading the router, I can see the size of the input queue slowly increasing "Input queue: 30/75/0/0". It appears that specific packets are getting stuck in the input queue while still processing the majority of IPv6 packets. After a short period of time the input queue gets wedged "Input queue: 76/75/0/0" and it stops working for IPv6 unless I reload the router.


Tunnel64 is up, line protocol is up

  Hardware is Tunnel

  Description: IPv6 Tunnel to x.x.x.x

  MTU 17920 bytes, BW 100 Kbit/sec, DLY 50000 usec,

     reliability 255/255, txload 1/255, rxload 1/255

  Encapsulation TUNNEL, loopback not set

  Keepalive not set

  Tunnel source x.x.x.x (ATM0/1/0.1), destination x.x.x.x

   Tunnel Subblocks:

      src-track:

         Tunnel64 source tracking subblock associated with ATM0/1/0.1

          Set of tunnels with source ATM0/1/0.1, 1 member (includes iterators), on interface <OK>

  Tunnel protocol/transport IPv6/IP

  Tunnel TTL 255

  Tunnel transport MTU 1480 bytes

  Tunnel transmit bandwidth 8000 (kbps)

  Tunnel receive bandwidth 8000 (kbps)

  Last input 00:00:15, output 00:00:15, output hang never

  Last clearing of "show interface" counters never

  Input queue: 76/75/0/0 (size/max/drops/flushes); Total output drops: 0

  Queueing strategy: fifo

  Output queue: 0/0 (size/max)

  30 second input rate 0 bits/sec, 0 packets/sec

  30 second output rate 0 bits/sec, 0 packets/sec

     2253 packets input, 1691254 bytes, 0 no buffer

     Received 0 broadcasts (0 IP multicasts)

     0 runts, 0 giants, 0 throttles

     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort

     1844 packets output, 730645 bytes, 0 underruns

     0 output errors, 0 collisions, 0 interface resets

     0 unknown protocol drops

     0 output buffer failures, 0 output buffers swapped out


I also have an IP SLA probe on the router to verify if connectivity is working over the IPv6 tunnel:


ip sla 10

icmp-echo 2001:XXXX:XXXX:XXXX::1

!

ip sla schedule 10 life forever start-time now


It appears that IP SLA return packets are getting stuck in the input queue as the input queue increments every time I receive a response to my IP SLA probe (every 60 seconds). I have tried to change the values in the probe (packet size, tos, etc) without any luck. I am able to ping the same IPv6 address normally from the command line without seeing this behaviour.


Can I deduce that this is a potential buffer leak - I can't find anything on Bug Toolkit relating to this.


Has anyone come across this issue before and know any workarounds?


Thanks in advance,

Chris

Correct Answer by Brett Sitomer about 4 years 5 months ago

Hi Chris and Andrea,


Thanks for testing the various versions.


I was able to reproduce the issue in order to investigate further. Here's what I have been able to figure out:

The input queue leak was introduced by the fix for

CSCtn36227    Alignment correction at ipv6_checksum with IPv6 ping sweep

it is fixed via

CSCto56317    Backward compatibility regarding pak release strategy in ipv6_ping_send


CSCto56317 was committed into 15.2(1)T, but was never committed into the 15.1(4)M throttle.


I have put in a request to get CSCto56317 fixed in 15.1(4)M throttle. The next potential release that can get the fix is 15.1(4)M7 which is due out in October.


Please note that CSCto56317 is currently an internal defect. I will be making it external, but it may take a day or two for that change to propogate.


Unfortunately, I don't see an easy workaround to prevent the input leak in the meantime. Given that you decided to move back to 15.1(4)M3, I would assume ip sla is a required feature for you both. Another option would be to modify the frequency of the sla icmps and/or then increase the size of the input hold queue (hold-queue 24000 in) to allow more time before the input queue filled up. Changing the frequency from 1 min to 5 minutes and increasing the hold queue to 24000 should allow the device to go ~83 days before needing a reload to clear the input queue.

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 5 (3 ratings)
Loading.
Marcin Latosiewicz Sat, 11/24/2012 - 02:58
User Badges:
  • Cisco Employee,

Chris,


I run a few queries against internal DB, I could not find any bug that would match what you're describing.


At the same time I think it would make sense to dump the buffers and check what is exactly there, it should be also quite easlity preoducible in our labs.


Can I ask you to open a TAC case?


M.

Chris Mason Sat, 11/24/2012 - 11:57
User Badges:

Hi,


Unfortunately I don't have TAC support for this device. I have a workaround by running older code, but I was curious to see if anyone else was hitting this issue, as it seems a fairly common use case.


I have run some debugs and can see the packets mount up in the "show buffers old" output:


r1#show buffers old


  Header DataArea  Pool Rcnt  Size Link  Enc    Flags          Input      Output

6615260C F5C0EE84 Middl    1    96   79   31  2000200           Tu64        Tu64

661538FC F5C0FB84 Middl    1    96   79   31  2000200           Tu64        Tu64

66154BEC F5C10884 Middl    1    96   79   31  2000200           Tu64        Tu64

66155A20 F5C11244 Middl    1    96   79   31  2000200           Tu64        Tu64

66E38C50 F60B88E4 Middl    1    96   79   31  2000200           Tu64        Tu64

66F661B0 F60B8264 Middl    1    96   79   31  2000200           Tu64        Tu64

671EA9AC F60B95E4 Middl    1    96   79   31  2000200           Tu64        Tu64

673F7DD8 F60B7F24 Middl    1    96   79   31  2000200           Tu64        Tu64

674BD860 F60B85A4 Middl    1    96   79   31  2000200           Tu64        Tu64


IP SLA appears to be padding the packets with random data as, although they are all the same size, they contain random payload information:


Buffer information for Middle buffer at 0x673F7DD8

  data_area 0xF60B7F24, refcount 1, next 0x0, flags 0x2000200

  linktype 79 (IPV6), enctype 31 (TUNNEL), encsize 0, rxtype 26

  if_input 0x682D4FE0 (Tunnel64), if_output 0x682D4FE0 (Tunnel64)

  inputtime 00:04:35.044 (elapsed 00:12:16.016)

  outputtime 00:04:15.536 (elapsed 00:12:35.524), oqnumber 65535

  datagramstart 0xF60B7F64, datagramsize 96, maximum size 756

  mac_start 0xF60B7F56, addr_start 0x0, info_start 0xF60B7F4C

  network_start 0xF60B7F78, transport_start 0xF60B7FA0, caller_pc 0x6005D410


F60B7F20:                   AFACEFAD 0A303030          /,o-.000

F60B7F28: 3031393A 204E6F76 20323420 31393A33  019: Nov 24 19:3

F60B7F38: 393A3531 2E373139 20474D54 3A20254C  9:51.719 GMT: %L

F60B7F48: 494E4B2D 332D5550 444F574E 3A20496E  INK-3-UPDOWN: In

F60B7F58: 74657266 00010000 AAAA0300 45000060  terf....**..E..`

F60B7F68: 85BB0000 F829C1C1 C2F66D70 BCDF8DB1  .;..x)AABvmp<_.1

F60B7F78: 68800000 00243A40 2001067C 204CFFFE  h....$:@ ..| L.~

F60B7F88: 00000000 00000001 2001067C 204CFFFE  ........ ..| L.~

F60B7F98: 00000000 00000002 81002C36 00040001  ..........,6....

F60B7FA8: 01020304 05060708 090A0B0C 0D0E0F10  ................

F60B7FB8: 11121314 15161718 191A1B1C 090A0B0C  ................

F60B7FC8: 0D0E0F10 11121314 15161718 191A1B1C  ................

F60B7FD8: DFABC9A8 8011246D 1E7F0000 0101080A  _+I(..$m........

F60B7FE8: D0999C89 42D790D3 8D86D89B 4B4D431F  P...BW.S..X.KMC.

F60B7FF8: FD21727E 5F23DFB0 2B2E4625 36B4F7BC  }!r~_#_0+.F%64w<

F60B8008: B06CB1B8 4D2686C2 E1EFF74C 1D763883  0l18M&.BaowL.v8.

F60B8018: E9958265 03505E66 09360727 38482306  i..e.P^f.6.'8H#.

F60B8028: 98C1B8A5 AE02C409 A06FF610 7EF25DDF  .A8%..D. ov.~r]_

F60B8038: ACC16983 C4D8E476 A879E327 714D4ECF  ,Ai.DXdv(yc'qMNO

F60B8048: 0DDD0597 59311E36 64046A3F 81FD0042  .]..Y1.6d.j?.}.B

F60B8058: 1A807A57 B8938A03 034AF37D AB923EE6  ..zW8....Js}+.>f

F60B8068: 2676EF08 383E1B5F 1C0B2F77 4883B60B  &vo.8>._../wH.6.

F60B8078: 64D22FBA E9C01D6B 247AE37D 17F8A1E9  dR/:i@.k$zc}.x!i

F60B8088: 71BB5C7F E6CE02FF 00000000 00000000  q;\.fN..........

F60B8098: 00000000 00000000 00000000 00000000  ................

F60B80A8: 00000000 00000000 00000000 00000000  ................

F60B80B8: 00000000 00000000 00000000 00000000  ................

F60B80C8: 00000000 00000000 00000000 00000000  ................

F60B80D8: 00000000 00000000 00000000 00000000  ................

F60B80E8: 00000000 00000000 00000000 00000000  ................

F60B80F8: 00000000 00000000 00000000 00000000  ................

F60B8108: 00000000 00000000 00000000 00000000  ................

F60B8118: 00000000 00000000 00000000 00000000  ................

F60B8128: 00000000 00000000 00000000 00000000  ................

F60B8138: 00000000 00000000 00000000 00000000  ................

F60B8148: 00000000 00000000 00000000 00000000  ................

F60B8158: 00000000 00000000 00000000 00000000  ................

F60B8168: 00000000 00000000 00000000 00000000  ................

F60B8178: 00000000 00000000 00000000 00000000  ................

F60B8188: 00000000 00000000 00000000 00000000  ................

F60B8198: 00000000 00000000 00000000 00000000  ................

F60B81A8: 00000000 00000000 00000000 00000000  ................

F60B81B8: 00000000 00000000 00000000 00000000  ................

F60B81C8: 00000000 00000000 00000000 00000000  ................

F60B81D8: 00000000 00000000 00000000 00000000  ................

F60B81E8: 00000000 00000000 00000000 00000000  ................

F60B81F8: 00000000 00000000 00000000 00000000  ................

F60B8208: 00000000 00000000 00000000 00        .............  


Thanks,

Chris

Andrea Florio Thu, 02/14/2013 - 02:23
User Badges:

i have got exactly the same issue... how did you solved ? (if you did) running M3 ?


Gateway#sh int tun 0 | i queue

  Input queue: 76/75/100/0 (size/max/drops/flushes); Total output drops: 0

  Output queue: 0/0 (size/max)

Gateway#sh buffers old



  Header DataArea  Pool Rcnt  Size Link  Enc    Flags          Input      Output



664C15C0 EEA06EA4 Middl    1    96   79   31      200            Tu0         Tu0

664C1A7C EEA071E4 Middl    1    96   79   31      200            Tu0         Tu0

664C1F38 EEA07524 Middl    1    96   79   31      200            Tu0         Tu0

664C23F4 EEA07864 Middl    1    96   79   31      200            Tu0         Tu0

664C28B0 EEA07BA4 Middl    1    96   79   31      200            Tu0         Tu0

664C2D6C EEA07EE4 Middl    1    96   79   31      200            Tu0         Tu0

664C3228 EEA08224 Middl    1    96   79   31      200            Tu0         Tu0

664C36E4 EEA08564 Middl    1    96   79   31      200            Tu0         Tu0

664C3BA0 EEA088A4 Middl    1    96   79   31      200            Tu0         Tu0

664C405C EEA08BE4 Middl    1    96   79   31      200            Tu0         Tu0

664C4518 EEA08F24 Middl    1    96   79   31      200            Tu0         Tu0

664C49D4 EEA09264 Middl    1    96   79   31      200            Tu0         Tu0

664C4E90 EEA095A4 Middl    1    96   79   31      200            Tu0         Tu0

664C534C EEA098E4 Middl    1    96   79   31      200            Tu0         Tu0

664C5808 EEA09C24 Middl    1    96   79   31      200            Tu0         Tu0

66F2BECC EEE92304 Middl    1    96   79   31      200            Tu0         Tu0

66F2C388 EEE92644 Middl    1    96   79   31      200            Tu0         Tu0

66F2D530 EEE90C44 Middl    1    96   79   31      200            Tu0         Tu0

66F40880 EEE8F8C4 Middl    1    96   79   31      200            Tu0         Tu0

6758A5A0 EEE26C64 Middl    1    96   79   31      200            Tu0         Tu0

6758AA5C EEE26FA4 Middl    1    96   79   31      200            Tu0         Tu0

6758AF18 EEE272E4 Middl    1    96   79   31      200            Tu0         Tu0

6758B3D4 EEE27624 Middl    1    96   79   31      200            Tu0         Tu0

6758B890 EEE27964 Middl    1    96   79   31      200            Tu0         Tu0

6758BD4C EEE27CA4 Middl    1    96   79   31      200            Tu0         Tu0

6758C6C4 EEE28324 Middl    1    96   79   31      200            Tu0         Tu0

6758CB80 EEE28664 Middl    1    96   79   31      200            Tu0         Tu0

6758D03C EEE289A4 Middl    1    96   79   31      200            Tu0         Tu0

676597C4 EEE8CB44 Middl    1    96   79   31      200            Tu0         Tu0

6765A13C EEE8D1C4 Middl    1    96   79   31      200            Tu0         Tu0

6765A5F8 EEE8D504 Middl    1    96   79   31      200            Tu0         Tu0

6784118C EEE94A04 Middl    1    96   79   31      200            Tu0         Tu0

67841648 EEE97444 Middl    1    96   79   31      200            Tu0         Tu0

679D2250 EEE8C804 Middl    1    96   79   31      200            Tu0         Tu0

679D2BC8 EEE8DB84 Middl    1    96   79   31      200            Tu0         Tu0

679D3084 EEE8DEC4 Middl    1    96   79   31      200            Tu0         Tu0

679D3540 EEE8E204 Middl    1    96   79   31      200            Tu0         Tu0

68194A08 EEE91C84 Middl    1    96   79   31      200            Tu0         Tu0

6851CBB8 EEE905C4 Middl    1    96   79   31      200            Tu0         Tu0

68520AC0 EEE91944 Middl    1    96   79   31      200            Tu0         Tu0

68526180 EEE91FC4 Middl    1    96   79   31      200            Tu0         Tu0

68528034 EEEAE644 Middl    1    96   79   31      200            Tu0         Tu0

68529800 EEE90F84 Middl    1    96   79   31      200            Tu0         Tu0

6856A69C EEE97784 Middl    1    96   79   31      200            Tu0         Tu0

6856AB58 EEE98B04 Middl    1    96   79   31      200            Tu0         Tu0

685B4A7C EEEAF344 Middl    1    96   79   31      200            Tu0         Tu0

685B53F4 EEEAF9C4 Middl    1    96   79   31      200            Tu0         Tu0

685B6834 EEEB0A04 Middl    1    96   79   31      200            Tu0         Tu0

685B83AC EEE960C4 Middl    1    96   79   31      200            Tu0         Tu0

685B8868 EEE96404 Middl    1    96   79   31      200            Tu0         Tu0

685B8D24 EEE96744 Middl    1    96   79   31      200            Tu0         Tu0

685B969C EEE96DC4 Middl    1    96   79   31      200            Tu0         Tu0

685BA7D4 EEEAFD04 Middl    1    96   79   31      200            Tu0         Tu0

685BC61C EEE92CC4 Middl    1    96   79   31      200            Tu0         Tu0

685BCAD8 EEE93004 Middl    1    96   79   31      200            Tu0         Tu0

685BCF94 EEE93344 Middl    1    96   79   31      200            Tu0         Tu0

685BD450 EEE93684 Middl    1    96   79   31      200            Tu0         Tu0

685C6D74 EEE953C4 Middl    1    96   79   31      200            Tu0         Tu0

685C7230 EEE95704 Middl    1    96   79   31      200            Tu0         Tu0

685C7BA8 EEE95D84 Middl    1    96   79   31      200            Tu0         Tu0

687C2104 EEE92984 Middl    1    96   79   31      200            Tu0         Tu0

687C2A7C EEE97AC4 Middl    1    96   79   31      200            Tu0         Tu0

687C2F38 EEE97E04 Middl    1    96   79   31      200            Tu0         Tu0

687C33F4 EEE98144 Middl    1    96   79   31      200            Tu0         Tu0

6888076C EEEAE984 Middl    1    96   79   31      200            Tu0         Tu0

688E3164 EEE8F244 Middl    1    96   79   31      200            Tu0         Tu0

689C4684 EEE939C4 Middl    1    96   79   31      200            Tu0         Tu0

689C4B40 EEE93D04 Middl    1    96   79   31      200            Tu0         Tu0

689C54B8 EEE94384 Middl    1    96   79   31      200            Tu0         Tu0

689C5974 EEE946C4 Middl    1    96   79   31      200            Tu0         Tu0

689DAA24 EEE8E544 Middl    1    96   79   31      200            Tu0         Tu0

689DAEE0 EEE8E884 Middl    1    96   79   31      200            Tu0         Tu0

689DB39C EEE8EBC4 Middl    1    96   79   31      200            Tu0         Tu0

689DB858 EEE8EF04 Middl    1    96   79   31      200            Tu0         Tu0

68AE11F4 EEE8F584 Middl    1    96   79   31      200            Tu0         Tu0

68AE2358 EEE8FF44 Middl    1    96   79   31      200            Tu0         Tu0





  Header DataArea  Pool           Rcnt  Size  Original   Flags   caller_pc



Public particle pools:

Phillip Remaker Fri, 02/15/2013 - 15:09
User Badges:
  • Cisco Employee,

Your fastest course of action here is to open a TAC case. 


Can you reproduce the wedged queue on demand?


Seems like a regression in 15.1(4)M4, then?  Flling back to 15.1(4)M3 fixes it?  Is it also present in the latest 15.1(4)M?

Andrea Florio Tue, 02/19/2013 - 10:21
User Badges:

yes, it's incredibly easy to reproduce and it takes just few seconds. yes correct, 15.1(4)M3 is no effected by this issue, i didn't test with 15.1(4)M but 15.1(4)M4/M5 are affected, while the T train is not.


The other problem is that i don't have a valid support contract for my device so i can't open a TAC case directly, any suggestion is highly welcome


Andrea

Correct Answer
Brett Sitomer Wed, 02/20/2013 - 16:08
User Badges:
  • Cisco Employee,

Hi Chris and Andrea,


Thanks for testing the various versions.


I was able to reproduce the issue in order to investigate further. Here's what I have been able to figure out:

The input queue leak was introduced by the fix for

CSCtn36227    Alignment correction at ipv6_checksum with IPv6 ping sweep

it is fixed via

CSCto56317    Backward compatibility regarding pak release strategy in ipv6_ping_send


CSCto56317 was committed into 15.2(1)T, but was never committed into the 15.1(4)M throttle.


I have put in a request to get CSCto56317 fixed in 15.1(4)M throttle. The next potential release that can get the fix is 15.1(4)M7 which is due out in October.


Please note that CSCto56317 is currently an internal defect. I will be making it external, but it may take a day or two for that change to propogate.


Unfortunately, I don't see an easy workaround to prevent the input leak in the meantime. Given that you decided to move back to 15.1(4)M3, I would assume ip sla is a required feature for you both. Another option would be to modify the frequency of the sla icmps and/or then increase the size of the input hold queue (hold-queue 24000 in) to allow more time before the input queue filled up. Changing the frequency from 1 min to 5 minutes and increasing the hold queue to 24000 should allow the device to go ~83 days before needing a reload to clear the input queue.

Phillip Remaker Wed, 02/20/2013 - 16:14
User Badges:
  • Cisco Employee,

Fantastic work Brett, thanks!  And thanks for handling the administrative work to get the bug in the right place.  I presume the release note of CSCto56317 will be updated to mark the tunnel interface symptom observed here?

Brett Sitomer Wed, 02/20/2013 - 16:23
User Badges:
  • Cisco Employee,

Yes, I have updated the release notes of CSCto56317 to describe the input queue leak/wedge issue. Your comment about moving to a version where CSCto56317 is already fixed is also good, but unfortunately the 1841 can't run 15.2(1)T or later where CSCto56317  is already present. If Andrea is seeing this on an ISR-G2 instead of a ISR like Chris is, it is an option.

Brett Sitomer Wed, 11/20/2013 - 12:45
User Badges:
  • Cisco Employee,

The fix has been committed and will be available in 15.1(4)M8 due out late March.

Brett Sitomer Mon, 11/11/2013 - 13:41
User Badges:
  • Cisco Employee,

It looks like the commit to 15,1(4)M throttle wasn't done. If you are in need of getting it fixed for M8, it would be best to open a TAC case so the TAC engineer can follow up with development to make sure the fix gets committed into 15.1(4)M8.

Chris Mason Thu, 02/21/2013 - 05:03
User Badges:

Thanks Brett, much appreciated. I will stick with 15.1(4)M3 until M7 as I need IP SLA to check my IPv6 tunnel is still working.

Actions

This Discussion

Related Content