cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
13862
Views
12
Helpful
0
Comments
Aleksandar Vidakovic
Cisco Employee
Cisco Employee

 


Input Drops Troubleshooting on ASR9k

In IOS XR release 5.3.3 new serviceability enhancements were introduced to help troubleshoot packet drops in Network Processor (NP) microcode. This documents explains in details the enhancements and touches briefly on other input drops troubleshooting techniques.

We recommend to use the new troubleshooting features "monitor np interface" and "show controllers np capture". Use the old "monitor np counter" only as last resort, under direct guidance from TAC.


Per-Interface Input Drops

Input drops were previosly quite challenging for troubleshooting. Starting with IOS XR release 5.3.3, per-interface packet drops in NP microcode can be investigated quite easily.

Architecture

There are currently close to 1200 various NP statistics counters that provide great insight into what actions is the NP microcode performing. These counters are stored in a memory that must be very fast and very close to the NP cores, which has an impact on the size of the memory. As a consequence, NP statistics counters are global. To achieve per-interface drop counitng, we have carved out a portion of the statistics memory for per-interface drop counters. In NP microcode terminology, these are per-uidb drop counters. UIDB stands (or μIDB) stands for Microcode Interface Descriptor Block, or other words this is NP's view of a (sub)interface.

Characteristics/Restrictions/Limitations

  • Monitoring is enabled on-demand.
  • One uidb per can be monitored per location (i.e. line card) at the time.
  • If enabled on bundle (sub)interface, monitoring is activated on all NPs that host a bundle member.
  • Drop counters that are updated for selected uidb are not updated in the global stats memory.
  • Monitoring by default runs as one iteration over 5 seconds.
  • If short iteration period is specified, the drop rate in the first iteration may be inaccurate.
  • Counters are reported with '_MONITOR' appendix.
  • No observable performance impact. Packet disposition is not changed by this feature.

Workflow

Your starting point for this kind of troubleshooting would probably be when you observe input drops on an interface, e.g.:

GigabitEthernet0/0/1/6.1 is up, line protocol is up
<..output omitted..>
307793 packets input, 313561308 bytes, 227987 total input drops

The following command monitors the drops on this particular sub-interface. In this example two iterations are executed, each lasting one second.

RP/0/RSP0/CPU0:our9001#monitor np interface g0/0/1/6.1 count 2 time 1 location 0/0/CPU0
Monitor NP counters of GigabitEthernet0_0_1_6.1 for 2 sec

<..output omitted..>
		****  Sun Jan 31 22:14:32 2016 ****

Monitor 2 non-zero NP1 counters:  GigabitEthernet0_0_1_6.1
Offset  Counter                                         FrameValue   Rate (pps)
-------------------------------------------------------------------------------
 262 RSV_DROP_MPLS_LEAF_NO_MATCH_MONITOR                       101          49
1307 PARSE_DROP_IPV4_CHECKSUM_ERROR_MONITOR                    101          50

(Count 2 of 2)
RP/0/RSP0/CPU0:our9001#

Now you have an idea of the drop reason. Some drops are self explanatory, like the IPv4 checksum error. For the remaining ones you need to look into the dropped packet header to investigate further. This is where the next new fature comes to the rescue.

 


Pervasive Capture of Dropped Packets

Starting with IOS XR release 5.3.3, packets recently dropped by NP microcode are saved for further troubleshooting.

Architecture

The NP microcode is saving the headers of the recent dropped packets into a circular buffer. On Tomahawk and Typhoon line card family we are saving the most recent 128 and 32 dropped packets respectively.

Characteristics/Restrictions/Limitations

  • This feature is enabled by default - no configuraton required.
  • No performance impact. Packet disposition is not changed by this feature.
  • Works at port-level (as opposed to sub-interface level).
  • To figure out the sub-interface you have to decode the L2 encapsulation.
  • In case of packets spanning more than one RFD buffer, only the first buffer is captured.
  • Further filtering is supported - you can select which drop reasons not to capture.

Workflow

You can view the recent dropped packets using the show controllers np capturecommand.

RP/0/RSP0/CPU0:our9001#sh controllers np capture np1 location 0/0/CPU0

NP1 capture buffer has seen 426268 packets - displaying 32

Sun Jan 31 22:55:13.935 : RSV_DROP_MPLS_LEAF_NO_MATCH
 From GigabitEthernet0_0_1_6: 1222 byte packet on NP1
0000: 84 78 ac 78 ca 3e 30 f7 0d f8 af 81 81 00 03 85
0010: 88 47 05 dc 11 ff 45 00 00 64 01 ae 00 00 ff 01
0020: 62 c3 ac 12 00 02 ac 10 ff 02 00 00 02 3a 00 0a
<..output omitted..>

In the snapshot shown above you can see that the dropped packet was received on port GigabitEthernet0/0/1/6. As the pervasive packet drop feature works on port level, to figure out the subinterface on which the packet was received you have to look into the L2 encapsulation header. In the above snapshot the encapsulaton was 802.1Q, indicated by Ethernet Type 0x8100. The two octects that follow contain the 3-bit PCP, 1-bit DEI and 12-bit VLAN ID. The VLAN ID happened to be 0x385, which is 901 in decimal.

Using any off-line packet decoder tool, the above captured frame is easily decoded to:

Ethernet II, Src: 30:f7:0d:f8:af:81, Dst: 84:78:ac:78:ca:3e
    Type: 802.1Q Virtual LAN (0x8100)
802.1Q Virtual LAN, PRI: 0, CFI: 0, ID: 901
    Type: MPLS label switched packet (0x8847)
MultiProtocol Label Switching Header, Label: 24001, Exp: 0, S: 1, TTL: 255
    MPLS Label: 24001
    MPLS Experimental Bits: 0
    MPLS Bottom Of Label Stack: 1
    MPLS TTL: 255
Internet Protocol, Src: 172.18.0.2 (172.18.0.2), Dst: 172.16.255.2 (172.16.255.2)
Internet Control Message Protocol
    Type: 0 (Echo (ping) reply)
    Code: 0 ()

 

The next step would be to check the MPLS forwarding table for the label 24001.

 

You can disable/enable the capturing of specific drop rasons by using the filter option:

RP/0/RSP0/CPU0:our9001#sh controllers np capture np1 filter RSV_DROP_MPLS_LEAF_NO_MATCH disable location 0/0/CPU0

 Disable NP1 packet capture for: RSV_DROP_MPLS_LEAF_NO_MATCH

You can see which drop counters are eligible for capture by using the help option:

RP/0/RSP0/CPU0:our9001#sh controllers np capture np1 help location 0/0/CPU0

 NP1 Status              Capture Counter Name
 ---------------------+------------------------------
 Capturing             PARSE_UNKNOWN_DIR_DROP
 Capturing             PARSE_UNKNOWN_DIR_1
<..output omitted..>

 


Capturing Packets Against Any NP Counter

Starting with XR release 4.3.x, the content of a packet processed by the NP can be dumped using the monitor np counter CLI. This method is explained in detail in ASR9000/XR: How to capture dropped or lost packets.

Use this only as the last resort, when the two new troubleshooting features can't help.

The drawback of this approach, compared to the two new methods described in this document, is that all captured packets are dropped. In addition, NP reset is required upon capture completion (~50ms traffic outage on Typhoon, ~150 on Tomahawk).

Important Note: In some XR releases the NP reset after the execution of "monitor np counter" is optional. We strongly recommend to always select the reset option after running the monitoring. The NP reset will be unconditional starting with XR release 6.1.4  and 6.2.2.


Conclusion

We're working on further drop troubleshooting enhancements. In the meantime we hope the two new drop troubleshooting tools will already help you a lot in packet drops troubleshooting.

Let us know your comments/questions.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Quick Links