Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.
Showing results for
Search instead for
Did you mean:
Debugging Input drops
Steps to debug Input Packet drops:
In this article,we will discussthe common reasons for input packet drops and how the drops could be captured in general.
This document will be an ongoing effort to understand the input drops & capture and so if after reading this article things are still not clear, please make comments necessary on the article and answers will be provided.
Why input drops occur?
There are several reasons why a packet drop occurs at the ingress of an interface. There are about 500-800 drop counters depending on the line card type and NP generation. So when we type show drops verbose location <> the user will find all the relevant drops. All the NP Input drops in general can be captured by doing a “show drops verbose location <> | include <counter name>” The counter name can be retrieved by identifying the type of drop at the time it occurred.
Some of the common drop reasons at ingress are:
1)Out of buffer size:There are several buffers within the NP itself that can have overflow issues or underflow issues. If we take the case of a Traffic Manager Ingress and a Egress Queue buffer and if either of those are out of buffers then the TM complains, which might lead to NP issues and result in packet drops.
E.g.show drops verbose location 0/1/CPU0 | include BUF_EXCD Mon Feb 10 18:32:50.249 EDT PUNT_DIAGS_RX_BUFF_EXCD 0 PUNT_DIAGS_RX_BUFF_EXCD 0
2) Link issue between PHY & NP: This could be due to some SERDES errors. Could lead to either a hardware or software issue.
NP-MAC is the first entry point for data. The following CLI provides MAC level Ingress and egress statistics.
show controller tengig <> stat
Further ingress drops may happen in NP Ucode.
The following CLI may provide more details. Since many ports are mapped to one NP, it may not give per-port ucode drops statistics.
show controller np counters all location <>
3)NP Back pressure:This is typically an over subscription case.E.g. if the ingress LC sending traffic of 30gige to an egress of 10 gig, the fia might backpressure to NP of the ingress and hence we see the ingress drops.
e.g. show drops verbose location <> | include <counter_name>
4)NP Lockup: NP Lockups can occur in all sorts of situations. NP lockup happens when microcode has a bug or a low level resource specific to the NP such as the TM, ICFD, and NP Engine gets blocked. If the HW detects that the NP has stopped forwarding, this scenario can be termed as NP lockup.
In general when the NP lockup happens, it raises a syslog with NP-DIAG in the logs. Soon after this the local ping will fail if the NP is locked up and the LC will automatically reload. Soon after this the local ping will fail if the NP is locked up and the LC will automatically reload that subsequently results in heavy traffic loss.
LC/0/3/CPU0: Mar 14: 09:07:22:122 : prm_server_ty: NP-DIAG health monitoring failure on NP3
There is no CLI as such to determine the NP lock up situation except the user the monitor & observe the above described situation.
5)Unrecognized upper-level protocol: This could also be termed as “Input drops other”. For e.g. when a router receives a new LSP, it floods this LSP to its neighbors, except the neighbor that sent the new LSP. On point-to-point links, the neighbors acknowledge the new LSP with a PSNP, which holds the LSP ID, sequence number, checksum, and remaining lifetime. When the acknowledgment PSNP is received from a neighbor, the originating router stops sending the new LSP to that particular neighbor although it may continue to send the new LSP to other neighbors that have not yet acknowledged it.
Symptoms: Layer 3, for e.g. ISIS 64 bytes PSNP packets are counted as "Input drop other", though they are not really dropped.
Conditions: ISIS router with links configured as point-to-point.
Work around: This problem is merely cosmetic, the PSNP's are still processed just reported incorrect due to an accumulation discrepancy in the sw.
6) RESOLVE_VPLS_REFLECTION_FILTER_DROP_CNT should not be in total drops: The counter RESOLVE_VPLS_REFLECTION_FILTER_DROP_CNT should not be included in the calculation for total drops seen under an interface.
RP/0/RSP0/CPU0:BEL1MN#sh int te 0/6/0/0
Tue Jan 31 10:39:39.047 ARG
TenGigE0/6/0/0 is up, line protocol is up
Last clearing of "show interface" counters never
30 second input rate 962000 bits/sec, 702 packets/sec
30 second output rate 10000 bits/sec, 19 packets/sec
519481709 packets input, 587232457343 bytes, 19412 total input drops
There are 19411 drops for RESOLVE_VPLS_REFLECTION_FILTER_DROP_CNT and 19412
total drops for the interface.
This was duplicated in the lab and there is a one to one increase in the NP counter and the total drops counter.
This provides confusion to the customer who thinks they have a traffic loss issue and impedes troubleshooting when trying to identify the source of the drops. The steps to identify the issue are as stated below -
0)clear both interface counters and np counters.
1)Show interface xxx, if there is non-zero generic input drop counter then
2)Show controller np ports all location yyy, this will display which NP the interface in question is on
3)Show controller np counters npz location yyy, where z is the NP # shown in step 2
In the output, find RESOLVE_VPLS_REFLECTION_FILTER_DROP_CNT, RESOLVE_INGRESS_DROP_CNT, RESOLVE_EGRESS_DROP_CNT
If the generic input drop matches (RESOLVE_VPLS_REFLECTION_FILTER_DROP_CNT - RESOLVE_EGRESS_DROP_CNT)
That means there is ingress drop due to loop, please figure out where the cause of loop.
If the input drop is more than the delta, that means other input drops also exists, try find other DROP counter under the np.
If the delta is 0, that means the input drop is NOT due to the loop condition, try find other DROP counter under the np.
After clearing counters, start traffic pattern that caused the drop.
Check the counters at input interface: This is first place to check in. Here the user can observe the packet count increment each time the user execute’s this command and identify if a drop is occurring by comparing the input and output packet rate.