2811 router, multilink frame-relay for Internet. 100 miles to CO for frame termination.
I'm working with an Cisco engineer on the MFR aggregation virtual interface that is dropping inbound packets. Six T1's with full CIR. All links clean. Other side of the router is Ethernet, 100full.
Any clues among my fellow engineers?
Cisco recommends only two T1s/E1s (see table 1 in http://www.cisco.com/en/US/prod/collateral/routers/ps5854/prod_qas0900aecd80169bd6.html), and although my experience has been the 2800s seem to be capable of more than Cisco recommends, six T1s might be a bit much, especially when supporting some kind of software link aggregation.
What's the CPU history look like and the input queue stats?
You might also try running LAN interface at 10/full. (This to avoid the router from having to process a 100 Mbps ingress bursts.)
CPU: 14% max, 10%+ during the day. We are a 40/hr week site that doesn't host a web site.
Input queue: 0/75/1918/0 (size/max/drops/flushes);
Received 0 broadcasts, 0 runts, 0 giants, 0 throttles, 0 input errors, 0 CRC, 0 frame, 0 overrun, 918 ignored, 0 abort
No other interface logging errors. I get a few output drops: since this is the choke point outbound, I'm not surprised.
I'm an old software engineer originally - I understand how software can get overloaded w/o being a CPU hog.
2,000? What's the total input packet count this goes with?
From Cicso papers on this issue, seems the common cause is main processor can not keep up and/or insufficient buffers. Even though your CPU looks low, you might have some very short term CPU bursts that don't show up in the averages.
Below are some papers on dealing with input queue drops (all not specific to 2800):
MFR0 is up, line protocol is up
Hardware is Multilink Frame Relay bundle interface
Description: Multilink Frame-Relay AT&T: 44.YHGP.000654..SUV
MTU 1500 bytes, BW 9216 Kbit, DLY 20000 usec,
reliability 255/255, txload 17/255, rxload 108/255
Encapsulation FRAME-RELAY IETF, loopback not set
Keepalive set (10 sec)
DTR is pulsed for 2 seconds on reset
LMI enq sent 1854, LMI stat recvd 1854, LMI upd recvd 0, DTE LMI up
LMI enq recvd 0, LMI stat sent 0, LMI upd sent 0
LMI DLCI 0 LMI type is ANSI Annex D frame relay DTE
Broadcast queue 0/64, broadcasts sent/dropped 0/0, interface broadcasts 0
Last input 00:00:00, output never, output hang never
Last clearing of "show interface" counters 05:09:02
Input queue: 0/75/2787/0 (size/max/drops/flushes); Total output drops: 370
Queueing strategy: fifo
Output queue: 0/120 (size/max)
30 second input rate 3937000 bits/sec, 550 packets/sec
30 second output rate 615000 bits/sec, 405 packets/sec
11286581 packets input, 1915301507 bytes, 0 no buffer
Received 0 broadcasts, 0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 0 overrun, 1345 ignored, 0 abort
8673768 packets output, 2228164132 bytes, 0 underruns
0 output errors, 0 collisions, 0 interface resets
0 output buffer failures, 0 output buffers swapped out
0 carrier transitions
2787 / 11286581 = (about) .025%
One might argue, not enough to worry much about. However, you might try increasing your input queue some (100?).
Likewise, for the "1345 ignored", you might also check your buffer stats.
BTW, I've been around the horn with looking for processes running hot, etc. This interface just aggregates packets, but it has no IP address. So, the processes show no problems.
The Cisco engineer has had me try CEF and fair queueing. I really am starting to believe this all points to hardware that is lacking the ability to burst the full rate of all the pipes.
Can't say for sure that alone should cure the issue, since the input queue drops might include the "ignored drops" and SPD, if active on 2800s, which seems to use its own drop strategy.
(NB: Also although not directly related to input Q drop stats on the router, excessive queues can have an adverse impact to some traffic types, including TCP. Drops is often the primary way to indicate to TCP flows they are overruning available bandwidth. I.e sometimes better to manage vs. always avoid.)
I don't know anything more on this issue then what's published by Cisco. Perhaps someone else will have another suggestion or additional information.