Input Drops - 8 hours later.. still no resolution

Unanswered Question
Sep 9th, 2010
User Badges:

Hello,

I have been troubleshooting this problem for 8 hours and have not figured it out yet. 

Basically, I have a 2811 where the FA0/0 keeps dropping packets at the interface now and
then seemingly when traffic tends to burst but anything over 5 megs and the errors seem to
increase.  The issue is not severe enough to affect users yet but that could change as
more users generate more traffic.

Here is the interface configuration:
- 100 mbits/sec full duplex
- MTU is 1375
- GRE Tunnels are used at this interface
- No access-lists
- Proxy Arp is enabled
- Local Proxy Arp is disabled
- Fast switching is enabled
- IP CEF is enabled
- No CRC errors
- Ignore Errors increment at the same time as Throttle Errors
- Input queue:  drops increment
- The input queue seems to increase to say 55 or higher and that's when we see the drops
happen.
Example:  55/75/4983/0

The CPU also skyrockets at this time hitting 99%.  I had a through look at the processes
with:
show proc cpu | exclude 0.00

The processes that increase dramatically are:
- IP Input
- Pool Manager

When the CPU spikes, IP INPUT is about 60% and Pool Manager around 25 - 35% plus the rest
of the processes.

If you suspect it's some type of bursty traffic, that's fine but all SNMP stats show a
maximum rate of 6 mbits/sec - a far cry from what this router can handle.

Is it possible that there is some type of traffic affecting the router?  Applications that
are running:  E-mail, voice, video (OCS), ftp, mapped drives, www

The IOS is:

c2800nm-adventerprisek9-mz.124-24.T.bin

I reviewed all release notes / bugs and can't find anything that could be the cause.

Is it possible that the router is process-switching and slow to a crawl CPU wise?  But
even if it was process-switching, we're talking about a max. of 6 mbits/sec.  Besides, I
see CEF working properly (so I think anyways.)

Thanks,

Martin


  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
Nagaraja Thanthry Thu, 09/09/2010 - 06:51
User Badges:
  • Cisco Employee,

Hello,


Can you post your running configuration here? Also, post the output of "show

interface "interface id"" command.


Regards,


NT

sectel123 Thu, 09/09/2010 - 07:00
User Badges:

I can but it's a pain as it's a classified system on a closed  - the config has to be vetted / IP addresses changed, etc. before submitting.

network.

burleyman Thu, 09/09/2010 - 07:25
User Badges:
  • Blue, 1500 points or more

Why is you MTU set so low? Could this be the reason for the Input drops?



Mike

sectel123 Thu, 09/09/2010 - 08:11
User Badges:

Well, the MTU is low because there's actually 2 Tunnels - there's the the GRE Tunnel but inside the GRE Tunnel is another military type grade tunnel.. we had issues with fragmentation (ICMP Discovery)


I thought about that...  but why would ... hmmm... maybe if there's a packet with DF set that's over 1375?  Possible...

j-marenda Thu, 09/09/2010 - 10:40
User Badges:

There is no reason to lower the MTU on the "outer" interface just because

it' is the source/destination for an int tunnel; tunnel mode gre.


If  the traffic is not vpn and udp,

i strongly recommand to use "ip tcp adjust-mss 1024" or lower

to enable the routers to catch the tcp-handshake

and inject values not leading to fragmentation.

Configure this on both incomign, outgoing and tunnel-interface(s).


And you may also want to clear the df-bit on the tunneld traffic to be able to transport

it (even if it gets fragmented) in spite of the df-bit. This normally

is done by a (cpu-intensive) route-map.


BTW, are you sure CEF is running ?

(sh ip cef)

sometimes it gets powered-off silently by out-of-memory condition.


Juergen.

sectel123 Thu, 09/09/2010 - 12:11
User Badges:

Hi Juergen,


Yes, it's not a duplex / speed issue and I have the statisitics for CPU processes in my original post.


We may look at adjusting the MSS to 1024 or lower.  And I do agree, clearing the DF bit may be a good idea.


CEF is running.


Thank you for the suggestion.

j-marenda Thu, 09/09/2010 - 12:31
User Badges:

have you tried to do ip flow ingres; ip flow egres

on the fas0/0 interface to have a look with sh ip cache flow on the traffic coming in,

to be able to detect unwanted ip traffic ,

and also get some nice staistics about protocol and paketsize distribution.


show ip traffic shows also some hints on why pakets got dropped.


Hope this help's locating the problem.


Juergen.

sectel123 Thu, 09/09/2010 - 13:17
User Badges:

I'll add those right now and try .. Thanks.


It's towards the end of the day here so traffic has gone down so the errors are not happening anymore.  However, I decided to look at debug ip icmp and I am getting:


ICMP: dst (1.2.3.4) frag. needed and DF set unreachable sent to 6.7.8.9


Hmmm..


Maybe I get a lot of those during the day and they get dropped incrementing those counters.

j-marenda Thu, 09/09/2010 - 10:46
User Badges:

Have you found out to which process the cpu load is related

(show proc cpu sorted) ?


Have you also checked wether the switch-port also

believes in 100-full ?


Juergen.

Actions

This Discussion