cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
9422
Views
0
Helpful
10
Replies

packets dropped, why?!

yurypakhomenko
Level 1
Level 1

Hello everyone,

I noticed that some ports of an access switch drop packets (0-10 per a minute)

and some other ports on the same switch don't. I cannot see any difference in the

configuration of the ports. The end systems connected to the ports are the same

hardware running Debian GNU/Linux 6.0 and the same application software.

Load is no issue... and is very similar on all ports.

I would like to find out which packets and why are being dropped,

your ideas || suggestions are highly appreciated.

All ports are configured like:

interface GigabitEthernet1/0/28

description srv10001.prod

switchport access vlan 157

switchport mode access

speed 1000

duplex full

spanning-tree portfast

end

All ports are directly connected to a HP blades

(each blade has 2 NICs connected to different swiches:

bond0     Link encap:Ethernet  HWaddr 1c:c1:de:73:b4:b2 

          inet addr:10.215.157.127  Bcast:10.215.157.255  Mask:255.255.255.0

          inet6 addr: fe80::1ec1:deff:fe73:b4b2/64 Scope:Link

          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1

          RX packets:17448344359 errors:0 dropped:6 overruns:0 frame:0

          TX packets:10135407025 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:0

          RX bytes:9973091124766 (9.0 TiB)  TX bytes:5301644097389 (4.8 TiB)

....

eth4      Link encap:Ethernet  HWaddr 1c:c1:de:73:b4:b2 

          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1

          RX packets:14472774476 errors:0 dropped:6 overruns:0 frame:0

          TX packets:10135155964 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:1000

          RX bytes:9781204401459 (8.8 TiB)  TX bytes:5301519982409 (4.8 TiB)

          Interrupt:42

eth5      Link encap:Ethernet  HWaddr 1c:c1:de:73:b4:b2 

          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1

          RX packets:2975569883 errors:0 dropped:0 overruns:0 frame:0

          TX packets:251061 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:1000

          RX bytes:191886723307 (178.7 GiB)  TX bytes:124114980 (118.3 MiB)

          Interrupt:47

)

------------------------------------------

GigabitEthernet1/0/28 is up, line protocol is up (connected)

  Hardware is Gigabit Ethernet, address is 04fe.7f5f.889c (bia 04fe.7f5f.889c)

  Description: xomi100.prod

  MTU 1500 bytes, BW 1000000 Kbit, DLY 10 usec,

     reliability 255/255, txload 1/255, rxload 1/255

  Encapsulation ARPA, loopback not set

  Keepalive set (10 sec)

  Full-duplex, 1000Mb/s, media type is 10/100/1000BaseTX

  input flow-control is off, output flow-control is unsupported

  ARP type: ARPA, ARP Timeout 04:00:00

  Last input never, output 00:00:01, output hang never

  Last clearing of "show interface" counters 2y36w

  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 452289

  Queueing strategy: fifo

  Output queue: 0/40 (size/max)

  5 minute input rate 3077000 bits/sec, 922 packets/sec

  5 minute output rate 7419000 bits/sec, 1229 packets/sec

     26828800323 packets input, 12824245603611 bytes, 0 no buffer

     Received 81534410 broadcasts (381 multicasts)

     0 runts, 0 giants, 0 throttles

     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored

     0 watchdog, 381 multicast, 0 pause input

     0 input packets with dribble condition detected

     39995982660 packets output, 24268168094858 bytes, 0 underruns

     0 output errors, 0 collisions, 0 interface resets

     0 babbles, 0 late collision, 0 deferred

     0 lost carrier, 0 no carrier, 0 PAUSE output

     0 output buffer failures, 0 output buffers swapped out

--------------------------

sh ver

Cisco IOS Software, C3750E Software (C3750E-UNIVERSALK9-M), Version 12.2(55)SE, RELEASE SOFTWARE (fc2)

Technical Support: http://www.cisco.com/techsupport

Copyright (c) 1986-2010 by Cisco Systems, Inc.

Compiled Sat 07-Aug-10 21:43 by prod_rel_team

Image text-base: 0x00003000, data-base: 0x02800000

ROM: Bootstrap program is C3750E boot loader

BOOTLDR: C3750E Boot Loader (C3750E-HBOOT-M) Version 12.2(44r)SE3, RELEASE SOFTWARE (fc3)

c3750-3-4-1 uptime is 2 years, 36 weeks, 5 days, 6 hours, 16 minutes

System returned to ROM by power-on

System restarted at 16:25:47 MET+1 Mon Feb 14 2011

System image file is "flash:/c3750e-universalk9-mz.122-55.SE.bin"

This product contains cryptographic features and is subject to United

States and local country laws governing import, export, transfer and

use. Delivery of Cisco cryptographic products does not imply

third-party authority to import, export, distribute or use encryption.

Importers, exporters, distributors and users are responsible for

compliance with U.S. and local country laws. By using this product you

agree to comply with applicable laws and regulations. If you are unable

to comply with U.S. and local laws, return this product immediately.

A summary of U.S. laws governing Cisco cryptographic products may be found at:

http://www.cisco.com/wwl/export/crypto/tool/stqrg.html

If you require further assistance please contact us by sending email to

export@cisco.com.

License Level: ipbase

License Type: Permanent

Next reload license Level: ipbase

cisco WS-C3750E-48TD (PowerPC405) processor (revision F0) with 262144K bytes of memory.

Processor board ID FDO1353R0JY

Last reset from power-on

5 Virtual Ethernet interfaces

1 FastEthernet interface

52 Gigabit Ethernet interfaces

2 Ten Gigabit Ethernet interfaces

The password-recovery mechanism is enabled.

512K bytes of flash-simulated non-volatile configuration memory.

Base ethernet MAC Address       : 04:FE:7F:5F:88:80

Motherboard assembly number     : 73-11175-13

Motherboard serial number       : FDO13530L1H

Model revision number           : F0

Motherboard revision number     : A0

Model number                    : WS-C3750E-48TD-S

Daughterboard assembly number   : 800-29737-01

Daughterboard serial number     : FDO135305CL

System serial number            : FDO1353R0JY

Top Assembly Part Number        : 800-28920-01

Top Assembly Revision Number    : C0

Version ID                      : V03

CLEI Code Number                : COM9V10ARB

Hardware Board Revision Number  : 0x00

Switch Ports Model              SW Version            SW Image                

------ ----- -----              ----------            ----------              

*    1 54    WS-C3750E-48TD     12.2(55)SE            C3750E-UNIVERSALK9-M    

Configuration register is 0xF

-----------------------

sh proc cpu

CPU utilization for five seconds: 17%/2%; one minute: 13%; five minutes: 13%

Thank you and best regards,

Yura

10 Replies 10

Joseph W. Doherty
Hall of Fame
Hall of Fame

Disclaimer

The  Author of this posting offers the information contained within this  posting without consideration and with the reader's understanding that  there's no implied or expressed suitability or fitness for any purpose.  Information provided is for informational purposes only and should not  be construed as rendering professional advice of any kind. Usage of this  posting's information is solely at reader's own risk.

Liability Disclaimer

In  no event shall Author be liable for any damages whatsoever (including,  without limitation, damages for loss of use, data or profit) arising out  of the use or inability to use the posting's information even if Author  has been advised of the possibility of such damage.

Posting

Load is no issue... and is very similar on all ports.

I would like to find out which packets and why are being dropped,

Load generally is the issue.  If the overload is from microbursts, normally you don't "see" the issue with routine load monitoring  (This because typical load monitoring might be averages across minutes vs. queue bursts in milliseconds.)

Is QoS globally enabled?  If so, default buffer settings generally support less queuing per egress queue.

Hi,

as far as I can see the QOS is disabled:

#sh mls qos

QoS is disabled

QoS ip packet dscp rewrite is enabled

#sh run | i qos

mls qos map cos-dscp 0 8 16 24 32 46 46 56

anyway, I increased the buffer and it did not help.

Input queue: 0/500/0/0 (size/max/drops/flushes); Total output drops: 455798

  Queueing strategy: fifo

  Output queue: 0/500 (size/max)

How can I monitor queue bursts in milliseconds?

Shouldn't I see something in the log if a packet is dropped due to the queue limitation?

#sh log

Syslog logging: enabled (0 messages dropped, 1 messages rate-limited, 8 flushes, 0 overruns, xml disabled, filtering disabled)

No Active Message Discriminator.

No Inactive Message Discriminator.

    Console logging: level debugging, 127895 messages logged, xml disabled,

                     filtering disabled

    Monitor logging: level debugging, 3 messages logged, xml disabled,

                     filtering disabled

        Logging to: vty1(3)

    Buffer logging:  level debugging, 127871 messages logged, xml disabled,

                     filtering disabled

    Exception Logging: size (4096 bytes)

    Count and timestamp logging messages: disabled

    File logging: disabled

    Persistent logging: disabled

No active filter modules.

    Trap logging: level informational, 127872 message lines logged

        Logging to 10.215.249.101  (udp port 514,  audit disabled,

              authentication disabled, encryption disabled, link up),

              127872 message lines logged,

              0 message lines rate-limited,

              0 message lines dropped-by-MD,

              xml disabled, sequence number disabled

              filtering disabled

Log Buffer (4096 bytes):

...

very old messages

...

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

Ok, with QoS disabled you should have maximum buffers for the egress queue.

Problem with the 3560/3750 series, they're buffer resources are undersized for really bursty traffic.  The 4800/4900 switches are much better.

What you might try, as different ASICs control different groups of ports, you might list your ports by their total usage and then "rotate" them across your ASICs.

I believe some of the 3560/3750 series might have some tweaks in their later IOSs, so you're running an old IOS, you might consider moving to newer code.

I also believe some of the 3560/3750 series provided more buffers for their "uplink" ports.  So you might try a very busy port on an "uplink" port (using copper "optic").  (E.g.: "The total available common pool for egress buffers varies from one platform  to the other. They are more limited in 2960-S: 2MB for the whole system  (downlink ports + uplink ports), while 3750-X has 2MB for each set of 24  downlink ports and 2MB for uplinks.")

thank you for your suggestion, but before I can do anything what will or may cause

even a short downtime, I must exactly find out which packets are dropped and why.

The short bursts and buffer overflow is a probable but not the proved cause.

Are there any debug || log options which may help to catch the dropped packets?

Disclaimer

The   Author of this posting offers the information contained within this   posting without consideration and with the reader's understanding that   there's no implied or expressed suitability or fitness for any purpose.   Information provided is for informational purposes only and should not   be construed as rendering professional advice of any kind. Usage of  this  posting's information is solely at reader's own risk.

Liability Disclaimer

In   no event shall Author be liable for any damages whatsoever (including,   without limitation, damages for loss of use, data or profit) arising  out  of the use or inability to use the posting's information even if  Author  has been advised of the possibility of such damage.

Posting

Are there any debug || log options which may help to catch the dropped packets?

Not I'm aware of.

Leo Laohoo
Hall of Fame
Hall of Fame

Hmmmm ... You've hard-coded speed and duplex.  What happens if you set both to "auto"?

thaks for your suggestion, but we made some bad experience trying to auto-negotiate with HP blades.

I believe as long as both ports (switch port and server’s NIC) are configured the same way, it is ok...

Fair enough ... This is on a 3750E?

Can you then please post the output to the command "sh controller e "?

> This is on a 3750E?

I think so, sh ver says:

Cisco IOS Software, C3750E Software (C3750E-UNIVERSALK9-M), Version 12.2(55)SE, RELEASE SOFTWARE (fc2)

> Can you then please post the output to the command "sh controller e "?

here you are:

sh controller ethernet-controller Gi1/0/28        

     Transmit GigabitEthernet1/0/28           Receive

   3527988499 Bytes                       1447264996 Bytes                   

   3868991458 Unicast frames              1117294191 Unicast frames          

     89627015 Multicast frames                   393 Multicast frames        

   1857077005 Broadcast frames              81703508 Broadcast frames        

            0 Too old frames               482090686 Unicast bytes           

            0 Deferred frames                  35310 Multicast bytes         

            0 MTU exceeded frames          965139000 Broadcast bytes         

            0 1 collision frames                   0 Alignment errors        

            0 2 collision frames                   0 FCS errors              

            0 3 collision frames                   0 Oversize frames         

            0 4 collision frames                   0 Undersize frames        

            0 5 collision frames                   0 Collision fragments     

            0 6 collision frames      

            0 7 collision frames            78873669 Minimum size frames     

            0 8 collision frames            90088428 65 to 127 byte frames   

            0 9 collision frames           818497755 128 to 255 byte frames  

            0 10 collision frames          736987613 256 to 511 byte frames  

            0 11 collision frames          987968502 512 to 1023 byte frames 

            0 12 collision frames         2781549421 1024 to 1518 byte frames

            0 13 collision frames                  0 Overrun frames          

            0 14 collision frames                  0 Pause frames            

            0 15 collision frames     

            0 Excessive collisions                 0 Symbol error frames     

            0 Late collisions                      0 Invalid frames, too large

            0 VLAN discard frames                  0 Valid frames, too large 

            0 Excess defer frames                  0 Invalid frames, too small

   1141069000 64 byte frames                       0 Valid frames, too small 

   3048750263 127 byte frames         

   1214220437 255 byte frames                      0 Too old frames          

   1693994121 511 byte frames                      0 Valid oversize frames   

   1018955364 1023 byte frames                     0 System FCS error frames 

   1993673589 1518 byte frames                     0 RxPortFifoFull drop frame

            0 Too large frames        

            0 Good (1 coll) frames    

            0 Good (>1 coll) frames   

2nd output is fine.

First output, about your IOS, is not.  I'm not a big fan of version "0".  Try 12.2(55)SE8 and see if the problem re-occurs.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: