10-28-2013 01:21 PM - edited 03-07-2019 04:17 PM
Hello everyone,
I noticed that some ports of an access switch drop packets (0-10 per a minute)
and some other ports on the same switch don't. I cannot see any difference in the
configuration of the ports. The end systems connected to the ports are the same
hardware running Debian GNU/Linux 6.0 and the same application software.
Load is no issue... and is very similar on all ports.
I would like to find out which packets and why are being dropped,
your ideas || suggestions are highly appreciated.
All ports are configured like:
interface GigabitEthernet1/0/28
description srv10001.prod
switchport access vlan 157
switchport mode access
speed 1000
duplex full
spanning-tree portfast
end
All ports are directly connected to a HP blades
(each blade has 2 NICs connected to different swiches:
bond0 Link encap:Ethernet HWaddr 1c:c1:de:73:b4:b2
inet addr:10.215.157.127 Bcast:10.215.157.255 Mask:255.255.255.0
inet6 addr: fe80::1ec1:deff:fe73:b4b2/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:17448344359 errors:0 dropped:6 overruns:0 frame:0
TX packets:10135407025 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:9973091124766 (9.0 TiB) TX bytes:5301644097389 (4.8 TiB)
....
eth4 Link encap:Ethernet HWaddr 1c:c1:de:73:b4:b2
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:14472774476 errors:0 dropped:6 overruns:0 frame:0
TX packets:10135155964 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:9781204401459 (8.8 TiB) TX bytes:5301519982409 (4.8 TiB)
Interrupt:42
eth5 Link encap:Ethernet HWaddr 1c:c1:de:73:b4:b2
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:2975569883 errors:0 dropped:0 overruns:0 frame:0
TX packets:251061 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:191886723307 (178.7 GiB) TX bytes:124114980 (118.3 MiB)
Interrupt:47
)
------------------------------------------
GigabitEthernet1/0/28 is up, line protocol is up (connected)
Hardware is Gigabit Ethernet, address is 04fe.7f5f.889c (bia 04fe.7f5f.889c)
Description: xomi100.prod
MTU 1500 bytes, BW 1000000 Kbit, DLY 10 usec,
reliability 255/255, txload 1/255, rxload 1/255
Encapsulation ARPA, loopback not set
Keepalive set (10 sec)
Full-duplex, 1000Mb/s, media type is 10/100/1000BaseTX
input flow-control is off, output flow-control is unsupported
ARP type: ARPA, ARP Timeout 04:00:00
Last input never, output 00:00:01, output hang never
Last clearing of "show interface" counters 2y36w
Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 452289
Queueing strategy: fifo
Output queue: 0/40 (size/max)
5 minute input rate 3077000 bits/sec, 922 packets/sec
5 minute output rate 7419000 bits/sec, 1229 packets/sec
26828800323 packets input, 12824245603611 bytes, 0 no buffer
Received 81534410 broadcasts (381 multicasts)
0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
0 watchdog, 381 multicast, 0 pause input
0 input packets with dribble condition detected
39995982660 packets output, 24268168094858 bytes, 0 underruns
0 output errors, 0 collisions, 0 interface resets
0 babbles, 0 late collision, 0 deferred
0 lost carrier, 0 no carrier, 0 PAUSE output
0 output buffer failures, 0 output buffers swapped out
--------------------------
sh ver
Cisco IOS Software, C3750E Software (C3750E-UNIVERSALK9-M), Version 12.2(55)SE, RELEASE SOFTWARE (fc2)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2010 by Cisco Systems, Inc.
Compiled Sat 07-Aug-10 21:43 by prod_rel_team
Image text-base: 0x00003000, data-base: 0x02800000
ROM: Bootstrap program is C3750E boot loader
BOOTLDR: C3750E Boot Loader (C3750E-HBOOT-M) Version 12.2(44r)SE3, RELEASE SOFTWARE (fc3)
c3750-3-4-1 uptime is 2 years, 36 weeks, 5 days, 6 hours, 16 minutes
System returned to ROM by power-on
System restarted at 16:25:47 MET+1 Mon Feb 14 2011
System image file is "flash:/c3750e-universalk9-mz.122-55.SE.bin"
This product contains cryptographic features and is subject to United
States and local country laws governing import, export, transfer and
use. Delivery of Cisco cryptographic products does not imply
third-party authority to import, export, distribute or use encryption.
Importers, exporters, distributors and users are responsible for
compliance with U.S. and local country laws. By using this product you
agree to comply with applicable laws and regulations. If you are unable
to comply with U.S. and local laws, return this product immediately.
A summary of U.S. laws governing Cisco cryptographic products may be found at:
http://www.cisco.com/wwl/export/crypto/tool/stqrg.html
If you require further assistance please contact us by sending email to
License Level: ipbase
License Type: Permanent
Next reload license Level: ipbase
cisco WS-C3750E-48TD (PowerPC405) processor (revision F0) with 262144K bytes of memory.
Processor board ID FDO1353R0JY
Last reset from power-on
5 Virtual Ethernet interfaces
1 FastEthernet interface
52 Gigabit Ethernet interfaces
2 Ten Gigabit Ethernet interfaces
The password-recovery mechanism is enabled.
512K bytes of flash-simulated non-volatile configuration memory.
Base ethernet MAC Address : 04:FE:7F:5F:88:80
Motherboard assembly number : 73-11175-13
Motherboard serial number : FDO13530L1H
Model revision number : F0
Motherboard revision number : A0
Model number : WS-C3750E-48TD-S
Daughterboard assembly number : 800-29737-01
Daughterboard serial number : FDO135305CL
System serial number : FDO1353R0JY
Top Assembly Part Number : 800-28920-01
Top Assembly Revision Number : C0
Version ID : V03
CLEI Code Number : COM9V10ARB
Hardware Board Revision Number : 0x00
Switch Ports Model SW Version SW Image
------ ----- ----- ---------- ----------
* 1 54 WS-C3750E-48TD 12.2(55)SE C3750E-UNIVERSALK9-M
Configuration register is 0xF
-----------------------
sh proc cpu
CPU utilization for five seconds: 17%/2%; one minute: 13%; five minutes: 13%
Thank you and best regards,
Yura
10-28-2013 06:16 PM
Disclaimer
The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.
Liability Disclaimer
In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.
Posting
Load is no issue... and is very similar on all ports.I would like to find out which packets and why are being dropped,
Load generally is the issue. If the overload is from microbursts, normally you don't "see" the issue with routine load monitoring (This because typical load monitoring might be averages across minutes vs. queue bursts in milliseconds.)
Is QoS globally enabled? If so, default buffer settings generally support less queuing per egress queue.
10-29-2013 11:33 AM
Hi,
as far as I can see the QOS is disabled:
#sh mls qos
QoS is disabled
QoS ip packet dscp rewrite is enabled
#sh run | i qos
mls qos map cos-dscp 0 8 16 24 32 46 46 56
anyway, I increased the buffer and it did not help.
Input queue: 0/500/0/0 (size/max/drops/flushes); Total output drops: 455798
Queueing strategy: fifo
Output queue: 0/500 (size/max)
How can I monitor queue bursts in milliseconds?
Shouldn't I see something in the log if a packet is dropped due to the queue limitation?
#sh log
Syslog logging: enabled (0 messages dropped, 1 messages rate-limited, 8 flushes, 0 overruns, xml disabled, filtering disabled)
No Active Message Discriminator.
No Inactive Message Discriminator.
Console logging: level debugging, 127895 messages logged, xml disabled,
filtering disabled
Monitor logging: level debugging, 3 messages logged, xml disabled,
filtering disabled
Logging to: vty1(3)
Buffer logging: level debugging, 127871 messages logged, xml disabled,
filtering disabled
Exception Logging: size (4096 bytes)
Count and timestamp logging messages: disabled
File logging: disabled
Persistent logging: disabled
No active filter modules.
Trap logging: level informational, 127872 message lines logged
Logging to 10.215.249.101 (udp port 514, audit disabled,
authentication disabled, encryption disabled, link up),
127872 message lines logged,
0 message lines rate-limited,
0 message lines dropped-by-MD,
xml disabled, sequence number disabled
filtering disabled
Log Buffer (4096 bytes):
...
very old messages
...
10-29-2013 12:22 PM
Disclaimer
The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.
Liability Disclaimer
In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.
Posting
Ok, with QoS disabled you should have maximum buffers for the egress queue.
Problem with the 3560/3750 series, they're buffer resources are undersized for really bursty traffic. The 4800/4900 switches are much better.
What you might try, as different ASICs control different groups of ports, you might list your ports by their total usage and then "rotate" them across your ASICs.
I believe some of the 3560/3750 series might have some tweaks in their later IOSs, so you're running an old IOS, you might consider moving to newer code.
I also believe some of the 3560/3750 series provided more buffers for their "uplink" ports. So you might try a very busy port on an "uplink" port (using copper "optic"). (E.g.: "The total available common pool for egress buffers varies from one platform to the other. They are more limited in 2960-S: 2MB for the whole system (downlink ports + uplink ports), while 3750-X has 2MB for each set of 24 downlink ports and 2MB for uplinks.")
10-30-2013 11:58 AM
thank you for your suggestion, but before I can do anything what will or may cause
even a short downtime, I must exactly find out which packets are dropped and why.
The short bursts and buffer overflow is a probable but not the proved cause.
Are there any debug || log options which may help to catch the dropped packets?
10-30-2013 05:25 PM
Disclaimer
The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.
Liability Disclaimer
In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.
Posting
Are there any debug || log options which may help to catch the dropped packets?
Not I'm aware of.
10-28-2013 06:27 PM
Hmmmm ... You've hard-coded speed and duplex. What happens if you set both to "auto"?
10-29-2013 11:27 AM
thaks for your suggestion, but we made some bad experience trying to auto-negotiate with HP blades.
I believe as long as both ports (switch port and server’s NIC) are configured the same way, it is ok...
10-29-2013 01:58 PM
Fair enough ... This is on a 3750E?
Can you then please post the output to the command "sh controller e
10-30-2013 11:50 AM
> This is on a 3750E?
I think so, sh ver says:
Cisco IOS Software, C3750E Software (C3750E-UNIVERSALK9-M), Version 12.2(55)SE, RELEASE SOFTWARE (fc2)
> Can you then please post the output to the command "sh controller e
here you are:
sh controller ethernet-controller Gi1/0/28
Transmit GigabitEthernet1/0/28 Receive
3527988499 Bytes 1447264996 Bytes
3868991458 Unicast frames 1117294191 Unicast frames
89627015 Multicast frames 393 Multicast frames
1857077005 Broadcast frames 81703508 Broadcast frames
0 Too old frames 482090686 Unicast bytes
0 Deferred frames 35310 Multicast bytes
0 MTU exceeded frames 965139000 Broadcast bytes
0 1 collision frames 0 Alignment errors
0 2 collision frames 0 FCS errors
0 3 collision frames 0 Oversize frames
0 4 collision frames 0 Undersize frames
0 5 collision frames 0 Collision fragments
0 6 collision frames
0 7 collision frames 78873669 Minimum size frames
0 8 collision frames 90088428 65 to 127 byte frames
0 9 collision frames 818497755 128 to 255 byte frames
0 10 collision frames 736987613 256 to 511 byte frames
0 11 collision frames 987968502 512 to 1023 byte frames
0 12 collision frames 2781549421 1024 to 1518 byte frames
0 13 collision frames 0 Overrun frames
0 14 collision frames 0 Pause frames
0 15 collision frames
0 Excessive collisions 0 Symbol error frames
0 Late collisions 0 Invalid frames, too large
0 VLAN discard frames 0 Valid frames, too large
0 Excess defer frames 0 Invalid frames, too small
1141069000 64 byte frames 0 Valid frames, too small
3048750263 127 byte frames
1214220437 255 byte frames 0 Too old frames
1693994121 511 byte frames 0 Valid oversize frames
1018955364 1023 byte frames 0 System FCS error frames
1993673589 1518 byte frames 0 RxPortFifoFull drop frame
0 Too large frames
0 Good (1 coll) frames
0 Good (>1 coll) frames
10-30-2013 01:53 PM
2nd output is fine.
First output, about your IOS, is not. I'm not a big fan of version "0". Try 12.2(55)SE8 and see if the problem re-occurs.
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: