Hi, I'm getting a lot of overruns on a gigabit ethernet interface of a Cisco 6509. The interfaces is connected to a server which has a gig network card. I believe overruns happen when data comes from the server at a rate which the switch port can't handle.
My first question is, if the NIC in the server is 1Gbps then how come it's transmitting at a higher rate than that (if that's what's happening)? The 6509 is a beefy box and I'd have thought it could cope with whatever was chucked at it. It's not overworked by any stretch of the imagination.
They are not shown as errors on the show interface output though which I'm puzzled about.
Do the overruns mean that the packets are actually dropped by the switch? If so then won't that meant that there will be a lot of TCP retransmissions?
I've monitored the traffic on the switchport and there's generally about 100 - 300Mbps coming from the server (MS Exchange server).
The interfaces on the server NIC and the switch port are hard coded to 1000 Full duplex. I was wondering if they should be set to auto as I seem to remember reading a while ago that this is recommended by Cisco as the autonegotiation with a GigEthernet interface negotiates other parameters besides speed and duplex (such as flow control).
Here's the show interface output (by the way it states the counters have never been cleared but the overruns are ticking up pretty fast all the same. I've cleared the counters on another interface connected to another Exchange server a couple of days ago and it's clocked up a load of overruns since then)
RDC-PRI#sh int gi8/10
GigabitEthernet8/10 is up, line protocol is up (connected)
Hardware is C6k 1000Mb 802.3, address is 000e.83b9.2e01 (bia 000e.83b9.2e01)
MTU 1500 bytes, BW 1000000 Kbit, DLY 10 usec,
reliability 255/255, txload 2/255, rxload 46/255
Encapsulation ARPA, loopback not set
input flow-control is off, output flow-control is desired
Clock mode is auto
ARP type: ARPA, ARP Timeout 04:00:00
Last input never, output 00:00:26, output hang never
Last clearing of "show interface" counters 3d05h
Input queue: 0/2000/8127861/0 (size/max/drops/flushes); Total output drops: 0
Queueing strategy: fifo
Output queue: 0/40 (size/max)
5 minute input rate 183552000 bits/sec, 38613 packets/sec
5 minute output rate 10998000 bits/sec, 19970 packets/sec
3197934213 packets input, 1904525434503 bytes, 0 no buffer
Received 7419 broadcasts (679 multicast)
0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 8127861 overrun, 0 ignored
0 watchdog, 0 multicast, 0 pause input
0 input packets with dribble condition detected
1682151323 packets output, 152965398501 bytes, 0 underruns
0 output errors, 0 collisions, 0 interface resets
0 babbles, 0 late collision, 0 deferred
0 lost carrier, 0 no carrier, 0 PAUSE output
0 output buffer failures, 0 output buffers swapped out
Could it have anything to do with:
Server NIC drivers/card incompatibility?
Switch IOS/firmware update needed?
Thanks in advance
Overruns indicate that the linecard cannot handle the amount of traffic coming into its ports. Before I even guess at the cause, could you give us:
Thanks. Here's the show module output:
RDC-PRI#sh module 8
Mod Ports Card Type Model Serial No.
--- ----- -------------------------------------- ------------------ -----------
8 48 48 port 10/100/1000mb EtherModule WS-X6148-GE-TX SAD080205ZA
Mod MAC addresses Hw Fw Sw Status
--- ---------------------------------- ------ ------------ ------------ -------
8 000e.83b9.2df8 to 000e.83b9.2e27 5.0 7.2(1) 8.3(0.156)RO Ok
And here's the show version output:
Cisco Internetwork Operating System Software
IOS (tm) c6sup2_rp Software (c6sup2_rp-JK9O3SV-M), Version 12.2(18)SXD5, RELEASE
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2005 by cisco Systems, Inc.
Compiled Fri 13-May-05 20:04 by ssearch
Image text-base: 0x4002100C, data-base: 0x42638000
ROM: System Bootstrap, Version 12.2(17r)S1, RELEASE SOFTWARE (fc1)
BOOTLDR: c6sup2_rp Software (c6sup2_rp-JK9O3SV-M), Version 12.2(18)SXD5, RELEASE
RDC-PRI uptime is 4 weeks, 2 days, 20 hours, 26 minutes
Time since RDC-PRI switched to active is 4 weeks, 2 days, 20 hours, 27 minutes
System returned to ROM by power-on (SP by power-on)
System restarted at 18:32:51 UTC Sat May 19 2007
System image file is "disk0:c6k222-jk9o3sv-mz.122-18.SXD5.bin"
This product contains cryptographic features and is subject to United
States and local country laws governing import, export, transfer and
use. Delivery of Cisco cryptographic products does not imply
third-party authority to import, export, distribute or use encryption.
Importers, exporters, distributors and users are responsible for
compliance with U.S. and local country laws. By using this product you
agree to comply with applicable laws and regulations. If you are unable
to comply with U.S. and local laws, return this product immediately.
A summary of U.S. laws governing Cisco cryptographic products may be found at:
If you require further assistance please contact us by sending email to
cisco WS-C6509 (R7000) processor (revision 3.0) with 458752K/65536K bytes of mem
Processor board ID TBP07050099
R7000 CPU at 300Mhz, Implementation 0x27, Rev 3.3, 256KB L2, 1024KB L3 Cache
Last reset from power-on
X.25 software, Version 3.0.0.
SuperLAT software (copyright 1990 by Meridian Technology Corp).
TN3270 Emulation software.
9 Virtual Ethernet/IEEE 802.3 interface(s)
96 FastEthernet/IEEE 802.3 interface(s)
68 Gigabit Ethernet/IEEE 802.3 interface(s)
381K bytes of non-volatile configuration memory.
32768K bytes of Flash internal SIMM (Sector size 512K).
Configuration register is 0x2102
I was expecting that, this module was designed to extend giga to the desktop. It's not meant for server farm.
You need to replace this module with one that supports Switch Fabric and Bus, not just Bus.
More information on modules available
Thanks for that Mr E. It's given me something to think about. I thought a gigabit switch module would be able to cope with gigabit servers but as you've pointed out not all gigabit modules are equal!
Wouldn't the output errors indicate that the NIC on the server isn't keeping up with the packets from the line card?
I am having the same problem and I just told the Data Center team that I thought the output overruns were when the NIC on the server couldn't remove the packets from it's hardware buffer fast enough. Geesh....
Could you please give me some more detail about the output overruns?