Massive output queue drops 6509

Unanswered Question
Apr 29th, 2009

Dear all,


we replaced our existing 3750 stack (4 3750's with GE connections) with a 6509.

Since then I see massive output packet drops on interfaces. I checked the forum and many people are talking about line cards (WS-X6548-GE-TX), which might be the reason. But the load rarely goes over 100Mbits on those interfaces and I still see drops in the range of 10% on the overall traffic.

For testing I removed all QoS config from the interfaces, but it had no positive effect. I am really looking for help on this.


Please find below the needed information:


Mod Ports Card Type Model

--- ----- -------------------------------------- ------------------

1 24 CEF720 24 port 1000mb SFP WS-X6724-SFP

2 48 SFM-capable 48 port 10/100/1000mb RJ45 WS-X6548-GE-TX

3 48 SFM-capable 48 port 10/100/1000mb RJ45 WS-X6548-GE-TX

4 48 SFM-capable 48 port 10/100/1000mb RJ45 WS-X6548-GE-TX

5 2 Supervisor Engine 720 (Active) WS-SUP720-3B



Interface IHQ IQD OHQ OQD RXBS RXPS TXBS TXPS TRTL

-------------------------------------------------------------------------

* GigabitEthernet3/24 0 0 0 277914 250000 63 81000 83 0

* GigabitEthernet3/25 0 0 0 50710155 0 0 0 0 0

* GigabitEthernet3/26 0 0 0 50710155 0 0 1000 2 0

* GigabitEthernet3/27 0 0 0 50710155 0 0 1000 2 0

* GigabitEthernet3/28 0 0 0 50710155 2000 2 3000 4 0

* GigabitEthernet3/29 0 0 0 50710155 0 0 0 0 0

* GigabitEthernet3/30 0 0 0 50710155 1000 1 2000 3 0

* GigabitEthernet3/31 0 0 0 50710155 0 0 0 0 0

* GigabitEthernet3/32 0 0 0 50710155 380000 158 535000 164 0

* GigabitEthernet3/33 0 0 0 223739947 2000 0 54000 4 0

* GigabitEthernet3/34 0 0 0 223739947 0 0 1000 2 0

* GigabitEthernet3/35 0 0 0 223739947 3976000 513 3250000 467 0

* GigabitEthernet3/36 0 0 0 223739947 15000 13 390000 31 0

* GigabitEthernet3/37 0 0 0 223739947 131000 58 82000 67 0

* GigabitEthernet3/38 0 0 0 223739947 0 0 13000 21 0

GigabitEthernet3/39 0 0 0 223739947 0 0 0 0 0

GigabitEthernet3/40 0 0 0 223739947 0 0 0 0 0


interface GigabitEthernet3/35

switchport

switchport access vlan XXX

switchport mode access

switchport voice vlan XXX

no ip address

wrr-queue bandwidth 30 70

wrr-queue queue-limit 40 30

wrr-queue random-detect min-threshold 1 40 80

wrr-queue random-detect min-threshold 2 70 80

wrr-queue random-detect max-threshold 1 80 100

wrr-queue random-detect max-threshold 2 80 100

wrr-queue cos-map 1 1 1

wrr-queue cos-map 1 2 0

wrr-queue cos-map 2 1 2 3 4

wrr-queue cos-map 2 2 6 7

mls qos trust cos

!

interface GigabitEthernet3/36

switchport

switchport access vlan XXX

switchport mode access

switchport voice vlan XXX

no ip address

wrr-queue bandwidth 30 70

wrr-queue queue-limit 40 30

wrr-queue random-detect min-threshold 1 40 80

wrr-queue random-detect min-threshold 2 70 80

wrr-queue random-detect max-threshold 1 80 100

wrr-queue random-detect max-threshold 2 80 100

wrr-queue cos-map 1 1 1

wrr-queue cos-map 1 2 0

wrr-queue cos-map 2 1 2 3 4

wrr-queue cos-map 2 2 6 7

mls qos trust cos

spanning-tree portfast

!

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 5 (1 ratings)
Loading.
Giuseppe Larosa Fri, 05/01/2009 - 07:03

Hello Andreas,

have you seen if these drops are real and are affecting real traffic or are simply figures in the show?


This is to understand if there is a big impact or it can be seen as cosmetic, for example I see that several ports share the exact some counter value that it is unlike.


Hope to help

Giuseppe


stephenshaw Fri, 05/01/2009 - 10:35

Hi Andreas,


this may be what you are experiencing ...


the WS-X6548-GE-TX is designed to "share" 1Gig for groups of 8 ports - this is referred as an 8:1 oversubscription.


i.e. if you load up ports 1 to 8 with heavily utilized servers, this will cause drops.


Cisco recommends using the newer WS-X6748-GE-TX modules which use a matrix to achieve far better "over-subscription" ratios depending upon specific placements within the chassis. This of, course means extra cost and requires a specific IOS code on the Sup 720.


Now, if this route is not cost effective for your company, you can try configuring traffic flow control on indivdual ports/servers and/or ensure the heavy hitting servers are not on a common ASIC (which controls each indvidual group of 8 ports).


Many companies, including mine, still use the WS-X6548-GE-TX for server connections and we found using flow control for specific servers helped prevent the drops.


HTH,


Steve

Giuseppe Larosa Fri, 05/01/2009 - 10:39

Hello Steve,

I see that drops are equal for ports 32-38 in the output this is caused by this port sharing a single ASIC chip!


We have similar behaviour in C4500 4548 linecards


Thanks for your info


Best Regards

Giuseppe


stephenshaw Fri, 05/01/2009 - 10:48

Hi Guiseppe,


the matrix design on the newer line card can give up to 40G per port (pending placement in the 6509 - E chassis). However, nothing less than 20G is provisioned which takes the 8:1 oversubscription down to around 2.1:1 oversubscription - far better but is chassis specific & IOS specific. i.e. probably worth the investment for a net new Data Centre switch but may not be worth upgrading an existing one.


I'm on the hunt for the Cisco link and will post it if I can find it.


regards,


Steve

stephenshaw Fri, 05/01/2009 - 10:54

Hi,


I found the reference link:


http://www.cisco.com/en/US/products/hw/switches/ps700/products_tech_note09186a00801751d7.shtml#topic3


Details on the new line card and the use of a fabric matrix were done via a direct technical session with Cisco and I'm not aware of any links that detail how the new line card functions.


I would recommend that if anyone is considering new 6509-E switches for a Data Centre environment that you arrange a technical session with your Cisco reps. to determine the specific requirements and cost.


Steve

Actions

This Discussion