6509 - Output Queue Drops - Uncongested Interfaces - 6748-GE-TX

Unanswered Question
May 23rd, 2010

I have an interesting issue where I am seeing high output queue drops between two servers on the same VLAN.

Servers are 1Gb attached and speed/duplex has been verified.  The servers run around 40-50Mb/s and then burst up to 150-300Mb/s.  It is during these bursts that I am seeing very high (in the 1000/10000's) of output queue drops.

I see these bursts both on the port connected to the server itself, as well as on the Port-Channel interfaces between the two 6509's.

mls qos is enabled, and TAC recommended disabling this; due to other services on the switch this is not an option and not one I think should be 'encouraged' to solve an issue.

The two servers are on two seperate 6509's, both connected via 6748-GE-TX.  The traffic itself is traversing a 4Gb/s port-channel.  Both the port-channel and the destination server port show very high rates of output drops during bursts of traffic.

Any suggestions or other information I could provide that could help diagnose this issue.

Everything from Cisco seems to suggest this being a simple  link congestion issue, but we've seen time and time again that the links are simply not congested, and are only running at 20-30% of line rate.

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.

Acennami

I have the exact same problem but on a different platform-9 member 3750G stack.

I too asked TAC what the issue was and they told me to check into bugs in IOS.

I am running IOS 12.2(40)SE and cisco reports bugs with 12.2(44)SE that cause these output drops. I am presently waiting to see if perhaps my issue is a prevously undocumented bug in this IOS that I am running.

My case is still pending.

Have you checked possible bug issue?

acennami Mon, 05/24/2010 - 11:40

The issue appears to be related to CoS 0 traffic being dropped by WRR on the interface.

This only occurs during a large burst of traffic on the interface (which occurs regularly due to a scheduled batch job).

There is no other traffic going to the server on the particular port (or port channels) that would be congesting the network to 1Gb/s, so it is definitely not a line saturation issue.

I understand the traffic is in the best effort queue, but I don't understand why traffic would be dropped when there is no other traffic in any other queues (either on the port channel interface or the interface attached directly to the destination server).

I believe Cisco when they suggest that disabling "mls qos" on the switch (and inherently the ports) would resolve the issue, but I don't think that is a valid fix for one particular system.

Rajeev Sharma Mon, 05/24/2010 - 11:42

Hi Acennam,

Have you tried chnaging the load balancing alogorithm on the port channel.How many ports are in the port-channel?

Moroever what type of QoS marking the traffic is coming with or in which queue you are seeing the drops? We can tweak the threshold of that queue.

Regards,

Raj.

Actions

This Discussion