Cisco Support Community
cancel
Showing results for 
Search instead for 
Did you mean: 
New Member

Sudden egress performance problem with QoS on C3560

We went live with a set of 3560G-48s a few weeks back. We have basic a basic policy map applied to some of the ports to DSCP mark select traffic, but a pretty stock setup other than that.

This week we noticed that large file transfers from our Gig connected EMC Celerra NAS to any of our NT systems set to 100 meg were horribly slow. Packet trace shows lots of TCP retransmits. Similar NT systems set to Gig have no problems talking to the EMC or anything else at any combination of port speeds. Non-NT systems on Gig or 100 have no problems with the EMC either.

Yes: "The EMC is at fault" is my first guess, too, but here is the thing: Removing the service policy on the client port resolves the problem. No more retransmit storms.

"sh platform port-asic stats drop CLIENTINTERFACE" shows increasing dropped frames for queue 1, weight 0 as the problem is occuring.

Anybody had this happend before, where mls qos works fine for weeks, then has a sudden problem with egress queues?

5 REPLIES
Bronze

Re: Sudden egress performance problem with QoS on C3560

Hi Paul,

With some IOS versions of the 3560s and 3750s, the default queue buffer allocation is incorrect and results in drops as you saw. This is documented in bug CSCeg29704 FYI.

A fix has been integrated into IOS versions 12.2(25)EX 12.2(25)SEB 12.2(25)SEC and later.

You can however tune the default buffer queues using "mls qos queue-set output" if you didn't want to upgrade.

Hope this helps,

Michael.

New Member

Re: Sudden egress performance problem with QoS on C3560

Thanks for the info, Michael! We are running 12.2(25)SEE1, and sh mls qos queue-set output shows that the defaults match the recommendations in the bug you referenced.

You did get me on the right track, though.

We found another server with the same issue. (2003 server.) Eventually, as I went to try retuning the output queues again, I noticed something interesting: For drop threshold1, drop threshold2, and maximum threshold, the value range showed "<1-3200>" instead of "<1-400>". The docs and defaults on the switches all showed 400 as the max.

I scaled the output queues to match the new ranges, and voila! No more drops, and great performance everywhere!

Questions:

1) Is this new range documented somewhere, and I just missed it?

2) There should be no nasty side effects to this change, correct?

-Paul

New Member

Re: Sudden egress performance problem with QoS on C3560

Just a followup:

I opened a TAC case on this, and indeed the ranges were changed in SEE1. Updated documentation is aparently in the works. For anyone who runs into this issue, try the following queue tuning:

mls qos queue-set output 1 threshold 1 800 800 50 3200

mls qos queue-set output 1 threshold 2 1600 1600 50 3200

mls qos queue-set output 1 threshold 3 800 800 50 3200

mls qos queue-set output 1 threshold 4 800 800 50 3200

mls qos queue-set output 2 threshold 1 800 800 50 3200

mls qos queue-set output 2 threshold 2 1600 1600 50 3200

mls qos queue-set output 2 threshold 3 800 800 50 3200

mls qos queue-set output 2 threshold 4 800 800 50 3200

This appears to match the default behaviour of earlier revisions. (Caveat emptor: Check release notes and config docs for updates releated to this.)

New Member

Re: Sudden egress performance problem with QoS on C3560

I went through similar grief with 3750 Metro switches.

For the interest or use of anyone else who has similar problems, these packet drops can be seen with the command:

show platform port-asic stat drop [asic n]

The port numbers shown in that output can be mapped to physical interfaces by checking the output from:

show platform pm platform-block

The number in the hw-i column matches the port number shown in the drop stat output.

New Member

Re: Sudden egress performance problem with QoS on C3560

Great point! To get the same stats with one command, you can try:

sh platform port-asic stats drop giX/Y

(Replacing X and y with the slot and port)

I am not sure the exact revision this feature appeared in, so if it fails, Brad's tip will get you the info.

-Paul

408
Views
8
Helpful
5
Replies
CreatePlease to create content