CBWFQ not working (I presume)

laloperez · ‎08-14-2007

Hi, I have an issue trying to get CBWFQ to work in some 2960 and 2950s we have.

We need to limit the bandwidth used by nfs traffic we use to make backups in order to avoid it to eat all the bandwidth available . For that, I create this configuration:

ip access-list extended ACL-INTERACTIVO

permit tcp any eq 22 any

permit tcp any eq telnet any

permit tcp any eq 3389 any

exit

ip access-list extended ACL-CRITICO

permit udp any eq domain any

permit udp any eq ntp any

permit udp any eq snmp any

permit tcp any eq syslog any

permit udp any eq syslog any

exit

ip access-list extended ACL-HOSTING

permit tcp any eq www any

permit tcp any eq 443 any

permit tcp any eq pop3 any

permit tcp any eq smtp any

permit tcp any eq ftp any

permit tcp any eq ftp-data any

permit tcp any eq 143 any

exit

ip access-list extended ACL-BACKUP

permit udp any eq sunrpc any

permit tcp any eq sunrpc any

permit udp any eq 2049 any

permit tcp any eq 2049 any

permit tcp any eq 139 any

permit tcp any eq 445 any

exit

class-map INTERACTIVO

match access-group name ACL-INTERACTIVO

exit

class-map CRITICO

match access-group name ACL-CRITICO

exit

class-map HOSTING

match access-group name ACL-HOSTING

exit

class-map BACKUP

match access-group name ACL-BACKUP

exit

policy-map QoS

class INTERACTIVO

set ip dscp ef

police 1000000 65536 exceed-action drop

class CRITICO

set ip dscp af11

police 5000000 65536 exceed-action drop

class HOSTING

set ip dscp af21

police 25000000 65536 exceed-action drop

class BACKUP

set ip dscp af41

police 35000000 65536 exceed-action drop

class class-default

set ip dscp default

police 9000000 65536 exceed-action drop

exit

interface FastEthernet0/X

service-policy input QoS

Well, when I issue the sh policy-map interface, I can see the class maps and the policy assigned to the interface, but all the counters remain zero. Nada. Even when we copy 1GB+ of data via nfs to a machine, the counters doesn't move al all. Is this normal? Am I missing something?

Thank you in advance.

a.cruea1980 · ‎08-15-2007

I think you might have forgotten something on your class-map statements; try putting match-all or (if you want to match everything under the class-map) or match-any (if you want to match at least on thing under the class-map). Granted, I don't know the defaults, but that might be the reason; you're not telling the class-map how to match up the traffic.

jwdoherty · ‎08-15-2007

Haven't worked with 2900 series, but on 3750s, you don't see stats using sh policy-map interface either. You need something similar to "sh mls qos interface g1/0/1 statistics".

laloperez · ‎08-16-2007

Thank you. In effect, when I issue the sh mls qos I've got some stats. I'm going to check them out to see if they have the correct meaning.

By the way, is my config correct for my purposes? I'm not sure if I need to do something with the queues or the cos-to-dscp.

jwdoherty · ‎08-17-2007

Is your config correct? At just a glance, it appears to be, unless you want to discuss QoS philosophy.

In general, I avoid policers. They indeed work, but so did amputation in civil war surgery. Both have their place, but they are often not the only or best way to obtain the desired end.

What you're concerned about is backups interfering with normal traffic. Rightly so, too. However, instead of limiting such traffic to some limit, I prefer to configure such traffic to defer to all other traffic, except for some very minimal bandwidth floor to avoid bandwidth starvation, yet use all otherwise available bandwidth.

I believe this can be accomplished on the switches you have, but don't have a configuration example since I don't work with them. If full CBWFQ was supported, it would be something like this:

policy-map CBWFQ

class foreground

bandwidth percent 99

class background

bandwidth percent 1

Where you would place the backup traffic in the background class.

Ideally you mark COS and/or DSCP on the edge and then just sort the traffic via those markings as it passes through your other devices.

laloperez · ‎08-17-2007

Well, the problem is that in the 2950/2960 there's not the possibility to use a percentage, just the nominal bandwidth of an interface. What do you think it would be an alternative (and better) solution? I'm new to QoS and I'm not very sure about the posibilities.

Thank you again.

jwdoherty · ‎08-17-2007

From reading this: http://www.cisco.com/en/US/docs/switches/lan/catalyst2950/software/release/12.1_22_ea2/configuration/guide/swqos.html#wp1025461 it appears you can set outbound percentages as long as you work within the 4 queues provided by the platform. In particular, look at the "wrr-queue bandwidth" command.

laloperez · ‎08-19-2007

But, if I'm not wrong, that is for egress traffic. I need to limit the input bandwidth used in each interface, because most of the traffic is from the connected server to the switch port, not the reverse way. I think I do nothing by tweeking the egress queues.

jwdoherty · ‎08-20-2007

Yes, I was thinking mostly of egress, where congestion more often occurs in a many to one situation, but if you expect congestion on ingress, you can prioritize traffic there too. Unfortunately, only two input queues are supported but you can also prioritize which traffic gets dropped first (up to four levels).

You're correct that the policer only works on ingress, but if you de-prioritize the less important traffic, or have it drop first, you either minimize or keep such excess traffic from impacting your more important traffic yet use all available bandwidth. When queue fills, ingress or egress, you have drops, but it will be due to real congestion not artificial policing.

laloperez · ‎08-20-2007

Well, maybe I have to think the reverse way for this case. I thought that the congestion would be on ingress, because backup traffic and ordinary traffic would contend when entering each switch interface. Take into account that the relation could be 100:1. If backup traffic eats the available bandwidth for each host, it didn't made any sense for me to manipulate the egress queues, where most of the traffic would already be of the backup type. Am I wrong?

With respect to your comment on de-prioritizing the less important traffic, I don't understand well what do you mean I'd have to do. How do I do what you suggest?

Thank you very much for your comments :)

jwdoherty · ‎08-20-2007

Ah, if I understand your comment correctly, you're also concerned about the link from the server to the switch, i.e. server egress. You're trying to avoid something like backup traffic filling that link. If that's correct, policing might help somewhat if the sender reduces its send rate due to drops. A better solution would be to prioritize the traffic on the server or to use a separate port, on both server and edge switch, and dedicate it for high volume traffic like backups. (Sometimes just getting a backup task to run at low priority on the server can help.)

As to your question about how to de-prioritize, its just a matter of marking traffic at different importance levels and then treating such traffic differently. For instance, in your original post you were marking traffic with different DSCP settings, you then would need define the special handling the switch will perform against those markings for ingress and/or egress. (BTW: I would only recommend using DSCP EF for real-time traffic, e.g. VoIP. Your other marking are opposite the normal convention. I.e. AF4x usually is treated better than AF2x. One recent exception, though, is AF11 has been proposed to treated less than BE.)

laloperez · ‎08-24-2007

Sorry for the delay, I was eating some cisco QoS books :)

Unfortunatedly, controlling QoS on servers is not feasible, and using different networks for backups is an idea in progress, waiting for budget :)

After some reading, I've decided that what could be better would be to define just three classes, one for interactive and critical network traffic (dns, and such), other for the standard traffic (http, mail, etc) and a third for the "bulk" default traffic, including backups and the rest of the traffic not considered in the rest. After that, I'll define the bendwidth associated in the policy map and the wrr queues and cos-to-dscp maps for the four queues. I'm not sure about the default class, because the 2950s are someways limited with that.

I'll post the solution as soon as it is ready

Joseph W. Doherty · ‎08-24-2007

Three classes for egress (reserving the 4th for real-time, e.g.), where one class is for important traffic, one for normal traffic and one for bulk traffic, would be what I would do too. However, I would place unknown traffic in the normal class. Effectively, you only need to identify traffic that gets better handling (e.g. interactive, DNS queries, etc.) and traffic that gets worse handling (e.g. backup), everything else is best effort or routine.

As to percentages, not positive about the 2950 switch, but often it's the ratios that are important. So a percentage of 50% for best, 10% for normal, and 1% for scavenger doesn't necessarily mean best always reserves 50% of the bandwidth but means best gets 5x the bandwidth when competing with normal, or 50x the bandwidth when competing with scavenger for all the bandwidth. Yet, scavenger can have all the bandwidth not being used by the better classes.

For the ingress queues, you might try placing the scavenger (e.g. backup) in one queue with a low percentage; everything else in the other queue with a high percentage.