Re: General question about shaping

Peter Paluch · ‎02-17-2012

Dear friends,

This discussion relates to IOS-based software routers.

It has been my implicit assumption that a configured shaping mechanism always works in conjuction with the current queueing policy on an egress interface, i.e. it merely allows or disallows to dequeue a packet from whatever software queue it is in (FIFO, PQ queues, CQ queues, WFQ conversation queues, CBWFQ conversation queues, etc.) if the shaper's token bucket has been exhausted.

However, a colleague of mine is very adamant that if a shaper is configured on an interface or in a policy-map, it creates its own separate queue in which it stores the delayed packets. At first, I considered this idea ridiculous because of the obvious scheduling problems if yet another queue(s) popped up in the system. However, the existing documentation on the topic is so vague that I am no longer sure.

Can anyone shed more light on this? Every help is very much appreciated! Thank you!

Best regards,

Peter

Edison Ortiz · ‎02-17-2012

Hello Peter, hope you are doing well.

I agree with your assessment that a shaper uses the software queues - not an individual queue as your colleague stated.

However, it actually uses a WFQ at all times even if FIFO, PQ or CQ is configured.

FRTS uses CQ or PQ if configured.

http://www.cisco.com/en/US/docs/ios/12_2/qos/configuration/guide/qcfpolsh.html#wp1001072

"If a packet is deferred, GTS and Class-Based Shaping use a weighted fair queue to hold the delayed traffic. FRTS uses either a custom queue or a priority queue for the same, depending on what you have configured. "

Peter Paluch · ‎02-17-2012

Hello Edison,

Hey, thanks for joining this thread! I am fine - and I hope you're doing well, too!

However, it actually uses a WFQ at all times even if FIFO, PQ or CQ is configured.

This is quite interesting. It implies that if, say, PQ is configured and a packet is delayed by a shaper, it needs to get dequeued from the particular priority queue and instead be placed into a WFQ conversation queue, in essence creating another set of queues parallel to (or next to?) the PQ queue set. Is that correct? If so, what is the next scheduling policy - how does a router decide which queue set (WFQ or PQ) is it going to serve?

I guess what I am looking for is a sequential description of the operations performed when a packet is delayed by a shaper, and when multiple packets are waiting both in the PQ (or any other queueing mechanism configured) and in the WFQ queues that hold the deferred packets.

Thank you very much! And oh, please, would you mind checking your private messages? I've sent you a short message.

Best regards,

Peter

Edison Ortiz · ‎02-18-2012

I want to point out when I said 'uses WFQ at all times' to reflect this behavior occurs while the shaper is active.

If the shaper isn't active, it will use whatever queuing is defined in the interface or the child policy.

With that said, the child policy defines how packets will be de-queued.

The shaper in the parent policy simply informs the child policy how much egress bandwidth is

available as well as handling the flow buffering process with WFQ.

Airport analogy: The information below represents your child policy

At the gate 4 lanes are formed:

- First-Class lane (represents the PQ)

- Frequent-Flyer lane (represents CBWFQ at 50%)

- Children & Elderly lane (represents CBWFQ at 40%)

- Remaining passengers lane (represents class class-default)

The gate attendant is responsible for the de-queuing process.

The tunnel from the gate to the plane represents the shaper.

The gate attendant will allow all First-Class (PQ) passengers to board the plane first.

They will arrive at the tunnel and proportionaly board the plane (WFQ).

Keep in mind, these flows have the same Weight so they will 'evently' board the plane.

Once the VIP lane is empty, they will service the other 3 lanes per their % ratio.

If they are boarding the plane on increment of 10,

- 5 Frequent flyers will approach the tunnel

- 4 Children or Elderly will approach the tunnel

- 1 general passenger will approach the tunnel.

At the tunnel, these passengers will board the plane proportionally to their weights. (WFQ)

Makes sense?

Joseph W. Doherty · ‎02-18-2012

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

A shaper will create its own queues in policy map classes, and as interface/subinterface traffic shaping but can't say for certain that they do likewise with all the other queuing methods you describe. Shapers did use to use WFQ, but those used on IOS supporting HQF policy maps might now only use FQ.

On some IOS platforms, and with policy maps (7500 VIPs ?), I recall seeing stats that implied even when the shaper had a subordinate child policy, it appeared to still maintain its own queue.

Like you, never found this clearly documented, and if shapers do maintain some queue while there is a subordinate child policy, haven't found documentation whether FQ or WFQ is still used and/or how deep the shaper queues.

Peter Paluch · ‎02-18-2012

Edison, Joseph,

I need some time to digest all of this. However, from what you indicated so far, it really seems that a configured shaper indeed in some way creates its own queues in addition to the queues managed by the current queueing strategy of the interface.

Let's not talk about child policies or nested policy-maps for now, as that hugely complicates things. Let's take the most simple MQC config (I guess talking about Legacy QoS CLI is largely irrelevant), which is, say:

policy-map Example

class class-default

no fair-queue

shape average 100000

This config places everything in a FIFO queue and shapes it subsequently. Now:

When does the shaper's WFQ take place: before or after the FIFO? In other words, is the shaper's queue placed before or after the configured queueing policy?
Why was the WFQ chosen? Would it not be simpler and equally effective to use a simpler FIFO-style queueing for each original shaped queue? Is the WFQ not going to interfere with the configured queueing policy, resulting in possible priority inversions and other ill-behaved scenarios?
Why is there actually a separate queueing mechanism in effect with shaping? What would be incorrect with an implementation where no additional queues would be created, just the dequeueing of existing queues would be delayed accordingly?

Best regards,

Peter

Joseph W. Doherty · ‎02-19-2012

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

I need some time to digest all of this. However, from what you indicated so far, it really seems that a configured shaper indeed in some way creates its own queues in addition to the queues managed by the current queueing strategy of the interface.

Let's not talk about child policies or nested policy-maps for now, as that hugely complicates things. Let's take the most simple MQC config (I guess talking about Legacy QoS CLI is largely irrelevant), which is, say:

policy-map Example

class class-default

no fair-queue

shape average 100000

This config places everything in a FIFO queue and shapes it subsequently. Now:

When does the shaper's WFQ take place: before or after the FIFO? In other words, is the shaper's queue placed before or after the configured queueing policy?
Why was the WFQ chosen? Would it not be simpler and equally effective to use a simpler FIFO-style queueing for each original shaped queue? Is the WFQ not going to interfere with the configured queueing policy, resulting in possible priority inversions and other ill-behaved scenarios?
Why is there actually a separate queueing mechanism in effect with shaping? What would be incorrect with an implementation where no additional queues would be created, just the dequeueing of existing queues would be delayed accordingly?

Best regards,

Peter

Ok, I see your policy with "no FQ" in class-default should set the class to use FIFO, but I believe the shaper within the class will maintain is own queue(s) for traffic it's shaped (which I believe uses WFQ [or FQ]).

Then the question is, if this is true, is there any FIFO queue for the class (which "overflows" into the shaper) or does the shaper queue(s) totally replace the class FIFO queue? I believe the latter.

Why was WFQ used? Only Cisco could answer that, although I agree a FIFO queue would have been simplier. My guess would be since they are using same code base for any shaper (Edison's reference: "Differences Between Shaping Mechanisms"), and since shaping has been around a while, might have decided shaped traffic should be WFQ just like the default for serial T1/E1 interfaces.

On most platforms, something like this GTS:

interface x

trafffc-shape rate average 100000

will use WFQ/FQ even on interfaces with FQ disabled.

Recently, I've been using this command on a FastEthernet interface without FQ enabled. (This in lieu of a CBWFQ policy - because of a possible bug - have a TAC case open on it - but that a different story.) When there's congestion with multiple flows, you can easily see them with the show traffic-shape queue command.

Regarding you questions about ". . . possible priority inversions and other ill-behaved scenarios?" If a queue originally used FIFO and the shaper does WFQ/FQ, it's possible individual packets from different flows will arrive in different sequence from their global "FIFO" ingress, but so what? If some flows have issues with this, then likely global FIFO egress should not have been used to begin with. (Again, to be clear, individual flow sequence will be preserved.)

Personally, the biggest issue I've had with shapers using their own WFQ, is that you can't normally set number of queues, individual queue depths and/or aggregate queue depth. I've also think I've may have had issues with multiple queues at different levels of a hierarchy, because of unexpected delay/jitter issues.

For example of the latter,

policy-map parent

class class-default

shape average 10000000

service-policy child

policy-map child

class voip

priority percent 30

I wonder whether when the parent's shaper determines packet need to be queued or dequeued, whether there's an intermediate queue(s) other than the child's. If so, would it impact the voip class's actual performance? I.e. are all shaped packets immediately pushed to the child policy, or do they first need to overflow from the parent's shaper's queue(s)? Additionally if there are parent shaper queues, do dequeued child voip packet now are only scheduled "fairly" with other parent shaped queued traffic?

I don't know the answer. Further, it might be slightly different between interfaces, platforms and/or IOS versions.

Edison Ortiz · ‎02-20-2012

Joseph/Peter,

Let me research internally and hopefully I get an answer for you guys.

Edison Ortiz · ‎02-23-2012

Guys,

Got some info from the developer, here are my notes:

Shaper does use WFQ on legacy IOS releases.

Starting with 12.4(20)T, HQF was introduced and Shaper now uses FIFO for queuing

http://www.cisco.com/en/US/docs/ios/qos/configuration/guide/qos_frhqf_support.html#wp1089229

The process in general works by the shaper causing the backpressure which forces the queuing to start.

Once the queuing starts, the queue mechanism either in the class the shaper is configured or the child class the flow is classified is used at enqueue.

At dequeue, packets will be using WFQ. With HQF, the dequeue is flow based.

Peter Paluch · ‎02-23-2012

Hello Edison,

Awesome! Still, please let me elaborate a little.

The process in general works by the shaper causing the backpressure which forces the queuing to start.

A backpressure is usually generated from the hardware transmit queue of an interface (TxQ). This would mean that the shaper is monitoring the amount of packets as they are entering the TxQ, possibly after the queueing mechanism and before the TxQ, in the sequence

software queues ---> shaper ---> TxQ

Is my assumption correct here?

Once the queuing starts, the queue mechanism either in the class the  shaper is configured or the child class the flow is classified is used  at enqueue.

Alright, so in other words, the packets that are delayed by the shaper are placed into the same queueing system that would normally kick in during periods of congestion (as if true TxQ backpressure was signalled). Am I correct?

I seem to have problem with this, though. Let's assume that a CBWFQ is configured on the egress interface. A packet that got delayed by the shaper is put into its appropriate queue for the class it is classified into within the CBWFQ queueing system applied at this interface, just as if the TxQ signalled a backpressure. After the Tc expires and new tokens are replenished, does this particular queue get any special treatment to send the delayed packets, or do they just wait until the CBWFQ decides to visit the queue once again on its own?

And another question. In the pre-HQF IOSes, it is stated that the shaper uses WFQ queueing. Does that mean that as soon as I activated the shape command in a policy-map for a particular class, the queueing within this class changed from the usual FIFO to WFQ? Recall that in CBWFQ, only the class-default can use WFQ queueing, every other class in a policy-map internally usees only just a single conversation queue, hence behaving as a FIFO.

At dequeue, packets will be using WFQ. With HQF, the dequeue is flow based.

I do not understand this one. Let's take just the HQF. If the shaper uses FIFO for queueing, how can the dequeue be flow based? Or does this statement apply to the entire queueing mechanism (i.e. the entire set of classes and their conversation queues as defined by the policy-map) and not just to the shaping queue?

Thank you very, very much!

Best regards,

Peter

Edison Ortiz · ‎02-23-2012

This would mean that the shaper is monitoring the amount of packets as they are entering the TxQ, possibly after the queueing mechanism and before the TxQ, in the sequence

Correct

Alright, so in other words, the packets that are delayed by the shaper are placed into the same queueing system that would normally kick in during periods of congestion (as if true TxQ backpressure was signalled). Am I correct?

Yes

After the Tc expires and new tokens are replenished, does this particular queue get any special treatment to send the delayed packets, or do they just wait until the CBWFQ decides to visit the queue once again on its own?

Token replenishment is controlled by the queuing for each class, on this case CBWFQ.

And another question. In the pre-HQF IOSes, it is stated that the shaper  uses WFQ queueing. Does that mean that as soon as I activated the 
shape command in a policy-map for a particular class, the queueing within this class changed from the usual FIFO to WFQ?

Changing the queuing for a class won't change the shaper queuing nor the shaper queuing will change the queuing for a class.

You can have the queuing configured for FIFO under a class and during back-pressure from the shaping FIFO will be the queuing for the class during enqueue, but during dequeue the shaper will use its own default queuing method (WFQ).

Or does this statement apply to the entire queueing mechanism (i.e. the entire set of classes and their conversation queues as defined by the policy-map) and not just to the shaping queue?

Yes

Peter Paluch · ‎02-23-2012

Hi Edison,

Still having doubts I am sorry if this is getting tedious - if it is please let me know.

Token replenishment is controlled by the queuing for each class, on this case CBWFQ.

Can you elaborate more on this statement please? Token replenishment has been, in my knowledge, always fully determined solely by the relation between CIR and Bc (every Tc=Bc/CIR seconds, add Bc worth of tokens). How can the CBWFQ queueing participate on the token replenishment here?

Changing the queuing for a class won't change the shaper queuing nor the shaper queuing will change the queuing for a class.
You  can have the queuing configured for FIFO under a class and during  back-pressure from the shaping FIFO will be the queuing for the class  during enqueue, but during dequeue the shaper will use its own default  queuing method (WFQ).

I am having troubles visualizing this. Assume that the entire policy-map consists only of a single class configured for FIFO operation and with shaping in place. Hence, the entire software queueing mechanism applied to the interface is a FIFO. Now, you say that during dequeue the shaper is going to use WFQ. But how can this happen? WFQ is based on creating a set of conversation queues by using hashing on packets' src/dst IP + L4 proto + src/dst L4 port and scheduling between these queues using a specific kind of max-min fairness algorithm. However, in order to create these conversation queues, the existing FIFO would first need to be "mass-drained" and the packets "sprayed" between the WFQ conversation queues to actually have something for WFQ to act upon. You can't perform WFQ on a single queue per se.

Thank you!

Best regards,

Peter

Edison Ortiz · ‎02-24-2012

I meant, during enqueue, each class will have their guaranteed rate under congestion (shaper or not).

When having the shaper configured, there is an additional queuing performed (WFQ) before hitting the transmit queue.

If you have trouble visualizing it, I posted a link in the initial thread which contains a figure for the entire process:

http://www.cisco.com/en/US/docs/ios/12_2/qos/configuration/guide/qcfpolsh.html#wp1001194

Joseph W. Doherty · ‎02-24-2012

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

Peter, as an aside, if your remember an earlier thread on shape vs peak, Edison's reference has:

Distributed Traffic Shaping

.

When shape average is configured, the interface sends no more than the Bc size for each interval, achieving an average rate no higher than the CIR. When the shape peak command is configured, the interface sends Bc plus Be bits in each interval.

This conforms with your test results, although it's at odds with the other Cisco documentation I referenced at that time.

Peter Paluch · ‎02-24-2012

Hello Edison,

If you have trouble visualizing it, I posted a link in the initial thread which contains a figure for the entire process:

I saw that figure - and it was not helpful. We are talking about this exhibit:

I find this figure to be not sufficiently descriptive because it is relevant only to GTS (i.e. the legacy QoS CLI) configured directly on an interface, without explaining how does it fit together with the existing queueing configured on the interface (priority-list, custom-list, CBWFQ). If the WFQ at the right side is the shaper's own independent and separate queueing system, where are the queues of the queueing policy that is applied to the interface supposed to be placed in this exhibit?

I meant, during enqueue, each class will have their guaranteed rate under congestion (shaper or not).

Do you mean that if I configure both bandwidth 500 and shape average 400000 in a class, then if the interface gets congested (possibly by other classes), the shaping will be deactivated and my class gets guaranteed 500 kbps instead of at most 400 kbps shaped?

When having the shaper configured, there is an additional queuing performed (WFQ) before hitting the transmit queue.

Hmmm... I assume that with the additional queueing, by default WFQ for pre-HQF or FIFO for HQF images, this additional queueing is performed for each shaped class in a policy-map independently, with each shaped class having its own "normal unshaped queue or set of queues" and a separate "shaping queue or a set of queues" where the delayed packets are stored after being extracted from the "normal" queues, correct?

This still makes my head spin... I should probably leave this to settle down for a couple of days and then try to create a diagram to the best of my understanding and let you guys voice your opinions.

Thank you so much for every help so far!

Best regards,

Peter