Solved: WRED Tail Drops

Craig Budrodeen · ‎05-06-2014

We have a POS 155 Mbps circuit.

There are tail drops occurring on the default class which is allocated 25% (38.75 Mbps). There is some voice traffic but that is priority queued and is not a problem. Other control classes also OK.

But when I look at the overall utilization of the physical interface (155 Mbps) it is not congested. It peaks at 40% during business hours - most of which is default class.

It was my understanding that WRED should not "kick in" unless there is congestion - as defined by the WRED algorithm.

Does "congestion" here apply to the physical interface or does it apply the the 25% allocated to defualt traffic?

If the tail drops are occurring because of "congestion" on the default class, what parameters should I tweak - queue lengths or the bandwidth allocated to default?

policy-map eth-service-policy
class voice
priority percent 35
class voice-control
bandwidth percent 2
class network-control
bandwidth percent 5
class class-default
bandwidth percent 25
random-detect

policy-map frame-relay-shape-policy
class class-default
shape average 154000000
service-policy eth-service-policy

interface POS1/0
description "..."
no ip address
encapsulation frame-relay
no keepalive
no arp frame-relay
no frame-relay inverse-arp
service-policy output frame-relay-shape-policy
hold-queue 1000 out

show policy-map interface POS1/0

POS1/0

Service-policy output: frame-relay-shape-policy

    Class-map: class-default (match-any)
      1802709602 packets, 720560400569 bytes
      5 minute offered rate 26898000 bps, drop rate 0 bps
      Match: any
      Traffic Shaping
           Target/Average   Byte   Sustain   Excess    Interval Increment
             Rate           Limit bits/int bits/int (ms)      (bytes)
        154000000/154000000 962500 3850000   3850000   25        481250

        Adapt Queue     Packets   Bytes     Packets   Bytes     Shaping
        Active Depth                         Delayed   Delayed   Active
        -      0         1802673602 3259356776 79971     71426633 no

Service-policy : eth-service-policy

Class-map: class-default (match-any)
          1405375819 packets, 644742811187 bytes
          5 minute offered rate 24889000 bps, drop rate 0 bps
          Match: any
          Queueing
            Output Queue: Conversation 267
            Bandwidth 25 (%)
            Bandwidth 38500 (kbps)
            (pkts matched/bytes matched) 94915/108733008
        (depth/total drops/no-buffer drops) 0/36015/0
             exponential weight: 9
             mean queue depth: 0

class    Transmitted      Random drop      Tail drop    Minimum Maximum Mark
           pkts/bytes       pkts/bytes       pkts/bytes    thresh thresh prob
      0 1405364393/644719714663   4024/4793657    31991/36714190    20      40 1/10
      1   25378/1116693         0/0              0/0           22      40 1/10
      2     175/7736            0/0              0/0           24      40 1/10
      3       0/0               0/0              0/0           26      40 1/10
      4       0/0               0/0              0/0           28      40 1/10
      5       0/0               0/0              0/0           30      40 1/10
      6       0/0               0/0              0/0           32      40 1/10
      7       0/0               0/0              0/0           34      40 1/10
   rsvp       0/0               0/0              0/0           36      40 1/10

The drop packet counters increment when I check them overnight, and we have graphs that show random and tail drops.

Thank you (in anticipation).

Vasilii Mikhailovskii · ‎05-08-2014

Hello, Craig.

The issue here is min and max thresholds you have on class-default WRED. As you run pretty fast link with a lot of users, traffic must be bursty, and mean queue of 20/40 packets is not enough!

For 100M link I would suggest to have tresholds at least 200 and 300 packets (or even more) respectively; queue size about 400-500.

Best regards.

PS: queue length constraints comes from queuing delay and buffers available; I guess in your case only queuing delay would come into play... so 300 packets*500 bytes*10/150M = 1/100 sec = 10 ms (in worst case).

View solution in original post

Joseph W. Doherty · ‎05-08-2014

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

As I noted in my origianl post, "the WRED defaults are probably too shallow", but I didn't suggest new values because as I also noted "WRED is difficult to tune unless you're an expert in QoS".

Vasilii is suggesting an initial threshold of 200 to 300 packet (or more) and a total queue size of 400 to 500 packets. He notes "so 300 packets*500 bytes*10/150M = 1/100 sec = 10 ms (in worst case)", but is this really true?

He assumes your average packet size is about 500 bytes, but for "worst case" calculations it might be better to assume we'll have to deal with maximum size packets, so lets assume 1500. If we do, though, we just tripled the delay.

Next, the 300 was actually suggested as the initial drop point, not max queue size, which was suggested as 400 to 500. So, 500 packets*1500*10/150Mbps = 50ms. That's not too bad, but WRED uses averages, so a much larger fast burst could go beyond 500 packets while WRED sees the average queue depth as much less.

How much larger? I don't know, as that would depend on ingress to this device and WRED settings for how fast it adjusts its moving average. But if it could go much higher resulting in additional delay.

Vasilii also assumes you have 150 Mbps but you've only defined 25% for class-default. So, again assuming worst case, i.e. you've only guaranteed 1/4 the bandwidth, your latency would increase by a factor of 4. In other words, not even accounting for WRED's moving average, allowing total queue depth beyond defined max, wost case latency for this class might be 200ms.

So, I wouldn't count on your worst case delay only being 10 ms.

However, even if Vasilii is correct about worst case delay being only 10 ms, increasing the min max thresholds as he suggests, might result in an increase in drops!

This is counter to what we normally expect, but with later TCP clients with very large RWINs, running across LFNs, slow start can create some very large bursts which can result in massive tail drops.

One of the reasons for WRED is to drop packets before they burst into a huge tail drop situation; remember WRED tail drop is just as bad as ordinary FIFO tail drop. Ideally we want WRED's early drop to avoid us hitting any tail drop. Getting this right has additional issues, one of the most serious is Cisco's WRED early drops as packets are added to the egress queue, which can mean a delay before the sender realizes there's a drop. There's also the issue not all flows types slow when there are individual drops. Also Cisco's WRED might drop packets from flows not even causing the congestion, while the congestion causing flow packets are not dropped (NB: Cisco FRED, addresses this, but few Cisco platforms support it).

I could go on (and on) but I'm hoping the above will how show that using WRED, to obtain it's full potential, can be a little more involved than just increasing its min and max settings. Just doing that may, or may not, decrease the drops you're seeing, but QoS is really about meeting certain service objectives. Often it's thought "no drops" is the ideal, but managed drops might better meet certain service objectives.

In my original post, I suggested using FQ (if supported), because it's not nearly as complicated to leverage properly. It helps insures light bandwidth using flows aren't as adversely impacted by heavy bandwidth using flows, and it also tends to drop packets first from the flows causing the congestion.

View solution in original post

Joseph W. Doherty · ‎05-07-2014

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

But when I look at the overall utilization of the physical interface (155 Mbps) it is not congested. It peaks at 40% during business hours - most of which is default class.

You shouldn't be getting drops if there's no congestion. % utilization doesn't often really indicate whether you have congestion or not. (NB: there's congestion whenever a packet is queued. If ingress bandwidth to this device allows for more than egress bandwidth to your PoS, you'll likely have packets being queued.)

It was my understanding that WRED should not "kick in" unless there is congestion - as defined by the WRED algorithm.

Correct.

Does "congestion" here apply to the physical interface or does it apply the the 25% allocated to defualt traffic?

Neither in this case. It applies when traffic is being queue in the default class and that is determined by your shaper. (Incidentally, are you sure you need a shaper? I.e. is your POS is physically also 155 Mbps?)

If the tail drops are occurring because of "congestion" on the default class, what parameters should I tweak - queue lengths or the bandwidth allocated to default?

Depends on what you what to accomplish; i.e. your service level goals. When managing congestion, when you push here, something else often pops out there. I.e. it's all about trade-offs.

That noted, assuming you'll have about 39 Mbps for this class, the WRED defaults are probably too shallow (as Cisco's common defaults seem to have be selected for "typical" WAN T1s).

IMO, WRED is difficult to tune unless you're an expert in QoS. I would recommend you remove it altogether. You might consider using FQ, though.

Vasilii Mikhailovskii · ‎05-08-2014

Hello, Craig.

The issue here is min and max thresholds you have on class-default WRED. As you run pretty fast link with a lot of users, traffic must be bursty, and mean queue of 20/40 packets is not enough!

For 100M link I would suggest to have tresholds at least 200 and 300 packets (or even more) respectively; queue size about 400-500.

Best regards.

PS: queue length constraints comes from queuing delay and buffers available; I guess in your case only queuing delay would come into play... so 300 packets*500 bytes*10/150M = 1/100 sec = 10 ms (in worst case).

Joseph W. Doherty · ‎05-08-2014