cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
4967
Views
7
Helpful
8
Replies

Problem - Hierarchical QoS Policy and DSCP Default Marking

stewartdrs
Level 1
Level 1

Hi there,

I have inherited a QoS configuration I am trying to troubleshoot. The rough setup is as follows. We have an access layer router for network A (A1) connected to a distribution router (Site1), which meets the WAN edge. Site1 is connected to a corresponding peer (Site2) using a DMVPN tunnel. A1 connects to its peer (A2) using another DMVPN tunnel (a tunnel-in-a-tunnel) setup. This gives the required segregation - we have multiple networks (ie. A, B, C etc) which utilise the backbone DMVPN tunnel.

Traffic is marked at the access layer egress using a particular DSCP value. To test the setup, all FTP traffic (data and control plane) is marked using dscp af11. This enters the encrypted tunnel, bound for A2, with the tunnel setup using the qos pre-classify command. I can verify this marking is preserved by snooping traffic on the A2 target host.

On the edge router (Site1) we want to match inbound traffic from the various networks in a given priority, then each application within that network is remarked for queuing. This allows us to allocate bandwidth for certain traffic, even though it is encrypted at this stage. We have multiple policy maps with different bandwidth allocations, which are used interchangeably on the WAN egress. For example, we have one policy that allocates bandwidth equally to all connected networks, another that gives one network greater bandwidth and so on.

The last step in our setup is to reset the DSCP field to 0 on egress - and here's the problem. I sniff traffic on the outbound WAN interface (ie. between Site1 and Site2) using Wireshark, where I can see all the encrypted packets, but the DSCP field has not been reset to 0! Instead it seems to just have the field set by the last statement prior to the set dscp default command.

Here's the sample configs, not all details are listed - such as multiple network entries - just trying to get the expected behaviour for one network right before examining the wider setup.

Site1:

class-map match-all CLASS7

match dscp af11


class-map match-any NETWORK-A

match dscp af41

match dscp af42

match dscp af43


class-map match-any NETWORK-A-CLASS3

match dscp af43


policy-map NETWORK-A-IN

class CLASS7

set dscp af43


policy-map NETWORK-A-SERVICE

class NETWORK-A-CLASS3

bandwidth percent 50

random-detect dscp-based

set dscp default

class class-default

fair-queue

random-detect dscp-based

set dscp default


policy-map EQUAL

class NETWORK-A

shape average percent 15

service-policy NETWORK-A-SERVICE

class NETWORK-B

shape average percent 15

service-policy NETWORK-B-SERVICE

interface FastEthernet2/1

description NETWORK-A LAN

ip address X.X.X.X Y.Y.Y.Y

service-policy input NETWORK-A-IN


interface FastEthernet2/8

description WAN DMVPN (physical)

ip address X.X.X.X Y.Y.Y.Y

bandwidth 512

service-policy output EQUAL

My understanding of the above config is that incoming traffic from Network A with a dscp of af11 is matched to CLASS7, which then gets translated to af43. At egress on the physical interface to the tunnel, this is matched to class NETWORK-A and NETWORK-A-CLASS3. The hierarchical policy map then sets bandwidth allocation for Network A traffic to 15% on average and the previously matched traffic gets 50% of that value. The last step is to reset the dscp field to 0. The encrypted packet then leaves the fa2/8 interface and travels over the wire to where I am sniffing the traffic using Wireshark.

In this case, I only see af43, not 0 (default) like I am expecting. Examining counters using show policy-map interface fa2/8 doesn't show much either. I was expecting to see lots of packets for my FTP transfer, but instead only see about 70 or so. I read somewhere that this is because the packet mangling counters only apply to those done in software, not in hardware - is that correct? That I can live with, but I'd still like to know why I do not see the dscp field reset.

Any pointers?

This is against IOS 12.4-7h running on a 3825.

Regards,

Damien.

8 Replies 8

Raphael Wouters
Cisco Employee
Cisco Employee

Hi Damien,

I think that you'll reset the DSCP for the packets that will be queued when the 15% shaper will be active, otherwise you won't touch the actions defined in the NETWORK-A-SERVICE policy.

I think it could be interesting to check two "show policy-map interface fa2/8 " at the beginning of a transfer, and after a while during the transfer, to see the counter increase. You can also change the load-interval to 30 second under the fast2/8 configuration in case it's not done yet.

You should check if the shaper is active, if the 30 second offered rate for the NETWORK-A-CLASS3 and class-default is showing something, as well as their respective packet marked counters:

          QoS Set

            dscp default

              Packets marked 0 <---

We need to see if that counter increased with the correct amount of packet you sent or not. If yes, then you probably hit a bug that the packet were not reset at all, if not, then the packets are not reaching this class for some reason.

rawouter wrote:

Hi Damien,

I think that you'll reset the DSCP for the packets that will be queued when the 15% shaper will be active, otherwise you won't touch the actions defined in the NETWORK-A-SERVICE policy.

Hi Raphael, thanks for the response! Is the above definitely true? From what I have read in the Cisco documentation, any packet classified should have the appropriate QoS operations performed on it. Is there a link you know of that can clear this aspect up for me either way?

rawouter wrote:

I think it could be interesting to check two "show policy-map interface fa2/8 " at the beginning of a transfer, and after a while during the transfer, to see the counter increase. You can also change the load-interval to 30 second under the fast2/8 configuration in case it's not done yet.

Since I posted originally, this is what I have done several times. I haven't played around with the load-interval command however. The expected number of packets for an ~6MB file transfer (ie. in the thousands) simply aren't getting counted. If I remove the set dscp default commands I do get stats for the last packet operation, that is I can observe the field remains at af43.

rawouter wrote:

We need to see if that counter increased with the correct amount of packet you sent or not. If yes, then you probably hit a bug that the packet were not reset at all, if not, then the packets are not reaching this class for some reason.

I suspected a bug too. Tried it again with a 12.4-25 image to see if that changed things. Didn't make a difference. Might try other 12.4 images in the 12.4T train and see what happens there. Might also try a 15.0 image.

It is a fairly complex QoS config, namely to support the interchangeable aspect - does the order of class-maps and policy-maps (as evident in sh run, for example) make a difference? I will try defining a separate policy as per my original post here, just to isolate any potential issue with the existing config and see if I get the same behaviour.

Any other idea of things to try?

Damien.

Hi Damien,

It is possble to do a test by changing the policy EQUAL to :

policy-map EQUAL

class NETWORK-A

shape average percent 15

service-policy NETWORK-A-SERVICE

class NETWORK-B

shape average percent 15

service-policy NETWORK-B-SERVICE

class class-default

       set dscp default

!

and see what the result packet marking is ?

I am wondering if going into the backbone DMVPN tunnel is ignoring the input packet marking for classification.

Marc

Hi Marc,

So move the class class-default section out of NETWORK-A-SERVICE and place it as the tail of the EQUAL policy? Or leave it in NETWORK-A-SERVICE and include it as the tail under the EQUAL policy? Interesting. I'll try both and see what happens.

I'm back onsite tomorrow and will post the results when I get them.

Thanks for your response!

Damien.

The short answer is to leave it in NETWORK-A-SERVICE and include it as the tail under the EQUAL policy. (just for test purposes you understand)

All policies contain class-default even if it is only implicit and one thing is for sure, the packets going out the interface must be matching one class. Since all the classes within the child service-policy under class NETWORK-A have a set dscp statement, I suspect the traffic is not matching this class at all (why is a question we can deal with later).

Assuming that the traffic is not matching class NETWORK-A, then there are only two other possibilities, class NETWORK-B or class-default.

Run the test and let me know what you find.

Marc

Hi Marc,

Managed to play around with the config a bit since your last post.

The general issue of remarking all packets bound for the DMVPN tunnel is resolved by adding the class class-default class-map under the EQUAL policy. When I perform my file transfer without the set dscp default command for traffic matching NETWORK-A-CLASS3, I get about 50 packets that feature the af43 marking (ie, the marking set at ingress).

So it would seem that indeed these packets are the average shaped ones - every thing else falls through to the newly specified class-default and THEN gets remarked. This still seems counter-intuitive to my thinking. I still reckon shaping applies (the 15% clause), should it be required, then NETWORK-A-SERVICE gets called - like a sub-routine in a way - and all previously matched traffic is processed. So now, I'm a little lost as to what the class-default class-map actually does under NETWORK-A-SERVICE. It seems redundant now? The only thing I can see it as, is a bit of a safety catch-all - any traffic not explicitly matched by the access router and not matched at the distribution router is reset. I couldn't see any stats being logged against it, but I think I'll keep it in there for now.

So to recap, policy-map EQUAL now has a class class-default section, and that resets all packets to 0 for my entire transfer test. Also, I need to keep set dscp default under NETWORK-A-CLASS3 - if I remove it, a small bunch of packets gets logged as af43.

So thanks Marc for fixing the resetting issue. Now, I'd like to understand why this additional clause is needed and what the class-default could possibly match as it exists under NETWORK-A-SERVICE. Is there any Cisco doco out there that clears these aspects up?

Thanks again,

Damien.


Hi Damien,

Thanks for testing this hypothesis, at least now we know that the problem is related to the classification. There are a couple of avenues to still explore
if you are up for it

Policy Background
====================
Firstly, you asked for some background on the QoS policies, in particular the matching criteria. I am a software guy and so I think of policies like C switch statements. All policies have the following semantics

1. The classes are matched from the top down. If traffic matching criteria is overlapping, and the packet can fall into multiple classes, then it is the first one it matches from reading from the start of the policy.

2. All policies have a default class. This class may be implicit, or it may be explicit. This catches any traffic that does not match the configured classes. It is possible that no packets match this class if the user defined classes cover the complete traffic match space

3. In heirarchical policies, the order of traversal for classification is from the parent policy to the child policy.

Looking at your specific policy maps and just looking at the classes without the actions

=================

policy-map EQUAL
class NETWORK-A
    This matches packets with dscp af41, af42, af43

    service-policy NETWORK-A-SERVICE
        class NETWORK-A-CLASS3
           This matches packets with dscp af43
        class class-default
            This matches packets with dscp af41, af42
     !

class NETWORK-B
    service-policy NETWORK-B-SERVICE
    ...

class class-default
    This matches everything else.

=================

Some interesting points to note. Firstly, the packets match one class and one class only. The set action in the class default shows that the FTP packets are NOT matching class NETWORK-A. Therefore, the shaping will not take affect. I don't understand the 50 packets that end up in the class NETWORK-A. I even wonder if they are they are even part of the FTP transfer or some other traffic.

Classification
===============
Tunnelled enviroments present a bit of a challenge to the policy language. Typical matchig criteria is specified using IP characteristics, address, port, tos etc. For tunnelled environments, it becomes a question of which IP information is used. Is it the inner IP, the outer IP etc. The way this is solved is to interpret the policy with respect to the interface it is a attached to.

In your case, you have the following environment:


Access1    --------  site1  --------- site2 --------- Access 2
  |--------------------- DMVPN tunnel -------------------|
                       |--- DMVPN  ----|


Now just looking at site1, the DMVPN tunnel from A1 -> A2 is irrelevant. The packets are coming in IP with af11.

The packet tranformation that is expected to occur at Site 1 is:

                                         +-------------+
                                       |   ip        |
                                       | dest = S2   |
                                       | src  = S1   |
                                       | dscp = 0    |
                                       |             |
+-------------+                       +-------------+
|   ip        |              ===>     |             |
| dest = A2   |                       | encrypted   |
| src  = A1   |                       | payload     |
| dscp = af11 |                       |             |
|             |                       |             |

The input service policy on interface FastEthernet2/1 should set the inner DSCP to be af43. The fact that it is not, and is actually being applied to the outer IP header sounds like a bug to me.

Anyway, my suggestion to fix this would be to apply qos-preclassify to the tunnel on the S1, and then move the service policy NETWORK-A-IN to be an output service policy on the tunnel interface.

i.e.
   int tunnel x/y
       description "this is the tunnel from S1 to s2"
       qos-preclassify
       service-policy output NETWORK-A-IN


Thanks,
Marc

Hi Marc,

I've been looking into this a little more extensively last week, hence the lack of replies here. I wanted to be sure I was across the behaviour I was observing. First, to your last post.

mfaggion wrote:

Some interesting points to note. Firstly, the packets match one class and one class only. The set action in the class default shows that the FTP packets are NOT matching class NETWORK-A. Therefore, the shaping will not take affect. I don't understand the 50 packets that end up in the class NETWORK-A. I even wonder if they are they are even part of the FTP transfer or some other traffic.

Congestion management occurs only when there's congestion, which applies to shaping. At the beginning of the FTP transfer, you see TCP doing its thing before "settling down" to a sustained rate. With my test file, I repeatedly saw ~50 odd packets at the beginning of the transfer get remarked. The flow of a packet through the IOS is a little tricky to track down. The best description I saw was here: http://lh3.googleusercontent.com/_pYguWUnPyho/S8g3bZWvDCI/AAAAAAAAADo/zTqaSSgb1ZA/IOS-OOO.PNG

My understanding now is that if there's no congestion or shaping to apply to a packet (and given we're trying to shape the overall average throughput, this is quite possible with low utilisation), packets are sent straight to the hardware queue.

mfaggion wrote:

The input service policy on interface FastEthernet2/1 should set the inner DSCP to be af43. The fact that it is not, and is actually being applied to the outer IP header sounds like a bug to me.

No, it does set it to af43. Also, I didn't specify - the tunnel mode isn't transport mode. The packet from the access router is already encrypted and this process copies the DSCP field to the newly added header. It is this header that is matched against and this works.

Another post on this site (https://supportforums.cisco.com/message/3206546#3206546) has a graphic that shows the setup closest to how I myself visualise it. Bandwidth is guaranteed to the percentages specified only in times of congestion, the rest is shaped.

mfaggion wrote:

Anyway, my suggestion to fix this would be to apply qos-preclassify to the tunnel on the S1, and then move the service policy NETWORK-A-IN to be an output service policy on the tunnel interface.

i.e.
   int tunnel x/y
       description "this is the tunnel from S1 to s2"
       qos-preclassify
       service-policy output NETWORK-A-IN


Thanks,
Marc

The whole qos pre-classify command is still a little unclear for me. A page I found that tries to clear up the matter is here: http://ccnpont.blogspot.com/2008/04/end-to-end-qos-pre-classify.html. Based on my reading of that, my current setup of using pre-classification on the tunnel interface only I believe is correct.

I have also read up on the Hierarchical Queuing Framework (HQF) stuff that has been introduced into newer IOS revisions. The changes are described here: http://www.cisco.com/en/US/prod/collateral/iosswrel/ps6537/ps6558/white_paper_c11-481499.html - note that for this, the shaping has to apply to either the physical or logical interface only. Unfortunately, I could not try the IOS with HQF as our router deosn't have enough RAM, so I'm not sure what the behaviour would be with our current configuration, but I assume subtly different.

So in conclusion, the behaviour I saw was to reset the DSCP for packets sent to the software queue, which makes sense as it sends these as "best effort". Everything else is sent through as remarked (after ingress) so that routers downstream can honour it appropriately too.  This is a long, round-about way of saying that I don't think I need to change the original config.

However, should I need to reset the DSCP field for ALL packets, then at least I have a solution to this - thanks again for your help.

Regards,

Damien.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Innovations in Cisco Full Stack Observability - A new webinar from Cisco