on 07-23-2015 08:47 AM
Change of Authorization (CoA) is a very flexible mechanism to provide for a seamless configuration change of an already active subscriber session. The flexibility comes with the impact on the performance. If for example a certain configuration item must be changed on all active subscriber sessions, it may be worth exploring an approach that can execute the change in shorter time, with a lesser impact on the control plane.
The perfect example of such scenario is when all subscriber sessions get assigned a higher QoS policer or shaper rate during the night and a lower QoS police/shaper rate during the day. Using CoA, the rate at which such a change can be reliably executed on an asr9k would be around 80 CoA/second. With 128k sessions, if only a single CoA is required per subscriber session, such a change would take around half an hour. In addition, running the highest sustainable rate of CoA for an extended period of time may lead to CoA drops and retransmits, extending even longer the time required to complete the change.
This document describes one alternative to CoA in a case when all subscriber sessions have their QoS policer rate changed twice a day.
Instead of changing the policer rate in the QoS policy-map, the solution presented in this document never changes the QoS policy associated with the subscriber session. It rather changes the classification criteria so that all data traffic matches one QoS class-map during the night and another QoS class-map during the day.
Restrictions of this solution are:
Since the solution is entirely local to the router, there is no direct mechanism to inform the Radius of this configuration change. Implicitly, Radius could detect that all traffic is hitting a different QoS class. The solution presented in this document doesn't cover how this could be done on the Radius. As there are no direct changes to the QoS policy, accounting records sent to Radius are not disrupted.
In this particular test all subscriber sessions have the QoS policy-map "TIMED" applied on egress. All sessions are dual stack.
dynamic-template type ipsubscriber SUB_DT_1 service-policy output TIMED ipv4 verify unicast source reachable-via rx ipv4 unnumbered Loopback4 ipv6 verify unicast source reachable-via rx ipv6 enable ! policy-map type control subscriber IPoE_PMAP_1 event session-start match-first class type control subscriber DHCP_CMAP do-until-failure 1 activate dynamic-template SUB_DT_1 2 authorize aaa list default identifier source-address-mac password shootme ! interface Bundle-Ether1.110 ipv4 point-to-point ipv4 unnumbered Loopback4 ipv6 address 67::1/64 ipv6 enable service-policy type control subscriber IPoE_PMAP_1 encapsulation dot1q 110 ipsubscriber ipv4 l2-connected initiator dhcp ! ipsubscriber ipv6 l2-connected initiator dhcp !
Policy-map "TIMED" has one class which that all IPv4 and IPv6 traffic should hit during the night and another class that all IPv4 and IPv6 traffic should hit during the day:
policy-map TIMED class TIMED_NIGHT police rate 5 mbps ! class TIMED_DAY police rate 1 mbps ! class class-default ! end-policy-map !
Classification is configured using access list (ACL):
class-map match-any TIMED_NIGHT match access-group ipv4 TIMED_NIGHT_IPv4 match access-group ipv6 TIMED_NIGHT_IPv6 end-class-map ! class-map match-any TIMED_DAY match access-group ipv4 TIMED_DAY_IPv4 match access-group ipv6 TIMED_DAY_IPv6 end-class-map
It's the ACLs that are manipulated to change classification at different time of the day. Since the class-map "TIMED_NIGHT" is the first on in the configuration of the policy-map "TIMED", all IPv4 and IPv6 traffic should match ACLs "TIMED_NIGHT_IPv4" and "TIMED_NIGHT_IPv6" respectively during the night. During the day, no traffic should match these two access lists, so the QoS classification falls through to class-map "TIMED_DAY". ACLs "TIMED_DAY_IPv4" and "TIMED_DAY_IPv6" should match all IPv4 and IPv6 traffic respectively at any time of the day.
ACL | Night | Day |
---|---|---|
TIMED_NIGHT_IPv4 | Matches all IPv4 traffic | Doesn't match any IPv4 traffic |
TIMED_NIGHT_IPv6 | Matches all IPv6 traffic | Doesn't match any IPv6 traffic |
TIMED_DAY_IPv4 | Matches all IPv4 traffic | Matches all IPv4 traffic |
TIMED_DAY_IPv6 | Matches all IPv6 traffic | Matches all IPv4 traffic |
To achieve this functionality, this is how the ACLs configuration looks like during the night:
ipv4 access-list TIMED_NIGHT_IPv4 10 permit ipv4 any any 20 permit ipv4 host 127.0.0.1 host 127.0.0.1 ! ipv6 access-list TIMED_NIGHT_IPv6 10 permit ipv6 any any 20 permit ipv6 host ::1 host ::1 ! ipv4 access-list TIMED_DAY_IPv4 10 permit ipv4 any any ! ipv6 access-list TIMED_DAY_IPv6 10 permit ipv6 any any
The following commands shows that with such ACL configuration all IPv4 and IPv6 traffic can only hit the QoS class "TIMED_NIGHT":
RP/0/RSP0/CPU0:GBNG1#sh qos-ea km policy TIMED vmr interface BE1.110.ip12 member Gi0/0/0/0 hw verbose ================================================================================ policy name TIMED and format type 4 Total Ingress TCAM entries: 3 ==== CLASS_NAME : TIMED_NIGHT CLASS_NUM : 0x0 VM_SEQ : 0x0 APP_ID : 0x1/0x0 QOS_ID : 0x20/0x0 PKT_TYPE : ipv4 ==== CLASS_NAME : TIMED_NIGHT CLASS_NUM : 0x0 VM_SEQ : 0x1 APP_ID : 0x1/0x0 QOS_ID : 0x20/0x0 PKT_TYPE : ipv6 ==== CLASS_NAME : class-default CLASS_NUM : 0x2 VM_SEQ : 0x2 APP_ID : 0x1/0x0 QOS_ID : 0x20/0x0 PKT_TYPE : any ================================================================================ Total Ingress and Egress TCAM entries: 3
Configuration of the same ACLs during the day:
ipv4 access-list TIMED_NIGHT_IPv4 20 permit ipv4 host 127.0.0.1 host 127.0.0.1 ! ipv6 access-list TIMED_NIGHT_IPv6 20 permit ipv6 host ::1 host ::1 ! ipv4 access-list TIMED_DAY_IPv4 10 permit ipv4 any any ! ipv6 access-list TIMED_DAY_IPv6 10 permit ipv6 any any
The dummy entry in each ACL was kept to prevent the ACL from being deleted. This makes the automated config change simpler, with minimal impact on TCAM space (only two more entries are used during the day):
RP/0/RSP0/CPU0:GBNG1#sh qos-ea km policy TIMED vmr interface BE1.110.ip12 member Gi0/0/0/0 hw verbose ================================================================================ policy name TIMED and format type 4 Total Ingress TCAM entries: 5 ==== CLASS_NAME : TIMED_NIGHT CLASS_NUM : 0x0 VM_SEQ : 0x0 APP_ID : 0x1/0x0 QOS_ID : 0x22/0x0 PKT_TYPE : ipv4 SRC_IP_ADDRESS : 127.0.0.1/0.0.0.0 DEST_IP_ADDRESS : 127.0.0.1/0.0.0.0 ==== CLASS_NAME : TIMED_NIGHT CLASS_NUM : 0x0 VM_SEQ : 0x1 APP_ID : 0x1/0x0 QOS_ID : 0x22/0x0 PKT_TYPE : ipv6 SRC_IP_ADDRESS : ::1/:: DEST_IP_ADDRESS : ::1/:: ==== CLASS_NAME : TIMED_DAY CLASS_NUM : 0x1 VM_SEQ : 0x2 APP_ID : 0x1/0x0 QOS_ID : 0x22/0x0 PKT_TYPE : ipv4 ==== CLASS_NAME : TIMED_DAY CLASS_NUM : 0x1 VM_SEQ : 0x3 APP_ID : 0x1/0x0 QOS_ID : 0x22/0x0 PKT_TYPE : ipv6 ==== CLASS_NAME : class-default CLASS_NUM : 0x2 VM_SEQ : 0x4 APP_ID : 0x1/0x0 QOS_ID : 0x22/0x0 PKT_TYPE : any ================================================================================ Total Ingress and Egress TCAM entries: 5
This test was performed on the following setup:
In this setup, it took around 4 minutes and 40 seconds to have change of an ACL entry (ACE) applied to all subscriber interfaces. Since :
RP/0/RSP0/CPU0:GBNG1(config)#ipv4 access-list TIMED_NIGHT_IPv4 RP/0/RSP0/CPU0:GBNG1(config-ipv4-acl)#no 10 RP/0/RSP0/CPU0:GBNG1(config-ipv4-acl)#commit
Regardless of whether the subscribers are RP-based or LC-based, QoS is always performed on the Network Processor (NP) on the line card and the CPU tasked with programming the NP is the LC CPU. Hence the performance monitoring should be done on the LC where the QoS is applied.
Continuous trace was started in another window to verify when the reconfiguration was started and completed:
RP/0/RSP0/CPU0:GBNG1#sh qos-ea trace policy-modification tailf location 0/0/CPU0 Jul 19 03:21:16.481 qos_ea/qos_ma_ea_pmod 0/0/CPU0 t1 PMOD_LTPENTRY Entering ipv4_acl_res_check Jul 19 03:21:16.481 qos_ea/qos_ma_ea_pmod 0/0/CPU0 t1 PMOD_LTPEXIT Exiting ipv4_acl_res_check. Error is No error Jul 19 03:21:17.498 qos_ea/qos_ma_ea_pmod 0/0/CPU0 t1 PMOD_LTPENTRY Entering acl_mod_create_mod_db Jul 19 03:21:17.498 qos_ea/qos_ma_ea_pmod 0/0/CPU0 t1 PMOD_LTP6 ACL name to modify is TIMED_DAY. State 2. EA Ready? 1 Jul 19 03:21:17.498 qos_ea/qos_ma_ea_pmod 0/0/CPU0 t1 PMOD_LTP2 Policy TIMED is to be modified Jul 19 03:21:29.709 qos_ea/qos_ma_ea_pmod 0/0/CPU0 t1 PMOD_LTP7 One target is ifh 0x0, ingress 0, egress 0 Jul 19 03:21:29.709 qos_ea/qos_ma_ea_pmod 0/0/CPU0 t1 PMOD_LTP7 One target is ifh 0x0, ingress 1, egress 0 <. . .> Jul 19 03:26:04.340 qos_ea/qos_ma_ea_pmod 0/0/CPU0 t1 PMOD_LTPEXIT Exiting pl_mod_func. Error is No error Jul 19 03:26:04.340 qos_ea/qos_ma_ea_pmod 0/0/CPU0 t1 PMOD_LTPENTRY Entering pl_mod_func Jul 19 03:26:04.340 qos_ea/qos_ma_ea_pmod 0/0/CPU0 t1 PMOD_LTP4 Tid: 0, Action: 4, plmgr_updates:0,best_effort: 1, flags:0, ea_ready:1 Jul 19 03:26:04.340 qos_ea/qos_ma_ea_pmod 0/0/CPU0 t1 PMOD_LTP5 Curr state: MDF - All servers up, batch_num: 1, batch_size: 1 Jul 19 03:26:04.461 qos_ea/qos_ma_ea_pmod 0/0/CPU0 t1 PMOD_LTPEXIT Exiting pl_mod_func. Error is No error Jul 19 03:26:04.461 qos_ea/qos_ma_ea_pmod 0/0/CPU0 t1 PMOD_LTPEXIT Exiting async_acl_mod_plcy_tgt_handler. Error is No Error
While the change is applied in HW, the "sh qos-ea km policy <policy> vmr interface ..." command is suppressed, so it may time out with the following message:
LC/0/0/CPU0:Jul 20 03:24:30.058 : sysdb_svr_local[362]: %SYSDB-SYSDB-6-TIMEOUT_EDM : EDM request for 'oper/km/qos_ea/node/821/policy/TIMED/info' from 'km_show' (jid 65911, node 0/RSP0/CPU0). No response from 'qos_ma_ea' (jid 320, node 0/0/CPU0) within the timeout period (100 seconds)
The above error should not be a concern if the QoS classification reconfiguration is in progress.
While the qos_ea is in full swing updating the ACL, new subscribers may come up with a delay.
Baseline LC CPU utilisation with this scale was around 30%. During the configuration change the overall CPU utilisation was fluctuating between 30% and 70%. The highest contributor during the configuration change was the qos_ma_ea process, with CPU utilisation around 25% in the first minute, later dropping to around 10% until the configuration change was completed.
One way to monitor the CPU during the reconfiguration is to open a new terminal window to the router and run the following commands:
run attach <location> top -d -2
Automation was achieved by using the EEM/Tcl. Execution time was specified using a cron timer. Changes are applied in the morning just after 8am and just after 11pm. IPv4 and IPv6 ACL modification was spread 10 minutes apart.
event manager environment _cron_entry_match_daily_IPv4 0 8 * * * event manager environment _cron_entry_match_daily_IPv6 15 8 * * * event manager environment _cron_entry_match_nightly_IPv4 0 23 * * * event manager environment _cron_entry_match_nightly_IPv6 15 23 * * *
Generic EEM/Tcl configuration required for this was to specify the script locations, authentication and username under which the scripts are executed:
event manager directory user policy harddisk:/scripts aaa authorization eventmanager default local !(user is configured in admin mode!) username eem_user group root-system group cisco-support secret <password>
The last step is to register the four scripts:
event manager policy update_acl_match_daily_IPv4.tcl username eem_user persist-time 3600 type user event manager policy update_acl_match_daily_IPv6.tcl username eem_user persist-time 3600 type user event manager policy update_acl_match_nightly_IPv4.tcl username eem_user persist-time 3600 type user event manager policy update_acl_match_nightly_IPv6.tcl username eem_user persist-time 3600 type user
The four scripts are modifying the TIMED_NIGHT_IPv4 and TIMED_NIGHT_IPv6 ACL configuration as explained earlier in the document. They are available for download in the attachments section of this document ("update_acl_match.zip").
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: