cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
7052
Views
70
Helpful
25
Replies

Ask the Expert: Configuration and Troubleshooting the Cisco Application Control Engine (ACE) load balancer

ciscomoderator
Community Manager
Community Manager

With Ajay Kumar and Telmo Pereira 

 

Ajay Kumar Telmo PereiraWelcome to the Cisco Support Community Ask the Expert conversation. This is an opportunity to learn and ask questions about configuration and troubleshooting the Cisco Application Control Engine (ACE) load balancer with Cisco expert Ajay Kumar and Telmo Pereira. The Cisco ACE Application Control Engine Module for Cisco Catalyst 6500 Series Switches and Cisco 7600 Series Routers is a next-generation load-balancing and application-delivery solution. A member of the Cisco family of Data Center 3.0 solutions, the module: Helps ensure business continuity by increasing application availability Improves business productivity by accelerating application and server performance Reduces data center power, space, and cooling needs through a virtualized architecture Helps lower operational costs associated with application provisioning and scaling

 

Ajay Kumar  is a customer support engineer in the Cisco Technical Assistance Center in Brussels, covering content delivery network technologies including Cisco Application Control Engine, Cisco Wide Area Application Services, Cisco Content Switching Module, Cisco Content Services Switches, and others. He has been with Cisco for more than four years, working with major customers to help resolve their issues related to content products. He holds DCASI and VCP certifications. 

 

Telmo Pereira is a customer support engineer in the Cisco Technical Assistance Center in Brussels, where he covers all Cisco content delivery network technologies including Cisco Application Control Engine (ACE), Cisco Wide Area Application Services (WAAS), and Digital Media Suite. He has worked with multiple customers around the globe, helping them solve interesting and often highly complex issues. Pereira has worked in the networking field for more than 7 years. He holds a computer science degree as well as multiple certifications including CCNP, DCASI, DCUCI, and VCP

 

Remember to use the rating system to let Ajay know if you have received an adequate response.

 

Ajay and Telmo might not be able to answer each question due to the volume expected during this event. Remember that you can continue the conversation on the Data Center sub-community discussion forum Application Networking shortly after the event.

 

This event lasts through July 26, 2013. Visit this forum often to view responses to your questions and the questions of other community members.

 

 
25 Replies 25

Hello Ajay.

Thanks for reply.

My network diagram is the following :

Server1---->ACE1 ----> ASA1----->ACE2----->Server2

                            | ------>ASA2----------|

        "DMZ"                        |                        "INSIDE"

Two servers are communicating with each other via specific tcp port numbers.

My problem is how to load-balance the traffic from server1 to server2 and vice versa through both ASA1 and ASA2.

Currently, the ASA1 only has the connections from server1 to server2.

Thank you for reading.

Hello Jeongdae,

I understand it now. You can refer the following link to do firewall load balancing.

http://www.cisco.com/en/US/docs/app_ntwk_services/data_center_app_services/ace_appliances/vA3_1_0/configuration/slb/guide/fwldbal.html

mac-sticky enable command is the one which does the trick here. It keeps the session with the same firewall.

let me know if you some specific question related to this setup.

regards,

Ajay Kumar

Krzysztof Obara
Level 1
Level 1

Dear Experts,

I'd like to ask you about configuration amendments for using sorry serverfarm (sorry page when primary sfarm fails) with regard to this specific example:

rserver host REAL_PRIMARY_1

  ip address 10.0.0.1

  inservice

rserver host REAL_PRIMARY_2

  ip address 10.0.0.2

  inservice

rserver host REAL_BACKUP_1

  ip address 10.0.0.11

  inservice

rserver host REAL_BACKUP_2

  ip address 10.0.0.12

  inservice

serverfarm host SFARM_PRIMARY

  failaction reassign

  predictor leastconns slowstart 30

  probe PROBE_HTTP

  rserver REAL_PRIMARY_1 80

    conn-limit max 10000 min 9900

    inservice

  rserver REAL_PRIMARY_2 80

    conn-limit max 10000 min 9900

    inservice

serverfarm host SFARM_BACKUP

  failaction reassign

  predictor leastconns slowstart 30

  probe PROBE_HTTP

  rserver REAL_BACKUP_1 80

    inservice

  rserver REAL_BACKUP_2 80

    inservice

sticky ip-netmask 255.255.255.255 address source STICKY_PRIMARY

  timeout 120

  serverfarm SFARM_PRIMARY backup SFARM_BACKUP

sticky http-cookie BACKUP_ID STICKY_BACKUP

  cookie insert

  timeout 60

  serverfarm SFARM_BACKUP

action-list type modify http ACT_LIST_RW

  ssl url rewrite location ".*"

policy-map type loadbalance first-match PM_L7_PRIMARY

  class class-default

    sticky-serverfarm STICKY_PRIMARY

    action ACT_LIST_RW

policy-map type loadbalance first-match PM_L7_BACKUP

  class class-default

    sticky-serverfarm STICKY_BACKUP

class-map match-all CM_L3L4_PRIMARY

  10 match virtual-address 10.0.0.100 tcp eq https

class-map match-all CM_L3L4_BACKUP

  10 match virtual-address 10.0.0.200 tcp eq www

parameter-map type connection PARA_MAP_CONN_TIMEOUT_120

  set timeout inactivity 120

parameter-map type http PARA_MAP_PERSIST_REBAL

  persistence-rebalance

  header modify per-request

policy-map multi-match PM_L3L4_PRIMARY

  class CM_L3L4_PRIMARY

    loadbalance vip inservice

    loadbalance policy PM_L7_PRIMARY

    loadbalance vip icmp-reply active

    appl-parameter http advanced-options PARA_MAP_PERSIST_REBAL

    ssl-proxy server SSL_PROXY_PRIMARY

    connection advanced-options PARA_MAP_CONN_TIMEOUT_120

policy-map multi-match PM_L3L4_BACKUP

  class CM_L3L4_BACKUP

    loadbalance vip inservice

    loadbalance policy PM_L7_BACKUP

    loadbalance vip icmp-reply active

    appl-parameter http advanced-options PARA_MAP_PERSIST_REBAL

interface vlan 100

  description Client-side VLAN

  bridge-group 100

  mac-sticky enable

  access-group input ACL_ALLOW_BPDU

  access-group input ACL_ALL_IP

  access-group output ACL_ALL_IP

  service-policy input PM_L3L4_PRIMARY

  service-policy input PM_L3L4_BACKUP

  no shutdown

interface vlan 200

  description Server-side VLAN

  bridge-group 100

  mac-sticky enable

  access-group input ACL_ALLOW_BPDU

  access-group input ACL_ALL_IP

  access-group output ACL_ALL_IP

  no shutdown

interface bvi 100

  ip address 10.0.0.252 255.255.255.0

  alias 10.0.0.254 255.255.255.0

  peer ip address 10.0.0.253 255.255.255.0

  no shutdown

My questions for the above example:

1) After making the first test, it seems that users who want to connect to a  failed primary page, then they see the sorry page. However, after  the primary sfarm is up, the users still see sorry page. What can cause that behaviour? Should I remove http cookie for the backup sfarm?

2) How will failaction reassign work, when the primary sfarm will go down reaching the MAXCONN state?

3) How will the ACE behave, when backup sfarm is configured under sticky of primary sfarm? Connections only for backup sfarm will be added into the sticky database?

4) Is there something else to configure for sorry sfarm if the requirement is to configure backup sfarm as host not as a redirect sfarm (from cisco.com - there is an example for configuring sorry sfarm as redirect)?

Kind of regards,

Krzysztof

Krzysztof,

Here it goes the answers to your queries:

1) This is expected behavior. Once the connections are on the backup serverfarm, they will stay there until they complete. But as you are suspecting stickiness also plays a role here. So assuming you have sticky for the backup serverfarm this is the behavior you should see:

–All new sticky connections that match existing sticky table entries for the real servers in the backup server farm are stuck to the same real servers in the backup server farm.

–All new non-sticky connections and those sticky connections that do not have an entry in the sticky table are load balanced to the real servers in the primary server farm.

Hopefully this is what you are seeing

2) When you reach the MAXCONN threshold, you can expect that new connections will be sent to the backup farm. Depending on the platform you are using, this threshold value may be divided by the amount of IXPs (Network processors).

3) That is normally what we see customers doing, so in spite of having:

sticky ip-netmask 255.255.255.255 address source STICKY_PRIMARY

timeout 120

serverfarm SFARM_PRIMARY backup SFARM_BACKUP

sticky http-cookie BACKUP_ID STICKY_BACKUP

  cookie insert

timeout 60

serverfarm SFARM_BACKUP

You would simply do (note the sticky keyword after the backup farm):

sticky ip-netmask 255.255.255.255 address source STICKY_PRIMARY

  timeout 120

  serverfarm SFARM_PRIMARY backup SFARM_BACKUP sticky

And the behavior is exactly the same. If all the servers in the primary server farm go down, the ACE sends all new requests to the backup server farm. When the primary server farm comes back up (at least one server becomes active):

•If the sticky option is enabled, then:

–All new sticky connections that match existing sticky table entries for the real servers in the backup server farm are stuck to the same real servers in the backup server farm.

–All new non-sticky connections and those sticky connections that do not have an entry in the sticky table are load balanced to the real servers in the primary server farm.

•If the sticky option is not enabled, then the ACE load balances all new connections to the real servers in the primary server farm.

•Existing non-sticky connections to the servers in the backup server farm are allowed to complete in the backup server farm.

This has been documented here:

http://www.cisco.com/en/US/docs/interfaces_modules/services_modules/ace/vA2_3_0/configuration/slb/guide/sticky.html#wp1137791

4) No, at first sight your configuration looks fine. However if I understood you correctly, you would need to remove sticky for the backup farm to meet your requirements. Or at least to achieve a behavior close to what you are expecting.

HTH,

Telmo

Hello Telmo,

Thank you very much for your answers

Here are my conclusions:

1) & 3) As suspected, the stickiness caused the problem with users when the sorry page was triggered by them. I am about to remove the sticky http-cookie from the backup sfarm but what about this sentence from the link that you provided (the last paragraph under that section):

If you want to configure sorry servers and you  want existing connections to revert to the primary server farm after it  comes back up, do not use stickiness.

Does it mean to not use stickiness for primary and backup sfarm or only for backup sfarm?

2) I'm using ACE module on Cisco 6500 switch, IOS ver: A2(3.5) and there are 2 NP's. When using sorry sfarm, should I also remove failaction reassign? This action is rather used for passing traffic to stateful firewalls (as backups). Please correct me if I'm wrong

4) Regardless of understanding the above sentence from ACE config guide (I guess, they had in mind backup sfarm only), removing sticky from backup sfarm will solve the problem

Much appreciated for your help,

Krzysztof

1) & 3) That sentence is only in reference to the backup farm.

2) You are correct. Failaction reassign will simply begin sending packets to the remaining/backup rserver.  There must be some logic to sync the state on the OS/App across all of the rservers for this to work properly or the new server will simply drop the connection, since it never saw the handshake.  Typically you see this feature used with firewall load

balancing because firewalls are replicating their connection state tables.

4) Correct.

Pleasure is all mine, hope this helps.

Regards,

Telmo

Thank you Telmo.

It really helped me

kgilbert1975
Level 1
Level 1

Hi Mr. Ajay am desperately in need of your help in configuring a cisco 3620 series router for VPN

remote access with ios

flash:c3620-jk9s-mz.122-29.bin please help my boss is on me for this please please help

Hi Gilbert,

I am sorry but this is not my area of expertise. I deal with Cisco ACE.  You can post the same question in security support forum, but try to provide more details and requirements so that they can help you out.

regards,

Ajay Kumar

Krzysztof Obara
Level 1
Level 1

Hello Experts,

I'd like to ask you, how to interpret the values from resource usage in high details:

ACE/Context# show resource usage
                                                     Allocation
        Resource         Current       Peak        Min        Max       Denied
-------------------------------------------------------------------------------
-- outputs omitted for brevity --
  proxy-connections             0      16358      16358      16358      17872
  ssl-connections rate          0        626        626        626      23204

Some tests were done with many SSL connections per second (more than the limit).

1) What would the ACE normally do with denied SSL connections? Could it be possible if the ACE will send to clients TCP Resets? (during the tests TCP Resets were captured)

2) What is proxy-connections shown in the resources? And the same question as above - what the ACE can do with denied proxy conns? What TCP flags will be send to clients?

I guess to overcome the high number of denied connections - there will be a need to increase values under the resource class.

3) If sticky is configured for a sfarm with conn-limit set, and if a client will initiate traffic to the sfarm which will be UP (conn-limit not yet reached) - sticky will add source client ip and associated rserver (as configured sticky method).

What will happen if the client will lose connection but the client still will have existing sticky entry (2 hours set for existence of sticky) and now, the client will try to generate traffic to sfarm which is now DOWN due to MAXCONN state on all rservers?

Sticky has no relations with current conns on rservers?

Best regards,

Krzysztof

Hello Krzysztof,

Another set of good/interesting questions posted. Thanks! 

I will try to clarify your doubts.

In the output below both resources (proxy-connections and ssl-connections rate) are configured with a min percentage of resources (column Min), while 'Max' is set to equal to the min.

ACE/Context# show resource usage

                                                     Allocation

        Resource         Current       Peak        Min        Max       Denied

-------------------------------------------------------------------------------

-- outputs omitted for brevity --

  proxy-connections             0      16358      16358      16358      17872

  ssl-connections rate          0        626        626        626      23204

Most columns are self explanatory, 'Current' is current usage, 'Peak' is the maximum value reached, and the most important counter to monitor 'Denied' represents the amount of packets denied/dropped due to exceeding the configured limits.

On the resources themselves, Proxy-connections is simply the amount of proxied connections, in other words all connections handled at layer 7 (SSL connections are proxied, as are any connections with layer 7 load balance policies, or inspection).

So in this particular case for the proxy-connections we see that Peak is equal to the Max allocated, and as we have denies we can conclude that you have surpassed the limits for this resource. We see there were 17872 connections dropped due to that.

ssl-connections rate should be read in the same manner, however all values for this resource are in bytes/s, except for Denied counter, that is simply the amount of packets that were dropped due to exceeding this resource. 

For your particular tests you have allocated a min percentage and set max equal to min, this way you make sure that this context will not use any other additional resources.

If you had set the max to unlimited during resource allocation, ACE would be allowed to use additional resources on top of those guaranteed, if those resources were available.

This might sound a great idea, but resource planning on ACE should be done carefully to avoid any sort of oversubscription, specially if you have business critical contexts.

We have a good reference for ACE resource planning that contains also description of all resources (this will help to understand the output better):

http://www.cisco.com/en/US/docs/interfaces_modules/services_modules/ace/v3.00_A2/configuration/virtualization/guide/config.html#wp1008224

1) When a resource is utilized to its maximum limit, the ACE denies additional requests made by any context for that resource. In other words, the action is to Drop. ACE  should in theory silently drop (No RST is sent back to the client). So unless we changed something on the code, this is what you should see.

To give more context, seeing resets with SSL connections is not necessarily synonym of drops. As it is usual to see them during normal transactions.

For instance Microsoft servers are usually ungracefully terminating SSL connections with RESET. Also when there is renegotiation during an SSL transaction you may see RESETS, but this will pass unnoticed for end users. 

2)  ACE will simply drop/ignore new connections when we reach the maximum amount of proxied connections for that context. Exisiting connections will continue there.

As ACE doesn't respond back, client would simply retransmit, and if he is lucky maybe in the next attempt he will be able to establish the connection.

To overcome the denies, you will definitely have to increase the resource allocation. This of course, assuming you are not reaching any physical limit of the box.

As mentioned setting max as unlimited might work for you, assuming there are a lot of unused resources on the box.

3)  If a new connection comes in with a sticky value, that matches the sticky entry of a real server, which is already in MAXCONNS state, then both the ACE module/appliance should reject the connection and that sticky entry would be removed.

The client would at that point reestablish a new connection and ACE would associate a new sticky entry with the flow for a new RSERVER after the loadbalancing decision.

I hope this makes things clearer! Uff...

Regards,

Telmo