Cisco Support Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Announcements

Welcome to Cisco Support Community. We would love to have your feedback.

For an introduction to the new site, click here. And see here for current known issues.

Ask the Expert: Understanding and Troubleshooting ACE Loadbalancer

Read the bioWith Sivakumar Sukumar


Welcome to the Cisco Support Community Ask the Expert conversation. This is an opportunity to learn and ask questions about configuration and troubleshooting the Cisco Application Control Engine (ACE) loadbalancer with Sivakumar Sukumar. The Cisco ACE Application Control Engine Module for Cisco Catalyst 6500 Series Switches and Cisco 7600 Series Routers is a next-generation load-balancing and application-delivery solution. A member of the Cisco family of Data Center 3.0 solutions, the module:

  • Helps ensure business continuity by increasing application availability
  • Improves business productivity by accelerating application and server performance
  • Reduces data center power, space, and cooling needs through a virtualized architecture
  • Helps lower operational costs associated with application provisioning and scaling

Sivakumar Sukumar is an experienced support engineer with the High Touch Technical Support content team, covering all Cisco content delivery network technologies including Cisco Application Control Engine (ACE), Cisco Wide Area Application Services (WAAS), Cisco Content Switching Module, Cisco Content Services Switches, and other content products. He has been with Cisco for more than 2 years, working with major customers to help resolve their issues related to content products. He holds CCNP and DCASI certification.

Remember to use the rating system to let Sivakumar know if you have received an adequate response.

Sivakumar might not be able to answer each question due to the volume expected during this event. Remember that you can continue the conversation on the Data Center sub-community discussion forum shortly after the event. This event lasts through August 24, 2012. Visit this forum often to view responses to your questions and the questions of other community members.

51 REPLIES

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Hi,

what are  best steps/commands to look at ACE confguration in regards to troubleshoot issues related to load balancing.

For example - if I know the VIP IP and need to know how many real servers/policies are associated with it just to check things configured for one specific VIP.

Thanks

Ajay

Cisco Employee

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Hi Ajay,

There is a handy CLI command that will pull out relevant ACE running-config for a class-map. It looks like it is available on some 3.x version and on 4.x versions and above. In some code versions it is hidden.

show running-config filter [policy-map name] [class-map name]

This will parse the running config for the configuration that is applicable to the policy-map and class-map that is specified.

Regards,

Siva

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Thanks Siva.. It works for me

New Member

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Hi Siva,

I have a Cisco 6500 and Cisco ACE 4710 with the following configuration/connection.

Cisco 6500's 6/29 (vlan 694 for mgmt) connects to ACE 4710's 1/1

Cisco 6500's 6/30 (vlan 697 for client-side) connects to ACE 4710's 1/2

Cisco 6500's 6/31 (vlan 698 for server-side) connects to ACE 4710's 1/3

***********************Cisco 6500***********************

interface GigabitEthernet6/29

description ACE4710 (Mgmt/Int 1/1)

switchport

switchport access vlan 694

no ip address

no cdp enable

!

interface GigabitEthernet6/30

description ACE4710 (Int 1/2)

switchport

switchport access vlan 697

no ip address

no cdp enable

!

interface GigabitEthernet6/31

description ACE4710 (Int 1/3)

switchport

switchport access vlan 698

no ip address

no cdp enable

!

interface Vlan694

ip address 10.78.2.1 255.255.255.248

interface Vlan697

ip address 10.10.40.1 255.255.255.0

!

interface Vlan698

ip address 10.10.50.1 255.255.255.0

***********************ACE 4710***********************

ACE4710/Admin# show run

Generating configuration....

boot system image:c4710ace-mz.A3_2_1.bin

boot system image:c4710ace-mz.A1_8_0a.bin

hostname ACE4710

interface gigabitEthernet 1/1

   switchport access vlan 694

   no shutdown

interface gigabitEthernet 1/2

   description Client-side

   switchport access vlan 697

   no shutdown

interface gigabitEthernet 1/3

   description Server-side

   switchport access vlan 698

   no shutdown

interface gigabitEthernet 1/4

   shutdown

access-list ALL line 8 extended permit ip any any

class-map type management match-any remote_access

   2 match protocol xml-https any

   3 match protocol icmp any

   4 match protocol telnet any

   5 match protocol ssh any

   6 match protocol http any

   7 match protocol https any

   8 match protocol snmp any

policy-map type management first-match remote_mgmt_allow_policy

   class remote_access

     permit

interface vlan 694

   ip address 10.78.2.2 255.255.255.248

   access-group input ALL

   service-policy input remote_mgmt_allow_policy

   no shutdown

interface vlan 697

   ip address 10.10.40.2 255.255.255.0

   fragment chain 112

   access-group input ALL

   no shutdown

interface vlan 698

   ip address 10.10.50.2 255.255.255.0

   fragment chain 112

   access-group input ALL

   no shutdown

ip route 0.0.0.0 0.0.0.0 10.78.2.1

I will assign other ports with the assicated vlans for client and server on the Cisco 6500. Is it a valid setup/configuration?

If not, what should I change? How to make sure that the client traffic and server traffic can be handled by the ACE 4710? Any suggestion for configuration?

Thanks a lot.

Philip

Cisco Employee

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Hi Philip,

Thanks for your question.

The configuration looks good for basic management setup.

Attached the configuration for client to server communication via ACE.

rserver host SERVER_01
  ip address 10.10.50.x
  inservice
rserver host SERVER_02
  ip address 10.10.50.x
  inservice

serverfarm host REAL_SERVERS
  rserver SERVER_01
    inservice
  rserver SERVER_02
    inservice

class-map match-all VIP
  2 match virtual-address 10.10.40.x any

policy-map type loadbalance first-match SLB_LOGIC
  class class-default
    serverfarm REAL_SERVERS
policy-map multi-match CLIENT_VIPS
  class VIP
    loadbalance vip inservice
    loadbalance policy SLB_LOGIC
    loadbalance vip icmp-reply active

interface vlan 697
  service-policy input CLIENT_VIPS
  no shutdown

Also here is a guide to setup basic server loadbalancing with step by step configuration.

http://docwiki.cisco.com/wiki/Cisco_ACE_4700_Series_Appliance_Quick_Start_Guide,_Release_A3(1.0)_--_Configuring_Server_Load_Balancing

Let me know if you have any questions.

Regards,

Siva

New Member

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

First, thank you for the quick note, Siva.

However, I am not able to browse/ping the VIP web server from client now.

VIP: 10.10.40.2

Real Web Server IP: 10.10.50.3

Client IP on vlan 697: 10.10.40.3

Here is what I have now.

***********************ACE 4710***********************

hostname ACE4710

interface gigabitEthernet 1/1

  switchport access vlan 694

  no shutdown

interface gigabitEthernet 1/2

  description Client-side

  switchport access vlan 697

  no shutdown

interface gigabitEthernet 1/3

  description Server-side

  switchport access vlan 698

  no shutdown

interface gigabitEthernet 1/4

  shutdown

access-list ALL line 8 extended permit ip any any

probe http 1

  interval 15

  passdetect interval 60

  request method get url http://10.10.50.3

  open 10

rserver host SERVER_01

  ip address 10.10.50.3

  conn-limit max 4000000 min 4000000

  inservice

rserver host SERVER_02

  ip address 10.10.50.4

  conn-limit max 4000000 min 4000000

  inservice

serverfarm host REAL_SERVERS

  probe 1

  rserver SERVER_01 80

    conn-limit max 4000000 min 4000000

    inservice

  rserver SERVER_02 80

    conn-limit max 4000000 min 4000000

    inservice

class-map match-all VIP

  2 match virtual-address 10.10.40.20 any

class-map match-all VIP2

  2 match virtual-address 10.10.40.20 tcp eq www

class-map type management match-any remote_access

  2 match protocol xml-https any

  3 match protocol icmp any

  4 match protocol telnet any

  5 match protocol ssh any

  6 match protocol http any

  7 match protocol https any

  8 match protocol snmp any

policy-map type management first-match remote_mgmt_allow_policy

  class remote_access

    permit

policy-map type loadbalance first-match SLB_LOGIC

  class class-default

    serverfarm REAL_SERVERS

policy-map type loadbalance first-match VIP2-l7slb

  class class-default

    serverfarm REAL_SERVERS

policy-map multi-match CLIENT_VIPS

  class VIP

    loadbalance vip inservice

    loadbalance policy SLB_LOGIC

    loadbalance vip icmp-reply active

  class VIP2

    loadbalance vip inservice

    loadbalance policy VIP2-l7slb

interface vlan 694

  ip address 10.78.2.2 255.255.255.248

  access-group input ALL

  service-policy input remote_mgmt_allow_policy

  no shutdown

interface vlan 697

  ip address 10.10.40.2 255.255.255.0

  fragment chain 112

  access-group input ALL

  service-policy input CLIENT_VIPS

  no shutdown

interface vlan 698

  ip address 10.10.50.2 255.255.255.0

  fragment chain 112

  access-group input ALL

  no shutdown

ip route 0.0.0.0 0.0.0.0 10.78.2.1

Thanks.
Philip

Cisco Employee

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Hi Philip,

Are you able to ping the gateway, servers and 6500 SVI's from the ACE? Can you send me the output of "show service-policy detail" & "show arp"?

Regards,

Siva

New Member

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Hi Siva,

With the above configuration, I can ping server-side's gateway (IP: 10.10.50.1) and client-side's gateway (IP: 10.10.40.1) from ACE. However, I can't ping (IP: 10.10.50.2) and (IP: 10.10.40.2) from 6500. I thought the icmp is allowed from the above configuration.

ACE4710/Admin# show service-policy detail

Policy-map : CLIENT_VIPS

Status     : ACTIVE

Description: -

-----------------------------------------

Interface: vlan 1 697

  service-policy: CLIENT_VIPS

    class: VIP

     VIP Address:    Protocol:  Port:

     10.10.40.20     any

      loadbalance:

        L7 loadbalance policy: SLB_LOGIC

        VIP ICMP Reply       : ENABLED-WHEN-ACTIVE

        VIP state: OUTOFSERVICE

        Persistence Rebalance: DISABLED

        curr conns       : 0         , hit count        : 24       

        dropped conns    : 24       

        client pkt count : 36        , client byte count: 1728               

        server pkt count : 0         , server byte count: 0                  

        conn-rate-limit      : 0         , drop-count : 0        

        bandwidth-rate-limit : 0         , drop-count : 0        

        L7 Loadbalance policy : SLB_LOGIC

          class/match : class-default

            LB action :

               primary serverfarm: REAL_SERVERS

                    state: DOWN

                backup serverfarm : -

            hit count        : 24       

            dropped conns    : 0        

            compression      : off

      compression:

        bytes_in  : 0                  

        bytes_out : 0                  

        Compression ratio : 0.00%

    class: VIP2

     VIP Address:    Protocol:  Port:

     10.10.40.20     tcp        eq    80  

      loadbalance:

        L7 loadbalance policy: VIP2-l7slb

        VIP ICMP Reply       : ENABLED-WHEN-ACTIVE

        VIP state: OUTOFSERVICE

        Persistence Rebalance: DISABLED

        curr conns       : 0         , hit count        : 0        

        dropped conns    : 0        

        client pkt count : 0         , client byte count: 0                  

        server pkt count : 0         , server byte count: 0                  

        conn-rate-limit      : 0         , drop-count : 0        

        bandwidth-rate-limit : 0         , drop-count : 0        

        L7 Loadbalance policy : VIP2-l7slb

          class/match : class-default

            LB action :

               primary serverfarm: REAL_SERVERS

                    state: DOWN

                backup serverfarm : -

            hit count        : 0        

            dropped conns    : 0        

            compression      : off

      compression:

        bytes_in  : 0                  

        bytes_out : 0                  

        Compression ratio : 0.00%

ACE4710/Admin# show arp

Context Admin

================================================================================

IP ADDRESS      MAC-ADDRESS        Interface  Type      Encap  NextArp(s) Status

================================================================================

10.78.2.1       00.12.da.10.3c.0a  vlan694   GATEWAY    5      38 sec       up

10.78.2.2       00.1b.24.3d.bc.8c  vlan694   INTERFACE  LOCAL     _         up

10.10.40.1      00.12.da.10.3c.0a  vlan697   LEARNED    8      10629 sec    up

10.10.40.2      00.1b.24.3d.bc.8c  vlan697   INTERFACE  LOCAL     _         up

10.10.40.3      00.50.56.94.2f.ba  vlan697   LEARNED    7      10589 sec    up

10.10.40.20     00.1b.24.3d.bc.8c  vlan697   VSERVER    LOCAL     _         up

10.10.50.1      00.12.da.10.3c.0a  vlan698   LEARNED    9      4601 sec     up

10.10.50.2      00.1b.24.3d.bc.8c  vlan698   INTERFACE  LOCAL     _         up

10.10.50.3      00.50.56.93.00.db  vlan698   RSERVER    10     235 sec      up

10.10.50.4      00.00.00.00.00.00  vlan698   RSERVER    -       * 1 req     dn

================================================================================

Total arp entries 10

Thanks.
Philip

Cisco Employee

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Hi Philip,

The VIP state is OUTOFSERVICE.

Can you remove the probe from serverfarm and check if the VIP changes to INSERVICE and serverfarm comes UP?

serverfarm host REAL_SERVERS

probe 1      <<<<<<<<<<<<<< REMOVE>>>>>>>>>>>>>>

VIP state: OUTOFSERVICE   <<<<<<<<<<<<<<<

Persistence Rebalance: DISABLED

curr conns : 0 , hit count : 24

dropped conns : 24

client pkt count : 36 , client byte count: 1728

server pkt count : 0 , server byte count: 0

conn-rate-limit : 0 , drop-count : 0

bandwidth-rate-limit : 0 , drop-count : 0

L7 Loadbalance policy : SLB_LOGIC

class/match : class-default

LB action :

primary serverfarm: REAL_SERVERS

state: DOWN  <<<<<<<<<<<<<<<<<<<<

Once VIP is INSERVICE check if you can ping the VIP - 10.10.40.20 from 6500.

If it still shows OUTOFSERVICE after removing the probe send me the output of "show serverfarm detail" & "show rserver detail"

Also can you check if you are able to ping 10.78.2.2 from 6500?

Regards,
Siva

New Member

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Hi Siva,

I can ping VIP (IP: 10.10.40.20) from 6500, also from the client-side (IP: 10.10.40.3), but I still can't browse from the VIP for both webservers when I tried http://10.10.40.20 from the client's web browser. I even see reset from the pcap.

I can browse both webservers direct without an issue though.

ACE4710/Admin# show service-policy detail

Policy-map : CLIENT_VIPS

Status     : ACTIVE

Description: -

-----------------------------------------

Interface: vlan 1 697

  service-policy: CLIENT_VIPS

    class: VIP

     VIP Address:    Protocol:  Port:

     10.10.40.20     any

      loadbalance:

        L7 loadbalance policy: SLB_LOGIC

        VIP ICMP Reply       : ENABLED-WHEN-ACTIVE

        VIP State: INSERVICE

        Persistence Rebalance: DISABLED

        curr conns       : 0         , hit count        : 28       

        dropped conns    : 28       

        client pkt count : 42        , client byte count: 2016               

        server pkt count : 0         , server byte count: 0                  

        conn-rate-limit      : 0         , drop-count : 0        

        bandwidth-rate-limit : 0         , drop-count : 0        

        L7 Loadbalance policy : SLB_LOGIC

          class/match : class-default

            LB action :

               primary serverfarm: REAL_SERVERS

                    state: UP

                backup serverfarm : -

            hit count        : 28       

            dropped conns    : 0        

            compression      : off

      compression:

        bytes_in  : 0                  

        bytes_out : 0                  

        Compression ratio : 0.00%

    class: VIP2

     VIP Address:    Protocol:  Port:

     10.10.40.20     tcp        eq    80  

      loadbalance:

        L7 loadbalance policy: VIP2-l7slb

        VIP ICMP Reply       : ENABLED-WHEN-ACTIVE

        VIP State: INSERVICE

        Persistence Rebalance: DISABLED

        curr conns       : 0         , hit count        : 0        

        dropped conns    : 0        

        client pkt count : 0         , client byte count: 0                  

        server pkt count : 0         , server byte count: 0                  

        conn-rate-limit      : 0         , drop-count : 0        

        bandwidth-rate-limit : 0         , drop-count : 0        

        L7 Loadbalance policy : VIP2-l7slb

          class/match : class-default

            LB action :

               primary serverfarm: REAL_SERVERS

                    state: UP

                backup serverfarm : -

            hit count        : 0        

            dropped conns    : 0        

            compression      : off

      compression:

        bytes_in  : 0                  

        bytes_out : 0                  

        Compression ratio : 0.00%

ACE4710/Admin# show serverfarm detail

serverfarm     : REAL_SERVERS, type: HOST

total rservers : 2

active rservers: 2

description    : -

state          : ACTIVE

predictor      : ROUNDROBIN

failaction     : -

back-inservice    : 0

partial-threshold : 0

num times failover       : 5

num times back inservice : 7

total conn-dropcount : 0

---------------------------------

                                                ----------connections-----------

       real                  weight state        current    total      failures

   ---+---------------------+------+------------+----------+----------+---------

   rserver: SERVER_01

       10.10.50.3:80         8      OPERATIONAL  0          0          18

         max-conns            : 4000000   , out-of-rotation count : 0

         min-conns            : 4000000  

         conn-rate-limit      : -         , out-of-rotation count : -

         bandwidth-rate-limit : -         , out-of-rotation count : -

         retcode out-of-rotation count : -

   rserver: SERVER_02

       10.10.50.4:80         8      OPERATIONAL  0          0          0

         max-conns            : 4000000   , out-of-rotation count : 0

         min-conns            : 4000000  

         conn-rate-limit      : -         , out-of-rotation count : -

         bandwidth-rate-limit : -         , out-of-rotation count : -

         retcode out-of-rotation count : -

ACE4710/Admin# show rserver detail

rserver              : SERVER_01, type: HOST

state                : OPERATIONAL (verified by arp response)

description          : -

max-conns            : 4000000   ,  out-of-rotation count  : 0

min-conns            : 4000000  

conn-rate-limit      : -         ,  out-of-rotation count  : -

bandwidth-rate-limit : -         ,  out-of-rotation count  : -

weight               : 8

---------------------------------

                                                ----------connections-----------

       real                  weight state        current    total              

   ---+---------------------+------+------------+----------+--------------------

   serverfarm: REAL_SERVERS

       10.10.50.3:80         8      OPERATIONAL  0          0                  

         max-conns            : 4000000   ,  out-of-rotation count  : 0

         min-conns            : 4000000  

         conn-rate-limit      : -         ,  out-of-rotation count  : -

         bandwidth-rate-limit : -         ,  out-of-rotation count  : -

         total conn-failures  : 18

rserver              : SERVER_02, type: HOST

state                : OPERATIONAL (verified by arp response)

description          : -

max-conns            : 4000000   ,  out-of-rotation count  : 0

min-conns            : 4000000  

conn-rate-limit      : -         ,  out-of-rotation count  : -

bandwidth-rate-limit : -         ,  out-of-rotation count  : -

weight               : 8

---------------------------------

                                                ----------connections-----------

       real                  weight state        current    total              

   ---+---------------------+------+------------+----------+--------------------

   serverfarm: REAL_SERVERS

       10.10.50.4:80         8      OPERATIONAL  0          0                  

         max-conns            : 4000000   ,  out-of-rotation count  : 0

         min-conns            : 4000000  

         conn-rate-limit      : -         ,  out-of-rotation count  : -

         bandwidth-rate-limit : -         ,  out-of-rotation count  : -

         total conn-failures  : 0

Thanks.
Philip

Cisco Employee

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Hi Philip,

Was this capture taken from client? If so can you apply NAT on ACE and see if it works.

 

policy-map multi-match CLIENT_VIPS

class VIP

loadbalance vip inservice

loadbalance policy SLB_LOGIC

loadbalance vip icmp-reply active

nat dynamic 1 vlan 698     <<<<<<<<<<< ADD  >>>>>>>>>>>.

interface vlan 698

nat-pool 1 10.10.50.10 10.10.50.10 netmask 255.255.255.255 pat   <<<<<<<<<<< ADD >>>>>>>>>>>.

Regards,
Siva

New Member

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Hi Siva,

Yes the pcap is from the client. It works now. Could you explain a little bit of the reason why we need the nat here?

We have the client(IP: 10.10.40.3) sending GET a resquest to ACE's VIP (IP: 10.10.40.20). ACE's VIP goes to real servers (IP: 10.10.50.3, 10.10.50.4). Real servers reply to the nat (IP: 10.10.50.10) and maps it back to the VIP -> client?

Where is the NAT (IP: 10.10.50.10) playing? What's the logic?

Thanks.
Philip

Cisco Employee

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Hi Philip,

That's correct.

The problem was due to asymmetric routing and the server replies directly back to the client bypassing ACE.

The trick here is getting return traffic from the real server to go back through the ACE; this is achieved with source NAT. We create a NAT pool on the ACE and when the user hits the ACE, his address is translated to one in the pool. The real server sees the source address as one in the pool and knows that that subnet resides on the ACE, so server replies to the ACE. The ACE then NATs the address to the user’s real address and forwards the response.

Another option is to change the routing on server so it always responds backs to ACE instead of replying directly back to the client.

Regards,
Siva

New Member

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Thank you for the information, Siva.

So does it mean putting ip route 10.10.50.0 255.255.255.0 10.10.50.2 on the 6500 will take care of it then?

Thanks.
Philip

Cisco Employee

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Hi Philip,

It has to be on the server itself, changing the default gateway on server to ACE ip should work here.

Regards,
Siva

New Member

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Is or will ACE loadbalancer be capabel to deal with WebSocket protocoll as defined in RFC 6455 ?

How to deal with stickiness in this area ? My on lab experiments are showing that ip based stickniess is working with ACE software version A4(1.0) - but SessionID based stickiness is not possible.

Cisco Employee

Re: Ask the Expert: Understanding and Troubleshooting ACE Loadba

Hi,

Thanks for your question.

There are no immediate plans to support websocket on ACE and no roadmap available yet. I can tell from previous documented cases and from my personal experience on cases I've handled, there is a particular requirement which seems to be very important for WebSocket traffic.

As WebSocket requires stickiness, to enable all connections from a single user to stick to one server and is particularly effective (and sometimes strictly necessary) when the application requires user authentication, as otherwise,
traffic would be bouncing between two or more servers.

The type of stickiness that you would implement depends entirely on your network requirements.

Since ACE does not have any specific knowledge of the WebSocket protocols, it doesn't have the capability to do deeper protocol inspection but it seem to work for generic Connection based Level 3 and 4 load balancing which I believe you have already tested in your LAB.

You can also get in touch with your cisco internal contact, share the use case and more details to help assist on your requirement.

Regards,

Siva

New Member

Re: Ask the Expert: Understanding and Troubleshooting ACE Loadba

Hi Siva,

Good to see the ACE discussion in the Experts Corner. My query is if there is any permanent fix to CSCsz65679 which causes ACE-20 to crash couple of times in a year ? I have noticed that RMA is not a fix for the problem neither the image upgrade. One of our customer had 10's of ACE-20s and neither RMA nor the upgrades fixed the 'NP Control Store Parity Error', so far they have observed around 10 total ACE-20 crashes on different modules in 3 years of time. The upgrades only reduces the crash frequency, probably due to explicit reload during upgrades which refreshes all the buffers.

I believe this might be an issue with the ACE-20 architecture ? similar issues have not been observed on ACE-30.

Regards,

Akhtar

Cisco Employee

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Hi Akhtar,

Thanks for your question.

Sorry your customer had to experience too many crashes due to parity issue.

First let me expalain few things about SRAM. SRAM parity error presented in the core file is not due to a software issue. The issue is the result of a "bit-flip" within the SRAM itself which can occur as a result of environmental conditions. This "bit-flip" is rectified by a simple reboot of the system, which would occur with the generation of the core file.Our testing has shown that these type of issues can occur with very low frequency and if a particular module experiences a significantly higher failure rate and you are running a version which has all the possible workarounds for CSCsz65679 then a proactive RMA could be in order.

ACE20 is susceptible to this because of the way it uses SRAM to store  control information and packet data as  opposed to scratch-pad storage.  Almost any 1-bit flip will be detected as a parity error.

Unfortunately, SRAM's are very sensitive to light, dust, radiation,  shock, temperature,... so it is possible to get an SRAM parity error on  an healthy ACE.

You are right about ACE30, neither ACE4710 or ACE30 are affected by these issues as the design does not use sram  or nitrox.

Also note that we have EOL notice for ACE10/20:

http://www.cisco.com/en/US/prod/collateral/modules/ps2706/end_of_life_c51-674430.html

Regards,

Siva

New Member

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Hi Siva

i have two ace module , the standby one is reload sudden , how can i know the cause of this

Cisco Employee

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Hi,

Thanks for your question.

I understand the standby ACE had an unexpected reload, do you see any crash info generated under "dir core:" after reload. If so please send those files to me to determine the reason for reload.

Otherwise you can raise a tac case and attach the following information for our analysis to determine the root cause.

1- 'show tech' on the switch

2- 'show tech' on the Admin context on the ACE

3- Logs on the switch covering the period when the reload happened.

4- Crash files from ACE located under "dire core:"

Let me know if you have any qusetions.

Regards,

Siva

New Member

Re: Ask the Expert: Understanding and Troubleshooting ACE Loadba

Hi Siva

I atteched the requied files , but regarding to the crash info , i didnt find crash info for the reload date ( 18 Aug 2012 11:41 PM )

Thanks Siva

Best Regards

Mohamed Abd EL Razik

Cisco Employee

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Hi,

Thanks for providing the data.

This looks like a silent reboot and SUP initiated the reload.

However the information doesn't really explain why it happened. Silent reboots are tricky as they don't leave much data to work with.

Here is the defect that we logged to track the silent reboot. With high probability a SW upgrade will be necessary as few bugs related to silent reloads have been fixed in A2(3.3) and current version is A2(3.5)) and then monitor device.

http://tools.cisco.com/Support/BugToolKit/search/getBugDetails.do?method=fetchBugDetails&bugId=CSCsy91540

There is an action plan to determine if this was traffic related L7 or management traffic ANM, XML, SNMP... which may be filling up the resources on ACE that caused the reload.

I can send you the detailed action plan via PM if reqiured.

Let me know if you have any questions.

Regards,
Siva

New Member

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Hi Siva,

Just a note on versions available. We recently appear to have run into the following Bug and had to downgrade to version A2(3.3) as removing our HTTP health probes did not seem like a workable solution for us.

http://tools.cisco.com/Support/BugToolKit/search/getBugDetails.do?method=fetchBugDetails&bugId=CSCtz47825

Once downgraded the paired modules stabalised (no longer re-loaded continuously). Both modules were in this state.

Just thought would provide some input.

Thanks.

Paul

Cisco Employee

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Hi Paul,

Thanks for your question.

Its good to know that the devices are stable now after downgrading to A2(3.3) and I am able to track down the TAC case you reported recently on this issue.

Looking into the bug, we had this issue reported mainly on version A2(3.5) in the past and we are working on reproducing the issue on different code versions to find out the reason for memory corruption.

We will have the fix after we successfully reproduce the problem and it has been updated with fixed version as A2(3.7).

Let me know if you have any questions.

Regards,
Siva

New Member

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Hi Siva

thanks for the information

kindly send me the detailed action plan to determine if this was traffic related L7 or management traffic ANM, XML, SNMP

Regards,

Mohamed

Cisco Employee

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Hi Mohamed,

Sent you the information via PM. Please check.

Regards,
Siva

New Member

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Thanks Siva for your support

Regards

Mohamed

New Member

Ask the Expert: Understanding and Troubleshooting ACE Loadbalanc

Hi Siva,

I am attaching the running-config of the ACE which is currently under test in the lab.

As you can see VLAN - 20 is configured to the Client Side & VLAN-30 is configured on the server side.

I am not able to ping the ACE Interface IP address : 2092:dead:beef:cafe::3 from the Cisco Switch ( 7k ) whose interface is connected to the ACE on VLAN-20.

Any idea if this is normal behavior (or) is there any configuration mistake ?

Thanks !!

hostname ACE-4710

interface gigabitEthernet 1/1

  description *** Interface connecting to the UUT-Switch-7k (WS-C7206X) ***

  switchport access vlan 20

  no shutdown

interface gigabitEthernet 1/2

  description *** Interface connecting to the serverfarm ***

  switchport access vlan 30

  no shutdown

interface gigabitEthernet 1/3

  description *** UNUSED***

  no shutdown

interface gigabitEthernet 1/4

  description *** UNUSED***

  no shutdown

access-list everyone  extended permit ip any any

access-list everyone  extended permit pim any any

access-list everyone  extended permit icmp any any

rserver host CNR

ip address 2092:dead:beef:cafe::90

inservice

rserver host CNR-IPv4

ip address 172.27.167.13

inservice

rserver host NMS

ip address 2092:dead:beef:cafe::999

inservice

serverfarm host LABSERVERS

rserver CNR

inservice

rserver CNR-IPv4

inservice

rserver NMS

inservice

! Layer-3 Traffic

class-map type management match-any MGMT

match protocol telnet any

match protocol https any

match protocol http any

match protocol xml-https any

match protocol ssh any

match protocol icmp any

! Layer-4 Traffic

class-map match-all slb-vip-LABSERVERS

match virtual-address 2092:dead:beef:cafe::1 any

! Layer-3 Class-Map defining source traffic. This traffic macthes server initiated

policy-map type management first-match MGMT_POLICY

class MGMT

permit

policy-map type loadbalance first-match LB_POLICY_LABSERVERS

class class-default

serverfarm LABSERVERS

policy-map multi-match CLIENT-VIPS_LABSERVERS

class slb-vip-LABSERVERS

loadbalance vip inservice

loadbalance policy LB_POLICY_LABSERVERS

loadbalance vip icmp-reply active

loadbalance vip advertise active

interface vlan 20

  description "Client Interface"

  bridge-group 1

  access-group input everyone

  service-policy input CLIENT-VIPS_LABSERVERS

  service-policy input MGMT_POLICY

  no shutdown

interface vlan 30

  description "Server Farm"

  bridge-group 1

  service-policy input CLIENT-VIPS_LABSERVERS

  service-policy input MGMT_POLICY

  no shutdown

interface bvi 1

  ipv6 enable

  ip address 2092:dead:beef:cafe::3/64

  description "Client-Server Bridge Group"

  no shutdown

ip route ::/0 2092:dead:beef:cafe::2

username admin password 5 $1$Hh4K/EuN$J9mu8qUJbebWixnC5Wxpo1  role Admin domain

default-domain

username www password 5 $1$9yHPLof8$RZrtAsMV26WtOp/q8Ou8L.  role Admin domain de

fault-domain

*******************************************************************

On the 7200 switch which is connecting to the ACE :

!

interface GigabitEthernet0/3

description Connected to ACE-E1

no ip address

ip pim sparse-mode

ip igmp version 3

ip ospf 1 area 0

shutdown

duplex auto

speed auto

media-type rj45

negotiation auto

ipv6 enable

ipv6 ospf 1 area 0

!

interface GigabitEthernet0/3.20

encapsulation dot1Q 20 native

ipv6 address 2093:DEAD:BEEF:CAFE::2/64

!

ipv6 route 2092:DEAD:BEEF:CAFE::/64 2092:DEAD:BEEF:CAFE::1

*********************************************************************************************************

I am setting it up for a basic management setup & later on progress to enable more functionalities in the ACE.

Please let me know if there are any mistakes (or) corrections which I might need to make in the configuration.

Thanks !

34604
Views
46
Helpful
51
Replies
CreatePlease login to create content