ACE Sharepoint with sticky sessions troubles

Unanswered Question
Apr 14th, 2011

Hello,

We are setting our new sharepoint environment up on our new ACE appliance.  We have setup sticky sessions and traffic seems to be flowing properly but we are running into a problem when we take one of the nodes out of service.  Traffic doesn't fail over to the other server without restart IE or whatever browser we are using.  It seems to be stuck to that one server and wont transisiton to the other.  I'm not sure if that is by design but it will start working if the server is put back in service.   We have waited up to five minutes but it still doesn't work.  We have also decreased the sticky timeout to 1 minute as a test but that didnt help either and have also configured persistant rebalance.

parameter-map type http HTTP-PARAM
  case-insensitive
  persistence-rebalance
  set header-maxparse-length 8092
  length-exceed continue

!

sticky http-cookie serverfarm1 serverfarm1_cookie
  cookie insert browser-expire
  timeout 20
  replicate sticky
  serverfarm serverfarm1_sf

Any help would be greatly appreciated.

Thanks in advance.

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Average Rating: 0 (0 ratings)
yushimaz Thu, 04/14/2011 - 18:44

Do you configure 'failaction [purge|reassign]' on serverfarm?

If not, please configure 'failaction [purge|reassign]' and check the behavior.

Even when server goes down, ACE doesn't purge entries from connection

table by default. It means ACE will continues to send packets to server.

Probably it's not a sticky issue since static sticky entries for cookie insert

are removed when rserver goes down.

Regards,

Yuji

Darren.sasso Fri, 04/15/2011 - 16:16

Thanks Yuji.  I tried both of those commands but it's still not working...any other suggestions.

Thanks.

yushimaz Sun, 04/17/2011 - 19:08

Hi Darren

Do you mean ACE continued to send packets to failed server even after adding failaction config?

If so, please check 'show conns' output whether ACE keeps entries for this failed server.

Please check 'show conn' when you type F5 key (reflesh of a page) after server goes down.

If entries of show conns are purged or reassigned but problem occurs, it maybe an issue of

application layer.

And, please check 'show rserver' too, corrent connections counter should be 0 if server goes down.
Regards,

Yuji

Darren.sasso Mon, 04/18/2011 - 16:00

Hi Yuji,

I can see that the current connections are immediately dropped when i bring the server offline.  I dont however see the connections pick up on the other server.  I keep refreshing the page and still nothing.  I did a show connections with my source address and i dont see any connections even when i refresh the page.  If i close the browser it works as expected, but if i keep the browser open and hit refresh it only works when i bring the server back inservice.

Thanks.

yushimaz Tue, 04/19/2011 - 03:37

Hi Darren

If possible, please get capture trace between client and ACE.

When you type F5 key, does ACE increment the 'hit count' in service-policy?

How about total connections in 'show rserver'?
ACE4710a-yushimaz/c3# show service-policy client-vips
Status     : ACTIVE
-----------------------------------------
Interface: vlan 1 777
  service-policy: client-vips
    class: vip-http
      loadbalance:
        L7 loadbalance policy: sticky
        Regex dnld status    : QUEUED
        VIP ICMP Reply       : ENABLED
        VIP State: INSERVICE
        Persistence Rebalance: ENABLED
        curr conns       : 0         , hit count        : 2 <<==
        dropped conns    : 0
[snipped]
ACE4710a-yushimaz/c3# show rserver
rserver              : sv1, type: HOST
state                : OPERATIONAL (verified by arp response)
---------------------------------
                                                ----------connections-----------
       real                  weight state        current    total
   ---+---------------------+------+------------+----------+--------------------
   serverfarm: sf
       192.168.78.1:80       8      OPERATIONAL  0          1
rserver              : sv2, type: HOST
state                : OUTOFSERVICE
---------------------------------
                                                ----------connections-----------
       real                  weight state        current    total
   ---+---------------------+------+------------+----------+--------------------
   serverfarm: sf
       192.168.78.2:80       8      OUTOFSERVICE 0          1

When browser is refreashed, client sends http request to ACE with cookie.

If rserver associated with this cookie is down, ACE should rebalance.

I confirmed the behavior with A3(2.7) and worked as expected.

If you can telnet on the client deivce, please try telnet access as below.

Your server may require additonal header. If so, you may fail thes steps.

## telnet to vip

client:/# telnet 192.168.77.100 80

Trying 192.168.77.100...

Connected to 192.168.77.100.

Escape character is '^]'.

## send get request with cookie

GET / HTTP/1.1

Host:

Cookie:ace=R3987003477

HTTP/1.1 200 OK

Set-Cookie: ace=R3987003477; path=/

Date: Tue, 19 Apr 2011 01:45:03 GMT

Server: Apache/1.3.34 (Debian)

Last-Modified: Tue, 01 Mar 2011 04:28:00 GMT

ETag: "2689f0-4-4d6c75d0"

Accept-Ranges: bytes

Content-Length: 4

Content-Type: text/html; charset=iso-8859-1

sv2

## force down rserver(sv2) and then send the same request

GET / HTTP/1.1

Host:

Cookie:ace=R3987003477

HTTP/1.1 200 OK

Set-Cookie: ace=R3986967540; path=/

Date: Tue, 19 Apr 2011 01:45:19 GMT

Server: Apache/1.3.34 (Debian)

Last-Modified: Thu, 24 Feb 2011 19:20:47 GMT

ETag: "2a27ba-4-4d66af8f"

Accept-Ranges: bytes

Content-Length: 4

Content-Type: text/html; charset=iso-8859-1

sv1

## ACE rebalanced and replied set-cookie for sv1.
Cookies associated with rserver are as below.
ACE4710a-yushimaz/c3# sh sticky cookie-insert group cookie
     Cookie   |        HashKey       |           rserver-instance
  ------------+----------------------+----------------------------------------+
  R3986967540 | 6899333988591634416  | sf/sv1:80
  R3987003477 | 981065466532684257   | sf/sv2:80
ACE4710a-yushimaz/c3#
Regards,
Yuji
Darren.sasso Tue, 04/19/2011 - 11:57

Hi Yuji thanks for your help thus far.

I was able to determine the hit count and drop conns were increasing together when the rserver was out of service.  I also did a sniff trace and i'm receiving duplicate acknowledgements and i also noticed that the cookie didn't change over the cousre of the flow.  The cookie was the same while inservice and even after i took the rserver out of service.

I also tried to telnet port 80 and set the values you specified but i was receiving a 500 internal server error.  I also tried the below which returned a 200 but i wasn't able to set the cookie.

GET /_layouts/images/QuickTagILikeIt_24.png HTTP/1.1

Just one thing to note is i am doing a rserver redirection and i'm not sure if that would have an affect.

rserver redirect REDIRECT
  webhost-redirection http://server1-2010%p 301
  inservice

Thanks.

Alvaro Perez Unzueta Tue, 04/19/2011 - 14:32

Hello,

I have a similar problem with a loadbalace. My ACE balance OK, but when i am  connected to application in a real server that i am connected and the application fail  the session is not response, so i have to init the new session. the idea  is that session be persistent, but the ACE sent the session id to the  other real server. how i can solved that?

I sent the configuration

access-list anyone line 8 extended permit ip any any


probe tcp WEBLOGIC-TCP
  port 7293
  interval 4
  faildetect 2
  passdetect interval 10
  passdetect count 2
  receive 2
  open 2
probe http http-get-index
  interval 4
  faildetect 2
  passdetect interval 10
  receive 2
  expect status 200 200
  open 2

rserver host intra1
  ip address 10.200.254.3
  inservice
rserver host intra2
  ip address 10.200.254.4
  inservice

serverfarm host intrafarm
  rserver intra1
    probe WEBLOGIC-TCP

    probe http-get-index
    inservice
  rserver intra2
    probe WEBLOGIC-TCP

    probe http-get-index
    inservice

sticky ip-netmask 255.255.255.255 address source src-ip-sticky
  timeout 6
  timeout activeconns
  serverfarm intrafarm

class-map type management match-any Mgt
  2 match protocol http any
  3 match protocol telnet any
  4 match protocol ssh any
  5 match protocol icmp any
class-map match-any VIP-srvintranet
  2 match virtual-address 172.10.254.3 any

policy-map type management first-match Management
  class Mgt
    permit

policy-map type loadbalance first-match lb-vip
  class class-default
    sticky-serverfarm src-ip-sticky

policy-map multi-match client-vips
  class VIP-srvintranet
    loadbalance vip inservice
    loadbalance policy lb-vip
    loadbalance vip icmp-reply

interface vlan 501
  description SIDE-SERVERS
  ip address 10.200.254.1 255.255.255.248
  access-group input anyone
  access-group output anyone
  service-policy input Management
  no shutdown
interface vlan 502
  description SIDE-CLIENTS
  ip address 172.10.254.2 255.255.255.248
  access-group input anyone
  access-group output anyone
  service-policy input Management
  service-policy input client-vips
  no shutdown

ip route 0.0.0.0 0.0.0.0 172.10.254.1

ace/intranet# sh sticky database
sticky group : src-ip-sticky
type         : IP
timeout      : 6             timeout-activeconns : TRUE
  sticky-entry          rserver-instance                 time-to-expire flags
  ---------------------+--------------------------------+--------------+-------+
  2887737384            intra1:0                         308            -
ace/intranet#

ace/intranet# sh serverfarm intrafarm
serverfarm     : intrafarm, type: HOST
total rservers : 2
---------------------------------
                                                ----------connections-----------
       real                  weight state        current    total      failures
   ---+---------------------+------+------------+----------+----------+---------
   rserver: intra1
       10.200.254.3:0        8      OPERATIONAL  10         28         0
   rserver: intra2
       10.200.254.4:0        8      PROBE-FAILED 3          58         0

ace/intranet# sh serverfarm intrafarm
serverfarm     : intrafarm, type: HOST
total rservers : 2
---------------------------------
                                                ----------connections-----------
       real                  weight state        current    total      failures
   ---+---------------------+------+------------+----------+----------+---------
   rserver: intra1
       10.200.254.3:0        8      OPERATIONAL  10         28         0
   rserver: intra2
       10.200.254.4:0        8      PROBE-FAILED 3          58         0

ace/intranet#

Surya ARBY Tue, 04/19/2011 - 14:48

Your issue is not related to the ACE but to the way your application server uses cookies. Unless you use a shared database between all your application servers to synchronize all the session states, a user will always have to close his browser to reintialize an applciation session again.

Surya ARBY Tue, 04/19/2011 - 14:55

Hi.

can you put the "failaction purge" in your config again and delete the timeout parameter in your stickyness config (while keeping the browser-expire flag) ?


can you post your config and the software version you're using ?

Also some wireshark trace and a live HTTP header log would help to troubleshoot this issue.

Alvaro Perez Unzueta Tue, 04/19/2011 - 15:16

Hello,

I havent configured the "failaction purge" command. the sticky is necesary in the enviroment, but i will do and the output of context admin of ace is:

ace/Admin# show version
Cisco Application Control Software (ACSW)
TAC support: http://www.cisco.com/tac
Copyright (c) 2002-2009, Cisco Systems, Inc. All rights reserved.
The copyrights to certain works contained herein are owned by
other third parties and are used and distributed under license.
Some parts of this software are covered under the GNU Public
License. A copy of the license is available at
http://www.gnu.org/licenses/gpl.html.

Software
  loader:    Version 12.2[123]
  system:    Version A2(1.6a) [build 3.0(0)A2(1.6a) adbuild_08:46:04-2009/10/16_
/auto/adbu-rel4/rel_a2_1_6_throttle/REL_3_0_0_A2_1_6A]
  system image file: [LCP] disk0:c6ace-t1k9-mz.A2_1_6a.bin
  installed license: no feature license is installed

Hardware
  Cisco ACE (slot: 8)
  cpu info:
    number of cpu(s): 2
    cpu type: SiByte
    cpu: 0, model: SiByte SB1 V0.2, speed: 700 MHz
    cpu: 1, model: SiByte SB1 V0.2, speed: 700 MHz
  memory info:
    total: 827108 kB, free: 272628 kB
    shared: 0 kB, buffers: 2800 kB, cached 0 kB
  cf info:
    filesystem: /dev/cf
    total: 1014624 kB, used: 363472 kB, available: 651152 kB

last boot reason:  SUP request
configuration register:  0x1
ace kernel uptime is 0 days 0 hour 27 minute(s) 32 second(s)

ace/Admin#

Surya ARBY Tue, 04/19/2011 - 23:18

Hi.

For browser-based applications, browser-expire is the best practice. Can you use cookie insertion based persistence ?

Alvaro Perez Unzueta Wed, 04/20/2011 - 09:42

Hello,

I configured cookie insert and failaction purge, but it is same , the server is no fail and the counter of failure is not increment.

Before probe fail (probe WEBLOGIC-TCP)

ace-pnp/intranet# sh serverfarm intrafarm
serverfarm     : intrafarm, type: HOST
total rservers : 2
---------------------------------
                                                ----------connections-----------
       real                  weight state        current    total      failures
   ---+---------------------+------+------------+----------+----------+---------
   rserver: intra1
       10.200.254.3:0        8      OPERATIONAL  10         28         0
   rserver: intra2
       10.200.254.4:0        8      OPERATIONAL   3          58         0

After probe fail (probe WEBLOGIC-TCP)

ace-pnp/intranet# sh serverfarm intrafarm
serverfarm     : intrafarm, type: HOST
total rservers : 2
---------------------------------
                                                ----------connections-----------
       real                  weight state        current    total      failures
   ---+---------------------+------+------------+----------+----------+---------
   rserver: intra1
       10.200.254.3:0        8      OPERATIONAL  10         28         0
   rserver: intra2
       10.200.254.4:0        8      PROBE-FAILED 3          58         0

Regards

Surya ARBY Wed, 04/20/2011 - 09:55

Can you send live http header traces before / after the failure of the probe ?

Darren.sasso Wed, 04/20/2011 - 09:59

Surya,

How can i configure the browser expire flag?  Is that something on the ACE or is that configured on the server itself?

Thanks.

Surya ARBY Wed, 04/20/2011 - 10:02

You already have it in your config :

sticky http-cookie serverfarm1 serverfarm1_cookie
  cookie insert browser-expire
  timeout 20

just drop the "timeout 20" which is antagonist with a browser-expire config (timeout=0 / session-based cookie)

Alvaro Perez Unzueta Wed, 04/20/2011 - 10:04

Hello, I make F5 in my page and the output is error. i havent configured the parameter-map is it necesary?

parameter-map type http HTTP-PARAM
  case-insensitive
  persistence-rebalance
  set header-maxparse-length 8092
  length-exceed continue

!

Thank you

Surya ARBY Wed, 04/20/2011 - 10:07

Hi.

I already gave you the answer a few post above You can't do anything on the ACE, this is related to the way your application manages sessions. Unless you use a shared database for synchronizing sessions between all the servers, when a specific front end fails, the user will face an error and will have to log in again.

The user will have to close his browser to flush the cookies and then establish a new session.

Darren.sasso Wed, 04/20/2011 - 10:20

Ok i removed the timeout and it still didn't work.  We have a common database on the backend, but i dont believe that keeps any session info, so basically this will never work is what your saying.  IE will have to be closed and reopened for it to work successfully?

Just an FYI when i use fiddler to check the HTTP connections i get a 504 error.

Thanks.

Surya ARBY Wed, 04/20/2011 - 10:40

Hello Daren.

There is a cross topic with the previous forumer

Can you send a pcap file to check a full session establishment

Surya ARBY Wed, 04/20/2011 - 10:45

Daren, for the tests I need :

- be sure that "failaction purge" is enabled.

- start a trace (wireshark or live http header)

- open a sharepoint page

- put the rserver on which the client is sticked out of service

- be sure that the ace marks the server as failed

- try to refresh the page in the browser

and send the capture file here

Just for information; about web based applications, usually after a user is logged in and an application server fails, usually the browser must be closed and then reopened, except if you have a database for synchronizing session states. If the application session is identified with a cookie, closing and reopening the browser will flush this cookie (if it's correctly configured as a "session cookie" - timeout 0 in the application). If the ID is kept with token embedded in URL, it will not work and the user will also have to close his browser.

Surya ARBY Wed, 04/20/2011 - 10:51

Maybe the HTTP 504 error comes from the application server itself because it's not able to recognize the session and you fall into an undefined case not managed by the application, so it doesn't send any response to the web front end, which returns the 504.

Can you retry the test with Firefox, but before refreshing the page, kill all the cookies except the one used by the ACE for persistance ?

Surya ARBY Wed, 04/20/2011 - 10:57

thanks.

can you try bu flushing all the cookies before ? it seems that the cookie sent by the application is named "WSS_KeepSessionAuthenticated". Try to make your with and without deleting this cookie when you put your rserver out of service.

Darren.sasso Wed, 04/20/2011 - 11:01

Ok so i did a clear sticky database all but that didn't seem to help.  The connection was still stuck.

Surya ARBY Wed, 04/20/2011 - 11:04

Yes, this is because with cookie insertion there is a 1-to-1 mapping between the cookie value and the service of the rserver.

But when the server is in out-of-service, the number of current connections should not increase.

Darren.sasso Wed, 04/20/2011 - 12:45

Just and FYI i upgraded the version of code today from 3.2.0 to 3.2.7 and it fixed the issue.

Thanks for everyones help!!

andrew.jewell Tue, 05/01/2012 - 15:03

Hi Darren,

I am experiencing a similar issue on weblogic set of servers.tried the Purge on fail etc etc but looking at wireshark traces the application is leaving cookies embedded in the browser.Delete these specific cookies and you connect every time.

Was the code upgrade 3.2.7 for your ACE as that seems unfamilar to me.?? as i am looking to upgrae from A3 to A4 code..

We are running a Ace 4710 appliance.

Thanks Andy

Actions

Login or Register to take actions

This Discussion

Posted April 14, 2011 at 6:32 PM
Stats:
Replies:27 Avg. Rating:
Views:2922 Votes:0
Shares:0
Tags: No tags.

Discussions Leaderboard

Rank Username Points
1 1,551
2 369
3 333
4 228
5 212
Rank Username Points
5