How to get the CSM to reassign a failed connection?

Unanswered Question
Aug 5th, 2007
User Badges:

We are in the process of upgrading our Local Directors to CSM blades but functional testing has indicated that the CSM's are failing to redirect connections to a healthy real server when one of the real servers has failed and the failed_interval timer has expired. Instead of being redirected to a healthy real server the client is sent a TCP RST. We are using inband health monitoring to detect a server failure.


Is it possible to configure the equivalent of the Local Director reassign parameter on the CSM's so that the client doesn't experience any loss of service when a real server fails?


Usage Guidelines

LocalDirector counts the number of TCP SYN packets per connection. When the number of allowable retries is exceeded, LocalDirector reassigns the next TCP SYN packet for the connection to another real server


  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
Gilles Dufour Mon, 08/06/2007 - 04:28
User Badges:
  • Cisco Employee,

the CSM does not have a similar feature.

You should use probes to detect servers down faster. When the csm detects the server down with a probe it will stop loadbalancing traffic to it.


Gilles.

nickarbiter Mon, 08/06/2007 - 07:45
User Badges:

With probes you client will still experience loss of service between probes times. Even if you run the probe every second it's still possible that a client will receive a TCP RST when a real server is down but hasn't yet been tested by the active probe. The Local Director uses active traffic to monitor the real server and re-balance in the case of a real server not responding to by default three SYN attempts. Do you know of anyway to achieve this with the CSM.


Many thanks for your support.




Gilles Dufour Tue, 08/07/2007 - 00:36
User Badges:
  • Cisco Employee,

there is no way to achieve this with a csm.

However, since most client tcp stack will retransmit the SYN anyway, your probe should detect the server down and when the re-transmitted syn from the client comes in, it will be loadbalanced to an active server.


Gilles.

nickarbiter Tue, 08/07/2007 - 01:52
User Badges:

I've tested this in the LAB and it doesn't work. Correct me if I'm wrong but the minimum probe interval is 2 seconds which means there could be up to two seconds before the probe marks a real server failed. In this 2 second period client connections get a TCP RST. I just can't make sense of this. I would expect the CSM to be able to make failures of individual servers transparent to the client. I've checked through the IOS SLB documentation and this is possible with the re-assign command, just a little confused why this is not available with the CSM.

Sean Merrow Sun, 08/19/2007 - 06:35
User Badges:
  • Silver, 250 points or more

Under the serverfarm configuration, you can configure 'failaction purge', 'failaction reassign' or no failaction at all.

By default, the failaction is not configured and any existing connections to a real server will remain on that real server, even when the real fails health probes. If the real is really down, then the session will timeout and the client will have to initiate a new session to be load balanced to a real server. The 'failaction reassign' is typically used for firewall load-balancing where the firewalls are 'stateful' and share their state tables. This scenario is the only way 'failaction reassign' will work. If your real servers share state tables for all their client connecitons, then 'failaction reassign' under the CSM serverfarm config should work fine. Here, an existing session can be re-mapped to a healthy real server in mid-session totally transparent to the client. However, if the real server the client is newly mapped to does not have that session's state information, the real server will send a RST.


If your real servers do not share state information, and you don't want the client to get the RST (failaction purge), then you might want to just leave it at the default. This way, if the client sends a SYN, it will always be sent to a real server.


Sean

Gilles Dufour Mon, 08/20/2007 - 02:10
User Badges:
  • Cisco Employee,

So, what you're saying is that you disabled the application from the server - but the server was still available - and send a SYN to the server within the 2 sec during which the CSM had not yet detected the application being down.

And the server not having an application running on the specific port sent a RESET to the client which prevented it to retransmit the SYN.

Indeed, this is a case where it will fail.

Nothing can be done about that.


Gilles.

Actions

This Discussion