I need to find out some information about the CSS11500 and how it makes load balance decisions. We currently monitor the services usinf tcp keepalives, we have turned the keepalivefrequency to 2, the max failure to 2 and the retry period to 2.
If the CSS polls the server and does not see a response it will then wait 2 seconds before polling again, it receives no response again and the CSS will start taking down the service. This works well, however, what happens to any requests that hit the vip during that 4 second period, the CSS still sees the service as alive, so can it still direct requests to that server? The client would then see no response as the server is no responsive, would tcp recover this or is there a script we implement to avoid this scenario, the application seems to be very sensitive to requests that are sent to the server when the appliaction is dwn, Is there any detailed explanation of the CSS behaviour in this scenario? Ultimatley we want to guarantee 100% service to the client so if a service is dying, can we make sure the request is directed to another service that is available?
Any advice is much appreciated
If the service is not down, it is considered usable by the CSS and indeed, if a connections comes in during those 4 sec while the CSS has not detected the server down, traffic will be forwarded to the server and obviously no response will come back.
Moreover, existing connections already linked to the down server will stay connected to the server.
There was a new feature added very recently to force a RESET to be sent to the client upon server failover for existing connections. This should be available in the very latest version.
What happens next is totally dependent on the application and the platform being used.
The TCP stack can retransmit packets for a while and then decide to RESET and reopen a new connections.
Or the application could just exit with an error.
There is not much we can do about it at CSS level. The application programmer should take this into account [css or no css] and make the application more robust to network events.