CSM : Sorry server and Stickyness when reals are overloaded

Unanswered Question
Mar 7th, 2007

Hi,

I have a portal of eight real servers and one sorry server, which should redirect new user to another portal, in case of an overload condition of all eight real servers. Server load is measured on each real server using a custom developed agent, which basically measures the real CPU load. If a real server experiences an overload, the local agent uses the CSM XML interface to set the maxcons value in the CSM to stop accepting new connections. However, I want to continue accepting sticky connections (request with a valid cookie). The experience shows that the CSM does accept to create new connections to real server reaching maxcons, even if a cookie exist.

This causes a problem if we want to redirect NEW users to another portal in case of overload, but to keep EXISTING users in the server farm, even if the number of connections could increase slightly above maxconns...

How can I solve the problem ?

Thank you

Yves Haemmerli

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
thomas.chen Tue, 03/13/2007 - 10:27

the idea of maxconns is to set the maximum amount of connections a server could handle. When this maximum is reached, we assume it cannot handle any new connections. So maxconns should be used as a kind of safeguard to ensure new connections are not dispatched to a real server if it's overloaded and cannot handle them anyway.

maxconn takes precedence over sticky, i.e. when a real server reaches maxconns, next connection from the client is dispatched to another server based on the vserver load-balancing rule and sticky entry is updated to point to this new real server

so next connections from same client are dispatched to this new server.

This may helps to you.

yves.haemmerli Wed, 03/14/2007 - 02:13

Hi Thomas,

Thank you for your comment. I also understand this behaviour like you, however this can have a devastating effect in a global portal environment. Imagine, you have three portals distributed over the world, each having let say 8 real servers. In the real life, it is seldom to replicate data in real time between data centers, due to the distance. However, the user roles and customized bookmarks and other user-specific settings are replicated. This allows to provide a global portal to users. But if a user connects to one particular regional portal, he has to stay on this portal for the duration of the whole browser session, do you follow me ? OK, now imagine that all 8 real servers in a portal reach the maxconns, because 10'000 users are connected to the portal. For new users (users with no sticky cookie), we want to send them to another regional portal. This is achieved with the global site selection provided by the GSS for example. But for existing user already connected to the overloaded portal, we want to KEEP them on the portal ! else, as the user browser continuously opens and closes TCP sessions, all 10'000 users will be immediately transferred to the other regional portal! This means the the other regional portal will becom overloaded as well, while the first portal load will be droped to zero very quickly ! Then, we not only create a situation where users loose their data by being transferred to another portal, but we also create a oscillations in the portal load !

I really don't know if there is a mean to solve this problem...Do you have any idea ?

Regards,

Yves Haemmerli

Actions

This Discussion