03-30-2007 03:34 AM
Hi all,
I have one CSS11503 at the main datacentre and another at the standby datacentre (ie D/R scenario). The standby datacentre CSS I have configured so that if any DNS queires hit this site and the main data centre service is up, then prefer the main data centre (as apposed to standby).
But when testing, I suspended the service on the CSS in the main data centre, but the standby datacentre CSS still saw this service as 'alive', and therefore would not take over responsibility for the service.
I placed a sniffer on the standby datacentre CSS customer facing (APP port) vlan and could see keepalives being sent from this standby CSS to the main CSS service and the remote service still responding (even though I could not ping the main CSS service - because I had suspended it).
I then suspended the content on the main datacentre CSS and still the backup CSS saw this as alive and still got responses back from the main CSS service.
I have attached a config subset of both CSSs (ie one at each datacentre). Please note: I have configured for VRRP because at some stage we may have 2 at each data centre.
It appears to me like a bug. However, really am struggling so would appreciate some help if anybody has got any ideas.
Thanks in advance
regards
Mark
03-30-2007 07:14 AM
Hi Mark,
In my CSS cluster, when changes are made to the primary CSS, they do not take effect on the secondary until you run
commit_VipRedundantConfig "local
Is this how yours works?
Joe
03-31-2007 12:03 AM
Hi Mark,
Is there any other way to reach this site: http://webgeneral.nffs.ea.gov/general_webserver/images/home.gif; besides going to the content rule in the main site?
Can you change the configuration of the service General_hacked_redirect to make it look like this and try again?
service General_hacked_redirect
ip address 10.8.149.31
protocol tcp
port 7003
keepalive type http
keepalive method head
keeplive port 7003
active
Let me know how it goes. Thanks!
Regards,
Jose Quesada.
04-02-2007 06:36 AM
Hi Jose and Joe/Rich for responses - much appreciated.
I have tried your config and it worked fine.
But we have 2 services running off same Web server (ie one server at main datacentre and other server on standby site - D/R).
For each site, both services are accessed via the same real server IP address using ports 7002 and 7003 (ie one IP address/site).
The above solution worked fine for port 7003, but when using a similar config for port 7002,the service 'Forecaster_hacked_redirect' at the backup site was 'down' (when the service was up at the primary site).
I tried with and without a uri and Sniffer showed server was returning error codes of 404 or 500.
When I used the keepalive method below, the service at the standby site is always 'alive' even when I suspend the service at the primary site.
I don't understand whats happening!
The relevant config at the backup site is:
service Forecaster_hacked_redirect
ip address 10.8.149.41
protocol tcp
keepalive port 7002
port 7002
keepalive method get
keepalive type http
keepalive uri "webserver/loginpage.do"
active
The relevent config at the primary site is:
service NFFS_Webforecaster
ip address w.x.y.z
protocol tcp
port 7002
keepalive type http
keepalive port 7002
keepalive uri "webserver/loginpage.do"
keepalive method get
active
owner NFFS_Forecaster
dns both
content NFFS_Webforecaster
vip address 10.8.149.41
add service NFFS_Webforecaster
dnsbalance preferlocal
protocol tcp
port 7002
add dns webforecaster.nffs.ea.gov 5
active
To recap, when using a similar config to the other 'General' service, the service Forecaster_hacked_redirect always shows as 'down'. But when I use the config above, the service Forecaster_hacked_redirect always shows as 'alive'. When in this state, I placed a sniffer on standby site client port (APP port) and after suspending the service on the primary site, I could no longer ping the VIP address (this is what I would expect), but some how a conversation still takes place between the CSSs using a source address of the content rule I cannot now ping (ie 10.8.149.41).
The sniffer shows that the keepalive responses from the primary Web server are current (because it responses with date/time) and the source address at the primary site is the VIP address belonging to the content rule (which is down when I suspend the only real server associated with this rule), so why/how this happens I have know idea.
Have you any ideas why?
regards
mark
03-31-2007 08:47 AM
Sorry, I misunderstood what you were asking, and it was my first post to the board too.
I agree with the last poster that you should try setting the keepalive port to match the service port. You can look in your packet sniff to verify whether keepalives are hitting port 80 or 7003.
Not all servers support "head" so maybe make the other change first and see if it worked.
04-02-2007 03:11 AM
Mark,
I have run into similar problems. It's not a bug in the CSS but more that the Default web site on the web server is responding. Like the previous post you need to be More specific on the keepalive ports.
thx
-Rich
04-03-2007 03:31 PM
Hi Mark,
For the server in the backup site running in port 7002, you might want to try using the head method in the keepalive. This due that the uri that you are requesting seems to be from a dynamic web page, and a get method would make the keepalive fail due to that the hash the CSS creates changes if the page changes.
Thanks & Regards,
Jose.
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: