I recently moved the backend of the CSS and associated servers to a new vlan. 2 days prior to that, there was a database upgrade for the same application.
Ever since the vlan change, customers are complaining that they are forced out of the application automatically.
SSL termination is done on the CSS and I am using advance-balance sticky-srcip for stickiness. I did some packet captures and it shows that the users are sticking to one server and suddenly they do ssl handshake with the second server and then I get a reset from the client. So, it looks like stickiness based on the sourceip is not working.
I do not know why it worked for 7 months and when I made the vlan change it stopped working. We bypassed the CSS and users were directly accessing the server and they were not being forced out. I have the following config
Looks like it is working. I tried packet capture and let it run for more than an hour and customers never had problem and after 2 retries I decided that everything is back to normal.
Like you said, stickiness was never a problem I think. It is just the application DB upgrade in combination of the dev instance caused the whole thing. Just because, CSS and servers were moved to a new vlan, it was initially thought of as the CSS problem.
Packet capture on both sides of the CSS showed that the client is sticking to server1 (x.x.235.11) and then initiates a connection to server2 (x.x.235.12) which I do not see it on the front end but only see it on the backend. Looks like the client is bypassing the CSS.
The client side packet capture did not show any traffic bypassing the CSS but still they were forced out. But 2 packet capture showed more traffic on the backend that I could not match up on the frontend.
I am not sure if source group NAT would work here since the client is bypassing the CSS and not the server.
I can't explaian the reason for the asymmetric routing though.
When the IP address on the servers were changed, the dual NICs on the servers were made active and configured for load balancing. Before that, only one NIC was active at any time. I am pretty sure that dual NICs with load balancing is causing the issue with the upstream CSS.
Servers were moved out of the CSS so that customers won't complain and so I am in the process of deploying couple of test servers behind the CSS to test this theory. Just curious to see if anyone else has experienced this before.
Introduction This article will help you understand the steps on how to
download the UCS licenses from the Cisco Systems website and then
installing it on the UCS. The redacted (blue lines) just covers up
certain numbers for privacy please do not take them...
Introduction This article will help you understand and educate the
customer on how to clear their "expired licenses"
(license-graceperiod-expired) from their UCS-M. If a customer just
purchased a license and needs a step by step guide on how to download
Introduction Prepositioning is a powerful tools on the WAAS platform but
it is not always easy to figure out why your jobs are failing when
trying to retrieve the files.Here is a method that should help you to
figure out the reason why they are not succes...