One of our customers is having a major problem with their Oracle ERP system which we are hosting for them. They have a high priority service request open which Oracle, who believe that our Cisco CSS 11501 load balancers has a problem with session persistence which causes timeouts for the clients.
The customer is using HTTPS from the clients to the servers and the load balancers do not terminate the SSL, but load balance the requests to four different servers. After requests from Oracle we have increased the flow-timeout-multiplier from our initial configuration of "6" to "1800", which corresponds to 8 hours.
I have listed what I believe is the relavant part of the configuration below. I would very much appreciate if someone could confirm if this is a good setup or if any changes are required. We are using version 08.10.4.01 on the CSS 11501s.
service server1.example.com-8100 ip address 192.168.11.21 protocol tcp port 8100 keepalive type tcp active
service server2.example.com-8100 ip address 192.168.11.22 protocol tcp port 8100 keepalive type tcp active
service server3.example.com-8100 ip address 192.168.11.23 protocol tcp port 8100 keepalive type tcp active
service server4.example.com-8100 ip address 192.168.11.24 protocol tcp port 8100 keepalive type tcp active
owner Example content www.example.com-8100 primarySorryServer server5.example.com-8101 vip address 192.168.14.20 add service server1.example.com-8100 add service server2.example.com-8100 add service server3.example.com-8100 add service server4.example.com-8100 application ssl protocol tcp port 8100 advanced-balance ssl flow-timeout-multiplier 1800 active
group www.example.com-nat vip address 192.168.21.30 add destination service server1.example.com-8100 add destination service server2.example.com-8100 add destination service server3.example.com-8100 add destination service server4.example.com-8100 active
The configuration looks good for SSL balancing, now regarding the session timeouts. Keep in mind that SSL stickiness is based on the SSL ID, therefore you need to make sure that you have no SSLv2 clients (where SSL ID is not in clear texr) and more important that some browsers/applications will renegotiate the SSL ID many times during a single session, if that is the case stickiness will break and the flow will be balanced based on round robin algorithn wich will probably take the flow to a different server that the one handling the session already.
You may want to take frontend/backend sniffer traces in order to find what exactly is going on with those flows. Also, did you see any difference when changing the flow-timeout-multiplier value? Oracle is expected to leave iddle flows which will be garbage collected.
Moquery is the command line cousin of Vizore, it's very helpful and efficient sometimes during the troubleshooting. This article aims to provide moquery cheat sheet to the users for some most common seen scenarios.
Here is the checklist before customers/partners contact Cisco TAC:
Firmware Version of APIC and Switch
Download Switch and APIC techsupport logs
Problem description (Symptoms with details)
Business impact (eg, what kind of services...
moquery usageAPIC moquerySwitchmoquery
This document discuss a common issue observed during the VMM integration & VM workload migration to ACI fabric.
VMware Virtual machines are hosted in Cisco UCS-B seri...