Alright, so I have toiled long and hard to get this right. I think I have the config down but I am unsure on how to verify how this load balancing is working.
Here is the Content Config that I am speaking of:
content cad-rule add service wls1-e0 add service wls1-e1 add service wls2-e0 add service wls2-e1 add service wls3-e0 add service wls3-e1 add service wls4-e0 add service wls4-e1 add service wls5-e0 add service wls5-e1 add service wls6-e0 add service wls6-e1 arrowpoint-cookie expiration 00:00:15:00 advanced-balance arrowpoint-cookie redundant-index 2 vip address 172.30.194.195 range 2 arrowpoint-cookie name TOQ protocol tcp port 8001 url "/*" active
Each service in the rule above is configured as follows:
service wls1-e1 port 8001 protocol tcp strin ags001-e1 ip address 172.30.193.81 keepalive type http keepalive uri "/cad/index.html" redundant-index 12 keepalive frequency 20 keepalive maxfailure 10 keepalive retryperiod 2 active
I am using the advanced arrowpoint cookies because I need some stickiness here. Straight round-robin would not have done what I needed it to do.
Now, when I go to my show summary, this is what I see for this rule:
The far right column indicates the services hits. I originally had the E1's suspended and activated them later on. So if this was true round robin, all the E0's should have the same number of service hits and all the E1's should have the same number of service hits. But as you can see, the wls5 server is getting hit the most while the wls6 server is sitting there twiddling its thumbs.
Now understanding how the arrowpoint cookies do their load balancing (inserting a cooking into the flow and then timing out after 15 mins as configured above) I would not expect a 1:1 ratio of load balancing between servers. But the distribution above seems rather extreme.
Does anyone have any suggestions on how to both A) verify that this is the right config and B) suggest to my boss that this is working the way it should be working?
There are several reasons of the uneven load balancing that you are seeing (based on the show summary). First of all, the CSS is configured to do stickiness (advance-balance).
With arrowpoint-cookies (for HTTP only) method for stickiness, only the requests coming with the same cookie are going to get stuck to the same server, since the cookie is lost when the browser is closed (or based on the expiration), then the stickiness is going to be session based and if the same client open a new session is going to be load balanced.
Is important to understand that when using stickiness, no real even load balancing is going to happen since we are sticking new flows to the same server; even when layer 5 stickiness would permit more even balancing than layer 3 stickiness (source IP based).
Also consider that the "show summary" is a command to see the hits (requests) being balanced to an specific server, this is a good command to see the load balancing, anyway since the CSS balance connections (flows), a persistent connection could have a lot of requests, so all those requests are always going to the same server (incrementing the amount of hits in the counter) while a non-persistent connection would be just one request (refer to HTTP persistence).
Also keep in mind that if a service is take out for maintenance, or is added to the load balancing later than another, or if goes down for a period of time, then the CSS will be balancing among the remaining alive servers. When you add the server again, the another servers are going to have connections already established, so since the CSS is doing round robin, the server last added will never have the same amount of connections (nor hits) that the other ones, because while one could have 55 for example, the new one will have it first connection, and when the first one gets the 56, the another will get the second, and so on.
I think we are both on the right path. Cookie based Load-balancing will never be equal. It could be that the user that is current "stuck" to server 5 will simply move over to another server and crush it when the cooking expires.
Such is the dilemna with having users "stuck".
I will continue to monitor things. Even if I remove the timeout and reset the cookie each time a browser closes, whatever user is causing the tremendous load will just reconnect to another server and do the same thing there when his browser reopens.
Maybe the question is more about why the user is causing more havoc than others.
The unmanaged mode is also known as Network only switching, which is introduced in Brazos release. It adds the flexibility for customer to use only network automation for service appliance.
If a device is configured a...
Usually, we can access ESXi Shell by pressing Alt+F1 from ESXi DCUI (Direct Console User Interface).
But on HyperFlex system, it just shows black window.
This is expected behavior because HyperFlex redirects ESXi Shell output to SoL...
Configuring an Export Policy Using the GUI
This procedure explains how to configure an Export policy using the APIC GUI. Follow these steps to trigger a backup of your data:
On the menu bar, choose Admi...