CSS arrowpoint cookies: TCP window abruptly reduced

Unanswered Question
Feb 16th, 2010

Hi,

I have an HTTP rule configured with "arrowpoint-cookies" and "url /*". (11503 with 8.20.4.02)

The performance are quite low and debugging the traffic at client side I see that the server intially propose a TCP window of 8760 but after the first GET is completed the window is reduced to 257 B!

Than the client before to issue a second GET for a different page object wait for some seconds and Wireshark reports a "TCP window full" message

No.     Time            Source                Destination           Protocol Info
      1 11:23:43.291837 10.39.68.19           SERVER_ADDR       TCP      59916 > http [SYN] Seq=0 Win=65535 Len=0 MSS=1460 WS=0 TSV=949257685 TSER=0
      2 11:23:43.304036 SERVER_ADDR       10.39.68.19           TCP      http > 59916 [SYN, ACK] Seq=0 Ack=1 Win=8760 Len=0 MSS=1380 WS=0
      3 11:23:43.304106 10.39.68.19           SERVER_ADDR       TCP      59916 > http [ACK] Seq=1 Ack=1 Win=65535 Len=0
      4 11:23:43.304258 10.39.68.19           SERVER_ADDR       HTTP     GET /investor/investor.php HTTP/1.1
      5 11:23:43.510833 SERVER_ADDR       10.39.68.19           TCP      http > 59916 [ACK] Seq=1 Ack=639 Win=258 Len=0
      6 11:23:43.723631 SERVER_ADDR       10.39.68.19           TCP      [TCP segment of a reassembled PDU]
      7 11:23:43.723735 SERVER_ADDR       10.39.68.19           TCP      [TCP segment of a reassembled PDU]
      8 11:23:43.723777 10.39.68.19           SERVER_ADDR       TCP      59916 > http [ACK] Seq=639 Ack=2761 Win=64860 Len=0
      9 11:23:43.735611 SERVER_ADDR       10.39.68.19           TCP      [TCP segment of a reassembled PDU]
     10 11:23:43.735727 SERVER_ADDR       10.39.68.19           TCP      [TCP segment of a reassembled PDU]
     11 11:23:43.735781 10.39.68.19           SERVER_ADDR       TCP      59916 > http [ACK] Seq=639 Ack=5521 Win=65535 Len=0
     12 11:23:43.735843 SERVER_ADDR       10.39.68.19           TCP      [TCP segment of a reassembled PDU]
     13 11:23:43.735963 SERVER_ADDR       10.39.68.19           TCP      [TCP segment of a reassembled PDU]
     14 11:23:43.736012 10.39.68.19           SERVER_ADDR       TCP      59916 > http [ACK] Seq=639 Ack=8281 Win=65535 Len=0
     15 11:23:43.747598 SERVER_ADDR       10.39.68.19           TCP      [TCP segment of a reassembled PDU]
     16 11:23:43.747715 SERVER_ADDR       10.39.68.19           TCP      [TCP segment of a reassembled PDU]
     17 11:23:43.747761 SERVER_ADDR       10.39.68.19           HTTP     HTTP/1.1 200 OK (text/html)
     18 11:23:43.747846 10.39.68.19           SERVER_ADDR       TCP      59916 > http [ACK] Seq=639 Ack=11709 Win=65491 Len=0
     19 11:23:48.978137 10.39.68.19           SERVER_ADDR       TCP      [TCP Window Full] [TCP segment of a reassembled PDU]
     20 11:23:49.205760 SERVER_ADDR       10.39.68.19           TCP      http > 59916 [ACK] Seq=11709 Ack=897 Win=8502 Len=0
     21 11:23:49.205826 10.39.68.19           SERVER_ADDR       HTTP     GET /investor/inc/dates.xml?_=1266315824078 HTTP/1.1
     22 11:23:49.218316 SERVER_ADDR       10.39.68.19           TCP      http > 59916 [ACK] Seq=11709 Ack=1368 Win=255 Len=0
     23 11:23:49.219144 SERVER_ADDR       10.39.68.19           HTTP     HTTP/1.1 404 Not Found (text/html)
     24 11:23:49.219183 10.39.68.19           SERVER_ADDR       TCP      59916 > http [ACK] Seq=1368 Ack=12155 Win=65535 Len=0
     25 11:23:54.727719 SERVER_ADDR       10.39.68.19           TCP      http > 59916 [FIN, ACK] Seq=12155 Ack=1368 Win=255 Len=0
     26 11:23:54.727796 10.39.68.19           SERVER_ADDR       TCP      59916 > http [ACK] Seq=1368 Ack=12156 Win=65535 Len=0
     27 11:24:06.408394 10.39.68.19           SERVER_ADDR       TCP      59916 > http [FIN, ACK] Seq=1368 Ack=12156 Win=65535 Len=0
     28 11:24:06.419822 SERVER_ADDR       10.39.68.19           TCP      http > 59916 [ACK] Seq=12156 Ack=1369 Win=255 Len=0

Since there are "arrowpoint-cookies" and "url /*" the rule should be a L5 rule so I think the CSS is spoofing the connection to insert the Cookie and re-calculate the CRC.

If I remove the "arrowpoint-cookies" and "url /*" (the rule should be at this point a L4 rule) the behaviour is different and the server receiving window is enlarged from 8192 to 65K

No.     Time            Source                Destination           Protocol Info
      1 11:35:32.168413 10.39.68.19           SERVER_ADDR       TCP      59999 > http [SYN] Seq=0 Win=65535 Len=0 MSS=1460 WS=0 TSV=949259103 TSER=0
      2 11:35:32.180893 SERVER_ADDR       10.39.68.19           TCP      http > 59999 [SYN, ACK] Seq=0 Ack=1 Win=8192 Len=0 MSS=1380 WS=8 TSV=120784078 TSER=949259103
      3 11:35:32.180969 10.39.68.19           SERVER_ADDR       TCP      59999 > http [ACK] Seq=1 Ack=1 Win=65535 Len=0 TSV=949259103 TSER=120784078
      4 11:35:32.181101 10.39.68.19           SERVER_ADDR       HTTP     GET /investor/investor.php HTTP/1.1
      5 11:35:32.394954 SERVER_ADDR       10.39.68.19           TCP      http > 59999 [ACK] Seq=1 Ack=398 Win=65536 Len=0 TSV=120784099 TSER=949259103
      6 11:35:33.008270 SERVER_ADDR       10.39.68.19           TCP      [TCP segment of a reassembled PDU]
      7 11:35:33.008372 SERVER_ADDR       10.39.68.19           TCP      [TCP segment of a reassembled PDU]
      8 11:35:33.008413 10.39.68.19           SERVER_ADDR       TCP      59999 > http [ACK] Seq=398 Ack=2737 Win=62928 Len=0 TSV=949259104 TSER=120784160
      9 11:35:33.020326 SERVER_ADDR       10.39.68.19           TCP      [TCP segment of a reassembled PDU]
     10 11:35:33.020423 10.39.68.19           SERVER_ADDR       TCP      59999 > http [ACK] Seq=398 Ack=4105 Win=65535 Len=0 TSV=949259104 TSER=120784162
     11 11:35:33.020440 SERVER_ADDR       10.39.68.19           TCP      [TCP segment of a reassembled PDU]
     12 11:35:33.020550 SERVER_ADDR       10.39.68.19           TCP      [TCP segment of a reassembled PDU]
     13 11:35:33.020599 10.39.68.19           SERVER_ADDR       TCP      59999 > http [ACK] Seq=398 Ack=6841 Win=65535 Len=0 TSV=949259104 TSER=120784162
     14 11:35:33.020665 SERVER_ADDR       10.39.68.19           TCP      [TCP segment of a reassembled PDU]
     15 11:35:33.032308 SERVER_ADDR       10.39.68.19           TCP      [TCP segment of a reassembled PDU]
     16 11:35:33.032414 10.39.68.19           SERVER_ADDR       TCP      59999 > http [ACK] Seq=398 Ack=9577 Win=65535 Len=0 TSV=949259104 TSER=120784162
     17 11:35:33.032421 SERVER_ADDR       10.39.68.19           TCP      [TCP segment of a reassembled PDU]
     18 11:35:33.032492 SERVER_ADDR       10.39.68.19           HTTP     HTTP/1.1 200 OK (text/html)
     19 11:35:33.101509 10.39.68.19           SERVER_ADDR       TCP      59999 > http [ACK] Seq=398 Ack=11773 Win=65535 Len=0 TSV=949259105 TSER=120784163
     20 11:35:33.170523 10.39.68.19           SERVER_ADDR       HTTP     GET /css/all.css HTTP/1.1
     21 11:35:33.182355 SERVER_ADDR       10.39.68.19           HTTP     HTTP/1.1 304 Not Modified
     22 11:35:33.301595 10.39.68.19           SERVER_ADDR       TCP      59999 > http [ACK] Seq=949 Ack=11979 Win=65535 Len=0 TSV=949259105 TSER=120784177
     23 11:35:33.782539 10.39.68.19           SERVER_ADDR       HTTP     GET /investor/inc/dates.xml?_=1266316533781 HTTP/1.1
     24 11:35:33.794652 SERVER_ADDR       10.39.68.19           HTTP     HTTP/1.1 404 Not Found (text/html)
     25 11:35:33.794734 10.39.68.19           SERVER_ADDR       TCP      59999 > http [ACK] Seq=1651 Ack=12425 Win=65218 Len=0 TSV=949259106 TSER=120784238
     26 11:35:39.305776 SERVER_ADDR       10.39.68.19           TCP      http > 59999 [FIN, ACK] Seq=12425 Ack=1651 Win=64256 Len=0 TSV=120784791 TSER=949259106
     27 11:35:39.305835 10.39.68.19           SERVER_ADDR       TCP      59999 > http [ACK] Seq=1651 Ack=12426 Win=65535 Len=0 TSV=949259117 TSER=120784791
     28 11:35:51.437686 10.39.68.19           SERVER_ADDR       TCP      59999 > http [FIN, ACK] Seq=1651 Ack=12426 Win=65535 Len=0 TSV=949259141 TSER=120784791
     29 11:35:51.449472 SERVER_ADDR       10.39.68.19           TCP      http > 59999 [ACK] Seq=12426 Ack=1652 Win=64256 Len=0 TSV=120786005 TSER=949259141

Now the problem seems to be solved with the "flow tcp-window-scale 8" command so that the window is now scaled to 258 * 2^8

The CSCsv1258 in the 8.20.x only refere to WS propagation problem in client side SYN packet and I noticed them in SYN-ACK from the server.

Is this the same issue?

Thank you

Kind Regards

Fulvio

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 3 (1 ratings)
Loading.
Gilles Dufour Wed, 02/17/2010 - 01:06

Is your client or server using window scaling ?

If yes, you indeed need to enable window scaling.

CSCsv12580 The command   "flow tcp-window scale [enabled|disabled]" was added to the Command   Line
Interface (CLI) to allow the user   to configure the ability to propagate the TCP Window Scale (WS) option   to
the backend server.

Sniff on the client as well as on the server and see which parameter are modified by the CSS.

Gilles.

egl.davidfarrell Thu, 04/15/2010 - 02:27

Hi folks,

We are also seeing issues after upgrade to 8.20.4.02 with window scaling where clients are attempting to use window scaling and the servers do not support window scaling. With the new feature being enabled by default we also saw the TCP window being abruptly reduced. Opening a TAC resulted in us applying a scale factor of 14 with the 'flow tcp-window-scale 14' command. However, this did not seem to resolve things so we failed over to an alternate CSS still running our previous release.

Re-visiting this, it looks like it was more of a cosmetic issue. In our testing IE and Firefox on windows platform don't set window scale option, however chrome does (looks like it can be selectively set when TCP socket is opened). Our performance monitoring systems are linux based where it looks like a kernel level thing so they were seeing the window scale issue and reporting problems. However, we decided to try to resolve it anyway.

So, we looked at disabling the window scaling on the CSS altogether. Using the command 'flow tcp-window-scale disabled' which seemed to resolve the issue completely so that the CSS didn't allow the Window scaling option. However, later in the day we experienced a lock up of the CSS and an outage. This is currently being investigated.

I must admit I am struggling to find good information about the behaviour of this window scaling features.

Thanks,

David.

sachinga.hcl Thu, 04/15/2010 - 04:04

Hi Egl,

You can read my reply on same situation but on ACE not on CSS which I answered few minutes back.

You can go through the following URL for my reply on this for ACE which I assume is having similar in behaviour as well as an developed Kind of CSS.

https://supportforums.cisco.com/message/3054854#3054854

Here are the contents of that reply for you convenience I am addin those here as well:

++++++++++++++++++++++++++++++++++++++++

As you know if you allows the ACE to use a window scale factor that essentially increases the size of the TCP send and receive buffers. The sender specifies a window scale factor in a SYN segment that determines the send and receive window size for the duration of the connection.

You have defined tcp-options scale allow in the parameter map but have not specified to which scale factor this should scale to.


The TCP window scaling feature adds support for the Window Scaling option in RFC 1323. It is recommend that increasing the window size to improve TCP performance in network paths with large bandwidth, long-delay characteristics. This type of network is called a long fat network (LFN).

But you have not configure a TCP window-scale factor which is essential for network paths with high-bandwidth, long-delay characteristics, using the "set tcp window-scale " command.

The window scaling extension expands the definition of the TCP window to 32 bits and then uses a scale factor to carry this 32-bit value in the 16-bit window field of the TCP header.

You can increase the window size to a maximum scale factor of 14.

Typical applications use a scale factor of 3 when deployed in LFNs but in you case if this is a LFN you are using default which is 0.

So here in your case you can select how much scale factor is needed to be defined from the range 0-14 , otherwise even if you have defined in the tcp-options windlow-scale allow, but as you have not define to what factor it should scale to , so it will always use the default scale factor which is 0, means no scaling at all.

To set the TCP window-scale factor to 3, add the second command as well in your parameter map:


host1/Admin(config-parammap-conn)# tcp-options window-scale allow        ---> Defined already

host1/Admin(config-parammap-conn)# set tcp window-scale 3                      ----> Needed to be added else "default scale factor 0" will be taken


You can check in the following URL at cisco docs wiki for how ACE handles Connection at Layer 4 (L4) and Layer 7 (L7):

http://docwiki.cisco.com/wiki/Cisco_Application_Control_Engine_(ACE)_Module_Troubleshooting_Guide,_Release_A2(x)_--_Troubleshooting_Connectivity

Here are the details of how a tcp-options scale windows works:

The window scale extension expands the definition of the TCP
window to 32 bits and then uses a scale factor to carry this 32- bit
value in the 16-bit Window field of the TCP header (SEG.WND in RFC-793).
The scale factor is carried in a new TCP option, Window Scale.
This option is sent only in a SYN segment (a segment with the SYN bit on),
hence the window scale is fixed in each direction when a connection is opened.
(Another design choice would be to specify the window scale in every TCP segment.

It would be incorrect to send a window scale option only when the scale factor changed,

since a TCP option in an acknowledgement segment will not be delivered reliably
(unless the ACK happens to be piggy-backed on data in the other direction).
Fixing the scale when the connection is opened has the advantage of lower
overhead but the disadvantage that the scale factor cannot be changed
during the connection.) The maximum receive window, and therefore the
scale factor, is determined by the maximum receive buffer space.
In a typical modern implementation, this maximum buffer space is set
by default but can be overridden by a user program before a
TCP connection is opened. This determines the scale factor,
and therefore no new user interface is needed for window scaling.

Window Scale Option The three-byte Window Scale option
may be sent in a SYN segment by a TCP.
It has two purposes:
(1) indicate that the TCP is prepared to do both send and receive window scaling, and
(2) communicate a scale factor to be applied to its receive window.
Thus, a TCP that is prepared to scale windows should send the option,
even if its own scale factor is 1. The scale factor is limited to
a power of two and encoded logarithmically,
so it may be implemented by binary shift operations.

TCP Window Scale Option (WSopt):
Kind: 3 Length: 3 bytes
+---------+---------+---------+
| Kind=3 |Length=3 |shift.cnt|
+---------+---------+---------+

This option is an offer, not a promise; both sides must send
Window Scale options in their SYN segments to enable window
scaling in either direction. If window scaling is enabled,
then the TCP that sent this option will right-shift its true
receive-window values by 'shift.cnt' bits for transmission in
SEG.WND. The value 'shift.cnt' may be zero (offering to scale,
while applying a scale factor of 1 to the receive window).

This option may be sent in an initial segment (i.e., a
segment with the SYN bit on and the ACK bit off).
It may also be sent in a segment, but only if a Window
Scale op- tion was received in the initial segment.
A Window Scale option in a segment without a SYN bit should be ignored.


The Window field in a SYN (i.e., a or ) segment itself is never scaled.

Using the Window Scale Option

A model implementation of window scaling is as follows


All windows are treated as 32-bit quantities for storage in
the connection control block and for local calculations.
This includes the send-window (SND.WND) and the receive- window
(RCV.WND) values, as well as the congestion window.

* The connection state is augmented by two window shift counts,
Snd.Wind.Scale and Rcv.Wind.Scale, to be applied to the
incoming and outgoing window fields, respectively.

* If a TCP receives a segment containing a Window Scale
option, it sends its own Window Scale option in the segment.


* The Window Scale option is sent with shift.cnt = R, where R
is the value that the TCP would like to use for its receive window.


* Upon receiving a SYN segment with a Window Scale option containing
shift.cnt = S, a TCP sets Snd.Wind.Scale to S and sets Rcv.Wind.Scale
to R; otherwise, it sets both Snd.Wind.Scale and Rcv.Wind.Scale to zero.

* The window field (SEG.WND) in the header of every incoming segment,
with the exception of SYN segments, is left-shifted by Snd.Wind.Scale
bits before updating SND.WND: SND.WND = SEG.WND << Snd.Wind.Scale
(assuming the other conditions of RFC793 are met,
and using the "C" notation "<<" for left-shift).


* The window field (SEG.WND) of every outgoing segment, with the
exception of SYN segments, is right-shifted by
Rcv.Wind.Scale bits: SEG.WND = RCV.WND >> Rcv.Wind.Scale.

TCP determines if a data segment is "old" or "new" by testing
whether its sequence number is within 2**31 bytes of the left
edge of the window, and if it is not, discarding the data as "old".

To insure that new data is never mistakenly considered old and vice- versa,

the left edge of the sender's window has to be at most 2**31 away
from the right edge of the receiver's window. Similarly with the
sender's right edge and receiver's left edge. Since the right and
left edges of either the sender's or receiver's window differ by
the window size, and since the sender and receiver windows can be
out of phase by at most the window size, the above constraints imply
that
2 * the max window size must be less than 2**31,
or max window < 2**30 Since the max window is 2**S
(where S is the scaling shift count) times at most 2**16 - 1
(the maximum unscaled window),
the maximum window is guaranteed to be < 2*30 if S <= 14.

Thus, the shift count must be limited to 14 (which allows windows of 2**30 = 1 Gbyte).

If a Window Scale option is received with a shift.cnt value exceeding 14,
the TCP should log the error but use 14 instead of the specified value.
The scale factor applies only to the Window field as transmitted
in the TCP header; each TCP using extended windows will maintain
the window values locally as 32-bit numbers. For example,
the "congestion window" computed by Slow Start and Congestion Avoidance
is not affected by the scale factor, so window scaling will not introduce
quantization into the congestion window.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Hope it will help you to understand the mechanism and the way tcp-optin window-scale optin work.

In the mean time I can give you better answer to cope up with your situation with particular CSS commands if you need.

Please let me know about your opinion if you want to know this same feature in regard to CSS  with specifications or this description from ACE can be understood to you..

Best regards,

Sachin Garg

Re: ACE default handling of the TCP Window Scale option

egl.davidfarrell Fri, 04/16/2010 - 04:10

Hi,

I suspect, though can't confirm (I have no packet captures) that the CSS was initially supporting window scaling when the connection is spoofed client to CSS but that this was not being replicated on the backend. As such, the server was sending a normal receive window and the client what it assumed was a scaled received window and the connection grinding to a halt since these values where skewed by the perception of a scale factor. If I get the opportunity I will get packet captures for this.

Running through older CSS code where window scaling is not supported, the intial SYN-SYN-ACK spoofing does not seem to support the window scale option from the server (css) side for L5 connections.

David.

Gilles Dufour Fri, 04/16/2010 - 06:22

See also

CSCsw25443 The command "flow tcp-window-scale disabled" was broken and did not fully disable the  propagation of the TCP Window Scale (WS) option to the server and/or the SSL Module. Also make sure the        TCP WS option is not incorrectly set in a spoofed TCP SYN to the server if it was not present in the original        TCP SYN from the client.

First fix in 8.10 B505s

Gilles.

d-fillmore Mon, 05/17/2010 - 06:50

Thanks to everyone for the great information in this post.

My customer has a very busy E-commerce website running through a pair of CSS11501s and they noticed that the performance was occasionally very poor through the CSS after we upgraded to the latest 8.20. Performance was fine if they accessed the server directly.

After looking at the traces, it looks like they are getting affected by this so am going to suggest they turn it off to emulate the behaviour prior to the upgrade.

Many Thanks, Dom

kevin-shaw Fri, 05/20/2011 - 14:02

I too have been having issues with Windows Scaling in the newer version of CSS Code.  I am seriously considering disabling the feature, as the CSS did not appear to support it, prior to 8.00 and all of my load balanced applications worked fine.  With that said, I was concerned to see you mentioned that after making that change, you saw a CSS Lock Up.

Using the command 'flow tcp-window-scale disabled' which seemed to resolve the issue completely so that the CSS didn't allow the Window scaling option. However, later in the day we experienced a lock up of the CSS and an outage. This is currently being investigated.

I have not been able to find a lot of info on this feature either.  Which is scary.  What other feature changes are looming....

I must admit I am struggling to find good information about the behaviour of this window scaling features.

farrell.da Mon, 05/23/2011 - 01:10

Hello Kevin,

Just to update, it appears the lock up was unrelated and probably due to a worn out flash card. These boxes have had new flash cards and have been completely rebuilt with the latest safe habour release. They are now rock solid again.

Thanks,

David Farrell.

Actions

This Discussion

Related Content