frame retransmission and low throughput

abopche · ‎11-27-2001

Hi,

I have 2 locations connected mutually with 2 x 64 kbps permanant satellite circuits (600 ms latency). When I try to ftp between them, I get a throughput of approx 90 kbps only. When I see the lan traces (sniffer), I can see a lot of duplicate frames causing retransmissions. Probably this is what is pulling the throughput down. Ocassionally, when there are no duplications / retransmissions, I am able to achieve good throughput (say 120 kbps + ).

Any suggestions ? What could be the cause of these duplications ?

MickPhelps · ‎11-27-2001

If I was to guess, I would say that the latency is causing TCP to believe that a packet was lost when it really wasn't. In your traces, what is the delta in ms between the original and the duplicate packet? You may need to change your TCP timeout values to something above 600 (1200 if 600 is one way latency).

Mick.

abopche · ‎11-28-2001

thx mick, 600 ms is the round trip ping response time. Also, when I am doing it over a single 64 K link, this problem is not arising and I am getting close to 62-63 kbps. I guess adding an additional link will not have any impact on the tcp packet loss. I think it is something to do with some registry modifications. My send and recv window size is 64K with an MTU size of 1500 bytes. All these settings are prominently visible in the traces. Any ideas as to how to increase the tcp timeout value ?

MickPhelps · ‎11-28-2001

Does it matter which 64K path you use when you just use one path? Do the retrans go away with either path?

If this is the case, one path may be slightly faster than the other and if you're load balancing over both, you may have packets showing up out of order. A sniffer should be able to tell you if this is happening.

Not sure about how to change TCP timers... hopefully, it isn't application specific.

Mick.

svermill · ‎11-28-2001

It usually is application specific (and, thus, often can't be done). Of course, TCP only "executes" from the router in very rare instances. So each TCP stack outside of the router has to be tweaked (mostly on the transmit side). I haven't been up against it in a while, but I think that it is often necessary to work with the timers, the window size (especially the initial window "cwnd"), and the MTU. With the window too small, throughput goes down while an ack is being awaited. With timers set too low, as you indicated, timeouts occur. Congestion avoidance also slows things down considerably with self-defeating timer values. The MTU is useful because of the way that TCP congestion avoidance works. Once a packet is "dropped" (read: timed out), TCP congestion avoidance collapses the transmit window (slow start is the terminology I believe) and slowly builds it back up to the max value (max value according to the receiver on the other end). So having a larger MTU means having a larger number of outstanding bytes per outstaning segment/ack.

Here is an excellent RFC that summarizes numerous theories as to how to properly implement TCP via high bandwidth*delay product links:

http://masaka.cs.ohiou.edu/papers/rfc2760.txt

There are apparently a lot of new ideas out there since I studied the subject a few years back. Of course, the old standby is TCP header compression.

Any thoughts as to why splitting into two paths might aggrivate the situation? I can see that unordered packets could be aggrivated if even just one link was in slow start. The receiving end would have to wait for long periods of time on both links even if just one was slowed. Of course, that would only occur if per-packet load balancing were in effect, right?