WAN Link issue

Unanswered Question
Apr 19th, 2012

Hi Experts

i have 1G Fiber link between 2 sites,as per MRTG output , the link is underutilized , but the strange is i m copying a size 100G from site 1 to site 2 , the download average just 16Megabyte per second

pls clarify

jamil

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Average Rating: 4.2 (12 ratings)
edwin.summers Thu, 04/19/2012 - 11:19

TCP doesn't dump traffic on a link to fill the pipe.  It transmits a piece at a time and waits for the remote end to acknowledge receipt prior to sending more information.  This allows the protocol to ramp up it's transmit rate as long as things are going well, or throttle it down if it doesn't receive timely acknowledgements (such as if the link exhibits packet loss or is becoming congested).

Think of it as feeding a baby.  It would be tempting to think that the size of your link equates to the size of the baby's mouth, and we can just start dumping food in the mouth to keep it full.  However, that's a good way to ensure that little food makes it to baby's stomach.  Instead, we send a spoonful of food at a time (the amount of food kinda like the TCP window - the amount of data that the sender will transmit before waiting on an acknowledgement), allow the baby to chew, then open her mouth to show it is empty and ready for another bite (the acknowledgement).

Sometimes baby is going to hold her mouth closed (not ready - still chewing - latency/delay), or spit out the food (packet loss).  In this case we must scrape up the food and "re-send" it.  As you can see, depending on the packet loss, delay, speed of acknowledgements, the amount of time to transmit the baby (feed a jar of food) can vary based on link conditions.  Even if one baby's mouth is twice as large as another, if it's latency (chew time) is longer, it's going to be a longer feeding session than the baby with the smaller mouth.

vmiller's link provides a good explanation of the window size (size of a spoonful of food) and theoretical calculation of expected speed.  Note that a bigger spoon is not always better.

Hope that's not too silly of an explanation.

vmiller Thu, 04/19/2012 - 11:37

Well played sir. Much better than what I could have written.

vmiller Fri, 04/20/2012 - 08:24

no matter. the endstations are still the ones that set transmit and receive windows.

ibrahim.jamil Fri, 04/20/2012 - 08:35

Hi vmiller

since i have have 1 Gig Pipe , why i shouldn't  have an average of transfer rate  of 128 Mbps ,if use the formula of of dividing the 1024/8,pls advise

thanks

jamil

vmiller Fri, 04/20/2012 - 08:58

you would have to go thru the tcp tuning drill based on your workstation operating system.

http://www.psc.edu/networking/projects/tcptune/

there is some references to microsoft tech page for a how to ( I avoid workstation issues like the plague)

there could also be some application parameters that need adjusting.

JosephDoherty Fri, 04/20/2012 - 09:53

Disclaimer

The      Author of this posting offers the information contained within this      posting without consideration and with the reader's understanding   that    there's no implied or expressed suitability or fitness for any    purpose.   Information provided is for informational purposes only  and   should not   be construed as rendering professional advice of any  kind.   Usage of  this  posting's information is solely at reader's own  risk.

Liability Disclaimer

In      no event shall Author be liable for any damages whatsoever    (including,   without limitation, damages for loss of use, data or    profit) arising  out  of the use or inability to use the posting's    information even if  Author  has been advised of the possibility of   such  damage.

Posting

Ibrahim Jamil wrote:

Hi vmiller

since i have have 1 Gig Pipe , why i shouldn't  have an average of transfer rate  of 128 Mbps ,if use the formula of of dividing the 1024/8,pls advise

thanks

jamil

Google BDP, "bandwidth delay product", LFN, "long fat network", and/or TCP RWIN.

PS:

There are other issue too, but the foregoing is usually the biggest issue.

JosephDoherty Fri, 04/20/2012 - 17:19

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

Then you need a receive window of 244 KB, older TCP stacks often don't, by default, use TCP scaling; some also don't default even to 64 KB.

Or working the equation the other way, 16 MBbps (128 Mbps) would use a RWIN of 32 KB.

JosephDoherty Sat, 04/21/2012 - 03:36

Disclaimer

The  Author of this posting offers the information contained within this  posting without consideration and with the reader's understanding that  there's no implied or expressed suitability or fitness for any purpose.  Information provided is for informational purposes only and should not  be construed as rendering professional advice of any kind. Usage of this  posting's information is solely at reader's own risk.

Liability Disclaimer

In  no event shall Author be liable for any damages whatsoever (including,  without limitation, damages for loss of use, data or profit) arising out  of the use or inability to use the posting's information even if Author  has been advised of the possibility of such damage.

Posting

First, confirm this is the issue. Best way would be sniff the traffic and look at the RWIN being advertised.  If it is too small, the easy, although expensive, way is to drop in an WAN accelerator on both ends.  Many will spoof the connection and do things like increase RWIN across the WAN link.  (They often do other tricks too.)

The more difficult way, possibly less expensive, determine how to adjust RWIN on the receiving host.  Then reconfigure the receiving host; often requires a host reload.  How the configuration should be modified varies per OS and OS version; often generally documented for many OSs on the Internet.

PS:

Later OSs shouldn't normally need manual configuration as their defaults are much better.

Later Windows hosts have different defaults whether workstation or server; there's also a patch for an earlier Windows servers to add later TCP stack features.

PPS:

Another approach is to use a transfer app that slices the file up so it can transfer multiple concurrent streams.  Such an app also works better dealing with packet loss; aggregate rate generally both ramps up faster and recovers fasters.

edwin.summers Fri, 04/20/2012 - 09:06

Just to clear up some mis-assumptions I've probably made, can you share additional information regarding your setup?  A diagram and any configuration information would be helpful.  Good things to know:

1) What devices you are connecting

2) End-to-end latency

3) What protocol you are using to transfer the file

4) Whether or not any other traffic is traversing the link, and if so, what kind and how much (even if it's just "three guys and the Queen playing Farmville")

5) Have you performed any traffic sniffing, and turned up anything odd (unusual number of duplicate ACKs, etc)

6) Distance between the sites

7) Type of fiber used

Best,

Ed

ibrahim.jamil Fri, 04/20/2012 - 09:56

Hi Ed

thanks for reply

1) What devices you are connecting

A)1 X  6500 at each site

2) End-to-end latency

when i ping the end host , only 2 ms

3) What protocol you are using to transfer the file

Normal copy of 100G and paste an 2nd site

4) Whether or not any other traffic is traversing the link, and if so, what kind and how much (even if it's just "three guys and the Queen playing Farmville")

according to MRTG output, the consumption of the link is 70M

5) Have you performed any traffic sniffing, and turned up anything odd (unusual number of duplicate ACKs, etc)

?????

6) Distance between the sites

20 Km

7) Type of fiber used

pair of fiber

I Have connected two server 2008 back to back using a NIC of  1G for testing, i copied 25GByte within 6 Mts , now where the matter of buffering and windowing?pls explain???


jamil

edwin.summers Fri, 04/20/2012 - 10:43

Okay, and based on your earlier message, there is an 802.1q trunk on the link between the sites, correct?  Just wanted to check a couple of other things:

1) I'm assuming that each end station is attached directly to the 6509.  Each end station is connected via a 1Gig Ethernet link, correct?

2) Copy/paste of 100G file...this is a Windows copy/paste? (CIFS)

3) Have you confirmed there are no speed/duplex mismatches between any of the devices?

4) What do the interface counters (errors, specifically) show when you view each of the interfaces (both ends of the trunk, and the two end-station interfaces)?

5) How many stations are on the vlan with the two stations in question, and how many vlans are traversing the trunk?

The RTT seems very good (2ms b/w end stations).  Quickest thing that pops out in my mind in that case would be port mismatch somewhere (someone's going half-duplex).

ibrahim.jamil Fri, 04/20/2012 - 11:09

Hi Ed

1) I'm assuming that each end station is attached directly to the 6509.  Each end station is connected via a 1Gig Ethernet link, correct?

a) Correct


2) Copy/paste of 100G file...this is a Windows copy/paste? (CIFS)

A) Correct copy from Server 1 at site (1) and paste on server 2 at site (2)

3) Have you confirmed there are no speed/duplex mismatches between any of the devices?

sh int status shows:duplex full

4) What do the interface counters (errors, specifically) show when you view each of the interfaces (both ends of the trunk, and the two end-station interfaces)?

A) No errors on both interfaces where the trunk link is terminated


5) How many stations are on the vlan with the two stations in question, and how many vlans are traversing the trunk?

A)only 3 VLANs are allowed to coross the trunk

The RTT seems very good (2ms b/w end stations).  Quickest thing that pops out in my mind in that case would be port mismatch somewhere (someone's going half-duplex).

a)sh int status shows all interface in full duplex

thanks ED,i do Appreciate ur time to answe me

Jamil

edwin.summers Sat, 04/21/2012 - 09:38

Sorry, it's getting difficult to determine remotely based on what you're seeing.  JosephDoherty has made great suggestions in his last post, and probably best to focus on checking those out rather than me confusing the issue with additional changes.  I'll keep an eye out in case something springs out at me.

Best of luck! -Ed

Talha Ansari Sun, 04/22/2012 - 22:28

@ Ibrahim.

There is a software named as iperf which can be used for testing bandwidth across the two sites and I think its a freeware. Try it if possible and let us know the results.

@ Edwin.... In any of my rarest of the rare thoughts I could have never imagined that TCP communication could be compared to feeding a baby. It was amazing.. and I m glad that I came across...

Regards,

Talha

Actions

Login or Register to take actions

This Discussion

Posted April 19, 2012 at 9:38 AM
Stats:
Replies:22 Avg. Rating:4.19444
Views:753 Votes:0
Shares:0
Tags: No tags.

Discussions Leaderboard