All 3 devices are connected to a Cisco Catalyst 6509 with sup-32 and
copper Gigabit Ethernet interface. All 3 devices are dell servers.
lab1 is dell 2550 dual processors 3.0 Ghz with 2GB RAM.
lab2 and Win2k3-1 are dell quad processors 3.1Ghz with 4GB RAM.
Everything on the switch and the interfaces on the server is
hard-code to 1000/full.
I have an FTP Server and Iperf running on LinuxES-lab2. When I
tested iperf from lab1, I get about 856Mbps throughput:
[root@LinuxES-lab1 tmp]# iperf -c 192.168.15.100 -t 10
Client connecting to 192.168.15.100, TCP port 5001
TCP window size: 16.0 KByte (default)
[ 3] local 192.168.15.110 port 32877 connected with 192.168.15.100 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 1020 MBytes 856 Mbits/sec
When I tested from Win2k3-1, I get about 600Mbps throughput.
However, when I download a 2GB file from lab1 to lab, I get only
about 325Mbps. If I used Secure Copy (scp), I get only about 72Mbps.
If I used Secure FTP (sFTP), I only get about 24Mbps.
Is there a way to improve the download speed for FTP, scp and sFTP?
One thing you could try is Jumbo Frame size. Set your MTU on these devices to a larger value then the default 1500 size. This will increae the data troughput. Since you can squeeze more data onto a packet with the increased MTU size then the IP overhead is decreased because less packets need to be sent over the line. Refer to attached for further details...Good Luck..
Unfortunately, the blade I have on the 6509 does
NOT support jumbo frame. It is a 10/100/1000
PoE blade. I am aware of the jumbo frame
but could not implement with my 6509.
Any other ideas? Thanks.
Your board should support Jumbo Frame. Go to the Gig Interface and type MTU 9198 for the interfaces you are concerned about. Also make sure that your NIC cards support Jumbo frames as well....
If you haven't already, you might review the articles under "Here are some additional links you may find useful" within http://dast.nlanr.net/Projects/Iperf/ Also, insure you have the latest NIC drivers on your hosts.
"f you haven't already, you might review the articles under "Here are some additional links you may find useful" within http://dast.nlanr.net/Projects/Iperf/ Also, insure you have the latest NIC drivers on your hosts."
I have the latest NIC drivers because I personally recompile my linux kernel. The driver is good. Otherwise, I would not get
856Mbps throughput with Iperf.
Take a look below, my catalyst 6506 gig blade
does NOT support jumbo frame. See below:
Enter configuration commands, one per line. End with CNTL/Z.
% Unrecognized command
Cisco Internetwork Operating System Software
IOS (tm) s3223_rp Software (s3223_rp-ENTSERVICESK9_WAN-M), Version 12.2(18)SXF4, RELEASE SOFTWARE (fc1)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2006 by cisco Systems, Inc.
Compiled Thu 23-Mar-06 18:30 by tinhuang
Image text-base: 0x40101040, data-base: 0x42D20000
ROM: System Bootstrap, Version 12.2(17r)SX3, RELEASE SOFTWARE (fc1)
BOOTLDR: s3223_rp Software (s3223_rp-ENTSERVICESK9_WAN-M), Version 12.2(18)SXF4, RELEASE SOFTWARE (fc1)
S65-1 uptime is 19 weeks, 19 hours, 27 minutes
Time since S65-1 switched to active is 19 weeks, 19 hours, 26 minutes
System returned to ROM by power-on (SP by power-on)
System image file is "bootdisk:s3223-entservicesk9_wan-mz.122-18.SXF4.bin"
This product contains cryptographic features and is subject to United
States and local country laws governing import, export, transfer and
use. Delivery of Cisco cryptographic products does not imply
third-party authority to import, export, distribute or use encryption.
Importers, exporters, distributors and users are responsible for
compliance with U.S. and local country laws. By using this product you
agree to comply with applicable laws and regulations. If you are unable
to comply with U.S. and local laws, return this product immediately.
A summary of U.S. laws governing Cisco cryptographic products may be found at:
If you require further assistance please contact us by sending email to
cisco WS-C6506-E (R7000) processor (revision 1.0) with 983040K/65536K bytes of memory.
Processor board ID SAL1003AFTB
R7000 CPU at 300Mhz, Implementation 0x27, Rev 3.3, 256KB L2, 1024KB L3 Cache
Last reset from power-on
SuperLAT software (copyright 1990 by Meridian Technology Corp).
X.25 software, Version 3.0.0.
TN3270 Emulation software.
16 Virtual Ethernet/IEEE 802.3 interfaces
96 FastEthernet/IEEE 802.3 interfaces
57 Gigabit Ethernet/IEEE 802.3 interfaces
1915K bytes of non-volatile configuration memory.
65536K bytes of Flash internal SIMM (Sector size 512K).
Configuration register is 0x2102
S65-1# sh mod
Mod Ports Card Type Model Serial No.
--- ----- -------------------------------------- ------------------ -----------
1 48 48 port 10/100/1000mb EtherModule WS-X6148-GE-TX SAL1003ASWV
5 9 Supervisor Engine 32 8GE (Active) WS-SUP32-GE-3B SAL1016KKWC
Mod MAC addresses Hw Fw Sw Status
--- ---------------------------------- ------ ------------ ------------ -------
1 0016.c7ed.7d98 to 0016.c7ed.7dc7 1.1 7.2(1) 8.5(0.46)RFW Ok
5 0016.c7ae.2f82 to 0016.c7ae.2f8d 4.2 12.2(18r)SX2 12.2(18)SXF4 Ok
Mod Sub-Module Model Serial Hw Status
---- --------------------------- ------------------ ----------- ------- -------
5 Policy Feature Card 3 WS-F6K-PFC3B SAL1016KP28 2.1 Ok
5 Cat6k MSFC 2A daughterboard WS-F6K-MSFC2A SAL1016KG48 3.0 Ok
Mod Online Diag Status
Anymore ideas folks? Thanks.
You might want to consider changing the TCP window size to force TCP to put more data in play, since jumbo frames are not an option. I try and steer clear of this in production, especially in a multi-OS environment, because flows that must go over the internet usually take a performance hit over internal flows. But! If this is an internal server and you have the time to tweak it, take a look:
"window size - and buffer sizes - sorry that wasn't clear, lol."
I already made modification to these settings
in the /proc directory according to Linux
documentation. I still can NOT scale sFTP
pass 24Mbps and scp pass 80Mbps throughput.
I would imagine if I have issues with the
windows size and buffer sizes, then Iperf
would NOT have shown a 858Mbps throughput.
Anymore ideas? Thanks.
A possible consideration, you're using Iperf with the implication that all other software using TCP should obtain similar performance. It could be as simple as the applications themselves are the root of the poorer performance, especially two that are performing on-the-fly(?) encryption.
Re: my earlier mention of NIC drivers, I also had in mind the lower Iperf(?) performance you saw with Win2k3-1.
i totally agree. ultimately if its not the tcp stack and not the networking devices we have to look at the remaining culprit - the software.
You might want to look at a time/sequence graph and see if the flows are steady - if not, you may have an implentation/encryption problem.
"especially two that are performing on-the-fly(?) encryption. "
That could very well but highly unlikely because
I performed "vmstat 1" on both linux boxes while
the transfer is taking place, the CPU is running
at 90% IDLE. In other words, the box has plenty
of cpu horse power left. One other thing, I
am using ssh with aes-128 encryption so the
encryption algorithm is very efficient.
Anymore ideas? Thanks.
Low CPU utilization, might rule that out as a bottleneck. That's assuming all CPU consumption is correctly accounted.
There are other system interactions that can lead to poor application performance. How the application reads and writes to disk, how the application bounces around within its working set, how the application reads and writes to the network. There's also how the system supports such. (An example of how extreme this can get, some computer architecures deliver different performance depending on how reads/writes to RAM are aligned on byte and/or word boundaries.)
"That's assuming all CPU consumption is correctly accounted".
I used "vmstat 1" to measure cpu utilization.
This low CPU utilization is consistent with
what I see nagios and solarwinds.
The Linux Server is running on a 10k RPM 100GB
RAID 5 drive so I dont' think reading/writing to
the disk is an issue.
I appreciate what you said about iperf results but where are your servers patched into the WS-X6148-GE-TX.
The WS-X6148-GE-TX is a heavily oversubscribed blade ie. it has an oversubscription rate of 8:1 so for every 8 ports there is maximum throughput of 1Gbps.
The port groupings are
1 - 8
9 - 16
so if you haven't already you may want to ensure that each of your servers has a port group to itself.
As i say i appreciate your iperf results but this may be worth a try.
Linux_1 is connected to port 1
Linux_2 is connected to port 11
Win2k-1 is connected to port 21
same result. Iperf shows 856mbps throughput
while scp and sFTP show very poor performance.
Anymore ideas? Thannks.
Well, if you interconnect your two Linux servers, see what results you get then.
I suspect they be what you've seen so far. That would then point at the hosts and/or their applications.
BTW, RAID 5 slows writes. It's great for the "I" portion of acronym, but not for write performance. Have you benched the drive standalone? It might account for major portion of the 325 Mbps you've documented.
"I appreciate what you said about iperf results but where are your servers patched into the WS-X6148-GE-TX.
The WS-X6148-GE-TX is a heavily oversubscribed blade ie. it has an oversubscription rate of 8:1 so for every 8 ports there is maximum throughput of 1Gbps."
Is this documented anywhere? Can you provide
the link for this? Thanks.
You had to ask :-). I can never find the doc that explains all this but Edison Ortiz seems to know where they are whenever we get into these sort of discussions so i've requested he post the link if he has it.
Ok I replaced the catalyst 6506 with an
Extreme switch. I am now able to push
about 600mbps FTP, 350mbps scp and 300mbps
sFTP. at 350mbps scp throughput, CPU on the
linux boxes is at 90% utilization which is
It seems to me like the catalyst 6506 can not
scale past 90-100mbps with scp traffics on the
Any ideas anyone? Thanks.
Those are intersting results! :P
I know one solution:
Out of curiosity, which Extreme switch did you use?
It's part of the Release Notes.
As for your previous post, the 6148-GE-TX is as follow:
Number of ports: 48
Number of port groups: 2
Port ranges per port group: 1-24, 25-48
I'm getting confused now
When you use either the WS-X6548-GE-TX or WS-X6148-GE-TX modules, there is a possibility that individual port utilization can lead to connectivity problems or packet loss on the surrounding interfaces. Especially when you use EtherChannel and Remote Switched Port Analyzer (RSPAN) in these line cards, you can potentially see the slow response due to packet loss. These line cards are oversubscription cards that are designed to extend gigabit to the desktop and might not be ideal for server farm connectivity. On these modules there is a single 1-Gigabit Ethernet uplink from the port ASIC that supports eight ports.
---> These cards share a 1 Mb buffer between a group of ports (1-8, 9-16, 17-24, 25-32, 33-40, and 41-48) since each block of eight ports is 8:1 oversubscribed. The aggregate throughput of each block of eight ports cannot exceed 1 Gbps.. <---
Table 4 in the Cisco Catalyst 6500 Series 10/100- & 10/100/1000-Mbps Ethernet Interface Modules shows the different types of Ethernet interface modules and the supported buffer size per port.
I'm sure you had diagrams that you posted that showed the port groupings of these modules ?
Don't be confused. That Release Notes is wrong
I did some digging now and found some internal documents which I can't publish.
The WS-X6148-GE-TX has 2 Pinnacles that connect to the ASIC but these 2 Pinnacles are broken down into 3 Port Groups each. Each Port Group has 8 Ports.
Port Group 1 = Ports 1-8
Port Group 2 = Ports 9-16
Port Group 3 = Ports 17-24
Port Group 1 = Ports 25-32
Port Group 2 = Ports 33-40
Port Group 3 = Ports 41-48
"These line cards are oversubscription cards that are designed to extend gigabit to the desktop and might not be ideal for server farm connectivity. On these modules there is a single 1-Gigabit Ethernet uplink from the port ASIC that supports eight ports."
So let me see if I understand this correctly
since I am a firewall/security person and not
a routing/switching person. Cisco is selling
me a Gigabit line card but the line card can
NOT do gig throughput with my servers. Is
that a correct statement?
Maybe it is time for me to look at Extreme
Your understanding is correct. However if you see where Cisco position this module it is in the wiring closet and not as a server farm blade. So chances are it is unlikely that you will be oversusbcribing too much at any one time.
Obviously it is also cheaper than a module that supports full gigabit throughput on each port although even with the 6748 module there is a little oversubscription ie. 48Gbps ports with a 40Gbps connection to the switch fabric.
Many people get all wound up about gigabit throughput being just that but this module was primarily designed for clients not servers, hence the oversubscription.
Edit - to be more precise, the line card can do gigabit througput on a port but if more than one port in the group of 8 is being used at the same time neither port will have the full gigabit throughput.
as a suggestion - try to sniff the traffic which is sending between the linux boxes, with iperf test, ftp and scp.
I'm pretty sure that in scp test you will get a lot of "retransmittions" and the TCP window size will not go up to the limit.
but if you start additional scp sessions you will see then summ of the scp connections are increasing proportionally to the number of scp session.
The Extreme chassis modules, at least the last time I looked, are also oversubscribed. The G48Te, for example, has an oversubscription of 4:1, 2:1 with 2 MSM. I am by no means an expert on any of this but I do have Extreme gear in our Core. If you are a CLI guy like I assume you are I don't think you will enjoy the interface Extreme has to offer.
Just my two cents.
(Indentation was getting a bit much.)
There's still something odd about this. Regardless of the oversubscription capacity of the card, why such differences in traffic rates between Iperf, FTP, and scp/sFTP on the 6500?
Yes, the last set of stats, on the Extreme switch, show scp/sFTP at half the rate of FTP with server being CPU constrained, but not the same proportions across the 6500. I.e. it makes sense IPerf would be the fastest, likely NIC limited; followed by straight FTP, perhaps disk limited; followed by scp/sFTP, CPU limited. What doesn't make sense why, if the 6500 could handle 856 Mbps Iperf and 600 Mbps windows, it couldn't also handle similar bandwidths for the other traffic as did the Extreme switch.