Slow network problems--revisited

Unanswered Question
Mar 25th, 2009
User Badges:

Greetings, I posted about a month ago (Feb 18) issues with our network running slow (file coping, printing as a couple of examples). At this point, we're still having issues. How is untagged versus tagged traffic treated? If frames aren't tagged properly, what would happen? I still see our switches running free of errors, so not sure what else to look at.

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
Yudong Wu Wed, 03/25/2009 - 13:41
User Badges:
  • Gold, 750 points or more

Hi Chris,

Could you provide a bit more info so that others can help you here?

1. Did the issue happen in the same vlan or cross vlans?

2. What kind of file transfer protocol did you test on?

3. Did the issue happen in bi-direction or just one direction?

4. What kind of platform are you using, and their IOS version?

5. Do you have QoS enabled?

....

christopher_hal... Wed, 03/25/2009 - 14:08
User Badges:

Let me start by posting a link to the original:


http://forums.cisco.com/eforum/servlet/NetProf?page=netprof&forum=Network%20Infrastructure&topic=LAN%2C%20Switching%20and%20Routing&CommCmd=MB%3Fcmd%3Dpass_through%26location%3Doutline%40^1%40%40.2cd27f4d



1. Did the issue happen in the same vlan or cross vlans?


We have two vlans running on our switches (plus vlan1 for mgt)--one for data and one for voice. While the file copy has been tested across wan links (MPLS), the vlans are the same. We ran the following tests:

a) From a remote PC to it's local server and back.

b) From one remote PC to another PC on the same LAN and back.

c) From a remote PC to a PC on the host and back.


The results were mixed. At times it was very quick moving, say a 20Mb file, from the remote PC to it's


server (within a minute), yet it would be slow returning that same file to the PC. And yet again, the file


would quickly copy to a remote PC. Let me specify that we are copying files from one windows platform to


another--XP on the desktop and 2003 on the server. We do run DFS on the servers and have somewhat


eliminated that as being the issue by copying files to/from folders outside of that structure.


2. What kind of file transfer protocol did you test on?


See above.


3. Did the issue happen in bi-direction or just one direction?


Again, it was random. We ran the above tests at multiple locations with the same mixed results.


4. What kind of platform are you using, and their IOS version?


We are running 3560s with 12.2(35)SE5. On the routers, 2811 and IOS version 12.4(3H).



5. Do you have QoS enabled?

We are running QoS.

....


Thanks for the quick reply.

Chris



Jason Fraioli Wed, 03/25/2009 - 14:12
User Badges:

I would check for duplex/speed mismatches.


#show interfaces status


If you have any "a-half a-10" entrys listed, that could be the source of your problems.

christopher_hal... Wed, 03/25/2009 - 14:16
User Badges:

Thanks for the suggestion, but we have reviewed this and all switches are running clean.

Joseph W. Doherty Wed, 03/25/2009 - 16:59
User Badges:
  • Super Bronze, 10000 points or more

I didn't have success with your Feb. 18th, link.


A network diagram would help.

Yudong Wu Wed, 03/25/2009 - 19:41
User Badges:
  • Gold, 750 points or more

Hi Chris,

Yes, if you could provide a diagram of your network, it would be helpful.

Per my understanding of your post, the testing was between two diferent sites -- remote site and central site. Slowness happened randomly.

If we think the route path as three parts -- LAN at central site, Carrier MPLS cloud, and LAN at remote site, have you done any tests to narrow down which part to cause the slowness?

Also, did you do a packet sniffer on PC/Server on both ends when the issue happened and when it was normal? I would like to know if there were a lot retransimission due to packet loss.

By the way, I would like to suggest to use a different protocol like FTP to do the same testing to see if you experience the same issue.


christopher_hal... Thu, 03/26/2009 - 05:12
User Badges:

I will get a diagram this afternoon. I'm out of the office this morning (EDT). Here's a copy of MY original post.....


Hi group,


I'm having some slow network issues. We have Catalyst 3560 FastEthernet with IOS version 12.2(35)SE5. We also have a 3560 Gigabit switch. On the LAN, I can copy a file (109 Mb) from an XP PC to our Windows Server 2003 in about a minute, but it takes about 15 minutes to copy that file from the server to the PC. I've tried these same tests with several PCs on the LAN and got the same results. We've even tried the same tests between a PC on another LAN and the server (going across the WAN) with the same results--server to PC, slow; PC to server, fast. On the LAN, I've been able to transfer the same file between two PCs with no problems--fast either way. Now, we've moved the server from the Gigabit switch to the FastEthernet switch and have the same issues--either switch. We've replaced the server cable also--it's plugged directly into the switch. At this point, you may be thinking we have a server problem, but the slow data transfer is only one example. We've had slow printing problems and sporadic network connectivity with some clients. All sporadic, of course. This building was newly refurbished, so the cable is new-- Cat 6. During the data transfers, I'm watching the server performance and there's no memory, processor, or network performance issues. No errors on the switch ports. I've turned off anti-virus on servers and workstations during the transfers. Ports show they're running full/100 and are set to autonegotiate--which again, there's no tx/rx errors. I am going to look at some Microsoft docs, but we keep our MS boxes fully patched and things I've seen thus far talk about this patch and that.


I thought maybe it was a switch configuration problem, as some ports on switches are different than others. Here's an example of how most are configured:



interface FastEthernet0/6

switchport trunk encapsulation dot1q

switchport trunk native vlan 10

switchport mode trunk

switchport voice vlan 172

mls qos trust dscp

spanning-tree portfast



I changed the switchport mode to access instead of trunk and changed 'switchport trunk native vlan 10' to 'switchport access vlan 10', but made no difference. I know a little about trunk vs. access mode, but just a little cloudy. A discussion of tagging frames would be a good post for me for a separate post.


Anyone have any ideas?




I have attached a switch config.




Attachment: 
Joseph W. Doherty Thu, 03/26/2009 - 16:40
User Badges:
  • Super Bronze, 10000 points or more

Is the VPN L2 or L3?


Are your preformance issues, between hosts, within a LAN or across the VPN WAN?

christopher_hal... Fri, 03/27/2009 - 04:59
User Badges:

That's the odd thing....it's both. We've experienced slow data copy between machines on the LAN level.

Joseph W. Doherty Fri, 03/27/2009 - 05:23
User Badges:
  • Super Bronze, 10000 points or more

Usually LAN vs. WAN have different issues but not always.


When running TCP applications, many things can impact their performance, most common being if TCP "believes" packets are being lost.


You might want to consider doing some packet analysis and hopefully capture a poor performing TCP flow.


If you can identify why TCP runs slow, then you can further hunt what on the network is causing the issue.


BTW, since you're running both L2 across the "WAN", issues such as broadcast storms might impact both LAN and WAN traffic. Here too, if happening, packet analysis might "see" it.

christopher_hal... Fri, 03/27/2009 - 05:31
User Badges:

Thanks for the info....I've used wireshark before, but the one thing with packet captures is going through all those packets. Any suggestions on filtering and what to look for? I do have some understanding of the basic sequence....


Thanks,

Chris

Joseph W. Doherty Fri, 03/27/2009 - 17:54
User Badges:
  • Super Bronze, 10000 points or more

Far from a expert with using Wireshark. In principle, you would want to see if there's a high rate of broadcasts, high time between TCP packet tranmissions and their ACKs, and/or TCP retranmissions.

christopher_hal... Mon, 03/30/2009 - 05:08
User Badges:

Thanks....keep it simple, right;O. In fact, Friday afternoon, browsed through youtube and actually found some vids on wireshark and filtering.


I'll post back.

ksjonak217 Thu, 03/26/2009 - 07:43
User Badges:

Chris,


I would suggest you try another copy application and see what happens. Find one that allows you to set the TCP window size. You might want to start by downloading iperf and running it across your WAN link. I have encountered so many prblems with file copies over WAN links that can be resolved by (usually) upping the TCP window size.


Cheers!


Keith

suthomas1 Thu, 03/26/2009 - 19:14
User Badges:

Sorry guyz to barge in between..but even we have a similar issue with some applications over WAN MPLS link being slow especially when the users get inside the app & trying moving thru the different options or performing actions.The Network between this users site & the server destination n.w is perfeclty fine with, no errors on switches or nodes,b.w is also fine.

But i did find retransmissions using one of our tools..bt i can parse in further as no sniffing is allowed & the tool only shows the retransmissions.

Any clues on this?

thanks!

Joseph W. Doherty Fri, 03/27/2009 - 04:29
User Badges:
  • Super Bronze, 10000 points or more

TCP retransmissions, often indicates lost packets (could also indicate timed out packets or out of sequence packets). If you're using a WAN MPLS cloud, one place drops commonly can happen is on a congested MPLS egress link (PE to CE). Common causes for this are asymetrical bandwidths between sites (e.g. HQ having more bandwidth than remote) or multiple sites sending to one site. Since the drops are within the MPLS cloud, you won't see them registered as drops on your equipment as you might for traffic entering the cloud (CE to PE).


Detecting TCP retransmissions can be one of your best clues that this might be happening. Another might be high load rate inbound, but if the drops are bad enough, you might only see a low average utilization (caused by TCP flows be driven into slow start).


If this is happening, "fixes" include better usage of MPLS vendor's QoS, shaping on CE egress (if there's no multiple site-to-site traffic), and/or increase available bandwidth.

christopher_hal... Thu, 04/02/2009 - 14:26
User Badges:

I'm coming back with a trace. I don see some arp requests (hard to tell how many.....in one minute, I grabbed 35000 packets). I also see this [TCP Out-Of-Order] and this happens with several protocols...HTTP/Telnet/TCP. Are these restransmissions?

Joseph W. Doherty Sat, 04/04/2009 - 04:35
User Badges:
  • Super Bronze, 10000 points or more

re: TCP Out-Of-Order


If you don't have multiple paths, then it likely only indicates lost packets.

sdoremus33 Sat, 04/04/2009 - 12:47
User Badges:
  • Bronze, 100 points or more

Yes these are reffred to as SAcks (Sequential Acks), meaning in its basic lets sy Hst x sends some data to Hst y in 8 byte chunks, so Hst y receives the following Frame 1-2-4-5-6-7, so what happens is that Hst Y sends an ack to Hst x and says I diodnt receive 3-8 of the packet, so Hst x retransmitts the packet knowing that 3-8 need to be retransmitted. You will see this in Wireshark as a Seq ACK. This definetly can dause latency issues, One thing you can do is to Adjust the TCP Window size.

suthomas1 Sun, 04/05/2009 - 03:46
User Badges:

Its known that even mtu issues cause slowness.How do we check if mtu is the cause of the issue & ways to correct it..

Pls suggest?

christopher_hal... Fri, 03/27/2009 - 05:02
User Badges:

Hey Keith,


What about copying via DOS? Not sure that allows the resizing of windows, which is really too large (ha--pun!). I think once of my peers has a Linux box, so I'll look into that.

Actions

This Discussion