DS3 point to point congestion, performance question

Unanswered Question
Nov 19th, 2007
User Badges:

I have a point to point DS3 that is our DR site link.


There are times that the link seems congested and there is poor performance, even though the bandwidth has not reached full capacity on the link.


Sometimes it does reach full capacity with the same result.


Is it possible there is a bottleneck somewhere in Verizon's network that could cause this?


I mentioned there was congestion to my manager and Verizon's Engineer during a meeting and they both said it was impossible because we are the only users of this point to point link.


My thought is that there are not two wires spanning from our site 80 miles to the DR site.


The traffic has to pass through equipment where it could be getting congested in a bottleneck somewhere.


Is this possible?

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 5 (4 ratings)
Loading.
JORGE RODRIGUEZ Mon, 11/19/2007 - 21:35
User Badges:
  • Green, 3000 points or more

Hi, you need to stablish some baseline on traffic utilization for that link at least 24 to 48 hours , first look at the router interfaces on both end of the link and look for input errors, crc errors , giants, runs interface resets or any indication of physical issues etc.. , if you can monitor the link for a 48 hours with PRTG software

http://www.paessler.com/prtg/download, this will give you a baseline of utilization on the link, with this infomation you can determined and pin point where the issue may be or if the link is over-utilized.


PRTG is not freeware but demo allows for two sensors or two interfaces to be monitored without buying software.



Rate any helpful post

HTH

Jorge



Richard Burts Tue, 11/20/2007 - 04:33
User Badges:
  • Super Silver, 17500 points or more
  • Hall of Fame,

    Founding Member

  • Cisco Designated VIP,

    2017 LAN, WAN

Richard


Yes it is possible that there is an issue with something in the provider network which is impacting your point to point circuit. I have recently gone through this with a customer who has a partial T3 and was experiencing performance problems. At first the provider responded that it could not be a circuit problem because the circuit load remained low according to their statistics and there was no congestion. Additional testing did convince the provider that there was an issue with the circuit. I am not clear what they did or what they changed, but suddenly the performance problem cleared up. It might have been a flakey card somewhere or some piece of circuit switching equipment with a problem and the provider should not dismiss the possibility that there is a problem with the circuit just because it is point to point and no one else uses it.


HTH


Rick

mvsheik123 Tue, 11/20/2007 - 07:28
User Badges:
  • Gold, 750 points or more

hi,


You might probably have solution from other memebers... but here are my 2cents..

1.Clear the counters on both ends of the rtrs and see if any erros pops up in the time frame.

2.Ran multiple 'extended' ping tests and see the avg. response time.

3. If you see any errors, ask carrier to place monitoring on the circuit and see if they can determine which direction errors coming in from.

4. If need be schdule a window and do 'loopback' tests with the carrier to make sure the h/w & in-house cabling is good.


Hope this helps.


Thanks

MS

wilson_1234_2 Tue, 11/20/2007 - 08:17
User Badges:

Thanks for all of the great replys,


Here is what I know:


This DS3 is DR connection with an MDS switch and a FCIP link connecting the two sites.


There are two VLANs bridged to the DR site and and FCIP link to the MDS switches, along with some routed subnets. So it is hard to detrmine the exact cause of large amounts of traffic.


When this problem occurs, it is very seldom, it can go for several months and then there be a rash of this problem for several days.


The only way I know that there is a congestion problem, is the FCIP link starts flapping with "TCP Retransmit" errors and the remote end shows "remote end disconnected by peer" in the MDS switch. Pings also response times longer than should be.


There are no errors accumulating on any of the interfaces, I do see a small amount of errors over time, but not when I am logged into the router and clear the counters during this problem.


At the time of this issue, I have seen that response times at 50ms when pinging the remote side, normally should be 3ms, but also the serial interface showing tx load 58/255 rx load 5/255. I have also seen the interface show almost totally utilized with the same symptoms.


I have tried to determine the source of large amounts of traffic during these times by using IP Accounting on the serial interface.


I do see the FCIP link with alot of packets going across but cannot determine what IP address on the MDS switch.


I had Cisco TAC on the phone with me during one extended session of experiencing this problem and he was in the MDS switches as well.


He could not see any reason that we would be seeing it either and suggested a carrier problem.


Verizon is saying it is a Cisco problem.



Richard Burts Tue, 11/20/2007 - 08:29
User Badges:
  • Super Silver, 17500 points or more
  • Hall of Fame,

    Founding Member

  • Cisco Designated VIP,

    2017 LAN, WAN

Richard


The fact that the problem occurs seldom makes it more difficult to troubleshoot. And unfortunately it is not uncommon for the telco to blame Cisco and for Cisco to blame the telco. What you may need to do it to wait for the problem to start again and then try to get both Cisco and the telco involved in testing.


HTH


Rick

Actions

This Discussion