WAN congestion and packets drop

Unanswered Question
Oct 21st, 2008
User Badges:

In site A, I have a cisco 3845 with a two T1 (WAN) betwe is the show intAttached are the end point show interface and sh queue.

On the Mpls site the is just massive latency , packet drop, how can I fix thies problem, could it be a telco problem?



Attachment: 
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
Joseph W. Doherty Tue, 10/21/2008 - 09:07
User Badges:
  • Super Bronze, 10000 points or more

Looks like ordinary congestion. (I.e. don't think it's a telco problem.)


WFQ often does a very good job handlying outbound congestion, but with MPLS, you might need to also consider inbound congestion. The latter often addressed by working within the framework of the MPLS vendor's QoS model.


I see in some of the outbound flows, there are different ToS settings. WFQ will treat flows differently because of this. Are the markings what you intend?


For outbound, if basic WFQ isn't granular enough, you can often use CBWFQ to better manage your congestion.


Although there's much one can often do with QoS to better manage bandwidth, sometimes you really do need additional bandwidth. I.e., a T-1 might be insufficient.


Not enough information to make better suggestions.

allan.thomas Tue, 10/21/2008 - 11:45
User Badges:
  • Blue, 1500 points or more

What is evident is that the show interface output exhibited on MPLS-RT1 s6/0/16:0 and s6/0/15:0 show a high number of CRCs and Errors.


These would certainly contribute to the high number of packets drops. It would seem that the snapshot of the show interfaces indicate extremely high utilisation, and suspect that congestion is more than likely causing the packet drops and the massive latency due to buffering.


What is interesting is that on the MPLS-RT1 gateway the serial interfaces are configured with CRC-16 is this correct? Framing error could certainly attribute to the errors, I would check with your telco and ensure that this is correct.


The fundamental problem it would seem is congestion, ideally you need to baseline the utilisation across these link and determine the average. Is this normal?


Unfortunately it is not possible to alleviate congestion if your applications or traffic demands it. The alternative is to upgrade your link capactity to compensate only if absolutely warranted. Hence I would determine a baseline, and discover what applications are saturating the bandwidth.


Hope this helps.

Allan.

Joseph W. Doherty Wed, 10/22/2008 - 03:15
User Badges:
  • Super Bronze, 10000 points or more

Great catch on the input errors, especially on Serial6/0/15:0 which has 2283774 / 276862116 = .82 %, which, I think, is too high. There could be a similar loss rate outbound too.


On Serial6/0/16:0 which has 9314 / 714944073 = .0013%, shouldn't, I think, be too adverse to performance.


PS:

Something you might also try is reducing the WFQ default parameters. Calculate the BDP (in 1500 byte packets), set that, if possible, in individual flow queue size, and overall queue size to about 4x that.

Actions

This Discussion