Cisco Support Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Community Member

Cisco LAN with lync poor quality (loss and delay)

Here is the secanrio,

 

2x Cisco 3750E in a stack (core)

connects to another 2x Cisco 3750E in a stack via 4x 1gig port channel (dist)

 

and multiple 3560 48 port switches connecting off the dist stack. There are multiple different lans and they are routed on the core switch.

 

Between two users in different vlans. the Lync calls are intermittent bad. usually one caller canot hear the otherside for a while then they will come back again.

 

the Lync report shows up to 30% packet loss and upto 4 second round trip time.

 

I cannot replicate the issue, pinging the hosts I can only see around 0.2% loss max.

 

The links are well underutilized, the port channel is less than 200Mbps and the 3560 uplink trunks are around 10mbps

 

CPU's and resources are looking ok on all the devices. QoS is disabled across all the switches

 

the only thing I see is this on the 3560's output drops, (I cleared these counters 2 hours ago, see there is quite a bit of loss already incrementing) anyway to find out what this is.

 


FastEthernet0/11 is up, line protocol is up (connected)
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 127
FastEthernet0/15 is up, line protocol is up (connected)
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 579
FastEthernet0/16 is up, line protocol is up (connected)
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 290
FastEthernet0/17 is up, line protocol is up (connected)
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 418
FastEthernet0/18 is up, line protocol is up (connected)
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 36538
FastEthernet0/19 is down, line protocol is down (notconnect)
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
FastEthernet0/20 is down, line protocol is down (notconnect)
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 271
FastEthernet0/31 is up, line protocol is up (connected)
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 286
FastEthernet0/32 is up, line protocol is up (connected)
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 351
FastEthernet0/33 is up, line protocol is up (connected)
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 450
FastEthernet0/34 is down, line protocol is down (notconnect)
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 1494
FastEthernet0/35 is up, line protocol is up (connected)
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 1746
FastEthernet0/36 is up, line protocol is up (connected)
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 567
FastEthernet0/37 is down, line protocol is down (notconnect)
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
FastEthernet0/38 is up, line protocol is up (connected)
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 227
FastEthernet0/39 is up, line protocol is up (connected)
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 736
FastEthernet0/40 is up, line protocol is up (connected)
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 31662
FastEthernet0/43 is up, line protocol is up (connected)
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 507
FastEthernet0/44 is up, line protocol is up (connected)
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 73
FastEthernet0/45 is up, line protocol is up (connected)
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 78
FastEthernet0/46 is up, line protocol is up (connected)
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 499

 

 

 

16 REPLIES

Hello.I would suggest to

Hello.

I would suggest to investigate what connections (traffic flows) does Lync use for the calls.

>Lync report shows up to 30% packet loss and upto 4 second round trip time.

Looks wierd. Try to capture traffic and see source/destination IP-address and QoS marking.

Do you have any storm-control and edge prot protection applied?

Try to figure out why you have drops on 3560.

Community Member

There is no storm control or

There is no storm control or port protection of any kind. They are simple access ports.

 

Problem is very intermittent. I am trying to find out the cause for the packet drops. Can't find anything. Within 3 days I have 20000 drops on a 100mb port doing at a max 7mbps.

QoS is disabled at the application level for Lync aswell. It shouldn't be needed as the calls are just across the LAN.

 

How can i trace the cause of the packets being dropped?

 

Hello.Could you please

Hello.

Could you please provide interconnectivity diagram and note STP role/status both VLANs (the first and the second Lync clients) and IP-addressing?

Could you try to capture traffic (on the client) with Wireshark during the call?

PS: show interface (would be helpful as well).

Community Member

attached is a network setup.

attached is a network setup. I have just used arbitrary vlans but the scenario is from what we have deducted if a user within the same vlan calls another user in the same vlan there is no issue. (although this rarely happens).

 

If a user calls another user in a different vlan, sometimes the call is good quality. Other times the call is bad quality with lync reports one or more of the following issues:

max jitter of over 50ms

high packet loss (30% +)

max round trip time of over 4000ms

 

 

There seems to be no clear trend on when or why a call is bad quality, it cannot be reproduced by any particular circumstances.

 

Doing simple ping tests i can see high latency sometimes (over 2000ms) and a few drops however I don't see any other issues on the network.

Community Member

Im sure this is what is

Im sure this is what is causing all the intermittent issues. pinging from one desktop to another its perfect 99.9% of the time, but I can't explain that "blip"

 

Reply from 192.168.0.150: bytes=20 time=1ms TTL=126
Reply from 192.168.0.150: bytes=20 time=1ms TTL=126
Reply from 192.168.0.150: bytes=20 time=1ms TTL=126
Reply from 192.168.0.150: bytes=20 time=1ms TTL=126
Reply from 192.168.0.150: bytes=20 time=1ms TTL=126
Reply from 192.168.0.150: bytes=20 time=1ms TTL=126
Reply from 192.168.0.150: bytes=20 time=1ms TTL=126
Reply from 192.168.0.150: bytes=20 time=1ms TTL=126
Reply from 192.168.0.150: bytes=20 time=1ms TTL=126
Reply from 192.168.0.150: bytes=20 time=1ms TTL=126
Reply from 192.168.0.150: bytes=20 time=1ms TTL=126
Reply from 192.168.0.150: bytes=20 time=1ms TTL=126
Reply from 192.168.0.150: bytes=20 time=1ms TTL=126
Reply from 192.168.0.150: bytes=20 time=1ms TTL=126
Reply from 192.168.0.150: bytes=20 time=888ms TTL=126
Reply from 192.168.0.150: bytes=20 time=194ms TTL=126
Reply from 192.168.0.150: bytes=20 time=864ms TTL=126
Reply from 192.168.0.150: bytes=20 time=176ms TTL=126
Reply from 192.168.0.150: bytes=20 time=608ms TTL=126
Reply from 192.168.0.150: bytes=20 time=506ms TTL=126
Reply from 192.168.0.150: bytes=20 time=642ms TTL=126
Reply from 192.168.0.150: bytes=20 time=1ms TTL=126
Reply from 192.168.0.150: bytes=20 time=2ms TTL=126
Reply from 192.168.0.150: bytes=20 time=1ms TTL=126
Reply from 192.168.0.150: bytes=20 time=1ms TTL=126
Reply from 192.168.0.150: bytes=20 time=1ms TTL=126
Reply from 192.168.0.150: bytes=20 time=1ms TTL=126
Reply from 192.168.0.150: bytes=20 time=1ms TTL=126
Reply from 192.168.0.150: bytes=20 time=1ms TTL=126
Reply from 192.168.0.150: bytes=20 time=1ms TTL=126
Reply from 192.168.0.150: bytes=20 time=1ms TTL=126
Reply from 192.168.0.150: bytes=20 time=2ms TTL=126
Reply from 192.168.0.150: bytes=20 time=1ms TTL=126

 

Super Bronze

DisclaimerThe Author of this

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

The cause could be a simple as a server, with a GigE interface, sending to a client with a FE interface.  GigE can easily overrun FE.

 

What's somewhat surprising is that the edge switches have enough buffering to support such high latencies before packets are dropped.  You earlier stats, of course, show that drops occur too.

 

If, in fact, gigE to FE is causing transient congestion, and such congestion is adverse to some traffic, then the solution is to either stop the source of the congestion, for example run your servers at FE too, or you manage the congestion with QoS.

 

Something I forgot to ask about, was what IOS version is being used.  Besides feature improvements, I think I recall one of the IOS releases also "improved" buffer management.

Community Member

Hi, 3750e: c3750e-universalk9

Hi,

 

3750e: c3750e-universalk9-mz.122-52.SE.bin

3750g: 12.2(53)SE1

3560:  Version 12.2(50)SE

 

 

thanks

Super Bronze

DisclaimerThe Author of this

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

I think all those versions are late enough they've got the buffer improvement revision I have in mind, although generally I recommend using later patch releases in any IOS train.  Also, (55)SE7..8 (and 9?) seem to be very stable.  I know Leo likes (55)SE8.

 

I don't think, though, upgrading you IOS versions would mitigate the drop issues you're experiencing, again, believe QoS is what you need to protect Lync traffic, but do like having a solid/stable IOS version to minimize it being a source of any issues.

 

Hello.Really interesting

Hello.

Really interesting "blip".

Could you provide ping results to def gateway during the issue?

I can't believe the issue is related to single link overutilization, unless it's a flood (like Orbit Downloader does)... I would bet on 3750E performance issue (or maybe STP). As during the ping packet traverses only 4 links (2 in both directions), so having overall 800ms RTT would result in 100ms delay in a single queue... that is not possible on GE switch.

Could you share your switch configuration?

Could you share "sh proc cpu hist" from the switches and "sh span int ..." for all the transit interfaces?

---

Btw, why do you use 3 (not four) interfaces in Etherchannel? 

Hall of Fame Super Gold

sh interface | i Ethernet[1-9

show interface | i Ethernet[1-9]| reliability [^255]/[^255], txload 1/[^255], rxload 1/[^255]|Last input never|output never|Total output drops:_[^0]|[^0]_CRC|[^0]_unknown protocol drops|[^0]_late collision|[^0]_lost carrier|[^0]_output buffer

Post the output to the above command. 

Community Member

  Input queue: 0/75/0/0 (size


  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 101
  Last input never, output 00:00:02, output hang never
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 724
  Last input never, output 14w0d, output hang never
  Last input never, output 00:00:00, output hang never
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 821
  Last input never, output 00:00:00, output hang never
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 468
  Last input never, output 00:00:00, output hang never
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 2349
  Last input never, output 00:00:00, output hang never
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 836
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 3847
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 1462
  Last input never, output 00:00:00, output hang never
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 11334
  Last input never, output 00:00:00, output hang never
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 235
  Last input never, output 00:00:00, output hang never
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 404
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 796
  Last input never, output 00:00:00, output hang never
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 103
  Last input never, output 00:00:00, output hang never
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 267
  Last input never, output 08:31:01, output hang never
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 5
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 3846
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 312
     22 input errors, 22 CRC, 0 frame, 0 overrun, 0 ignored
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 400
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 288
  Last input never, output 00:00:01, output hang never
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 373
  Last input never, output 8w5d, output hang never
  Last input never, output 00:00:01, output hang never
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 291
  Last input never, output 00:00:01, output hang never
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 519
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 734
  Last input never, output 14w0d, output hang never
  Last input never, output 00:00:01, output hang never
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 170
  Last input never, output 14w0d, output hang never
  Last input never, output 14w0d, output hang never
  Last input never, output 00:00:01, output hang never
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 1033
  Last input never, output 00:00:01, output hang never
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 963
  Last input never, output 13w6d, output hang never
  Last input never, output 00:20:17, output hang never
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 1067
     26 input errors, 26 CRC, 0 frame, 0 overrun, 0 ignored
  Last input never, output never, output hang never
  Last input never, output never, output hang never

 

Hall of Fame Super Gold

Input queue: 0/75/0/0 (size

Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 11334

I would be keen to know what client is connected to this.  


Another thing, the output shows a lot of one-way traffic.   You may want to check the logs to see if you have any suspicious log entries.

Community Member

Its an HP laptop. There is

Its an HP laptop. There is nothing in any of the logs unfortunately. I am graphing the errors. they seem to just pop up in bunches so perhaps the user just starts a big file transfer or something.

Super Bronze

DisclaimerThe Author of this

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

When you see drops, links are not always "underutilized".

 

Your solutions choices are, insure there's enough bandwidth and/or buffering to avoid drops (although extensive buffering can cause latency issues), or use QoS features to protect/prefer you more important traffic, i.e. your Lync traffic.

 

PS:

BTW, some "gotchas"

 

If you're not intentionally using QoS features on your 3K switches, are you sure QoS is actually disabled?  If not, it's default settings can often cause premature packet drops.

 

For your port-channel(s), are you using the "optimal" load balancing algorithm?

Community Member

yes its definitely disabled

yes its definitely disabled:

sh mls qos
QoS is disabled
QoS ip packet dscp rewrite is enabled

 

You have raised an excellent point In your last statement. My load balancing mechanisms don't even match on the connected switches:

s2# sh etherchannel load-balance
EtherChannel Load-Balancing Configuration:
        src-mac

EtherChannel Load-Balancing Addresses Used Per-Protocol:
Non-IP: Source MAC address
  IPv4: Source MAC address
  IPv6: Source MAC address

 

 

s1#sh etherchannel load-balance
EtherChannel Load-Balancing Configuration:
        src-dst-ip

EtherChannel Load-Balancing Addresses Used Per-Protocol:
Non-IP: Source XOR Destination MAC address
  IPv4: Source XOR Destination IP address
  IPv6: Source XOR Destination IP address

 

Switch 1 has "port-channel load-balance src-dst-ip" set globally but s2 is let as default.

 

I will get that changed to match, is that the best algorithm or is there a better option for real time traffic?

 

I have these options:


  dst-ip       Dst IP Addr
  dst-mac      Dst Mac Addr
  src-dst-ip   Src XOR Dst IP Addr
  src-dst-mac  Src XOR Dst Mac Addr
  src-ip       Src IP Addr
  src-mac      Src Mac Addr

 

 

Super Bronze

DisclaimerThe Author of this

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

The "best" load balancing choice is an "it depends" answer but most of the time src-dst-ip works well.  (BTW, sometimes different directions are fine using different choices.)

 

Ok, you've confirmed QoS is disabled, so your FE edge ports are likely dropping due to transient bursts.  What you could do is enable QoS and configure to protect your Lync traffic.  You'll still see port drops, hopefully not much higher than what you have now, but the protected Lync traffic shouldn't be dropped or delayed.

743
Views
0
Helpful
16
Replies
CreatePlease to create content