I'm seeing a high number of output drops when we are running backups from site to site over a wireless bridge. Our setup is a follows:
backup server - 3560G - 3560 - 70mbit wireless bridge - 3560G - san
When the backups are running and sending data to the san at the remote site, the output drops show up on the port connected to the wireless bridge and don't stop until the backups are complete or fail. From what I have read this is because the wireless bridge is slower than the 100mbit or 1gbit ports on the 3560's and becomes a bottleneck. That is fine, but is there a way to decrease or eliminate the output drops using qos?
traffic is placed on the different queues depending on markings (DSCP byte).
if in your tests you send traffic with no markings or with the same DSCP value you are using only one queue.
in shaped mode the following applies:
>> In shaped mode, the egress queues are guaranteed a percentage of the bandwidth, and they are rate-limited to that amount. Shaped traffic does not use more than the allocated bandwidth even if the link is idle. Shaping provides a more even flow of traffic over time and reduces the peaks and valleys of bursty traffic. With shaping, the absolute value of each weight is used to compute the bandwidth available for the queues.
so you can get 50 Mbps on that queue and no more.
I may be wrong but I think this is what is needed in your scenario, may be using different ratios to go nearer to 70 Mbps.
From what you describe, you likely have several possible issues.
As you note, the wireless bridge is 70 Mbps, so assuming the 3560 connection to it is 100 Mbps, what happens to excess bandwidth utilizition? There's a good chance it's dropped on the bridge, where you might not see the drops. This is the likely the smallest bottleneck, but you may have another. (Does the bridge have any stats where it can show drops?)
Since you show 3560Gs on both ends, and mention gig, where gig slows to 100 Mbps you likely have drops. (Are these the drops you're noticing?)
Besides the bottlenecks, i.e. 70 and 100 Mbps vs. gig, what protocol does the SAN device use for its backups? If it uses TCP, TCP, usually by design, detects available bandwidth by ramping up bandwidth utilization until there's drops. I.e., Some drops, for a high bandwidth demanding TCP flow, is quite normal and can't be easily eliminated unless you have a device that can spoof RWIN between devices.
If the SAN device is using some "special" protocol for its backups, difficult to predict how the protocol will work unless the vendor documents it.
With the above background, what might you do?
First, assuming the wireless bridge lacks any effective QoS, we want to keep from sending it more traffic than it can handle. This so we can manage congestion on the 3560s, again assuming it's not possible on the bridge. This could be accomplished to setting the 3560s interfaces connected to the bridge not to exceed 70 Mbps. (E.g. on a 100 Mbps configured interface, "srr-queue bandwidth limit 70". NB: These values are not exact because the hardware adjusts the line rate in increments of six. I.e., you might need to set slower than 70 to insure 70 not exceeded.)
Second, you'll want to configure QoS, such that the backup traffic is identified (hopefully this can be done vs. "normal" SAN traffic). Using two (of the four) egress queues, place all routine traffic in one queue and and backups in another. Use SRR in shared mode and provide the backup queue minimal or little bandwidth and other traffic maximum or most bandwidth. (This will allow backups to use as much as possible available bandwidth, but cause little adverse impact to other traffic.)
With most bandwidth reserved for non-backups, that traffic, assuming there's sufficient bandwidth, should see almost not drops, however backups likely will.
Buffers should be sized for BDP. If they are, there's not much more that can be done on a 3560 to reduce drops for bandwidth probing traffic.
Hi everyone, I would like to thank you in advance for any help you can provide a newcomer like myself!
Im studying the 100-105 book by Odom and am currently on the topic of Port security. I purchased a used 2960 and I'm trying to follow a...
While deploying a number of 18xx/2802/3802 model access points (APs), which run AP-COS as their operating platform. It can be observed on some occasions that while many of their access points were able to join the fabric WLC withou...
I am going to design and build an LAN network under a tunnel underground with long distance between the switches.
I will have 2 Catalyst switches and 8 Industrial IE3000, and they will be connected with fiber.
For now I am planning on use Layer-2 s...