Cisco Support Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Announcements

Welcome to Cisco Support Community. We would love to have your feedback.

For an introduction to the new site, click here. And see here for current known issues.

Shaping traffic because a port is overloaded

All,

I'm attaching a diagram for what I'm currently experiencing.

Port 5/0/5 on our 3750 connects to our 3745 router. Port 5/0/5 is constantly going from 35% to >95% utilization from this one server. It's our SAN server, and apparently it's replicating back to our DR site. Is there a way to shape this traffic, and if so, where would I create the policy? On the switch or the router, and which interface would it be applied to? NAT isn't used in this scenario.

Edit: All of our traffic to our branches go out of this port, so whatever I do, I think it needs to be done by an acl so it matches just the traffic from the SAN. Am I correct?

Thanks!

John

HTH, John *** Please rate all useful posts ***
1 ACCEPTED SOLUTION

Accepted Solutions
Super Bronze

Re: Shaping traffic because a port is overloaded

"I'm not sure of the bandwidth requirements for the SAN, and I've not heard of any complaints regarding speed. But, for ease of understanding, say that I guaranteed a min bandwidth for the SAN to replicate across the link. Would that help with the bursty nature of the SAN going through that port? "

No, it would only help insure other traffic doesn't adversely impact SAN replication.

"Or, would that just tell the interface "SAN is allowed 10mb on 100mb port, BUT if there's more allow it to have more." I know that I can police the traffic minimums, burst, etc, but if I applied a policy that gave a minimum 10mb, would that drop all of the "available" to others down to a 90mb port or even less if we have to consider the 25% overhead for network control?"

Yes, if you've set a floor of 10 Mbps for SAN out of 100 Mbps, other traffic wouldn't be able to acquire more than 90 Mbps if it wanted it unless SAN used less. Conversely, as a floor or minimum, SAN could use more that 10 Mbps if other traffic wasn't using it.

If you don't believe there are any performance issues, you, again, likely don't need to do anything. Only if the bursty SAN traffic is causing other issues, might compel you to do something.

Although, if SAN replication is as bursty as you note, I would expect at least brief transitory performance issues; but many live with typical best-effort networks without knowing it can often be better. Many assume inconsistent network performance is normal (and it often is in best-effort only networks that are oversubscribed and especially don't, by default, use FQ [such as 3750s don't support FQ]).

Shaping or policing can be used for many purposes, one of which is upstream control, especially when you don't have later downstream control. For example, if the link to the backup datacenter was a T-3, one might want to police the SAN replication at the edge to 45 Mbps. You know more bandwidth isn't available later, so why let it congest later? However, assuming the downstream link shares the T-3, does it makes sense to further limit SAN replication at less than 45 Mbps? It might if you have no control over downstream congestion, but if we did, and since we don't know upstream what the congestion is downstream, it's better to manage the congestion there, where it forms. This doesn't preclude still limiting the SAN source to send at 45 Mbps, but then we need to manage bandwidth at two points. For the cost of such management, we avoid sending "too much" traffic before it gets to the later congestion management point, at least for one source. If the traffic is something like TCP, it won't much go beyond the downstream congestion point's bandwidth because it will self regulate its flow rate. Given this, and issues with managing both upstream and downstream, I've found there's often little benefit to upstream rate limiting if we're going manage the bandwidth downstream. (Note: there's always exceptions.)

PS:

BTW, the 25% you're likely thinking of is the default bandwidth you can't explicitly allocate to defined CBWFQ classes unless you override the reserved default. Although 25% is set aside for bandwidth allocations, it can still be used by other traffic. (Also 3750s don't support CBWFQ like your 3725.)

28 REPLIES
Hall of Fame Super Bronze

Re: Shaping traffic because a port is overloaded

I recommend placing QoS closest to the source hence inbound from the SANS device to the switch would be ideal.

HTH,

__

Edison.

Super Bronze

Re: Shaping traffic because a port is overloaded

What you could do on the 3750 is deprioritze the SAN replication traffic so that it only uses bandwidth otherwise unused. This would be done by directing this traffic to its own egress queue with minimum weight in shared mode (i.e. srr-queue bandwidth share).

If the overall load is too much for the 3745, use srr-queue bandwidth limit to "shape" the port rate.

Hall of Fame Super Bronze

Re: Shaping traffic because a port is overloaded

I will disagree on this approach.

On the srr-queue bandwidth share command the absolute value of each weight is meaningless, and only the ratio of parameters is used.

As for the srr-queue bandwidth limit, it will affect everyone on that location, not just the SANS device.

Best to use MQC with police inbound on the SANS port. Why take the traffic in just to drop it at egress?

Super Bronze

Re: Shaping traffic because a port is overloaded

"On the srr-queue bandwidth share command the absolute value of each weight is meaningless, and only the ratio of parameters is used. "

Correct, but that's the idea. If we could use a policy map, it would be something like:

policy-map x

class-map besteffort

bandwidth remaining percent 99

class-map SAN

bandwidth remaining percent 1

(The above is treating SAN replication traffic, more or less, like scavenger class.)

"As for the srr-queue bandwidth limit, it will affect everyone on that location, not just the SANS device. "

Yes and no. It will affect everyone in that it limits the overall rate to what the 3745 can accept, but there's no point in driving the link with more traffic than the 3745 can process regardless whether it's SAN traffic or other traffic. However, within the bandwidth capacity of the 3745, SAN will effectively only have "left over" bandwidth.

"Best to use MQC with police inbound on the SANS port. Why take the traffic in just to drop it at egress? "

Because on egress we're dropping SAN against total aggregate congestion, i.e. drops more if there's other traffic that needs the bandwidth, drops less if other traffic doesn't need the bandwidth. With an inbound policer, you drop all the time and either don't fully utilize excess bandwidth or conversely allow the policed (SAN) traffic to obtain bandwidth you would prefer other traffic to obtain.

Hall of Fame Super Bronze

Re: Shaping traffic because a port is overloaded

You aren't addressing John's concern on shaping or policing the traffic.

John's isn't looking to guarantee one traffic over the other, he is looking to control the burst traffic the SAN is creating on his network.

I understand that police will limit the traffic to an X value but it's up to John to determine what's the adequate X value the SAN can burst to.

__

Edison.

Super Bronze

Re: Shaping traffic because a port is overloaded

"You aren't addressing John's concern on shaping or policing the traffic.

John's isn't looking to guarantee one traffic over the other, he is looking to control the burst traffic the SAN is creating on his network.

I understand that police will limit the traffic to an X value but it's up to John to determine what's the adequate X value the SAN can burst to. "

Perhaps, or perhaps not. I suspect his real concern isn't so much just the need to shape or police the SAN traffic, but as you note "control the burst". One must ask, why control the burst? Is it because we don't want a link to hit 95% utilization, could be, or is the concern really what such bursts might do to other traffic sharing the link, which is mentioned in the OP ("All of our traffic to our branches go out of this port"), or performance impact to the 3745 (not explicitly mentioned)? If the former (i.e. we don't want SAN to exceed some %), yes policing the SAN traffic could be used to define "adequate" bandwidth. If the latter (i.e. don't degrade other traffic [and/or the 3745]), bursts are controlled such that there's effectively no impact to other traffic (and/or the 3745), which isn't guaranteed with policing just SAN traffic.

BTW, there's no reason why both policing and bandwidth ratio management can't be combined, but usually there's little need to do so if bandwidth allocations between traffic can be managed. There are situations where the platform and situation doesn't provide the capability to manage bandwidth, and policing is your only option, but this isn't one of them with the 3750.

[edit]

Just so there's no confusion, what I'm suggesting is bandwidth traffic management 1st, port limiting is optional depending on load impact to 3745, but port limiting, alone, although it would guarantee router's performance, like policing, wouldn't guarantee traffic performance.

Also, I have much real world experience with something similar. Remote sites that back up both hosts and servers across the WAN. These backup applications drive the WAN link to 100% utilization for hours during normal business hours (mainly laptop hosts that connect only during the day - server backups scheduled in the "early hours"). On the same link, that's running 100% for hours, run other business applications, including VoIP, w/o problem when above approach can be used. Few sites have L3 equipment that only supports policing, and backup traffic is policed, but there's no quality of service, and it's noticable regardless of "adequate" bandwidth that the backup is policed to. (In fact, doesn't have to be any backup traffic for there to be user complaints.)

Hall of Fame Super Bronze

Re: Shaping traffic because a port is overloaded

Our real world experience isn't up for debate. What you did for your customers may not be what John wants. I have countless design under my belt and no design is ever alike. All designs need to accommodate customer's needs first, then you use the technology to address it. Not the other way around.

You start by saying:

I suspect his real concern isn't so much just the need to shape or police the SAN traffic,

Yet, he used the word shape several times on his initial message.

Then you said:

is the concern really what such bursts might do to other traffic sharing the link

The way I read it, John wants to perform this QoS only on the SAN device and leave the remaining traffic the way it is now.

You and I are seeing John's request from different angle. We understand the technology but it seems only one of us really got his request. I will wait for John's reply and see who is closer to what he wants.

__

Edison.

Super Bronze

Re: Shaping traffic because a port is overloaded

Didn't intend to debate real world experience, nor intend such now, but I have run into the (common) situation that many aren't used to the concept of managing traffic using various QoS techniques.

Policing is an obvious solution to restricting some traffic's link utilization, but my usage of a real world example was to highlight a case, that's somewhat similar; to demonstrate what we may want to manage is SAN's traffic impact, not just link utilization.

Perhaps you're correct, ". . . John wants to perform this QoS only on the SAN device and leave the remaining traffic the way it is now.", but he may not be aware of other possiblities nor pitfalls. Again, just policing SAN traffic to some %, can still allow the link to burst to full utilization from other traffic although probably not as often, and/or doesn't guarantee other traffic isn't degraded by SAN bandwidth utilization. For instance, if you limit SAN to 25%, that's 25% unavailable for other traffic.

I agree we're seeing John's request from different angles. Your approach is more of a direct technical answer, i.e. you want to limit SAN traffic, do this. My approach assumes there's an underlying issue, even if not explicitly stated, which is more than we just don't want SAN to use more than X% bandwidth, but is instead, we don't want SAN bursts to adversely impact other trafffic.

In other words, you may indeed answered John's request, and even what John wants. I've tried to provide information to assist John on what he might need, even if he doesn't realize it.

What's somewhat puzzling is why you disagreed with my suggestion or are making such a fuss. Even if you're 100% correct, i.e. your answer is exactly what John desires, so what? He can choose it, give you a 5 and mark your answer as question resolved. Is there some pitfall to my suggestion you see? Some risk to John or others using my suggestion? If there is, I welcome correction, but if what I suggest isn't what John wants, so what? If it doesn't help John, it might be someone else finds it of interest when reading these forums.

Re: Shaping traffic because a port is overloaded

I want to thank you BOTH for such great answer and direction.

Joseph brings up a good point in that I don't think I explained in my OP the detail of the "problem" that I'm experiencing. The 3745 isn't being overloaded, as far as I can tell, but the port that the 3745 connects to in the switch does get up to 95-99% utilization when the SAN bursts.

I didn't figure that policing would be good since it would drop the traffic when the queue size gets full, and I've been told (I'm not a SAN admin) that the concern would be if the SAN can't sync up quick enough, it could cause a problem. (I have no way of verifying this unless I called EMC.)

That's why I used shaping as my other alternative, and I wanted to use shaping outbound. I don't know if I need to apply it on the port that the SAN connects to, or the port that the traffic goes out of the 3750. I would think I would want to apply it inbound on the port that the SAN connects to, but as Joseph probably has seen in my other posts, I have a hard time with the direction these should be placed in. (I try to apply them like I would an ACL.)

Now, overall, I would like the SAN to use little bandwidth during the day and as much as it wants at night. I don't know the first thing about QoS, but I do know I have a real need for it in this situation. I'm kinda doing this blindly, and, I don't want to affect everything else.

I really appreciate both of your suggestions.

Thanks!

John

HTH, John *** Please rate all useful posts ***
Hall of Fame Super Bronze

Re: Shaping traffic because a port is overloaded

Hi John,

Thanks for expanding your requirements.

If the bursty nature of the SAN device is affecting throughput for other services in your network, then Joseph's approach will be ideal in this situation.

If you want to avoid burst from the SAN device, then you need to control that traffic and the only solution is policing.

You can't shape inbound and shaping outbound in the 3750 is very cumbersome.

You would need to allocate the SAN traffic to a queue and shape that queue using SRR.

Keep in mind, just like policing, shaping drops traffic as well. Shaping stores the packet a bit longer in the buffers but in bursty situations and if your shaping % is lower than the demand, traffic will be dropped.

Re: Shaping traffic because a port is overloaded

Can I not shape outbound on the port that the SAN is connected to?

Where is the "SRR" commands held: 3750 or 3745?

Can you point me in the direction of good documentation to do this?

Thanks!

John

HTH, John *** Please rate all useful posts ***
Hall of Fame Super Bronze

Re: Shaping traffic because a port is overloaded

Shaping outbound on the port that is connected to the SAN will control traffic coming from the remote SAN.

You need to see the traffic flow from the switch's perspective.

The SRR command we are talking will be in the switch.

QoS on the 3750 can be found here:

http://www.cisco.com/en/US/docs/switches/lan/catalyst3750/software/release/12.2_46_se/configuration/guide/swqos.html

HTH,

__

Edison.

Re: Shaping traffic because a port is overloaded

Will I need to do anything in the router, or will the traffic stay controlled to the destination?

Thanks,

John

HTH, John *** Please rate all useful posts ***
Hall of Fame Super Bronze

Re: Shaping traffic because a port is overloaded

We've discussed several designs. Which design are you selecting?

You are concerned about the router interface showing 95% utilization but are the users in the location complaining about slowness during that period of time?

___

Edison.

Re: Shaping traffic because a port is overloaded

"...are the users in the location complaining about slowness during that period of time?"

Actually, no they're not. I figured that it would be better than having the high utilization. =)

Thanks!

John

HTH, John *** Please rate all useful posts ***
Super Bronze

Re: Shaping traffic because a port is overloaded

"I didn't figure that policing would be good since it would drop the traffic when the queue size gets full, and I've been told (I'm not a SAN admin) that the concern would be if the SAN can't sync up quick enough, it could cause a problem. (I have no way of verifying this unless I called EMC.) "

Yes, that's a valid concern. If the backup replication can't keep up with original, the replica SAN device can lose sync with the original. (If fact, from a QoS perspective, there could become a need to guarantee a minimum amount of bandwidth to keep the backup replica current.)

If there isn't any "problem" beyond seeing the link hit high utilization, you really don't need to do anything. But you write, "I do know I have a real need for it in this situation." So, other than seeing the link get busy, what's your concern? If you only want to avoid seeing the link busy, then policing is a simple solution. If you want to keep the SAN replication from adversely impacting non-SAN traffic, policing can help but it wouldn't be as "good" as queue management.

If SAN replication does have a minimum bandwidth requirement, that can be accomplished by how the egress queues are weighted. At the queue level, as Edison mentions, shaped mode (SRR) can be used, but if the bandwidth is available, i.e. not otherwise needed, why not alllow SAN to utilize if it wants, regardless of the time of day?

What I would expect to be important: a) SAN doesn't adversely impact other traffic, b) SAN obtains at least the bandwidth it needs to maintain sync.

PS:

Looking at your diagram, one could also consider traffic flows in/out the 3745, but so far, you've only mentioned replica SAN traffic and a busy link between the 3750 and 3745 being of concern.

Hall of Fame Super Bronze

Re: Shaping traffic because a port is overloaded

Hey Joseph, great post and I agree with everything you said ;)

Super Bronze

Re: Shaping traffic because a port is overloaded

Thank you.

I thought yours with "Ok, I won't disagree with any of your posts in the future... " was even better!

It demonstrates one of the benefits of these forums; how it helps people to improve, to learn . . .

ROFL

Re: Shaping traffic because a port is overloaded

Joseph,

I'm not sure of the bandwidth requirements for the SAN, and I've not heard of any complaints regarding speed. But, for ease of understanding, say that I guaranteed a min bandwidth for the SAN to replicate across the link. Would that help with the bursty nature of the SAN going through that port?

Or, would that just tell the interface "SAN is allowed 10mb on 100mb port, BUT if there's more allow it to have more." I know that I can police the traffic mininmums, burst, etc, but if I applied a policy that gave a minimum 10mb, would that drop all of the "available" to others down to a 90mb port or even less if we have to consider the 25% overhead for network control?

Thanks,

John

HTH, John *** Please rate all useful posts ***
Super Bronze

Re: Shaping traffic because a port is overloaded

"I'm not sure of the bandwidth requirements for the SAN, and I've not heard of any complaints regarding speed. But, for ease of understanding, say that I guaranteed a min bandwidth for the SAN to replicate across the link. Would that help with the bursty nature of the SAN going through that port? "

No, it would only help insure other traffic doesn't adversely impact SAN replication.

"Or, would that just tell the interface "SAN is allowed 10mb on 100mb port, BUT if there's more allow it to have more." I know that I can police the traffic minimums, burst, etc, but if I applied a policy that gave a minimum 10mb, would that drop all of the "available" to others down to a 90mb port or even less if we have to consider the 25% overhead for network control?"

Yes, if you've set a floor of 10 Mbps for SAN out of 100 Mbps, other traffic wouldn't be able to acquire more than 90 Mbps if it wanted it unless SAN used less. Conversely, as a floor or minimum, SAN could use more that 10 Mbps if other traffic wasn't using it.

If you don't believe there are any performance issues, you, again, likely don't need to do anything. Only if the bursty SAN traffic is causing other issues, might compel you to do something.

Although, if SAN replication is as bursty as you note, I would expect at least brief transitory performance issues; but many live with typical best-effort networks without knowing it can often be better. Many assume inconsistent network performance is normal (and it often is in best-effort only networks that are oversubscribed and especially don't, by default, use FQ [such as 3750s don't support FQ]).

Shaping or policing can be used for many purposes, one of which is upstream control, especially when you don't have later downstream control. For example, if the link to the backup datacenter was a T-3, one might want to police the SAN replication at the edge to 45 Mbps. You know more bandwidth isn't available later, so why let it congest later? However, assuming the downstream link shares the T-3, does it makes sense to further limit SAN replication at less than 45 Mbps? It might if you have no control over downstream congestion, but if we did, and since we don't know upstream what the congestion is downstream, it's better to manage the congestion there, where it forms. This doesn't preclude still limiting the SAN source to send at 45 Mbps, but then we need to manage bandwidth at two points. For the cost of such management, we avoid sending "too much" traffic before it gets to the later congestion management point, at least for one source. If the traffic is something like TCP, it won't much go beyond the downstream congestion point's bandwidth because it will self regulate its flow rate. Given this, and issues with managing both upstream and downstream, I've found there's often little benefit to upstream rate limiting if we're going manage the bandwidth downstream. (Note: there's always exceptions.)

PS:

BTW, the 25% you're likely thinking of is the default bandwidth you can't explicitly allocate to defined CBWFQ classes unless you override the reserved default. Although 25% is set aside for bandwidth allocations, it can still be used by other traffic. (Also 3750s don't support CBWFQ like your 3725.)

Re: Shaping traffic because a port is overloaded

Thanks Joseph!

So, do we lose downstream control when we have an inbound connection that we don't own? In other words, is my control lost at an edge router connected to an ISP, but wouldn't be lost if I had router control at both ends of a P2P T1 link? I've been a little confused as to why we can't really shape traffic coming to us, say from the internet. I guess we just police that traffic? If I've got an FTP site on a 20mb connection, but I don't want it to ever use more than 2mb because I also have a game server, would you normally police on that port inbound?

Sorry if it seems like I got off topic. QoS is a really big subject, and I'm trying to get a grasp on it.

Thanks!

John

HTH, John *** Please rate all useful posts ***
Super Bronze

Re: Shaping traffic because a port is overloaded

A shaper requires a queue to store overspeed packets, which is why shapers are configured outbound only. (In theory you could do it on the inbound interface, but since you can already do in on an outbound interface, not much point.)

A policer doesn't require a queue, so it works about the same inbound or outbound. (Another purpose for a policer might not be to drop overspeed traffic, but to tag it based on its bandwidth usage. This might be one reason why a policer is supported on input where a shaper isn't.)

Some traffic will regulate its flow rate when it sees drops (or ECN [or even jump in RTT]), some will not. For the latter, neither a shaper nor policer will keep such a flow from sending as much traffic as it desires upstream of the control point (both would control downstream).

For the former, i.e. traffic that regulates its bandwidth based on seeing drops, e.g. TCP, will slow its transmission rate, but only after seeing one or more drops. Plus, at least with TCP, it actually sends traffic as quickly as possible because it doesn't manage actual transmission rate, but how many packets to send back-to-back. Something like TCP then can "burst" into a large share of bandwidth before it knows to slow, and/or the actual burst can fill a link (for a time) beyond the downstream policer/shaper's bandwidth setting.

What this means, it's difficult to impossible to regulate transmission rate upstream of our control point, although again, we can regulate it downstream of our control point.

In your question about Internet, if the WAN link was the primary congestion point to/from the Internet, we would often want to do something outbound on both ends of the link. If one end is controlled by the ISP, and they won't allow us to control their side's outbound, we cannot obtain the same level of control on our side inbound.

For instance, on their side, we can (usually) easily limit FTP to 2 Mbps of the 20. On our side, we can police the inbound FTP to 2 Mbps (or shape 2 Mbps outbound on router heading toward our network), but FTP might still use more than 2 Mbps on the WAN link, when bursting (especially if it's still in "slow start").

PS:

BTW, for TCP, I've noticed if you police much slower than the "nominal" rate inbound, you might average you're nominal rate. E.g. for 2 Mbps policing inbound, somewhere between 10 to 50% seems to come close with default burst intervals. (I haven't tried it, but I suspect tuning burst interval down might allow more precise control.)

For TCP you can also shape outbound ACKs, but since every other packet is normally ACKed, inbound packet sizes vary, and since ACKs can piggyback, very difficult to target a specific inbound rate.

Re: Shaping traffic because a port is overloaded

Thanks Joseph!

John

HTH, John *** Please rate all useful posts ***
Hall of Fame Super Bronze

Re: Shaping traffic because a port is overloaded

What's somewhat puzzling is why you disagreed with my suggestion or are making such a fuss

Ok, I won't disagree with any of your posts in the future...

__

Edison.

New Member

Re: Shaping traffic because a port is overloaded

I have a similar issue to this. We have two SANs are that are setup to replicate over a 85 Mbps MAN connection. The main site has a 3745 router, and the remote site has a 2821. If I let the SAN replication traffic go unchecked, it will consume the entire link causing problems for my users (at the remote site). Right now I am using the "traffic-shape group" command to match traffic from the SAN based on IP and limit it to 25 Mbps in the routers. This seems to work, but I would like to allocate more bandwidth without impacting our users or backups (basically they get priority over SAN traffic). What commands would I need to implement something like this? I've played around with some different policy maps, but can't seem to get them right.

Super Bronze

Re: Shaping traffic because a port is overloaded

Assuming the MAN is your primary bottleneck, you would want a CBWFQ policies somewhat like this:

policy-map 85Mbps

class class-default

!allow 5 to 15% for Ethernet overhead

shape average 77000000

service-policy prioritzetraffic

policy-map prioritzetraffic

!class that matches your SAN traffic

class SANtraffic

!adjust bandwidth as low as possible to meet minimum bandwidth needs

!remember class can use more if available

bandwidth remaining percent 1

class class-default

fair-queue

On both routers egress

interface ethernet #

service-policy output 85Mbps

New Member

Re: Shaping traffic because a port is overloaded

I'll give that a shot. I didn't think of using the bandwidth remaining command. I had tried just using the bandwidth command and that filled up the MAN connection causing slowness for the users.

Super Bronze

Re: Shaping traffic because a port is overloaded

Nothing special about the "bandwidth remaining". What's important is class bandwidth ratios. If you try 1%, other traffic should obtain a higher priority vs. the SAN traffic, if such traffic wants the bandwidth. Yet SAN can still use 100% of the link (if the bandwidth is available).

What I'm suggesting, shouldn't (much) delay non-SAN traffic if there's SAN traffic, but SAN traffic can be delayed by non-SAN traffic.

You will need to insure that SAN isn't too starved for bandwidth, if you go with a low bandwidth %. Also know, on most small router platforms and IOSs, FQ is a special case within class-default and you might not see a 1:99 bandwidth ratio.

357
Views
22
Helpful
28
Replies