cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
7787
Views
4
Helpful
20
Replies

etherchannel load balancing

Dennis Olvany
Level 1
Level 1

Why does Cisco not offer round-robin load balancing for etherchannel?

I understand that round-robin can result in out-of-order packet delivery, but tcp is well equipped to deal with this issue. Per-packet load balancing is achievable at layer 3, so why not layer 2? Round-robin would provide a far better alternative to packet discards in a layer 2 implementation.

20 Replies 20

Peter Paluch
Cisco Employee
Cisco Employee

Hello,

Round-robin (or per-frame) load balancing is not supported by Cisco switches because it could potentially result in misordered frames, i.e. frames arriving in different order than in which they have originally been sent. The Ethernet technology alone preserves the frame ordering, and thus, deploying the EtherChannel technology must not break the existing functionality, as the EtherChannel is a transparent technology and must not in any way change the fundamental Ethernet behavior except increasing the available bandwidth. Therefore, the round-robin is not a viable approach although it would indeed distribute the load more evenly.

It is true that in Layer3 switching/routing, the per-packet load balancing is seen quite often. However, that form of load balacing suffers from the very same issue and is often deprecated not because of the sole problem of packets arriving out of order, but also because of stateful devices like NAT boxes or firewalls that may drop traffic if they do not receive the first packet or all packets in order.

Best regards,

Peter

Round-robin (or per-frame) load balancing is not supported by Cisco switches because it could potentially result in misordered frames, i.e. frames arriving in different order than in which they have originally been sent. The Ethernet technology alone preserves the frame ordering, and thus, deploying the EtherChannel technology must not break the existing functionality, as the EtherChannel is a transparent technology and must not in any way change the fundamental Ethernet behavior except increasing the available bandwidth. Therefore, the round-robin is not a viable approach although it would indeed distribute the load more evenly.

It is true that in Layer3 switching/routing, the per-packet load balancing is seen quite often. However, that form of load balacing suffers from the very same issue and is often deprecated not because of the sole problem of packets arriving out of order, but also because of stateful devices like NAT boxes or firewalls that may drop traffic if they do not receive the first packet or all packets in order.


I'm not sure I can agree that "Ethernet technology alone preserves the frame ordering". In an ethernet-only network this may be true, but this certainly does not apply to a larger internetwork. I think you make a good point about firewalls and, if this were the case, I would consider this a serious design limitation in the firewall mechanism.

Hi Dennis,

I'm not sure I can agree that "Ethernet technology alone preserves the 
frame ordering". In an ethernet-only network this may be true, but this 
certainly does not apply to a larger internetwork.

But the EtherChannel is an Ethernet-only technology and is always used only inside a single Ethernet network. It does not span throughout a larger internetwork - and if it did, the large internetwork would revert to a large single Ethernet network, wouldn't it?

I think you make a good point about firewalls and, if this were the 
case, I would consider this a serious design limitation in the firewall 
mechanism.

A serious design limitation? No, rather a natural result of their stateful nature. Consider a firewall configured with a following rule set:

  1. Allow connections initiated from internal network towards an external web server X.
  2. From the external webserver X, allow only replies to connections initiated from inside to pass into the internal network.

Now, consider that the TCP SYN segment to the server X goes through another path, and the reply (TCP SYN/ACK) from the server X arrives to this firewall. It will not let this segment pass to the internal network because it did not receive the very first TCP segment and therefore cannot verify the validity of this second segment.

And the same goes if the firewall received the third TCP segment (TCP ACK) from the internal network going to server X. Because it did not receive the first TCP SYN segment, it cannot again verify the legibility of this segment.

Does the firewall do wrong decisions here? Absolutely not. These segments may as well be forged, trying to circumvent naive firewalls checking the simple TCP flags.

Best regards,

Peter

Peter Paluch
Cisco Employee
Cisco Employee

Hi,

One more thought. You wrote:

I understand that round-robin can result in out-of-order packet delivery, but tcp is well equipped to deal with this issue.

The TCP is equipped for that situation but there are more protocols used on Ethernet networks that do not have the option of recognizing reordered frames/packets/segments. Just think of UDP streams of any kind, pure IP-encapsulated protocols (L2TPv3 is capable of running over IP directly, for example) - there are many examples.

In addition, even though the TCP is able to put the segments back into the correct order, as soon as it detects misordered frames, it tends to slow down the transmission. In effect, the TCP could easily slow down to a crawl, running on speeds far below the throughput of a single link in the EtherChannel, thereby completely negating the advantages of EtherChannel.

Best regards,

Peter

The TCP is equipped for that situation but there are more protocols used on Ethernet networks that do not have the option of recognizing reordered frames/packets/segments. Just think of UDP streams of any kind, pure IP-encapsulated protocols (L2TPv3 is capable of running over IP directly, for example) - there are many examples.

In addition, even though the TCP is able to put the segments back into the correct order, as soon as it detects misordered frames, it tends to slow down the transmission. In effect, the TCP could easily slow down to a crawl, running on speeds far below the throughput of a single link in the EtherChannel, thereby completely negating the advantages of EtherChannel.


It is generally understood that packet ordering is not guaranteed across large internetworks and protocols which do not have such a mechanism should be able to contend with such an environment. Offering options that maintain packet order is a good thing. Your point about the transmission slowing due to out-of-order packets is well taken, but I think it is preferrable to discards producing the same result. Using the most granular load-balancing method does not provide adequate packets-per-second to prevent transmit discards on the etherchannel at modest levels of bandwidth utilization.

Hello Dennis,

First of all, let me tell you that you are a fine debater - I enjoy discussing this issue with you very much.

It is generally understood that packet ordering is not guaranteed across
 large internetworks and protocols which do not have such a mechanism 
should be able to contend with such an environment.

I believe that the analogy used in this statement regarding large internetworks is not completely relevant to our discussion about the EtherChannel.

If you talk about a large internetwork in which packet reordering may take place, you are implicitly assuming that there are actual mechanisms at work that may cause this reordering, such as multiple paths to the same destination, non-trivial queueing mechanisms, different link layer technologies. Such a heterogenous system may indeed cause packets to be delivered in a different order but note that the EtherChannel we are discussing does not fall among those mentioned mechanisms. An EtherChannel bundle is itself limited to a single link layer technology, is bound to a single broadcast domain, in essence, it is a transit element of a single Ethernet network. It would be inappropriate for an Ethernet network to behave differently with respect to frame ordering, depending on whether or not an EtherChannel is deployed in a single network.

I restate my point for better clarity: within a single Ethernet network (I am not talking about large internetworks - I am talking about a single Ethernet domain), the frame ordering is preserved without EtherChannel, and must remain preserved even if the EtherChannel is deployed. Otherwise, the EtherChannel would not be a transparent technology and the protocols would need to be more "paranoid" depending on whether the EtherChannel is in use.

Using the most granular load-balancing method does not provide adequate 
packets-per-second to prevent transmit discards on the etherchannel at 
modest levels of bandwidth utilization.

Well, the EtherChannel technology is not a miracle worker and it does have its limitations. It tries to make use of parallel links in a switched topology to provide a greater bandwidth but I do not believe that the EtherChannel ever tried to guarantee a linear throughput increase. Considering its relative simplicity, it is hard to achieve anyway.

Suffice it to say that the EtherChannel cannot have more than 8 active links. Now, the Ethernet variants speeds alone are usually scaled by a factor of 10 (10M/100M/1G/10G), thus simply moving to a more recent Ethernet variant provides you with more speed than a fully loaded EtherChannel bundle can give you.

In my opinion, the EtherChannel should not be seen as a technology providing near-to-linear throughput increase. It simply wasn't designed to do that.

Best regards,

Peter

First of all, let me tell you that you are a fine debater - I enjoy discussing this issue with you very much.

Same here.

It is generally understood that packet ordering is not guaranteed across large internetworks and protocols which do not have such a mechanism should be able to contend with such an environment.

I believe that the analogy used in this statement regarding large internetworks is not completely relevant to our discussion about the EtherChannel.

If you talk about a large internetwork in which packet reordering may take place, you are implicitly assuming that there are actual mechanisms at work that may cause this reordering, such as multiple paths to the same destination, non-trivial queueing mechanisms, different link layer technologies. Such a heterogenous system may indeed cause packets to be delivered in a different order but note that the EtherChannel we are discussing does not fall among those mentioned mechanisms. An EtherChannel bundle is itself limited to a single link layer technology, is bound to a single broadcast domain, in essence, it is a transit element of a single Ethernet network. It would be inappropriate for an Ethernet network to behave differently with respect to frame ordering, depending on whether or not an EtherChannel is deployed in a single network.

I restate my point for better clarity: within a single Ethernet network (I am not talking about large internetworks - I am talking about a single Ethernet domain), the frame ordering is preserved without EtherChannel, and must remain preserved even if the EtherChannel is deployed. Otherwise, the EtherChannel would not be a transparent technology and the protocols would need to be more "paranoid" depending on whether the EtherChannel is in use.

Using the most granular load-balancing method does not provide adequate packets-per-second to prevent transmit discards on the etherchannel at modest levels of bandwidth utilization.

Well, the EtherChannel technology is not a miracle worker and it does have its limitations. It tries to make use of parallel links in a switched topology to provide a greater bandwidth but I do not believe that the EtherChannel ever tried to guarantee a linear throughput increase. Considering its relative simplicity, it is hard to achieve anyway.

Suffice it to say that the EtherChannel cannot have more than 8 active links. Now, the Ethernet variants speeds alone are usually scaled by a factor of 10 (10M/100M/1G/10G), thus simply moving to a more recent Ethernet variant provides you with more speed than a fully loaded EtherChannel bundle can give you.

In my opinion, the EtherChannel should not be seen as a technology providing near-to-linear throughput increase. It simply wasn't designed to do that.

In a communication flow where everything is designed to ignore ordering (ethernet, IP, etc) or tolerate reordering (TCP), this load-balancing limitation seems to impose a double standard. It would be nice if the port-based load-balancing method were more pervasive. I agree that 10G would be ideal, but also the most costly. 10G seems like overkill to support the pps required of only ~100Mb of traffic.

Hi Dennis,

In a communication flow where everything is designed to ignore ordering 
(ethernet, IP, etc) or tolerate reordering (TCP), this load-balancing 
limitation seems to impose a double standard.

You seem to assume that just because an Ethernet does not have sequence numbers and has no explicit mechanisms to ascertain that the frames are arriving in order, it is inherently a technology that ignores ordering.

But that is too strong an assumption. There may be implicit mechanisms or rules that effectively prevent reordering to the extent that sequence numbers or other explicit tools are not necessary at all, at least on a selected part of the network.

In particular, a single Ethernet link - a link that interconnects two active Ethernet devices - can never cause frame reordering. On a link between two switches, no reordering of frames may take place. Do we agree on this point?

Now, if you replace that link with an EtherChannel (as an EtherChannel only replaces such links), there must be absolutely no change to the behavior of the replaced network segment. It did not reorder frames before so it is not supposed to reorder frames now.

In addition, the IEEE 802.1AX-2008 standard that discusses the EtherChannel technology is very clear on this - Clause 5.2.1, item f)

Frame ordering must be maintained for certain sequences of frame exchanges between MAC Clients (known as conversations, see Clause 3). The Distributor ensures that all frames of a given conversation are passed to a single port. For any given port, the Collector is required to pass frames to the MAC Client in the order that they are received from that port. The Collector is otherwise free to select frames received from the aggregated ports in any order. Since there are no means for frames to be misordered on a single link, this guarantees that frame ordering is maintained for any conversation.

where the term conversation is defined as follows (Clause 3.8):

A set of frames transmitted from one end station to another, where all of the frames form an ordered sequence, and where the communicating end stations require the ordering to be maintained among the set of frames exchanged.

Regarding the better pervasivity of the load-balancing methods: it is confined to the extent of the clauses I have just quoted from the 802.1AX. The most pervasive methods take the MAC+IP+L4 addressing information into account but they can hardly go beyond because that could split a single conversation into subflows that would be subject to possible reordering.

Best regards,

Peter

Come to think of it, Cisco's load-balancing algorithm only seeks to preserve ordering for higher level protocols. The very nature of etherchannel makes it impossible to preserve ordering at the frame level. The more granular algorithms (ie. src/dst IP) will split communication flows over different links, but the frames are arbitrarily reordered as they come off the etherchannel.

Regarding the available algorithms, I favor the src/dst tcp/udp port option on the 6500.

Hi Dennis,

The very nature of etherchannel makes it impossible to preserve ordering at the frame level.

It preserves frame ordering in conversations which is exactly what 802.1AX requests, as I indicated earlier.

Best regards,

Peter

The snippet seems to use the term conversation with reference to ethernet frames only. Perhaps there is more to usage of the term which is not obvious.

Dennis Olvany
Level 1
Level 1

I am considering another possibility for solving transmit discards on the etherchannel. I'm thinking that jumbo mtu could slow packets-per-second and abate the discards.

Hi Dennis,

I am considering another possibility for solving transmit discards on 
the etherchannel. I'm thinking that jumbo mtu could slow 
packets-per-second and abate the discards.

I am not sure this would help. No device will "coalesce" frames or IP packets into bigger units for you once they have been sent. Intermediate Layer3 devices are capable of fragmenting IP packets but never of defragmenting or even coalescing them. Setting a higher MTU would be helpful only if the end hosts generated jumbo IP packets as well and the entire transmission path between the communicating hosts supported the jumbo MTU. Otherwise, you would not make use of the increased MTU, or you could accidentally run into IP fragmentation issues, should a part of the transmission chain not support the increased MTU.

Best regards,

Peter

I am considering another possibility for solving transmit discards on the etherchannel. I'm thinking that jumbo mtu could slow packets-per-second and abate the discards.

I am not sure this would help. No device will "coalesce" frames or IP packets into bigger units for you once they have been sent. Intermediate Layer3 devices are capable of fragmenting IP packets but never of defragmenting or even coalescing them. Setting a higher MTU would be helpful only if the end hosts generated jumbo IP packets as well and the entire transmission path between the communicating hosts supported the jumbo MTU. Otherwise, you would not make use of the increased MTU, or you could accidentally run into IP fragmentation issues, should a part of the transmission chain not support the increased MTU.

All hosts and intermediate links on a VLAN would have to migrate to jumbo, which does nothing for pps to/from remote hosts. It does, however, reduce pps within a given VLAN and between jumbo VLANs which will serve to reduce overall pps. Path MTU Discovery should alleviate fragmentation issues.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Innovations in Cisco Full Stack Observability - A new webinar from Cisco