cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2623
Views
11
Helpful
16
Replies

LACP / Loadbalance

mail000015
Level 1
Level 1

I want to try and understand the error messages listed below.

My setup has two 3560 switches both working in sync and providing failover in the event of any port going down respectively.  Here are the messages:

402942: *Mar  6 23:30:13.243: %SW_MATM-4-MACFLAP_NOTIF: Host 000c.29d4.3265 in vlan 10 is flapping between port Po12 and port Gi0/1

402946: *Mar  6 23:30:37.427: %SW_MATM-4-MACFLAP_NOTIF: Host 000c.29d4.3265 in vlan 10 is flapping between port Gi0/1 and port Po12

402947: *Mar  6 23:30:39.306: %SW_MATM-4-MACFLAP_NOTIF: Host 000c.29d4.326f in vlan 60 is flapping between port Gi0/1 and port Gi0/2

402937: *Mar  6 23:30:01.776: %SW_MATM-4-MACFLAP_NOTIF: Host 000c.29d4.326f in vlan 60 is flapping between port Gi0/2 and port Gi0/1

From a little research I understand this could be a loop or incorrectly configured loadbalancing.  Here are the configurations of both switches:

Port:

interface GigabitEthernet0/1

description esx1

switchport trunk encapsulation dot1q

switchport trunk allowed vlan 2-100

switchport mode trunk

channel-group 1 mode active

spanning-tree portfast trunk

Here is the port channel:

interface Port-channel1

description esx1

switchport trunk encapsulation dot1q

switchport trunk allowed vlan 2-100

switchport mode trunk

spanning-tree portfast trunk

The other switch which has an interconnect on ports 47 and 48 with Portchannel 12.

Any advice or ways to debug this would be very helpful.  I must add, that I cannot use the debug option in IOS as the swicthes are remote and I cannot risk the ports being unavailable, and of course they are in production.

2 Accepted Solutions

Accepted Solutions

Hi Chris,

How are the ESX servers configured in this setup? The way I understand it based on what you've explained previously I think you have all four server NICs connected to the same vSwitch in the ESX server and using "Route based on IP hash" as the load balancing mechanism. If that is the case then you effectively have a single "aggregate link" (port channel) from the ESX server and what you have configured is as shown in the diagram below.

If that is indeed the case, then this is not a supported configuration on the Catalyst 3560 switches. When you connect the NICs that form a single aggregate on the ESX server connecting across two physical switches you need switches that support some kind of Multi-Chassis Link Aggregation (MLAG) e.g., Catalyst 3750 "stack", Catalyst 6500 with VSS, Nexus 5000 with vPC etc.

This would also explain why you're seeing MAC flaps. The ESX server is sending traffic from a single MAC on any of the physical NICs as it's a single aggregate, but as far as the network is concerned the MAC is seen to move from one switch to another.

Unless you have a very specific reason to use port-channels to the ESX server i.e., you need a single VM to be able to use more bandwidth than is available on a single physical NIC, then I would personally remove the port-channels and use the default "Route based on originating virtual port ID" load balancing on the ESX servers. If you have a good number of VMs hosted on a the ESX server then you'll still get good load balancing across all four links.

Regards

View solution in original post

Hi Chris,

The actual NICs and ports used from the server is not going to change anything. If all four ports of the two NICs are configured as part of the same aggregate on the server, but then terminate as two separate port-channels on two separate switches, then you have the setup I described and illustrated above. Reading again your previous reply:

Lets use this scenario:

On the server 1 port 1 on interface card 1 is connected to port 11 switch one

On server 1 port 2 on interface card 2 is connected to port 12 on switch one

On the server 1 port 1 on interface card 2 is connected to port 11 switch two

On server 1 port 2 on interface card 1 is connected to port 12 on switch two

... I understand the actual setup is as follows:

As for the diagram, that's produced with Microsoft Visio.

Regards

View solution in original post

16 Replies 16

cadet alain
VIP Alumni
VIP Alumni

Hi,

if the etherchannel is between your 2 switches you should get rid of this command:

spanning-tree portfast trunk

Regards

Alain

Don't forget to rate helpful posts.

Don't forget to rate helpful posts.

Hi Alain,

Thank you for your reply.  So to clarify, there are 12 port channels.  11 of them provide LACP and trunking to esx and storage machines.  Po12 is the interconnect.  Are you saying that I should remove spanning-tree portfast trunk on all ports and port channels or just the po12?

regards

Chris

Hi,

only on the trunk between switches and corresponding portchannel.

Regards

Alain

Don't forget to rate helpful posts.

Don't forget to rate helpful posts.

Hi Alain,

Woud you be able to give me an insight why this is necessary?

I should add that the modification has not fixed the problem.

Regards

Chris

Hi,

This command could lead to STP problems when configured between 2 switches.

I can't tell you much more as I've never implemented etherchannel with VMs.

Hopefully someone more knowledgeable will kick in and find a solution to your problem.

Regards

Alain

Don't forget to rate helpful posts.

Don't forget to rate helpful posts.

Hi,

Based on your advice then, would the above also apply to the following:

Switch 1:  interconnect with Switch 2

interface GigabitEthernet0/11

description server1

switchport trunk encapsulation dot1q

switchport trunk allowed vlan 2-100

switchport mode trunk

channel-group 6 mode active

spanning-tree portfast trunk

interface GigabitEthernet0/12

description server1

switchport trunk encapsulation dot1q

switchport trunk allowed vlan 2-100

switchport mode trunk

channel-group 6 mode active

spanning-tree portfast trunk

Switch 2:  interconnect with Switch 1

interface GigabitEthernet0/12

description server1

switchport trunk encapsulation dot1q

switchport trunk allowed vlan 2-100

switchport mode trunk

channel-group 6 mode active

spanning-tree portfast trunk

Hi,

as long as the trunk is between 2 switches you should not use portfast on this trunk.

Regards

Alain

Don't forget to rate helpful posts.

Don't forget to rate helpful posts.

Again Alain thanks for the quick reply, you have been very helpful so far.

so

switch 1:  ports 45-49 are the interconnect trunk port on po12

switch 2.  ports 45-49 are the interconnect trunk port on po12

switch 1:  ports 11-12 are connected trunk port to server 1

switch 2:  ports 11-12 are also connected trunk port to server 1

server has LACP over ports 1-4 thus providing failover on the server.

In this scenario should spanning tree be removed on all trunks, remembering that these physical switches are on the same network in the same rack.

Hi,

the portfast feature should be deleted on these:

switch 1:  ports 45-49 are the interconnect trunk port on po12

switch 2.  ports 45-49 are the interconnect trunk port on po12

But not on the trunk to servers

Regards

Alain

Don't forget to rate helpful posts.

Don't forget to rate helpful posts.

Ahhh I think I know what the problem is Alain with the mac flaps.

It states that

"

portfast should only be enabled on ports connected to a single

host. Connecting hubs, concentrators, switches, bridges, etc... to this

interface  when portfast is enabled, can cause temporary bridging loops.

Use with CAUTION "

Lets use this scenario:

On the server 1 port 1 on interface card 1 is connected to port 11 switch one

On server 1 port 2 on interface card 2 is connected to port 12 on switch one

On the server 1 port 1 on interface card 2 is connected to port 11 switch two

On server 1 port 2 on interface card 1 is connected to port 12 on switch two

Hi Chris,

How are the ESX servers configured in this setup? The way I understand it based on what you've explained previously I think you have all four server NICs connected to the same vSwitch in the ESX server and using "Route based on IP hash" as the load balancing mechanism. If that is the case then you effectively have a single "aggregate link" (port channel) from the ESX server and what you have configured is as shown in the diagram below.

If that is indeed the case, then this is not a supported configuration on the Catalyst 3560 switches. When you connect the NICs that form a single aggregate on the ESX server connecting across two physical switches you need switches that support some kind of Multi-Chassis Link Aggregation (MLAG) e.g., Catalyst 3750 "stack", Catalyst 6500 with VSS, Nexus 5000 with vPC etc.

This would also explain why you're seeing MAC flaps. The ESX server is sending traffic from a single MAC on any of the physical NICs as it's a single aggregate, but as far as the network is concerned the MAC is seen to move from one switch to another.

Unless you have a very specific reason to use port-channels to the ESX server i.e., you need a single VM to be able to use more bandwidth than is available on a single physical NIC, then I would personally remove the port-channels and use the default "Route based on originating virtual port ID" load balancing on the ESX servers. If you have a good number of VMs hosted on a the ESX server then you'll still get good load balancing across all four links.

Regards

Hi Steve,

Thank you for your detailed reply to this discussion. 

In every point you have understood exactly the correct topology of my system.  In all of your points it explains logocally why I am receiving MAC Flaps.

Thanky you

I just wanted to add Steve, that in each server there are two NICs.  I am using two ports from each NIC.  Does this change anything?

Also what software do you use to illustrate your diagram, that's very help?

Regards

Chris

Hi Chris,

The actual NICs and ports used from the server is not going to change anything. If all four ports of the two NICs are configured as part of the same aggregate on the server, but then terminate as two separate port-channels on two separate switches, then you have the setup I described and illustrated above. Reading again your previous reply:

Lets use this scenario:

On the server 1 port 1 on interface card 1 is connected to port 11 switch one

On server 1 port 2 on interface card 2 is connected to port 12 on switch one

On the server 1 port 1 on interface card 2 is connected to port 11 switch two

On server 1 port 2 on interface card 1 is connected to port 12 on switch two

... I understand the actual setup is as follows:

As for the diagram, that's produced with Microsoft Visio.

Regards

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: