cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1486
Views
0
Helpful
2
Replies

Etherchannel from Cisco 3012 modules to Procurve 2810

AndreSerrurier
Level 1
Level 1

Hello,

I am working on an Etherchannel issue between Cisco 3012 switches (cisco switches modules for IBM BladeCenter) and external Procurve 2810 switches.

At least, as far as I can tell I am *thinking* that is an Etherchannel issue... but I am all ears.

Sometimes a port (inside a channel-group) would do DOWN and come back UP with no apparent reasons or worst, a whole Etherchannel group (4 ports) would completely go down thus requesting a SHUT/NO SHUT from my part to be brought back from err-disable mode.

The physical interconnection setup is the following

BladeCenter with 4 Cisco 3012 uplink to 2 Procurve 2810 via 4Gig LACP Etherchhanel

In more details;

Our BladeCenter has 4 Cisco 3012 in it, our cisco models are non-stackable modules. Each 3012 has 14 internal ports towards the Blade's NIC and 4 externals ports towards the Procurves. From each Cisco 3012 I bundled the 4 external ports, gig0/15 trought gig0/18, to form channel-group 1 in Active LACP. The upstream switches at the other end of the channel-group is Procurve 2810.

Both sides of the Port channels are set to LACP Active, that is on Procurves and on the Ciscos.

Here are the Procurve's  side for the etherchannel config:

Swicth Procurve 1

trunk 17-18,35-36 Trk1 LACP
trunk 37-40 Trk2 LACP

vlan 134
   name "DATA"
   untagged 2-16,19-30,32-34,Trk1-Trk2

   tagged 1,41-48
   exit

vlan 136
   name "VOIP"
   untagged 31
   no ip address
   qos priority 7
   tagged 1,41-Trk2
   exit

Switch Procurve 2

trunk 27-30 Trk1 LACP
trunk 37-40 Trk2 LACP

vlan 134
   name "DATA"
   untagged 1-19,21-22,24-26,31-32,35-36,Trk1-Trk2

   tagged 41-48
   exit

vlan 136
   name "VOIP"
   untagged 20,23
   no ip address
   qos priority 7
   tagged 41-Trk2
   exit

And here are the Ciscos side of the Etherchannel config:

Switch Cisco 1

#Show runn

...

interface Port-channel1
switchport access vlan 134
switchport trunk native vlan 134
switchport mode trunk
link state group 1 upstream

...

Switch Cisco 2

#show runn

...

interface Port-channel1
switchport access vlan 134
switchport trunk native vlan 134
switchport mode trunk
link state group 1 upstream

...

Switch Cisco 3

#show runn

...

interface Port-channel1
switchport access vlan 134
switchport trunk native vlan 134
switchport mode trunk
link state group 1 upstream

...

Switch Cisco 4

#show runn

...

interface Port-channel1
switchport access vlan 134
switchport trunk native vlan 134
switchport mode trunk
link state group 1 upstream

...

Is there anything wrong so far with the above that could explain that the Channels works.... and go down afters weeks?

Regards,

Andre

2 Replies 2

Peter Paluch
Cisco Employee
Cisco Employee

Hello Andre,

I have no suggestion as of now but I wanted to ask you if there are any accompanying logging messages displayed on the console (or sent to a syslog) when this interface flap happens. Also, I would be very much interested in seeing the show interfaces on the Cisco side of that port that flapped.

You are saying that the port that went down was placed into an err-disabled state. An err-disabled state is a result of a faulty condition detected on a port, and it should be possible to pinpoint what has happened that the switch decided to put the port into the err-disabled state. The reason must be logged so it is worth reviewing the log buffer or the syslog. I would say that this is the starting point - determining what exactly caused the port to go into the err-disabled state.

Best regards,

Peter

Hi Peter,

                    Thanks for taking the time to read and reply to my 1st post on cisco community, much appreciated.

For the err-disable, I know actualy where it's coming from... sorry I should have explained that in the first Post.

We are using link-state traking in our BladeCenter connectivity setup to the datacenter.

So err-disable is the result of the Link-state doing what it is meant to do...

Our setup is that If all members of the upstream link state group go down, it places the upstream ports into err-disable and shuts  the downstream ports towards the Blade's NIC. That way NIC teaming at the OS level switch to another active path/NIC. Sorry I should have described it that way.

That part of the err-disable being said you can understand why I have to do the  SHUT/NO SHUT when an etherchannel goes down.

When that first started to happen, it logged the follwoing:

- 1d05h: %PM-4-ERR_DISABLE: channel-misconfig (STP) error detected on Gi0/15, putting Gi0/15 in err-disable state
- 1d05h: %PM-4-ERR_DISABLE: channel-misconfig (STP) error detected on Gi0/16, putting Gi0/16 in err-disable state
- 1d05h: %PM-4-ERR_DISABLE: channel-misconfig (STP) error detected on Gi0/17, putting Gi0/17 in err-disable state
- 1d05h: %PM-4-ERR_DISABLE: channel-misconfig (STP) error detected on Gi0/18, putting Gi0/18 in err-disable state


From there I worked on the STP side if things, thinking it was the root cause of the etherchannel going down. I eventualy ended-up disabling the BPDU altogether on the Procurve side of the Etherchannel, with the following commands:

spanning-tree Trk1 priority 4 bpdu-filter
spanning-tree Trk2 priority 4 bpdu-filter

This worked for a month, until I rebooted a simple blade and that same issue occured, very odd, a whole channel went down.

I attached the show interfaces with this Post. I'll try to get more details from the cisco side of logged events if it occures again.

Regards,

Andre

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Innovations in Cisco Full Stack Observability - A new webinar from Cisco