I've experienced a strange problem with a customer of ours. The setup for this customer uses 2 core 3750 L2/L3 switches configured with several vlan's and hsrp groups (with svi interfaces) all configured on these 2 3750's. Between the 2 3750's there is a 4Gb ether-channel configured. Everything was working as designed.
The problem started yesterday when a faulty fibre patchcable caused a link-flap on one link on one side of the ether-channel. The port got err-disabled. Not a major priority because the other 3 links were still operating normaly. However around the same time our customer started complaining about intervlan communication problems.
During troubleshooting i noticed that one of the vlan got split up (hsrp status for one vlan on both 3750's got active). So one hsrp group wasn't able to communicatie accross the ether-channel. All the other hsrp groups were operating normaly.
At this time i started suspecting it had something to do with the err-disabled link. So after swapping the cable i re-enabled this link. As soon as this link got operational again the communication problems were gone, also the hsrp communication started working again.
Now as far as i can explain this behaviour isn't normal. One err-disabled link within a multilink ether-channel should cause communication problems for 15 to 20 min. It almost seems as if the switch with the err-disabled link was still trying to use this link within the ether-channel. Have already consulted the bug dbase and release notes but could find anything related to this problem.
Has anyone seen these kind of problems or maybe got an explaination why this was happening?
Setup details: 2 3750-24TS, IOS 122-35SE1
etherchannel: 4Gb dot1q Trunk (mode: on / load-balance: src-mac) links through SFP-CWDM's. All links may pass all the vlan's.