Catalyst 3750 stack members removed unexpectedly

Unanswered Question
Jan 4th, 2010

Hi all,


Today we had two 3750 switches remove from a core switch stack of eight 3750 switches. There was no loss of power or changes made. The other switches continued to work. The message was


040667: Jan  5 09:59:43.891 AEDT: %STACKMGR-6-SWITCH_REMOVED: Switch 7 has been REMOVED from the stack
040668: Jan  5 09:59:44.168 AEDT: %STACKMGR-6-SWITCH_REMOVED: Switch 8 has been REMOVED from the stack


The version is

Cisco IOS Software, C3750 Software (C3750-IPSERVICES-M), Version 12.2(25)SEC.


I have been unable to find any likely cause. Does anyone have an idea for a cause?


Thanks

Peter

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
Leo Laohoo Mon, 01/04/2010 - 19:15

How long have the entire stack been up?  Have you tried rebooting the 2 switches?

PETER BLOOMFIELD Mon, 01/04/2010 - 19:22

The master had been up for over two years. The stack has been rebooted and is working normally.

Leo Laohoo Mon, 01/04/2010 - 20:05

Not really a big fan of something running a very old IOS to be running more than 1 year without a reboot.

slhulshizer Wed, 01/06/2010 - 07:48

Peter,


I had the same problem as well, only with a newer IOS version.  This occurred on Jan 2nd and I have not been able to find a cause as well.  I would welcome any thoughts on this as well.  This stack was installed 28 weeks ago and has had no problems until the 2nd of January.  Switch 1 restarted after this and has been fine since.



1103: 000037:   Switch 1 has been REMOVED from the stack
1102: 000024:   Switch 1 has been REMOVED from the stack
1118: 001015:   Interface GigabitEthernet1/0/12, changed state to down
1117: 001014:   Interface GigabitEthernet1/0/11, changed state to down
1116: 001013:   Interface GigabitEthernet1/0/10, changed state to down
1115: 001012:   Interface GigabitEthernet1/0/9, changed state to down
1114: 001011:   Interface GigabitEthernet1/0/8, changed state to down
1110: 001007:   Interface GigabitEthernet1/0/5, changed state to down
1099: 001000:   Switch 1 has been REMOVED from the stack
1098: 000999:   Stack Port 1 Switch 6 has changed to state DOWN
1097: 000998:   Stack Port 2 Switch 2 has changed to state DOWN


     1 12    WS-C3750G-12S      12.2(50)SE1           C3750-IPSERVICESK9-M
*    2 12    WS-C3750G-12S      12.2(50)SE1           C3750-IPSERVICESK9-M
     3 12    WS-C3750G-12S      12.2(50)SE1           C3750-IPSERVICESK9-M
     4 12    WS-C3750G-12S      12.2(50)SE1           C3750-IPSERVICESK9-M
     5 12    WS-C3750G-12S      12.2(50)SE1           C3750-IPSERVICESK9-M
     6 12    WS-C3750G-12S      12.2(50)SE1           C3750-IPSERVICESK9-M


Thank you!

Sara

francisco_1 Wed, 01/06/2010 - 09:49

Looks like a bug related issue.

There was a bug with a recent code release  where you could get random reboots if you did a "show cdp neighbors".
looking at Version 12.2(25)SE  is actually now deferred - boot issues and sh cdp neigh causes crash.


Version 12.2(25)SE Reason for Deferral:

CSCeh45368
Headline: Switch powered down/up in 3750 stack stuck at Initializing state


1st Found-In
12.2(25)SEA

Fixed-In
12.2(25)SEB1
12.2(25)SEC



CSCsa78000
Symptom:
A Catalyst switch running Cisco IOS release 12.2(25)SG, 12.2(25)SEA, 12.2(25)SEB, 12.2(25)EZ,
or 12.2(25)EY might reload when show cdp neighbor detail is entered.


1st Found-In
12.2(25)EY
12.2(25)SEA

Fixed-In
12.2(25)EY1
12.2(25)SEB1
12.2(25)SEC
12.2(31)SG


hope that helps

PETER BLOOMFIELD Wed, 01/06/2010 - 13:22

The IOS on the switches is 12.2(25)SEC which the bugs were fixed in.

No one was on the switches at the time, so no commands were being issued. The switches didn't reboot and weren't initialising. The switches just dropped out of the stack. The stacking configuration is such that the communications between switches 1,2,3 and 4,5,6 had to continue through the two removed switch stacking ports.

Jerry Ye Wed, 01/06/2010 - 14:56

What did show version said, software reboot? Any crashdump file in the rebooted switch's flash?


Regards,

jerry

PETER BLOOMFIELD Sun, 01/10/2010 - 15:18

Hi,


The show version shows System return to ROM by power-on. We power off the entire stack to get it working again. There are crashdump files on the two switches that dropped out of the stack.

francisco_1 Fri, 01/08/2010 - 07:31

Peter,


I thought i saw 12.2(25)SE in your original post.


Are you using 12.2(25)SEC on your all switches in the stack?


Francisco.

PETER BLOOMFIELD Sun, 01/10/2010 - 15:04

Yes all switches are running Version 12.2(25)SEC


Switch   Ports  Model              SW Version              SW Image           
------   -----  -----              ----------              ----------         
*    1   28     WS-C3750G-24TS     12.2(25)SEC             C3750-IPSERVICES-M 
     2   28     WS-C3750G-24TS     12.2(25)SEC             C3750-IPSERVICES-M 
     3   28     WS-C3750G-24TS     12.2(25)SEC             C3750-IPSERVICES-M 
     4   28     WS-C3750G-24TS     12.2(25)SEC             C3750-IPSERVICES-M 
     5   28     WS-C3750G-24TS     12.2(25)SEC             C3750-IPSERVICES-M 
     6   28     WS-C3750G-24TS     12.2(25)SEC             C3750-IPSERVICES-M 
     7   28     WS-C3750G-24TS     12.2(25)SEC             C3750-IPSERVICES-M 
     8   28     WS-C3750G-24TS     12.2(25)SEC             C3750-IPSERVICES-M

vdadlaney Sun, 01/10/2010 - 16:21

Hi Peter,


Prior to rebooting the switches did you happen to notice if the cpu was high on the switches that dropped out of the stack. Just another thought as quite often I have seen the CPU on 3750 switches increase for no apparent reason. Just out of curiosity what SDM profile are you running on these switches. Thx

Jerry Ye Sun, 01/10/2010 - 18:49

HI Peter,


If you happen to have the crashdump files, can you post them.


Regards,

jerry

Leo Laohoo Sun, 01/10/2010 - 20:03

I don't see anything funny here Peter.  My suspicion lies in the fact that the manager has been up for more than 2 years with a very ancient IOS.

Jerry Ye Sun, 01/10/2010 - 20:28

I am not able to find anything matched exactly However, the IOS that you are running has a memory leak issue. You should upgrade the stack to a newer IOS.


Just curious, does the switch stack has no ip routing configured? And do you have the syslog messages during the crash?


Regards,

jerry

vvasisth Mon, 01/11/2010 - 02:01

Hi,


the error message you are geeting "%STACKMGR-6-SWITCH_REMOVED: Switch" is it just for switch 7 and 8 ?

if yes then what is the status of switch 7 and 8 are they up ?

if yes the check the physical stackwise cable at the back of these switches.

or else we might be looking into a hardware issues regarding these switches but its too early to jump to any conclusion let me know once you have an update.


regards,

varun

Actions

This Discussion

Related Content