Cisco Blade Switch for HP (3120G) crashing

Unanswered Question
Mar 11th, 2010

I got several Cisco Blade Switches for HP (3120G) where I have experienced crashes when exiting configure mode. The crash brings up the following crash dump:


Cisco IOS Software, CBS31X0 Software (CBS31X0-UNIVERSALK9-M), Version 12.2(50)SE1, RELEASE SOFTWARE (fc2)
Copyright (c) 1986-2009 by Cisco Systems, Inc.
Compiled Mon 06-Apr-09 09:28 by amvarma


Debug Exception (Could be NULL pointer dereference) Exception (0x2000)!


SRR0 = 0x01A6E0D4  SRR1 = 0x00029230  SRR2 = 0x0073394C  SRR3 = 0x00021000
ESR = 0x00000000  DEAR = 0x00000000  TSR = 0x8C000000  DBSR = 0x10000000


CPU Register Context:
Vector = 0x00002000  PC = 0x00BB535C  MSR = 0x00029230  CR = 0x22004042
LR = 0x00BB5320  CTR = 0x01A62B78  XER = 0x6000007B
R0 = 0x00BB5320  R1 = 0x05E0D550  R2 = 0x00000000  R3 = 0x03D5F958
R4 = 0x00000000  R5 = 0x00000000  R6 = 0x00000000  R7 = 0x00000000
R8 = 0x00007530  R9 = 0x00000000  R10 = 0x00000000  R11 = 0x00000005
R12 = 0x4FCFB6A1  R13 = 0x00110000  R14 = 0x01D433C0  R15 = 0x00000000
R16 = 0x05AFB89C  R17 = 0x05AFB89C  R18 = 0x00000006  R19 = 0xFFFFFFFF
R20 = 0x00000000  R21 = 0x00000000  R22 = 0x00000000  R23 = 0x00000000
R24 = 0x00000000  R25 = 0x00000000  R26 = 0x00000000  R27 = 0x02EBB180
R28 = 0x05C50C30  R29 = 0x00000000  R30 = 0x05C50C08  R31 = 0x00000000


Stack trace:
PC = 0x00BB535C, SP = 0x05E0D550
Frame 00: SP = 0x05E0D560    PC = 0x00BB5320
Frame 01: SP = 0x05E0D598    PC = 0x01A57AA8
Frame 02: SP = 0x05E0D5B8    PC = 0x01DF6F70
Frame 03: SP = 0x05E0D5C8    PC = 0x00F27980
Frame 04: SP = 0x05E0D5E8    PC = 0x00F1D464
Frame 05: SP = 0x05E0D648    PC = 0x00F1FEBC
Frame 06: SP = 0x05E0D680    PC = 0x00F1FD44
Frame 07: SP = 0x05E0D6A8    PC = 0x00F1F43C
Frame 08: SP = 0x05E0D6B8    PC = 0x00F1EE08
Frame 09: SP = 0x05E0D6C8    PC = 0x00F33F44
Frame 10: SP = 0x05E0D760    PC = 0x00F59CB8
Frame 11: SP = 0x05E0D788    PC = 0x01D5164C
Frame 12: SP = 0x05E0D7A8    PC = 0x01D43688
Frame 13: SP = 0x05E0D7B0    PC = 0x00BB72E8
Frame 14: SP = 0x00000000    PC = 0x00BADDB8


I have searched the Bug Tracking tool but have not found any related bugs to this. Originally I reported this to HP 9 months ago, as our support contracts are with HP, and HP was suppose to file a bug with Cisco. Last Monday we experienced another crash after I did the following command sequence:


switch# configure terminal

interface range gigabitethernet 1/0/4, gigabitethernet 2/0/4, port-channel 14

<CTRL-Z>


Has anyone run into this crash and got any resolution from HP or Cisco on this?

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 5 (1 ratings)
Loading.
Leo Laohoo Thu, 03/11/2010 - 14:18

Can you find any "crashinfo" file and post it?  Have you considered upgrading the IOS?  Sounds like it could be a bug.

uzimmermannatc Thu, 03/11/2010 - 14:21

The crash info file just contains the posted lines. Upgrading means downtime and currently that is really hard for me to get unless I can point at the specific bug number and it being fixed in a newer code version. Management doesn't like to blindly upgrade and hope it is fixed.

Leo Laohoo Thu, 03/11/2010 - 14:25

I have an option for you to consider:  Download the latest IOS, 12.2(53), change the boot parameters but do not boot your blade.  When the blade crashes again it will boot up the new IOS.  How about that?

Jon Marshall Fri, 03/12/2010 - 01:25

leolaohoo wrote:


I have an option for you to consider:  Download the latest IOS, 12.2(53), change the boot parameters but do not boot your blade.  When the blade crashes again it will boot up the new IOS.  How about that?


Leo


That is an ingenious way to get around the downtime   It does run the risk of the later IOS version not recognising all commands etc. but i still like it +5


To the OP, i think not just blindly upgrading every time there is a problem is a good approach but if the device is crashing when you exit configure mode to me that in itself would be a good enough reason to change the IOS.


Jon

Reza Sharifi Thu, 03/11/2010 - 15:05

Are these switches stacked?


Have a look at this bug idea.


CSCsc59027 Bug Details

mem leak in CEF IPC Background process
Symptom:

Unexpected memory consumption (mem leak) in process "CEF IPC Background"
in function "fib_memory_alloc_named_internal" if "ip routing is disabled" in stacked switches.

Conditions:

This problem has been observed only when switches are stacked
When "ip routing" is disabled and master switch has been changed (master failure or cahnge master priority/reload)
slow constant memory leak will happen in CEF IPC Background.
This happen only if "ip routing" is disabled

Workaround:

Enable ip routng and reload switch stack.

HTH

Reza

uzimmermannatc Fri, 03/12/2010 - 01:49

Bug CSCsc59027 is listed as fixed in 12.2(35)SE, we are running 12.2(50)SE1, so should not be the same issue. All instances I found through searching for the crash messages has lead me to the same posts about Catalyst 3750 stacked and that bug. Also the bug mentions a slow memory leak and several weeks of uptime. I have had this happened with switches barely up an hour (brand new installation).


I can't just upgrade because I do not feel that bug has ever actually been reported by HP to Cisco (back in June 2009), so even newer version probably still have this problem. So even if I do upgrade and then the issue happens again, hell would break loose at my company. Which is why I am trying to actually get this verified with Cisco. Right now I am trying to get HP to talk to Cisco about this for the last 4 days, something they were suppose to do 9 months ago.

Leo Laohoo Sun, 03/14/2010 - 14:01

uzimmermannatc wrote:


Right now I am trying to get HP to talk to Cisco

Good luck on that part.  With the latest development with Cisco UCS, there are some parts of HP who refuses to talk to Cisco. 

Actions

This Discussion

Related Content