Supervisor engine 720 redundancy takeover failure

Unanswered Question
Mar 4th, 2009

Hi

Supervisor engine 720 redundancy takeover failure

Here is the configuration,

1x 6513

1x WS-C6K-13slot

2x ws-sup720-3b

2x ws-cdc-2500W

2x ws-x6516a-gbic

2x ws-x6548-rj-45

OS image :

The sups are in redundancy mode. They are both running the following image:

S72033-pk9sv-mz.122-17d.sxb11.bin

Issue :

The redundant sup did not pick up properly

Indicates that the" boot file did not load, it stayed in ROM monitor"

1)Why did the supervisor engine fail?

2) why didn't the redundant supervisor engine pickup when the active one failed?

Please help me out to fix this issue. Its critical. Thanks in advance

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 5 (4 ratings)
Loading.
Mark Yeates Wed, 03/04/2009 - 06:36

There are many possible reasons why this is happening. I would like to see what your config-register values are. Can you please post the output of "show version" and "show redundancy".

Mark

sivaprakasam81 Wed, 03/04/2009 - 09:44

Redundant System Information :

------------------------------

Available system uptime = 2 days, 23 hours, 16 minutes

Switchovers system experienced = 0

Standby failures = 3

Last switchover reason = none

Hardware Mode = Duplex

Configured Redundancy Mode = sso

Operating Redundancy Mode = sso

Maintenance Mode = Disabled

Communications = Up

Current Processor Information :

-------------------------------

Active Location = slot 8

Current Software state = ACTIVE

Uptime in current state = 2 days, 23 hours, 16 minutes

Image Version = Cisco Internetwork Operating System Software

IOS (tm) s72033_rp Software (s72033_rp-IPSERVICESK9-M), Version 12.2(18)SXF11, RELEASE SOFTWARE (fc1)

Technical Support: http://www.cisco.com/techsupport

Copyright (c) 1986-2007 by cisco Systems, Inc.

Compiled Fri 14-Sep-07 22:01 by kellythw

BOOT = sup-bootflash:s72033-ipservicesk9-mz.122-18.SXF11.bin,12;

BOOTLDR =

Configuration register = 0x2102

Peer Processor Information :

----------------------------

Standby Location = slot 7

Current Software state = STANDBY HOT

Uptime in current state = 2 days, 23 hours, 1 minute

Image Version = Cisco Internetwork Operating System Software

IOS (tm) s72033_rp Software (s72033_rp-IPSERVICESK9-M), Version 12.2(18)SXF11, RELEASE SOFTWARE (fc1)

Technical Support: http://www.cisco.com/techsupport

Copyright (c) 1986-2007 by cisco Systems, Inc.

Compiled Fri 14-Sep-07 22:01 by kellythw

BOOT = sup-bootflash:s72033-ipservicesk9-mz.122-18.SXF11.bin,12;

BOOTLDR =

Configuration register = 0x2102

===================================

sivaprakasam81 Wed, 03/04/2009 - 09:47

Show version

---------------

Cisco Internetwork Operating System Software

IOS (tm) s72033_rp Software (s72033_rp-IPSERVICESK9-M), Version 12.2(18)SXF11, RELEASE SOFTWARE (fc1)

Technical Support: http://www.cisco.com/techsupport

Copyright (c) 1986-2007 by cisco Systems, Inc.

Compiled Fri 14-Sep-07 22:01 by kellythw

Image text-base: 0x40101040, data-base: 0x42DBD2B0

ROM: System Bootstrap, Version 12.2(17r)S2, RELEASE SOFTWARE (fc1)

BOOTLDR: s72033_rp Software (s72033_rp-IPSERVICESK9-M), Version 12.2(18)SXF11, RELEASE SOFTWARE (fc1)

1118th307-6513-01 uptime is 2 days, 23 hours, 20 minutes

Time since 1118th307-6513-01 switched to active is 2 days, 23 hours, 19 minutes

System returned to ROM by s/w reset (SP by error - a Software forced crash, PC 0x402DF044)

System image file is "sup-bootflash:s72033-ipservicesk9-mz.122-18.SXF11.bin"

This product contains cryptographic features and is subject to United

States and local country laws governing import, export, transfer and

use. Delivery of Cisco cryptographic products does not imply

third-party authority to import, export, distribute or use encryption.

Importers, exporters, distributors and users are responsible for

compliance with U.S. and local country laws. By using this product you

agree to comply with applicable laws and regulations. If you are unable

to comply with U.S. and local laws, return this product immediately.

A summary of U.S. laws governing Cisco cryptographic products may be found at:

http://www.cisco.com/wwl/export/crypto/tool/stqrg.html

If you require further assistance please contact us by sending email to

[email protected].

cisco WS-C6513 (R7000) processor (revision 1.0) with 458720K/65536K bytes of memory.

Processor board ID TBM06150127

SR71000 CPU at 600Mhz, Implementation 0x504, Rev 1.2, 512KB L2 Cache

Last reset from power-on

SuperLAT software (copyright 1990 by Meridian Technology Corp).

X.25 software, Version 3.0.0.

Bridging software.

TN3270 Emulation software.

1 Virtual Ethernet/IEEE 802.3 interface

144 FastEthernet/IEEE 802.3 interfaces

36 Gigabit Ethernet/IEEE 802.3 interfaces

1917K bytes of non-volatile configuration memory.

8192K bytes of packet buffer memory.

65536K bytes of Flash internal SIMM (Sector size 512K).

Configuration register is 0x2102

sivaprakasam81 Wed, 03/04/2009 - 09:50

Hi mark,

Please find herewith the results of show version and show redundancy results,

Please help me out on this. The config reg values are 0x2102

sivaprakasam81 Wed, 03/04/2009 - 10:33

The supervisor engine 720(slot 7) was failed on Sunday and unable to take over the slot 8 (redundant) supervisor engine. After the reboot, the slot 8 was active but the slot 7 was keeps rebooting and stayed in ROMMON mode.

Need to find out the following info,

1. Why the failure happen? attached log file.

2. why the takeover of redundant supervisor 720 failed?

3. why the failed sup keeps on rommon mode and keeps rebooting?

Thanks in advance.

regards,

Siva

Mark Yeates Wed, 03/04/2009 - 11:29

Siva,

Thanks for the posted info. Not too sure why it keeps booting into rommon. Before we jump to conclusions about faulty hardware, can you post the crashinfo file on your bootflash? I saw where the was a Software forced crash in the show version output. This crash file should give us an answer as to what the real problem is.

Mark

sivaprakasam81 Wed, 03/04/2009 - 11:46

-#- ED ----type---- --crc--- -seek-- nlen -length- ---------date/time--------- name

1 .. crashinfo 5CD968E5 C90E8 25 299112 Mar 1 2009 06:06:26 -05:00 crashinfo_20090301-110626

65236760 bytes available (299240 bytes used)

sivaprakasam81 Wed, 03/04/2009 - 11:48

Hi mark,

I posted here the output of bootflash:

Then second thing, i wish to know about switchover between supervisor engine, can you please help me out?

sivaprakasam81 Wed, 03/04/2009 - 12:10

Directory of bootflash:/

1 -rw- 299112 Mar 1 2009 06:06:26 -05:00 crashinfo_20090301-110626

65536000 bytes total (65236760 bytes free)

Mark Yeates Wed, 03/04/2009 - 12:35

Can you actually pull the crashfile from the switch and post it on here? I am trying to investigate as to why the redundancy is not working.

sivaprakasam81 Wed, 03/04/2009 - 12:40

Okay, i will do that, before that can you please guide to do switchover from Active Sup720 to Standby Sup720.

Thanks again

Mark Yeates Wed, 03/04/2009 - 12:52

The command to force a failover is "redundancy force-switchover". You shouldn't need to failover the supervisors to get the crashfile (unless you are doing something else).

sivaprakasam81 Wed, 03/04/2009 - 13:29

Thanks again.

I am trying to check if that happens again.

How can i get the crashfile to you?

Mark Yeates Wed, 03/04/2009 - 18:43

If you have a PC running a TFTP server ( I personally use and recommend tftpd32). You will want to copy the crashfile using tftp.

copy bootflash:crashfilename tftp

Address or name of remote host []? 10.10.1.100 (replace IP with your PC IP address)

Destination filename [crashfilename]?

!!

10035 bytes copied in 0.068 secs (147574 bytes/sec)

Note: Make sure the crashfile name is correct. Verify this by issuing the show bootflash: command.

Then add the crashfile as an attachment on the forum.

HTH,

Mark

sivaprakasam81 Thu, 03/05/2009 - 08:06

Hi

I am working on the ssh, and the tftp server not configured at the remote end.

how i can get you the details ?

sivaprakasam81 Thu, 03/05/2009 - 08:09

copy crashinfo_20090301-110626 tftpd32

The above command copied the information, how i can get the file?

sivaprakasam81 Thu, 03/05/2009 - 09:55

Hi mark,

Thank you for your continues support. i appreciate Mark!

I am sorry for the last post, as i was new to tftpd32.

Okay , i got the crashfile and attaching herewith, please provide me the cause of the redundancy failure.

Thanks in advance.

Actions

This Discussion