cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1880
Views
20
Helpful
23
Replies

Supervisor engine 720 redundancy takeover failure

sivaprakasam81
Level 1
Level 1

Hi

Supervisor engine 720 redundancy takeover failure

Here is the configuration,

1x 6513

1x WS-C6K-13slot

2x ws-sup720-3b

2x ws-cdc-2500W

2x ws-x6516a-gbic

2x ws-x6548-rj-45

OS image :

The sups are in redundancy mode. They are both running the following image:

S72033-pk9sv-mz.122-17d.sxb11.bin

Issue :

The redundant sup did not pick up properly

Indicates that the" boot file did not load, it stayed in ROM monitor"

1)Why did the supervisor engine fail?

2) why didn't the redundant supervisor engine pickup when the active one failed?

Please help me out to fix this issue. Its critical. Thanks in advance

23 Replies 23

Mark Yeates
Level 7
Level 7

There are many possible reasons why this is happening. I would like to see what your config-register values are. Can you please post the output of "show version" and "show redundancy".

Mark

Redundant System Information :

------------------------------

Available system uptime = 2 days, 23 hours, 16 minutes

Switchovers system experienced = 0

Standby failures = 3

Last switchover reason = none

Hardware Mode = Duplex

Configured Redundancy Mode = sso

Operating Redundancy Mode = sso

Maintenance Mode = Disabled

Communications = Up

Current Processor Information :

-------------------------------

Active Location = slot 8

Current Software state = ACTIVE

Uptime in current state = 2 days, 23 hours, 16 minutes

Image Version = Cisco Internetwork Operating System Software

IOS (tm) s72033_rp Software (s72033_rp-IPSERVICESK9-M), Version 12.2(18)SXF11, RELEASE SOFTWARE (fc1)

Technical Support: http://www.cisco.com/techsupport

Copyright (c) 1986-2007 by cisco Systems, Inc.

Compiled Fri 14-Sep-07 22:01 by kellythw

BOOT = sup-bootflash:s72033-ipservicesk9-mz.122-18.SXF11.bin,12;

BOOTLDR =

Configuration register = 0x2102

Peer Processor Information :

----------------------------

Standby Location = slot 7

Current Software state = STANDBY HOT

Uptime in current state = 2 days, 23 hours, 1 minute

Image Version = Cisco Internetwork Operating System Software

IOS (tm) s72033_rp Software (s72033_rp-IPSERVICESK9-M), Version 12.2(18)SXF11, RELEASE SOFTWARE (fc1)

Technical Support: http://www.cisco.com/techsupport

Copyright (c) 1986-2007 by cisco Systems, Inc.

Compiled Fri 14-Sep-07 22:01 by kellythw

BOOT = sup-bootflash:s72033-ipservicesk9-mz.122-18.SXF11.bin,12;

BOOTLDR =

Configuration register = 0x2102

===================================

Show version

---------------

Cisco Internetwork Operating System Software

IOS (tm) s72033_rp Software (s72033_rp-IPSERVICESK9-M), Version 12.2(18)SXF11, RELEASE SOFTWARE (fc1)

Technical Support: http://www.cisco.com/techsupport

Copyright (c) 1986-2007 by cisco Systems, Inc.

Compiled Fri 14-Sep-07 22:01 by kellythw

Image text-base: 0x40101040, data-base: 0x42DBD2B0

ROM: System Bootstrap, Version 12.2(17r)S2, RELEASE SOFTWARE (fc1)

BOOTLDR: s72033_rp Software (s72033_rp-IPSERVICESK9-M), Version 12.2(18)SXF11, RELEASE SOFTWARE (fc1)

1118th307-6513-01 uptime is 2 days, 23 hours, 20 minutes

Time since 1118th307-6513-01 switched to active is 2 days, 23 hours, 19 minutes

System returned to ROM by s/w reset (SP by error - a Software forced crash, PC 0x402DF044)

System image file is "sup-bootflash:s72033-ipservicesk9-mz.122-18.SXF11.bin"

This product contains cryptographic features and is subject to United

States and local country laws governing import, export, transfer and

use. Delivery of Cisco cryptographic products does not imply

third-party authority to import, export, distribute or use encryption.

Importers, exporters, distributors and users are responsible for

compliance with U.S. and local country laws. By using this product you

agree to comply with applicable laws and regulations. If you are unable

to comply with U.S. and local laws, return this product immediately.

A summary of U.S. laws governing Cisco cryptographic products may be found at:

http://www.cisco.com/wwl/export/crypto/tool/stqrg.html

If you require further assistance please contact us by sending email to

export@cisco.com.

cisco WS-C6513 (R7000) processor (revision 1.0) with 458720K/65536K bytes of memory.

Processor board ID TBM06150127

SR71000 CPU at 600Mhz, Implementation 0x504, Rev 1.2, 512KB L2 Cache

Last reset from power-on

SuperLAT software (copyright 1990 by Meridian Technology Corp).

X.25 software, Version 3.0.0.

Bridging software.

TN3270 Emulation software.

1 Virtual Ethernet/IEEE 802.3 interface

144 FastEthernet/IEEE 802.3 interfaces

36 Gigabit Ethernet/IEEE 802.3 interfaces

1917K bytes of non-volatile configuration memory.

8192K bytes of packet buffer memory.

65536K bytes of Flash internal SIMM (Sector size 512K).

Configuration register is 0x2102

Hi mark,

Please find herewith the results of show version and show redundancy results,

Please help me out on this. The config reg values are 0x2102

Log file

log file

The supervisor engine 720(slot 7) was failed on Sunday and unable to take over the slot 8 (redundant) supervisor engine. After the reboot, the slot 8 was active but the slot 7 was keeps rebooting and stayed in ROMMON mode.

Need to find out the following info,

1. Why the failure happen? attached log file.

2. why the takeover of redundant supervisor 720 failed?

3. why the failed sup keeps on rommon mode and keeps rebooting?

Thanks in advance.

regards,

Siva

Siva,

Thanks for the posted info. Not too sure why it keeps booting into rommon. Before we jump to conclusions about faulty hardware, can you post the crashinfo file on your bootflash? I saw where the was a Software forced crash in the show version output. This crash file should give us an answer as to what the real problem is.

Mark

-#- ED ----type---- --crc--- -seek-- nlen -length- ---------date/time--------- name

1 .. crashinfo 5CD968E5 C90E8 25 299112 Mar 1 2009 06:06:26 -05:00 crashinfo_20090301-110626

65236760 bytes available (299240 bytes used)

Hi mark,

I posted here the output of bootflash:

Then second thing, i wish to know about switchover between supervisor engine, can you please help me out?

Directory of bootflash:/

1 -rw- 299112 Mar 1 2009 06:06:26 -05:00 crashinfo_20090301-110626

65536000 bytes total (65236760 bytes free)

Can you actually pull the crashfile from the switch and post it on here? I am trying to investigate as to why the redundancy is not working.

Okay, i will do that, before that can you please guide to do switchover from Active Sup720 to Standby Sup720.

Thanks again

The command to force a failover is "redundancy force-switchover". You shouldn't need to failover the supervisors to get the crashfile (unless you are doing something else).

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Innovations in Cisco Full Stack Observability - A new webinar from Cisco