sup7203B goes always to rommon

Unanswered Question
Dec 29th, 2009
User Badges:

Hi,


I have a 6500 with 2 SUP7203B in Hot-Standby and when i reload slot 5 it goes in rommon (slot 6 is still up and takes over fine).

When it is in rommon and I simply type boot, it loads the image fine.Why it doesnt boot by itself directly??

I use SXI3 on both SUP720. Config reg are OK with 0x2102.


Also, when both SUPs are up OK, i try to do 'wr mem' a strange error comes out, see below.

Is this all caused by  the (un)famous battery? mmmm... it looks strange because these devices are on since a lot of time, so the battery should be fine.

What is it?



R015_R011#sh run | inc boot
boot-start-marker
boot system flash sup-bootflash:s72033-ipservicesk9-mz.122-33.SXI3.bin
boot system flash slavesup-bootflash:s72033-ipservicesk9-mz.122-33.SXI3.bin
boot-end-marker
no ip bootp server
diagnostic bootup level complete
R015_R011#sh bootvar
BOOT variable = sup-bootflash:s72033-ipservicesk9-mz.122-33.SXI3.bin,1;slavesup-bootflash:s72033-ipservicesk9-mz.122-33.SXI3.bin,1;
CONFIG_FILE variable =
BOOTLDR variable =
Configuration register is 0x2102


Standby is up
Standby has 524288K/65536K bytes of memory.


Standby BOOT variable = sup-bootflash:s72033-ipservicesk9-mz.122-33.SXI3.bin,1;slavesup-bootflash:s72033-ipservicesk9-mz.122-33.SXI3.bin,1;
Standby CONFIG_FILE variable =
Standby BOOTLDR variable =
Standby Configuration register is 0x2102
R015_R011#wr
Building configuration...


Dec 29 10:13:26: %PFREDUN-SP-4-BOOTSTRING_INVALID: The bootfile slavesup-bootflash:s72033-ipservicesk9-mz.122-33.SXI3.bin is present on the active supervisor but not on the standby
Dec 29 10:13:28: %PFINIT-SP-5-CONFIG_SYNC: Sync'ing the startup configuration to the standby Router. [OK]

R015_R011#dir sup-bootflash:
Directory of sup-bootflash:/


    1  -rwx    58931492  Dec 17 2009 15:22:11 +01:00  s72033-ipservicesk9-mz.122-33.SXI3.bin


65536000 bytes total (6604380 bytes free)
R015_R011#dir slavesup-bootflash:
Directory of slavesup-bootflash:/


    1  -rwx    58931492  Dec 17 2009 15:27:27 +01:00  s72033-ipservicesk9-mz.122-33.SXI3.bin


65536000 bytes total (6604380 bytes free)
R015_R011#

R015_R011#sh mod
Mod Ports Card Type                              Model              Serial No.
--- ----- -------------------------------------- ------------------ -----------
  1    4  CSM with SSL                           WS-X6066-SLB-S-K9  SAD100408P8
  3   16  SFM-capable 16 port 1000mb GBIC        WS-X6516-GBIC      SAD0543036C
  4   48  CEF720 48 port 10/100/1000mb Ethernet  WS-X6748-GE-TX     SAL11498ZVZ
  5    2  Supervisor Engine 720 (Active)         WS-SUP720-3B       SAL114883XG
  6    2  Supervisor Engine 720 (Hot)            WS-SUP720-3B       SAL114990Q6


Mod MAC addresses                       Hw    Fw           Sw           Status
--- ---------------------------------- ------ ------------ ------------ -------
  1  0013.c39f.aa98 to 0013.c39f.aa9f   1.3                2.2(2-CSCsu3 Ok
  3  0001.63d2.dc82 to 0001.63d2.dc91   4.3   6.1(3)       12.2(33)SXI3 Ok
  4  001e.13dd.d300 to 001e.13dd.d32f   2.6   12.2(14r)S5  12.2(33)SXI3 Ok
  5  0019.e7d4.202c to 0019.e7d4.202f   5.6   8.5(2)       12.2(33)SXI3 Ok
  6  001a.2f3c.8ec4 to 001a.2f3c.8ec7   5.6   8.5(2)       12.2(33)SXI3 Ok


Mod  Sub-Module                  Model              Serial       Hw     Status
---- --------------------------- ------------------ ----------- ------- -------
  4  Centralized Forwarding Card WS-F6700-CFC       SAL11499CPC  4.0    Ok
  5  Policy Feature Card 3       WS-F6K-PFC3B       SAL11487WUF  2.3    Ok
  5  MSFC3 Daughterboard         WS-SUP720          SAL1148839T  3.1    Ok
  6  Policy Feature Card 3       WS-F6K-PFC3B       SAL11488UHY  2.3    Ok
  6  MSFC3 Daughterboard         WS-SUP720          SAL11488S14  3.1    Ok


Mod  Online Diag Status
---- -------------------
  1  Pass
  3  Pass
  4  Pass
  5  Pass
  6  Pass

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 5 (2 ratings)
Loading.
vvasisth Tue, 12/29/2009 - 08:24
User Badges:
  • Silver, 250 points or more

i think the reason is you have a wrong boot variable this is your output :-

R015_R011#sh bootvar
BOOT variable = sup-bootflash:s72033-ipservicesk9-mz.122-33.SXI3.bin,1;slavesup-bootflash:s72033-ipservicesk9-mz.122-33.SXI3.bin,1;


instead of using bootflash replace it with bootdisk and that should do the trick.


Hope that helps.


Varun

GIULIO FAINI Tue, 12/29/2009 - 08:34
User Badges:

Thanks Varun but it didnt work...


R015_R011#sh run | inc boot
boot-start-marker
boot system flash sup-bootdisk:s72033-ipservicesk9-mz.122-33.SXI3.bin
boot system flash slavesup-bootdisk:s72033-ipservicesk9-mz.122-33.SXI3.bin

boot-end-marker
no ip bootp server
diagnostic bootup level complete
R015_R011#wr
Building configuration...


Dec 29 16:20:31: %PFINIT-SP-5-CONFIG_SYNC: Sync'ing the startup configuration to the standby Router. [OK]
R015_R011#reload
Proceed with reload? [confirm]


Dec 29 16:20:42: %SYS-SP-3-LOGGER_FLUSHING: System pausing to ensure console debugging output.


Dec 29 16:20:42: %OIR-SP-6-CONSOLE: Changing console ownership to switch processor




Dec 29 16:20:42: %SYS-SP-3-LOGGER_FLUSHED: System was paused for 00:00:00 to ensure console debugging output.


Dec 29 16:20:45: %SYS-SP-3-LOGGER_FLUSHING: System pausing to ensure console debugging output.




***
*** --- SHUTDOWN NOW ---
***


Dec 29 16:20:45: %SYS-SP-5-RELOAD: Reload requested by Delayed Reload. Reload Reason: reload.
Dec 29 16:20:45: %OIR-SP-6-CONSOLE: Changing console ownership to switch processor




System Bootstrap, Version 8.5(2)
Copyright (c) 1994-2007 by cisco Systems, Inc.
Cat6k-Sup720/SP processor with 524288 Kbytes of main memory


rommon 1 >

Reza Sharifi Tue, 12/29/2009 - 10:54
User Badges:
  • Super Bronze, 10000 points or more
  • Cisco Designated VIP,

    2017 LAN

Hi GIULIO,


What happens when slot 6 is primary Sup and slot 5 is the backup Sup and you reload slot 6 (primary)?

I mean slot 6 Sup also goes into rommon just like slot 5?


Also, is this happening after an upgrade to SXI3? I am running SXI2a with redundant Sups and no issues.



Reza

GIULIO FAINI Mon, 01/04/2010 - 05:27
User Badges:

remote login show bootvar gives 0x0 as configregister for the supervisor,it should have been 0x2102

ccannon88567 Mon, 01/04/2010 - 06:22
User Badges:

I had a similar issue a couple of weeks ago with one of our 6509's in a VSS configuration. Upon reboot the config register would be lost and boot straight into rommon but we could easily set it back to 0x2102 and then boot the IOS. After an unspecified ammount of time the system would also produce a software forced crash and then go back to rommon again setting the register to 0.


This turned out to be a hardware issue with the the sup720. Have you tried passing a sh tech to TAC for assistance? maybe you have a similar harware issue since the system is forcing you into rommon?

danrya Mon, 01/04/2010 - 22:54
User Badges:
  • Bronze, 100 points or more

The easiest way to fix a non-sync confreg is to reconfigur it from rommon.  The next time it drops

into rommon mode, just change the config reg to the same as it already is for the RP, and it will sync it to the

SP.


I've seen this happen before, and the only time that doesn't work is if it's a hradware failure.


Dan

taieb.hlaoui Tue, 01/05/2010 - 05:40
User Badges:

Hello all,


type the command :

remote command switch show boot

and if the config register value is 0x2100 , then type :


config-register 0x2102
write mem


retype the command remote command switch show boot , to see if the value was changed.

and finally reload the module.


hope that it helps.


kind regards

ron_fourie Mon, 02/08/2010 - 20:54
User Badges:

Hi All,


I have a Sup 720-3b in a Cisco 6509E chassis , that also only boots up into Rommon mode, but when issue with the " boot "command it will find the IOS from the bootdisk and try to boot. The boot process will only get to the point where the Switch Processor ( SP ) needs to hand over to the Route Processor ( RP ) and fail and reboots to rommon.


I have check the " set "vairables and below is a copy of it. When I try and make changes to specify the IOS on the bootdisk specifically it doesn't seems to take this changes, even if i do a sync afterwards.


rommon 1 > set
PS1=rommon ! >
LOG_PREFIX_VERSION=1
SLOTCACHE=cards;
BOOT=bootdisk:,1;
ACL_DENY=0
BSI=0
CRASHINFO=crashinfo_FAILED
RET_2_RTS=02:29:39 UTC Thu Nov 26 2009
RET_2_RCALTS=1259202581
?=0
PF_REDUN_CRASH_COUNT=0


When I issue the command "confreg 0x2102 " it still gets to the same point where the ( SP ) needs to hand over to the ( RP ) and fail, but I do get some useful info displayed on the console, but not sure how I would be able to fix my problem as I don't seems to be able to change the boot vairables.


Here is the output from the console.


System Bootstrap, Version 8.4(2) Release
Copyright (c) 1994-2005 by cisco Systems, Inc.
Cat6k-Sup720/SP processor with 524288 Kbytes of main memory

Autoboot executing command: "boot bootdisk:"

Initializing ATA monitor library...
string is bootdisk:s72033-ipservices_wan-vz.122-18.SXF11.bin
Loading image, please wait ...


Initializing ATA monitor library...

Self extracting the image... [OK]
Self decompressing the image : ############################################################################################################################################################################################################################### [OK]
running startup....

              Restricted Rights Legend

Use, duplication, or disclosure by the Government is
subject to restrictions as set forth in subparagraph
(c) of the Commercial Computer Software - Restricted
Rights clause at FAR sec. 52.227-19 and subparagraph
(c) (1) (ii) of the Rights in Technical Data and Computer
Software clause at DFARS sec. 252.227-7013.

           cisco Systems, Inc.
           170 West Tasman Drive
           San Jose, California 95134-1706


Cisco Internetwork Operating System Software
IOS (tm) s72033_sp Software (s72033_sp-IPSERVICES_WAN-VM), Version 12.2(18)SXF11, RELEASE SOFTWARE (fc1)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2007 by cisco Systems, Inc.
Compiled Fri 14-Sep-07 23:25 by kellythw
Image text-base: 0x01020150, data-base: 0x01021000

Created ION FileSystem for disk0, disk1
Active crashed three times, disabling auto-boot and dropping to rommon

Crashdump : 04:25:55.936  Tue Feb 9 2010 : ios-base : (PID=12312, TID=4) : -Traceback=(s72033-ipservices_wan-9-dso-b.so+0x5D8D0) ([32:0]+0x61F74) ([32:0]+0x4C064) ([32:0]+0x46E70) ([22:-9]4+0x13DE4) ([32:0]+0x13DBC)
crashdump called (with pause = 0 sec)

Buffered messages:
Queued messages:
%ALIGN-1-FATAL: Illegal access to a low address
addr=0x0, pc=0x73E88728, ra=0x73E88724, sp=0x2CBDE68
SLOT0:00:00:31: %DUMPER-3-PROCINFO: pid = 12312: (sbin/ios-base), terminated due to signal SIGSEGV, Segmentation violation (Address not mapped)
SLOT0:00:00:31: %DUMPER-3-REGISTERS_INFO: 12312:             zero     at       v0       v1
SLOT0:00:00:31: %DUMPER-3-REGISTERS_INFO: 12312:   R0   00000000  793F5698  00000000  00000000
SLOT0:00:00:31: %DUMPER-3-REGISTERS_INFO: 12312:             a0       a1       a2       a3
SLOT0:00:00:31: %DUMPER-3-REGISTERS_INFO: 12312:   R4   0205CB44  797253D4  000079D8  00000000
SLOT0:00:00:31: %DUMPER-3-REGISTERS_INFO: 12312:             t0       t1       t2       t3
SLOT0:00:00:31: %DUMPER-3-REGISTERS_INFO: 12312:   R8   00000000  00000000  00000053  00000020
SLOT0:00:00:31: %DUMPER-3-REGISTERS_INFO: 12312:             t4       t5       t6       t7
SLOT0:00:00:31: %DUMPER-3-REGISTERS_INFO: 12312:   R12  00000000  FFFFFFE0  FFFFFFFF  73512000
SLOT0:00:00:31: %DUMPER-3-REGISTERS_INFO: 12312:             s0       s1       s2       s3
SLOT0:00:00:31: %DUMPER-3-REGISTERS_INFO: 12312:   R16  02CBE128  02CBE158  00000000  00000000
SLOT0:00:00:31: %DUMPER-3-REGISTERS_INFO: 12312:             s4       s5       s6       s7
SLOT0:00:00:31: %DUMPER-3-REGISTERS_INFO: 12312:   R20  00000000  00000000  00000000  00000000
SLOT0:00:00:31: %DUMPER-3-REGISTERS_INFO: 12312:             t8       t9       k0       k1
SLOT0:00:00:31: %DUMPER-3-REGISTERS_INFO: 12312:   R24  02CBDD68  732CABB4  00000000  00000000
SLOT0:00:00:31: %DUMPER-3-REGISTERS_INFO: 12312:             gp       sp       s8       ra
SLOT0:00:00:31: %DUMPER-3-REGISTERS_INFO: 12312:   R28  7A570D50  02CBDE68  00000000  73E88724
SLOT0:00:00:31: %DUMPER-3-REGISTERS_INFO: 12312:             sr       lo       hi       bad
SLOT0:00:00:31: %DUMPER-3-REGISTERS_INFO: 12312:   R32  1000FC63  00000000  00000001  00000000
SLOT0:00:00:31: %DUMPER-3-REGISTERS_INFO: 12312:             cause    pc       epc
SLOT0:00:00:31: %DUMPER-3-REGISTERS_INFO: 12312:   R36  0080000C  73E88728  00000000
SLOT0:00:00:31: %DUMPER-3-TRACE_BACK_INFO: 12312: (s72033-ipservices_wan-7-dso-b.so+0x43F728) (s72033-ipservices_wan-9-dso-b.so+0x5D8D0) (s72033-ipservices_wan-9-dso-b.so+0x61F74) (s72033-ipservices_wan-9-dso-b.so+0x4C064) (s72033-ipservices_wan-9-dso-b.so+0x46E70) (s72033-ipservices_wan-4-dso-b.so+0x13DE4) (s72033-ipservices_wan-4-dso-b.so+0x13DBC)
SLOT0:00:00:34: %DUMPER-6-BAD_PATH: 12312: Choice 1 either not configured or bad path. Trying next choice.
SLOT0:00:00:34: %DUMPER-6-BAD_PATH: 12312: Choice 2 either not configured or bad path. Trying next choice.
SLOT0:00:00:34: %DUMPER-6-BAD_PATH: 12312: Choice 3 either not configured or bad path. Trying next choice.
SLOT0:00:00:35: %DUMPER-3-DUMP_FAILURE: 12312: Core dump failed: Could not create core
SLOT0:00:00:35: %DUMPER-3-CRASHINFO_FILE_NAME: 12312: Crashinfo for process sbin/ios-base at bootflash:/crashinfo_ios-base-20100209-042556
SLOT0:00:00:35: %SYSMGR-3-ABNORMTERM: ios-base:1 (jid 73) abnormally terminated, restart disabled
SLOT0:00:00:35: %SYSMGR-6-ERROR_EOK: ios-base:1 (jid 73) mandatory process exited, rebooting

System Bootstrap, Version 8.4(2) Release
Copyright (c) 1994-2005 by cisco Systems, Inc.
Cat6k-Sup720/SP processor with 524288 Kbytes of main memory


Any ideas would be welcome as I am really desperate to get this issue sorted so that I can deploy this device.

Cheers,

Ali Norouzi Sat, 12/08/2012 - 00:59
User Badges:

Thank you Taieb. Your solution worked for me. Despit the show version show conf-reg 0x2102 but the remote command show 2100. Just after retyping the command and reloading the router, problem solved. Why this happanes?

Actions

This Discussion