ISSU bootloop

Unanswered Question
Jul 14th, 2010

Hi,

I tried a ISSU upgrad on a pre-production VSS system the other day. I was going from SXI3 to SXI4.

Everytime I issued the issu-runversion command both Active and Standby unit reloaded. I tried to get it to work with diffrent images but everytime I got the same results. After maybe 5 times the Standby switch did not load IOS but stoped in rommmon. I removed all connections and turnd off the active unit and tried to get it to boot as standalone.

After trying that I can't get it to boot at all, it's complains about the image "%ISSU_PROCESS-SP-3-IMAGE: Active is loading the wrong image", and here we are.

I changed the confreg, I erased nvram from RP, tried diffrent images for booting but it always comes back to SP and rommon.

Does anyone know how to clear the unit, remove all config and parameters from the switch?

Back to vanilla and no ISSU process out of control.

------------------------

cisco WS-C6506-E (R7000) processor (revision 1.2) with 983008K/65536K bytes of memory.
Processor board ID SAL1419HCT8
SR71000 CPU at 600Mhz, Implementation 0x504, Rev 1.2, 512KB L2 Cache
Last reset from s/w reset
1 Virtual Ethernet interface
3 Gigabit Ethernet interfaces
18 Ten Gigabit Ethernet interfaces
1917K bytes of non-volatile configuration memory.
8192K bytes of packet buffer memory.

65536K bytes of Flash internal SIMM (Sector size 512K).
*Jul 14 15:56:53.875: %SYS-SP-3-LOGGER_FLUSHING: System pausing to ensure console debugging output.

*Jul 14 15:53:47.019: %SYS-SP-3-LOGGER_FLUSHED: System was paused for 00:00:00 to ensure console debugging output.

*Jul 14 15:56:46.287: %SPANTREE-SP-5-EXTENDED_SYSID: Extended SysId enabled for type vlan. The Bridge IDs of all active STP instances have been updated, which might change the spanning tree topology
*Jul 14 15:56:46.295: SP: SP: Currently running ROMMON from S (Gold) region
*Jul 14 15:56:53.111: %ISSU_PROCESS-SP-3-IMAGE: Active is loading the wrong image [ bootdisk:s72033-advipservicesk9_wan-mz.122-33.SXI2a.bin ], expected image [ bootdisk:/s72033-advipservicesk9_wan-mz.122-33.SXI2a.bin ]
*Jul 14 15:56:53.871: %RF-SP-5-RF_RELOAD: Shelf reload. Reason: Active is loading the wrong image
*Jul 14 15:56:53.875: %OIR-SP-6-CONSOLE: Changing console ownership to switch processor

*Jul 14 15:56:57.207: %SYS-SP-3-LOGGER_FLUSHING: System pausing to ensure console debugging output.

***
*** --- SHUTDOWN NOW ---
***

*Jul 14 15:56:54.819: %SYS-SP-3-LOGGER_FLUSHED: System was paused for 00:00:00 to ensure console debugging output.

*Jul 14 15:56:57.199: %SYS-SP-5-RELOAD: Reload requested by Delayed Reload. Reload Reason: bus error at PC 0x4174A1D0, address 0x0.
*Jul 14 15:56:57.207: %OIR-SP-6-CONSOLE: Changing console ownership to switch processor

System Bootstrap, Version 8.5(3)
Copyright (c) 1994-2008 by cisco Systems, Inc.
Cat6k-Sup720/SP processor with 1048576 Kbytes of main memory

rommon 1 >

-------------------------------------------------

Best regards

Mikael Gustafsson

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Average Rating: 0 (0 ratings)
bralynch Wed, 07/14/2010 - 09:34

Hey Mikael,

Can you please provide the steps you took in changing the config-register, erasing the NVRAM, and trying to boot the switch?  Does the switch still fail to boot if you manually boot it by specifing an image from one of the available file systems?

Brandon

mik.gustafsson@... Wed, 07/14/2010 - 11:02

Hi Brandon

-First I tried to change the BOOT= parameter in SP rommon. I tested SXI2a, 3 and 4 all on the switch.

   BOOT=sup-bootdisk:s72033-advipservicesk9_wan-mz.122-33.SXI2a.bin.

-I tested to boot directly with the  boot bootdisk:s72033-advipservicesk9_wan-mz.122-33.SXI2a.bin command.

-Then I changed the confreg to 0x2142 to skip startup-config.

-After this I changede the SWITCH_NUMBER= from 2 to 1.

-And then I changed the BOOT= on the RP and tested to point it at diffrent images.

-Then a nvram_erase on the RP (with rommon 1 > priv).

-I put the confreg back to 0x2102 and after a boot to 0x0.

Between all steps I did at least one boot and all of them stoped with the errorr %ISSU_PROCESS-SW2_SP-3-IMAGE: Active is loading the wrong image: ...

I hade a couple crashes during my tries to boot IOS. They produced crashinfo_20100714 files on the bootflash:

BR

Mikael

bralynch Wed, 07/14/2010 - 11:16

Hey Mikael,

After looking into this further, you may need to clear the persistent ISSU variables.  Please try the following -

rommon > priv

rommon > PRST_VBL_NUKE=1

rommon > sync

Then, confirm the variable is set correctly -

rommon > set

rommon > i

Make sure your boot statement is pointing to the correct image and then reload.  If you experience further boot failures after this, you can also try clearing all BOOT variables -

rommon > unset BOOT

rommon > sync

Hope this helps.  Let me know what you find.

Brandon

mik.gustafsson@... Wed, 07/14/2010 - 12:37

Hi Brandon,

I changed the  PRST_VBL_NUKE=1 variable en restarted the system, it didnt work, same problem.

After that I removed the BOOT variables and first it looked like it was working, but then instead I got a system crash and a new crashinfo file. A new reset of the system from me and the ISSU errror was back :-)

Then I again tested to do a > boot bootdisk:reboot an got IOS to start.

The error is:

*Jul 14 19:04:32.223: %ISSU_PROCESS-SP-3-IMAGE: Active is loading the wrong image [ bootdisk:s72033-advipservicesk9_wan-mz.122-33.SXI2a.bin ], expected image [ bootdisk:/s72033-advipservicesk9_wan-mz.122-33.SXI2a.bin ]    And the diffrent here is the " / " after bootdisk:. So I simply tried

rommnon 6 > boot bootdisk:/s72033-advipservicesk9_wan-mz.122-33.SXI2a.bin    and it work.

Now Im in IOS and trying to change the image with the boot system command but then I get this.

Router#conf t
Enter configuration commands, one per line.  End with CNTL/Z.

Router(config)#$ sup-bootdisk:/s72033-adventerprisek9-vz.122-33.SXI3.bin

% ISSU process is in progress; Boot variable can not be updated.
Router(config)#

This means that I nerver succeded in stoping the ISSU so I tried:

Router#issu abortversion ?
  <1-6>  Active RP slot number
 

Router#issu abortversion
The system is without a fully initialized peer and Service impact will occur. Proceed with abort? [confirm]
% ISSU process can be aborted only from [ Load Version ] or [ Run Version ] or [ Load Version - Switchover ] or [ Run Version - Switchover ] state

Router#

Do you have an suggestion about the ISSU process?

Thaks again for your help so far.

BR

Mikael

bralynch Wed, 07/14/2010 - 13:09

Hey Mikael,

With the box up, can you change the configuration-register to 0x2142 or perform a 'write erase' to clear the configuration?  Or is this also blocked?  Given that we know that we can boot it up by including the extra '/', we may need to test another reload and try to set the ROMMON variable again.

If you are unable to change the config-register or clear the memory with the box up, please log your session and do the following -

1) Warm-reboot the device.

2) When it starts to come up, break into ROMMON using 'Ctrl+Break'.

3) Follow the same procedure as before to set the 'PRST_VBL_NUKE' variable to 1.  Go ahead and clear the boot variables as before as well at this phase before reloading.

4) Manually set the 'BOOT' variable to point to the correct image without the extra '/' symbol.

5) Make sure these changes are synced via the 'set' command.

6) Reset the box to see if it boots.

Please attach the log for review.  Thanks.

By the way, what was the full name of the original image you were running and the full name of the image you were trying to upgrade to?

Brandon

mik.gustafsson@... Thu, 07/15/2010 - 03:43

Hi Brandon,

I succeded to get IOS up everytime I do a reload, this was probably because a mistake from me were I put CONFREG 0x4142 under the SP. Now it's changed to 0x2102 and the boot process is normal.

But ISSU is still stuck in a state were it won't allow me to change image, in IOS or rommon.

Router(config)#boot system sup-bootdisk:^Z
% ISSU process is in progress; Boot variable can not be updated.      

When I look at ISSU it gives me disabled.

Router#sh issu state
                          Slot = 5
                      RP State = Active
                    ISSU State = System Reset
                 Boot Variable = bootdisk:/s72033-advipservicesk9_wan-mz.122-33.SXI2a.bin
% Standby information is not available because it is in 'DISABLED' state

I see alot of  ISSU proccesses like   129 Mwe 41E73E38            0          1       0 5564/6000   0 IPC ISSU Version

If I could bring up the peer switch and get them to recognise each other I could som how trick ISSU?

Thats what Im testing now.

BR

Mikael

mik.gustafsson@... Thu, 07/15/2010 - 06:34

The last testing I did setting up the VSS pair agian worked, it did enable the ISSU to be Init again.

Router#sh issu state

                          Slot = 1/5

                      RP State = Active

                    ISSU State = Init

                 Boot Variable = bootdisk:s72033-advipservicesk9_wan-mz.122-33.SXI2a.bin;

                          Slot = 2/5

                      RP State = Standby

                    ISSU State = Init

                 Boot Variable = bootdisk:s72033-advipservicesk9_wan-mz.122-33.SXI2a.bin;

Now Im able to start over, put configuration back and start a new test to use ISSU for SXI3 and SXI4.

A tought about this:

Is it possible to remove this variable without a second Catalyst? Is it the same for other 'deep'config like VSS, not possible to remove it all?

BR

Mikael

bralynch Thu, 07/15/2010 - 07:40

Hey Mikael,

From what I found, setting the mentioned variable to "1" was the suggested way to reinitialize ISSU.  If this was being set from the SP's ROMMON and the problem was still there, it may be worth getting a TAC case open for deeper investigation.  It's possible that in general circumstances with the second switch still connected in the VSS pair, this method would work but did not because the secondary switch was disconnected.  Has the second ISSU procedure worked successfully?

Brandon

mik.gustafsson@... Fri, 07/16/2010 - 03:54

Hi Brandon,

It took some time to get the pre-production lab up again. The VSS system with problems are now connected  (MEC) to another VSS pair that works as core router over two dc-sites. And that core pair is again connected to an other VSS.

I got all features running between them, OSPF MPLSVPN and so on. I got L2 MEC links working and connected to access-layer N5k with vPC.

All done I tried to do en ISSU upgrade from the Active switch again, from ISX3 to ISX4.the first time I got:

r-log-oslr-r25-9-1#issu loadversion su
r-log-oslr-r25-9-1#$bootdisk:s72033-adventerprisek9-vz.122-33.SXI4.bin      

000114: Jul 15 22:40:01.955 DST: %ISSU_PROCESS-SW1_SP-3-IPC_AGENT: Failed to send; error code [ timeout ]
000115: Jul 15 22:40:01.955 DST: %ISSU_PROCESS-SW1_SP-3-SYSTEM: Failed to set Standby ISSU state to Pre-LV
% ISSU state could not be set to LV on Standby.

I then booted the system and tried the PRST_VBL_NUKE on Standby (problem switch).

This worked and I could get run the #issu loadversion command.

So, I wait fo maybe 20 minutes and run the #issu runversion, This produced the same result as before, both Active and Standby reloads.

This is the output from the Stanby unit:

000040: Jul 15 23:22:53.591 DST: %OIR-SW2_SP-6-INSCARD: Card inserted in slot 5, interfaces are now online
000041: Jul 15 23:22:55.291 DST: %LDP-5-GR: LDP restarting gracefully.  Preserving forwarding state for 600 seconds.SLOT0:00:23:19: %DUMPER-3-PROCINFO: pid = 16406: (sbin/ios-base), terminated due to signal SIGQUIT, Quit (Signal from user)
SLOT0:00:23:19: %DUMPER-3-REGISTERS_INFO: 16406:             zero     at       v0       v1      
SLOT0:00:23:19: %DUMPER-3-REGISTERS_INFO: 16406:   R0   00000000  7A95AB24  00000004  01000000 
SLOT0:00:23:19: %DUMPER-3-REGISTERS_INFO: 16406:             a0       a1       a2       a3      
SLOT0:00:23:19: %DUMPER-3-REGISTERS_INFO: 16406:   R4   00E93CB8  00E93CC0  00000000  00E93D50 
SLOT0:00:23:19: %DUMPER-3-REGISTERS_INFO: 16406:             t0       t1       t2       t3      
SLOT0:00:23:19: %DUMPER-3-REGISTERS_INFO: 16406:   R8   FFFFFFF8  FFFAFFFF  00E93F5C  00E93F58 
SLOT0:00:23:19: %DUMPER-3-REGISTERS_INFO: 16406:             t4       t5       t6       t7      
SLOT0:00:23:19: %DUMPER-3-REGISTERS_INFO: 16406:   R12  00E93F54  00E93F50  00E93F4C  00E93F48 
SLOT0:00:23:19: %DUMPER-3-REGISTERS_INFO: 16406:             s0       s1       s2       s3      
SLOT0:00:23:19: %DUMPER-3-REGISTERS_INFO: 16406:   R16  00E93CB8  00E93CC0  00000004  00E93E08 
SLOT0:00:23:19: %DUMPER-3-REGISTERS_INFO: 16406:             s4       s5       s6       s7      
SLOT0:00:23:19: %DUMPER-3-REGISTERS_INFO: 16406:   R20  00E93D58  00000000  00000000  00000008 
SLOT0:00:23:19: %DUMPER-3-REGISTERS_INFO: 16406:             t8       t9       k0       k1      
SLOT0:00:23:19: %DUMPER-3-REGISTERS_INFO: 16406:   R24  00E93F94  779D683C  00000000  00000000 
SLOT0:00:23:19: %DUMPER-3-REGISTERS_INFO: 16406:             gp       sp       s8       ra      
SLOT0:00:23:19: %DUMPER-3-REGISTERS_INFO: 16406:   R28  7D2C9F20  00E93C68  00E93D08  779B9C60 
SLOT0:00:23:19: %DUMPER-3-REGISTERS_INFO: 16406:             sr       lo       hi       bad     
SLOT0:00:23:19: %DUMPER-3-REGISTERS_INFO: 16406:   R32  1000FC73  00000000  00000000  7D2BFEF0 
SLOT0:00:23:19: %DUMPER-3-REGISTERS_INFO: 16406:             cause    pc       epc     
SLOT0:00:23:19: %DUMPER-3-REGISTERS_INFO: 16406:   R36  00800020  779D684C  00000000 
SLOT0:00:23:19: %DUMPER-3-TRACE_BACK_INFO: 16406: (libc.so+0x4084C) (libc.so+0x23C60) (libc.so+0x554AC) (libc.so+0x55AC0) (libc.so+0x81C90) (s72033_rp-adventerprisek9-13-dso-b.so+0x3C9900) (s72033_rp-adventerprisek9-7-dso-b.so+0x442638) (libc.so+0x2404C)
SLOT0:00:23:25: %DUMPER-6-BAD_PATH: 16406: Choice 1 either not configured or bad path. Trying next choice.
SLOT0:00:23:25: %DUMPER-6-BAD_PATH: 16406: Choice 2 either not configured or bad path. Trying next choice.
SLOT0:00:23:25: %DUMPER-6-BAD_PATH: 16406: Choice 3 either not configured or bad path. Trying next choice.
SLOT0:00:23:26: %DUMPER-3-DUMP_FAILURE: 16406: Core dump failed: Could not create core
SLOT0:00:23:28: %DUMPER-3-CRASHINFO_FILE_NAME: 16406: Crashinfo for process sbin/ios-base at bootflash:/crashinfo_ios-base-20100715-212258
SLOT0:00:23:28: %SYSMGR-3-ABNORMTERM: ios-base:1 (jid 75) abnormally terminated, restart disabled
SLOT0:00:23:28: %SYSMGR-6-ERROR_EOK: ios-base:1 (jid 75) mandatory process exited, rebooting

That was last night, after that I tried ISSU three more times and the last time worked, with only about 30sek down time for failover (OSPF)

I did get alot of these:

000197: *Jul 16 11:59:35.570 DST: %BIT-SW2_SP-4-OUTOFRANGE: bit 1114112 is not in the expected range of 1920 to 8191 : ios-base : (PID=12311, TID=10) : -Traceback=(s72033-adventerprisek9-0-dso-bnp.so+0x3E284) ([35:0]+0x3E47C) ([23:-3]1-dso-b+0x3F28BC) ([23:-9]4+0x2DC528) ([23:-9]1+0x40F9E4) ([23:-9]4+0x2DD298) ([33:0]+0x2E09B4) ([33:0]+0x2E0E84) ([33:0]+0x2EBEF0)
000198: *Jul 16 11:59:35.570 DST: %BIT-SW2_SP-4-OUTOFRANGE: bit 4980736 is not in the expected range of 1920 to 8191 : ios-base : (PID=12311, TID=10) : -Traceback=(s72033-adventerprisek9-0-dso-bnp.so+0x3E284) ([35:0]+0x3E47C) ([23:-3]1-dso-b+0x3F28BC) ([23:-9]4+0x2DC528) ([23:-9]1+0x40F9E4) ([23:-9]4+0x2DD298) ([33:0]+0x2E09B4) ([33:0]+0x2E0E84) ([33:0]+0x2EBEF0)
000199: *Jul 16 11:59:35.570 DST: %BIT-SW2_SP-4-OUTOFRANGE: bit 58851328 is not in the expected range of 1920 to 8191 : ios-base : (PID=12311, TID=10) : -Traceback=(s72033-adventerprisek9-0-dso-bnp.so+0x3E284) ([35:0]+0x3E47C) ([23:-3]1-dso-b+0x3F28BC) ([23:-9]4+0x2DC528) ([23:-9]1+0x40F9E4) ([23:-9]4+0x2DD298) ([33:0]+0x2E09B4) ([33:0]+0x2E0E84) ([33:0]+0x2EBEF0)
000200: *Jul 16 11:59:37.295 DST: %BIT-SW2_SP-4-OUTOFRANGE: bit 1114112 is not in the expected range of 1920 to 8191 : ios-base : (PID=12311, TID=10) : -Traceback=(s72033-adventerprisek9-0-dso-bnp.so+0x3E284) ([35:0]+0x3E47C) ([23:-3]1-dso-b+0x3F28BC) ([23:-9]4+0x2DC528) ([23:-9]1+0x40F688) ([23:-9]4+0x2DCE98) ([33:0]+0x2E0910) ([33:0]+0x2E0E84) ([33:0]+0x2EBEF0)
000201: *Jul 16 11:59:37.299 DST: %BIT-SW2_SP-4-OUTOFRANGE: bit 4980736 is not in the expected range of 1920 to 8191 : ios-base : (PID=12311, TID=10) : -Traceback=(s72033-adventerprisek9-0-dso-bnp.so+0x3E284) ([35:0]+0x3E47C) ([23:-3]1-dso-b+0x3F28BC) ([23:-9]4+0x2DC528) ([23:-9]1+0x40F688) ([23:-9]4+0x2DCE98) ([33:0]+0x2E0910) ([33:0]+0x2E0E84) ([33:0]+0x2EBEF0).

For this setup of switches my deadline is in about 3 weeks so Im going to try ISSU a couple of times more to see how it works out, but then I have to start with the move to production. My conclusion is to not use ISSU on this pair in an production environment at the moment.

BR

Mikael

bralynch Fri, 07/16/2010 - 06:04

Hey Mikael,

It's not clear at this point what may have triggered these failures.  I presume that you were following a process similar to the following document -

http://www.cisco.com/en/US/docs/switches/lan/catalyst6500/ios/12.2SX/configuration/guide/vss.html#wp1169499

From the errors provided, it appears that crash files were being generated so it would certainly be worth getting a TAC case open at some point to investigate this failure further.  By all accounts, this procedure should work successfully.  Was your SXI3 image modular as well that were you trying to upgrade from?  Can you provide the full image names of those that were tested?  This image also needs to have the same feature set.  It looks like the images should match from your previous posts but I wanted to confirm.

Brandon

mik.gustafsson@... Wed, 07/21/2010 - 02:11

Hi Brandon,

Yes I use the configuration guide "12.2(33)SXH  and later releases"

First I did the upgrade from s72033-adventerprisek9-vz.122-33.SXI3.bin to s72033-adventerprisek9-vz.122-33.SXI4.bin.

This is the upgrade that I cant get working, I get a reloafd on both active and standby switch.

The bootloop is probably from the confreg mistake i made.

Now just 20 minutes ago I tried a ISSU upgrade, again from SXI3 to SXI4. Up until the issu commitversion command all looks fine. But when I run the last command both standby and active reloads.

----From SSH session to VSS pair----

r-log-oslr-r25-9-1#sh redun
r-log-oslr-r25-9-1#sh redundancy
Redundant System Information :
------------------------------
       Available system uptime = 56 minutes
Switchovers system experienced = 1
              Standby failures = 0
        Last switchover reason = active unit removed

                 Hardware Mode = Duplex
    Configured Redundancy Mode = sso
     Operating Redundancy Mode = sso
              Maintenance Mode = Disabled
                Communications = Up

Current Processor Information :
-------------------------------
               Active Location = slot 2/5
        Current Software state = ACTIVE
       Uptime in current state = 31 minutes
                 Image Version = Cisco IOS Software, s72033_rp Software (s72033_rp-ADVENTERPRISEK9-VM), Version 12.2(33)SXI4, RELEASE SOFTWARE (fc3)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2010 by Cisco Systems, Inc.
Compiled Sat 29-May-10 16:50 by prod_rel_team
                          BOOT = sup-bootdisk:s72033-adventerprisek9-vz.122-33.SXI3.bin,12;
        Configuration register = 0x2102

Peer Processor Information :
----------------------------
              Standby Location = slot 1/5
        Current Software state = STANDBY HOT
       Uptime in current state = 24 minutes
                 Image Version = Cisco IOS Software, s72033_rp Software (s72033_rp-ADVENTERPRISEK9-VM), Version 12.2(33)SXI3, RELEASE SOFTWARE (fc2)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2009 by Cisco Systems, Inc.
Compiled Tue 27-Oct-09 10:21 by prod_rel_team
                          BOOT = sup-bootdisk:s72033-adventerprisek9-vz.122-33.SXI3.bin,12;
        Configuration register = 0x2102


r-log-oslr-r25-9-1#sh issu stat
r-log-oslr-r25-9-1#sh issu state
                          Slot = 2/5
                      RP State = Active
                    ISSU State = Run Version
                 Boot Variable = bootdisk:s72033-adventerprisek9-vz.122-33.SXI4.bin,12;bootdisk:s72033-adventerprisek9-vz.122-33.SXI3.bin,12

                          Slot = 1/5
                      RP State = Standby
                    ISSU State = Run Version
                 Boot Variable = bootdisk:s72033-adventerprisek9-vz.122-33.SXI3.bin,12
r-log-oslr-r25-9-1#
r-log-oslr-r25-9-1#sh issu rol
r-log-oslr-r25-9-1#sh issu rollback-timer
        Rollback Process State = In progress
      Configured Rollback Time = 00:45:00
       Automatic Rollback Time = 00:08:47

r-log-oslr-r25-9-1#
r-log-oslr-r25-9-1#
r-log-oslr-r25-9-1#iss
r-log-oslr-r25-9-1#issu com
r-log-oslr-r25-9-1#issu commitversion
Building configuration...

----From Standby VSS switch----

(UNKNOWN)SLOT0:00:45:30: %DUMPER-3-PROCINFO: pid = 16406: (sbin/ios-base), terminated due to signal SIGBUS, Bus error (Invalid address alignment)
SLOT0:00:45:30: %DUMPER-3-REGISTERS_INFO: 16406:             zero     at       v0       v1      
SLOT0:00:45:30: %DUMPER-3-REGISTERS_INFO: 16406:   R0   00000000  77A2550C  00000001  00000FBA 
SLOT0:00:45:30: %DUMPER-3-REGISTERS_INFO: 16406:             a0       a1       a2       a3      
SLOT0:00:45:30: %DUMPER-3-REGISTERS_INFO: 16406:   R4   00001016  7A8CE344  00FDDFC0  0000000C 
SLOT0:00:45:30: %DUMPER-3-REGISTERS_INFO: 16406:             t0       t1       t2       t3      
SLOT0:00:45:30: %DUMPER-3-REGISTERS_INFO: 16406:   R8   00000000  0000CA30  B1C2CA30  0000003C 
SLOT0:00:45:30: %DUMPER-3-REGISTERS_INFO: 16406:             t4       t5       t6       t7      
SLOT0:00:45:30: %DUMPER-3-REGISTERS_INFO: 16406:   R12  00000057  7B1D7770  03A6A8CC  1EC75DDC 
SLOT0:00:45:30: %DUMPER-3-REGISTERS_INFO: 16406:             s0       s1       s2       s3      
SLOT0:00:45:30: %DUMPER-3-REGISTERS_INFO: 16406:   R16  FFFFFFFF  10A2FD50  10A2FDF0  0EAD55C8 
SLOT0:00:45:30: %DUMPER-3-REGISTERS_INFO: 16406:             s4       s5       s6       s7      
SLOT0:00:45:30: %DUMPER-3-REGISTERS_INFO: 16406:   R20  00000000  00000000  00000000  00000000 
SLOT0:00:45:30: %DUMPER-3-REGISTERS_INFO: 16406:             t8       t9       k0       k1      
SLOT0:00:45:30: %DUMPER-3-REGISTERS_INFO: 16406:   R24  00000000  73BFAAD8  00000000  00000000 
SLOT0:00:45:30: %DUMPER-3-REGISTERS_INFO: 16406:             gp       sp       s8       ra      
SLOT0:00:45:30: %DUMPER-3-REGISTERS_INFO: 16406:   R28  7CFD9910  031BED08  00000000  764FC168 
SLOT0:00:45:30: %DUMPER-3-REGISTERS_INFO: 16406:             sr       lo       hi       bad     
SLOT0:00:45:30: %DUMPER-3-REGISTERS_INFO: 16406:   R32  1000FC63  0044B650  00000000  73BFA2E1 
SLOT0:00:45:30: %DUMPER-3-REGISTERS_INFO: 16406:             cause    pc       epc     
SLOT0:00:45:30: %DUMPER-3-REGISTERS_INFO: 16406:   R36  80800014  764FC160  00000000 
SLOT0:00:45:30: %DUMPER-3-TRACE_BACK_INFO: 16406: (s72033_rp-adventerprisek9-2-dso-bn.so+0x3A160) (s72033_rp-adventerprisek9-12-dso-b.so+0x22C6EC) (s72033_rp-adventerprisek9-12-dso-b.so+0x22E37C) (s72033_rp-adventerprisek9-4-dso-b.so+0x4541EC) (s72033_rp-adventerprisek9-4-dso-b.so+0x456E24) (s72033_rp-adventerprisek9-4-dso-b.so+0x43309C) (s72033_rp-adventerprisek9-3-dso-b.so+0x57B858) (s72033_rp-adventerprisek9-3-dso-b.so+0x57C10C) (s72033_rp-adventerprisek9-3-dso-b.so+0x575280) (s72
SLOT0:00:45:36: %DUMPER-6-BAD_PATH: 16406: Choice 1 either not configured or bad path. Trying next choice.
SLOT0:00:45:36: %DUMPER-6-BAD_PATH: 16406: Choice 2 either not configured or bad path. Trying next choice.
SLOT0:00:45:36: %DUMPER-6-BAD_PATH: 16406: Choice 3 either not configured or bad path. Trying next choice.
SLOT0:00:45:36: %DUMPER-3-DUMP_FAILURE: 16406: Core dump failed: Could not create core
SLOT0:00:45:38: %DUMPER-3-CRASHINFO_FILE_NAME: 16406: Crashinfo for process sbin/ios-base at bootflash:/crashinfo_ios-base-20100721-084620
SLOT0:00:45:39: %SYSMGR-3-ABNORMTERM: ios-base:1 (jid 75) abnormally terminated, restart disabled
SLOT0:00:45:39: %SYSMGR-6-ERROR_EOK: ios-base:1 (jid 75) mandatory process exited, rebooting

Then when the boot is done both switches runs SXI4

r-log-oslr-r25-9-1#sh issu state
                          Slot = 1/5
                      RP State = Active
                    ISSU State = Init
                 Boot Variable = bootdisk:s72033-adventerprisek9-vz.122-33.SXI4.bin,12;

                          Slot = 2/5
                      RP State = Standby
                    ISSU State = Init
                 Boot Variable = bootdisk:s72033-adventerprisek9-vz.122-33.SXI4.bin,12;

I do have a couple of the crashinfo files.

r-log-oslr-r25-9-1#dir all | i crash
    2  -rw-      338073  Jul 15 2010 23:23:04 +02:00  crashinfo_ios-base-20100715-212258
    6  -rw-      426177  Jul 21 2010 10:46:28 +02:00  crashinfo_ios-base-20100721-084620
    2  -rw-      374361  Jul 13 2010 13:24:56 +02:00  crashinfo_ios-base-20100713-112449
    3  -rw-      379847  Jul 13 2010 17:35:25 +02:00  crashinfo_ios-base-20100713-153516
    4  -rw-      380825  Jul 13 2010 20:41:05 +02:00  crashinfo_ios-base-20100713-184057
    8  -rw-      408903  Jul 15 2010 23:23:05 +02:00  crashinfo_ios-base-20100715-212258
    9  -rw-      487739  Jul 21 2010 10:46:28 +02:00  crashinfo_ios-base-20100721-084620

Mikael

bralynch Wed, 07/21/2010 - 08:24

Hey Mikael,

At this point, I think you need to go ahead and get a TAC case open.  This behavior does not look normal and the fact that the switch is generating crash files should be investigated further.  This could be associated with a software bug.  Let me know if you have any other questions.

Brandon

mik.gustafsson@... Wed, 07/21/2010 - 08:33

Hi Brandon,

I did open a TAC case today as you recomended. I'll try to post an update when there is a conclusion to this.

Thanks for all the help so far.

Mikael

patrick.roche Mon, 10/11/2010 - 08:49

Hello Mikael,

What was the TAC SR number? I seem to be having the same problem.

Thanks,

Pat

mik.gustafsson@... Thu, 10/21/2010 - 01:43

Hi Pat,

There was no resolution to this, TAC could not reproduce the problem. And after I upgraded to SXI4 neither could I.

The problem with ISSU upgrade was a minor problem, that I don't have anymore :-), and I had to put this VSS pair in production.

That is for upgrade problem with both VSS members doing a boot during ISSU upgrade. The bootloop was a user 2 in 1 error, wrong confreg and wrong processor, SP.

BR

Mikael

Actions

Login or Register to take actions

This Discussion

Posted July 14, 2010 at 9:21 AM
Stats:
Replies:15 Avg. Rating:
Views:3090 Votes:0
Shares:0

Related Content

Discussions Leaderboard

Rank Username Points
1 15,007
2 8,150
3 7,725
4 7,083
5 6,742
Rank Username Points
165
82
70
69
55