Mass IOS Upgrade Failed with LMS 3.2/RME 4.3.0

Unanswered Question
Nov 4th, 2009

Upgrade of several 1841s failed, leaving sites down.

RME says:

Processing Device : 1841.SITE1

Start Time : Wed Nov 04 00:35:26 CST 2009

Device is locked for exclusive access.

The supported protocols for image transfer are: TFTP

Going to perform delete operation based on the patterns in the List [.*\.bin, html, snmp, help, .*\.tar]

Files that will be excluded from deletion are: [vlan\.dat, config\.text, config\.txt, private-config\.text, private-config\.txt, multiple-fs, env_vars, system_env_vars, info\.ver, info]

Storage from which files are to be deleted is: flash

Available Flash files: c1841-ipbasek9-mz.100msec_DSR.BIN,

c1841-ipbasek9-mz.100msec_DSR.BIN -> Dosen't match with the patterns in the delete list.

Delete is successful but Squeeze failed, Erasing the Flash...

Failed to erase the flash.

Cause: miscOpInvalidOperation : command invalid or command-protocol-device combination unsupported

Copying c1841-ipbasek9-mz.124-24.T1.bin from Software Repository to tftp-specific directory.

Copied successfully to D:/PROGRA~1/CSCOpx/tftpboot/rep_sw_2710718669462605744

Loading image file to flash device : rep_sw_2710718669462605744 --> flash:c1841-ipbasek9-mz.124-24.T1.bin using TFTP

Loading image file to flash device : rep_sw_2710718669462605744 --> flash:c1841-ipbasek9-mz.124-24.T1.bin using TFTP

Completed loading image file to Flash device.

Total time to copy the image - 1 Hour 12 Minutes 28 Seconds

TFTP fallback image c1841-ipbasek9-mz.100msec_DSR.BIN has been moved to tftpboot directory.

Retrieving configuration file from the device...

Current device configuration is copied to D:\PROGRA~1\CSCOpx\files\rme\jobs\swim\2744\1841.SITE1_Config

Updating configuration file during software upgrade process...

Successfully updated.

Device will be rebooted.

SWIM1114: The device cannot be reached after the reboot. Number of attempts to verify the device status has exceeded the maximum retry count.

Either an invalid image has been loaded onto the device or there are network connectivity problems.

Use the device console to determine if the device has reloaded with the desired image.

Device is unlocked.

Device Upgrade Result : Failed

End Time:Wed Nov 04 01:56:30 CST 2009

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
Joe Clarke Wed, 11/04/2009 - 14:20

There isn't enough information here to know what failed. At what state are these devices in? That is, are they stuck in ROMMON? Did they boot, but lost config? If the router is in ROMMON, is it because the image on flash is corrupt?

drew.salmon Wed, 11/04/2009 - 14:32

Thanks for the follow up Joe. Unfortunately these sites are in California, and I am in Texas. There are 43 of them total, where the WAN is down, so we have no remote access.

What I am being told, by tech on the ground, is the router is in rommon and the flash is empty. The CF is replaced with one that has the new image, they are able to boot, and all is well.

I have to be able to provide a reason for failure to management. What else do you need from me to be able to make that determination?

Joe Clarke Wed, 11/04/2009 - 14:38

After the fact, nothing will really say for certain why the transfer failed. SWIM debug logs (with debugging enabled) would be required to rule out a problem with RME.

Yes, RME will erase the flash, but according to this job manifest, it copied the new image back to flash. Based on this, the only guess I have is that the code running on the device is buggy, and the copy failed, but the MIB never indicated it failed.

drew.salmon Wed, 11/04/2009 - 15:18

Is it possible that it is a file system problem? We have noted the below. The 1st site failed an upgrade, but the second site was fine.

SITE1.ROUTER>sh flash all

-#- --length-- -----date/time------ path

1 28820760 Sep 17 2008 15:26:18 -05:00 c1841-adventerprisek9-mz-609147483-debug.bin

2 19466016 Sep 19 2008 21:50:44 -05:00 c1841-ipbasek9-mz.100msec_DSR.BIN

79851520 bytes available (48324608 bytes used)

******** ATA Flash Card Geometry/Format Info ********

ATA CARD GEOMETRY

Number of Heads: 8

Number of Cylinders 980

Sectors per Cylinder 32

Sector Size 512

Total Sectors 250880

ATA CARD FORMAT

Number of FAT Sectors 123

Sectors Per Cluster 8

Number of Clusters 31293

Number of Data Sectors 250737

Base Root Sector 357

Base FAT Sector 111

Base Data Sector 389

ATA MONLIB INFO

Image Monlib size = 55592

Disk monlib size = 56832

Name = piptom-atafslib-m

Monlib Start sector = 2

Monlib End sector = 104

Monlib updated by = C3745-IPBASE-M12.4(20050413:175129)

Monlib version = 1

########################################################

SITE2.ROUTER#sh flash all

-#- --length-- -----date/time------ path

11 18445520 May 09 2007 19:17:04 c1841-ipbasek9-mz.124-11.T.bin

12 27560620 Nov 04 2009 06:25:20 c1841-ipbasek9-mz.124-24.T1.bin

18001920 bytes available (46010368 bytes used)

******** ATA Flash Card Geometry/Format Info ********

ATA CARD GEOMETRY

Manufacturer Name

Model Number STI Flash 7.4.0

Serial Number STI J189307089085918

Firmware Revision 02-10-06

Number of Heads 8

Number of Cylinders 490

Sectors per Cylinder 32

Sector Size 512

Total Sectors 125440

ATA PARTITION 1 INFO

Start Sector 32

Number of Sectors 125408

Size in Bytes 64208896

File System Type FAT16

Number of FAT Sectors 62

Sectors Per Cluster 8

Number of Clusters 15628

Number of Data Sectors 125024

Base FAT Sector 111

Base Root Sector 235

Base Data Sector 267

ATA MONLIB INFO

Image Monlib size 117868

Disk Monlib Size 52248

Disk Space Available 56320

Name piptom-atafslib-m

Start sector 2

End sector 104

Updated By C1841-IPBASE-M12.4(1c)

Version 1

Joe Clarke Wed, 11/04/2009 - 15:41

It's certainly possible that a flash problem caused this. But I see nothing here that indicates that is the case.

drew.salmon Wed, 11/04/2009 - 15:48

Thanks Joe.

Just curious, why does one CF show to have a FAT16 file sys type, but the other does not show a file sys type at all?

Joe Clarke Wed, 11/04/2009 - 15:53

My guess is because they are different types of flash chips. I don't know for certain.

Martin Ermel Thu, 11/05/2009 - 01:09

I am not sure about the "ATA MONLIB INFO" part in detail, but there is a difference in the following parameter:

Monlib updated by = C3745-IPBASE-M12.4(20050413:175129)

where as the second router states:

Updated By C1841-IPBASE-M12.4(1c)

perhaps this is an indicator for a possible reason. As well the Monlib file size is different....

it is just a quick thought, but I have got the feeling the first router does have a flash card inserted that was formated on a 3745 and is not accessable on your 1841. So when booting the router trys to initialize the flash without success and stops in rommon.

http://www.cisco.com/en/US/docs/ios/12_3t/12_3t7/feature/guide/gtmonlib.html

Actions

This Discussion