Failed CMM upgrade: Module won't boot...

Answered Question
May 20th, 2008

When attempting to upgrade the IOS version on a CMM module the session hung when erasing the bootflash. Now the module won't boot even if removed and re-inserted. Syslog contains these entries:

2y7w: %ONLINE-SP-6-DNLDFAIL: Module 8, Proc. 0, Runtime image download failed because of image not found

2y7w: %C6KPWR-SP-4-DISABLED: power to module in slot 8 set off (Module Failed SCP dnld)

Per other posts and TAC Cases, connecting a console cable to the on-board RJ45 doesn't help because the card is quickly powered off by the supervisor.

There are some disaster recovery instructions for CMM software upgrades, but they are specific to CatOS and we are running IOS. Document found here:

http://www.cisco.com/en/US/docs/switches/lan/catalyst6500/hardware/Config_Notes/78_14107.html

The recovery document mentions setting the power management bits to prevent the download mechanism from triggering when the card is reset. Is there a way to do this with IOS as well?

Any recovery suggestions most welcome! Thanks...

Scott

I have this problem too.
0 votes
Correct Answer by Michael Owuor about 8 years 6 months ago

Hi again Scott,

I think that when the CMM is rebooting in disaster recovery mode, it looks for a specific file name alias.

So in your case, the tftp server statement should be:

'tftp-server disk0:wscmm-i6k9s-mz.124-3j.bin alias disk0:ws-svc-cmm'

Would you please test again with this configuration?

Regards,

Michael.

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 5 (1 ratings)
Loading.
Michael Owuor Tue, 05/20/2008 - 19:54

Hi Scott,

Disaster Recovery for Supervisor Engines running Cisco IOS.

For systems that use Cisco IOS software on the supervisor engine, perform these steps:

Note: This procedure requires Supervisor Engine software release 12.2(18)SFX or later.

Step 1 Before performing the disaster recovery, remove the port adapters from the CMM.

Step 2 Enter the copy tftp [device] command to copy the golden image for the CMM to the supervisor engine Flash memory where [device] can be any flash file system on the supervisor.

Step 3 Under the CLI Config mode, enter the tftp-server [device]:[golden-image] alias [alias-name] command to translate the file name of the downloaded file. The [golden-image] is the file to be downloaded, and the [device] can be any flash file system on the supervisor.

Step 4 Under the CLI exec mode, enter the hw-module module [mod] boot [pm_option] command to set the power management option for the CMM. The appropriate [pm_option] corresponds to the Supervisor Engine type in use.

Step 5 Enter the hw-module module [mod] reset command to reset the module. When the CMM powers up, disaster recovery is complete.

Step 6 After you complete the disaster recovery, under CLI config mode, enter the no tftp-server [device]:[golden-image] command to remove the filename translation.

Step 7 Set the power management option for the CMM to zero (0) by entering the command hw-module module [mod] boot config-register. This step is necessary to prevent the download mechanism from triggering every time that the CMM is reset.

Let us know how this works.

Regards,

Michael.

goodwinscottns Wed, 05/21/2008 - 19:46

Thanks Michael. I tried the steps above but have run in to a few snags. Here's the brief summary:

To rule out the IOS version as a factor, we upgraded to Version 12.2(18)SXF13.

Active Sup-720 is in slot 6.

6-port T1 module has been removed from CMM base card.

CMM base card reinserted into Slot 8.

CMM IOS image file wscmm-i6k9s-mz.124-3j.bin copied to disk0:

Added tftp alias entry to the running configuration: "tftp-server disk0:wscmm-i6k9s-mz.124-3j.bin alias cmm_bootme"

Next question was what is the correct value for the "pm_option"... From this document we found this document(http://www.cisco.com/en/US/docs/switches/lan/catalyst6500/hardware/Config_Notes/78_14107.html) we found this guideline:

"Enter the set module power down mod pm_option command to power down the CMM. Set the power management bits (pm_option) to 10 if the supervisor engine is in slot 5 or slot 7. Set the power management bits to 11 if the supervisor engine is in slot 6 or slot 8."

So, because active supervisor is in slot 6, we set hw-module boot option to "11". (hw-module module 8 boot 11)

Because the module had already been placed in a power down state by the supervisor, it seemed we needed some way to cause it to reboot, else the software would never load.

Card in slot 8 was powered on this way: (power enable module 8).

After about a minute, the same error messages are present:

2y7w: %ONLINE-SP-6-DNLDFAIL: Module 8, Proc. 0, Runtime image download failed because of image not found

2y7w: %C6KPWR-SP-4-DISABLED: power to module in slot 8 set off (Module Failed SCP dnld)

So it appears that the CMM module still can't find an image from which to boot.

I see the point of creating an alias for the bootable image, but it seems that there must also be some way to specify that the card in slot 8 should boot from alias name: "cmm_bootme". How does the CMM module know to boot from this alias?

Thanks again for your help. Information on this is very difficult to find, and I've been digging all day...

Scott

Correct Answer
Michael Owuor Thu, 05/22/2008 - 19:29

Hi again Scott,

I think that when the CMM is rebooting in disaster recovery mode, it looks for a specific file name alias.

So in your case, the tftp server statement should be:

'tftp-server disk0:wscmm-i6k9s-mz.124-3j.bin alias disk0:ws-svc-cmm'

Would you please test again with this configuration?

Regards,

Michael.

goodwinscottns Fri, 05/23/2008 - 08:48

Michael,

Brilliant, that was it...

I think I have discovered a different problem in the process. It seems I cannot write anything to the bootflash on the CMM module. A directory listing shows it to be empty. When trying to copy any file (no matter how small), I get a "No space left on device" error. If I try to erase bootflash:, the module hangs and a reboot is necessary.

"Format" is not an option. Any ideas on how I can determine if the bootflash is bad? On the CMM module, is the bootflash a physical memory card that can be replaced?

Thanks again for your expert advice!

Scott

Michael Owuor Fri, 05/23/2008 - 09:48

Hi Scott,

Thanks for the update. Glad it worked. Regarding the second issue, try these steps:

1. Erase the contents of the bootflash by explicitly enter the 'erase bootflash:' command.

2. Once completed, copy the golden image back onto the bootsflash using the 'copy tftp bootflash' command.

3. Once completed, test to see if you can now write a file (such as the running configuration) to the bootflash.

Regards,

Michael.

goodwinscottns Sat, 05/24/2008 - 12:10

Thanks Michael.

Have tried a few times to erase the bootflash, but always get the same result. The erase process seems to hang indefinetely. Only way out is escape character.

Here's what is displayed:

"Erasing the bootflash filesystem will remove all files! Continue? [confirm]y

Erasing device..." [Process hangs here]

After escaping from this, if I try to copy a file to bootflash, on the first attempt there is an error message to say the device is in exclusive use:

"Loading wscmm-i6k9s-mz.124-3j.bin from 172.16.40.222 (via GigabitEthernet1/0): !

%Error opening bootflash:wscmm-i6k9s-mz.124-3j.bin (Device in exclusive use)"

If I reattempt the file copy a second time, I get this message:

"Loading wscmm-i6k9s-mz.124-3j.bin from 172.16.40.222 (via GigabitEthernet1/0): !!

%Error copying bootflash:wscmm-i6k9s-mz.124-3j.bin (No space left on device)"

Available space does not seem to be the actual problem as the bootflash reports 33,548,504 bytes free and the file size is 13,287,424 bytes.

"bootflash directory:

No files in bootflash

[5924 bytes used, 33548504 available, 33554428 total]

32768K bytes of processor board bootflash (Read/Write)"

Could the flash memory be faulty?

Thanks again!

Scott

guiter.carpio Sat, 04/04/2009 - 22:31

I don't know if at this toime these will help you need to use the command

After deleting the unwanted images, enter the squeeze bootflash: command to recover the free memory

Actions

This Discussion