ACE4710 started to continuously reboot

Unanswered Question
Mar 4th, 2010

Hi, one of our 2 ACE boxes in FT group suddenly reboot-ed from active state and now it continuously reboots with the following error:

insmod: error inserting '/isan/bin/klm_octeon_device.klm': -1 No such device
error inserting /isan/bin/klm_octeon_device.klm
Daughter Card Not Found. Rebooting..
INIT: Sending all processes the TERM signal...


We are currently running SW version A1(8.0a). The only solution/workaround to be found on the web is from the Cisco release notes, which says:

Workaround: Upgrade to the c4710ace-xxxx.bin software image to resolve the rebooting issue.


But how to upload the image, if the system doesn't boot....anyone?

Thank you!

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
ciscocsoc Thu, 03/04/2010 - 09:49

Hi,

I haven't tried this but it might just work ...

First you need to establish a console session to the rebooting ACE.

http://www.cisco.com/en/US/docs/app_ntwk_services/data_center_app_services/ace_appliances/vA3_1_0/configuration/admin/guide/basiccfg.html#wp1300907

This should give you a boot menu

http://www.cisco.com/en/US/docs/app_ntwk_services/data_center_app_services/ace_appliances/4710/hardware/installation/guide/Install.html

If this were an ACE module you'd be able to tell it to boot from an image on the supervisor card. I don't know if there is an equivalent network boot option. However the first stage is to get you to a stable rommon prompt.

HTH

Cathy

jure.peternel Thu, 03/04/2010 - 10:45

Hi Cathy, thanks!

Come to think of it, there is an option in the GRUB boot menu to boot a ACE-APPLIANCE-RECOVERY-IMAGE.bin image. I will try this option tommorow and see what can I do in the so called recovery mode...

jure.peternel Fri, 03/05/2010 - 00:34

Nope, same error....

What is I pull out the CF card and copy the new image and edit grub from some other linux box?

UPDATE:

yes, it's pretty staightforward: take out the flash card, stick it in a linux box, mount both partitions. These are the contents of both:

[[email protected] /]# ll /mnt/usb0/
total 74380
-rw-rw-rw-    1 root     root     75862237 Jun 12  2008 ACE_APPLIANCE_RECOVERY_IMAGE.bin
drwxr-xr-x    2 root     root         1024 Apr 12  2007 grub
-rw-rw-rw-    1 root     root           23 Mar  5 09:52 reload_reason

[[email protected] /]# ll /mnt/usb1/
total 508992
-rw-rw-rw-    1 root     root    180912118 Jun 12  2008 c4710ace-mz.A1_8_0a.bin
drwx------    2 root     root        16384 Jun 12  2008 lost+found
-rw-r--r--    1 root     root     32505856 Jun 12  2008 TN-CERTKEY-STORAGE
-rw-r--r--    1 root     root     74448896 Jun 12  2008 TN-CONFIG
-rw-r--r--    1 root     root    209715200 Jun 12  2008 TN-COREFILE
-rw-r--r--    1 root     root     11534336 Jun 12  2008 TN-HOME
-rw-r--r--    1 root     root     11534336 Jun 12  2008 TN-LOGFILE

Replace the image with the new one, edit GRUB cofig (part0 /grub/menu.lst), unmount, put the card back and reboot....

Unfortunately, my problem remains even with the new image...

Cristobal Armij... Thu, 06/30/2011 - 10:52

Hi,

      I have the same problem, anyone know how to fix it?


                                                                                    
kernel=(hd0,1)/c4710ace-mz.A1_8_0a.bin ro root=LABEL=/ auto console=ttyS0,9600n
8 quiet bigphysarea=32768                                                      
   [Linux-bzImage, setup=0x1400, size=0xac869f6]                               
                                                                               
INIT: version 2.85 booting                                                     
/mnt/cf/TN-CONFIG on /TN-CONFIG type ext3 (rw,sync,loop=/dev/loop0)            
/mnt/cf/TN-CERTKEY-STORAGE on /TN-CERTKEY-STORAGE type ext3 (rw,sync,loop=/dev/loop1)
/mnt/cf/TN-LOGFILE on /TN-LOGFILE type ext3 (rw,sync,loop=/dev/loop2)           oop1)
/mnt/cf/TN-HOME on /TN-HOME type ext3 (rw,sync,loop=/dev/loop3)                
/mnt/cf/TN-COREFILE on /TN-COREFILE type ext3 (r                                oop1)
w,sync,loop=/dev/loop4)                                                        
insmod: error inserting '/isan/bin/klm_octeon_device.klm': -1 No such device   
error inserting /isan/bin/klm_octeon_device.klm                                
Daughter Card Not Found. Rebooting..                                           
INIT: Sending all processes the TERM signal...                                 
Sending all processes the KILL signal...                                       
Syncing hardware clock to system time                                          
Unmounting loopback filesystems:                                               
Unmounting file systems:                                                       
Please stand by while rebooting the system...                                  
Restarting system.                                                             

thanks

Cristobal

tengo el mismo problema, alguien sabe como solucionarlo?
Cesar Roque Thu, 06/30/2011 - 16:19

Hi Cristobal,

It looks like a problem on the PCI card, most of the times we have  this error is related with a hardware issue.

kchoates1 Thu, 05/17/2012 - 11:23

I'm having the same issue but what is funny is that it occurred on two 4710's at the same time. 

Surya ARBY Fri, 05/18/2012 - 05:01

The octeon mentionned is the chipset used as dataplane; if it's not recognized anymore, it comes from a hardware issue.

Any electrical outage which can have burnt some electronics ?

Actions

This Discussion

Related Content