**Updated 16 February 2011** IBM 7816-I4 782x-I4 filesystem errors

Document

Sep 9, 2010 7:51 AM
Sep 9th, 2010


Summary

Cisco Media Convergence Servers 7816-I4, 7825-I4 (and IBM x3250-M2  equivalent) and 7828-I4 have recently been experiencing technical issues.  These servers are used by Cisco Unified Communications Manager and various other Cisco Collaboration software products.

The symptom is that the local disk drives' file-system goes into read-only mode, which can manifest as application services going down, the server becoming  unresponsive via the network or the management interfaces, or worst case data corruption necessitating a reinstall and restore from backup.

Root cause has been identified by Cisco and its suppliers as a disk drive issue stemming from interaction with system firmware. 

Field Notice 63374 has been published and includes more technical details regarding this  issue.  Cisco and its suppliers are committed to high quality and  apologize for any disruptions or impact caused by this issue.

Solution

The file-system going read-only issue which has recently been affecting server models MCS-7816-I4, MCS-7825-I4, and MCS-7828-I4 (or their IBM equivilants) in the field is addressed by CSCti52867 - "IBM 7816-I4 and 782x-I4 READONLY file system".

The fix for CSCti52867 is now available and requires the application of two patch files.  Install both of these patch files in the order listed below.

1. First install ciscocm.ibm-diskex-1.0.cop.sgn 
     The Readme file ciscocm.ibm-diskex-1.0.cop.sgn includes installation instructions for this .cop.sgn.

     Make sure to only install this utility when show hardware CLI output indicates the array is in a healthy state.

     If your server has never had the filesystem go readonly then this step is optional. 
2. Next install Cisco-HDD-FWUpdate-3.0.1-I.ISO .
     The Readme file Cisco-HDD-FWUpdate-3.0.1-I.Readme.pdf includes installation instructions for this ISO.

     This installer is completely independant of the OS installed on the server.

Note:  Installing the FWUpdate v3.0(1) or later will get you firmware with the fix for this defect.  It is always recommended that you apply the latest FWUCD available for your server.

Refer to the Release Note of CSCti52867 and the Readme file for each of the above mentioned patch files for more details.

Symptoms

  • The file system goes READONLY, then CUCM services may go down, the server may become "unresponsive" meaning that it is not possible to ssh into the server, login to the console, or web into the server although it may still respond to pings.
  • Traces from all services stop writing (including syslog)
  • You see the following error on the server console
EXT3-fs error (device sda6) in start_transaction: Jornal has aborted


  • If you are able to login to the server via SSH, the following output may be displayed.
Last login: Mon Aug  X XX:XX:XX XXXX from XXX.XXX.XXX.XXX
Command Line Interface is starting up, please wait ...
java.io.FileNotFoundException: /var/log/active/platform/log/cli.bin (Read-only file system)
        at java.io.RandomAccessFile.open(Native Method)
        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
    :
    :
    :
        at org.apache.log4j.Category.info(Category.java:674)
        at sdMain.main(sdMain.java:611)
log4j:ERROR No output stream or file set for the appender named [CLI_LOG].

   Welcome to the Platform Command Line Interface
    WARNING:
        The /common file system is mounted read only.  <<<<<<<<<<<<<<<<<<

        Please use Recovery Disk to check the file system using fsck.
admin:



How to determine the current version of firmware on the hard drive

  • For MCS-7825-I4 and MCS-7828-I4, running Cisco UCM 7.1 and above, you can use the CLI command 'show hardware' to verify the firmware version.



admin:show hardware



HW Platform       : 7828I4

Processors        : 1

Type            
: Intel(R) Core(TM)2 Quad CPU    Q9400  @ 2.66GHz

CPU Speed         : 2660

Memory            : 8192
MBytes

Object ID         : 1.3.6.1.4.1.9.1.899

OS Version        : UCOS 4.0.0.0-34

Serial Number     : KQRBVVB



RAID Version      :

Raid firmware version: 1.26.81.00

Raid Bios version: 6.16.00.00



BIOS Information  :

IBM IBMBIOSVersion1.44-[M9E144AUS-1.44]- 06/11/2009



RAID Details      :

LSI Logic IR Configuration Utility 2.00.15

Read configuration has been initiated for controller 0

------------------------------------------------------------------------

Controller information

------------------------------------------------------------------------

  Controller
type                       
: SAS1064E

  BIOS
version                          
: 6.16.00.00

  Firmware version                      
: 1.26.81.00

  Channel
description                   
: 1 Serial Attached SCSI

  Initiator
ID                          
: 112

  Maximum physical
devices              
: 62

  Concurrent commands
supported           : 266

  Slot                                  
: 0


Bus                                   
: 1


Device                                
: 0


Function                              
: 0

  RAID
Support                          
: Yes

------------------------------------------------------------------------

IR Volume information

------------------------------------------------------------------------

IR volume 1

  Volume
ID                             
: 7

  Status of
volume                      
: Okay (OKY)

  RAID
level                            
: 1

  Size (in
MB)                          
: 237464

  Physical hard disks (Target
ID)         : 9 8

------------------------------------------------------------------------

Physical device information

------------------------------------------------------------------------

Initiator at ID #112

Target on ID #8

  Device is a Hard disk

  Enclosure
#                           
: 1

  Slot
#                                
: 1

  Target
ID                             
: 8


State                                 
: Online (ONL)

  Size (in MB)/(in
sectors)             
: 238475/488397168


Manufacturer                          
: ATA   

  Model
Number                          
: WD2502ABYS
-23B7A

  Firmware Revision                     
: 3B04

  Serial
No                             
:      WD-WCAT1D712130

  Drive
Type                            
: SATA

Target on ID #9

  Device is a Hard disk

  Enclosure
#                           
: 1

  Slot
#                                
: 0

  Target
ID                             
: 9


State                                 
: Online (ONL)

  Size (in MB)/(in
sectors)             
: 238475/488397168


Manufacturer                          
: ATA   

  Model Number                          
: WD2502ABYS
-23B7A

  Firmware
Revision                     
: 3B04

  Serial
No                             
:      WD-WCAT1D723848

  Drive
Type                            
: SATA

------------------------------------------------------------------------

Enclosure information

------------------------------------------------------------------------

Enclosure#                              
: 1

  Logical
ID                            
: 5005076b:0648afc0

  Numslots                              
: 4


StartSlot                             
: 0

  Start
TargetID                        
: 0

  Start
Bus                             
: 0

------------------------------------------------------------------------



The text highlited in red are the info you need.  This output shows a server with two drives with model  number WD2502ABYS on 3B04 firmware.  These drives should be upgraded as  soon as possible.

  • For MCS-7825-I4 and MCS-7828-I4 models running Cisco UCM versions previous to 7.1, as well as any version of Cisco UCM running on a MCS-7816-I4 model server, you must download and boot off of a CD burned with Cisco-HDD-FWUpdate-3.0.1-I.ISO (refer to the "Solution" section for links to download the ISO and readme).  Upon successful boot of the Cisco-HDD-FWUpdate-3.0.1-I.ISO CD, you will be presented with the current HDD FW version as well as the opportunity to upgrade to HDD FW version 02.03B06.

What should be done if filesystem issues persist after applying the patches?


As of 16 February 2011 if you encounter any further filesystem or hard drive issues after applying both the firmware and disk exerciser you should proceed to replace the affected drive(s).

There are three ways you can replace the drive(s).

  1. If you have an MCS server with an active Cisco support contract open a TAC service request.
  2. If you purchased the IBM x3250 M2 MCS equivalent and have an IBM support contract contact IBM support.
  3. If you do not have any support contract for the server you can purchase a new drive from Cisco or IBM.
    • The Cisco part number for the hard drive is HDD-7825-I4-250=.
    • Contact your IBM reseller to confirm the correct part number for the 250GB SATA simple-swap HD for the x3250 M2 server.

If you have any questions you can leave a comment on this document or send an email to ibm-fs-failure@cisco.com.

Sending the email will not generate a TAC SR but will allow us to collect more information.  This is an informal submission with no associated SLA and we will make every effort to follow up submissions but cannot guarantee a response.

    Related Defects


    Related Links

    Average Rating: 5 (2 ratings)

    Comments

    Phillip Ratliff Fri, 10/15/2010 - 13:24

    If you had tried the original firmware update on a 7816 and it didn't work there has been a new one posted that will.  The links in the document all point to the new one.

    gabriel.caclin Mon, 10/18/2010 - 04:11

    This document talk about server under linux-based server, but what happen for the Windows-based servers like Cisco UCCX or Cisco Unity ?

    I have numbers of these server in my customers, I meet 2 time this bug on CUCM, but I am afraid to see it on UCCX and UNITY server, which are windows based...

    Phillip Ratliff Mon, 10/18/2010 - 08:10 (reply to gabriel.caclin)

    This document talk about server under linux-based server, but what happen for the Windows-based servers like Cisco UCCX or Cisco Unity ?

    I have numbers of these server in my customers, I meet 2 time this bug on CUCM, but I am afraid to see it on UCCX and UNITY server, which are windows based...

    This is a very good question.

    I don't think we've seen it yet on any Windows based system.  My personal guess is because Windows doesn't respond to the underlying hardware issue or timeout in the same way Linux does.  The filesystem going readonly is the OS' way of protecting itself in response to an issue with the disk subsystem.  Windows may not have a similar protection mechanism.

    jjohnson1980 Wed, 11/17/2010 - 11:29 (reply to Phillip Ratliff)

    This document talk about server under linux-based server, but what happen for the Windows-based servers like Cisco UCCX or Cisco Unity ?

    I have numbers of these server in my customers, I meet 2 time this bug on CUCM, but I am afraid to see it on UCCX and UNITY server, which are windows based...

    This is a very good question.

    I don't think we've seen it yet on any Windows based system.  My personal guess is because Windows doesn't respond to the underlying hardware issue or timeout in the same way Linux does.  The filesystem going readonly is the OS' way of protecting itself in response to an issue with the disk subsystem.  Windows may not have a similar protection mechanism.

    Has there been any new findings on this? I would love to be able to disregards this for the few servers I have in the field using UCCX. But I can not justify this based on a guess. Thanks!

    shkirby Wed, 11/17/2010 - 11:58 (reply to jjohnson1980)

    There have been no reports of server models affected by this issue (MCS-7816-I4, or MCS-782X-I4) experiencing this issue when running Windows based operating systems.

    d.haeni Wed, 02/16/2011 - 01:03 (reply to Phillip Ratliff)

    Ryan,

    We recently had an issue with a UCCX 7.x installation, running on a 7816-I4 (and Windows, of course):

    - Windows Event Logs every now and then showed "bad blocks", and the server restarted itself automatically

    - In one case, the server was unresponsive and manually had to be powered off and on in order to restore its function.

    - I attributed it to a bad Drive and had the HDD RMAed.

    My questions:

    - The info on this R/O filesystem issue with respect to Windows OS seems a bit vague. Any chance this was related to the issue we're discussing here?

    - Is HDD FW Upgrade supported on MCSes running Windows OS?

    - Would you even recommend it?

    - How do you accomplish this (step-by-step instructions)

    Thanks for your help

    Phillip Ratliff Wed, 02/16/2011 - 07:25 (reply to d.haeni)

    Thanks David I've updated the document with the bugs you cited.  Thanks for your patience working through the UCCX issues.

    We never saw any complaints of this issue on a Windows server so I cannot confirm or refute that the issue you saw was due to this problem or not.  I would encourage anyone with one of these servers to apply the hard drive firmware update regardless of whether you have seen issues.  The installer is a self contained patch utility from IBM that does not rely on any data on the hard drives.  It can be run on a server with no OS at all.

    If you are seeing bad blocks reported on a hard drive from Windows then I would replace the drive regardless of firmware.  You can also confirm using the IBM DSA utility if the drive is showing SMART errors.

    Ryan,

    We recently had an issue with a UCCX 7.x installation, running on a 7816-I4 (and Windows, of course):

    - Windows Event Logs every now and then showed "bad blocks", and the server restarted itself automatically

    - In one case, the server was unresponsive and manually had to be powered off and on in order to restore its function.

    - I attributed it to a bad Drive and had the HDD RMAed.

    My questions:

    - The info on this R/O filesystem issue with respect to Windows OS seems a bit vague. Any chance this was related to the issue we're discussing here?

    - Is HDD FW Upgrade supported on MCSes running Windows OS?

    - Would you even recommend it?

    - How do you accomplish this (step-by-step instructions)

    Thanks for your help

    athamm_DIEP Thu, 03/10/2011 - 04:45 (reply to Phillip Ratliff)

    Yesterday I've also hit the "read-only"-issue when trying to upgrade to v8.5.1 of CUCM, nevertheless I have already installed the B06-Firmware fix in December 2010. So I contacted Cisco TAC and as described above I requested the new HDD's to replace the old ones.

    In the meantime I would like to know if there is any possiblity for me to get the CUCM work again until the new HDD's are delivered? I'm a little bit afraid of trying a simple restart of the CUCM, because now mostly eighty percent of our phones are working because they were logged in with Extension Mobility when the error occured. May be after the restart no one will be working in appropriate function, because login with ExMo is not available. I would be pleased if someone could give me any good advice to follow.

    Kind regards from Germany,
    Andi

    Phillip Ratliff Thu, 03/10/2011 - 06:24 (reply to athamm_DIEP)

    Most of the time simply rebooting the server will recover it.  If this doesn't work for you then a filesystem check may get you up enough to proceed but you may be stuck until you get the HDD(s) replaced.

    Yesterday I've also hit the "read-only"-issue when trying to upgrade to v8.5.1 of CUCM, nevertheless I have already installed the B06-Firmware fix in December 2010. So I contacted Cisco TAC and as described above I requested the new HDD's to replace the old ones.

    In the meantime I would like to know if there is any possiblity for me to get the CUCM work again until the new HDD's are delivered? I'm a little bit afraid of trying a simple restart of the CUCM, because now mostly eighty percent of our phones are working because they were logged in with Extension Mobility when the error occured. May be after the restart no one will be working in appropriate function, because login with ExMo is not available. I would be pleased if someone could give me any good advice to follow.

    Kind regards from Germany,
    Andi

    athamm_DIEP Thu, 03/10/2011 - 08:42 (reply to Phillip Ratliff)

    Phillip,

    I followed your advice and made a reset (right at the front of the machine) of the cucm. Now Ex-Mo and any other services are working again! So I'm very happy and have to thank you very, very much

    The only thing that I'm missing now is the inactive partition in the "CUCM OS => Settings => Version windows" window. Normally there should be the option to switch to the displayed inactive partition with V.x.x.x- installed on it. May be you have for this any useful advice.

    Kind regards,

    Andreas

    Phillip Ratliff Thu, 03/10/2011 - 10:07 (reply to athamm_DIEP)

    If it happened during an upgrade then it's likely your inactive partition got wiped in preparation for it to be come the new active partition.

    Phillip,

    I followed your advice and made a reset (right at the front of the machine) of the cucm. Now Ex-Mo and any other services are working again! So I'm very happy and have to thank you very, very much

    The only thing that I'm missing now is the inactive partition in the "CUCM OS => Settings => Version windows" window. Normally there should be the option to switch to the displayed inactive partition with V.x.x.x- installed on it. May be you have for this any useful advice.

    Kind regards,

    Andreas

    lpaquin Tue, 10/19/2010 - 05:42

    I got the same issue, but I'm already to the hard disk firmware 3b05 from factory.

    I did the recovery disk and found no errors.

    After restart, i'm good for 20-24 hours and then the system goes in read-only file system.

    Any other things to try before a full reinstall ?

    Thanks

    Phillip Ratliff Tue, 10/19/2010 - 06:56 (reply to lpaquin)

    I got the same issue, but I'm already to the hard disk firmware 3b05 from factory.

    I did the recovery disk and found no errors.

    After restart, i'm good for 20-24 hours and then the system goes in read-only file system.

    Any other things to try before a full reinstall ?

    Thanks

    Please gather all of the information outlined in the document and open a TAC SR so that we can get your information over to IBM.

    Phillip Ratliff Wed, 10/27/2010 - 09:06 (reply to silemire)

    Any chance of this issue affecting Cisco Unity Connection as well?

    This issue can hit any of these servers.  We haven't seen it on Windows-based Unity but have on UC, CUP, and CUCM.

    lpaquin Wed, 10/27/2010 - 09:57 (reply to Phillip Ratliff)

    Finally I did a full rebuilt of my CUCM and it's stable now.

    Cisco told me a new hard disk firmware upgrade should be out very soon to fix that issue.

    Marcman-Cisco Mon, 11/01/2010 - 08:22 (reply to lpaquin)

    Unfortunately I have the same problem on my Cluster of two 7816I4 with CUCM6.1

    admin:utils create report hardware


             *** WARNING ***
    This process can take several minutes as the disk array, remote console,
    system diagnostics and enviromental systems are probed for their current
    values.


    Continue (y/n)?y
    Internal CLI failure
    admin:utils create report hardware


             *** WARNING ***
    This process can take several minutes as the disk array, remote console,
    system diagnostics and enviromental systems are probed for their current
    values.


    Continue (y/n)?y

    Password:
    Internal CLI failure


    As password I used that from my admin login

    With RTMT I can see the following

    At Mon Nov 01 16:05:19 CET 2010 on node 192.168.0.2; the following SyslogSeverityMatchFound events generated:  SeverityMatch - Alert sudo:    admin : command not allowed ; TTY=unknown ; PWD=/usr/local/platform/bin ; USER=root ; COMMAND=/opt/ibm/dsa/ibm_utl_dsa_212p_rhel3_i386.bin -b -text -d /var/log/active/platform/log

    What can I do? Please Help.

    Phillip Ratliff Mon, 11/01/2010 - 08:33 (reply to Marcman-Cisco)

    Unfortunately I have the same problem on my Cluster of two 7816I4 with CUCM6.1

    admin:utils create report hardware


             *** WARNING ***
    This process can take several minutes as the disk array, remote console,
    system diagnostics and enviromental systems are probed for their current
    values.


    Continue (y/n)?y
    Internal CLI failure
    admin:utils create report hardware


             *** WARNING ***
    This process can take several minutes as the disk array, remote console,
    system diagnostics and enviromental systems are probed for their current
    values.


    Continue (y/n)?y

    Password:
    Internal CLI failure


    As password I used that from my admin login

    With RTMT I can see the following

    At Mon Nov 01 16:05:19 CET 2010 on node 192.168.0.2; the following SyslogSeverityMatchFound events generated:  SeverityMatch - Alert sudo:    admin : command not allowed ; TTY=unknown ; PWD=/usr/local/platform/bin ; USER=root ; COMMAND=/opt/ibm/dsa/ibm_utl_dsa_212p_rhel3_i386.bin -b -text -d /var/log/active/platform/log

    What can I do? Please Help.

    For the 7816s the first thing you should do is reboot the server and hit F2 during POST.  This will get you into the drive's self diagnostic application.  If that report shows a failure then you need to replace the hard drive.

    Edit: Looks like the F2 option was removed from these servers.  We are looking for the replacement utility and the doc will be updated when it is known.

    For your issue running the CLI command you are hitting the 3rd bug listed in the Releated Defects above (CSCtg26203).  You can run the bootable DSA or open a TAC SR to use a workaround.

    Phillip Ratliff Mon, 11/01/2010 - 09:17 (reply to Marcman-Cisco)

    "replace the hard drive" - So it is not useful to upgrade the firmware?

    Which HDD is recommended by cisco?

    If you have a faulty hard drive then upgrading firmware will not help.

    The HDD replacement needs to go through Cisco TAC if you have an MCS server.  If you bought the server directly from IBM then you need to contact their support to arrange a replacement drive.  If you do not have a support contract on the server from IBM or Cisco then the IBM part number for the drive is listed on the IBM server solutions page at www.cisco.com/go/swonly.

    Marcman-Cisco Tue, 11/02/2010 - 03:08 (reply to Phillip Ratliff)

    There is no way with F2 during POST to get into the drive's self diagnostic application.

    We bought the MCS7816I4-K9-CMB2 with CUCM6.1 preinstalled.

    Can it be that the Dynamic System Analysis (DSA) is not preinstalled for this machine?

    Is it the only way to take the ibm_fw_dsa_3.10_anyos.iso?

    Phillip Ratliff Tue, 11/02/2010 - 09:16 (reply to Marcman-Cisco)

    You are correct that the F2 option was removed.  We are looking for the replacement application and I will update the document when we have a known working procedure.

    The DSA utility is installed with 6.1 however a bug with the permissions prevents it from being run from the CLI.  TAC can use a remote support account to run the file manually but without going through TAC the only option is to use the bootable DSA utility.

    There is no way with F2 during POST to get into the drive's self diagnostic application.

    We bought the MCS7816I4-K9-CMB2 with CUCM6.1 preinstalled.

    Can it be that the Dynamic System Analysis (DSA) is not preinstalled for this machine?

    Is it the only way to take the ibm_fw_dsa_3.10_anyos.iso?

    Marcman-Cisco Mon, 11/08/2010 - 05:01 (reply to Phillip Ratliff)

    I have used the bootable DSA utility.

    I did "1c. Run HD Self Diagnostic test" the result was that the Test was passed.


    I have collected the Inventary:

    http://rapidshare.com/files/429589229/4194PBP_KQVBBLN_20101105-125520.txt

    But the Version of the HDDs Firmware seems to be Revision 02.0 So it is not very similar to yours...

    I had a look to the System Logs - messages from that day of failure and could see this messages:

    Error             : ata1: translated ATA stat/err 0x51/10 to SCSI SK/ASC/ASCQ 0xb/14/00

    Warning     : ata1: status=0x51 { DriveReady SeekComplete Error }

    Warning     : ata1: error=0x10 { SectorIdNotFound }

    Phillip Ratliff Mon, 11/08/2010 - 06:20 (reply to Marcman-Cisco)

    I have used the bootable DSA utility.

    I did "1c. Run HD Self Diagnostic test" the result was that the Test was passed.


    I have collected the Inventary:

    http://rapidshare.com/files/429589229/4194PBP_KQVBBLN_20101105-125520.txt

    But the Version of the HDDs Firmware seems to be Revision 02.0 So it is not very similar to yours...

    I had a look to the System Logs - messages from that day of failure and could see this messages:

    Error             : ata1: translated ATA stat/err 0x51/10 to SCSI SK/ASC/ASCQ 0xb/14/00

    Warning     : ata1: status=0x51 { DriveReady SeekComplete Error }

    Warning     : ata1: error=0x10 { SectorIdNotFound }

    Please email the .gz file and the messages* file(s) from the server to ibm-fs-failure@cisco.com so we can get them logged and see if your errors match the other reports.

    networkdefence Mon, 11/08/2010 - 06:29 (reply to Marcman-Cisco)

    Marcus,

    We seem to be experiencing issues the same issues you are experiencing with the same make and model of server.  I have also checked the system logs and we receive the same error messages you are getting.  In addition to this IBM have swapped out the hard disk now the second time and we now cannot seem to do a restore to the server where by it just hangs.  Because of these issues we are currently running on the faulty drive at the moment, but it has revision 02.0 firmware.

    Has anyone else seen this issue and if so would to fix be to place a different model of hard drive if the firmware is at fault?

    Marcman-Cisco Wed, 11/03/2010 - 05:19 (reply to Phillip Ratliff)

    (I have no support contract on the server from IBM or Cisco)

    There are no informations about IBM part number for the drive with MCS7816-I4

    I could read from another discussion that there is no equivalent between MCS7816-I4 and IBM Server

    Can I take as equivalent the Server IBM x3250-M2 with IBM part number for the drive: 39M4509 ?

    Phillip Ratliff Wed, 11/03/2010 - 09:43 (reply to Marcman-Cisco)

    Using the 7825I4 equivalent is exactly the right thing to do.  All three of these servers use the exact same hard drive.

    For the HD self diagnostic I'm getting ready to update the wiki but you need to use the standalone IBM DSA to run the diagnostic.

    (I have no support contract on the server from IBM or Cisco)

    There are no informations about IBM part number for the drive with MCS7816-I4

    I could read from another discussion that there is no equivalent between MCS7816-I4 and IBM Server

    Can I take as equivalent the Server IBM x3250-M2 with IBM part number for the drive: 39M4509 ?

    camdebuck_jus Tue, 11/16/2010 - 05:43

    I've completed step 1 and still have the same issue (it's a branded Cisco IBM 7828) and have been working with Cisco on the issue.  Nothing like rebooting your production server every few days (if I don't it will crash usually within 5 to 7 days).  This is ridiculous and can't believe that Cisco won't provide me with a new server that actually works.  Two months of this broken hardware and I shouldn't have to baby sit and reboot our server because of faulty hardware... (I'm done ranting).

    I'm now onto step 1b and cannot seem to get step format the USB drive.  I've installed the application as mentioned.  I get the following error message when pressing the "Start" button to format the USB Thumb Drive:

    "The user-supplied DOS system files are not compatible with FAT32"

    I've pointed the DOS system files to the ones supplied (freedos).  It also doesn't give me an option of just "FAT" as described above.  The only two options are FAT32 & NTFS.  I've tried this on three different OS (Windows 7 64bit, Windows Vista 32 bit, and Windows XP 32 bit) and all three give the exact same error message.

    Has anyone been able to get 1b to work and if so what am I missing.

    networkdefence Tue, 11/16/2010 - 08:50 (reply to camdebuck_jus)

    Phillip,

    We have just recieved a replacement server from yourselves and because I didn't know what hd firmware was running I decided to run a DSA well half way through loading this up it tries to read via ata3 which is the sata disk.  This then fails because the communication is too slow.  It then tries to transmit at a lower speed which again fails and keeps on repeating this I then see I/B errors and SCSI errors.  I have tried running version 3.02 of the DSA and also 3.10 and both version do the same.

    Because of this i'm unsure whether the continue rebuilding the server as it doesn't give me much confidence if the DSA fails to load due to these errors.  To note from what I can tell the revision of the hard disk is 02.0.

    shkirby Tue, 11/16/2010 - 09:55 (reply to networkdefence)

    networkdefence,

    Cisco posted a new Cisco-HDD-FWUpdate-3.0.1-I.ISO yesterday November 15, 2010 which includes new HDD firmware that prevents the issue caused by "CSCti52867 - IBM 7816-I4 and 782x-I4 READONLY file system".

    Therefore, please upgrade your HDD FW using this new "Cisco Standalone HDD Firmware Update Version 3.0".

    The read-me for the "Cisco Standalone HDD Firmware Update Version 3.0" may downloaded here.

    After updating the firmware, please post your results.

    Which Cisco UC application will be installed on this server?

    camdebuck_jus Tue, 11/16/2010 - 12:31 (reply to shkirby)

    networkdefence,

    Cisco posted a new Cisco-HDD-FWUpdate-3.0.1-I.ISO yesterday November 15, 2010 which includes new HDD firmware that prevents the issue caused by "CSCti52867 - IBM 7816-I4 and 782x-I4 READONLY file system".

    Therefore, please upgrade your HDD FW using this new "Cisco Standalone HDD Firmware Update Version 3.0".

    The read-me for the "Cisco Standalone HDD Firmware Update Version 3.0" may downloaded here.

    After updating the firmware, please post your results.

    Which Cisco UC application will be installed on this server?


    I have a 7828-I4.  In the table listed it indicates that HDD Product ID should be "

    WD2502ABYS-23B7A0".

    However, when I run the "utils create report hardware" it shows the following (basically, the 0 is missing at the end).

    | ModelNumber                              |WD2502ABYS-23B7A                                         |
    | FirmwareRevision                         |3B05                                                     |

    Is this just at typo in the reference chart in the readme.pdf file?

    shkirby Tue, 11/16/2010 - 13:16 (reply to camdebuck_jus)

    Cam,

    Thanks for the update regarding the size of the USB key and the 2GB size limitation.  I'm glad that you were able to proceed.

    In regards to your question, WD2502ABYS-23B7A is the affected HDD Product ID and the readme file will be updated accordingly.  So you can proceed with the HDD update.

    networkdefence Wed, 11/17/2010 - 05:04 (reply to shkirby)

    Shane,

    Because we have already received an additional hard disk from IBM due to this issue I upgraded the firmware to 3B06 and then tried to run the DSA, it still failed regardless of what version I use.  I have now been informed by Cisco TAC that is isn't a hardware issue and to proceed with the install, to note I am installed 6.1.4.

    What concerns me is should the server fail again I won't be able to run the DSA so I won't be able to know whether there is a hardware issue or not.  I am now installing this software on the HD which doesn't have the latest firmware on so will see what happens.  Do you still recommend installing this firmware upgrade anyway?

    Also if running the upgraded firmware is it required to install the ciscocm.ibm-diskex-1.0.cop.sgn ???

    shkirby Wed, 11/17/2010 - 07:58 (reply to networkdefence)

    Yes, you should install both ciscocm.ibm-diskex-1.0.cop.sgn & Cisco-HDD-FWUpdate-3.0.1-I.ISO.  After running both utilities try running the DSA again.  Is this MCS-7816-I4?  Please ensure that all of the information requested in this doc has been uploaded to your TAC SR and send your SR number to ibm-fs-failure@cisco.com

    camdebuck_jus Tue, 11/16/2010 - 12:26 (reply to camdebuck_jus)

    I did some research on the USB and FAT not showing up.  In most cases, if your USB Drive is bigger than 2 gigs it will have issues (mine was 4 gigs).  I purchased a Dane-Elec 2GB USB Drive from Target for $8.  This allowed the FAT to show up and format the drive without any issues.

    shkirby Tue, 11/16/2010 - 19:14 (reply to camdebuck_jus)

    Cam,

    Thanks again for providing your 2GB USB key finding!  Step 1b has been updated with this recommendation.  I also sympathize with your frustration regarding this issue.  Please rest assured that this issue is top-of-mind and focus for many Cisco resources and suppliers who are working diligently to deliver a resolution for this problem.

    As a follow-on to the previous recommendation of applying the new HDD firmware bundled in Cisco-HDD-FWUpdate-3.0.1-I.ISO, you should ALSO apply the newly published ciscocm.ibm-diskex-1.0.cop.sgn.

    The Readme file ciscocm.ibm-diskex-1.0.cop.sgn-Readme.html includes installation instructions for the .cop.sgn.

    camdebuck_jus Tue, 11/16/2010 - 19:49 (reply to shkirby)

    Cam,

    Thanks again for providing your 2GB USB key finding!  Step 1b has been updated with this recommendation.  I also sympathize with your frustration regarding this issue.  Please rest assured that this issue is top-of-mind and focus for many Cisco resources and suppliers who are working diligently to deliver a resolution for this problem.

    As a follow-on to the previous recommendation of applying the new HDD firmware bundled in Cisco-HDD-FWUpdate-3.0.1-I.ISO, you should ALSO apply the newly published ciscocm.ibm-diskex-1.0.cop.sgn.

    The Readme file ciscocm.ibm-diskex-1.0.cop.sgn-Readme.html includes installation instructions for the .cop.sgn.

    Thanks for mentioning the .cop.sgn file.  It is in process of applying right now.  From what I read it sounds like that it shouldn't matter that I'm apply after I did the firmware.  It this isn' the case let me know and what I need to do.  Thanks.

    camdebuck_jus Tue, 11/16/2010 - 19:36

    I have completed 1b successfully and have the log file.

    I also applied the firmware update that was made available and it now shows that I'm running firmware revision 3B06 instead of 3B05.

    The system is running and we'll see how long it runs before it crashes (hopefully it won't, but only time will tell).

    If you need me to post my logs or anything please let me know.  I did attach it to my TAC that I have open.

    I also ran 1b after doing the firmware as well.  Not sure if that would be helpful.  I did not attach that to my TAC, but thought I would mention it.

    camdebuck_jus Tue, 11/16/2010 - 20:16

    Bad news.  While applying the cop.sgn file it completed on step 14 of 20 (that's a big problem).  The system went into read-only mode <ARG>.  I had already applied the firmware upgrade as the cop.sgn file wasn’t there when I downloaded the .iso file for the firmware.

    Should I try and reapply the cop.sgn file again?

    shkirby Tue, 11/16/2010 - 21:57 (reply to camdebuck_jus)

    Cam,

    Please try step 2a (boot from CUCM recovery CD to run filesytem check).

    Boot from the CUCM recovery disk and select option f to check the filesystem.   

    • If you do not have a recovery disk you may download the ISO from the Software Center on cisco.com. You should always download the latest version of the recovery disk regardless of the CUCM version running on the server.

    If no errors reported, then reboot server and from AdminCLI run "show hardware".  Confirm that the "Status of volume" is "Okay".  If the status of the volume is in resycing state, then wait until the status transitions to "Okay".  Then run ciscocm.ibm-diskex-1.0.cop.sgn again.

    Marcman-Cisco Wed, 11/17/2010 - 01:07 (reply to shkirby)

    At the end of the day, I will get new HDDs from IBM in case of the "CSCti52867 - IBM 7816-I4 and 782x-I4 READONLY file system". I could really believe that there is an old firmware inside...

    So, do you recommend that I install the new firmware before I begin the full installation?

    The old HDD of MCS6816-I4 has firmware 02.0, I don't know which will have the new one.

    An upgrade from 02.0 to "Cisco-HDD-FWUpdate-3.0.1-I.ISO" is possible and no problem?

    I will do the upgrade/installation process tomorrow, so it would be nice to know the way till then...

    (I have to travel thourgh half of Germany for that, so it have to be successful)

    shkirby Wed, 11/17/2010 - 16:59 (reply to Marcman-Cisco)

    At the end of the day, I will get new HDDs from IBM in case of the "CSCti52867 - IBM 7816-I4 and 782x-I4 READONLY file system". I could really believe that there is an old firmware inside...

    So, do you recommend that I install the new firmware before I begin the full installation?

    The old HDD of MCS6816-I4 has firmware 02.0, I don't know which will have the new one.

    An upgrade from 02.0 to "Cisco-HDD-FWUpdate-3.0.1-I.ISO" is possible and no problem?

    I will do the upgrade/installation process tomorrow, so it would be nice to know the way till then...

    (I have to travel thourgh half of Germany for that, so it have to be successful)

    Marcus,  the new/replacement HDD that you receive from IBM will NOT be running the new firmware version 3B06 which is the fix for this issue.  Also note that revision "02.0" is NOT the full firmware version.  The full firmware version (ex. 02.03B0X) will be displayed once you have booted the server with the "Cisco-HDD-FWUpdate-3.0.1-I.ISO".  And the installation of this patch file will update the FW to 02.03B06.

    Yes, you can install "Cisco-HDD-FWUpdate-3.0.1-I.ISO" before installing the Cisco UC application on this server.

    Marcman-Cisco Thu, 11/18/2010 - 11:41 (reply to shkirby)

    I did the upgrade for the brand new IBM HDDs with 02.03B05 to 02.03B06. You can only see the long name of firmwareversion with that "Cisco-HDD-FWUpdate-3.0.1-I.ISO". With DSA 3.20 I could only see 02.0

    All Servers are new installed.

    Well, we will see how it works...I hope good.

    Thank you for your help!

    shkirby Thu, 11/18/2010 - 13:34 (reply to Marcman-Cisco)

    I did the upgrade for the brand new IBM HDDs with 02.03B05 to 02.03B06. You can only see the long name of firmwareversion with that "Cisco-HDD-FWUpdate-3.0.1-I.ISO". With DSA 3.20 I could only see 02.0

    Correct, on the MCS-7816-I4 servers, the DSA will not show the full HD FW version.  It will only show the first 4 digits of the FW version (ex 02.0).  But when you run "Cisco-HDD-FWUpdate-3.0.1-I.ISO" it will show the current "full" version (ex 02.03B04) and the version which will be applied during the upgrade (02.03B06).

    camdebuck_jus Wed, 11/17/2010 - 06:44 (reply to shkirby)

    Cam,

    Please try step 2a (boot from CUCM recovery CD to run filesytem check).

    Boot from the CUCM recovery disk and select option f to check the filesystem.   

    • If you do not have a recovery disk you may download the ISO from the Software Center on cisco.com. You should always download the latest version of the recovery disk regardless of the CUCM version running on the server.

    If no errors reported, then reboot server and from AdminCLI run "show hardware".  Confirm that the "Status of volume" is "Okay".  If the status of the volume is in resycing state, then wait until the status transitions to "Okay".  Then run ciscocm.ibm-diskex-1.0.cop.sgn again.

    I did notice when I was running the cop.sign file previously that it was in resyncing state (showed it while booting up).  I have verified the file system using the recovery disk and all is well and everything is ok.

    I'll retry the cop.sign file tomorrow night as I've got other tasks gonig on tonight.  I'll let you know how it goes at that point.

    You might want to update the readme on the cop.sign file that it mentions about the resyncing state so others don't have the same issue.

    Thank you for your help.

    shkirby Wed, 11/17/2010 - 08:33 (reply to camdebuck_jus)

    Cam,

    If the cop.sgn install fails again, collect the "Event Viewer-System Log" and "Install and Upgrade Logs" via RTMT covering the time of the installation and upload them to your TAC SR.  Then send an email to ibm-fs-failure@cisco.com indicating such.

    networkdefence Wed, 11/17/2010 - 09:33 (reply to shkirby)

    Shane,

    Well after upgrading the HD firmware I then installed 6.1.3 and then upgraded this to 6.1.4.  At the final part where it asks you for any configuration via an xml file I click on continue and it does nothing!

    I have tried this twice and at both times at gets stuck.  I have also been advised by IBM because of the issue with the DSA to install disk controller firmware as this could be the problem, however when I try to run this via http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=TOOL-BOMC and create a boot iso this also fails!

    I'm going to try once again to install 6.1.3 and then upgrade via the gui but i'm not confident this will make a difference at all!!

    shkirby Wed, 11/17/2010 - 17:32 (reply to networkdefence)

    Shane,

    Well after upgrading the HD firmware I then installed 6.1.3 and then upgraded this to 6.1.4.  At the final part where it asks you for any configuration via an xml file I click on continue and it does nothing!

    I have tried this twice and at both times at gets stuck.  I have also been advised by IBM because of the issue with the DSA to install disk controller firmware as this could be the problem, however when I try to run this via http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=TOOL-BOMC and create a boot iso this also fails!

    I'm going to try once again to install 6.1.3 and then upgrade via the gui but i'm not confident this will make a difference at all!!

    networkdefence,

    On an MCS-7816-I4 in the lab I just now performed the following steps with success:

    1. Ran ciscocm.ibm-diskex-1.0.cop.sgn (took ~40min)

    2. Booted from a Cisco-HDD-FWUpdate-3.0.1-I.ISO CD which indicated that the current HDD firmware was 02.03B02

    3. Proceeded to install the new 02.03B06 HDD firmware successfully with the  Cisco-HDD-FWUpdate-3.0.1-I.ISO CD.

    4. Ejected the  Cisco-HDD-FWUpdate-3.0.1-I.ISO CD and replaced it with a Standalone DSA 3.0.2 CD.  Then selected the option to reboot the server.

    5. The server booted via the DSA 3.0.2 CD and I was able to execute the "Hard Drive: Self Test" on "/dev/sda" as per step 1c documented in this article.  And the test completed successfully.

    If you are still having issues after you current efforts, then please give these 5 steps a try.  And if the server still exhibits problems, please update your TAC SR accordingly.  Also ensure that all of the information requested in this doc has been uploaded to your TAC SR and send your SR number to ibm-fs-failure@cisco.com

    Actions

    Login or Register to take actions

    This Document

    Posted September 9, 2010 at 7:51 AM
    Stats:
    Comments:89 Avg. Rating:5
    Views:34849 Contributors:23
    Shares:3

    Documents Leaderboard