01-09-2008 04:20 PM
Hi
One of our core WAEs is reporting a disk error but we cannot seem to correct it.
"show disk details" output is as follows:
Software RAID devices:
DEVICE NAME TYPE STATUS PHYSICAL DEVICES AND STATUS
/dev/md0 RAID-1 NORMAL OPERATION disk00/00[GOOD] disk01/00[GOOD]
/dev/md1 RAID-1 NORMAL OPERATION disk00/01[GOOD] disk01/01[GOOD]
/dev/md2 RAID-1 ONE OR MORE DRIVES ABNORMAL disk01/02[GOOD]
/dev/md3 RAID-1 NORMAL OPERATION disk00/03[GOOD] disk01/03[GOOD]
/dev/md4 RAID-1 ONE OR MORE DRIVES ABNORMAL disk01/04[GOOD]
/dev/md5 RAID-1 ONE OR MORE DRIVES ABNORMAL disk01/05[GOOD]
/dev/md6 RAID-1 ONE OR MORE DRIVES ABNORMAL disk01/06[GOOD]
The output of the 'show alarms crit detail support' results in 'none'.
I ran the disk_check.sh script (as we plan to upgrade this WAE first from 4.0.13.b.12 to the 4.0.15 release but that check passed ok:
#type disk_status.txt
Thu Jan 10 11:06:28 EST 2008
device /dev/md1 (/swstore) is OK
device /dev/md0 (/sw) is OK
device /dev/md2 (/state) is OK
device /dev/md6 (/local/local1/spool) is OK
device /dev/md5 (/local/local1) is OK
device /dev/md4 (/disk00-04) is OK
Question: Is there anything we can do to remove the 'abnormal' state? Is it safe to proceed with the software upgrade?
Thanks!
Cameron
Solved! Go to Solution.
01-15-2008 12:20 AM
Cameron,
You can try the following process prior to replacing disk00:
1. From config mode, remove disk00 from the RAID array:
di d disk00 s
2. From config mode, re-add disk00 to the RAID array:
no di d disk00 s f
3. You will be asked to reload if this is a WAE-512, otherwise the disk should be added back into the array.
If this does not correct the disk state, I would recommend replacing the physical drive.
Zach
01-09-2008 06:10 PM
Cameron,
Was the drive recently replaced? Can you please provide the output from the command 'sh di t d'.
Thanks,
Zach
01-14-2008 01:14 PM
Zach,
Thanks for your reply. Here is the output:
sh disks tech-support details
=== disk00 ===
Device: IBM-ESXS ST3300555SS Version: BA33
Serial number: 3LM1JY51000098037C11
Device type: disk
Transport protocol: SAS
Local Time is: Tue Jan 15 08:10:26 2008 EST
Device supports SMART and is Enabled
Temperature Warning Enabled
SMART Health Status: OK
Current Drive Temperature: 39 C
Drive Trip Temperature: 68 C
Vendor (Seagate) cache information
Blocks sent to initiator = 652770078
Blocks received from initiator = 666414147
Blocks read from cache and sent to initiator = 8823800
Number of read and write commands whose size <= segment size = 3463715
Number of read and write commands whose size > segment size = 0
Error counter log:
Errors Corrected Total Total Correction Gigabytes Total
delay: [rereads/ errors algorithm processed uncorrected
minor | major rewrites] corrected invocations [10^9 bytes] errors
read: 164519 0 0 164519 164519 333.716 0
write: 0 0 0 0 0 341.940 0
verify: 194 0 0 194 194 0.675 0
Non-medium error count: 1
=== disk01 ===
Device: IBM-ESXS ST3300555SS Version: BA33
Serial number: 3LM1GA1Q00009803ZAFF
Device type: disk
Transport protocol: SAS
Local Time is: Tue Jan 15 08:10:28 2008 EST
Device supports SMART and is Enabled
Temperature Warning Enabled
SMART Health Status: OK
Current Drive Temperature: 36 C
Drive Trip Temperature: 68 C
Vendor (Seagate) cache information
Blocks sent to initiator = 1262471149
Blocks received from initiator = 309317146
Blocks read from cache and sent to initiator = 14889473
Number of read and write commands whose size <= segment size = 18130992
Number of read and write commands whose size > segment size = 0
Error counter log:
Errors Corrected Total Total Correction Gigabytes Total
delay: [rereads/ errors algorithm processed uncorrected
minor | major rewrites] corrected invocations [10^9 bytes] errors
read: 241936 1 0 241937 241937 592.212 0
write: 0 0 0 0 0 159.395 0
verify: 552 0 0 552 552 2.814 0
Non-medium error count: 0
We've successfully applied the 4.0.15 software to this WAE in any case, and it seems to operates normally.
No disks have been replaced as far as I am aware.
Thanks for your time Zach.
Cheers
Cameron
01-15-2008 12:20 AM
Cameron,
You can try the following process prior to replacing disk00:
1. From config mode, remove disk00 from the RAID array:
di d disk00 s
2. From config mode, re-add disk00 to the RAID array:
no di d disk00 s f
3. You will be asked to reload if this is a WAE-512, otherwise the disk should be added back into the array.
If this does not correct the disk state, I would recommend replacing the physical drive.
Zach
01-16-2008 03:44 PM
Many thanks Zach - that fixed it!
PHYSICAL DEVICES AND STATUS
/dev/md0 RAID-1 NORMAL OPERATION disk00/00[GOOD] disk01/00[GOOD]
/dev/md1 RAID-1 NORMAL OPERATION disk00/01[GOOD] disk01/01[GOOD]
/dev/md2 RAID-1 NORMAL OPERATION disk00/02[GOOD] disk01/02[GOOD]
/dev/md3 RAID-1 NORMAL OPERATION disk00/03[GOOD] disk01/03[GOOD]
/dev/md4 RAID-1 NORMAL OPERATION disk00/04[GOOD] disk01/04[GOOD]
/dev/md5 RAID-1 NORMAL OPERATION disk00/05[GOOD] disk01/05[GOOD]
/dev/md6 RAID-1 NORMAL OPERATION disk00/06[GOOD] disk01/06[GOOD]
01-15-2008 12:21 AM
Cameron,
You can try the following process prior to replacing disk00:
1. From config mode, remove disk00 from the RAID array:
di d disk00 s
2. From config mode, re-add disk00 to the RAID array:
no di d disk00 s f
3. You will be asked to reload if this is a WAE-512, otherwise the disk should be added back into the array.
If this does not correct the disk state, I would recommend replacing the physical drive.
Zach
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide