hi to all..
does anybody have expertise on this matter..
maybe wes or ryan have inside info..
we had a TAC case with failed HDD and problematic RAID controller in MCS-7825-I4 server running CUCM 7.1.5..
the symptom happend when we shutdown the server in order to check what IBR FRU is on memory inside the server..we checked it and after that we powered up the server.. RAID controller reports DEGRADED state..cisco OS started to boot and in certain point it reported something like “ unsupported hardware......not for the production..without TAC support..bla..bla..bla )”..we opened the case..
during the case we upgraded all kinds of firmware on server and also find out that one HDD failed..we replaced it and troubleshooted more..
finally we got RMA for the server but have spotted very strange behavior on the new server as well ( we put in into lab to test it )..so, we’re thinking , is the following expected behaviour or there is a serious problem in cisco OS or IBM HW..
- first of all we upgraded the HDD firmware on the new server we got from RMA to 3B06 ( it was with 3B05 ) having in mind CSCti52867 which we hitted earlier..after that we’installed CUCM 7.1.5..
- when server is shut down and we pull out the right hard drive ( the one in BAY 1)..we power up the server..RAID controller detects one drive missing and reports DEGRADED state..cisco OS starts to boot and in certain point it reports again something like “ unsupported hardware......not for the production..without TAC support..bla..bla..bla )..
- when we shut down the server again..and switch the scenario..we get back the right HDD in server ( BAY 1 ) and pull out the left HDD out ( BAY 0 )..when we power up the server..it doesn’t even detect bootable device like there is no HDDs inside..
- so, when we shut down the server again..and get back both HDDs where they belong..power up the server..everythings works fine..of course, RAID is resynching..
- of course both HDDs used in server are in correct state and aren’t failed..
so, i’m a little bit confused, since it’s happening on 2 different servers..identical scenario..my RAID1 perception is in serious doubt..
does this mean that if HDD failed during regular maintenance shutdown ( which we had in the first place ), RAID shouldn’t provide operational system?..
to be honest , we didn’t have failed disk while server was up and online so we couldn’t notice what’s going on in that situation..
but the fact is, with the old server and with the new server, when we pull out one of the HDDs the system wouldn’t boot up properly..
is this expected behaviour?..what’s the purpose of RAID1 then?..
is RAID1 expected to cover just online HDD failure ( if so ) or it should work in every scenario?..