Cisco Support Community
cancel
Showing results for 
Search instead for 
Did you mean: 
New Member

cisco ucs 5108 thermal problem

si

Hi everyone!

I've installed my first UCS system: 2 UCS 5108 & 2 UCS 6248

ee2a6d264433.png

In first chassys  six blade-servers (2 - b230 m2& 4 b200 m2). In second - 5 b200 m2. I've got two air conditioners in server room working on their maximum. For the last week i've received three faults on first chassis (Fault Code: F0411). IOM temperature was about 45-46. After that i've mooved 3 blade-servers to second chassis until i solve this problem.

UCSM version - 2.0.2r

Everything is quite good, except thermal problem. All blade-servers discovered, 0 errors and critical.

Everyone's tags (3)
15 REPLIES
Cisco Employee

cisco ucs 5108 thermal problem

Hello Sergey,

Please a open TAC service request with UCSM and Chassis 1 and 2 tech support bundle.

http://www.cisco.com/en/US/docs/unified_computing/ucs/ts/Frame-Files-Converted-to-DITA--Do-Not-Use/TS_GeneralTroubleshooting.html#wp1073749

We need more logs to investigate the thermal fault.

Padma

New Member

cisco ucs 5108 thermal problem

Please reset the IOM physicaly present in that chasiss. I have done this twice for the thermal issue and the issue never re-accured.

Ram

New Member

cisco ucs 5108 thermal problem

I'll try to reset them/ Could you tell me how to do this correctly?

New Member

cisco ucs 5108 thermal problem

Just unplug the right hand side IOM and fix. Wait for 20 Mins and evrything comes up, repeat the steps for another side.

Ram

New Member

cisco ucs 5108 thermal problem

Do not turn power off? Just unplug and set back one IOM and then another?

Cisco Employee

cisco ucs 5108 thermal problem

Yes, don't power off, it is not required

New Member

cisco ucs 5108 thermal problem

I can't open TAC at this moment - my smartnet is still on registration... i've created technical files for Chassis 1 and 2. Should i place them here or wait for my smartnet?

Cisco Employee

cisco ucs 5108 thermal problem

This can be caused by an I2c issue on the server.

You can try the following:

Reset fans one by one.

Reset PSU one by one

Finally, reset IOMs starting for the faulty one.

Also, determine which blade is showing any alarms and try to reseat the blade on the chassis.

Please make sure to wait a couple of minutes during the resetting of the components.

New Member

cisco ucs 5108 thermal problem

How to do it correctly? Power off than reset or what?

New Member

cisco ucs 5108 thermal problem

Think this is a code bug and you need to goto 2.0(q). Running two 6248 systems at that level and not having the issue. This thermal stuff plagued ALL the 1.4x releases.

Craig

My UCS Blog http://realworlducs.com
New Member

cisco ucs 5108 thermal problem

i've got the same errors on 2.0.2q...

cisco ucs 5108 thermal problem

Sergey,

If this is a real I2C issue, you may still see the same behavior on 2.0.x release if the I2C bus was not cleared before the upgrade. (in this moment I don't know if you recently performed an upgrade on the system or not)

I2C bus tranports information about the different components of the Unified System, this, meaning Chassis, IOMs, Fans, PSU, etc...  What happens is that all those components try to send theit status update while other do the same and then the I2C bus gets overwhelmed, and then noone can really report their real status, so we usually recommend the customer ro reseat all major components, one at the time, to clear the bus and then do the upgrade, if that is not done before the upgrade, it still should be done after.

Try reseating the Fans and PSU, one at the time, leaving a minute in between and then, IOMs one at the time, leaving three minutes in between and begining with the subordinate to cause minimun disruption.

If this does not clear the situation, then you will need to remove one of the components already mentioned, one at the time and do a "show tech-support chassis # all brief" to see what the I2C bus reports segment by segment (chassis, Fans, PSUs...) once you remove a component and the errors on each segment stop incrementing you will have your faulty piece of hardware, and a TAC case will be needed to send a replacement.

For further analysis or assistance, I strongly recommend a TAC case to be opened.

-Kenny

New Member

cisco ucs 5108 thermal problem

Have gone through powering off the whole 6248 UCS on 2q, and issues remain.

Craig

My UCS Blog http://realworlducs.com
Cisco Employee

cisco ucs 5108 thermal problem

Actually you don't have to power off the 6248 FIs. A effective but luxury solution, is to decommission and powercycle the chassis that is generating those faults, including the power cords removal.Then, you can wait a minute and recommision the chassis. After that, all thermal fauls should go away.

New Member

cisco ucs 5108 thermal problem

Hi Craig,

Did you manage to resolve your issue?

6712
Views
5
Helpful
15
Replies
CreatePlease to create content