Cisco Support Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Community Member

Power state on Chassis # is redundancy-failed

Hi,

I know this error used to be common a while back, but I'm running 2.0(2q) with four apparently healthy PSUs per chassis in n+1 mode and consuming using less than 800 watts across four B200 M2 blades. Help?

Thanks,

Hamish

30 REPLIES
Cisco Employee

Power state on Chassis # is redundancy-failed

Can you paste the output of the following commands:

scope chassis x

show psu detail

show psu-control detail

show fault

Regards,

Robert

Cisco Employee

Power state on Chassis # is redundancy-failed

Hi Hamish ,

If you access the UCSM, do you see any PSU with a N/A status?

Does UCSM report any alerts such as upper non recoverable, thermal alerts or power redundancy lost?

Do you see any amber light on the power supplies?

You can reseat the power supplies one by one in order to see if they come back online.

Sometimes the chassis can generate thermal alerts and it can be related to a fan or even the IOM.

Go ahead and check the status of the power supplies physically and on UCSM.

Also collect the information from the commands Robert suggested.

Community Member

Re: Power state on Chassis # is redundancy-failed

Hello,

I'm having the same issue in 2.0(4d). I have a UCS-system with two 6120XP and a three 5108 chassis.

The system is configured for N+1 redundancy:

show psu-control detail

Psu Control:

    Redundancy: NPlus1

    Input Power: Ok

    Output Power: Ok

    Cluster Power: Slot 2 Master

    Overall Status: Failed

    Config Error: Redundancy Lost

C61UCSscto01-B /chassis # show fault

Severity  Code     Last Transition Time     ID       Description

--------- -------- ------------------------ -------- -----------

Major     F0408    2013-05-14T18:33:53.694    740475 Power state on chassis 1 is redundancy-failed

show psu detail

PSU:

    PSU: 1

    Overall Status: Operable

    Operability: Operable

    Threshold Status: OK

    Power State: On

    Presence: Equipped

    Thermal Status: OK

    Voltage Status: OK

    Product Name: 2500W 200-240VAC PSU for UCS 5108 Blade Server Chassis

    PID: N20-PAC5-2500W

    VID: V00

    Vendor: Cisco Systems Inc

    Serial (SN): DTM1616016M

    HW Revision: 0

    PSU: 2

    Overall Status: Operable

    Operability: Operable

    Threshold Status: OK

    Power State: On

    Presence: Equipped

    Thermal Status: OK

    Voltage Status: OK

    Product Name: 2500W 200-240VAC PSU for UCS 5108 Blade Server Chassis

    PID: N20-PAC5-2500W

    VID: V00

    Vendor: Cisco Systems Inc

    Serial (SN): DTM161703MM

    HW Revision: 0

    PSU: 3

    Overall Status: N/A

    Operability: N/A

    Threshold Status: N/A

    Power State: PwrSave

    Presence: Equipped

    Thermal Status: OK

    Voltage Status: N/A

    Product Name: 2500W 200-240VAC PSU for UCS 5108 Blade Server Chassis

    PID: N20-PAC5-2500W

    VID: V00

    Vendor: Cisco Systems Inc

    Serial (SN): DTM161703L8

    HW Revision: 0

    PSU: 4

    Overall Status: Operable

    Operability: Operable

    Threshold Status: OK

    Power State: On

    Presence: Equipped

    Thermal Status: OK

    Voltage Status: OK

    Product Name: 2500W 200-240VAC PSU for UCS 5108 Blade Server Chassis

    PID: N20-PAC5-2500W

    VID: V00

    Vendor: Cisco Systems Inc

    Serial (SN): DTM170304LJ

    HW Revision: 0

Any clue?

Regards,

Javier

Community Member

Re: Power state on Chassis # is redundancy-failed

Hi,

Not sure why this is the first reply I've recieved notification for.

I opened a case with Cisco on this and was advised to change the policy to Grid, then back to N+1. This cleared the error for me.

Hope this helps.

Hamish

Cisco Employee

Re: Power state on Chassis # is redundancy-failed

The problem might be with the subordinate IOM.  The active one, which returned the outputs requested above looks good.

Please do the following from the UCSM CLI:

ssh to fabric Interconnect A

connect iom x (x = chassis # showing redundancy lost)

show platform software cmcctrl power redundancy

ssh to fabric interconnect B

connect iom x

show platform software cmcctrl power redundancy

Thanks,

Robert

Community Member

Re: Power state on Chassis # is redundancy-failed

Hi Robert,

Here is the output of the command:

FabricB (active)

fex-1# show platform software cmcctrl power redundancy

==============================

Last update TS                 : 1718362

Stale TS                 : 1718422

Now                         : 1718378

Cluster master                 : yes

Policy                        : N+1

State                        : Lost

Total power available        : 7500

Total power usage        : 1713

Power budget requested        : 5472

-----------

Grid                        : 0

Active PS        : 0 1 3

Spare PS        : 2

Unavailable PS        :

-----------

==============================

FabricA (subordinate)

fex-1# show platform software cmcctrl power redundancy

==============================

Last update TS                 : 1718486

Stale TS                 : 1718546

Cluster master                 : no

Policy                        : N+1

State                        : Lost

Total power available        : 7500

Total power usage        : 1732

Power budget requested        : 5472

-----------

Grid                        : 0

Active PS        : 0 1 3

Spare PS        : 2

Unavailable PS        :

-----------

==============================

Thanks!

Cisco Employee

Power state on Chassis # is redundancy-failed

Have you tried to change the power policy to non-redundant and then back to N+1?

This is not disruptive.

Community Member

Power state on Chassis # is redundancy-failed

Hi,

Yes, we did it, but didn't work for us...

Regards,

Javier

Re: Power state on Chassis # is redundancy-failed

Javier,

Please open a TAC case, and attach the chassis (one for each chassis) and UCSM tech support to the case.

-Kenny

Cisco Employee

Power state on Chassis # is redundancy-failed

Please open a TAC case as Kenny suggested, but also look in to the following known defect and applying the workaround if applicable:

CSCub53747

Community Member

Power state on Chassis # is redundancy-failed

Hi,

According the bug toolkit, the CSCub53747 should be fixed in 2.0(4a). We're now in 2.0(4d).

We've opened a TAC case. I'll put here the conclussions.

Thanks!

Javier

Community Member

Power state on Chassis # is redundancy-failed

Hi Javier

Did the TAC provide you with a workable solution ? I seem to have the same issue as well on my end. However , in my case its a UCS 5108 with 4 Power Supplies connected and on N+1 option.

Community Member

Power state on Chassis # is redundancy-failed

Hi,

Not yet. We're working on that, but the TAC recommends to go to 2.0(5b). These bugs seems to be hitting us:

CSCue49366, CSCud48637/CSCue33889.

Javier

Community Member

Power state on Chassis # is redundancy-failed

Hi Javier,

Have you already solved the issue ?

I'm having the same problem and I'm on

2.1(1e)

Regards,

Bruno

Power state on Chassis # is redundancy-failed

Bruno,

Could you please run the following commands and attach them here:

*connect local a

*connect iom #   <<< # of the chassis where you see the power prob

*show platform soft cmc thermal  status

*show platform soft cmc power redundancy

Next

*connect local b

*connect  iom # <<< again same chassis number

*show platform soft cmc power redundancy

-Kenny

Community Member

Re: Power state on Chassis # is redundancy-failed

Hi Kenny,

See the attached files regarding Fabric-A and Fabric-B

FI-BERNA-A /chassis # show psu-control detail

Psu Control:

    Redundancy: NPlus1

    Input Power: Ok

    Output Power: Ok

    Cluster Power: Slot 1 Master

    Overall Status: Failed

    Config Error: Redundancy Lost

FI-BERNA-A /chassis # show fault

Severity  Code     Last Transition Time     ID       Description

--------- -------- ------------------------ -------- -----------

Major     F0408    2013-06-20T11:03:06.262    341965 Power state on chassis 2 is redundancy-failed

FI-BERNA-A /chassis # show psu detail

PSU:

    PSU: 1

    Overall Status: Operable

    Operability: Operable

    Threshold Status: OK

    Power State: On

    Presence: Equipped

    Thermal Status: OK

    Voltage Status: OK

    Product Name: Platinum AC PSU for N20-C6508 Blade Server Chassis

    PID: UCSB-PSU-2500ACPL

    VID: V00

    Vendor: Cisco Systems Inc

    Serial (SN): DTM163000M7

    HW Revision: 0

    Firmware Version: N/A

    PSU: 2

    Overall Status: Operable

    Operability: Operable

    Threshold Status: N/A

    Power State: PwrSave

    Presence: Equipped

    Thermal Status: OK

    Voltage Status: N/A

    Product Name: Platinum AC PSU for N20-C6508 Blade Server Chassis

    PID: UCSB-PSU-2500ACPL

    VID: V00

    Vendor: Cisco Systems Inc

    Serial (SN): DTM163000MT

    HW Revision: 0

    Firmware Version: N/A

    PSU: 3

    Overall Status: Operable

    Operability: Operable

    Threshold Status: OK

    Power State: On

    Presence: Equipped

    Thermal Status: OK

    Voltage Status: OK

    Product Name: Platinum AC PSU for N20-C6508 Blade Server Chassis

    PID: UCSB-PSU-2500ACPL

    VID: V00

    Vendor: Cisco Systems Inc

    Serial (SN): DTM162900A0

    HW Revision: 0

    Firmware Version: N/A

    PSU: 4

    Overall Status: Operable

    Operability: Operable

    Threshold Status: OK

    Power State: On

    Presence: Equipped

    Thermal Status: OK

    Voltage Status: OK

    Product Name: Platinum AC PSU for N20-C6508 Blade Server Chassis

    PID: UCSB-PSU-2500ACPL

    VID: V00

    Vendor: Cisco Systems Inc

    Serial (SN): DTM162900A1

    HW Revision: 0

    Firmware Version: N/A

Kind regards,

Bruno Fernandes

Re: Power state on Chassis # is redundancy-failed

Bruno,

Thanks for the information.

So PSU is in a power save mode:

PSU: 2

    Overall Status: Operable

    Operability: Operable 

    Power State: PwrSave  <<<

From the Active IOM, I can see this:

fex-1# show platform software cmcctrl power redundancy

==============================

Cluster master                 : yes   <<< Shows we are in the primary IOM

Policy                        : N+1

State                        : Lost   <<< This is the only problem cause the PSU is fine

Total power available        : 7500  <<< 3 PSUs available

Total power usage        : 856 <<<< 1 PSU is more than enough to cover this

Power budget requested        : 5472 < However the chassis asks for 3 PSUs to be active, this is not a expected value

-----------

Grid                        : 0

        Active PS        : 0 2 3

        Spare PS        : 1    <<<< 1 is actually PSU 2, which shows up in power save mode

        Unavailable PS        :

-----------

==============================

Actions suggested:

1-Change the power policy from N+1 to Grid and vice versa

2-Follow the instructions in the bug CSCty64894 (Note those steps are not disruptive)

http://tools.cisco.com/Support/BugToolKit/search/getBugDetails.do?method=fetchBugDetails&bugId=CSCty64894

I hope this helps, otherwise, let us know.

-Kenny

Community Member

Power state on Chassis # is redundancy-failed

Kenny,

Just to confirm with you:

Regarding the suggested actions:

1-Change the power policy from N+1 to Grid and vice versa

2-Follow the instructions in the bug CSCty64894 (Note those steps are not disruptive)

Neither step 1 or 2 area disruptive correct ? Step 2 has stated it's not....regarding step 1 I see no reason for being disruptive but I'm not 100% confident, sorry for the basic question, but I have no spare UCS to confirm and this is already in production....so I need to be 100% confident

Kind regards,

Bruno

Power state on Chassis # is redundancy-failed

Bruno,

Totally save steps, no disruption whatsoever since all your PSUs show up as operable.

-Kenny

Community Member

Power state on Chassis # is redundancy-failed

Hi Kenny,

I have done both steps with no result, but then this morning juste repeated step 1 and waitted a little longer and the fault went gone, also the chassis recovered is healthy state (regarding poert redundancy). But still the the same PSU has a strange result "Threshold and Voltage Status"

Could this be that since we are using N+1 and in this case we area using only 3 psu ?????

FI-BERNA-B /chassis # show psu detail

PSU:

    PSU: 1

    Overall Status: Operable

    Operability: Operable

    Threshold Status: OK

    Power State: On

    Presence: Equipped

    Thermal Status: OK

    Voltage Status: OK

    Product Name: Platinum AC PSU for N20-C6508 Blade Server Chassis

    PID: UCSB-PSU-2500ACPL

    VID: V00

    Vendor: Cisco Systems Inc

    Serial (SN): DTM163000M7

    HW Revision: 0

    Firmware Version: N/A

    PSU: 2

    Overall Status: Operable

    Operability: Operable

   Threshold Status: N/A

    Power State: On

    Presence: Equipped

    Thermal Status: OK

    Voltage Status: N/A

    Product Name: Platinum AC PSU for N20-C6508 Blade Server Chassis

    PID: UCSB-PSU-2500ACPL

    VID: V00

    Vendor: Cisco Systems Inc

    Serial (SN): DTM163000MT

    HW Revision: 0

    Firmware Version: N/A

    PSU: 3

    Overall Status: Operable

    Operability: Operable

    Threshold Status: OK

    Power State: On

    Presence: Equipped

    Thermal Status: OK

    Voltage Status: OK

    Product Name: Platinum AC PSU for N20-C6508 Blade Server Chassis

    PID: UCSB-PSU-2500ACPL

    VID: V00

    Vendor: Cisco Systems Inc

    Serial (SN): DTM162900A0

    HW Revision: 0

    Firmware Version: N/A

    PSU: 4

    Overall Status: Operable

    Operability: Operable

    Threshold Status: OK

    Power State: On

    Presence: Equipped

    Thermal Status: OK

    Voltage Status: OK

    Product Name: Platinum AC PSU for N20-C6508 Blade Server Chassis

    PID: UCSB-PSU-2500ACPL

    VID: V00

    Vendor: Cisco Systems Inc

    Serial (SN): DTM162900A1

    HW Revision: 0

    Firmware Version: N/A

Kind regards,

Bruno Fernandes

Power state on Chassis # is redundancy-failed

Bruno,

Thanks for the feedback, I am glad the power redundancy error message is gone now.

In regards to the power supply not showing all the correct status info, I will recommend you to open a case, like Javier mentioned, this can be a I2C bus issue, where your PSU is not either being able to deliver his status messages or the primary IOM is just not receiving it, but this definitely needs further/deeper analysis.

Please open a TAC case.

-Kenny

Community Member

Re: Power state on Chassis # is redundancy-failed

Hi Bruno,

Not yet. TAC engineers are still working in the case (625642101). We're waiting for an RMA of the 4 PSUs in one chassis. One PSU seems to be caussing errors in the I2C bus. We'll probably upgrade to a 2.1 due to compatibility with new

SAN equipment (also to solve the bugs that seems to be affecting the system).

Regards

Community Member

Power state on Chassis # is redundancy-failed

Community Member

Power state on Chassis # is redundancy-failed

Hi Guys

Just to give all an update ,

 

1-Change the power policy from N+1 to Grid and vice versa

Worked for us.

Community Member

Re: Power state on Chassis # is redundancy-failed

Hi,

We recently upgrade to 2.1(2a). All seems to be working fine. Let's see how it behaves from now on...

Regards,

Javier

Community Member

Re: Power state on Chassis # is redundancy-failed

Got  notification this week only reagrding the power supplies for the Chassis.

UCS B-Series chassis power supplies have an issue which can cause shutdown when activated in a redundancy switchover.

Affected units can be identified by the version and serial number format defined in below link.

http://www.cisco.com/en/US/ts/fn/636/fn63628.html

Community Member

Re: Power state on Chassis # is redundancy-failed

Hi,

Thanks for the info. We have 2 chassis potentially affected by this issue. We have to check the deviation label.

Regards

Re: Power state on Chassis # is redundancy-failed

Hello All,

If you happen to be affected by this Field Notice, please remember you need a TAC Service Request Number and just make reference to this FN#.  If you may attach screenshots/pics that will make processes to be faster and that way TAC does not have to ask for any further information.

Also, please remember that there is no need for a single case for each PSU; you may confirm how many of these PSUs have problems and then just specify the quantity in the form with the Serial Numbers separated by commas, if this will include more than 4,000 characters including blank spaces and commas, then you will need to fill up more forms.

-Kenny

Community Member

i am getting this information

i am getting this information on UCSM of UCSB:

one PSU for 1 chasis and one PSU for 2nd chassis.

 

 

Please tell me how can fix it as all four power supply connected and ok.

 

Warm Regard's =========== Amit
5540
Views
5
Helpful
30
Replies
CreatePlease to create content