Activating the Fabric Interconnect Firmware for a Cluster Configuration

Unanswered Question
Aug 9th, 2012

I have been following the upgrade Guide within 2.0. everything went smooth until I started on the FIs. I activated the FI-B first, then I set FI-B as the lead. Since then I have not been able to log into UCSM GUI. The virtual IP is not responding. FI-B is not responding on ping also.

The problem started at the end of page 40 in the "Upgrade UCS from 2.0 to 2.x" guide.

Is there a way to restart the web-server, perhaps? Bear in mind I am in the middle of a upgrade scenario. I have different firmware on the two FIs, because the FI-B has already been upgraded according to the instructions in the Upgrade Guide.

When I log into he two FIs, everything seems to be normal. I think the problem could have arosen, because of the strange 10 minute time difference I have reported earlier today.

See output from events here.

FI6120XP-A# scope fabric-interconnect a

FI6120XP-A /fabric-interconnect # show fault

Severity  Code     Last Transition Time     ID       Description

--------- -------- ------------------------ -------- -----------

Cleared   F16550   2012-08-09T20:55:29.330    167299 [FSM:STAGE:RETRY:]: Uplink

eth port configuration on A(FSM-STAGE:sam:dme:SwEthLanBorderDeploy:UpdateConnect

ivity)

Cleared   F16550   2012-08-09T20:55:29.330    167303 [FSM:STAGE:RETRY:]: Uplink

fc port configuration on A(FSM-STAGE:sam:dme:SwFcSanBorderDeploy:UpdateConnectiv

ity)

Info      F0279    2012-08-09T12:39:44.804     44806 fc port 6 on fabric interco

nnect A oper state: sfp-not-present

Info      F0279    2012-08-09T12:39:44.803     44801 fc port 1 on fabric interco

nnect A oper state: sfp-not-present

Info      F0279    2012-08-09T12:39:44.803     44802 fc port 2 on fabric interco

nnect A oper state: sfp-not-present

Info      F0279    2012-08-09T12:39:44.803     44803 fc port 3 on fabric interco

nnect A oper state: sfp-not-present

Info      F0279    2012-08-09T12:39:44.803     44804 fc port 4 on fabric interco

nnect A oper state: sfp-not-present

Info      F0279    2012-08-09T12:39:44.803     44805 fc port 5 on fabric interco

nnect A oper state: sfp-not-present

Major     F0374    2012-08-07T14:44:23.960    115262 Power supply 1 in fabric in

terconnect A operability: inoperable

Major     F0369    2012-08-07T14:44:23.959    115261 Power supply 1 in fabric in

terconnect A power: error

FI6120XP-A /fabric-interconnect #  exit

----

FI6120XP-A# scope fabric-interconnect b

FI6120XP-A /fabric-interconnect # show fault

Severity  Code     Last Transition Time     ID       Description

--------- -------- ------------------------ -------- -----------

Cleared   F16550   2012-08-09T20:55:29.330    167301 [FSM:STAGE:RETRY:]: Uplink

eth port configuration on B(FSM-STAGE:sam:dme:SwEthLanBorderDeploy:UpdateConnect

ivity)

Cleared   F16550   2012-08-09T20:55:29.330    167305 [FSM:STAGE:RETRY:]: Uplink

fc port configuration on B(FSM-STAGE:sam:dme:SwFcSanBorderDeploy:UpdateConnectiv

ity)

Major     F0276    2012-08-09T20:40:14.059    103277 ether port 5 on fabric inte

rconnect B oper state: link-down, reason: Link failure or not-connected

Major     F0276    2012-08-09T20:40:14.059    103279 ether port 7 on fabric inte

rconnect B oper state: link-down, reason: Link failure or not-connected

Cleared   F16550   2012-08-09T20:39:29.666    168434 [FSM:STAGE:RETRY:]: interna

l network configuration on B(FSM-STAGE:sam:dme:SwAccessDomainDeploy:UpdateConnec

tivity)

Cleared   F0278    2012-08-09T20:38:37.860    168199 ether port 20 on fabric int

erconnect B oper state: hardware-failure, reason: hardware-failure

Cleared   F0278    2012-08-09T20:38:37.566    168200 ether port 19 on fabric int

erconnect B oper state: hardware-failure, reason: hardware-failure

Cleared   F16653   2012-08-09T20:38:33.969    168210 [FSM:STAGE:RETRY:]: rebooti

ng remote fabric interconnect(FSM-STAGE:sam:dme:MgmtControllerUpdateSwitch:reset

Remote)

Cleared   F0291    2012-08-09T20:37:19.381    168209 Fabric Interconnect B opera

bility: inoperable

Major     F0369    2012-08-07T14:44:52.779    115581 Power supply 1 in fabric in

terconnect B power: error

Major     F0374    2012-08-07T14:44:52.779    115582 Power supply 1 in fabric in

terconnect B operability: inoperable

Info      F0279    2012-07-06T10:43:49.427     50643 fc port 5 on fabric interco

nnect B oper state: sfp-not-present

Info      F0279    2012-07-06T10:43:49.427     50644 fc port 6 on fabric interco

nnect B oper state: sfp-not-present

Info      F0279    2012-07-06T10:43:49.426     50640 fc port 2 on fabric interco

nnect B oper state: sfp-not-present

Info      F0279    2012-07-06T10:43:49.426     50641 fc port 3 on fabric interco

nnect B oper state: sfp-not-present

Info      F0279    2012-07-06T10:43:49.426     50642 fc port 4 on fabric interco

nnect B oper state: sfp-not-present

Info      F0279    2012-07-06T10:43:49.425     50639 fc port 1 on fabric interco

nnect B oper state: sfp-not-present

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Average Rating: 5 (3 ratings)
CSCO12035638 Thu, 08/09/2012 - 12:33

This is in main event-log:

FI6120XP-A# show event

Creation Time            ID       Code     Description

------------------------ -------- -------- -----------

2012-08-09T21:40:59.700    183964 E4195227 [FSM:STAGE:REMOTE-ERROR]: Result: end

-point-unavailable Code: ERR-NTP-set-error Message: Device Name:[0x3FF] Instance

:[63] Error Type:[(null)] code:[255](sam:dme:CommSvcEpUpdateSvcEp:SetEpLocal)

2012-08-09T21:40:59.699    183963 E4195227 [FSM:STAGE:STALE-FAIL]: communication

service  configuration to primary(FSM-STAGE:sam:dme:CommSvcEpUpdateSvcEp:SetEpL

ocal)

2012-08-09T21:40:00.555    183961 E4195227 [FSM:STAGE:STALE-FAIL]: communication

service  configuration to primary(FSM-STAGE:sam:dme:CommSvcEpUpdateSvcEp:SetEpL

ocal)

2012-08-09T21:40:00.555    183962 E4195227 [FSM:STAGE:REMOTE-ERROR]: Result: end

-point-unavailable Code: ERR-NTP-set-error Message: Device Name:[0x3FF] Instance

:[63] Error Type:[(null)] code:[255](sam:dme:CommSvcEpUpdateSvcEp:SetEpLocal)

2012-08-09T21:38:49.703    183959 E4195227 [FSM:STAGE:STALE-FAIL]: communication

service  configuration to primary(FSM-STAGE:sam:dme:CommSvcEpUpdateSvcEp:SetEpL

ocal)

2012-08-09T21:38:49.703    183960 E4195227 [FSM:STAGE:REMOTE-ERROR]: Result: end

-point-unavailable Code: ERR-NTP-set-error Message: Device Name:[0x3FF] Instance

:[63] Error Type:[(null)] code:[255](sam:dme:CommSvcEpUpdateSvcEp:SetEpLocal)

2012-08-09T21:37:40.335    183929 E4195227 [FSM:STAGE:STALE-FAIL]: communication

service  configuration to primary(FSM-STAGE:sam:dme:CommSvcEpUpdateSvcEp:SetEpL

CSCO12035638 Fri, 08/10/2012 - 00:08

I cannot log in with SSH to FI-B also. Remember, I reported a problem setting the clock on FI-B? It was always 10 minutes faster. The error above is on FI-A, but I think it is on FI-B also. I wil now try to set the clocks again on both FIs.

CSCO12035638 Fri, 08/10/2012 - 00:24

I took a backup of the full config. Will a restore set it back to where I started? (But still a problem with time differences)

CSCO12035638 Fri, 08/10/2012 - 00:30

FI6120XP-B# show cluster extended-state

Cluster Id: 0xd03a569cc74311e1-0x9ad4000decfbaec4

Start time: Thu Aug  9 20:46:42 2012

Last election time: Thu Aug  9 20:52:26 2012

B: UP, PRIMARY

A: UP, SUBORDINATE

B: memb state UP, lead state PRIMARY, mgmt services state: UP

A: memb state UP, lead state SUBORDINATE, mgmt services state: UP

   heartbeat state PRIMARY_OK

INTERNAL NETWORK INTERFACES:

eth1, UP

eth2, UP

HA READY

Detailed state of the device selected for HA storage:

Chassis 1, serial: FOX1338GY8D, state: active

FI6120XP-B#

CSCO12035638 Fri, 08/10/2012 - 00:33

IOM 1 (Fabric A):

        Running-Vers: 2.0(1t)

        Package Vers: 2.0(1t)A

        Update-Status: Ready

        Activate-Status: Pending Next Boot

    IOM 2 (Fabric B):

        Running-Vers: 2.0(3b)

        Package Vers: 2.0(3b)A

        Update-Status: Ready

        Activate-Status: Ready

I will reboot FI-A (it did not say this).

----

That did not help. Same situation. Cluster is reporting it is ok, but I cannot log into both virtual and FI-B.

CSCO12035638 Fri, 08/10/2012 - 01:09

I think somehow this step went wrong:

Step 2 To activate the IOM firmware, do the following in the Activate Firmware dialog box:

a) From the Filter drop-down list, choose IO Modules.

b) From the Set Version drop-down list, choose the version for the current 2.0 release.

c) Check the Ignore Compatibility Check check box.

d) Check the Set Startup Version Only check box.

When you configure Set Startup Version Only for an I/O module, the I/O module is rebooted

when the fabric interconnect in its data path is rebooted. If you do not configure Set Startup

Version Only for an I/O module, the I/O module reboots and disrupts traffic. In addition, if

Cisco UCS Manager detects a protocol and firmware version mismatch between the fabric

interconnect and the I/O module, Cisco UCS Manager automatically updates the I/O module

with the firmware version that matches the firmware in the fabric interconnect and then

activates the firmware and reboots the I/O module again.

Important

e) Click Apply.

When the Activate Status column for all IOMs displays pending-next-boot, continue with Step 3.

Step 3 Click OK.

Because:

IOM-A seems never to be activated, because the package-version is 2.0(1t)

FI6120XP-A# show iom version

Chassis 1:

    IOM      Fabric ID Running-Vers    Package Vers   Update-Status   Activate-Status

    -------- --------- --------------- -------------- --------------- ---------------

           1 A         2.0(1t)         2.0(1t)A       Ready           Pending Next Boot

           2 B         2.0(3b)         2.0(3b)A       Ready           Ready

CSCO12035638 Fri, 08/10/2012 - 01:23

Severity: Major

Code: F0910

Last Transition Time: 2012-08-09T20:22:39.621

ID: 167826

Status: None

Description: default Keyring's certificate is invalid, reason: expired.

Affected Object: sys/pki-ext/keyring-default

Name: Pki Key Ring Status

Cause: Invalid Keyring Certificate

Type: Security

CSCO12035638 Fri, 08/10/2012 - 01:52

Since the IOM firware is different on IOM-A, I think maybe start there?

Is this how I do it (Cannot do this in scope firmware)

FI6120XP-A /system # activate firmware  ucs-2100.2.0.3b.bin (I have a 2104XP)

CSCO12035638 Fri, 08/10/2012 - 06:13

I now understand more. The reason FI-A and IOM-A are showing old packages is that IOM-A will only upgrade to the newest version when we activate the new fw for FI-A. So the solution may be to go ahead and do this on CLI. Then they both will be updated.

But my gui problem may still be there. This may be 2 different problems. I just thought the GUI problem was related to this.

CSCO12035638 Fri, 08/10/2012 - 06:23

I am thinking to activate these two:

Kernel: ucs-6100-k9-kickstart.5.0.3.N2.2.03b.bin

System: ucs-6100-k9-system.5.0.3.N2.2.03b.bin

In Gui these are done at the same time. Here they will have to be executed on at a time.

CSCO12035638 Sun, 08/12/2012 - 01:54

This can be closed. We found the mistake: The FI-B was connected with mgmt1 interface. Therefore when we switched management over to FI-B we could no longer access the GUI...The same error caused my time sync issues.. Tis can be closed.

Actions

Login or Register to take actions

This Discussion

Posted August 9, 2012 at 12:30 PM
Stats:
Replies:12 Avg. Rating:5
Views:2816 Votes:0
Shares:0
Tags: cannot, gui, ucsm, after, log, lead
+
Categories: General UCS Hardware
+

Related Content

Discussions Leaderboard