Cisco Support Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Announcements

Welcome to Cisco Support Community. We would love to have your feedback.

For an introduction to the new site, click here. And see here for current known issues.

New Member

UCSM 2.2.(1b) Cluster IP Failover Issue

Hi,

 

We are doing UAT testing for our newly installed 6248 FI's with following details and we are observing some unusual behavior

 

FI-A - 172.10.15.51 (Primary)

FI-B - 172.10.15.52 (Secondary)

Cluster IP - 172.10.15.50

 

If we are shutting down the FI-A (Primary), the Cluster IP is not getting fail over to FI-B and FI-B is not be promoted as Primary

Is there any issue related to 2.2(1b)?

13 REPLIES
VIP Green

Did you check the cluster

Did you check the cluster status, before the test: CLI show cluster status ?

Can you clarify what you mean with shutting down FI-A ? power off ?

ssh to FI-B, after shutting FI-A, and do CLI show cluster status ?

New Member

Below is the output before

Below is the output before power off FI-A

================================================

lavender-A(local-mgmt)# show cluster extended-state
Cluster Id: 0x7bce8766ecdc11e3-0xa1b4002a6ac23b41

Start time: Sat Jun 14 21:05:55 2014
Last election time: Sat Jun 14 21:17:57 2014

A: UP, PRIMARY
B: UP, SUBORDINATE

A: memb state UP, lead state PRIMARY, mgmt services state: UP
B: memb state UP, lead state SUBORDINATE, mgmt services state: UP
   heartbeat state PRIMARY_OK

INTERNAL NETWORK INTERFACES:
eth1, UP
eth2, UP

HA READY
Detailed state of the device selected for HA storage:
Chassis 1, serial: FOX1812GNZZ, state: active
Chassis 2, serial: FOX1813GFT7, state: active
Chassis 3, serial: FOX1814G5DR, state: active
lavender-A(local-mgmt)#

================================================

 

Below is the output after shutting down the FI-A

================================================

lavender-B(local-mgmt)# show cluster extended-state
Cluster Id: 0x7bce8766ecdc11e3-0xa1b4002a6ac23b41

Start time: Wed Jun 11 19:09:13 2014
Last election time: Sat Jun 14 13:45:18 2014

B: UP, PRIMARY, (Management services: SWITCHOVER IN PROGRESS)
A: DOWN, INAPPLICABLE

B: memb state UP, lead state PRIMARY, mgmt services state: INVALID
A: memb state DOWN, lead state INAPPLICABLE, mgmt services state: DOWN
   heartbeat state SECONDARY_FAILED

INTERNAL NETWORK INTERFACES:
eth1, DOWN
eth2, DOWN

HA NOT READY
Management services: switchover in progress on local Fabric Interconnect
Detailed state of the device selected for HA storage:
Chassis 1, serial: FOX1812GNZZ, state: active
Chassis 2, serial: FOX1813GFT7, state: active
Chassis 3, serial: FOX1814G5DR, state: active
lavender-B(local-mgmt)#

================================================

Its really strange that when I do the manual cluster lead change, its taking good amount of time to switch the VIP from FI-A to FI-B wise a versa

 

Regards,

Amit Vyas

VIP Green

I assume, initially you can

I assume, initially you can of course ping FI-A and B, and VIP ?

Now as can be seen, FI-A is down; and FI-B is primary ? can you now ping FI-B and VIP ?

New Member

I can only ping FI-B and

I can only ping FI-B and unable to ping VIP until FI-A come up

VIP Green

Must be a bug !btw. did you

Must be a bug !

btw. did you set this ??

admin->managementInterfaces->managementInterfaceMonitoringPolicy

Admin status -> enable
MII status

VIP Green

CSCum82888

CSCum82888

 

After an upgrade to UCSM 2.2.1b we see the following symptoms:
- No UCSM GUI access.
- Virtual IP is not reachable.
- Virtual IP cannot be accessed either by GUI or CLI/SSH.
- Individual FI can be accessed via SSH but not via http.

 
<B>Conditions:</B>
- Issue occurred after an upgrade to version 2.2.1b.

 
- The issue can happen in the following two conditions:
1. default keyring is deleted and the system is upgraded to 2.2.1<x> 
2. When default keyring is deleted on a system running 2.2.1<x> and the system is rebooted.

 
Workaround:
Workaround:
-       Make the key and certificate links used by apache httpd to point to any valid key/certificate (by deleting and re-creating the links). This requires loading debug plugin.

 
The situation can be avoided by:
1. not deleting the default keyring (or recreating it if deleted) before upgrading to 2.2.1<x>.  
2. Not deleting the default keyring even after upgrading to 2.2.1<x>

 
<B>Further Problem Description:</B>
Issue is caused because of deadlock.

 

New Member

I am also assume the same

I am also assume the same that it might be a bug

Yes, Admin status is enabled and Monitoring Mechanism is MII Status

But above setting also didn't help, when I restarted the primary FI, I lost the ping VIP and FI-A and VIP response came after FI-A came up

 

VIP Green

Hi AmitI hope you have seen

Hi Amit

I hope you have seen the bug id below !

Walter.
 

New Member

I am not hitting this bug,

I am not hitting this bug, because these devices are shipped with 2.2.(1b)

Not sure what will be the issue ?

 

VIP Green

Understood, but it could also

Understood, but it could also happen, if

2. When default keyring is deleted on a system running 2.2.1<x> and the system is rebooted.

New Member

No, haven't delete key ring

No, haven't delete key ring but have rebooted FI's multiple times

VIP Green

I would open a TAC case, and

I would open a TAC case, and/or upgrade the infrastructure with autoinstall to the latest, resp. long lived Cisco recommended release (2.2.1d)

New Member

I have raised the SR and TAC

I have raised the SR and TAC engineer is working on it

Mean while I wanted to know / understand following two things

  • How Fail over happens in Fabric Interconnect? i.e. how Primary Subordinate role gets transfer from one to another
  • What is the ideal time to create sub-interface on peer fabric interconnect for VIP when Primary FI reboot or powered down

 

447
Views
0
Helpful
13
Replies
CreatePlease login to create content