cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
630
Views
10
Helpful
4
Replies

Nexus1000V splitt brain question

hleschin
Level 1
Level 1

Hello,

I have a question about a special splitted brain situation with 2 VSM we had seen.

We have a good working VM-DataCenter with VSM-A (active) on host ESX-1 und a VSM-B (standby) on host ESX-2. We use version 4.0.4.SV1.3b for the VSMs and VEMs.

Then the host ESX-2 got problems with its network interfaces where the VSM system vlans are transmitted. So the heartbeat connection between the VSM's was broken. In this situation VSM-B changed the status from standby to active, but was not reachable via IP (I could see it only via console in VCenter). In this situation I couldn't see any hosts and no standby VSM at VSM-B (via show module). Also VSM-A was still active and I could see both hosts, but no Standby-VSM. All this was ok so far.

Now the problem:

When the network connection for host ESX-2 came back both active VSM started seeing each other. In this moment VSM-A was automatic rebootet and the VSM-B stayed active. After the reboot VSM-A was standby.

That's strange in my view, because VSM-B was the standby-VSM before the problems with host ESX-2 happened. And VSM-B had no good configuration during the splitted brain situation (it didn't know any hosts). My exception was, that the VSM that was active before the splitted brain situation happend wins the active/standby election.

So my question:

When 2 active VSMs starting seeing each other, which VSM will be the active one ? What is the trigger for this?

Regards

Hendrik

4 Replies 4

jagrelo
Cisco Employee
Cisco Employee

Hello Hendrik,

when two active VSM in the same domain resume contact with each other the primary VSM will always reload. This is a static decision which does not depend on what VSM was active before the active-active situation occurred.

thanks,

/Juan

Hi Juan,

thanks for you answer, but what do you mean exactly with "primary VSM" ?

Regards

Hendrik

In a system with two VSM, the redundancy role of one of them is always configured as "primary" and the other one as "secondary", during the initial setup (see "show system redundancy state" output). The terms primary and secondary are equivalent to "module 1" and "module 2" in the virtual chassis, not to be confused with "active" and "standby", which are the redundancy states. When the system boots up, the primary VSM normally becomes the active and the secondary VSM becomes the standby. However, depending on the number of system switchovers after that, the active VSM could be the primary or the secondary at any point in time.

Redundancy role is equivalent to module number and it is not changed since initial setup, unless the user explicitly does it. Redundancy state tells you which VSM is controlling the switch and which one is backing it up, and it may change depending on operational conditions

thanks,

/Juan

Thanks Juan, that answers my questions completely.

But I don't think it's a good behaviour, when always the primary VSM reloads after a split brain situation.

Example:

When the ESX interface with the system vlans for the secondary VSM is flapping, it causes a split brain situation and some seconds later when the VSMs are seeing each other again the good working primary-VSM reloads.  Some seconds later the secondary VSM is isolated again about the nic-flapping and the primary VSM comes back and is active.  And some seconds later the game would start again when the VSM are seeing each other.

In this situation (with a flapping NIC at the Secndry-VSM)  I would have no working VSM, because the primary VSM is permanent reloading and the secondary VSM disappears every 5 seconds.

And this is not a very unlikely example, we had this situation this week.

Regards

Hendrik