I need to know a scenario that would cause a split brain Central Controller. I have my Central Controller split across two different sites and we recently had a failure of the Public and Private WAN. Each site still had local connectivity (Rogger A could communicate with PG A and Rogger B could communicate with PG B) but any synchronization execution that would have taking place between the duplexed pairs would have failed. There are currently three PGs configured in the Router configuration. I am assuming that if the Roggers could not communicate, they would both assume that the other side was down and try to take control. Finally, would the system automatically recover from a split brain central controller or would manually steps be required? Thanks
Thanks for the quick reply. I think I read something similar to this in the SRND, but if you are able to find a separate document on this question specifically I would appreciate any help I can get. I must have had the wrong understanding of Split Brain. I was under the assumption that Spit Brain meant that both sides thought they were in control because they could not communicate with their duplexed partner. Does that mean that both sides of the Router are not Active when split brain occurs?
John you're correct, split brain refers to both sides being active at once. ICM tries to avoid this with the scheme that Rob described: side A will go active only if it can communicate with at least half of the configured PG's, while side B will only go active if it can communicate with a majority (more than half). If you have PG's which are split across the WAN then it is possible for both Routers to think they are communicating with a majority of PG's, since the PG's themselves can become split brain. I imagine that's what is happening in your scenario. There are different ways to deal with this, for example putting a simplex "dummy" PG only on side A or at a third site. It depends on what sites you have available and what type of failover behavior you are looking for.
The only time I have seen a split brain happen was when certain parts of the the customer's network was failing intermittently. They had 3 call centre sites - A, B & C - with the side A CC at Site A and side B CC at a data centre site - Site D.
The customer had 9 PGs - 3 agent, 2 IPIVR, 3 MR PGs - with one of each at each call centre site.
An intermittent failure of a fibre port connector was occurring at site A's connection to Site D, which resulted in the network toggling between a primary and a backup route every few seconds. This resulted in a break in the CC traffic, but left each CC able to talk to a majority of PGs (due to the WAN acrhictecture).
This caused both CC sides to declare themselves the primary / active side, which caused the PGs to receive different route results (because of a lack of CC sync). This was ugly.
The customer rectified this by changing the routes, and moving the CCs to different links.
The recommendations listed in the above posts cover off what you need to do to minimise the possibility of a split brain issue occurring. But even if you get this 100% correct, make sure the WAN routing is such that you can't end up with the scenario I've mentioned above.
SIP traces provide key information in troubleshooting SIP Trunks, SIP
endpoints and other SIP related issues. Even though these traces are in
clear text, these texts can be gibberish unless you understand fully
what they mean. This document attempts to br...
Please find the attached HTML document, download and open it on your PC.
This provides an easy to use form where you simply answer a few
questions and it will render the proper jabber-config.xml file for you
to copy/paste. There is built in logic to verif...
CUCM Database Replication is an area in which Cisco customers and
partners have asked for more in-depth training in being able to properly
assess a replication problem and potentially resolve an issue without
involving TAC. This document discusses the bas...