Cisco Unity Connection cluster issue

Unanswered Question
Nov 24th, 2010

Hello,

I have a Cisco Untiy Connection cluster of two servers, one publisher and one subscriber.  They are running software: 7.1.3ES43.33034-43.  Yesterday at about 8:43 I lost connection to the subscriber server.  I could still ping it but was unable to get to it via GUI nor SSH.  I spoke to TAC yesterday and they told me to simply reboot the server.  I was a little concerned about this because I don't want to cause some kind of split-brain effect.  I work in a hospital and it is imperative that the voicemail system stay up.  My only other option would be to do this at 4:00 am on a Sunday.  Any suggestions?

Thanks

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Average Rating: 5 (1 ratings)
David Hailey Wed, 11/24/2010 - 06:57

I wouldn't worry too much about causing a Split-Brain condition.  In fact, a brief period of split-brain is normal during recovery.  In your scenario - I'd go with TAC and reboot the server.  Technically, if you have your ports configured correctly for failover and etc then you should be able to reboot the subscriber at any time.  Here is some additional tidbits on Split-Brain and what it is:

Effects of a Split-Brain Condition

When the servers in a Cisco Unity Connection cluster have Primary status  at the same time (for example, when the servers have lost their  connection with each other), both servers handle incoming calls (answer  phone calls and take messages), send message notifications, send MWI  requests, and accept changes to the administrative interfaces (such as  Connection Administration). However, the servers do not replicate the  database and message store to each other and do not receive replicated  data from each other.

When the connection between the servers is restored, the status of the  servers temporarily changes to Split Brain Recovery while the data is  replicated between the servers and MWI settings are coordinated. When  the recovery process is complete, the publisher server has Primary  status and the other server has Secondary status.

Hailey

Please rate helpful posts!

latintrpt Wed, 11/24/2010 - 07:02

Thank you for your fast response.

The reason I bring this up is because I upgrade to this code about a month ago.  I first upgrade the publisher and then the subscriber.  When the subscriber came back up it had caused a split-brain effect for at least 15-20 mins.  At that time when I would try to retrieve my voice messages, Unity Connection told me Voice Messages were unavailable at the time.

Do you recommend I do this during a downtime then?

David Hailey Wed, 11/24/2010 - 07:06

Again, split-brain during recovery is normal - and recovery after an upgrade takes much longer than a simple reboot.  It takes approximately 15-20 minutes for all of your Tomcat and other web services to start before a server is recognized...so this could be the lag you saw.  During it during a downtime is never a bad idea but if you are experiencing a bottleneck with VM then I'd do it sooner rather than later.  That is totally your call.  If you are looking for what I believe to be the most stable version of 7.1 code then it is 7.1.3.32900-4 or 7.1(3b)SU2.  Tried and tested.

Why don't you just do the reboot the afternoon?   I'm sure you'll have some downtime as folks part for the holidays.

Hailey

Please rate helpful posts!

Rob Huffman Wed, 11/24/2010 - 07:18

Hi latintrpt,

I will add my +5 point vote for this good info from Hailey

Just to let you know we have had to re-boot our Sub during office hours

on two occasions. This was done without any noticeable effect for our users.

Just make sure the Pub is set as primary and that the Sub has been set

to "Stop taking calls" (give some time for calls to clear)  before rebooting.

On another related note, we had a time when our Sub was out of commission completely

for a few days due to Tomcat Service Bug. Again, our users saw no issues during this time.

The CUC Cluster design has been well thought out by Cisco and is indeed very resilient!

Cheers!

Rob

latintrpt Wed, 11/24/2010 - 08:20

Thanks for the information guys.

Rob, I am unable to to set the Sub to "Stop Taking Calls" as it says "Not Available"  Server status for the subscriber is "Not Reachable".

The Publisher does say "Primary"

David Hailey Wed, 11/24/2010 - 08:32

If the Publisher is primary and you are not experiencing call issues, you simply need to reboot the Subscriber - either now or later.

Another way to force the Sub to not take calls is to remove it's ports from the CUCM hunt list/line group configurations.

Hailey

Actions

Login or Register to take actions

This Discussion

Posted November 24, 2010 at 6:49 AM
Stats:
Replies:6 Avg. Rating:5
Views:2013 Votes:0
Shares:0
Tags: No tags.

Discussions Leaderboard

Rank Username Points
1 21,026
2 15,047
3 10,314
4 7,999
5 4,856
Rank Username Points
154
95
75
66
55