cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
876
Views
0
Helpful
2
Replies

Unity Connection 8.5 weird cluster problem

rstewart
Level 1
Level 1

I'm having a very odd problem with my CUC cluster. It happens after a split-brain caused by a network issue. The sub and pub go into split brain and once the network problem is resolved and they can talk, the sub goes into high utilization and the queue on the pub just backs up and is never delivered. I've even tried doing a full cluster reboot and it doesn't seem to solve the problem. During this scenario neither nodes are answering calls so incoming callers get a busy and users can't check their voicemail.

The back story is that it first happened after a link between the two buildings was disconnected for about 30 seconds during maintenance. I ended up leaving the sub down for a couple of weeks and things seemed to be OK when it came back up. Then the next time it happened was recently when we did a switch upgrade. Exactly the same thing happens, one node...usually the sub, goes into high CPU (i see it on the VMware console) and the queue on the pub backs up.

Anyone have any thoughts on how I can proceed to troubleshoot this? The worst part is I will basically have to do it late at night so as not to affect the users!

Thanks in advance!

Rob

Sent from Cisco Technical Support iPad App

2 Replies 2

Rob Huffman
Hall of Fame
Hall of Fame

Hi Rob,

It sounds like you could be hitting this bug;

CSCtq86294 - MTA threads stuck after SBR.  No message delivery

Description

Symptom:

After a lengthy subscriber shutdown, when the subscriber is brought up, the MTA failed to initialize properly and messages are no longer delivered. The messages simply queue up.

Conditions:

Workaround:

First make sure SBR has completed (show cuc cluster status) and that the Publisher is the Primary. Next make sure the Message Transfer Agent (MTA) service is started. This can be seen by the GUI (under Cisco Unity Connection Serviceability > Tools > Service Management => Connection Message Transfer Agent) or via CLI (utils service list, look for "Connection Message Transfer Agent[STARTED]").

If the service is started, then the RTMT counters can be examined to see if messages are being delivered (also available via CLI as shown below).

show perf query path "CUC Message StoreMessages Received Total"

show perf query path "CUC Message StoreMessages Delivered Total"

show perf query path "CUC Message StoreQueued Messages Current"

If these do not increment, then the MTA process is likely stuck and needs to be restarted (from the GUI via Cisco Unity Connection Serviceability). The process is "Connection Message Transfer Agent" and it should be stopped and immediately started. If there is a significant delay, then the system will fail over and will need to be failed back.

Details

1st Found-in:                          (1)

8.5(1)

Status:

Fixed

Last Modified:

Dec 06,2011

Fixed-in:                          (11)

8.6(2.10000.30), 8.6(1.98000.53), 8.6(1.98000.126)

8.6(1.59), 8.6(1.21005.1), 8.6(1)ES4, 8.5(1.13029.1)

8.5(1.13028.2), 8.5(1)ES54, 8.0(3.23041.1)

8.0(3)ES39

More

Product:

Cisco Unity Connection

Platform:

Dependent

Severity:

3 - moderate

Cheers!

Rob

Please remember to tag your threads and help support "Teachers without Borders"

https://supportforums.cisco.com/community/netpro/idea-center/communityhelpingcommunity

Thanks for the suggestion Rob. The thing is, the pub stops taking calls and the sub is in high CPU and the queue on the pub keeps getting bigger. Also the split brain never clears.

Sent from Cisco Technical Support iPad App

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: