cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2256
Views
4
Helpful
21
Replies

CUC 8.x Clustering Questions / Concerns

dtran
Level 6
Level 6

Hi all,

I am looking into deploying a CUC 8.x cluster with Pub and Sub at two separate locations connecting over a 45MB DS3 link. I have about 1500 subscribers with 48 voice messaging ports on each server. According Cisco specs for CUC 8.x below I am completely within specs. I am wondering if anyone has run into this scenario with CUC 8.x and could share your experience. Per Cisco recommendation, the subscriber sever will be the primary call processor and the Publisher will be handling replication, maintaining the DB, and MWI. I am wondering if there will be any issues with MWI with Pub and Sub connecting over a WAN link even though my WAN link is completely within Cisco specs.

In my existing Unity 4.0.5 with failover environment. I have the primary Unity server and the Exchange server located in the same building and the secondary Unity server located at the remote site. And I know whenever I failover to the secondary Unity server I always have issue with MWI. I understand Cisco has completely changed the architecture with Unity Connection. I hope this won't an issue.

Connection Cluster Requirements When the Servers Are in Separate Buildings or Sites
Revised April16, 2010
•Both servers must meet specifications according to the Cisco Unity Connection 8.x Supported Platforms List at

http://www.cisco.com/en/US/partner/docs/voice_ip_comm/connection/8x/supported_platforms/8xcucspl.html.


•For a cluster with two physical servers, both servers must have the same platform overlay.
•For a cluster with two virtual servers, both servers must have the same virtual platform overlay.
•For a cluster with one physical and one virtual server:
–The platform-overlay numbers of the physical server and the virtual server must match.
–With Platform Overlay 1 servers, you must add 2 GB of RAM to the physical server so the amount of RAM in the physical server matches the 4 GB of vRAM configured for Connection on the virtual server.
•Depending on the number of voice messaging ports on each Connection server, the path of connectivity must have the following guaranteed bandwidth with no steady-state congestion:
–For 50 voice messaging ports on each server—7 Mbps
–For 100 voice messaging ports on each server—14 Mbps
–For 150 voice messaging ports on each server—21 Mbps
–For 200 voice messaging ports on each server—28 Mbps
–For 250 voice messaging ports on each server—35 Mbps

Thanks in advance !!! I appreciate any inputs / suggestions !!

D.

10 Accepted Solutions

Accepted Solutions

David Hailey
VIP Alumni
VIP Alumni

D Tran,

You should be good as long as you are within design specs.  One thing to remember is that there is no longer any dependence on AD and Exchange with CUC.  In other words, the general "operation" of MWI may be the same but the actual components at play in the transaction are self-contained.  I dont know what sort of issues you have with your legacy Unity system when it comes to failover but I wouldn't expect that to be replicated with CUC unless the underlying issue is network-related.

Hailey

Please rate helpful posts!

View solution in original post

That's unfortunate to hear that your 4x setup has never worked properly.  I've deployed Unity in a failover via WAN setup without issues.  The good news is you're moving to Unity Connection and there are fewer dependencies which means fewer places where things can break (or be broken on initial setup). With Unity deployments, you really have to dig into how Unity works and the design guide specifications for the various deployment models.  When you do failover via the WAN, the key requirements from the 4x design are:

In a remote failover configuration, the network connectivity should be no less than 100 Mbps between the sites, and the messaging systems and messaging infrastructure components must be accessible by both Cisco Unity servers. (Note that messaging infrastructure components include domain controllers and/or directory servers, global catalog servers, and name resolution hosts.) Regardless of your network connectivity bandwidth, the response time between the Cisco Unity server and the Exchange servers it is connected to should be no more than a 40-millisecond round trip delay in order for Cisco Unity to service subscriber TUI requests normally.

Again, the good news is that you're moving to Unity Connection.  FYI:  Unity Connection 8.5 EFT (Early Field Trial) should start in August.  This is going to be testing of Unity Connection code that provides not only Integrated Messaging (IMAP) and Voice Mail only services but also Unified Messaging.  This will be accomplished/supported with Exchange 2003 via WebDav and Exchange 2007/2010 via Exchange Web Services.  So, it's a great time to migrate to Unity Connection as it's feature-set just continues to grow.

Hailey

Please rate helpful posts!

View solution in original post

Only Exchange will be supported from 8x and beyond. This is true of Unity and CUC to my knowledge. You can still do IMAP with Lotus/Domino on supported versions with CUC.

Sent from my iPhone

View solution in original post

First question, do you have QoS on the WAN link between the cluster servers?

Hailey

Please rate helpful posts!

View solution in original post

Technically, yes.  If you didn't have a cluster, then you would just have a single box that handles all functions.  So, you could place all of the load on one box.  In a cluster, the roles of the servers and best practices are different which is why all of this is being discussed.  The Publisher role is similar to that of a CUCM cluster (during normal operations) and the Subscriber is the call processing node.  The Publisher does provide mailstore operations for the cluster so there could be some correlation there especially since you said you see no issues when the Subscriber is the Primary.  The actual specs on BW are more like 150 ms latency (if I remember correctly - which you would be fine there) and the bandwidth is dedicated (amount as specified) but with no steady state congestion on the link.  D Tran, my friend, this one likely boils down to what you are comfortable with and what works best for you.  You could also see what TAC has to say on the situation.  I mean, in your case, let's say that the server locations were reversed.  In that case, all of your calls to VM would traverse the WAN which seems undesirable.  In addition, if you had other remote sites or users at the DR site then they would (theoretically) experience the same issues you have today.  There are a lot of variables at play...but just to retouch on your question (as noted above) - technically, yes one server can handle all functions.  That's what you'd have in a standalone configuration just without the failover aspect.

Hope this helps...there are a lot of variables at play here.

Hailey

Please rate helpful posts!

View solution in original post

From a "what is the server doing" perspective in your scenario, not really.  From a "how would this be looked at from a support perspective", I can see that this might raise some flags.  If you're having trouble with clustering over the WAN, then I'd consider swapping locations if you're going to go with the deployment model you indicated (one server performing all roles).

Did you have any interest in seeing what TAC says about the issues clustering over the WAN?  You're running a zero release of 8x so it is possible that there could a bug or some other factor at play...not guaranteed, but possible.

Hailey

Please rate helpful posts!

View solution in original post

In your case, I would say you have to do what works best in your environment...if having one server perform all the functions is what it takes, so be it. However, from a product and feature perspective - it defeats the purpose of having active-active clustering with the concept of Publisher and Subscriber in place. I think clustering over the WAN should work as advertised as long as you're within specs - but that's my 2 cents.

As long as you're supported from a TAC perspective, do what works best for you - can't fault you for that.

Hailey

Please rate helpful posts!

View solution in original post

Just one comment... you don't want to have the sub with the primary role.  That's considered a failed over system and it won't properly fail back if there's a problem with the sub.  A failover will only occur if the publisher is the primary.

The audio/prompts are all played back from the server answering the call.  So whichever pub/sub answers the call, that's the one that will be playing prompts locally.  For logging into a mailbox, the sub has to contact the pub (on a working system), but that should not be a significant delay in your case (but is a potential source of delay).  If there is a delay issue, you will typically see delays after entering the password or (when taking a message) before the record beep (after the greeting).

View solution in original post

You should be fine having the pub handle calls--for the exact reasons the TAC engineer mentions--but I think it would be good to get some diagnostic traces and figure out what the real delay is in getting a response and if there isn't something else going on.  All of the bandwidth restrictions on the product deployment boil down to needing a low latency, lossless connection between the two servers.  If checking messages by dialing the pub is noticeably better performance than checking on the sub, then there may well be something wrong.  It would be better to find out now since these delays could just be the canary in the coal mine for other things, like replication, to be affected. 

View solution in original post

I was out and about earlier so I wanted to elaborate on my response:

I still think you ultimately have to do what works best for your environment.  If that means running everything on the Publisher then so be it.  BUT, I think there are too many variables to just move to that solution without further investigation both internally and from TAC.  Understandably, a single server can handle the load - if it couldn't, there wouldn't be a standalone option for CUC.  However, the concept of clustering is not simply about load balancing to provide high availability.  That's a big part of it - but the separation of roles is also intended to provide better performance, scalability, as well as the active-active HA fault-tolerance.  If you meet the specs for clustering over the WAN, then I think the product should perform as expected.  My expectation would be for there to be very little, if any, noticeable lag in performance especially as far as end user operations are concerned.  If what you are experiencing is par for the course for clustering over the WAN with CUC, I would say there are a lot of customers that would be disappointed.  I know I would. In saying that, it doesn't meant that there is an inherent issue with CUC.  It could be perfectly fine.  However, given that you are running an 8.0 release - I would push for some further inspection of the health of the system.  If everything is up to spec then there could be something on the WAN link that is at fault.  These types of things always take a little more time to investigate but as Markus said - it would still be in your best interest to figure out what is going on here. Even if you have the Publisher sitting at your HQ site doing all the work, it still has to replicate data within the cluster which would be traversing the WAN. So, I would want to pursue both sides of the coin.  You may still end up running everything from one box but at least you will have tried to rule out as many potential issues as possible.

Hailey

Please rate helpful posts!

View solution in original post

21 Replies 21

David Hailey
VIP Alumni
VIP Alumni

D Tran,

You should be good as long as you are within design specs.  One thing to remember is that there is no longer any dependence on AD and Exchange with CUC.  In other words, the general "operation" of MWI may be the same but the actual components at play in the transaction are self-contained.  I dont know what sort of issues you have with your legacy Unity system when it comes to failover but I wouldn't expect that to be replicated with CUC unless the underlying issue is network-related.

Hailey

Please rate helpful posts!

Hi David,

MWI never work properly for me in my existing environment whenever the Secondary Unity is active. I know for sure this is due to the Secondary Unity server and MS Exchange/AD are connecting across the WAN link. It seems like MWI never work correctly whenever you have Unity and Exchange/AD connecting any type of WAN link. I know another customer who has the same setup as mine but their Unity and Exchange are connecting across a Gigaman WAN link and MWI still doesn't work.  I hope this won't be an issue with CUC 8.x

Thanks David !! appreciate your help as always !!!

D.

That's unfortunate to hear that your 4x setup has never worked properly.  I've deployed Unity in a failover via WAN setup without issues.  The good news is you're moving to Unity Connection and there are fewer dependencies which means fewer places where things can break (or be broken on initial setup). With Unity deployments, you really have to dig into how Unity works and the design guide specifications for the various deployment models.  When you do failover via the WAN, the key requirements from the 4x design are:

In a remote failover configuration, the network connectivity should be no less than 100 Mbps between the sites, and the messaging systems and messaging infrastructure components must be accessible by both Cisco Unity servers. (Note that messaging infrastructure components include domain controllers and/or directory servers, global catalog servers, and name resolution hosts.) Regardless of your network connectivity bandwidth, the response time between the Cisco Unity server and the Exchange servers it is connected to should be no more than a 40-millisecond round trip delay in order for Cisco Unity to service subscriber TUI requests normally.

Again, the good news is that you're moving to Unity Connection.  FYI:  Unity Connection 8.5 EFT (Early Field Trial) should start in August.  This is going to be testing of Unity Connection code that provides not only Integrated Messaging (IMAP) and Voice Mail only services but also Unified Messaging.  This will be accomplished/supported with Exchange 2003 via WebDav and Exchange 2007/2010 via Exchange Web Services.  So, it's a great time to migrate to Unity Connection as it's feature-set just continues to grow.

Hailey

Please rate helpful posts!

Hi David,

I only have a 45MB link between the data center and remote site where the Secondary Unity server resides, that's most likely the reason why MWI never work for me. But I don't the detail why the other company that I know that has a Gig connection between sites and they have the same issue with MWI.

Thanks for the heads up on CUC 8.5 !! Do you know if CUC 8.5 can support Domino for Unified messaging ?

Thanks David !!!

D.

Only Exchange will be supported from 8x and beyond. This is true of Unity and CUC to my knowledge. You can still do IMAP with Lotus/Domino on supported versions with CUC.

Sent from my iPhone

Hi David,

I finally got my CUC 8.0.2c cluster up with the Sub located in the main data center in Ca and the Pub in the DR site in Tx. I've configured the Pub as the Primary and it also handles MWI and configured the Sub to handle calls. The cluster is up but it's not in production yet. I am still in the testing phase and so far I've noticed there a noticeable delay when I access voice mail. When I press the messages button and enter the PIN there is a noticeable delay before I can hear my recorded name and the same delay when listen to messages. Do you think this delay is due to fact that I have the Pub and Sub connecting over a WAN link ? even though my WAN bandwidth is within Cisco specs. Per Cisco specs: for 50 voice ports on each server require 7MB of bandwidth and under 40ms round-trip. I have 48 voice ports on each server with 45MB WAN link and under 40ms round-trip. But if I make the Sub the Primary and also taking calls I can tell the delay is gone.

I am thinking about moving the Pub back to the main data center and configure it as the Primary and also taking calls.  Do you see any issues/concerns with that ?

Thanks David !!!! very much appreciate your help !!!

D.

First question, do you have QoS on the WAN link between the cluster servers?

Hailey

Please rate helpful posts!

Hi David,

Yes, QoS is enabled on the WAN link and traffic load between two locations is pretty light.

I was under the impression that whichever server is configured to take calls uses its local DB to process the call. But it seems like the server that takes the call uses whichever server that has the active DB (which is the Pub) to process that call.

Technically I can have a single box handling all responsibilities (taking calls, DB processes, MWI) right ? and the Sub can just sitting there for DR purposes correct ?

Thanks David !!!

D.

Technically, yes.  If you didn't have a cluster, then you would just have a single box that handles all functions.  So, you could place all of the load on one box.  In a cluster, the roles of the servers and best practices are different which is why all of this is being discussed.  The Publisher role is similar to that of a CUCM cluster (during normal operations) and the Subscriber is the call processing node.  The Publisher does provide mailstore operations for the cluster so there could be some correlation there especially since you said you see no issues when the Subscriber is the Primary.  The actual specs on BW are more like 150 ms latency (if I remember correctly - which you would be fine there) and the bandwidth is dedicated (amount as specified) but with no steady state congestion on the link.  D Tran, my friend, this one likely boils down to what you are comfortable with and what works best for you.  You could also see what TAC has to say on the situation.  I mean, in your case, let's say that the server locations were reversed.  In that case, all of your calls to VM would traverse the WAN which seems undesirable.  In addition, if you had other remote sites or users at the DR site then they would (theoretically) experience the same issues you have today.  There are a lot of variables at play...but just to retouch on your question (as noted above) - technically, yes one server can handle all functions.  That's what you'd have in a standalone configuration just without the failover aspect.

Hope this helps...there are a lot of variables at play here.

Hailey

Please rate helpful posts!

Thanks David !!! Your inputs is very much apprecited !!!

I understand during the normal operation the Pub should be Primary and Sub should be taking calls. Do you see any concerns with having the Sub as Primary during normal operation ? or Do you only switch role for maintenance purposes only ?

Thanks again David !!!

D.

From a "what is the server doing" perspective in your scenario, not really.  From a "how would this be looked at from a support perspective", I can see that this might raise some flags.  If you're having trouble with clustering over the WAN, then I'd consider swapping locations if you're going to go with the deployment model you indicated (one server performing all roles).

Did you have any interest in seeing what TAC says about the issues clustering over the WAN?  You're running a zero release of 8x so it is possible that there could a bug or some other factor at play...not guaranteed, but possible.

Hailey

Please rate helpful posts!

Hi David,

I do have a TAC case open regarding the delay issue. But I really think in the case of clustering over WAN you'll have to dedicate one box to do all functions and the other box just sit there for failover purpose. This is most like what I'll have to do.

Splitting function between the Pub and Sub is mainly for load balancing purpose and seems like it only works when both Pub and Sub are in the same location. I've talked to TAC about having the Pub doing all the functions and the Sub just for failover purpose and they don't have any issues with that.

Thanks David !!! Your inputs is highly appreciated !!!

D.

In your case, I would say you have to do what works best in your environment...if having one server perform all the functions is what it takes, so be it. However, from a product and feature perspective - it defeats the purpose of having active-active clustering with the concept of Publisher and Subscriber in place. I think clustering over the WAN should work as advertised as long as you're within specs - but that's my 2 cents.

As long as you're supported from a TAC perspective, do what works best for you - can't fault you for that.

Hailey

Please rate helpful posts!

Thanks David,

Here is the response that I got from TAC:

The reason why Secondary Server answers the call is to load balance to create a High Availability environment. On
    this way, primary server will not have all the load for all the tasks (calls, notifications, changes in Administration Page).
   However, If you would like to maintain only one server active taking all the load, it is fine. Since the servers are actually
   created in cluster to support failure of any server, they are capable (from hardware and software perspectives) to handle
   all that load.

Thanks again David !!!

D.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: