I'm wondering if anyone else has encountered this problem and what was done to fix it or work around...
Our configuration is Unity 4.2(1) Voice Mail only with failover (Exchange). If you dial a port number for the secondary server, the secondary server becomes the active Unity server. I can duplicate this "error" over and over. I cannot "failback" to the primary server via the same method. The ports for the primary Unity server unregister with Call Manager when the failover activates.
I have modified the dial plan to prevent the dialing of these port numbers from any location on Call Manager. This work around works fine. However, I'm concerned if this is a bug or a misconfiguration of Unity on our part. I'm concerned that the secondary will become active, unnecessarily, for another reason that we haven't realized.
it's probably not so much dialing the port, but rather in your Failover settings.
Failover only happens if the server sense a fault. Network fault, database fault, whatever.
Even network teaming of the NICS can throw off the Failover and switch it from Primary to Secondary.
- Check your DNS settings. Name resolution has a ton to do with this. Poor resolution will result in a failover unexpectatly.
- Network congestions. If Failover loses contact with Primary, it will failover. (poor roundtrip times
- Check to ensure failover is configured correctly. It could be completely not even working how it is supposed. Unity secondary should not be running, or bare bones running. Answer a call on the secondary is a little odd. That would mean some services are started that shouldnt be. It should be ring no answer if dialed those ports.
- NIC cards teamed. If the teaming is not stable, it cause SQL to think it has disconnected from the Primary or Secondary server, hence, cause a failover to start.
- if it becomes a problem, change the failover to force failover instead of automatic.
The port doesn't answer the call. But, Unity does see the call which triggers the failover to become active. I can see it happen in failover monitor. I've attached a couple of the events from from event viewer that shows what happened. The file "event01" has an interesting clue perhaps...I'm not sure. It says "is configured to fail over in this condition". Should the check box in "Failover Advanced Options" be checked on the secondary server and NOT the primary? Should it be checked on both? Or unchecked on both? We DO want the failover to become active automatically when there is a fault on the primary.
The DNS settings are working fine from both sides. Network congestion is not an issue. All connections are gigabit ethernet (from access to distribution to core). I have verified that there are no nic teaming issues in the logs. Round trip times are less than 10ms. No SQL errors.
It only fails over (unnecessarily) when a port is called. No other "faults" are showing up in the event viewer. It's a real head scratcher. lol
I changed the partitions of the failover ports so that callers can't dial them by accident anymore. That seems to work fine. I'm just hoping there isn't a more serious problem that will show up after we go live with this new build.
What ever happened with this issue? did you get resolve?
We are experiencing the same thing! We're running Unity 4.2(1) on 2 identical IBM 346 servers with plenty of resources, exchange 2000 off box. All the servers are in at the same location and the two unities are in the same switch on the same blade in the same vlan.
Unity01 (primary) keeps "failing-over" to Unity02 (secondary) without any apparent reason. The logs show that a port on unity01 did not pick up and therefore triggering the failover.
We have TAC troubleshooting it but nothing yet, it is believed to be a bug with/in the TSPs (TSP 8.1.3).
Any inof will be greatly appreciated,
The "integration ID" was mismatched between the two servers. Go the Unity Tools Depot, Switch Integration Tools, and then Telephone Integration Manager. Click on Properties and you will see the "Integration" tab. Our primary server has an integration id of 0. Our secondary server had an integration id of 1. I changed the id to 0 on the secondary server. In the top right corner is the "modify integration id" box. We haven't had a problem since. I give credit TAC on helping us with this.
I can't remember if I had to stop Unity services on both servers or not, but I would do it just to remove doubt.
Hopefully, this will fix the problem. Good luck. :)
Please let me know if this solves it. I'm curious if there is a bug that hasn't bitten us yet. We are using the same TSP.
Thank you so much for your quick response, unfortunately our integration IDs on both unities is the same (=1) but since we have a vendor handling this issue I don't know if this is a recent change or something that has been present since the start.
Currently TAC is looking into it and they believe is either a bug with/on the TSP 8.1.3
maybe bug CSCsh35344 (others listed below) or hardware issues specifically High CPU and/or I/O utilization.
We don't believe it to be a hardware issue since the server is pretty beefed up (4 CPU, 4 GB RAM, etc) and aside from being new it was on production for 3 months (before failover implementation) as the only Unity.
I'll let you know what happens when is all done... if ever.:-(
List of bugs related to TSP 8.1.3:
CSCse76319 - Unity sends 3rd Transfer if Connected received after 2nd Transfer
CSCsi65508 - Unity TSP port failback detection fails
CSCsh35344 - failed xfer initiate may cause delay in clearing port
CSCse43664 - supervised xfer cleanup may result in delay answering next call
CSCsd12541 - Unity ports delay failback with CCM port in LAST_ACK
CSCsi89941 - TAPI can block dialing transfer digits
So what is the reccomendation for 4.2(1) for failover with 2 Unity servers? (not unities) TSP 8.1.2?
Ready to be flamed rlp :-)
Im with you on this. I just put failover on 4.2.1 with TSP 8.2.1 and Im getting random failovers for no reason.
I did notice that the names were mismatched. Switched that and waiting to see.
Any TSP issues lately?
It was something else. I configured a dual integration and failover and split my 16 ports between the two CCM. When the ports would full, it roll to the second Line Group in CCM, which is Failover. When the call came from that group the Failover server, it would activate the Failover server, shutting down primary.
In the Failover configuration button, uncheck answer calls and activate. This will then give the full ports on the next call a busy signal.
By default when a call arives on the secondary server it will automatically failover. It is important to configure your partitions and CSS properly so the phones cannot dial the voicemail ports directly.
I also have a large customer that is experiencing random failovers with Unity 4.21 with TSP 8.21.
I will check to see if the Switch IDs match