No Calls to Contact Center

Unanswered Question
Apr 8th, 2010

Dear All,

We are facing certain major issues at our CC due to some unknown reasons. Let me explain the complete scenario. This may seem a bit long but please take a while and help us isolate the problem. Let me start with explaining our architecture first.


Distributed Multisite Architecture with a Side_A and Side_B

Componants in Side_A





CVP Call Server

Call Manager (Pub)

Call Manager (Sub1)

Call Manager (Sub2)

Voice Gateway 1

Voice Gateway 2

Componants in Side_B





CVP Call Server

CVP Operation Console

CVP Reporting Server


Voice Gateway 1

Voice Gateway 2

Voice Gateway 3


IVR Server

Call Manager (Pub)

Call Manager (Sub1)

Call Manager (Sub2)

* 300 + Agents in Side_B and 150+ Agents in Side_A

* Side A is the Active Central Controller.

* There are 2 WAN Links connectivity between Side_A and Side_B (8mbps and 2mbps)

* When there is a WAN link failure, as per the architecture and the documentations Side_B should work independently since for a side to become the active central controller the applied formula is n/2+1 where n is the total number of PGs at both locations and a site should be able to reach the number of PGs equaling the resulting value. At side B we have an additional MRPG for this purpose.

Problem Description:

At Side_B, we face a complete shutdown of calls across all Skill groups. This problem occurs even during peak hours where the call volume used to be very high normally.


*During this time, we see WAN link fluctuations due to regular carrier issues.

* All other applications on the LAN network like Outlook or shared drives are working fine.

*All the Servers at Side_B are up and running.

*Survivability is getting triggered on Voice Gateway.

Logs on Rogger (Side_B)

1. Connectivity with duplexed partner has been lost due to a failure of the private network, or duplexed partner is out of service.

2. MDS is out of service.

3. MDS has reported failure to the router that it is out of service.

4. Message Delivery Service (MDS) feed from the Router to the Logger has failed.

5. Central Controller service is unavailable.

6. Requesting MDS termination due to error.

7. Application Gateway has been taken out of service. Application Gateway ID - 5000

8. Client rtr stopping due to error.

9. Client hlgr stopping due to error.

10. Client clgr stopping due to error.

11. Synchronizer is unable to establish connection to peer.

12. Process rtr on ICM\int1\RouterB has detected a failure. Node Manager is restarting the process.

13. Process rtr on ICM\int1\RouterB is down after running for 366 seconds. It will restart after delaying 1 second for related operations to complete.

14. ICM\int1\LoggerB node process hlgr exited cleanly and requested that it be restarted by the Node Manager.

15. MDS message sunc delay exceeded 3000ms

16. Connectivity with duplexed partner has been lost due to a failure of the private network, or duplexed partner out of service.


Logs from PG_B

1. Central Controller side B reports poor response time.

2. Central Controller side B is out of service.

3. Connection to Central Controller side A failed (high priority)

4. Path to Central Controller side A disconnected.

5. Message stream to Central Controller has been broken.

6. FCCS3008 Network communication error <COMM FAILURE> sending message to application <AGENT DESKTOP_17119> The application will be logged out.

            This log is there in plenty with different AGENT_DESKTOP Ids.

7. Configuration response for Logical Controller 5002 contained OPI error code 1

8. ProcessPIMMsgs:: Invalid PIM message received calss 10 type PIM_REPORTING_CONFIG IND

9. Fail Asserion failed: ElemString.Get() Remote Address not present.

Workaround Solution

Restart the PG Processes manually.

Please review this and help us isolate the root cause of this issue. When the WAN link fluctuates or go down,

1. Side B is supposed to work independently taking Rogger_B as the Central controller, however this is not happening and the PG services are going to hung mode.

2. Not sure why the Node Manager Service did not trigger the PG services when these went to hung mode. As per the documentation this should happen automatically or it should try to restart the PG server to make the services up but none of these happened.

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Edward Umansky Thu, 04/08/2010 - 14:09

You didn't specify which WAN link failed. Assuming it was the private link, it sounds like Router failover is working as expected. Both Routers will continue to see all PG's via the visible link and the A side will stay active while B is disabled. The tie breaker PG won't make any difference. I suggest you reread the SRND section on WAN failure as there are some subtle details to whole process.

As for why the side B PG is not re-connecting to the active side, most likely it is a configuration or firewall issue. Go over your configuration with a fine-tooth comb and make sure everything is as it should be. Make sure you can telnet to the appropriate ports from PG to Router and vice versa.

abraham23482 Fri, 04/09/2010 - 20:34

The Private link fluctuated continuously and there was high latency and packet drops on the visible link.

david.macias Sat, 04/10/2010 - 10:13

Not to sound like a total jerk, but you can't expect the product to function as designed when your networks are not working as expected.  Focus on the network first, ICM will fall in line once that's rectified, also be careful you might end up with a split brain and then you're in a bigger mess than before.



This Discussion