We are running the following setup
Approx 140 seat contact center-
IPCC Enterprise 7.1.5-
System PG's, duplexed for DR -
IPIVR 4.5(2)SR02ES09_Build092, again, split for DR-
Nuance ASR (split for DR)-
Call Manager 5.1.3 (1 pub, 6 subs: pub, sub 1,3,5 located at site A, subs 2,4,6 located at DR site B.)-
We went "live" on 9/17/07. Since go live, we have had multiple tac cases, where we appear to have heartbeat issues with the PG's. We are connected to our DR location, via a DWDM connection using optical. It's our own fiber stretched between the 10 mile location.
EVERYTHING we run, as a company, is load balanced between both sites. We have NEVER shown a network latency of greater than 1ms. Yet, every 3-4 weeks, we seem to have the PG's get out of sync(?), and logging will inevitably show that they missed heartbeats.
This has been our longest standing issue, and we've never really gotten down to the bottom of it. The first handful of TAC cases, resulted in ES patches, which seem ( and I use this term, because it has always come back) to handle the situation at the time. I honestly can't say for sure if the patch is what took care of the problem, or recycling the PG's took care of the problem. Yes, I agree that the patches took care of issues that probably would have crept up - but the fact that we continually keep going down the same path is just eating at me. And this is where our business users feel the pain, and their interpretation of our new phone system is less than stellar. (I'm being kind, for the terms that have been used.) I KNOW this is a good system, but I don't think we've truly hit on what has caused our heartbeat issues. And of course, when those PG's have an issue, the effect is dramatic for our contact centers.