IPCCX HA Database Issues - Slow updates

Unanswered Question
Sep 4th, 2007

Hi

I have a customer with IPCC 4.0(5), recently upgraded from 4.0(4)... they actually had this problem on the previous version as well.

Basically they have two clusters (one still on 4.0(4)) - on this HA cluster, the AgentStateDetail table on CRA_DB is updated pretty much in real time. It's then replicated every few minutes out to the secondary server.

The same normally applies ot their second cluster - however, occasionally we have seen that the updates slow down. Currently the AgentStateDetail updates every 5 minutes, at which point multiple updates go in to the DB at once. It also pretty much immediately replicates the changes out to the secondary server.

This causes problems as the wallboard/Checkmate data is stale compared to that shown in the Cisco Supervisor.

Has anyone seen this behaviour?

We have recently been running on the secondary server for about 2 weeks, and just failed it back this weekend... is there any extra action that needs to be taken to clear up the databases when we've been running on the secondary for a while? I've heard something manual needs to be done after 4 days, but not found solid info on it. I have noticed that there are currently more rows in the database on the secondary server in the AgentStateDetail table than on the primary, which suggests that some info has not replicated back to the primary...

Any suggestions welcomed!

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
syedbahm Thu, 09/13/2007 - 11:03

Pls see inline [Basheeruddin]

Hi

I have a customer with IPCC 4.0(5), recently upgraded from 4.0(4)... they actually had this problem on the previous version as well.

Basically they have two clusters (one still on 4.0(4)) - on this HA cluster, the AgentStateDetail table on CRA_DB is updated pretty much in real time. It's then replicated every few minutes out to the secondary server.

[Basheeruddin]: We have db_cra, I assume you mean that when you say CRA_DB and its not a custom database.

The same normally applies ot their second cluster - however, occasionally we have seen that the updates slow down. Currently the AgentStateDetail updates every 5 minutes, at which point multiple updates go in to the DB at once. It also pretty much immediately replicates the changes out to the secondary server.

This causes problems as the wallboard/Checkmate data is stale compared to that shown in the Cisco Supervisor.

[Basheeruddin]: Wallboard data is dependent on the following two tables RtCSQsSummary and RtICDStatistics and not the AgentStateDetail table. These tables are updated on the Current CDS master node. Which Wallboard are you using?.

Has anyone seen this behaviour?

[Basheeruddin]: No we haven't seen the behavior you mentioned that the updates to AgentStateDetail slows down and bulk updates happen at once. How are you determining the same?. Just want to get clarified which updates are you referring to...the replication updates from other server or the updates based on call traffic.

We have recently been running on the secondary server for about 2 weeks, and just failed it back this weekend... is there any extra action that needs to be taken to clear up the databases when we've been running on the secondary for a while? I've heard something manual needs to be done after 4 days, but not found solid info on it. I have noticed that there are currently more rows in the database on the secondary server in the AgentStateDetail table than on the primary, which suggests that some info has not replicated back to the primary...

[Basheeruddin]: If the primary CRS Server node was still connected to Secondary CRS Server node and was not totally down i.e. the SQL Services on both the nodes could contact each other, then the replication should happen automatically, even if the Secondary node was used to write the data. Pls confirm if this was the case.

OR

You seem to be referring to replication retention period which is usually 4 days. If any node was down or Services stopped such that there wasn't contact to the other SQL server node and if it was for more than 2 or 4 days (depending on the type of MCS box you are using), then replication setup between the two nodes is broken and you need to re-setup the same.. It seems since you were using subscriber node for data writes when publisher node was down, to retain, the data written on the subscriber, you need to switch the publisher to subscriber, such that all data from subscriber is copied back to the old publisher which was down.. Pls note that you can retain data of one particular node and sync both the nodes based on that.

Actions

This Discussion