cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
4133
Views
5
Helpful
25
Replies

DBReplication Failure

keithknowles
Level 1
Level 1

I checked the DBReplication Status of my 5.1.1 cluster (1 pub and 2 subs), and though it took a long time to create the file, there was no output in it. I then performed a "utils dbreplaction stop" on the sub and a "util dbreplication reset <subscriber2>" and let it run through. Ever since that point I have been getting a "IDSReplicationFailure class_id : CDR DEFINE SERVER command failed on the subscriber class_msg : replstate = 3 specific_msg : We are in the svc routine of ReplTask trying to setup replication on Subscriber AppID : Cisco Database Layer Monitor ClusterID : UofLCCM NodeID : <subscriber2> " If someone can tell me what I have done wrong and how to get replication working againj through my cluster that would be awesome. This all precipitated from an attemot to upgrade from 5.1.1 to 5.1.3, the publisher could see the patch file, but the subscribers couldn't. I had read that if replication has failed that it can cause this to happen. Any help would be greatly appreciated.

1 Accepted Solution

Accepted Solutions

The upgrade from 5.1.1 to 5.1.3 needs to be carried out initially on the Publisher before any subscriber, once the Publisher has been upgraded the Subscriber should then be able to see the upgrade image.

Can you confirm what the DB status is within RTMT, does is show the status of all nodes or only the Publisher and what the states are?

The sqlhosts file is present on each server and contains a reference for each Cisco Unified Communication Manager node in the cluster.

If those sqlhosts files are out of sync, the SQL replication fails. Use the show tech dbstateinfo CLI command in each subscriber in order to check the local sqlhosts at the

bottom of the output for any mismatch on each node.

If there are mismatches within this file these can only be modified through root access by TAC.

Rgds

Allan

View solution in original post

25 Replies 25

allan.thomas
Level 8
Level 8

The dbreplicate reset should be carried on the Publisher node not the Subscriber.

Run the same process again as follows:-

Initiate the utils dbreplicate stop on each Subscriber before the Publisher node. Ensure that the stop completes before proceeding to the next subscriber.

Once the dbreplicate stop has been carried on all Subscribers, then initate the same on the Publisher.

Only when the dbreplicate stop has completed on the publisher, should you run the 'utils dbreplication reset all'

HTH.

Allan.

I have followed that procedure, but now I am getting that error for both subscribers. Is it possible I will get that until replication has finished??

The CDR Define Server error you are experiencing across both subscribers definately suggests a dbreplication failure.

In this instance there is an additional step that you should try before initiating a dbreplication reset from the publisher.

After stopping the dbreplication on both Subs and the Pub as before execute the following command in the same manor 'utils dbreplication dropadmindb'. Wait for it to complete on the each sub before the next, and then finally the Pub.

One this stage is complete on the Pub run the 'utils dbreplication reset all' from the Pub.

If you receive the following error, Enterprise Replication not active (62) after running the reset this is expected. The replication can take upto 30mins after.

If this still fails to correct the issue, then there maybe an underlying issue within the sqlhosts file which can only be changed via root access by TAC.

HTH.

Allan.

SQL hosts?? I thought 5.X was Informix based, not SQL??

I have tried the dropadmindb, but I am still getting the same error. I have also noticed that the publisher seems to think its the only node in the cluster, it tells me that when I run "utils dbreplication status".

The upgrade from 5.1.1 to 5.1.3 needs to be carried out initially on the Publisher before any subscriber, once the Publisher has been upgraded the Subscriber should then be able to see the upgrade image.

Can you confirm what the DB status is within RTMT, does is show the status of all nodes or only the Publisher and what the states are?

The sqlhosts file is present on each server and contains a reference for each Cisco Unified Communication Manager node in the cluster.

If those sqlhosts files are out of sync, the SQL replication fails. Use the show tech dbstateinfo CLI command in each subscriber in order to check the local sqlhosts at the

bottom of the output for any mismatch on each node.

If there are mismatches within this file these can only be modified through root access by TAC.

Rgds

Allan

My version of 5.1 doesnt have a "show tech dbstateinfo", is there an older version of this command?? This is the output of a "util dbreplication status".

SERVER ID STATE STATUS QUEUE CONNECTION CHANGED

-----------------------------------------------------------------------

g_batman_ccm 2 Active Local 0

Status cannot be reported for a cluster with a single active node; aborting status check operation

The command 'show perf query class "Number of Replicates Created and State of Replication"'

This will only show the status of the replication, as it would through RTMT. I assume it will only return the one entry.

Can you confirm that the Publisher is able reach either Subscriber both by IP address or Hostname?

You can verify this through the CLI using 'utils network host' as below, and 'show tech network hosts' CLI commands on all the cluster nodes:

'utils network host

Can you also post the output from the following command:-

'run sql select name,nodeid from ProcessNode'

It simply appears that the subscriber have not been added when they were first installed. I assume these servers are located in the admin pages under system/server?

Allan.

==>query class :

- Perf class (Number of Replicates Created and State of Replication) has instances and values:

ReplicateCount -> Number of Replicates Created = 342

ReplicateCount -> Replicate_State = 2

'run sql select name,nodeid from ProcessNode'

================== ======

EnterpriseWideData 1

2

3

4

RTMT DB States.

Publisher: 2

Subscriber 1: 2

Subscriber 2: 3

Curious, the replicate state for Publisher 2 and Subscriber1 show a status of 2 which is good. It seems that only Subscriber2 has broken replication?

Does the dbreplication status command still only return the publisher-node? I would expect to see both subs. Remember the reset could take upto 30mins.

Allan.

Allan.

Still only outputs the publisher node.

I am still gettng that error for Sub 1 too, even though its dbReplication status is 2??

Were you able to ping and resolve the hostname of each Subscriber from the Publisher?

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: