CUCM 7.0 - SUBSCRIBER IN REPLICATION STATE = 4

Unanswered Question
Mar 15th, 2010

Hello,


I have a cluster of CUCM 7.0(2) with 3 servers (1 Publisher, 2 Subscribers), and one of the Subscribers is in "4" Replication State. I have tried everything but I'm not able to change this state.


This is "utils dbreplication clusterreset" output:


/cm/trace/dbl/sdi/clusterReset_20100225173215.out

Deleting server g_end_ccmsub2_ccm7_0_2_20000_5 from the sub.
command failed -- Enterprise Replication not active  (62)
Deleting server g_end_ccmsub2_ccm7_0_2_20000_5 from the pub.
command failed -- undefined server  (37)

The Sub g_end_ccmsub2_ccm7_0_2_20000_5 can't be defined.
Please analyze the logs in cm/trace/dbl and fix accordingly.
This may be due to a corrupt admin database. If so, execute utils dbreplication dropadmindb
on the node which has indicated failure.
Exiting with errors.


If I try "utils dbreplication dropadmindb" command, I obtain the following error:


sucmd_err [su -c 'ulimit -c 0;cdr err --zap' - informix ]
Executing [su -c 'ulimit -c 0;cdr define server --connect=end_ccmsub2_ccm7_0_2_20000_5 --idle=0 --init --sync=g_end_ccmpub_ccm7_0_2_20000_5 g_end_ccmsub2_ccm7_0_2_20000_5 --ats=/var/log/active/cm/log/informix/ats --ris=/var/log/active/cm/log/informix/ris;' - informix]
We got exception in Cdr define
Exception from cdr define e.value[5] e.msg [Error executing [su -c 'ulimit -c 0;cdr define server --connect=end_ccmsub2_ccm7_0_2_20000_5 --idle=0 --init --sync=g_end_ccmpub_ccm7_0_2_20000_5 g_end_ccmsub2_ccm7_0_2_20000_5 --ats=/var/log/active/cm/log/informix/ats --ris=/var/log/active/cm/log/informix/ris;' - informix] returned [1280]]
Executing [su -c 'ulimit -c 0;cdr delete server --connect=end_ccmsub2_ccm7_0_2_20000_5 g_end_ccmsub2_ccm7_0_2_20000_5' - informix]

Exception from cdr delete e.value[62] e.msg [Error executing [su -c 'ulimit -c 0;cdr delete server --connect=end_ccmsub2_ccm7_0_2_20000_5 g_end_ccmsub2_ccm7_0_2_20000_5' - informix] returned [15872]]
Executing [su -c 'ulimit -c 0;cdr delete server g_end_ccmsub2_ccm7_0_2_20000_5' - informix]

Exception from cdr delete e.value[37] e.msg [Error executing [su -c 'ulimit -c 0;cdr delete server g_end_ccmsub2_ccm7_0_2_20000_5' - informix] returned [9472]]

2009-06-19 07:41:57,082 - Drop_Admin_DB - Stopping dblrpc
2009-06-19 07:41:57,083 - Drop_Admin_DB - Executing:[/usr/local/cm/bin/controlce                                                                             nter.sh "A Cisco DB Replicator" stop]
2009-06-19 07:42:10,538 - Drop_Admin_DB - stoping dblrpc - rc is [0]
2009-06-19 07:42:10,538 - Drop_Admin_DB - Attempting to drop syscdr using IDS cd                                                                             r remove.
2009-06-19 07:42:10,733 - Drop_Admin_DB - retcode of cdr remove [115]
2009-06-19 07:42:10,733 - Drop_Admin_DB - cdr remove failed on first attempt
2009-06-19 07:42:10,733 - Drop_Admin_DB - Checking to see if syscdr database exi                                                                             sts
2009-06-19 07:42:10,733 - Drop_Admin_DB - Executing /bin/su - informix -c "echo                                                                              select name from sysdatabases | dbaccess sysmaster"> /var/log/active/cm/trace/db                                                                             l/sdi/cmd.log
2009-06-19 07:42:10,879 - Drop_Admin_DB - Running onstat -
2009-06-19 07:42:10,999 - Drop_Admin_DB - return value of onstat - = [5]
2009-06-19 07:42:10,999 - Drop_Admin_DB - Database is Up
2009-06-19 07:42:11,005 - Drop_Admin_DB - Running onstat -
2009-06-19 07:42:11,126 - Drop_Admin_DB - return value of onstat - 5 :
2009-06-19 07:42:11,126 - Drop_Admin_DB - Database missing sysdb = 1
2009-06-19 07:42:11,126 - Drop_Admin_DB - rc: [115]
2009-06-19 07:42:11,126 - Drop_Admin_DB - The syscdr database is missing! No nee                                                                             d to remove syscdr database
2009-06-19 07:42:11,126 - Drop_Admin_DB - Starting dblrpc.
2009-06-19 07:42:24,545 - Drop_Admin_DB - hd = 0, Syscdr already dropped
.


ANY IDEA??? I'M DESPERATED!! THANKS!!!

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
Aaron Harrison Mon, 03/15/2010 - 10:43

Hi


Ok - first things first... you need to run the Unified CM Cluster Overview report to verify the hosts/rhosts tables are consistent accross the cluster. If not, you can reset replication all you like but will never get anywhere.


If that flags any warnings/errors post them back, but typically the fix is to restart the problem servers to regenerate the files.


Secondly - when you do the clusterreset, it's important to stop the replication first (utils dbreplication stop). When you do this, the CLI will block for anything up to 20-30 minutes. Do not CTRL+C it, just wait. Once it's stopped on all the subscribers, then do it on the publisher.


Then do your cluster reset, and monitor the Replicate_State counter from RTMT or from the CLI:


show perf query class "Number of Replicates Created and State of Replication"


The state counter should go to 0, and eventually go to another state... hopefully 2.


Regards


Aaron


Please rate helpful posts..

alicia.vega Mon, 03/15/2010 - 10:58


Some additional information:


Unified CM Database Status


Server Number of Replicates Created Replicate_State
10.255.27.1 412 2 - good
10.255.27.2 412 2 - good
10.255.27.3 412 4 - setup failed


Replication Server List (cdr list serv)


10.255.27.1 SERVER                 ID STATE    STATUS     QUEUE  CONNECTION CHANGED
-----------------------------------------------------------------------
g_end_ccmpub_ccm7_0_2_20000_5    2 Active   Local           0               
g_end_ccmsub1_ccm7_0_2_20000_5    3 Active   Connected       0 Mar 15 18:19:00


10.255.27.2 SERVER                 ID STATE    STATUS     QUEUE  CONNECTION CHANGED
-----------------------------------------------------------------------
g_end_ccmpub_ccm7_0_2_20000_5    2 Active   Connected     512 Mar 15 18:21:06
g_end_ccmsub1_ccm7_0_2_20000_5    3 Active   Local           0              


10.255.27.3

--> EMPTY!!!!!!!

Unified CM Hosts


Server Host Information
10.255.27.1 #This file was generated by the /etc/hosts cluster manager.
#It is automatically updated as nodes are added, changed, removed from the cluster.

127.0.0.1 localhost
10.255.27.2  END-CCMSUB1
10.255.27.1  END-CCMPUB
10.255.27.3  end-ccmsub2


10.255.27.2 #This file was generated by the /etc/hosts cluster manager.
#It is automatically updated as nodes are added, changed, removed from the cluster.

127.0.0.1 localhost
10.255.27.2  END-CCMSUB1
10.255.27.1  END-CCMPUB
10.255.27.3  end-ccmsub2


10.255.27.3 #This file was generated by the /etc/hosts cluster manager.
#It is automatically updated as nodes are added, changed, removed from the cluster.

127.0.0.1 localhost
10.255.27.3  end-ccmsub2
10.255.27.1  end-ccmpub
10.255.27.2  END-CCMSUB1


Unified CM Rhosts


Server rhosts File
10.255.27.1 localhost
END-CCMSUB1
END-CCMPUB
end-ccmsub2

10.255.27.2 localhost
END-CCMPUB
END-CCMSUB1
### IDS BEGIN - DO NOT REMOVE
END-CCMPUB
END-CCMSUB1
end-ccmsub2

### IDS END - DO NOT REMOVE

10.255.27.3 localhost
end-ccmsub2
end-ccmpub
END-CCMSUB1


--> WHY 10.255.27.2 OUTPUT IS SO DIFFERENT????

Vladimir Stankov Fri, 07/09/2010 - 13:55

Did you managed to fix that?


I am experiencing the same problem with CM 6.1(3) just after upgrad from 6.1(2).

I tried utils dbreplication stop on the sub and pub and after that dbreplication reset but nothing happened.


I'll appreciate info on this matter.

Actions

This Discussion

Related Content