Error while joining HA IPCC server

Unanswered Question
Jun 22nd, 2010

when joining node 2 to the HA cluster all services activate fine except historical reporting which fails with the following error:

Error Activating components.

com.cisco.cluster.ClusterException: Component Controller CRS Historical Datastore on node 2 : Enable
com.cisco.database.util.DBException: Unable to connect to the DB on node:QWCORPCC1-1
com.cisco.database.util.DBException: On node(QWCORPCC1) mismatch in Hisotrical Datastore DB (db_cra) size. Bootstrap[db_cra=13312MB] does not match SQL2K Engine[db_cra=10240MB]

The size mismatch is what i find confusing, seems like it should replicate the older database.

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 5 (2 ratings)
Loading.
Aaron Harrison Tue, 06/22/2010 - 12:57

Hi

The size message could be misleading - size is generally an issue if you add a lower-capacity server to the cluster than the original node.

Assuming that's not the case here, it's probably wise to check the DBs can be connected together (the error may be a result of the line before that says can't connect to DB).

Firstly, did you enter the hostnames/ips in the hosts and lmhosts files on BOTH servers? Also ensure that the lmhosts is lmhosts and not the default lmhosts.sam file (i.e. rename it). This often causes failures if the servers aren't in the same subnet, but is good practice either way.

Next, go into SQL Enterprise Manager on each server (you did install full SQL on both nodes, right?). On each, the current server should be registered. If you right click the server group, you can do a 'new Sql server registration' - try to register the SQL instance on the other server on each (using hostname\crssql as the server). Use Windows Authentication.

Post back any errors that come up when you try this.

Regards

Aaron

mcox00941 Wed, 06/23/2010 - 07:44

Thanks Aaron, Here are the results

The size message could be misleading - size is generally an issue if you add a lower-capacity server to the cluster than the original node.

>This was the problem the first time I tried this, so I already learned that lesson. Now the servers are identical.

Assuming that's not the case here, it's probably wise to check the DBs can be connected together (the error may be a result of the line before that says can't connect to DB).

> The line before (though I didnt capture it) did not show an error, the historical database was the only error entry.

Firstly, did you enter the hostnames/ips in the hosts and lmhosts files on BOTH servers? Also ensure that the lmhosts is lmhosts and not the default lmhosts.sam file (i.e. rename it). This often causes failures if the servers aren't in the same subnet, but is good practice either way.

> I've updated both of these files but assume a reboot is necessary, so i'll do that at the next downtime. The servers are on the same subnet though.

Next, go into SQL Enterprise Manager on each server (you did install full SQL on both nodes, right?). On each, the current server should be registered. If you right click the server group, you can do a 'new Sql server registration' - try to register the SQL instance on the other server on each (using hostname\crssql as the server). Use Windows Authentication.

> This works fine on both servers, but I did find that the new server comes up just as the hostname, there is no crssql in the path. I then went into appadmin and all appears to be in order except for what I found under 'system>datastore control'. It shows only the publisher server under historical with node ID 1, Y/Y, replication status unknown and last action is 'no available information for the current cluster setup'. Repository/Agent/Configuration all show signs of activity. I'm wondering if maybe historical reporting just isnt activated on this cluster? Below is the licensing info if thats of any help:

Configured Licenses:

Package: Cisco Unified CCX Premium

IVR Port(s): 170

Cisco Unified CCX Premium Seat(s): 85

High Availability Enabled: 1

Cisco Unified CCX Preview Outbound Dialer: Enabled

Cisco Unified CCX Maximum Agents: 300

Aaron Harrison Wed, 06/23/2010 - 10:18

Hi

When I said 'the previous line' I meant this:

com.cisco.cluster.ClusterException: Component Controller CRS Historical Datastore on node 2 : Enable
com.cisco.database.util.DBException: Unable to connect to the DB on node:QWCORPCC1-1
com.cisco.database.util.DBException: On node(QWCORPCC1) mismatch in Hisotrical Datastore DB (db_cra) size. Bootstrap

Three lines, the last whinges about size, the 2nd mentions DB connection failure.

Re: the instance of SQL showing up as just hostname - if you go into Servicse on this server, is it listed as MSSQL, or MSSQL$CRSSQL?

It sounds like someone has installed a default instance of SQL rather than the required CRSSQL instance. That won't work :-) , and wouldn't happen normally if you are using the Cisco-ised SQL CD.

The datastore page won't show the subscriber, as at the moment the datastores aren't activated due to the error you have.

CHeck on the type of SQL instance (CRSSQL or not) and report back...

Regards

Aaron

mcox00941 Wed, 06/23/2010 - 12:03

the services.msc shows MSSQL$CRSSQL as expected. It is running on both servers under the same account of CRSAdministrator

Aaron Harrison Wed, 06/23/2010 - 12:48

OK - so you are though that in SQL Ent Man, if you start a new registration, the server is listed as just hostname?

If you try to explicitly register hostname\crssql what do you get?

Aaron

mcox00941 Wed, 06/23/2010 - 13:21

Ok, well in enterprise manager they both show as qwcorpcc\crssql so i'm pretty sure it was j

ust how the registration pane showed it. I'm not thinking the issue is in MSSQL. Here is a screenshot.

Attachment: 
Aaron Harrison Wed, 06/23/2010 - 14:57

Hi

So I did once have a size mismatch; I don't know why this happened, or if this is a TAC-friendly solution... I do know the server I put in has been OK for a couple of years though.

What I did is this cheeky little routine:

  • First, take a copy of the entire c:\program files\wfavvid\clusterdata folder - I've never had a problem with this but have seen posts from some folk who have broke their servers.
  • Then go into CET (on either server is OK I think) - start/run, type cet, then hit 'no' in the dialog
  • Click on com.cisco.crs.cluster.config.ClusterDependantConfig, then on the right you should have two records
  • One is for the existing node, and the other for the new one
  • Double click on one of the records on the right, and then in the box that appears click the second tab
  • At the bottom is a 'Database Size' field. You can change this from what it is now for the new node (13312 in your case) to the 10240 that SQL has actually provisioned.

Like I say, I don't know how this gets out of sync, but since your DB looks OK, it's worth a try...

Regards

Aaron

mcox00941 Thu, 06/24/2010 - 09:31

Looks like both servers are identical in that respect, perhaps it fixed itself? One thing that concerns me is the fact that the 'publisher' record ID is a negative number. See attached screen shot.

Attachment: 

Actions

This Discussion

Related Content