I am getting Event ID:208 Warnings
SQL Server Scheduled Job 'Api-Cm-1-Ccm0300-Ccm0300-Api-Cm-2-Ccm0300- 0' (0xBCDB3839D7D06F42B2EB7E866AF0DA24) - Status: Failed - Invoked on: 2/28/2002 3:11:00 PM - Message: The job failed. The Job was invoked by Schedule 3 (Replication Agent Schedule.). The last step to run was step 1 (Run agent.).
and Event ID:203 Information
SubSystem Message - Job 'Api-Cm-1-Ccm0300-Ccm0300-Api-Cm-2-Ccm0300- 0' (0xBCDB3839D7D06F42B2EB7E866AF0DA24), step 1 - The process could not connect to Distributor 'Api-Cm-1'.
On my Subscriber server. As far as I can tell the Subscriber is communicating with the Publisher, in fact if I bring the Publisher down I can operate phones from the Subscriber.
Is this something I need to worry about?
If you look in the in SQL enterprise manager on the publisher drill down until you see the databases folder. How many CCM030X databases do you see? If you see more than one then CCM0300 is probably not the current database. So this should not be anything to worry about.
There is only one, and it is CCM0300. It's creation date is listed as 2/24/2002, which I should note is when I had to do a reinstall and restore of the publisher after a failed attempt at upgrading to 3.2.
On the publisher if you look at the CCM0300 database and then click the plus sign beside it. Then click the plus sign beside publications. Then click on the CCM0300 pulication. You should see the name of your subscriber does this say active or does it say pending. Now look on the subscriber. Go to databases then make sure you have a CCM0300 database. Then click the plus sign beside the CCM0300. Then highlight the pull subscription. Does it says succeeded or running or do you have some other error message. What verion of Cisco CallManager are you running? Let me know what you find and I will help you get this resolved.
You may want to open a TAC case and provide access to the engineer. They can run these checks in a little while. Probably more time effective for both parties.
From the publisher, it says the subscriber pull status is Active
On the Subscriber, there is a CCM0300 database. In the Pull subscription area, it says the status is "Retrying" and the the last action was "The process could not connect to Distributer API-CM-1"
You can email me directly
Have you changed the SQLSvc users passwords on either server. This account is used to get the initial snapshot. I have seen this error when the passwords have changed. If you did change the passwords let me know before you try to correct it yourself. You could make things much worse. If you have not changed the passwords for the SQLSvc user. Try the link below for reestablishing the subscription.
If you don't think you changed the password and this link does not help you please let me know and I will try and help you.
I did have to change the passwords as part of the failed upgrade attempt. Each server is using it's own local SQLSvc account, the passwords are the same though.
Do you think I should attempt the procedure?
Can you just verify that the passwords are the same. I usually log into CCMAdmin as the SQLSvc user on both servers using what I think the password is. If the SQLSvc passwords are not the same on all systems in the cluster the initial snapshot of the replication will fail and you usually see the message you are getting. The only reason I say to try redoing the subscription is that you mentioned something about rebuilding the publisher. The later version of the restore are suppose to rebuild the subscriptions, but I can't remember what version they put that into. Can you elaborate on the rebuild of the pub did you rebuild the sub after the restore? If you look at the services make sure the SQL Sever Agent service is loging on as the .\SQLSvc for each server. Verify the passwords on both and let me know what you find.
Well, I'm not sure how it happened but the publisher has a different password that what I expect it to be. I can change it tonight and try bringing the services back up using the new password (when I changed it before, I followed the 21 step procedure that is in the release notes "performing Post-installation tasks").
As for rebuilding the publisher, it was because I had a failed upgrade to 3.2 and did not have a broken mirror drive to revert back to. The upgrade failure was pretty bad in that my DBL was completely gone. TAC had me reinstall from CD then restore the previously unupgraded publisher DB.
I have learned my lesson on pulling the drive before the upgrade.
Well that is what is causing the replication issues. If the passwords are different. I am not sure what doc you are talking about but, here is what I do when I change these passwords. Right-click on my computer and click manage. Then go to users and groups and right click on the SQLSvc user and do set password. Then go to start -> programs -> administrative tools -> component services. Then click the plus sign beside component services. Then drill down to you get to the DBL service. Right-click on DBL and go to properties. Click on the idenity tab and make sure the user is SQLSvc and set the password to what you changed it to. Then go to start -> programs -> administrative tools -> services and make sure all the services logging on as SQLSvc reflect the new password. Then just reboot the box and you should be okay.
The replication should start working at that point. If you rebuilt the box the password would have gotten reset.
Did you reset the SQLSvc password after that?
Let me know if this helps,
Life is good...Reset the password for .\SQLSvc on the Publisher and all is well again. No more error messages on the Subscriber the Enterprise Manager on each system has the correct "status".
I was referring to the procedure in the install guide:
it's a handy tool to follow to insure you hit all the right places. I did notice though that on my servers the MOH and TFTP services also use the .\SqlSvc account and need to be cleaned up when you change the password. The strange thing is that I did this to both the Publisher and the Subscriber "after" I restored the Publisher from backup which is why I don't understand how it subsequently lost it.
Now that I'm synched up again I plan on attempting the 3.2 upgrade again this weekend, I guess I'm just a glutton for punishment.
Thanks a lot for your insight, it was incredibly helpful,
I am gald that you got it working. If you have any problems with the 3.2 upgrade start another post and we will see what we can do to help. Do you have a 7835 or 7825 or some ohter server?
That sounds good to me. The only reason I asked the server models if you have the 7835 like you do for the publisher you can pull a drive so it something went really wrong you could just revert back to the spare drive. It is not really an issue now but you can keep it for future reference. I know this procedure is outlined in the readme1st.pdf that you can get when downloaded the upgrades.
I've been interested in these forum entries because we recently experienced a 'different' problem that resulted in all of the same errors mentioned here. So this post is to share our experience with others and hopefully avert upgrade problems or downtime. One week prior to upgrading one of our CallManager clusters from 3.1(2c) to 3.1(3a) + tons of security hot fixes, our domain joined into an existing Active Directory forest. The CallManager cluster was already part of the domain, but the servers were not shutdown/restarted following the AD change. Once we completed the upgrade on the publisher last Weds., we began to receive the SQL job engine errors and major services (CCM, CTI Manager, Database Layer Monitor, and Telephony Call Dispatcher) would not start. We applied the 3.1(3aSPb) patch, since it replaces ccm.exe, but this did not fix the problem. We began working with TAC. We finally replaced the mirror drive and restored the publisher back to its 3.1(2c) state ... as I like to quote from Chef Emeril Lagasse, "It's a beautiful thing!" The following morning, although phones were working, we soon found out CFwdAll from the phones were not (gave message Database unavailable on screen). We fortunately could modify the setting from ccmadmin\ccmuser, which got us through the day. Working with TAC again and thinking we had a broken publisher computer account, we removed the publisher from the domain into a workgroup, deleted the account, then rejoined the domain following synchronization. But by the time we had completed the 3.1(3a) upgrade, we soon received the same exact errors from the night before. Again, the mirror saved the evening. We received some additional big guns support from TAC on Friday. The engineer reviewed all of our SQL settings and several tables, Windows server settings (like DNS and host file for SQL 7.0) etc. We also checked the local security policy for the publisher and here's where we finally found the real problem. The resultant policy for the publisher password setting, inherited from the Default Domain Policy, had a minimum password length of 8 characters. The subscriber had a setting of 0, with no resultant domain policy as it had not been rebooted since the move to AD. Our engineer felt confident that problems were arising during the upgrade, when the password is changed to blanks, which our domain policy would not permit. The recommendation was to remove both publisher/subscriber from the domain into a workgroup, then proceed with the upgrade. This did the trick! The upgrade was successful, no errors, and all of the services started successfully.
We are now looking into a group policy setting that will override/disable the minimum password length for the OU in which these servers should reside. Obviously the CallManager servers do not need to be part of the domain, but we'd prefer to have them be. Hope this helps :-)
Great observations. I think that a problem with the SQL password was the beginning of my problem as well. When I got to that step in the upgrade from 3.1(2c) to 3.2, it didn't seem very happy, but it didn't give me any alternatives either. When the system came up post upgrade, the database was not just inaccessible - it was gone. There are at least 3 or 4 distinct highlights in the upgrade procedure about how to minimize downtime between Publisher or Subscriber upgrade but almost no mention of the criticality of the SQL password.
One thing to note. In the 3.2 version of callmanager when you are doing the upgrade it ask you for your administrator password before the upgrade so I don't believe it blanks it out when running the upgrade. I would have to test the in the lab to verify it but I believe this is the case.
Hope this helps,