I keep getting this error below when I try to install the subscriber on a CUCM 7.0 cluster. What is going on here????
Unable to set the system clock using the NTP server on the first node in the cluster.
Verify that this node was added to the server list in the CUCM admin System>Server.
The steps for adding a subsequent node (subscriber) are listed nicely in this doc;
Configuring a Subsequent Node
To configure a subsequent node in the cluster, follow these steps.
Caution You must configure a subsequent node on the first node by using Cisco Unified Communications Manager Administration before you install the subsequent node.
Install Software from a DVD on a Subsequent Node
Hope this helps!
That's what I thought. This is pretty common. I actually ran into this myself several months back and found that I could not cluster 2 servers (CUCM or CUC) in VMWare. In every instance, the subscriber fails to sync with the "hardware" clock of the Publisher (subscriber is trying to get NTP from Publisher). I also have run into issues in VMWare where the Publisher would not sync with an IOS NTP device (of any stratum) but would sync with an AD DC. That is not constant but the inability to cluster is. I have found conflicting reports on the problem. Some say it works in VMWare Workstation due to an upgraded driver but I have VMWare Workstation as well and that doesn't work. Others say it works in VMWare Server 2.x but I run the latest version (and have run everything from 1.4 and up) and it does not work. Some say it works in ESX or ESXi but I cannot confirm that - I would assume that it would work in an enterprise product since Cisco is moving forward with the UCS line to support virtualization. I hate to be the harbinger of bad news but I think this is par for the course. I'm not sure what version of VMWare you are running but I wish you luck with it. If you find something on how to make it work (regardless of version), let me know...and I'll do the same. Trust me, I've spent a lot of time on this topic.
After I read your posting it seems like you and I had been working on this same exact lab. I am in class with a CCIE Voice professional and he told me that we need to perform a backup of the pub then run a clean install on the pub. After this is done then restore the server settings from the sftp backup. He also told me that this is a well known issue with VMware. I really hope this helps. This guys has his CCIE Voice, Security and R and S so I really trust him. Let me know if this fixes everything for you. Oh Yeah I am running VMWARE SERVER 1.0.6.
We're definitely not alone. I ran across a number of "fixes" but have yet to test any mainly because my home lab is more than sufficient running standalone applications in VMWare. From a process perspective, the backup and reinstall from backup doesn't on the surface address what I perceive the issue to be but it's worth a shot. Let me know if it works for you.
I am trying to install CUCM 7.x (namely, 7.1(3b)SU1) in VMWare ESXi 4.0 U1 (modified for Dell) on Dell PowerEdge D410 (Xeon 5620 with 12 GB of RAM). When I try to pair the Subscriber with the Publisher by specifying the Subscriber as Second Node and supplying the Publisher's IP address, host name, and the security password for the cluster, I get the following message:
Configuraiton validation with
Could not send/receive UDP packets to publisher on port 8500
- Is network connection to
- Is the MTU size correct for this network?
- Does the network allow packet fragments?
If I retry, the installation continues, but eventually it hangs at 100% of the installation script. It just sits there, and if I try to manually reset the VM, when the Subscriber loads up the OS again, it reports that the installation did not succeed and quits. I have done this twice already, and I got the same problem. I have installed CUCM clusters before in the lab and in production. I have also run a Pub and a Sub in VMWare Fusion on my Mac. However, for the life of me, I cannot figure out why this error occurs in ESXi 4.0 U1.
Strangely enough, I am not having any problems with the Subscriber not getting the NTP clock from the Publisher. The Publisher gets its clock from the Catalyst 3560 to which it is connected.
I did create the Subscriber in System -> Server prior to trying to pair the Subscriber with the Publisher.
For me I've found the "Could not send/receive UDP packets to publisher on port 8500" error usually means there is more than one switch in between the Pub and the Sub (common for layer 2 DR configurations). The act of migration from one real/logical server/MAC address to another on another port means you need to clear out the affected Layer 2 tables in all the switches.
Specifically, go to each switch in between Pub and Sub and run the following commands:
# clear arp
# clear mac address-table dynamic vlan N
(where N is the VLAN the Pub/Sub are on).
Here are a few tips to make sure that your Subscriber can synchronize its clock to the Publisher via NTP. If during the addition of the Subsequent node (aka Subscriber) the NTP synchronization does not succeed, the Subscriber's installation will quit with an error (which happens almost all the way towards the end of the installation, so every time you will have wasted plenty of time before this happens). Additionally, if you try to power up the Subscriber after this error occurs, you will get a message that the installation did not succeed, and that you need to start over. Thank you Cisco!
Now, this is what you need to do to avoid this:
1. Your Publisher must meet one of the two conditions:
a. If your Publisher is configured to synchronize its clock via NTP with an NTP server, the clock must be successfully synchronized. If the Publisher's clock was successfully synchronized with the NTP server at one point, but the Publisher is no longer getting its clock from the NTP server at the time when you are trying to add a Subsequent node to the cluster, the Subscriber's installation will error out. To make sure that your Publisher is getting its clock from the NTP server, use the following CLI command: utils ntp status The output of this command will clearly say if the Publisher is synchronized to NTP server and the stratum.
b. If you do not have an active NTP server on the network - you are on an isolated network in the lab or this is a VMWare environment, then do not set your Publisher to synchronize to an NTP server. Instead, set it for its internal clock. You can do this by logging in to Cisco Unified Operating System Administration and navigating to Settings -> NTP Servers. Here, you will have to delete all NTP servers and then you can navigate to Settings -> Time and set the internal clock to the current time.
If you make sure that either of the aforementioned conditions - (a) or (b) - is met, you will have no problems adding a Subsequent node to the cluster, that is when working with physical hardware. However, if you are running one or both servers - Publisher and Subscriber - in VMWare, there is something else you must do. By default, VMWare configures a virtual machine to synchronize its clock with the clock of the hardware server on which the hypervisor is running. This is true for VMWare Fusion and VMWare ESXi. I am not sure about VMWare Workstation and VMWare Server, but I could bet some serious money that it is the case with those two products as well. The strange thing is that whereas in VMWare Fusion, the VM's clock is always configured to synchronize with the Mac on which it is running, in VMWare ESXi, the clock synchronization with the physical server is a hit or miss. I cannot figure out when ESXi decides to enable this feature and when to disable it. So, to be on the safe side, open the .vmx file for your VM and make sure that the following line is contained anywhere in that file: tools.syncTime = "false" alternatively, the syntax could be: tools.syncTime = "FALSE" or perhaps even tools.syncTime = "0" I have seen all three versions with the lower case in quotation marks used by ESXi, and the upper case used by Fusion. I believe that all VMWare versions understand any variation in the syntax of this command, though. If your .vmx file has the following command: tools.syncTime = "true" or tools.syncTime = "TRUE" or even tools.syncTime = "1", you will have to modify it. If your .vmx file does not contain this command at all, you will have to add it like this: tools.syncTime = "false" or use any variation of the syntax shown above.
Once you have added this command to your .vmx file, save the file and place it back where it belongs - in your VM's folder. You must do this on both servers - the Publisher and the Subscriber. On the Publisher, you will have to power down the VM, make the change in the .vmx file, and then power up the VM again - there is no need to reinstall the Publisher. On the Subscriber, this MUST be done after you have created the VM and BEFORE you start the Configuration Wizard on this Subscriber. If you have installed the Subscriber on the VM but skipped the Configuration Wizard, you can power down the Subscriber, make this change in its .vmx file, save the .vmx file, power up the Subscriber, and then run the Configuraiton Wizard. You should have no problems adding the Subscriber to the cluster at this point.
By the way, to follow up on my post of last night, I was able to add my Subscriber to the cluster with both the Publisher and the Subscriber running in VMWare ESXi 4.0U1. I would still like to know why I am getting that error every time I try to add a Subscriber, but it seems that when I retry the same step after I get the error (without having to restart the installation), the installation succeeds (sometimes).
Thanks for such an informative post.
I have used NTP source when installing First Node in a VMWare Server 2.0. I have configured the Host server as NTP source. The CUCM cluster is in an isolated network. The reason I used VMWare server for the test is because it allows me to change MAC address and get licenses uploaded.
I can install First Node without any issues.
Do patch upgrade and license upload on first node
Do DRS restore using production backup files.
Installed the first subscriber and pointing to the first node
When installing the second subscriber, it passed all configurations and failed at post installation. It said Critical error as following:
The installation has encountered a unrecoverable internal error.
Script "/usr/local/cm/script/18.104.22.1680-16/cm-dbl-install install PostInstall ........"
The system will now halt.
Is there any kind of limit for CUCM in VMWare Server? How many nodes can it support in a cluster?
Try the Version 22.214.171.12400-11 works for me on VMWareWorkstation on Linux , I had the PUB point towards a router acting as NTP Master and then installed the SUB .
Sergeyugcrop, you are absalutly right with your solutions. I have stacked on same place as shown before problems. Stacked on cucm subscriber installation, on validating problems i had issue with ntp server accessibles. I have checked ntp synchronisation on Pub and rebooted subcriber installation on vmware. Installation finished without any issue. Thank you a lot. Voting for ur unswer 5! )
I agree with Vikas Srivastava. I was using CUCM 7.15 on Workstation 7.12 with the Windows Time Service acting as an NTP Server. I could not get the PUB's clock to synchronize with the NTP server at all (even though the PUB installed ok and said the NTP server was accessible) despite also trying the instructions suggested by sergyugcorp. As a result I was getting continuous unrecoverable errors when trying to add a subscriber.
I moved to using CUCM 7.13b and used an IOS NTP server with stratum 2. The PUB installed normally (like above) but this time it had also sync'd correctly with the NTP server. The subsequent installation of the subscriber went smoothly with no errors.
I was looking through the forums as I'm trying to find a fix for a problem I'm having building a v9.1 subscriber on ESXi/VMWare, getting the following error:
Interestingly I found all your comments about the NTP source problems, which I have found a fix for as it was also something that cropped up before this 'new issue'!
We are currently using a Window 2008 server for the NTP source, using Windows Time Service (this is enabled via a reg key in w2008). There was no way I could get the Pub to sync with it to start with, and I was at the point where I was losing the will to live when I found a forum entry on the web which advised the following:
Use Windows server as NTP source
Depending on your Windows version, there are some registry settings you need to set:
Changing the ‘Enabled’ flag to the value 1 enables the NTP Server.
Change the server type to NTP by specifying ‘NTP’ in the ‘Type’ registry entry.
Set the ‘Announce Flags’ registry entry to 5, to indicate a reliable time source.
Set 'LocalClockDispersion' to 0
The last one is most important one.
After changing registry, you need to restart "Windows Time" service.
After changing these settings on the server, boom, the pub sync'd up fine! Annoying huh!
It doesn't help me now as I'm stuck with this new issue but thought you guys might be interested in future..
Tim I'm having the exact same problem with my subscriber. I can ping it but still getting. Will be keeping an eye on this post and will udpate if i come across a fix:
I got to the bottom of our problem in the end.
We had to re-build the virtual network in vSphere as I'd investigated all other avenues and had run out of ideas, interestingly it worked. I basically wasted 2 days trying to fix a problem with the CUCM that wasn't there...anyways
When you do figure it out, you'll see the validation test completes really quickly, usually it takes no longer than a few minutes.
Have you tried pinging the default gateway of your virtual network, it's probably the first test to see if the virtual nic's/switches are working or not?
What hw are you building this cluster on (B/C series blade, ESXi v?,)?
Hope you get it fixed...I came across another interesting bug when installing the dial plans you might want to look out for on this version too. There's a post on the forums with a fix if you need more info.
I wish that would have worked for me. I think i tried that a dozen times
I'm working with an engineer now and looking through the logs.
Seems my subscriber is in fact communicating with my publisher on some levels but my publisher seems to be sending back a restart message.
Hopefully i'll hear something back soon and will let you know what i found..