As we all know in the IT world, everything needs to be rebooted at times and CUCM is no exception. What is a good guideline to rebooting CUCM? Does Cisco have a specific guideline? I was trying to look through the SRND's for 7.x and 8.x and did not find any recommendations for maintenance. Any reference to a document would be greatly appreciated too!
There is no official recommendation or guidelines for this, so there is no document.
Linux has made it more stable than Windows, still, some bugs or issues like memory leaks from tomcat or others have required, if not a server restart, at least a service restart every week or so. I've seen servers which have been running for over 2 years straight without reboot.
Unless you're having an issue there's no real need for a reboot (IMHO it doesn't hurt either, I do that every once in a while with my lab servers). I usually reboot them every 2 months just to keep everything running smoothly even if not affected by any bug that requires it.
I'd also be interested in hearing if anyone else has a reboot as part of their maintenance plan and how often they reboot their servers.
If this helps, please rate
We reboot yearly. We'd had a few problems where rebooting the subscriber was the fix and the subscriber had been up for almost two years. So to avoid the problem, we chose to reboot yearly.
One of our client has 2 CUCM clusters(11 servers in each) running 8.6.2 version on MCS. These servers support about 6000 phones each in UCCE environment.
They are doing scheduled reboot every month.
My first impression was, wow.. I have never seen 1 month reboot period. I know its a big cluster supporting UCCE. However, based on my experience, Cisco's recommendation and the feedback on this thread, we are trying to convince client for a reboot activity every 2 months.
Sometimes, frequent reboot cause issues as well :).
There are some defects on UCCE side which affect outbound dialer port activation on PG servers. The ports go down/register back with CUCM node reboot but they do not activate on PG. A manual assoication for all the ports is required from CUCM application user side(JTAPI). This brings up dependency with IPT and CCSO teams. I had to work with TAC for 4 months to find that its a defect on UCCE side. There is a difference of lowercase and uppercase letters for dialer ports which comes into picture during a reboot(CUCM and PG). So, we were breaking a smooth running system every month and engaging TAC for this.
I think even for a busiest running CUCM cluster, a reboot every month is a bit too frequent.
I would appreciate if you guys can share your experiences and bring up some guidance based on the load on the CUCM node/cluster and the size of it?
I have never seen anybody rebooting their CUCM server every month (That does not mean nobody does it, but is definitely not common at all). I don't a see a need on rebooting every month, however, it shouldn't cause any issues.
-The general recommendation is to reboot the servers at least every six months, regardless of the volume that the servers handle.
I dont think so you need to schedule a reboot unless there is any issue. I dont think there is any Cisco guideline or recommendation as well documented anywhere to restart the cluster for apparently no reason after a given amount of time.
I have never done that nor seen this, we have several large clusters and different customer environments.
Please rate all helpful posts
I opened a TAC case at one point to ask this same question. We were told there is no official recommendation, but the TAC engineer said his personal recommendation is every 6 months to every year. That was back on the stand-alone servers. I'm guessing it would be the same for a UCS chassis.
As a Cisco TAC CUCM engineer, by experience, we recommend rebooting the CUCM servers at least every six months. Is this on an official document? No. But we see CUCM issues on a daily basis and weird behaviors can happen if the server has been running for a very long time.
Is like your PC. We all know that you can skip rebooting your PC for a while, and it will work fine for a few weeks (2-3 weeks) but after a while you start noticing that your PC acting "funny". What do you do by then? Yeah, you reboot your PC, and it starts working properly again.
Is the same principle (Except Linux tends to be more stable than Windows), pretty much all software need a reboot from time to time.
I'd recommend you to go to CUCM Command Line Interface and run the "utils system restart" command, instead of rebooting the whole UCS server or restart it from the vSphere Client/ vCenter.
We typically reboot only the VMs (using the CLI commands mentioned below). We reboot all of our UC VMs at the same time (e.g. UNITY, CUPS, CER, CCX, etc).
That's a good question though. We have power maintenance in our data center yearly and we shut down our UCS servers during that time. But we're thinking about a different power structure where we would no longer need to do this. I hadn't thought about what the best practice would be for the UCS. I'll look into that.
Is there any precautions needed before reboot the cluster.
We have 2 sub and 1 pub in a cluster.
Give me a suggestion for rebooting cluster.
Basic troubleshooting should be done first like probing the symptoms, patterns to isolate it to a certain extent. Then only you should get into restarting services or servers.
To reboot the cluster, you reboot the pub first. Use RTMT to verify it is completely back up. Then you reboot any TFTP servers (but if you only have 1 pub and 2 subs, I assume this does not apply to you). Then you reboot the subs. With just 2 subs, I would reboot them one at a time and verify all services are back up before doing the other. Also, check how many phones are registered at the beginning and make sure the same number a registered at the end.