DiRT Recovery Restore Locking up

Unanswered Question
Sep 2nd, 2008

I have a TAC case open about this but I figured I could double my chances of finding a resolution by bringing it up here.

The issue I am having is that DiRT Restore locks up and makes no more progress once it begins the directory synchronization step. A quick look in task manager shows DiRT as *Not Responding*. I have left it running as is without touching the server over the weekend, thinking something might be happening in the background, but upon arriving in the morning at the beginning of the week DiRT restore is in the same condition it was in when I left. Also a check of the DiRT Recovery Restore log file shows that it just stops, and records no errors.

A little background info for what we're trying to do:

We have a total of 8 Unity 4.0(3) servers in production, 4 primary and 4 secondary. Each pair of Unity servers has an off-box message store(exchange 2000). Everything running on a Windows 2000 OS. Currently the 4 pairs of servers are located in 3 sites, 2 of the pairs being in the larger site. The 3 sites are each on their own voicemail-only domain.

We are looking to upgrade to 8 Unity 5.0(1) servers, 4 primary and 4 secondary, with off-box exchange 2003 message stores. Everything running on Windows 2003 OS. Same setup as before, just newer versions of everything running on new hardware. One major difference will be that all 3 sites will be sharing the same voicemail-only domain.

In order to achieve this upgrade I have built 8 brand new Unity 5.0(1) servers, and their associated exchange 2003 message stores, on Windows 2003 OS, in our lab. I have also built 4 temporary Unity 4.0(3) servers running on Windows 2000 OS (because it won't let me install Unity 4.0(3) on windows 2003).

The overall objective is to DiRT backup each of the 4 live systems including messages, then DiRT restore the config and messages to the temporary Unity 4.0(3) servers. Once the restore has run and any errors have been accounted for, we will upgrade the box to Unity 5.0(1). When the upgrade is complete we will run a new DiRT backup not including messages(they will already have been restored to the proper message store) and DiRT restore this to the freshly built Unity 5.0(1) server running Windows 2003. Once that's done we'll implement failover between each pair of servers.

Now we have have been able to do this successfully for both of the smaller sites that only have a single pair of Unity servers. We have restored them to the temporary server, upgraded Unity on that server, then backed up and restored to the permanent Unity 5.0(1) server. No major errors, everything works great. For reference they have between 400-700 subscriber accounts.

Where we run into the problem is when trying to DiRT restore either of the 2 larger systems, each of which has around 3,000 subscriber accounts. I'll include the DbWalker logs from each existing server, as well as the DiRT backup logs, and the restore logs that just freeze, for anyone that wants to take a look at them.

So I guess my question is, has anyone run into DiRT Restore just locking up like that? If so were you able to discover the cause and work around it?

Also, from reading the help file my understanding is that COBRAS can only be used to backup/restore Unity 4.0(5) and up. Am I incorrect? Can I try using it instead of DiRT to backup my Unity 4.0(3) systems?

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
ranpierce Tue, 09/02/2008 - 09:28

I would not use DiRT at all anyway.

I would build 5.01 on the new servers and then use GSM to move your subscriber to the new hardware.

That what I did here and it worked great

If you are interested then shoot me an email.

Oh yea COBRAS is not supported by TAC yet so I would not advise using it yet although I have many friends that are.

Randy

dandrools Tue, 09/02/2008 - 09:35

Well, we're completely changing the domain that Unity is on. The domain name is changing, we're merging 3 Unity domains into 1. The new Unity servers are in a lab environment until the upgrade proves successful, they do not have connectivity to the existing system in any way shape or form.

I'm not exactly an expert at getting different domains talking to one another, not to mention I have no idea if GSM can work in that kind of environment.

Also, I'm not interested in being able to move only the subscribers. There are tons of subscriber templates, classes of service, call handlers, all kinds of stuff no one is interested in having to reconfigure by hand.

COBRAS not being supported by TAC is not a major issue to me. This entire process is being done in a lab. If something doesn't work we just stick with the live system and try again.

I really just wanna know why DiRT is locking up, it's never done that to me in the past.

Oh btw, for anyone who may ask, the latest version of DiRT backup is installed and was used on the live servers(backup was run from the secondary), and the latest version of DiRT restore is being used on the lab servers.

ranpierce Tue, 09/02/2008 - 09:42

I remember reading that it is reccomended that DiRT is run from the primary. I have failover too and am running DiRT every night.

I will find it for you.

Maybe that is the issue.

Randy

dandrools Tue, 09/02/2008 - 09:46

What I've read says to do it the other way around. Straight from the DiRT Help file:

""When restoring a backup intended to work as a failover system, be sure to restore the DiRT backup to a stand alone server first, make sure it's running properly then run the failover configuration wizard on that server. You cannot restore a DiRT backup onto either a primary or secondary failover pair, it will not work properly.

If you have two Unity servers configured for fail over it's best to backup the Unity server configured as the fail over server, not the primary if possible. The primary Unity server will have additional links in SQL setup for replicating to the fail over server database and these links will be broken when you restore to a new server. If you then try to reconfigure fail over capabilities on that server it will fail. There's no way to automatically remove these flags from the backed up database so if you have backed up the primary Unity server you will need to manually clean up the SQL properties after restore.""

ranpierce Tue, 09/02/2008 - 09:48

Working with systems configured for failover

When restoring a backup intended to work as a failover system, be sure to restore the DiRT backup to a stand alone server first, make sure it's running properly then run the failover configuration wizard on that server. You cannot restore a DiRT backup onto either a primary or secondary failover pair, it will not work properly.

If you have two Unity servers configured for fail over it's best to backup the Unity server configured as the fail over server, not the primary if possible. The primary Unity server will have additional links in SQL setup for replicating to the fail over server database and these links will be broken when you restore to a new server. If you then try to reconfigure fail over capabilities on that server it will fail. There's no way to automatically remove these flags from the backed up database so if you have backed up the primary Unity server you will need to manually clean up the SQL properties after restore. The DiRT restore notices the fact that the restored database has these links and will pop up a warning dialog at the end of the restore reminding you of these (probably why you're here reading this now). The steps to clean up the SQL replication links are as follows:

Open SQL Enterprise Manager.

Expand Microsoft SQL Servers -> SQL Server Group -> "Server Name"

Right-click on the Replication folder and select Configure Publishing, Subscribers, and Distribution...

A wizard will pop up.

Click Next>.

Click Next>.

Click Next>.

Click Yes.

Click Next>.

Click Finish.

Click OK.

Click Close.

Right click on the Replication folder and select Disable Publishing…

A wizard will pop up.

Click Next>.

Choose the option labeled Yes, disable publishing on "Server Name".

Click Next>.

Click Next>.

Click Finish.

An error will pop up.

Click OK to close the error.

Click Cancel.

At this point it should now be possible to reconfigure the restored Unity server for fail over again.

dandrools Tue, 09/02/2008 - 09:28

and here's the other 3. Notice how in the restore logs it just stopped adding entries to the log file, this is how the log file looked when DiRT restore first appeared to have frozen, and it looked exactly the same when I came back several days later after having let the server just sit.

dandrools Wed, 09/03/2008 - 16:14

Just to update for anyone that may come around with this same problem and stumble upon my thread, it appears that TAC and I have discovered the issue.

This Bug ID:

http://tools.cisco.com/Support/BugToolKit/search/getBugDetails.do?method=fetchBugDetails&bugId=CSCee01448

describes exactly the problem I was having. It lead us to install ES 72, which has a prerequisite that you have installed Unity 4.0(3) SR1 by the way, which at this point appears to have fixed the problem.

I will update this thread later when I know for sure, but at this point my DiRT Restore is proceeding normally through the directory synchronization step. Considering there are 6556 directory objects to synchronize it is going to take a while =)

Thanks for your input ranpierce, whether it leads to the fix or not an outside viewpoint is always appreciated.

Actions

This Discussion