LMS 2.6 restore sanity check.

Unanswered Question
May 30th, 2007
User Badges:

Just need a sanity check here:


Backed up the LMS data and used tar to put the data in one file to FTP to a new system

tar file size is 24Gig!!!


on new system extracted tar and used restore script to restore the LMS data.


Started the process on Friday May 25.

It is still running!!!


Only think i can figure out is that I have about 400 devices with config history dating back to 2005. This amounts to nearly 400 config versions.


It looks like the Solaris rm -fr command take "forever" when the script deletes the /opt/CSCOpx/tempBackupData directory.


My old system was a Sun V210, the new one is a four CPU 8Gb ram V880.


But it looks like the system is just working on traversing the directory structure deleteing the temp files.



Will it ever end?


Thanks

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
Joe Clarke Wed, 05/30/2007 - 10:45
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

Short answer is, "yes." The long answer involves making sure you are on LMS 2.6 which has some extra fixes for restore speed, then waiting. The data that is archived in the backup will cause the restore to run for a while. It is not uncommon for large backups to take hours to restore.

DAVID SCHULTZ Wed, 05/30/2007 - 10:52
User Badges:

The is from LMS 2.6 to LMS 2.6, just on a new system.


I can figure hours!


But days!!! Wow!


Thanks for the reply.

Joe Clarke Wed, 05/30/2007 - 10:56
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

Days is odd for LMS 2.6. It sounds like I/O might be your bottleneck, though. Please provide a ps -efl, migration.log, and restorebackup.log.

Joe Clarke Wed, 05/30/2007 - 12:07
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

There is one zombied child process of the restore, and the rm process is sleeping. But it's not stopped, and it does appear to have accumulated a large amount of CPU time. If you run du -sh on the temp backup directory, is its size diminishing? If so, you will have to wait out the rm before the restore can finish.


If not, and the rm process is not accumulating any more CPU time, try sending it a kill -15 (SIGTERM) to see if that kicks the restore into proceeding.

DAVID SCHULTZ Wed, 05/30/2007 - 13:13
User Badges:

the size of the directory is diminishing...

I am going to wait this out...I suppose i need to reduce some of the archives and databases somehow.


Thanks.

Joe Clarke Wed, 05/30/2007 - 13:49
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

You can configure ChangeAudit and Config Archive purging under RME > Admin. That will greatly reduce the size of data backed up.

DAVID SCHULTZ Thu, 05/31/2007 - 03:37
User Badges:

Thanks for the advice.


The restore completed last night.


I restarted dmgtd but not I get this error when I access CW:


Forbidden

You don't have permission to access /cwhp/LiaisonServlet on this server.


Additionally, a 403 Forbidden error was encountered while trying to use an ErrorDocument to handle the request.



Thoughts?

Joe Clarke Thu, 05/31/2007 - 06:55
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

Usually we see this error on Windows, but a misconfigured Apache can cause it on Solaris as well. Please provide the /opt/CSCOpx/MDC/Apache/logs/error_log, /opt/CSCOpx/MDC/tomcat/logs/stdout.log, /opt/CSCOpx/MDC/etc/regdaemon.xml, and /opt/CSCOpx/objects/dmgt/dmgtd.conf.

Joe Clarke Thu, 05/31/2007 - 09:22
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

The hostname is incorrectly configured on this box. That's your first problem. I see both NSG-OV1 and NSG-CW as potential hostnames. Additionally, it looks like you've opened a TAC services request for this, and I see NSG-CW as one of the hostnames being used. So you need to follow the 10 step procedure at http://www.cisco.com/univercd/cc/td/doc/product/rtrmgmt/cw2000/cw2000_d/cs303/usrguide/diagnos.htm#wp1078582 . After completing the steps there, edit /opt/CSCOpx/objects/dmgt/dmgtd.conf, and make sure the one instance of NSG-CW in that file reflects the correct hostname for the server.


If you still have problems after all that, get the same files once more.

Actions

This Discussion