backup problems in LMS3.0.1

Unanswered Question
Feb 28th, 2008

hi,

since upgrading to LMS3.0.1, backups don't seem to be working properly.

despite stating 5 generations in Common Services > Server > Admin > Backup... only a single backup is being maintained and being overwritten each time the backup schedule runs.

additionally, the error message below appears after each backup attempt.

Starting database engine dfmFhEng

ERROR(951): Fatal error: Database engine 'dfmFhEng' could not be started on database 'dfmFhDb' in Bulk mode..

Backup failed: 2008/02/26 00:12:55

have attached the dbbackup.log.

anyone seen this before?

Attachment: 
I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
Joe Clarke Thu, 02/28/2008 - 09:51

This indicates the backup is not running properly. Shutdown dmgtd, and post the list of contents under NMSROOT/databases/dfmFh.

davidharan1 Fri, 02/29/2008 - 07:03

thanks for your response!

do you need the actual file/s? or is the info below useful?

D:\Program Files\CSCOpx\databases\dfmFh>dir

Volume in drive D is CiscoWorks

Volume Serial Number is 68BC-4E57

Directory of D:\Program Files\CSCOpx\databases\dfmFh

29-Feb-08 15:56 .

29-Feb-08 15:56 ..

29-Feb-08 15:56 87,052,288 dfmFh.db

31-Jan-08 15:01 orig

1 File(s) 87,052,288 bytes

3 Dir(s) 7,778,803,712 bytes free

D:\Program Files\CSCOpx\databases\dfmFh>

Joe Clarke Fri, 02/29/2008 - 08:27

One of two things could be wrong here. Either the database is corrupt, or the password is bad. Try changing the password using NMSROOT\bin\perl NMSROOT\bin\dbpasswd.pl dsn=dfmFh first. Then try restarting dmgtd, and see if you are able to complete the backup. If not, try reinitializing this database:

NMSROOT\bin\perl NMSROOT\bin\dbRestoreOrig.pl dsn=dfmFh dmprefix=FM

davidharan1 Tue, 03/04/2008 - 01:02

sorry for the slow reply, just returned to work this morning.

the weekly backup completed ok this morning (backup status = 'successful', and DFM backup folder seems an appropriate size, log attached, dbbackup2.txt), after having failed every week since the upgrade to LMS3.0.1 with the same error code as advised previous.

i didn't get the chance to attempt your two suggestions, however i have made changes since the last unsuccessful backup in relation to other DFM problems I've been having (rediscovered devices remain 'stuck in learning', cannot delete devices, they stay 'pending');

- removed TSM (Tivoli Backup software)

- renamed .dll's msvcm80.dll, msvcp80.dll & msvcr80.dll with version higher than 8.0.50727.42

- deleted DFM.rps & DFM1.rps from NMSRoot/objects/smarts/local/repos/icf/ as per CSCsi01966 (note these .rps files have not been recreated so far)

- rebooted CiscoWorks server & restarted dmgtd several times

so, it would seem my effort to resolve the other DFM issues have instead resolved the backup issue, whilst the original issues remain. i didn't mention the other issues as you've covered them in some detail in this forum already and i now have a TAC case open for them (SR 608049289) however it is currently progressing very very slowly.

any other feedback you can provide would be most helpful.

Joe Clarke Tue, 03/04/2008 - 09:56

You have not completed all the necessary steps. When you remove the .rps files, you must also reinitialize your Sybase DFM databases. The command I gave you previously will handle the FH database, but you also need to run the same commands for dsn=dfmEpm dmprefix=EPM and dsn=dfmInv dmprefix=INV.

Now, if the .rps files are not coming back, you still have a DLL conflict problem. Renaming the DLLs may not be good enough if they are under C:\WINDOWS. IT would be better to move them out of the way, or remove them entirely.

davidharan1 Thu, 03/06/2008 - 08:45

i've now completed the following steps;

-deleted all copies of msvcm80.dll, msvcp80.dll & msvcr80.dll on the server besides those within the following 4 x CiscoWorks folders. all of these copies are version 8.0.50727.42

D:\Program Files\CSCOpx\lib\vbroker\bin

D:\Program Files\CSCOpx\objects\smarts\bin

D:\Program Files\CSCOpx\objects\smarts\lib

D:\Program Files\CSCOpx\objects\smarts\system

-stopped the daemon manager, i.e. net stop crmdmgtd

-re-initialised the DFM database as per the following commands...

d:\Progra~1\CSCOpx\bin\perl d:\Progra~1\CSCOpx\bin\dbRestoreOrig.pl dsn=dfmInv dmprefix=INV

d:\Progra~1\CSCOpx\bin\perl d:\Progra~1\CSCOpx\bin\dbRestoreOrig.pl dsn=dfmFh dmprefix=FH

d:\Progra~1\CSCOpx\bin\perl d:\Progra~1\CSCOpx\bin\dbRestoreOrig.pl dsn=dfmEpm dmprefix=EPM

-rebooted the server (system reports DFMCTMStartup, UTManager are down...)

-imported 2 x devices into DFM from the DCR

and unfortunately, both devices are stuck in learning, and no new DFM.rps or DFM1.rps files have been created in D:\Program Files\CSCOpx\objects\smarts\local\repos\icf

any ideas?

Martin Ermel Thu, 03/06/2008 - 09:53

check if any .NET2.0 SP1 or .NET3.0 SP1 is installed on the server; if yes, uninstall these patches, shutdown dmgtd (net stop crmdmgtd) and reboot the server; delete the 2 devices form DFM and re-add them.

davidharan1 Mon, 03/10/2008 - 02:26

thanks for your feedback mermel,

i checked the following registry entries to see if i have .NET 2.0 sp1 installed.. but i don't. i understand the 'sp' value would be '1' if sp1 was installed.

any other ideas? do you know what it is with .NET 2.0 sp1 that causes a problem?

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\NET Framework Setup\NDP\v2.0.50727]

"Install"=dword:00000001

"MSI"=dword:00000001

"SP"=dword:00000000

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\NET Framework Setup\NDP\v2.0.50727\1033]

"Install"=dword:00000001

"MSI"=dword:00000001

"SP"=dword:00000000

Martin Ermel Mon, 03/10/2008 - 08:00

I am not sure how the registry will reflect the installation of the service packs for .NET - but just go to Start > Settings > Control Panel > Add/Remove Programs and have a look if these patches are installed.

These patches have sever negative influence on DFM as Joe mentioned here:

http://forum.cisco.com/eforum/servlet/NetProf?page=netprof&forum=Network%20Infrastructure&topic=Network%20Management&CommCmd=MB%3Fcmd%3Dpass_through%26location%3Doutline%40%5E1%40%40.2cbf0e61/8#selected_message

They must be removed if they are present on the CiscoWorks server that has DFM installed.

davidharan1 Mon, 03/10/2008 - 08:35

i only mentioned the registry because i can't see any reference to .NET SP1 in add / remove programs.

all i can see is 'Microsoft .NET Framework 2.0', under which, when i select 'show updates', is 'Security Update for Microsoft .NET Framework 2.0 (KB928365)'... that's it. nothing about SP1.

Joe Clarke Mon, 03/10/2008 - 09:26

The problem is really the underlying DLLs that .NET 2.0 SP1 and higher install. Regardless of what you think is installed, the root cause of the problem are versions of msvcm80.dll,

msvcp80.dll, and msvcr80.dll newer than 8.00.50727.42. If you have any such DLLs on the system, you must remove the them along with the software that installed them. The only other workaround is to disable DFM until a fix is available later this month.

davidharan1 Mon, 03/10/2008 - 09:33

as per my comment from back on Mar 6, 2008, 9:45am PST. I have deleted all of the mentioned .dll's, and removed software (TSM, Tivoli Backup Client & Adobe Reader version 8.0) which seemed to be related. Microsoft .NET 2.0 (no service pack) remains installed. and i still have a problem.

devices remain stuck in learning / pending and cannot be deleted / added to DFM.

if the newer version .dll's are no longer on the system at all, and i've restarted a number of times since their deletion, there's no way that the .dll's could still be the problem.. right?

Joe Clarke Mon, 03/10/2008 - 09:36

If you've done another search, and the newer DLLs are truly gone, then that is not the problem. At what percentage are the devices hanging in DFM? This can be seen under DFM > Device Management > Rediscover/Delete.

davidharan1 Mon, 03/10/2008 - 09:41

devices hang at 10%. I'm pretty certain this is consistent. i've never seen 11% or 50% etc.

***Data Collector Status Information***

Discovery Progress = 10% completed

Joe Clarke Mon, 03/10/2008 - 09:56

This still points to a problem on the EMC side. Please post the output of pdshow and netstat -a -n -o -b.

Joe Clarke Mon, 03/10/2008 - 11:05

There are a lot of problems on this server. It would be helpful to see if there are any errors in the Windows Event Viewer. Also, please post the NMSROOT\objects\smarts\local\logs\brstart.log. And run the command:

NMSROOT\lib\vbroker\bin\osagent -p 42342

What output do you get?

davidharan1 Tue, 03/11/2008 - 08:15

Windows Event Viewer error messages;

from the system log...

The CiscoWorks VisiBroker Smart Agent service failed to start due to the following error:

The service did not respond to the start or control request in a timely fashion.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp

Timeout (30000 milliseconds) waiting for the CiscoWorks VisiBroker Smart Agent service to connect.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.

from the application log...

Windows cannot load extensible counter DLL ASP.NET_2.0.50727, the first DWORD in data section is the Windows error code.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.

davidharan1 Tue, 03/11/2008 - 08:20

when i try to run NMSROOT\lib\vbroker\bin\osagent -p 42342, i get an error message 'osagent.exe - Application Error. The application failed to initialize properly (0xc0000135). Click on OK etc'.

Attachment: 
Joe Clarke Tue, 03/11/2008 - 08:50

It sounds like you still have a DLL problem. The daemons that are failing to work use those same DLLs I mentioned before. In fact, they are the only daemons in LMS that use them. Please post a DOS dir output of NMSROOT\lib\vbroker\bin.

Joe Clarke Tue, 03/11/2008 - 11:26

This looks fine. All my searching regarding this error mentions problems with .NET. Though we are not using .NET, I wonder if the DLLs are somehow calling into some of the same code paths. Since you mentioned you had .NET 2.0 installed, is there any way you can remove it, thus leaving only .NET 1.1 installed?

Actions

This Discussion