cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1699
Views
5
Helpful
11
Replies

LMS2.6 problem archiving configuration

ijdod
Level 1
Level 1

A fresh Windows install of LMS2.6 at a customer has a problem archiving the configurations of our IOS devices. This does seem to workd fine for the few CatOS boxes we still have. The IOS boxes are registered as partially successful, with the following error message:

CM0057 PRIMARY RUNNING Config fetch SUCCESS, archival failed for gv-kgr-02-sw01 Cause: CM0002: Could not archive config Cause: Device may not be reachable, may be in suspended state or credentials may be incorrect. Action: Verify that device is managed, credentials are correct and file system has correct permissions. Increase timeout value, if required. Action: Verify that archive exists for device.

This looks a lot like quite a few problem posted in this forum, but removing/adding the devices, as mentioned as a possible workaround, does not solve the problem. jclark reported work was being done on a fix, is there any ETA on that?

Additionally, we are seeing some stability issues, with LMS services being shut down for no apparent reason after a few hours to a couple of days running. Services are stopped with "Administrator has stopped this server". So far I've had not much luck deciphering the log messages, to find out why those services stopped.

Any help or suggestions are appreciated.

1 Accepted Solution

Accepted Solutions

Your problem is your System Identity User is missing. Make sure the username configured under Common Services > Server > Security > System Identity Setup is a valid local user with full CiscoWorks roles (e.g. admin).

After correcting the System Identity User, restart dmgtd.

View solution in original post

11 Replies 11

Joe Clarke
Cisco Employee
Cisco Employee

This config archive error may not require any fixes. It may be a misconfiguration on the server. You should enable ArchiveMgmt Service debugging under RME > Admin > System Preferences > Loglevel Settings, then reproduce the problem. The dcmaservice.log should have errors as to why the archive is failing.

Which daemons are in the shutdown state? What other errors are you seeing?

See attached dmcaclient.log file. Started debugging, ran an immediate Synch Conf job (job #1045), job failed (same error as in original post), stopped debugging, copy&paste to file for attachement.

Haven't written the stopped services down, will do so next time. It was a whole bunch.

Other issues we noted:

- Some of the default scheduled jobs fail because they have no owner.

- Trying to set the collection job schedule (RME > Admin > CFG Mgmt > Collection settings) results in a CM0076 error, but the CTMJrmServer & jrm ARE running. See attachment collection_job.log for the debugging output on that one.

I had mentioned the dcmaservice.log, not the dcmaclient.log. They are substantially different. However, it looks like you have bigger problems. Jrm appears to be down. I assume these problems are new? What changed on the server recently (e.g. was the hostname changed?)? A full pdshow output may help isolate the failing processes.

Ouch, my bad. Sorted on date, never read past dmca, and went from there. Looking at the dmcaserver.log I do see some errors regarding 'user null or empty'. I have no idea which user is ment. Device credentials check out in CDA, as well as in the ACS logging (ACS is just used for device authentication at this point, no integration with CW).

Jrm is up as far as I can tell ("Running normally"), bot through the common service GUI and through pdshow, no recent changes to the server. Server itself is a fresh install, just like CW itself.

The problem is 'new' in the sense that I didn't see this before, but I don't think I tried setting that before. LMS is new to us, production server is still an ancient CW2000 with the old style webGUI.

Your problem is your System Identity User is missing. Make sure the username configured under Common Services > Server > Security > System Identity Setup is a valid local user with full CiscoWorks roles (e.g. admin).

After correcting the System Identity User, restart dmgtd.

Downed daemons last incident, with timestamp. No engineers were actually working with the system when this happened, we only found out later when certain parts on the GUI turned out to be unresponsive. All daemons were 'Administrator has shut down this server'.

FHServer 10:30:11

FHDbMonitor 10:31:11

CampusOGSServer 10:32:51-53

ChangeAudit

CmfDbMonitor

CMFOGSServer

ConfigMgmtServer

CTMJrmServer

DCRServer

DFMCTMStartup

EDS-GCF

Interactor

InvDBMonitor

InventoryCollector

jrm

NCTemplateMgr

NetShowMgr

NOSServer

PTMServer

RMEOGSServer

SyslogAnalyzer

TISServer

EssentialsDM

The root of the problem appears to be centered around the CMF database. What is the state of CmfDbEngine? Is this Windows or Solaris?

CmfDbEngine was very likely up and running, the list was of services stopped on or around that time.

It's a Win2003 SP1 install.

So is this no longer a problem?

It's a problem in the sense that LMS seems to die without reasons known to us. There is a workaround by restarting the whole shebang.

I assume you've fixed the problem with the System Identity User. If not, go ahead and do that now, then restart LMS. When the problem occurs again, it would be a good idea to get the CmfDbMonitor.log to see why the connection to the database died.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Innovations in Cisco Full Stack Observability - A new webinar from Cisco