cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
806
Views
0
Helpful
2
Replies

LMS 2.6: Config Management Periodic Collection Failing

cbajelis
Level 1
Level 1

RME 4.0.5 as part of LMS 2.6 on Windows 2003 SP2.

Our standard monthly config management collection has mysteriously failing for all devices.

The results file shows the following for every single device:

Execution Status  : Job Execution Failed for device

Execution Message : Unable to get results of job execution for device. Retry the job after increasing the job result wait time using the option:Resource Manager Essentials -> Admin -> Config Mgmt -> Archive Mgmt ->Fetch Settings

Whilst the previous value of 60 seconds has been working without any issues for the previous two years, I upped the wait time to 120 second and all this did was double the required poll time.

The actual job log file shows:

Mon Feb 08  10:45:04 EST 2010 ],INFO ,[Thread-6],com.cisco.nm.rmeng.dcma.jobdriver.DcmaJobExecThread,handleMultiDeviceExecution,143,JobExecutorThread - MultiDeviceExec DcmaJobExecThread 0 : Running

[ Mon Feb 08  10:45:04 EST 2010 ],INFO ,[Thread-6],com.cisco.nm.rmeng.dcma.jobdriver.DcmaJobExecThread,syncArchive,653,Syncing Archive, # of devices = 752

[ Mon Feb 08  10:45:05 EST 2010 ],INFO ,[Thread-6],com.cisco.nm.rmeng.dcma.jobdriver.DcmaJobExecThread,handleMultiDeviceExecution,149,Completed executeJob(), updating Results

[ Mon Feb 08  10:45:05 EST 2010 ],INFO ,[Thread-6],com.cisco.nm.rmeng.dcma.jobdriver.DcmaJobExecThread,getNumCyclesToPoll,1018,getNumCyclesToPoll Function Started.

[ Mon Feb 08  10:45:05 EST 2010 ],INFO ,[Thread-6],com.cisco.nm.rmeng.dcma.jobdriver.DcmaJobExecThread,updateMultiDeviceExecResults,781,Awaiting Job results: req Id = 1265586304881 Poll time = 1321 min(s)

[ Tue Feb 09  08:46:22 EST 2010 ],INFO ,[Thread-6],com.cisco.nm.rmeng.dcma.jobdriver.DcmaJobExecThread,setDevResultToFailure,897,Could not get results for 752 device(s)

[ Tue Feb 09  08:46:25 EST 2010 ],INFO ,[Thread-6],com.cisco.nm.rmeng.dcma.jobdriver.DcmaJobExecThread,handleMultiDeviceExecution,166,Thread DcmaJobExecThread 0: Stopping

[ Tue Feb 09  08:46:26 EST 2010 ],INFO ,[main],com.cisco.nm.rmeng.dcma.jobdriver.DcmaJobExecutor,run,221,Finished Job Execution

[ Tue Feb 09  08:46:26 EST 2010 ],INFO ,[main],com.cisco.nm.rmeng.dcma.jobdriver.DcmaJobExecutor,endJobExecution,460,Writing result Files

Historically, when this worked fine, the log file would show the devices as they completed and update the number of remaining devices. The job is still configured for parallel execution, I’m debating setting it to sequential just to see if makes any difference.

From what I can see our syslog or periiodic polling triggered collections are working fine.

Any suggestions out there from someone who may have seen something similar?

1 Accepted Solution

Accepted Solutions

Joe Clarke
Cisco Employee
Cisco Employee

One of the ConfigMgmtServer threads maybe getting deadlocked.  Try restarting ConfigMgmtServer, and see if subsequent jobs run correctly:

pdterm ConfigMgmtServer

pdexec ConfigMgmtServer

View solution in original post

2 Replies 2

Joe Clarke
Cisco Employee
Cisco Employee

One of the ConfigMgmtServer threads maybe getting deadlocked.  Try restarting ConfigMgmtServer, and see if subsequent jobs run correctly:

pdterm ConfigMgmtServer

pdexec ConfigMgmtServer

Thanks, Joe. That is exactly what fixed it.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: