CiscoWorks RME Daily Archive Poller Failure

Unanswered Question
Sep 8th, 2008
User Badges:

I'm running LMS 2.6 with RME 4.0.6.


The current situation is that the job simply fails, and doesn't appear to have polled anything.


Because I'm not aware of the exact service at fault, I stop and restart the CW daemon. This fixes the issue for a single daily run. All jobs after the first run fail.


I get a single email:


Hello,


The following is the status of your Change Poller based Collection job:


Job ID : 1015

Status : Job Failed

Description : System config polling job

Details : https://nms.ArkansasElectric.com:443/rme/DcmaJobDetails.do?jobid=1015.192


Start Date and Time : Thu Sep 04 17:00:21 CDT 2008

End Date and Time : Thu Sep 04 17:00:21 CDT 2008


RME Server Name/IP : nms.ArkansasElectric.com


Execution Summary


Pending : 0

NotAttempted : 0

Successfull : 0

Failed : 0

Partial Success : 0


I'm pretty sure this all began after we had a disaster in our data center. I know the server CW runs on went down without being properly shut down, so that probably has something to do with it.


Does anyone have any advice, or knowledge about how to remedy the situation?


Thanks,

Daniel


  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
Joe Clarke Mon, 09/08/2008 - 11:24
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

Please post the output of the pdshow command when the polling fails.

dhardy6786 Fri, 09/26/2008 - 07:54
User Badges:

I've attached the output of the pdshow.


The image I've attached is an error message that I receive when I access "Archive Management". Access is fine for several attempts to access at a time, then I receive this message. After I receive this message I am unable to access any part of Cisco Works except the home page. This is also the point that scheduled jobs fail.



Attachment: 
Joe Clarke Fri, 09/26/2008 - 12:41
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

Is this installed on Windows or Solaris?

Joe Clarke Fri, 09/26/2008 - 15:34
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

It appears there may be a problem with your RME database. It may be corrupt in some way which is triggering a failure, or there may be a connection exhaustion. Please post your RMEDbMonitor.log, and the contents of the Windows Application Event Viewer.

dhardy6786 Fri, 09/26/2008 - 16:44
User Badges:

That would not suprise me. From what I saw that does appear to be a fair assumption. I filtered the event viewer for the RMEDbEngine instead of showing everything, but if you'd like to see something else specific please let me know.



Joe Clarke Fri, 09/26/2008 - 17:23
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

It does appear your database is corrupt. If you have a known good backup of LMS, you should restore it. If not, you will need to reinitialize the RME database with the command:


NMSROOT\bin\perl NMSROOT\bin\dbRestoreOrig.pl dsn=rmeng dmprefix=RME

dhardy6786 Fri, 09/26/2008 - 19:23
User Badges:

I don't know if my last couple backups are any good. What will that command do to everything I currently have in RME? Will it all be lost?

Joe Clarke Fri, 09/26/2008 - 19:31
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

Yes, all your RME data will be lost.

dhardy6786 Wed, 10/01/2008 - 12:36
User Badges:

I restored to a backup and it all went fine. It's running fine now, only the scheduled daily poll seems to fail before it starts. Before when it would fail it would say


Pending : 0

NotAttempted : 0

Successfull : 0

Failed : 0

Partial Success : 0


I guess because of the databse problem, but now it says all the devices are Pending, but the process just fails. I noticed that manually running syncs would complete successfully, but I had run a job that just kept running and would not stop - and it won't. It just says "Stop Initiated". Now all REM jobs fail. Any logs that show this info?

Joe Clarke Wed, 10/01/2008 - 12:38
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

The dcmaservice.log would show any errors. Howeber, if you have a wedged job, you will need to restart ConfigMgmtServer at the very least for new config jobs to run again.

dhardy6786 Wed, 10/01/2008 - 12:59
User Badges:

Thanks that finally killed that job. I just noticed another problem though. ANIServer Fails to run. If I start it says


Program started - No mgt msgs received


Then


Running with busy flag set


Then


Failed to run


Any ideas?

Joe Clarke Wed, 10/01/2008 - 13:15
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

Please start a new thread for the problem with ANIServer.

dhardy6786 Wed, 10/01/2008 - 13:25
User Badges:

I restarted the Daemon and so far it looks fine. I'll start a new thread if I continue to have problems with it. Thank you for your help.

dhardy6786 Tue, 06/16/2009 - 09:54
User Badges:

This was a while back, but I'm pretty sure I was referring to crmdmgtd.

scott.lorenzen@... Tue, 06/16/2009 - 09:56
User Badges:

Thank you. I have a little bat file for this. However, I will have to wait until after hours to run this since it shuts everything down and restarts it. I find it takes about 1/2 hour before the system is usable again.

Actions

This Discussion