LMS 3.2 RME 4.2.0
Server suddenly stopped polling inventory and won't run credential verification jobs. The devices are never attempted and job stays in running state till deamons are stopped.
Logs for all failing jobs show this error:
[ Thu Oct 15 06:46:02 BST 2009 ],ERROR,[main],No resource is associated with key "POLLERJOB015".
[ Thu Oct 15 06:46:02 BST 2009 ],ERROR,[main],POLLERJOB015
javax.jms.JMSException: Could not connect to broker URL: tcp://localhost:42351?wireFormat.maxInactivityDuration=0. Reason: java.net.ConnectException: Connection refused: connect
It is an ongoing problem. I looked it up on another machine and that port (42351) is owned by cwjava. On the affected node nothing is listening on that port. The user guide says this is for ESS.
Yes, it is. But your pdshow looked healthy which is why I asked that when you took it if you were currently seeing the problem. It appeared that you were seeing the problem, then you restarted daemons, and thus everything was okay when you captured the pdshow output. I need to see the pdshow output when the problem is occurring BEFORE daemons are restarted.
Well, the problem is consistent. We have a daily inventory polling that fails consistently and any inventory job fails without attempting any devices. This happens all the time including when the PDSHOW was taken
When this problem is occurring, before doing anything else, open a TAC Service request so a full thread dump can be taken from ESS. If it is still running, it appears one of its threads may have died.
What was the fix to this? I'm having the exact same issue. It seems like the inventory will start and will run for a while, then come back and say fa
iled. After that when you look at the job summery, it always say's "not attempted".
I don't know that a service request was opened for this, but if the originator does not reply, you can start a new thread for this issue. If the symptoms are the same, a service request may be required to figure out why ESS is either not starting completely, or dying after it starts.
One thing we have seen since this thread was first posted was some ESS problems that are caused by corrupt data stores. Moving the contents of NMSROOT/objects/ess/data to a backup location then restarting Daemon Manager has been known to help with that.