Cisco Support Community
cancel
Showing results for 
Search instead for 
Did you mean: 
New Member

RME 4.3 (LMS 3.2) archive jobs hanging

Hi,

I have an issue that, usually, the archive poll job hangs (still shows as running). This also stops all other archive jobs running until LMS is restarted. The only stacktraces are xdi related. Are all the known xdi issues fixed in RME 4.3 ?

Thanks

[ Wed Jul 15 21:09:08 BST 2009 ],ERROR,[Thread-339],com.cisco.nm.xms.xdi.transport.cmdsvc.LogAdapter,error,19,Unexpected Ssh2Exception stacktrace:

[ Wed Jul 15 21:09:08 BST 2009 ],DEBUG,[Thread-339],com.cisco.nm.xms.xdi.transport.cmdsvc.LogAdapter,printStackTrace,51,stacktracecom.cisco.nm.lib.cmdsvc.ssh2.Ssh2Exceptio

n: Disconnected from remote host

at com.cisco.nm.lib.cmdsvc.ssh2.StreamPair.readBytes(StreamPair.java:332)

at com.cisco.nm.lib.cmdsvc.ssh2.StreamPair.readPacket(StreamPair.java:183)

at com.cisco.nm.lib.cmdsvc.ssh2.Ssh2Engine.run(Ssh2Engine.java:234)

[ Wed Jul 15 21:09:08 BST 2009 ],DEBUG,[Thread-45],com.cisco.nm.xms.xdi.transport.cmdsvc.LogAdapter,printStackTrace,51,stacktracejava.net.SocketException: Broken pipe

at java.net.SocketOutputStream.socketWrite0(Native Method)

at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)

at java.net.SocketOutputStream.write(SocketOutputStream.java:136)

at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)

at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)

at com.cisco.nm.lib.cmdsvc.ssh2.StreamPair.flush(StreamPair.java:341)

at com.cisco.nm.lib.cmdsvc.ssh2.StreamPair.write(StreamPair.java:164)

at com.cisco.nm.lib.cmdsvc.ssh2.StreamPair.write(StreamPair.java:128)

at com.cisco.nm.lib.cmdsvc.ssh2.Ssh2Engine.write(Ssh2Engine.java:119)

at com.cisco.nm.lib.cmdsvc.ssh2.Ssh2Engine.disconnect(Ssh2Engine.java:375)

at com.cisco.nm.lib.cmdsvc.SSH2Session.disconnect(SSH2Session.java:180)

at com.cisco.nm.lib.cmdsvc.SSH2Session.close(SSH2Session.java:169)

at com.cisco.nm.lib.cmdsvc.OpConnect.revert(OpConnect.java:74)

at com.cisco.nm.lib.cmdsvc.SessionContext.revert(SessionContext.java:587)

at com.cisco.nm.lib.cmdsvc.SessionContext.invoke(SessionContext.java:216)

at com.cisco.nm.lib.cmdsvc.Engine.process(Engine.java:57)

at com.cisco.nm.lib.cmdsvc.LocalProxy.process(LocalProxy.java:22)

at com.cisco.nm.lib.cmdsvc.CmdSvc.close(CmdSvc.java:591)

at com.cisco.nm.xms.xdi.pkgs.LibDcma.persistor.CliOperator.cleanupOperator(CliOperator.java:1219)

at com.cisco.nm.xms.xdi.pkgs.SharedDcmaPIX.transport.PIXCliOperator.cleanupOperator(PIXCliOperator.java:844)

at com.cisco.nm.xms.xdi.pkgs.SharedDcmaPIX.transport.PIXConfigOperator.cleanupOperator(PIXConfigOperator.java:252)

at com.cisco.nm.xms.xdi.pkgs.LibDcma.persistor.OperatorCacheManager.clearCache(OperatorCacheManager.java:95)

at com.cisco.nm.xms.xdi.pkgs.SharedDcmaPIX.transport.PIXConfigOperator.operationDone(PIXConfigOperator.java:259)

at com.cisco.nm.rmeng.dcma.configmanager.ConfigManager.updateArchiveForDevice(ConfigManager.java:840)

at com.cisco.nm.rmeng.dcma.configmanager.ConfigManager.performCollection(ConfigManager.java:1646)

at com.cisco.nm.rmeng.dcma.configmanager.CfgUpdateThread.run(CfgUpdateThread.java:27)

1 ACCEPTED SOLUTION

Accepted Solutions
Cisco Employee

Re: RME 4.3 (LMS 3.2) archive jobs hanging

There are a lot of bug IDs associated with this (e.g. 6533630). If you apply the latest Solaris recommended patch cluster, you should be okay. I'm running it on my servers, and I have not seen this hang.

8 REPLIES
Cisco Employee

Re: RME 4.3 (LMS 3.2) archive jobs hanging

All of the known lock-up bugs have been fixed in RME 4.3. In order to troubleshoot this, you will need to get a full Java thread dump from the ConfigMgmtServer process. If this is on Windows, the procedure can be somewhat involved, and you should contact TAC to have them walk you through it.

New Member

Re: RME 4.3 (LMS 3.2) archive jobs hanging

Ok Thanks. It's Solaris, but I will open a TAC case. Will post back anything informative.

Cisco Employee

Re: RME 4.3 (LMS 3.2) archive jobs hanging

Solaris is much easier. You can send a SIGQUIT to the ConfigMgmtServer PID. The thread dump will be written to daemons.log.

New Member

Re: RME 4.3 (LMS 3.2) archive jobs hanging

As attached. Guess it's all those ssh2 locked threads.

Cisco Employee

Re: RME 4.3 (LMS 3.2) archive jobs hanging

Looks like you're hitting a Solaris bug. To workaround this, edit /opt/CSCOpx/lib/jre/lib/security/java.security, and change the line:

security.provider.1=sun.security.pkcs11.SunPKCS11 ${java.home}/lib/security/sunpkcs11-solaris.cfg

to:

security.provider.1=sun.security.provider.Sun

Then restart dmgtd.

New Member

Re: RME 4.3 (LMS 3.2) archive jobs hanging

Thanks, I've made the change. Should know by Monday.

New Member

Re: RME 4.3 (LMS 3.2) archive jobs hanging

All looks good. Do you by any chance have a Solaris patch or bug ID for this?

Any-which-way I will mark it resolved.

Thanks.

Cisco Employee

Re: RME 4.3 (LMS 3.2) archive jobs hanging

There are a lot of bug IDs associated with this (e.g. 6533630). If you apply the latest Solaris recommended patch cluster, you should be okay. I'm running it on my servers, and I have not seen this hang.

457
Views
5
Helpful
8
Replies
CreatePlease to create content