Cisco Support Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Community Member

VOIP Monitor is OUT OF SERVICE on UCCX - 9.0.2 - 9.0.2.11001-24

Hi All - Recently implemented UCCX 9.0.2  in HAOLAN setup, Where VOIP MONITOR is OUT OF SERVICE on Subscriber Node.

Restarted cluster - No go

Any idea ?

1492016: Jul 24 14:53:10.175 GST %MIVR-SS_VOIPMON_SRV-3-VOIP_OPERATION_ERROR:VOIP subsystem operation error: Module Name=LrmVoipManager.recovery,A specific description for a trace=fail to read OPEN_CONF msg returned is null,Exception=
1492017: Jul 24 14:53:10.773 GST %MIVR-CFG_MGR-7-UNK:BSSession:heartbeat::BSSocket.nextInvokeId=130945
1492018: Jul 24 14:53:10.774 GST %MIVR-CFG_MGR-7-UNK:BSMessageWriter-1.writeMessage-> about to write bootstrap message = HEARTBEAT_REQ[length=-1,invokeId=130945]
1492019: Jul 24 14:53:10.814 GST %MIVR-CFG_MGR-7-UNK:BSMessageReader-1.run-> message read: HEARTBEAT_CONF[length=4,invokeId=130945]
1492020: Jul 24 14:53:15.777 GST %MIVR-CFG_MGR-7-UNK:BSSession:heartbeat::BSSocket.nextInvokeId=130946
1492021: Jul 24 14:53:15.777 GST %MIVR-CFG_MGR-7-UNK:BSMessageWriter-1.writeMessage-> about to write bootstrap message = HEARTBEAT_REQ[length=-1,invokeId=130946]
1492022: Jul 24 14:53:15.818 GST %MIVR-CFG_MGR-7-UNK:BSMessageReader-1.run-> message read: HEARTBEAT_CONF[length=4,invokeId=130946]
1492023: Jul 24 14:53:20.195 GST %MIVR-SS_VOIPMON_SRV-3-VOIP_OPERATION_ERROR:VOIP subsystem operation error: Module Name=LRMConnection.readMSG: lrmHost: 172.17.16.10 , lrmPort: 3000,A specific description for a trace= error is: ,Exception=java.io.EOFException
1492024: Jul 24 14:53:20.195 GST %MIVR-SS_VOIPMON_SRV-3-EXCEPTION:java.io.EOFException
1492025: Jul 24 14:53:20.195 GST %MIVR-SS_VOIPMON_SRV-3-EXCEPTION:    at java.io.DataInputStream.readInt(DataInputStream.java:375)
1492026: Jul 24 14:53:20.195 GST %MIVR-SS_VOIPMON_SRV-3-EXCEPTION:    at com.spanlink.lrm.io.CRSLRMInputStream.readINT(CRSLRMInputStream.java:211)
1492027: Jul 24 14:53:20.195 GST %MIVR-SS_VOIPMON_SRV-3-EXCEPTION:    at com.spanlink.lrm.io.CRSLRMInputStream.readMHDR(CRSLRMInputStream.java:356)
1492028: Jul 24 14:53:20.195 GST %MIVR-SS_VOIPMON_SRV-3-EXCEPTION:    at com.spanlink.VOIPMonitor.subsystem.LRMConnection.readMSG(LRMConnection.java:156)
1492029: Jul 24 14:53:20.195 GST %MIVR-SS_VOIPMON_SRV-3-EXCEPTION:    at com.spanlink.VOIPMonitor.subsystem.LrmVoipManager.recovery(LrmVoipManager.java:444)
1492030: Jul 24 14:53:20.195 GST %MIVR-SS_VOIPMON_SRV-3-EXCEPTION:    at com.spanlink.VOIPMonitor.subsystem.VoipServerHeartbeatThread.run(VoipServerHeartbeatThread.java:50)
1492031: Jul 24 14:53:20.195 GST %MIVR-SS_VOIPMON_SRV-3-EXCEPTION:    at com.cisco.executor.impl.ExecutorStubImpl$RequestImpl.runCommand(ExecutorStubImpl.java:690)
1492032: Jul 24 14:53:20.195 GST %MIVR-SS_VOIPMON_SRV-3-EXCEPTION:    at com.cisco.executor.impl.ExecutorStubImpl$RequestImpl.run(ExecutorStubImpl.java:486)
1492033: Jul 24 14:53:20.195 GST %MIVR-SS_VOIPMON_SRV-3-EXCEPTION:    at com.cisco.executor.impl.ExecutorStubImpl$RequestImpl.run(ExecutorStubImpl.java:762)
1492034: Jul 24 14:53:20.196 GST %MIVR-SS_VOIPMON_SRV-3-EXCEPTION:    at EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(PooledExecutor.java:776)
1492035: Jul 24 14:53:20.196 GST %MIVR-SS_VOIPMON_SRV-3-EXCEPTION:    at com.cisco.executor.impl.PooledExecutorStubImpl$1$WorkerImpl.run(PooledExecutorStubImpl.java:99)
1492036: Jul 24 14:53:20.196 GST %MIVR-SS_VOIPMON_SRV-3-EXCEPTION:    at com.cisco.util.ThreadPoolFactory$ThreadImpl.run(ThreadPoolFactory.java:853)
1492037: Jul 24 14:53:20.196 GST %MIVR-SS_VOIPMON_SRV-3-VOIP_OPERATION_ERROR:VOIP subsystem operation error: Module Name=LrmVoipManager.recovery,A specific description for a trace=fail to read OPEN_CONF msg returned is null,Exception=
1492038: Jul 24 14:53:20.781 GST %MIVR-CFG_MGR-7-UNK:BSSession:heartbeat::BSSocket.nextInvokeId=130947
1492039: Jul 24 14:53:20.781 GST %MIVR-CFG_MGR-7-UNK:BSMessageWriter-1.writeMessage-> about to write bootstrap message = HEARTBEAT_REQ[length=-1,invokeId=130947]
1492040: Jul 24 14:53:20.828 GST %MIVR-CFG_MGR-7-UNK:BSMessageReader-1.run-> message read: HEARTBEAT_CONF[length=4,invokeId=130947]
1492041: Jul 24 14:53:25.784 GST %MIVR-CFG_MGR-7-UNK:BSSession:heartbeat::BSSocket.nextInvokeId=130948
1492042: Jul 24 14:53:25.785 GST %MIVR-CFG_MGR-7-UNK:BSMessageWriter-1.writeMessage-> about to write bootstrap message = HEARTBEAT_REQ[length=-1,invokeId=130948]
1492043: Jul 24 14:53:25.825 GST %MIVR-CFG_MGR-7-UNK:BSMessageReader-1.run-> message read: HEARTBEAT_CONF[length=4,invokeId=130948]

SIVANESAN R
Everyone's tags (1)
1 ACCEPTED SOLUTION

Accepted Solutions
Community Member

LRM maintains the value of

LRM maintains the value of active LRM in the form of a key in the LDAP database(Calabrio's).And all services/clients use this key to determine which LRM server they've to connect to.

So when the VoIP Mon SS is looking for that key in it's own server's LDAP database and is failing to connect to that server given in the key, it means it has either the IP of inactive LRM server(Slave's) IP address or they have network connection/firewall issues between the two servers.

Could be multiple reasons like issues in LDAP database, no update from CVD, or just simple connectivity issues.

For now, do the following step by step:

1) Restart the LRM service on the Master Node(Node 1)

2) Restart the LRM service on the Slave Node(Node 2)

3) Restart the LDAP Monitor on the Master Node

4) Restart the LDAP Monitor on the Slave Node

5) Restart the CCX Engine on the Master Node. (Note, this will failover the mastership from the Node 1 to Node 2)

6) Once the Engine is up on Node 1, restart the CCX Engine on the Node 2 to fail the mastership back to Node 1

This should ensure, that the key of Active LRM server's address is reset in both the nodes(considering no issues in CVD, else you would need to reboot both the servers after performing the above steps).

 

HTH

Prashant

 

Please rate helpful posts and mark the solved threads as answered to drive the content further.

5 REPLIES
Community Member

The following message:%MIVR

The following message:

%MIVR-SS_VOIPMON_SRV-3-VOIP_OPERATION_ERROR:VOIP subsystem operation error: Module Name=LRMConnection.readMSG: lrmHost: 172.17.16.10 , lrmPort: 3000,A specific description for a trace= error is: ,Exception=java.io.EOFException

suggests that the VoIP Mon SS on the subscriber is looking for LRM Master: 172.17.16.10 at port 3000, but is unable to.

By design, this service should point to the master server, not the slave.

Can you confirm if this server 172.17.16.10 is Master or Slave server?(Not Pub/Sub)

 

 

 

Community Member

Hi - Thanks for your response

Hi - Thanks for your response, 172.17.16.10 is Subscriber where it acting as a slave.

No idea why its pointing to Slave, Issue only with Subscriber Node (SLAVE).

SIVANESAN R
Community Member

LRM maintains the value of

LRM maintains the value of active LRM in the form of a key in the LDAP database(Calabrio's).And all services/clients use this key to determine which LRM server they've to connect to.

So when the VoIP Mon SS is looking for that key in it's own server's LDAP database and is failing to connect to that server given in the key, it means it has either the IP of inactive LRM server(Slave's) IP address or they have network connection/firewall issues between the two servers.

Could be multiple reasons like issues in LDAP database, no update from CVD, or just simple connectivity issues.

For now, do the following step by step:

1) Restart the LRM service on the Master Node(Node 1)

2) Restart the LRM service on the Slave Node(Node 2)

3) Restart the LDAP Monitor on the Master Node

4) Restart the LDAP Monitor on the Slave Node

5) Restart the CCX Engine on the Master Node. (Note, this will failover the mastership from the Node 1 to Node 2)

6) Once the Engine is up on Node 1, restart the CCX Engine on the Node 2 to fail the mastership back to Node 1

This should ensure, that the key of Active LRM server's address is reset in both the nodes(considering no issues in CVD, else you would need to reboot both the servers after performing the above steps).

 

HTH

Prashant

 

Please rate helpful posts and mark the solved threads as answered to drive the content further.

Community Member

Thanks for your valuable

Thanks for your valuable inputs, I restarted UCCX Server couple of times post its working fine on both nodes. 

SIVANESAN R
Community Member

I just wanted to comment that

I just wanted to comment that this worked for me and I greatly appreciate the step by step and the explanation of what was occurring.  I had looked at the MIVR log and saw the IP the Subscriber was using BUT it  does not clue you in that the wrong IP is the issue.

I did not have to restart the UCCX servers, just restarting the services one at a time, waiting for each to be complete the restart before doing the other server or service worked for me .

Too bad the log comments could not say that the error is caused by the server not using the Master IP. 

Cisco seriously lacks in documentation in their guides as well as the logs.  

SS_VOIPMON_SRV-3-VOIP_OPERATION_ERROR:VOIP subsystem operation error: Module Name=LRMConnection.readMSG: lrmHost: <lists IP>

THANKS !!

642
Views
5
Helpful
5
Replies
CreatePlease to create content