CUCM 6.X and temp fail on phones

Unanswered Question
Jun 2nd, 2009

we have been dealing with this issue for a month and TAC still hasnt been able to pin point it. we upgraded from 4.x to 6.x. only thing that changed was that we also upgraded IOS on all MGCP voice gateways to 12.4(21)a. sporadically when on a call users will get a temp fail message on their phone and the call will only drop if the user is an UCCX agent. if it is a regular user their call will not drop. we have downgraded their IOS and nothing. CUCM traces are not giving any info. we captured a sniffer (see attached)

Attachment: 
I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.

From the trace logs, there is an error:

Error:

SDL Logs:

2009/06/01 10:36:28.481| 002| AlarmErr | | | | | | AlarmClass: CallManager, AlarmName: SDLLinkOOS, AlarmSeverity: Error AlarmMessage: , AlarmDescription: SDL link to remote application out of service., AlarmParameters: LocalNodeId:2, LocalApplicationID:100, RemoteIPAddress:172.16.2.8, RemoteNodeID:3, RemoteApplicationID:100, LinkID:2:100:3:100, AppID:Cisco CallManager, ClusterID:StandAloneCluster, NodeID:CORP-SUB1,

Let's break it down:

AlarmName: SDLLinkOOS (SDL Link out of service

RemoteIPAddress:172.16.2.8 (this might be Callmanager????)

LocalApplicationID:100 (this is the CTI Manager on the local node)

RemoteApplicationID:100 (this is the CallManager service on the remote node)

This link will show you some of the output and recommendations:

http://partnerwiki.cisco.com/ViewWiki/index.php/CallManager_Event_Logs

Error Message: %CCM_CALLMANAGER-CALLMANAGER-3-SDLLinkOOS: SDL link to the remote application is out of service. Remote IP address of remote application [String], Unique Link ID. [String], Local node ID [UInt], Local Application ID. [Enum], RemoteNodeID [UInt], Remote application ID.[Enum]

Explanation-This alarm indicates that the local Cisco CallManager has lost communication with the remote Cisco CallManager. This alarm usually indicates network errors or a nonrunning remote Cisco CallManager.

Recommended Action-Investigate why the remote Cisco CallManager does not run or whether a network problem exists.

The other section of this error worries me:

ClusterID:StandAloneCluster, NodeID:CORP-SUB1

This subscriber is in its own cluster???

Maybe something got changed from the upgrade from 4.x to 6.x with your cluster groups. Perhaps, CORP-SUB1 is orphaned outside the cluster for some reason. Might be worth looking at.

kiru Tue, 06/02/2009 - 11:24

all 3 servers are part of the same CM group. is there anything else i should be verifying? what should the clusterID say?

kiru Tue, 06/02/2009 - 12:45

that is Sub2 which is located away from the other 2 which is at the colo.

kiru Tue, 06/02/2009 - 12:48

yes i saw the above link. but just isnt helping. thanks for your reply

Dennis Mink Wed, 06/03/2009 - 18:37

We had the same issue.

Temp fail simultaneous with the SDL Link OOS. we also have a multisite cluster

SDL link is used for inter cluster signalling.

We had issues on the link between the two data centers (high CPU and no Q0S, after we resolved these, the SDL link OOS alarms disappeared, and so did the temp fail). I would focus my attention on that,

Actions

This Discussion