I have an Intercluster between 2 callmanagers one of this is version 4.1(3)sr2 and the other has 4.1(2)sr1, the issue is when a call is established, after 5 minutes both sides receives a busy tone and the call is lost.
Is a Inter-Cluster Trunk (Non- Gatekeeper controlled)
Could you please post the detailed ccmtrace file for both clusters when call is dropped.
What is the topology between the 2 clusters?
All calls exactly after 5 minutes? 5:00 ?
During active call when stream UDP streaming is active?
- The connection between the 2 clusters of callmanagers is over a WAN with 3E1´s in MPLS environment.
- ALL Calls are lost exactly after 5 minutes
- During all call the UDP stream is active, after 5 minutes the end user receives a Busy tone.
I attach the trace file.
See that you dialed 3888.
The ccm trace is edited (TCP handle) I need the complete detailed CCM trace for both clusters.
I see 2 OpenReceiveChannelAck messages from the phone to the CCM (abnormal) and then I see that this phone hangup at this time.
02/16/2006 23:52:23.587 CCM|StationInit: (0000832) OpenReceiveChannelAck Status=0, IpAddr=0x8e0210ac, Port=16660, PartyID=67110209|<:STANDALONECLUSTER><:172.16.1.11><:4><:172.16.2.142><:SEP00137F163AC6>
02/16/2006 23:52:24.868 CCM|StationInit: (0000832) OpenReceiveChannelAck Status=0, IpAddr=0x8e0210ac, Port=16786, PartyID=67110210|<:STANDALONECLUSTER><:172.16.1.11><:4><:172.16.2.142><:SEP00137F163AC6>
02/16/2006 23:57:36.625 CCM|StationInit: (0000832) OnHook.|<:STANDALONECLUSTER><:172.16.1.11><:4><:172.16.2.142><:SEP00137F163AC6>
Which firmware your are using for the phones?
The 5 minute sounds like h.323 keepalives are missing on the link. H.323 will clear calls if it doesn't receive keepalives, it will trip the 5 minute timer and drop the call thinking that the other side has went down. Are you running NAT or have a PIX between the clusters? My guess is that the timers are not making it from one cluster to the next.
As Steven explained could be more likely a timer issue.
You may also check: CSCeg62469
The Option is to increase the KeepAliveTimer.
The registry setting is:
In my 4.1.3 sr2 server default is 300,000.
Deleting this setting will go back to the Windows 2000 default of 2 hours.
(Cisco OS version 2000.2.4 and earlier)
Or it could be increased to even longer than 2 hours if needed. The setting is in milliseconds.
This does open up the box for this security vulnerability. "An attacker who was able to connect to network applications could cause a DoS condition by establishing numerous connections.
Try the workaround and let us know the results. You need to reboot Callmanager for the change to take effect.
Any NAT between the CCM servers?
What is the current value for:
Allow TCP KeepAlives For H323,under CCM service parameters, default is True /change it to False) and give it a try.
(Not sure if that parameter was added to 4.1.2sr1)
Let us know.
Looking for that keepalive parameter on the registry setting is 300,000
There is no NAT betweewn the clusters, but there are a Firewall.
In CallManager 4.1(2)sr1 does not exist the Allow TCP KeepAlives For H323 service parameter
Note: When I make a local PSTN call using the same WAN using h323 gateways located in the other cluster the call maintans UP all time.