CUCM won't failover from first to second SIP trunk in route list

Unanswered Question
Nov 9th, 2009

We are running CUCM 6.1(2)2000-1. We have two Cisco CUBEs running 12.4(20)T4 adenterprisek9_IVS-mz feature set. Call processing works fine with both CUBEs. However, when we do a failover test RTMT shows that the CUCM continues to try to use the first SIP trunk in the route list no matter which CUBE is listed first. I have tried placing both SIP trunks in the same route group with a top-down algorithm and I have tried placing the SIP trunks in separate route groups in the route list with the same results. In the enterprise parameters the "Retry Count for SIP Invite" was set to 6 but we waited two to three minutes for failover to occur and it never did. I have changed this parameter to 2 since the initial test which should decrease the time to failover. We have another setup similar to this with another customer but they are using CUCM 7.x and are experiencing no problems. I have been unable to find a bug for this in the bug toolkit so far. What could be causing this issue?

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 5 (3 ratings)
Loading.
jeffrey.girard Tue, 11/10/2009 - 18:47

Take a look at the service parameters for Route Plan (in CUCM 7 it is Clusterwide Parameters (Route Plan)).

There are flags there that indicate what CUCM is to do when flags are returned for unallocated numbers and busy flags. The default is to stop routing, so CUCM will never try to fail over to the next RG in the RL. Might want to take a look and see if there is something similar in 6.1

Jeff

e.groves Tue, 11/24/2009 - 10:56

The parameter you mention is for intercluster trunk routing only.  We are working with a single cluster that has SIP trunks to two redundant Cisco CUBEs.  Thanks for the reply and sorry for the delayed response.

Anthony Towsley Mon, 10/19/2015 - 17:06

Since I just don't like seeing a thread with no definitive resolution and when searching on Google this was the first thing that came up. Just in case anybody else hits this in their search.

Jeffrey.girard was right. Even though that parameter speaks to Inter-Cluster trunks, it is indeed the correct parameter to change that behavior of the SIP Trunk Failover in a Route List. I was having no issues failing over SIP to MGCP, but SIP Trunk to SIP Trunk (to different CUBE's) was failing on CUCM 9.1.  

Setting the Service Parameter in CUCM: "Stop Routing on Unallocated Number Flag " to false allowed my SIP Trunks to fail over in the Route List.

I believe this behavior is due to the fact CUCM doesn't know the SIP trunk is down.  I believe SIP to SIP failover will work in CUCM if options pings are configured so that CUCM knows when a SIP trunk is up or down. But I haven't tested this in my lab.

 

chrisnoon11 Tue, 06/07/2016 - 10:26

Would this apply to SIP trunks that are in the same RG as well?  I have 2 trunks in a RG, and when calls fail (error 404) on the first trunk, they are not routing to the 2nd trunk.  Would setting the parameter to false allow routing further down the RG? 

sidenote:  Is there any reason why I shouldn't have them both in one RG?

iptuser55 Tue, 06/07/2016 - 10:36

The stop hunt setting is for busy and unavailable which in q931 is due to different things - user busy as in cfa or trunk down and so a sip response is different.  I think there is a 404 timer in the service parms you need to check as well 

Anthony Towsley Tue, 06/07/2016 - 10:43

I don't see why it woudn't apply to 2 trunks in a RG.It's looking at failing over.

Since you are receiving a 404, the trunk is being told that the number is not found and CUCM is killing the call. Set the parameters below and then test. Should fail over.

Sip Trying Timer = 200

Retry Count for SIP Invite  = 2

Stop Routing on Unallocated Number Flag to FALSE.

MARTIN STREULE Tue, 11/24/2009 - 11:53

I guess you have that already, but maybe you want to doublecheck:

http://www.cisco.com/en/US/products/sw/voicesw/ps556/products_configuration_example09186a008082d76a.shtml

This sounds interesting:

" the delay to retry increases as a geometric progression with a common ratio of 2 and a scale factor equal to the initial failover time."

"This works as designed and there is not a service parameter that you can change to alter the common ratio. However, you can change the initial delay to retry and the number of retries. This will lower the overall time to failover."

hth

e.groves Wed, 11/25/2009 - 06:44

Yes, I considered this.  We initially had the retries set to 6 which adds up to 31.5 seconds before failover (500 ms + 1s + 2s + 4s + 8s +16s = 31.5s).  However, we waited much longer than this time period for failover to occur.  I have since lowered the retry count to 2 so it should failover after 1.5 seconds.  My customer has been unable to setup a maintenance window for further testing but I don't think there is anything I have done that would have corrected this issue.  As soon as we test again, I will post our status.

Is anyone aware of any possible bugs in this version of CUCM that could cause this issue?  I have searched the bug toolkit but I know that sometimes some bugs are not published there and are for Cisco eyes only. 

James Hogan Wed, 12/01/2010 - 22:46

did you ever find a fix for this??? i am running into the same thing?

e.groves Thu, 12/02/2010 - 13:28

Yes, we opened a TAC case and could not resolve the issue.  No bug was listed for this at the time, but we upgraded them from 6.1(2) to 6.1(4) and the issue was resolved.

iptuser55 Thu, 01/06/2011 - 03:47

Hi

We are testing a voice recorder which uses SIP trunks, we are ok overflowing between SIP TK 1 to SIP TK2 but it fails to SIP TK3. In the URL given earlier it shows an example of two SIP trunks, GW`s and mentions a couple of time 2x Routes- is there a limit on the number of SIP trunks in a RL, RG?

Is the setting used when CUCM does not get a response back from the inital invite- almost the same as a CFNA, failure  setting for CTI RP`s? If the destination in our case is a 3rd Party Server and is shut off so SIP registration is lost, is the fail over between the SIP TK`s is instantaneous  i.e same as an E1, T1 overflow?

thanks

Actions

This Discussion