Cisco Support Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Announcements
Users might experience few discrepancies in Search results. We are working on this on our side. We apologize for the inconvenience it may have caused.
New Member

SPA-112 periodically loses registration

We have 2 SPA-112 units (running the latest 1.3.3(015) firmware) which are experiencing loss of registration on a regular basis.   The web interface shows that the line is off-hook, and that it is registered.  However, there is no dial tone on the line, and the SIP registrar (a Mitel 3300) shows that the line is out of service.  The telltale sign of this problem is that the "Next Registration In:" field in the Voice tab is blank.  Attached is the SIP debug log of an event - sometime during that window, the SPA lost registration.

1 ACCEPTED SOLUTION

Accepted Solutions
New Member

I had a very long back-and

I had a very long back-and-forth with Cisco small business support.  They recommended the firmware which was released just a few days ago (1.3.5).  I upgraded the firmware, and also applied a patch to the 3300 that Mitel released just for us.  We have not experienced the problem since.

16 REPLIES
VIP Gold

The attached debug log seems

The attached debug log seems not to confirm your claiming. The important lines of them are:

 --------------------------------------------

 

  1. 11:14:50 | [0]Off Hook
  2. 11:14:59 | Calling:918885651662@10.10.0.17:0, rc=0
  3. 11:15:02 | CNG tone detected
  4. 11:15:05 | NEW_CALL_STATE(), call 0: old state = CC_CST_CALLING, new state CC_CST_PROCEEDING
  5. 11:15:06 | NEW_CALL_STATE(), call 0: old state = CC_CST_PROCEEDING, new state CC_CST_CONNECTED
  6. 11:15:55 | [0]On Hook
  7. 11:20:54 | [0]Off Hook
  8. 11:21:03 | Calling:918885651662@10.10.0.17:0, rc=0
  9. 11:21:06 | CNG tone detected
  10. 11:21:09 | NEW_CALL_STATE(), call 0: old state = CC_CST_CALLING, new state CC_CST_PROCEEDING
  11. 11:21:10 | NEW_CALL_STATE(), call 0: old state = CC_CST_PROCEEDING, new state CC_CST_CONNECTED
  12. 11:21:52 | [0]On Hook
  13. 11:22:26 | End of Log

 --------------------------------------------

 

According the log, there has been outgoing fax call from 11:14:50 to 11:15:55 then second outgoing fax call from 11:20:54 to 11:21:52. Both calls has been connected to caller. It mean that from 11:14:50 to 11:21:52 (e.g. almost complete debug window) the line has NOT been out of service. It has been in use.

 

OK, the attached log can't help us so much with the issue analysis.

 

Please catch the SIP packets between your ATA and Mitel. We need to see registration requests send and related responses from Mitel. I wish it will help to solve the problem.

 

 

New Member

Thanks, Dan!I will set up a

Thanks, Dan!

I will set up a wireshark trace.  In the meantime, here's another log file that definitely covers the time period when the SPA lost registration.

New Member

Please see the attached

Please see the attached wireshark trace.  There are some interesting things here: several 'malformed packet' instances.  During the trace, the device lost registration.

VIP Gold

There is no sucessfull

There is no sucessfull registration catched in the file. SO I can't decide if registration has been lost prematurely or it has just expired.

Also, all files you supplied looks simmilar - they cover active T.38 fax call. No other type of call, no phone in idle state. Should I assume the issue manifests only during such type of call ? You didn't mention such condition ...

 

New Member

The SIP boxes in question

The SIP boxes in question only handle fax calls.  At the beginning of the time represented in the trace, the box was registered.  At the end, it was not registered, at least from the point of view of the registrar.  I will do another trace.

VIP Gold

It should start at REGISTER

It should start at REGISTER time (including the register itself) and should not end before the phone become unregistered AND active call has ended. It's not necesarry to catch the RTP  ...

New Member

OK, I've got another capture

OK, I've got another capture in progress.  The problem is that from the SPA's perspective, the call is still in progress.  The web interface shows "off hook".

VIP Gold

If SPA claim off-hook, then

If SPA claim off-hook, then fax is holding the line off-hook. The fax need to hang up the line. It can't be done by SPA.

New Member

I've attached a capture where

I've attached a capture where the phenomenon occurred, but this time the SPA was on-hook but unregistered.  .40 is the SPA, .17 is the SIP registrar.

VIP Gold

Lets analyze. Both lines has

Lets analyze. Both lines has been registered successfully (packets #13-#20) with expiration set to 3600seconds.

Registration has been renewed about 3570s later (packets #21-#28).

The registration has been successfully renewed so many times then. Successful incoming as well as outgoing calls has occurred several times in meantime.

Important part seems to start at #544. Incoming call to 774201 is diverted to 628 because of "user busy". But previous call has been disconnected so long time ago (#388-#407). So why the 774201 is busy ? Syslog&debug packets are not included in the packet dump, but I assume the fax is holding the line off-hook with no apparent reason. Similar session occurred three times.

But it doesn't affect the registration so we can ignore it.

No interesting event happened up to packet #1001.  A device with IP 10.10.0.24 seems to flood the SPA by suspicious (and possibly malformed) T.38 UDPTL packets.

Not registration is still not lost, so even this can be ignored.

Mos suspicious part is starting at packet #1231. External PBX is attempting to setup incomming call, but upstream PBX decided to cancel it in ringing phase. SPA doesn't respond to such attempt. It's not so good, but it should not harm - upstream PBX should consider the dialog expired.

From packet #1245 we see attempt to setup another incoming call - such requests are not responded from SPA's side. The device seems to be dead. Although previous registration is not expired yet, the device is in wrong state and may claim anything - including it's not registered.

 

Well, what's now ?

 

I see several suspicious event that may cause device's abend.

 

At the first, the device's configuration related to T.38 seems to be broken. T.38 reinvite is attempted during almost all calls, but they are unsuccessful. If you are wishing for T.38 fax mode, the issue needs to be solved. Ask your VoIP provider for help as we can't know why the T.38 is rejected by them. If no T.38 is supported then it should be deconfigured on SPA's side.

At the second, there is a flood of packets from .24 node. The source and purpose of such packets needs to be identified. It may confuse the SPA.

Well, if it will not help, there is "at the third" paragraph. Call cancelled during RINGing phase of incoming call is latest event before the SPA become frozen. It should not be so hard to verify it cause the problem or not. Just call the SPA, don't allow fax to go on-hook, cancel the call during ringing phase. If you confirm it freeze the SPA you discovered the firmware bug. Downgrade may help then.

Note - don't try "third" unless "first" and "second" issue become solved.

Hope it help ...

 

 

 

New Member

Dan, this is an utterly

Dan, this is an utterly fantastic analysis.

I'm going to go through your advice bit by bit.  One note: the .24 node is the on-site SIP proxy, which is a Mitel Border Gateway.

Thank you!

 

 

VIP Gold

It required about two hours

It required about two hours of analysis (more detailed in first version, but I deleted unrelated parts then), so it SHOULD be considered useful ;-)

Start with T.38 - just try not to use it even in the case you wish to use it. You may reactivate it later.

Note the SPA112 is not state-of-the-art device even with the latest firmware. The abandoned PAP2T seems to be still more reliable and stable than SPA112. So try not to push the SPA112 to the edges.

And rate useful responses ;-)

 

New Member

For interest's sake, here's

For interest's sake, here's the SIP log from the device at the time of packet #1231 and #1245:

Aug 25 09:41:09 10.10.0.41  ################# CC: running... ############
Aug 25 09:41:53 10.10.0.40  ################# CC: running... ############
Aug 25 09:46:09 10.10.0.41  ################# CC: running... ############
Aug 25 09:46:52 10.10.0.40  ################# CC: running... ############
Aug 25 09:47:39 10.10.0.40 CC_eventProc(), event: CC_EV_SIG_CALL_ARRIVED(0x31), lid: 1, par: 0, par2: 0x407b9900
Aug 25 09:47:39 10.10.0.40 AUD_ccEventProc: event 49 vid 1 par 0x0 par2 0x407b9900
Aug 25 09:47:39 10.10.0.40 CC_lineBusy()
Aug 25 09:47:39 10.10.0.40 getCodecList line 0x22cc04 call 0x22cc08 clRemote: 0x0 bInbound 1
Aug 25 09:47:39 10.10.0.40 pconly: 0
Aug 25 09:47:39 10.10.0.40 ====== Local codec num 4 ======
Aug 25 09:47:39 10.10.0.40 0
Aug 25 09:47:40 10.10.0.40 130
Aug 25 09:47:40 10.10.0.40 8
Aug 25 09:47:40 10.10.0.40 18
Aug 25 09:47:40 10.10.0.40 ================================
Aug 25 09:47:40 10.10.0.40 [AUD]Get UCH node for AUD_LINE 1 0.
Aug 25 09:47:40 10.10.0.40 uchAllocateNode(), Node 0 allocated ret=0
Aug 25 09:47:41 10.10.0.40 [AUD]UCH node 0 allocated to AUD_LINE 1.
Aug 25 09:47:41 10.10.0.40 uchConnectEpToNode(), connecting EP FXS 2 to node 0
Aug 25 09:47:41 10.10.0.40 uchEnableNode(), Node 0 enbaled ret=0
Aug 25 09:47:41 10.10.0.40 CC_eventProc(), inf.strName = ONT GOVT
Aug 25 09:47:41 10.10.0.40 CC_eventProc(), inf.strPhone = 4163142695
Aug 25 09:47:41 10.10.0.40 callEventProcTable[0] is cepIdleProc
Aug 25 09:47:41 10.10.0.40 cepIdleProc(lid=1, call=0x22cc08, event=18(CC_EV_USR_ACCEPTCALL), par=0, par2=0x407b9900)
Aug 25 09:47:42 10.10.0.40 cepIdleProc(), lid=1
Aug 25 09:47:42 10.10.0.40 cepIdleProc(), line->sigProc(CC_CMD_ACCEPT)
Aug 25 09:47:42 10.10.0.40 cepIdleProc(), call->cinf.bAutoAnswer = 0
Aug 25 09:47:42 10.10.0.40 NEW_CALL_STATE(), call 0: old state = CC_CST_IDLE, new state CC_CST_RINGING
Aug 25 09:47:42 10.10.0.40 CID_initGen 8
Aug 25 09:47:42 10.10.0.40 [1]CID:CID_initGen() >>> offhook 0 delay 2200 phone 4163142695 name ONT GOVT
Aug 25 09:47:42 10.10.0.40 SLIC_startRing state 0 ts 0x25b18con 2000 off 4000 len 60000
Aug 25 09:47:43 10.10.0.40 [1]Ring cad event 0 pol 0
Aug 25 09:47:43 10.10.0.40 RTP_nextMediaPort(), port = 16476
Aug 25 09:47:43 10.10.0.40 RTP_nextMediaPort(), rc=16474
Aug 25 09:47:43 10.10.0.40 AUD_allocCallObj() call(0x277360)
Aug 25 09:47:43 10.10.0.40 [1:0]AUD ALLOC CALL (port=16474)
Aug 25 09:47:43 10.10.0.40 +++++ SIP_lineCcCmdProc CC_CMD_ACCEPT AUD_startRtpRx
Aug 25 09:47:43 10.10.0.40 [AUD]AUD_startRtpRx(0x277360) lid 1
Aug 25 09:47:44 10.10.0.40 Local loopback mode: None. Type: None.
Aug 25 09:47:44 10.10.0.40 Remote loopback mode: None. Type None.
Aug 25 09:47:44 10.10.0.40 RTP channel setup: udp_no_checksum 0, sysmmetric_rtp 0, tos 0xb8, cos 6, mlb 0.
Aug 25 09:47:44 10.10.0.40 uchConnectEpToNode(), connecting EP VoIP 0 to node 0
Aug 25 09:47:44 10.10.0.40 uchEnableEchoCan(), lid 1 EP 1 enable
Aug 25 09:47:44 10.10.0.40 uchEnableModemCall() Modem call state(0) not change
Aug 25 09:47:44 10.10.0.40 UCH sync parameter hold off time is 70
Aug 25 09:47:45 10.10.0.40 Set QoS succeed
Aug 25 09:47:45 10.10.0.40 uchSetGTD(), Disable GTD for FXS 2  
Aug 25 09:47:45 10.10.0.40 uchSetGTD() GTD state not change
Aug 25 09:47:45 10.10.0.40 uchSetDTMFMute(), ENABLE
Aug 25 09:47:45 10.10.0.40 cordless_start_rtp(), chan:0 remote ip:(null) port:0 local:16474 rx:1 ipt:0 ptime:30 bInMdmPasstru:0
Aug 25 09:47:45 10.10.0.40 Starting Rx only RTP.
Aug 25 09:47:45 10.10.0.40 Socket 0 bound to port 16474.
Aug 25 09:47:46 10.10.0.40 Remote IP/port: 0.0.0.0:0
Aug 25 09:47:46 10.10.0.40 set RTP_SESSION_OPT_DTMF
Aug 25 09:47:46 10.10.0.40 VAD = 0
Aug 25 09:47:46 10.10.0.40 RTP configuration:
Aug 25 09:47:46 10.10.0.40   audio_mode RTP_MODE_REC_ONLY, media_loop_level RTP_LOOP_LEVEL_NONE, dtmf2833numEndPakcets 3, opts 0x0
Aug 25 09:47:46 10.10.0.40   Codec: duration 30, rx_pt_event 101, tx_pt_event -1, tx_pt 0  
Aug 25 09:47:47 10.10.0.40          rx[0] 0 PCMU/8000, rx[1] 8 PCMA/8000, rx[2] 2 G.726/8000
Aug 25 09:47:47 10.10.0.40          rx[3] 18 G.729/8000, rx[4] 9 G.722/16000, rx[5] 0  
Aug 25 09:47:47 10.10.0.40   Jib: max 180ms, min 60ms, adapt 1
Aug 25 09:47:47 10.10.0.40 RTP Channel 0 is virgin: 1.
Aug 25 09:47:47 10.10.0.40 #### rtp seq number is 5067  
Aug 25 09:47:47 10.10.0.40 RTP session 0 started
Aug 25 09:47:47 10.10.0.40 [AUD]RTP Rx Up
Aug 25 09:51:09 10.10.0.41  ################# CC: running... ############
Aug 25 09:56:09 10.10.0.41  ################# CC: running... ############
Aug 25 10:01:09 10.10.0.41  ################# CC: running... ############ 

VIP Gold

It conform with analysis

It conform with analysis results.

We can see the Call Control is aware of incoming call (CC_EV_SIG_CALL_ARRIVED) , call has entered RINGING Call STate (CC_CST_RINGING ). Also, rx-only early-audio channel has been established (Starting Rx only RTP and bellow).

There is nothing related to neither CANCEL nor INVITE ...

Unfortunately, there is nothing that can help us with identification of device freezing reason as well - so no new things nor advices will follow ;-)

 

New Member

I had a very long back-and

I had a very long back-and-forth with Cisco small business support.  They recommended the firmware which was released just a few days ago (1.3.5).  I upgraded the firmware, and also applied a patch to the 3300 that Mitel released just for us.  We have not experienced the problem since.

VIP Gold

I read 1.3.5 release notes

I read 1.3.5 release notes and I found nothing relevant. But not all changes are documented. I have no information what's Mitel's patch about. So we don't know what bug has been solved (nor who solved the bug).

If you will catch SIP and syslog&debug communication on patched network it may reveal what's changed. Despite I have no Mitel switch around I'm curious - but it's up to you, of course.

500
Views
49
Helpful
16
Replies
CreatePlease to create content