Phones register to SRST even when WAN is UP

Answered Question
Jun 15th, 2009

I have an issue where IP phones register to my SRST reeference even though the WAN is Up and there are no issues. I see from the trace the following: Last=CM-closed-TCP.....Can anyone tell me what this means? Has anyone come across this and fixed it...

you help is greatly appreciated... full output seen below:

I'm also seeing TCPPID socket broken errors roughly at the same time...any clues ????

06/15/2009 16:08:32.167 CCM|StationInit: (0000000) alarmSeverity=2 text="14: Name=SEP0024C40B3949 Load= SCCP11.8-3-4SR1S Last=CM-closed-TCP" parm1=0(0) parm2=0(0).|<CLID::StandAloneCluster><NID::10.0.2.11><CT::2,100,126,1.37691063><IP::10.200.81.31><DEV::>

Correct Answer by parshah about 7 years 8 months ago

Paul,

What this means that TCP connection between the IP Phone and the CUCM is getting reset/broken causing the phone to think that CUCM is unavailable and it registers with local SRST.

In CUCM, please view the event viewer application logs through RTMT and you should see either device unregistered or device transient connections messages for the MAC of one of the phone you are experiencing this problem with.

For that message, you should have a reason code. Can you tell me what is the reason code?

Most likely, CUCM is closing the TCP connectsion because it is not receiving keep alives from the IP Phone causing it to think the phone/network is down and so it resets it.

Thanks,

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 5 (1 ratings)
Loading.
Correct Answer
parshah Mon, 06/15/2009 - 13:12

Paul,

What this means that TCP connection between the IP Phone and the CUCM is getting reset/broken causing the phone to think that CUCM is unavailable and it registers with local SRST.

In CUCM, please view the event viewer application logs through RTMT and you should see either device unregistered or device transient connections messages for the MAC of one of the phone you are experiencing this problem with.

For that message, you should have a reason code. Can you tell me what is the reason code?

Most likely, CUCM is closing the TCP connectsion because it is not receiving keep alives from the IP Phone causing it to think the phone/network is down and so it resets it.

Thanks,

paul_savage Tue, 06/16/2009 - 00:28

Hi...thanks for the response...

i see the following forthe MAC address in the publisher:

Event Type: Error

Event Source: Cisco CallManager

Event Category: None

Event ID: 3

Date: 6/15/2009

Time: 3:52:50 PM

User: N/A

Computer: UCSTPUB01

Description:

Error: DeviceUnregistered - Device unregistered.

Device name.: SEP0024C40B3949

Device IP address.: 10.200.81.31

Device type. [Optional]: 369

Device description [Optional].: SEP0024C40B3949 NOX

Reason Code [Optional].: 8

App ID: Cisco CallManager

Cluster ID: StandAloneCluster

Node ID: 10.0.2.10

Explanation: A device that has previously registered with Cisco CallManager has unregistered. This event may be issued as part of normal unregistration event or due to some other reason such as loss of keepalives. In cases of SCCP phone normal unregistration with Reason Code 'CallManagerReset', the severity of this alarm is lowered to INFORMATIONAL.

Recommended Action: No action is required if unregistration of this device was expected..

I see from the SRST ref the following...

*Jun 15 15:08:17.577: %IPPHONE-6-REG_ALARM: 10: Name=SEP0024C40B3949 Load= SCCP11.8-3-4SR1S Last=TCP-timeout

I look forward to seeing what you think and thank you for responding

Kind regards

Paul

parshah Tue, 06/16/2009 - 14:03

So, you have device unregistered with a reason code of 8. This reason code means 'Device Initiated Reset'

What this means is that the phone is not receiving keep alive ack from the CUCM.

Now, if this is happening to phones at one particular remote site than I would say you have some issue with packet drops. If this is happening at the local site and to a lot of phones than issue could be in CUCM.

What I would suggest is span the switchport the phone is plugged into and run a sniffer when this problem happens. If you see that keepalive from the phones are sent to CUCM but not keep alive ack are being received than you have packet drops either in WAN or LAN. You will have to do some leg work to find out who the cluprit is. While you are looking at the sniffer, verify that the DSCP markings for skinny packets is CS3 or AF31 for both, tx and rx packets.

Thanks,

allan.thomas Wed, 06/17/2009 - 05:08

In this instance I would tend to agree with the previous post. If IP Phones are falling back to SRST when the WAN is up, then it is highly possible that the keepalives are being dropped or delayed across the WAN.

Therefore I would suspect that QoS is the issue if there are no physical problems with the WAN or underlying LAN.

Firstly verify the switchports which these phones are connected to. Are there any reported errors on the ports? Duplex issues?

Also verify the integrity of the WAN interface. Is the reliability 255/255? are there CRCs, Input Errors?

Secondly is QoS provisioned end-to-end? It is possible that your signalling is not be trusted or being remarked. Do packets captures at either end to determine whether signalling packets still have the dscp value.

One other alternative if QoS is not an option would be to increase the keepalives, this would avoid issues with lossy low bandwidth WAN circuits. However this is clusterwide!!!

Hope this helps.

Allan.

paul_savage Wed, 06/17/2009 - 05:19

thanks to both of you fgor your comments....

I have been looking at the stats for the interface and a i do see some drops so I'm going to be looking into those as the root cause.... see snippets

the really odd thing now is that only 1 phone is experiencing the issue.....the other 50 odd phones seem to be stable and i no longer see any transient connection attempts. I have dsiable QOS for the time being and te network has settled down, so I think both of you might be right...QOS is at fault a previously configuredincorrectly which i am going to have to look inot further

Serial4/2 is up, line protocol is up

Hardware is M4T

Description: Connected to North Oxford

Internet address is 192.168.91.41/30

MTU 1500 bytes, BW 2000 Kbit, DLY 20000 usec,

reliability 255/255, txload 11/255, rxload 11/255

Encapsulation HDLC, crc 16, loopback not set

Keepalive set (10 sec)

Restart-Delay is 0 secs

Last input 00:00:00, output 00:00:00, output hang never

Last clearing of "show interface" counters 00:35:36

Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 13

Queueing strategy: weighted fair

Output queue: 0/1000/64/13 (size/max total/threshold/drops)

Conversations 0/20/256 (active/max active/max total)

Reserved Conversations 0/0 (allocated/max allocated)

Available Bandwidth 1600 kilobits/sec

5 minute input rate 94000 bits/sec, 91 packets/sec

5 minute output rate 93000 bits/sec, 57 packets/sec

273611 packets input, 37436546 bytes, 0 no buffer

Received 248 broadcasts, 0 runts, 0 giants, 0 throttles

0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort

149350 packets output, 33491195 bytes, 0 underruns

0 output errors, 0 collisions, 0 interface resets

0 unknown protocol drops

0 output buffer failures, 0 output buffers swapped out

0 carrier transitions DCD=up DSR=up DTR=up RTS=up CTS=up

*Jun 17 08:59:01.243: %IPPHONE-6-REG_ALARM: 14: Name=SEP0024C40B3D12 Load= SCCP11.8-3-4SR1S Last=CM-closed-TCP

*Jun 17 08:59:01.295: %IPPHONE-6-REG_ALARM: 14: Name=SEP0024C40B3D12 Load= SCCP11.8-3-4SR1S Last=CM-closed-TCP

*Jun 17 08:59:01.323: %IPPHONE-6-REGISTER_NEW: ephone-14:SEP0024C40B3D12 IP:10.200.88.32 Socket:1 DeviceType:Phone has registered.

North-Oxford-WAN#sh interfaces serial 0/0/0

Serial0/0/0 is up, line protocol is up

Hardware is GT96K Serial

Description: Connected to Learning Stream$FW_OUTSIDE$

Internet address is 192.168.91.42/20

MTU 1500 bytes, BW 2000 Kbit, DLY 20000 usec,

reliability 255/255, txload 7/255, rxload 8/255

Encapsulation HDLC, loopback not set

Keepalive set (10 sec)

Last input 00:00:03, output 00:00:00, output hang never

Last clearing of "show interface" counters 00:40:06

Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 14

Queueing strategy: weighted fair

Output queue: 0/1000/64/14 (size/max total/threshold/drops)

Conversations 0/106/256 (active/max active/max total)

Reserved Conversations 0/0 (allocated/max allocated)

Available Bandwidth 1600 kilobits/sec

5 minute input rate 67000 bits/sec, 44 packets/sec

5 minute output rate 61000 bits/sec, 65 packets/sec

162739 packets input, 35431076 bytes, 0 no buffer

Received 282 broadcasts, 0 runts, 0 giants, 0 throttles

0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort

293564 packets output, 39448161 bytes, 0 underruns

0 output errors, 0 collisions, 0 interface resets

0 output buffer failures, 0 output buffers swapped out

0 carrier transitions

DCD=up DSR=up DTR=up RTS=up CTS=up

pgcristovam Sun, 06/28/2009 - 11:56

Hi

How can I open/release the tcp port blocked on cucm?

Thanks

Peterson

paul_savage Mon, 06/29/2009 - 01:00

Hi,

can you give some more information....which port do you want CUCM to unblock?

pgcristovam Mon, 06/29/2009 - 05:32

Hello

parshah say: "Most likely, CUCM is closing the TCP connectsion because it is not receiving keep alives from the IP Phone causing it to think the phone/network is down and so it resets it. "

So, I want know, how can I open the tcp connection between CUCM and phones.

Thx

Peterson

paul_savage Mon, 06/29/2009 - 06:17

In the case that i was dealing with it turnd out that the switches were staked and one of the switchs had a faulty backplane....its also did'nt help that the switch was an Enterasys switch, POE that i did not look after or have access to.

Basically what i have cleaned is that the phone will also try and initite a connection to callmanagr afrter the TCP socket is closed. it basically comes down to network connetivity...Are you droping packes in your network/WAN/LAN links....Are you seeing devices unregister/Device tranient conection attemts in Callmanager ?

You should'nt have to touch callmanager to open the port up.....Are you blocking certain ports in an access list somewhere in the network/Certain routes on WAN failure, is your QOS setup and enabled correctly?

I had to go through everything to try and prove that it was'nt, in my case for this problem, the part of the network that i look after, i.e, our path upto the Enterasys LAN.

Hope that helps....

paul_savage Mon, 06/29/2009 - 06:23

Peterson,

The other thing you can try to settle the network down is to increase the TCP keepalive for the IP phones. beware though, it will require a reset of all IP phones for the setting to take effect...double the value, but you will still need to investiage the cause for the TCP keepalives being lost.

Actions

This Discussion