cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1291
Views
0
Helpful
14
Replies

some phones 7961 unregister for short network failures

MaxPaolini
Level 1
Level 1

Hi !

Some phones model 7961, in a remote branch office, unregister from Call Manager and register in the

14 Replies 14

MaxPaolini
Level 1
Level 1

Some 7961 phones, used in a remote branch office, unregister from Cisco Call Manager and register in the local SRST Router, also for very short VPN failures.

7940, 7941 and 7985 phones don't unregister in the same conditions.

We have a Cisco Call Manager 7 (but we had the same problem with 5).

VPN is an Ipsec VPN over internet connection, managed by Sonicwall firewalls

Thanks for your help

Check and see if the 7941(where the issue isnt seen) and 7961(where the issue is seen) are running the same version.  The 7941 and 7961 run the same firmware, so if you are seeing different issues between those models, something else is going on (that isn't related to the software on the phone).  Perhaps the 7961s are in a different location where the issue is more catastrophically seen.

The phones will unregister if they miss 3 keepalives.  Default is 30 seconds, but that can be changed in CM.

Your best bet is to get a packet capture off the phone switchport in parallel with a detailed CM trace off the node the phone is registered to when the issue occurs.

EDIT: Since you already know you have network failures, a workaround would be to increase the SCCP timer to a larger value.  That will keep the phone registered, but it won't behave well when the network is down if softkeys/events occur.

Thanks Steven.

a) I checked 7961 and 7941 firmware version: SCCP-41.9-0-3s (same release)

b) 7961 and 7941 phones are in the same location (sometimes on the same desktop!!)

c) Call Manager keep alive parameters are the default parameters. I'll add 15 seconds during week end, when I can restart the Call Manager, as workaround, as you say.

Can I control anything in the switch configuration at the remote site?

Thanks again

The swtich is irrelevant for this.  If your VPN is going down, you need to troubleshoot why that is going down.  If the VPN flapping like that is expected behavior, then expect your phones to lose connectivity once in a while, too.  You can't troubleshoot at the application layer until the network is solid.

Increasing the keepalive timer is a workaround until you can get stability on your WAN/VPN circuit.

Steven you are right, when you say that a good network doesn't fail and I'd troubleshoot it when it fails.

But I think that something is wrong in my IP telephony system too: some few phones (always the same) seem to unregister for very short network failures (shorter than 3 keepalives, I suppose) and other phones (same location) don't unregister and continue to work!!!

Thanks again

But I think that something is wrong in my IP telephony system too: some few phones (always the same) seem to unregister for very short network failures (shorter than 3 keepalives, I suppose) and other phones (same location) don't unregister and continue to work!!!

I am not familiar with third-party VPN systems, but it is also posible that the phone receives an ICMP message like destination unreachable from the VPN device. In that case it is very reasonable that it resets.

So do not assume thta the problem is with the CM, because more likely, it is not.

Paolo, it's possible that your idea is right, but

a) I'm sure that there is a very short failure in the VPN network, because I find the same error in the Blackberry Enterprise Server Application Log

b) the failure is probably very short because people never had noticeable troubles with other applications in the same conditions

c) some phones seem to unregister too much speedly ( before the third  three Keep alive?);  other phones unregister for longer network failures

Unregistering phones have always BLF enabled (some have 7914 expansions too), but I didn't find anything  in internet linking BLF with my problem.

Can you suggest me anything about  troubleshooting?  VPN quality is good and failures are not frequent (1-2 per month) but I'd like reduce these small problems.

Thanks

Maybe phones with BLF use a more complex a keepalive scheme, more susceptoble to connectivity interruptions.

I think that if the phone unregisters and then registers back, even if rarely. that is normal in presence of interruptions, and actually a good thing, so you are made aware of the condition

Likely for the 'premature' unregistrations, TCP packets and their retransmissions for valid events (BLF, softkey events, etc.) aren't being ACKd.  So that will cause the TCP/2000 link to get dropped before the 3 SCCP keepalives are sent.

It isn't effecient of your time to troubleshoot this from the phone perspective.  You need to troubleshoot why your VPN goes down.  Consult help from the VPN forum regarding that.

Steven and Paolo, thank you very much for your help.

I'm not sure that I can have better results troubleshooting VPN. It works over internet and it is very good for italian standards.

In the past I thought to move the second CM (I've a cluster) in the branch office for limiting the problem but this activity is not cheap!!!

Thanks again

You may have some luck configuring a policy-map with a shaper on the interface facing the internet connection.  Shape to the upload speed of the link (run a speed test from a PC to verify actual upload speed).  Nest a voice QoS policy inside of that, with a queue for signaling traffic.

That may help this some.  Note that this has to be done before the packets get encrypted, unless you pre-classify at the tunnnel and classify on DSCP.

Something like:

class-map match-any RTP-Class

match dscp ef

class-map match-any Call-Control

match dscp cs3 af31

policy-map VoicePriority

class RTP-Class

  priority 128

class Call-Control

  bandwidth 24

class class-default

  fair-queue

policy-map shaper

class class-default

  shape average 250000 2500 0

  service-policy VoicePriority

interface

  service-policy output shaper

I spoke with the Cisco partner managing our systems and we decided to open a case with TAC.

I'll update this thread with information from TAC.

Steven's idea is very interesting and can mitigate the problem sometimes, but I don't know how much can be helpful in our environment, where bandwith is enough (each site can go at 10 Mbps, in fiber channel), usage is low and round trip time is about 16-18 milliseconds and is rather stable.

But pheraps my knowledge about networking is too much basic (I'm not a technician)!!!!

Thanks

Massimo

It may be the case, but I'd be suprised if the public Internet connection at the remote office has 10M up.  Just make sure that you are testing *upload* bandwidth, and not download bandwidth.  You need to shape the interface to the upload bandwidth speed, which often isn't symmetrical with the download speed.


Even if your upload is 10M, if you have >10M of traffic in the LAN heading out to the Internet (which is a likely scenario), you still have congestion on the interface and can cause late/dropped packets.  Applying the QoS as a safety measure would be a good preventative measure.

Yes Steven, our public internet connections are 10 Mbps (symmetrical) but I believe that you are right when you say that QOS could help.

I'll update this thread after the TAC answer.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: