We have run into a problem at one (and one only) of our sites where IP Phones - primarily 7940's with a few 7960's - will randomly reset and reconfigure. They do this only occasionally and no real pattern (by firmware/rev/specific loaction on blade) can be individuated. We are using Call Manager ver. 3.3(2)spC. We are aware of a Cisco Bug Alert CSCeb72009 which has 3.3(3) Call Manager implementations experiencing random phone resets in network-determined environments.
That is our question - what is going on on this particular subnet (and ONLY this subnet) that is provoking the resets?
As a point of information prior to launching into the scenaro we have found a way to workaround the issue by segregating the voice traffic (voice VLAN's) onto a seperate T-1 link. As soon as we did this the resets stopped COMPLETELY.
Another point of information - by examining Call Manager and Sniffer logs/traces it seems the resets are caused by keepalive timeouts in which the phone and/or call manager lose track of each other, provoking a reset by the phone to relocate the CM. These resets occur very sporadically - some users go days without one, others are hit three or more times a day.
So what anomalous condition is occurring on the 100mbit link that is provoking these resets?
The site involved uses a 100mbit pipe leased from an ISP on which we are running a GRE/IP tunnel. The ISP has fiber running between our central site to the remote site using 2948 L3 switches (all provider equipment.) This fiber terminates to RJ-45 media converters that then feeds ports on a 4006 (Supv. III) switch at our central site and a 4506 (Supv. IV) switch at the remote site.
The only error that we can observe from IOS-generated stats are constantly incrementing output drops on the TUNNEL interface only. The other counter on the tunnel interface are clean as well as the physical interfaces themselves. Speed and duplexing are set explicitly and CDP is disabled as per the ISP's recommendations.
We have experimented with removing QoS commands and the creation of tx-queues on the physical interfaces to see if the problem might be related to QoS configuration. Removing the QoS commands does not seem to have effected the drop count greatly - neither reducing or increasing it.
CPU utilization on the switches involved range from 10-60%. The 4506 at the remote site is fully populated with 48-port inline power blades and a Sup. IV engine running 12.1(19)EW. It has the higher utilizaton - running between 25-60%. Memory utilization is below 35%. The 4006 switch at our central site also functions as the core switch for the VLAN's at our central site. It has the Supv. III engine running the exact same IOS version as the 4506 along with a 12-port Gig blade and a 48-port 10/100/1000 blade. It runs at about 10-40% utilization most of the time with occasional peaks beyond 40%.
We will be replacing the media converters and the associated premise cables in the coming days.
As you can see we have examined this issue from a lot of viewpoints using many resources - now it is your turn! Viewpoints?
It appears that these resets may be due to constantly incrementing output drops on the tunnel. The speed of the software-based forwarding on the tunnel in the 4006 and 4506 switches could be the cause of this. We have moved the tunnel to routers to see if the reset issue is also encountered there.
Another possibility is the MTU size is set larger than what the provider recommends for the tunnel. The default MTU size for a GRE tunnel between these switches is 1514 bytes. We may have to reduce that size down to 1450 or so. A question - does the MTU size have to be changed on the tunnel interface or the physical interface (or both?)
I'm not able to access my old voice mail messages all of a sudden. The recording says something like 'the message is currently not available'. This has never happened before in all the years I have been using this system. I have t...
If you have 2 ISR routers, one acting as Failover, do we need to have both the same number of SRST licenses on the 2 routers?
No. You will only need the SRST licenses on the primary router. Because this feature...