7921 problem stops responding etc

Unanswered Question
Aug 7th, 2008

Edit: I am going to clean this up a bit to try to get a response.

I am having a bit of trouble with these phones. They connect and work but within 5 minutes the call audio becomes one way. It is always the 7921g phone, it stops receiving audio. The issue is very strange. I have verified many many configuration items and have no where to go. The old part of this post didn't read so well so I cleaned up the detail.

EDIT: cleaned up old post below...

We have Call Manager 4.1. The APs are 1310 running IOS 12.4(10b)JA3. We use WLSE to manage access points. The phones work using EAP-Fast or PEAP on ACS 3.3. I have tried the voice vlan on both the 802.11a and b/g with the same results. I have enabled CAC by using the "Optimize for voice" button and the two check marks for WMM. "Dot11 phone dot11e" and "dot11 arp-cache" are also set on the AP. I don't understand how the power is supposed to work but I set "power level client" on the AP as well as both to lower power levels and such, I will read more about this standard since I am now thinking this is the issue. Anyhow when I turn on the phone it connects and obtains an ip address attaches to the call manager and obtains it's configuration. The firmware also updates if it is not firmware 1.1(1).

As I said above the phone works fine for a little while then it seems like it stops processing packets. When the phone stops working the other party can still hear me speak but I can't hear them. In testing I was pinging the phone and was getting responses until the problem starts then pings starts to fail. While still on the call if I hit a number key on the phone I will hear a few seconds of audio and pings start working but about 2 seconds later it stops again. I was sniffing the traffic on the AP and I could see that both directions of the conversation were getting to the AP and the call still looks normal even after I can't hear. After stopping the call I see the phone sending ARP requests for the routers IP and it is still not responding to pings. If I hit a key on the phone at this time it will respond to pings for a couple seconds. If I try to make a call right away the call will be setup but the audio is still one way. If I wait a while I will see it lose its connection to CM and eventually it will give up it's IP address. The phone will just sit there trying to obtain an IP address. I verified it is still a CDP neighbor of the AP and a capture of traffic reveals that there are DHCP responses to the requests but the phone never accepts the offer. The AP still shows the phone association with 0.0.0.0 for the ip address.

The phone just stops processing packets but hitting a key causes it to process packets for a few seconds.

I have the same issue on both 1.1(1) and 1.0(5) firmware.

I have tested 2 phones and they both experience the same issue.

I am going to try to remove the power management settings from the phone to see if that is causing the issue. I will let you know on Monday but if you have any ideas please please chime in and thanks in advance!!!

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 4.5 (2 ratings)
Loading.
jbarger Mon, 08/11/2008 - 06:46

Power managemnt is not the problem. Here is a log file from bootup then I made a call and the one way audio problem started a little while into the call.

Attachment: 
jbarger Mon, 08/11/2008 - 07:02

Here is another log file with a few more lines in it. To get this log file I have to attempt to make a call and the the phone to start responding to pings by hitting keys on the phone... If I don't hit keys on the phone I can't get to the web page to get the file and ping fails until I start hitting keys. This is just nuts :( Here is a sample of the messages after the issue has started...

2008-08-11 08:48:11:0480 CP-7921G user.err SCCP: TRP Timer Expired

2008-08-11 08:48:36:0120 CP-7921G user.err SCCP: pri-tcp timeout<-47>

2008-08-11 08:48:36:0320 CP-7921G user.err netd: SCCP Requested a network status check

2008-08-11 08:48:50:0340 CP-7921G user.err GUI: set LCD -1

2008-08-11 08:48:53:0190 CP-7921G user.err SCCP: Skinny_get_tftpIp: Could NOT getGNetutilVarVal!

2008-08-11 08:48:53:0200 CP-7921G user.err SCCP: Skinny_get_alt_tftpIp: Could NOT getGNetutilVarVal!

2008-08-11 08:48:55:0100 CP-7921G user.err SCCP: Skinny_get_IPconnect: Could NOT getGNetutilVarVal!

2008-08-11 08:48:55:0110 CP-7921G user.err SCCP: fallback to CCM

2008-08-11 08:48:55:0290 CP-7921G user.err netd: SCCP Requested a network status check

2008-08-11 08:48:57:0080 CP-7921G user.err SCCP: Skinny_get_tftpIp: Could NOT getGNetutilVarVal!

2008-08-11 08:48:57:0080 CP-7921G user.err SCCP: Skinny_get_alt_tftpIp: Could NOT getGNetutilVarVal!

2008-08-11 08:52:40:0350 CP-7921G user.err SCCP: TRP Timer Expired

2008-08-11 08:53:06:0980 CP-7921G user.err SCCP: pri-tcp timeout<-47>

2008-08-11 08:53:07:0260 CP-7921G user.err netd: SCCP Requested a network status check

Attachment: 
jbarger Mon, 08/11/2008 - 13:26

Umm that sentence should read: Here is another log file with a few more lines in it. To get this log file I have to attempt to make a call and then ocasionally press buttons on the phone to get it to start responding to pings.

Note this is after the phone 'breaks'. Before it breaks I can get the web pages fine for as long as I want. It only breaks during a call. After it breaks...

If I don't hit keys on the phone I can't get to the web page to get the file and ping fails until I start hitting keys.

migilles Wed, 08/13/2008 - 18:40

If issues in the rx path, check the network stats on the 7921 webpage for any discarded packets to see if the phone is receiving the RTP packets or not. Really not much the phone can do from the rx side of things, except when using power save, it must send a trigger packet to rx the packet (i.e. U-APSD). If you set the On Call Power Save mode to None in the network profile, then this shouldn't be the issue.

I see that you have opened a TAC case and saw the picture for the call stats showing the MOS 2.0 reflecting 1 way audio and the loss was on the rx side. Also saw that packets are holding the correct DSCP value as well, but also want to ensure they are being sent downstream from the correct queue. In 1.1(1) in the network stats local menu, can see rx stats for each queue. Should see the VO queue #s increasing when on call.

jbarger Wed, 08/13/2008 - 19:42

I have disabled U-APSD power management on the phone and still have the same issue.

I will grab the network stats page after a failed call tomorrow and post it in my tac case.

And thanks for the reply. The TAC case is progressing well I hope I will be getting 1.2(1) firmware for testing. But I can say the engineer has not had the time to really look at all the data I have posted and I have had to recommunicate these details... Now where are all the other folks having this issue?

HAHA

jbarger Thu, 08/14/2008 - 06:02

I am not certain I know which screens you wanted to see... I put a wireless and a network statistics image in my ticket.

The

I am upgrading to 1.2(1) right now to test that firmware.

jbarger Thu, 08/14/2008 - 13:38

Well it turns out to be POWER!!

The AP would decide that the client was in power save mode and buffer packets instead of forward them to the phone.

They are sending the issue to the developers... No solution yet. Probably have to downgrad the AP but hopefully they will supply an upgrade.

When you do a show dot11 association mac the line power save is ON.

migilles Fri, 08/15/2008 - 18:06

It may show that power save on even if not using U-APSD or PS-POLL, where the on call power save mode in the network profile is set to "None". It could be off channel scanning for ap discovery, where it would have to send a null data frame with the power save bit set.

jbarger Sat, 08/16/2008 - 12:10

Yes, even when I disable U-APSD the power save mode the AP still switches to power save ON.

That is why I thought it was not power until I was working with TAC who helped me find the exact issue.

jbarger Tue, 08/19/2008 - 08:04

A possible workaround for the 7921G phones on an 1131 AP. I was able to have a 20 minute call stay working by by disabling U-APSD on the phone and removing dot11e from the AP.

"dot11 phone" instead of "dot11 phone dot11e"

#1 With the AP set for “dot11 phone” not “dot11 phone dot11e” and the 2971 has U-APSD turned on the AP Shows power-save ON during a call.

The one way audio issue happened in about 5 minutes.

#2 With the AP set for “dot11 phone” not “dot11 phone dot11e” and the 2971 has U-APSD turned OFF the AP Shows power-save OFF during a call.

The one way audio issue did not happen for 20 minutes and I quit trying to get it to happen...

jbarger Mon, 08/25/2008 - 05:50

If anyone is interested disabling dot11e and U-APSD did work. Calls have stayed connected for over 1 hour.

migilles Mon, 08/25/2008 - 15:55

Dot11 phone dot11e is QBSS, which is the CU (channel utilization) # you see in the 7921 neighbor list. This has absolutely nothing to do with power save or would prevent the phone from working. The 7921 only uses this for possible roaming and not per say CAC.

It appears that you are having some issues with U-APSD in your environment. I know it was working well in previous AP IOS versions (i.e. 12.3(4g)JA1 and 12.3(8)JEC).

I did look at your sniffer trace and with WMM enabled U-APSD was negotiated properly, but for some reason the AP was not forwarding packets until it entered active mode, which is when you press the keypad (by design). However, you can see the 7921 sending RTP packets all the time, but not getting anything return. With U-APSD, the client must send a trigger packet in order for the AP to send queued packets, which it is doing that. I heard the trace was between 2 7921s, but the other 7921 was on another ap / channel, so couldn't see what it was doing, but because all the packets were spit out after entering active mode, I can assume that the other client was transmitting RTP the whole time as well. So appears there is an AP issue with U-APSD on this code version or some issue in the wired network. I think it's on the ap side though since power save disabled enables it to forward the packets successfully.

So if you disable WMM, "no dot11 qos mode", then PS-POLL will be used instead if the 7921 is configured for U-APSD/PS-POLL for on call power save mode in the network profile. If you set it to "None", then will use active mode, however in client status it may still show up as power save due to going off channel via ps-null power save packet to scan other channels for ap discovery.

So disabling WMM is a temporary workaround for you, but definitely not a recommended solution as now there is no QoS.

jbarger Tue, 09/09/2008 - 08:24

Thanks for the detailed explaination and sorry for the late response. I had to give the phones to the users so I can't do any more testing right now but I have asked for a loaner RMA phone and will test it with 12.3.8 when I get a chance... I do have some of the old style QoS but the whole QoS thing just need to be looked at in detail once we are happy with the power management.

I find it crazy that Jake was unable to reproduce the trouble since I had so much trouble getting a work around. :)

Anyhow I will update when I get the test phone...

ericn8484_2 Wed, 09/24/2008 - 11:05

We had an issue where our 7921 phones kept locking up, we had to downgrade to firmware 4.2.130.0 on our controllers to clean it up. We were running 5.0.148.0

migilles Fri, 10/03/2008 - 15:45

Yes there was an issue with the large beacon coming to the 7921G phone in 5.0 code. This issue is resolved in the 1.2(1) firmware release for the 7921G.

But this thread is about autonomous APs and not the WLAN controller.

SJessulat_2 Mon, 10/06/2008 - 03:12

Hello everyone,

we have a similar problem with our CP7921G's:

The phone is reachable through ping and you can place calls for about 1 minute after WLAN connection is established. After 1 minute, the ping times out and you cannot place calls with the phone.

We thought the problem is our WLAN (AP1252's), but even without any authentication and with AP1242's, the problem persists.

We tried code 1.2.1 and 1.1.1 on the phones, IOS 12.4(10b)JA and 12.4(10b)JA3 on the APs. And PowerSave-Mode is disabled on the phones.

I assume the problem is with the phones, not the APs. Is there a setting i am missing?

Regards,

Sebastian

migilles Mon, 10/06/2008 - 11:11

Doubt it is a phone issue here as the 7921 works with every other Cisco AP using U-APSD just fine. There was an issue fixed on the WLAN controller side in regards to the AP1250 with U-APSD in the 5.1.151.0 code. For autonomous, not sure if that is integrated yet or not. For the power save mode on the 7921, this refers to on call only. If you want to disable U-APSD, although not recommended, can enter "no dot11 qos mode" under the corresponding radio interface. Can try this for a test, where the 7921 should then use PS-POLL vs U-APSD when in idle.

If you have "None" set for on call power save mode, then talk time will be cut by 50% (about 5 hours vs 10 hours)

jbarger Mon, 10/06/2008 - 11:16

Try disabling dot11e on the ap and U-APSD on the phone...

dmantill Thu, 12/04/2008 - 10:05

My recommendation from the AP side will be:

upgrade to 12.4(10b)JDA or downgrade to 12.3 latest stable version.

From the IP phone side, upgrade the firmware to 1.2.1

If you can upload the latest show tech and show log from the AP we might found something interesting in there.

Thanks in advance

SJessulat_2 Fri, 12/05/2008 - 02:15

My experience from the last weeks/months is similar:

We ran 12.4(10b)JA3 on our AP-1242's with CP-7921 running 1.2.1. We had many problems concerning roaming, disconnecting calls and sleep-mode-problems.

After downgrading to the last 12.3 IOS some of the problems disappeared, but bad roaming and disconnecting calls were still occuring.

We recently changed our WLAN infrastructure to LWAPP. We added a 2100 WLC running 5.2.157.0. The Access Points only got an LWAPP-Image (same location, same antennas). All of the remaining problems disappeared. There are no roaming issues, calls lasted for more than 10 minutes without interruption, etc.

My advice is to use a controller-based solution every time you use the WLAN for both Voice and Data. You avoid a lot of problems this way.

Greets,

Sebastian

dmantill Fri, 12/05/2008 - 09:33

Sebastian,

Thanks for your report on this.

The reason why I suggest to low the IOS or to use the latest one is because 12.4 is basically a new IOS version that compared to 12.3 that has been developed for a long time.

The roaming issues, according to the description that you provided seemed to be due to RF interference.

Since the controller based have its own auto-RF capabilities, it can manage and decide what channels should AP use and what should be the correct power and so on.

I am sure that if there were roaming issues in aIOS it might be quite possible that channels are TxPower were different.

Regards,

abpsoft Sun, 04/05/2009 - 15:42

Hi,

> But this thread is about autonomous APs and

> not the WLAN controller.

I'm having *exactly* the same issue with 7921s and 7925s running 1.3(2) against the 1142 and 1252 LAPs controlled by a WLC running 5.2.176.0. Even the workaround is the same: disable WMM on the WLC and the issue is gone. The issue is also gone as soon as I replace the 1142/1252 AP with a 1242. I'm quite sure there is a severe bug in WMM timing in a certain stretch of IOS versions around 12.4(10) up to at least 12.4(18a)JA1. What I see from a wireless sniff is essentially a seemingly working WMM ping-pong of management frames (phone wakes up, requests TXOP, LAP ACKs, no data follows, phone goes to sleep again) but data frames don't make it through (ARPs time out, RTP is one-way, but I tested with MOH so there is no RTP at all when the issue strikes) with some exceptions (WLCCP stuff for instance is still trickling down).

I know the issue is fixed in newest internal IOS/WLC code from an engineering release I was able to test against. I too couldn't understand why nobody else seems to see the issue - it was right there after doing nothing but essential configuration, and it was extremely nasty (destroyed a VoWLAN survey entirely by turning it into a bug hunt). I'm not sure whether it's also isolated to something unusual I'm doing like using WPA2-PSK with CCMP exclusively...

Anyway, nice to finally find someone with exactly the same symptoms (down to the "plays a bit of audio when pressing a key" bit). Another observation: When there is more than one (L)AP, as in my case, the one-way/stuck phone situation is resolved if one moves quickly to another AP than the one currently associated to. As soon as you roam, traffic is back. Unless the SKINNY keepalive times out earlier, that is.

Could I have the TAC case ID or ideally the BugID as a reference?

TIA,

Andre.

Leo Laohoo Sun, 04/05/2009 - 22:06

There is an issue with the autonomous AP (CSCsx07150), where CCKM is failing. This appears to be handing the TSPEC that we send for SCCP traffic (UP4).

The workaround here is to enable "admit-traffic" under the ssid config. In the AP webpage, it is listed as "Call Admission Control", which will add the admit-traffic command.

Symptom:

Client disassociate straight after a CCKM roaming.

As result it will re-authenticate soon after, voice gaps are heard when 7921 are in use.

Conditions:

Standalone AP in WDS.

No CAC in the SSID ( no admit-traffic)

Workaround:

Enable CAC as per design recommendations.

dot11 ssid

admit-traffic

migilles Sun, 04/05/2009 - 22:38

Yes as of the 7921 1.2(1) release, a TSPEC is sent for signalling (SCCP) tagged as UP4 regardless of whether Admission Control Mandatory is enabled for the voice or video queues.

Even when Admission Control for voice or video is not set to mandatory the AP will send a reassociation response without containing the CCKM IE, which is needed for a successful roam. Because the IE is not present, it causes a multi-second audio gap when on a call.

When roaming, TSPECs are included in the reassociation request. Currently the AP is binding TSPEC to CCKM.

There is a fix being planned to possibly enable admission control by default on the SSID in a future release in order to avoid this issue.

But yes the current workaround is to enble "admit-traffic" under the SSID (listed as Call Admission Control under SSID in the Web interface) regardless of enabling Admission Control Mandatory for voice or video.

Not recommended to enable ACM for voice or video due to no load based support like there is with the Wireless LAN Controller solution.

abpsoft Wed, 04/08/2009 - 00:02

Please note that this is *not* the same issue as the one that was originally described in this thread, even though they might be related. The issue is a total and final loss of payload communications between the 792[15] and the AP it is currently associated to, apparently by some WMM malfunction that causes the infrastructure side to no longer transmit most payload frames towards the phones.

* I'm not using CCKM but WPA2-PSK (CCMP only)

* There is no gap but final loss of payload communications (voice RTP, SKINNY, even ping), which is unidirectional AP->STA but this of course kills anything that is TCP and only leaves one-way RTP voice (as transmitted by the phone) running for a while, soon to be terminated by SKINNY keepalive timeouts.

* Comms come back for a second when pressing certain keys on the phone (like the green one)

* Enabling or disabling CAC in whatever combinations makes no difference in the issue

* Disabling WMM *does* make a difference - the issue completely disappears.

* The issue is entirely independent from roaming and can be demonstrated with a single AP, roaming only gives a way out of the issue (by roaming quickly to another AP than the one currently associated to - this restores the payload communications until the issue strikes again at the new AP).

* The issue apparently can happen in both an autonomous AP infrastructure as well as a LAP infrastructure if certain relatively new IOS versions on the (L)APs are in use. It is *not* specific to only one use case.

That is why I'm highly interested in the original TAC case number and/or BugId of this issue. The problem must be known inside Cisco (after all, it's been fixed in interim IOS builds for the LAP 1142 from mid march 2009).

TIA,

Andre.

jbarger Wed, 04/08/2009 - 06:55

The original TAC case number was 609318661.

The ticket changed hands a couple times but Jacob Fussell is the person who is most knowledgeable about it.

Email: jafussel"AT"cisco.com

I am still using the original solution posted eariler in this thread but as noted it is for IOS and not Light weight access points. I have not had time to find out if it is fixed in newer releases.

HTH

Joe

abpsoft Wed, 04/08/2009 - 07:19

Hi Joe,

thanks for the feedback. I'm in the process of opening a case myself, so this is extremely helpful. I'll keep you posted here about the outcome.

Thanks again,

Andre.

abpsoft Sun, 07/05/2009 - 04:16

Hi Joe (et al),

here's an update to my issue and the outcome of my TAC case. When preparing answers for the case, I had to repeatedly replicate the issue and that helped me to notice two additional settings (non-default and not directly mandated by the VoWLAN design guide) that had to be made for the issue to appear:

1) I had the "low latency MAC" feature activated that supposedly helps voice deployments to scale better. When disabling that, the issue would no longer appear.

2) In the WLC Wireless QoS section, I had modified Wired QoS to make use of 802.1p (as mandated by the design guide), but had changed the Tag for Voice from 6 (the default) to 5, based on everything else I knew about the Cisco QoS baseline. It turns out that this was a failure, even though docs on what really happens here are scarce. There is a contrived statement in the VoWLAN guide that essentially boils down to "leave that value at 6, it's good this way and for a reason", but the reasons are complicated. Let's just say here that 802.11e has another mapping of voice and video classes to user_priority tags than has the Cisco baseline on the LAN side, and the WLC does a lot of magic to make this chasm transparent, but setting the wired side QoS 802.1p tag value to 5 (which *is* the value used on this side, so it seemed completely natural to choose) will partially break that. It should *NOT* break it in the way I observed (simply losing voice QoS would be the expected breakage that nevertheless would be disastrous), but to make the long story short, setting this value back to the default of 6 made the issue disappear, too.

So I could work around the issue by simply fixing a misconfiguration that seemed to trigger that problem on less-tested code paths, and I could work around it by simply disabling the low latency MAC, which seemed a good idea anyway after some testing. These were completely orthogonal to the issue disappearing in WLC software 6.0 that just got released, so it's solved the real way, too.

Now what does all that mean for your problem which looks exactly the same but on autonomous APs? I'd say you should have a close look on what the APs call "STREAM", which is exactly the low latency MAC (aggressive queue length limitation) and 802.11e tag mapping stuff on the autonomous AP side. You have a lot more configuration options here compared to the WLC, for instance you can really configure the queue lengths for low latency by giving a number, it's not just magically and mostly undocumentedly pulled down to 4. But keep in mind that just choosing "Optimized Voice" from the general QoS page (Radio X Access Categories tab) will automatically activate low latency MAC settings in the STREAM page without the admin necessarily noticing that (I had it active on an autonomous AP of mine, without ever noticing the STREAM page before).

So I'd say you could have some testing around these exact settings, disabling the low latency MAC behaviour, and could well get rid of the issue even with current IOS and WMM/802.11e generally enabled. What we both have seen IMHO lingers somewhere there in the queue limiting code overshooting and killing essentially every frame that was trying to leave the interface, somehow concluding it was too late to make sense at the receiver.

HTH & Thanks,

Andre.

jbarger Mon, 07/13/2009 - 08:55

Abpsoft

Damn man! You said a mouth full! :)

I have never went back to get this working better. Those phones sit on the charger all day and hardly ever get used as far as I know. From what I remember all that the new changes for these phones gives is a lower power mode when they are not on a call.

Although it is great to know exactly how to solve it if I ever move to WCS instead of WLSE I will fix it at that time.

Just to be certain you are saying enable dot11e and U-APSD and disabling "low latency MAC" should solve the issue as well and give all the benefits of dot11e/WMM.

Thanks,

abpsoft

Joe

migilles Fri, 07/17/2009 - 16:28

Low latency MAC is a term used on the WLAN controller, which limits the # of retries for packets sent from that queue.

In autonomous world, this is called STREAM, where if enabled, it can reduce retries to 3. So if you have 12-54 enabled on the AP, then it may not downshift to 12 to reach the client if it is out on the edge.

It is not advised to enable low latency mac or STREAM. The point about LLM will be in the next version of the 7921G DG.

The note about setting the 802.1p tag to 6 is already there.

But for autonomous config, just ensure WMM is enabled. No need for CAC as it's not load based. Don't enable STREAM and QBSS is optional.

More config info is in the 7925G Deployment Guide @ http://www.cisco.com/en/US/docs/voice_ip_comm/cuipph/7921g/6_0/english/deployment/guide/7921dply.pdf

Actions

This Discussion