First, I couldn't get any debugs or log files due to the nature of the issue and the area it affected.
Basically, yesterday while hanging some 3602's going to a different controller, I started getting task reports of wireless phones not working. However, due to the area of complaint I mistakenly assumed it was due to me causing havoc in the area with AP replacements. Then the issues started to spread with Carts and Cow's and other various devices not able to connect until it started affecting me. I was not able to connect to any SSID at all, kind of nerve racking when you talking ICU/NICU area's in a hospital. Once I got to a wired device, my first thought was to move the AP's to a different controller to at least try and get things back up and then work on the controller to see what was going on. However, in trying to get the AP's to move, most of them wouldn't. In hind sight, I wish I had just shut off the etherchannel port on the 6500 that this wlc is on and force the AP's to move that way, but with people standing over me I ended up just rebooting the controller. Of course, once it came back up the AP's reattached to it but everything was working fine. I went ahead and moved the AP's off of it to another controller for now but am searching for answers. About to start digging into bug reports, but am concerned with this line of code causing the issue and worried about moving to 7.2.110. Any thoughts or suggestions? Anyone run into this issue?
Hmm.. afraid of that. Only a small amount of users with this issue and so far TAC doesn't recognize it as a known issue but we haven't troubleshooted it yet, being the 4th. How long have you been on the .110 code? Also, how long were you runing the .103 before experiencing the issue? Thanks for the response
I noticed the problem on Monday 2nd July as first. After some tries I did a reboot and upgrade to latest .110 version. We have been running .103 code for at least one month and about 160 APs connected. Msglog was full of these:
*apfMsConnTask_7: Jul 02 09:51:46.245: %APF-1-MOBILE_ENTRY_CREATE_FAILED: Could not create Mobile Station Entry. Unable to lock existing entry or create timers. Mobile:00:0e:35:6b:74:a1, Type: LWAPP AP. Mobile rejected.
*apfMsConnTask_7: Jul 02 09:51:46.244: %OSAPI-0-TIMER_CREATE_FAILED: Failed to create a timer.
From our support I received this answer:
Depletion of WLC's timers in an auto anchor scenario
WLC stops accepting any new clients associations (or re-associations), due to depletion of timers.
WLC running 7.2 code version and auto-anchor is configured.
The following can be seen in controller message log, once it is in that state:-
%APF-1-MOBILE_ENTRY_CREATE_FAILED: apf_ms.c:3472 Could not create Mobile Station Entry. Unable to lock existing entry or create timers
%OSAPI-0-TIMER_CREATE_FAILED: timerlib.c:535 Failed to create a timer.
%OSAPI-0-TIMERCB_ALLOC_FAILED: timerlib.c:493 Unable to allocate timer control bloc
1st Found-In 1st Found-in
Reboot the WLC
5508 contorller does not allow client connections , reboot required
Symptom: 5508 on 184.108.40.206 with close to 500 Access Points. After some time, all wireless clients cannot connect to the network.
However I'm not aware of using auto anchoring.
Latest FUS image was upgraded when I installed 7.2.103 as the controller was brand new.
I was also thinking if this cannot do something with correlation of time, which happened during last weekend (they added one more second to a minute, so a one minute had 61 seconds. A lot of systems couldn't handle that) ... but not sure.