We have 9 AP541n's in a cluster. There are two in an area where there is a high concentration of devices (high being potentially 50-ish on an intermittent basis when Parliament sits). We believe there is only about devices 10 trying to connect at present and can only see 6 connections in the console after the recent reboot. These two units have been running along without incident until this week when the clients started a session (previously only 1-2 clients present). It seems that first one unit crashed/rebooted, then the other one in the same area rebooted soon after. I am guessing the second one rebooted when the clients all tried to connect to it after the first one crashed. Others in the cluster have remained operational and unaffected. They are running 2.0 code. We have been through the "DHCP issue" and this new versions seems to have cured that one, but the crashing under load is new.
They are pretty much default except for IP addresses obviously, a single vlan only, and using Radius mac authentication to a Microsoft RRAS server.
We logged a call with the Cisco helpdesk over other issues along the way (date and time resets, radius entries don't appear in the log until after an AP reboot) and got some person on the phone who eventually decided that they don't support Microsoft Radius, despite us trying to explain to him that it is nothing to do with the issue. He had no idea, no skills and didn't want to help. I am somewhat less impressed by Cisco now than I was. So....
Anyone got any ideas?
Hi, My name is Eric Moyers. I am a Network Support Engineer in the Cisco Small Business Support Center.
What was your case number, I would like to look into your issue a little farther.
Cisco Network Support Engineer
Thank you for that info Leon and Glenn, going back I remember that case from when it came through the Center.
As far as this current issue, are those two constantly crashing? When they "Crash" what happens next do they come back on their own or do you have to restart them? Are they completely down can you still connect to them but not get to the Internet. What exactly are your symptoms?
They appear to restart themselves when this happens, although it is not constant. They are in the Parliamentary Chamber, which is only loaded up on a sitting week roughly every 3 weeks for a week or two so difficult to get a real picture. What happened was that our users came around to IT saying they are being asked for the wireless password (which we don't tell them). By the time they get to us, it works as there is an AP in our area. When we try to conenct to the AP in their area, ti won't respond to us via the http admin interface. After some minutes, we eventually get back in. Checking the log, in this case the uptime was minutes, so we assume it has rebooted itself. The second one in that area followed suit a little later, which is why we assume it did the same thing when it suddenly got all the load. The log doesn't seem to give any clues as to events leading up to the reboot.
We have 9 of them in a cluster spread over the precinct, and it only seems to happen to the ones copping more load. Our snmp monitoring did not pick up the dropout, however it only checks every 30 seconds and has to miss 3 retries if I recall.
We still do not have an answer for this. These units clearly have an unresolved issue where they reboot themselves after sitting idle for a time then having a relatively small number of clients connect.We are at V2.0. I see V2.1 is out but it mentions nothing about this issue or anything similar. We have 9 of them currently in a cluster running 2.4Ghz and doing mac authentication via 2008 Server RADIUS with mac addresses in the username filed of AD. It seems to happen to the ones in areas where there are a larger number of potential clients and it happens after they have sat idle for a week or two then clients start to connect. Interestingly, the one here in IT (part of the cluster) has yet to act up, but it almost always has a connection or two and never really has more than 2-3 try to connect.
The units also seem to lose their time when they reboot even though there is a ntp server set. They don't log anything useful in their logs either.
Hopefully the issues will force Management to cough up dollars for something decent.
Just wondered if you ever got to the bottom of this issue. I'm having the same problem now but with just a single AP. Its seems to just randomly reload.
Let me know if you got anywhere.
What is the firmware you're currently running? 2.0.0 or 2.0.1? How many clients are connecting to a single AP when this happens? Do you have any Syslog output?
I am running an AP541N 2.0.1 fw for supplying wireless access for the who support floor , 4 vlans and i am mointoring my ap using Onplus agent.
We can have about 30-40 engineer connected at a given time. Also for quicker resolution high priority - please give Cisco Small Business Support Center a call and open a support case (1-866-606-1866) -
Nope, they still do it randomly, particularly after being idle for a period.
Jas - Only a few clients as far as we know, maybe 5-10 at the time they reboot. We logged a call with them, but once they ascertained that we use radius to authenticate via Microsoft AD (mac address as user name) the official answer was "Sorry, we don't support AD". After that they just closed the call and refused to work on it any further unless we turned off authentication.
The answer from our Cisco rep. is that "they are Linksys, not Cisco"... strange, they have a Cisco logo emlbazoned on them.
I am afraid I am much less than impressed with Cisco than I used to be and will think twice before I choose Cisco for our main network and wireless upgrade (which is currently my primary project). I don't mind that there are bugs, but the support attitude I cannot tolerate.
Wow, thanks for the update Glenn.
Jas, here are the details of my issue:
I have a single AP541N-K9 running AP541N-K9-2.0(1). It is running 2 SSIDs with just basic WEP. This is a new installation. The reboot problem became apparent when our monitoring tool reported that the AP kept rebooted every few hours. This was confirmed by checking the up time on the AP. I noticed some errors in the log relating to the Bonjour service so I turned that off and the errors stopped. Now the montoring tool has reported that the AP has rebooted twice in the last 24 hours however the uptime on the AP does not reflect that - it appears to have remained up. However while I was on the AP via the web interface this morning, I noticed that it stopped responding, I tried an SSH connection to it as well and that also failed, but I could ping it. I was locked out for about 5 minutes and then it came back to life. I then got a message on the monitoring tool saying it had rebooted but again the AP up time suggested it hadn't rebooted. The symptoms from the client point of view is that they do get disconnected during the time that the monitoring tool loses conenction to it. The logs show continuous "deauthed from BSSID (mac address) reason 3" messages every 5 to 10 mins for the 3 clients that are using it. However the issue occuring during the night when there was no-one using it at all.
Any suggestions would be great.
Call into the support center and open a support case @ 1-866-606-1866. We’ll need to gather all information like topology, configuration files, logs, etc. So we can analyze your situation and start possible trouble shooting step and find a solution for you.
Was this issue resolved? We have a handful of these devices and they all experience the same symptoms (freezing, etc) and then our monitoring software reports that the unit reboots. We have opened several cases with Cisco and have tried everything that was sugguested (uncluster the units, downgrade from 2.0.2 to 2.0.1, reset the unit and rebuild from scratch, and even replace the hardware). Also, we were told that clustering isn't supported on these units (even though it is an available feature) - can anyone provide an color on this?
I love these units, but these issues are keeping us from purchasing more.
Hi, My name is Eric Moyers. I am a Network Support Engineer in the Cisco Small Business Support Center. Thank you for using the Cisco Community Post Forums.
Not sure who told you that clustering is not supported, but it is one of the main features/selling points for this device. What Case number/SR number were you given. I would like to review your case(s).
Cisco Network Support Engineer
SBSC Wireless and Surveillance SME
Yes we still have this problem. I have a feeling that its when the iphone/ipads try to connect and request an ip address.
Working for Cisco i can tell you that Cisco is very committed to customer service even if certain engineers lack to give great customer service to our customer, and generally those engineers don’t last long and are not with us any longer. If I judged all companies by certain engineers I spoke with then I defiantly wouldn’t still be with my service provider that I still have to this day. I did look up your case to see who worked with you and just like my above statement, but the engineer should have supported or assisted with AP541N but would have advised that we don’t support the configuration on your clients and radius server. Your issue you’re claiming should have been supported and looked into further. Should not have been handled as described by you.
As Leon said, we still have the problem. Most of the 10 AP's reboot randomly with no obvious pattern except perhaps a period of inactivity then a little load and off they go. It may be iPads connecting that cause it but we can't be sure. Haven't loaded 2.02 but somebody else here said they have.
We are looking at replacing the whole infrastructure here shortly anyway so have basically given up trying to get it fixed.
Suffice it to say that the wireless infrastructure going forward is unlikely to be Cisco given our experience. And before anybody says it, I know they are not "real" Cisco, but they have a Cisco badge on them and that was one reason I chose them.
We are also experiencing this problem.
We have 10 APs and one of them has been rebooting by itself during the middle of the night on a daily basis, as reported by our monitoring tool. It is running on firmware version AP541N-K9-2.0(2). One thing weird though is that the system uptime says it has been up for the last 29 days.
We have about 150 of these devices deployed with random ones 'slowing down' and eventually are rebooting. We have been working with Cisco on this since Feburary of this year. (4 months) and still no resolution has been found. It appears to be when we use RADIUS/MAC authentication that seems to have some kind of memory leak that causes this issue. I am not the expert on this but have been sending logs on how the CPU spikes up prior to the device rebooting. It's hard to get logs like this since it locks up prior to rebooting. We are still working with them in hopes of finding a resolution for this issue. We only get updates every other week so I don't expect an answer anytime soon.
Show TOP during quite early morning: (192.168.105.3)
Mem: 45088K used, 11492K free, 0K shrd, 9152K buff, 16192K cached
Load average: 4.00 4.00 4.00
PID USER STATUS VSZ PPID %CPU %MEM COMMAND
421 root RW 1256 144 99.4 2.2 hostapd
485 root RW 1452 483 0.1 2.5 top
380 root SW 9244 144 0.0 16.2 snmpd
144 root SW 5792 1 0.0 10.1 dman