Issues with roaming between and general connectivity with (Cisco) Meraki APs
We have a building with 3 floors (1st floor is manufacturing, 2nd floor is offices, as is the 3rd), with many laptops that frequently travel around to meetings, whiteboard sessions, and so on. We have about 175-200 clients active at any given time, all of which are using our "corporate" SSID with RADIUS authentication, handled by 2 x Aruba ClearPass Policy Manager appliances. We have 21 Meraki APs (mix of MR12, MR16, and MR24, though mostly MR16) handling all of these roaming devices. Our environment is small enough that we have a single /24 for this SSID, so we are not doing any kind of Layer 3 roaming (though you could argue we're doing Layer 2 roaming, between APs). We do not have any MX appliances on-site (I know these are popular with environments that require a concentrator, but I don't see the use for this in our case).
In the past, though it seems to be getting worse as time goes on, we've had intermittent issues with devices either being completely disconnected when roaming from one AP to another (then reconnected a few seconds later, dropping all network connections), or (even more common) sitting in a conference room or workspace and seeing their signal strength jumping from 4 bars, to 1 bar, to 4 bars, and so on. Along with these issues, we see lots of "802.1X deauthentication", along with "802.11 disassociation - unknown reason" messages in the logs. Rebooting APs and running standard packet captures (as Meraki support does) really hasn't produced anything helpful. We have some traditional Cisco WLC 5508s in service as well for some of our other locations, which are providing remote LWAPP APs with connectivity, using the same SSID and RADIUS configuration. We don't appear to have any of these random drops in signal, or trouble with roaming, when clients are using those. I should also add that we're using SAP at times, and the number of client disconnects we get using the Meraki-provided network is ridiculous (to the point where we simply advertise that our users cannot reliably use wireless to access SAP resources).
I know there have been several discussions on Spiceworks about roaming, client roaming aggressiveness, different wireless band selections, and so on. Unfortunately, we've tried many of these tweaks (that is, disabling Band Steering, setting Roaming Aggressiveness on clients to both low, medium, and high, and even reducing the number of APs in our environment to avoid over-saturating with available points our clients can associate to), and haven't had much success with any of them. Like others, Meraki support has been marginally helpful, citing knowledgebase articles and best-practices therories,.
So, my questions for anyone willing to answer/add input are:
Is anyone else seeing similar issues, either given a similar setup (Meraki RADIUS with 3rd party [Aruba] appliances)?
What have you discovered in your own environment, with possible fixes or workarounds?
From some reading I've done, Meraki seems to handle "true" Layer 2 roaming in a different way than other vendors (Aerohive, Ruckus, Cisco WLC, to name a few). What behaviors/success have you seen in your environments, with true roaming, using Meraki APs?
We're heavily invested in Meraki as a platform, so changing to a different solution entirely simply isn't an option for us (though I concede that at least in terms of roaming, the older Cisco WLC platform does a far better job).
Any help or input you can provide is greatly appreciated. Many thanks in advance.
At present , People replying on this support platform are mostly Cisco product specialist only until there are Technology related questions. Meraki is also a Cisco Product now but to look for Meraki experts at present , the other support forum you mentioned would be able to help more.
In general , for a layer 2 roam , the client should look and send re-association request to the target AP and if there are no coverage holes , the other AP should directly go in to 4 way handshake taking mush less time than the full dotx authentication and retain the same ip address as well.
I have seen this a bit with Meraki in the EDU space (that is where Meraki is used heavily). Unfortunately I do not have a solution to the problem. I do know it is focused on the 802.1x WLANs and does not seem to affect the other types of authentication. I feel like it may be related to a timeout issue to the RADIUS server. Implementing a secondary RADIUS server has helped. Also make sure you enable fast reconnect under your PEAP settings on your wireless profile. This example is specific to Microsoft NPS:
Thanks for your thoughts, Nathan. We do actually have the "Enable Fast Reconnect" option selected on our wireless profile. Good idea, though.
We did also (originally) have 2 RADIUS servers defined within our wireless network. What we discovered was that each Meraki AP will try each one in order, top-to-bottom, and then primarily use the server that responded to it first. So, if for any reason you have a short-lived issue with your local RADIUS server responding to requests, and the AP is able to talk to a remote RADIUS server (in our case, one on the other side of the world) instead, the AP will elect to use the remote RADIUS server instead. In our case, the latency is high enough between these APs and this remote RADIUS server that while a client is roaming between APs, and having to re-authenticate, the entire process breaks down because (1) the client is moving between APs faster than the remote RADIUS server can authenticate the client, and (2) the entire exchange and communication ends up timing out -- thus forcing a manual re-connect. This is not a common occurrence by any means, but I just wanted to share what made us later choose to define only 1 RADIUS server, in the network settings. Surely our circumstance here is rather unique, but I thought it might be worth mentioning. Having only 1 RADIUS server defined forces ALL of our APs to use the same RADIUS server, regardless of anything else. It has resulted in a much smoother re-auth process for our clients.
I appreciate the link you sent, however. If I come across anything else that is helpful, I'll certainly post it back here. I appreciate your input once again!
Re: Issues with roaming between and general connectivity with (Cisco) Meraki APs
Hey, we have had similar roaming issues on our Meraki Infra, especially with Apple TV's and Airplay streaming, we use it heavily in our school.
Apart from the generic responses from Meraki Support, we really didn't get much help.Our older Cisco WLC infra worked much better for our Apple TV's.But since some Upgrades, we have noticed that the wireless is actually more stable.
I have written a much detailed post on my blog regarding this
Transferring Crash file from standby:
Login to the Active WLC in HA.
(Cisco Controller) >transfer upload datatype crash
(Cisco Controller) >transfer upload filename <Desired filename>
(Cisco Controller) >transfer up...
This is the start of a display filter cross reference between Wireshark and OmniPeek.
The 1st installment is a table of advanced filters. More filters will be added as time allows.
It is a living doc, so check back for changes every so often
Please feel ...
I have created a Powershell script to automatically add a Wireless Guest User on Cisco WLCs. (tested on 2500 Series)
The script should be completely self explanatory.
Powershell SNMP Module (Install-Module -Name SNMP)
SNMP Write Access to...