I have just finished upgrading my WLC 4404's to the newest version (bootloader and all), and am running into a series of strange errors.
The initial upgrade occured on May 8th, and ever since then my WCS (188.8.131.52) has been swarmed with minor security errors related to the access points.
I have 129 APs in my network, and so far about 60 have shown up with these errors.
The errors state:
"Client '<MAC1>' which was associated with AP '<MAC2>', interface '1' is excluded. The reason code is '1(802.11 Association failed repeatedly)'".
"Client '<MAC1>' which was associated with AP '<MAC2>', interface '1' is excluded. The reason code is '3(802.1X Authentication failed 3 times.)'"
I have tested some of the APs listed and am having no trouble associating or authenticating (though I have yet to test them all).
The strange thing is, <MAC1> on both error message types is actually my AP's MAC addresses, whereas <MAC2> is the Client's MAC (as far as I know). So it looks like the Mac's are being listed in the wrong order. I checked the caveat's on the release notes for this version but found nothing on this apparent bug.
I know it seems like a typical authentication error (with the odd MAC placements as the exception), but these just started coming up after the upgrade - mainly wondering if that's just coincidence or a bug in the version?
Thanks in advance for anyone's input.
Can you confirm if you see this error message for only 2 prticular clients all multiple clients?
Also can you just double confirm if the security configuration on client matches the WLAN security?
It's more than 2 clients, seems to be just random people trying to connect as far as I can tell - random APs also.
Unfortunately I can't confirm the security settings on the client laptops as they are not supplied by the education facility I work at (they're personal computers). I imagine it is possible that it could be a client misconfiguration problem.. we have just started a new term so there's a good possibility alot of new laptops are on campus. I'll have to look into it.
The AP's are the same software version as the WLC's, but they are on different Boot versions (since 184.108.40.206 the software and boot images are on different files). I loaded the new boot image this morning on the controllers but it doesn't look like it pushed through to the APs..do you think that could be the cause of the WCS thinking the AP's are clients?
The AP's boot image is still only 220.127.116.11 on all the AP's I looked at, are these supposed to update?
Thanks for the replies guys.
I'm still having this issue and haven't had any luck finding a resolution.
Another strange error has started to pop up and the more I think of it, the more i think it may be related. This error is being listed as Critical (whereas the one I previously mentioned was just Minor).
The IP that's being reported as duplicate is actually the IP address of the AP that reported it. This looks like it may be an ARP flood, caused by problems with associating maybe (the original error)?
Any thoughts to point me in the right direction?
Also does anyone know if the new upgrade has any extra default settings that weren't present in previous versions that may have snuck into the configuration?
I poked around on our WCS and lo & behold, we are seeing the same error!
Are you pushing the AP address out via the WLC dhcp or by dedicated dhcp server(s)?
Also, what are your lease times?
Which error were you referring to, the one with 00:00:00:00:00:00 as the MAC?
We use dedicated DHCP, lease time of 24hrs.
I did a SPAN session on our switchport connected into one of the access points and found out that these actually -are- a series of ARP requests. Looking at the packet info was quite strange.
Basically, the packet layout on Ethereal for ARP is:
Sender MAC address
Sender IP address
Target MAC address
Target IP address
The Sender MAC address was the AP, sender IP and Target MAC were all 0's, and the Target IP was actually the existing AP's IP address (from DHCP).. it's very, very strange.
It basically seems like the AP's don't recognize their own address and are listing it as a 'duplicate' or even a rogue.. I'm also getting a few Fake AP attack messages on my WLCs - not sure if that's a coincidence.
Yes, the one with 00:00:00:00:00:00 as the MAC error.
Our leases are served up from 2 WLC 4402s (split scope) and are several days in length.
Have you tried resetting the port that the AP is on? (assuming PoE in use)
Sorry, our DHCP lease is actually for a few days, not just 24hrs.
I have reset an AP and rebooted it post-upgrade before for basic testing, but it is still showing up in the error log with this problem.
Also, each AP was broadcasting an ARP every 10 seconds, which is pretty frequent.. have you noticed this as well? I guess this goes back to my previous question a few posts up where I was wondering if any 'extra configs' were set on this new version.
Looking further into it as I was typing this.. I'm also not noticing any ARP replies to the specific AP that I'm running SPAN on. Possibly a problem with the WLC's handling ARP? I have verified that the AP connected into the port I'm monitoring did send out one of these requests as well.
haven't dug into it yet, will start tomorrow.
which APs are you using? We are using 1020 (airspace). I would be curious to see if there is a difference between the lwapp native APs (10x0 series) vs aeronet based APs running in lwapp mode.
I'm using 1010's and 1020's.
I'm starting to lean towards opening a TAC case on this topic to see what they have to say.
No I haven't. Are you seeing anything else out of the ordinary concerning the WLC error logs themselves on your end?
No, nothing on the controllers. I haven't had a chance to place a sniffer on an AP port yet, a bit curious about the arp stuff though... I do know that the AP will proxy thru the switch to the controller which is why config lines line switchport port-security have no effect at all on the APs.
I haven't seen any more errors on the WLC either, but then again, I backed up the data, wiped the server, put RHEL 4 (had RHEL 3 on it before, Cisco does not support WCS 4.x on RHEL 3) and then did a restore. So far so good.
Hmm that's interesting. Maybe that was the actual problem with RHEL 3 not being supported.
Do you think RHEL 5 is supported, or would upgrading to 4 be the best bet? The release notes on WCS4.0 don't mention a 'minimum' of RHEL 4.
Also have you had a chance to confirm on your end that the AP's are not sending arp requests since the upgrade?
As long as /etc/redhat-release says
"Red Hat Enterprise Linux ES release 4"
(could say as as well) it won't complain on the install. I have it running on CentOS 4.4 just fine. The only caveat of not using the cisco recommended Redhat OS is lack of TAC support.
Hey, just wondering on if you had received any of those errors yet since the upgrade?
I'm planning an upgrade to RHEL4 soon (I was also on 3), so hopefully it resolves the issue for me, too.
We're upgrading the server today, but I just wanted to share that we (my coworker and I) may have actually found the source of the problem.
We took a closer look at the problem and realized that our DHCP option 43 was pointing to our management interfaces for the controllers. We changed it to point to the ap-manager interfaces (same subnet), and there hasn't been any recurring errors of that type since (it's been 13 hours).
Mind you, this is on the controllers themselves - the WCS is currently disabled/shutdown.
We opened up a TAC case on this problem after our upgrade to 4.1.171 and were told that there is a known issue:
client deauth events show up with client & ap mac interchanged
It is filed against the WLC & is as yet unresolved. I will plan on linking my case to the bug & letting you know when the fixed code is available!"
Apparently, the parameters between the WLC and the WCS are getting swapped.
Currently, our case is in "Release Pending".
As to which release - that is anyone's guess.
RHEL5 did not work out for us, even when tricking out redhat-release. You'd have to be patient enough to get and install a bunch of compatibility libs, even beyond the libc5-compat packages. TAC pretty much said they had no clue when or if RHEL5 will be supported.
Just updating this to say that I have fully upgraded my WCS to 18.104.22.168 along with RHEL4.0. The errors were gone for awhile (Duplicate IPs from MAC 00:00:00:00:00:00), but they are returning again.
One thing I have noticed is that my ACLs are not actually permitting ICMP packets through (I noticed when trying to do linktests to the APs), would this possibly be the cause for these errors?
Is there anything else that a blocked ICMP protocol would affect directly? I'll most likely be opening the ports anyways, but I would just like to know if this will fix/prevent other issues.
From what I understand, the ARP traffic is a mechanism used by the controllers to detect Rogues, so that's normal - however is it possibly tied into creation of the duplicate IP messages?
I had the same problems until I disable all the options under "security", "client exclusion options"
I'm not a expert, but this works.. %-(
Well I opened up ICMP, but I'm still getting the report of Duplicate IP addresses.
I was wondering, does anyone else use multiple subnets for their Access Points? Currently I have two 24-bit (Class C) subnets for the APs. Is there any reason why it would -not- be wise to do this, and instead supernet the two into a 23-bit?
This error is starting to bother me as I have no idea if this impacting the performance of my access points - I can't find documentation on this error -anywhere- and so far I haven't had any luck with troubleshooting this.
Any input would be great at this point.
I contacted TAC and it appears as though this is a cosmetic bug in the system for this version. Bug ID CSCsj68456 has been submitted regarding this problem.