I posted this yesterday but deleted the thread thinking I had fixed the issue - alas I was wrong. In summary I have a scenario where I am doing wired 802.1x and also wired MAB/CWA. The issue is that a certain number of external/BYOD hosts have supplicants configured for 802.1x at their "home" organisations which for obvious reasons can't authenticate on this network. The idea is that MAB and CWA become a fallback but these hosts in question don't efficiently fail to MAB.
If the host has validate server certificates enabled (and doesn't have our root selected) then 802.1x fails and goes to MAB as per the tx timers etc. Hosts that don't validate certificates essentially fail authentication, abandon the EAP session and start new... this process seems to continue for a very long time.
Does anyone have any similoar experiences and if so can you provide some info? I am looking into tweaking 802.1x port timers to make this fail quicker/better but am not confident this will fix the issue.
Thanks in advance
Please go through the link below on the page no. 370
WHat am I looking at on that page? For what it is worth this is now a TAC case as the issue is not present on all switches in the environment. Will update this thread accordingly.
Maybe the held-period and quite-period parameters would help. I would not change the TX period to anything shorter than 10 seconds. Every cisco doc that I have ever seen has said this same recomendation and I can tell you from experience you will have devices at times that will authenticate via MAB when you dont want them to if you decrease lower than 10 seconds.
Read this doc for best pratices including the timers listed below.
I hope this link works. http://d2zmdbbm9feqrf.cloudfront.net/2014/eur/pdf/BRKSEC-3698.pdf
If not goto www.ciscolive365.com (signup if you havn't already) and search for
"BRKSEC-3698 - Advanced ISE and Secure Access Deployment (2014 Milan) - 2 Hours"
Change the dot1x hold, quiet, and ratelimit-period to 300.
Configures the time, in seconds for which a supplicant will stay in the HELD state (that is, the length of time it will wait before trying to send the credentials again after a failed attempt). The range is from 1 to 65535. The default is 60.
Configures the time, in seconds, that the authenticator (server) remains quiet (in the HELD state)
following a failed authentication exchange before trying to reauthenticate the client. For all platforms except the Cisco 7600 series Switch, the range is from 1 to 65535. The default is 120.
Throttles the EAP-START packets that are sent from misbehaving client PCs (for example, PCs that send EAP-START packets that result in the wasting of switch processing power). The authenticator ignores EAPOL-Start packets from clients that have successfully authenticated For the rate-limit period duration. The range is from 1 to 65535. By default, rate limiting is disabled.
Thanks for your reply Justin. All of those timers you mentioned I have already altered based on the same Cisco Live slides and session. Ultimately though I think I am up against some sort of bug be it switch or client. Basically I can't replicate this issue in my lab. Long and short of it is that the EAP session abandons and the client starts again - which seems to nullify the timers. When I get a response from TAC etc I will update this thread.
Again thanks for your input.
Have you been able to resolve this? I am pretty much getting the same thing when my switches are running IOS 12.2 but when running IOS 15 everything is fine.
I am in the process of providing additional logs to Cisco TAC and will update when I know more. I am finding the issue on different switches but what is strange is I find the issue on switches running the exact code but slightly different hardware:
3750 running 12.2.58 - This has the issue (this code is bug ridden and not an accurate test)
WS-C2960-24PC-L - 15.0(2)SE5 - This has the issue
WS-C2960-48PST-L -15.0(2)SE5 - This does not have the issue
Pretty annoying issue so far to be honest.
No additional information from TAC but I am starting to come around to the fact that is more than likely a supplicant related issue. Long story short I did more in depth testing yesterday (remote location and workers makes it difficult) and found that the issue is very much linked to a specific subset of hosts and their SOE build. Put simply my machine and other machines configured in the same way as the above mentioned hosts are able to connect.
None thew less I will await the TAC response but I think this is more than likely client PC issues
After sending a packet capture to TAC, I was told my switch running IOS 12.2(58) was dropping EAP packets for my phones that had dot1x turned on but only configured for mab in ISE.
Here are the responses from TAC:
"It looks like I may have found your bug.
Big ID: CSCsr55949"
"I looked at your packet capture, And I can see the phone initiating EAP, and ISE passing back the challenge, and then one minute of timeout, and the phone passing a new EAP request. I am not seeing the EAP notification and answer from the challenge on the phone.
The Switch is dropping those EAP packets.
You will want to get away from that bug by upgrading that code."
I am now running 15.0(2)SE5 on both WS-C2960-24PC-L and WS-C2960-48PST-L with no issues.