cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
724
Views
0
Helpful
7
Replies

radius servers continually failover throughout the day on 4400

btenney
Level 1
Level 1

I have some concern regarding some messages on both our IAS servers (win 2k server) and the 4400 controller itself (3.2.116.21) I just wanted to know if anybody has any thought on this.

I have noticed a lot of these messages on the controller though out the day. It appears the controller is switching back and forth between the radius servers.

-------------------------------

Wed May 3 03:36:37 2006 [WARNING] apf_api.c 10860: RADIUS Server 10.x.x.2:1812 selected for failover on VAP 2 - marking ACTIVE

Wed May 3 03:36:37 2006 [WARNING] apf_api.c 10820: RADIUS Server 10.x.x.3:1812 failed on VAP 2 - marking INACTIVE

Wed May 3 03:36:25 2006 [WARNING] apf_api.c 10860: RADIUS Server 10.x.x.3:1812 selected for failover on VAP 2 - marking ACTIVE

Wed May 3 03:36:25 2006 [WARNING] apf_api.c 10820: RADIUS Server 10.x.x.2:1812 failed on VAP 2 - marking INACTIVE

------------------------------

Also, I am noticing these errors on the IAS logs throughout the day. These are random events. There are successful IAS auths in the logs as well so I know it authenticating. I am thinking this is why the IAS server is not responding back to the controller and causing the controller to failover on a continual basis.

------------------------------

Event Type: Error

Event Source: IAS

Event Category: None

Event ID: 3

Date: 5/2/2006

Time: 9:36:24 PM

User: N/A

Computer: xxxx

Description:

Access request for user XXXX was discarded.

Fully-Qualified-User-Name = <undetermined>

NAS-IP-Address = 10.X.X.X

NAS-Identifier = SLC-XXX

Called-Station-Identifier = 00-0B-85-5F-F9-C0:REWIFI3

Calling-Station-Identifier = 00-0E-6A-D7-49-7C

Client-Friendly-Name = SLC-xxx

Client-IP-Address = 10.x.x.x

NAS-Port-Type = 19

NAS-Port = 1

Reason-Code = 97

Reason = The authentication request was dropped because it contained an unexpected packet.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.

------------------------------

Finally, I am getting critical alarms on the controller through Cisco WCS. This is basically saying that the radius servers are not responding.

------------------------------

General

Failure Object

Switch SLC-xxx/10.x.x.x

Owner

Category

Controller

Created

May 2, 2006 4:10:18 PM

Modified

May 2, 2006 9:36:38 PM

Generated By

Device

Severity

Critical

Previous Severity

Critical

Message

Controller '10.x.x.x'. RADIUS server(s) are not responding to authentication requests.

Help

Check network connectivity between RADIUS and Controller '10.x.x.x'.

---------------------------

I am wondering if this is a bug on the controller software causing these failovers. Here are some of my ideas:

Could the controller be possibly sending an attribute that IAS doesn’t understand; the server doesn’t reply back causing the radius failover?

Similar to CSCsc05495—Controllers running 3.0.107 code intermittently send a state attribute 24 in an

access-request packet.

Workaround: Apply the Microsoft KB 883659 patch to IAS. The Microsoft patch may or may not

work. There is no workaround on the controller.

Of course im not running win2003 which this bug specfically addresses.

Could an auth packet be sent to the wrong server causing these errors in the event log? The packet would be sent to the wrong server because of the failover itself.

Any suggestions would be appreciated.

Thanks,

Brett

7 Replies 7

Not applicable

do the client's associate and authenticate?

Yes, the clients authenticate without a problem. 98+ percent of the logs are successful. I am now wondering if a few clients are the culprits. I am seeing a pattern.

I get about 6 failed messages seconds apart in the event log for IAS. First on the primary then on the secondary. (failover takes place) Like this one:

Event Type: Error

Event Source: IAS

Event Category: None

Event ID: 3

Date: 5/11/2006

Time: 9:45:45 AM

User: N/A

Computer: slc-xxx-zzz

Description:

Access request for user host/SLC-xxx-NDEMAC.slc.xxx.com was discarded.

Fully-Qualified-User-Name =

NAS-IP-Address = 10.x.x.x

NAS-Identifier = slc-xxx

Called-Station-Identifier = 00-0B-85-5F-EF-80:REWIFI3

Calling-Station-Identifier = 00-0B-7D-0D-A9-C6

Client-Friendly-Name = slc-xxx

Client-IP-Address = 10.x.x.x

NAS-Port-Type = 19

NAS-Port = 1

Reason-Code = 97

Reason = The authentication request was dropped because it contained an unexpected packet.

Then a few seconds later the authentication succeeds. And it gives this log.

Event Type: Information

Event Source: IAS

Event Category: None

Event ID: 1

Date: 5/11/2006

Time: 9:45:59 AM

User: N/A

Computer: slc-xxx-zzz

Description:

User host/SLC-xxx-NDEMAC.slc.xxx.com was granted access.

Fully-Qualified-User-Name = slc.xxx.com/Agents/SLC-xxx-NDEMAC

NAS-IP-Address = 10.x.x.x

NAS-Identifier = slc-xxx

Client-Friendly-Name = slc-xxx

Client-IP-Address = 10.x.x.x

NAS-Port-Type = 19

NAS-Port = 1

Policy-Name = Wireless policy

Authentication-Type = EAP

EAP-Type = Protected EAP (PEAP)

There is definitely a link between these messages and the critical errors in WCS and on the controller

"No radius servers are responding"

Also, I have a VPN 3000 concentrator using the same IAS radius servers and do not have this problem.

Have you tried letting the controller have more time before retrying?

Under [Security > RADIUS Auth > Edit]

Change the Retransmit Timeout to 30 secs

If the above does not help, I'd start checking for STP errors in the ethernet switch between the IAS servers; if they are truly disconnecting, you may have issues there.

When IAS, or any other Radius server, for that matter, does a silent discard, and the 2 retries that we send also are discarded, we mark the server inactive. When that's done, you'll see those messages in msglog, and the controller sends a trap to WCS (and other trap receivers) saying radius server is dead.

Thanks for the responses. I tried changing the auth timeout with no luck. The failed auth messages seem to be coming from the same couple clients. I'll look into updating the client computers and let you know if I find anything else.

dkhalyav is correct. When the IAS server discards a request the controllers think the server is dead. We see these messages all the time when our desktop teams login with the local admin account on the wireless client computer. Since the radius server can't auth against that machine (host\administrator) it discards the request. We've looked for reg keys to force IAS to not discard these but no luck. We are running IAS on Windows Server 2003 SP1.

Hey m2w.. you may want to look into changing your authmode regkey for "machine only" authentication (no user authentication is performed) on those machines. Set the dword value to 2 in the registry. By default, it is set to perform both computer and user authentication in order to connect. (defualt dword value of 1) If the user auth fails, so will connection.

That way the computer account will authenticate and not the user who is logging in. As long as the computer account is in your AD and in your wireless security group. it should auth correctly. I've had to do this for machines that I want to use remote desktop on. Normally remote desktop won't work with any wireless client set to perform user 801.1x authentication. Known issue.

In my case, the machine will finally authenticate but will fail initially after about 6 dropped auth packets sent to IAS. So the auth does work. Its only these couple clients. It also differs in that they are logging onto the machine and not locally. One the computers that is having problems is running Dell 1350 mini-pc card. The other one is a dell 1450. I tried updating to the latest driver with no luck. I will live with this problem for now as it seems only a few select clients are causing the problem. I will be throwing in the towel on this one.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: