Clients on Guest WLAN Losing Layer 3 Connectivity

Answered Question
Aug 1st, 2008

Hello NetPro gurus!

I am currently troubleshooting an issue we are having with our Guest (completely open) WLAN in which it seems certain clients are losing their layer 3 connectivity while staying 'connected' to the LWAP(s). These certain clients lose their layer 3 configuration and are not able to access internal or external resources until they disable/enable their wireless connection.

I specifically have this problem, and it's only on the Guest WLAN that this occurs. I am using a Lenovo T61 with an Intel 4965AG internal wireless chipset. I know this chipset is relatively new and I have tried multiple drivers, all with the same result. Not all machines have this issue. MacPro laptops do not seem to have this issue nor do machines with Intel Pro 2200BG chipsets. I tested with a Netgear PCMCIA card and did not have this issue either.

Here's some more background information:

We have 5 WLCs (2 WiSM blades each in a Catalyst 6509 and 1 WLC 4402) and 7 WLANs. The 4 WiSM controllers have each WLAN configured on it, and the 4402 WLC only knows about one Guest wireless network (it is a completely open WLAN i.e. no security). This is the particular network we see this issue with. We have approximately 200 LWAP 1131AG's (47 in one building, 154 in another) all broadcasting the Guest SSID. Our server core Catalyst 6509's each have seperate VLANs (with Port-channels in them) for the WiSM blades. The Guest WLC 4402 is in the DMZ in its own VLAN. Each WLC is providing DHCP for each of the WLANs.

The issue that seems to be occuring is the fact that during our troubleshooting I lose all layer 3 connectivity. I continue to stay "connected" to the AP and signal strength is excellent however my continuous pings to the Guest WLC (192.168.0.x network) time out and I cannot get out to the Web. I noticed the following error on my laptop (Lenovo T61 w/ an Intel 4965AG wireless chipset) in the system event viewer:

Description:

The system detected that network adapter Intel(R)...Link 4965AG - Packet Scheduler Miniport was disconnected from the network, and the adapter's network configuration has been released. If the network adapter was not disconnected, this may indicate that it has malfunctioned. Please contact your vendor for updated drivers.

This occured at the exact time I lost my layer 3 connectivity. A co-worker and I did some research and determined that this was exactly one half of the way through my 1-hour DHCP lease from the Guest WLC (the 4402). The DHCP leases are set to expire at 1 hour as we have a lot of clients on the Guest WLAN that come and go and only have one network configured for the Guest WLAN w/ 229 available IP's to be handed out. We were wondering if it was an issue with the DHCP renewal process from the WLC. This does not occur on the Internal WLANs configured with strict authentication security.

We tested with a few machines, such as an Apple laptop, an older laptop with an Intel Pro 2200BG chipset, and even my same laptop with a Netgear PCMCIA WiFi card none of which exhibited this problem. Connectivity at layer 3 was not interrupted. I have tried multiple drivers as well, all with the same result.

Now, we are not sure if it is an issue with the WLC itself or a chipset issue. The Intel 4965AG chipset is rather new but we have a lot of WLAN clients with this chipset on the network. That also doesn't explain why this issue ONLY occurs on the Guest WLAN.

We were thinking of placing a small DHCP server on the network to take over DHCP responsibilities from the Guest WLC to see if that makes a difference. Another idea we had was to increase the DHCP scope to two Class B networks (191.168.0.0 - 191.168.1.255 /23 to give us 510 hosts so we can extend the DHCP lease time).

I plan on doing further testing today by placing a few more machines on the Guest WLAN with multiple chipsets and taking note of which ones exhibit the problem.

Any and all help is MUCH appreciated. Thanks!

Shane

I have this problem too.
1 vote

Well.. to complete a nice happy ending to my saga.. BUG FOUND!

I opened a TAC case and we came to this conclusion.

In the Advanced settings of the WLAN there is a client time-out default of 1800 seconds.

The clients were dis-associating due to inactivity according to the sniffer traces, causing the dhcp process to kick off and the web_auth reqd state.

We set this down to 60 seconds and watched it over and over..

I have now set it to the max allowed of 65535 (18 hours) as a work around.

Cisco admitted there are bugs when setting this to 0, so they suggest the 65535.

Hope this info helps some of you out!!

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 4.9 (5 ratings)
Loading.
Scott Fella Fri, 08/01/2008 - 07:10

I have seen this issues also with various other Intel cards. The fix was to change around the drivers or if you are using a 3rd party utility like Access Connections on your IBM. I had to upgrade my driver a couple of times and also the utility. Since you narrowed it down to certain wireless cards, its not your wlan. Make sure you use the driver that the laptop manaufacture recommends and not the latest Intel. I have had issues using the latest Intel from their website. I use Access Conections now and no problems at all. The fix before was to remove the 3rd party utility and just use windows to configure the wifi.

s.clinard Fri, 08/01/2008 - 09:57

Thanks for the information. Doesn't it seem a little odd that this only happens on our guest WLAN? I would think that if it indeed was a driver issue, I/we would see this on any of the WLANs that exist. For reasons I have yet to determine, I think it has something to do with the WLC controlling the guest WLAN, WLC 4402 running software version 4.2.130.0. Our WiSM blades are running 4.2.130.0 as well and the WCS is on version 5.0.56.2. I can paste any relevant configuration of the Guest WLAN if that might help.

Scott Fella Fri, 08/01/2008 - 10:07

Do you notice if it occurs only on certain ap's or on a certain wlc on the WiSM? Post your show run-config on the dmz and on your wlc. This way we can eliminate issues with configuration.

s.clinard Fri, 08/01/2008 - 12:23

It seems to occur when associated to any access point broadcasting the Guest SSID. I want to test my theory of the DHCP issue(s) before submitting all of the configs, just to see if my theory is correct. I think it's doing something funky though debugs on the WLC didn't show anything interesting. I noticed that when the problem occured just a little while ago (while testing) I could release my IP but it would not renew.

Scott Fella Fri, 08/01/2008 - 12:45

What dhcp are you using. The wlc would work the best for dhcp on the guest controller in the DMZ.

s.clinard Fri, 08/01/2008 - 12:48

I was thinking of an external Linux DHCP server configured with the same scope on the same subnet. This, obviously, would not actually allow me to determine the true problem though. I'll have the configurations posted in a little while. I am not going to implement this today, so we can talk more.

Thanks for your time.

olivier.nicolas... Mon, 08/04/2008 - 07:06

Did you try to change the lease time or manually do "ipconfig /renew" on the client?

You can also run "debug client enable"on the controller and follow the dhcp requests.

s.clinard Mon, 08/04/2008 - 07:52

Hello, thanks for the reply. When this issue occurs, I am able to release the IP bind information but am not able to renew unless I disable/enable the wireless card in the machine. So far, this occurs on only one Intel chipset that I can tell. I am performing further testing at this time, it might be a WZC service issue, so I am trying to rule that out. I have enabled the client debugging as you have suggested so I will let everyone know what happens.

Thanks!

Scott Fella Mon, 08/04/2008 - 08:10

If you are using WZC, make sure you uninstall any orther 3rd party utility like Intel Proset or IBM Access Connections. This has been an issue with the wireless card driver version. I actually use the Access Connections to manage my wireless profiles. Upgrading the utility and the driver solved my issue. It took a while until I found a driver that actually fixed the issue. Seem like it corrupts the tcp stack. I wasn't able to even ping my own ip when i lost connectivity, but still seemed connected.

s.clinard Mon, 08/04/2008 - 08:14

Can you tell me what version of driver you are using for the 4965AG? I've got 11.5 on my machine, but even version 11.1 exhibits the same problem.

Scott Fella Mon, 08/04/2008 - 08:22

I had issues with my 3945ABG which now I have 11.5.0.36. I have some co-workers that have issues with their 4965AG... they just end up rebooting the machine. There is a new code in Intel's website v12. Don't know if it is supported by the laptop manufacturer though.

s.clinard Mon, 08/04/2008 - 09:48

It doesn't look like Lenovo (T61) has posted ant 12.x driver software yet. I was able to debug the problem on the WLC, though I have no idea what the messages really mean. It logged the BOOTREQUEST message when this particular client was half way through its DHCP lease. No other clients I am testing with are having an issue. I went through and tested without using WZC, and had the same problem. I then uninstalled ALL third-party wireless software, re-enabled WZC and rebooted. I am testing again. The unfortunate part is that I have to wait 30 minutes until I can tell if there is a difference.

s.clinard Mon, 08/04/2008 - 11:18

well, it doesn't appear to matter what is controlling the wireless connection. I received the following debug message for both the 3945ABG and 4965AG wireless chipsets:

(WLC5) >Mon Aug 4 13:17:46 2008: 00:13:02:24:ca:77 DHCP received op BOOTREQUEST (1) (len 334, port 29, encap 0xec00)

Mon Aug 4 13:17:46 2008: 00:13:02:24:ca:77 DHCP selecting relay 1 - control block settings:

dhcpServer: 192.168.0.4, dhcpNetmask: 255.255.255.0,

dhcpGateway: 192.168.0.1, dhcpRelay: 192.168.0.4 VLAN: 0

Mon Aug 4 13:17:46 2008: 00:13:02:24:ca:77 DHCP selected relay 1 - 192.168.0.4 (local address 192.168.0.4, gateway 192.168.0.4, VLAN 0, port 29)

Mon Aug 4 13:17:46 2008: 00:13:02:24:ca:77 DHCP transmitting DHCP REQUEST (3)

Mon Aug 4 13:17:46 2008: 00:13:02:24:ca:77 DHCP op: BOOTREQUEST, htype: Ethernet, hlen: 6, hops: 1

Mon Aug 4 13:17:46 2008: 00:13:02:24:ca:77 DHCP xid: 0xce34deed (3459571437), secs: 0, flags: 0

Mon Aug 4 13:17:46 2008: 00:13:02:24:ca:77 DHCP chaddr: 00:13:02:24:ca:77

Mon Aug 4 13:17:46 2008: 00:13:02:24:ca:77 DHCP ciaddr: 192.168.0.164, yiaddr: 0.0.0.0

Mon Aug 4 13:17:46 2008: 00:13:02:24:ca:77 DHCP siaddr: 0.0.0.0, giaddr: 192.168.0.4

Mon Aug 4 13:17:46 2008: 00:13:02:24:ca:77 DHCP selecting relay 2 - control block settings:

dhcpServer: 192.168.0.4, dhcpNetmask: 255.255.255.0,

dhcpGateway: 192.168.0.1, dhcpRelay: 192.168.0.4 VLAN: 0

Mon Aug 4 13:17:46 2008: 00:13:02:24:ca:77 DHCP selected relay 2 - NONE

Mon Aug 4 13:17:46 2008: 00:13:02:24:ca:77 DHCP received op BOOTREPLY (2) (len 548, port 0, encap 0x0)

Mon Aug 4 13:17:46 2008: 00:13:02:24:ca:77 DHCP sending packet in EoIP tunnel to foreign 10.50.111.11 (len 346)

Mon Aug 4 13:17:46 2008: 00:13:02:24:ca:77 DHCP transmitting DHCP ACK (5)

Mon Aug 4 13:17:46 2008: 00:13:02:24:ca:77 DHCP op: BOOTREPLY, htype: Ethernet, hlen: 6, hops: 0

Mon Aug 4 13:17:46 2008: 00:13:02:24:ca:77 DHCP xid: 0xce34deed (3459571437), secs: 0, flags: 0

Mon Aug 4 13:17:46 2008: 00:13:02:24:ca:77 DHCP chaddr: 00:13:02:24:ca:77

Mon Aug 4 13:17:46 2008: 00:13:02:24:ca:77 DHCP ciaddr: 192.168.0.164, yiaddr: 192.168.0.164

Mon Aug 4 13:17:46 2008: 00:13:02:24:ca:77 DHCP siaddr: 0.0.0.0, giaddr: 0.0.0.0

Mon Aug 4 13:17:46 2008: 00:13:02:24:ca:77 DHCP server id: 2.2.2.2 rcvd server id: 192.168.0.4

Mon Aug 4 13:17:55 2008: 00:13:02:24:ca:77 Clearing Address 192.168.0.164 on mobile

Mon Aug 4 13:17:55 2008: 00:13:02:24:ca:77 192.168.0.164 RUN (20) Change state to DHCP_REQD (7) last state RUN (20)

Mon Aug 4 13:17:55 2008: 00:13:02:24:ca:77 192.168.0.164 DHCP_REQD (7) pemAdvanceState2 3850, Adding TMP rule

Mon Aug 4 13:17:55 2008: 00:13:02:24:ca:77 192.168.0.164 DHCP_REQD (7) Replacing Fast Path rule

type = Airespace AP - Learn IP address

on AP 00:00:00:00:00:00, slot 0, interface = 29, QOS = 0

ACL Id = 255, Jumbo Frames = NO, 802.1P = 0, DSCP = 0, TokenID = 5006

Mon Aug 4 13:17:55 2008: 00:13:02:24:ca:77 192.168.0.164 DHCP_REQD (7) Successfully plumbed mobile rule (ACL ID 255)

Mon Aug 4 13:17:55 2008: 00:13:02:24:ca:77 apfMmProcessCloseResponse (apf_mm.c:427) Expiring Mobile!

Mon Aug 4 13:17:55 2008: 00:13:02:24:ca:77 0.0.0.0 DHCP_REQD (7) Deleted mobile LWAPP rule on AP [00:00:00:00:00:00]

Mon Aug 4 13:17:55 2008: 00:13:02:24:ca:77 Deleting mobile on AP 00:00:00:00:00:00(0)

Mon Aug 4 13:17:55 2008: 00:13:02:24:ca:77 Set bi-dir guest tunnel for 00:13:02:24:ca:77 as in Export Anchor role

Mon Aug 4 13:17:55 2008: 00:13:02:24:ca:77 192.168.0.164 Added NPU entry of type 9

Mon Aug 4 13:17:55 2008: 00:13:02:24:ca:77 0.0.0.0 Removed NPU entry.

This occurs exactly halfway through the DHCP lease.

olivier.nicolas... Mon, 08/04/2008 - 12:40

1. A DHCP client will always try renew the lease after half lease time has expired. (the same ipconfig /renew if you don't want to wait half an hour).

2. The client try to renew its lease (192.168.0.4)

Mon Aug 4 13:17:46 2008: 00:13:02:24:ca:77 DHCP siaddr: 0.0.0.0, giaddr: 192.168.0.164

3. The DHCP server replies and tells the clients to keep its current address.

Mon Aug 4 13:17:46 2008: 00:13:02:24:ca:77 DHCP ciaddr: 192.168.0.164, yiaddr: 192.168.0.164

4. Response is sent the the client via the EoP tunnel

Mon Aug 4 13:17:46 2008: 00:13:02:24:ca:77 DHCP sending packet in EoIP tunnel to foreign 10.50.111.11 (len 346)

Now the strange points

Mon Aug 4 13:17:55 2008: 00:13:02:24:ca:77 Clearing Address 192.168.0.164 on mobile

Mon Aug 4 13:17:55 2008: 00:13:02:24:ca:77 192.168.0.164 RUN (20) Change state to DHCP_REQD (7) last state RUN (20)

For some reasons, the controller decides to change the client state from RUN to DHCP_REQD. So, from now, the client connections are dropped until a full DHCP cycle is done (ex: ipconfig /release ipconfig /renew)

Could you post debug message for a working a wireless chipset ?

s.clinard Tue, 08/05/2008 - 06:55

I have attached the file that has the initial DHCP transaction and the subsequent renewal that was successful on a Dell Latitude with an Intel PRO/Wireless 2200BG chipset.

Attachment: 
olivier.nicolas... Tue, 08/05/2008 - 08:10

I have tested with Intel 3945.

I renew the IP twice. After the first renew I add "debug pem events" but I don't see any PEM related message. And the client is still running fine after the renew.

s.clinard Wed, 08/06/2008 - 06:20

What's interesting is that if I force a /release and then a /renew on the client just prior to the DHCP renew proces occuring on its own, the problem doesn't seem to exist. The client will be leased the same IP for another hour...

olivier.nicolas... Wed, 08/06/2008 - 08:13

I see differnce between my capture and yours. When I do a IP renew (which happens naturally at the half lease time), the PEM mechanisnm is not trigger. DHCP is seen as normal traffic. When I do a ipconfig /renew the PEM is triggered and change the connection state from RUN to DHCP_REQ. So all traffic is blocked until a new DHCP query is done.

Does it works again, if I you do "ipconfig /renew" twice ? Does the PEM state change from DHCP_REQ to RUN ?

s.clinard Tue, 08/12/2008 - 13:07

If I disable DHCP proxying, than we can't use the internal WLC DHCP capability. I have plans to implement an external DHCP server tomorrow evening which means I will need to disable DHCP proxying anyway.

I am seeing similar things on my guest network as well. My anchor controller is in the dmz, port 1 is on the private network, and port 2 is in the DMZ, is trunked, and I am using a dynamic interface for the DMZ subnet. The DHCP server is the DMZ 6509 with the SVI for the guest network.

What it appears is when I am using web auth (passthrough) my clients are getting put back to a dhcp_reqd state, which results in them being stuck in the web auth required.

I don't know why the clients are losing their IP address to begin with - the WCS only captures "Client Moved to DHCP Required State. 8...

The wlan to which client is connecting does not require 802 1x authentication. 7?.0

08/15/2008 15:35:49 EDT INFO 10.2.254.150 Client does not have an IP address yet. 7?.0

08/15/2008 15:35:49 EDT INFO 10.2.254.150 Client L3 authentication is required 7?.0

08/15/2008 15:35:49 EDT INFO 10.2.254.150 Client Moved to DHCP Required State. 7?.0

08/15/2008 15:35:49 EDT INFO 10.2.254.150 Client Moved to DHCP Required State. 7?.

08/15/2008 15:35:50 EDT INFO 10.2.254.150 DHCP successful. ;...

08/15/2008 15:35:50 EDT ERROR 10.2.254.150 Client got an IP address successfully and the WLAN requires Web Auth or Web Auth pass through. ;... "

I'm not sure if some of the messages are just INFO about the l3 auth, this wlan is configured for no security, just web auth passthrough.

The anchor controller is running 4.2.99 while my inside controllers are 4.2.130.

I had a previous problem when the anchor was on 4.2.130 where DHCP messages would get dropped by the inside controller... I may play around and put it back to 4.2.130 just to see.

Scott Fella Sat, 08/16/2008 - 09:23

Brian,

If you configure a dhco scope on the DMZ controller, do you still see errors and do the clients still hang? I have multiple clients with DMZ controllers running various code and a couple running 4.2.130 with no issues. I do have the dhcp scope on the DMC WLC though.

I've never tested long enough with the controller as the server to see this problem.

I have tried it before and it works as the DHCP server, but I've never tried staying connected the 15 or 20 min it would take to get dropped..

A few people at Cisco Networkers this year that spoke on Wireless said not to use the WLC as a DHCP server because it is not an enterprise type DHCP server.. so I've been afraid to use the WLC as the DHCP server.

How big is your clients guest user base with the WLC as the DHCP server ?

Scott Fella Sat, 08/16/2008 - 10:52

Guest is the only time I would use the WLC as a DHCP server. We have implemented this in various environments, but the biggest is probably in hospitals. Some we have ran /24 to /23. Internal wireless has always been done on an enterprise DHCP server. I have also implemented this in a retail mall area that has public wifi on 8 floors. No issue with the WLC being a DHCP server.

Unfortunately, no luck.

I moved the DHCP server to the anchor controller, and enabled DHCP proxy (which works to provide my clients an IP address) but the disconnects continue.

More debugging this weekend has provided me some additional info to consider.

It seems the problem happens on some kind of timer because 2 test machines I'm working with go into the web_auth required state at exactly the same times when they've both been connected at the same time.

The controller gives messages about Mobility role update requests.

"Mobility role update request. from Anchor to Handoff Peer = 10.1.254.150, Old Anchor = 10.2.254.150, New Anchor = 0.0.0.0"

Right at that time the client no longer passes traffic as web auth is required.

If I shut web auth off - the clients are OK - although I'm not positive the problem is gone - its just the small amt of time it may be going through changing states doesn't bother the client as web_auth isnt required.

I've upgraded all of the controllers to 5.1.151 - and the exact same symptoms are with me.

I don't beleive it's DHCP related as this occurs with DHCP on the WLC or on the 6509 in the DMZ.

I beleive it has something to do with mobility and or web auth.

Here is some monitoring output with the web_auth disabled. This is why I beleive the problem still exists, just not as noticeable with web_auth gone because the client gets (or keeps) its IP address. With web_auth w/passthrough enabled the client has to refresh a web browser to get traffic passing again.

08/17/2008 13:05:13 EDT INFO 10.2.254.150 Mobility role update request. from Export Anchor to Handoff Peer = 10.1.254.150, Old Anchor = 10.2.254.150, New Anchor = 0.0.0.0

08/17/2008 13:05:13 EDT INFO 10.2.254.150 Client Moved to DHCP Required State.

08/17/2008 13:05:16 EDT INFO 10.2.254.150 Mobility role update request. from Unassociated to Export Anchor Peer = 0.0.0.0, Old Anchor = 0.0.0.0, New Anchor = 10.2.254.150

08/17/2008 13:05:16 EDT INFO 10.2.254.150 The wlan to which client is connecting does not require 802 1x authentication.

08/17/2008 13:05:16 EDT INFO 10.2.254.150 Client does not have an IP address yet.

08/17/2008 13:05:16 EDT INFO 10.2.254.150 Client Moved to DHCP Required State.

08/17/2008 13:05:16 EDT INFO 10.2.254.150 Mobility role changed. State Update from Mobility-Incomplete to Mobility-Complete, mobility role=ExpAnchor

08/17/2008 13:05:16 EDT INFO 10.2.254.150 Client Moved to DHCP Required State.

08/17/2008 13:05:17 EDT INFO 10.2.254.150 DHCP successful.

08/17/2008 13:05:17 EDT INFO 10.2.254.150 Client has got IP address, no L3 authentication required.

08/17/2008 13:05:17 EDT INFO 10.2.254.150 Client IP address is assigned.

08/17/2008 13:05:17 EDT INFO 10.2.254.150 Dhcp Information. transmitting DHCP REQUEST (3)

08/17/2008 13:05:17 EDT INFO 10.2.254.150 Dhcp Information. op: BOOTREQUEST, htype: Ethernet, hlen: 6, hops: 1

08/17/2008 13:05:17 EDT INFO 10.2.254.150 Dhcp Information. xid: 0x5faa5e58 (1605000792), secs: 0, flags: 0

08/17/2008 13:05:17 EDT INFO 10.2.254.150 Dhcp Information. chaddr: 00:1a:73:9d:96:cb

08/17/2008 13:05:17 EDT INFO 10.2.254.150 Dhcp Information. ciaddr: 10.99.0.10, yiaddr: 0.0.0.0

08/17/2008 13:05:17 EDT INFO 10.2.254.150 Dhcp Information. siaddr: 0.0.0.0, giaddr: 10.99.0.1

08/17/2008 13:05:17 EDT INFO 10.2.254.150 Received DHCP ACK ,dhcp server set.

08/17/2008 13:05:17 EDT INFO 10.2.254.150 Dhcp Information. transmitting DHCP ACK (5)

08/17/2008 13:05:17 EDT INFO 10.2.254.150 Dhcp Information. op: BOOTREPLY, htype: Ethernet, hlen: 6, hops: 0

08/17/2008 13:05:17 EDT INFO 10.2.254.150 Dhcp Information. xid: 0x5faa5e58 (1605000792), secs: 0, flags: 0

08/17/2008 13:05:17 EDT INFO 10.2.254.150 Dhcp Information. chaddr: 00:1a:73:9d:96:cb

08/17/2008 13:05:17 EDT INFO 10.2.254.150 Dhcp Information. ciaddr: 10.99.0.10, yiaddr: 10.99.0.10

08/17/2008 13:05:17 EDT INFO 10.2.254.150 Dhcp Information. siaddr: 0.0.0.0, giaddr: 0.0.0.0

08/17/2008 13:05:17 EDT INFO 10.2.254.150 Dhcp Information. server id: 4.4.4.4 rcvd server id: 10.2.254.150

Scott Fella Sun, 08/17/2008 - 10:33

What ip address is the wlc's and the guest wlc. Also what is their dynamic interface and VIP.

Scott Fella Sun, 08/17/2008 - 11:22

Okay, well you have a local controller in your inside network (10.1.254.150) and your port 1 on the guest controller is on your internal network also (10.2.254.150), but your port 2 is on the DMZ and is only being used by your guest ssid and the dynamic interface you created for it.

Do you have symmetric tunneling enabled on both the local and guest controllers?

Is you Guest wlan ssid configured exactly the same and both are set on the management interface?

You guest wlc wlan ssid is only configured for port 2 and not a backup of port 1?

Mobility group is configured and you can eping and mping both ways?

The virtual ip and mobility group name is the same or different between the local and guest wlc's?

What code are you running on both now?

Scott Fella Sun, 08/17/2008 - 12:24

Made a mistake... the local wlan guest ssid is mapped to the management interface and you create an anchor to the guest and on the guest wlan ssid you have that mapped to they dynamic interface that is using port 2.

Yes, you have a correct understanding of the setup.

I do not have symmetric tunneling enabled (unless that is enabled by default, but I don't think it is). Is that required? I couldn't gain a good understanding from documentation on what that provided for this architecture.

The guest WLC wlan ssid is only conf'd for port 2 - not a backup for port 1.

Mobility groups appear up, eping and mping is OK in both directions.

The virtual IP is the same on all controllers (4.4.4.4)

The local controllers are NOT the same mobility group as the anchor controller, although this should not be a requirement as far as I know.

5.1.151 is on all controllers now.

To clarify - the guest network works initially - its just after a random amount of time (anywhere from 10 - 20 minutes) the dhcp required state comes back, and wth web auth enabled requires a refresh of accepting the web auth screen.

Scott Fella Sun, 08/17/2008 - 13:16

Is the WLAN setup for DHCP required on both the local and the guest wlc? I just wanted to verify that both wlc is either setup for symmetric tunneling or not. Looks like the wlc is not allowing the client on the network because dhcp required is checked. Since the client seems to keep the ip and not renewing for some reason, the wlc stops client traffic.

Scott Fella Sun, 08/17/2008 - 14:45

Okay, but on the guest wlan ssid, do you have dhcp required unchecked on both the local and the guest controller?

Scott Fella Sun, 08/17/2008 - 15:01

No... you should keep that unchecked, just makes it simpler for clients to connect. The error message shows that dhcp required is checked, which is weird, because you do not have it checked. The wlc seems to think that is checked for some reason. Try to delete the ssid and recreate it on both the local and guest wlc.

I was hoping it was checked - but it is unchecked on all of my controllers.

I will attempt rebuilding on new controllers - but this problem is consistent across 4 different local controllers in Raleigh and Atlanta where the problem is present.

The Anchor is at a hosted datacenter in North Carolina. The WAN links are 100mb, with less than 40ms latency.

Do I need symetric tunneling enabled ?

Scott Fella Mon, 08/18/2008 - 06:15

You don't need it, but if you do, then you will have to enable that on all wlc's in the mobility group and reboot the wlc's. Is there alot of roaming when this happens?

s.clinard Mon, 08/18/2008 - 12:53

It seems like Brian's issue and my issue is a little different, however I am thinking that mine might be due to the fact that symmetric mobility tunneling is not enabled. Here's why: When we implemented an external DHCP server on the 192.168.0.0 /24 network (Guest WLAN network), we saw the DHCP server responding to the requests but never saw client acknowledgements accepting the DHCP information. I am thinking this is because the firewall sees the source IP not matching the subnet on which the packets are received (the firewall). I was looking at a WLC document which outlined the following:

You should also enable symmetric mobility tunneling if a firewall installation in the client packet path may drop the packets whose source IP address does not match the subnet on which the packets are received.

Correct Answer

Well.. to complete a nice happy ending to my saga.. BUG FOUND!

I opened a TAC case and we came to this conclusion.

In the Advanced settings of the WLAN there is a client time-out default of 1800 seconds.

The clients were dis-associating due to inactivity according to the sniffer traces, causing the dhcp process to kick off and the web_auth reqd state.

We set this down to 60 seconds and watched it over and over..

I have now set it to the max allowed of 65535 (18 hours) as a work around.

Cisco admitted there are bugs when setting this to 0, so they suggest the 65535.

Hope this info helps some of you out!!

s.clinard Wed, 08/20/2008 - 13:17

Brian - that seems to have fixed my problem as well. I set it to 60 on one of our WiSM controllers to which the particular AP I was associated to was registered. I saw the problem occur in 30 seconds. I have then set it to 65535 and am testing again, however I am convinced that was the case. Something with these Intel chipsets and the timeout value was screwing with the DHCP renewal.

THANKS!

Shane

sachinraja Fri, 03/06/2009 - 07:30

same issue here.. i'll try setting it to 65535 and see if it resolves the issue.. did you set the timers on all the WLC's, ie the local WLC & Anchor WLC ??

Raj

sachinraja Fri, 03/06/2009 - 08:03

Shane

you had changed the session timeout right (default 1800) ? not the client exclusion timeout , which is defaulted to 60 secs ? I have increased the session timeout to 3600, instead of putting it to 65535.. do u think this would work ?

and i also saw a bug in 5.1.163 which relates to our problem:

CSCsq26446-Clients using a WLAN with web authentication enabled might disconnect every 5 minutes. The "pem timed out" message appears in the controller logs.

Workaround: Authenticate the clients using another WLAN.

s.clinard Fri, 03/06/2009 - 12:06

Yes, the session timeout. I would suspect that changing your value to 3600 would delay this issue from occuring, but if it completely resolves the problem is dependent on how long your wireless users are connected at any given time. The workaround for that bug wouldn't help me in my scenario as we only have one guest WLAN. The others are internal 802.1x-secured WLANs.

sachinraja Fri, 03/06/2009 - 12:10

Yeah... with 3600, the client was connected for 1 hour 4 mins, before getting disconnected... this seems really strange to me ! now, I have increased this to 65535.. lets see.. but the security team isnt going to accept 65535 value for sure :) he he..

Has this bug been resolved in any of the latest IOS trains ? Did Cisco TAC answer you on this ? The other bug that I had shown in my previous post, hasnt been solved till 5.2.178 !!!

Actions

This Discussion

 

 

Trending Topics: Other Wireless Mobility

client could not be authenticated
Network Analysis Module (NAM) Products
Cisco 6500 nam
reason 440 driver failure
Cisco password cracker
Cisco Wireless mode