Option 43 vs. DNS Resolution

Unanswered Question
Feb 4th, 2009
User Badges:

For those of us that are familiar with the process by which an AP finds its controller, we know that there is L2 broadcast, Option 43, DNS resolution, and shared neighbor information via OTP, as well as the final option to statically assign a controller IP via the 'lwapp ap controller ip address x.x.x.x'. If you watch the process via 'debug lwapp client event' process on an AP, you will see that each IP address is categorized as to how it was learned using a number (0-4). Here's my question: Are these numbers used in a priority order when an AP attempts to join a controller? I had a 1252 on a 2106 running 5.2.157.0 and no domain (the AP got its controller IP via 'option 43 ascii x.x.x.x' from a DHCP scope on a 2960 switch). Then I moved it to my lab setting, where it's a 4402-25 running 4.2.130.0 and a domain. I expected the new resolution of CISCO-LWAPP-CONTROLLER to be successful and have it join my controller. However, all I saw was the stored entries in NVRAM from the previous controller to which it was joined. I had a couple of options to force it to join my lab controller, and I chose Option 43. That seemed to work and the AP happily downgraded its code. Any thoughts/comments? I'm just surprised that the new DNS resolution (which did work b/c the debug showed 'translating [OK]') didn't allow the AP to join my lab controller.


Regards,

Scott

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 4.5 (2 ratings)
Loading.
laxcis Wed, 02/04/2009 - 08:46
User Badges:

After we turn on debugging, what is the command to view the debugs??

Scott Pickles Wed, 02/04/2009 - 09:08
User Badges:

laxcis -


When you turn on debugging, it should show up rather immediately. If you are not getting any debug information, that means that your AP is not getting its data to the controller. If you are running debug from the controller, you can use the following:


1. debug lwapp events enable (shows all)

2. debug client (restrict to a particular AP)


If you are running the debug from the AP, the following will help:


1. debug lwapp client event

2. debug lwapp client packet

3. debug lwapp client error


I would recommend only running one debug type at a time so that your output isn't mixed up. You must be consoled to an AP in privileged/exec mode to run the debug commands. The default login is still Cisco/Cisco, unless it has been changed by the administrator (from the controller, use this to change the password on the AP 'config ap username password ')


You can use Cisco.com, the TAC search tool, Cisco's Support Wiki, and the 'net at large to find many relevant docs for troubleshooting the AP join process. If you need some docs specifically, I have a few I can share.


Hope that helps.


Regards,

Scott



laxcis Wed, 02/04/2009 - 09:13
User Badges:

I do have a open case with TAC and they want me to do the following. After I enable this though, what do I do next to see the debug? Where do I go to see them?


debug mac addr

debug lwapp events enable

debug dhcp message enable

debug dhcp packet enable

debug pm pki enable


laxcis Wed, 02/04/2009 - 09:35
User Badges:

nevermind, figured out the debug. I sent it to TAC.

Scott Pickles Wed, 02/04/2009 - 09:40
User Badges:

Laxcis -


Be sure and post back what your issue was and how you resolved it. Your response 'nevermind, I figured it out" is going to leave a lot of people reading the thread hanging. The NetPro forums are for professionals to help ourselves and eachother. Without subscribing to a PROBLEM/SOLUTION methodology, this thread would otherwise not be useful to others.


Regards,

Scott

laxcis Wed, 02/04/2009 - 10:45
User Badges:

of course I would post what I find. The "nevermind I got it" was in reference to how to see the debugging going on. But I finally realized it just shows up on the screen. My AP's still are not locating the right WLC, and I am waiting for TAC to get back to me.

Scott Pickles Wed, 02/04/2009 - 10:48
User Badges:

Have you run debug on an AP? It will tell you what controllers it has learned about and how. Once you know if and how your APs are learning about your controllers, you can begin to figure out where it is breaking down. If you don't see the address you want listed, review the configuration for that method (i.e. Option 43, DNS, etc.). For the short term, you'll make the most progress by hard coding one of your APs with the controller address you want.


Regards,

Scott

laxcis Wed, 02/04/2009 - 11:07
User Badges:

Would you be able to tell me how to hard code my AP with the controller I want? That would be terrific. It's a 1242.

Scott Pickles Wed, 02/04/2009 - 11:33
User Badges:

Sure. Plug your AP into a power source but leave the network port disconnected. The AP will boot and realize it has no connection to the rest of the network and won't attempt the LWAPP join process. Then you can run the following commands to set it up:


clear lwapp private-config

lwapp ap controller ip address x.x.x.x


As for the DNS entries. There was a time I thought the same thing. You actually create the same 'A' record multiple times, each time using a different controller IP address. So each DNS record is named CISCO-LWAPP-CONTROLLER.domain. What happens when an AP joins is that it gets all records returned. Then it sends a join request to ALL of those controllers. Which controller it ends up on is due in part to several factors, but most important is the AP load. It will choose the controller with the least load (this information is sent to the AP by the controller in its 'DISCOVERY RESPONSE'). You can override this behavior with a controller configured as 'Master Controller.' However, this is recommended for provisioning APs in a controlled lab environment, and is not recommended for production. In your case, what will probably end up happening is that it will join a controller you don't want it joined to. Then you go into the config for that AP via the controller and set the PRIMARY, SECONDARY, and TERTIARY controller names. Once saved to the AP's NVRAM, it will unicast to join that controller first the next time around. Once you save it, you can safely reboot it. The names you put in for the controllers need to be the HOST names of the controllers, and may or may not actually be DNS records depending on how you have set your DNS up. In code version 5.0 and up, these fields also have an IP address field.


Regards,

Scott

laxcis Wed, 02/04/2009 - 12:16
User Badges:

When I try and run these commands on the AP CLI, it says "Error!! Command is disabled"


I googled it and it appears I need to change the username/pwd. Easy to do?

Scott Pickles Wed, 02/04/2009 - 13:02
User Badges:

Did you log in as exec user first? If the AP has not ever been on a controller, or if the controller is set to default for user/pass, it will still be Cisco/Cisco. If the AP gives you the 'command disabled' error, that usually means that it has run its LWAPP process. I found that leaving the AP disconnected from the network usually works. Give me a min to test and I'll get back to you.


Regards,

Scott

Scott Pickles Wed, 02/04/2009 - 13:08
User Badges:

Got it - you can't run the command 'clear lwapp private-config' UNLESS the default user/pass HAS been changed. Once those parameters are changed, you'll have the ability to run that command. However, in this case it isn't necessary. Just use the 'lwapp ap controller ip address x.x.x.x' command and you should be all set. Never hurts to use the reset button too. Just hold it down for 1-2 sec. while powering on the AP and let go when you see the LED go amber.


Regards,

Scott

laxcis Wed, 02/04/2009 - 13:14
User Badges:

Still get the same message about command is disabled. Looks like i need to change the username/pwd. I'm not quite sure on how to do that though. Or can I get by with just changing the pwd?

Scott Pickles Wed, 02/04/2009 - 13:30
User Badges:

It's been a while since I did that, and it's coming back now. It wasn't an issue for me b/c I have a couple of 2106 controllers that I take out into the field with me to help with troubleshooting. You can't run any of those commands without changing the password. So what I would do is hook the AP up to my 2106 (it uses local layer 2 broadcast to join) and then change the username and password. You can do this from the controller using the comand 'config ap username user_id password passwd [all | ap_name ]'. This allows you to restrict it to the one AP you have, but since you can't join it to a controller you can't do that. Send me the same debugs you sent Cisco TAC and I'll look at them. Do you have admin access to your infrastructure, or are the switches/controllers under someone else's provisioning?


[email protected]">[email protected]


Regards,

Scott

laxcis Wed, 02/04/2009 - 13:34
User Badges:

The AP actually does join a controller (just the wrong one), so I can do those commands. Let me try that and see how it goes first.

Scott Pickles Wed, 02/04/2009 - 13:35
User Badges:

If it does join a controller, then all you need to do is go into the AP and update the PRIMARY controller field and you'll be all set!!


Regards,

Scott

laxcis Wed, 02/04/2009 - 13:39
User Badges:

Nope, even though the right controller name is in there, it still joins the wrong controller. Firmware version 4.1.8 of the WLC.

laxcis Wed, 02/04/2009 - 13:52
User Badges:

Ok, so I was able to get the pwd changed on the AP, and clear the lwapp and configure it for the other controller. Do I need to do any sort of save like a copy run start that I normally do on Cisco switches?

laxcis Wed, 02/04/2009 - 14:23
User Badges:

Scott - Just want to say thank you for all the time you spent on this with me. I really appreciate it. The AP's are now locating the right controller, but it's not due to manually configuring the AP. Honestly, I'm not 100% sure why they are locating it now but I think it may be because I removed one of the two DNS entries for the LWAPP Controller. I had two entries for both controllers. Removed one, and now the AP's go to the right controller. I'll probably re-add the other entry now after the fact. Thanks again. I think this is solved.

Scott Pickles Thu, 02/05/2009 - 08:04
User Badges:

Laxcis -


No problem, my pleasure. To clarify, you can have multiple DNS entries for CISCO-LWAPP-CONTROLLER, but each time it must point to a different IP address. The fact that you had 4 DNS entries when you only needed 2 surely confused your APs. If you would be so kind as to rate the post(s), I'd appreciate it.


Edit: Even with multiple ways of finding a controller, you still want to make sure you fill in your PRIMARY and SECONDARY controller host names. This is what the AP will use to query a controller to join first, and if that controller is full it will go on to the next one. This is also the method by which you can manually distribute and load balance your APs across multiple controllers, as well as what the APs use for failover/fallback.


Now, does anyone have an answer to my original post..? :)


Regards,

Scott

laxcis Thu, 02/05/2009 - 10:33
User Badges:

I actually had 2 DNS entries, and each one was indeed pointing to a different controller. So in theory, it should have been ok, but in reality, it wasn't. Once I deleted the 1 entry, then they only found the other controller, which is what I wanted. I always do fill in the Primary and Secondary, but it doesn't seem to always work that great. I have read to on here that the GUI for those fields really don't do much. It's better to use the CLI. Thanks again.

Scott Pickles Wed, 02/04/2009 - 13:26
User Badges:

Got it - you can't run the command 'clear lwapp private-config' UNLESS the default user/pass HAS been changed. Once those parameters are changed, you'll have the ability to run that command. However, in this case it isn't necessary. Just use the 'lwapp ap controller ip address x.x.x.x' command and you should be all set. Never hurts to use the reset button too. Just hold it down for 1-2 sec. while powering on the AP and let go when you see the LED go amber.


Regards,

Scott

laxcis Wed, 02/04/2009 - 13:31
User Badges:

I'm confused. I need to change the username/pwd, but not sure how to do that.

Leo Laohoo Thu, 02/05/2009 - 16:05
User Badges:
  • Super Gold, 25000 points or more
  • Hall of Fame,

    The Hall of Fame designation is a lifetime achievement award based on significant overall achievements in the community. 

  • Cisco Designated VIP,

    2017 LAN, Wireless

You'll get this error message when you are running the IOS that the WLC has dished out. Boot into the RCV firmware and then you can run the command.

Scott Pickles Wed, 02/04/2009 - 09:35
User Badges:

Laxcis -


If I wasn't clear in my previous post, if you are either telnetted or consoled into the controller, your debug should show up immediately in your terminal emulation software (i.e. HyperTerminal/PuTTY/CRT/etc.). If you are NOT seeing any debug traffic, then your APs are not able to reach the controller and thus you are not seeing any traffic. A few things that will cause this immediately are these:


1. Firewall blocking ports UDP 12222/12223 (used to send LWAPP packets to/from controller)

2. Incorrect tagging on the interface directly connected to the controller. Remember, the controller ports are hard coded 1 gigabit, full duplex, dot1q trunks.

3. Incorrect port configuration on the switch. Your switchport that connects to your controller MUST be 1 gigabit, full duplex, dot1q trunk.

4. You are not allowing all VLANs across your trunk link (only applies if you chose not to put your controller management IP address on your native VLAN).


Regards,

Scott

laxcis Wed, 02/04/2009 - 11:02
User Badges:

Thanks. Would you be able to tell me how to hard code my AP with the controller I want? That would be terrific. It's a 1242.


I do have another question relating to Option 43 vs DNS. What happens if you have both? I'm thinking this may be my problem, and why my AP's are going to the wrong controller. In DNS, we have the entry for the one wireless controller (CISCO-LWAPP-CONTROLLER), but not for the other WLC. Instead, for that we have option 43. I guess what I am wondering is, how do you do multiple DNS entries for more than 1 WLC? Doesn't the name in DNS need to be CISCO-LWAPP-CONTROLLER? Or should that be the name of the controller?

talmadari Wed, 02/04/2009 - 11:24
User Badges:

Hi Scott,


After moving the AP into your LAB, where the WLC 4402 is connected, did the AP was connected into the same segment where the WLC is sitting?

I think that it is obvious that the AP will keep his controller IP (if it was retained through DNS or DHCP) because of the redundancy issue.

Scott Pickles Wed, 02/04/2009 - 11:54
User Badges:

Talmadari -


Whether the AP is on the same subnet or not really just tells me which discovery methods the AP can/will use. Cisco hasn't recommended L2 in quite a while, and my particular configuration is L3. The APs are on a different subnet. Therefore, it should have gotten a controller address from either DNS or Option 43, or possibly both. Prior to adding Option 43, it should have picked up the address for the 4402 from the DNS server in the domain since it is configured to respond. In addition, debug on the AP indicated that it resolved CISCO-LWAPP-CONTROLLER successfully. So I'm not really sure what your point is. Whether it was on the same subnet or not, the LWAPP join process should have shown the new controller IP address for my lab controller. I agree with you that the AP properly stored the previous controller addresses, but it should have at least learned the new ones via the standard LWAPP join process and then decided 'I already have a primary configured so I'm going to use that.'


Regards,

Scott

Leo Laohoo Thu, 02/05/2009 - 16:15
User Badges:
  • Super Gold, 25000 points or more
  • Hall of Fame,

    The Hall of Fame designation is a lifetime achievement award based on significant overall achievements in the community. 

  • Cisco Designated VIP,

    2017 LAN, Wireless

Here is my experience and understanding of the very-odd LWAPP/CAPWAP discovery process:


Presume you have the following:

1. 21xx WLC for Lab and/or priming purposes running new or old firmware;

2. A number of 41xx or WiSM all throughout the network (Production)


You follow the documentations and prime the AP's in your lab. The AP's naturally join the first WLC in a small subnet/network, which is the 21xx. Of course, it will upgrade/downgrade the AP's IOS blah, blah, blah.


We all know how the AP's discover and join the WLC. Does anyone know how the AP's remember the WLC details? According to Cisco documentation: "Once joined, the AP will have one or more controller IP Addresses stored LOCALLY."


This means that once the AP is pulled out of the lab network and thrown into the production network, it will look for the 21xx details. Unfortunately, it is a hit-and-miss for the AP to say something like "I can't find the 21xx anywhere, I'll join the first WLC I can find."


Thus, the command of 'lwapp ap controller ip address x.x.x.x' comes in.


I've seen this scenario happen several times. I've done DHCP Option 43 first and use the "lwapp ap controller ip address x.x.x.x" to resort most of the problem Option 43 can't fix.


In my humble opinion ...


One more thing ... If you are using CLI on the LWAP and you are getting the error message of "Error!! Command is disabled", this means that you are on the wrong IOS. Commands such as the "clear lwapp" can only be invoked if the RCV IOS is being used.

laxcis Thu, 02/05/2009 - 19:53
User Badges:

FWIW - I was able to get rid of the Error! Command is disabled after I changed the username/pwd on the AP from it's factory default. I had to reset one AP today, and of course, it joined the wrong controller. So my way around this was to delete the wrong controller DNS entry, reset the AP to factory default, and then it jumped onto the right controller, then I re-add the wrong controller DNS entry. Cisco TAC isn't giving me much for info on how to solve this besides "you can return it and we'll send you another one". Pretty frustrating. Almost seems like you have to rely on the command you and others posted ('lwapp ap controller ip address x.x.x.x' ).


-Ryan

Leo Laohoo Thu, 02/05/2009 - 19:58
User Badges:
  • Super Gold, 25000 points or more
  • Hall of Fame,

    The Hall of Fame designation is a lifetime achievement award based on significant overall achievements in the community. 

  • Cisco Designated VIP,

    2017 LAN, Wireless

Looks like I'm wrong with the "Error! Command is disabled" theory. You learn something new everyday. :)

laxcis Thu, 02/12/2009 - 16:04
User Badges:

Just wanted to post a follow-up to this. After working with TAC, we found a solution to my problem (AP's joining the wrong controller). In order to be sure we had the AP join the right controller, we created Access Lists in the GUI (Security, AP Policies). This worked great. Each controller only accepts certain AP's. Thanks to all for the help on this.

l.mourits Tue, 02/24/2009 - 00:57
User Badges:
  • Silver, 250 points or more

Ouch, that sounds like a quite uggly work-around. May work great if you have some controlelrs and some APs, but what if you have 20 controllers and 300 APs and still growng rapidly?


That would be a pain to manage, I think....


Anyway, that said, I am facing the exact same problem with access-points sometimes not joining the controller you would expect.


My setup is, DHCP option 43 provides the local controller address, DNS resolves to one controller address in a datacnetr which is also the guest anchor controller, then OTAP is disabled, and controllers are always in the server vlans (whilst the server vlans have no helper adresses or something, so the L3 broadcat is blocked and does not leave the vlan of the controller). APs are in different vlan.


The problem I am facing is similar, and worse. Sometimes when I have an AP that looses connection to it's controller, the AP homes to another controller. In most cases, that controller is in another country/location, and may have other country codes enabled, and may not even support the regulatory domain of the AP in question.


Then, when the original controller comes back online, the AP does not go back to it's own controlelr (since it is not in the same mobility group), and resetting the AP brings it back to the original controller, but then the AP looses it's country settings, as when the AP joins a controller with the same regulatory domain as the AP itself, it seems to select the first enabled country code available in the domain.


In some case, this results in radios staying down, and have to manualyy re-eable the radios and set the correct country code from WCS.


Frustrating! Argh!!!!



Working with TAC, and even engineers that have been on-site, and reading to documents it seems the whol discovery and selection process has changed over the years with newer version, and seems that this has not been changed after 4.2.

However, one engineer stated it has changed in 5.2 (I am running the elatest 5.2 version) but no docs to support it.


Another engineer informed me that in the LWAPP discovery response from a controller, the controller sends all IPs of controllers in its mobility group that it knows of. If so, that would require mobility groups to be different for each location. Other engineers state this is not required.


A lot of discrepancies out there, IMHO.


Wish there would be one guru that could prefectly outline the whole discovery and selection proces, and provide supporting documents that show the explanation goes for 5.2


Just venting a bit ;-)


kind regards,

Leo

Scott Pickles Tue, 02/24/2009 - 05:42
User Badges:

Leo -


Call or email me (check my profile) - I can talk more with you about this stuff. I have some documentation about how the process works in pre 5.0 code, but I suspect the difference the engineers are talking about is the migration from LWAPP to CAPWAP (Control And Provisioning of Wireless Access Points), which is open standard.

dcaughey Fri, 02/27/2009 - 14:01
User Badges:
  • Cisco Employee,

Scott, the format of the option 43 command in IOS for the AP you are using is incorrect. It should be option 43 hex F1aabbccddee, where aa= hex value of the number of bytes to follow (04 for one IP address, 08 for two), where bbccddee are the hex values of each octet of the IP address of the Mang. Interface of the controller

Scott Pickles Fri, 02/27/2009 - 17:58
User Badges:

dcaughey -


Thanks for the reply. I'm aware of the HEX configuration for Option 43. I just did a quick re-read of the documentation for that and didn't realize that the option for ascii only applies to 1000 series APs!! Thanks!!

Actions

This Discussion

 

 

Trending Topics: Other Wireless Mobility

client could not be authenticated
Network Analysis Module (NAM) Products
Cisco 6500 nam
reason 440 driver failure
Cisco password cracker
Cisco Wireless mode