APs fail to get DHCP from WLC after upgrade from 4.0 to 4.2

Unanswered Question
Mar 23rd, 2010
User Badges:

Hi all,


following scenario:


A WLC 4402-25 running 4.0.217.0. Some LAP1131. Local DHCP Range configured on WLC for APs. Management and two ap-manager interfaces are in the same subnet. No LAG. Everything is working fine - APs gather DHCP from controller, join LWAPP, happy.


Now I have tried to upgrade to 4.2.205.0. Controller upgrade went fine. After reboot the APs connect with their old existing IP to the controller, get the new software successfully and reboot. After reboot the APs try to get DHCP addresses from WLC. But controller log says:


Tue Mar 23 10:22:13 2010: 00:18:ba:75:a3:78 DHCP received op BOOTREQUEST (1) (len 584, port 1, encap 0xec00)

Tue Mar 23 10:22:13 2010: 00:18:ba:75:a3:78 DHCP dropping packet from AP 00:18:ba:75:a3:78 received on port 1, vlan 16455


I can't find vlan 16455 in configuration, configured management vlan number is 71.



I have copied the original 4.0.217.0 config to another WLC in the Lab (4402-12) and used same AP as above. Have to change IP addresses in interface and DHCP configuration. No LAG but one active interface only. Upgrade to 4.2.205.0 went fine including APs getting DHCP after reboot:


Tue Mar 23 14:48:31 2010: 00:18:ba:75:a3:78 DHCP received op BOOTREQUEST (1) (len 584, port 1, encap 0xec00)
Tue Mar 23 14:48:31 2010: 00:18:ba:75:a3:78 DHCP received a REQUEST on 'management' interface from AP -- bouncing to local DHCP server.
Tue Mar 23 14:48:31 2010: 00:18:ba:75:a3:78 DHCP sending to local dhcp server (0.0.0.0:68 -> 10.1xx.x.xxx:1067, len 302)



I have compared both resulting 4.2.205.0 configs, but found differences in MACs, IPs and timestamps only.



Why is this controller dropping the DHCP packets in the life environment after upgrade? What is vlan 16455?

I have repeated the upgrade several times, with and without intermediary steps (4.1.xxx) - no change with the resulting 4.2 software.



Cheers,

Kai

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 4.3 (4 ratings)
Loading.
dancampb Tue, 03/23/2010 - 11:14
User Badges:
  • Cisco Employee,

Do you have DHCP proxy enabled or disabled?  If you are using the internal DHCP server on the controller proxy has to be enabled.  In 4.0 disabling proxy just meant that the DHCP server reported to the client was the actually DHCP server's IP instead of the controllers virtual interface IP.  In 4.2 having proxy disabled means that the DHCP request is just broadcast out into the VLAN.

kai.freese Wed, 03/24/2010 - 07:47
User Badges:

Yes, good idea. But no, didn't help.


DHCP proxy is enabled in both environments - the failing life controller and the working lab controller.

And neither disabling DHCP proxy nor re-enabling didn't change anything on the life controller. Same error messages always occur.


Cheers,

Kai

Scott Fella Fri, 03/26/2010 - 06:02
User Badges:
  • Super Silver, 17500 points or more
  • Hall of Fame,

    The Hall of Fame designation is a lifetime achievement award based on significant overall achievements in the community. 

  • Cisco Designated VIP,

    2017 Wireless

Kai,


Maybe try if possible to erase the configuration, upgrade the FW and configure the WLC manually and not use a backup image.  Could possibly be a corrupt config or image.

George Stefanick Sun, 03/28/2010 - 08:42
User Badges:
  • Purple, 4500 points or more
  • Community Spotlight Award,

    Best Publication, October 2015

Just wondering if you tried (maybe you did) extend the AP to the same VLAN / SUBNET as the managment interface

to see if she gets an IP?


I would also perhaps try LAG just for testing purposes


Vlan 16455 --- i dont think ive ever come across this one myself... I did a google serach as well and didnt see anything pertaining to this either.

kai.freese Mon, 03/29/2010 - 01:53
User Badges:

@ Scott: To do a clean install is an idea, but I have 8 other WLCs around the country. And I don't want to risk to run in the same trap before I have either a solution without hands on device or a clear reason why this device is the only faulty one.


I'm relatively sure that the sw image is not corrupt. It is working fine in lab and I have tried several times on life controller without any change.


But I will try to do a clean install with 4.2 and compare the final configuration with the faulty one. Maybe then I can see any difference.


@ George: The APs are already in the same vlan and subnet. Trying LAG is another idea, yes.


Thanks so far, I will come back if I have more results


Kai

Scott Fella Mon, 03/29/2010 - 05:11
User Badges:
  • Super Silver, 17500 points or more
  • Hall of Fame,

    The Hall of Fame designation is a lifetime achievement award based on significant overall achievements in the community. 

  • Cisco Designated VIP,

    2017 Wireless

Kai,


It doens't mean when you upgrade each WLC you will have that issue, but last case senerio for troubleshooting is to upload the code and confgiure the basics manually and test.  If it works, then you can configure everything else.  Since you were able to take the code from on WLC to another and worked in a lab senerio, It looks like it might just be that one controller, especially if nothing changed in the network except for the WLC upgrade.

kai.freese Fri, 06/18/2010 - 02:01
User Badges:

All,


just for information - after contacting TAC we had a deep dive into different directions. Finally we have found a new bug in WLC software: CSCth31837.


Reason was that packets in this LAN are marked with an valid CoS value other than zero together with a valid vlan id in a valid 802.1q tag.

But software did calculate the vlan id from whole 802.1q tag including CoS bits resulting in vlan 16455.


Proven workaround now is to clear the CoS bits to zero for all packets travelling towards the WLC.


Thanks again for your contribution.


Cheers,

Kai

Greg Focaccio Fri, 07/22/2011 - 11:04
User Badges:

Hello,


So is there a recommended 4.2 version of code without the bug? 


I am having DHCP issues running a 3750 Integrated WLC and will be upgrading from 4.0 to 4.2 soon.


Thanks,

Greg

kai.freese Mon, 07/25/2011 - 05:50
User Badges:



Hi Greg,


after identifying the reason last year the TAC engineer told me that Cisco is planning to fix that in version 7.0 and later only.


And checking the bug toolkit today - yes, there is only a 7.0 version listed as fixed.


I'm currently running a 6.0 version with the workaround of CoS = 0 for all packets sent to the WLC - as described above.


Cheers,

Kai

Actions

This Discussion

Related Content

 

 

Trending Topics: Other Wireless Mobility

client could not be authenticated
Network Analysis Module (NAM) Products
Cisco 6500 nam
reason 440 driver failure
Cisco password cracker
Cisco Wireless mode