cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2299
Views
5
Helpful
9
Replies

Lightweight APs start failing after a couple of failovers to a backup WLC

I built a test environment with 2 APs (1142 and 1252) and 2 4402 WLCs.

I configured them for failover.

The APs join the 1st one and work fine.

After a couple of simulated fails (cable disconnects) though, the APs cannot join any of the WLCs any more.

I get the following error at the APs:

*Aug 14 10:26:10.000: DTLS_CLIENT_ERROR: ../dtls/dtls_connection_db.c:2013 Max retransmission count reached!

*Aug 14 10:26:10.000: %DTLS-3-HANDSHAKE_RETRANSMIT: Max retransmit count for 192.168.33.22 is reached.

Does anyone know what could cause this?

After I reboot the APs, they work fine again.

1 Accepted Solution

Accepted Solutions

Good research man! A little about CAPWAP though. CAPWAP on the 4400, 2100, WiSM, 3750, and ISR series controllers replace LWAPP as the encrypted transport for the AP management traffic only. This is achieved in a DTLS tunnel. All client traffic is still sent in the clear. With a 5500 series controller ALL traffic is sent in the DTLS tunnel. This includes AP management and client traffic and is the most powerful form of CAPWAP security but also the most processor and memory intensive and as such is the reason that the 5508 only can at this time support it. The 5508 can be set to perform just like an older series controller and thus reduce the hardware resources load on the controller as an option. This is generally how I configure the controller as I don't worry too much unless I am in a very secure environment about traffic on the copper cables. I assume that a good IPS box exists on the physical plant to protect incursions.

As for your rating your own post, I know this is a big  problem with this new format. Just pick the answer that makes you the happiest of all and the post will show as solved. When a viewer researches your problem they will still show it as solved.

View solution in original post

9 Replies 9

Leo Laohoo
Hall of Fame
Hall of Fame

What firmware are the WLCs running?

How did you enter the High Availability information of the controllers? Per AP or globally?

Console into the APs that are not working and in enable mode, enter the command sh config ap general. Under the Primary/Secondary controllers what are the details?

Firmware: 6.0.182.0 (ED)

How availability settings are set on the APs. Primary Controller and Secondary Controller both the same on both APs.

Both controllers are set on the same mobility group and they "see" each other as "Up" from the web interface.

Let me explain the problem better:

After a couple of times of unplugging the network cables of the primary controller so that I force the APs to join the secondary, the APs cannot join any controller any more.

Debugging shows:

1. APs send discovery messages.

2. The messages are received by the WLCs

3. WLC send response

4. APs receive response

5. APs try to connect via CAPWAP to the correct AP manager (there is only one) IP address and then they fail and I get the error messages shown above.

I should note that both the WLC connect to the network via 2 Gigabit Ethernet interfaces with Link aggregation (LAG). I have already configured the switch for src-dest-ip load-balancing as recommended by the configuration guides.

6.0.184 will have the fix for : CSCta09996

George Stefanick
VIP Alumni
VIP Alumni

Just wondering, do you have the mobility groups set on the controllers?

"Satisfaction does not come from knowing the solution, it comes from knowing why." - Rosalind Franklin
___________________________________________________________

Lucien Avramov
Level 10
Level 10

You are most likely running into bug : CSCta09996

Problem summary of this defect:

When Backup port become active caused by primary port down, then some LAPs fails to associate to WLC.

4.2.176 code of WLC is not affected.

5.2 code is affected by this defect.

It could be related but in my case none of the Access Points ever connect to any of the 2 WLCs. Only a reboot of the LAP fixes the problem.

Btw, I thought DTLS was an encryption protocol for encrypting the communication between LAP and WLC.

But as far as I know it is only supported at the 5508 WLC and not at the 4402, which is the one I use.

Could it be a bug that the APs try to use encryption while the WLC doesn't support it?>

The problem was solved by using DHCP instead of static IP and static WLC IPs.

It is also a known bug by Cisco.

PS: I cannot choose my own answer as a correct answer to the thread, so the thread remains "Not answered". Perhaps this is sth that the admins should look at.

Good research man! A little about CAPWAP though. CAPWAP on the 4400, 2100, WiSM, 3750, and ISR series controllers replace LWAPP as the encrypted transport for the AP management traffic only. This is achieved in a DTLS tunnel. All client traffic is still sent in the clear. With a 5500 series controller ALL traffic is sent in the DTLS tunnel. This includes AP management and client traffic and is the most powerful form of CAPWAP security but also the most processor and memory intensive and as such is the reason that the 5508 only can at this time support it. The 5508 can be set to perform just like an older series controller and thus reduce the hardware resources load on the controller as an option. This is generally how I configure the controller as I don't worry too much unless I am in a very secure environment about traffic on the copper cables. I assume that a good IPS box exists on the physical plant to protect incursions.

As for your rating your own post, I know this is a big  problem with this new format. Just pick the answer that makes you the happiest of all and the post will show as solved. When a viewer researches your problem they will still show it as solved.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: