Lightweight APs start failing after a couple of failovers to a backup WLC

Answered Question
Aug 14th, 2009

I built a test environment with 2 APs (1142 and 1252) and 2 4402 WLCs.

I configured them for failover.

The APs join the 1st one and work fine.

After a couple of simulated fails (cable disconnects) though, the APs cannot join any of the WLCs any more.

I get the following error at the APs:

*Aug 14 10:26:10.000: DTLS_CLIENT_ERROR: ../dtls/dtls_connection_db.c:2013 Max retransmission count reached!

*Aug 14 10:26:10.000: %DTLS-3-HANDSHAKE_RETRANSMIT: Max retransmit count for 192.168.33.22 is reached.

Does anyone know what could cause this?

After I reboot the APs, they work fine again.

I have this problem too.
0 votes
Correct Answer by dennischolmes about 7 years 1 week ago

Good research man! A little about CAPWAP though. CAPWAP on the 4400, 2100, WiSM, 3750, and ISR series controllers replace LWAPP as the encrypted transport for the AP management traffic only. This is achieved in a DTLS tunnel. All client traffic is still sent in the clear. With a 5500 series controller ALL traffic is sent in the DTLS tunnel. This includes AP management and client traffic and is the most powerful form of CAPWAP security but also the most processor and memory intensive and as such is the reason that the 5508 only can at this time support it. The 5508 can be set to perform just like an older series controller and thus reduce the hardware resources load on the controller as an option. This is generally how I configure the controller as I don't worry too much unless I am in a very secure environment about traffic on the copper cables. I assume that a good IPS box exists on the physical plant to protect incursions.

As for your rating your own post, I know this is a big  problem with this new format. Just pick the answer that makes you the happiest of all and the post will show as solved. When a viewer researches your problem they will still show it as solved.

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 5 (2 ratings)
Loading.
Leo Laohoo Fri, 08/14/2009 - 20:17

What firmware are the WLCs running?

How did you enter the High Availability information of the controllers? Per AP or globally?

Console into the APs that are not working and in enable mode, enter the command sh config ap general. Under the Primary/Secondary controllers what are the details?

Georgios Nikitas Sun, 08/16/2009 - 23:02

Firmware: 6.0.182.0 (ED)

How availability settings are set on the APs. Primary Controller and Secondary Controller both the same on both APs.

Both controllers are set on the same mobility group and they "see" each other as "Up" from the web interface.

Let me explain the problem better:

After a couple of times of unplugging the network cables of the primary controller so that I force the APs to join the secondary, the APs cannot join any controller any more.

Debugging shows:

1. APs send discovery messages.

2. The messages are received by the WLCs

3. WLC send response

4. APs receive response

5. APs try to connect via CAPWAP to the correct AP manager (there is only one) IP address and then they fail and I get the error messages shown above.

I should note that both the WLC connect to the network via 2 Gigabit Ethernet interfaces with Link aggregation (LAG). I have already configured the switch for src-dest-ip load-balancing as recommended by the configuration guides.

Lucien Avramov Sun, 08/16/2009 - 14:46

You are most likely running into bug : CSCta09996

Problem summary of this defect:

When Backup port become active caused by primary port down, then some LAPs fails to associate to WLC.

4.2.176 code of WLC is not affected.

5.2 code is affected by this defect.

Georgios Nikitas Sun, 08/16/2009 - 23:07

It could be related but in my case none of the Access Points ever connect to any of the 2 WLCs. Only a reboot of the LAP fixes the problem.

Georgios Nikitas Mon, 08/17/2009 - 01:27

Btw, I thought DTLS was an encryption protocol for encrypting the communication between LAP and WLC.

But as far as I know it is only supported at the 5508 WLC and not at the 4402, which is the one I use.

Could it be a bug that the APs try to use encryption while the WLC doesn't support it?>

Georgios Nikitas Fri, 11/27/2009 - 01:01

The problem was solved by using DHCP instead of static IP and static WLC IPs.

It is also a known bug by Cisco.

PS: I cannot choose my own answer as a correct answer to the thread, so the thread remains "Not answered". Perhaps this is sth that the admins should look at.

Correct Answer
dennischolmes Mon, 11/30/2009 - 03:18

Good research man! A little about CAPWAP though. CAPWAP on the 4400, 2100, WiSM, 3750, and ISR series controllers replace LWAPP as the encrypted transport for the AP management traffic only. This is achieved in a DTLS tunnel. All client traffic is still sent in the clear. With a 5500 series controller ALL traffic is sent in the DTLS tunnel. This includes AP management and client traffic and is the most powerful form of CAPWAP security but also the most processor and memory intensive and as such is the reason that the 5508 only can at this time support it. The 5508 can be set to perform just like an older series controller and thus reduce the hardware resources load on the controller as an option. This is generally how I configure the controller as I don't worry too much unless I am in a very secure environment about traffic on the copper cables. I assume that a good IPS box exists on the physical plant to protect incursions.

As for your rating your own post, I know this is a big  problem with this new format. Just pick the answer that makes you the happiest of all and the post will show as solved. When a viewer researches your problem they will still show it as solved.

Actions

This Discussion

Related Content

 

 

Trending Topics: Other Wireless Mobility

client could not be authenticated
Network Analysis Module (NAM) Products
Cisco 6500 nam
reason 440 driver failure
Cisco password cracker
Cisco Wireless mode