I have a question regarding the failure detection time for an AP, if the corresponding WLC becomes unavailable.
We're talking about the current 7.2 release here - so no "old" technotes (<5.0) please :-))
I don't completely understand the software configuration guide, the available technotes and some Cisco Live Slides regarding this topic.
Also the current software configuration guide (7.2) doesn't explain the new fields "AP Retransmit Count" and "AP Retransmit Interval".... it's just stated that these values may be configured, but not the impact and the functionality.
So here's how I understand the functionality:
In this example I used the following values:
- AP heartbeat timeout: 3 seconds
- Local mode AP fast heartbeat: 1 second
- AP primary discovery timeout: 30 seconds (but this is not important in this context, I guess)
The AP probes the WLC every "AP heartbeat timeout" (3 seconds) with an echo request.
First question: How long does the AP wait for a response from the WLC? If I understand it correctly, the value "heartbeat timeout" is more or less something like the "hello intervall" (HSRP, EIGRP, OSPF).... but what's the dead time here?
When the WLC doesn't answer to the echo request, the AP starts sending fast heartbeat messages (every 1 second "Local mode AP fast heartbeat")
Second question: What is the dead interval for those fast messages?
After three failed fast heartbeats (at least I think it's three of them .... there's no document I know stating how much fast heartbeats need to fail), the AP switches to the backup controller.
What I want to achieve is something like a simple convergence time calculation, like everyone knows from routing, STP and so on :-)
- What is the failure detection time? (heartbeat timeout + <x> times fast heartbeat + .... I don't know)
- After the failure detection time, what's the needed time to change to the 2nd controller - performing a join, configuration and run state (assuming same config, same SW).... I know - this is kinda hard to answer.... so I'm already glad with the first question ("failure detection time).
Perhaps we can develop this together and create some kind of document here - I think a good document explaining this is needed.
At least I'm not aware of any document explaining this without leaving some open questions. I read some sections of RFC5415 (CAPWAP) as well - but this explained not all of it :-)
So I would be really grateful for some input regarding this topic.