This document briefly summarizes the 828 days problem often observed in CSS and CSM.
There is a need to provide detailed information on how the 828 days problem occurs, and ways to avoid it.
CSS and CSM are based on 32 bit VxWorks.
Further, the clock of VxWorks is counted at 60 Hz (increases by one every 1/60 of a second). You can get the value with tickGet() API, provided in the following URL. For example, when you get the value of tickGet() five seconds after booting, you can get the value of 0x12c (60Hz * 5sec = 300). CSS and CSM refer to this value for various purposes.
As tickGet() is 32 bit timer, and its maximum value is 2^32 = 4294967296, when this value is wrapped, the counter is reset to 0.
In other words, tickGet() value will be reset to 0 after 828 days and 12 hours have elapsed, according to the following formula.
Therefore, various problems will occur after 828 days have elapsed.
2^32 / (60Hz*60sec*60min*24hr) = 828.5days
Let us explain this problem in a bit more detail. We will use keepalive feature for the example. Keepalive is sent by CSS regularly.
By default, CSS sends icmp packets to a service every five seconds for an availability check.
Within CSS, the next transmission time will be calculated with current time and keepalive interval (five seconds, 0x12c; next_keepalive = tickGet() + 0x12c).
For example, if keepalive is sent 3600 seconds after booting, the next icmp packets will be sent 3605 seconds after booting.
If the value retrieved by tickGet() is larger than 3605 seconds (tickGet() > next_keepalive), keepalive packets will be sent.
If the tickGet() value is 0xfffffff0, the next_keepalive value is set to 0x1000011c, but the maximum value of tickGet() is 2^32 = 0xffffffff. Therefore, if this maximum value is exceeded, it is reset to 0 and the next keepalive value is set to 0x1000011c.
In this case, the condition of tickGet() > next_keepalive will never come, and thus CSS stops sending keepalive packets.
Changing the base OS from 32 bit to 64 bit also requires significant changes in CSS/CSM, which runs on the OS. Therefore, we have decided not to upgrade the base OS.
As a result, many bugs that may have taken effect after 828 days have been corrected.
For both CSS and CSM, we fixed many bugs. The root problem, however, remains. Therefore we suggest you reboot CSS/CSM before 828 days have elapsed..
Note: End of SW Maintenance Releases Date: September 20, 2012
Also, some of the reported failures were analyzed in order to determine that correction was impossible.
To avoid these problems, it is recommended that CSS/CSM be rebooted every two years.
When the CSS has an uptime of 828 days, it cannot send packets to the management port for 18 minutes. This issue affects the management port only. The circuit and VIP addresses works fine. We recommend that you reboot the CSS before its uptime is 828 days.