Sounds like you are getting keepalive timeouts due to network load. I think you are seeing the failure of 3 keepalives, 1 send every 10 seconds and following the 5 second retry period (total of 35 sec) the service is declared down. If a response to one of the keepalives is received 1 second after the service is declared down then the service will be reported as being up. This could be the reason that you are seeing a service down then, 1 second later service up.
As Im sure you are aware, but having a service go down through slow keepalives is not desirable. I would suggest that if you see this regularly your best approach would be to increase either the retryperiod of the maxfailure period
Introduction This article will help you understand the steps on how to
download the UCS licenses from the Cisco Systems website and then
installing it on the UCS. The redacted (blue lines) just covers up
certain numbers for privacy please do not take them...
Introduction This article will help you understand and educate the
customer on how to clear their "expired licenses"
(license-graceperiod-expired) from their UCS-M. If a customer just
purchased a license and needs a step by step guide on how to download
Introduction Prepositioning is a powerful tools on the WAAS platform but
it is not always easy to figure out why your jobs are failing when
trying to retrieve the files.Here is a method that should help you to
figure out the reason why they are not succes...