04-17-2003 11:27 PM
Hello,
we use CSS 11503, and for some service we use a http keepalive with value (10 3 5). We have sometimes messages DOWN/ALIVE with just one second difference :
APR 17 23:29:17 1/1 574 NETMAN-2: Enterprise:Service Transition:Test01 -> down
APR 17 23:29:18 1/1 575 NETMAN-5: Enterprise:Service Transition:Test01 -> alive
Why, i think that with the value we use for this keepalive service, the mininum difference from status down and alive shoud be of 5 seconds,
Thank a lot to explain us this value of 1 second between DOWN/ALIVE status ?
Didier
04-18-2003 07:09 AM
Hi, it sounds like we are having the same problem.....
The service goes down for 1 second then comes back active...we have an open TAC case on this issue, the services to not acutally go down, here is their current thinking
Usually we see this type of log entry when the CS-800 is receiving a high volume of extraneous traffic. Because
we act as both a router and a bridge, we must examine each packet. This particular queue handles all non-specific
and non-IP traffic, including Spanning Tree BPDU, non-IP bridgeable traffic, ICMP, ARP, UDP fragments, and
packets with expired TTL. Under situations where the CS-800 receives a high number of these packets, such as
during a DOS attack or where other network anomalies exist, there may be occasional drops in this queue. This
should not have any impact on user TCP traffic, as TCP is sent to a different queue.
If this error appears in the log occasionally, then check the "show dos" commands to make sure the site is not
under attack. Also, check the network topology to make sure the routing is solid. If the log is filling rapidly
with these errors, then a packet capture may be helpful in isolating the cause
04-23-2003 07:06 PM
Hi Dider,
Sounds like you are getting keepalive timeouts due to network load. I think you are seeing the failure of 3 keepalives, 1 send every 10 seconds and following the 5 second retry period (total of 35 sec) the service is declared down. If a response to one of the keepalives is received 1 second after the service is declared down then the service will be reported as being up. This could be the reason that you are seeing a service down then, 1 second later service up.
As Im sure you are aware, but having a service go down through slow keepalives is not desirable. I would suggest that if you see this regularly your best approach would be to increase either the retryperiod of the maxfailure period
Hope this helps !!
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide