We have 2 x ACE4710 appliance running in redundant mode, with LB1 hosting an active Context1 and a hot standby Context2, LB2 running a hot standby Context1 and active context2 - essentially to spread the load across the 2 boxes.
they'd been running without a hitch for the last 2 months they'd been deployed. But we'd bumped into a very weird issue lately - the health probe configured on Context2 seems to go down/up in rapid fashion at least once a day..upon closer inspection, it is evident that it's caused by a status change in the context's FT group.
Strangely enough, this only happens on Context2 while Context1 are humming along just fine..both contexts have similar number of VIP/health probes configured with just different ip addressing.
'show resource usage' shows an interesting finding (see attached). LB2 (where Context2 is active) shows high mgmt-traffic rate 'denied' count...but i'm not sure if what exactly this indicates or if it's related at all.
Any insight at all is welcomed on this...