I am in need of some big help. TAC can't help me because they do not get involved in design; they only do break/fix. I totally understand that.
I got a simple office: one flat LAN, one single 1841 router and 2 ISPs.
LAN is 10.10.20.0/24 and is connected to a port on an HWIC card I installed in the 1841. Then FA0/0 connects to ISP1 and FA0/1 connects to ISP2.
Everything is fine except that I am having some issues with the Failover feature. Currently, I am using Object Tracking with SLAs. I am pinging 2 hosts located on the internet and then I have an SLA OR statement which basically say if ANY of the 2 objects are unreachable, DO NOT trigger a failover to ISP2. If in the case that BOTH objects become unreachable, then DO trigger a failover. It works like a charm.
Any internet hiccup obviously makes the router activate the tracks and redirects all traffic to ISP2. However, 99% of the time ISP1 is back online within minutes or seconds, so after 180 seconds the traffic gets redirected back to ISP1. So in essence, the customer suffers 2 interruptions.
Besides internet hiccups, I have also noticed that every time any user tries to copy a big file accross the tunnel (the 1841 has site to site tunnels with 2 branches) the tracks go crazy and the objects become unreachable so a failover is triggered. We were breaking our heads and fighting with the ISP1 provider because every time this happened, we called them but every time they kept telling us that their line was UP and running without any problems. So after careful investigation, I do admit they were right.... it is not so much that the ISP1 experiences hiccups, it is actually the fact that users putting heavy load into the router are causing it to have its track to stop reaching the objects.
Have you guys seen this behavior? Can you please help me?
Do I have to fine tune the tracks?
Can I introduce dynamic routing to the picture?
thank you in advanced.