I'm having a big trouble trying to get this topology working. The image is not the actual topology I'm working on, because I removed some stuff so it would be easier to you to focus on the problem I'm having.
SCENARIO
My goal is to remove any single point of failure (SPOF), so as you can see, we have two of anything. If a router/switch/server fails, there's another one. Until now, I got GLBP and Pacemaker working just fine.
Switches have the default (blank) config. Routers have their own IP and GLBP IP (on BVI1) with no additional options and servers have their own IP and Cluster IP (on bond0) with no additional options, either.
LABELS
- Green addresses are unique for each device.
- Red (virtual) addresses are shared by the devices on their sides in order to provide fault-tolerance and/or load-balancing. Servers have .1 as gateway.
- Purple addresses are used by servers to communicate/monitor each other and synchronize databases.
PROBLEMS
- ackets get duplicated and/or arrive in both physical server interfaces.
EXAMPLE
Ping 10 times to SRV1:
- 10 ping requests are sent.
- 13 ping requests are received on bond0 (6 on eth1 and 7 on eth2).
- 10 ping replies are sent.
Yes, 10 pings were succesful with duplicated packets (some of you could think that's good enough), but when I use an upper layer protocol such as SSH, when packets arrive in both physical interfaces (eth1 and eth2), it just doesn't work. Sometimes even ping doesn't work fine. Don't know if packets being dropped or not even being received (didn't have the time to capture network traffic on that issue today).
This is my first time working with a high-availability network design, and I think this may be MAC related.
Any help would be much appreciated

[EDITED] Solution (December 5th):
According to the Linux Kernel documentation regarding bonding (Chapter 11: "Configuring Bonding for High Availability"), in this topology and with the equipment provided, isn't possible to setup fault-tolerance and load-balancing on the servers' physical interfaces, which is the default mode for bonding (balance-rr, a round-robin based mode), so the solution was to opt for active-backup mode, which sets only one interface as active and provides only fault-tolerance.
So, now I have primary and backup links, which means there's a primary switch and a backup one. If one server's primary link goes down, that would cause each server to be connected to each switch, so I connected the switches in order to avoid packets going through the routers.
I hope this saves some time for anyone having the same issue.