Hello all; looking to solicit some opinions on what might have been the cause for an issue we experienced recently. A caveat: I am a systems administrator and therefore I don't have exact knowledge of all the exact models of hardware involved (but could potentially get this information if it would be helpful).
On one of our networks, we have a cluster of servers running an application that depends on multicast. After a recent issue that caused these servers and the networking equipment to cycle power at least once we found that the multicast traffic used by these servers was not properly propagating to the other servers in the cluster.
These servers are all on the same subnet (a /24) and attached to the same catalyst (albeit on different blades) which appears to be configured only to do layer 2 and to not block multicast in any way shape or form.
However, we observed that the initial packet in a multicast transmission would be received by the other members of the multicast group exactly once. No further packets were received. A tcpdump (snoop in this case as these are Solaris machines) showed that the multicast packets were indeed leaving the multicast "transmitter" but a corresponding tcpdump on the recievers showed that after the first received packet, no more were entering the interface (at least to the point where they'd be processed by the sniffer).
When we placed these machines on a dedicated catalyst switch which was attached to the previous switch via an uplink crossover cable, the multicast immediately began working as expected. As soon as the machines were placed once again on the original switch the problem returned.
Any ideas on what might have been the culprit here? We suspected perhaps an IGMP routing device on the same subnet initially, but it would seem that the problem should have persisted even when we added the secondary switch if this were the case.
The MAC's for the multilink groups as reported by arp -an on the machines themselves did correspond with an IGMP MAC address range FWIW.
Just looking for suggestions / brainstorming on this issue as we will likely address it again later.
Thanks in advance, and happy to provide additional technical information if requested.