losing primary and backup routes when primary circuit goes down using eigrp
I've got an open case with Cisco but they haven't been able to find anything wrong with the config. I'm hoping someone out there has seen this before and can tell me what I'm missing.
I've got 2 locations. At site-A there is an internet connection to an ASA with a 3750x (IP Services) behind it. The 3750x has vlans for both public addressing and private addressing. This 3750x has a GRE tunnel interface connected over the internet to site-B. The ASA allows the GRE traffic between the two public IP addresses but is not actually part of the tunnel. This 3750x also has a layer2 connection between the sites (basically a vlan over the carriers network). EIGRP is configured to propigate routes over both the GRE and the L2 to another 3750x at site-B.
Site-B also has a 3750x, also with EIGRP, the L2 connection, and is the other end of the GRE tunnel. The only real difference is that site-B has a Sonicwall, but I don't think that's relavant to the problem. I added a "delay 120" to the GRE tunnel interface on both ends to be sure this route stays at a higher cost unless the primary (the L2) route is unavailable.
When everything is working the eigrp routes are propagated to/from each 3750x, traffic is routed over the L2 connection and a "sho ip eigrp nei" shows both the primary (L2) and secondary (GRE tunnel) connections.
But when the L2 stops passing traffic (interface is still up) I lose the EIGRP routes completely. I can ping across the GRE tunnel and even route traffic over the GRE by manually creating routes. But I need this to be an automatic failover and the routing needs to be dynamic to support the changing networks at either site. A "sho ip eigrp nei" when the L2 is off-line shows nothing. So it seemed obvious to me that EIGRP isn't being passed over the GRE tunnel and so when the L2 goes down EIGRP loses all information on the other location. Is there a way I can check this at the tunnel level?
Cisco has looked over the configs and check a few things on the 3750x and believes it should work, but I need to schedule a time to actually take the L2 off-line to show the Cisco tech it not working and he can gather more information. Finding a time for extended down-time isn't easy. But it did go down again today long enough for me to grab the eigrp events.
Relevent config is attached. any and all thoughts are greately appriciated.
[toc:faq]The ProblemOn traditional switches whenever we have a trunk
interface we use the VLAN tag to demultiplex the VLANs. The switch needs
to determine which MAC Address table to look in for a forwarding
decision. To do this we require the switch to do...
[toc:faq]Introduction:Netdr is a tool available on a RSP720, Sup720 or
Sup32 that allows one to capture packets on the RP or SP inband. The
netdr command can be used to capture both Tx and Rx packets in the
software switching path. This is not a substitut...
IntroductionOSPF, being a link-state protocol, allows for every router
in the network to know of every link and OSPF speaker in the entire
network. From this picture each router independently runs the Shortest
Path First (SPF) algorithm to determine the b...