I am looking to understand of there is any configuration method that will prevent fallback to a primary link when it restores after failure and stay on the secondary OSPF neighbor. Something akin to the non-preempt method of HSRP for LAN services.
We have dual routers that connect over OSPF to an ISP pair of routers. At present it is a standard OSPF configuration of relevant participating interfaces with a redistribute of our statics to the ISP.
The ISP then takes the OSPF updates and redistributes them into their BGP process and the BGP routes learnt from the WAN into the OSPF process to us.
We want to be able to gain some control of fall back should the ISPs primary link fail so as service stays on the secondary router/links until we are ready to restore to primary manually.
Primary motivation is the fact our ISP seems to have intermittent link or route flaps on their BGP IP4 VPN MPLS service on the primary circuit. We have no visibility over this. This is only reflected in an OSPF LSA update change to our routers, which is where we wanted to gain some control. Obviously once their link is restored the primary link falls back into service. It is this the client wants us to prevent. We are looking for an automated means but are considering a manual intervention. Which moves us to the secondary motivation.
Secondary motivation is the fact the ISP is not strictly honest now or in the past about topology events in their networks and as we cannot currently provide monitoring for their routers, due to their legacy GETVPN setup preventing monitoring access from our leveraged management platform, we can only react post-event. Until we can get our client to agree a change window to update the GETVPN ACLs issued by their key servers to allow our monitoring devices to reach the encrypted interfaces we are being tasked with this challenge to prevent failback to the primary once it is flapping and restores. The delay is the fact they have 20 or so critical sites all running GETVPN and they are historically nervous about changes to it.
The EMM policy looks like a possibility, however complicated. I too am sceptical about the possibility but I wanted to broadcast it out to see if there was a technology feature I am not aware of.
The issue here seems to be the service provider. You should be able to trust your service provider and they should be able to guarantee you a certain level of uptime according to SLA if that was in your contract.
To be honest they seem kind of shady and I would look to replace them with someone else. Maybe you are already considering that option but I understand that you want to try to find a technical solution as well.
The only other thing I can think of is performance routing (PfR) also previously called Optimized Edge Routing (OER). It can do policy routing and react to events such as ICMP replies taking too long to reply, jitter and if an interface is congested and things like that.
It might be worth reading up on it and see if your equipment has support for it.
Re: OSPF Dual homed WAN Links - Preventing failback
So you are redistributing the statics for your LAN subnets on your routers ?
If so EEM does not have to be that complicated. You could monitor the OSPF neighborship between your primary router and the ISP primary router. If the neighborship was lost then you could simply remove the redistribute static command from under your OSPF configuration so when the link came back up the ISP no longer received the routes via OSPF and so did not advertise them out via BGP.
Then until you added that redistribute statement back no traffic would come in via that link. I did think about shutting down the LAN inerface on your primary router but if you are using statics then they would still be advertised to the ISP and then via BGP so traffic would come to your primary router and be dropped.
I have not used EEM but i found this example where they monitor the OSPF neighborship and shut an interface down. It could be easily modified to remove the redistribute command instead -
[toc:faq]The ProblemOn traditional switches whenever we have a trunk
interface we use the VLAN tag to demultiplex the VLANs. The switch needs
to determine which MAC Address table to look in for a forwarding
decision. To do this we require the switch to do...
[toc:faq]Introduction:Netdr is a tool available on a RSP720, Sup720 or
Sup32 that allows one to capture packets on the RP or SP inband. The
netdr command can be used to capture both Tx and Rx packets in the
software switching path. This is not a substitut...
IntroductionOSPF, being a link-state protocol, allows for every router
in the network to know of every link and OSPF speaker in the entire
network. From this picture each router independently runs the Shortest
Path First (SPF) algorithm to determine the b...