I am still experiencing a nagging issue with my bgp router. I can't be sure if it's an issue with my router or one of my SPs. If I run continuous pings against my router's two serial interfaces, I will intermittantly have packets drop. As I watch the echo replies, they seem to drop simultaneously from both interfaces. I'll see replies from both for several intervals, then drops for a few, then replies, drops, etc.
I have default routes set to both of my SPs (Verizon and ATT) in my config. When I do "sh ip bgp" both default routes show up in my bgp routing table as RIB failures. When I do "sh ip bgp rib-failure" it only lists one of those default routes (ATT) with the reason for the failure being "Higher admin distance". I assume this has something to do with the fact that the SPs are sending me their partial routes, which include a default route. When I deleted the Verizon default route we lost Internet connectivity.
When I view ip bgp routes, the ATT default route is reported as a RIB-Failure Best Route (r>); the Verizon route is reported as a RIB-failure only (r).
I suspect the dropped echo replies have something to do with the default route issue, but can't be sure. Does anyone have any idea what is going on here?
Any suggestions would be greatly appreciated.
The fact that one of the default route is selected is as the best path is normal BGP behavior. The exception is when eBGP multipath is used but it wouldn't help in your case since for eBGP multipath to use both 0/0 they would need to come from the same neighbor AS.
As far as the intermittent ping lost is concerned, it could be due to the CPU on your router being busy running BGP Scanner.
What platform is this router?
Your assumption is correct, the RIB failure is most likely caused by a default route with a smaller AD. If I'm understanding correctly that you have static defaults configured, I would imagine this is the source.
I agree that the problem has something to do with your default routes. Primarily because you shouldn't
require them on your border routers. If you are load balancing, your IGP should be providing a couple defaults that point to your border routers. The fact that you are losing your connection when the static is removed leads me to believe it is a configuration issue, either on your side, or possibly your ISP.
If you are getting a !.!.!. when your static default is removed, you probably have a problem with the SP, redistributing a static to the Internet, thereby black holing your network.
Below is a sample config that should be pretty close to what you have.
If you can share your routing config and brief topology, we might be able to help more.
Thank you for your response, Jeff.
This is a Cisco 2691 with 128MB of memory.
Here is a partial config:
ip address 184.108.40.206 255.255.255.252
no ip unreachables
service-module t1 timeslots 1-24
interface Serial0/1.100 point-to-point
ip address 220.127.116.11 255.255.255.252
no ip unreachables
no cdp enable
frame-relay interface-dlci 100 IETF
router bgp 31000
network 18.104.22.168 mask 255.255.255.0
timers bgp 30 90
neighbor 22.214.171.124 remote-as 7018
neighbor 126.96.36.199 description ATT Peering
neighbor 188.8.131.52 soft-reconfiguration inbound
neighbor 184.108.40.206 route-map ATT_only in
neighbor 220.127.116.11 route-map localonly out
neighbor 18.104.22.168 remote-as 6995
neighbor 22.214.171.124 description Verizon Peering
neighbor 126.96.36.199 soft-reconfiguration inbound
neighbor 188.8.131.52 route-map Verizon_only in
neighbor 184.108.40.206 route-map localonly out
ip route 0.0.0.0 0.0.0.0 220.127.116.11
ip route 0.0.0.0 0.0.0.0 18.104.22.168
ip as-path access-list 10 permit ^$
ip as-path access-list 20 permit ^1010$
ip as-path access-list 30 permit ^2020$
route-map ATT_only permit 10
match as-path 20
route-map Verizon_only permit 10
match as-path 30
route-map localonly permit 10
match as-path 10
I rushed to make some assumptions when I stated that you didn't need the default routes. You may indeed need them, so don't pull them yet! ;)
Let me look over your config, nothing is jumping out as obviously wrong...
May be your problem is solved with a lot of replies. I just want to contribute..I don't see any problem whatsoever with your BGP configuration. Problem is just with your default route.If you were multi-homing with the same SP then fine.But as you are doing it with 2 different SP's then you need to configure the default route with administrative distance, giving priority to one of them.
When you say you are running continouos pings against your router's two serial interfaces, do you mean you are actually pinging the serial interface IP addresses on the routers themselves? Or, are you pinging something that is reachable via those interfaces?
If you are pinging the router itself, you could be seeing echo replies drop because we rate limit echo replies and some other types of ICMP packets. ARe you seeing other packets dropping across the link, or (?).
I am pinging the serial interface addresses of the router. When I ping one of my public addresses from the block I am advertising to the my bgp peers (i.e. 22.214.171.124), I almost never get dropped packets. These replies are not sourced from the router.
From what I can guess, it looks as though the router is sending the echo replies out both interfaces in some sort of round-robin fashion. Since the replies are sourced from the router - and the source address of the reply is the router's serial interface - packets sent on the other SPs network get blocked. That is, replies from the ATT interface that get sent out the Verizon network get blocked, and vise versa.
The only thing I might do is increase the admin distance on my static default routes so they stop showing as RIB-failures.
Does this make sense?
You can increase the admin distance on your static routes, so the BGP routes are installed instead. I would probably pull the static routes altogether, since you are getting bgp routes.
I don't think this will matter on the issue of dropping some echo replies, though, since I think that's just normal rate limiting.
I thought EGP is preferred over IGP (in my experience with BGP). Default route in BGP is the gateway of last resort.
When combine together, EGP will be injected in the routing table over IGP. If you see only the default route, that means you BGP is not working and the Gateway of last resort is activated hence when you remove the default route - the connection lost.
Have you done a "show ip bgp neighbor" to see if your BGP session with two neighbor is established?
What do you see in "show ip bgp" only the default route?
You can also do a "show ip bgp summary".
It depends on several factors, one of which is the administrative distance of the route. Static routes will always be preferred over eBGP learned routes, unless you make the static route "float" by increasing its administrative distance.
There are some cases where BGP ignores the administrative distance, and the non-BGP route is preferred over the BGP route, or the other way around, but not in this case--the statics will be installed instead of the eBGP learned routes.
Thanks, Russ. I believe I may have this issue under control. Some things, such as the SP's router configs and, to some extent, the routes bgp uses in and out of my network, are out of my control. This exercise began because one of the SP's was complaining that the echo replies for their monitoring station were being dropped intermittantly. That resulted in some navel gazing and experimentation to try to determine what was going on.
As it stands, it looks as though one of our SP's is not advertising our specific route - it is being summarized. I entered static routes for the SP's management LANs to force the echo response to use the correct path.
I'm glad to hear things are under control. You may want to consider removing those static routes if you don't need them. I can't see if you're using them to redistribute into an IGP, so proceed with caution and common sense on this advise.
I suggest this, because if one of your upstream providers has problems, but your link to the next hop stays up, you'll lose the eBGP route, but the static will continue to send traffic to the troubled ISP. This kind of undermines the whole purpose of dual homing.
So, then, it could have been an rpf problem, as well--at least that's what it sounds like. I would ask the other ISPF to punh a hole in their summary for the other route.