02-15-2012 12:34 PM - edited 03-04-2019 03:16 PM
Here goes....
MPLS customer with 4 T1s in a multilink. If one of the T1s drops there is a brief delay in traffic picking back up and I actually lose packets from premise back to CO. You can see this loss both with pinging across the circuit and with techs on either end running JPerf. It can take as long as 6 seconds for the reconvergence to actually happen on the multilink and traffic picks back up. In my experience this is normal behavior for Mulitlinks, but I wanted to throw it out there just in case we're all missing something that can help the convergence time.
I'd also like to note that it is indeed much quicker reconvergence when you physically pull the T1, any of the T1s, rather than administratively shutting down one of them and I understand that the hardware is quicker than software and that's a good thing, obviously. I've tried this with and without ppp mulitlink fragment disabled on either end and every other combo between the two. Each of the 4 serial interfaces are on line timing and I tried free-running just on the off chance that it could imrpove the loss, but it gets worse.....back to line timing. I've even tried this on other CPE platforms like two different versions of Adtran CPEs and I get the same thing. Currently I have a new 2821 CPE in place and still get the same thing. Still see a brief amount of traffic loss up to 6-7 seconds or so at times.
Again, I've advised my colleagues that this is pretty normal behavior in my experience, but want to get some outside influence on this. Here are the configs:
7600 side:
interface Multilink592
ip vrf forwarding ******************
ip address *************************
load-interval 30
no peer neighbor-route
ppp multilink
ppp multilink group 592
ppp multilink fragment disable
no cdp enable
service-policy output VPN-TEMPLATE-2
interface Serial9/2/0.2/9:0
no ip address
encapsulation ppp
ppp multilink
ppp multilink group 592
Each of the 4 are the same so I'll leave out the other 3.
2800 side:
interface Multilink1
ip address ********************
ppp multilink
ppp multilink group 1
max-reserved-bandwidth 90
controller T1 0/0/0
framing esf
linecode b8zs
channel-group 1 timeslots 1-24
interface Serial0/0/0:1
no ip address
encapsulation ppp
ppp multilink
ppp multilink group 1
max-reserved-bandwidth 90
Again, each of the 4 are the same so I'll leave out the redundant info.
Any input is much appreciated.
Solved! Go to Solution.
02-16-2012 03:41 AM
Hi Vince,
When you shut down/disconnect a link on one site, the remote end will detect the disconnection only when the keepalives will expire. During that time, it will still forward traffic to that interface, and those packets are lost.
To reduce the convergence time you may play with keepalive settings under the serial interfaces.
keepalive
Try for example to configure each interface, on both sides, with "keepalive 1 1"
Let me know if it helps
Marco
02-16-2012 03:41 AM
Hi Vince,
When you shut down/disconnect a link on one site, the remote end will detect the disconnection only when the keepalives will expire. During that time, it will still forward traffic to that interface, and those packets are lost.
To reduce the convergence time you may play with keepalive settings under the serial interfaces.
keepalive
Try for example to configure each interface, on both sides, with "keepalive 1 1"
Let me know if it helps
Marco
02-16-2012 09:53 AM
Wow, the downtime/traffic lost is greatly imrpoved. We're going to get on either end with JPerf again, but ICMP is telling me that this is now much, much better. Can't believe that didn't cross my mind. Makes perfect sense and your pointer is very much appreciated, bro. This is a customer with a particularly sensative set of IP phones and PBX and this might actually do the trick. The funny thing is that I was explaining to management that their keepalive/hold timer for their PBX has to be really, really low....and yet I didn't think to change said parameters on our side. lol
Thanks again. I'll let you know the end result with the customer's voice traffic.
02-21-2012 02:41 PM
Labbed up an IP Phone to Mitel 5000 to mimic the exact same traffic and it performed well. Roughly 2 seconds for traffic comes back and the IP phone never lost sync. Looks like this worked like a charm.
02-21-2012 02:43 PM
Drawback to this being that there very well could be more bounces due to a lower keepalive, but in this particular situation it's warrented. Much appreciated......
02-22-2012 12:31 AM
Glad to see that it helped, Vince.
Cheers
Marco
* Remember to rate useful posts *
02-22-2012 10:12 AM
Post rated.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide