Cisco 888 modems in back to back configuration - CPE dropping packets
We have 4 sets of 888 DSL modems running in back to back configurations. Recently, the CPE side on three sets started dropping packets and we can't figure out why. One set of modems is still running fine (which makes no sense as they all had the same config and the working set is going into the same switch as one of the non-working sets). The CO sides are fine, but the CPE sides are dropping about 30%. I (nor any of my fellow co-workers) are aware of any network changes, as we would be the ones to do it. When I login to a CPE and do a 'sh int', I can see that ATM0 is dropping packets from an unknown protocol. I'm not sure if that is part of the problem or not. I have pasted the config below. Any ideas and/or suggestions would be greatly appreciated. BTW, the standard subnet/vlan here is .47 and the secondary is .43. It appears the problem doesn't exist if I change the IP addresses of the modems to .47 and swap the switch port to vlan 47, but we really need them on 43. The odd thing, it appears that once you frist config a new set of these, they work fine for a little while (several minutes) and then the CPE side starts dropping. Not only does it drop packets, but the response time goes from 7-8ms to 95ms. I'm not sure if there is a routing issue or what.
version 15.2 no service pad service timestamps debug datetime msec service timestamps log datetime msec service password-encryption ! hostname SMK_UG_DSL6_BrCPE ! boot-start-marker boot-end-marker ! ! ! no aaa new-model memory-size iomem 10 ! ! ! ! ! ! ! ! ! ! ip cef no ipv6 cef ! ! multilink bundle-name authenticated license udi pid CISCO888-K9 sn FTXxxxxxxxx ! ! username xxxxxxx privilege 15 password 7 xxxxxxxxxxxxxxxxxx ! ! ! ! ! controller DSL 0 mode atm ignore-error-duration 30 line-rate 1024 ! ! ! ! ! ! bridge irb ! ! ! ! interface BRI0 no ip address encapsulation hdlc shutdown isdn termination multidrop ! interface ATM0 no ip address no ip route-cache no atm ilmi-keepalive ! interface ATM0.1 point-to-point no ip route-cache bridge-group 1 pvc 0/35 ! ! interface FastEthernet0 no ip address ! interface FastEthernet1 no ip address ! interface FastEthernet2 no ip address ! interface FastEthernet3 no ip address ! interface Vlan1 no ip address bridge-group 1 ! interface BVI1 ip address 10.110.43.193 255.255.255.0 ! ip forward-protocol nd no ip http server no ip http secure-server ! ! ip route 10.0.0.0 255.0.0.0 10.110.43.254 ip route 192.168.0.0 255.255.0.0 10.110.43.254 ! ! ! control-plane ! bridge 1 protocol ieee bridge 1 route ip ! ! line con 0 no modem enable line aux 0 line vty 0 4 login transport input all ! scheduler max-task-time 5000 ! end
Hmmmm ... This means that the ATM0 interface is downloading something a full bandwidth. 255/255 means 100%. Look at the "packets input" vs "packets output". There's a significant ratio of data going from the ISP to your router.
I'd run a netflow report so you'll be able to determine what clients are actually pulling data from the WAN.
Let me shed a little light on the network topolgy. The data circuits are provided by Frontier and the ISP is AT&T. Here is a diagram of the layout.
There is nothing connected to the DSLcpe side (we were trying the process of elimination). The cable between the co and cpe is like 1-foot right now. Of course the network (in it's entirety) is was more complex than the diagram above, but we moved the modems directly to the main switch without anything behind them and the problem still exists. Note - If I wasn't clear, these modems are just used as basic Ethernet extenders to provide a network connection to an area that doesn't have fiber.
Just check your interface. You shouldn't be getting your download pipe getting 100% all the time. Check the interface status regularly. If the result is 255/255 in either "Tx" or "Rx' then something is choking your bandwidth and this is indicative of your packet loss.
If this is the case, then the first step could be to enable netflow so you'll be able to determine who your "top talkers" are and what their destination address(es) is/are.
Another harsh method is to disconnect the LAN cable one by one until the ping times drop.
I finally figured out the problem. It was the "line-rate" command that caused the issue. Previously, I have set it at 1024 when we were dealing with some dirty lines/noisy conditions and all was good. I have no idea what changed, but now leaving it on 1024 will allow the problem to occur. If you set it as line-rate auto, the problem gets worse. The packet loss still occurs, but the ping times jump to around 179ms. Now here's the weird part... If you remove the line-rate command from the config and let it auto detect, it will slect the rate of 2304 (which is the highest rate for a 2-wire connection). Strange that 2304 works fine but 1024 (which is slower, so you think it would be more stable), doesn't work. What's even stranger is the default should be line-rate auto, but that makes it even worse. You have to actually remove the command totally for it to work correctly? Now as to why the problem just started out of the blue... We had one set that never developed the problem (it's running IOS 15.0). We had two sets that did develop the problem (they are running IOS 15.2). Given this information, I thought that maybe this "line-rate" issue was a bug in 15.2. The other weird part is that we had a third set that developed this issue as well. It is running 15.0 (because I checked this early on to potentially elimate an IOS issue). However, after finally figuring out that the line-rate issue was the problem, I went back to check the version on the last set. The CO was sure enough 15.0 as I thought, but the CPE is deep underground (disconnected now). I will have to wait until a later date to see which version it is. It's possible that since the devices were mixed sets, the CPE side could actually be running 15.2 and thus why it had the problem too. Either way, I know "line-rate" is the cause. My best guess is that 15.2 introduced a bug with it.
The ProblemEnter EVCsHow It Works (Ingress)How It Works
(Egress)Step-by-Step ExampleFinal Thoughts The ProblemOn traditional
switches whenever we have a trunk interface we use the VLAN tag to
demultiplex the VLANs. The switch needs to determine which MAC ...
The ProblemEnter EVCsHow It Works (Ingress)How It Works
(Egress)Step-by-Step ExampleFinal Thoughts Introduction: Netdr is a tool
available on a RSP720, Sup720 or Sup32 that allows one to capture
packets on the RP or SP inband. The netdr command can be use...
IntroductionOSPF, being a link-state protocol, allows for every router
in the network to know of every link and OSPF speaker in the entire
network. From this picture each router independently runs the Shortest
Path First (SPF) algorithm to determine the b...