I have a 7204 VXR router. FA0/0 is internal net, fa0/1 is wan backup to dsl, s1 to frame relay isp and s2 load balanced over 2 t1s to co-location servers. Whenever we have a lot of traffic to the co-lo site (either running an app or downloading a large file using Windows XP to 2000/2003 server, I begin to see a lot of frame errors. If I plug in a fluke, I will see 20 - 30% jabber errors (spread evenly among FA0/0 and local machines trying to download data). I have taken ethereal traces and have found a lot of the following errors:
nbss - tcp previous segment lost
tcp - tcp dup ack
I seem to be real slow over the load balanced t1's but my internet connection over frame relay seems to be okay.
I have attached my router config. In addition to the slowness, I was wondering about a couple of things.
1. Does packet load balancing work well in Windows environments where there is a lot of SMB type of traffic and file copies and pastes.
2. I remember seeing somewhere that on Frame Relay when using subinterfaces, you should not have a IP address on the physical interface. We do have an IP address on the physical serial1 interface. Does this cause a problem?
Router was setup by a consultant and I just started working here. Any help you can provide will be greatly appreciated.
at first glance, I would turn off ´ip load-sharing per-packet´ on your T1 interfaces, and use per-detsination load-sharing instead. Just configuring ´no ip load-sharing per-packet´ on the will enable per-destination load sharing (which is the default when CEF is turned on):
ip address 192.168.250.2 255.255.255.0
--> no ip load-sharing per-packet
ip address 192.168.251.2 255.255.255.0
--> no ip load-sharing per-packet
Another thing you could check is if you could set FastEthernet0/0 to duplex full instead of auto, but that would depend on which device is connected to FastEthernet0/0. In general, fixed settings are recommended for networking devices (that is, everything except for workstations).
Since you are running OSPF in area 1, do you know where are 0, the backbone area, is located ?
Thanks for the reply. FA0/0 is connected to a switch so I can set that to full duplex. We are running RIP to advertise our internal subnets and I believe we are using OSPF to advertise routes to our ISP for internet connectivity. I would guess that Area 0 is in their network.
On the load sharing, does it matter if the far side router (which is controlled by the co-location vendor) is set to per packet or per destination? Do you think the frame errors are due specifically to the packets being sent out two different interfaces. I did check and saw that the interfaces seemed to be load balancing properly.
Last question, on the Serial 1 interface, do you suggest that (from what I've read) the physical interface not have an ip address assigned to it. What are the ramifications of having a ip address assigned to the physical interface.
Thanks for your help.
Check your T1 with your telecom provider. Frame errors can be sign of physical problem in the link.
Load sharing is probably not the cause of the problem but in your conditions it can make life more difficult (especcially when error rate is different in each T1).
in addition to what Mikhail said, you also might want to adjust the MSS size on your FastEthernet0/0 interface:
ip tcp adjust-mss 1350
Regarding the IP address on the serial interface: it should not make a difference. You might want to disable inarp on the interface though, in order to avoid having unnecessary DLCIs mapped to the interface:
no frame-relay inverse-arp
And with regard to the load sharing algorithm used on both sides, that should be the same on both sides indeed, so you might want to ask your provider to enable per-destination on his side as well...
Thanks for the reply. I did a bit of reading on the ip tcp adjust-mss command and had a couple of questions. It says that this is a problem with PPP over Ethernet. I don't think I'm running PPP on the interfaces that have the problem. They are defaulted to HDLC. Am I reading this correctly in that this isn't a problem on HDLC interfaces? If I do make this change, does the far end router also have to make the same change?
If the mss size is an issue, could this be why my fluke reports a large amount of jabber errors (which I read to be packets that are larger than MTU)?
Again, thanks for your help.
It could be problem with one of the network cards on you servers (or may be even switch port).
Again, it sounds more like physical problem. I would recommend you to try to isolate the source:
- try to check where jabber frames are coming from
- try to connect router to another switch port
- try to set up port speed/duplex parameters statically and absollutelly same on both sides (router <-> switch, server <-> switch etc)
Thanks for the response. I'll try a different switch port for the router and statically enter port speed duplex and see if it helps.
As for the jabber frames. On the fluke, it starts off with the router reporting the most. As more traffic hits the router, it becomes more evenly dispersed between the router and the workstations hitting the router the most to get to our co-location site. Its been steady at a little over 10% of all frames being listed as jabber frames.
Right now its evenly distributed over the router, a 2003 dc I have in my local office and another couple of machines in the local office. Do you think this could be due to dropped frames because of the per packet load balancing to co-location site as has been mentioned before?
Thanks for your input.
As for now I can not see a link between packets dropped in frame-relay links and jabber errors in your ethernet segments. Jabber errors points usually to physical problem with networks equipment.
It would be good if you attach outputs from "show interface" command from the router.
Try also to change patch-cord between the router and switch and see if you get any difference.
I don't know what do you have for SLA but if you can get enough long service window I would recommend you to swap two LAN interfaces on the router (as far as I understand you haven't got such problems on other LAN interface).
But the most important thing is to check ports speed/duplex parameters.
Thanks for the response. Here is the current sho int on FA in question.
HTBYB#sho int fa0/1
FastEthernet0/1 is up, line protocol is up
Hardware is i82543 (Livengood), address is 0009.44a2.a801 (bia 0009.44a2.a801)
Internet address is 10.2.0.50/30
MTU 1500 bytes, BW 100000 Kbit, DLY 100 usec,
reliability 255/255, txload 1/255, rxload 1/255
Encapsulation ARPA, loopback not set
Keepalive set (10 sec)
Full-duplex, 100Mb/s, 100BaseTX/FX
ARP type: ARPA, ARP Timeout 04:00:00
Last input 00:00:07, output 00:00:05, output hang never
Last clearing of "show interface" counters 1w1d
Queueing strategy: fifo
Output queue 0/40, 0 drops; input queue 0/75, 2 drops
5 minute input rate 87000 bits/sec, 15 packets/sec
5 minute output rate 0 bits/sec, 0 packets/sec
13642164 packets input, 4107136019 bytes
Received 0 broadcasts, 0 runts, 0 giants, 3 throttles
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
0 input packets with dribble condition detected
176250 packets output, 17118507 bytes, 0 underruns(0/0/0)
0 output errors, 0 collisions, 4294967295 interface resets
0 babbles, 0 late collision, 0 deferred
1 lost carrier, 0 no carrier
0 output buffer failures, 0 output buffers swapped out
What are interface resets. I had been monitoring earlier but didn't see very many errors.
I did try changing the patch cord and also had an extra fast ethernet interface that I swapped in. No luck on either.
The problem occurs when sending or receiving over load balanced (per packet) over two t1s to co-location site.
I have not tried changing port speed/duplex but will try over the next few days.
Thanks again for your help.
Is it on FA0/1 segment you have a lot of jabber errors? I thought it was FA0/0...
According to IOS documentation interface resets are:
"Number of times an interface has been completely reset. This can happen if packets queued for transmission were not sent within several seconds. On a serial line, this can be caused by a malfunctioning modem that is not supplying the transmit clock signal, or by a cable problem. If the system notices that the carrier detect line of a serial interface is up, but the line protocol is down, it periodically resets the interface in an effort to restart it. Interface resets can also occur when an interface is looped back or shut down."
I would treat such a large number of resets as physical problem in link. Another thing that looks strange is that you have throttles - this means that the router sometimes simply can not take care on incoming packets. Too high load of CPU?
I would recommend you to set up an monitoring computer (for example with MRTG) and check interfaces workload, CPU load, memory utilization, number of errors on all interfaces.