Users are getting a white page over a T1 link

Unanswered Question
Jul 23rd, 2008

I have users who are getting a white page over a T1 when connecting to an application. They are blaming the network but did not see anything wrong with the network. The way the application works is users are using https url to a server and that server redirects traffic to another database server and from there users start their queries. I put sniffer on the network and got a lot of RST flags and also a lot of TCP fragment of reassembled units. The application manager is still blaming the network. I told him this is an application issue. Anyone in this forum had this kind of issue before ? If yes, please I need your help in solving this. Thanks.

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
tdrais Wed, 07/23/2008 - 09:34

Just tell them you agree its a network problems but its a network problem on the servers tcp stack ....lol

For a true network problem you will either see lots of packet retransmission because of packet loss or packets that are sent over and over with no response.

Fragmentation is not a good thing but it in general does not cause a big issue since servers can reassemble them. If it is your t1 causing the fragmentation you can attempt to increase the mtu on the serial port and see if it has any effect.

The RST is most likely the problem and as long as it is not a router sending it you can claim it is not your problem. That never seems to work for me so you will end up having to help the app guys find out why the RST is being sent.

If it happens during the TCP handshake it is most likely a firewall rule on the sever or simply nothing on the port requested. If it happens after the session is past the handshake but still early in the session is most likely some issue in the authentication at the application level. If it happens later on then you just need to be sure that there is no packet loss causing the reset. If all the packets are flowing back and forth cleanly and all the sudden you get a reset then the machine sending the reset is the only one that knows why it was sent and it is seldom in the data stream.

jeanaguemon Wed, 07/23/2008 - 09:44

Here is the analysis of some experts in decoding the sniffer files,

There are some apparent connectivity issues between the web server and the app server. We saw many instances in the app server capture of the web server sending a SYN packet to initiate the connection, but the app server responding with a RST-ACK, thus denying the connection. Each instance would obviously force the web server to back off then retry to connect. It was not consistent; there appeared to be about as many successful connections as reset connections. With it happening very often, these could potentially slow down the communication between the web and app servers enough that slowness might appear to the users, especially at high traffic periods. These resets could be the result of either mismatched but overlapping port range settings within the configurations of the web server and the app server, or perhaps interference from iptables or the local firewall based on incorrect allowed port ranges.

Please let me know what you think, thanks.

tdrais Wed, 07/23/2008 - 10:02

Was a joke at first but it does appear to be related to the TCP stack on your app server.

With it being random and on the initial SYN packet this is tough. When it breaks all the time you can blame misconfigured firewalls or routing but when its random ....

The only time I have even seen something like this was when we were pushing dynamic rules to a firewall and the traffic attempted to pass before the rule got fully installed.

My knowledge of how the stack works on a machine is very limited and on my list of things to learn.

Howfully someone who started on the server side and has now moved to network is reading this can can provide more insite.

jeanaguemon Wed, 07/23/2008 - 11:11

The application manager is not convinced yet. Now the next step is to plug a laptop into the WAN router and connect to the application and see if I'll get the white page. If I don't then I'll rule out the network.

Thanks for your help. I'll let you know how it goes.

jeanaguemon Mon, 07/28/2008 - 05:58

I did some tests by hook up a laptop to the router first then the firewall then the Layer 2 switch, and still I got the white page for all three tests.

Finally, the manager acknowledged it is a server issue and ruled out the network.

Thanks so much for your input.

Actions

This Discussion