02-20-2010 05:21 AM - edited 03-06-2019 09:48 AM
Hi,
We have been informed by the server team, that one server is rebooted due to lost of Heartbit. And their (Server Team) belive its because of Network problem.
My design is like this.
one CoreSwitch-6509 and 9 Access switches(3750) connected through 10G Fibre connectivity.
Context is created in 6509-FWSM Application and DB.
this problem we have been informed last few days....everytime their server got automatically rebooted due to heartbit lost. Please find the logs in Server.
153 6.006320 10.128.6.21 10.128.5.35 TCP [TCP Previous segment lost] sdsc-lm > 58953 [PSH, ACK] Seq=1573 Ack=1106 Win=32768 Len=22
Note: 10.128.6.21 = DB Server IP
10.128.5.35 = Application Server IP
Please suggest what should I do to fix this problem.
02-20-2010 05:34 AM
goutam
Can you post sh log from the 6500 switch. This could be server application issue and nothing to do with the switches.
HTH
Reza
02-20-2010 05:42 AM
02-21-2010 07:22 AM
Hi,
Did you check the log? anyone could help on this?
02-21-2010 07:46 AM
goutam_04 wrote:
Hi,
We have been informed by the server team, that one server is rebooted due to lost of Heartbit. And their (Server Team) belive its because of Network problem.
My design is like this.
one CoreSwitch-6509 and 9 Access switches(3750) connected through 10G Fibre connectivity.
Context is created in 6509-FWSM Application and DB.
this problem we have been informed last few days....everytime their server got automatically rebooted due to heartbit lost. Please find the logs in Server.
153 6.006320 10.128.6.21 10.128.5.35 TCP [TCP Previous segment lost] sdsc-lm > 58953 [PSH, ACK] Seq=1573 Ack=1106 Win=32768 Len=22
Note: 10.128.6.21 = DB Server IP
10.128.5.35 = Application Server IP
Please suggest what should I do to fix this problem.
So the db server is behind the FWSM and the apps server has to go through the FWSM to get to the db server ?
If so i have seen an issue with this setup. It was to do with the timeout for TCP connections on the FWSM. The FWSM would timeout the TCP connection between the apps server and db server but they were not aware of it so it stopped working. The solution was to increase the TCP timeout for that connection.
Now FWSM v2.x code you could only increase the TCP timeout globally ie. you had to do it for all connections which was not ideal. But v3.x code onwards you can increase the timeout for specific TCP connections, see this link for an example -
This is not necessarily the issue you are having but it might be worth a try.
Jon
02-21-2010 08:09 AM
Hi,
We have seen this type of problem...TCP Connection Time Out Session.
So already we have increased the time line to 7 days.... still we are facing this type of problem. Is there any other solution to solve this.
02-21-2010 10:01 PM
you need to sniff the traffic and capture it in both sides: appl server and DB server.
then you need to analyze the captured packets and see if there r really some lost segments.
firewall log is also very usefull in these case. can you see any msg concerning this problem in fw logs?
02-21-2010 11:02 PM
Hi,
We have been informed by the server team, that one server is rebooted due to lost of Heartbit. And their (Server Team) belive its because of Network problem.
My design is like this.
one CoreSwitch-6509 and 9 Access switches(3750) connected through 10G Fibre connectivity.
Context is created in 6509-FWSM Application and DB.
this problem we have been informed last few days....everytime their server got automatically rebooted due to heartbit lost. Please find the logs in Server.
153 6.006320 10.128.6.21 10.128.5.35 TCP [TCP Previous segment lost] sdsc-lm > 58953 [PSH, ACK] Seq=1573 Ack=1106 Win=32768 Len=22
Note: 10.128.6.21 = DB Server IP
10.128.5.35 = Application Server IP
Please suggest what should I do to fix this problem.
Hi,
Gautam how fast is the interval of disconection between App server and DB server,You have only firewall between the two server which are talking on hearbeat messages,Is there any specific port in whihc hearbeat communicates and the logs in the server is push ack messages which means already a TCP communication is establishes and data is getting pushed in the existing connection.
The best way to trouble shoot check what are the devices are there between the server and check the timeout for the port in which hearbeat messages are exchanged between the servers and if possible check out the TCP buffers in both the servers is there any TCP related issue at both the server end and finally capture a sniffer trace between the server on hearbeat port and then check what is the behaivor when it get disconnected.
Hope to help !!
Ganesh.H
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide