[e2e] query on behaviour of tcp_keepalive and tcp retransmit on Linux based systems

Zama Ques queszama at yahoo.in
Tue Feb 22 01:24:31 PST 2011


We need some clarifications on TCP_keepalive .  We are facing some issues on our Prod servers related to TCP functionality .

The issue is like this. 

We
 have some machines at one end sending data in real time to another 
group of machines on the other hand .  Now due to some hardware issues 
on the other hand , some of the machines becomes unresponsive/crashes. 
The client system which pumps data never came to know that the server 
went unresponsive . The connection remains in
ESTABLISHED state and 
the client always tries to send data thinking that the connection is 
alive because of which we are seeing backlog on client sides. 

Our understanding is like this on how TCP will handle the connection.


Q 1) Since  the server went down , the client will try to the retransmit the data until it times out. What is the
 behavior of TCP after the timeout? Need clarification on
the following things.
a)
 Will the kernel will close the established connection after the timeout
 . Looks like no in our case as we still see the connection still in 
ESTABLISHED state after around more 
than 2 hours. 
b) Are there 
any kernel parameters which decides the when the client is timeout after
 retransmission fails. What is the behavior of TCP after the client 
retransmission timeouts.


Q 2 ) There is something called 
tcp_keepalive which if implemented in the kernel , by default it's there
 and comes to be around 2 hrs 2 minsutes , i think  ,  the client will 
send some TCP probes after the keepalive time ineterval and if it cannot
 reach the server , then the established connection in the client side 
will be closed by the kernel . This is my understanding. But I can see 
that the connection still remains in established after the tcp_keepalive
 time . We waited for
 around 2 hrs 30 minutes but the connection remains in established state
 only. Tried reducing the keepalive time to be around 10 minutes , but 
the connection remains in ESTABLISHED state in client side .


Where
 I went wrong .Please clarify my doubts raised above . What should we do
 to resolve the problem we are seeing above . Any help will be highly 
appreciated as we are going through a hard time to resolve the issue . 

Thanks in Advance


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20110222/39f64409/attachment.html


More information about the end2end-interest mailing list