[e2e] query on behaviour of tcp_keepalive and tcp retransmit on Linux based systems
Detlef Bosau
detlef.bosau at web.de
Thu Feb 24 03:06:25 PST 2011
First of all, I'm not quite sure whether this is the right list for
Linux specific issues.
Second, you should have a look at the basic function of TCP. When a
listening socket dies, the sender will continuously face timeouts. With
the following consequences:
1. The sending window shrinks to 1 segment.
2. The RTO is doubled, each time a sent packet is not acknowledged in
time. Whether RTO is limited or not depends on the implementation.
3. The sending socket is shut down after some period of time. (Have a
look at the various timeouts in TCP.)
> We need some clarifications on TCP_keepalive . We are facing some
> issues on our Prod servers related to TCP functionality .
>
> The issue is like this.
>
> We have some machines at one end sending data in real time to another
> group of machines on the other hand . Now due to some hardware issues
> on the other hand , some of the machines becomes unresponsive/crashes.
> The client system which pumps data never came to know that the server
> went unresponsive . The connection remains in
> ESTABLISHED state and the client always tries to send data thinking
> that the connection is alive because of which we are seeing backlog on
> client sides.
>
> Our understanding is like this on how TCP will handle the connection.
>
>
> Q 1) Since the server went down , the client will try to the
> retransmit the data until it times out. What is the behavior of TCP
> after the timeout? Need clarification on
> the following things.
> a) Will the kernel will close the established connection after the
> timeout . Looks like no in our case as we still see the connection
> still in ESTABLISHED state after around more
> than 2 hours.
> b) Are there any kernel parameters which decides the when the client
> is timeout after retransmission fails. What is the behavior of TCP
> after the client retransmission timeouts.
>
>
> Q 2 ) There is something called tcp_keepalive which if implemented in
> the kernel , by default it's there and comes to be around 2 hrs 2
> minsutes , i think , the client will send some TCP probes after the
> keepalive time ineterval and if it cannot reach the server , then the
> established connection in the client side will be closed by the kernel
> . This is my understanding. But I can see that the connection still
> remains in established after the tcp_keepalive time . We waited for
> around 2 hrs 30 minutes but the connection remains in established
> state only. Tried reducing the keepalive time to be around 10 minutes
> , but the connection remains in ESTABLISHED state in client side .
>
>
> Where I went wrong .Please clarify my doubts raised above . What
> should we do to resolve the problem we are seeing above . Any help
> will be highly appreciated as we are going through a hard time to
> resolve the issue .
>
> Thanks in Advance
>
>
--
------------------------------------------------------------------
Detlef Bosau
Galileistraße 30
70565 Stuttgart Tel.: +49 711 5208031
mobile: +49 172 6819937
skype: detlef.bosau
ICQ: 566129673
detlef.bosau at web.de http://www.detlef-bosau.de
------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20110224/d15babaf/attachment.html
More information about the end2end-interest
mailing list