[e2e] tcp connection timeout

Mon Mar 6 11:08:09 PST 2006

On Sat, Mar 04, 2006 at 06:13:02PM -0500, Saikat Guha wrote:
> On Sat, 2006-03-04 at 11:37 -0500, Ethan Blanton wrote:
> > there's no reason HTTP *couldn't* be implemented at the transport
> > layer, or SMTP, or ssh, so clearly keepalives (or whatever feature)
> > could be done at the transport layer -- it just wouldn't be a good
> > idea to do so.
> 
> What makes it a "good idea" to implement something at the transport
> level or application level?

In the case of keepalives, the issue is that transport-level keepalive
packets are rarely what an application needs; one usually needs
application-level keepalives.  Sometimes you can make a thin app layer
that's just TCP, but lets ignore that for the moment.

It's rarely valuable to know that the transport connection is up.  What
you want to know is if the other end of the application (web server,
TCP-NFS server, realplayer streamer) is still talking to you.  A
deadlocked web server still ACKs TCP segments.  A working
TCP connection is necessary but not sufficient, so you need an
application keepalive.  End-to-end 101.

As with other implementation at lower layers, everyone believes that you
can implement the keepalive function lower.  The questions you mentioned
are close to the right ones, but I assert that they include:

	How much does it help?
	How much does it hurt?

I think the first answer is "not much".  If your app is properly
designed, a keepalive at the app layer tells you what you wanted to
know.  TCP timeouts will also tell you that the app connection has
failed, but the absence of a TCP timeout is not conclusive.  A false
positive (TCP checking the connection when the app doesn't care) is
actually harmful.  One advantage is that applications (mis)designed
without a keepalive can use TCP keepalives, but a locked-up application
remains undetectable.

How much it hurts is pretty much the design issue.  There's extra
configuration work to be done for apps to set TCP keepalive parameters
(and if you can't at least turn them off, they're in the way of
applications with long idle connections) and for putting the code into
the stack.  The code is a wash - setting params is going to happen; the
timeout code can be in the stack or a library - but you've put another
thing into TCP to interact with everything else that's in TCP.

IMHO, adding timeouts is a long run for a short slide.  Applications
that actually care have to add the keepalive functionality anyway.  The
vague assurance of a TCP keepalive is insufficient reason to add
complexity.

Generally lower level functionality that gets added in the face on an
end-to-end argument that it's unnecessary is a performance hack, ahem,
sorry, enhancement, but keepalives are a matter of correctness.  Not
much help at lower levels than you need the assurance.  Knowing the
phone's connected while the person you're talking to is out cold doesn't
help enough.

Now, I'm obviously wrong, because the option exists, but there's an
argument against them and it has nothing to do with tradition or old OS
v. network designer disagreements.

-- 
Ted Faber
http://www.isi.edu/~faber           PGP: http://www.isi.edu/~faber/pubkeys.asc
Unexpected attachment on this mail? See http://www.isi.edu/~faber/FAQ.html#SIG
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.postel.org/pipermail/end2end-interest/attachments/20060306/74ff3715/attachment.bin