[e2e] Early FIN/close for query-response TCP connections like DNS's

Stefanos Harhalakis v13 at v13.gr
Fri Oct 1 02:08:16 PDT 2010


On Friday 01 of October 2010, David P. Reed wrote:
>   TCP endpoint stacks are not part of IETF standards.  In particular
> "socket" calls and their meaning are not standardized.

Of course. I don't suggest any change required in any point in the underlying 
stack. Only a suggestion for implementations of DNS/TCP clients.

> I suspect that there might be a problem here in the actual *stacks*
> because to "half close" a connection, one issues a "close" to the
> socket.  After such a command, many operating systems may not expect to
> provide more data on the receive side.

In fact one has to use shutdown() instead of close(). From what I understand, 
close() closes the socket, while shutdown handles the connection leaving the 
socket fd valid.

> I suppose you could signal a half-close without an operating system
> close by some kind of ioctl call (in Linux type OS's).  In OS's like
> Symbian and DOS/Windows, etc., there may be some "control" operation
> that one can call. However it would be decidedly non-standard for the
> client to close when it expects more data.

In linux it works fine. You only have to shutdown() the socket with 
how=SHUT_WR. Here's the tcpdump of  a test where a client sends 10K data and 
receives 10K data while the server receives 10K data, sleeps for 1 second and 
sends 10K data. The client half closes the connection after it finishes 
write()ing:

Connect/Accept:
11:38:57.908427 IP 127.0.0.1.44572 > 127.0.0.1.9996: Flags [SEW], seq 
1205578956, win 32792, options [mss 16396,sackOK,TS val 5097055 ecr 
0,nop,wscale 7], length 0
11:38:57.908453 IP 127.0.0.1.9996 > 127.0.0.1.44572: Flags [S.E], seq 
1215158356, ack 1205578957, win 32768, options [mss 16396,sackOK,TS val 
5097055 ecr 5097055,nop,wscale 7], length 0
11:38:57.908467 IP 127.0.0.1.44572 > 127.0.0.1.9996: Flags [.], ack 1, win 
257, options [nop,nop,TS val 5097055 ecr 5097055], length 0

Client -> Server data:
11:38:57.908520 IP 127.0.0.1.44572 > 127.0.0.1.9996: Flags [P.], seq 1:10001, 
ack 1, win 257, options [nop,nop,TS val 5097055 ecr 5097055], length 10000
11:38:57.908544 IP 127.0.0.1.9996 > 127.0.0.1.44572: Flags [.], ack 10001, win 
386, options [nop,nop,TS val 5097055 ecr 5097055], length 0

Client's shutdown(SHUT_WR):
11:38:57.908577 IP 127.0.0.1.44572 > 127.0.0.1.9996: Flags [F.], seq 10001, 
ack 1, win 257, options [nop,nop,TS val 5097055 ecr 5097055], length 0
11:38:57.948106 IP 127.0.0.1.9996 > 127.0.0.1.44572: Flags [.], ack 10002, win 
386, options [nop,nop,TS val 5097095 ecr 5097055], length 0

-- 1 second sleep after read() and before write() at server side--

Server -> Client data:
11:38:58.908653 IP 127.0.0.1.9996 > 127.0.0.1.44572: Flags [P.], seq 1:10001, 
ack 10002, win 386, options [nop,nop,TS val 5098055 ecr 5097055], length 10000
11:38:58.908676 IP 127.0.0.1.44572 > 127.0.0.1.9996: Flags [.], ack 10001, win 
386, options [nop,nop,TS val 5098055 ecr 5098055], length 0

Final close() from both sides:
11:38:58.908703 IP 127.0.0.1.9996 > 127.0.0.1.44572: Flags [F.], seq 10001, 
ack 10002, win 386, options [nop,nop,TS val 5098055 ecr 5098055], length 0
11:38:58.908710 IP 127.0.0.1.44572 > 127.0.0.1.9996: Flags [.], ack 10002, win 
386, options [nop,nop,TS val 5098055 ecr 5098055], length 0

> All that said, this is not that big a deal - there are lots of ways to
> deal with this without changing the protocol.  For example, the DNS
> *server* could shorten its fin-wait timeout to a very short timeout,
> after which it just drops any record of the connection.

Agreed, but is there any objection on testing/suggesting such a valid (from 
the protocol's POV) hack? I mean: do you see any problem?

If I understand this correctly the half-closed connection will speed-up the 
client (considering a blocking client) by RTT/2 or even RTT depending on how 
the FINs are exchanged.

Code of the above test is available of course.


More information about the end2end-interest mailing list