[e2e] Is a non-TCP solution dead?

Wed Apr 2 14:16:59 PST 2003

Alex,

When you say "SMB", I assume you mean the Server Message Block
protocol used in (for example) CIFS traffic,  as deployed in Microsoft servers.

>1) Even Old SMB ('90s) allows 128kB of block transfer down to the Transport.

That's not rrelevant to the stop-and-wait RPC payloads which SMB
passes down to its own transport.  If the traffic layered on SMB is
CIFS remote-file RPC. CIFS does have the stop-and-wait RPC behaviour I
sketched. So, *if* your example of a ``NOS'' refers to CIFS traffic,
then the problem is not TCP at all, and it's disingenuous to claim
that it is.

While SMB itself may allow 128Kbyte SMB frames, both the open-source
(SNIA) and the long-expired Microsoft CIFS I-D specify an 16-bit field
for the overall octet length of the CIFS payload inside the SMB delimiters.

Then again: if your example was not CIFS traffic: show us a packet
trace with more than 64Kbyte stop-and-wait RPCs, where the packet
trace suggests TCP (rather than the stop-and-wait SMB RPC) is at fault.

>Things like NFS have had similar ranges.  

No. NFS does not have the limits familiar to SMB and CIFS users. None
of the NFS implementations I've ever used do stop-and-wait RPCs; they
all employ nfsds on the server and nfsiods on the clients (or
more modern equivalents) to sustain multiple RPCs in flight.
In contrast, SMB (with CIFS) rarely has more than one RPC in flight.

>These are beyond the windows a vendor's TCP normally begins at, or
>gets to in a few MB.

Scuse me, but that's nonsense.  I've personally instrumented Ethernet
drivers for CIFS traffic. A maximum-length CIFS read or write is just
over 64Kbytes, which consumes 44-odd standard length Ethernet packets.
I've taken packet traces which show that under typical load, an Intel
gigabit NIC[1] will DMA that train of 44-odd packets into memory in
one hit, and delivers one interrupt for the entire packet train,
shortly after the link goes idle.  In those circumstances the sender's
TCP window clearly *has* to be more than one RPC's worth, because the
receiving TCP isn't seeing any of those 40-odd packets until the whole
burst of 40-odd packets has already been deposited in the receiver's
memory[2]. The system under test did indeed get there ``in a few MB'.

The attribution of poor performance of (unspecified) SMB traffic to
TCP's ACK-every-second packet heuristic thus makes no sense whatever.

That said: there is a gotcha in fast-retransmit/fast-recovery, with
RPC- oriented traffic: a drop of any of the last 4 segments of an RPC
has too few segments following it to trigger the 3-dup-ack threshold.
But for the specific example of SMB, its moot: even WindowsCE devices
can (with registry editing) do SACK, and have done for what, nearly
3 years now?

Alex, if you know of legitimate technical gripes with TCP, I'm
genuinely keen to hear them.  But war stories, with more colour than
fact, with insufficient detail to ascertain causes, yet with the blame
assigned to TCP regardless of the facts, is a waste of our collective time.

[1] one port of an Intel 82546 with Intel-supplied FreeBSD 4.x
drivers: I don' have Microsoft source code to instrument).

[2] I had rfc1323 options enabled.  So has most every other TCP in
the last 5 to 8 years.  RFC-1323 will be eleven years old next month.