[e2e] Is a non-TCP solution dead?

Cannara cannara at attglobal.net
Fri Apr 4 10:34:40 PST 2003


Jonathan,

First, let me say again that the example I gave was from a real network, owned
and operated by a real (read major) worldwide network provider.  The routed
network was constructed with major vendors' gear (the C word).  TCP traffic
was generated by standard NT servers & stations of excellent capacity, as
measured a few years ago.  The overall path lengths in the tests were no more
than 400 miles.  And, the result was already reported on and published as an
early example of a network-performance tool by NetPredict.com, so you can find
some details there.  The traces I have are many MB of Sniffer(r) .ENC files. 
I'd be happy to view them with you, but because of non-disclosure, they can't
be sent out.  This, furthermore, is just one among several examples of poor
TCP behaviors in real networks that I've personally seen in the last 10
years.  

Now, the 100-150mS delay, waiting for an Ack every send window, when an odd #
of pkts have been sent in a block, is overhead, simply because of: a) the
setting of that timer b) the size of the application data block (e.g., SMB);
and c) is in inverse proportion to the size of the send window.  So, if the
sender's window starts small and doesn't increase quickly, so that there must
be N windows to send the block, the block will suffer N such delays -- I'm
sure we all get this.  

What can cause the send window to not increase?  

1) Well, we know its max can be set wrong by default -- NT was shipped with a
6-pkt window limit.  Microsoft upped that a bit a while later.  The user,
however, did not have access to the adjustment.  So, talk about RFCs is often
irrelevant.  

2) A lost pkt will put the receiver into triple Acking, eventually, to try to
replace it, if the receiver is of that vintage.  If the sender doesn't
understand that, as this NT TCP apparently did not, or if one repeated Ack is
lost, then the sender goes into long timeout, perhaps back to slow start.  
The combination of the above can make a 2MB file transfer take about 30%
longer using TCP, when the loss rate is well under 1% on the path.  This is
hardly good, nor evidence of a well-engineered transport.

Add in Alok's RFC quote: "Many TCP's acknowledge only every Kth segment out of
a group of segments arriving within a short time interval;..." and you can see
that very long delays can occur when an Ack is lost.  One question we can
legitimately ask today is:  Why worry about delaying Acks?  Ack every segment
received.  This would also allow new feedback info to be included with an Ack,
as discussed in the archives, and so let the sender know some useful things
more quickly.  

By the way, I don't understand the "driver issue" you mention -- you started
that line.  You also need to explain more about what you think happens when an
SMB client makes a block request and a server sends the response -- I've no
idea why you're bringing up CIFS, because when a 30kB SMB Read command occurs,
the server SMB responds by pointing to a block of 30kB in memory that TCP gets
and can segment, and TCP sends that block.  I know Hans brought up CIFS as
well, but I really don't get what the point there is -- SMB blocks are
provided to TCP to send, regardless.

By the way, when falling back on the crutch of "The RFCs", we need to remember
the many implementations of TCP out there that real people must use with real
systems, while they try to accomplish their own, real tasks.  The admonitions
on use of RST provide one example.  In some cases, the folks building the RFCs
don't really make it clear enough what not to do, so we get oddball behaviors
from real vendor products -- HP TCP's use of RST instead of FIN, for example. 
Now, sometimes these may be corrected later, but the point here is that there
is no trusted validation of TCP sources.  This means that users have no idea,
when problems/inefficiencies arise, that they need to look in detail at how
their purchased gear, including stacks, are working together.  This becomes
far more complex when we add in the millions of Internet targets and their
installed products.

Alex

Jonathan Stone wrote:
> 
> Alex,
> 
> I see some fairly distinct issues here. For reasons I don't understand,
> you seem determined to conflate them. Let's not conflate.
> 
> The first point I'd like to address is your claim that TCP's
> ACK-every-other-packet was a contributor to your, war-story
> about poor file-transfer performance.
> 
> I report, again, the *fact* that a TCP will ACK every other segment
> need not, in itself, be a throughput bottleneck.  In a modern TCP
> (or any TCP which has caught up with RFCs more than 10 years old!),
> window sizes *do* grow sufficiently large that a TCP sender can transmit a
> burst of over 40 full-size Ethernet segments, without receiving a
> single ACK -- in fact, all 40 segments receive at the receiver before
> the ``communication network'' (rfc793) even informs the receiving TCP
> of the first packet in the burst.
> 
> That is *not* a ``driver'' issue. That is an observed fact about TCP
> window sizes, and the consequent ability of TCP to send data without
> necessarily receiving prompt ACK segments.  In light of that observed
> fact, your claim about one-ACK-every-two segments is (again) specious.
> 
> Now another issue.  You did supply slights more detail about the
> alleged bad file-transfer performance being due to TCP:
> 
>   > [...] no, I'm not referring to what you term RPC/CIFS [...]
> 
>   >I'm referrring to an ordinary file transfer, say from an NT server to
>   >a workstation, where TCP/IP is the installed stack.  This has been a
>   >common config ever since Microsoft began shipping TCP/IP.
> 
> Point of information: network file accesses between Microsoft NT file
> servers using SMB over TCP/IP, actually employ *CIFS* over SMB over
> TCP.  [Leaving aside  remote possibilities, like DFS over MS's DCE RPC].
> 
> Were you using some other file transfer protocol atop SMB-over-TCP?
> 
> >By the way, since we're about 2 miles from one another, I'll be happy to show
> >you any data you'd like to see.
> 
> Post a URL to a packet trace, please. (libpcap preferred, so we can
> feed it into Shawn Osterman's tcptrace; but not required).  If TCP is
> truly as bad as you say, there's lots of us hungry to publish papers
> analyzing the problems and proposing ways to fix them.





More information about the end2end-interest mailing list