[e2e] Is a non-TCP solution dead?

Fri Apr 25 13:58:54 PDT 2003

> I suggest exploring the (network. damn.) tradeoff again in the light
> of ABC. (cpu tradeoffs are, of course, best explored in prehistory
> back when more people cared about them. thanks, moore.)

Some people do still care about the CPU consumption :)

> ABC went experimental (proof that the IESG is stuck on Planet
> Status Quo?[*]), and RFC3465 discusses concerns about the extra
> burstiness that it introduces. Turn off delacks, and you counteract
> some of that burstiness now you're counting acks properly.

(It has been so long I'm sure I'll flub something here but anyhow...)
FWIW, and IIRC, when the VJ Congestion Control and Avoidance went into
the MPE/XL TCP stack round about 89-90, the implementation counted the
number of packets ACKed by the ACKs and adjusted cwnd accordingly. That
was our interpretation of the conservation of packets ideas.  It seemed
to work just fine and dandy on the 9600 baud dial-up links of the time. 
As far as I know (unless someone changed the code after I transfered) it
is still doing that today.  Drifting, the code was also doing an
immediate retransmit of the segment following the ACK for a retransmit
after timeout that didn't ack everything outstanding at the time of the
retransmit.

Now, I cannot say that MPE/XL was/is a major force in the "Internet"
with the capital I, but it was certainly exchanging data around on
networks of constrained bandwidth. 

I can also recall taking measurements of several simultaneous file
transfers across some 9600 baud links in the lab and seeing how much
better (possible even) it made things rather than the prior behaviour
without the congestion control heuristics.

> I'd mandate ABC with turning off delacks. Except ABC is in the sender,
> and delacks are in the receiver, and the IESG hasn't got the guts to
> get through the deployment/upgrade hump, minor though it is.
> Any future changes to TCP will be smaller than ABC, and you can't get
> much smaller than that.

At the risk of stumbling over an implied irony:) On the surface it it
would seem that if one were to mandate no delacks, you wouldn't need ABC
in the first place...

I myself am in the camp (assuming there is one) that believes that ACKs
are (necessary) overhead, and in broad terms just as "expensive" to the
hosts as a data segment.  In that sense I like the idea of backing-off
on ACKs beyond one for two and also like the idea of counting the bytes
ACKed rather than the ACKs themselves.  I'm still not fully
thought-through on the interaction of the two. 

I also like the idea that what is a nice two segment request/response
exchange remains a nice two segment request/response exchange and
doesn't become a four segment exchange (req, immediate ACK, resp,
immediate ACK).

HP-UX 11 has an ACK avoidance heuristic in its TCP implementation.  It
also has ways to constrain the heuristic via tuning.  Here is some rough
data between a pair of 440 MHz, UP, PA-RISC systems running HP-UX 11.11
(11i 1.0) speaking over an old Tigon2-based Gigabit link.  The service
demand figures show how much CPU time was consumed to transfer a KB (K
== 1024) of data.  Lower service demand is better. MTU of the link is
the standard 1500 bytes, the typical MSS of 1460 bytes is in effect.

The TCP_MAERTS test is the classic netperf TCP_STREAM test where the
data flows from the netserver to the netperf rather than from netperf to
netserver.  That way, the netperf is the receiver and altering
tcp_deferred_ack_max can be done in the same script invoking netperf
without remsh and such.  Alas the tcp_deferred_ack_max tunable has a
lower bound of 2 segments so I cannot directly measure the effect of
immediate ACK on service demand.  I suppose that someone could
extrapolate it from the other data points. Tcp_deferred_ack_max
represents an upper bound, sometimes the ACKs will be sent earlier.

tcp_deferred_ack_max 2
TCP MAERTS TEST to 192.168.1.55
Recv   Send    Send                       Utilization     Service Demand
Socket Socket  Message  Elapsed            Send   Recv    Send    Recv
Size   Size    Size     Time   Throughput  local  remote  local   remote
bytes  bytes   bytes    secs.  10^6bits/s  % I    % I     us/KB   us/KB

131072 131072 131072    20.01  478.13   65.54    56.35    11.230  9.654 
tcp_deferred_ack_max 4
131072 131072 131072    20.01  570.90   71.26    59.77    10.226  8.577 
tcp_deferred_ack_max 8
131072 131072 131072    20.01  641.37   72.45    63.95    9.253   8.168 
tcp_deferred_ack_max 16
131072 131072 131072    20.00  680.75   77.98    63.46    9.384   7.637 
tcp_deferred_ack_max 32
131072 131072 131072    20.00  693.86   74.57    66.05    8.804   7.798 

There is a bit of "noise" in the data - at present the implementation of
the TCP_MAERTS test doesn't get along well with the confidence intervals
code in netperf, so these are individual data points.  I'll have to fix
that problem one of these days, or change the script to TCP_STREAM and
use remsh to alter the remote tcp_deferred_ack_max settings... but this
should give a decent flavor for what happens to CPU consumption as the
ACK behaviour changes.

rick jones
-- 
Wisdom Teeth are impacted, people are affected by the effects of events.
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to post, OR email to raj in cup.hp.com  but NOT BOTH...