[e2e] TCP ex Machina

Richard G. Clegg richard at richardclegg.org
Wed Jul 31 05:47:23 PDT 2013

```On 22/07/2013 08:04, Jon Crowcroft wrote:
> Richard Clegg at UCL presented a nice study recently where he has being
> trying to fit TCP equations to lots if big packet traces...he found
> precious few loss events, and the best explanation appears to be that the
> world is made of many zebras and mice....zebras are flows that are
> application rate limited...e.g video streaming...mice are flows that just
> don't make it out if slow start before they are done...eg web transactions,
> social media chat etc...it appears TCP CC is rarely invoked these days......
Jon,
Thanks for the kind mention.  I didn't reply earlier as I was
travelling in parts of the world with no Internet coverage.  The work I
presented comes from two papers, one publicly available, one under
submission right now:

This paper was presented at ICC:  On the relationship between
fundamental measurements in TCP flows:

http://www.richardclegg.org/pubs/rgc_icc_2013.pdf

It's a simple model fitting exercise where we look at a large number of
publicly available TCP traces (mainly CAIDA but some MAWI) and try to
fit models inspired by the Padhye et al form that a TCP connection
bandwidth is proportional to 1/RTT and 1/sqrt(p) where p is probability
of packet loss.  There was a burst of papers of this form some years ago
which took a mathematical model of TCP and tried to get a formula for
throughput given assumptions.  We took the reverse approach and fitted a
large number of packet traces (several billion packets in all) to see
how the throughput of a flow depended on loss, RTT and length of flow.
So the idea was to take the observational statistics approach of
assuming that we knew nothing about TCP and trying to reverse engineer
what factors we can observe in a TCP flow which cause it to have high or
low throughput.

What was interesting to me was that
1) Length of flow was very important to the model fit -- even if you
filter out short flows to get rid of slow start.
2) Proportion of loss in a flow was not so important, indeed hardly
relevant at all.  There was little correlation between loss and
throughput (we used a variety of traces, some low loss, some high loss).
3) Often it is sufficient to simply know RTT and length of flow in order
to make a good prediction of the completion time for a flow.

The second paper (under submission) was led by Joao Araujo here at UCL.
It looks at mechanisms which affect TCP throughput but are not loss or
delay based.  E.g. throttling by application (youtube, clickhosts and
P2P hosts often do this by various mechanims), hitting max window size
(either because of OS limits at client or because server limits it to
cap bandwidth) or throttling by middleboxes.  This used the MAWI data
from Japan and showed that more than half of modern TCP traffic in that
data is not really "traditional" TCP as we think of it (attempting to
"fill the pipe" using loss or delay as a signal to stop doing so) and
the fraction of TCP controlled by other mechanisms appears to be growing.

Joao would be the best person to answer questions about the work
j.araujo at ucl.ac.uk -- but I found it very interesting.  Others would
have better insight than I but I was surprised.  If more than half the
traffic is not at heart controlled by loss/delay then what does this
imply for all of our insights on fairness between flows?  It seems the
majority of traffic we considered is in various ways either deliberately
(youtube, clickhost) or accidentally (OS limited at client) controlled
in a different way to the way that we teach in the class room.

--
Richard G. Clegg,
Dept of Elec. Eng.,
University College London
http://www.richardclegg.org/

```