[e2e] TCP ex Machina

Wed Jul 31 05:47:23 PDT 2013

On 22/07/2013 08:04, Jon Crowcroft wrote:
> Richard Clegg at UCL presented a nice study recently where he has being
> trying to fit TCP equations to lots if big packet traces...he found
> precious few loss events, and the best explanation appears to be that the
> world is made of many zebras and mice....zebras are flows that are
> application rate limited...e.g video streaming...mice are flows that just
> don't make it out if slow start before they are done...eg web transactions,
> social media chat etc...it appears TCP CC is rarely invoked these days......
Jon,
         Thanks for the kind mention.  I didn't reply earlier as I was 
travelling in parts of the world with no Internet coverage.  The work I 
presented comes from two papers, one publicly available, one under 
submission right now:

This paper was presented at ICC:  On the relationship between 
fundamental measurements in TCP flows:

http://www.richardclegg.org/pubs/rgc_icc_2013.pdf

It's a simple model fitting exercise where we look at a large number of 
publicly available TCP traces (mainly CAIDA but some MAWI) and try to 
fit models inspired by the Padhye et al form that a TCP connection 
bandwidth is proportional to 1/RTT and 1/sqrt(p) where p is probability 
of packet loss.  There was a burst of papers of this form some years ago 
which took a mathematical model of TCP and tried to get a formula for 
throughput given assumptions.  We took the reverse approach and fitted a 
large number of packet traces (several billion packets in all) to see 
how the throughput of a flow depended on loss, RTT and length of flow.  
So the idea was to take the observational statistics approach of 
assuming that we knew nothing about TCP and trying to reverse engineer 
what factors we can observe in a TCP flow which cause it to have high or 
low throughput.

What was interesting to me was that
1) Length of flow was very important to the model fit -- even if you 
filter out short flows to get rid of slow start.
2) Proportion of loss in a flow was not so important, indeed hardly 
relevant at all.  There was little correlation between loss and 
throughput (we used a variety of traces, some low loss, some high loss).
3) Often it is sufficient to simply know RTT and length of flow in order 
to make a good prediction of the completion time for a flow.

The second paper (under submission) was led by Joao Araujo here at UCL.  
It looks at mechanisms which affect TCP throughput but are not loss or 
delay based.  E.g. throttling by application (youtube, clickhosts and 
P2P hosts often do this by various mechanims), hitting max window size 
(either because of OS limits at client or because server limits it to 
cap bandwidth) or throttling by middleboxes.  This used the MAWI data 
from Japan and showed that more than half of modern TCP traffic in that 
data is not really "traditional" TCP as we think of it (attempting to 
"fill the pipe" using loss or delay as a signal to stop doing so) and 
the fraction of TCP controlled by other mechanisms appears to be growing.

Joao would be the best person to answer questions about the work 
j.araujo at ucl.ac.uk -- but I found it very interesting.  Others would 
have better insight than I but I was surprised.  If more than half the 
traffic is not at heart controlled by loss/delay then what does this 
imply for all of our insights on fairness between flows?  It seems the 
majority of traffic we considered is in various ways either deliberately 
(youtube, clickhost) or accidentally (OS limited at client) controlled 
in a different way to the way that we teach in the class room.

-- 
Richard G. Clegg,
Dept of Elec. Eng.,
University College London
http://www.richardclegg.org/