[e2e] TCP "experiments"

Sun Jul 28 01:09:19 PDT 2013

we've been using overlays to run experimental protocols on the production
internet for two decades - your (minor) correctness point about abuse of
code points is unfortunately almost unavoidable because the internet in the
wild doesn't let stuff through using new code points in enough places to
get deployment - this goes back a long time (e.g. when micsosoft shipped a
perfectly ok ECN enabled TCP stack only to find a bunch of their servers
werent reachable due to incorrect dropping and re-writing (normalization)
of various tcp options - it is extremely well illustrated in the work I
mentioned (but here's the reference from NSDI 2012) on MPTCP - a perfectly
respectable TCP extension which has seen a lot of careful work and
refinement, and simulation, and testbed eval, only to hit barreiers to the
next step, the deployment trials - see
https://www.usenix.org/system/files/conference/nsdi12/nsdi12-final125.pdf

I remember having discussions in both the IETF//IAB (e.g. when i was on
it:) and the more "academic" research community (e.g. SIGCOMM when sally
floyd and i were co chairs) 15+ years ago - back then, it was conceivable
someone would build a TCP with a hidden flaw as yet undiscovvered and get
the code shipped widely, then discover in the wild that very bad things
happen

now, we have
a) lots of players who can try things in "their" part of the net, and
switch back if things don't all pan out
b) lots of mechanisms to roll out new code (online s/w update in all major
OS, including ove the air for mobile devices)
c) as I said before, many correctly aligned incentives to not break stuff
gratuitously
the worries about arbitrary changes to TCP breaking the whole internet

the environment lets you do statistically safe experiments, and many people
do - also, we are not yet quite as bad as the pharmaceutical industry where
negative results from clinical trials are suppressed so we don't get to see
a balanced picture (see Ben GOldacre's book, or summary of at
http://en.wikipedia.org/wiki/Bad_Pharma
for that story - although they are getting a lot better recently)
,  we tend to try to report stuff, or at least try to....

I'd note that if you were to try some really disruptive thing using some
widespread deployment, you'd find yourself kicked off the net pretty fast -
planetlab has for example, had its share of such stories
but also a very large number of successes - the additional joy of planetlab
actually being connected to the real world rather than to some academic
idealised notion of what the internet might have been like, is that a LOT
Of really useful lessons in scale were learned and in the process a very
large number of people trained in what you can and can't do (as well as a
few quite succesful startups emerging from some of the work)....
http://www.planet-lab.org/

I do think you also need the testbeds and simulations (of course), but you
need to go that last step to validate the work and there will always be an
element of risk about it (but also we have several aforementioned
mechanisms to mitigate that risk)....

it has also been said that certain very large network and OS vendor
companies have used the installed customer base as their test engineers,
which kind of implies that what we discuss here is pretty irrelevant in
practice anyhow, so I think i'll go back to my reviewing

ttfn
jon

On Sat, Jul 27, 2013 at 5:24 PM, Joe Touch <touch at isi.edu> wrote:

>
>
> On Jul 26, 2013, at 9:48 PM, Jon Crowcroft <Jon.Crowcroft at cl.cam.ac.uk>
> wrote:
>
> > while linux (pick your flavour) cubic isn't vanilla, neither is
> > microsoft's compound tcp - the latter might have seen a bit more
> > eval in the literature but the former has seen a lot more big iron
> > deployment and doesn't appear to have broken the internet yet
>
> How would we know? They're not instrumented. These are not experiments;
> they're deployments.
>
> Even Schrödinger's cat eventually sees the light of day (as much as there
> is a cat in the first place).
>
> > (although there are rumours and reports of corner case problems)
> > but i dont think either of these are "non tcp" - they are variants
> > on CC behaviour....
>
> Which is a specified standard, which these mechanisms violate.
>
> You do bring up a valid point about the subject line, so I've changed it
> for this thread going forward.
>
> > also - the ability to do any deployment testing of a new tcp in
> > anger _requires you_ to be wireline compatible with TCP because of
> > the "non stadard" but ubiquitous NATs and other middleboxes
>
> The environment doesn't support safe experiments, but that is not a valid
> excuse for unsafe ones.
> ...
> > so the gold standard you quite reasonably want to hold people to,
> > to show their work doesn't do harm in the wild,
> > requires them to do "harm" by making
> > their new variant TCP appear chameleon like,
> > vanilla TCP, so they can get results
>
> So they do harm to avoid doing harm?
>
> They have failed because of their first step.
>
> Joe
>
>