[e2e] TCP in outer space

Thu Apr 12 18:17:53 PDT 2001

    > From: Ted Faber <faber at ISI.EDU>

    >> since 1978, there has been no mystery that, for instance, TCP's
    >> retransmission algorithm has been given no means to distinguish among
    >> even the two most important causes of loss -- physical or queue drop.

    > Mechanisms based on internal network state were considered and
    > discarded because they conflicted with the well-defined and
    > well-prioritized list of design goals for the nascent Internet.
    > Specifically, depending on such internal information made the net less
    > robust to gateway failures .. so such mechanisms weren't included.

Actually, neither of these positions is right. The early Internet *did* try
and *explicitly* signal drops due to congestion, separately from drops due to
damage - that's what the ICMP "Source Quench" message was all about.

It turned out that SQ probably wasn't the right way to go about it - but I
seem to recall a very recent, very lengthy debate about the utility of SQ, as
compared to ECN, so I think those who made that particular mistake back when
can't be castigated too hard.

In fact, the thing is that we just didn't understand a lot about the dynamics
of very large pure datagram networks, back in the late 70's when the initial
TCP work was being done. (And I'm not talking about only congestion, etc
here.) To start with, nobody had ever built one (the ARPAnet was *not* a pure
datagram network, for reasons I'm not going to explain here). With the *very*
limited amount of brain/programming power available in the early days of the
project (most of which was focussed on down-to-earth tasks like getting
packets in and out of interfaces), it's not too surprising that there were
lots of poorly-examined regions of the design space.

But to get back to the topic, even ECN is at best a probabilistic mechanism
for letting the source know which drops are due to congestion, and which to
packet damage. The only "more reliable" (not that *any* low-level mechanism
in a pure datagram network is wholly reliable) method is SQ - and as I
mentioned, we just got through a long disquisition about how SQ is not the
right thing either.

    >> nothing in 2001 is new with respect to Internet design criteria, that
    >> was not evident to network engineers in and out of the Internet
    >> community over a decade ago

Perhaps - but the basic Internet framework was done in the mid-70's, when
this stuff was *not* "evident" to all and sundry.

And if you think it's due to lack of smarts back then, I think there are a
lot of things you only learn via experience. I'll point to Sorcerer's
Apprentice Syndrome, which was apparently independently re-invented by people
at Xerox PARC *and* MIT - neither of whom is known for being below-average
clever.

    >> If the old tunes were spur-of-the-moment (ad hoc), as we know some
    >> significant ones were, then why be so resistant to new ideas for a new
    >> networking realm? Bureaucracies are resistant. Productive, competent
    >> research cannot be.

As someone who's spent a lot of time trying to get the IETF community to take
on some novel ideas, let me tell you that it's one heck of a lot harder than
it looks.

And it's not "bureacracy", but a much more complex set of reasons - ones I
don't have the time/energy to expand on right now.

	Noel