[e2e] Open the floodgate

Cannara cannara at attglobal.net
Thu Apr 22 13:04:11 PDT 2004


David, thanks for the tutorial, but you obviously missed my points, for some
reason.  I never asserted TCP is "supposed to correct for failing hardware",
other than by retransmission, now did I?  Nor did I even suggest "TCP obviate
the need for local error control".  Nor did I say anything about control
theory for managing the Internet.  You seem to love going off on tangents to
bring an argument to your terms, while avoiding the points raised.

The problem is, as you well know but somehow can't bring yourself to admit
here, that TCP was modified to provide network congestion control some years
ago -- something that is clearly a violation of layering responsibility and
certainly inconsistent with the vaunted "end-to-end" principle (or lemma as
one joked to me off list).

So, what we've bought by that poorly-engineered and long-maintained kludge,
which ignores all other IP traffic, is a protocol that believes it needs to
slow down whenever it sees a packet loss.  That's a very simple thing to
grasp.  It's a very simple mistake.  There are various causes of loss, which
was a fact even known in the '80s when TCP was kludged, so why ignore them and
incorrectly lump them with congestion?  Even if it were agreeable to have one
Transport protocol do a Network-layer task this way, it would then require
engineering to be sure the task were performed properly in all likely
scenarios and that no fatal corner cases existed.  This has not been done,
though it has been begun, finally, with some efforts like ECN.

Another remarked to me off list that even the ITU and OSI work faster than the
IETF, so it should be no surprise that we are where we are.  For this list,
it's especially relevant, because, as said many times, the archives contain
concrete discussions and concrete suggestions of how to improve the situation,
even if TCP were left largely unchanged.  But these are not things a
comfortable, junketing bureaucracy, wants to spend time on.  They are the
things new grad students and researchers might well spend time on, but
messages like yours and some others are clearly aimed at discouraging open
discussion and trial of even modest alternatives.

Alex

"David P. Reed" wrote:
> 
> Alex - your note reflects a tremendous misunderstanding of TCP.   Is TCP
> supposed to correct for failing hardware anywhere on the path?   The answer
> is no.   TCP is a protocol that provides end-to-end error control - which
> ensures error-freeness over best efforts networks.
> 
> Does TCP obviate the need for local error control, because it does it on an
> end-to-end basis?   No - it was never supposed to.
> 
> The end-to-end analysis applies here:
> 
> 1. end-to-end reliability cannot be provided at the link level.   Thus we
> must provide it on an end-to-end basis.
> 
> 2. there is vast improvement in the operating point that can be achieved by
> doing local error recovery at the link (or within AS) level - local error
> recovery allows for tighter control loops, and should be done without
> adding to the end-to-end delay.   Thus it is appropriate to optimize
> (improve) the performance of links by retransmission.
> 
> The second point also captures what Bob Kahn crystalized in creating the
> Internet - the concept of "best efforts".   The word "best" clearly does
> not mean "no effort".   What it means is a subset of the end-to-end
> argument - do what you can where what you do is unambiguously helpful, but
> don't take on the impossible burden of assuring high-level properties with
> low-level mechanisms.
> 
> The worst botch I have ever seen in my consulting to commercial network
> installations was a Fortune 500 company that really misunderstood
> this.   They had been convinced to put in frame relay links between all
> their sites, and to use frame relay's "perfect end-to-end" delivery mode
> between their locations.    That's not a "best efforts" link if you think
> about it - it's a stranded soldier maintaining fanatical adherence to duty
> 20 years after the war is over.
> 
> What happened?   If any link downstream failed (turned off), the frame
> relay link started filling buffers in every underlying switch.   It took
> many seconds to fill up, then when the downstream link came back up, it
> dumped many seconds worth of completely useless traffic into the destinations.
> 
> The frame-relay sales engineer just could not understand why turning off
> his low-level reliability made his customer happier.  In fact, he kept
> trying to get them to turn it back on - saying that the problem must have
> been with the routers.
> 
> Ultimately, this is the all-too-human problem of perseverating based on an
> incorrect theory of the world.   There's nothing wrong with theories, but
> their utility depends on matching their assumptions to reality.   The
> reality of the Internet is not the reality of traditional control theory.
> 
> Control theory
>         - in the presence of competing and evolving goals at the user level (no
> single objective function to maximize, but instead a need to develop the
> most flexibility - that is the most diverse set of stable operating points
> in control theoretic terms) and
>         -in the presence of highly coupled interactions with the clients (the WWW
> invented caching, which changed the operating point in a completely
> unpredictable way, without consulting the network planners) and
>         -in the presence of an evolving set of underlying communications technologies
> 
> is now a new science.  This is partly because of people like John Doyle and
> Sally Floyd who took on the challenge of constructing a new control theory
> to match the requirements of the Internet.   Yes, everyone involved in
> developing TCP knows control theory.  But few of them have the illusion
> that the world exists to fit that theory.


More information about the end2end-interest mailing list