[e2e] a new IRTF group on Transport Models

Tue Jun 7 11:11:23 PDT 2005

The archives for this list, over the past few years, should have many relevant
emails on this general topic of Internet Dysfunction.  To save space, I'll try
to add to Ted's and Franks' msgs in this one email.

1) Since no effort (in shipped code) was ever made to distinguish the real
cause of a missing TCP packet, it can be for any of the reasons we'd want to
know (if congestion, or resource shortage, is occurring somewhere) and, it can
be for reasons we can't do anything about, by slowing down, either at the
network or transport layer.  Using TCP as the Internet's congestion avoider,
then makes even less sense, because (apart from quench) no one ever planned to
tell it, say from IP, that it should slow down for the right reason.  An
application slows down because the last buffer it gave to the transport hasn't
been released yet, so it waits.  A network-layer process slows down because
network management has told it to and that mgmnt knows exactly why. Something
as simple as an Ethernet driver that's encountering many collisions will tell
the network layer above to stop, again by buffer management, or by explicit
mgmnt msg, as when all 15 retrys on a collision have failed.   

2) Programmers for all manner of packet-handling devices (routers, bridges,
NATs, firewalls...) send packets they can't handle to a drop queue and/or they
use the incoming interface's ability to slow the sender (pause frames, not
ready...).  Frames that are dropped that way can indeed be classed as
congestive losses along the path.  Since the switching systems internally know
where and when this happens, one has mgmnt info available.  This info was
never mapped into TCP/IP, whether datagram or reliable services were being
affected.  Yes, there's been talk of ECN, but it's not complete in its
description of loss cause.  So the packet-based Internet is shortchanged of
info that can improve performance.  And we all know that performance is needed
for important functions, like mgmnt/security.

3) Hardware fails.  When a switch backpressures a node to shut up, by faking a
collision, that node indeed shuts up, but for exactly the period the switch
needs to recover, unless either end has a failure.  When a dedicated link on a
path starts failing and errors occur, losses start out small, but TCP
performance tanks, because TCP is ignorant of the reason for loss (ie., not
congestion, so keep going).  So, one bad CRC in 100 pkts can bring today's
TCP's performance down by more than a factor of 10 on common paths.

4) Default installs are often suboptimal, and not in subtle ways.  Default
timers, in particular, can easily hurt performance in TCP flows -- consider
the Delayed Ack Timer, or the Max Send Window.  This becomes more of an issue
as network data rates increase, but files are passed in smallish blocks as
before.  See what impact a 100mS Ack Timer value has on a large 100Mb/s file
transfer, when each block fits in an odd number of TCP packets.

5) The problem is not just with TCP being made the Jedi of congestion
management.  It's a combination of many Internet design shortcomings that have
long needed attention.  Firewalls, IDSs and NAT boxes, for instance, would not
be so needed if we indeed had modelled the Internet on secure,
access-controlled systems and dealt sooner with its always-ancient
addressing.  One of the reasons packets can be lost in so many resource
limited systems along a path is that no consideration of true security ever
made it into the design.  It's no accident that the miltary nets were kept
physically separate.  It's no accident that VPNs are de-rigeur.

6) There is no concept of source & release control in the Internet.  That's
why, for instance, folks like uSoft can ship an OS that gives its box a new IP
address if the Ethernet card has its wire out for 10 secs, so any pkts on
their way to it now get dropped.  Or why that a vendor allows Port-Based FTP
inside some apps, even though its insane violation of 3 layers causes
connection failure through NAT boxes, etc.

7) And, of course, lucky 7 -- access control and unique addressing.  Maybe
freedom from Big Brother was the original motivation for "anyone, anywhere
with any IP address" (until we exhausted them), but now we have millions of
junk packets every second delivered to us, with no way of tracking the actual
sources and no way of preventing them from hogging links, unless we spend
billions on all manner of security, anti-spam, yadda, yadda gear and software
that has never been required for our other communication media.

Packet nets depend on random holes, into which a sender can often inject a
packet at will -- essentially "on demand", when loads are only modest. 
Synchronous nets always have bit slots moving, but allocate exactly what they
can fill and no more.  Mgmnt has always been as important as data in such
nets.  Not so with the Internet protocols.  Same with access control &
security.  Imagine deploying a mail system around the world whose services are
gained by sending "HELO" (or "ELHO", or...) in plain text to establish version
and connection for something as important as private information passing. 
Years ago SMTP was a joke.  Some poor kid even got nailed by the feds years
ago for showing how stupidly designed it was, by executing code at a far
system via a 'feature' built into the mail protocols.

The problem with the Internet is that it is a mess and tweaking a protocol
ain't gonna fix anything.  The shame is that we all paid for it in taxes, we
all ended up with it because of the market subsidy it has enjoyed, and we will
continue to pay, some more than others, until someone is encouraged to rethink
it premises.  Right now, all that seems to happen is that folks with sensible
ideas don't get anywhere, which is exactly the property of an effective
bureaucracy -- one that lost sight of its original purpose and now just
persists for its own benefit.

My suggestion is simple, and I'd be happy if my taxes and contributions to
alma maters helped the research effort to:  a) take a small, dedicated, 
non-IETF group at a small school and charge them with addressing the basic
problems in the Internet (access & address security, network path mgmnt, layer
performance, inter-layer comm...); b) tell them no Internet compatibility is
required above DLC and up to the app layer; c) have them implement a
demonstration campus net (with a few routed remote sites) with only the new
protocols installed; d) provide a gateway (in the true sense of the word) to
the Internet; and e) establish a center for open-source control and release
mgmnt.  This would be a research effort that many could benefit from, many
good masters theses would arise from and, like most small-group efforts, would
result in a good product.  Deployment from there would lead to further
fundable research.  The big problem for any such effort is the lack of the
implicit subsidy given TCP/IP over the years by its free distribution &
inclusion in OSs shipped by everyone from AT&T, Sun, HP, uSoft, Linux...  So,
the inclusion of the new stack in Linux would be an essential task.  In other
words, the competitive playing field would have to be levelled, as it never
was for TCP/IP vs Novell, etc.  Then, competitive results would speak for
themselves, and we might pass through the Age of Spam and the Valley of the
Shadow of Identity Theft more safely.

Alex

Saverio Mascolo wrote:
> 
> >
> > I was part of a team that looked at the particular problem of
> distinguishing
> > packet drop cause in detail recently.  See, for instance,
> >
> >     http://www.ir.bbn.com/documents/articles/krishnan_cn04.pdf
> >
> > You don't get as much leverage as you'd hope from knowing the cause of
> > packet drops.
> 
> also because the link layer should be well designed so that losses should be
> due only (or mainly) to congestion.
> 
> Saverio