From simon at limmat.switch.ch  Thu Mar  1 07:09:42 2001
From: simon at limmat.switch.ch (Simon Leinen)
Date: Thu Mar 25 11:59:32 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be generated...]
In-Reply-To: <20010228164710.G51394@ted.isi.edu>
References: <13695.982276825@dstc.edu.au> <aa66hueboy.fsf@limmat.switch.ch>
	<20010228164710.G51394@ted.isi.edu>
Message-ID: <aaofvl4jbd.fsf@limmat.switch.ch>

>>>>> "tf" == Ted Faber <faber@ISI.EDU> writes:
>> [...]
>> TCP           532400385076 95.81 %
>> UDP            21201575665  3.82 %

> Bytes or packets?  Does the other unit support your conclusion, too?

Those were bytes.  Here's a summary from the same transatlantic links
(Eastbound direction only) which counts flows and packets too (new
version of script appended):

protocol.......flows..............packets...............bytes.........
GRE         7071 ( 0.01 %)    268698 ( 0.02 %)     213346212 ( 0.04 %)
ICMP     3473563 ( 6.09 %)  10420689 ( 0.94 %)    1083751656 ( 0.20 %)
IGMP           4 ( 0.00 %)         8 ( 0.00 %)          7264 ( 0.00 %)
IP         11604 ( 0.02 %)   3724884 ( 0.34 %)     763601220 ( 0.14 %)
IPINIP      4716 ( 0.01 %)     14148 ( 0.00 %)       2589084 ( 0.00 %)
TCP     35155287 (61.62 %) 942530269 (85.00 %)  532400385076 (95.81 %)
UDP     18399711 (32.25 %) 151843881 (13.69 %)   21201575665 ( 3.82 %)

So in terms of number of flows and packets, this particular
transatlantic link does indeed have a higher share of UDP than what
the 1998 "beast" paper observed.  Wether this is a general trend, or
just due to different usage patterns between our link and the links
observed by CAIDA, I don't know.  Actually I'd be glad if people could
run the script on other routers which aggregate large numbers of users
(especially non-academic users) and tell me whether the results are
wildly different or wildly similar.

For a comparison, here are the totals from an access router at a
random university:

protocol.......flows..............packets...............bytes.........
GRE          383 ( 0.00 %)     17235 ( 0.00 %)       3602115 ( 0.00 %)
ICMP    101931237 ( 1.75 %) 305793711 ( 0.45 %)   37918420164 ( 0.11 %)
IGMP       34662 ( 0.00 %)    901212 ( 0.00 %)      58578780 ( 0.00 %)
IP       1406788 ( 0.02 %)  15474668 ( 0.02 %)    3528224304 ( 0.01 %)
IPINIP      1297 ( 0.00 %)      1297 ( 0.00 %)        583650 ( 0.00 %)
TCP     4361852662 (74.91 %) 63919315234 (93.39 %) 32455859980970 (96.82 %)
UDP     1357265629 (23.31 %) 4201556174 ( 6.14 %) 1025284993546 ( 3.06 %)

This matches the CAIDA values much better.  Maybe the transatlantic
figures are biased because we run authoritative name servers for some
ccTLDs - those can be expected to generate lots and lots of
single-packet UDP port 53 flows.

Note also that the "flow" concept used in the CAIDA work isn't exactly
the same as Cisco NetFlow's, although the numbers may still be
comparable.  In case someone wants to know, we use the default flow
timeout values (30 minutes maximum lifetime or 1 minute(??) maximum
inactivity).
-- 
Simon.

#!/usr/local/bin/perl -w

my (%tb,%tp,%tf); my ($tb,$tp,$tf) = (0,0,0);
while (<>) {
    my ($proto,$f,$fps,$ppf,$bbp) = split;
    next unless defined $bbp && $proto =~ /^[A-Z]/ && $f =~ /^[0-9]+$/;
    $proto =~ s/-.*//;
    my $p = $f * $ppf; my $b = $p * $bbp;
    $tf{$proto} += $f, $tf += $f,
    $tb{$proto} += $b, $tb += $b,
    $tp{$proto} += $p, $tp += $p
	unless $proto eq 'Total:';
}
printf "protocol.......flows..............packets...............bytes.........\n";
foreach (sort keys %tb) {
  printf "%-7s %8.0f (%5.2f %%) %9.0f (%5.2f %%) %13.0f (%5.2f %%)\n",
	  $_,
	  $tf{$_}, 100.0 * $tf{$_} / $tf,
	  $tp{$_}, 100.0 * $tp{$_} / $tp,
	  $tb{$_}, 100.0 * $tb{$_} / $tb;
}
1;

From simon at limmat.switch.ch  Thu Mar  1 07:32:23 2001
From: simon at limmat.switch.ch (Simon Leinen)
Date: Thu Mar 25 11:59:32 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be generated...]
In-Reply-To: <4591.983412132@dstc.edu.au>
References: <4591.983412132@dstc.edu.au>
Message-ID: <aaitlt4i9k.fsf@limmat.switch.ch>

>>>>> "gm" == George Michaelson <ggm@dstc.edu.au> writes:
> Wow. Completely proved me wrong.

Well... other networks may be different.  For example, we used to
charge the universities by volume (for transatlantic traffic we still
do), so that certainly has some influence on our usage.

>> 98 Also predates an explosion in IP-in-IP and other encapsulated
>> flows (VPNs, IPSEC, PPPoE) so I'd be willing to hazard there are
>> more fragmented flows than shown there.

> And I look to be wrong on IP-in-IP as well. 

That may also be different for other ISPs.  Our university users don't
do much VPN (yet?).

>   It also predates the explosion of applications such as Napster and
>   Gnutella (which both run over TCP), whose traffic volume dwarfs
>   that of all UDP traffic (at least on our network).

> The application mix that makes TCP predominate.. I didn't expect
> that. I had assumed like FSP these things used UDP layering.

You underestimate people's ability to learn from past mistakes
(-: although TCP is used for concerns other than TCP-friendliness).

> The UDP is going to be NTP and DNS?

Re-running my script with a tiny change:

protocol.............flows..............packets...............bytes.........
GRE               7071 ( 0.01 %)    268698 ( 0.02 %)     213346212 ( 0.04 %)
ICMP           3473563 ( 6.09 %)  10420689 ( 0.94 %)    1083751656 ( 0.20 %)
IGMP                 4 ( 0.00 %)         8 ( 0.00 %)          7264 ( 0.00 %)
IP-other         11604 ( 0.02 %)   3724884 ( 0.34 %)     763601220 ( 0.14 %)
IPINIP            4716 ( 0.01 %)     14148 ( 0.00 %)       2589084 ( 0.00 %)
TCP-BGP         154665 ( 0.27 %)    154665 ( 0.01 %)      10053225 ( 0.00 %)
TCP-FTP        1478600 ( 2.59 %)   7393000 ( 0.67 %)    2336188000 ( 0.42 %)
TCP-FTPD        161850 ( 0.28 %)  69433650 ( 6.26 %)   49783927050 ( 8.96 %)
TCP-Frag           285 ( 0.00 %)      3420 ( 0.00 %)        413820 ( 0.00 %)
TCP-NNTP         70222 ( 0.12 %) 113338308 (10.22 %)   21307601904 ( 3.83 %)
TCP-SMTP        968681 ( 1.70 %)  17436258 ( 1.57 %)    7689389778 ( 1.38 %)
TCP-Telnet       75043 ( 0.13 %)   1876075 ( 0.17 %)     324560975 ( 0.06 %)
TCP-WWW       24258155 (42.52 %) 315356015 (28.44 %)  235255587190 (42.34 %)
TCP-X             6509 ( 0.01 %)   2512474 ( 0.23 %)     293959458 ( 0.05 %)
TCP-other      7981277 (13.99 %) 415026404 (37.43 %)  215398703676 (38.76 %)
UDP-DNS        8185178 (14.35 %)  16370356 ( 1.48 %)    2111775924 ( 0.38 %)
UDP-Frag           444 ( 0.00 %)   1053168 ( 0.09 %)     767759472 ( 0.14 %)
UDP-NTP        3676906 ( 6.44 %)   3676906 ( 0.33 %)     279444856 ( 0.05 %)
UDP-TFTP            11 ( 0.00 %)        11 ( 0.00 %)           693 ( 0.00 %)
UDP-other      6537172 (11.46 %) 130743440 (11.79 %)   18042594720 ( 3.25 %)

So NTP is marginal in terms of traffic, DNS too (although not in terms
of number of flows).  The bulk of UDP *bytes* does in fact come from
"UDP-other" - I can think of audio/video streaming and gaming,
although the latter may be insignificant on a transatlantic link.

> Are the ssh tunnels looking like TCP and so IPSEC/ip-in-ip doesn't
> figure because grassroots, people use applications tunnels instead?

Maybe.  We definitely have customers who use IPSEC for VPN
applications (probably showing up in the "IP-other" category), but I
don't know whether they do this transatlantically, and our users may
do this less than users of commercial networks(?).
-- 
Simon.

From floyd at aciri.org  Fri Mar  2 09:12:11 2001
From: floyd at aciri.org (Sally Floyd)
Date: Thu Mar 25 11:59:32 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be generated...] 
Message-ID: <200103021712.f22HCB567380@elk.aciri.org>

>So the 95% figure for TCP still looks reasonable in 2001, at least for
>that particular link 

Many thanks for posting your results.  I have added a pointer to them
on a web page on "Measurement Studies of End-to-End Congestion Control
in the Internet", at "http://www.aciri.org/floyd/ccmeasure.html",
where we are trying to track information from measurement studies about
how end-to-end congestion control is actually doing in the Internet.

I would be particularly interested if anyone's measurements ever
indicated a surge of non-congestion-controlled traffic in the
Internet...

- Sally
--------------------------------
http://www.aciri.org/floyd/
--------------------------------

From kjc at csl.sony.co.jp  Fri Mar  2 10:25:35 2001
From: kjc at csl.sony.co.jp (Kenjiro Cho)
Date: Thu Mar 25 11:59:32 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be
 generated...] 
In-Reply-To: <200103021712.f22HCB567380@elk.aciri.org>
References: <200103021712.f22HCB567380@elk.aciri.org>
Message-ID: <20010303032535C.kjc@csl.sony.co.jp>

Sally Floyd wrote:
> >So the 95% figure for TCP still looks reasonable in 2001, at least for
> >that particular link 
> 
> Many thanks for posting your results.  I have added a pointer to them
> on a web page on "Measurement Studies of End-to-End Congestion Control
> in the Internet", at "http://www.aciri.org/floyd/ccmeasure.html",
> where we are trying to track information from measurement studies about
> how end-to-end congestion control is actually doing in the Internet.

We are maintaining trans-Pacific packet traces along with their
summary info taken from the WIDE project backbone at
http://tracer.csl.sony.co.jp/mawi/
(note that addresses are scrambled in tcpdump binary outputs.)

> I would be particularly interested if anyone's measurements ever
> indicated a surge of non-congestion-controlled traffic in the
> Internet...

Our data also confirms that TCP is still more than 90% of the traffic
under normal situations.
But, unfortunately, unusual traffic patterns do happen these days.
For example,
http://tracer.csl.sony.co.jp/mawi/samplepoint-A/2000/200006171359.html
http://tracer.csl.sony.co.jp/mawi/samplepoint-A/2000/200006181359.html

-Kenjiro

From dpreed at reed.com  Fri Mar  2 11:02:32 2001
From: dpreed at reed.com (David P. Reed)
Date: Thu Mar 25 11:59:32 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be
  generated...] 
In-Reply-To: <200103021712.f22HCB567380@elk.aciri.org>
Message-ID: <5.0.2.1.2.20010302135526.03130a60@mail.reed.com>

At 09:12 AM 3/2/01 -0800, Sally Floyd wrote:
>I would be particularly interested if anyone's measurements ever
>indicated a surge of non-congestion-controlled traffic in the
>Internet...


Good idea, but I'd caution people to observe that non-TCP traffic is still 
capable of congestion control.  For example, one can do streaming media 
over UDP with congestion control - the same signals (lost packets, RED, and 
ECN) can be used to reflect congestion to the endpoints and implement a 
closed-loop adaptive solution (for video, lowering frame rate, and 
prioritizing audio, for example).

So the actual detection and measurement of "non-congestion-controlled" 
traffic flows is an end-to-end issue.  It isn't strictly observable at 
router, certainly not by just looking at protocol numbers.


- David
--------------------------------------------
WWW Page: http://www.reed.com/dpr.html


From simon at limmat.switch.ch  Fri Mar  2 12:48:03 2001
From: simon at limmat.switch.ch (Simon Leinen)
Date: Thu Mar 25 11:59:32 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be generated...]
In-Reply-To: <5.0.2.1.2.20010302135526.03130a60@mail.reed.com>
References: <5.0.2.1.2.20010302135526.03130a60@mail.reed.com>
Message-ID: <aa8zmn99to.fsf@limmat.switch.ch>

>>>>> "dpr" == David P Reed <dpreed@reed.com> writes:
> At 09:12 AM 3/2/01 -0800, Sally Floyd wrote:
>> I would be particularly interested if anyone's measurements ever
>> indicated a surge of non-congestion-controlled traffic in the
>> Internet...

> Good idea, but I'd caution people to observe that non-TCP traffic is
> still capable of congestion control.  For example, one can do
> streaming media over UDP with congestion control - the same signals
> (lost packets, RED, and ECN) can be used to reflect congestion to
> the endpoints and implement a closed-loop adaptive solution (for
> video, lowering frame rate, and prioritizing audio, for example).

...or giving up out of frustration, or getting kicked out of a game.

The thing that comes closest to incapable of congestion control is
probably DNS (except zone transfers).  But in terms of bytes, DNS
makes up only ~0.3% of all traffic around here (even though we have a
couple of ccTLD servers on our network).

Unfortunately I cannot look at the "UDP-other" traffic (~90% of UDP
traffic or 2.7% of all bytes) very well.  I'd venture a guess that
most of this is RealMedia/QuickTime/Windows Media Player.  Those
should use fairly well-defined congestion control mechanisms.  Is
there any work on characterizing these kinds of transport protocols
with respect to their levels of "TCP-friendliness"?

> So the actual detection and measurement of "non-congestion-controlled"
> traffic flows is an end-to-end issue.  It isn't strictly observable at
> router, certainly not by just looking at protocol numbers.

Absolutely,
-- 
Simon.

From floyd at aciri.org  Fri Mar  2 21:01:11 2001
From: floyd at aciri.org (Sally Floyd)
Date: Thu Mar 25 11:59:32 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be generated...] 
Message-ID: <200103030501.f2351B573554@elk.aciri.org>

>I'd venture a guess that
>most of this is RealMedia/QuickTime/Windows Media Player.  

The packet traces from the MAWI Working Group Traffic Archive at
"http://tracer.csl.sony.co.jp/mawi/" break down the udp traffic
into dns, rip, realaud, halflif, everque, quake, and other.  E.g.,
"http://tracer.csl.sony.co.jp/mawi/samplepoint-B/2001/200102251400.html".
For the days that I looked, the UDP traffic on this transoceanic
link was dominated by DNS, actually.  But maybe transoceanic links 
have different traffic mixes than other ones.

>Those
>should use fairly well-defined congestion control mechanisms.  Is
>there any work on characterizing these kinds of transport protocols
>with respect to their levels of "TCP-friendliness"?

We have just started to look at this.  In addition to thinking some
about the potential fit of equation-based congestion control (e.g.,
TFRC) for these kinds of traffic.  It turns out that the deployment
of ECN in the Internet would add a new interest to some of these issues.

- Sally
--------------------------------
http://www.aciri.org/floyd/
--------------------------------

From ehall at ehsco.com  Fri Mar  2 22:07:58 2001
From: ehall at ehsco.com (Eric A. Hall)
Date: Thu Mar 25 11:59:32 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be 
 generated...]
References: <200103030501.f2351B573554@elk.aciri.org>
Message-ID: <3AA08A3E.541D233@ehsco.com>

> For the days that I looked, the UDP traffic on this transoceanic
> link was dominated by DNS, actually.  But maybe transoceanic links
> have different traffic mixes than other ones.

People don't play action-oriented multi-player games over long-haul
networks. Shoot-em-up games are very sensitive to latency and packet loss.
Playing a shoot-em-up with >200ms RTT will get you killed fast by players
with <20ms (client-side events have to wait for server-side messages to
arrive so the "closer" player gets a distinct advantage in terms of
shorter inter-command gap). After a while, you learn to play on servers
that are close.

Anyway, trans-oceanic links are radically different in that regard. They
will always have lower gaming levels.

FWIW, network games are fascinating examples of interactive applications.
They are the new TELNET, except that they also have range issues that
TELNET didn't often encounter (similar to the annoying remote echo problem
but on a larger scale). Also, not all of these games use UDP. Many of them
are using TCP for a variety of familiar reasons.

-- 
Eric A. Hall                                        http://www.ehsco.com/
Internet Core Protocols          http://www.oreilly.com/catalog/coreprot/

From akyol at pluris.com  Sun Mar  4 00:22:24 2001
From: akyol at pluris.com (Bora Akyol)
Date: Thu Mar 25 11:59:32 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be
 generated...] 
In-Reply-To: <200103030501.f2351B573554@elk.aciri.org>
Message-ID: <Pine.GSO.4.05.10103040018420.428-100000@volsung.pluris.com>

I would expect to see lots of content caching/distribution going on in the
transoceanic links such that multimedia traffic probably always gets
served from the nearest content server.

If that is the case, how is the content getting replicated to these
different continents? Do the traffic statistics over the transoceanic
links capture this replication or are these being beamed over satellite?

Thanks

Bora

On Fri, 2 Mar 2001, Sally Floyd wrote:

> >I'd venture a guess that
> >most of this is RealMedia/QuickTime/Windows Media Player.  
> 
> The packet traces from the MAWI Working Group Traffic Archive at
> "http://tracer.csl.sony.co.jp/mawi/" break down the udp traffic
> into dns, rip, realaud, halflif, everque, quake, and other.  E.g.,
> "http://tracer.csl.sony.co.jp/mawi/samplepoint-B/2001/200102251400.html".
> For the days that I looked, the UDP traffic on this transoceanic
> link was dominated by DNS, actually.  But maybe transoceanic links 
> have different traffic mixes than other ones.
> 
> >Those
> >should use fairly well-defined congestion control mechanisms.  Is
> >there any work on characterizing these kinds of transport protocols
> >with respect to their levels of "TCP-friendliness"?
> 
> We have just started to look at this.  In addition to thinking some
> about the potential fit of equation-based congestion control (e.g.,
> TFRC) for these kinds of traffic.  It turns out that the deployment
> of ECN in the Internet would add a new interest to some of these issues.
> 
> - Sally
> --------------------------------
> http://www.aciri.org/floyd/
> --------------------------------
> 


From smd at ebone.net  Sun Mar  4 02:55:34 2001
From: smd at ebone.net (Sean Doran)
Date: Thu Mar 25 11:59:32 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be generated...]
Message-ID: <20010304105534.B8DE58A3@sean.ebone.net>

| I would expect to see lots of content caching/distribution going on in the
| transoceanic links such that multimedia traffic probably always gets
| served from the nearest content server.

I wouldn't.

| If that is the case, how is the content getting replicated to these
| different continents? Do the traffic statistics over the transoceanic
| links capture this replication or are these being beamed over satellite?

Some people (Yahoo, CNN, etc.) locate "european-flavour" servers in Europe,
"cantonese-style" servers in Hong Kong, and so forth.   Some people use 
Akamai and their competitors, which seem to be locating stuff in various
places around the world.  Most content just comes from wherever it happens
to be hosted, and often enough that's somewhere in California.  Works great.

	Sean.

From jstevenson at orblynx.com  Sun Mar  4 06:49:55 2001
From: jstevenson at orblynx.com (John Stevenson)
Date: Thu Mar 25 11:59:32 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be 
 generated...]
References: <20010304105534.B8DE58A3@sean.ebone.net>
Message-ID: <3AA25613.5678C303@orblynx.com>

<< Most content just comes from wherever it happens to be hosted, and often
enough that's somewhere in California.  Works great. >>

Not so if the client is in Indonesia, or eastern Europe, or Egypt, etc., a long
or an indirect link over any big cable.

And recently not quite so even in California (when the resulting brownouts kick
in).

John Stevenson


Sean Doran wrote:

> | I would expect to see lots of content caching/distribution going on in the
> | transoceanic links such that multimedia traffic probably always gets
> | served from the nearest content server.
>
> I wouldn't.
>
> | If that is the case, how is the content getting replicated to these
> | different continents? Do the traffic statistics over the transoceanic
> | links capture this replication or are these being beamed over satellite?
>
> Some people (Yahoo, CNN, etc.) locate "european-flavour" servers in Europe,
> "cantonese-style" servers in Hong Kong, and so forth.   Some people use
> Akamai and their competitors, which seem to be locating stuff in various
> places around the world.  Most content just comes from wherever it happens
> to be hosted, and often enough that's somewhere in California.  Works great.
>
>         Sean.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.postel.org/pipermail/end2end-interest/attachments/20010304/69472035/attachment.html
From cerpa at ISI.EDU  Sun Mar  4 21:13:02 2001
From: cerpa at ISI.EDU (Alberto Cerpa)
Date: Thu Mar 25 11:59:32 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be
 generated...]
In-Reply-To: <20010304105534.B8DE58A3@sean.ebone.net>
Message-ID: <Pine.GSO.4.21.0103042109570.24587-100000@boreas.isi.edu>

On Sun, 4 Mar 2001, Sean Doran wrote:
> | If that is the case, how is the content getting replicated to these
> | different continents? Do the traffic statistics over the transoceanic
> | links capture this replication or are these being beamed over satellite?
> 
> Some people (Yahoo, CNN, etc.) locate "european-flavour" servers in Europe,
> "cantonese-style" servers in Hong Kong, and so forth.   Some people use 
> Akamai and their competitors, which seem to be locating stuff in various
> places around the world.  Most content just comes from wherever it happens
> to be hosted, and often enough that's somewhere in California.  Works great.
> 

Do you have some measurements to back this up?  I would be really
interested to get any pointers to some data available confirming this.

Best regards,
-Al


> 	Sean.
> 


From T.Henderson at cs.ucl.ac.uk  Mon Mar  5 07:29:58 2001
From: T.Henderson at cs.ucl.ac.uk (Tristan Henderson)
Date: Thu Mar 25 11:59:32 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be 
         generated...]
In-Reply-To: Message from "Eric A. Hall" <ehall@ehsco.com> of "Fri, 02 Mar 2001 22:07:58 PST." <3AA08A3E.541D233@ehsco.com>
Message-ID: <20010305153003.9C49337D3F@kylie.cs.ucl.ac.uk>

In message <3AA08A3E.541D233@ehsco.com>, "Eric A. Hall" said:
>
>People don't play action-oriented multi-player games over long-haul
>networks. Shoot-em-up games are very sensitive to latency and packet loss.
>Playing a shoot-em-up with >200ms RTT will get you killed fast by players
>with <20ms (client-side events have to wait for server-side messages to
>arrive so the "closer" player gets a distinct advantage in terms of
>shorter inter-command gap). After a while, you learn to play on servers
>that are close.
>

Do you have any data/stats to support these figures? I'm doing some analysis 
of shoot-em-up games and haven't been able to find anything authoritative 
about the maximum delays for networked games. I've seen figures of ~200ms 
being declared as the "maximum" delay before; e.g. a games designer says that 
they design for 200-300ms delays at http://www.gamasutra.com/features/19970905/
ng_01.htm.

OTOH, there are plenty of usenet postings from people playing with RTTs of 
300-1000ms, e.g.
http://groups.google.com/groups?hl=en&lr=&safe=off&ic=1&th=1daccce21a879875
http://groups.google.com/groups?hl=en&lr=&safe=off&ic=1&th=2cd5a305b3152d89
http://groups.google.com/groups?hl=en&lr=&safe=off&ic=1&th=41803c558ae2df07

(apologies if these google links don't work; I haven't quite got used to their 
usenet archive yet)

It would be useful to know the absolute highest delays that gamers can 
tolerate.
 
>
>FWIW, network games are fascinating examples of interactive applications.

I agree. I'm particularly interested in the multiuser aspects - for example, 
as you state, there are dynamics which may force users with similar network 
characteristics to congregate together. Alas, games seem to have been 
neglected by the networking research community, but hopefully that is changing.

Cheers,
Tristan


From smd at ebone.net  Mon Mar  5 08:10:48 2001
From: smd at ebone.net (Sean Doran)
Date: Thu Mar 25 11:59:32 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be generated...]
In-Reply-To: <Pine.GSO.4.21.0103042109570.24587-100000@boreas.isi.edu> (Alberto Cerpa's message of "Sun, 4 Mar 2001 21:13:02 -0800 (PST)")
References: <Pine.GSO.4.21.0103042109570.24587-100000@boreas.isi.edu>
Message-ID: <52d7bwgprr.fsf@sean.ebone.net>

Alberto Cerpa <cerpa@ISI.EDU> writes:

> Do you have some measurements to back this up?  I would be really
> interested to get any pointers to some data available confirming this.

What type of measurements would you be looking for?

My assertion of "works great" is based on some
measurements of RTT (absolute and variability) and loss
various networks here and there use to construct their
SLAs.  In the little network in which I play, all approach
the minimum most of the time, even in outlying places like
Bratislava, Budapest, Bucharest, Prague and so on, that
are in the "Eastern Europe" someone suggested was badly
connected to the world.

Yes, TCP's ACK clocking means things farther away from
each other are at a disadvantage, but realistically, it
*is* only ~190ms from Romania to Southern California
(which seems like a reasonable selection of worst case,
topologically speaking).

I am open to the argument that these are atypically good
numbers, and that performance to Eastern Europe (for
example) tends to be much worse.  In fact, I would love to
get such an argument into the hands of our sales people. :-) :-)

        Sean.
- --
robuc101-ta#trace 
Protocol [ip]: 
Target IP address: cs.ucsd.edu
Source address: 213.174.64.13
Numeric display [n]: 
Timeout in seconds [3]: 
Probe count [3]: 10
Minimum Time to Live [1]: 
Maximum Time to Live [30]: 
Port Number [33434]: 
Loose, Strict, Record, Timestamp, Verbose[none]: 
Type escape sequence to abort.
Tracing the route to cs.ucsd.edu (132.239.51.18)
  1 atvie103-ta-s1-0.ebone.net (213.174.70.81) 20 msec 20 msec 12 msec 16 msec 12 msec 24 msec 16 msec 12 msec 16 msec 12 msec
  2 atvie101-tc-r6-0.ebone.net (195.158.245.49) 16 msec 20 msec 12 msec 16 msec 20 msec 20 msec 12 msec 16 msec 12 msec 16 msec
  3 czpra103-tc-p2-0.ebone.net (195.158.242.45) 20 msec 28 msec 28 msec 28 msec 20 msec 24 msec 24 msec 24 msec 20 msec 20 msec
  4 debln302-tc-p2-0.ebone.net (213.174.70.45) 32 msec 32 msec 28 msec 36 msec 32 msec 32 msec 36 msec 40 msec 32 msec 28 msec
  5 debln301-tc-p1-0.ebone.net (213.174.70.37) 32 msec 40 msec 28 msec 36 msec 36 msec 36 msec 32 msec 32 msec 32 msec 32 msec
  6 dedus206-tc-p6-0.ebone.net (213.174.70.41) 36 msec 44 msec 36 msec 40 msec 40 msec 36 msec 40 msec 40 msec 36 msec 36 msec
  7 dedus205-tc-p7-0.ebone.net (213.174.70.125) 40 msec 40 msec 40 msec 36 msec 36 msec 40 msec 40 msec 44 msec 48 msec 40 msec
  8 nlams303-tc-p2-0.ebone.net (213.174.70.134) 40 msec 40 msec 40 msec 40 msec 44 msec 40 msec 40 msec 40 msec 40 msec 44 msec
  9 bebru203-tc-p1-0.ebone.net (213.174.71.1) 44 msec 44 msec 44 msec 44 msec 44 msec 44 msec 44 msec 44 msec 40 msec 44 msec
 10 bebru204-tc-p2-0.ebone.net (195.158.225.82) 40 msec 44 msec 44 msec 44 msec 40 msec 48 msec 40 msec 48 msec 44 msec 48 msec
 11 gblon505-tc-p1-0.ebone.net (195.158.232.41) 48 msec 48 msec 48 msec 48 msec 52 msec 52 msec 48 msec 48 msec 48 msec 48 msec
 12 usnyk105-tc-p1-1.ebone.net (195.158.229.25) 116 msec 120 msec 116 msec 116 msec 120 msec 120 msec 116 msec 120 msec 120 msec 116 msec
 13 sl-bb11-nyc-5-3.sprintlink.net (144.232.9.229) [AS 1239] 116 msec 120 msec 116 msec 116 msec 120 msec 120 msec 116 msec 120 msec 124 msec 116 msec
 14 144.232.9.202 [AS 1239] 140 msec 116 msec 120 msec 116 msec 120 msec 116 msec 116 msec 120 msec 116 msec 116 msec
 15 pos3-0-622M.nyc-bb8.cerf.net (134.24.33.158) [AS 1740] 120 msec 120 msec 116 msec 120 msec 116 msec 120 msec 116 msec 116 msec 116 msec 120 msec
 16 so6-3-0-622M.chi-bb5.cerf.net (134.24.32.213) [AS 1740] 136 msec 136 msec 136 msec 136 msec 136 msec 140 msec 136 msec 140 msec 132 msec 148 msec
 17 pos2-0-622M.chi-bb3.cerf.net (134.24.33.197) [AS 1740] 136 msec 140 msec 136 msec 140 msec 136 msec 140 msec 140 msec 136 msec 140 msec 136 msec
 18 pos0-0-622M.sfo-bb4.cerf.net (134.24.46.58) [AS 1740] 192 msec 192 msec 188 msec 192 msec 192 msec 188 msec 188 msec 188 msec 188 msec 192 msec
 19 pos7-0-622M.sfo-bb3.cerf.net (134.24.32.78) [AS 1740] 196 msec 196 msec 196 msec 196 msec 196 msec 192 msec 196 msec 196 msec 200 msec 196 msec
 20 pos3-0-622M.lax-bb4.cerf.net (134.24.29.234) [AS 1740] 192 msec 188 msec 192 msec 192 msec 188 msec 192 msec 188 msec 192 msec 192 msec 192 msec
 21 so1-0-0-622M.lax-bb7.cerf.net (134.24.33.170) [AS 1740] 188 msec 192 msec 188 msec 192 msec 192 msec 192 msec 188 msec 192 msec 192 msec 188 msec
 22 so-6-0-0.san-bb4.cerf.net (134.24.29.13) [AS 1740] 196 msec 192 msec 196 msec 196 msec 192 msec 196 msec 196 msec 196 msec 196 msec 196 msec
 23 pos1-0-0-155M.san-bb1.cerf.net (134.24.29.190) [AS 1740] 196 msec 196 msec 196 msec 196 msec 204 msec 196 msec 196 msec 192 msec 444 msec * 
 24 sdsc-gw.san-bb1.cerf.net (134.24.12.26) [AS 1740] 204 msec 204 msec 200 msec 200 msec 204 msec 208 msec 200 msec 204 msec 208 msec 200 msec
 25 bigmama.ucsd.edu (192.12.207.5) [AS 195] 220 msec 220 msec 228 msec 220 msec 292 msec 236 msec 244 msec 248 msec 232 msec 224 msec
 26 cse-rs.ucsd.edu (132.239.254.45) [AS 7377] 224 msec 224 msec 244 msec 224 msec 224 msec 228 msec 228 msec 228 msec 224 msec 224 msec
 27 cs.ucsd.edu (132.239.51.18) [AS 7377] 216 msec *  212 msec *  212 msec *  212 msec *  208 msec * 

From smd at ebone.net  Mon Mar  5 08:25:41 2001
From: smd at ebone.net (Sean Doran)
Date: Thu Mar 25 11:59:32 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be generated...]
Message-ID: <20010305162541.5E74F8A3@sean.ebone.net>

Mmmm, socio-psychology meets networking.  Always fun, never understood fully. :)

| It would be useful to know the absolute highest delays that gamers can 
| tolerate.

Surely this will be somewhat application-dependent?

However, there's probably some literature here and there about
human reflexes and how fast one needs a result back from a "twitch"
in order to feel reasonably interactive.   Probably very little
of that will focus on network impact.

| >FWIW, network games are fascinating examples of interactive applications.

They're also fun.  I've never been big into shoot-em-up games,
since building the Internet is faster and harder, but some friends
had me over to play Unreal Tournament with their clan the other week,
and my eyes were opened a bit.  UT in any event was more sensitive
to loss and "drop outs" than to stable delay -- for me, anyway, choppy
updates and missed action was more important and harder to compensate
for than aiming ahead along the direction the target is seen to be moving.

| I agree. I'm particularly interested in the multiuser aspects - for example, 
| as you state, there are dynamics which may force users with similar network 
| characteristics to congregate together.

It turns out that LAN parties are pretty common: people drive across
Europe to gather together around a hub or small switch, matching up
as teams in a series of competitions within a broader league.  

The social aspect, it turns out, is as important as the locality.
By analogy, although a good SLA can be gotten from a high-quality
Chinese Restaurant's delivery service, 15 people eating the same
stuff and communicating via a telephone bridge or IRC or whatever
is not as much fun as the same 15 people together in the restuarant,
even if the food is no better prepared or presented, and arrives
at the table no more quickly.

In the Internet space, we all know that there is a significant
value in the social aspect of IETF meetings, despite the formalization
of the mailing lists as being the places where real work happens.

| Alas, games seem to have been neglected by the networking
| research community, but hopefully that is changing.

Heh - well, they're sure popular among operators, at least those
on the operations front, as far as I can tell.  Perhaps that is
a reflection of a dichotomy between people who are reactive & practical
versus people who like to plan in advance and understand the theory
behind things.

	Sean.

From T.Henderson at cs.ucl.ac.uk  Mon Mar  5 09:54:38 2001
From: T.Henderson at cs.ucl.ac.uk (Tristan Henderson)
Date: Thu Mar 25 11:59:33 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be 
         generated...]
In-Reply-To: Message from smd@ebone.net (Sean Doran) of "Mon, 05 Mar 2001 17:25:41 +0100." <20010305162541.5E74F8A3@sean.ebone.net>
Message-ID: <20010305175443.BF70537D3F@kylie.cs.ucl.ac.uk>

In message <20010305162541.5E74F8A3@sean.ebone.net>, Sean Doran said:
>Mmmm, socio-psychology meets networking.  Always fun, never understood fully. 
>:)
>
>| It would be useful to know the absolute highest delays that gamers can 
>| tolerate.
>
>Surely this will be somewhat application-dependent?
>
Yes, you'd expect (within networked games) that delay requirements would look 
like shoot-em-up < RPG < chess. It should be possible, however, to come up 
with some general figures, a G.114 equivalent for shoot-em-ups. I'd just like 
something more concrete than figures pulled out of a hat, so if anyone knows 
of any (reasonably) scientific studies please point me at them.

>However, there's probably some literature here and there about
>human reflexes and how fast one needs a result back from a "twitch"
>in order to feel reasonably interactive.   Probably very little
>of that will focus on network impact.
>
Precisely. There is stuff in the VR and physiology worlds about reflexes, but 
it's not clear that this applies to the Internet, where people seem to put up 
with a lot more than they'll admit to in a lab experiment.

>| >FWIW, network games are fascinating examples of interactive applications.
>
>They're also fun.  I've never been big into shoot-em-up games,
>since building the Internet is faster and harder, but some friends
>had me over to play Unreal Tournament with their clan the other week,
>and my eyes were opened a bit.  UT in any event was more sensitive
>to loss and "drop outs" than to stable delay -- for me, anyway, choppy
>updates and missed action was more important and harder to compensate
>for than aiming ahead along the direction the target is seen to be moving.
>
Interesting. I've been concentrating on Half-Life mainly (it seems to be the 
most widely-played game according to tracking sites such as  
http://www.theclq.com/games.asp) but I might have to give UT a go as well.

>| I agree. I'm particularly interested in the multiuser aspects - for example,
> 
>| as you state, there are dynamics which may force users with similar network 
>| characteristics to congregate together.
>
>It turns out that LAN parties are pretty common: people drive across
>Europe to gather together around a hub or small switch, matching up
>as teams in a series of competitions within a broader league.  
>
But this isn't always an option for geographically dispersed groups, so a lot 
of games server operators allow clans to book servers for private games. 
That's why I'd quite like to determine the QoS requirements for applications 
such as these; games players are already spending lots of money on their 
habit, so they should be quite receptive to paying for QoS.

>| Alas, games seem to have been neglected by the networking
>| research community, but hopefully that is changing.
>
>Heh - well, they're sure popular among operators, at least those
>on the operations front, as far as I can tell.  Perhaps that is
>a reflection of a dichotomy between people who are reactive & practical
>versus people who like to plan in advance and understand the theory
>behind things.
>
No comment :)

Cheers,
Tristan


From ehall at ehsco.com  Mon Mar  5 10:02:38 2001
From: ehall at ehsco.com (Eric A. Hall)
Date: Thu Mar 25 11:59:33 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be 
 generated...]
References: <20010305153003.9C49337D3F@kylie.cs.ucl.ac.uk>
Message-ID: <3AA3D4BE.212F190E@ehsco.com>

> Do you have any data/stats to support these figures?

Word-of-mouth, casual research. IE, asking the guy that killed me what his
ping is. Also, lots of player forums (newsgroups, message boards, etc). I
would agree that 200ms RTT seems to be about the max for combat.

> design for 200-300ms delays at http://www.gamasutra.com/features/19970905/
> ng_01.htm.

Interesting read. Thanks.

> OTOH, there are plenty of usenet postings from people playing with RTTs
> of 300-1000ms, e.g.

Well, not all of them are telling the truth. I'm not sure I'd believe the
boastings of nine-year olds in public forums.

But there is a lot of skill involved. There are people with 5ms RTT that
can't win no matter what, and there are people with 300ms RTT that win all
of the time.

Another issue here is that not all of the games are shooters. UO in
particular has a lot of social elements, and it doesn't require any combat
at all. A lot of the high-ping players naturally gravitate more towards
the role-playing or social elements instead of combat, particularly after
getting their clock cleaned consistently by low-pingers. I'm not saying
low RTTs are not important, I am saying that there are games which embrace
high-RTT players by offering non-combat activities, and this will likely
become more important over time.

> Alas, games seem to have been neglected by the networking research
> community, but hopefully that is changing.

It has gone both ways. Developers of new Internet-specific apps are not
coming here, either. But I agree that there is a growing separation
between the current Internet and the research community in general.

-- 
Eric A. Hall                                        http://www.ehsco.com/
Internet Core Protocols          http://www.oreilly.com/catalog/coreprot/

From ehall at ehsco.com  Mon Mar  5 10:14:55 2001
From: ehall at ehsco.com (Eric A. Hall)
Date: Thu Mar 25 11:59:33 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be 
 generated...]
References: <20010305162541.5E74F8A3@sean.ebone.net>
Message-ID: <3AA3D79E.B3BBCD7A@ehsco.com>

> However, there's probably some literature here and there about
> human reflexes and how fast one needs a result back from a "twitch"
> in order to feel reasonably interactive.   Probably very little
> of that will focus on network impact.

I'm sure that has something to do with it but I don't think it's the
principle issue. I mean, it might be the primary factor when everybody is
on the same LAN, but when you're talking about cross-country RTTs it's not
the primary issue.

Command queueing is the problem. Longer RTTs mean larger gaps between
commands. This works both ways, in that movement and actions sent from the
client take longer to reach the server, but data coming from the server is
also rapidly becoming outdated by the time it reaches the client. This
puts high RTTs at a distinct disadvantage to low RTTs, regardless of the
player's twitch reflex capabalities.

-- 
Eric A. Hall                                        http://www.ehsco.com/
Internet Core Protocols          http://www.oreilly.com/catalog/coreprot/

From jms at central.cis.upenn.edu  Mon Mar  5 10:35:29 2001
From: jms at central.cis.upenn.edu (Jonathan M. Smith)
Date: Thu Mar 25 11:59:33 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be generated...] 
In-Reply-To: Your message of "Mon, 05 Mar 2001 17:47:25 GMT."
             <Pine.GSO.4.21.0103051730530.9418-100000@regan.ee.surrey.ac.uk> 
Message-ID: <200103051835.f25IZTj27538@central.cis.upenn.edu>

There's a classic book which has a bunch of experiments on timing - very high
quality stuff. The authors are Card, Newell and Simon, and it's called the
"Psychology of Computer Human Interaction". I looked at it when I was trying to 
understand how much queueing delays and jitter of other types "mattered".

							-JMS


From T.Henderson at cs.ucl.ac.uk  Mon Mar  5 11:39:22 2001
From: T.Henderson at cs.ucl.ac.uk (Tristan Henderson)
Date: Thu Mar 25 11:59:33 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be 
         generated...]
In-Reply-To: Message from Lloyd Wood <l.wood@eim.surrey.ac.uk> of "Mon, 05 Mar 2001 19:17:32 GMT." <Pine.GSO.4.21.0103051910310.10370-100000@regan.ee.surrey.ac.uk>
Message-ID: <20010305193927.CFD4B37D3F@kylie.cs.ucl.ac.uk>

In message <Pine.GSO.4.21.0103051910310.10370-100000@regan.ee.surrey.ac.uk>, Ll
oyd Wood said:
>On Mon, 5 Mar 2001, Eric A. Hall wrote:
>
>> But there is a lot of skill involved. There are people with 5ms RTT that
>> can't win no matter what, and there are people with 300ms RTT that win all
>> of the time.
>
>hacking your copy of the  game for e.g. shooting accuracy has nothing
>to do with it. (apropos: there's a rant on security of multiplayer
>games under
>http://tuxedo.org/~esr/writings/quake-cheats.html
>) Deliberately compensating for lag in the game client in some
>similar way would be interesting.

Apparently the more delay-tolerant RPGs, e.g. Age of Empires and Warcraft, 
already do some compensation - they deliberately delay all interactions so 
that all players have similar delay. Not sure about shoot-em-ups though.

Cheers,
Tristan


From touch at ISI.EDU  Mon Mar  5 11:47:13 2001
From: touch at ISI.EDU (Joe Touch)
Date: Thu Mar 25 11:59:33 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be 
 generated...]
References: <20010305153003.9C49337D3F@kylie.cs.ucl.ac.uk>
Message-ID: <3AA3ED41.E50A300F@isi.edu>


Tristan Henderson wrote:
> 
> In message <3AA08A3E.541D233@ehsco.com>, "Eric A. Hall" said:
> >
> >People don't play action-oriented multi-player games over long-haul
> >networks. Shoot-em-up games are very sensitive to latency and packet loss.
> >Playing a shoot-em-up with >200ms RTT will get you killed fast by players
> >with <20ms (client-side events have to wait for server-side messages to
> >arrive so the "closer" player gets a distinct advantage in terms of
> >shorter inter-command gap). After a while, you learn to play on servers
> >that are close.

As Sean indicated, these are application dependent.
More precisely, they depend on the level of 
predicatability in the feedback system, and how
high in the human the processing occurs.

The most basic human feedback loops (single flashing light,
hit a switch) are in the 100 ms range. That means the
network portion must be in the 20ms range to be 'noise'
on the overall system delay. However, it gets longer
for things like "multiple lights, hit the switch only if
one of the lights is red". The response delay gets
larger the more complicated the task.

E.g., ask someone for a review of War and Peace,
and you're liable to be willing to wait a few days. :-)
It's all about expectations.

Figures in the 100-200ms range are for maximum auditory
delay for telephone echos, and date back to the early
Bell Labs days.

> OTOH, there are plenty of usenet postings from people playing with RTTs of
> 300-1000ms, e.g.

Many old video games had artificial delays incorporated 
(e.g., sluggishness in the controls of space invaders, etc).
Part of the 'game' is getting acclimated to those delays.

> It would be useful to know the absolute highest delays that gamers can
> tolerate.

People play chess by mail. It's more about expectations than
about the inherent delay of the system. 

----------------------

Regarding latency papers, there is Stuart Cheshire's from 1996, 
as well as more recent notes from David Reed. My dissertation
was on this stuff, and examined the fundamental limits of 
latency in communication (pub'd 1992, links on my home page).

Joe
http://www.isi.edu/touch

From ehall at ehsco.com  Mon Mar  5 11:48:42 2001
From: ehall at ehsco.com (Eric A. Hall)
Date: Thu Mar 25 11:59:33 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be 
 generated...]
References: <20010305193927.CFD4B37D3F@kylie.cs.ucl.ac.uk>
Message-ID: <3AA3ED9A.F68F1723@ehsco.com>

> Apparently the more delay-tolerant RPGs, e.g. Age of Empires and
> Warcraft, already do some compensation - they deliberately delay all
> interactions so that all players have similar delay. Not sure about
> shoot-em-ups though.

Some of the shooters are using turn-based play in order to put more of an
emphasis on tactics and less on connection. There are still cheat problems
of course, but when *everything* goes through a scheduler on the server it
really changes the nature of the game.

-- 
Eric A. Hall                                        http://www.ehsco.com/
Internet Core Protocols          http://www.oreilly.com/catalog/coreprot/

From ehall at ehsco.com  Mon Mar  5 12:06:27 2001
From: ehall at ehsco.com (Eric A. Hall)
Date: Thu Mar 25 11:59:33 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be 
 generated...]
References: <Pine.GSO.4.21.0103051910310.10370-100000@regan.ee.surrey.ac.uk>
Message-ID: <3AA3F1C2.F69E2B94@ehsco.com>

> Deliberately compensating for lag in the game client in some
> similar way would be interesting.

This was [re]introduced as a fairly big problem just a few weeks back,
with client-based clock emulators being used to eliminate programmed
delays. EG, for those actions which required client-side delay (casting a
spell, healing, shooting a paced weapon, etc.), emulating the clock (and
running the emulator at very high rates) meant that client-side delays
were essentially removed from the game.

It has been around is some form or another for a while. UO has long
suffered from a "fast walk" hack that allowed players to move at their own
pace instead of a rate set by the server. This was eventually fixed with
rotating random keys and encrypted commands.

The clock hack essentially made these fixes irrelevant, and allowed for
many more cheats.

Closing the loop, what this is driving is a return to closed communities.
Diablo saw it badly (nobody with any sense played on the public servers
when the clients had a built-in "God mode"), others are recognizing the
problem and are now designing for it. I think it is trending away from
massive multiplayer towards multiple-islands.

-- 
Eric A. Hall                                        http://www.ehsco.com/
Internet Core Protocols          http://www.oreilly.com/catalog/coreprot/

From demir at usc.edu  Mon Mar  5 12:12:58 2001
From: demir at usc.edu (demir)
Date: Thu Mar 25 11:59:33 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be 
 generated...]
In-Reply-To: <3AA3ED41.E50A300F@isi.edu>
Message-ID: <Pine.GSO.4.21.0103051210230.6923-100000@aludra.usc.edu>

> As Sean indicated, these are application dependent.
> More precisely, they depend on the level of 
> predicatability in the feedback system, and how
> high in the human the processing occurs.

I am, completely, agree with above lines. I interpret this as "engineering
human perception/action" where "communication" is also part of this. As   
"perception/action" will differ in our real life, so should differ in 
applications. As Joe stated, it is all about "expectations", I think, too.

Alper K. Demir


From ehall at ehsco.com  Mon Mar  5 12:30:08 2001
From: ehall at ehsco.com (Eric A. Hall)
Date: Thu Mar 25 11:59:33 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be 
 generated...]
References: <20010305153003.9C49337D3F@kylie.cs.ucl.ac.uk> <3AA3ED41.E50A300F@isi.edu>
Message-ID: <3AA3F74F.C2724ACE@ehsco.com>

> > >arrive so the "closer" player gets a distinct advantage in terms of
> > >shorter inter-command gap).

> As Sean indicated, these are application dependent.

> The most basic human feedback loops (single flashing light,
> hit a switch) are in the 100 ms range. That means the
> network portion must be in the 20ms range to be 'noise'
> on the overall system delay. However, it gets longer

Not all functions fall in that category. Strafing is holding down a key
while turning, for example, not click-click-click. Running/motion is
holding down a key. Etc. Whenever a task involves interactive exchange of
packets which are not driven by user interaction, then the player with the
lower latency gets a distinct advantage.

There are also tasks which are user-automated. For example, a user may
have practiced a particular sequence of events, and may have developed a
timing patter such that they can execute events without waiting for
feedback from the system. Rather than "hit switch when light flashes" it
becomes "hit switch every 5ms because that's how often the light flashes"
which is fundamentally different, and this model also rewards players who
have low RTTs vs high RTTs.

The best Player-vs-Player fighters are trained monkeys with well-honed
reactionary pathways which allow them to react to macros that fail.

-- 
Eric A. Hall                                        http://www.ehsco.com/
Internet Core Protocols          http://www.oreilly.com/catalog/coreprot/

From touch at ISI.EDU  Mon Mar  5 13:00:44 2001
From: touch at ISI.EDU (Joe Touch)
Date: Thu Mar 25 11:59:33 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be 
 generated...]
References: <20010305153003.9C49337D3F@kylie.cs.ucl.ac.uk> <3AA3ED41.E50A300F@isi.edu> <3AA3F74F.C2724ACE@ehsco.com>
Message-ID: <3AA3FE7C.E4C63F99@isi.edu>


"Eric A. Hall" wrote:
> 
> > > >arrive so the "closer" player gets a distinct advantage in terms of
> > > >shorter inter-command gap).
> 
> > As Sean indicated, these are application dependent.
> 
> > The most basic human feedback loops (single flashing light,
> > hit a switch) are in the 100 ms range. That means the
> > network portion must be in the 20ms range to be 'noise'
> > on the overall system delay. However, it gets longer
> 
> Not all functions fall in that category. Strafing is holding down a key
> while turning, for example, not click-click-click. Running/motion is
> holding down a key. Etc. Whenever a task involves interactive exchange of
> packets which are not driven by user interaction, then the player with the
> lower latency gets a distinct advantage.

Strafing needs high packet rate, but is latency independent.
A better implementation would just send "start strafe" and
"end strafe" signals anyway.

> There are also tasks which are user-automated. For example, a user may
> have practiced a particular sequence of events, and may have developed a
> timing patter such that they can execute events without waiting for
> feedback from the system. Rather than "hit switch when light flashes" it
> becomes "hit switch every 5ms because that's how often the light flashes"
> which is fundamentally different, and this model also rewards players who
> have low RTTs vs high RTTs.

Any such timing pattern should be uploadable. If you're
forcing the user to input the sequence manually, it's just
like the forced delays of the old Space Invaders days.

> The best Player-vs-Player fighters are trained monkeys with well-honed
> reactionary pathways which allow them to react to macros that fail.

Right - all you really need to adjust is the non-predicatable
part.

Joe

From J.Crowcroft at cs.ucl.ac.uk  Mon Mar  5 14:17:11 2001
From: J.Crowcroft at cs.ucl.ac.uk (Jon Crowcroft)
Date: Thu Mar 25 11:59:33 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be 
         generated...]
In-Reply-To: Your message of "Mon, 05 Mar 2001 13:00:44 PST." <3AA3FE7C.E4C63F99@isi.edu>
Message-ID: <11047.983830631@cs.ucl.ac.uk>


what tristan asked for was _evidence_

you've all turned anecdotal or prescriptive since the evidne about
relative TCP and UDP traffic 

fact is theres few facts about any of this, jusdt lots of opinion.

go look at the original bell labs papers on interative audio RTTs :
that was just opinion too - when we get to games (pace, Cheriton) same
applies in spades
j.

From touch at ISI.EDU  Mon Mar  5 14:51:03 2001
From: touch at ISI.EDU (Joe Touch)
Date: Thu Mar 25 11:59:33 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be 
 generated...]
References: <11047.983830631@cs.ucl.ac.uk>
Message-ID: <3AA41857.9F074B30@isi.edu>


Jon Crowcroft wrote:
> 
> what tristan asked for was _evidence_

There's abundant evidence in the human factors
community. It's just statistical, like
everything involving 'bags of mostly water,'
and it's highly domain-specific, because the
level of comprehensive and predictive complexity
is hard to provide quantitative measures for.

Joe

From smd at ebone.net  Mon Mar  5 15:01:02 2001
From: smd at ebone.net (Sean Doran)
Date: Thu Mar 25 11:59:33 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be generated...]
Message-ID: <20010305230102.1BAF88A3@sean.ebone.net>

| fact is theres few facts about any of this, jusdt lots of opinion.

Well, yeah, but Jon, given that the Internet is heterogeneous,
anisotropic, expanding, and mutating, it is really hard to be
anything but anecdotal, since even the most comprehensive data
set (one that defeats the observer problem (i.e., in the absence
of isotropism, how do we know what things look like "over there"?)) 
will quickly grow stale.

| go look at the original bell labs papers on interative audio RTTs :
| that was just opinion too - when we get to games (pace, Cheriton) same
| applies in spades

Are you arguing on the question of whether opnion can be "good enough",
or on the question of whether something much more strong than opinion
or localized (in space and time) measurements can be obtained with
an affordable amount of effort?

	Sean.

[in a long-ago CIDRD wg meeting when they were contentious]
smd: well, that's just my opinion
voice in crowd (tli? postel?): and we're ALL entitled to Sean's opinion


From demir at usc.edu  Mon Mar  5 15:31:38 2001
From: demir at usc.edu (demir)
Date: Thu Mar 25 11:59:34 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be
 generated...]
In-Reply-To: <20010305230102.1BAF88A3@sean.ebone.net>
Message-ID: <Pine.GSO.4.21.0103051519190.26008-100000@aludra.usc.edu>

I agree with below lines. However, these are all "chaotic", to me. I
think, the main challenge is how could we "engineer" these anisotropic,
expanding, and mutating world as engineered as possible so that, may be,
"the Turing test" is achieved in the current state. As Joe Touch
indicated, "the levels of comprehensive and comlexity is hard to provide
quantitative measures". I think, searching for "evidence" requires to
solve the "relativity" problem as a human factor. I assume these are all
"phylosophical" issues that one might think unimportant. I think an
"enhanced architecture" should consider all these and other related
factors. Again, thse are all about "expectations".

Alper K. Demir

> | fact is theres few facts about any of this, jusdt lots of opinion.
> 
> Well, yeah, but Jon, given that the Internet is heterogeneous,
> anisotropic, expanding, and mutating, it is really hard to be
> anything but anecdotal, since even the most comprehensive data
> set (one that defeats the observer problem (i.e., in the absence
> of isotropism, how do we know what things look like "over there"?)) 
> will quickly grow stale.
> 
> | go look at the original bell labs papers on interative audio RTTs :
> | that was just opinion too - when we get to games (pace, Cheriton) same
> | applies in spades
> 
> Are you arguing on the question of whether opnion can be "good enough",
> or on the question of whether something much more strong than opinion
> or localized (in space and time) measurements can be obtained with
> an affordable amount of effort?
> 
> 	Sean.
> 
> [in a long-ago CIDRD wg meeting when they were contentious]
> smd: well, that's just my opinion
> voice in crowd (tli? postel?): and we're ALL entitled to Sean's opinion
> 


From foo at eek.org  Mon Mar  5 17:42:52 2001
From: foo at eek.org (foo)
Date: Thu Mar 25 11:59:34 2004
Subject: [e2e] TEAR.
Message-ID: <20010305194252.L33489@eek.org>

Does anyone have any experience with or thoughts about TEAR (TCP Emulation
at Receivers) developed by Injong Rhee at NCSU? 

http://www.csc.ncsu.edu/faculty/rhee/export/tear_page/

-Brian

From J.Crowcroft at cs.ucl.ac.uk  Tue Mar  6 00:00:35 2001
From: J.Crowcroft at cs.ucl.ac.uk (Jon Crowcroft)
Date: Thu Mar 25 11:59:34 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be 
         generated...]
In-Reply-To: Your message of "Tue, 06 Mar 2001 00:01:02 +0100." <20010305230102.1BAF88A3@sean.ebone.net>
Message-ID: <3377.983865635@cs.ucl.ac.uk>

In message <20010305230102.1BAF88A3@sean.ebone.net>, Sean Doran typed:

 >>| fact is theres few facts about any of this, jusdt lots of opinion.
 
 >>Well, yeah, but Jon, given that the Internet is heterogeneous,
 >>anisotropic, expanding, and mutating, it is really hard to be
 >>anything but anecdotal, since even the most comprehensive data
 >>set (one that defeats the observer problem (i.e., in the absence
 >>of isotropism, how do we know what things look like "over there"?)) 
 >>will quickly grow stale.

a few of us actually try to do some measurements in the real world -
before we do to many, we thoguht we would see if some other people had
some - there is a LOT on web, a LOT on voice over IP now, and a lot of
it is done over a fairly well characterised set of IP paths globally,
despite what you say about the heterogeneity - sorry, but the fac t is
 that when it comes to games, there isnt, as far as we can tell, but
we thought we;d ask.
 
 >>| go look at the original bell labs papers on interative audio RTTs :
 >>| that was just opinion too - when we get to games (pace, Cheriton) same
 >>| applies in spades

 >>Are you arguing on the question of whether opnion can be "good enough",
 >>or on the question of whether something much more strong than opinion
 >>or localized (in space and time) measurements can be obtained with
 >>an affordable amount of effort?

look at vern's work on characterising end to end paths, look at
sculzrinne, and bolot's workl on charcartiering delay jitter and its
effect on voice, look at abundent work on zipf law and not for web
page download size/time, etc etc etc

where is the _equivalent_ _experimental_ data for games, please?

the point is that a lot of early work in this area (50s,60s, itu
standards definitions for toll quality speech) was based on LAB
experiments, often with small, culture specific samples. a LOT of
recent internet measurement work is based on real world data, which is
NOT magic, not impossible (its hard work, and has to be incremental,
painstaking, and very careful, but there is a lot) - we just wanted to
see where the work had got to in one more part of the space.....

 >>[in a long-ago CIDRD wg meeting when they were contentious]
 >>smd: well, that's just my opinion
 >>voice in crowd (tli? postel?): and we're ALL entitled to Sean's opinion
 
thanks, given that sean's comments are heterogeneous, anisotropic and
expanding and mutating, i guess we are.

 cheers

   jon


From craig at aland.bbn.com  Tue Mar  6 05:28:32 2001
From: craig at aland.bbn.com (Craig Partridge)
Date: Thu Mar 25 11:59:34 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be generated...] 
In-Reply-To: Your message of "Mon, 05 Mar 2001 22:17:11 GMT."
             <11047.983830631@cs.ucl.ac.uk> 
Message-ID: <200103061328.IAA19448@aland.bbn.com>

In message <11047.983830631@cs.ucl.ac.uk>, Jon Crowcroft writes:

>go look at the original bell labs papers on interative audio RTTs :
>that was just opinion too

Hi Jon:

In defense of the Bell Labs folks.  There were some badly done studies in
the 1950s and 1960s [not all at Bell Labs if I recall] -- many of which had
the property that the people doing the studies didn't understand echo
cancellers, with the result that, *surprise surprise*, they all reported
that interactive voice couldn't be sustained with a delay of more than
something like 100ms (exact number no longer remembered) which was the
point at which the lack of echo cancellation started to be a problem.

But there were three very good studies, all out of Bell Labs, which did
the tests properly.  They're still worth reading:  Riesz and Klemmer in
Bell System Technical Journal of Nov 1963, Klemmer in Bell System Technical
Journal of July-August 1967, and P.T. Brady's article in the Bell System
Technical Journal of January 1971.

Incidentally, Klemmer sent me a note in the early 1990s says that, in
retrospect, his studies didn't account for learned delay sensitivity --
that is, a delay which previously was acceptable will become annoying if
you've become used to a much shorter delay.

Side note: the BSTJ studies often used the following test procedure:

    * every time you picked up your phone handset, a delay was randomly
      chosen from a pool of possible delay times
    * on the phone was a button that you could press if you were unhappy
      with the audio quality, and I think you were rewarded by having
      the delay eliminated

Something like this might work for gaming (though we'd have to get the
incentives right -- if pressing the button eliminates the delay, everyone
will do it all the time)

Craig

From smd at ebone.net  Tue Mar  6 06:04:00 2001
From: smd at ebone.net (Sean Doran)
Date: Thu Mar 25 11:59:34 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be generated...]
Message-ID: <20010306140400.EFF438A3@sean.ebone.net>

| Something like this might work for gaming (though we'd have to get the
| incentives right -- if pressing the button eliminates the delay, everyone
| will do it all the time)

If pressing the button debits the presser's account a couple of 
euros in our favour, I would be more than happy to see this supported
as quickly as possible in my network.

	Sean.

P.S.: Peter Lothberg & I like to argue how QoS stuff can be done in 
      various interrelated networks which generally offer only "platinum"
quality service (zero average queue length, zero drops most of the time)
can offer lower-quality (gold, silver, bronze, lead, barbed wire) at 
more market-competitive prices when there is demand.   The idea is to
put up a web page where the customer can dial a flavour, such as moving
a slide-bar between 0-100% drop probability, and one that lengthens
and thickens the tail of one-way delays at the interface facing the
customer.   While turning the knobs, one would also see the list
price change -- worse quality -> lower pricing.  Prese here when satisfied.

(We also like the idea of having pre-build "profiles".  Click here for
network XYZ's observed level of service, price is 5% less than network
XYZ's list, that kind of thing).

There is apparently a decades-long history of doing this in the X.25 world.

From karir at wam.umd.edu  Tue Mar  6 07:38:31 2001
From: karir at wam.umd.edu (Manish Karir)
Date: Thu Mar 25 11:59:34 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be
 generated...]
In-Reply-To: <20010306140400.EFF438A3@sean.ebone.net>
Message-ID: <Pine.GSO.4.21.0103061032320.6514-100000@rac1.wam.umd.edu>


On Tue, 6 Mar 2001, Sean Doran wrote:

> 
> P.S.: Peter Lothberg & I like to argue how QoS stuff can be done in 
>       various interrelated networks which generally offer only "platinum"
> quality service (zero average queue length, zero drops most of the time)
> can offer lower-quality (gold, silver, bronze, lead, barbed wire) at 
> more market-competitive prices when there is demand.   The idea is to
> put up a web page where the customer can dial a flavour, such as moving
> a slide-bar between 0-100% drop probability, and one that lengthens
> and thickens the tail of one-way delays at the interface facing the
> customer.   While turning the knobs, one would also see the list
> price change -- worse quality -> lower pricing.  Prese here when satisfied.
> 
> (We also like the idea of having pre-build "profiles".  Click here for
> network XYZ's observed level of service, price is 5% less than network
> XYZ's list, that kind of thing).
> 
> There is apparently a decades-long history of doing this in the X.25 world.
> 

I think something similar to this was done under the INDEX project at
berkeley, the paper is at:
http://www.path.berkeley.edu/~varaiya/papers_ps.dir/networkpaper.pdf

though the web site for the project iteself does'nt seem to exist
anymore or has moved...

manish karir


From kuang at sask.trlabs.ca  Sat Mar  3 15:55:01 2001
From: kuang at sask.trlabs.ca (Tianbo kuang)
Date: Thu Mar 25 11:59:34 2004
Subject: [e2e] TEAR.
In-Reply-To: <20010305194252.L33489@eek.org>
Message-ID: <Pine.LNX.4.10.10103031745150.1886-100000@kuang.sask.trlabs.ca>

Hi,

I was just about to ask a question about TEAR. It seems unclear to me how
TEAR calculates RTT in the technical report (TEAR: TCP emulation at
receivers - flow control for multimedia streaming). Does the sender
calculate it and send it to the receiver, or does the receiver calculate
it (and how?)? In section 3.5, it does mention under the title _timeout_
that, "this information is embedded in the packet header by the sender".
What does "this information" refer to?

Cheers,

--Tianbo

------------------------------------------------------
  Kuang Tianbo                                         
  TRlabs                                               
  111-116 Research Drive                               
  Saskatoon, Saskatchewan S7N 3R3                      
  Tel: (306) 668-9325(office) (306) 343-9747 (home)                                  
  kuang@sask.trlabs.ca                                 
------------------------------------------------------
On Mon, 5 Mar 2001, foo wrote:

> Date: Mon, 5 Mar 2001 19:42:52 -0600
> From: foo <foo@eek.org>
> To: end2end-interest@ISI.EDU, tcp-impl@lerc.nasa.gov
> Subject: [e2e] TEAR.
> Resent-Date: Mon, 5 Mar 2001 19:48:05 -0600
> Resent-From: foo@eek.org
> Resent-To: end2end-interest@postel.org
> 
> Does anyone have any experience with or thoughts about TEAR (TCP Emulation
> at Receivers) developed by Injong Rhee at NCSU? 
> 
> http://www.csc.ncsu.edu/faculty/rhee/export/tear_page/
> 
> -Brian
> 


From rhee at eos.ncsu.edu  Tue Mar  6 09:01:50 2001
From: rhee at eos.ncsu.edu (Injong Rhee)
Date: Thu Mar 25 11:59:34 2004
Subject: [e2e] TEAR.
In-Reply-To: <Pine.LNX.4.10.10103031745150.1886-100000@kuang.sask.trlabs.ca>
Message-ID: <NEBBJFHFCDNOKHIEJPAGMEEFCGAA.rhee@eos.ncsu.edu>

Hi,

I can't help overhearing, and want to drop a few lines. The RTT calculation
can be done by the sender through receiving the receiver report about the
rate. The receiver sends back the time stamp of the last packet received to
the sender with the sequence number which will be used by the sender to
compute the RTT. This is one way to do it, and there are  many other ways.
For instance, you can use the same way that RTP does or use GPS to
synchronize the clocks and measure the one-way time. Then use the one-way
trip time in place of RTT -- I know in this case that TCP-friendliness may
suffer, but it can at least give some bounded fairness. In fact this removes
back-channel concerns completely from flow control scuh as losses and delays
in the back channels.

Some of nice things about TEAR are that (1) it does not use back channels
much so sutiable for wireless comm; (2) rate control is very smooth; (3)
TCP-friendly over various ranges of bandwidth --- TFRC has some prblems
under very low bandwidth cases.

We have improved TEAR quite bit from the initial work and  TEAR is
incorporated into an MPEG-4 stream player and stream server, and it seems to
give pretty good performance over other existing streaming solutions. Other
areas of exploration are multicast and wireless communication. Sorry I have
not kept you guys up to date about the progress. I got tired of writing
papers and have been digging into writing codes.....Maybe its time to come
out and see the light :-)

Injong


> -----Original Message-----
> From: end2end-interest-admin@postel.org
> [mailto:end2end-interest-admin@postel.org]On Behalf Of Tianbo kuang
> Sent: Saturday, March 03, 2001 6:55 PM
> To: foo
> Cc: end2end-interest@ISI.EDU; tcp-impl@lerc.nasa.gov;
> end2end-interest@postel.org
> Subject: Re: [e2e] TEAR.
>
>
> Hi,
>
> I was just about to ask a question about TEAR. It seems unclear to me how
> TEAR calculates RTT in the technical report (TEAR: TCP emulation at
> receivers - flow control for multimedia streaming). Does the sender
> calculate it and send it to the receiver, or does the receiver calculate
> it (and how?)? In section 3.5, it does mention under the title _timeout_
> that, "this information is embedded in the packet header by the sender".
> What does "this information" refer to?
>
> Cheers,
>
> --Tianbo
>
> ------------------------------------------------------
>   Kuang Tianbo
>   TRlabs
>   111-116 Research Drive
>   Saskatoon, Saskatchewan S7N 3R3
>   Tel: (306) 668-9325(office) (306) 343-9747 (home)
>
>   kuang@sask.trlabs.ca
> ------------------------------------------------------
> On Mon, 5 Mar 2001, foo wrote:
>
> > Date: Mon, 5 Mar 2001 19:42:52 -0600
> > From: foo <foo@eek.org>
> > To: end2end-interest@ISI.EDU, tcp-impl@lerc.nasa.gov
> > Subject: [e2e] TEAR.
> > Resent-Date: Mon, 5 Mar 2001 19:48:05 -0600
> > Resent-From: foo@eek.org
> > Resent-To: end2end-interest@postel.org
> >
> > Does anyone have any experience with or thoughts about TEAR
> (TCP Emulation
> > at Receivers) developed by Injong Rhee at NCSU?
> >
> > http://www.csc.ncsu.edu/faculty/rhee/export/tear_page/
> >
> > -Brian
> >
>
>


From touch at ISI.EDU  Tue Mar  6 11:35:19 2001
From: touch at ISI.EDU (Joe Touch)
Date: Thu Mar 25 11:59:34 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be 
 generated...]
References: <200103061328.IAA19448@aland.bbn.com>
Message-ID: <3AA53BF7.665AC944@isi.edu>


Craig Partridge wrote:
> 
> Side note: the BSTJ studies often used the following test procedure:
> 
>     * every time you picked up your phone handset, a delay was randomly
>       chosen from a pool of possible delay times
>     * on the phone was a button that you could press if you were unhappy
>       with the audio quality, and I think you were rewarded by having
>       the delay eliminated
> 
> Something like this might work for gaming (though we'd have to get the
> incentives right -- if pressing the button eliminates the delay, everyone
> will do it all the time)

A variant of that is that when you press the button, the delay goes
down,
but so does the bandwidth. There's your disincentive.

Joe

From J.Crowcroft at cs.ucl.ac.uk  Wed Mar  7 01:24:26 2001
From: J.Crowcroft at cs.ucl.ac.uk (Jon Crowcroft)
Date: Thu Mar 25 11:59:34 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be 
         generated...]
In-Reply-To: Your message of "Tue, 06 Mar 2001 08:00:35 GMT." <3377.983865635@cs.ucl.ac.uk>
Message-ID: <7570.983957066@cs.ucl.ac.uk>

interesting data-
http://www.jisc-tau.ac.uk/linx-access.html
has a nice graph of latency improving as local access speed increases
and matches the in/out capacity better, but
http://www.jisc-tau.ac.uk/usa-access.html
shows how it aint that simple and as latent demand tracks supply, long
haul latency goes up again...roughly speaking...

j.


From dpreed at reed.com  Wed Mar  7 05:12:04 2001
From: dpreed at reed.com (David P. Reed)
Date: Thu Mar 25 11:59:34 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be 
  generated...]
In-Reply-To: <7570.983957066@cs.ucl.ac.uk>
References: <Your message of "Tue, 06 Mar 2001 08:00:35 GMT." <3377.983865635@cs.ucl.ac.uk>
Message-ID: <5.0.2.1.2.20010307080716.024df030@mail.reed.com>

If carriers at all points got "paid" based on average latency, the 
investment would be there to move latency to a better attractor, which 
would track latent demand.  This is something I've been trying to get 
started for a long time.  The movement to pay carriers based on traffic 
volume, rather than delay experienced, will always drive the system to its 
worst case latency.

we need a closed loop congestion control that works in the time-scale of 
fiber deployment and LAN-speed upgrades.  We don't have one that does this, 
and no one (other than me) seems to be even seriously thinking about 
it.  I've even done something about it by advising some of the bandwidth 
exchanges.

At 09:24 AM 3/7/01 +0000, Jon Crowcroft wrote:

>interesting data-
>http://www.jisc-tau.ac.uk/linx-access.html
>has a nice graph of latency improving as local access speed increases
>and matches the in/out capacity better, but
>http://www.jisc-tau.ac.uk/usa-access.html
>shows how it aint that simple and as latent demand tracks supply, long
>haul latency goes up again...roughly speaking...
>
>j.

- David
--------------------------------------------
WWW Page: http://www.reed.com/dpr.html


From david_zhang at ins.com  Wed Mar  7 06:41:19 2001
From: david_zhang at ins.com (david_zhang@ins.com)
Date: Thu Mar 25 11:59:34 2004
Subject: [e2e] (no subject)
Message-ID: <00fc01c0a714$ac23b660$df59a4d0@C991473C>

unsubscribe
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.postel.org/pipermail/end2end-interest/attachments/20010307/b000bc50/attachment.html
From smd at ebone.net  Wed Mar  7 08:46:42 2001
From: smd at ebone.net (Sean Doran)
Date: Thu Mar 25 11:59:34 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be generated...]
Message-ID: <20010307164642.B1BCB8A3@sean.ebone.net>

[some graphs from www.jisc-tau.ac.uk]

well, yes, one expects to move bottlenecks around from time to time,

i don't suppose there's any chance of using RED on the US->Europe
side to limit the delay, and doing a comparison to actual utilization?
that would be very cool (but understandably may not be possible)

	Sean.

From smd at ebone.net  Wed Mar  7 08:55:34 2001
From: smd at ebone.net (Sean Doran)
Date: Thu Mar 25 11:59:34 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be generated...]
Message-ID: <20010307165534.C7AA28A3@sean.ebone.net>

| If carriers at all points got "paid" based on average latency, the 
| investment would be there to move latency to a better attractor, which 
| would track latent demand.  This is something I've been trying to get 
| started for a long time.  The movement to pay carriers based on traffic 
| volume, rather than delay experienced, will always drive the system to its 
| worst case latency.

It's not going to be cheaper to have an empty network than to have
one with a bottleneck here and there.   It's also not like there
are that many applications that are so inelastic that latency is
worth paying real extra $ to remove, when volume-over-time figures
are "good enough" to make the per-available-mbps or 95th-percentile-utilization
charges worth it, and there is no obvious killer app that is unamenable
to adapting to the Internet's "rough approximation" of good performance,
on the grounds that it's cheaper to do that than to do fancy QoS everywhere.

| we need a closed loop congestion control that works in the time-scale of 
| fiber deployment and LAN-speed upgrades.  We don't have one that does this, 

Well, so convince people it's cheaper than what we have now, without
eliminating (much) utility.  Start with explaining what it takes to have 
a bounded queueing delay at every potential or real bottleneck.

	Sean.

From djw1005 at cam.ac.uk  Wed Mar  7 15:54:57 2001
From: djw1005 at cam.ac.uk (Damon Wischik)
Date: Thu Mar 25 11:59:34 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be generated...]
In-Reply-To: <20010307165534.C7AA28A3@sean.ebone.net>
Message-ID: <Pine.SOL.3.96.1010307234252.1637A-100000@virgo.cus.cam.ac.uk>

On Wed, 7 Mar 2001, Sean Doran wrote:
> It's not going to be cheaper to have an empty network than to have one
> with a bottleneck here and there.  It's also not like there are that
> many applications that are so inelastic that latency is worth paying
> real extra $ to remove, when volume-over-time figures are "good enough"
> to make the per-available-mbps or 95th-percentile-utilization charges
> worth it
> ...

Might it be that by reducing latency one can improve the performance even
of elastic traffic? TCP, for example, controls its rate using a feedback
loop: the lower the latency, the tighter the control loop. There are
results to suggest that tighter control loops will improve the stability
of the network; this could be worth paying for.

For references, see 
  http://www.statslab.cam.ac.uk/~frank/int/
particularly
"Stability of distributed congestion control with heterogeneous feedback
  delays" (L.Massoulie)
"End-to-end congestion control for the Internet: delays and stability"
  (R.Johari and D.Tan.)

Damon Wischik.


From vijay at umbc.edu  Thu Mar  8 08:41:08 2001
From: vijay at umbc.edu (Vijay Gill)
Date: Thu Mar 25 11:59:34 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be
 generated...]
In-Reply-To: <Pine.GSO.4.21.0103042109570.24587-100000@boreas.isi.edu>
Message-ID: <Pine.SGI.4.31L.02.0103081136130.3071131-100000@irix2.gl.umbc.edu>

On Sun, 4 Mar 2001, Alberto Cerpa wrote:

 [snip snip]


Some data regarding flows from the Pacific Rim to the US. Keep in mind
that the sampling rate used was fairly low.

period:  03/04/2001 15:55:18 - 03/05/2001 15:55:19 PST
Protocol           Pkts       Pkts/sec          Bytes       Bits/sec
--------  -------------  -------------  -------------  -------------
     tcp        4566586             52     1118739284         103585
    icmp          97183              1       59403325           5500
     udp         207550              2       33848925           3134
     esp           2019              0        1216156            112
    skip            673              0         337732             31
     gre            298              0          94225              8
    ipip             93              0          49844              4
    ospf            108              0          33204              3
    ipv6             14              0           1327              0
    rsvp              5              0            580              0

the rsvp is an anomaly.

/vijay


From P.Gevros at cs.ucl.ac.uk  Thu Mar  8 09:17:49 2001
From: P.Gevros at cs.ucl.ac.uk (Panos GEVROS)
Date: Thu Mar 25 11:59:34 2004
Subject: [e2e] Re: UDP vs. TCP distribution
In-Reply-To: Your message of "Thu, 08 Mar 2001 11:41:08 EST." <Pine.SGI.4.31L.02.0103081136130.3071131-100000@irix2.gl.umbc.edu>
Message-ID: <1602.984071869@cs.ucl.ac.uk>

it is true that providers care about the exchanged volume - and that packet 
statistics are much easier to gather compared to flow statistics -
 still "flow" statistics  would  be very interesting (e.g active flows per 
T-sec, for some defintion of flow)
does anyone know whether such data have been published anywhere

cheers,
Panos


Vijay Gill writes:
 |On Sun, 4 Mar 2001, Alberto Cerpa wrote:
 |
 | [snip snip]
 |
 |
 |Some data regarding flows from the Pacific Rim to the US. Keep in mind
 |that the sampling rate used was fairly low.
 |
 |period:  03/04/2001 15:55:18 - 03/05/2001 15:55:19 PST
 |Protocol           Pkts       Pkts/sec          Bytes       Bits/sec
 |--------  -------------  -------------  -------------  -------------
 |     tcp        4566586             52     1118739284         103585
 |    icmp          97183              1       59403325           5500
 |     udp         207550              2       33848925           3134
 |     esp           2019              0        1216156            112
 |    skip            673              0         337732             31
 |     gre            298              0          94225              8
 |    ipip             93              0          49844              4
 |    ospf            108              0          33204              3
 |    ipv6             14              0           1327              0
 |    rsvp              5              0            580              0
 |
 |the rsvp is an anomaly.
 |
 |/vijay
 

From braden at ISI.EDU  Thu Mar  8 09:36:51 2001
From: braden at ISI.EDU (Bob Braden)
Date: Thu Mar 25 11:59:34 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be
 generated...]
Message-ID: <200103081736.RAA11946@gra.isi.edu>

  *>     ipv6             14              0           1327              0
  *>     rsvp              5              0            580              0
  *> 
  *> the rsvp is an anomaly.
  *> 
  *> /vijay
  *> 
  *> 
  *> 
  *> 
  *> 
  *> 

"anomaly"? What does that mean?  If you mean statistically, it would
appear that IPv6 is also an anomaly.

Bob Braden

From vijay at umbc.edu  Thu Mar  8 09:39:16 2001
From: vijay at umbc.edu (Vijay Gill)
Date: Thu Mar 25 11:59:34 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be
 generated...]
In-Reply-To: <200103081736.RAA11946@gra.isi.edu>
Message-ID: <Pine.SGI.4.31L.02.0103081237370.2831779-100000@irix2.gl.umbc.edu>

On Thu, 8 Mar 2001, Bob Braden wrote:

>
>   *>     ipv6             14              0           1327              0
>   *>     rsvp              5              0            580              0
>   *>
>   *> the rsvp is an anomaly.

>
> "anomaly"? What does that mean?  If you mean statistically, it would
> appear that IPv6 is also an anomaly.

Apologies for being cryptic. RSVP was a misconfiguration somewhere; once
fixed, it did not come back.  There are some people who feverently wish
that v6 could also be fixed and not come back, so I'm watching the links
with an eagle eye.

/vijay


From dpreed at reed.com  Thu Mar  8 09:07:53 2001
From: dpreed at reed.com (David P. Reed)
Date: Thu Mar 25 11:59:35 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback be
  generated...]
In-Reply-To: <20010307165534.C7AA28A3@sean.ebone.net>
Message-ID: <5.0.2.1.2.20010308120144.02fa35f0@mail.reed.com>

At 05:55 PM 3/7/01 +0100, Sean Doran wrote:
>It's not going to be cheaper to have an empty network than to have
>one with a bottleneck here and there.

This presumes that customers want the lowest price regardless of 
delay.  Not true.  And in any case, operating a network with queues mostly 
full (which increases utilization) a lot of the time is great strategy if 
you want to maximize profit when you are being paid by the byte (or by the 
portal access rate).

Most applications benefit from low queueing delay, so this isn't about QoS 
differentiations.  Only FTPs with no human in the loop want capacity with 
no delay constraint.


From touch at ISI.EDU  Thu Mar  8 11:20:45 2001
From: touch at ISI.EDU (Joe Touch)
Date: Thu Mar 25 11:59:35 2004
Subject: UDP vs. TCP distribution [was: Re: [e2e] Can feedback 
 begenerated...]
References: <5.0.2.1.2.20010308120144.02fa35f0@mail.reed.com>
Message-ID: <3AA7DB8D.B0A9C8E7@isi.edu>


"David P. Reed" wrote:
> 
> At 05:55 PM 3/7/01 +0100, Sean Doran wrote:
> >It's not going to be cheaper to have an empty network than to have
> >one with a bottleneck here and there.
> 
> This presumes that customers want the lowest price regardless of
> delay.  Not true.  And in any case, operating a network with queues mostly
> full (which increases utilization) a lot of the time is great strategy if
> you want to maximize profit when you are being paid by the byte (or by the
> portal access rate).
> 
> Most applications benefit from low queueing delay, so this isn't about QoS
> differentiations.  Only FTPs with no human in the loop want capacity with
> no delay constraint.

NTP wants it too. 

Capacity can also be used to mask latency, PROVIDED the variability
in the feedback loop can be described (if not predicted). E.g., even
FTPs with people in the loop work - you send the whole directory
when the person does a "cd" (see Infocom 1995).

(both the above cases are related; the variability is derived
from the application, rather than the network).

Further, many interactive systems are more sensitive to variability
in the latency itself than in the latency value (e.g., NTP, it turns
out).

Joe

From P.Gevros at cs.ucl.ac.uk  Thu Mar  8 13:04:13 2001
From: P.Gevros at cs.ucl.ac.uk (Panos GEVROS)
Date: Thu Mar 25 11:59:35 2004
Subject: [e2e] Re: UDP vs. TCP distribution
In-Reply-To: Your message of "Thu, 08 Mar 2001 11:20:45 PST." <3AA7DB8D.B0A9C8E7@isi.edu>
Message-ID: <1805.984085453@cs.ucl.ac.uk>

Joe Touch writes:
 |
 |
 |"David P. Reed" wrote:
 |> 
 |> At 05:55 PM 3/7/01 +0100, Sean Doran wrote:
 |> >It's not going to be cheaper to have an empty network than to have
 |> >one with a bottleneck here and there.
 |> 
 |> This presumes that customers want the lowest price regardless of
 |> delay.  Not true.  And in any case, operating a network with queues mostly
 |> full (which increases utilization) a lot of the time is great strategy if
 |> you want to maximize profit when you are being paid by the byte (or by the
 |> portal access rate).
 |> 
 |> Most applications benefit from low queueing delay, so this isn't about QoS
 |> differentiations.  Only FTPs with no human in the loop want capacity with
 |> no delay constraint.
 |
 |NTP wants it too. 
 |
 |Capacity can also be used to mask latency, PROVIDED the variability
 |in the feedback loop can be described (if not predicted). E.g., even
 |FTPs with people in the loop work - you send the whole directory
 |when the person does a "cd" (see Infocom 1995).


low delay (or jitter) is good but whether this should be the network design 
goal is a different matter;
if capacity was not a constrain i would be willing to tolerate an extra delay 
to download the whole "structure" (or a specific subset) of a web site and do 
all the searching/browsing locally, also store it for future reference in case 
i find it interesting enough, and save subsequent network accesses
the fact that only a fraction of the information downloaded would be of 
interest may be irrelevant (because there are no capacity constrains)
of course this does not apply to using the web for transactions or dynamic 
content,
so ftp-style transport may not be completely out of fashion yet,

it all depends on whether need for interactive experience or access to 
information proves to be the killer app - but if bounded delay is a necessity 
and customers are prepared to pay for it then we may be better of with 
something where one directly "dials the web server"

Panos


From touch at ISI.EDU  Thu Mar  8 14:01:15 2001
From: touch at ISI.EDU (Joe Touch)
Date: Thu Mar 25 11:59:35 2004
Subject: [e2e] Re: UDP vs. TCP distribution
References: <1805.984085453@cs.ucl.ac.uk>
Message-ID: <3AA8012B.FC1BFE58@isi.edu>


Panos GEVROS wrote:
> 
> low delay (or jitter) is good but whether this should be the network design
> goal is a different matter;
...
> it all depends on whether need for interactive experience or access to
> information proves to be the killer app - but if bounded delay is a necessity
> and customers are prepared to pay for it then we may be better of with
> something where one directly "dials the web server"

My concern is optimizing the entire network for a single class of
applications as well. There are different goals - maximizing BW,
minimizing latency, minimizing jitter. 

What matters is how flexible the infrastructure is to providing
these, hopefully concurrently.

Joe

From vern at ee.lbl.gov  Fri Mar  9 16:39:19 2001
From: vern at ee.lbl.gov (Vern Paxson)
Date: Thu Mar 25 11:59:35 2004
Subject: [e2e] paper on "Difficulties in Simulating the Internet" now available
Message-ID: <200103100039.f2A0dJN00605@daffy.ee.lbl.gov>

The following paper is to appear in IEEE/ACM Transactions on Networking.
It's a revision of a previous paper titled "Why We Don't Know How to
Simulate the Internet".

		Vern & Sally


Difficulties in Simulating the Internet

Sally Floyd & Vern Paxson
AT&T Center for Internet Research at ICSI (ACIRI)
{floyd,vern}@aciri.org

http://www.aciri.org/vern/papers/sim-difficulty.TON.2001.ps.gz
http://www.aciri.org/vern/papers/sim-difficulty.TON.2001.pdf


Simulating how the global Internet behaves is an immensely challenging
undertaking because of the network's great heterogeneity and rapid change.
The heterogeneity ranges from the individual links that carry the network's
traffic, to the protocols that interoperate over the links, to the "mix"
of different applications used at a site, to the levels of congestion seen
on different links.  We discuss two key strategies for developing meaningful
simulations in the face of these difficulties: searching for invariants,
and judiciously exploring the simulation parameter space.  We finish with a
brief look at a collaborative effort within the research community to develop
a common network simulator.

From vijay at umbc.edu  Fri Mar  9 23:00:23 2001
From: vijay at umbc.edu (Vijay Gill)
Date: Thu Mar 25 11:59:35 2004
Subject: [e2e] Some stats broken down by protocol
In-Reply-To: <Pine.SGI.4.31L.02.0103081136130.3071131-100000@irix2.gl.umbc.edu>
Message-ID: <Pine.SGI.4.31L.02.0103100153550.3225616-100000@irix2.gl.umbc.edu>

Based on some queries regarding protocols and flows, here are some of the
protocol breakdowns.


> Some data regarding flows from the Pacific Rim to the US. Keep in mind
> that the sampling rate used was fairly low.
>
> period:  03/04/2001 15:55:18 - 03/05/2001 15:55:19 PST
> Protocol           Pkts       Pkts/sec          Bytes       Bits/sec
> --------  -------------  -------------  -------------  -------------
>      tcp        4566586             52     1118739284         103585
>     icmp          97183              1       59403325           5500
>      udp         207550              2       33848925           3134
>      esp           2019              0        1216156            112
>     skip            673              0         337732             31
>      gre            298              0          94225              8
>     ipip             93              0          49844              4
>     ospf            108              0          33204              3
>     ipv6             14              0           1327              0
>     rsvp              5              0            580              0
>
> the rsvp is an anomaly.

Adding to the above.

Regarding Panos' query about flows:
Here is what the statman sayeth (josh wepman)

1. How hard would it be to quantify the traffic in terms of "flow"
   (src/dst/port) pair? "Flow" statistics would be very interesting
   (e.g active flows per T-sec, for some definition of flow).  I'm
   working with some folks regarding TCP ECN and flow data would
   be very useful.

Number flows (N) at time (T) (a snapshot)
or
Number flows (N) over time T1 -> T2 (counter over time)

Realtime data is of course out of the question.  We only get "expired"
flow information exported to cflowd.  Historical values can be gotten with
a bit of work.  This is not data available in the cflowd tables maintained
in ARTS data.  #Flows is not an attribute maintained.  So we have to view
the raw flow files.  A snapshot could be obtained by counting all flows in
flows files whose start/end time encompass T(snapshot).  The latter
(counter over time) could be determined by counting all the flows seen
from Time1 to time2.

The value is NOT real as a representation of flows on a link.  They are a
value based on "flows" exported from the router as determined by a router.
A flow could be terminated and exported because a FIN occurred, because it
was idle for time T, or because the total time of the flow exceeded a
limit time value.

In order to do either of the above, it needs to be clear that the value
represented is NOT flows on a link at a given time, but flows seen
exported based on flow-export criteria.  It should also be mentioned that
the functionality to do this does not currently exist in cflowd, so any
efforts here would have to be part of a larger Flow Development effort.


Re: DNS

To or from port53.  More generically, we can use artsprotos to
characterize tcp vs udp vs icmp vs whatever else is seen.  The time
domain can be manipulated to what you may be looking for.  Since
protocol is an ARTS stored value, we have historical data to work with.

Likewise, DNS (port 53TCP/UDP) is maintained in ARTS and available
via artsportms and artsports.  We can state from T1 -> T2, what
was the protocol distribution, and for a set of ports, what were
the port distributions.

As with the first question, we do not have #Flows, but we do have
pkt/byte data.

# /usr/local/arts/bin/artsports daily/arts.20010308.ports
router: blah blah blah
ifIndex: 27
period:  03/07/2001 15:55:20 - 03/08/2001 15:55:18 PST
selected ports: 20-21,53,80,119,443
    Port         InPkts        InBytes        OutPkts       OutBytes
   -----  -------------  -------------  -------------  -------------
    http        5920632      422354154        1350887     1368382195
    ELSE        1055537      252207931           3039        2648398
    nntp         269884       80456315          32733        2257472
ftp-data          39530        4840976          52149       76430542
  domain          69178        4482451          82340       13109974
   https           9266        1253745           4470        2222407
     ftp          10868         560030           6164         319734

This data was based on a 1:64 packet sampling rate and has not been
extrapolated to 1:1 values.  An optimal N for sampling has not been
determined for this class of link, so the degree of skew in the above
numbers cannot be stated with any certainty.  If we assumed that 1:64
sampling did correctly represent the true population, then multiplying out
the values by 64 would give you the approximate real values.

--end statman

Hope this was useful.

/vijay


From gr224 at hermes.cam.ac.uk  Sat Mar 10 03:37:53 2001
From: gr224 at hermes.cam.ac.uk (Gaurav Raina)
Date: Thu Mar 25 11:59:35 2004
Subject: [e2e] paper on "Difficulties in Simulating the Internet" now
 available
In-Reply-To: <200103100039.f2A0dJN00605@daffy.ee.lbl.gov>
Message-ID: <Pine.SOL.4.21.0103101111290.21360-100000@yellow.csi.cam.ac.uk>

On Fri, 9 Mar 2001, Vern Paxson wrote:

> Difficulties in Simulating the Internet
....
> and judiciously exploring the simulation parameter space.  We finish with a
> brief look at a collaborative effort within the research community to develop
> a common network simulator.

Apologies if I am merely stating the obvious, but along with developing 
simulation tools I think it is *imperative* to try and develop
theoretical tools which might give some rules of thumb on how the results
could scale to a network as large and complex as the Internet. Apart from 
the obvious industries involved in Internet Research - *some* of the 
academic reseach groups are :

http://netlab.caltech.edu/
http://www-net.cs.umass.edu/
http://www.statslab.cam.ac.uk/~frank/int/
http://comm.csl.uiuc.edu:80/~srikant/pub.html/ 

An exhaustive list is not possible...

Might it be a good idea to consider having a common database/pool for 
research papers/preprints dealing with the different research topics? Like
the way the physics community has the Los Alamos archive.

Gaurav 


From hgs at cs.columbia.edu  Sat Mar 10 06:01:14 2001
From: hgs at cs.columbia.edu (Henning G. Schulzrinne)
Date: Thu Mar 25 11:59:35 2004
Subject: [e2e] paper on "Difficulties in Simulating the Internet" 
 nowavailable
References: <Pine.SOL.4.21.0103101111290.21360-100000@yellow.csi.cam.ac.uk>
Message-ID: <3AAA33AA.5FCAA3B9@cs.columbia.edu>

Gaurav Raina wrote:

> Might it be a good idea to consider having a common database/pool for
> research papers/preprints dealing with the different research topics? Like
> the way the physics community has the Los Alamos archive.

<ad>Well, there's netbib, with about 55,000 networking-related papers,
http://www.cs.columbia.edu/~hgs/netbib</ad>

> 
> Gaurav
> 
> 

-- 
Henning Schulzrinne   http://www.cs.columbia.edu/~hgs

From J.Crowcroft at cs.ucl.ac.uk  Sun Mar 11 05:58:05 2001
From: J.Crowcroft at cs.ucl.ac.uk (Jon Crowcroft)
Date: Thu Mar 25 11:59:35 2004
Subject: [e2e] paper on "Difficulties in Simulating the Internet" 
         nowavailable
In-Reply-To: Your message of "Sat, 10 Mar 2001 09:01:14 EST." <3AAA33AA.5FCAA3B9@cs.columbia.edu>
Message-ID: <18880.984319085@cs.ucl.ac.uk>

In message <3AAA33AA.5FCAA3B9@cs.columbia.edu>, "Henning G. Schulzrinne" typed:

 >>> Might it be a good idea to consider having a common database/pool for
 >>> research papers/preprints dealing with the different research topics? Like
 >>> the way the physics community has the Los Alamos archive.
 
there was/is an attempt to do exactly this, but it takes time to build - i'm
not sure what its current status is

in the meantime, henning is right - his is the nearest we have, and given
the efforts of IEEE Infocom and ACM SIGCOMM and other releated
conferences to encourage the archival of conference proceedings online
for all, you can generally find most timely information via netbib
(and citeseer) now without recourse to walking over the your real
libray (btw, you have quite a good one below that  ugly tower in
cambridge that domiantes the skyline from many approaches:-)

 >><ad>Well, there's netbib, with about 55,000 networking-related papers,
 >>http://www.cs.columbia.edu/~hgs/netbib</ad>

& i personally freel that distributed lassez-fair approaches work better
for a diverse growing community than the focuessed approach that math
& phsyics people have enjoyed in their social context...
 

 cheers

   jon


From hgs at cs.columbia.edu  Sun Mar 11 06:28:19 2001
From: hgs at cs.columbia.edu (Henning G. Schulzrinne)
Date: Thu Mar 25 11:59:35 2004
Subject: [e2e] paper on "Difficulties in Simulating the Internet" 
 nowavailable
References: <18880.984319085@cs.ucl.ac.uk>
Message-ID: <3AAB8B83.E0C0D23A@cs.columbia.edu>

Jon Crowcroft wrote:
> 

> 
> in the meantime, henning is right - his is the nearest we have, and given
> the efforts of IEEE Infocom and ACM SIGCOMM and other releated
> conferences to encourage the archival of conference proceedings online
> for all, you can generally find most timely information via netbib
> (and citeseer) now without recourse to walking over the your real
> libray (btw, you have quite a good one below that  ugly tower in
> cambridge that domiantes the skyline from many approaches:-)

Also, if your favorite paper (or your own paper) is not yet in netbib,
you're encouraged to enter this information (same web site).

> 
>  >><ad>Well, there's netbib, with about 55,000 networking-related papers,
>  >>http://www.cs.columbia.edu/~hgs/netbib</ad>
> 
> & i personally freel that distributed lassez-fair approaches work better
> for a diverse growing community than the focuessed approach that math
> & phsyics people have enjoyed in their social context...

There is also the http://arXiv.org service, but from what I can tell, it
is being used extremely rarely by this community.


> 
> 
>  cheers
> 
>    jon

-- 
Henning Schulzrinne   http://www.cs.columbia.edu/~hgs

From demir at usc.edu  Wed Mar 14 15:36:31 2001
From: demir at usc.edu (demir)
Date: Thu Mar 25 11:59:35 2004
Subject: [e2e] [Diffserv-interest] A question on Adaptive Protocols vs Expected
 Service Classes of Diffserv
Message-ID: <Pine.GSO.4.21.0103141522250.15277-100000@aludra.usc.edu>

Hi,
There has been vast amount of research on how TCP will react on top of
services based on AF/AF-alike PHBs. However, I am not aware of a research
that elaborates TCP on top of EF/EF-alike PHBs from service perspective 
(it seems this is unneccessary at all???). I am aware of that TCP is a
widely implemented and used protocol for congestion control and
avoidance. I assume, in a "short" time scale, TCP seems reasonable to be
used for AF PHB-based services cause TCP-friendly traffic conditioners
would refine the TCP's behavior. It seems to me that, may be, we need
different adaptive protocols for different service classes (TCP has been
developed for the "best-effort" service class). Any  
ideas/insights/comments? I appreciate very much.

Alper K. Demir


From nichols at packetdesign.com  Wed Mar 14 16:25:38 2001
From: nichols at packetdesign.com (Kathleen Nichols)
Date: Thu Mar 25 11:59:35 2004
Subject: [e2e] [Diffserv-interest] A question on Adaptive Protocols vs 
 ExpectedService Classes of Diffserv
References: <Pine.GSO.4.21.0103141522250.15277-100000@aludra.usc.edu>
Message-ID: <3AB00C02.A6C2BA33@packetdesign.com>

Kedar Poduri carried out some simulations like this quite a while back
when we were talking about something called a "virtual leased line"
built from the EF PHB described in RFC2598. This is real easy to do
(of course, you need a shaper at the edge, but Van's been saying that
since the origins of this as his "premium" service). You can look at
slide 23 of the talk at:
http://www.nren.nasa.gov/CFP/nichols_pres/index.htm, a NASA NREN
workshop on QoS. 

It's supposed to "look like a wire".

	Kathie

demir wrote:
> 
> Hi,
> There has been vast amount of research on how TCP will react on top of
> services based on AF/AF-alike PHBs. However, I am not aware of a research
> that elaborates TCP on top of EF/EF-alike PHBs from service perspective
> (it seems this is unneccessary at all???). I am aware of that TCP is a
> widely implemented and used protocol for congestion control and
> avoidance. I assume, in a "short" time scale, TCP seems reasonable to be
> used for AF PHB-based services cause TCP-friendly traffic conditioners
> would refine the TCP's behavior. It seems to me that, may be, we need
> different adaptive protocols for different service classes (TCP has been
> developed for the "best-effort" service class). Any
> ideas/insights/comments? I appreciate very much.
> 
> Alper K. Demir

From bsikdar at networks.ecse.rpi.edu  Wed Mar 14 16:42:07 2001
From: bsikdar at networks.ecse.rpi.edu (bsikdar@networks.ecse.rpi.edu)
Date: Thu Mar 25 11:59:35 2004
Subject: [e2e] Vegas on Linux
In-Reply-To: <Pine.GSO.4.21.0103141522250.15277-100000@aludra.usc.edu>
Message-ID: <Pine.GSO.4.10.10103141932260.24945-100000@poisson.ecse.rpi.edu>

Hi,
   Could anyone please direct me to any TCP Vegas implementations
for Linux? Or for any other platforms? And are there any differences
in these implementations? Thanks a lot,
						Biplab Sikdar
						Dept. of ECSE
					Rensselaer Polytechnic Inst.,
						Troy NY 12180


From demir at usc.edu  Wed Mar 14 16:44:28 2001
From: demir at usc.edu (demir)
Date: Thu Mar 25 11:59:35 2004
Subject: [e2e] [Diffserv-interest] A question on Adaptive Protocols vs 
 ExpectedService Classes of Diffserv
In-Reply-To: <3AB00C02.A6C2BA33@packetdesign.com>
Message-ID: <Pine.GSO.4.21.0103141637080.15277-100000@aludra.usc.edu>

Kathie,
From nseddigh at tropicnetworks.com  Thu Mar 15 07:53:14 2001
From: nseddigh at tropicnetworks.com (Nabil Seddigh)
Date: Thu Mar 25 11:59:35 2004
Subject: [e2e] [Diffserv-interest] A question on Adaptive Protocols vs 
 ExpectedService Classes of Diffserv
References: <Pine.GSO.4.21.0103141637080.15277-100000@aludra.usc.edu>
Message-ID: <3AB0E56A.1C9DD55C@tropicnetworks.com>

There has been work on both TCP modifications as well as 
"Intelligent" or "TCP-friendly" traffic conditioners to 
address issues with TCP over the AF PHB. We discovered 
some interesting results in our experiments with the latter
approach - send me email if you're interested.

In principle, it should be easier to incorporate changes 
in edge device traffic conditioners than to affect TCP 
standard modifications. However, at the same time, there has 
been limited implementation of intelligent traffic conditioners 
or policers in deployed products.

Best,
Nabil Seddigh

demir wrote:

> My intention
> of asking this question was "do we need such an (adaptive) protocol
> complexity"? If not, then "do we need to have a different levels of
> complexity in the protocols that can used for each service class"? /Or
> "One uniform adaptive protocol will suffice cause traffic conditioners can
> take care of this"? I assume all is a possibility. What would be a proper
> trend to go? Thank you very much
>

From floyd at aciri.org  Thu Mar 15 09:08:56 2001
From: floyd at aciri.org (Sally Floyd)
Date: Thu Mar 25 11:59:35 2004
Subject: [e2e] two questions about the Internet
Message-ID: <200103151708.f2FH8u523767@elk.aciri.org>

I maintain a web page
  http://www.aciri.org/floyd/questions.html
of (mostly unanswered) questions about the Internet. 
I just posted two new questions to that page, and I thought I would
also mention them here, in case anyone on this list knows any
(partial) answers to any of them.

The new questions:

ROUND-TRIP TIMES (HOPS, NUMBER OF ASes) OF PACKETS?
For packets on a particular link, each packet could be assigned an
estimated round-trip time, a number of ASes for the end-to-end
path, etc, based on the IP source and destination addresses for
that packet.  For packets on a particular link, what can we say
about the distribution of round-trip times, or of the number of hops
traversed, or the number of ASes traversed, or number of continents
traversed, or (this is harder) the number of congested links traversed?

[Example:  For link X, can we say that most packets/bytes stay on
that continent?  Or that most packets have a minimum round-trip
time of at least S seconds?  Or that most packets on that link
during this period of time traverse more than one congested link
on their path from source to destination?]

PERIODS OF EXTREME CONGESTION AT A ROUTER?
For those routers in the network that do occationally experience
congestion, how can we characterize their rare periods of *extreme*
congestion (defining extreme congestion, say, as packet drop rates
above 5%)?  How frequently to these periods of extreme congestion
occur, and how long do they last?  What fraction can be attributed
to flash crowds? to Denial of Service attacks? to fiber cuts or
other routing changes?

Many thanks,
- Sally
--------------------------------
http://www.aciri.org/floyd/
--------------------------------

From dovrolis at mail.eecis.udel.edu  Thu Mar 15 14:33:18 2001
From: dovrolis at mail.eecis.udel.edu (Constantinos Dovrolis)
Date: Thu Mar 25 11:59:35 2004
Subject: [e2e] two questions about the Internet
In-Reply-To: <200103151708.f2FH8u523767@elk.aciri.org>
Message-ID: <Pine.GSO.4.31.0103151648250.9254-100000@galois.cis.udel.edu>

Sally,

I may have a (very partial) answer to the first question, i.e.,
what is the distribution of round-trip times (RTTs) for the
packets on a certain link.

A couple of initial "disclaimers":
a) an RTT can be only associated to a packet of a closed-loop
protocol, and so our measurements only looked at TCP packets
b) our measurements do not refer to per-packet RTTs, but to
per-connection RTTs. It is likely that there are important
differences in these two distributions (short-RTT flows may
tend to carry more data, causing the per-packet RTT distribution to
be heavier on lower RTT values).

So, the measurements that I refer to were done in the summer
of 99 at CAIDA. We were processing traffic traces captured from
passive monitors on certain links (note that we get two different
traces, one for each direction of the link). In a certain trace,
we were estimating the RTT of each TCP connection using the
following two rules:

a) if we observe the flow from the caller to the callee, the
RTT is estimated as the time interval from the SYN to the SYN-ACK.

b) if we observe the flow from the callee to the caller (which
is usually the traffic from the server to the client), the RTT
is estimated from the time spacing of the first 2 or 3 slow-start
bursts. The code to do this is tricky (I can send to you, or to
anyone else, the code if you want to play with this).

So, using these tricks we were measuring the distribution
of RTTs in the TCP connections that were present in each
(unidirectional) trace. Just as an example of the distributions
that we were getting, take a look at:

http://www.cis.udel.edu/~dovrolis/rtt-sdsc.eps

The graph shows two RTT distributions, one for each direction
of the OC-3 link that used to connect UCSD with CERFnet.

A few major points from the graph:
- About 35% of the connections have RTT < 50ms
- About 60% of the connections have RTT < 100ms
- There is a significant fraction of connections (20-30%)
  with RTT>200ms (which is probably close to the upper bound
  for any type of interactive applications).
- About 10% of the RTTs are quite large (some of them in the
  order of multiple seconds), which may indicate errors in our
  measurement methodology. This is why I did not include that
  fraction of RTTs in the graph.

Some very interesting measurements on this subject also appear
at Mark Allman's "A Web server's view of the transport layer"
published at CCR Oct-2000. Mark's measurements originate from
traces of the server's traffic (instead of a passive monitor in the
the network). Also, he could measure the RTTs more accurately
based on the time distance between a non-retransmitted packet
and the corresponding ACK. Obviously we cannot do the same,
because we don't have the flow of ACKs in the trace.

It is interesting that Mark's measurements (see Figure 9) are not *very*
different from the graph that I mentioned before. Specifically,
his graph shows:
- About 35% of the RTTs < 100msec
- About 60-70% of the RTTs < 200msec
- About 85% of the RTTs < 500msec.
Of course Mark's measurements/analysis were much more methodically
done (my measurements were only done to get some reasonable values
for simulations about other stuff).

I hope that this helps. I am also very interested in answers
to the rest of your questions.


Constantinos

Computer and Information Sciences - University of Delaware

http://www.cis.udel.edu/~dovrolis/

On Thu, 15 Mar 2001, Sally Floyd wrote:

> I maintain a web page
>   http://www.aciri.org/floyd/questions.html
> of (mostly unanswered) questions about the Internet.
> I just posted two new questions to that page, and I thought I would
> also mention them here, in case anyone on this list knows any
> (partial) answers to any of them.
>
> The new questions:
>
> ROUND-TRIP TIMES (HOPS, NUMBER OF ASes) OF PACKETS?
> For packets on a particular link, each packet could be assigned an
> estimated round-trip time, a number of ASes for the end-to-end
> path, etc, based on the IP source and destination addresses for
> that packet.  For packets on a particular link, what can we say
> about the distribution of round-trip times, or of the number of hops
> traversed, or the number of ASes traversed, or number of continents
> traversed, or (this is harder) the number of congested links traversed?
>
> [Example:  For link X, can we say that most packets/bytes stay on
> that continent?  Or that most packets have a minimum round-trip
> time of at least S seconds?  Or that most packets on that link
> during this period of time traverse more than one congested link
> on their path from source to destination?]
>
> PERIODS OF EXTREME CONGESTION AT A ROUTER?
> For those routers in the network that do occationally experience
> congestion, how can we characterize their rare periods of *extreme*
> congestion (defining extreme congestion, say, as packet drop rates
> above 5%)?  How frequently to these periods of extreme congestion
> occur, and how long do they last?  What fraction can be attributed
> to flash crowds? to Denial of Service attacks? to fiber cuts or
> other routing changes?
>
> Many thanks,
> - Sally
> --------------------------------
> http://www.aciri.org/floyd/
> --------------------------------
>


From nahum at watson.ibm.com  Thu Mar 15 15:51:11 2001
From: nahum at watson.ibm.com (Erich Nahum)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] two questions about the Internet
In-Reply-To: <Pine.GSO.4.31.0103151648250.9254-100000@galois.cis.udel.edu> from "Constantinos Dovrolis" at Mar 15, 2001 05:33:18 PM
Message-ID: <200103152351.SAA33906@orinoco.watson.ibm.com>

Constantinos Dovrolis writes:
> 
> It is interesting that Mark's measurements (see Figure 9) are not *very*
> different from the graph that I mentioned before. Specifically,
> his graph shows:
> - About 35% of the RTTs < 100msec
> - About 60-70% of the RTTs < 200msec
> - About 85% of the RTTs < 500msec.
> Of course Mark's measurements/analysis were much more methodically
> done (my measurements were only done to get some reasonable values
> for simulations about other stuff).

Srini Seshan (when he was here at Watson) had some packet trace data from 
the 1996 Olympic Web server, but it's a bit old now.  The technique was 
similar to what Mark Allman did.  For the record, though, it had:

- 25% of the RTTs < 115 ms
- 50% of the RTTs < 338 ms
- 75% of the RTTs < 778 ms

The RTTs are obviously going to vary depending on what kind of
connection you have (T3, OC-768) as well as where your clients
are (NY, CA, Greece).

-Erich

-- 
Erich M. Nahum                  IBM T.J. Watson Research Center
Networking Research             P.O. Box 704
nahum@watson.ibm.com            Yorktown Heights NY 10598

From ggm at dstc.edu.au  Thu Mar 15 16:23:51 2001
From: ggm at dstc.edu.au (George Michaelson)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] two questions about the Internet 
In-Reply-To: Your message of "Thu, 15 Mar 2001 18:51:11 EST."
             <200103152351.SAA33906@orinoco.watson.ibm.com> 
Message-ID: <12360.984702231@dstc.edu.au>

  Srini Seshan (when he was here at Watson) had some packet trace data from 
  the 1996 Olympic Web server, but it's a bit old now.  The technique was 
  similar to what Mark Allman did.  For the record, though, it had:
  
  - 25% of the RTTs < 115 ms
  - 50% of the RTTs < 338 ms
  - 75% of the RTTs < 778 ms
  
  The RTTs are obviously going to vary depending on what kind of
  connection you have (T3, OC-768) as well as where your clients
  are (NY, CA, Greece).
  
  -Erich

The 96 Olympics were hosted behind multiple backends, geographically
distributed? I thought Nagano was, I went to a seminar by IBM on it.

Because if so, there were presumably frontend boxes making decisions
on backend server, which would either intuit best-fit path or else
map it into some simple model like BGP AS or link-based region and
so skew RTT in favour of shorter-hop and/or ligher-load hosts.

-George
--
George Michaelson         |  DSTC Pty Ltd
Email: ggm@dstc.edu.au    |  University of Qld 4072
Phone: +61 7 3365 4310    |  Australia
  Fax: +61 7 3365 4311    |  http://www.dstc.edu.au

From nahum at watson.ibm.com  Thu Mar 15 18:04:46 2001
From: nahum at watson.ibm.com (Erich Nahum)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] two questions about the Internet
In-Reply-To: <12360.984702231@dstc.edu.au> from "George Michaelson" at Mar 16, 2001 10:23:51 AM
Message-ID: <200103160204.VAA53100@orinoco.watson.ibm.com>

George Michaelson writes:
> 
> The 96 Olympics were hosted behind multiple backends, geographically
> distributed? I thought Nagano was, I went to a seminar by IBM on it.
> 
> Because if so, there were presumably frontend boxes making decisions
> on backend server, which would either intuit best-fit path or else
> map it into some simple model like BGP AS or link-based region and
> so skew RTT in favour of shorter-hop and/or ligher-load hosts.

96 (Atlanta) was the first olympics that IBM hosted, and I believe it was
just one complex in Southbury, CT.  98 (Nagano) and 2000 (Australia)
were hosted by 4 main sites: Bethesda (for Europe), Shaumberg IL and 
someplace in Ohio (for the Americas) and Tokyo (for Asia).  The
request routing was done on a very course-grain level, basically
through the routing tables.  E.g., if you were in Europe,
olympics.com pointed to Bethesda. I think it was done at
the routing layer and not through DNS. 

The front ends of each cluster were IBM network dispatcher TCP
sprayers, which routed to back-end nodes on the same LAN.  So
I believe the RTT distribution seen by a complex would be the same
across nodes within that cluster.

-Erich

-- 
Erich M. Nahum                  IBM T.J. Watson Research Center
Networking Research             P.O. Box 704
nahum@watson.ibm.com            Yorktown Heights NY 10598

From hari at chive.lcs.mit.edu  Thu Mar 15 19:02:17 2001
From: hari at chive.lcs.mit.edu (Hari Balakrishnan)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] two questions about the Internet 
In-Reply-To: Message from Erich Nahum <nahum@watson.ibm.com> 
   of "Thu, 15 Mar 2001 21:04:46 EST." <200103160204.VAA53100@orinoco.watson.ibm.com> 
Message-ID: <200103160302.f2G32Ha28828@chive.lcs.mit.edu>

Erich,

> George Michaelson writes:
> > 
> > The 96 Olympics were hosted behind multiple backends, geographically
> > distributed? I thought Nagano was, I went to a seminar by IBM on it.
> > 
> > Because if so, there were presumably frontend boxes making decisions
> > on backend server, which would either intuit best-fit path or else
> > map it into some simple model like BGP AS or link-based region and
> > so skew RTT in favour of shorter-hop and/or ligher-load hosts.
> 
> 96 (Atlanta) was the first olympics that IBM hosted, and I believe it was
> just one complex in Southbury, CT.  

For the 1996 Atlanta games, IBM actually ran multiple servers for the Olympics, 
but they weren't transparent (i.e., they had distinct DNS names).  The data 
being referred to here was collected at Southbury, CT.  The other sites were, 
if I recall, at Keio (Japan), Cornell (NY), Karlsruhe (Germany), and Hursley 
(UK).

The Southbury site was connected via T3 links to 4 US NAPs: Chicago (Bellcore & 
Ameritech), SF Bay Area (Bellcore and PacBell), NY (Sprint), and DC (MFS 
Datanet).

> 98 (Nagano) and 2000 (Australia)
> were hosted by 4 main sites: Bethesda (for Europe), Shaumberg IL and 
> someplace in Ohio (for the Americas) and Tokyo (for Asia).  The
> request routing was done on a very course-grain level, basically
> through the routing tables.  E.g., if you were in Europe,
> olympics.com pointed to Bethesda. I think it was done at
> the routing layer and not through DNS. 

> The front ends of each cluster were IBM network dispatcher TCP
> sprayers, which routed to back-end nodes on the same LAN.  So
> I believe the RTT distribution seen by a complex would be the same
> across nodes within that cluster.

Sounds about right, if you believe the load-balancing was working correctly.  
:}  (I'm not saying it wasn't!)

Hari

> 
> -Erich
> 
> -- 
> Erich M. Nahum                  IBM T.J. Watson Research Center
> Networking Research             P.O. Box 704
> nahum@watson.ibm.com            Yorktown Heights NY 10598


From widmer at informatik.uni-mannheim.de  Fri Mar 16 05:37:48 2001
From: widmer at informatik.uni-mannheim.de (Joerg Widmer)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] SIG on Networked Computer Games
Message-ID: <3AB2172C.EA67F19A@informatik.uni-mannheim.de>

Hi all,
With respect to the recent discussion about networked computer games
this SIG might be of interest.

Cheers,
  J?rg/Martin


                   Call for Participation

Special Interest Group on Networked Computer Games (SIG NetGame)

Over the past three to four years networked computer games have been a
tremendous commercial success. Games like Ultima Online, Everquest,
Doom,
Quake, Diablo II and others have attracted an audience of several
million
players, worldwide. They are one of the few Internet services for which 
end users are actually willing to pay money. As the Internet becomes 
ubiquitous through wireless and/or cheaper Internet access the the 
audience for networked computer games will increase rapidly, creating 
a mass market with a multi-billion dollar volume.  

However, most - if not all - of the successful networked computer games
have
encountered a large number of technical challenges that are inherent to
this
application area. These range from inadequate support by network and
transport
protocols to consistency problems and security breaches (or cheating as
players prefer to call it). At the same time scientists have begun to
discover
networked computer games as an extremely challenging and rewarding area
of 
research. What makes this area of research particularly fascinating is
that 
solutions found for networked computer games tend to solve related
problems 
in other areas such as computer supported collaborative work, distance 
education and telemedicine. 

It is the aim of this SIG to bring together developers of commercial and
non-commercial networked computer games, service providers, scientists,
and
interested individuals in order to discuss - and possibly solve -
technical
challenges of networked computer games. Topics of interest include, but
are 
certainly not limited to: 
             
- network and transport protocols 
- application-level protocol design 
- architectures for service providers 
- consistency mechanisms 
- security / cheating prevention 
- middle-ware (e.g. Direct Play) 
- billing and charging 

... for networked computer games. 


You can subscribe to the NetGame mailing list through the NetGame 
webpage:

  http://www.informatik.uni-mannheim.de/netgame/index.html

or by sending a mail to:

  netgame-l-request@pi4.informatik.uni-mannheim.de 

  with the following line in the BODY of the message: 

  subscribe

I'm looking forward to interesting discussions on netgame-l.

Sincerely,

Martin Mauve

Disclaimer: This SIG is currently not affiliated with any other
organization.

From oleg at inforocket.com  Fri Mar 16 06:46:43 2001
From: oleg at inforocket.com (Oleg Vishnepolsky)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] two questions about the Internet
In-Reply-To: <200103160204.VAA53100@orinoco.watson.ibm.com>
Message-ID: <NEBBJFHFADLMLFEFPEHEMEBGDLAA.oleg@inforocket.com>

Erich M. Nahum writes:

>96 (Atlanta) was the first olympics that IBM hosted, and I believe it was
>just one complex in Southbury, CT.  98 (Nagano) and 2000 (Australia)
>were hosted by 4 main sites: Bethesda (for Europe), Shaumberg IL and 
>someplace in Ohio (for the Americas) and Tokyo (for Asia).  The
>request routing was done on a very course-grain level, basically
>through the routing tables.  E.g., if you were in Europe,
>olympics.com pointed to Bethesda. I think it was done at
>the routing layer and not through DNS. 

How is it even possible not to involve DNS ? If DNS was giving out the same IP address to  olympics.com
irrespective of the where requests came from, then routing would have been real tricky, to say the least. 

Oleg Vishnepolsky


From pingpan at juniper.net  Fri Mar 16 09:40:48 2001
From: pingpan at juniper.net (Ping Pan)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] two questions about the Internet
References: <200103151708.f2FH8u523767@elk.aciri.org>
Message-ID: <3AB25020.99C75551@cs.columbia.edu>

Sally Floyd wrote:
> 
> The new questions:
> 
> ROUND-TRIP TIMES (HOPS, NUMBER OF ASes) OF PACKETS?
> For packets on a particular link, each packet could be assigned an
> estimated round-trip time, a number of ASes for the end-to-end
> path, etc, based on the IP source and destination addresses for
> that packet.  For packets on a particular link, what can we say
> about the distribution of round-trip times, or of the number of hops
> traversed, or the number of ASes traversed, or number of continents
> traversed, or (this is harder) the number of congested links traversed?
> 

Hi,

Hop-counters: http://www.nlanr.net/NA/Learn/wingspan.html
AS length: http://moat.nlanr.net/ASPL/ (from University of Oregon)

BTW, there are several good pages on Internet questions:
1. Henning Schulzrinne:
http://www.cs.columbia.edu/~hgs/internet/traffic.html
2. Merit: http://www.merit.edu/ipma/reports/
3. NLANR: http://www.nlanr.net/NA/Learn/

- Ping Pan

From pingpan at juniper.net  Fri Mar 16 10:07:48 2001
From: pingpan at juniper.net (Ping Pan)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] two questions about the Internet
References: <200103151708.f2FH8u523767@elk.aciri.org>
Message-ID: <3AB25674.B3EE78A9@juniper.net>

Sally Floyd wrote:
> 
> PERIODS OF EXTREME CONGESTION AT A ROUTER?
> For those routers in the network that do occationally experience
> congestion, how can we characterize their rare periods of *extreme*
> congestion (defining extreme congestion, say, as packet drop rates
> above 5%)?  How frequently to these periods of extreme congestion
> occur, and how long do they last?  What fraction can be attributed
> to flash crowds? to Denial of Service attacks? to fiber cuts or
> other routing changes?
> 

Almost forgot, please take a look at http://www.nordu.net/stats/. This
is one of the better places where you can monitor the link traffic for
both average and peak rates, and draw your own conclusion on link
congestion and duration.

In the past several years, most of US providers stop showing their
networks, and only provide the average bw utilization, which is low
anyway. It is believed that the peak/average ratio is around 3-4 or
higher, but I have not seen solid evidence on this since NSFNET.

I don't think there are too many fiber cuts in the network (well... on
the other hand, China-US undersea cable was cut many days ago, and the
link was still down.) But in some networks, providers do shift traffic
between links quite often.

- Ping

From nahum at watson.ibm.com  Fri Mar 16 11:00:51 2001
From: nahum at watson.ibm.com (Erich Nahum)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] two questions about the Internet
In-Reply-To: <NEBBJFHFADLMLFEFPEHEMEBGDLAA.oleg@inforocket.com> from "Oleg Vishnepolsky" at Mar 16, 2001 09:46:43 AM
Message-ID: <200103161900.OAA33888@orinoco.watson.ibm.com>

Oleg Vishnepolsky writes:
> 
> >96 (Atlanta) was the first olympics that IBM hosted, and I believe it was
> >just one complex in Southbury, CT.  98 (Nagano) and 2000 (Australia)
> >were hosted by 4 main sites: Bethesda (for Europe), Shaumberg IL and 
> >someplace in Ohio (for the Americas) and Tokyo (for Asia).  The
> >request routing was done on a very course-grain level, basically
> >through the routing tables.  E.g., if you were in Europe,
> >olympics.com pointed to Bethesda. I think it was done at
> >the routing layer and not through DNS. 
> 
> How is it even possible not to involve DNS ? If DNS was giving out the 
> same IP address to  olympics.com irrespective of the where requests 
> came from, then routing would have been real tricky, to say the least. 

I wasn't the one who did the work, so take my recollections with
a grain of salt.  Hari was one of the authors on the SigMetrics 97
and InfoCom 98 papers that describe this work, so I would trust him
on this one about the 96 olympics.

As for the later ones, this is what I've been told.  It doesn't
seem tricky to me, but I'm not a routing person.  I'll try to dig 
up the info and post it here next week.

-Erich

-- 
Erich M. Nahum                  IBM T.J. Watson Research Center
Networking Research             P.O. Box 704
nahum@watson.ibm.com            Yorktown Heights NY 10598

From dino.saija at libero.it  Tue Mar 20 01:57:41 2001
From: dino.saija at libero.it (dino.saija@libero.it)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] TCP Traffic generator
Message-ID: <GAHQC5$Irlq_rPaBAj5jHK6KPh0r1OKj7YBTlaqZSFKvHv6oKeS3HH2@libero.it>

I'd like to use a free TCP traffic generator(i.e in c++).Where can I 
find?
thank you


From srh at merit.edu  Tue Mar 20 05:50:12 2001
From: srh at merit.edu (Susan Harris)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] NANOG 22 CFP
Message-ID: <Pine.GSO.4.03.10103200849300.23947-100000@backin5.merit.edu>

                     * * * * * * * * * * * * * * * * *

  			   CALL FOR PRESENTATIONS    
 		     *			             *
                                  NANOG 22            
               	     *				     *
                             May 20 - 22, 2001

                     * * * * * * * * * * * * * * * * *


The North American Network Operators' Group (NANOG) will hold its 22nd
meeting in Scottsdate, Arizona, between May 20-22, 2001. The meeting will
be hosted by CenterGate Research Group.  NANOG conferences provide a
forum for the coordination and dissemination of technical information
related to large-scale (i.e., national/international) Internet backbone
networking technologies and operational practices.

NANOG meetings, held three times each year, include two days of short
presentations, plus afternoon/evening tutorial sessions.  Meetings are
informal, with an emphasis on relevance to current backbone engineering
practices. The conference draws over 600 participants, mainly consisting
of engineering staff from large national service providers, and members
of the research and education community.

Now in its sixth year, NANOG evolved from the NSFNET "regional-techs"  
meetings, where technical staff from the regional networks met to discuss
operational issues of common concern. With the emergence of the
commercial Internet, NANOG meetings evolved to include a broader base of
providers, network operators, and researchers.

The meeting will be held at the DoubleTree Paradise Valley. For more
information about NANOG meetings, schedules, and logistics, see:

     http://www.nanog.org
------------------------------------------------------------------------------

CALL FOR PRESENTATIONS

NANOG invites presentations on backbone engineering, coordination, and
research topics. Presentations should highlight issues relating to
technology already deployed or soon to be deployed in core Internet
backbones and exchange points. 

Previous meetings have included presentations on: 

    - Backbone traffic engineering 
    - Coordination of inter-provider QoS 
    - Deployment experience with queueing disciplines (CAR, RED) 
    - Inter-provider security and routing protocol authentication 
    - Routing scalability in backbone infrastructures 
    - Security issues for the Internet core 
    - Routing policy specification and backbone router configuration 
    - Building large-scale measurement infrastructure 
    - Cooperative inter-provider caching 
    - Alternatives to hot-potato routing 
    - Recommendations on queue management and congestion avoidance 
    - Experience with differentiated services 
    - Reports from next-generation networks (CANARIE, Internet2)) 
    - Inter-domain multicast deployment 
    - Backbone network failure analysis 

Tutorials have covered topics such as: 

    - BGP case studies 
    - MPLS fundamentals 
    - External route selection 
    - IP multicast technologies 
    - Distributed content caching in large IP networks

------------------------------------------------------------------------------

HOW TO PRESENT

Submit an informal one- or two-paragraph abstract describing the
presentation in email to nanog-support@nanog.org.  The deadline for
proposals is April 9, 2001.  While the majority of speaking slots will be
filled by April 9, a limited number of slots will be available after that
date for topics that are exceptionally timely and important.  
Submissions will be reviewed by the NANOG Program Committee, and
presenters will be notified of acceptance by April 23, 2001.

NANOG also welcomes suggestions/recommendations for tutorials, panels and
other presentation topics.
---------------------------------------------------------------------------


From hussein at ee.washington.edu  Thu Mar 22 18:40:41 2001
From: hussein at ee.washington.edu (Alhussein Abouzeid)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] TCP timestamping inquiry
In-Reply-To: <200103161900.OAA33888@orinoco.watson.ibm.com>
Message-ID: <Pine.GHP.4.21.0103221827380.6719-100000@maxwell.ee.washington.edu>

Hi all,

Anyone has any pointers to info/measurements regarding noticeable
performance issues with/without TCP time-stamping or any
deployment issues? specifically, I am interested in any clear negative,
positive or 'no effect' results regarding its use in the Internet,
satellite or wireless/ad-hoc.

Thanks in advance,

Hussein.


From Michael.Meyer at eed.ericsson.se  Fri Mar 23 02:08:31 2001
From: Michael.Meyer at eed.ericsson.se (Michael Meyer (EED))
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] Initial TCP Window
Message-ID: <5E5172B4DE05D311B3AB0008C75DA94106D7B454@edeacnt100.eed.ericsson.se>

Does anyone know the current status, which initial window size is used for TCP in different operating systems? Are most up-to-date with RFC2581 (two segments) ? 

It would be interested in 
- windows NT
- windows 95, 98 and 2000
- Linux 

/Michael


From mallman at grc.nasa.gov  Fri Mar 23 06:32:36 2001
From: mallman at grc.nasa.gov (Mark Allman)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] TCP timestamping inquiry 
Message-ID: <200103231432.JAA25090@lawyers.grc.nasa.gov>

> Anyone has any pointers to info/measurements regarding noticeable
> performance issues with/without TCP time-stamping or any
> deployment issues? specifically, I am interested in any clear
> negative, positive or 'no effect' results regarding its use in the
> Internet, satellite or wireless/ad-hoc.

Vern Paxson and I found that using timestamps with the current RTO
algorithm doesn't really buy you much.  See:

    Mark Allman, Vern Paxson.  On Estimating End-to-End Network Path
    Properties.  ACM SIGCOMM, September 1999. 
    http://roland.grc.nasa.gov/~mallman/papers/estimation.ps

(However, note that timestamps are needed for PAWS if your sending
rate is quite high.)

As for deployment, I have some measurements on the use of timestamps
"in the wild" in the following paper:

    Mark Allman.  A Web Server's View of the Transport Layer.  ACM
    Computer Communication Review, 30(5), October 2000. 
    http://roland.grc.nasa.gov/~mallman/papers/webobs-ccr.ps

allman


---
Mark Allman -- BBN/NASA GRC -- http://roland.grc.nasa.gov/~mallman/

From michael at tk.uni-linz.ac.at  Fri Mar 23 08:19:49 2001
From: michael at tk.uni-linz.ac.at (Michael Welzl)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] draft on IP Fast Option Lookup
Message-ID: <A17BDB85B175D311804E00E07D02A21D276257@CONAN>

Hi all,

I would really appreciate feedback on this draft, especially from
the router vendor folks   :)

----------------------------------------------------------------------
A New Internet-Draft is available from the on-line Internet-Drafts
directories.


	Title		: IP Fast Option Lookup
	Author(s)	: M. Welzl
	Filename	: draft-welzl-opt-lookup-00.txt
	Pages		: 8
	Date		: 22-Mar-01

This memo describes a new IP Option type that allows routers to more
efficiently check whether the IP header contains options that need
further processing.

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-welzl-opt-lookup-00.txt

(..)
----------------------------------------------------------------------

Specific questions include:

- What do you think of the idea in general? Is it nice, or plainly a
  useless waste of space in the IP header?

- I am somewhat unsure about the offsets (the second octet of each
  Option Entry) - should I really leave them in there? They're
  additionally wasted space and are only useful if an option is found...
  which probably means that the packet is going to end up in the
  slow path anyway. An alternate version of the draft (without the
  offset) is available from
  ftp://ftp.tk.uni-linz.ac.at/pub/michael/lookup-nooffset/

- Where else should I discuss this? Would the NANOG list be appropriate?


Cheers,
Michael Welzl


From dpreed at reed.com  Fri Mar 23 09:20:47 2001
From: dpreed at reed.com (David P. Reed)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] web100
Message-ID: <5.0.2.1.2.20010323121349.02fd0ab0@mail.reed.com>

So, I got a press release on web100.org and its TCP improvement software.

The press will probably get this completely wrong (the slant in the press 
release is that TCP is *the big problem* and that scarce bandwidth is the 
reason we can't use 100 MB pipes).

Has anyone done any studies that would reasonably support the huge 
investment here?

- David
--------------------------------------------
WWW Page: http://www.reed.com/dpr.html


From rja at inet.org  Fri Mar 23 09:48:06 2001
From: rja at inet.org (RJ Atkinson)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] draft on IP Fast Option Lookup
In-Reply-To: <A17BDB85B175D311804E00E07D02A21D276257@CONAN>
Message-ID: <5.0.2.1.2.20010323124628.009f6090@10.30.15.2>

At 11:19 23/03/01, Michael Welzl wrote:
>Hi all,
>
>I would really appreciate feedback on this draft, 
>especially from the router vendor folks   :)

        It isn't needed by anyone who has ASIC-based forwarding.
Folks building big routers these days generally either already
have or are moving to ASIC-based forwarding.  I work for a 
small router vendor with ASIC-based forwarding.  THis option
isn't especially interesting to us at least.

Ran
rja@inet.org


From vjs at calcite.rhyolite.com  Fri Mar 23 10:27:56 2001
From: vjs at calcite.rhyolite.com (Vernon Schryver)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] draft on IP Fast Option Lookup
Message-ID: <200103231827.f2NIRuX12662@calcite.rhyolite.com>

> From: Michael Welzl <michael@tk.uni-linz.ac.at>

> I would really appreciate feedback on this draft, especially from
> the router vendor folks   :)

I'm not a router person, but I have designed and implemented commercial
stuff that peeked at headers and chose faster or slower paths.


> - What do you think of the idea in general? Is it nice, or plainly a
>   useless waste of space in the IP header?

The history of such speed hints is bad.  A fast path wants to deal
with simple things, and it is usually trivial to detect things that
are not simple.  For IP headers, I bet most implementations would do
best by noticing whether there are any IP options at all.  In this
case, given MPLS and other tag-forwarding schemes, what's the point?

That this option is variable length means that it is among the
most complicated IP options, which is an odd characteristic for
something that is intended to help things go faster.

That it must not be used if there are fewer than two other IP options
confuses me.  How many IP packets have more than 2 options?  Has the list
of possible IP options exploded while I wasn't paying attention?

There are other problems with options such as this.  For example,
why would hosts add them?  "To make routers go faster" is not
a reason, because the speed of routers doesn't affect a host's
SPECMARK or other benchmark value.

Then there are the boundary and error conditions.  For example, what
happens if an option is mentioned in this option is absent?  What if there
are 2 Fast Option Lookup options or if another option precedes it?


> ...
> - Where else should I discuss this? Would the NANOG list be appropriate?

Long ago, there was some overlap between those who buy and operate
routers and those who design and implement things.  Today, the camps are
quite separate.  (Never mind those whose resumes say they have "implemented
TCP/IP in the Enterprise".)  Too many operators tend to be uncritical of
sales blarney about the current magic speed pill.  Previous pills included
"ASCI" and "RISC."  "RISC core" and "DSP" are more recent.

It is possible to sell such features in forums such as NANOG, but not
profitably.  Features that cannot be detected by watching the wire
are ignored in the long run.  A router that does the obvious and uses
a fast path for IP headers with no options and a slow path for IP
headers with any option would be indistinguishable from a router that
used a hint like this.  Sooner instead of later, designers omit those
magic sales pill, usually without telling their own salescritters and
customers and always without telling the competition.


Finally and far more important, if you want to design such things,
it's best to start by designing a faster router in private.  Only after
you have some experience with what makes a router (or anything) faster is
it wise to consider publishing an RFC instructing other people.
The weaknesses in the RFC's that tried to instruct how to make compute
the TCP checksum faster are classic examples of that syndrome.


Vernon Schryver    vjs@rhyolite.com


From RACarlson at anl.gov  Fri Mar 23 11:08:56 2001
From: RACarlson at anl.gov (Richard Carlson)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] web100
In-Reply-To: <5.0.2.1.2.20010323121349.02fd0ab0@mail.reed.com>
Message-ID: <4.3.2.7.2.20010323123011.00aec6b0@atalanta.ctd.anl.gov>

David;

Can you elaborate on your question?  Are you asking if TCP stacks are 
really a performance bottleneck, if bandwidth is a scarce resource, of if 
we have any proof of this?

 From the DOE perspective getting access to high bandwidth pipes is not the 
major problem scientific applications are running into.  There is 'easy' 
access to OC-3 to OC-48 links both within North America and around the 
globe.  (Take a look at the number of OC-3/12 links coming into the US from 
Europe.)  The problem is getting effective e2e throughput (goodput) through 
between 2 nodes (i.e., moving a GB of data from a storage system at SLAC to 
a users desktop at UTK).  The BW*delay product requires large windows on 
both end nodes and almost no loss over SLAC's campus network, ESnet, 
Abilene, and UTK's campus network.

The major problem DOE scientists have is determining why the goodput is so 
low (i.e., 5 Mbps e2e over a 100 Mbps channel).  The Web100 activities are 
designed to answer the question 'is the biggest problem in the local host, 
the remote host, or the network'.  Getting an authoritative answer to this 
simple question would be of immense value to the DOE scientific community 
and well worth the investment NSF is making in funding the Web100 activities.

Rich

At 12:20 PM 3/23/01 -0500, David P. Reed wrote:
>So, I got a press release on web100.org and its TCP improvement software.
>
>The press will probably get this completely wrong (the slant in the press 
>release is that TCP is *the big problem* and that scarce bandwidth is the 
>reason we can't use 100 MB pipes).
>
>Has anyone done any studies that would reasonably support the huge 
>investment here?
>
>- David
>--------------------------------------------
>WWW Page: http://www.reed.com/dpr.html
>
>

------------------------------------

Richard A. Carlson				e-mail: RACarlson@anl.gov
Network Research Section			phone:  (630) 252-7289
Argonne National Laboratory			fax:    (630) 252-4021
9700 Cass Ave. S.
Argonne,  IL 60439


From michael at tk.uni-linz.ac.at  Fri Mar 23 11:40:24 2001
From: michael at tk.uni-linz.ac.at (Michael Welzl)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] draft on IP Fast Option Lookup
In-Reply-To: <A17BDB85B175D311804E00E07D02A21D345D5F@CONAN>
Message-ID: <A17BDB85B175D311804E00E07D02A21D27625B@CONAN>

> > - What do you think of the idea in general? Is it nice, or plainly a
> >   useless waste of space in the IP header?
> 
> The history of such speed hints is bad.  A fast path wants to deal
> with simple things, and it is usually trivial to detect things that
> are not simple.  For IP headers, I bet most implementations would do
> best by noticing whether there are any IP options at all.  In this
> case, given MPLS and other tag-forwarding schemes, what's the point?

It is useless for routers which simply ignore IP options; this option
is supposed to help routers which DO support options, but only a subset
because most are turned off.


> That it must not be used if there are fewer than two other IP options
> confuses me.  How many IP packets have more than 2 options?  

More than 1. And not many, I suppose. It is a small aid for a rare
case    :)
But it is useless when there is only one option (you don't need to
have an index for ONE entry).


> Has the list
> of possible IP options exploded while I wasn't paying attention?

I just registered a few new ones to push this draft      :)
On a serious note, I DO agree that packets with more than one
option will be rare. Still, it's a possibility.
 

> There are other problems with options such as this.  For example,
> why would hosts add them?  "To make routers go faster" is not
> a reason, because the speed of routers doesn't affect a host's
> SPECMARK or other benchmark value.

Right - it's just a recommendation.


> Then there are the boundary and error conditions.  For example, what
> happens if an option is mentioned in this option is absent?  

This is described in the "security issues" section.

> What if there
> are 2 Fast Option Lookup options or if another option precedes it?

This should not be the case according to the draft. But I agree that
it should be discussed (actually, another option preceding it won't
cause much trouble - it's just inefficient).


> Finally and far more important, if you want to design such things,
> it's best to start by designing a faster router in private.  
> Only after
> you have some experience with what makes a router (or 
> anything) faster is
> it wise to consider publishing an RFC instructing other people.

Designing a router is not an option for me.
I trust in the IESG to prevent me from publishing an absolutely
pointless RFC, though    :)

Cheers,
Michael Welzl


From ddc at lcs.mit.edu  Fri Mar 23 11:40:39 2001
From: ddc at lcs.mit.edu (David Clark)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] web100
In-Reply-To: <5.0.2.1.2.20010323121349.02fd0ab0@mail.reed.com>
References: <5.0.2.1.2.20010323121349.02fd0ab0@mail.reed.com>
Message-ID: <v04210113b6e1559c4698@[18.26.0.108]>

At 12:20 PM -0500 3/23/01, David P. Reed wrote:
>So, I got a press release on web100.org and its TCP improvement software.
>
>The press will probably get this completely wrong (the slant in the 
>press release is that TCP is *the big problem* and that scarce 
>bandwidth is the reason we can't use 100 MB pipes).
>
>Has anyone done any studies that would reasonably support the huge 
>investment here?
>
>- David

Dave,
     Not sure what you mean by "huge investment". (They just got 
funded at a rate of just under $1M a year, which is not all that much 
these day...)   I think what these folks are doing is trying to 
distribute software that is pre-configured so that it actually goes 
fast, as opposed to what happens today.  Guys like Matt Mathis have 
put a lot of work into understanding the tuning of TCP, and so on, 
and they have a lot of real world knowledge. The problem today is 
that the vendors are not shipping stuff that benefits from that 
knowledge.

     I think that TCP is the problem, but it is the implementation, 
not the design. That is what the press may get wrong.

Dave


From vjs at calcite.rhyolite.com  Fri Mar 23 11:40:34 2001
From: vjs at calcite.rhyolite.com (Vernon Schryver)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] draft on IP Fast Option Lookup
Message-ID: <200103231940.f2NJeYQ13951@calcite.rhyolite.com>

>         It isn't needed by anyone who has ASIC-based forwarding.
> Folks building big routers these days generally either already
> have or are moving to ASIC-based forwarding.  I work for a 
> small router vendor with ASIC-based forwarding.  THis option
> isn't especially interesting to us at least.

"ASIC-based forwarding" might mean "has specialized hardware for the fast
path.  However, in general "ASIC-based" is as meaningful as "electronic
based," no matter that router users and the trade rags have been talking
about "ASIC" as magic speed pill for 10 years.  From what I've seen,
whether you use full-custom, custom with purchase IP (not the protocol)
such as RISC cores, ASIC's, only commodity parts, or some other point in
the spectrum no more about speed than other design issues including power,
real estate (both board and package), product life, time-to-market, and
available design and simulation tools and talent.

For example, with high enough volumes, a full custom silicon but rather
slow router (e.g. SOHO) might make sense.

Well, I am assuming that ASIC means application specific integrated
circuit, the less aggressive shore of the full custom swamp.  And I've
never been involved with router custom silicon, although I have watched
fun with ion milling and related wonders in other contexts.


Vernon Schryver    vjs@rhyolite.com

From raj at cup.hp.com  Fri Mar 23 11:55:27 2001
From: raj at cup.hp.com (Rick Jones)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] web100
References: <4.3.2.7.2.20010323123011.00aec6b0@atalanta.ctd.anl.gov>
Message-ID: <3ABBAA2F.785033C@cup.hp.com>

I suspect this could be an issue:

#begin exceprt
http://www.web100.org/papers/bdp.discovery.html

...
The method proposed herein for automatic BDP discovery and caching is to
use a simple mechanism modeled after the ICMP Echo Request and Echo
Reply protocol to discover the bandwidth of the least-capable hop
between a given source and destination host pair. This new mechanism
could be a new type of ICMP Request/Reply pair, or it could be a simple
enhancement to the existing Echo Request/Reply, but using a new IP
option
class/number combination. The main difference between the new mechanism
and the
existing ICMP Request/Reply pair is that the router would have to
process two new fields in the message.
...

Development of this BDP protocol initially requires the cooperation of
at least one router vendor, though a crude prototype could be
demonstrated with traceroute and SNMP-derived information.
#end

Seems that the ifSpeed fields of the standard SNMP MIBs would be the
best way to go here anyhow. It does mean knowing the community string or
authentication stuff for the SNMP access. True, that will have "issues"
crossing AS (is that the right term?) but then I suspect that AS would
not like to have that bandwidth info escaping their shpere anyhow.

As far as driving "supercharged Web"
(http://www.web100.org/papers/web100.html) I would have thought that if
the commercial types were that keen on it, they would be taking part in
the SPECweb9X benchmarks and perhaps the IRC bakeoffs. If the 100 in
web100 is supposed to represent 100 Mbit/s, those benchmarks are already
demonstrating solutions going far faster.

The stuff about driving demand for fibre to the home was fun to read in
the context of long-haul bandwidth prices bottoming-out due to
oversupply, and vendors not being able to recoup their investments.

Other interesting things from the concept paper:

#begin
A great deal of fine research has been underway by the Pittsburgh
Supercomputer Center's Networking Research Group, the University of
Washington's Department of Computer Science & Engineering, and several
other groups regarding networking performance tuning and TCP protocol
stack improvements. This research needs to be intensified and
capitalized upon in terms of application to the TCP protocol stack in
the chosen development system. The individual research groups might also
be more effective if their various efforts could be utilized in a
cohesive fashion. For instance, no standing TCP-stack improvement forum
exists to provide a focal point for the exchange of ideas. Finally, it
should be noted that the TCP protocol stack improvement task would be
the most complex and most difficult task of all of those listed.
#end

I guess e2e and tcp-impl don't count... :)

#begin
Needed TCP-stack improvements are listed below.

Include Well-Known Mechanisms

Standard mechanisms like per-destination MTU-discovery (RFC 1191) and
"Large Windows"
extensions to TCP (RFC1323) would certainly be included in the
development system.
#end

is there a commercial stack out there that doesn't already have these
things?!? Their target OS - Linux already has them.

#begin
include Advanced Mechanisms

In addition to such standard mechanisms as listed above, more advanced
improvements are needed. For instance, TCP Selective Acknowledgment
(SACK), defined by RFC 2018, should also be included in the development
system.
#end

hmm, also in the latest (?) linux bits, and in HP-UX 11, and in Solaris,
and in WinSomething. Seems that is already done...

#begin
Furthermore, work needs to be done not just to improve high-performance
networking, but to improve short-duration network-flows as well,
particularly when congestion is relatively high, as such short-duration
high-loss transfers are typical of most current Web transfers. Current
end-to-end congestion avoidance and congestion control mechanisms can
greatly impede performance in such circumstances.

#end

I must be missing something - that sounds like the increase in the
allowable initial cwnd?

#begin
The following is a list of needed improvements.

Kernel Hooks

Currently, operating system kernels generally provide statistics
regarding network only in the aggregate. Kernel hooks to monitor
individual TCP sessions in real-time need to be added as a foundation
for developing a large class of highly needed network diagnostic and
performance monitoring tools. Such hooks should maintain dynamic counts
of important TCP-session parameters, as well as be able to supply
TCP-session packet streams upon demand.

#end

OK, per-session stats might be interesting. It will be move overhead in
the stack of course :)

#begin
GUI-based TCP-Session Monitoring Tools

Based upon the aforementioned kernel hooks, one or more TCP-monitoring
tools need to be developed that are capable of concurrent, dynamic,
real-time graphing of sets of user-selected real-time TCP-session
statistics. Among these statistics are: data rate, window size,
round-trip-time, number of packets unacknowledged, number of
retransmitted packets, number of out-of-order packets, number of
duplicate packets, etc. A variety of display options should be available
such as totals, deltas, running-averages, etc.
#end

All nice and wizzy, but to what end?

How a GUI for traceroute makes it any better is an open question. (I've
not bothered to quote from the article)

Anyhow, it sounds like nice cushy funding if you can get it :)

rick jones
-- 
ftp://ftp.cup.hp.com/dist/networking/misc/rachel/
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to email, OR post, but please do NOT do BOTH...
my email address is raj in the cup.hp.com domain...

From vjs at calcite.rhyolite.com  Fri Mar 23 12:08:42 2001
From: vjs at calcite.rhyolite.com (Vernon Schryver)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] draft on IP Fast Option Lookup
Message-ID: <200103232008.f2NK8gA14461@calcite.rhyolite.com>

> From: Michael Welzl <michael@tk.uni-linz.ac.at>

> It is useless for routers which simply ignore IP options; this option
> is supposed to help routers which DO support options, but only a subset
> because most are turned off.

I said nothing about routers that simply ignore IP options because
they are not routers, or at best are broken by design.
Please read RFC 1812.  There are a lot of MAY's for IP options, but
there is at least one MUST.

> ...
> On a serious note, I DO agree that packets with more than one
> option will be rare. Still, it's a possibility.

Optimizing rare cases is rarely interesting.


> ...
> Designing a router is not an option for me.

If you want to design router optimizations and you're like most of us and
don't have a few $10M to fund a new router design, then why not get a job
at a router vendor?  Participation in the IETF is no more a substitute
for experience impliementing routers than participation in the ISO was a
substitute for designing and implementing transport protocols.

> I trust in the IESG to prevent me from publishing an absolutely
> pointless RFC, though    :)

The last I looked, the IESG is not a router vendor or custom silicon
design group.  In other words, that is not a reasonable or respectable
hope, as demonstrated by plenty of RFC's.

Specifying hardware optimizations without benefit of relevant design
experience is unlikely to improve one's professional reputation outside
the trade rags.  The trade rags are something else.


Vernon Schryver    vjs@rhyolite.com

From braden at ISI.EDU  Fri Mar 23 12:54:03 2001
From: braden at ISI.EDU (Bob Braden)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] TCP Framing
Message-ID: <200103232054.UAA16559@gra.isi.edu>

A non-text attachment was scrubbed...
Name: not available
Type: x-sun-attachment
Size: 4895 bytes
Desc: not available
Url : http://www.postel.org/pipermail/end2end-interest/attachments/20010323/7da5391a/attachment.ksh
From campbell at comet.columbia.edu  Fri Mar 23 16:02:23 2001
From: campbell at comet.columbia.edu (Andrew T. Campbell)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] Software release of the Columbia IP Micro-Mobility Suite (CIMS)
Message-ID: <000801c0b3f5$b47f87b0$3d443b80@SWEETPEA>

IP Micro-mobility has been a hot topic over the
the last few years. 

We have released the Columbia IP Micro-Mobility Suite (CIMS) 

http://www.comet.columbia.edu/micromobility/

that includes an ns 2 extension for the 
following IP micro-mobility protocols:

-Cellular IP  (draft-ietf-mobileip-cellularip-00.txt)
-HAWAII  (draft-ietf-mobileip-hawaii-00.txt)
-Hierarchical Mobile IP   (draft-ietf-mobileip-reg-tunnel-04.txt) 

The Cellular IP implementation supports hard and semi-soft handoff, 
and IP paging. The Hawaii implementation supports Unicast 
Non-Forwarding (UNF) and Multiple Stream Forwarding (MSF) schemes. 
Hawaii's IP paging capability is currently not supported. 
In addition, the CIMS implementation of Hierarchical Mobile IP 
currently does not support IP paging. These and other features 
will be added in due course - we would be happy to add any 
extensions worked on by other groups to the next release of CIMS.

Best,
---
Andrew
http://comet.columbia.edu/~campbell


From chase at cs.duke.edu  Fri Mar 23 13:58:46 2001
From: chase at cs.duke.edu (Jeff Chase)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] TCP Framing
References: <200103232054.UAA16559@gra.isi.edu>
Message-ID: <3ABBC716.3DA09850@cs.duke.edu>

Bob Braden wrote:
....
> the wire from what the user sends.  (See the following sentences from
> RFC 793, for example:
> 
>     The TCP is able to transfer a continuous stream of octets in each
>     direction between its users by packaging some number of octets into
>     segments for transmission through the internet system.  In general,
>     the TCPs decide when to block and forward data at their own
>     convenience.

Bob, your point is indeed subtle.  One could also argue that this
snippet of RFC 793 *supports* the "TCP framing" proposal, which does
nothing more than to affirm the right of mutually consenting TCPs to
block data at their own convenience when speaking amongst themselves (as
negotiated by a ULP).

> Now for the subtle bit.  Generality and optimization are typically
> contradictory.  The Internet protocol suite was designed deliberately
> and carefully for generality, at the possible expense of optimization.
> It was also designed for simplicity at the expense of optimization.
> We recognized that later engineering efforts would rob some of the
> simplicity in order to reach greater optimization, and indeed,
> this has happened and is probably not a bad thing.  On the other
> hand, we should be very wary of over-engineering optimal solutions
> that cut down the generality.

Any TCP implementation that supports this proposal is fully
interoperable with any TCP that does not.  The wire format does not
change, ever, even when the feature is enabled.  Thus it is
interoperable with any other compliant TCP; indeed the RFC snippet you
quoted above ensures this.  So how does it compromise generality?

You almost seem to be suggesting that a TCP implementor should never add
a new locally-selectable policy feature, because any ULP that benefits
from it won't benefit any more if you later take the feature away.  In
this case, the argument is specious because the intended beneficiary (a
layered RDMA protocol called WARP that is still under development) can
run happily over a TCP that does not support this feature; it just won't
be capable of the same degree of hardware acceleration, in the case of
an unreliable network. 

> change.  The ULP proposal changes this to tight coupling, since it only
> works if the send call units are mapped directly into segments.  So
> adopting ULP MAY (and note that one cannot ever be sure that it
> will/won't) reduce the freedom of TCP to adapt to future changes.
> And contrary to what the ULP proponents claim, it is a very fundamental
> change in TCP.  We should think VERY carefully before making such
> a change, and we should be honest about what we are doing.

No, the ULP proposal does not reduce TCP's freedom of choice, it only
assumes that the TCP sender implementation notifies the ULP of its
choice, e.g., by upcalling to the ULP buffers to fill each segment, as
in many current implementations.  No ULP will rely on this behavior, but
if the sender TCP provides it, then the RDMA ULP can benefit from
hardware acceleration.  In most cases (e.g., BSD) this is a superficial
change to the TCP *implementation* itself, although I will allow that it
may be a fundamental change to the way some might think about the
implementation.  

In any case, the proposal does not affect the TCP wire protocol, it does
not affect interoperability, and it does not affect the congestion
behavior.  That is all that its proponents have claimed.  We are honest
people, and we are sincerely trying to do this in a way that is
responsive to the legitimate but sometimes rather shrill concerns about
"changing TCP" among those most experienced with TCP and its history.

One worthwhile question to ask is: can an intermediary observing the
flow at the TCP level determine for certain if this proposed feature is
in use or not?

Jeff

From dpreed at reed.com  Fri Mar 23 13:38:01 2001
From: dpreed at reed.com (David P. Reed)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] web100
In-Reply-To: <4.3.2.7.2.20010323123011.00aec6b0@atalanta.ctd.anl.gov>
References: <5.0.2.1.2.20010323121349.02fd0ab0@mail.reed.com>
Message-ID: <5.0.2.1.2.20010323162458.02fdfd90@mail.reed.com>

At 01:08 PM 3/23/01 -0600, Richard Carlson wrote:
>David;
>
>Can you elaborate on your question?  Are you asking if TCP stacks are 
>really a performance bottleneck, if bandwidth is a scarce resource, of if 
>we have any proof of this?

It was genuinely a question to clarify a press release and website that are 
quite puzzling.  Fixing a performance bottleneck is a good thing to do, I 
just don't understand what the big hoopla is about, or why it takes $3MM.

So, none of the above.  I include the press release here - I've also looked 
at the website.  Reading the press release and the website, I get the idea 
that there is an answer that is already being disseminated in the form of 
software (middleware?), and it has to do with TCP-MIBs and autotuning.

So with claims of first distribution of a "solution" implied in the press 
release, it would be interesting to know if the researchers in the TCP 
field actually have validated that this is the source of the 
problem.  Crappy applications programs and TCP implementations could be the 
problem, as well, one might think.  Or maybe the APIs (Berkeley Sockets? 
and file system buffering don't work very well).

And the mystery of why the project is called "WEB" 100?  We know that web 
protocols have too much handshaking and parsing to be good bulk transfer 
vehicles.  And what do supercomputer users have to do with the Web?

But what most puzzles me is that this is an NSF research project, not a 
software development project, yet the press release talks about it as the 
latter.

I'm probably just confused.  Maybe this is how science is done these days, 
but I'd think that one grad student could have figured out where a 
bottleneck is by just a few measurements, then passed the info off to the 
community of developers to fix it.  Since the project is "open source" 
according to the website (but I, at least, can't look at the source because 
I don't have a password), one might think that the fix would simply be 
posted, at low cost.
------------------------------------------------------------------------------
FOR IMMEDIATE RELEASE
Mar. 19, 2001
Web100 Takes First Step Towards Improving Network Performance
PITTSBURGH -- The Web100 Project has distributed the initial version
of software that aims to bring data-transmission rates of 100
megabits per second to users of high-speed networks. Select
researchers at universities and government laboratories are getting a
sneak peek at the Web100 software to do real-world testing and
provide feedback to developers.
"Today's release of the Web100 software promises improved network
performance at a time when bandwidth is increasingly precious," said
Tom Greene, the Senior Program Director for Infrastructure in the
National Science Foundation's Division of Advanced Networking
Infrastructure and Research. "This type of middleware can help us
use existing resources more efficiently."
While most home users still connect to the Internet with a 56K modem,
universities, research centers and some businesses today have
connections capable of transmitting data at 100 megabits per second
(Mbps) or higher. Research has shown, however, that users rarely see
performance greater than three Mbps. Web100 researchers traced the
problem to software that governs the Transmission Control Protocol
(TCP) -- a "language" that computers use to communicate across
networks. Networking experts are able to overcome this limit by fine
tuning connections with adjustments to TCP.
The Web100 software will eventually allow users to take full
advantage of available network bandwidth without the help of a
networking expert. Web100 programmers are refining TCP software in
the Linux operating system to automatically achieve the highest
possible transfer rate. "Our goal is to make it easier for everyone
to move data across networks at 100 megabits per second or higher,"
said Matt Mathis, Pittsburgh Supercomputing Center network research
coordinator and one of the principal investigators of Web100.
Twenty-one researchers at ten institutions -- including Stanford
Linear Accelerator Center, Oak Ridge National Laboratory, Lawrence
Berkeley Laboratory and Argonne National Laboratory -- will test the
initial release of Web100 software.
At the University of Michigan, for example, Brian Athey will test the
Web100 software for use with the Visible Human Project. Athey is
working with Art Wetzel at PSC to develop applications that allow
students to view large Visible Human data-sets over high-speed
networks. "In situations of marginal bandwidth availability," said
Athey, "tuning could make the difference between a choppy and
unusable 500 Kbps to 1 Mbps stream to a perfectly useful 2 Mbps to 5
Mbps stream."
The Web100 Project is a collaboration of Pittsburgh Supercomputing
Center, the National Center for Atmospheric Research and the National
Center for Supercomputing Applications. More information can be found
at: http://www.web100.org/

# # #
CONTACT:
Sean Fulton
sfulton@psc.edu
Pittsburgh Supercomputing Center
412-268-4960
[R. Sean Fulton | Public Information Specialist | sfulton@psc.edu]
[***** Pittsburgh Supercomputing Center | 412/268-7141 *****]
-----------------------------------------------------------------------------


> From the DOE perspective getting access to high bandwidth pipes is not 
> the major problem scientific applications are running into.  There is 
> 'easy' access to OC-3 to OC-48 links both within North America and around 
> the globe.  (Take a look at the number of OC-3/12 links coming into the 
> US from Europe.)  The problem is getting effective e2e throughput 
> (goodput) through between 2 nodes (i.e., moving a GB of data from a 
> storage system at SLAC to a users desktop at UTK).  The BW*delay product 
> requires large windows on both end nodes and almost no loss over SLAC's 
> campus network, ESnet, Abilene, and UTK's campus network.
>
>The major problem DOE scientists have is determining why the goodput is so 
>low (i.e., 5 Mbps e2e over a 100 Mbps channel).  The Web100 activities are 
>designed to answer the question 'is the biggest problem in the local host, 
>the remote host, or the network'.  Getting an authoritative answer to this 
>simple question would be of immense value to the DOE scientific community 
>and well worth the investment NSF is making in funding the Web100 activities.
>
>Rich
>
>At 12:20 PM 3/23/01 -0500, David P. Reed wrote:
>>So, I got a press release on web100.org and its TCP improvement software.
>>
>>The press will probably get this completely wrong (the slant in the press 
>>release is that TCP is *the big problem* and that scarce bandwidth is the 
>>reason we can't use 100 MB pipes).
>>
>>Has anyone done any studies that would reasonably support the huge 
>>investment here?
>>
>>- David
>>--------------------------------------------
>>WWW Page: http://www.reed.com/dpr.html
>
>------------------------------------
>
>Richard A. Carlson                              e-mail: RACarlson@anl.gov
>Network Research Section                        phone:  (630) 252-7289
>Argonne National Laboratory                     fax:    (630) 252-4021
>9700 Cass Ave. S.
>Argonne,  IL 60439

- David
--------------------------------------------
WWW Page: http://www.reed.com/dpr.html

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.postel.org/pipermail/end2end-interest/attachments/20010323/ffcdc588/attachment.html
From rja at inet.org  Fri Mar 23 14:23:20 2001
From: rja at inet.org (RJ Atkinson)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] TCP Framing
In-Reply-To: <200103232054.UAA16559@gra.isi.edu>
Message-ID: <5.0.2.1.2.20010323171417.00a09380@10.30.15.2>

>Moral: Over-engineering may be bad for the Internet, eventually.

        I'm not known for being subtle.  I might have
phrased the above more along these lines:

        "Folks are doing lots of micro-optimisations these days,
        at various points in the stack and in the end-to-end 
        network system.  While a given micro-optimisation might
        be worthwhile in a micro-network, many micro-optimisations
        are being deployed in a haphazard manner in the global
        Internet these days.  The real result is a significantly 
        less robust network for very little (if any) real 
        measurable benefit."

>Finally, an historical irony: the ULP hack is acknowledged to be a
>stopgap until SCTP has advanced to save us.  Essentially, SCTP is OSI
>TP4 with features.  TCP's idea of decoupling API from wire protocol
>units was, at the time of its development a new idea that was at
>variance with the evolving OSI suite.  Now, things seem to be
>running full circle.  

        The property of SCTP that I am hearing the most interest
in is the decoupling of the transport-layer session state from
the actual IP address at each end of the connection.  Lots of
folk seem interested in having a transport-layer session that
could have increased robustness simply by multi-homing each
end-host for the session onto different networks (providing
path diversity).  In this regard, I'm influenced by talking
with folks who are implementing or deploying SCTP for various
purposes.  My sample space is certainly not statistically valid.

        IMHO, if the TCP checksum were bound to some form of
host identifier other than the IP address, TCP could provide
that particular property quite nicely.  Obviously I'm influenced
by conversations with jnc on this particular point.  It would
be an interesting exercise to work out the details for such
an approach.

        All that noted, I could imagine using SCTP underneath
a suitably generic API.  It isn't clear to me that the API
has to necessarily be as tightly coupled as is the case in
some SCTP API proposals.  Mayhap I'm confused.

Cheers,

Ran
rja@inet.org


From dpreed at reed.com  Fri Mar 23 14:40:40 2001
From: dpreed at reed.com (David P. Reed)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] TCP Framing
In-Reply-To: <200103232054.UAA16559@gra.isi.edu>
Message-ID: <5.0.2.1.2.20010323173039.02fd7e60@mail.reed.com>

Bob - I agree with you that having TCP control the framing underneath it 
places a significant burden on future evolution of the TCP and IP.  You are 
right, as well, about the intention of the TCP spec to avoid linking 
segmentation to the sequence of calls - it was considered and dropped.

And for that matter, I still don't see the need.  If your data should be 
processable out of order, why not use multiple TCP connections, RTP, or 
(gasp) UDP?  If the data needs to be processed in order, then framing can 
be embedded in the data stream.


From vjs at calcite.rhyolite.com  Fri Mar 23 15:20:00 2001
From: vjs at calcite.rhyolite.com (Vernon Schryver)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] TCP Framing
Message-ID: <200103232320.f2NNK0j17500@calcite.rhyolite.com>

> From: Bob Braden <braden@ISI.EDU>

>         Filename        : draft-williams-tcpulpframe-01.txt

> ...
> Moral: Over-engineering may be bad for the Internet, eventually.
>
> Finally, an historical irony: the ULP hack is acknowledged to be a
> stopgap until SCTP has advanced to save us.  Essentially, SCTP is OSI
> TP4 with features.  TCP's idea of decoupling API from wire protocol
> units was, at the time of its development a new idea that was at
> variance with the evolving OSI suite.  Now, things seem to be
> running full circle.  Are we really sure we want to do this?
>
> I will be interested to hear other opinions.

As long as such proposals don't consume finite, non-renewable resources
such as protocol or IP option numbers, who cares?  They won't be
implemented by a significant number of hosts.  No one will ask for them,
instead of the usual TCP byte stream framing code (surely in some C++ or
Java class by now).  It will all be forgotten a lot sooner than TP4.
If we're wrong about all of that and it becomes wildly popular,
then no harm is done.

The fact that the proposal does change TCP could be handled.  The easiest
way is to observe that like the recently suggest IP option acceleration,
the idea is ok but the implementation (protocol) is wrong.  It is easy
to change the TCP API without changing the on-wire behavior to get
and put data directly to and from application buffers when things are
going well (e.g. no retransmissions), and fall back to a slow path in
what must be the other, rare cases.  If you're doing enough retransmitting
to notice, you're not going fast enough to care about such things.

In other words, contrary to various claims, no black magic was required
to "page flip" on both input and output more than 10 years ago.


Vernon Schryver    vjs@rhyolite.com

From day at std.com  Fri Mar 23 15:16:37 2001
From: day at std.com (John Day)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] TCP Framing
In-Reply-To: <200103232054.UAA16559@gra.isi.edu>
References: <200103232054.UAA16559@gra.isi.edu>
Message-ID: <v04220801b6e17a964ea7@[208.192.102.66]>

>Hi.  At the IETF just completed, I sat through an exposition of
>the following Internet Draft:
>
>   	"Title		: ULP Framing for TCP
>	Author(s)	: J. Williams et al.
>	Filename	: draft-williams-tcpulpframe-01.txt
>	Pages		: 12
>	Date		: 22-Mar-01
>
>     This document proposes a framing protocol for TCP which is designed
>     to be fully compliant with applicable TCP RFC's and fully
>     interoperable with existing TCP implementations. The framing
>     mechanism is designed to work as a 'shim' between TCP and higher-
>     level protocols, preserving the reliable, in-order delivery of TCP
>     while adding the preservation of higher-level protocol record
>     boundaries if the record is less than or equal to the path MTU. The
>     shim is designed to enable hardware acceleration of data movement
>     operations (e.g. direct placement of receive TCP segments into
>     higher-level protocol buffers) for the protocols that use it, even
>     if TCP segments are delivered out-of-order."

Bob, let me take a stab at this.  I have thoroughly read your note 
but have not read the draft yet.  I wil do that next, but it is late 
in the day here and I don't want to wait until I can get to the draft 
which might not be until tomorrow sometime.

I have agonized over this problem for almost 30 years.  I have for 
some time believed that TCP approach was the proper approach.  (This 
stemming from our early discussions back in the early 70's when 
Multics did streams and that IBM system you were in charge of did 
records.  I think I remember you arguing hard for record capabilities 
in some of those early meetings!  ;-)  Sorry couldn't resist. 
Streams were elegant.  I remember how much I detested those damn half 
duplex terminals and the push for records in Telnet.  Telnet was the 
most elegant thing we created in that early batch of protocols.) 
Streams were more general than records. (In those days, records 
usually meant fixed length as you no doubt remember.)  Records were a 
pain in the neck (or something else).  One of the mistakes I always 
thought the OSI model made (it wasn't just TP4) was the definition 
that "the identity of SDUs were maintained end-to-end."  SDU (Service 
Data Unit) was the lump of stuff handed across the layer boundary. 
SDUs could be fragmented or contatenated however the protocol wanted 
into any number of PDUs for sending to the other end.  The only 
requirement was the the guy on the other end was handed the same SDU 
that it had been given.

Now it is the case that some applications do work with fixed or 
variable records and can't do anything until they have the whole 
thing.  In fact most of our applications today do this except Telnet 
and it probably should be.  (Telnet processing would be a whole lot 
more efficient if you could find the next IAC without looking at 
every byte  But that is a nit and much less a problem now than with 
the hardware then.)  Right, it was quite a revelation to me when I 
realized that the one protocol that I really thought was pure stream 
would be more efficient not with records but with SDUs.)

Now clearly it can be rightly argued that if an application wants to 
see records, then it should do it.  Of course, the application 
designer could just as easily argue that "there are many more 
applications that need to do this than just the one application.  And 
isn't one of the main design principles that if lots of things need 
the same function that rather than do it n times (potentially 
differently), it should be done once consistently?  TCP already has 
the machinery to do reassembly, why not have it do it, why should I 
have to replicate this byte diddling stuff in my application. 
Afterall, it is perfectly legal for TCP to deliver my data one byte 
at a time.  If I am lucky, one call will give me what I need but more 
likely it will take more system calls to TCP before I get something I 
can use.  Why does it need to be that much work."  Well, yea. . . .

So I started looking for some other reason to decide which side of 
the line it should be done on.  Frankly, I haven't found any.  It 
appears that the work is the same whether TCP does it or the 
application does it.  Although, it may be replicated code, but that 
is not a big deciding factor. So no real architectural argument there.

The only thing I have really found goes against us and that sort of 
comes from thinking about these things as objects.  At first I 
thought that the only difference between the OSI SDU and what we 
thought of as records was that records were fixed length and SDUs 
could be variable in length.  But at some point, I realized that from 
the application's point of view it was something different:  "Here I 
want you to send this stuff.  I don't care what you do to it.  Send 
it all at once, send it with something else, break it up into big or 
little pieces, I don't care what kind of mess you make of it.  Just 
clean up after yourself when you are done."  And I thought hmmmm, 
that isn't an unreasonable demand by the layer above and it maintains 
the invariance of the interaction which is always a strong design 
property for interfaces.

So these days, I am still on the fence, but leaning toward a solution 
that maintains invariance, i.e.  what I put in at one end will come 
out the other.  This is really the orthogonality that all interfaces 
should exhibit.  groan.

>
>I would like to suggest two things about this, one simple and one
>subtle.  The simple one is this: to say that the ULP framing is fully
>compliant with the applicable TCP RFCs is simply false.  For some of
>us, at least, such a lack of truth in technical advertising is a red
>flag.
>
>The reason why it is false, and its consequences, form the subtle bit.
>It is true that the proposed shim does not change the definition of the
>TCP protocol on the wire.  However, it does change a more fundamental
>principle of TCP, which is the deliberate decoupling of what happens on
>the wire from what the user sends.  (See the following sentences from

Yes, but in some sense we have decoupled the sender but coupled the 
receiver more tightly.

>RFC 793, for example:
>
>     The TCP is able to transfer a continuous stream of octets in each
>     direction between its users by packaging some number of octets into
>     segments for transmission through the internet system.  In general,
>     the TCPs decide when to block and forward data at their own
>     convenience.
>
>The last sentence may be phrased in a slightly academic manner;
>the reader is assumed to understand the the "convenience" of the
>transport layer is to provide optimal performance.  In an earlier
>paragraph the spec says:
>
>     TCP is designed to work in a very general environment of
>     interconnected networks.
>
>Now for the subtle bit.  Generality and optimization are typically
>contradictory.  The Internet protocol suite was designed deliberately
>and carefully for generality, at the possible expense of optimization.
>It was also designed for simplicity at the expense of optimization.
>We recognized that later engineering efforts would rob some of the
>simplicity in order to reach greater optimization, and indeed,
>this has happened and is probably not a bad thing.  On the other
>hand, we should be very wary of over-engineering optimal solutions
>that cut down the generality.

Above I was trying to point out that this is generality on sending 
but not for receiving.  It does not impact either generality or 
optimality "to clean up after yourself." Be careful about the 
generality thing:  It was generality as we understood it in the 
mid-70's.  (I have found some aspects of TCP that I thought were done 
for good performance and design reasons and have recently realized 
that they were really based on the nature of the traffic at the time 
and that those conditions no longer hold.  ooops.)  We weren't and 
aren't infallible.

>
>The ULP protocol is a classic example of this issue.  The TCP spec
>deliberately decoupled the packaging of data across the API (the ADPU,
>if you will) from the segmentation that TCP does on the wire.  This was
>not an accident; it was designed to allow TCPs to be able to adapt to
>new and different environments, without requiring that applications
>change.  The ULP proposal changes this to tight coupling, since it only
>works if the send call units are mapped directly into segments.  So
>adopting ULP MAY (and note that one cannot ever be sure that it
>will/won't) reduce the freedom of TCP to adapt to future changes.
>And contrary to what the ULP proponents claim, it is a very fundamental
>change in TCP.  We should think VERY carefully before making such
>a change, and we should be honest about what we are doing.

You may be right here.  I will read the document later and see what I 
think.  IF ULP is really a protocol on top of TCP, I don't see how it 
can constrain TCP.  I don't even see that every application that uses 
TCP would have to use ULP.  But I also would agree with you about 
whether this is really a necessary addition at this point.

>
>Moral: Over-engineering may be bad for the Internet, eventually.
>
>Finally, an historical irony: the ULP hack is acknowledged to be a
>stopgap until SCTP has advanced to save us.  Essentially, SCTP is OSI
>TP4 with features.  TCP's idea of decoupling API from wire protocol
>units was, at the time of its development a new idea that was at
>variance with the evolving OSI suite.  Now, things seem to be
>running full circle.  Are we really sure we want to do this?

I have browsed through SCTP and it looks like a bell-heads dream. 
Not sleek and simplicity at all.  Just lots and lots of mechanism and 
weight.  I would also agree with your appraisal of the OSI situation 
at the time.  I think that when the people put the SDU thing in the 
OSI stuff they were trying to force a fixed record view of the world. 
The problem was that it gotten written in general terms and while you 
could read that into it (and I did for years).  Later as I indicated, 
I began to realize that it actually said something much more 
interesting.  So now when I talk about this, I say there are three 
cases:  Stream, record and for lack of a better term Idempotent for 
the literal interpretation of the SDU because it keeps things 
invariant.

Well, there it is something for you to think about too.  ;-))) 
Whaddya think?  Now to read the damn thing!  Gotta go to dinner, 
people are hollering at me.

Take care,
John


From mfisk at lanl.gov  Fri Mar 23 15:29:20 2001
From: mfisk at lanl.gov (Mike Fisk)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] TCP Framing
In-Reply-To: <200103232054.UAA16559@gra.isi.edu>
Message-ID: <Pine.LNX.4.21.0103231457510.7855-100000@pescado.lanl.gov>

An abstract TCP byte stream is very similar to the byte or bit stream
provided by serial links.  It would be useful to know why one wouldn't use
normal byte stuffing to denote frame boundaries within ULP data.  

I assume the argument is that it is inefficient to scan and twiddle bytes
and that some out-of-band (ala packet segmentation) framing looks cheaper.  
But the draft presents this problem in the context of special-purpose NICs
which could presumably handle byte stuffing pretty cheaply.

With regard to your "subtle problems", the one that first comes to my mind
is a new dependence on the end-to-end characteristics of TCP packets.  
With all of the middleboxes munging TCP packets, this seems dangerous.  
Even if the draft is correct in assuming that the ULP payload won't
contain something that looks like a valid ULP header, any performance 
gains from using this protocol are lost in these situations.

Second, from an upper-layer point of view, I don't know that I want to
have to limit my PDUs to the current path MSS or force the protocol to
degrade when the MSS falls below 512.  It doesn't seem far fetched to me
that some future (wireless?) network will only permit very small MTUs.

What if I have an application that thinks in fixed-size blocks of, say 1k.  
Depending on the MSS, this can result in a lot of small packets if one is
trying to preserve message boundaries.  For good reasons, people have gone
to a lot of effort to remove small packets from TCP streams.

Again, it would be helpful if there was a good argument against
byte-stuffing.

On Fri, 23 Mar 2001, Bob Braden wrote:

>   	"Title		: ULP Framing for TCP
> 	Author(s)	: J. Williams et al.
> 	Filename	: draft-williams-tcpulpframe-01.txt
> 	Pages		: 12
> 	Date		: 22-Mar-01
> 	
>     This document proposes a framing protocol for TCP which is designed
>     to be fully compliant with applicable TCP RFC's and fully
>     interoperable with existing TCP implementations. The framing
>     mechanism is designed to work as a 'shim' between TCP and higher-
>     level protocols, preserving the reliable, in-order delivery of TCP
>     while adding the preservation of higher-level protocol record
>     boundaries if the record is less than or equal to the path MTU. The
>     shim is designed to enable hardware acceleration of data movement
>     operations (e.g. direct placement of receive TCP segments into
>     higher-level protocol buffers) for the protocols that use it, even
>     if TCP segments are delivered out-of-order."

> I would like to suggest two things about this, one simple and one
> subtle.  The simple one is this: to say that the ULP framing is fully
> compliant with the applicable TCP RFCs is simply false.  For some of
> us, at least, such a lack of truth in technical advertising is a red
> flag.
> 
> The reason why it is false, and its consequences, form the subtle bit.
> It is true that the proposed shim does not change the definition of the
> TCP protocol on the wire.  However, it does change a more fundamental
> principle of TCP, which is the deliberate decoupling of what happens on
> the wire from what the user sends.  (See the following sentences from
> RFC 793, for example:

-- 
Mike Fisk, RADIANT Team, Network Engineering Group, Los Alamos National Lab
See http://home.lanl.gov/mfisk/ for contact information


From jonathan at DSG.Stanford.EDU  Fri Mar 23 16:11:03 2001
From: jonathan at DSG.Stanford.EDU (Jonathan Stone)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] TCP Framing 
In-Reply-To: Your message of "Fri, 23 Mar 2001 16:20:00 MST."
             <200103232320.f2NNK0j17500@calcite.rhyolite.com> 
Message-ID: <200103240011.QAA10208@champagne.dsg.stanford.edu>

In message <200103232320.f2NNK0j17500@calcite.rhyolite.com>,
Vernon Schryver writes:

>In other words, contrary to various claims, no black magic was required
>to "page flip" on both input and output more than 10 years ago.

Yes, provided that the MSS is a multiple of the pagesize, (or the
sender rounded down to that), and that you already DMAed the packet
into memory, aligned such that the TCP (or whatever) payload
ended up page-aligned.

That is, it works provided the receiver's guess about alignment and
preceding header sizes pays off. Reading between the lines, one of the
aims of this proposal is to address the cases where such guesses would
fail.  The "RDMA" makes me wonder if this isn't just about preserving
record boundaries, but about preserving in-memory alignment of 
each record,  too.

Alignment constraints might be why byte-stuffing (or Stuart Cheshire's
COBS) was not proposed.  More explanation of WARP (the target
remote-DMA ULP) might give some insight.  Maybe the authors of the
draft could comment?

That said -- it seems an awful lot of effort, in an Internet where
both Ethernet-sized MTUs, and signficantly larger alignment
constraints -- pagesizes of 4K, 8k, 16k or larger -- are common.

From vjs at calcite.rhyolite.com  Fri Mar 23 17:40:05 2001
From: vjs at calcite.rhyolite.com (Vernon Schryver)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] TCP Framing
Message-ID: <200103240140.f2O1e5p20358@calcite.rhyolite.com>

> From: Jonathan Stone <jonathan@DSG.Stanford.EDU>

> >In other words, contrary to various claims, no black magic was required
> >to "page flip" on both input and output more than 10 years ago.
>
> Yes, provided that the MSS is a multiple of the pagesize, (or the
> sender rounded down to that), and that you already DMAed the packet
> into memory, aligned such that the TCP (or whatever) payload
> ended up page-aligned.
>
> That is, it works provided the receiver's guess about alignment and
> preceding header sizes pays off...

The proposal also doesn't work unless it's somewhat similar guesses
pay off.  That the guesses are made explicit and put on the wire
doesn't change that problem.

I prefer things that just work by default to things that are explicit
but work just as (in)frequently.  But that's largely a matter of style.
I'm not a fan of having an on-the-wire protocol for every cloud on
a white board.  Many people disagree with me, as demonstrated by
enthusiasm for such as the PPP BOD protocol which can do and does
nothing except put on the wire what both peers either already know or
don't care about.


Vernon Schryver    vjs@rhyolite.com

From day at std.com  Fri Mar 23 17:50:03 2001
From: day at std.com (John Day)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] TCP Framing
In-Reply-To: <200103232054.UAA16559@gra.isi.edu>
References: <200103232054.UAA16559@gra.isi.edu>
Message-ID: <v04220802b6e1ab83cd59@[208.192.101.9]>

At 20:54 +0000 3/23/01, Bob Braden wrote:
>----------
>X-Sun-Data-Type: text
>X-Sun-Data-Description: text
>X-Sun-Data-Name: text
>X-Sun-Charset: us-ascii
>X-Sun-Content-Lines: 84
>
>
>Hi.  At the IETF just completed, I sat through an exposition of
>the following Internet Draft:
>
>   	"Title		: ULP Framing for TCP
>	Author(s)	: J. Williams et al.
>	Filename	: draft-williams-tcpulpframe-01.txt
>	Pages		: 12
>	Date		: 22-Mar-01

Okay, I am back from dinner and have glanced through this proposal.

What I wrote before was sort of my general view of years of trying to 
figure out the stream vs record vs SDU thing.  Now I could imagine a 
completely independent shim layer above TCP that inserted some 
framing around the user's data and gave it to TCP and then simply 
just took the receiving stream of bytes as they came in in order and 
put together what had been sent and handed it to the application.  So 
I could imagine some good ways to do what Bob was talking about  in 
his note that would not affect TCP, and might provide a common 
procedure for applications.

BUT THIS ISN'T THAT!!!!!!!!!!!!!!

THIS is a real bad idea.  Just scanning it I saw many things that set 
off more than a few red flags.  Perhaps this weekend or early next 
week I can detail my problems with this.  But right now is not the 
time.  I have to get up at 0330 tomorrow morning and that is looking 
sooner than I like.

I'm with Bob on this one.  Although, I think he was to gentle in his objection.

Take care,
John


From Erik.Nordmark at eng.sun.com  Fri Mar 23 23:02:43 2001
From: Erik.Nordmark at eng.sun.com (Erik Nordmark)
Date: Thu Mar 25 11:59:36 2004
Subject: [e2e] TCP Framing
In-Reply-To: "Your message with ID" <200103232054.UAA16559@gra.isi.edu>
Message-ID: <Roam.SIMC.2.0.6.985417363.9198.nordmark@bebop.france>

As much as we might dislike the various middle boxes,
I wonder what would happen if one of these TCP connections
passed through a middle box. While many middleboxes tweak things
on a packet by packet basis, there might be some that are essentially
implemented as a read+write loop in application space, i.e.
the TCP segment boundaries would not be preserved.

Thus trying to make the TCP segment boundaries matter for the ULP
is threading into unchartered territory.

   Erik


From conrad at joda.cis.temple.edu  Sat Mar 24 09:00:59 2001
From: conrad at joda.cis.temple.edu (Phillip Conrad)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP Framing
In-Reply-To: <5.0.2.1.2.20010323173039.02fd7e60@mail.reed.com>
References: <200103232054.UAA16559@gra.isi.edu>
Message-ID: <5.0.2.1.2.20010324112843.039c5770@155.247.71.60>

At 05:40 PM 3/23/2001 -0500, David P. Reed wrote:
>Bob - I agree with you that having TCP control the framing underneath it 
>places a significant burden on future evolution of the TCP and IP.  You 
>are right, as well, about the intention of the TCP spec to avoid linking 
>segmentation to the sequence of calls - it was considered and dropped.
>
>And for that matter, I still don't see the need.  If your data should be 
>processable out of order, why not use multiple TCP connections, RTP, or 
>(gasp) UDP?  If the data needs to be processed in order, then framing can 
>be embedded in the data stream.

Without making any comment on the TCP framing question one way or the 
other, I think I can address David's last question ("why not 
use...?")  Let's take each of the proposed alternatives in turn:

>Why not use multiple TCP connections

   Two reasons: (1) fairness (2) slow start/congestion avoidance.

    Fairness: If I use "n" TCP connections for a single flow because I have 
three logical      streams that I want to be processed out-of-order with 
respect to one another, then I am
getting "n" times greater a share of the bandwidth on congested links that 
I should reasonably be entitled to.

    Slow-start/congestion avoidance: if I have "n" TCP connections for my 
packet flow rather than one, there is no communication among them.  If one 
of my "n" TCP connections experiences a packet loss, then I should probably 
back off my sending rate on all three.

My expectation is that having "n" connections all independently doing 
slow-start/congestion avoidance to find an appropriate sending rate, would 
mean that each of the flows would converge to an appropriate sending rate 
more slowly, than if there were a single flow, with the result that the 
overall goodput of the network is reduced.   I may be wrong on this point.. 
. sometimes intuition leads us astray.  If NS-2 work hasn't already been 
done to investigate this point, it probably should be... but I'd be 
surprised if it hasn't already.  (If someone reading this message knows of 
such work, could they point it out?)

>RTP

    RTP, it seems to me, is widely misunderstood.   While RTP *does* 
contain some transport layer functionality (e.g. end-to-end delivery, 
sequence numbers, etc.) RTP is most definitively NOT a transport protocol 
in the sense that TCP, UDP, or SCTP are transport protocols.   Typically, 
RTP must be layered on top of one of those (TCP, UDP, SCTP, or something 
else in that category).   Thus it is a red herring in this discussion.

>or (gasp) UDP?

Apart from the issue of reliability, the main reason: flow 
control/TCP-friendly congestion control.  Applications without flow control 
don't work well, and those without TCP-friendly congestion control are 
"considered harmful".  Building TCP-friendliness *correctly* into an 
application built on top of UDP is a corner that many developers are 
inclined to cut.

In short, applications need a wide variety of qualities of service at the 
transport layer: total order, partial order, unordered service... reliable, 
unreliable, or something in between... but what ALL applications need to be 
good network citizens is flow-control and TCP-friendly congestion control.

To my way of thinking, this is why the time is right for SCTP, which offers 
a choice between reliable/ordered service, reliable/partially ordered 
service, and now has extensions under development for unreliable and 
partially reliable service as well.

More thinking on this topic can be found at: 
http://www.cis.udel.edu/~pconrad/thesis and in some tech reports at 
http://netlab.cis.temple.edu/techrpts.

Phill Conrad
Asst. Professor, CIS Dept., Temple University


From dpreed at reed.com  Sat Mar 24 11:54:58 2001
From: dpreed at reed.com (David P. Reed)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP Framing
In-Reply-To: <5.0.2.1.2.20010324112843.039c5770@155.247.71.60>
References: <5.0.2.1.2.20010323173039.02fd7e60@mail.reed.com>
 <200103232054.UAA16559@gra.isi.edu>
Message-ID: <5.0.2.1.2.20010324142828.02fd37a0@mail.reed.com>

At 12:00 PM 3/24/01 -0500, Phillip Conrad wrote:
>I think I can address David's last question ("why not use...?")  Let's 
>take each of the proposed alternatives in turn:
>
>>Why not use multiple TCP connections
>
>   Two reasons: (1) fairness (2) slow start/congestion avoidance.
>
>    Fairness: If I use "n" TCP connections for a single flow because I 
> have three logical      streams that I want to be processed out-of-order 
> with respect to one another, then I am
>getting "n" times greater a share of the bandwidth on congested links that 
>I should reasonably be entitled to.

Don't think this is actually true.  packet drop rate on the shared link has 
nothing to do with port numbers - even RED discriminates only on IP 
address.  Now ECN might cause one TCP to back off and another to back off 
less, but the stable state would seem to be the same, whether multiple TCP 
connections are used or not.  (some of the less end-to-endian notions of 
router fairness might give 3 TCP cnxns better service, by looking deeper 
into the packets).

>    Slow-start/congestion avoidance: if I have "n" TCP connections for my 
> packet flow rather than one, there is no communication among them.  If 
> one of my "n" TCP connections experiences a packet loss, then I should 
> probably back off my sending rate on all three.

I would think that if the total traffic between the two ends is divided 
among the "n" connections, that slow start would converge just as 
fast.  But it would be an interesting experiment.
>>RTP
>
>    RTP, it seems to me, is widely misunderstood.   While RTP *does* 
> contain some transport layer functionality (e.g. end-to-end delivery, 
> sequence numbers, etc.) RTP is most definitively NOT a transport protocol 
> in the sense that TCP, UDP, or SCTP are transport protocols.   Typically, 
> RTP must be layered on top of one of those (TCP, UDP, SCTP, or something 
> else in that category).   Thus it is a red herring in this discussion.

Don't agree. RTP on UDP is a transport.  UDP was invented (I was there) as 
sugaring of IP layer in order to allow a wide variety of experimental 
transport protocols, one key particular example being protocols like packet 
voice that didn't want reliable delivery, but instead timely delivery  that 
allowed the application to decide what to do with out-of-order and lost 
packets.  RTP is pretty much for that example.

>>or (gasp) UDP?
>
>Apart from the issue of reliability, the main reason: flow 
>control/TCP-friendly congestion control.  Applications without flow 
>control don't work well, and those without TCP-friendly congestion control 
>are "considered harmful".  Building TCP-friendliness *correctly* into an 
>application built on top of UDP is a corner that many developers are 
>inclined to cut.

One can create a TCP-friendly congestion-controlled protocol on top of UDP 
quite easily.  You just use the same mechanisms as TCP to control the 
aggregate volume of data, respond to the same signals of congestion 
(unack'ed packets from loss and RED, and ECN if capable).  That's what I meant.

>To my way of thinking, this is why the time is right for SCTP, which 
>offers a choice between reliable/ordered service, reliable/partially 
>ordered service, and now has extensions under development for unreliable 
>and partially reliable service as well.

Maybe.  But the problem with SCTP is that it is a "kitchen-sink" protocol, 
full of options and combinations of options that make it hard to test (for 
performance and so forth) in all its combinations.

I did such a protocol once - it was called DSP, and some of the old-timers 
like Vint, Bob Kahn, and Bob B. may remember that I was arguing for it to 
replace TCP, because it was more "general".
It did all of the things that SCTP seems to want to do, but was much 
simpler.  After much thought and debate, it became obvious that I was 
really ignoring my own "end-to-end argument" because most of the 
functionality was only there as a subroutine library for lazy application 
programmers, and we had no particular way to argue that the application 
needs were optimized by the design.  If so, then it should have been a 
subroutine library, not built into the "standard" - instead we created UDP 
to encourage experimentation in those domains - e.g. RTP, DNS, ...

SCTP may seem to do a lot, and it may be fun to deploy, but I'm 
conservative about throwing features into low level protocols until you can 
prove they are needed (not just wanted by folks who could experiment on top 
of UDP).

This fetish of opposing UDP is based on a falsehood - that somehow UDP 
protocols aren't TCP-friendly, or closed-loop congestion-controlling, by 
definition.  Some may be, but that's because no one has thought it through 
for them.


- David
--------------------------------------------
WWW Page: http://www.reed.com/dpr.html


From vjs at calcite.rhyolite.com  Sat Mar 24 13:00:47 2001
From: vjs at calcite.rhyolite.com (Vernon Schryver)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP Framing
Message-ID: <200103242100.f2OL0ld06539@calcite.rhyolite.com>

> From: "David P. Reed" <dpreed@reed.com>

> ...
> This fetish of opposing UDP is based on a falsehood - that somehow UDP 
> protocols aren't TCP-friendly, or closed-loop congestion-controlling, by 
> definition.  Some may be, but that's because no one has thought it through 
> for them.

I think UDP is resisted for reasons like those that give CSMA/CD
a bad name.  People misunderstand "collision" as something bad that
breaks bits or at least uses vast quantities of bandwidth much as
a collision on a freeway causes traffic jams in both directions.
UDP is misunderstood as a bad thing because its acronym is often
expanded as the "unreliable datagram protocol" or at best as the
"unreliable user datagram protrocol."  They hear "unreliable" and
think it's not for the precious data of their wonderful application.

For proof, ask http://www.google.com/ about "unreliable datagram protocol"
(with or without the double-quotes).


Vernon Schryver    vjs@rhyolite.com

From jnc at ginger.lcs.mit.edu  Sat Mar 24 14:51:32 2001
From: jnc at ginger.lcs.mit.edu (J. Noel Chiappa)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP Framing
Message-ID: <200103242251.RAA07976@ginger.lcs.mit.edu>

    > From: RJ Atkinson <rja@inet.org>

    > if the TCP checksum were bound to some form of host identifier other
    > than the IP address, TCP could provide that particular property quite
    > nicely. Obviously I'm influenced by conversations with jnc on this
    > particular point. It would be an interesting exercise to work out the
    > details for such an approach.

As part of the NSRG work, I have a basically-done I-D which worked out in
some detail how to do it. (I.e. down to specifying the mechanics of how you
do an upwardly compatible change to the TCP checksum, the format of a TCP
option to carry the host identifier in the SYN packet, etc, etc.)

It turns out you can do it with zero extra overhead in both (packet) space
and processing time - if you keep the same checksum algorithm - more if you
upgrade to a better one, as I recall Dave Reed wanted to do originally :-).
(In fact, a tiny bit less computing, if you're recomputing the partial
checksum of the pseudo-header at the moment.)

If people are interested, I can put it up on the web somewhere, or even
(gasp) turn it in as an I-D.

	Noel

From djw1005 at cam.ac.uk  Sat Mar 24 16:22:19 2001
From: djw1005 at cam.ac.uk (Damon Wischik)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP Framing
In-Reply-To: <5.0.2.1.2.20010324112843.039c5770@155.247.71.60>
Message-ID: <Pine.SOL.3.96.1010325001454.10815A-100000@libra.cus.cam.ac.uk>

Phillip Conrad wrote:
> David P. Reed wrote:
> >Why not use multiple TCP connections
> 
> Two reasons: (1) fairness (2) slow start/congestion avoidance.
> Fairness: If I use "n" TCP connections for a single flow because I have
> three logical streams that I want to be processed out-of-order with
> respect to one another, then I am getting "n" times greater a share of
> the bandwidth on congested links that I should reasonably be entitled
> to. 

This begs the question: what are you reasonably entitled to?

If you have three logically separate streams which can be processed
out-of-order, I would have thought there is a case to be made that those
are three essentially independent streams (which just happen to be between
the same end-nodes), and so together they deserve three times the
bandwidth of a single stream. 

Damon Wischik.


From cannara at attglobal.net  Sat Mar 24 22:44:08 2001
From: cannara at attglobal.net (Cannara)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP Framing
References: <200103242100.f2OL0ld06539@calcite.rhyolite.com>
Message-ID: <3ABD93B8.E87D26D9@attglobal.net>

Just an added note -- "unreliable" is redundant in UDP.  "Datagram means
unreliable, best-effort packet service, which is why Xerox XNS termed its
equivalent to IP as "IDP", for "Internet Datagram Protocol".  Further, UDP
isn't completely unreliable, if a packet gets received, since the data are
checksummed.

Alex


Vernon Schryver wrote:
> 
> > From: "David P. Reed" <dpreed@reed.com>
> 
> > ...
> > This fetish of opposing UDP is based on a falsehood - that somehow UDP
> > protocols aren't TCP-friendly, or closed-loop congestion-controlling, by
> > definition.  Some may be, but that's because no one has thought it through
> > for them.
> 
> I think UDP is resisted for reasons like those that give CSMA/CD
> a bad name.  People misunderstand "collision" as something bad that
> breaks bits or at least uses vast quantities of bandwidth much as
> a collision on a freeway causes traffic jams in both directions.
> UDP is misunderstood as a bad thing because its acronym is often
> expanded as the "unreliable datagram protocol" or at best as the
> "unreliable user datagram protrocol."  They hear "unreliable" and
> think it's not for the precious data of their wonderful application.
> 
> For proof, ask http://www.google.com/ about "unreliable datagram protocol"
> (with or without the double-quotes).
> 
> Vernon Schryver    vjs@rhyolite.com

From craig at aland.bbn.com  Sun Mar 25 09:00:39 2001
From: craig at aland.bbn.com (Craig Partridge)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP Framing 
In-Reply-To: Your message of "Sat, 24 Mar 2001 22:44:08 PST."
             <3ABD93B8.E87D26D9@attglobal.net> 
Message-ID: <200103251700.f2PH0dZ51050@aland.bbn.com>

In message <3ABD93B8.E87D26D9@attglobal.net>, Cannara writes:

>Just an added note -- "unreliable" is redundant in UDP.

That's why it isn't in UDP's name.  From the RFC index:

    0768 User Datagram Protocol. J. Postel. Aug-28-1980. (Format: TXT=5896
     bytes) (Also STD0006) (Status: STANDARD)

Craig

From dotis at sanlight.net  Sun Mar 25 15:53:14 2001
From: dotis at sanlight.net (Douglas Otis)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP Framing
In-Reply-To: <200103232054.UAA16559@gra.isi.edu>
Message-ID: <NEBBJGDMMLHHCIKHGBEJIEKJCFAA.dotis@sanlight.net>

Bob,

You are in effect changing the wire specifications of TCP by insisting on
the payload being bound to the TCP frame.  This is a change to the wire
specifications in that a middle box is likely to re-position these bytes
into a non-compatible form forcing a software intervention likely to break
your intended application.  TCP framing is NOT an interim fix if indeed it
is intended to be placed into hardware to perform content directed placement
of data.  As such hardware will not be able to cope with non-aligned data,
you are placing a new requirement on the wire format; that being byte
placement with respect to TCP frames.

At the same time you attempt to implement a major modification to the TCP
API while viewing this modification as unrelated to the TCP standard, there
is already an API/Wire Format that provides the exact features that you
desire that is documented and agreed upon. It is called RFC 2960 or SCTP.
This RFC does the same function as this modified version of TCP is hoped to
do.  The real desire is not for an interim version pending deployment of
SCTP, as anyone knows, protocols built into hardware tend to live much
longer than a short period of time as you allude.  In reality, you are not
happy with SCTP and do not want to use it as it is likely adding features
you do not want to deal with.

What are those features of SCTP that make it hard to deal with I wonder?  Is
it the stream ID that allows multiple independent flows?  A feature surely
to be a boon to hardware implementations as this then requires only a single
flow control for many independent streams.  Is it the multi-homing feature?
Also a boon to those wishing hardware to provide highly reliable
connections.  Is it the ability to prevent spoofing, or DoS attacks?
Perhaps it is the ability of SCTP to identify payloads unlike TCP.  Sorry,
but any framed version of TCP you create will look like a hack compared to
SCTP with its highly desired features.  Please, do not tell me SCTP is too
hard to implement in hardware and only a mangled version of TCP is something
you are willing to attempt.  SCTP will be in place years before your mangled
version of TCP is even seriously considered.

Instead, this is a prelude to something similar to Modem standard wars where
manufactures either could not wait or could not agree to standards.  I think
before you reject SCTP out of hand, you should make some effort to explain
why a record based protocol does not suit your needs and yet only a modified
byte stream protocol does.  Should I desire to attack your adapter, all I
would need to do is to provide you with non-aligned data, something that
anyone will agree is a valid TCP data stream.  Think twice before using PPP
or IP-SEC with your framing equipment.  This framing is likely to be the
worst thing ever afflicted upon TCP as it corrupts wire specifications and
APIs.

Doug

> -----Original Message-----
> From: end2end-interest-admin@postel.org
> [mailto:end2end-interest-admin@postel.org]On Behalf Of Bob Braden
> Sent: Friday, March 23, 2001 12:54 PM
> To: end2end-interest@postel.org
> Cc: braden@ISI.EDU
> Subject: [e2e] TCP Framing
>
>
>


From craig at aland.bbn.com  Mon Mar 26 05:39:28 2001
From: craig at aland.bbn.com (Craig Partridge)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] draft on IP Fast Option Lookup 
In-Reply-To: Your message of "Fri, 23 Mar 2001 11:27:56 MST."
             <200103231827.f2NIRuX12662@calcite.rhyolite.com> 
Message-ID: <200103261339.f2QDdSZ54143@aland.bbn.com>

In message <200103231827.f2NIRuX12662@calcite.rhyolite.com>, Vernon Schryver wr
ites:

>The weaknesses in the RFC's that tried to instruct how to make compute
>the TCP checksum faster are classic examples of that syndrome.

Really -- my recollection is that those of us who wrote those RFCs actually
had done checksum work.  We just managed to completely bungle writing
down some of the details.  (Which is unfortunate but tars us with a different
brush...)

Craig

From craig at aland.bbn.com  Mon Mar 26 05:46:00 2001
From: craig at aland.bbn.com (Craig Partridge)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP Framing 
In-Reply-To: Your message of "Fri, 23 Mar 2001 15:29:20 PST."
             <Pine.LNX.4.21.0103231457510.7855-100000@pescado.lanl.gov> 
Message-ID: <200103261346.f2QDk0Z54200@aland.bbn.com>

In message <Pine.LNX.4.21.0103231457510.7855-100000@pescado.lanl.gov>, Mike Fis
k writes:

>I assume the argument is that it is inefficient to scan and twiddle bytes
>and that some out-of-band (ala packet segmentation) framing looks cheaper.  

COBS is a very efficient byte stuffing that doesn't require much byte
scanning.  If you're asking the question, you might go looks at Cheshire's
SIGCOMM paper and see how COBS might fit.

Craig

From RShankar at Novell.COM  Mon Mar 26 08:04:44 2001
From: RShankar at Novell.COM (Ramesh Shankar)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP Framing
References: <Pine.SOL.3.96.1010325001454.10815A-100000@libra.cus.cam.ac.uk>
Message-ID: <3ABF689C.6060305@Novell.COM>

The fairness issue is an interesting angle and seems relevant only when 
bandwidth is really limited or from an ISP perspective (perhaps). This 
angle is similar to the "fair share scheduling" approach used in time 
sharing UNIX systems. This issue has been discussed in the following 
Ph.D. thesis:

V. N. Padmanabhan
Ph.D. Dissertation
Computer Science Division, University of California at Berkeley, USA
September 1998
(Also published as Technical Report UCB/CSD-98-1016.)

http://www.research.microsoft.com/~padmanab/phd-thesis.html

I am not a researcher to be able to make authoritative statements, but I 
felt that just like the FSS concept is no longer relevant in todays 
systems, the fairness issue is perhaps not so relevant. I have been 
curious to understand this issue and perhaps someone can throw some 
light on this.

Thanks,

S.R.

Damon Wischik wrote:

> Phillip Conrad wrote:
> 
>> David P. Reed wrote:
>> 
>>> Why not use multiple TCP connections
>> 
>> Two reasons: (1) fairness (2) slow start/congestion avoidance.
>> Fairness: If I use "n" TCP connections for a single flow because I have
>> three logical streams that I want to be processed out-of-order with
>> respect to one another, then I am getting "n" times greater a share of
>> the bandwidth on congested links that I should reasonably be entitled
>> to. 
> 
> 
> This begs the question: what are you reasonably entitled to?
> 
> If you have three logically separate streams which can be processed
> out-of-order, I would have thought there is a case to be made that those
> are three essentially independent streams (which just happen to be between
> the same end-nodes), and so together they deserve three times the
> bandwidth of a single stream. 
> 
> Damon Wischik.


From cannara at attglobal.net  Mon Mar 26 09:01:45 2001
From: cannara at attglobal.net (Cannara)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP Framing
References: <200103251700.f2PH0dZ51050@aland.bbn.com>
Message-ID: <3ABF75F9.16C5209A@attglobal.net>

Craig, this has been a common test for years, to see how old a "network-
knowledgeable" student is.  Ask the what UDP means.  Prior to the interesting
RFC Jeremy produced the "U" stood for just what it stands for in all other
families of protocols that have datagram services -- "unreliable".  Somehow
some Internet folks seemed to become sensitive, almost ashamed, of that very
accurate and truthful engineering label, and turned to seek a "u"-word that
had marketability.  I've yet to meet a user who knowingly "uses" a datagram
protocol.  You're younger than I thought!

Alex


Craig Partridge wrote:
> 
> In message <3ABD93B8.E87D26D9@attglobal.net>, Cannara writes:
> 
> >Just an added note -- "unreliable" is redundant in UDP.
> 
> That's why it isn't in UDP's name.  From the RFC index:
> 
>     0768 User Datagram Protocol. J. Postel. Aug-28-1980. (Format: TXT=5896
>      bytes) (Also STD0006) (Status: STANDARD)
> 
> Craig

From jim.williams at emulex.com  Mon Mar 26 10:43:32 2001
From: jim.williams at emulex.com (Jim Williams)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP Framing
References: <200103232054.UAA16559@gra.isi.edu>
Message-ID: <00f101c0b624$a8620d50$710e10ac@giganet.com>

----- Original Message ----- 
From: "Bob Braden" <braden@ISI.EDU>
To: <end2end-interest@postel.org>
Cc: <braden@ISI.EDU>
Sent: Friday, March 23, 2001 3:54 PM
Subject: [e2e] TCP Framing

>Hi.  At the IETF just completed, I sat through an exposition of
>the following Internet Draft:
>
>   "Title  : ULP Framing for TCP
> Author(s) : J. Williams et al.
> Filename : draft-williams-tcpulpframe-01.txt
> Pages  : 12
> Date  : 22-Mar-01
> 
>    This document proposes a framing protocol for TCP which is designed
>    to be fully compliant with applicable TCP RFC's and fully
>    interoperable with existing TCP implementations. The framing
>    mechanism is designed to work as a 'shim' between TCP and higher-
>    level protocols, preserving the reliable, in-order delivery of TCP
>    while adding the preservation of higher-level protocol record
>    boundaries if the record is less than or equal to the path MTU. The
>    shim is designed to enable hardware acceleration of data movement
>    operations (e.g. direct placement of receive TCP segments into
>    higher-level protocol buffers) for the protocols that use it, even
>    if TCP segments are delivered out-of-order."
>
>I would like to suggest two things about this, one simple and one
>subtle.  The simple one is this: to say that the ULP framing is fully
>compliant with the applicable TCP RFCs is simply false.  For some of
>us, at least, such a lack of truth in technical advertising is a red
>flag.

I hope you are not attacking the honesty of the authors.  I may well be
the most intellectually dishonest scoundrel to ever roam the internet,
but I can assure you that the other authors are fine, honest, upstanding
people who would not let me get away with anything underhanded. :-)

More seriously, many alternatives had been considered which defined new
TCP options or defined currently reserved TCP header bits.  The point
being that the submitted proposal does not do any of those things,
which leads to the claim of full compliance with existing RFCs.


>The reason why it is false, and its consequences, form the subtle bit.
>It is true that the proposed shim does not change the definition of the
>TCP protocol on the wire.  However, it does change a more fundamental
>principle of TCP, which is the deliberate decoupling of what happens on
>the wire from what the user sends.  (See the following sentences from
>RFC 793, for example:
>
>    The TCP is able to transfer a continuous stream of octets in each
>    direction between its users by packaging some number of octets into
>    segments for transmission through the internet system.  In general,
>    the TCPs decide when to block and forward data at their own
>    convenience.

These sentences don't seem to support your point.  Stating what
TCPs are able to do, or what they generally do, hardly indicates
what they MUST do.  It seems inconsistant to state on one hand
that APIs are outside the scope of the TCP specification, and
on the other hand claim that a particular implementation is
non-compliant because the API doesn't map to the wire in a
way that suits your liking.

The core of your objections may be that the framing proposal
uses TCP in a way different from what was originally intended.
I would agree with this.  My view is that the point of standards
compliance is interoperability, not "original intent".  If two
consenting endpoints want to violate "original intent", that
should be fine as long they follow all the rules.

>The last sentence may be phrased in a slightly academic manner;
>the reader is assumed to understand the the "convenience" of the
>transport layer is to provide optimal performance.  In an earlier
>paragraph the spec says:
>
>    TCP is designed to work in a very general environment of
>    interconnected networks.
>
>Now for the subtle bit.  Generality and optimization are typically
>contradictory.  The Internet protocol suite was designed deliberately
>and carefully for generality, at the possible expense of optimization.
>It was also designed for simplicity at the expense of optimization.
>We recognized that later engineering efforts would rob some of the
>simplicity in order to reach greater optimization, and indeed,
>this has happened and is probably not a bad thing.  On the other
>hand, we should be very wary of over-engineering optimal solutions
>that cut down the generality.

These types of arguments tend to be more philosophical 
than technical, and probably the best that can be done
is to clearly state the differences in point of view without 
presuming to resolve it one way or the other.

There are certain things that should be decided
by standards organizations and certain things that should be
decided by the market place.  My view is that the best standards
are those that allow the optimization versus generality tradeoffs
to be resolved by the market place while still insuring full
interoperability of the various competing design points.


From jnc at ginger.lcs.mit.edu  Mon Mar 26 11:24:12 2001
From: jnc at ginger.lcs.mit.edu (J. Noel Chiappa)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP Framing
Message-ID: <200103261924.OAA11530@ginger.lcs.mit.edu>

    > From: Cannara <cannara@attglobal.net>

    > Craig, this has been a common test for years, to see how old a
    > "network-knowledgeable" student is. Ask the what UDP means. Prior to
    > the interesting RFC Jeremy [sic - JNC] produced .. You're younger than
    > I thought!

Well, this is kind of pointless, but in the name of historical accuracy:

Actually, those of us who are *really* "older" will remember that UDP was
*actually* done by Dave Reed (whose name appears nowhere on RFC-768, alas).
RFC-768 is simply a re-packaging of IEN-71, "User Datagram Protocol", by Dave
Reed.

The reason I recall this is that I seem to recall that Dave discussed the
design with me, and I've ever since had this bit set that we screwed up the
"no-checksum" value. (It should have been all-1's, since as the checksum is
the 1's complement of the 1's-completement sum, and as no sequence of numbers
[except all 0's, which you never see in a real packet] can be
1's-completement summed to 0, all 1-'s is the value you can never get - and
thus should have been the "no checksum" value. Making it 0 requires a check
of the complement of the sum, and inversion if it's 0, on all packets.)

Perhaps Dave Reed will correct me if my memory is wrong?

As to why it now bear's Jon's name, that was because he was editing all the
TCP/IP standards documents (IP, TCP, ICMP etc), and he edited the UDP
document to be part of the set.


As for the protocol's name, it was quite deliberately chosen to be "User". At
that point, the only user-accessible service was TCP. There were a number of
theorized services, including host-name lookup, which didn't want a
full-blown bi-directional stream connection. UDP - allowing "users" access to
a datagram protocol - was the answer.

It deliberately wasn't made reliable i) to allow its use by applications which
didn't care about reliability (we'd had experience of this problem, with
packet voice and TCP, where the enforced reliability got in the way of the
application), and ii) to keep it simple.

I believe Name Resolution (IEN-116, I think - not DNS, this was long before
DNS - it allowed a client with no disk, such as a terminal server, to allow
use of hostnames, which it queried a time-sharing machine which has a copy of
HOSTS.TXT to convert to an IP address) was the first service defined to run
on UDP.

	Noel

From braden at ISI.EDU  Mon Mar 26 11:33:49 2001
From: braden at ISI.EDU (Bob Braden)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP Framing
Message-ID: <200103261933.TAA01019@gra.isi.edu>

  *> 
  *> There are certain things that should be decided
  *> by standards organizations and certain things that should be
  *> decided by the market place.  My view is that the best standards
  *> are those that allow the optimization versus generality tradeoffs
  *> to be resolved by the market place while still insuring full
  *> interoperability of the various competing design points.
  *> 
  *> 

Jim,

I would suggest that the marketplace is most specifically a poor place
to make wise high-level technical decisions.  One could make the case
that TCP/IP has been so successful just because it was allowed to
mature in military and academic environments that shielded it from
irrelevant marketplace pressures for many years.  X.25 is a good
example of a technology that did not have that advantage.  There are
also XNS, WAP, VHS, and lots of other examples of market-driven
entries.

The marketplace is concerned only with optimization, since it is
necessary very short-term in its outlook.  Any "optimization vs.
generality tradoff" performed by the marketplace will certainly end up
with generality getting the short end of the stick.  Of course,
generality is in the long-range best interest of the marketplace, but
the marketplace itself is like a 5 year old child, incapable of seeing
its long-range best interest.  This is why there are grown-ups in the
world.

Bob Braden


From lixia at CS.UCLA.EDU  Mon Mar 26 12:16:36 2001
From: lixia at CS.UCLA.EDU (Lixia Zhang)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP Framing
In-Reply-To: <200103261933.TAA01019@gra.isi.edu> from Bob Braden at "Mar 26,
 2001 07:33:49 pm"
Message-ID: <200103262016.MAA23990@aurora.cs.ucla.edu>

> Jim,
> 
> I would suggest that the marketplace is most specifically a poor place
> to make wise high-level technical decisions.  One could make the case
> that TCP/IP has been so successful just because it was allowed to
> mature in military and academic environments that shielded it from
> irrelevant marketplace pressures for many years.  X.25 is a good
> example of a technology that did not have that advantage.  There are
> also XNS, WAP, VHS, and lots of other examples of market-driven
> entries.

I beg to exclude XNS from the rest of the "market-driven" entries.

Lixia
(unrelated to the fact that I worked for Xerox for 7 years)

From dpreed at reed.com  Mon Mar 26 11:54:29 2001
From: dpreed at reed.com (David P. Reed)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP Framing
In-Reply-To: <3ABF75F9.16C5209A@attglobal.net>
References: <200103251700.f2PH0dZ51050@aland.bbn.com>
Message-ID: <5.0.2.1.2.20010326141852.023a3310@mail.reed.com>

At 09:01 AM 3/26/01 -0800, Cannara wrote:
>Craig, this has been a common test for years, to see how old a "network-
>knowledgeable" student is.  Ask the what UDP means.  Prior to the interesting
>RFC Jeremy produced the "U" stood for just what it stands for in all other
>families of protocols that have datagram services -- "unreliable".  Somehow
>some Internet folks seemed to become sensitive, almost ashamed, of that very
>accurate and truthful engineering label, and turned to seek a "u"-word that
>had marketability.  I've yet to meet a user who knowingly "uses" a datagram
>protocol.  You're younger than I thought!

Alex - Craig may be young, but then I must be ancient at only 49.  Anyway, 
I was there at the meeting where we created UDP (and split TCP into the TCP 
and IP layers), in Marina del Rey in winter '77/'78.  We called it the 
"User Datagram Protocol" from the first, and the reason was to distinguish 
it from the IP layer, which was the "datagram protocol" not well tuned for 
users, since you couldn't demux sensibly on the "protocol" field to the 
correct "user process" aka "application program instance".  (I won't bore 
you with the radical idea that we had tried to force into TCP of using a 
64-bit process-specific address in IP, rather than a machine specific 
address - but memory cost a few pennies per bit then, so we were viewed as 
dangerously profligate).

Now there may have been some in the years after that that called it 
"Unreliable ...", but I'd suggest that only those who had fought against 
the idea that a base datagram function was useful would have stooped to 
that kind of propaganda.  Those of us who fought for a datagram protocol 
(the PARC people, Danny Cohen and the speech people, and the LAN people 
like me) used the term "best efforts", not "unreliable", to describe the 
delivery reliability of IP and UDP.

- David

--------------------------------------------
WWW Page: http://www.reed.com/dpr.html


From touch at ISI.EDU  Mon Mar 26 13:03:06 2001
From: touch at ISI.EDU (Joe Touch)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP Framing
References: <Pine.SOL.3.96.1010325001454.10815A-100000@libra.cus.cam.ac.uk> <3ABF689C.6060305@Novell.COM>
Message-ID: <3ABFAE8A.EC47C9F2@isi.edu>


Ramesh Shankar wrote:
> 
> The fairness issue is an interesting angle and seems relevant only when
> bandwidth is really limited or from an ISP perspective (perhaps). This
> angle is similar to the "fair share scheduling" approach used in time
> sharing UNIX systems. This issue has been discussed in the following
> Ph.D. thesis:
> 
> V. N. Padmanabhan
> Ph.D. Dissertation
> Computer Science Division, University of California at Berkeley, USA
> September 1998
> (Also published as Technical Report UCB/CSD-98-1016.)
> 
> http://www.research.microsoft.com/~padmanab/phd-thesis.html

FWIW, RFC2140 (April 1997) speaks directly to the issue of how sharing
is compliant with TCP and is an extension of T/TCP concepts.

Fairness can be completely decoupled from the number of connections
between two hosts. 

Joe

> >> David P. Reed wrote:
> >>
> >>> Why not use multiple TCP connections
> >>
> >> Two reasons: (1) fairness (2) slow start/congestion avoidance.
> >> Fairness: If I use "n" TCP connections for a single flow because I have
> >> three logical streams that I want to be processed out-of-order with
> >> respect to one another, then I am getting "n" times greater a share of
> >> the bandwidth on congested links that I should reasonably be entitled
> >> to.
> >
> >
> > This begs the question: what are you reasonably entitled to?
> >
> > If you have three logically separate streams which can be processed
> > out-of-order, I would have thought there is a case to be made that those
> > are three essentially independent streams (which just happen to be between
> > the same end-nodes), and so together they deserve three times the
> > bandwidth of a single stream.
> >
> > Damon Wischik.

From touch at ISI.EDU  Mon Mar 26 13:13:18 2001
From: touch at ISI.EDU (Joe Touch)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP Framing
References: <200103232054.UAA16559@gra.isi.edu> <00f101c0b624$a8620d50$710e10ac@giganet.com>
Message-ID: <3ABFB0EE.832AF0E5@isi.edu>


Jim Williams wrote:
> 
> ----- Original Message -----
> From: "Bob Braden" <braden@ISI.EDU>
> To: <end2end-interest@postel.org>
> Cc: <braden@ISI.EDU>
> Sent: Friday, March 23, 2001 3:54 PM
> Subject: [e2e] TCP Framing
> 
> >Hi.  At the IETF just completed, I sat through an exposition of
> >the following Internet Draft:
> >
> >   "Title  : ULP Framing for TCP
> > Author(s) : J. Williams et al.
> > Filename : draft-williams-tcpulpframe-01.txt
> > Pages  : 12
> > Date  : 22-Mar-01
> >
> >    This document proposes a framing protocol for TCP which is designed
> >    to be fully compliant with applicable TCP RFC's and fully
> >    interoperable with existing TCP implementations. The framing
> >    mechanism is designed to work as a 'shim' between TCP and higher-
> >    level protocols, preserving the reliable, in-order delivery of TCP
> >    while adding the preservation of higher-level protocol record
> >    boundaries if the record is less than or equal to the path MTU. The
> >    shim is designed to enable hardware acceleration of data movement
> >    operations (e.g. direct placement of receive TCP segments into
> >    higher-level protocol buffers) for the protocols that use it, even
> >    if TCP segments are delivered out-of-order."
> >
> >I would like to suggest two things about this, one simple and one
> >subtle.  The simple one is this: to say that the ULP framing is fully
> >compliant with the applicable TCP RFCs is simply false.  For some of
> >us, at least, such a lack of truth in technical advertising is a red
> >flag.
> 
> I hope you are not attacking the honesty of the authors.  I may well be
> the most intellectually dishonest scoundrel to ever roam the internet,
> but I can assure you that the other authors are fine, honest, upstanding
> people who would not let me get away with anything underhanded. :-)
> 
> More seriously, many alternatives had been considered which defined new
> TCP options or defined currently reserved TCP header bits.  The point
> being that the submitted proposal does not do any of those things,
> which leads to the claim of full compliance with existing RFCs.

My primary concern is that this appears to be a stopgap measure until
SCTP is available.
	
Stopgap modifications to widely-deployed protocols (e.g., TCP), even
optional ones, should be considered only very hesitantly.

As a stopgap, it might be sufficient to create a new "protocol" which
happens to be based on a TCP implementation with the addition of record
boundary enforcement, as a new (and somewhat temporary) protocol.
Backward compatibility can be achieved by having the server sit on BOTH
protocol ports - conventional TCP and this new
enhanced-reliable-record-transport.
	
This allows implementers to leverage the current base of
silicon-friendly TCP implementations with somewhat minor modifications.

-----

The concern with having even an optional modification to the TCP API is
that it can creep into the assumptions of the default API. I prefer the
freedom of the existing decoupling; anything that even implicitly
endorses an optional modification to that API is sliding down the path
to a true modification. Given the ephemeral nature of this proposed
modification, that seems premature.

Joe

From mfisk at lanl.gov  Mon Mar 26 21:21:09 2001
From: mfisk at lanl.gov (Mike Fisk)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP Framing 
In-Reply-To: <200103261346.f2QDk0Z54200@aland.bbn.com>
Message-ID: <Pine.LNX.4.21.0103262114270.2003-100000@pescado.lanl.gov>

My message was misunderstood; I'm familiar with COBS.  I was attempting to
ask a leading question of the authors of the draft and other supporters of
similar proposals.  I was hoping that they could explain why _they_ don't
feel that byte stuffing is an appropriate solution.  To date, I haven't
heard any credible arguments about why byte-stuffing wouldn't be
sufficient.

On Mon, 26 Mar 2001, Craig Partridge wrote:

> 
> In message <Pine.LNX.4.21.0103231457510.7855-100000@pescado.lanl.gov>, Mike Fis
> k writes:
> 
> >I assume the argument is that it is inefficient to scan and twiddle bytes
> >and that some out-of-band (ala packet segmentation) framing looks cheaper.  
> 
> COBS is a very efficient byte stuffing that doesn't require much byte
> scanning.  If you're asking the question, you might go looks at Cheshire's
> SIGCOMM paper and see how COBS might fit.
> 
> Craig
> 

-- 
Mike Fisk, RADIANT Team, Network Engineering Group, Los Alamos National Lab
See http://home.lanl.gov/mfisk/ for contact information


From cannara at attglobal.net  Tue Mar 27 01:31:43 2001
From: cannara at attglobal.net (Cannara)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP Framing
References: <5.0.2.1.2.20010323173039.02fd7e60@mail.reed.com>
		 <200103232054.UAA16559@gra.isi.edu> <5.0.2.1.2.20010324142828.02fd37a0@mail.reed.com>
Message-ID: <3AC05DFF.C7030051@attglobal.net>

Actually, with current network processors (e.g., Vitesse, IBM, PMCC, Intel...)
flows are queued and can be classified for RED or other QoS purposes by
5-tuples, which include ports.  This is quite logical, since a conversation on
one port pair, especially to a common system (e.g., server) will rightly
deserve differing flow treatment from other port pairs.  Loss probability
under RED then can vary across connections between individual IP pairs.

Alex


"David P. Reed" wrote:
> 
[clip]
> Don't think this is actually true.  packet drop rate on the shared link has
> nothing to do with port numbers - even RED discriminates only on IP
> address.  Now ECN might cause one TCP to back off and another to back off
> less, but the stable state would seem to be the same, whether multiple TCP
> connections are used or not.  (some of the less end-to-endian notions of
> router fairness might give 3 TCP cnxns better service, by looking deeper
> into the packets).
> 
[clip]

From cannara at attglobal.net  Tue Mar 27 01:32:49 2001
From: cannara at attglobal.net (Cannara)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP Framing
References: <200103262016.MAA23990@aurora.cs.ucla.edu>
Message-ID: <3AC05E41.FC5EBAF1@attglobal.net>

Definitely agree, given Xerox's tradition of 'success' in marketing.  XNS was
researched rather than marketed.  TCP/IP, however, has been subsidized beyond
grandest imaginings -- free distribution with Sun, ATT, HP... machines for
years, untold public $ spent on graduate students, research projects, papers,
committees...  And, the real hero of the Internet, Bob Kahn, rarely gets the
recognition he deserves, for zealously working to maintain the flow of public
finances, even when DARPA was ready to cut and run.  Even now, millions more
are being spent to get back even the basics of a secure, uniformly-addressable
internetworking structure that were overlooked in the adolescent design
process that has left us with the profoundly hackable Internet protocol
family.  I only use "adolescent" rather than "bureaucratic" here, because The
Economist has an Internet piece out using that modifier. {:o]

Alex

Lixia Zhang wrote:
> 
> > Jim,
> >
> > I would suggest that the marketplace is most specifically a poor place
> > to make wise high-level technical decisions.  One could make the case
> > that TCP/IP has been so successful just because it was allowed to
> > mature in military and academic environments that shielded it from
> > irrelevant marketplace pressures for many years.  X.25 is a good
> > example of a technology that did not have that advantage.  There are
> > also XNS, WAP, VHS, and lots of other examples of market-driven
> > entries.
> 
> I beg to exclude XNS from the rest of the "market-driven" entries.
> 
> Lixia
> (unrelated to the fact that I worked for Xerox for 7 years)

From cannara at attglobal.net  Tue Mar 27 01:35:00 2001
From: cannara at attglobal.net (Cannara)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP Framing
References: <200103251700.f2PH0dZ51050@aland.bbn.com> <5.0.2.1.2.20010326141852.023a3310@mail.reed.com>
Message-ID: <3AC05EC4.B720078C@attglobal.net>

This is interesting David, having known the people at Parc and being still
older, the idea that packet networking began with those meetings is, as you'll
agree, incorrect.  Since "unreliable" was used in packet networking as
equivalent to "datagram" for years before those meetings, and books describing
UDP even later used "unreliable", perhaps as a matter of ethical choice, I can
only say that the choice of "user" as a modifier for a user-invisible protocol
component underscores how arbitrary many choices of terms in the TCP/IP family
have been.  

The idea of "best effort" is also a hard one to support, since "best" is very
much open to interpretation, especially by a receiver who got nothing, or
something trashed.  If "best effort" is a euphemism for datagram, then it's no
wonder some folks thought the imaginitive naming, adopted as you say, was
objectionable.  The problem that "user" and "best-effort" raise is that they
mean nothing and add nothing to pre-existing terms, such as datagram.

Actually, since UDP at least checksums a datagram, it could well have been
called "CDP", for "checksummed datagram protocol", thus being much clearer to
"users" in its purpose and capability.  

Alex


"David P. Reed" wrote:
> 
> At 09:01 AM 3/26/01 -0800, Cannara wrote:
> >Craig, this has been a common test for years, to see how old a "network-
> >knowledgeable" student is.  Ask the what UDP means.  Prior to the interesting
> >RFC Jeremy produced the "U" stood for just what it stands for in all other
> >families of protocols that have datagram services -- "unreliable".  Somehow
> >some Internet folks seemed to become sensitive, almost ashamed, of that very
> >accurate and truthful engineering label, and turned to seek a "u"-word that
> >had marketability.  I've yet to meet a user who knowingly "uses" a datagram
> >protocol.  You're younger than I thought!
> 
> Alex - Craig may be young, but then I must be ancient at only 49.  Anyway,
> I was there at the meeting where we created UDP (and split TCP into the TCP
> and IP layers), in Marina del Rey in winter '77/'78.  We called it the
> "User Datagram Protocol" from the first, and the reason was to distinguish
> it from the IP layer, which was the "datagram protocol" not well tuned for
> users, since you couldn't demux sensibly on the "protocol" field to the
> correct "user process" aka "application program instance".  (I won't bore
> you with the radical idea that we had tried to force into TCP of using a
> 64-bit process-specific address in IP, rather than a machine specific
> address - but memory cost a few pennies per bit then, so we were viewed as
> dangerously profligate).
> 
> Now there may have been some in the years after that that called it
> "Unreliable ...", but I'd suggest that only those who had fought against
> the idea that a base datagram function was useful would have stooped to
> that kind of propaganda.  Those of us who fought for a datagram protocol
> (the PARC people, Danny Cohen and the speech people, and the LAN people
> like me) used the term "best efforts", not "unreliable", to describe the
> delivery reliability of IP and UDP.
> 
> - David
>

From cannara at attglobal.net  Tue Mar 27 01:36:00 2001
From: cannara at attglobal.net (Cannara)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP Framing
References: <Pine.GSO.4.21.0103261854110.23457-100000@regan.ee.surrey.ac.uk>
Message-ID: <3AC05F00.D2A42706@attglobal.net>

Lloyd, as I said to Craig it was late (or early) and Jon or Jeremy were
equally good for me.  :]  Ok, so how old is that?  And, is an x.25 datagram
now reliable?

Alex


Lloyd Wood wrote:
> 
> On Mon, 26 Mar 2001, Cannara wrote:
> 
> > Craig, this has been a common test for years, to see how old a "network-
> > knowledgeable" student is.  Ask the what UDP means.  Prior to the interesting
> > RFC Jeremy produced
> 
> Okay, just how old do you have to be to know that Jeremy 'Bentham'
> Postel later changed his name by deed poll to Jon?
> 
> > the "U" stood for just what it stands for in all other
> > families of protocols that have datagram services -- "unreliable".
> 
> such as, oh, X.25 datagram transport?
> 
> L.
> 
[clip]

From J.Crowcroft at cs.ucl.ac.uk  Tue Mar 27 02:02:45 2001
From: J.Crowcroft at cs.ucl.ac.uk (Jon Crowcroft)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP Framing
In-Reply-To: Your message of "Tue, 27 Mar 2001 01:35:00 -0800." <3AC05EC4.B720078C@attglobal.net>
Message-ID: <4382.985687365@cs.ucl.ac.uk>

 >>The idea of "best effort" is also a hard one to support, since "best" is very
 >>much open to interpretation

this isnt rocket science.

the reliability sematics of the UDP service are not distnguisible from
the IP service that carries a UDP payload.

the checksum is optional and UDPlite work is working on amaking it
partially optional:-) so even the bit delivery sematics aren't any
better or worse than IP 

other protocols above IP add different value. SCTP and TCP and RDP and
netblt and PGM and so on all add some notion of a lower failure
probability, as well as what quaint old iso people used to call
"signalled" errors only - i.e. apart fro ma few corner cases that
stone/partidge etal idneity in the engineering noise (literally) they
attempt to reduce unsignaled errors to as close to zero as acceptable
for the application (or Upper Layer Protocol as we used to say)....

btw, we used to have several types of datagrams in other networks
other than IP ones- for example, n the cambridge distributed system
there was a Universe Datagram Protocol (i even did a gateway to ip
once for it as well as a layering of IP on it - oh, and it was run o
na sort of ATM layer, except we only had 16bit cells:-) and  in X25
nets there WAS actually a datagram service - it was called Fast
Select, and was rarely implemented. Some folks in X.25 made mistakes
about the X.25 semantics and didn't get edge-to-edge reliability
beyond the _interface_ - in this case, while pedfantically,m they were
right, what ytou actyually got was an unreliable data transfer service
withotu signaled errors -for exampl the UK academic IP on 2Mbps X.25
service in the late 80s suffered interesting performance effects from
this...

other cases -oh, look at GPRS and the "reliable" (aka window and
GBN/retransmit) link layer it offers as an option - the effect depends
on the interface spec and how long you are acutally prepared to _wait_
for a signaled error too...so its quite subtle in reality....

oh, lets not getinto fragmentation debates too :-)...

cheers
jon
who is 43


From laws at dera.gov.uk  Tue Mar 27 02:29:25 2001
From: laws at dera.gov.uk (John Laws)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP Framing
In-Reply-To: <Pine.GSO.4.21.0103261854110.23457-100000@regan.ee.surrey.ac.uk>
References: <3ABF75F9.16C5209A@attglobal.net>
Message-ID: <3AC07995.10056.18800C@localhost>

LLoyd,

On 26 Mar 2001, at 19:09, Lloyd Wood wrote:

> Okay, just how old do you have to be to know that Jeremy 'Bentham'
> Postel later changed his name by deed poll to Jon?

I never knew that (and I'm old), but it's a very "interesting" 
connection (James Burke, Scientific American style) back to another 
Jon (Crowcroft) at UCL AND that the major benefactor for the 
foundation of UCL is a Jeremy Bentham. His mummified body is in a 
display case within UCL (a condition I think of granting his money 
over to UCL).

John

_________________________

John Laws
Security & Information Systems
Battlespace Management Dept.
Integrated Systems Sector, Security Division
Defence Evaluation & Research Agency, Malvern Worcs WR14 3PS UK
Tel +44 1684 89-4903 (with voice mail), Fax +44 1684 89-6064
DERA Standard Internet Disclaimer "The Information contained 
in this e-mail and any subsequent correspondence is private 
and is intended solely for the intended recipient(s).  For 
those other than the intended recipient any disclosure, copying, 
distribution, or any action taken or omitted to be taken in 
reliance on such information is prohibited and may be unlawful."

From J.Crowcroft at cs.ucl.ac.uk  Tue Mar 27 03:37:56 2001
From: J.Crowcroft at cs.ucl.ac.uk (Jon Crowcroft)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP Framing
In-Reply-To: Your message of "Tue, 27 Mar 2001 11:29:25 BST." <3AC07995.10056.18800C@localhost>
Message-ID: <4729.985693076@cs.ucl.ac.uk>

In message <3AC07995.10056.18800C@localhost>, John Laws typed:

 >>I never knew that (and I'm old), but it's a very "interesting" 
 >>connection (James Burke, Scientific American style) back to another 
 >>Jon (Crowcroft) at UCL AND that the major benefactor for the 
 >>foundation of UCL is a Jeremy Bentham. His mummified body is in a 
 >>display case within UCL (a condition I think of granting his money 
 >>over to UCL).
 
John

technically, Bentham was not a founder - he was the mentor for a group
of utilitarians who were the actual foudners - his body is in a glass
case in the Quad (mummified, sans head)  and has top be present at all
college council meetings (interesting given he was against all
organised religion) - his head is elsewherre (believed to be in a safe
since various pranksters stole it and did various dubious things to
it, though i have heard the same story about oliver cromwell's head in
cambridge (pembroke college?)...)

those of you coming to the London IETF this summer may wish to visit UCL 
and see for yourselves - see
http://www-mice.cs.ucl.ac.uk/ietf/
for a totally informal set of info about this event

i believe ip protocol #7 is still assigned as 
7  UCL UCL [PK]
for those of you interested in history - it was part of a "remote"
transport end-point hack that was called "clean and simple" that
allowed one to concatenate a variety of "end" to end protocols
together and provide transparent protocol translation - the way the
different families linked to each other was through a type of "network
address translation" in a true sense of translation, and so long as the
protocols had the right semantics, the service actually kind of
worked.... (module signaled error models...:-)

one of the protocols went by the name of yellow book ,and another
was TP4 and another was a Byte Stream Protocol on a Cambridge ring and
another was something we came across in 81 called TCP....

of course, the term "clean and simple" was (i believe) ironic


 cheers

   jon


From cannara at attglobal.net  Tue Mar 27 09:35:16 2001
From: cannara at attglobal.net (Cannara)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP Framing
References: <4382.985687365@cs.ucl.ac.uk>
Message-ID: <3AC0CF54.AC35DA8A@attglobal.net>

Exactly the point Jon -- no rocket science.  So we should have no need for
meaningless adjectives that mislead folks naive to the systems.  "DP" could be
as sufficient as "IP".

Alex


Jon Crowcroft wrote:
> 
>  >>The idea of "best effort" is also a hard one to support, since "best" is very
>  >>much open to interpretation
> 
> this isnt rocket science.
> 
> the reliability sematics of the UDP service are not distnguisible from
> the IP service that carries a UDP payload.
[clip]

From ballardie at dial.pipex.com  Tue Mar 27 09:53:37 2001
From: ballardie at dial.pipex.com (Tony Ballardie)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP Framing
References: <4729.985693076@cs.ucl.ac.uk>
Message-ID: <007501c0b6e6$f928ae20$7f0dbc3e@vaionote>

----- Original Message ----- 
From: "Jon Crowcroft" <J.Crowcroft@cs.ucl.ac.uk>
To: <laws@dera.gov.uk>
Cc: "Lloyd Wood" <l.wood@eim.surrey.ac.uk>; <end2end-interest@postel.org>
Sent: 27 March 2001 12:37
Subject: Re: [e2e] TCP Framing


> i believe ip protocol #7 is still assigned as 
> 7  UCL UCL [PK]
> for those of you interested in history - it was part of a "remote"
> transport end-point hack that was called "clean and simple" 

IP ptcl #7 was assigned to CBT in '95 or '96. So, still UCL related,
and... "clean and simple"  too :-)

Tony


From braden at ISI.EDU  Tue Mar 27 13:24:53 2001
From: braden at ISI.EDU (Bob Braden)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] FYI - Proposal: Real-time transfer protocol
Message-ID: <200103272124.VAA01913@gra.isi.edu>

From: "Real-time transfer protocol" <rttp@over-ground.net>
To: <Postel@ISI.EDU>, <JKRey@ISI.EDU>
Subject: Proposal: Real-time transfer protocol
Date: Mon, 26 Mar 2001 12:46:44 +0300

Dear Sir/Madam,

Excuse me for sending this letter without your permission. 
I just wanted to intruduce to your attention my research, concerning a protocol for real-time data transmission in Internet.
It is published at:

http://over-ground.net/rttp 

This is a public research, relying on volunteers for its development. It is a request for comments, though this is not an RFC in the common meaning of this abbreviation. 
If my work is outside the range of your interests, I would much appreciate informing your colegues who could be ineterested about it.
Thank you!

Yours faithfully, 
Dimitar Aleksandrov


From mfisk at lanl.gov  Tue Mar 27 14:55:36 2001
From: mfisk at lanl.gov (Mike Fisk)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP Framing 
In-Reply-To: <200103270610.WAA10577@champagne.dsg.stanford.edu>
Message-ID: <Pine.LNX.4.21.0103271431550.2003-100000@pescado.lanl.gov>

Once you know where the record boundary is, you can find an upper layer
header and use whatever upper-layer logic is neccessary to place (DMA) the
block.  What you don't want is to receive a packet that lacks a header
describing where to put it.  You can add an optional RDMA header to TCP or
IP or you can add it to the TCP payload and make sure that there's one per
packet.

What seems problematic to me is assuming a 1-1 mapping between upper-layer
blocks and TCP segments.  To me, this suggests that when building TCP
segments you want to insert a header into the byte stream right before
each block and at the beginning of each segment.  But this header can be
generated at the last minute by the TCP output routines.  There doesn't
seem to be a need to require that segment sizes match upper-layer protocol
size.

And if you don't want to use something like byte-stuffing to find the
header, you can place the header(s) at the beginning of each segment.  
Assuming that DF is set, and middleboxes are well-behaved (is that an
oxymoron?), that segment should be preserved end-to-end.

On Mon, 26 Mar 2001, Jonathan Stone wrote:

> I suspect its because they want not just to preserve record
> boundaries, but to align "records" onto suitable memory boundaries.
> 
> Think of scsi-over-tcp, with the TCP stream carrying a mix of
> "scsi CCBs" and "disk blocks."
> 
> Then again, i could be completey wrong...

As could I.  In particular, the folks designing NICs to do this may have
some constraints that I'm not aware of.

-- 
Mike Fisk, RADIANT Team, Network Engineering Group, Los Alamos National Lab
See http://home.lanl.gov/mfisk/ for contact information


From knm at protocol.ece.iisc.ernet.in  Tue Mar 27 19:13:35 2001
From: knm at protocol.ece.iisc.ernet.in (K N Manoj)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] (no subject)
Message-ID: <Pine.GSO.4.10.10103281106120.332-100000@cml>

Hello,

We face the following problem at our mail server <cml.ece.iisc.ernet.in>

Mail coming from .iisc.ernet.in reach the server in time, but those coming
from outside bounce back.

We are unable to track it. Can you help us?

Thanks and regards,
Manoj.

--
K N Manoj, 
Coding and Modulation Lab,
Department of Electrical Communication Engineering,
Indian Institute of Science,
Bangalore, India. 560 012.  Ph: 309 2855

12 Deg 58 Min N, 77 Deg 39 Min E
--


From mankin at ISI.EDU  Thu Mar 29 05:05:33 2001
From: mankin at ISI.EDU (Allison Mankin)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP Framing 
In-Reply-To: Your message of Fri, 23 Mar 2001 23:02:43 -0800.
             <Roam.SIMC.2.0.6.985417363.9198.nordmark@bebop.france> 
Message-ID: <10103291305.AA15081@maia.east.isi.edu>

> As much as we might dislike the various middle boxes,
> I wonder what would happen if one of these TCP connections
> passed through a middle box. While many middleboxes tweak things
> on a packet by packet basis, there might be some that are essentially
> implemented as a read+write loop in application space, i.e.
> the TCP segment boundaries would not be preserved.
> 
> Thus trying to make the TCP segment boundaries matter for the ULP
> is threading into unchartered territory.

The shim does have a provision for detecting that middleboxes have
happened to it (resegmenting) and reverting to normal processing if so.

Reviewing to see that detection is a s sure as the designers hope would
be good.

Allison

From dino.saija at libero.it  Fri Mar 30 07:44:20 2001
From: dino.saija at libero.it (dino.saija@libero.it)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP modeling
Message-ID: <GB0P1W$IxpP6bzcVcj9nDO2GDtwvxK2fHkFXh_kBeJOrDrmjWCWzDD6@libero.it>

 exist a recent analitical model for TCP ?
thank you


From tjo at research.telcordia.com  Fri Mar 30 08:56:12 2001
From: tjo at research.telcordia.com (Teunis J Ott)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP modeling
In-Reply-To: "dino.saija@libero.it"<dino.saija@libero.it>
        "[e2e] TCP modeling" (Mar 30,  5:44pm)
References: <GB0P1W$IxpP6bzcVcj9nDO2GDtwvxK2fHkFXh_kBeJOrDrmjWCWzDD6@libero.it>
Message-ID: <1010330115611.ZM27735@buzz>

On Mar 30,  5:44pm, dino.saija@libero.it wrote:
> Subject: [e2e] TCP modeling
>  exist a recent analitical model for TCP ?
> thank you
> 
> 
>-- End of excerpt from dino.saija@libero.it


See
 ftp://ftp.research.telcordia.com/pub/tjo/TCPwindow.ps

It is pretty old, but it makes sense. Teun Ott.

From larse at ISI.EDU  Fri Mar 30 10:27:59 2001
From: larse at ISI.EDU (Lars Eggert)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP modeling
References: <GB0P1W$IxpP6bzcVcj9nDO2GDtwvxK2fHkFXh_kBeJOrDrmjWCWzDD6@libero.it>
Message-ID: <3AC4D02F.8B0106BB@isi.edu>

"dino.saija@libero.it" wrote:
> 
>  exist a recent analitical model for TCP ?
> thank you

I think Vishal Misra (http://www-net.cs.umass.edu/~misra/) has some papers
on that. He just gave a talk at USC last week.
-- 
Lars Eggert <larse@isi.edu>                 Information Sciences Institute
http://www.isi.edu/larse/                University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 2087 bytes
Desc: S/MIME Cryptographic Signature
Url : http://www.postel.org/pipermail/end2end-interest/attachments/20010330/e8bebf2e/smime.bin
From padhye at aciri.org  Fri Mar 30 10:34:24 2001
From: padhye at aciri.org (Jitendra Padhye)
Date: Thu Mar 25 11:59:37 2004
Subject: [e2e] TCP modeling
In-Reply-To: <GB0P1W$IxpP6bzcVcj9nDO2GDtwvxK2fHkFXh_kBeJOrDrmjWCWzDD6@libero.it> from "dino.saija@libero.it" at "Mar 30, 2001  5:44:20 pm"
Message-ID: <200103301834.f2UIYOU80169@moose.aciri.org>

http://www.aciri.org/padhye/tcp-model.html

Lists some (but certainly not all) of the papers on this topic. If you find
any more, please let me know!

- Jitu

>  exist a recent analitical model for TCP ?
> thank you
> 
> 
> 


From guol at cs.bu.edu  Fri Mar 30 12:32:46 2001
From: guol at cs.bu.edu (Guo, Liang)
Date: Thu Mar 25 11:59:38 2004
Subject: [e2e] [ns]: RED treatment to SYN packet from TCP/ECN source
In-Reply-To: <Pine.LNX.4.21.0103292109380.934-100000@pc420.cl.pwf.cam.ac.uk>
Message-ID: <Pine.SOL.4.20.0103301458110.20807-100000@csa.bu.edu>

I'm reading tcp.cc/red.cc file and intrigued by the following questions.

Since tcp.cc in ns assumes one-way session, the first packet
will be serving as a SYN packet although in most time it is treated as
the first data packet. Here comes the problem. For ECN capable tcp,
here's the code from output() function:

        if (seqno == 0) {
                if (syn_) {
                        hdr_cmn::access(p)->size() = tcpip_base_hdr_size_;
                }
                if (ecn_) {
                        hf->ecnecho() = 1;
//                      hf->cong_action() = 1;
                        hf->ect() = 0;
			~~~~~~~~~~~~~~~~~~
                }


So the first packet will carry no ECT codepoint. I guess this is following
the specification from draft-ietf-tsvwg-ecn-03.txt which demands
"A host MUST NOT set ECT on data packets unless it has sent at least
   one ECN-setup SYN or ECN-setup SYN-ACK packet, and has received at
   least one ECN-setup SYN or ECN-setup SYN-ACK packet, and has sent no
   non-ECN-setup SYN or non-ECN-setup SYN-ACK packet."


However, at the RED queue, the queue only do ECN (marking instead of
dropping) to packets that carries ECT bit. Here's the code from
red.cc:

       hdr_flags* hf = hdr_flags::access(pickPacketForECN(pkt));
       if (edp_.setbit && hf->ect() && edv_.v_ave < edp_.th_max) {
                         ~~~~~~~~~~~~
                hf->ce() = 1;   // mark Congestion Experienced bit
                return (0);     // no drop
       } else {
                return (1);     // drop
       }


My question is, does this mean that the SYN packet is more likely to
be dropped than the data packet? This is horrible because dropping
SYN packet will cause a 6 seconds timeout even if the RTT is say
0.1 msec. Wouldn't it be nice if RED queue also protects these SYN
packets?

I'm not sure how it is implemented in real network products. 
But at least I've seen different implementation of RED on linux
machines. 

One more thing, why does TCP/ECN allows congestion window
to go below 1 (so if not using double precision, that means
cwnd could be 0)? Any special purpose for this to happen?


Guo, Liang 

guol@cs.bu.edu                     Dept. of Comp. Sci., Boston Univ.,
(617)353-5222 (O)                  111 Cummington St., MCS-217,
(617)375-9206 (H)                  Boston, MA 02215


From csapuntz at stanford.edu  Fri Mar 30 22:33:41 2001
From: csapuntz at stanford.edu (Constantine Sapuntzakis)
Date: Thu Mar 25 11:59:38 2004
Subject: [e2e] TCP Framing 
References: <Pine.LNX.4.21.0103271431550.2003-100000@pescado.lanl.gov>
Message-ID: <016101c0b9ac$87560950$0f00000a@KEALIACSAPUNTZ>

Hi Mike,

I hope this e-mail can respond to a couple of your very good and
thought-provoking points.

I'll use the term upper-layer protocol (ULP) to talk about the protocol
riding on top of TCP. Some examples of ULPs include iSCSI, SSL, NFS, and
RDMA.

There are two properties we were looking to get from TCP:

1) finding NLP message boundaries in segments received out-of-order

This involves having some signalling discipline for message boundaries.

There are several ways other than the one we proposed of providing this
property (including techniques that do not modify the TCP sender). These
include having  a header periodically in the TCP stream (say every 1000
bytes) or a byte-stuffing technique like COBS.

2) application messages not spanning segments

This simplifies the receiver as it does not have to deal with cases where
ULP headers span segments or
where ULP datagrams are broken across TCP segments.

I don't believe that property #2 can be had without modifying the TCP sender
a la the proposal presented.

-----------

One could question how critical property #2 is. After all, if stuff arrives
mostly in order except for the occasional drop, you can keep a bit of
application state from packet to packet. I would still argue that property
#2 makes life on the fast path a good deal easier for the receiver.

-Costa


From ggumdol at comis.kaist.ac.kr  Sat Mar 31 03:34:15 2001
From: ggumdol at comis.kaist.ac.kr (Jeong-woo Cho)
Date: Thu Mar 25 11:59:38 2004
Subject: [e2e] TCP modeling
References: <GB0P1W$IxpP6bzcVcj9nDO2GDtwvxK2fHkFXh_kBeJOrDrmjWCWzDD6@libero.it>
Message-ID: <001b01c0b9d6$84208ec0$2992f88f@ggumdol>

----- Original Message ----- 
From: <dino.saija@libero.it>
To: <end2end-interest@postel.org>
Sent: Saturday, March 31, 2001 12:44 AM
Subject: [e2e] TCP modeling


> exist a recent analitical model for TCP ?
> thank you
> 
> 
> 

 I think that the following paper is the most excellent paper about TCP modeling.

 Jitendra Padhye, Victor Firoiu, and Donald F. Towsley, "Modeling TCP Reno Performance: A Simple Model and Its Empirical Validation". IEEE/ACM Transaction on Networking, vol. 8, no. 2, April 2000.


From ggumdol at comis.kaist.ac.kr  Sat Mar 31 03:43:03 2001
From: ggumdol at comis.kaist.ac.kr (Jeong-woo Cho)
Date: Thu Mar 25 11:59:38 2004
Subject: [e2e] RED with TFRC
Message-ID: <003901c0b9d7$c1f17d30$2992f88f@ggumdol>


 Although Sally insists that TFRC could achieve smooth sending rates of real-time applications, (in fact, TFRC is smoother than TCP) RED is not a good router mechanism for real-time applications which would adopt TFRC as their congestion control mechanism.

 I think that dropping strategy of RED is to simplified and it cannot avoid "random drops" which is quite bad for TFRC flows, which uses weighted sum of last n packet drop intervals to estimate current fair share.

 In conclude, I think that their should be another router mechanisms to avoid these "random packet drops". 

 Is their any discussions on this?


From tqbf at sonicity.com  Mon Mar 26 00:25:26 2001
From: tqbf at sonicity.com (Thomas H. Ptacek)
Date: Thu Mar 25 11:59:46 2004
Subject: [e2e] UDP length field
In-Reply-To: <200104191632.f3JGW2I22876@baskerville.CS.Arizona.EDU>
References: <200104191632.f3JGW2I22876@baskerville.CS.Arizona.EDU>
Message-ID: <985595126.1070.4.camel@tqbf-notebook.int.sonicity.com>

> The validity checks is another issue, all UDPs that we've 
> looked at seems to adhere to Craigs rules. There is one 
> exception which is Quake's home-grown UDP. They do some

Quake has its own UDP? I can see (evil) reasons for building ones
own TCP, but almost no benefit to a custom UDP. Is this an OS issue
(ie, they reimplemented sockets for speed) or did they also build
their own IP? Does anyone know the answer to this?

---
Thomas H. Ptacek