From s.malik at tuhh.de  Mon Aug  1 04:10:12 2005
From: s.malik at tuhh.de (Sireen Habib Malik)
Date: Mon, 01 Aug 2005 13:10:12 +0200
Subject: [e2e] RTO Estimation... was "Agility..."
In-Reply-To: <DAC3FCB50E31C54987CD10797DA511BA0FE7ACF0@WIN-MSG-10.wingroup.windeploy.ntdev.microsoft.com>
References: <DAC3FCB50E31C54987CD10797DA511BA0FE7ACF0@WIN-MSG-10.wingroup.windeploy.ntdev.microsoft.com>
Message-ID: <42EE0314.6020202@tuhh.de>

Hi all,

I have been thinking about David's emails and some points raised by 
Detlef in the background discussion. It's a learning process.

So ...there are some questions which I think are important in the 
context of this discussion.

We say that the RTT's distribution is heavy-tailed. However, the 
discussion on heavy-tailed sized files, the resultant LRD in the traffic 
and the sub-exponential queue occupancy distribution, is based upon  the 
"open-loop" queue anaylsis.

However, TCP is a "closed-loop" protocol (David's point).

The first set of questions then is, "what impacts the queue occupancy 
distribution more, the closed loop operation, or the heavy-tailedness of 
E2E distribution?", or, "under what loads/traffic conditions one of them 
is more dominant?" , or, "is there a dependency between them?".

Second point: It is clear that present RTO estimation will work in the 
frame of assumptions under which it is supposed to work. Like Detlef 
says, "nobody will complain that a car does not run if it is out of 
gas", so nobody should complain if RTO estimator does not work when 
traffic parameters do not fall inside the space of the relevant assumptions.

If that is true then one way to resolve this issue is to adjust/shape 
traffic in such a way that RTO should work (I think this is what Detlef 
is saying), or make a "general purpose" RTO estimator that 
reduces/relaxes the set of assumptions - ideally, it should work if IID 
assumption holds, or not.

I think the work in the second direction is more general, and conducive 
to practical environments. How difficult or easy it is, I don't know! A 
good way is to first find out, if there is any work already done in this 
direction?


Thanks and regards,
Sireen Malik


Christian Huitema wrote:

>I think we should just look at a simple question. Does the current
>algorithm actually works? 
>
>I personally did measurements 6 years ago. The measurement of
>tcp-connect times to various web servers clearly showed a power law
>distribution. There is in fact a history of finding power laws in
>measurement of communication systems. In fact, Mandelbrot work on
>fractals started with an analysis of the distribution of errors on a
>modem link! Based on all that, it is quite reasonable to assume that the
>distribution of RTT measurement follows a power law. 
>
>People will immediately mention that it should be a truncated power law,
>but even that is far from clear. There is at least anecdotal evidence of
>packets being held up in queues and then transmitted after a very long
>time, e.g. half an hour...
>
>The current RTT estimators are based on exponential averages of
>consecutive samples of delays and variations. This is an issue, as the
>exponential average of a heavy tailed distribution also is a heavy
>tailed distribution. If you plug that in a simulation, you will observe
>that the estimates behave erratically. 
>
>My personal feeling is that the current RTT estimators do not actually
>work.
>
>-- Christian Huitema
>  
>


-- 
M.Sc.-Ing. Sireen Malik

Communication Networks
Hamburg University of  Technology
FSP 4-06 (room 5.012)
Schwarzenbergstrasse 95 (IVD)
21073-Hamburg, Deutschland

Tel: +49 (40) 42-878-3443
Fax: +49 (40) 42-878-2941
E-Mail: s.malik at tuhh.de

--Everything should be as simple as possible, but no simpler (Albert Einstein)


From detlef.bosau at web.de  Mon Aug  1 11:07:46 2005
From: detlef.bosau at web.de (Detlef Bosau)
Date: Mon, 01 Aug 2005 20:07:46 +0200
Subject: [e2e] RTO Estimation... was "Agility..."
References: <DAC3FCB50E31C54987CD10797DA511BA0FE7ACF0@WIN-MSG-10.wingroup.windeploy.ntdev.microsoft.com>
	<42EE0314.6020202@tuhh.de>
Message-ID: <42EE64F2.BCD0621E@web.de>

Sireen Habib Malik wrote:

> 
> Second point: It is clear that present RTO estimation will work in the
> frame of assumptions under which it is supposed to work. Like Detlef
> says, "nobody will complain that a car does not run if it is out of
> gas", so nobody should complain if RTO estimator does not work when
> traffic parameters do not fall inside the space of the relevant assumptions.

And we should well consider the consequences, if it?s true that RTO
estimators 
actually don?t work, as Christian suggested. For the particular case of
mobile wireless networks, we would have to reconsider the whole work on
"spurious timeouts", because what?s called "spurious timeout" is perhaps
not the problem, but a symptom. Unduly often spurious timeouts are
nothing else 
than a too high probability for unwanted retransmissions, to remain in
the words chosen e.g. by Edge.

> 
> If that is true then one way to resolve this issue is to adjust/shape
> traffic in such a way that RTO should work (I think this is what Detlef

Exactly. In my post, I focussed solely on the routers. However, any
change in a router?s behaviour directly influences the traffic switched
by this.
Even more difficult: Influencing the traffic on a router will perhaps
not only affect this router, but the behaviour of other routers as well.
So, I?m not quite sure whether we are allowed to consider the router
queues as being decoupled.

> is saying), or make a "general purpose" RTO estimator that
> reduces/relaxes the set of assumptions - ideally, it should work if IID
> assumption holds, or not.
> 
> I think the work in the second direction is more general, and conducive
> to practical environments. How difficult or easy it is, I don't know! A

The assumptions made by Edge are extremely general. E.g., for RTT and
VAR he assumes hardly more than the pure existence. "Weakly stationary"
means that all observation variables must share the same E and V,
moreover the correlation of the latest observation variable to some
arbitrary other one in a given sample does not depend on the sample
size. 

Bearing in mind, that we look for E and V, these assumptions appear
rather weak to me.

I?ve looked around for other estimators. In fact, there may be
estimators which yield better values for forecasting, however under much
stronger assumptions, e.g. the forecasted process must be stationary and
obey a normal distribution.

> good way is to first find out, if there is any work already done in this
> direction?

In addition I would be interested in work on the convergence speed of
EWMA filters. I?m not quite sure, whether I can access the work on EWMA
filters quoted by Edge, these are older textbooks by Cox (1965) or Box
(1976).

I think, a signal theory perspective would be helpful here. An EWMA
filter is basically nothing else as a grade 1 lowpass IIR filter.
So it may be helpful to consider step response and impulse response
functions of this one, particularly the step impulse because this will
reveal
the behaviour in case of sudden steps in the latency. However, if we
know the impulse response, we can describe the general behaviour on
arbitrary signals here.

One difficulty here is that the EWMA filters impulse response is that
one of a time discrete system and hence its consequences on a real time
(continuous time!) system  depend on the sampling freuquency, i.e. on
the acknowledgement rate. This becomes extremely important in mobile
networks where path characteristics may change due to physical or
enviorenmental circumstances which are more or less beyound our
influence and where e.g. a filtering of changes, which is of course a
time discrete filtering, must be adapted to the flows "sampling
frequency", i.e. the ACK rate.

Detlef

-- 
Detlef Bosau
Galileistrasse 30
70565 Stuttgart
Mail: detlef.bosau at web.de
Web: http://www.detlef-bosau.de
Mobile: +49 172 681 9937

From chris at cs.utexas.edu  Tue Aug  2 14:33:59 2005
From: chris at cs.utexas.edu (Chris Edmondson-Yurkanan)
Date: Tue, 2 Aug 2005 16:33:59 -0500
Subject: [e2e] Cerf & Kahn's Turing Lecture: Open to all, 8/22/2005
Message-ID: <8934b77371b7e568f412d0ef5104e628@cs.utexas.edu>


The Turing Lecture by Vint Cerf and Bob Kahn is OPEN TO ALL!

SIGCOMM 2005 is the host for this year's ACM Turing Lecture, and has  
opened
the Lecture beyond the conference attendees to ALL who are interested.
In addition, SIGCOMM will stream it live over the Internet that Cerf &  
Kahn
helped create.

* You are invited to attend the ACM Turing Lecture in Philadelphia, PA,  
US,
   August 22nd: 6:00-7:30 EDT (and join the reception which begins at  
4:30)
   at the Irvine Auditorium, University of Pennsylvania (free-of-charge)

   (with thanks to Penn's School of Engineering & Applied Science)

* Bring your colleagues, guests, students, advisors... and help Vint &  
Bob
   celebrate the first time that networking researchers have received  
this
   prestigious award, in the 39 years of the ACM Turing Award.

* The Lecture will be a moderated discussion between Vint and Bob, with
   the title:
   Assessing the Internet:
        Lessons Learned, Strategies for Evolution, and Future  
Possibilities

   Afterwards, there will be a Q&A session with the audience.

* To reserve one of 600 seats set aside for the public, please
   sign up via the Turing Lecture web page:
   http://www.acm.org/sigcomm/sigcomm2005/turinglecture.html

   That same web page has details, directions, ticket reservations, and  
info
   on how-to access the live webcast and the eventual archived webcast.

   Reservations will be filled on a first-come, first-served basis.

---------------------
If you have not heard about this year's Turing Award or have not heard
about Cerf and Kahn, here's a little background.

The A.M. Turing Award is often recognized as the "Nobel Prize of
Computing".  The citation for Cerf and Kahn reads:
     "For pioneering work on internetworking, including the design and
     implementation of the Internet's basic communications protocols,
     TCP/IP, and for inspired leadership in networking."

Their first paper on "internetworking" was published in IEEE  
Transactions on
Communications, May 1974:  A Protocol for Packet Network  
Intercommunication.

If you haven't read their first paper, add it to your summer reading  
list!

Bob Kahn and Vint Cerf started in 1973 to solve the problem of how to
interconnect a network of networks, i.e. an "internetwork", or  
"internet".
For Bob, new at DARPA, his interest was in building and connecting a  
packet
radio network to the existing ARPA network along with a packet satellite
network.  Bob invited Vint to work with him, and they jointly designed  
TCP,
which included an internetwork header and a process header (but the two
headers didn't start to split into IP and TCP until 5 years later).  In  
1973
Vint was already the chair of the International Network Working Group,  
so
he was interested as well in interconnecting the ARPA network to the  
French
network Cyclades & the British network at National Physics Laboratory.   
The
following link has a small bio on each:
http://www.acm.org/awards/turing_citations/cerf_kahn.html

At a reception at the Computer History Museum June 9th, Vint and Bob  
"cited
the collaborative nature of their work, acknowledging the contributions
from many in the room who had made their achievements possible."  For  
more
information on a few of their collaborators, see:
http://campus.acm.org/public/membernet/storypage_2.cfm? 
ci=July_2005&story=2&CFID=48919977&CFTOKEN=16561738

---------------------
PS: if you cannot attend the lecture, then please do watch the live
webcast or the archived lecture.  Check out the Turing Lecture website  
for
all details:  http://www.acm.org/sigcomm/sigcomm2005/turinglecture.html

---------------------
Chris Edmondson-Yurkanan
(chris at cs.utexas.edu)
Contact info:  www.cs.utexas.edu/~chris/


From detlef.bosau at web.de  Wed Aug  3 12:54:50 2005
From: detlef.bosau at web.de (Detlef Bosau)
Date: Wed, 03 Aug 2005 21:54:50 +0200
Subject: [e2e] Agility of RTO Estimates, stability, vulneratibilites
References: <DAC3FCB50E31C54987CD10797DA511BA0FE7ACF0@WIN-MSG-10.wingroup.windeploy.ntdev.microsoft.com>
Message-ID: <42F1210A.4030508@web.de>

Christian Huitema wrote:
> I think we should just look at a simple question. Does the current
> algorithm actually works? 
> 
> I personally did measurements 6 years ago. The measurement of
> tcp-connect times to various web servers clearly showed a power law
> distribution. There is in fact a history of finding power laws in
> measurement of communication systems. In fact, Mandelbrot work on
> fractals started with an analysis of the distribution of errors on a
> modem link! Based on all that, it is quite reasonable to assume that the
> distribution of RTT measurement follows a power law. 
> 

Hm. I believe I remember some newspaper article, where the origin for 
the work on "the fractal geometry of nature" was the question: How long 
is the coast of England?

Shortly afterwards, we typically learn that a butterly in the Himalaya 
may cause a tornado in Europe.

When I attendet lessons in stochastics, I was told: When we think there 
may be a stochastic behaviour, we must consider where the stochastic 
behaviour is supposed to come from. Do we really _expect_ this 
behaviour? And why do we?

It?s the same with the whole thing of chaos theory, self similarity and 
its variations.

1.: What does it describe exactly? (I frequently miss precise definitions.)
2.: Where does chaotic/self-similar/.... behaviour come from? (It?s not 
enough to list up occasional observations. Why it?s reasonable to assume 
a hehaviour like that? Is there hard evidence, that e.g. latencies are 
self similar?
3.: What do we learn from that behaviour? Does it end in itself? Or can 
we really tell about "lessons learned" from the self similarity debate?

> People will immediately mention that it should be a truncated power law,
> but even that is far from clear. There is at least anecdotal evidence of
> packets being held up in queues and then transmitted after a very long
> time, e.g. half an hour...
> 

Does not sound like a solid basis.

> The current RTT estimators are based on exponential averages of
> consecutive samples of delays and variations. This is an issue, as the
> exponential average of a heavy tailed distribution also is a heavy
> tailed distribution. If you plug that in a simulation, you will observe
> that the estimates behave erratically. 

O.k. After having played around for a few minutes with EWMA filters in 
Octave, I?ve seen that even the settling behaviour is simply disastrous.

When we keep in mind that Internet latencies vary from some microseconds 
  (10e-6) in an Ethernet segment to some hundred _seconds_ (sic!) in 
some mobile wireless networks (10e2) then we see that Internet latenies 
vary on a scale covering at least eight orders of magnitude.

When we keep in mind further, that the Internet is dominated by short 
term flows (20 packets or so), then we must conclude, that an ordinary 
TCP flow is quite unlikely to see even _one_ correct RTT estimate in its 
whole lifetime.

Is this correct?

Now, to my knowledge, we use an initial value about 2 seconds, which is 
a reasonable upper limit for quite a few internet connections and 
therefore, during a flow?s lifetime some few distracting RTT 
measurements do not really matter.

So, from a "practicioner?s view", TCP "works". "Somehow".

However, as soon as we are confronted with latencies larger than this 
initial value or subject to variation on a large scale, the situation 
deteriorates.


> 
> My personal feeling is that the current RTT estimators do not actually
> work.
> 

What should be considered bad news ...

However, I would like to focus the problem a little bit more on a hone 
hop scenario. The reason for doing so is, that after having read the 
works by Zhang, Jain and Edge two problems become evident.

1.: RTT estimators suffer from poor convergence and a problem with their 
initial value.

2.: RTT estimators suffer from a poor forecast capability.

Their are numerous other difficulties, e.g. Edge?s assumptions, however 
I think these can be handled as well as 1.

The hard problem may be 2.

Let?s consider one hop. Two routers, r1 and r2, one link in between.
(Routers: These systems may well be IS in an arbitrary network path.)

r1--------------------------r2

Consider one packet.

t1: packet?s arrival time on r1.
t2: packet?s arrival time on r2.

If a packet is yet to arrive on r1 latest at a time now + delta and 
delta is known, can we forecast the estimation of (t2-t1)?


-- 
Detlef Bosau
Galileistrasse 30
70565 Stuttgart
Mail: detlef.bosau at web.de
Web: http://www.detlef-bosau.de
Mobile: +49 172 681 9937


From keshav at uwaterloo.ca  Wed Aug  3 13:02:35 2005
From: keshav at uwaterloo.ca (S. Keshav)
Date: Wed, 03 Aug 2005 16:02:35 -0400
Subject: [e2e] end2end-interest Digest, Vol 17, Issue 26
In-Reply-To: <mailman.1.1122663601.2250.end2end-interest@postel.org>
Message-ID: <BAYC1-PASMTP04B1C737E0B5446132DBADC0C50@cez.ice>

> I think of RED strategies, I remember a strategy where there
> are two thresholds a, b, a < b, for a queuelength q. If q < a, packets
> are accepted. If b < q, packets are rejected. If a <= q <= b packets are
> rejected randomly with a probality p which is linear increased from p=0
> if q=a to p=1 if q=b.
> 
> Question: Would it make sense to chose a and b that way, that
> i) q has a constant expectation and
> ii) q has a constand variance
> for certain periods of time?
> 
> 
> However, I expect that someone has discussed this before, it?s just too
> simple.
> 

The easiest way to make the queueing delay constant, or nearly so, is to
introduce wait times where the link is idle even though there are packets in
the queue. This reduces delay jitter in the system and makes the whole
network more circuit-like. By introducing new 'work', the system is what is
called 'non-work-conserving'. Such systems were studied extensively in the
early 90's. For more details, you should look up Hui Zhang's comprehensive
survey on scheduling: "Service disciplines for guaranteed performance
service in packet-switching networks" Proceedings of the IEEE, Volume 83,
Issue 10,  Oct. 1995 Page(s):1374 - 1396.

hope this helps

keshav


From detlef.bosau at web.de  Wed Aug  3 15:20:53 2005
From: detlef.bosau at web.de (Detlef Bosau)
Date: Thu, 04 Aug 2005 00:20:53 +0200
Subject: [e2e] end2end-interest Digest, Vol 17, Issue 26
References: <BAYC1-PASMTP04B1C737E0B5446132DBADC0C50@cez.ice>
Message-ID: <42F14345.7000804@web.de>

S. Keshav wrote:
> 
> The easiest way to make the queueing delay constant, or nearly so, is to
> introduce wait times where the link is idle even though there are packets in
> the queue. This reduces delay jitter in the system and makes the whole
> network more circuit-like. By introducing new 'work', the system is what is

Exactly. And I?m not quite sure whether it?s that what I want to do.

> called 'non-work-conserving'. Such systems were studied extensively in the
> early 90's. For more details, you should look up Hui Zhang's comprehensive
> survey on scheduling: "Service disciplines for guaranteed performance
> service in packet-switching networks" Proceedings of the IEEE, Volume 83,
> Issue 10,  Oct. 1995 Page(s):1374 - 1396.
> 
> hope this helps

It exactly marks the problem.

The more circuit like a network is, the less are the economical 
advantages for typical "packet switching users".

When we make a delay?s _expectation_ constant for a certain amount of 
time, we can well accept a large variation. Jitter is not the problem. 
So, this could be overkill here. However, I don?t know of a "weaker" way.

In my other post from today (Augst, 3rd) I tried to weaken the problem 
that way, that I only ask for a limited forecast capability. It is not 
necessary to keep a queueing delay constant or makeing it obey a certain 
distribution. It would be sufficient to forecast its expectation, and if 
possible its variance, for a limited period of time, e.g. 200 ms.

Do you think, there?s a way to do so, thereby maintaining the typical 
"packet-switching best effort" nature of the Internet?

Perhaps, this is a borderline between "best effort" traffic shaping (if 
this even exists) and some kind of guaranteed service. I really don?t 
know yet.


-- 
Detlef Bosau
Galileistrasse 30
70565 Stuttgart
Mail: detlef.bosau at web.de
Web: http://www.detlef-bosau.de
Mobile: +49 172 681 9937


From detlef.bosau at web.de  Thu Aug  4 05:50:28 2005
From: detlef.bosau at web.de (Detlef Bosau)
Date: Thu, 04 Aug 2005 14:50:28 +0200
Subject: [e2e] Expected latency for a single hop
Message-ID: <42F20F14.3000007@web.de>

I posted this in another context yesterday, but perhaps, I should 
isolate the problem to state it more clearly.

Consider an arbitrary packet-switching network.

Consider two adjacent nodes n1, n2 with link l in between

n1--------------------------n2
                l


Consider a packet traveling the network, it?s path shall contain n1 and 
n2 subsequently.

Now, let
  t1: packet?s arrival time on r1.
  t2: packet?s arrival time on r2.

Can we forecast expectaition and variance (if only for the _near_ 
future!) for the "one hop latency" t2 - t1 ?

I explicitely focus on a "best effort" context.

For link l I assume, that expectation and variance of the transport 
latency exist.

Is there any work in this direction?


-- 
Detlef Bosau
Galileistrasse 30
70565 Stuttgart
Mail: detlef.bosau at web.de
Web: http://www.detlef-bosau.de
Mobile: +49 172 681 9937


From craig at aland.bbn.com  Thu Aug  4 07:09:22 2005
From: craig at aland.bbn.com (Craig Partridge)
Date: Thu, 04 Aug 2005 10:09:22 -0400
Subject: [e2e] Expected latency for a single hop
In-Reply-To: Your message of "Thu, 04 Aug 2005 14:50:28 +0200."
	<42F20F14.3000007@web.de> 
Message-ID: <20050804140922.821FF1FF@aland.bbn.com>


Is l a physical link, an IPsec or IP-in-IP tunnel, or ...?

Note that if it is a tunnel, the answer is that the expectation and
variance of latency is potentially the same as any random multi-hop
Internet path....

Craig

In message <42F20F14.3000007 at web.de>, Detlef Bosau writes:

>I posted this in another context yesterday, but perhaps, I should 
>isolate the problem to state it more clearly.
>
>Consider an arbitrary packet-switching network.
>
>Consider two adjacent nodes n1, n2 with link l in between
>
>n1--------------------------n2
>                l
>
>
>Consider a packet traveling the network, it´s path shall contain n1 and 
>n2 subsequently.
>
>Now, let
>  t1: packet´s arrival time on r1.
>  t2: packet´s arrival time on r2.
>
>Can we forecast expectaition and variance (if only for the _near_ 
>future!) for the "one hop latency" t2 - t1 ?
>
>I explicitely focus on a "best effort" context.
>
>For link l I assume, that expectation and variance of the transport 
>latency exist.
>
>Is there any work in this direction?
>
>
>-- 
>Detlef Bosau
>Galileistrasse 30
>70565 Stuttgart
>Mail: detlef.bosau at web.de
>Web: http://www.detlef-bosau.de
>Mobile: +49 172 681 9937

From keshav at uwaterloo.ca  Thu Aug  4 07:30:35 2005
From: keshav at uwaterloo.ca (S. Keshav)
Date: Thu, 04 Aug 2005 10:30:35 -0400
Subject: [e2e] end2end-interest Digest, Vol 17, Issue 26
In-Reply-To: <42F14345.7000804@web.de>
Message-ID: <BAYC1-PASMTP01D9F4246C18A693B3B6C3C0C40@cez.ice>

Detlef,
    In general, what you are asking for is difficult. Consider the following
scenario. Suppose a router forecasts that the queueing delays at a
particular interface are small at time t and expects this forecast to hold
until t+200ms. Now, suddenly, a burst of packets from multiple input ports
destined to that interface arrive at time t+epsilon. This builds up the
queue, increasing delays. You have two choices:

1. violate the forecast
   or
2. drop packets in order to meet the forecast.

Neither one is a good alternative. If you violate the forecast, then what
use is it? If you drop packets to meet the forecast, that's a waste, because
adequate buffers exist. I do not think that dropping packets in order to
make RTO computations sane is a good tradeoff.

A similar situation holds if traffic is generally high, so that queue
lengths are large, and you forecast a large delay. Now, if the traffic dies
down, you have to either violate the forecast or add new work to the system.
Adding new work delays all subsequent packets, so if you now get a burst,
you are in trouble.

As such, I believe that any sort of forecast is only possible if there is a
way to bound the total incoming traffic, both in terms of rate and
burstiness. 

keshav

> 
> In my other post from today (Augst, 3rd) I tried to weaken the problem
> that way, that I only ask for a limited forecast capability. It is not
> necessary to keep a queueing delay constant or makeing it obey a certain
> distribution. It would be sufficient to forecast its expectation, and if
> possible its variance, for a limited period of time, e.g. 200 ms.
> 
> Do you think, there?s a way to do so, thereby maintaining the typical
> "packet-switching best effort" nature of the Internet?
> 
> Perhaps, this is a borderline between "best effort" traffic shaping (if
> this even exists) and some kind of guaranteed service. I really don?t
> know yet.
> 


From dpreed at reed.com  Thu Aug  4 08:19:08 2005
From: dpreed at reed.com (David P. Reed)
Date: Thu, 04 Aug 2005 11:19:08 -0400
Subject: [e2e] Expected latency for a single hop
In-Reply-To: <42F20F14.3000007@web.de>
References: <42F20F14.3000007@web.de>
Message-ID: <42F231EC.1060100@reed.com>

Detlef - Though it seems simple, your statement is about as complex as a 
problem can be.
This is the kind of problem statement that creates the definitional trap 
I was referring to in earlier discussions.   By construing the "latency" 
as being a propery of the "link" rather than of the network as a whole, 
the statement acquires  a misleading simplicity

The latency only is well defined for real packets that actually arrive 
and traverse the link.   Expectation and variance are properties of 
distributions, not packets.

There is no random process at all on the link itself (at least in the 
common case - there are links where the link itself has a random delay, 
but that usually arises where the link's physical characteristics vary 
faster and larger than the queue management and link pacing 
mechanisms).  The random process is the network environment that 
provides competing packets.  So the latency is everywhere but the link 
itself.

The other issue is that prediction is more reliable over a collection of 
packets, but a sufficient collection cannot happen in an instant.

The first order predictor is the queue size at the entry to the link.   
That's a very reliable predictor of latency for the next event.   But it 
provides very little input about variance (which depends entirely on 
packets arriving from elsewhere at "light speed").

I think there might be a much better (i.e. less complex to state) 
approach in NOT trying to start with the link and go by induction to the 
multilink case.   Instead, perhaps start with an end-to-end flow (over a 
path) and reason about what happens as you add flows that superpose 
themselves on the existing paths.

Detlef Bosau wrote:

> I posted this in another context yesterday, but perhaps, I should 
> isolate the problem to state it more clearly.
>
> Consider an arbitrary packet-switching network.
>
> Consider two adjacent nodes n1, n2 with link l in between
>
> n1--------------------------n2
>                l
>
>
> Consider a packet traveling the network, it?s path shall contain n1 
> and n2 subsequently.
>
> Now, let
>  t1: packet?s arrival time on r1.
>  t2: packet?s arrival time on r2.
>
> Can we forecast expectaition and variance (if only for the _near_ 
> future!) for the "one hop latency" t2 - t1 ?
>
> I explicitely focus on a "best effort" context.
>
> For link l I assume, that expectation and variance of the transport 
> latency exist.
>
> Is there any work in this direction?
>
>


From nicolasc at andrew.cmu.edu  Thu Aug  4 07:59:50 2005
From: nicolasc at andrew.cmu.edu (Nicolas Christin)
Date: Thu, 4 Aug 2005 10:59:50 -0400
Subject: [e2e] end2end-interest Digest, Vol 17, Issue 26
In-Reply-To: <42F14345.7000804@web.de>
References: <BAYC1-PASMTP04B1C737E0B5446132DBADC0C50@cez.ice>
	<42F14345.7000804@web.de>
Message-ID: <20050804145950.GA18305@lithium.ini.cmu.edu>

Detlef, 

On Wed Aug 03, 2005, Detlef Bosau <detlef.bosau at web.de> wrote:
> 
> Do you think, there?s a way to do so, thereby maintaining the typical 
> "packet-switching best effort" nature of the Internet?
> 
> Perhaps, this is a borderline between "best effort" traffic shaping (if 
> this even exists) and some kind of guaranteed service. I really don?t 
> know yet.

I actually studied a very related problem in the good ol' days of my
Ph.D dissertation. I basically tried to combine buffer management and
packet scheduling to provide service differentiation without admission
control.

The trick is essentially that the bounds that you give to traffic
classes are actually soft, in that they can be violated. (Keshav is
completely right - when traffic is really bursty, it is very difficult
to do any type of intelligent predicition, and you might end up with
something that is not much better than best effort.) 

The good news is that you can do quite a lot if you combine scheduling
with packet dropping, and even more when you start looking at ways to
playing with TCP congestion control to do essentially endpoint admission
control for you.

If you are interested, a summary of my dissertation is in:

N. Christin and J. Liebeherr. 
A QoS Architecture for Quantitative Service Differentiation. In IEEE
Communications Magazine 41(6), Special Issue on Scalability in
IP-Oriented Networks, pages 38-45. June 2003.

http://www.comsoc.org/livepubs/ci1/DLPREVIEW/christin.pdf

Best,
Nicolas
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050804/c8ddb4df/attachment-0001.bin

From detlef.bosau at web.de  Thu Aug  4 11:27:27 2005
From: detlef.bosau at web.de (Detlef Bosau)
Date: Thu, 04 Aug 2005 20:27:27 +0200
Subject: [e2e] Expected latency for a single hop
References: <42F20F14.3000007@web.de> <42F231EC.1060100@reed.com>
Message-ID: <42F25E0F.4000601@web.de>

David P. Reed wrote:
> Detlef - Though it seems simple, your statement is about as complex as a 
> problem can be.
> This is the kind of problem statement that creates the definitional trap 
> I was referring to in earlier discussions.   By construing the "latency" 
> as being a propery of the "link" rather than of the network as a whole, 
> the statement acquires  a misleading simplicity
> 


I know. However, the rationale behind my question is quite obvious: If 
you place a TCP sender at n1 and the according receiver at n2, the 
adaptive RTO mechanism in TCP exactly relys upon estimated mean and 
variance of (t2-t1).

If I had written: "Can we provide an adaptive RTO for a single hop TCP 
connection?" I surely had been directed to the relevant literature. 
Perhaps, one had considered it a stupid question.

Thus, I thought it might be useful to state the same problem(sic!) which 
TCP claims to solve (even for n hops!) in somewhat different words >:-)

Honestly, I believe, if we cannot estimate mean and variance for a 
_single_ hop, it?s perhaps not that much easier to do the job for an 
arbitrary number of hops (which of course includes the nasty case of a 
single hop).


> The latency only is well defined for real packets that actually arrive 
> and traverse the link.   Expectation and variance are properties of 
> distributions, not packets.
> 


Yes. The intention of EWMA filtering used in TCP  is an attempt to do a 
parameterless forecast of mean and variance the actual latency distribution.

(Some weeks ago someone refered me to this well known saying by Niels 
Bohr: "Prediction is hard, especially of the future.")

Thus, of course we estimate properties of an unknown (!) distribution 
and in turn derive an RTO by application of an inequality similar to 
Chebyshev?s inequality.

However, the basic assumption is that we can provide estimates for mean 
and variance of a packets round trip time.

> There is no random process at all on the link itself (at least in the 
> common case - there are links where the link itself has a random delay, 
> but that usually arises where the link's physical characteristics vary 
> faster and larger than the queue management and link pacing 


My assumptions on l are tough. I totally agree with Craig here. In 
general, we do know nothing about l. It may be a tunnel, it may be a 
mobile wireless link. E.g. for mobile wireless links, I do not even know 
whether a finite variance for the link?s latency distribution exists.

> mechanisms).  The random process is the network environment that 
> provides competing packets.  So the latency is everywhere but the link 
> itself.


> 
> The other issue is that prediction is more reliable over a collection of 
> packets, but a sufficient collection cannot happen in an instant.
> 


I did not make any assumptions here, especially I did not assume that 
the estimation should be based upon the observation of the one packet. 
Perhaps my formulation was somewhat misleading here.

We could use testpackets sent from n1 to n2 or observe traffic from 
several flows. If n1 and n2 are routers and we can observe a large 
number of flows, the job should be much easier than if done at a TCP 
flows source which has to rely on a _very_ rough sample.

My favourate expample is always TCP including a 2400 bps link. (Nowadays 
forgotten, two years ago known from GSM - and we all know it from the 
good old modem times.) Depending on MSS, the sender gets a sample every 
one second or so. For wirebound systems in between one second is _ages_.
In one of my posts, I claimed (please correct me, if I?m wrong) that in 
contemporary networks, even link bandwidths cover a range of eight 
orders of magnitude. When our single packet crawls along a GSM link,
a Tier 1 backbone link may convey the whole Encyclopedia Britannica 
within the same period of time.

However, in a scenario

Sender----Tier1/Enc.Brit. Link--------router----GSM---receiver

the sender estimates mean and variance of the round trip time using EWMA 
filters and the extremely rough time series gained from the ACK packets.

Question: Is there a justification for doing so?

I looked at Edge?s paper, esecially for the assumptions for the 
observation variables, i.e. the time series Tn (Tn: stocastic variables, 
tn: instances.)

One sufficient assumption is that all Tn share the same mean and variance.

Some "drift" is accepted as well as are "occasional steps" (put in my 
own words).

When I look at my "Britannica example" and consider "sane mean and 
variance", I do not feel comfortable with these assumptions.


> The first order predictor is the queue size at the entry to the link.   
> That's a very reliable predictor of latency for the next event.   But it 
> provides very little input about variance (which depends entirely on 
> packets arriving from elsewhere at "light speed").
> 
> I think there might be a much better (i.e. less complex to state) 
> approach in NOT trying to start with the link and go by induction to the 
> multilink case.   Instead, perhaps start with an end-to-end flow (over a 
> path) and reason about what happens as you add flows that superpose 
> themselves on the existing paths.
> 

Is this really that more promising? Admittedly, this a rhetorical 
question. Basically, this is already being done. So I sharpened it a 
little bit by omitting n-1 links in the n link case ;-)

Or is it a matter of how coarse or fine grained we look at the problem?

I?m thinking about this problem - and at the same time, I use TCP and 
everything seems just to be fine :-) Then I read Raj Jain?s paper about 
the divergence of RTO estimators and Lixia Zhang?s paper on TCP timers. 
And I understand that we already addressed a number, perhaps nearly all, 
issues in these papers. But one issue which I do not yet understand is 
the use of EWMA filters.

- Do they hold for arbitrary TCP connections? Can we reasonable assume 
the necesseary conditions given by Edge? Or alternative ones?

- Do they converge fast enough in case of a sudden step in latency? Do 
they follow drifting latencies? How must we set the gain? I sometimes 
here something about "agility" and "stability". Basically, we should 
minimize the forecast error by proper choice of the gain. Can we use the 
same gain for all flows?

- Is the temporal resolution of an ACK clocked TCP flow sufficient to 
provide reasonable estimates? Or is the time series? resolution obtained 
from that too coarse? (Nilsson, Martin and Rhee do so claim in there 
paper on lateny change / congestion correlation in June, 2003. One 
central point there was that the temporal resolution of observed round 
trip times in most cases is by far too coarse to derive reasonable 
conclusions concerning path properties.)

I get no "feeling" for this situation. I see lots of scenarios and 
individual papers there, but I don?t see the big picture yet.

Detlef
-- 
Detlef Bosau
Galileistrasse 30
70565 Stuttgart
Mail: detlef.bosau at web.de
Web: http://www.detlef-bosau.de
Mobile: +49 172 681 9937


From detlef.bosau at web.de  Thu Aug  4 13:19:25 2005
From: detlef.bosau at web.de (Detlef Bosau)
Date: Thu, 04 Aug 2005 22:19:25 +0200
Subject: [e2e] end2end-interest Digest, Vol 17, Issue 26
References: <BAYC1-PASMTP01D9F4246C18A693B3B6C3C0C40@cez.ice>
Message-ID: <42F2784D.5020809@web.de>

S. Keshav wrote:
> Detlef,
>     In general, what you are asking for is difficult. Consider the following
> scenario. Suppose a router forecasts that the queueing delays at a
> particular interface are small at time t and expects this forecast to hold
> until t+200ms. Now, suddenly, a burst of packets from multiple input ports
> destined to that interface arrive at time t+epsilon. This builds up the
> queue, increasing delays. You have two choices:
> 
> 1. violate the forecast
>    or
> 2. drop packets in order to meet the forecast.
> 
> Neither one is a good alternative. If you violate the forecast, then what
> use is it? If you drop packets to meet the forecast, that's a waste, because
> adequate buffers exist. I do not think that dropping packets in order to
> make RTO computations sane is a good tradeoff.
> 


Perhaps, we talk a litte bit cross purposes here. What I?m trying to 
understand is the estimation of mean and variation of RTT in TCP flows.
I don?t want to give any guarantees.

So the purpose of a forecast is only to estimate latencies for the near 
future. If there is a traffic burst, then the forecast may be violated. 
So what? It?s an _estimate_. Moreover, it?s an estimate for a _mean_. An 
actual latency may well be greater or less.

Basically, there are two objectives:

1. provide an RTO estimator with _less_ assumptions than e.g. Edge?s 
algorithm.
2. alleviate the settling behaviour and the consequences of the 
sometimes quite rough sampling done by the usual RTT observation.

Perhaps, this could be helpful, I don?t know yet.

Detlef
-- 
Detlef Bosau
Galileistrasse 30
70565 Stuttgart
Mail: detlef.bosau at web.de
Web: http://www.detlef-bosau.de
Mobile: +49 172 681 9937


From detlef.bosau at web.de  Mon Aug  8 09:27:23 2005
From: detlef.bosau at web.de (Detlef Bosau)
Date: Mon, 08 Aug 2005 18:27:23 +0200
Subject: [e2e] Expected latency for a single hop: What about 802.11
	networks?
References: <42F20F14.3000007@web.de> <42F231EC.1060100@reed.com>
	<42F25E0F.4000601@web.de>
Message-ID: <42F787EB.3040006@web.de>

I just had a very first glance on a paper by Christoph Lindemann et al., 
MobiHoc 05.
The paper deals with TCP in multihop wireless networks, as far as I see 
particularly 802.11 networks.

The paper mentions the typical consideration: In wireless networks, 
corruption based loss happens more often than corruption based drop.

Now, first of all: What is the MAC algorithm in 802.11 ad hoc (not 
infrastructure!) networks / MANETs?

To the best of my knowledge, this is ALOHA. (BTW: I would greatly 
appreciate a copy of Abramsons Paper. It?s on my reading list, but I 
could not find it yet.)

AFAIK, ALOHA does _not_ detect collisions but relys upon positive 
acknowledments: A packet is sent, repeated if necessary, until it is 
acknowledged by the receiver.

Q: Is this correct?

If so, we have implict retransmissions on the MAC layer here. 
Particularly, we would observe transport latencies as the temporal 
distance between the first sending attempt and the final reception.

This seems to be similar to the latency estimation used in the ARPAnet 
in the 80s and which is proven to be insufficient / divergent according 
to Jains paper "Divergence of Timeout Algorithms....", refer to the 
discussion concerning "Round Trip Delay with Retransmissions" in that paper.

Q: Does this mean, it is difficult to obtain correct latency estimates 
by pure TCP/ACK observation in case of networks where local recovery is 
implicit/compulsory?


Detlef

-- 
Detlef Bosau
Galileistrasse 30
70565 Stuttgart
Mail: detlef.bosau at web.de
Web: http://www.detlef-bosau.de
Mobile: +49 172 681 9937


From dpreed at reed.com  Mon Aug  8 10:31:25 2005
From: dpreed at reed.com (David P. Reed)
Date: Mon, 08 Aug 2005 13:31:25 -0400
Subject: [e2e] Expected latency for a single hop: What about 802.11
	networks?
In-Reply-To: <42F787EB.3040006@web.de>
References: <42F20F14.3000007@web.de> <42F231EC.1060100@reed.com>
	<42F25E0F.4000601@web.de> <42F787EB.3040006@web.de>
Message-ID: <42F796ED.7040401@reed.com>

The MAC protocol in 802.11 is not ALOHA.  You'd best get the spec if you 
really want to understand it, because it's pretty complex.

It doesn't detect collisions, however.  Nor does it depend on positive 
acks.  It relies on collision avoidance techniques to reduce collision 
losses to a low enough level, and end-to-end acks to clean up the rest.

There is a "polled" mode (point coordination function) that is hardly 
ever implemented.   Instead, the "distributed coordination function" 
(DCF) is typically employed, but modified in many cases by RTS/CTS 
exchanges, this latter being the means to reduce collisions in most 
cases (CTS is a positive ack for RTS).

Many networks are set up so that CTS/RTS applies only to long frames 
(i.e. file transfers).

Ultimately, it means that what TCP/ACK observation sees when an 802.11 
link is involved depends on how well the CTS/RTS works.


From rja at extremenetworks.com  Mon Aug  8 11:44:45 2005
From: rja at extremenetworks.com (RJ Atkinson)
Date: Mon, 8 Aug 2005 14:44:45 -0400
Subject: [e2e] Expected latency for a single hop: What about 802.11
	networks?
In-Reply-To: <42F796ED.7040401@reed.com>
References: <42F20F14.3000007@web.de> <42F231EC.1060100@reed.com>
	<42F25E0F.4000601@web.de> <42F787EB.3040006@web.de>
	<42F796ED.7040401@reed.com>
Message-ID: <8F6DF257-8277-47FC-92A5-13EB5793E349@extremenetworks.com>


On Aug 8, 2005, at 13:31, David P. Reed wrote:

> The MAC protocol in 802.11 is not ALOHA.  You'd best get the spec  
> if you really want to understand it, because it's pretty complex.
>

By the way, most IEEE 802.* standards are available in PDF
at no cost from this URL:

     http://standards.ieee.org/getieee802/


From detlef.bosau at web.de  Mon Aug  8 14:46:35 2005
From: detlef.bosau at web.de (Detlef Bosau)
Date: Mon, 08 Aug 2005 23:46:35 +0200
Subject: [e2e] Expected latency for a single hop: What about 802.11
	networks?
References: <42F20F14.3000007@web.de> <42F231EC.1060100@reed.com>
	<42F25E0F.4000601@web.de> <42F787EB.3040006@web.de>
	<42F796ED.7040401@reed.com>
Message-ID: <42F7D2BB.5040009@web.de>

David P. Reed wrote:
> The MAC protocol in 802.11 is not ALOHA.  You'd best get the spec if you 
> really want to understand it, because it's pretty complex.
> 
> It doesn't detect collisions, however.  Nor does it depend on positive 
> acks.  It relies on collision avoidance techniques to reduce collision 
> losses to a low enough level, and end-to-end acks to clean up the rest.
> 

Oh :-(

You just have destroyed my view of life..........

I knew about the CA stuff before, but not that 802.11 in fact does not 
care, when collision actually _occurs_.

(Call me lazybones, call me coward, but I avoid reading IEEE standards 
whenever possible =8-0
It?s nevertheless inevitable sometimes, but I rather read 20 RFCs than 1 
  IEEE stanard. O.k., it?s a standard, not a cartoon.....)

However, what you say here totally changes my way of thinking. I 
typically compare WLAN and Ethernet, which is still possible for low 
loads and when single, independent segments are compared. I.e., 
collusion does hardly occur and in a single segment e2e recovery should 
not behave that different than ALOHA, moreover there is hardly any 
network capacity at all and CWND etc. is small.

In case of increasing load, and therefore an increasing number of 
collisions), and if the 802.11 network is the last link in a number of 
subsequent links, there should be quite a difference to Ethernet when 
all collision losses must be cured end to end....

O.k., bearing this in mind, local recovery protocols like snoop appear 
totally different to me than before.

I think, it will take some days for me to understand all the consequences.

Thanks a lot.

I?ve learned something new today.

BTW: (Of course I will find it in the standards, it?s only I fear it?s 
on page 345 of 800....) What is the _reason_ for this decision _not_ to 
handle actual collisions locally but leave it to the e2e protocol?
To my understanding (up to now...) CA does _avoid_ collisions but does 
not totally prevent them. Or is CA that successfull that actual 
collisions can nearly be neglected?


Detlef

-- 
Detlef Bosau
Galileistrasse 30
70565 Stuttgart
Mail: detlef.bosau at web.de
Web: http://www.detlef-bosau.de
Mobile: +49 172 681 9937


From crk at research.att.com  Mon Aug  8 18:43:26 2005
From: crk at research.att.com (crk@research.att.com)
Date: Mon, 8 Aug 2005 21:43:26 -0400
Subject: [e2e] Expected latency for a single hop: What about
	802.11networks?
Message-ID: <387B5A9BF31B5D43A2B18DD9F326B8E1DA68AB@NJFPSRVEXG2KCL.research.att.com>

QoS for 802.11 was actually fairly recently defined in the 802.11e
specifications.  Since the scheduled "HCCA" mode is a new addition, it
is true that it's not widely deployed, but that will change.

The unscheduled QoS mode uses randomized backoff timers with the max
value determined by traffic class; the scheduled HCCA mode allows
clients to provide a Tspec to an Access Point, which can then provide
bounded and predictable delay.  Since HCCA "parameterizes" QoS via
scheduled "polls", the latency is normally max'd at the superframe
beacon interval, but can be less.  A typical example is 20 msec in one
implementation that we've worked on with several vendors.  These kinds
of guarantees will be needed if you ever want to use 802.11 to provide
WVoIP in an enterprise environment...

I believe we have some good model outputs from simulations that we could
share if there is an interest...

Regards,
chuck

-----Original Message-----
From: end2end-interest-bounces at postel.org
[mailto:end2end-interest-bounces at postel.org] On Behalf Of David P. Reed
Sent: Monday, August 08, 2005 1:31 PM
To: Detlef Bosau
Cc: Michael.kochte at gmx.net; end2end-interest at postel.org
Subject: Re: [e2e] Expected latency for a single hop: What about
802.11networks?


The MAC protocol in 802.11 is not ALOHA.  You'd best get the spec if you

really want to understand it, because it's pretty complex.

It doesn't detect collisions, however.  Nor does it depend on positive 
acks.  It relies on collision avoidance techniques to reduce collision 
losses to a low enough level, and end-to-end acks to clean up the rest.

There is a "polled" mode (point coordination function) that is hardly 
ever implemented.   Instead, the "distributed coordination function" 
(DCF) is typically employed, but modified in many cases by RTS/CTS 
exchanges, this latter being the means to reduce collisions in most 
cases (CTS is a positive ack for RTS).

Many networks are set up so that CTS/RTS applies only to long frames 
(i.e. file transfers).

Ultimately, it means that what TCP/ACK observation sees when an 802.11 
link is involved depends on how well the CTS/RTS works.


From detlef.bosau at web.de  Tue Aug 16 01:54:36 2005
From: detlef.bosau at web.de (Detlef Bosau)
Date: Tue, 16 Aug 2005 10:54:36 +0200
Subject: [e2e] Latency Variation and Contention.
References: <387B5A9BF31B5D43A2B18DD9F326B8E1DA68AB@NJFPSRVEXG2KCL.research.att.com>
Message-ID: <4301A9CC.8090103@web.de>

Hi to all.

Recently, I found the following paper by Sherif M. ElRakabawy, Alexander 
Klemm and Christoph Lindemann:

http://mobicom.cs.uni-dortmund.de/publications/TCP-AP_MobiHoc05.pdf

The paper proposes a congestion control algorithm for ad hoc networks.
Perhaps, this paper is interesting within the context of our latency 
discussion.

However, I?m not yet convinced of this work.

If I leave out some sheets of paper, some simulations and many words, 
the paper basically assumes that in ad hoc networks a TCP sender can 
measurethe degree of network contention using the variance of (recently 
seen) round trip times:

-If the variance is close to zero, the network is hardly loaded.
-If the variance is "high" (of course "high" is to be defined) there is 
a high degree of contention on this network.

Afterwards the authors propose a sender pacing scheme, where a TCP 
flow?s rate is decreased with respect to the so measured "degree of 
contention".

What I do not yet understand is basic assumption: variance 0 <=> no 
load; variance high <=> heavy load.

Perhaps the main difficulty is that I believed this myself for years and 
it was an admittedly difficult task to convince me that I was wrong %-)
However,

	@article{martin,
	journal = " IEEE/ACM TRANSACTIONS ON NETWORKING",
	volume ="11",
	number = "3",
	month = "June",
	year = "2003",
	title = "Delay--Based Congestion Avoidance for TCP",
	author = "Jim Martin and Arne Nilsson and  Injong Rhee",
	}
eventually did the job.

More precisely, I looked at the latencies themselves, not the variances.
 

Let?s consider a simple example.

           A  network B

"network" is some shared media packet switching network.
Let?s place a TCP sender on A and the according sink on B.

The simple question is (and I thought about this years ago without 
really coming to an end - I?m afraid I didn?t want to):

Is a variance close to zero really equivalent for a low load situation?
And does increasing variance indicate increasing load?

Isn?t it possible that a variance close to zero is a consequence of a 
fully loaded network? And _decreasing_ load in that situation would 
cause the latencies to vary?

If we could reliably identify a low load situation from a varaince close 
to zero, we could use the latencies themselves as a load indicator 
because we could reliably identify a "no load latency" and thus could 
identify imminent congestion by latency observation.

One could even think of a "latency-congestion scale" which is calibrated 
  first by variance observation in order to get the "unloaded" mark and 
second by drop observation and some loss differentation technique to get 
the "imminent congestion" mark.

To my knowledge, this is extensively discussed in literature - until 
Martin, Nilsson and Rhee found the mentioned results.

Now, back to my example and the basic question: Does the assumption, 
latency variations indicate the degree of contention in an ad hoch 
network, really hold?

I admit, I personally do not yet see an evidence for this.

Detlef
-- 
Detlef Bosau
Galileistrasse 30
70565 Stuttgart
Mail: detlef.bosau at web.de
Web: http://www.detlef-bosau.de
Mobile: +49 172 681 9937


From s.malik at tuhh.de  Tue Aug 16 03:22:55 2005
From: s.malik at tuhh.de (Sireen Habib Malik)
Date: Tue, 16 Aug 2005 12:22:55 +0200
Subject: [e2e] Latency Variation and Contention.
In-Reply-To: <4301A9CC.8090103@web.de>
References: <387B5A9BF31B5D43A2B18DD9F326B8E1DA68AB@NJFPSRVEXG2KCL.research.att.com>
	<4301A9CC.8090103@web.de>
Message-ID: <4301BE7F.9080107@tuhh.de>

Hi,

Have not read the paper, however, I think that if,

RTT = Round Trip Time, and
dRTT = variations in RTT,

then "dRTT" is a weak/poor indicator of congestion. 

A congestion signal based upon "dRTT/RTT" would give a much better idea, 
relatively speaking.

--
Sireen


Detlef Bosau wrote:

> Hi to all.
>
> Recently, I found the following paper by Sherif M. ElRakabawy, 
> Alexander Klemm and Christoph Lindemann:
>
> http://mobicom.cs.uni-dortmund.de/publications/TCP-AP_MobiHoc05.pdf
>
> The paper proposes a congestion control algorithm for ad hoc networks.
> Perhaps, this paper is interesting within the context of our latency 
> discussion.
>
> However, I?m not yet convinced of this work.
>
> If I leave out some sheets of paper, some simulations and many words, 
> the paper basically assumes that in ad hoc networks a TCP sender can 
> measurethe degree of network contention using the variance of 
> (recently seen) round trip times:
>
> -If the variance is close to zero, the network is hardly loaded.
> -If the variance is "high" (of course "high" is to be defined) there 
> is a high degree of contention on this network.
>
> Afterwards the authors propose a sender pacing scheme, where a TCP 
> flow?s rate is decreased with respect to the so measured "degree of 
> contention".
>
> What I do not yet understand is basic assumption: variance 0 <=> no 
> load; variance high <=> heavy load.
>
> Perhaps the main difficulty is that I believed this myself for years 
> and it was an admittedly difficult task to convince me that I was 
> wrong %-)
> However,
>
>     @article{martin,
>     journal = " IEEE/ACM TRANSACTIONS ON NETWORKING",
>     volume ="11",
>     number = "3",
>     month = "June",
>     year = "2003",
>     title = "Delay--Based Congestion Avoidance for TCP",
>     author = "Jim Martin and Arne Nilsson and  Injong Rhee",
>     }
> eventually did the job.
>
> More precisely, I looked at the latencies themselves, not the variances.
>
>
> Let?s consider a simple example.
>
>           A  network B
>
> "network" is some shared media packet switching network.
> Let?s place a TCP sender on A and the according sink on B.
>
> The simple question is (and I thought about this years ago without 
> really coming to an end - I?m afraid I didn?t want to):
>
> Is a variance close to zero really equivalent for a low load situation?
> And does increasing variance indicate increasing load?
>
> Isn?t it possible that a variance close to zero is a consequence of a 
> fully loaded network? And _decreasing_ load in that situation would 
> cause the latencies to vary?
>
> If we could reliably identify a low load situation from a varaince 
> close to zero, we could use the latencies themselves as a load 
> indicator because we could reliably identify a "no load latency" and 
> thus could identify imminent congestion by latency observation.
>
> One could even think of a "latency-congestion scale" which is 
> calibrated  first by variance observation in order to get the 
> "unloaded" mark and second by drop observation and some loss 
> differentation technique to get the "imminent congestion" mark.
>
> To my knowledge, this is extensively discussed in literature - until 
> Martin, Nilsson and Rhee found the mentioned results.
>
> Now, back to my example and the basic question: Does the assumption, 
> latency variations indicate the degree of contention in an ad hoch 
> network, really hold?
>
> I admit, I personally do not yet see an evidence for this.
>
> Detlef


-- 
M.Sc.-Ing. Sireen Malik

Communication Networks
Hamburg University of  Technology
FSP 4-06 (room 5.012)
Schwarzenbergstrasse 95 (IVD)
21073-Hamburg, Deutschland

Tel: +49 (40) 42-878-3443
Fax: +49 (40) 42-878-2941
E-Mail: s.malik at tuhh.de

--Everything should be as simple as possible, but no simpler (Albert Einstein)


From detlef.bosau at web.de  Tue Aug 16 03:57:16 2005
From: detlef.bosau at web.de (Detlef Bosau)
Date: Tue, 16 Aug 2005 12:57:16 +0200
Subject: [e2e] Latency Variation and Contention.
References: <387B5A9BF31B5D43A2B18DD9F326B8E1DA68AB@NJFPSRVEXG2KCL.research.att.com>
	<4301A9CC.8090103@web.de> <4301BE7F.9080107@tuhh.de>
Message-ID: <4301C68B.7018257A@web.de>

Sireen Habib Malik wrote:
> 
> Hi,
> 
> Have not read the paper, however, I think that if,
> 
> RTT = Round Trip Time, and
> dRTT = variations in RTT,
> 
> then "dRTT" is a weak/poor indicator of congestion.
> 
> A congestion signal based upon "dRTT/RTT" would give a much better idea,
> relatively speaking.

Hm. At least, it looks more complex ;-)

However, it does not really affect the "hi-lo-quest". 

As far as I see, the basic question is: Can we detect / react upon
network congestion by latency observation?

It is no big deal whether we look at the RTT or variance. We can even
look at higher moments of RTT (skewness, curtosis), we can introduce
quantiles and thresholds, we can use any formula TeX is able to print
:-) 

The question is: Can we distinguish a loaded network from an unloaded
one by (pure) latency observation / evaluation.

Detlef


> 
> --
> Sireen
> 


-- 
Detlef Bosau
Galileistrasse 30
70565 Stuttgart
Mail: detlef.bosau at web.de
Web: http://www.detlef-bosau.de
Mobile: +49 172 681 9937

From touch at ISI.EDU  Tue Aug 16 16:29:25 2005
From: touch at ISI.EDU (Joe Touch)
Date: Tue, 16 Aug 2005 16:29:25 -0700
Subject: [e2e] Latency Variation and Contention.
In-Reply-To: <4301BE7F.9080107@tuhh.de>
References: <387B5A9BF31B5D43A2B18DD9F326B8E1DA68AB@NJFPSRVEXG2KCL.research.att.com>	<4301A9CC.8090103@web.de>
	<4301BE7F.9080107@tuhh.de>
Message-ID: <430276D5.1010706@isi.edu>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Sireen Habib Malik wrote:
> Hi,
> 
> Have not read the paper, however, I think that if,
> 
> RTT = Round Trip Time, and
> dRTT = variations in RTT,
> 
> then "dRTT" is a weak/poor indicator of congestion.

but a good indicator that congestion control will be hard to compute ;-)

stability is f(dRTT), not f(RTT)

RTT is a function of distance, in general
dRTT is a function of the number of hops, in general

Changes in the two - relative or absolute - don't seem to tell you much
more than that, though.

> A congestion signal based upon "dRTT/RTT" would give a much better idea,
> relatively speaking.

relative variance = variance/mean

but noise is more closely correlated to variance than to relative
variance, which makes sense if dRTT = variance

what you are aiming at is SNR, i.e., 10log10(RTT/dRTT)

Joe

> 
> -- 
> Sireen
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Detlef Bosau wrote:
> 
>> Hi to all.
>>
>> Recently, I found the following paper by Sherif M. ElRakabawy,
>> Alexander Klemm and Christoph Lindemann:
>>
>> http://mobicom.cs.uni-dortmund.de/publications/TCP-AP_MobiHoc05.pdf
>>
>> The paper proposes a congestion control algorithm for ad hoc networks.
>> Perhaps, this paper is interesting within the context of our latency
>> discussion.
>>
>> However, I?m not yet convinced of this work.
>>
>> If I leave out some sheets of paper, some simulations and many words,
>> the paper basically assumes that in ad hoc networks a TCP sender can
>> measurethe degree of network contention using the variance of
>> (recently seen) round trip times:
>>
>> -If the variance is close to zero, the network is hardly loaded.
>> -If the variance is "high" (of course "high" is to be defined) there
>> is a high degree of contention on this network.
>>
>> Afterwards the authors propose a sender pacing scheme, where a TCP
>> flow?s rate is decreased with respect to the so measured "degree of
>> contention".
>>
>> What I do not yet understand is basic assumption: variance 0 <=> no
>> load; variance high <=> heavy load.
>>
>> Perhaps the main difficulty is that I believed this myself for years
>> and it was an admittedly difficult task to convince me that I was
>> wrong %-)
>> However,
>>
>>     @article{martin,
>>     journal = " IEEE/ACM TRANSACTIONS ON NETWORKING",
>>     volume ="11",
>>     number = "3",
>>     month = "June",
>>     year = "2003",
>>     title = "Delay--Based Congestion Avoidance for TCP",
>>     author = "Jim Martin and Arne Nilsson and  Injong Rhee",
>>     }
>> eventually did the job.
>>
>> More precisely, I looked at the latencies themselves, not the variances.
>>
>>
>> Let?s consider a simple example.
>>
>>           A  network B
>>
>> "network" is some shared media packet switching network.
>> Let?s place a TCP sender on A and the according sink on B.
>>
>> The simple question is (and I thought about this years ago without
>> really coming to an end - I?m afraid I didn?t want to):
>>
>> Is a variance close to zero really equivalent for a low load situation?
>> And does increasing variance indicate increasing load?
>>
>> Isn?t it possible that a variance close to zero is a consequence of a
>> fully loaded network? And _decreasing_ load in that situation would
>> cause the latencies to vary?
>>
>> If we could reliably identify a low load situation from a varaince
>> close to zero, we could use the latencies themselves as a load
>> indicator because we could reliably identify a "no load latency" and
>> thus could identify imminent congestion by latency observation.
>>
>> One could even think of a "latency-congestion scale" which is
>> calibrated  first by variance observation in order to get the
>> "unloaded" mark and second by drop observation and some loss
>> differentation technique to get the "imminent congestion" mark.
>>
>> To my knowledge, this is extensively discussed in literature - until
>> Martin, Nilsson and Rhee found the mentioned results.
>>
>> Now, back to my example and the basic question: Does the assumption,
>> latency variations indicate the degree of contention in an ad hoch
>> network, really hold?
>>
>> I admit, I personally do not yet see an evidence for this.
>>
>> Detlef
> 
> 
> 
> 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFDAnbVE5f5cImnZrsRAtC9AKDYkULbaAz4y93+Ym5iIuv/rVZEWgCfW5vy
MELJpDvHjw5QDGjl4dDUtLU=
=lMcl
-----END PGP SIGNATURE-----

From s.malik at tuhh.de  Wed Aug 17 02:36:41 2005
From: s.malik at tuhh.de (Sireen Habib Malik)
Date: Wed, 17 Aug 2005 11:36:41 +0200
Subject: [e2e] Latency Variation and Contention.
In-Reply-To: <430276D5.1010706@isi.edu>
References: <387B5A9BF31B5D43A2B18DD9F326B8E1DA68AB@NJFPSRVEXG2KCL.research.att.com>	<4301A9CC.8090103@web.de>
	<4301BE7F.9080107@tuhh.de> <430276D5.1010706@isi.edu>
Message-ID: <43030529.3050506@tuhh.de>

Hi,


 >>what you are aiming at is SNR, i.e., 10log10(RTT/dRTT)

So we are getting somewhere now :-)

Right. SNR is the signal strength normalized to the noise strength. For 
dRTT=0, SNR=f(RTT/dRTT)=infinite.

I considered "congestion" as the noise strength normalized to the signal 
strength. For dRTT=0, congestion signal based upon dRTT/RTT= 
f(dRTT/RTT)= zero = no congestion.

So I reckon a congestion signal that looks like 1/(10log10(RTT/dRTT)) 
should do the trick.

--
Sireen


Joe Touch wrote:

>-----BEGIN PGP SIGNED MESSAGE-----
>Hash: SHA1
>
>
>
>Sireen Habib Malik wrote:
>  
>
>>Hi,
>>
>>Have not read the paper, however, I think that if,
>>
>>RTT = Round Trip Time, and
>>dRTT = variations in RTT,
>>
>>then "dRTT" is a weak/poor indicator of congestion.
>>    
>>
>
>but a good indicator that congestion control will be hard to compute ;-)
>
>stability is f(dRTT), not f(RTT)
>
>RTT is a function of distance, in general
>dRTT is a function of the number of hops, in general
>
>Changes in the two - relative or absolute - don't seem to tell you much
>more than that, though.
>
>  
>
>>A congestion signal based upon "dRTT/RTT" would give a much better idea,
>>relatively speaking.
>>    
>>
>
>relative variance = variance/mean
>
>but noise is more closely correlated to variance than to relative
>variance, which makes sense if dRTT = variance
>
>what you are aiming at is SNR, i.e., 10log10(RTT/dRTT)
>
>Joe
>
>  
>
>>-- 
>>Sireen
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>Detlef Bosau wrote:
>>
>>    
>>
>>>Hi to all.
>>>
>>>Recently, I found the following paper by Sherif M. ElRakabawy,
>>>Alexander Klemm and Christoph Lindemann:
>>>
>>>http://mobicom.cs.uni-dortmund.de/publications/TCP-AP_MobiHoc05.pdf
>>>
>>>The paper proposes a congestion control algorithm for ad hoc networks.
>>>Perhaps, this paper is interesting within the context of our latency
>>>discussion.
>>>
>>>However, I?m not yet convinced of this work.
>>>
>>>If I leave out some sheets of paper, some simulations and many words,
>>>the paper basically assumes that in ad hoc networks a TCP sender can
>>>measurethe degree of network contention using the variance of
>>>(recently seen) round trip times:
>>>
>>>-If the variance is close to zero, the network is hardly loaded.
>>>-If the variance is "high" (of course "high" is to be defined) there
>>>is a high degree of contention on this network.
>>>
>>>Afterwards the authors propose a sender pacing scheme, where a TCP
>>>flow?s rate is decreased with respect to the so measured "degree of
>>>contention".
>>>
>>>What I do not yet understand is basic assumption: variance 0 <=> no
>>>load; variance high <=> heavy load.
>>>
>>>Perhaps the main difficulty is that I believed this myself for years
>>>and it was an admittedly difficult task to convince me that I was
>>>wrong %-)
>>>However,
>>>
>>>    @article{martin,
>>>    journal = " IEEE/ACM TRANSACTIONS ON NETWORKING",
>>>    volume ="11",
>>>    number = "3",
>>>    month = "June",
>>>    year = "2003",
>>>    title = "Delay--Based Congestion Avoidance for TCP",
>>>    author = "Jim Martin and Arne Nilsson and  Injong Rhee",
>>>    }
>>>eventually did the job.
>>>
>>>More precisely, I looked at the latencies themselves, not the variances.
>>>
>>>
>>>Let?s consider a simple example.
>>>
>>>          A  network B
>>>
>>>"network" is some shared media packet switching network.
>>>Let?s place a TCP sender on A and the according sink on B.
>>>
>>>The simple question is (and I thought about this years ago without
>>>really coming to an end - I?m afraid I didn?t want to):
>>>
>>>Is a variance close to zero really equivalent for a low load situation?
>>>And does increasing variance indicate increasing load?
>>>
>>>Isn?t it possible that a variance close to zero is a consequence of a
>>>fully loaded network? And _decreasing_ load in that situation would
>>>cause the latencies to vary?
>>>
>>>If we could reliably identify a low load situation from a varaince
>>>close to zero, we could use the latencies themselves as a load
>>>indicator because we could reliably identify a "no load latency" and
>>>thus could identify imminent congestion by latency observation.
>>>
>>>One could even think of a "latency-congestion scale" which is
>>>calibrated  first by variance observation in order to get the
>>>"unloaded" mark and second by drop observation and some loss
>>>differentation technique to get the "imminent congestion" mark.
>>>
>>>To my knowledge, this is extensively discussed in literature - until
>>>Martin, Nilsson and Rhee found the mentioned results.
>>>
>>>Now, back to my example and the basic question: Does the assumption,
>>>latency variations indicate the degree of contention in an ad hoch
>>>network, really hold?
>>>
>>>I admit, I personally do not yet see an evidence for this.
>>>
>>>Detlef
>>>      
>>>
>>
>>
>>    
>>
>-----BEGIN PGP SIGNATURE-----
>Version: GnuPG v1.2.4 (MingW32)
>Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
>
>iD8DBQFDAnbVE5f5cImnZrsRAtC9AKDYkULbaAz4y93+Ym5iIuv/rVZEWgCfW5vy
>MELJpDvHjw5QDGjl4dDUtLU=
>=lMcl
>-----END PGP SIGNATURE-----
>  
>


-- 
M.Sc.-Ing. Sireen Malik

Communication Networks
Hamburg University of  Technology
FSP 4-06 (room 5.012)
Schwarzenbergstrasse 95 (IVD)
21073-Hamburg, Deutschland

Tel: +49 (40) 42-878-3443
Fax: +49 (40) 42-878-2941
E-Mail: s.malik at tuhh.de

--Everything should be as simple as possible, but no simpler (Albert Einstein)


From detlef.bosau at web.de  Wed Aug 17 04:13:32 2005
From: detlef.bosau at web.de (Detlef Bosau)
Date: Wed, 17 Aug 2005 13:13:32 +0200
Subject: [e2e] Latency Variation and Contention.
References: <387B5A9BF31B5D43A2B18DD9F326B8E1DA68AB@NJFPSRVEXG2KCL.research.att.com>	<4301A9CC.8090103@web.de>
	<4301BE7F.9080107@tuhh.de> <430276D5.1010706@isi.edu>
	<43030529.3050506@tuhh.de>
Message-ID: <43031BDC.1000608@web.de>

Your comments are both, helpful and enlightning.

Nevertheless, please, allow me to re-focus the discussion.

The assertion made by ElRakbawy, Klemm and Lindemann is:

Ass.1:   Network contention can be measured by measuring the RTT 
variance. A small variance is equivalent to a low degree of contention 
and a high variance is equivalent to a high degree of contention.

Assertions like these can be met in literature several times and it?s 
simply the question whether this assertion is true or not.

Personally, I am in great doubt at this.

It?s exactly what David P. Reed pointed out some weeks ago.
Before building brittle constructs upon questionable assertions, it is 
important to have a solid _basis_.

Here in Germany, we have a saying: "Das Fundament ist die Grundlage 
jeglicher Basis." I don?t know whether there exists an english 
equivalent, but this makes the very difference whether a space shuttle 
pilot coming home is busy with landing or busy with prayer.

Detlef


-- 
Detlef Bosau
Galileistrasse 30
70565 Stuttgart
Mail: detlef.bosau at web.de
Web: http://www.detlef-bosau.de
Mobile: +49 172 681 9937


From touch at ISI.EDU  Wed Aug 17 07:34:08 2005
From: touch at ISI.EDU (Joe Touch)
Date: Wed, 17 Aug 2005 07:34:08 -0700
Subject: [e2e] Latency Variation and Contention.
In-Reply-To: <43030529.3050506@tuhh.de>
References: <387B5A9BF31B5D43A2B18DD9F326B8E1DA68AB@NJFPSRVEXG2KCL.research.att.com>	<4301A9CC.8090103@web.de>
	<4301BE7F.9080107@tuhh.de> <430276D5.1010706@isi.edu>
	<43030529.3050506@tuhh.de>
Message-ID: <43034AE0.7070000@isi.edu>


Sireen Habib Malik wrote:
> Hi,
> 
> 
>>>what you are aiming at is SNR, i.e., 10log10(RTT/dRTT)
> 
> So we are getting somewhere now :-)
> 
> Right. SNR is the signal strength normalized to the noise strength. For
> dRTT=0, SNR=f(RTT/dRTT)=infinite.
> 
> I considered "congestion" as the noise strength normalized to the signal
> strength. For dRTT=0, congestion signal based upon dRTT/RTT=
> f(dRTT/RTT)= zero = no congestion.

You can consider it the noise ratio, but why? There are other reasons
that RTT can vary - multipath routing, in particular.

All SNR does here is tell you how noisy the RTT is, which tells you how
good you can run your feedback control (which is RTT-dependent). It
doesn't tell you whether there is congestion, though. There may be a
correlation in some systems, but it's not cause-effect. There are too
many other causes for noisy RTTs.

Joe
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050817/c433745c/signature.bin

From touch at ISI.EDU  Wed Aug 17 07:37:06 2005
From: touch at ISI.EDU (Joe Touch)
Date: Wed, 17 Aug 2005 07:37:06 -0700
Subject: [e2e] Latency Variation and Contention.
In-Reply-To: <43031BDC.1000608@web.de>
References: <387B5A9BF31B5D43A2B18DD9F326B8E1DA68AB@NJFPSRVEXG2KCL.research.att.com>	<4301A9CC.8090103@web.de>
	<4301BE7F.9080107@tuhh.de> <430276D5.1010706@isi.edu>
	<43030529.3050506@tuhh.de> <43031BDC.1000608@web.de>
Message-ID: <43034B92.9040206@isi.edu>


Detlef Bosau wrote:
> Your comments are both, helpful and enlightning.
> 
> Nevertheless, please, allow me to re-focus the discussion.
> 
> The assertion made by ElRakbawy, Klemm and Lindemann is:
> 
> Ass.1:   Network contention can be measured by measuring the RTT
> variance. A small variance is equivalent to a low degree of contention
> and a high variance is equivalent to a high degree of contention.
> 
> Assertions like these can be met in literature several times and it?s
> simply the question whether this assertion is true or not.
> 
> Personally, I am in great doubt at this.

As am I. Multipath routing can cause it, i.e. All you know when the RTT
is noisy is that the RTT is noisy, and then that anything that depends
on the RTT (e.g., the window size) is necessarily imprecise.

> It?s exactly what David P. Reed pointed out some weeks ago.
> Before building brittle constructs upon questionable assertions, it is
> important to have a solid _basis_.
> 
> Here in Germany, we have a saying: "Das Fundament ist die Grundlage
> jeglicher Basis." I don?t know whether there exists an english
> equivalent, but this makes the very difference whether a space shuttle
> pilot coming home is busy with landing or busy with prayer.
> 
> Detlef

Ours is "correlation != cause & effect". Gets at the same point, at the
end of the day.

Joe


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050817/1127d32f/signature.bin

From faber at ISI.EDU  Thu Aug 18 08:59:09 2005
From: faber at ISI.EDU (Ted Faber)
Date: Thu, 18 Aug 2005 08:59:09 -0700
Subject: [e2e] Latency Variation and Contention.
In-Reply-To: <43034B92.9040206@isi.edu>
References: <387B5A9BF31B5D43A2B18DD9F326B8E1DA68AB@NJFPSRVEXG2KCL.research.att.com>
	<4301A9CC.8090103@web.de> <4301BE7F.9080107@tuhh.de>
	<430276D5.1010706@isi.edu> <43030529.3050506@tuhh.de>
	<43031BDC.1000608@web.de> <43034B92.9040206@isi.edu>
Message-ID: <20050818155909.GC14126@pun.isi.edu>

On Wed, Aug 17, 2005 at 07:37:06AM -0700, Joe Touch wrote:
> Detlef Bosau wrote:
> > Here in Germany, we have a saying: "Das Fundament ist die Grundlage
> > jeglicher Basis." I don?t know whether there exists an english
> > equivalent, but this makes the very difference whether a space shuttle
> > pilot coming home is busy with landing or busy with prayer.
> 
> Ours is "correlation != cause & effect". Gets at the same point, at the
> end of the day.

You're basically right, but lets be a little more precise.

In any network that queues packets, in the absence of any other effects,
the onset of congestion will result in an increase in the RTT of a given
connection sampled over an RTT.  This is a causal relationship:
congestion causes RTT increases. 

There are at least 3 problems with using that observation to detect
congestion: 

	1. Lots of other things (OS artifacts, route changes, wireless
	   delays, ARQ) cause RTT variation.  Just as with using packet
	   loss as a congestion indication, a mistaken inference can
	   cause a source to slow when unnecessary or speed up when
	   unwarranted.

	   All congestion causes RTT increases; not all increased RTTs
	   indicate congestion.

	2. Sometimes the change caused by congestion is too small to be
	   reliably detected, even without the noise sources above.
	   This can be because there are a lot of sources in a net near
	   capacity or a lot of fixed delay on the path (queueing delay
	   Earth to Mars might be hard to detect).  Small queues also
	   make this difficult, and if the recent SIGCOMM work on sizing
	   routers is to be believed, small buffer sizes may become more
	   common.

	3. The queueing discipline in use can make detection of
	   congestion related RTT increases, even without confounding
	   noise in that signal and when the change is detectable, a
	   matter of statistics.  The amount of change in RTT that a
	   source sees will be affected by how other packets are
	   interleaved.  A source can detect a small change in RTT in a
	   byte-fair WFQ system much more quickly and reliably than in a
	   FIFO system with varying packet sizes.  Having to sample and
	   analyze increases the work the sender does and slows the
	   reaction time of sources.

Certainly many systems have been proposed that ise RTT as a congestion
indication, from Vegas through FAST to a bunch I've certainly lost track
of.  To use it as the only indication, successfully, in a rich network
environment, requires addressing at least the problems above.  There are
also cases where the network environment is less rich and you can rule
one or more of these out.

Congestion causes RTT increases.  Finding those RTT increases that are
due to congestion can be tricky.

-- 
Ted Faber
http://www.isi.edu/~faber           PGP: http://www.isi.edu/~faber/pubkeys.asc
Unexpected attachment on this mail? See http://www.isi.edu/~faber/FAQ.html#SIG
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050818/4c027bfe/attachment.bin

From keshav at uwaterloo.ca  Thu Aug 18 09:15:46 2005
From: keshav at uwaterloo.ca (S. Keshav)
Date: Thu, 18 Aug 2005 12:15:46 -0400
Subject: [e2e] end2end-interest Digest, Vol 18, Issue 9
In-Reply-To: <mailman.1.1124305201.18486.end2end-interest@postel.org>
Message-ID: <BAYC1-PASMTP03AD7B59C87EE6CD5134D5C0B20@cez.ice>

Detlef,

>> The assertion made by ElRakbawy, Klemm and Lindemann is:
>> 
>> Ass.1:   Network contention can be measured by measuring the RTT
>> variance. A small variance is equivalent to a low degree of contention
>> and a high variance is equivalent to a high degree of contention.
...
>> Personally, I am in great doubt at this.
> 

RTT delay is influenced by the following factors:

1. Speed of light delay in the path
2. Retransmissions in the underlay
3. Queues in buffers due to
    a. self queueing (queueing behind your own packets)
    b. queueing due to cross traffic
4. The service rate of  within a switch fabric in a router
5. The size of the packet whose RTT is measured

Variance in the RTT can be due to variation in any of the above.
So, if you want to measure contention, you have to do some things cleverly
at the sender:
    keep packet size fixed
    send at a `slow' rate
and also assume that
    paths are pinned
    there are no retransmissions in the underlay

If these hold, then you can link RTT variation to contention.

keshav


From alokdube at hotpop.com  Thu Aug 18 10:29:06 2005
From: alokdube at hotpop.com (Alok)
Date: Thu, 18 Aug 2005 22:59:06 +0530
Subject: [e2e] end2end-interest Digest, Vol 18, Issue 9
References: <BAYC1-PASMTP03AD7B59C87EE6CD5134D5C0B20@cez.ice>
Message-ID: <022c01c5a41a$5878e4b0$6401a8c0@rs.riverstonenet.com>


> >> Personally, I am in great doubt at this.
> >
>
> RTT delay is influenced by the following factors:
>
> 1. Speed of light delay in the path
> 2. Retransmissions in the underlay
> 3. Queues in buffers due to
>     a. self queueing (queueing behind your own packets)
>     b. queueing due to cross traffic

Do routers/ATM switches use queues for "congestion control" or because most
of their cards and backplanes are asynchronous?


From touch at ISI.EDU  Thu Aug 18 13:23:42 2005
From: touch at ISI.EDU (Joe Touch)
Date: Thu, 18 Aug 2005 13:23:42 -0700
Subject: [e2e] end2end-interest Digest, Vol 18, Issue 9
In-Reply-To: <BAYC1-PASMTP03AD7B59C87EE6CD5134D5C0B20@cez.ice>
References: <BAYC1-PASMTP03AD7B59C87EE6CD5134D5C0B20@cez.ice>
Message-ID: <4304EE4E.7070804@isi.edu>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


S. Keshav wrote:
> Detlef,
> 
> 
>>>The assertion made by ElRakbawy, Klemm and Lindemann is:
>>>
>>>Ass.1:   Network contention can be measured by measuring the RTT
>>>variance. A small variance is equivalent to a low degree of contention
>>>and a high variance is equivalent to a high degree of contention.
> 
> ...
> 
>>>Personally, I am in great doubt at this.
>>
> 
> RTT delay is influenced by the following factors:
> 
> 1. Speed of light delay in the path
> 2. Retransmissions in the underlay
> 3. Queues in buffers due to
>     a. self queueing (queueing behind your own packets)
>     b. queueing due to cross traffic
> 4. The service rate of  within a switch fabric in a router
> 5. The size of the packet whose RTT is measured
> 
> Variance in the RTT can be due to variation in any of the above.
> So, if you want to measure contention, you have to do some things cleverly
> at the sender:
>     keep packet size fixed
>     send at a `slow' rate
> and also assume that
>     paths are pinned
>     there are no retransmissions in the underlay

and that the underlay hops have stable RTTs; non-geosync satellites have
varying RTTs

and the points about pinning and retransmissions apply to the link
layers as well as to the network.

> If these hold, then you can link RTT variation to contention.

Yes - but when RTT variance goes up, it means that contention increased
or decreased. It seems more useful to use the first derivative of the
RTT than to use the variance, in that case.

Joe

> 
> keshav
> 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFDBO5OE5f5cImnZrsRAkxsAJ4hffBVMajFvgKyj/3wEiSD/pEcSgCg8QNr
jG//2Tuzz/lXCY6ZgMt2XWM=
=Gmt9
-----END PGP SIGNATURE-----

From touch at ISI.EDU  Thu Aug 18 13:25:41 2005
From: touch at ISI.EDU (Joe Touch)
Date: Thu, 18 Aug 2005 13:25:41 -0700
Subject: [e2e] end2end-interest Digest, Vol 18, Issue 9
In-Reply-To: <022c01c5a41a$5878e4b0$6401a8c0@rs.riverstonenet.com>
References: <BAYC1-PASMTP03AD7B59C87EE6CD5134D5C0B20@cez.ice>
	<022c01c5a41a$5878e4b0$6401a8c0@rs.riverstonenet.com>
Message-ID: <4304EEC5.4050300@isi.edu>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Alok wrote:
>>>>Personally, I am in great doubt at this.
>>>
>>RTT delay is influenced by the following factors:
>>
>>1. Speed of light delay in the path
>>2. Retransmissions in the underlay
>>3. Queues in buffers due to
>>    a. self queueing (queueing behind your own packets)
>>    b. queueing due to cross traffic
> 
> Do routers/ATM switches use queues for "congestion control" or because most
> of their cards and backplanes are asynchronous?

it depends on where the queues are:

input queues help more for asynch backplanes/cards, as well as
forwarding-based congestion (limits to header processing, e.g., for VPNs
terminating IPsec)

output queues are needed for output port contention congestion control,
i.e., where the output link is the limiting factor

Joe
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFDBO7FE5f5cImnZrsRAmdoAJ45G5GBo1JicYaRFo6ZQaAMm2eCOACgq66y
zryzGiA8BDVfnDi//zugQM0=
=ydER
-----END PGP SIGNATURE-----

From detlef.bosau at web.de  Thu Aug 18 14:15:51 2005
From: detlef.bosau at web.de (Detlef Bosau)
Date: Thu, 18 Aug 2005 23:15:51 +0200
Subject: [e2e] end2end-interest Digest, Vol 18, Issue 9
References: <BAYC1-PASMTP03AD7B59C87EE6CD5134D5C0B20@cez.ice>
	<022c01c5a41a$5878e4b0$6401a8c0@rs.riverstonenet.com>
Message-ID: <4304FA87.70108@web.de>

Alok wrote:
>>>>Personally, I am in great doubt at this.
>>>
>>RTT delay is influenced by the following factors:
>>
>>1. Speed of light delay in the path
>>2. Retransmissions in the underlay
>>3. Queues in buffers due to
>>    a. self queueing (queueing behind your own packets)
>>    b. queueing due to cross traffic
> 
> 
> Do routers/ATM switches use queues for "congestion control" or because most
> of their cards and backplanes are asynchronous?
> 
> 
> 


To my understanding, queues have two purposes.

1. Rate adaptation, this includes adaptation of a flow to possible MAC 
delays.

2. Interleaving/Mixing of flows.

Basically, these two are 3a and 3b in Keshav?s post. So, to answer your 
question: In a packet switching system congestion takes place in queues 
of store & forward nodes, especially when incoming and outgoing lines 
are asynchronous.

I?m hesitant to make too much words here, because each word may be wrong.

A very helpful rerefence is Raj Jain?s paper "A Delay-Based Approach for 
Congestion Avoidance in Interconnected Heterogenous Computer Networks".
I always found this work helpful to understand the role of switches/routers.

For congestion control itself, there are two "extreme positions" and, as 
in most cases where extreme positions exist, combinationes and middle 
courses.

The first position is a strict End to End approach: Routers don?t care 
about congestion. If a queue runs out of space, there?s no alternative 
left for a router than to discard a packet. In this extreme view: 
_Silently_ discard a packet. Consequently, end systems must react upon 
packet loss / congestion notification appropriately.

Look at the congavoid paper for this approach.

The second position is a continous control of each flow hop by hop. 
Spoken very simplified: We do traffic shaping on each node, in a well 
controlled manner.

I think (I must be careful here, I had a glance at this quite a long 
time ago, so forgive me if I?m wrong or unprecise here) this appproach 
is discussed in Keshav?s PhD thesis.

If we take the second position: Yes, routers and switches use queues for 
congestion control.

For middle courses and approaches "in between" think of active queue 
managemet and RED. And of course quite a number of PEP approaches, which 
often interconnect packet switching networks where congestion control is 
difficult to achieve using identical algorithms, e.g. (error-)loss free 
networks and lossy networks as for example 802.11 networks.

Detlef
-- 
Detlef Bosau
Galileistrasse 30
70565 Stuttgart
Mail: detlef.bosau at web.de
Web: http://www.detlef-bosau.de
Mobile: +49 172 681 9937


From alokdube at hotpop.com  Thu Aug 18 23:59:03 2005
From: alokdube at hotpop.com (Alok)
Date: Fri, 19 Aug 2005 12:29:03 +0530
Subject: [e2e] end2end-interest Digest, Vol 18, Issue 9
References: <BAYC1-PASMTP03AD7B59C87EE6CD5134D5C0B20@cez.ice><022c01c5a41a$5878e4b0$6401a8c0@rs.riverstonenet.com>
	<4304FA87.70108@web.de>
Message-ID: <009e01c5a48b$7eb6b640$070218ac@rs.riverstonenet.com>

Inline =>
----- Original Message ----- 
From: "Detlef Bosau" <detlef.bosau at web.de>
To: <end2end-interest at postel.org>
Sent: Friday, August 19, 2005 2:45 AM
Subject: Re: [e2e] end2end-interest Digest, Vol 18, Issue 9


> Alok wrote:
> >>>>Personally, I am in great doubt at this.
> >>>
> >>RTT delay is influenced by the following factors:
> >>
> >>1. Speed of light delay in the path
> >>2. Retransmissions in the underlay
> >>3. Queues in buffers due to
> >>    a. self queueing (queueing behind your own packets)
> >>    b. queueing due to cross traffic
> >
> >
> > Do routers/ATM switches use queues for "congestion control" or because
most
> > of their cards and backplanes are asynchronous?
> >
> >
> >
>
>
> To my understanding, queues have two purposes.
>
> 1. Rate adaptation, this includes adaptation of a flow to possible MAC
> delays.

Which means you have a bandwidth gradient and you buffer to handle the
gradient.
Which again means you have to "work on windows" and "buffer for windows" as
far as TCP is concerned

Simply put:
------->10Mbps---->R1----->1Mbps--->

Implies R1 has to buffer , and the buffer size can be *finite* only if the
traffic has a window/burst size is finite.


>
> 2. Interleaving/Mixing of flows.
>

Let me put the question in a simpler manner,

assume no TOS/DSCP, why does one need queues at all????
The only time you can do a buffer is if there is an window on top capping ur
burst

For example, if the 10Meg guy pumps UDP at 10Meg continuously, no amount of
buffering is going to help you on the 1Meg link.

As far as I understand, queues are only as the inherent architecture is
async,

Say I have

1M----|                            |--------1M
1M----| switching element |--------1M
1M----|                            |--------1M


all my switching element needs to be able to do is to switch at round robin
at 6*1M ...right?
Now where and why do I need the queues? Only reason that comes to mind is
the async. nature (each 1M is not clocked by the same clock etc), but the
queue size still does not need to be that high, does it?

-thanks
Alok


From detlef.bosau at web.de  Fri Aug 19 06:27:31 2005
From: detlef.bosau at web.de (Detlef Bosau)
Date: Fri, 19 Aug 2005 15:27:31 +0200
Subject: [e2e] end2end-interest Digest, Vol 18, Issue 9
References: <BAYC1-PASMTP03AD7B59C87EE6CD5134D5C0B20@cez.ice><022c01c5a41a$5878e4b0$6401a8c0@rs.riverstonenet.com>	<4304FA87.70108@web.de>
	<009e01c5a48b$7eb6b640$070218ac@rs.riverstonenet.com>
Message-ID: <4305DE43.6060305@web.de>

Alok wrote:


>>To my understanding, queues have two purposes.
>>
>>1. Rate adaptation, this includes adaptation of a flow to possible MAC
>>delays.
> 
> 
> Which means you have a bandwidth gradient and you buffer to handle the
> gradient.

Yes.

> Which again means you have to "work on windows" and "buffer for windows" as
> far as TCP is concerned
> 
> Simply put:
> ------->10Mbps---->R1----->1Mbps--->
> 
> Implies R1 has to buffer , and the buffer size can be *finite* only if the
> traffic has a window/burst size is finite.
> 
> 

Yes.

It?s interesting. Some weeks ago, I got criticism why I beat the 
conservation principle drum here.

<ignore criticism>

taram! taram! tataram! I beat the con-principle drum!

</ignore criticism>

I can only repeat it again and again: Exactly _this_ is the purporse of 
ACK pacing and the conservation principle in TCP.

A flow must not have more packets in transit than the congestion window 
allows (the "equilibrium window") and a packet must not be sent to the 
network until some other packet was taken away.

_This_ and nothing else limits the "energy" put into the network (the 
analogy to physics is obvious: We talk about energy conservation, 
impulse conservation, sometimes I think, Van Jacobson and Sir Isaac are 
best friends :-)) and hence bursts, oscillation etc. are limited.

Recall the Takoma bridge disaster, make the wind to stop blowing - the 
Takoma bridge may oscillate to eternity, but at least it was still there.


> 
>>2. Interleaving/Mixing of flows.
>>
> 
> 
> Let me put the question in a simpler manner,
> 
> assume no TOS/DSCP, why does one need queues at all????

The simple answer is: We do not need them.

The more complex answer can be found e.g. in Jains "Delay" paper: 
Limited queues with a length thoroughly thought through can improve 
network performance.

> The only time you can do a buffer is if there is an window on top capping ur
> burst
> 

Not quite. Think of RED.

> For example, if the 10Meg guy pumps UDP at 10Meg continuously, no amount of
> buffering is going to help you on the 1Meg link.
> 

But this guy is really misbehaved: He is not responsive.

Responsiveness is no part of UDP. Therefore, the application is 
responsible for responsiveness here.

Admittedly, people forget about this quite often.

It?s not an academic example, but on the support newsgroup of my ISP 
some guys recently detected ping. Ping. PING. 
PIIIIIIIIIIIIIIIIIIIIII...............
.....................................
...................................................................................

Oh, you miss the rest of my post? The reason is simple: "NG" is yet to come.

So, once again I take my drum, taram, taram, tataram.....

Perhaps I can join a parade?

The Internet still remains a well behaved community.

Some administrators block ping.

The guys on my ISP?s newsgroup call those administrators bad guys.
I recall: "Good fences make good neighbours".


> As far as I understand, queues are only as the inherent architecture is
> async,


Even that would not _require_ a queue. Think of Ethernet. What else is a 
"congestion" than a "collision", when there is no queueing on the router?

So, if we had no queues, the Internet would run. Perhaps the throughput 
could be somewhat higher, perhaps the way the Internet runs would be 
more similar to a turtle than to Achilles - but who cares? Isn?t there 
still snail mail delievered sent by soldiers who served with General Custer?

However, too large a queue can have the same effect.

> 
> Say I have
> 
> 1M----|                            |--------1M
> 1M----| switching element |--------1M
> 1M----|                            |--------1M
> 
> 
> all my switching element needs to be able to do is to switch at round robin
> at 6*1M ...right?

Right.


> Now where and why do I need the queues? Only reason that comes to mind is
> the async. nature (each 1M is not clocked by the same clock etc), but the
> queue size still does not need to be that high, does it?

Exactly.

And even no queuing (called "cut through switching" in the good old days 
from the past) would work.

But then, packets arriving at the switch at the same time would result 
in the same effect as collisions.

However, this debate was conducted in the eighties. So, I?m curious why 
some people buried tons of queueing memory in routers during the last 
ten years (perhaps the disaster in Cobe was overcome and now there was 
some amount of memory chips to be sold) and recently, researchers detect 
that small queues could be useful.

Queues should be small. IIRC, this is exactly what John Nagle, Raj Jain 
and perhaps countless others told us twenty years ago.

However, in extremely asnchronous situations, think of mobile wireless 
networks connected to the Internet, a reasonable amount of queuing is 
unevitable.

I got a paper submission rejected this year with the enlightning comment 
"overqueing is bad, refer to Reiner Ludwigs PhD dissertation".
I know Reiner Ludwigs PhD dissertation.
When he claims, overqueueing is bad, he is perfectly right as all the 
researchers before. It?s really an old story.

However, when service times oscillate from milliseconds to _minutes_(!)
at the last mile (refer to the relevant ETSI/ITU standards for GPRS 
before calling me nuts), traffic might happen to be a little bursty if 
not equalized by queues and appropriate techniques.

Detlef

-- 
Detlef Bosau
Galileistrasse 30
70565 Stuttgart
Mail: detlef.bosau at web.de
Web: http://www.detlef-bosau.de
Mobile: +49 172 681 9937


From alokdube at hotpop.com  Fri Aug 19 11:22:46 2005
From: alokdube at hotpop.com (Alok)
Date: Fri, 19 Aug 2005 23:52:46 +0530
Subject: [e2e] end2end-interest Digest, Vol 18, Issue 9
References: <BAYC1-PASMTP03AD7B59C87EE6CD5134D5C0B20@cez.ice><022c01c5a41a$5878e4b0$6401a8c0@rs.riverstonenet.com>	<4304FA87.70108@web.de>
	<009e01c5a48b$7eb6b640$070218ac@rs.riverstonenet.com>
	<4305DE43.6060305@web.de>
Message-ID: <010f01c5a4eb$020132f0$6401a8c0@rs.riverstonenet.com>

ok..either u have been smoking stuff..  or this is one helluva email :o)

----- Original Message ----- 
From: "Detlef Bosau" <detlef.bosau at web.de>
To: "Alok" <alokdube at hotpop.com>
Cc: <end2end-interest at postel.org>
Sent: Friday, August 19, 2005 6:57 PM
Subject: Re: [e2e] end2end-interest Digest, Vol 18, Issue 9


Alok wrote:


>>To my understanding, queues have two purposes.
>>
>>1. Rate adaptation, this includes adaptation of a flow to possible MAC
>>delays.
>
>
> Which means you have a bandwidth gradient and you buffer to handle the
> gradient.

Yes.

> Which again means you have to "work on windows" and "buffer for windows"
as
> far as TCP is concerned
>
> Simply put:
> ------->10Mbps---->R1----->1Mbps--->
>
> Implies R1 has to buffer , and the buffer size can be *finite* only if the
> traffic has a window/burst size is finite.
>
>

Yes.

It?s interesting. Some weeks ago, I got criticism why I beat the
conservation principle drum here.

<ignore criticism>

taram! taram! tataram! I beat the con-principle drum!

</ignore criticism>

I can only repeat it again and again: Exactly _this_ is the purporse of
ACK pacing and the conservation principle in TCP.


Alok=> okie!


A flow must not have more packets in transit than the congestion window
allows (the "equilibrium window") and a packet must not be sent to the
network until some other packet was taken away.

Alok=> ahh!! and how do we "know that"??

_This_ and nothing else limits the "energy" put into the network (the
analogy to physics is obvious: We talk about energy conservation,
impulse conservation, sometimes I think, Van Jacobson and Sir Isaac are
best friends :-)) and hence bursts, oscillation etc. are limited.

Alok=> ? so?


Recall the Takoma bridge disaster, make the wind to stop blowing - the
Takoma bridge may oscillate to eternity, but at least it was still there.

Alok=> :-) if u can find the freq, it will still beat!

>
>>2. Interleaving/Mixing of flows.
>>
>
>
> Let me put the question in a simpler manner,
>
> assume no TOS/DSCP, why does one need queues at all????

The simple answer is: We do not need them.

The more complex answer can be found e.g. in Jains "Delay" paper:
Limited queues with a length thoroughly thought through can improve
network performance.


Alok=> My ability to read is limited.

> The only time you can do a buffer is if there is an window on top capping
ur
> burst
>

Not quite. Think of RED.


Alok==> how so?

> For example, if the 10Meg guy pumps UDP at 10Meg continuously, no amount
of
> buffering is going to help you on the 1Meg link.
>

But this guy is really misbehaved: He is not responsive.

Responsiveness is no part of UDP. Therefore, the application is
responsible for responsiveness here.

Admittedly, people forget about this quite often.

It?s not an academic example, but on the support newsgroup of my ISP
some guys recently detected ping. Ping. PING.
PIIIIIIIIIIIIIIIIIIIIII...............
.....................................
............................................................................
.......

Oh, you miss the rest of my post? The reason is simple: "NG" is yet to come.

So, once again I take my drum, taram, taram, tataram.....

Perhaps I can join a parade?

The Internet still remains a well behaved community.


Alok=> no doubts about that ;-)

Some administrators block ping.

The guys on my ISP?s newsgroup call those administrators bad guys.
I recall: "Good fences make good neighbours".

Alok=> good chics too...


> As far as I understand, queues are only as the inherent architecture is
> async,


Even that would not _require_ a queue. Think of Ethernet. What else is a
"congestion" than a "collision", when there is no queueing on the router?

Alok=> depends. A collision is the inablity to send something due to a media
limitation, and *note*, the end host "orginiating" the packet experinces it
in the case of collision


So, if we had no queues, the Internet would run. Perhaps the throughput
could be somewhat higher, perhaps the way the Internet runs would be
more similar to a turtle than to Achilles - but who cares? Isn?t there
still snail mail delievered sent by soldiers who served with General Custer?

However, too large a queue can have the same effect.

Alok=> define "too large"

>
> Say I have
>
> 1M----|                            |--------1M
> 1M----| switching element |--------1M
> 1M----|                            |--------1M
>
>
> all my switching element needs to be able to do is to switch at round
robin
> at 6*1M ...right?

Right.


> Now where and why do I need the queues? Only reason that comes to mind is
> the async. nature (each 1M is not clocked by the same clock etc), but the
> queue size still does not need to be that high, does it?

Exactly.

And even no queuing (called "cut through switching" in the good old days
from the past) would work.

But then, packets arriving at the switch at the same time would result
in the same effect as collisions.

Alok=> ok. where would you "drop" them is the fundamental question. on an
IS? then uve already wasted b/w and queues of an IS for no reason
(remember...everything is e2e)

However, this debate was conducted in the eighties. So, I?m curious why
some people buried tons of queueing memory in routers during the last
ten years (perhaps the disaster in Cobe was overcome and now there was
some amount of memory chips to be sold) and recently, researchers detect
that small queues could be useful.

Alok=>$$ is a good reason ;-)

Queues should be small. IIRC, this is exactly what John Nagle, Raj Jain
and perhaps countless others told us twenty years ago.

Alok=> Yep except i lost a bit on nagle's theorem when he kinda didnt wrap
around the window.

However, in extremely asnchronous situations, think of mobile wireless
networks connected to the Internet, a reasonable amount of queuing is
unevitable.


Alok=> :-) they are good to steal other's passwds when sitting at an airport
with nothing to do :-)

I got a paper submission rejected this year with the enlightning comment
"overqueing is bad, refer to Reiner Ludwigs PhD dissertation".
I know Reiner Ludwigs PhD dissertation
When he claims, overqueueing is bad, he is perfectly right as all the
researchers before. It?s really an old story.


Alok=> yep................ but wrap around the window..right?

However, when service times oscillate from milliseconds to _minutes_(!)
at the last mile (refer to the relevant ETSI/ITU standards for GPRS
before calling me nuts), traffic might happen to be a little bursty if
not equalized by queues and appropriate techniques.


Alok=> My inability to read does wonders... ;-)

Detlef

-- 
Detlef Bosau
Galileistrasse 30
70565 Stuttgart
Mail: detlef.bosau at web.de
Web: http://www.detlef-bosau.de
Mobile: +49 172 681 9937


From detlef.bosau at web.de  Fri Aug 19 14:05:45 2005
From: detlef.bosau at web.de (Detlef Bosau)
Date: Fri, 19 Aug 2005 23:05:45 +0200
Subject: [e2e] end2end-interest Digest, Vol 18, Issue 9
References: <BAYC1-PASMTP03AD7B59C87EE6CD5134D5C0B20@cez.ice><022c01c5a41a$5878e4b0$6401a8c0@rs.riverstonenet.com>	<4304FA87.70108@web.de>
	<009e01c5a48b$7eb6b640$070218ac@rs.riverstonenet.com>
	<4305DE43.6060305@web.de>
	<010f01c5a4eb$020132f0$6401a8c0@rs.riverstonenet.com>
Message-ID: <430649A9.3030007@web.de>

Alok wrote:
> 
> 
> A flow must not have more packets in transit than the congestion window
> allows (the "equilibrium window") and a packet must not be sent to the
> network until some other packet was taken away.
> 
> Alok=> ahh!! and how do we "know that"??

A sender knows this from the acknowledgements.

> 
> _This_ and nothing else limits the "energy" put into the network (the
> analogy to physics is obvious: We talk about energy conservation,
> impulse conservation, sometimes I think, Van Jacobson and Sir Isaac are
> best friends :-)) and hence bursts, oscillation etc. are limited.
> 
> Alok=> ? so?
> 
> 
> Recall the Takoma bridge disaster, make the wind to stop blowing - the
> Takoma bridge may oscillate to eternity, but at least it was still there.
> 
> Alok=> :-) if u can find the freq, it will still beat!
> 
> 

So what? As long as it does not _break_ it may beat!

As long as we have no congestion collapse, there is no problem with 
queue oscillation.

Of course, there may be a problem with RTT estimation, which was the 
original topic of this thread. However, when we have small queues and 
perhaps queueing delays turn to be neglectible compared to propagation 
delays, RTT estimation becomes easier than today.


> 
> The more complex answer can be found e.g. in Jains "Delay" paper:
> Limited queues with a length thoroughly thought through can improve
> network performance.
> 
> 
> Alok=> My ability to read is limited.

I apologize.

Perhaps we should send you posts in mp3 format? =8-)

I admit, I often write too long posts. However, the issue is extremely 
difficult. So, i can?t put too short. (Recall Sireens signature and the 
Einstein quote.)
> 
> 
> Not quite. Think of RED.
> 
> 
> Alok==> how so?

Some RED disciplines randomly discard packets even when there is no 
actual queue overun in order to limit oscillation and increase stability.

> 
> Even that would not _require_ a queue. Think of Ethernet. What else is a
> "congestion" than a "collision", when there is no queueing on the router?
> 
> Alok=> depends. A collision is the inablity to send something due to a media
> limitation, and *note*, the end host "orginiating" the packet experinces it
> in the case of collision

Not quite. Recall Davids recent post. In 802.11 ad hoc nets a collision 
results in a silent "discard" exactly as a congestion.

This perfectly makes sense: Both, a media limitation and a queue 
limitation, is a limitation. Some part of the network can not convey the 
incoming packet.

> 
> 
> So, if we had no queues, the Internet would run. Perhaps the throughput
> could be somewhat higher, perhaps the way the Internet runs would be
> more similar to a turtle than to Achilles - but who cares? Isn?t there
> still snail mail delievered sent by soldiers who served with General Custer?
> 
> However, too large a queue can have the same effect.
> 
> Alok=> define "too large"
> 

That?s the million dollar question. Especially as a TCP window is 
limited to 64 kBytes by default. However, if one would follow the 
"advice" of some "bright" network consultant I read recently, we should 
play around with window scaling in LANs to improve performance (God in 
Heaven!). Imagine a TCP sender scaled to AWND units of 1 Megabyte (we 
will _really_ imrpove performance). So imagine, a TCP sender has an 
actual window of 2 Megabyte and a router would support this.

We would introduce a single trip e2e latency of nearly one second here - 
from one floor in a building to the other.

This is not really what we want to do.

In addition, in practical networks the vast majority of flows are short 
timed flows, so a routers memory is not occupied because there is not 
enough data in the flow.


Hoever, theoretically (refer e.g to Jains paper) too large a buffer can 
simply bring down a flow?s throughput to _zero_. This is extremely hard 
to imagine: A sender?s window may increase beyond all limits, so does a 
bottleneck queue and so the time for a packet to stay in the queue may 
increase beyound all limits as well.

I must correct the above. It?s not the infinite queueing space which 
brings the flow to the ground but the _window_ size.

But this exactly results from unlimited queueing space if you don?t put 
an upper limit to a TCP sender?s window.

To put Jain and Nagle short: They investigated the behaviour of packet 
switching networks with unlimited queues - and came to the advice: Make 
the queues short.

> 
> I got a paper submission rejected this year with the enlightning comment
> "overqueing is bad, refer to Reiner Ludwigs PhD dissertation".
> I know Reiner Ludwigs PhD dissertation
> When he claims, overqueueing is bad, he is perfectly right as all the
> researchers before. It?s really an old story.
> 
> 
> Alok=> yep................ but wrap around the window..right?


I lost you.

BTW: I do not talk about "Nagles algorithm" here but primarily of papers 
like: "On Packet Switches With Infinite Storage" from 1987.

So basically, we do not even talk about TCP here.


> 
> However, when service times oscillate from milliseconds to _minutes_(!)
> at the last mile (refer to the relevant ETSI/ITU standards for GPRS
> before calling me nuts), traffic might happen to be a little bursty if
> not equalized by queues and appropriate techniques.
> 
> 
> Alok=> My inability to read does wonders... ;-)

I see. But my posts are a good practice. ITU standards are _much_ longer :-)

Detlef
-- 
Detlef Bosau
Galileistrasse 30
70565 Stuttgart
Mail: detlef.bosau at web.de
Web: http://www.detlef-bosau.de
Mobile: +49 172 681 9937


From alokdube at hotpop.com  Fri Aug 19 14:26:50 2005
From: alokdube at hotpop.com (Alok)
Date: Sat, 20 Aug 2005 02:56:50 +0530
Subject: [e2e] end2end-interest Digest, Vol 18, Issue 9
References: <BAYC1-PASMTP03AD7B59C87EE6CD5134D5C0B20@cez.ice><022c01c5a41a$5878e4b0$6401a8c0@rs.riverstonenet.com>	<4304FA87.70108@web.de>
	<009e01c5a48b$7eb6b640$070218ac@rs.riverstonenet.com>
	<4305DE43.6060305@web.de>
	<010f01c5a4eb$020132f0$6401a8c0@rs.riverstonenet.com>
	<430649A9.3030007@web.de>
Message-ID: <000801c5a504$b80fc2a0$6401a8c0@rs.riverstonenet.com>


>
> Alok=> My inability to read does wonders... ;-)

I see. But my posts are a good practice. ITU standards are _much_ longer :-)

Alok==> U win!


From detlef.bosau at web.de  Sat Aug 20 08:57:18 2005
From: detlef.bosau at web.de (Detlef Bosau)
Date: Sat, 20 Aug 2005 17:57:18 +0200
Subject: [e2e] end2end-interest Digest, Vol 18, Issue 9
References: <BAYC1-PASMTP03AD7B59C87EE6CD5134D5C0B20@cez.ice>
Message-ID: <430752DE.DFEA69BF@web.de>

"S. Keshav" wrote:
> 
> 
> RTT delay is influenced by the following factors:
> 
> 1. Speed of light delay in the path
> 2. Retransmissions in the underlay
> 3. Queues in buffers due to
>     a. self queueing (queueing behind your own packets)
>     b. queueing due to cross traffic
> 4. The service rate of  within a switch fabric in a router
> 5. The size of the packet whose RTT is measured
> 
> Variance in the RTT can be due to variation in any of the above.
> So, if you want to measure contention, you have to do some things cleverly
> at the sender:
>     keep packet size fixed
>     send at a `slow' rate
> and also assume that
>     paths are pinned
>     there are no retransmissions in the underlay
> 
> If these hold, then you can link RTT variation to contention.
> 
> keshav


Just to see, whether I understood you correctly.

The packet size is fixed => serialization delay is constand and
hopefully (nearly) the service times.
No retransmissions and pinned paths are clear.
Slow rate => There is no self queueing, any queuing is due to cross
traffic.

In other terms: You make sure that any RTT variation is only due to
cross traffic. Right?

Now, even the "low rate" requires explicit knowledge of the network and
can hardly achieved along an unknown path.

In addition, "cross traffic" may not be "cross traffic" but in fact
_traffic_. On the street. Thinks like cars, motorcycles. As traffic
signs,
buildings etc., this influences the properties of a wireless channel. 

At least in a mobile wireless network, this is the reason why error
recovery in the underlay is inevitable.

So, I presume you basically agree that using RTT variation as a
universal means for contention estimation is at least questionable.
Is this correct?

IIRC, the paper from Lindemann?s group does not mention mobility.
However, I don?t remember a paper or talk, where ad hoc net users
are supposed to stay in quiet and motionless medidation. Anybody is
interesed in mobile ad hoc networks today.

So, I would like to sharpen my question a bit:

Can this approach be made to work with reasonable effort? Or should it
be abandoned, beause it is not really promising?

This is a hard question, I know. But for horse?s and rider?s benefit
still the old saying holds true:
"If you discover that you?re riding a dead horse, dismount."


Detlef 

-- 
Detlef Bosau
Galileistrasse 30
70565 Stuttgart
Mail: detlef.bosau at web.de
Web: http://www.detlef-bosau.de
Mobile: +49 172 681 9937

From david.hagel at gmail.com  Sun Aug 21 15:15:08 2005
From: david.hagel at gmail.com (David Hagel)
Date: Sun, 21 Aug 2005 18:15:08 -0400
Subject: [e2e] Question about propagation and queuing delays
In-Reply-To: <mailman.0.1124662178.23693.end2end-interest@postel.org>
References: <mailman.0.1124662178.23693.end2end-interest@postel.org>
Message-ID: <c6ae07ec0508211515ab95fd3@mail.gmail.com>

I was wondering what are the typical coast-to-coast propagation and
queuing delays observed by today's backbone networks in North America.
Is there any data/study which provides a breakdown of different
components of such end-to-end delays in today's backbone networks?

Thanks,
David

From dpreed at reed.com  Sun Aug 21 20:44:54 2005
From: dpreed at reed.com (David P. Reed)
Date: Sun, 21 Aug 2005 23:44:54 -0400
Subject: [e2e] Question about propagation and queuing delays
In-Reply-To: <c6ae07ec0508211515ab95fd3@mail.gmail.com>
References: <mailman.0.1124662178.23693.end2end-interest@postel.org>
	<c6ae07ec0508211515ab95fd3@mail.gmail.com>
Message-ID: <43094A36.1040402@reed.com>

I can repeatably easily measure 40 msec. coast-to-coast (Boston-LA), of 
which around 25 msec. is accounted for by speed of light in fiber (which 
is 2/3 of speed of light in vacuum, *299,792,458 m s^-1 *, because the 
refractive index of fiber is approximately 1.5 or 3/2).   So assume 2e8 
m/s as the speed of light in fiber,  1.6e3 m/mile, and you get 1.25e5 
mi/sec.

The remaining 15 msec. can be accounted for by the fiber path not being 
straight line, or by various "buffering delays" (which include queueing 
delays, and scheduling delays in the case where frames are scheduled 
periodically and you have to wait for the next frame time to launch your 
frame).

Craig Partridge and I have debated (offline) what the breakdown might 
actually turn out to be (he thinks the total buffering delay is only 2-3 
msec., I think it's more like 10-12), and it would be quite interesting 
to get more details, but that would involve delving into the actual 
equipment deployed and its operating modes.

From mallman at icir.org  Fri Aug 19 08:46:49 2005
From: mallman at icir.org (Mark Allman)
Date: Fri, 19 Aug 2005 11:46:49 -0400
Subject: [e2e] pam 2006 cfp
Message-ID: <20050819154649.CD7B3335C1D@lawyers.icir.org>

An embedded and charset-unspecified text was scrubbed...
Name: not available
Url: http://www.postel.org/pipermail/end2end-interest/attachments/20050819/15aa9649/attachment.ksh
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 185 bytes
Desc: not available
Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050819/15aa9649/attachment.bin

From mbuddhikot at lucent.com  Mon Aug 22 06:59:54 2005
From: mbuddhikot at lucent.com (Milind M. Buddhikot)
Date: Mon, 22 Aug 2005 09:59:54 -0400
Subject: [e2e] Call for Participation, IEEE ICNP 2005, Nov 6-9, Boston
Message-ID: <4309DA5A.6050504@lucent.com>

                   Call For Participation

                (and Call for Student Posters)

                  http://csr.bu.edu/icnp2005 

   13th IEEE International Conference on Network Protocols

                  Boston, Massachusetts, USA
                      November 6-9, 2005

Sponsored by: IEEE Computer Society, IEEE TCDP,
            NSF CISE/CNS, IBM Research, Boston University

Important Dates
===============

Early Registration: October 12, 2005

Student Travel Award Application: September 23, 2005

Minority Travel Award Application: September 15, 2005

Student Poster Submission: September 15, 2005


Highlights of ICNP 2005
=======================

* Keynote speech by Professor Larry Peterson (Princeton University) on
"A Strategy for Continually Reinventing the Internet"

* Invited talk by Darleen Fisher (National Science Foundation) on "NSF
NeTS Initiatives on New Architectures and Protocols"

* Presentations of peer-reviewed technical papers organized into ten
sessions:

+ Interdomain Routing
+ Sensor & Ad-hoc Protocols
+ Peer-to-Peer Protocols
+ Geographic Routing in Ad-hoc Networks
+ Overlay Protocols
+ Dimensioning & Traffic Engineering
+ Security & Safety
+ Congestion Control
+ Protocol Implementation & Analysis
+ Wireless Transport

* Workshop on Secure Network Protocols (NPSec)

* Three timely tutorials:

+ Survivable Routing: Algorithms and Protocols
+ Wireless Mesh Networking
+ Session Initiation Protocol (SIP):
  A Protocol for Managing Next Generation Networks

* Student work-in-progress poster session

=================


-------------- next part --------------
A non-text attachment was scrubbed...
Name: mbuddhikot.vcf
Type: text/x-vcard
Size: 350 bytes
Desc: not available
Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050822/ae160dfe/mbuddhikot.vcf

From huitema at windows.microsoft.com  Mon Aug 22 09:04:23 2005
From: huitema at windows.microsoft.com (Christian Huitema)
Date: Mon, 22 Aug 2005 09:04:23 -0700
Subject: [e2e] Question about propagation and queuing delays
Message-ID: <DAC3FCB50E31C54987CD10797DA511BA1060136D@WIN-MSG-10.wingroup.windeploy.ntdev.microsoft.com>

> The remaining 15 msec. can be accounted for by the fiber path not
being
> straight line, or by various "buffering delays" (which include
queueing
> delays, and scheduling delays in the case where frames are scheduled
> periodically and you have to wait for the next frame time to launch
your
> frame).
> 
> Craig Partridge and I have debated (offline) what the breakdown might
> actually turn out to be (he thinks the total buffering delay is only
2-3
> msec., I think it's more like 10-12), and it would be quite
interesting
> to get more details, but that would involve delving into the actual
> equipment deployed and its operating modes.

One way to find out is to collect a large set of samples, and then look
at the minimum value. As long as the route does not change, the
propagation delay is the sum of the transmission times, which are
supposed constant, and a set of positive random values. The minimum of a
large sample is the sum of the transmission times and the minimum of the
random values, which tends towards zero.

Obviously, you have to verify the "stable route" hypothesis...

-- Christian Huitema 

From david.hagel at gmail.com  Mon Aug 22 09:13:41 2005
From: david.hagel at gmail.com (David Hagel)
Date: Mon, 22 Aug 2005 09:13:41 -0700
Subject: [e2e] Question about propagation and queuing delays
In-Reply-To: <43094A36.1040402@reed.com>
References: <mailman.0.1124662178.23693.end2end-interest@postel.org>
	<c6ae07ec0508211515ab95fd3@mail.gmail.com> <43094A36.1040402@reed.com>
Message-ID: <c6ae07ec05082209136a2d039f@mail.gmail.com>

Thanks, this is interesting. I asked the same question on nanog and
got similar responses: that queuing delay is negligible on todays
backbone networks compared to other fixed delay components
(propagation, store-and-forward, transmission etc). Response on nanog 
seems to indicate that queuing delay is almost irrelevant today.

This may sound like a naive question. But if queuing delays are so
insignificant in comparison to other fixed delay components then what
does it say about the usefulness of all the extensive techniques for
queue management and congestion control (including TCP congestion
control, RED and so forth) in the context of today's backbone
networks? Any thoughts? Are the congestion control researchers out of
touch with reality?

- Dave


On 8/21/05, David P. Reed <dpreed at reed.com> wrote:
> I can repeatably easily measure 40 msec. coast-to-coast (Boston-LA), of
> which around 25 msec. is accounted for by speed of light in fiber (which
> is 2/3 of speed of light in vacuum, *299,792,458 m s^-1 *, because the
> refractive index of fiber is approximately 1.5 or 3/2).   So assume 2e8
> m/s as the speed of light in fiber,  1.6e3 m/mile, and you get 1.25e5
> mi/sec.
> 
> The remaining 15 msec. can be accounted for by the fiber path not being
> straight line, or by various "buffering delays" (which include queueing
> delays, and scheduling delays in the case where frames are scheduled
> periodically and you have to wait for the next frame time to launch your
> frame).
> 
> Craig Partridge and I have debated (offline) what the breakdown might
> actually turn out to be (he thinks the total buffering delay is only 2-3
> msec., I think it's more like 10-12), and it would be quite interesting
> to get more details, but that would involve delving into the actual
> equipment deployed and its operating modes.
>

From detlef.bosau at web.de  Mon Aug 22 11:26:07 2005
From: detlef.bosau at web.de (Detlef Bosau)
Date: Mon, 22 Aug 2005 20:26:07 +0200
Subject: [e2e] Question about propagation and queuing delays
References: <mailman.0.1124662178.23693.end2end-interest@postel.org>	<c6ae07ec0508211515ab95fd3@mail.gmail.com>
	<43094A36.1040402@reed.com>
	<c6ae07ec05082209136a2d039f@mail.gmail.com>
Message-ID: <430A18BF.5030202@web.de>

David Hagel wrote:
> 
> This may sound like a naive question. But if queuing delays are so
> insignificant in comparison to other fixed delay components then what
> does it say about the usefulness of all the extensive techniques for
> queue management and congestion control (including TCP congestion
> control, RED and so forth) in the context of today's backbone
> networks? Any thoughts? Are the congestion control researchers out of
> touch with reality?
> 
> - Dave


It depends.

One answer is: Yes, they are.

A more cynical answer is: If a lucky guy joins a PhD program, he must 
find a topic to write about.

In earlier centuries one wrote a doctoral thesis about: "Was Maria 
virgin until her first intercourse with..., excuse me, before she became 
pregnant?" O.k., now we know about that. Next thesis. "Was Maria virgin 
_during_ her pregnancy?". Even that is clear. Next thesis, and this is 
anatomiclly interetindg, perhaps one could not only achieve a D.D. but 
an M.D. with this: "Was Maria, anatomically correct, virgin after she 
gave birth to Jesus?" And now the most difficult one: "Was Maria virgin 
_during_ the birth of Jesus?"

No, this is no political incorrect offense to the readers, these are 
topics which were discussed extensivley in the Middle Ages, here in 
Germany and in Italy and in other locations of the roman catholic church.


Nowadays, we are rationalists. Wie don?t debate Marias virginity.

We discuss the importance of timers for congestion control. Some months 
ago, some people from the group around Christoph Lindemann published about
"TCP wit Adaptive Pacing for Multihop Wireless Networks" and reckognized 
latency observation as a new crystall ball for congestion forecast and 
avoidance.

If this sounds too cyncial: I apologize.

During the last weeks, I became mad about Edge?s paper about adaptive 
retransmission timeouts.

And the more I?m thinking about that paper and how TCP timers work, the 
more I become convinced that the insignificance of queueing delays, and 
the consequence that the Internet latency as perceived by a flow is 
nearly constant during the lifetime of a flow, is the reason why TCP 
timers work at all.

As soon as latencies are subject to large and sudden change, prominent 
example: mobile wide area networks, we talk about "spurious timeouts" 
and other urban legends, which miss the problem.

The more often I read Edge?s paper and think about ist, the more I play 
around with the actual RTT estimators, the more I?m in doubt whether 
these will work in a network with highly instable and quickly changing 
latencies.

This is all the more true in mobile wireless networks where latencies 
are due to retransmissions and error recovery (without error recovery 
TCP flows would break down due to retransmission collapse in those 
networks) and therefore subject to change of path properties beyound our 
control.

It?s not Edge?s approach, which causes the problem.

It?s our actual approach of RTT estimation which is as useful as cast dice.

So, with respect to your last sentence: We often are out of touch with 
reality, because using our actual TCP timers we use an insolid basis for 
TCP congestion control which by some chance and lucky cirumstances holds 
in contemporary Internet. And when there appear some "strange effects" 
in mobile networks, we are glad about it: "Hurray! An effect! A topic 
for my PhD thesis!"
Honestly, if you detect fire in your house, you certainly will not be 
glad because you can spare fuel but you will call the fire brigade.

Using instable and inappropriate estimators for mean and variance of RTT
leads to a number of "strange effects", "spurious timeouts" is only one 
if them. However: It?s a symptom. Not the reason. A cure must focus at 
the reason. Not at the symptom.

Once again: In contemporary Internet with neglectible queueing delays 
and almost constant paths, this is absolutely no problem and anything 
works fine. But falling asleep safe and sound, knowing Kah is around is 
perhaps not the best strategy to solve the imminent problem. It?s 
similar to our German wellfare system, where polictians ignored (well 
known!) problems for decades - and now we face a disaster.

I?m not even convinced that anything is fine in wirebound networks.
Due to some "interesting" discussions here in Germany concerning 
"fastpath" (some new buzzword with ADSL) I had a first glance at the ITU 
recommendation for g.dmt. In fact, we do not _yet_ use automatic 
retransmission here. But if we continue to exploit extremely noisy lines 
for high speed data transmission, which appears to be promising when you 
look at the market and which allows me as an unemployed person to use 
the Internet (with my old ISDN dialup account it was by far to 
expensive), things can turn to be different. Perhaps, ARQ might be 
useful fore some line. Perhaps not only at the last mile which can be 
hidden behind a PEP. We discussed ARQ for satellite links recently in 
this list. And then? Will we complain about "spurious timeouts" then?

I apologize when this sounds extremely upset. It?s my honest intention 
not to offense anybody. And if I can contribute an approach here, I will 
do my very best. I sent some rough ideas to some people, perhas I will 
get a feedback about it.

But either I am to stupid too understand TCP and it?s assumptions,
or there is real danger to get into severe trouble when we still ignore 
the timer issue.

O.k. I think, I will appl for asylum on the falklands or in the 
antarktis now, since I expect to receive evil criticism now.

I don?t mind. If I?m wrong, I will learn my lesson.

But at the moment, I?m simply discouraged.
If I?m wrong, I would appreciate somebody to correct me.
If not, perhpas I can think about a way out. But everytime I start my 
editor on my dated, ten years old P160 with 128 MByte memory I think: It 
  does not matter, whatever I write. As long as I do not provide 
billions of simulations (AKA repeated assertions) with the NS2, where I 
would have to change great amounts of code which would require man years 
of work, even with an equipment where not even a _link_ run for the NS2 
would take about half an hour, no one would believe me.

And as an unemployed person who is, as one minor problem of course, in 
the need of a job and to make a living here in Germany with admitted 5 
millions of unemployed people, in reality whe have probably about 8 to 
10 millions of unemployed people here, I cannot rewrite the whole NS2 
and insert layer 2 models, which I do not have because no one gives 
_real_ channel traces to an unknown guy from Germany, and I cannot 
implement all the necessary changes _and_ produce convincing traces 
(which again no one would ever believe) on my own.

So, I write one and two lines, and then I shut down the editor and give up.

To blether about "TCP with Adaptive Pacing...." is obviously more 
successful.

And to ignore the problem is perhaps the best strategy.

Excuse me for writhing this, but as I said, I?m _really_ discouraged.

And may be, I?m completely wrong.

Detlef


-- 
Detlef Bosau
Galileistrasse 30
70565 Stuttgart
Mail: detlef.bosau at web.de
Web: http://www.detlef-bosau.de
Mobile: +49 172 681 9937


From marc.herbert at free.fr  Mon Aug 22 11:43:37 2005
From: marc.herbert at free.fr (Marc Herbert)
Date: Mon, 22 Aug 2005 20:43:37 +0200 (CEST)
Subject: [e2e] Question about propagation and queuing delays
In-Reply-To: <c6ae07ec05082209136a2d039f@mail.gmail.com>
References: <mailman.0.1124662178.23693.end2end-interest@postel.org>
	<c6ae07ec0508211515ab95fd3@mail.gmail.com> <43094A36.1040402@reed.com>
	<c6ae07ec05082209136a2d039f@mail.gmail.com>
Message-ID: <Pine.LNX.4.58.0508222022080.300@meije.emic.fr>

On Mon, 22 Aug 2005, David Hagel wrote:

> Thanks, this is interesting. I asked the same question on nanog and
> got similar responses: that queuing delay is negligible on todays
> backbone networks compared to other fixed delay components
> (propagation, store-and-forward, transmission etc). Response on nanog
> seems to indicate that queuing delay is almost irrelevant today.
>
> This may sound like a naive question. But if queuing delays are so
> insignificant in comparison to other fixed delay components then what
> does it say about the usefulness of all the extensive techniques for
> queue management and congestion control (including TCP congestion
> control, RED and so forth) in the context of today's backbone
> networks? Any thoughts? Are the congestion control researchers out of
> touch with reality?

The delay-based congestion control techniques you are talking about
are not based on a ratio but on a delta between instant and constant
delay.

This still does not mean the measure is easy; but IMHO not for the
reason you give.


From dpreed at reed.com  Mon Aug 22 14:08:58 2005
From: dpreed at reed.com (David P. Reed)
Date: Mon, 22 Aug 2005 17:08:58 -0400
Subject: [e2e] Question about propagation and queuing delays
In-Reply-To: <DAC3FCB50E31C54987CD10797DA511BA1060136D@WIN-MSG-10.wingroup.windeploy.ntdev.microsoft.com>
References: <DAC3FCB50E31C54987CD10797DA511BA1060136D@WIN-MSG-10.wingroup.windeploy.ntdev.microsoft.com>
Message-ID: <430A3EEA.80004@reed.com>

Christian Huitema wrote:

>
>One way to find out is to collect a large set of samples, and then look
>at the minimum value. As long as the route does not change, the
>propagation delay is the sum of the transmission times, which are
>supposed constant, and a set of positive random values. The minimum of a
>large sample is the sum of the transmission times and the minimum of the
>random values, which tends towards zero.
>
>Obviously, you have to verify the "stable route" hypothesis...
>  
>
This assumes the buffering is elastic.   If it includes a fixed delay 
independent of load in the particular equipment (e.g. a "slotted" 
multiplexed rate adapter) you could have a long buffer delay without 
variation.

Not all queues are elastic. (i.e. a pair of scheduled train routes with 
a transfer point can have a constant queueing delay that is the skew in 
arrival vs. departures at the transfer point.).

From fred at cisco.com  Mon Aug 22 14:50:52 2005
From: fred at cisco.com (Fred Baker)
Date: Tue, 23 Aug 2005 05:50:52 +0800
Subject: [e2e] Question about propagation and queuing delays
In-Reply-To: <c6ae07ec05082209136a2d039f@mail.gmail.com>
References: <mailman.0.1124662178.23693.end2end-interest@postel.org>
	<c6ae07ec0508211515ab95fd3@mail.gmail.com>
	<43094A36.1040402@reed.com>
	<c6ae07ec05082209136a2d039f@mail.gmail.com>
Message-ID: <13B1CFCA-3291-4B04-8CC4-D711D4423486@cisco.com>

no, but there are different realities, and how one measures them is  
also relevant.

In large fiber backbones, within the backbone we generally run 10:1  
overprovisioned or more. within those backbones, as you note, the  
discussion is moot. But not all traffic stays within the cores of  
large fiber backbones - much of it is originated and terminates in  
end systems located in homes and offices.

The networks that connect homes and offices to the backbones are  
often constrained differently. For example, my home (in an affluent  
community in California) is connected by Cable Modem, and the service  
that I buy (business service that in its AUP accepts a VPN, unlike  
the same company's residential service) guarantees a certain amount  
of bandwidth, and constrains me to that bandwidth - measured in KBPS.  
I can pretty easily fill that, and when I do certain services like  
VoIP don't work anywhere near as well. So I wind up playing with the  
queuing of traffic in the router in my home to work around the  
service rate limit in my ISP. As I type this morning (in a hotel in  
Taipei), the hotel provides an access network that I share with the  
other occupants of the hotel. It's not uncommon for the entire hotel  
to share a single path for all of its occupants, and that single path  
is not necessarily in MBPS. And, they tell me that the entire world  
is not connected by large fiber cores - as soon as you step out of  
the affluent industrialized countries, VSAT, 64 KBPS links, and even  
9.6 access over GSM become the access paths available.

As to measurement, note that we generally measure that  
overprovisioning by running MRTG and sampling throughput rates every  
300 seconds. When you're discussing general service levels for an  
ISP, that is probably reasonable. When you're measuring time  
variations on the order of milliseconds, that's a little like running  
a bump counter cable across a busy intersection in your favorite  
downtown, reading the counter once a day, and drawing inferences  
about the behavior of traffic during light changes during rush hour...

http://www.ieee-infocom.org/2004/Papers/37_4.PDF has an interesting  
data point. They used a much better measurement methodology, and one  
of the large networks gave them some pretty cool access in order to  
make those tests. Basically, queuing delays within that particular  
very-well-engineered large fiber core were on the order of 1 ms or  
less during the study, with very high confidence. But the same data  
flows frequently jumped into the 10 ms range even within the 90%  
confidence interval, and a few times jumped to 100 ms or so. The  
jumps to high delays would most likely relate to correlated high  
volume data flows, I suspect, either due to route changes or simple  
high traffic volume.

The people on NANOG and the people in the NRENs live in a certain  
ivory tower, and have little patience with those who don't. They also  
measure the world in a certain way that is easy for them.


On Aug 23, 2005, at 12:13 AM, David Hagel wrote:

> Thanks, this is interesting. I asked the same question on nanog and  
> got similar responses: that queuing delay is negligible on todays  
> backbone networks compared to other fixed delay components  
> (propagation, store-and-forward, transmission etc). Response on  
> nanog seems to indicate that queuing delay is almost irrelevant today.
>
> This may sound like a naive question. But if queuing delays are so  
> insignificant in comparison to other fixed delay components then  
> what does it say about the usefulness of all the extensive  
> techniques for queue management and congestion control (including  
> TCP congestion control, RED and so forth) in the context of today's  
> backbone networks? Any thoughts? Are the congestion control  
> researchers out of touch with reality?
>
> - Dave
>
> On 8/21/05, David P. Reed <dpreed at reed.com> wrote:
>> I can repeatably easily measure 40 msec. coast-to-coast (Boston- 
>> LA), of which around 25 msec. is accounted for by speed of light  
>> in fiber (which is 2/3 of speed of light in vacuum, *299,792,458 m  
>> s^-1 *, because the refractive index of fiber is approximately 1.5  
>> or 3/2).   So assume 2e8 m/s as the speed of light in fiber,   
>> 1.6e3 m/mile, and you get 1.25e5 mi/sec.
>>
>> The remaining 15 msec. can be accounted for by the fiber path not  
>> being straight line, or by various "buffering delays" (which  
>> include queueing delays, and scheduling delays in the case where  
>> frames are scheduled periodically and you have to wait for the  
>> next frame time to launch your frame).
>>
>> Craig Partridge and I have debated (offline) what the breakdown  
>> might actually turn out to be (he thinks the total buffering delay  
>> is only 2-3 msec., I think it's more like 10-12), and it would be  
>> quite interesting to get more details, but that would involve  
>> delving into the actual equipment deployed and its operating modes.

From detlef.bosau at web.de  Mon Aug 22 15:39:28 2005
From: detlef.bosau at web.de (Detlef Bosau)
Date: Tue, 23 Aug 2005 00:39:28 +0200
Subject: [e2e] Question about propagation and queuing delays
References: <DAC3FCB50E31C54987CD10797DA511BA1060136D@WIN-MSG-10.wingroup.windeploy.ntdev.microsoft.com>
Message-ID: <430A5420.9ACB8F52@web.de>

Christian Huitema wrote:
> 
> 
> One way to find out is to collect a large set of samples, and then look
> at the minimum value. As long as the route does not change, the
> propagation delay is the sum of the transmission times, which are
> supposed constant, and a set of positive random values. The minimum of a
> large sample is the sum of the transmission times and the minimum of the
> random values, which tends towards zero.

the minimum of the random values, which tends towards zero.....

Is there evidence for this?

I think, this is similar to the rationale given in the "Adaptive
Pacing..." paper, where delay variation indicates congestion:

If the random values represent e.g. queueing delays, why does the sum of
these tends towards zero? Why not to an average value?

If the sum would tend to zero, once again: we had a possibility to
calibrate a "congestion level = f(latency)" function then.

Of course, when you can observe networks in unloaded periods of time,
you may be right as long as you take samples for a long enough period,
sufficiently high sampling rate etc.

However, from my own experience with all this "congestion level =
f(latency)" magic, I became rather relcutant. It?s appealing at the
first glance - however 
it does not look promising at the second.

Detlef

-- 
Detlef Bosau
Galileistrasse 30
70565 Stuttgart
Mail: detlef.bosau at web.de
Web: http://www.detlef-bosau.de
Mobile: +49 172 681 9937

From iam4 at cs.waikato.ac.nz  Mon Aug 22 16:40:08 2005
From: iam4 at cs.waikato.ac.nz (Ian McDonald)
Date: Tue, 23 Aug 2005 11:40:08 +1200
Subject: [e2e] Question about propagation and queuing delays
In-Reply-To: <430A18BF.5030202@web.de>
References: <mailman.0.1124662178.23693.end2end-interest@postel.org>	<c6ae07ec0508211515ab95fd3@mail.gmail.com>	<43094A36.1040402@reed.com>	<c6ae07ec05082209136a2d039f@mail.gmail.com>
	<430A18BF.5030202@web.de>
Message-ID: <430A6258.9090603@cs.waikato.ac.nz>

Detlef Bosau wrote:
> David Hagel wrote:
> 
>>
>> This may sound like a naive question. But if queuing delays are so
>> insignificant in comparison to other fixed delay components then what
>> does it say about the usefulness of all the extensive techniques for
>> queue management and congestion control (including TCP congestion
>> control, RED and so forth) in the context of today's backbone
>> networks? Any thoughts? Are the congestion control researchers out of
>> touch with reality?
>>
>> - Dave
> 
> 
> 
> It depends.
> 
> One answer is: Yes, they are.
> 
> A more cynical answer is: If a lucky guy joins a PhD program, he must
> find a topic to write about.

As a lucky guy doing a PhD on congestion control I couldn't resist the bait :-)

I may be missing something but we need congestion control as long as we have networks. In the USA
and in Europe you may all have unlimited bandwidth available at virtually no cost to you but in the
rest of the "real world" it doesn't quite work like that. So as long as you are bandwidth
constrained you will need congestion control. I think others are out of touch of reality....

</flame bait off>

Seriously traffic can be constrained for many different reasons apart from backbones:
- link at other end (e.g. web server) is on a "slow" link
- mobile networks
- link between ISP and upstream ISP (a particular problem in NZ at the moment)
- slow speed link at consumer premises

Most backbones are over provisioned in the developed world but less so in more remote corners and
even less so in developing countries. I have seen presentations showing >50% packet loss in parts of
Asia and Africa on this list in the last few months - surely you need congestion control for that!

Remember congestion control is also about fairness on your own equipment as well - you want
competing flows to share nicely (unless you specify otherwise).

Regards,

Ian

From huitema at windows.microsoft.com  Mon Aug 22 17:34:06 2005
From: huitema at windows.microsoft.com (Christian Huitema)
Date: Mon, 22 Aug 2005 17:34:06 -0700
Subject: [e2e] Question about propagation and queuing delays
Message-ID: <DAC3FCB50E31C54987CD10797DA511BA10696116@WIN-MSG-10.wingroup.windeploy.ntdev.microsoft.com>

> the minimum of the random values, which tends towards zero.....
> 
> Is there evidence for this?

Yep. Assuming independent samples, P(min(X1, X2,...,Xn) > y) = P(X > y)
to the power N, which tends towards 0 when N increases, except for the
value y=0. 

> If the random values represent e.g. queueing delays, why does the sum
of
> these tends towards zero? Why not to an average value?

Min, not sum.


-- Christian Huitema

From vgill at vijaygill.com  Mon Aug 22 19:45:14 2005
From: vgill at vijaygill.com (vijay gill)
Date: Mon, 22 Aug 2005 22:45:14 -0400
Subject: [e2e] Question about propagation and queuing delays
In-Reply-To: <13B1CFCA-3291-4B04-8CC4-D711D4423486@cisco.com>
References: <mailman.0.1124662178.23693.end2end-interest@postel.org>	<c6ae07ec0508211515ab95fd3@mail.gmail.com>	<43094A36.1040402@reed.com>	<c6ae07ec05082209136a2d039f@mail.gmail.com>
	<13B1CFCA-3291-4B04-8CC4-D711D4423486@cisco.com>
Message-ID: <430A8DBA.2000007@vijaygill.com>

Fred Baker wrote:
> no, but there are different realities, and how one measures them is  
> also relevant.
> 
> In large fiber backbones, within the backbone we generally run 10:1  
> overprovisioned or more. within those backbones, as you note, the  
> discussion is moot. But not all traffic stays within the cores of  large 
> fiber backbones - much of it is originated and terminates in  end 
> systems located in homes and offices.

We don't run 10:1 overprovisioning or n:1 overprovisioning in the 
backbone because we simply do not know how. I am provisioning a backbone 
interface, where do I get the 10 to 1 figure from. I have worked at very 
large backbones for most of my career and in every case, the backbone 
bandwidth provisioning was simply kicked off when certain paths got to a 
steady 50% or more utilization. The saving factor is that large 
macroflows between places are fairly tractacble and we can watch the 
link utilization and upgrade as needed (I speak to well funded north 
american networks, if you're running a country over a VSAT link and 
dialup modem, disregard this).

> 
> The networks that connect homes and offices to the backbones are  often 
> constrained differently. For example, my home (in an affluent  community 
> in California) is connected by Cable Modem, and the service  that I buy 
> (business service that in its AUP accepts a VPN, unlike  the same 
> company's residential service) guarantees a certain amount  of 
> bandwidth, and constrains me to that bandwidth - measured in KBPS. 


Here is where overprovisioning is common. Normally most cable plants 
allocate 20 kbps or 25 kbps per paying sub for capacity planning 
purposes and build the physical plant to support that.


> in MBPS. And, they tell me that the entire world  is not connected by 
> large fiber cores - as soon as you step out of  the affluent 
> industrialized countries, VSAT, 64 KBPS links, and even  9.6 access over 
> GSM become the access paths available.

> As to measurement, note that we generally measure that  overprovisioning 
> by running MRTG and sampling throughput rates every  300 seconds. When 
> you're discussing general service levels for an  ISP, that is probably 
> reasonable. When you're measuring time  variations on the order of 
> milliseconds, that's a little like running  a bump counter cable across 
> a busy intersection in your favorite  downtown, reading the counter once 
> a day, and drawing inferences  about the behavior of traffic during 
> light changes during rush hour...

Which is why I've been pushing my vendors to implement high watermark 
counters that measure the maximum queue depth reached. The EWMA counters 
used in most routers might as well be a random number in terms of 
finding out microburst caused congestion. It is however, perfectly valid 
for cap planning for large city-pair flows.


> 
> http://www.ieee-infocom.org/2004/Papers/37_4.PDF has an interesting  
> data point. They used a much better measurement methodology, and one  of 
> the large networks gave them some pretty cool access in order to  make 
> those tests. Basically, queuing delays within that particular  
> very-well-engineered large fiber core were on the order of 1 ms or  less 
> during the study, with very high confidence. But the same data  flows 
> frequently jumped into the 10 ms range even within the 90%  confidence 
> interval, and a few times jumped to 100 ms or so. The  jumps to high 
> delays would most likely relate to correlated high  volume data flows, I 
> suspect, either due to route changes or simple  high traffic volume.

That burstiness occurs more frequently if your customers are connected 
at links that are on the same bandwidth as the core. Lots of ds3/t1/e3 
type customers are not going to cause significant microburstiness issues 
on a 10 gig backbone.


> The people on NANOG and the people in the NRENs live in a certain  ivory 
> tower, and have little patience with those who don't. They also  measure 
> the world in a certain way that is easy for them.
> 

No comment.

/vijay

From randy at psg.com  Mon Aug 22 23:40:13 2005
From: randy at psg.com (Randy Bush)
Date: Mon, 22 Aug 2005 23:40:13 -0700
Subject: [e2e] Question about propagation and queuing delays
References: <mailman.0.1124662178.23693.end2end-interest@postel.org>
	<c6ae07ec0508211515ab95fd3@mail.gmail.com>
	<43094A36.1040402@reed.com>
	<c6ae07ec05082209136a2d039f@mail.gmail.com>
	<13B1CFCA-3291-4B04-8CC4-D711D4423486@cisco.com>
Message-ID: <17162.50381.753858.916232@roam.psg.com>

> In large fiber backbones, within the backbone we generally run 10:1  
> overprovisioned or more.

while i am quite ready to believe that in the backbones that you
run, this is the case.  in the backbones which are run by the large
and medium isps, this is not.  in the real world, it's driven by
provisioning time.  i.e., if one can provision in a matter of weeks,
then traffic usually grows sufficiently slowly that utilization of
well over 50% can be tolerated.  in the more realistic situation
where provisioning takes months, 50-66% is more the norm.  but as
i said, it also depends on rate of traffic growth.

> The people on NANOG and the people in the NRENs live in a certain  
> ivory tower, and have little patience with those who don't. They also  
> measure the world in a certain way that is easy for them.

unfortunately, what is 'easy' is that which is provided by the
broken vendor(s).  these tools are so gross as to only be useful
when the law of large numbers is in play in highly aggregated
traffic.  when small spiky flows are at issue, we're left in what
i might term as dirt, not an ivory tower.

randy


From puddinghead_wilson007 at yahoo.co.uk  Tue Aug 23 00:47:55 2005
From: puddinghead_wilson007 at yahoo.co.uk (Puddinhead Wilson)
Date: Tue, 23 Aug 2005 08:47:55 +0100 (BST)
Subject: [e2e] Question about propagation and queuing delays
In-Reply-To: <430A18BF.5030202@web.de>
Message-ID: <20050823074755.3014.qmail@web25701.mail.ukl.yahoo.com>

How is this for a thesis in older times

If the equator and the latitudes was totally
aligned/in  parallel with the plane of the revolution
of the earth is daylight saving times needed?
(neevrmind that latitudes may be ellipses)

;-)
--- Detlef Bosau <detlef.bosau at web.de> wrote:

> David Hagel wrote:
> > 
> > This may sound like a naive question. But if
> queuing delays are so
> > insignificant in comparison to other fixed delay
> components then what
> > does it say about the usefulness of all the
> extensive techniques for
> > queue management and congestion control (including
> TCP congestion
> > control, RED and so forth) in the context of
> today's backbone
> > networks? Any thoughts? Are the congestion control
> researchers out of
> > touch with reality?
> > 
> > - Dave
> 
> 
> It depends.
> 
> One answer is: Yes, they are.
> 
> A more cynical answer is: If a lucky guy joins a PhD
> program, he must 
> find a topic to write about.
> 
> In earlier centuries one wrote a doctoral thesis
> about: "Was Maria 
> virgin until her first intercourse with..., excuse
> me, before she became 
> pregnant?" O.k., now we know about that. Next
> thesis. "Was Maria virgin 
> _during_ her pregnancy?". Even that is clear. Next
> thesis, and this is 
> anatomiclly interetindg, perhaps one could not only
> achieve a D.D. but 
> an M.D. with this: "Was Maria, anatomically correct,
> virgin after she 
> gave birth to Jesus?" And now the most difficult
> one: "Was Maria virgin 
> _during_ the birth of Jesus?"
> 
> No, this is no political incorrect offense to the
> readers, these are 
> topics which were discussed extensivley in the
> Middle Ages, here in 
> Germany and in Italy and in other locations of the
> roman catholic church.
> 
> 
> Nowadays, we are rationalists. Wie don?t debate
> Marias virginity.
> 
> We discuss the importance of timers for congestion
> control. Some months 
> ago, some people from the group around Christoph
> Lindemann published about
> "TCP wit Adaptive Pacing for Multihop Wireless
> Networks" and reckognized 
> latency observation as a new crystall ball for
> congestion forecast and 
> avoidance.
> 
> If this sounds too cyncial: I apologize.
> 
> During the last weeks, I became mad about Edge?s
> paper about adaptive 
> retransmission timeouts.
> 
> And the more I?m thinking about that paper and how
> TCP timers work, the 
> more I become convinced that the insignificance of
> queueing delays, and 
> the consequence that the Internet latency as
> perceived by a flow is 
> nearly constant during the lifetime of a flow, is
> the reason why TCP 
> timers work at all.
> 
> As soon as latencies are subject to large and sudden
> change, prominent 
> example: mobile wide area networks, we talk about
> "spurious timeouts" 
> and other urban legends, which miss the problem.
> 
> The more often I read Edge?s paper and think about
> ist, the more I play 
> around with the actual RTT estimators, the more I?m
> in doubt whether 
> these will work in a network with highly instable
> and quickly changing 
> latencies.
> 
> This is all the more true in mobile wireless
> networks where latencies 
> are due to retransmissions and error recovery
> (without error recovery 
> TCP flows would break down due to retransmission
> collapse in those 
> networks) and therefore subject to change of path
> properties beyound our 
> control.
> 
> It?s not Edge?s approach, which causes the problem.
> 
> It?s our actual approach of RTT estimation which is
> as useful as cast dice.
> 
> So, with respect to your last sentence: We often are
> out of touch with 
> reality, because using our actual TCP timers we use
> an insolid basis for 
> TCP congestion control which by some chance and
> lucky cirumstances holds 
> in contemporary Internet. And when there appear some
> "strange effects" 
> in mobile networks, we are glad about it: "Hurray!
> An effect! A topic 
> for my PhD thesis!"
> Honestly, if you detect fire in your house, you
> certainly will not be 
> glad because you can spare fuel but you will call
> the fire brigade.
> 
> Using instable and inappropriate estimators for mean
> and variance of RTT
> leads to a number of "strange effects", "spurious
> timeouts" is only one 
> if them. However: It?s a symptom. Not the reason. A
> cure must focus at 
> the reason. Not at the symptom.
> 
> Once again: In contemporary Internet with
> neglectible queueing delays 
> and almost constant paths, this is absolutely no
> problem and anything 
> works fine. But falling asleep safe and sound,
> knowing Kah is around is 
> perhaps not the best strategy to solve the imminent
> problem. It?s 
> similar to our German wellfare system, where
> polictians ignored (well 
> known!) problems for decades - and now we face a
> disaster.
> 
> I?m not even convinced that anything is fine in
> wirebound networks.
> Due to some "interesting" discussions here in
> Germany concerning 
> "fastpath" (some new buzzword with ADSL) I had a
> first glance at the ITU 
> recommendation for g.dmt. In fact, we do not _yet_
> use automatic 
> retransmission here. But if we continue to exploit
> extremely noisy lines 
> for high speed data transmission, which appears to
> be promising when you 
> look at the market and which allows me as an
> unemployed person to use 
> the Internet (with my old ISDN dialup account it was
> by far to 
> expensive), things can turn to be different.
> Perhaps, ARQ might be 
> useful fore some line. Perhaps not only at the last
> mile which can be 
> hidden behind a PEP. We discussed ARQ for satellite
> links recently in 
> this list. And then? Will we complain about
> "spurious timeouts" then?
> 
> I apologize when this sounds extremely upset. It?s
> my honest intention 
> not to offense anybody. And if I can contribute an
> approach here, I will 
> do my very best. I sent some rough ideas to some
> people, perhas I will 
> get a feedback about it.
> 
> But either I am to stupid too understand TCP and
> it?s assumptions,
> or there is real danger to get into severe trouble
> when we still ignore 
> the timer issue.
> 
> O.k. I think, I will appl for asylum on the
> falklands or in the 
> antarktis now, since I expect to receive evil
> criticism now.
> 
> I don?t mind. If I?m wrong, I will learn my lesson.
> 
> But at the moment, I?m simply discouraged.
> If I?m wrong, I would appreciate somebody to correct
> me.
> If not, perhpas I can think about a way out. But
> everytime I start my 
> editor on my dated, ten years old P160 with 128
> MByte memory I think: It 
>   does not matter, whatever I write. As long as I do
> not provide 
> 
=== message truncated ===


___________________________________________________________ 
To help you stay safe and secure online, we've developed the all new Yahoo! Security Centre. http://uk.security.yahoo.com

From puddinghead_wilson007 at yahoo.co.uk  Tue Aug 23 01:15:58 2005
From: puddinghead_wilson007 at yahoo.co.uk (Puddinhead Wilson)
Date: Tue, 23 Aug 2005 09:15:58 +0100 (BST)
Subject: [e2e] Question about propagation and queuing delays
In-Reply-To: <20050823074755.3014.qmail@web25701.mail.ukl.yahoo.com>
Message-ID: <20050823081558.50775.qmail@web25702.mail.ukl.yahoo.com>


--- Puddinhead Wilson
<puddinghead_wilson007 at yahoo.co.uk> wrote:

> How is this for a thesis in older times
> 
> If the equator and the latitudes was totally
> aligned/in  parallel with the plane of the
> revolution
> of the earth is daylight saving times needed?
> (neevrmind that latitudes may be ellipses)
> 
> ;-)

foolish me!! how will i get an ellipse from a cross
section of a spehere :-))


___________________________________________________________ 
Yahoo! Messenger - NEW crystal clear PC to PC calling worldwide with voicemail http://uk.messenger.yahoo.com

From detlef.bosau at web.de  Tue Aug 23 06:25:47 2005
From: detlef.bosau at web.de (Detlef Bosau)
Date: Tue, 23 Aug 2005 15:25:47 +0200
Subject: [e2e] Question about propagation and queuing delays
References: <mailman.0.1124662178.23693.end2end-interest@postel.org>	<c6ae07ec0508211515ab95fd3@mail.gmail.com>	<43094A36.1040402@reed.com>	<c6ae07ec05082209136a2d039f@mail.gmail.com>
	<430A18BF.5030202@web.de> <430A6258.9090603@cs.waikato.ac.nz>
Message-ID: <430B23DB.8050101@web.de>

Ian McDonald wrote:

> As a lucky guy doing a PhD on congestion control I couldn't resist the bait :-)
> 
> I may be missing something but we need congestion control as long as we have networks. In the USA

I?m totally with you.

> and in Europe you may all have unlimited bandwidth available at virtually no cost to you but in the
> rest of the "real world" it doesn't quite work like that. So as long as you are bandwidth
> constrained you will need congestion control. I think others are out of touch of reality....
> 
> </flame bait off>


Excuse me, where is the flame bait? What you say is abolutley correct 
and I totally agree with you!

I will only give one example (I always tend to write too much...). When 
I worked as a network adminstrator in northern Germany we had to attach 
some points of sales to a company network which were situated in the 
Czech Republik.

It is interesting to observe people who always talk about ISDN, DSL, 
backbondes with large bandwdith, and now you are informed: "We do not 
yet know whether 9k6 can be achieved, we have to check the old POTS 
line, it may be too noisy."

Depending on where you are, you may perfectly encounter different 
realities! Even here in Germany you may encounter stone aged POTS lines 
in some rural areas.

When I read what you say, I would like to invite you into my ISP?s 
support newsgroup, I think much of the readers can learn a lot from you!

Just to give one example from there: We recently had a discussion about 
"Fastpath". In DSL lines, you need error recovery on the last mile. Now, 
  to save overhead you do codespreading/interleaing. Some "well informed 
guys" want the ISP to turn interleaving off in order to spare some "ping 
time". First of all, it?s simply ridiculous, theat individual customers 
without any technical knowledge will prescribe the provider the 
appropriate line coding for one individual wire pair. Second: Not only 
these customers may be affected by increasing error rates: These guys 
flood large portions of the network with defictive frames, more 
precisely with defective ATM cells with corrupted payload, which is 
eventually being detected at the customers AAL 5 peer. (At least AFAIK.)
This is thoughtless waste of bandwidth, but it is nearly impossible to 
convince those guys that this is malicious in quite a number of cases!

What is even more disastrous: In fact, in DSL TCP appears to be based 
upon AAL5/UBR. Unspecified bitrate. Hence, all congestion control is 
done at the TCP endpoints. I?m totally with you that this requires well 
behaved participants in a network. IIRC, LANE works with ABR and that 
will alleviate the problem.

> 
> Seriously traffic can be constrained for many different reasons apart from backbones:
> - link at other end (e.g. web server) is on a "slow" link
> - mobile networks
> - link between ISP and upstream ISP (a particular problem in NZ at the moment)
> - slow speed link at consumer premises

Could you _please_ join this newsgroup :-)

> 
> Most backbones are over provisioned in the developed world but less so in more remote corners and
> even less so in developing countries. I have seen presentations showing >50% packet loss in parts of
> Asia and Africa on this list in the last few months - surely you need congestion control for that!

Excuse me, but I don?t mind congestion control! Of course we need it! 
Perhaps my command of the englisch language is rather poor. But I 
sincerely hope that no one had misunderstood me that way that I denied 
the necessity of congestion control!

The problem _I_ expect is, that congestion control and even proper 
retransmission control can run into severe problems, when TCP timers 
don?t work.

And when I talked about the Internet as it is perceived in Europe and 
the US, I concluded that in _this_ area TCP works fine.

Whether this holds true all about the world and in all kinds of networks 
is highly questionable.

So, I really don?t see a flame bait here. Perhaps you understood me in a 
different way,
but from what you wrote I couldn?t agree with you even better!

Detlef

-- 
Detlef Bosau
Galileistrasse 30
70565 Stuttgart
Mail: detlef.bosau at web.de
Web: http://www.detlef-bosau.de
Mobile: +49 172 681 9937


From detlef.bosau at web.de  Tue Aug 23 07:20:16 2005
From: detlef.bosau at web.de (Detlef Bosau)
Date: Tue, 23 Aug 2005 16:20:16 +0200
Subject: [e2e] Question about propagation and queuing delays
References: <DAC3FCB50E31C54987CD10797DA511BA10696116@WIN-MSG-10.wingroup.windeploy.ntdev.microsoft.com>
Message-ID: <430B30A0.1030202@web.de>

Christian Huitema wrote:
>>the minimum of the random values, which tends towards zero.....
>>
>>Is there evidence for this?
> 
> 
> Yep. Assuming independent samples, P(min(X1, X2,...,Xn) > y) = P(X > y)
> to the power N, which tends towards 0 when N increases, except for the
> value y=0. 
> 

O.k.

> 
>>If the random values represent e.g. queueing delays, why does the sum
> 
> of
> 
>>these tends towards zero? Why not to an average value?
> 

But can you observe this min?

If Xi, i = 1..n, are queueing delays, the minimum tends to zero.
However, if you observe a packet traveling the network (I think, you 
obtain your samples this way?) the queueing delays will sum up?

Detlef


-- 
Detlef Bosau
Galileistrasse 30
70565 Stuttgart
Mail: detlef.bosau at web.de
Web: http://www.detlef-bosau.de
Mobile: +49 172 681 9937


From fla at inescporto.pt  Tue Aug 23 09:35:40 2005
From: fla at inescporto.pt (Filipe Abrantes)
Date: Tue, 23 Aug 2005 17:35:40 +0100
Subject: [e2e] Question about propagation and queuing delays
In-Reply-To: <c6ae07ec05082209136a2d039f@mail.gmail.com>
References: <mailman.0.1124662178.23693.end2end-interest@postel.org>	<c6ae07ec0508211515ab95fd3@mail.gmail.com>
	<43094A36.1040402@reed.com>
	<c6ae07ec05082209136a2d039f@mail.gmail.com>
Message-ID: <430B505C.3000303@inescporto.pt>

Hello David,

David Hagel wrote:
> Thanks, this is interesting. I asked the same question on nanog and
> got similar responses: that queuing delay is negligible on todays
> backbone networks compared to other fixed delay components
> (propagation, store-and-forward, transmission etc). Response on nanog 
> seems to indicate that queuing delay is almost irrelevant today.
> 
> This may sound like a naive question. But if queuing delays are so
> insignificant in comparison to other fixed delay components then what
> does it say about the usefulness of all the extensive techniques for
> queue management and congestion control (including TCP congestion
> control, RED and so forth) in the context of today's backbone
> networks? Any thoughts? Are the congestion control researchers out of
> touch with reality?
> 

The latencies mentioned by David Reed are in the case of a 
non-congestioned path, and how it was already referred here, nowadays 
the most common case is to have our access link (xDSL/cable...) at 
home/office to be the bottleneck (the ping would struggle to fill the 
link right?). So, to get an approximate value for the maximum queueing 
delays you should try a ping when you have background traffic that fully 
utilizes your access link.

Congestion Control only plays an active role when there is a bottleneck 
in the path... (well not totally true as the guys from the 
high-bandwidth delay and lossy paths may tell you).

As to queue management, one of it's goals is also to promote fairness 
between flows (TCP is not that good at it), so i can see some usefulness 
in them too. If the final result is actually good enough I still don't 
know (I havent' gone too deep into this issue).

I just did a ping to my home which is on a 2Mb-dl/128kbit-ul cable 
connection from work (where I am) to exemplify this. At home I started a 
P2P program which had the upload capped at 6KByte/s (capped by the 
application, so there could be instantaneous overloads i think) I got this:
(the upload link was the bottleneck as my download was well below the dl 
limit)

$ ping xxxxxxxx.no-ip.org
PING xxxxxxx.no-ip.org (83.132.76.xx) 56(84) bytes of data.
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=1 
ttl=52 time=71.9 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=2 
ttl=52 time=109 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=3 
ttl=52 time=88.9 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=4 
ttl=52 time=29.5 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=5 
ttl=52 time=399 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=6 
ttl=52 time=307 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=7 
ttl=52 time=131 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=8 
ttl=52 time=78.6 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=9 
ttl=52 time=87.9 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=10 
ttl=52 time=54.2 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=11 
ttl=52 time=93.7 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=12 
ttl=52 time=22.4 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=13 
ttl=52 time=21.8 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=14 
ttl=52 time=45.2 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=16 
ttl=52 time=251 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=17 
ttl=52 time=22.1 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=18 
ttl=52 time=297 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=19 
ttl=52 time=290 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=20 
ttl=52 time=280 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=21 
ttl=52 time=21.0 ms

--- xxxxxxxx.no-ip.org ping statistics ---
21 packets transmitted, 20 received, 4% packet loss, time 20020ms
rtt min/avg/max/mdev = 21.046/135.229/399.044/117.287 ms


Then I capped the upload at 3Kbyte/s and got this:

$ ping xxxxxxxx.no-ip.org
PING xxxxxxx.no-ip.org (83.132.76.xx) 56(84) bytes of data.
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=1 
ttl=52 time=22.9 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=2 
ttl=52 time=22.2 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=3 
ttl=52 time=88.9 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=4 
ttl=52 time=34.3 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=5 
ttl=52 time=23.3 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=6 
ttl=52 time=24.6 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=7 
ttl=52 time=25.9 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=8 
ttl=52 time=22.9 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=9 
ttl=52 time=20.9 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=10 
ttl=52 time=52.5 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=11 
ttl=52 time=21.4 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=12 
ttl=52 time=30.9 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=13 
ttl=52 time=21.4 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=14 
ttl=52 time=42.8 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=15 
ttl=52 time=20.5 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=16 
ttl=52 time=22.0 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=17 
ttl=52 time=24.7 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=18 
ttl=52 time=24.4 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=19 
ttl=52 time=21.3 ms
64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=20 
ttl=52 time=20.6 ms

--- xxxxxxxx.no-ip.org ping statistics ---
20 packets transmitted, 20 received, 0% packet loss, time 19018ms
rtt min/avg/max/mdev = 20.593/29.482/88.992/15.843 ms


As you can see, queueing delays are noticeable.

Best Regards

Filipe Abrantes


> - Dave
> 
> 
> On 8/21/05, David P. Reed <dpreed at reed.com> wrote:
> 
>>I can repeatably easily measure 40 msec. coast-to-coast (Boston-LA), of
>>which around 25 msec. is accounted for by speed of light in fiber (which
>>is 2/3 of speed of light in vacuum, *299,792,458 m s^-1 *, because the
>>refractive index of fiber is approximately 1.5 or 3/2).   So assume 2e8
>>m/s as the speed of light in fiber,  1.6e3 m/mile, and you get 1.25e5
>>mi/sec.
>>
>>The remaining 15 msec. can be accounted for by the fiber path not being
>>straight line, or by various "buffering delays" (which include queueing
>>delays, and scheduling delays in the case where frames are scheduled
>>periodically and you have to wait for the next frame time to launch your
>>frame).
>>
>>Craig Partridge and I have debated (offline) what the breakdown might
>>actually turn out to be (he thinks the total buffering delay is only 2-3
>>msec., I think it's more like 10-12), and it would be quite interesting
>>to get more details, but that would involve delving into the actual
>>equipment deployed and its operating modes.
>>
> 
> 

-- 
Filipe Lameiro Abrantes
INESC Porto
Campus da FEUP
Rua Dr. Roberto Frias, 378
4200-465 Porto
Portugal

Phone: +351 22 209 4266
E-mail: fla at inescporto.pt

From garmitage at swin.edu.au  Tue Aug 23 11:27:55 2005
From: garmitage at swin.edu.au (grenville armitage)
Date: Tue, 23 Aug 2005 14:27:55 -0400
Subject: [e2e] Question about propagation and queuing delays
In-Reply-To: <430B23DB.8050101@web.de>
References: <mailman.0.1124662178.23693.end2end-interest@postel.org>	<c6ae07ec0508211515ab95fd3@mail.gmail.com>	<43094A36.1040402@reed.com>	<c6ae07ec05082209136a2d039f@mail.gmail.com>	<430A18BF.5030202@web.de>
	<430A6258.9090603@cs.waikato.ac.nz> <430B23DB.8050101@web.de>
Message-ID: <430B6AAB.9040804@swin.edu.au>

Detlef Bosau wrote:
	[..]
> Just to give one example from there: We recently had a discussion about 
> "Fastpath". In DSL lines, you need error recovery on the last mile. Now, 
>  to save overhead you do codespreading/interleaing. Some "well informed 
> guys" want the ISP to turn interleaving off in order to spare some "ping 
> time". First of all, it?s simply ridiculous, theat individual customers 
> without any technical knowledge will prescribe the provider the 
> appropriate line coding for one individual wire pair.

I'm curious if you have any stats on the typical error rates that will
be experienced by the people who switch from Interleave to Fastpath mode
on their DSL links. It is certainly true that e.g. gamers find Interleave
mode to be a pain (at least 20ms additional latency), with good reason.
Stats on how much packet loss the gamer will experience (in order to gain
the latency improvement of Fastpath) would be interesting to know. If the
error rates are low, then in fact it seems like an entirely reasonable
thing for a customer to desire Fastpath.

> Second: Not only 
> these customers may be affected by increasing error rates: These guys 
> flood large portions of the network with defictive frames, more 
> precisely with defective ATM cells with corrupted payload, which is 
> eventually being detected at the customers AAL 5 peer. (At least AFAIK.)
> This is thoughtless waste of bandwidth, but it is nearly impossible to 
> convince those guys that this is malicious in quite a number of cases!

I'm also curious about this "large portions of the network" which is carrying
useless ATM AAL5 cells. If the bit error rate is so bad that a noticable fraction
of ATM cells are useless then gamers going to have a bad packet loss rate and
fairly quickly go back to interleave mode (despite the higher latency). Yet if
the gamers are happy with the typical loss rate using Fastpath, then there's
probably not that many wasted/useless ATM cells floating around.

(Naturally, if the actual AAL_PDU loss rate starts to become more than an
integer of a % the gamer's use of TCP for p2p, web surfing and email becomes
problematic. But it is hard to argue this by hand-waving - we need stats on
likely bit error rates a typical DSL customers islikely to see using Fastpath.)

cheers,
gja

From rja at extremenetworks.com  Tue Aug 23 11:30:36 2005
From: rja at extremenetworks.com (RJ Atkinson)
Date: Tue, 23 Aug 2005 14:30:36 -0400
Subject: [e2e] Question about propagation and queuing delays
In-Reply-To: <c6ae07ec05082209136a2d039f@mail.gmail.com>
References: <mailman.0.1124662178.23693.end2end-interest@postel.org>
	<c6ae07ec0508211515ab95fd3@mail.gmail.com>
	<43094A36.1040402@reed.com>
	<c6ae07ec05082209136a2d039f@mail.gmail.com>
Message-ID: <91A5CE60-4E8C-475F-9B02-371A0D3EC1BB@extremenetworks.com>


On Aug 22, 2005, at 12:13, David Hagel wrote:
> Thanks, this is interesting. I asked the same question on nanog and
> got similar responses: that queuing delay is negligible on todays
> backbone networks compared to other fixed delay components
> (propagation, store-and-forward, transmission etc). Response on nanog
> seems to indicate that queuing delay is almost irrelevant today.
>
> This may sound like a naive question. But if queuing delays are so
> insignificant in comparison to other fixed delay components then what
> does it say about the usefulness of all the extensive techniques for
> queue management and congestion control (including TCP congestion
> control, RED and so forth) in the context of today's backbone
> networks? Any thoughts? Are the congestion control researchers out of
> touch with reality?

Congestion still exists today.  However, it tends to exist
not inside the network core, but instead in the access link
(i.e. the link between the campus network and the upstream ISP).
In many cases, this congestion is a policy choice on the part
of the end site (e.g. pay for NxT1 uplink rather than T3 uplink
in order to save money).

Ran


From marc.herbert at free.fr  Tue Aug 23 11:49:09 2005
From: marc.herbert at free.fr (Marc Herbert)
Date: Tue, 23 Aug 2005 20:49:09 +0200 (CEST)
Subject: [e2e] Question about propagation and queuing delays
In-Reply-To: <430B23DB.8050101@web.de>
References: <mailman.0.1124662178.23693.end2end-interest@postel.org>
	<c6ae07ec0508211515ab95fd3@mail.gmail.com> <43094A36.1040402@reed.com>
	<c6ae07ec05082209136a2d039f@mail.gmail.com> <430A18BF.5030202@web.de>
	<430A6258.9090603@cs.waikato.ac.nz> <430B23DB.8050101@web.de>
Message-ID: <Pine.LNX.4.58.0508232033530.6976@meije.emic.fr>

On Tue, 23 Aug 2005, Detlef Bosau wrote:

> When I read what you say, I would like to invite you into my ISP?s
> support newsgroup, I think much of the readers can learn a lot from you!
>
> Just to give one example from there: We recently had a discussion about
> "Fastpath". In DSL lines, you need error recovery on the last mile. Now,
>   to save overhead you do codespreading/interleaing. Some "well informed
> guys" want the ISP to turn interleaving off in order to spare some "ping
> time". First of all, it?s simply ridiculous, theat individual customers
> without any technical knowledge will prescribe the provider the
> appropriate line coding for one individual wire pair. Second: Not only
> these customers may be affected by increasing error rates: These guys
> flood large portions of the network with defictive frames, more
> precisely with defective ATM cells with corrupted payload, which is
> eventually being detected at the customers AAL 5 peer. (At least AFAIK.)
> This is thoughtless waste of bandwidth, but it is nearly impossible to
> convince those guys that this is malicious in quite a number of cases!
>
> What is even more disastrous: In fact, in DSL TCP appears to be based
> upon AAL5/UBR. Unspecified bitrate. Hence, all congestion control is
> done at the TCP endpoints. I?m totally with you that this requires well
> behaved participants in a network. IIRC, LANE works with ABR and that
> will alleviate the problem.


FYI the second biggest ISP in France (about 1.2M subscribers) gives
its subscribers a write access to this "fastpath" interleave level,
through a simple web interface.

http://translate.google.com/translate?u=http%3A%2F%2Fadsl.free.fr%2Fadmin%2Ffast_path.html&langpair=fr%7Cen&hl=en

Of course you also have access to error stats on the DSL line.

All gamers know about and love this feature. It helps them gain about
30ms, a huge benefit for "real-time" games. And they don't care much
about the rest. Other subscribers don't care and use the default,
conservative setting. So everyone is happy with this well-designed
feature... I guess that if this feature was "flooding the network with
malicious frames" or something, the ISP would obviously not have
offered it.

From detlef.bosau at web.de  Tue Aug 23 15:09:11 2005
From: detlef.bosau at web.de (Detlef Bosau)
Date: Wed, 24 Aug 2005 00:09:11 +0200
Subject: [e2e] Question about propagation and queuing delays
References: <mailman.0.1124662178.23693.end2end-interest@postel.org>	<c6ae07ec0508211515ab95fd3@mail.gmail.com>	<43094A36.1040402@reed.com>	<c6ae07ec05082209136a2d039f@mail.gmail.com>	<430A18BF.5030202@web.de>	<430A6258.9090603@cs.waikato.ac.nz>
	<430B23DB.8050101@web.de> <430B6AAB.9040804@swin.edu.au>
Message-ID: <430B9E87.6070504@web.de>

grenville armitage wrote:
> Detlef Bosau wrote:
>     [..]
> 
>> Just to give one example from there: We recently had a discussion 
>> about "Fastpath". In DSL lines, you need error recovery on the last 
>> mile. Now,  to save overhead you do codespreading/interleaing. Some 
>> "well informed guys" want the ISP to turn interleaving off in order to 
>> spare some "ping time". First of all, it?s simply ridiculous, theat 
>> individual customers without any technical knowledge will prescribe 
>> the provider the appropriate line coding for one individual wire pair.
> 
> 
> I'm curious if you have any stats on the typical error rates that will
> be experienced by the people who switch from Interleave to Fastpath mode

No. I haven?t.

And I expect they will depend heavily on where you are. I don?t think an 
ISP will reveal this to customers.

> on their DSL links. It is certainly true that e.g. gamers find Interleave
> mode to be a pain (at least 20ms additional latency), with good reason.
> Stats on how much packet loss the gamer will experience (in order to gain
> the latency improvement of Fastpath) would be interesting to know. If the
> error rates are low, then in fact it seems like an entirely reasonable
> thing for a customer to desire Fastpath.

If.

The problem is, besides the problem that this may be off topic here in 
this list, so perhaps we should continue this discussion via PM, that it 
is hardly possible to make a decision here without any knowledge of the 
quality of the line.

In addition, from my own professional experience, it is necessary to 
have a clear service description and service level agreement in any sort 
of contract. So, ISP and customer agree upon _WHAT_ is offered and not 
upon _HOW_ it?s implementend.

Particularly the "ping times" are often wunderful "promises", based upon 
"there was an interview with a famous guy on this topic in the 
tabloids...." and then a customer expects a certain QoS. So "fastpath" 
may end up in a hidden, unspoken "unilateral QoS contract".

What happens, if for some reason the latency increases?

As a provider, you must not promise anything what you might not be able 
to keep. It?s always a liability issue. And if one customer is located 
in Markl and the other one in Flensburg, it doesn?t matter where these 
locatins are but there may be a distance of 1000 kilometers in between, 
this is of course diffferent to a situation where both people are 
located in Flensburg. Customer?s typically aren?t aware of this. They 
have read in the tabloids: "Fastpath will gurantee you ping times of 20 
ms." There is not written from where to where. There is only said: "20 
ms". So, when the NASA starts its mission to Mars, you shold enable fast 
path. Then you will have ping times to the spacecraft of 20 ms.


> 
>> Second: Not only these customers may be affected by increasing error 
>> rates: These guys flood large portions of the network with defictive 
>> frames, more precisely with defective ATM cells with corrupted 
>> payload, which is eventually being detected at the customers AAL 5 
>> peer. (At least AFAIK.)
>> This is thoughtless waste of bandwidth, but it is nearly impossible to 
>> convince those guys that this is malicious in quite a number of cases!
> 
> 
> I'm also curious about this "large portions of the network" which is 
> carrying
> useless ATM AAL5 cells. If the bit error rate is so bad that a noticable 
> fraction
> of ATM cells are useless then gamers going to have a bad packet loss 
> rate and
> fairly quickly go back to interleave mode (despite the higher latency). 


When they get aware of this. Until yesterday, DSL 1000 was sufficient 
for online gamers. As of today, customers threaten the providers with 
law suits in order to get "DSL 6000", because otherwise the online game 
won?t work any longer.

It was written in the tabloids, you know.

It?s like computer worms and viruses. These are typical issues in the 
evening news on TV here in Germany.

> Yet if
> the gamers are happy with the typical loss rate using Fastpath, then 
> there's
> probably not that many wasted/useless ATM cells floating around.
> 
> (Naturally, if the actual AAL_PDU loss rate starts to become more than an
> integer of a % the gamer's use of TCP for p2p, web surfing and email 
> becomes
> problematic. But it is hard to argue this by hand-waving - we need stats on
> likely bit error rates a typical DSL customers islikely to see using 
> Fastpath.)
> 

I am totally with you. And even that?s the reason why line coding should 
be left to the provider who _has_ statistics and can make a decision 
based upon them.

It?s what I said before: ISP and customer shall agree upon _what_ is 
provided. Not _hwo_ it?s provided.

However, to get on topic again: Bascially, I talked about DSL as an 
_example_, why  things can get more complex and more complicated than it 
was perhaps in the mids of the 1980s between UCLA, UCSD and UCBE.

Basically, I startet with the question: Why do we expect difficulties 
with TCP in mobile wireless networks? some years ago.

Then, some thousands of papers later and having read tons of paper (is 
there any forest left?) about varying bandwidth, spurious timeouts, 
adverse interactions, scheduling problems and other problems which are 
of course scary - and occur on each and every company LAN with even no 
wireless component in it - I got into the details of TCP timer estimation.

As one would expect, this ended up in the question: Why does the 
Internet work at all?

Obviously, it does.

_I_ want to understand, _if_ and if _why_ there are problems with TCP in 
mobile wireless networks.

It?s not convincing that there are dozens of PhD theses around which 
claim there are problems here, as long as there is no convincing reason 
for this.

Simulations are not convincing (they prove anything and nothing - 
whatever you prefer) and "occasional observation" (recall the "cold 
fusion") aren?t neither.

When we talk about problems with TCP in mobile wireless networks, we 
must give _reasons_ why there could be problems. Anything else is 
playing. And not science.

When you look at my homepage, you?ll find my Path Tail Emulation paper 
there.

I did not write a second paper yet, so a number of obejctions must be 
discussed and a number of corrections must be done in later versions.

However, at the moment, it is not the question _how_ to solve the 
"problem" with TCP in mobile wireless networks. It is the question
_IF_ there is a problem at all. And it is not the question that Ludwig, 
Gurtov, Chrakravorty and thousands of others have written tons of 
papers, there would be some. I have read a great deal of this stuff and 
there are problems in the NS2 caused by HICCUP and this shall make me 
believe there is a problem in reality without NS2 and HICCUP.

Bash me, beat my, excuse me, that is not convincing. Either we identify 
were TCP is vulnerable or were the "system model" assumed by TCP is 
violated by mobile wireless networks - or this all is guess, 
hand-waving. That is the reason why I talked about an "urban legend" 
here yesterday.

You may perfectly say, I would question quite a couple of PhD theses and 
whether it was justified to award the candidates the degree.

If you do so, you perfectly understood what I mean.

What I have read so far on this issue is not convincing - but simply sloppy.

And if I couldn?t do it signicantly better, I wouldn?t write a second 
paper. But _if_ I do, and I?m trying to do so, there must be a sound 
basis in this. And not these "irreproducible observations" and (by NS2) 
"repeated assertions" I?ve read so far.

And an embarrassing example for sloppiness is the "Adaptive Pacing" 
paper by ElRakabawy, Klemm and Lindemann at the Mobihoc this year.

Taking two or three repeated assertations ("simulations") as a proof for 
a fundamental, but questionable, theorem cannot be accepted and I 
wonder, why the paper was by the reviewers.

The paper is nicely written, there are nice figures and tables....
But for the benefit of a conference?s reputation, there should be some 
content in there as well and if the content is correct this is even better.


Detlef


Detlef
-- 
Detlef Bosau
Galileistrasse 30
70565 Stuttgart
Mail: detlef.bosau at web.de
Web: http://www.detlef-bosau.de
Mobile: +49 172 681 9937


From david.hagel at gmail.com  Tue Aug 23 15:06:41 2005
From: david.hagel at gmail.com (David Hagel)
Date: Tue, 23 Aug 2005 18:06:41 -0400
Subject: [e2e] Question about propagation and queuing delays
In-Reply-To: <430B505C.3000303@inescporto.pt>
References: <mailman.0.1124662178.23693.end2end-interest@postel.org>
	<c6ae07ec0508211515ab95fd3@mail.gmail.com> <43094A36.1040402@reed.com>
	<c6ae07ec05082209136a2d039f@mail.gmail.com>
	<430B505C.3000303@inescporto.pt>
Message-ID: <c6ae07ec050823150628608c7f@mail.gmail.com>

>From all that I hear, access links seem like the main curlpits that
cause most of the congestion today. But so many of the current
congestion control evaluations focus on alleviating congestion in the
network core. Perhaps a simpler network model, in which only access
links can be the bottlenecks, might yield much simpler congestion
control solutions? Has there been any work in this direction? With all
the non-TCP applications (like VoIP) emerging on the horizon, does
relying on TCP alone for congestion control make much sense?

(List moderators -- in case I am stirring some old soup of discussion
on this list, please feel free to kill this thread. I am new to this
list.)

- Dave

On 8/23/05, Filipe Abrantes <fla at inescporto.pt> wrote:
> Hello David,
> 
> David Hagel wrote:
> > Thanks, this is interesting. I asked the same question on nanog and
> > got similar responses: that queuing delay is negligible on todays
> > backbone networks compared to other fixed delay components
> > (propagation, store-and-forward, transmission etc). Response on nanog
> > seems to indicate that queuing delay is almost irrelevant today.
> >
> > This may sound like a naive question. But if queuing delays are so
> > insignificant in comparison to other fixed delay components then what
> > does it say about the usefulness of all the extensive techniques for
> > queue management and congestion control (including TCP congestion
> > control, RED and so forth) in the context of today's backbone
> > networks? Any thoughts? Are the congestion control researchers out of
> > touch with reality?
> >
> 
> The latencies mentioned by David Reed are in the case of a
> non-congestioned path, and how it was already referred here, nowadays
> the most common case is to have our access link (xDSL/cable...) at
> home/office to be the bottleneck (the ping would struggle to fill the
> link right?). So, to get an approximate value for the maximum queueing
> delays you should try a ping when you have background traffic that fully
> utilizes your access link.
> 
> Congestion Control only plays an active role when there is a bottleneck
> in the path... (well not totally true as the guys from the
> high-bandwidth delay and lossy paths may tell you).
> 
> As to queue management, one of it's goals is also to promote fairness
> between flows (TCP is not that good at it), so i can see some usefulness
> in them too. If the final result is actually good enough I still don't
> know (I havent' gone too deep into this issue).
> 
> I just did a ping to my home which is on a 2Mb-dl/128kbit-ul cable
> connection from work (where I am) to exemplify this. At home I started a
> P2P program which had the upload capped at 6KByte/s (capped by the
> application, so there could be instantaneous overloads i think) I got this:
> (the upload link was the bottleneck as my download was well below the dl
> limit)
> 
> $ ping xxxxxxxx.no-ip.org
> PING xxxxxxx.no-ip.org (83.132.76.xx) 56(84) bytes of data.
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=1
> ttl=52 time=71.9 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=2
> ttl=52 time=109 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=3
> ttl=52 time=88.9 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=4
> ttl=52 time=29.5 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=5
> ttl=52 time=399 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=6
> ttl=52 time=307 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=7
> ttl=52 time=131 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=8
> ttl=52 time=78.6 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=9
> ttl=52 time=87.9 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=10
> ttl=52 time=54.2 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=11
> ttl=52 time=93.7 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=12
> ttl=52 time=22.4 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=13
> ttl=52 time=21.8 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=14
> ttl=52 time=45.2 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=16
> ttl=52 time=251 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=17
> ttl=52 time=22.1 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=18
> ttl=52 time=297 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=19
> ttl=52 time=290 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=20
> ttl=52 time=280 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=21
> ttl=52 time=21.0 ms
> 
> --- xxxxxxxx.no-ip.org ping statistics ---
> 21 packets transmitted, 20 received, 4% packet loss, time 20020ms
> rtt min/avg/max/mdev = 21.046/135.229/399.044/117.287 ms
> 
> 
> Then I capped the upload at 3Kbyte/s and got this:
> 
> $ ping xxxxxxxx.no-ip.org
> PING xxxxxxx.no-ip.org (83.132.76.xx) 56(84) bytes of data.
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=1
> ttl=52 time=22.9 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=2
> ttl=52 time=22.2 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=3
> ttl=52 time=88.9 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=4
> ttl=52 time=34.3 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=5
> ttl=52 time=23.3 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=6
> ttl=52 time=24.6 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=7
> ttl=52 time=25.9 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=8
> ttl=52 time=22.9 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=9
> ttl=52 time=20.9 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=10
> ttl=52 time=52.5 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=11
> ttl=52 time=21.4 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=12
> ttl=52 time=30.9 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=13
> ttl=52 time=21.4 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=14
> ttl=52 time=42.8 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=15
> ttl=52 time=20.5 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=16
> ttl=52 time=22.0 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=17
> ttl=52 time=24.7 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=18
> ttl=52 time=24.4 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=19
> ttl=52 time=21.3 ms
> 64 bytes from a83-132-76-xx.cpe.netcabo.pt (83.132.76.xx): icmp_seq=20
> ttl=52 time=20.6 ms
> 
> --- xxxxxxxx.no-ip.org ping statistics ---
> 20 packets transmitted, 20 received, 0% packet loss, time 19018ms
> rtt min/avg/max/mdev = 20.593/29.482/88.992/15.843 ms
> 
> 
> As you can see, queueing delays are noticeable.
> 
> Best Regards
> 
> Filipe Abrantes
> 
> 
> > - Dave
> >
> >
> > On 8/21/05, David P. Reed <dpreed at reed.com> wrote:
> >
> >>I can repeatably easily measure 40 msec. coast-to-coast (Boston-LA), of
> >>which around 25 msec. is accounted for by speed of light in fiber (which
> >>is 2/3 of speed of light in vacuum, *299,792,458 m s^-1 *, because the
> >>refractive index of fiber is approximately 1.5 or 3/2).   So assume 2e8
> >>m/s as the speed of light in fiber,  1.6e3 m/mile, and you get 1.25e5
> >>mi/sec.
> >>
> >>The remaining 15 msec. can be accounted for by the fiber path not being
> >>straight line, or by various "buffering delays" (which include queueing
> >>delays, and scheduling delays in the case where frames are scheduled
> >>periodically and you have to wait for the next frame time to launch your
> >>frame).
> >>
> >>Craig Partridge and I have debated (offline) what the breakdown might
> >>actually turn out to be (he thinks the total buffering delay is only 2-3
> >>msec., I think it's more like 10-12), and it would be quite interesting
> >>to get more details, but that would involve delving into the actual
> >>equipment deployed and its operating modes.
> >>
> >
> >
> 
> --
> Filipe Lameiro Abrantes
> INESC Porto
> Campus da FEUP
> Rua Dr. Roberto Frias, 378
> 4200-465 Porto
> Portugal
> 
> Phone: +351 22 209 4266
> E-mail: fla at inescporto.pt
>

From alexkr at cisco.com  Tue Aug 23 15:21:43 2005
From: alexkr at cisco.com (Alex Krivonosov (alexkr))
Date: Tue, 23 Aug 2005 15:21:43 -0700
Subject: [e2e] Need help: setting winsock receive low watermark while using
	completion port and TCP
Message-ID: <CDB8E728D94C3A48BAACD57F390C4022712C9E@xmb-sjc-233.amer.cisco.com>

Hi,
 
Can anybody help me to solve this issue?
 
I have a TCP connection handled by the completion port IO model. What is
happening is in case I specify a large buffer for receiving (WSARecv),
the operation completes only after the buffer is full, not after
receiving about 500 bytes (a packet), so a significant delay is
introduced. In case of small buffers, performance degrades. Any advice
on this? Completion port model is a must.
 
Thank you
Alex Krivonosov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.postel.org/pipermail/end2end-interest/attachments/20050823/ba58f2c7/attachment.html

From lars.eggert at netlab.nec.de  Wed Aug 24 01:06:42 2005
From: lars.eggert at netlab.nec.de (Lars Eggert)
Date: Wed, 24 Aug 2005 10:06:42 +0200
Subject: [e2e] Need help: setting winsock receive low watermark while
	using completion port and TCP
In-Reply-To: <CDB8E728D94C3A48BAACD57F390C4022712C9E@xmb-sjc-233.amer.cisco.com>
References: <CDB8E728D94C3A48BAACD57F390C4022712C9E@xmb-sjc-233.amer.cisco.com>
Message-ID: <0EC343A4-094D-4EE6-9428-7D3FE03CB83E@netlab.nec.de>

On Aug 24, 2005, at 0:21, Alex Krivonosov (alexkr) wrote:
> I have a TCP connection handled by the completion port IO model.  
> What is happening is in case I specify a large buffer for receiving  
> (WSARecv), the operation completes only after the buffer is full,  
> not after receiving about 500 bytes (a packet), so a significant  
> delay is introduced. In case of small buffers, performance  
> degrades. Any advice on this? Completion port model is a must.

Please understand that TCP doesn't deliver "packets" to the  
application, it provides a byte stream. You may want to look into  
using non-blocking I/O for the receive call. (I don't know what you  
mean by "completion port model.")

Lars
--
Lars Eggert                                     NEC Network Laboratories

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3686 bytes
Desc: not available
Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050824/9ed62f66/smime-0001.bin

From braden at ISI.EDU  Wed Aug 24 05:43:01 2005
From: braden at ISI.EDU (Bob Braden)
Date: Wed, 24 Aug 2005 05:43:01 -0700
Subject: [e2e] Need help: setting winsock receive low watermark while
 using completion port and TCP
In-Reply-To: <0EC343A4-094D-4EE6-9428-7D3FE03CB83E@netlab.nec.de>
References: <CDB8E728D94C3A48BAACD57F390C4022712C9E@xmb-sjc-233.amer.cisco.com>
	<CDB8E728D94C3A48BAACD57F390C4022712C9E@xmb-sjc-233.amer.cisco.com>
Message-ID: <5.1.0.14.2.20050824053938.024567a0@boreas.isi.edu>

BTW, the designers of TCP included a protocol mechanism to deal with this
problem.  It is called "Push".  The Berkeley folks chose to ignore it
when they developed the socket interface, because it did not fit into
their model of a connection as a virtual file.  Interesting example
of an atrophied protocol mechanism.

Bob Braden


At 10:06 AM 8/24/2005 +0200, Lars Eggert wrote:
>On Aug 24, 2005, at 0:21, Alex Krivonosov (alexkr) wrote:
>>I have a TCP connection handled by the completion port IO model.
>>What is happening is in case I specify a large buffer for receiving
>>(WSARecv), the operation completes only after the buffer is full,
>>not after receiving about 500 bytes (a packet), so a significant
>>delay is introduced. In case of small buffers, performance
>>degrades. Any advice on this? Completion port model is a must.
>
>Please understand that TCP doesn't deliver "packets" to the
>application, it provides a byte stream. You may want to look into
>using non-blocking I/O for the receive call. (I don't know what you
>mean by "completion port model.")
>
>Lars
>--
>Lars Eggert                                     NEC Network Laboratories
>
>


From s.malik at tuhh.de  Wed Aug 24 06:30:18 2005
From: s.malik at tuhh.de (Sireen Habib Malik)
Date: Wed, 24 Aug 2005 15:30:18 +0200
Subject: [e2e] Need help: setting winsock receive low watermark while
 using	completion port and TCP
In-Reply-To: <CDB8E728D94C3A48BAACD57F390C4022712C9E@xmb-sjc-233.amer.cisco.com>
References: <CDB8E728D94C3A48BAACD57F390C4022712C9E@xmb-sjc-233.amer.cisco.com>
Message-ID: <430C766A.4000801@tuhh.de>


> 
>I have a TCP connection handled by the completion port IO model. What is
>happening is in case I specify a large buffer for receiving (WSARecv),
>the operation completes only after the buffer is full, not after
>receiving about 500 bytes (a packet), so a significant delay is
>introduced. In case of small buffers, performance degrades. Any advice
>on this? Completion port model is a must.
>  
>
No idea what a "completion port model" is! Here are some hints for your 
question.

Large delay for large buffer is intuitively clear. For the small 
buffers, consider the relation:

Maximum Window (Wmax) = BufferSize+ Capacity*RTT
(assuming error-free medium, and that Capacity is the speed of the ONLY 
bottleneck).

For the more common TCP versions, Wmax should not be so small that TCP 
does not get a chance to get out of the Slow-Start phase. For 
triple-duplicates to arrive, there should be atleast 3 packets 
successfully delievered after the lost packet. Otherwise, TCP time-outs. 
If your client access-speed is small, give your connection enough 
buffer-space to "atleast" get to the saw-tooth behavior.

The other possibility is that with a large buffer the connection can 
operate at the Maximum Congestion Window, however, with a small buffer 
you are  forcing it go into the saw-tooth congestion control. So poorer 
performance, relatively speaking.

Hope that helps.
--
SM


From Anil.Agarwal at viasat.com  Wed Aug 24 08:01:44 2005
From: Anil.Agarwal at viasat.com (Agarwal, Anil)
Date: Wed, 24 Aug 2005 11:01:44 -0400
Subject: [e2e] Need help: setting winsock receive low watermark while
	using	completion port and TCP
Message-ID: <A8E6CBCE61F8D5478F6349C914A015BB0EF4C2@COURIER>

All,

"Completion Port Model" is a Windows-specific mechanism. I suspect this
question requires some Windows expertise, rather than TCP expertise. See
links below for some info on this topic -

http://www.nevelsteen.com/coding/completion_ports_in_delphi.htm
http://www.sysinternals.com/Information/IoCompletionPorts.html

Anil Agarwal
Viasat Inc.
22300 Comsat Dr.
Clarksburg, MD 20871
(W) 301-428-4655
Anil.Agarwal at viasat.com


-----Original Message-----
From: end2end-interest-bounces at postel.org
[mailto:end2end-interest-bounces at postel.org]On Behalf Of Sireen Habib
Malik
Sent: Wednesday, August 24, 2005 9:30 AM
To: Alex Krivonosov (alexkr)
Cc: end2end-interest at postel.org
Subject: Re: [e2e] Need help: setting winsock receive low watermark
while using completion port and TCP


> 
>I have a TCP connection handled by the completion port IO model. What is
>happening is in case I specify a large buffer for receiving (WSARecv),
>the operation completes only after the buffer is full, not after
>receiving about 500 bytes (a packet), so a significant delay is
>introduced. In case of small buffers, performance degrades. Any advice
>on this? Completion port model is a must.
>  
>
No idea what a "completion port model" is! Here are some hints for your 
question.

Large delay for large buffer is intuitively clear. For the small 
buffers, consider the relation:

Maximum Window (Wmax) = BufferSize+ Capacity*RTT
(assuming error-free medium, and that Capacity is the speed of the ONLY 
bottleneck).

For the more common TCP versions, Wmax should not be so small that TCP 
does not get a chance to get out of the Slow-Start phase. For 
triple-duplicates to arrive, there should be atleast 3 packets 
successfully delievered after the lost packet. Otherwise, TCP time-outs. 
If your client access-speed is small, give your connection enough 
buffer-space to "atleast" get to the saw-tooth behavior.

The other possibility is that with a large buffer the connection can 
operate at the Maximum Congestion Window, however, with a small buffer 
you are  forcing it go into the saw-tooth congestion control. So poorer 
performance, relatively speaking.

Hope that helps.
--
SM


From fred at cisco.com  Wed Aug 24 02:36:47 2005
From: fred at cisco.com (Fred Baker)
Date: Wed, 24 Aug 2005 17:36:47 +0800
Subject: [e2e] Question about propagation and queuing delays
In-Reply-To: <13B1CFCA-3291-4B04-8CC4-D711D4423486@cisco.com>
References: <mailman.0.1124662178.23693.end2end-interest@postel.org>
	<c6ae07ec0508211515ab95fd3@mail.gmail.com>
	<43094A36.1040402@reed.com>
	<c6ae07ec05082209136a2d039f@mail.gmail.com>
	<13B1CFCA-3291-4B04-8CC4-D711D4423486@cisco.com>
Message-ID: <0D1E327F-5E78-49ED-A132-D048995E144A@cisco.com>

So I am sitting in a meeting room at APAN, which is meeting in  
Taipei. I happen to be VPN'd into Cisco in San Jose, but I shut that  
down to develop a traceroute for your benefit.

The traceroute from here to Cisco is:

traceroute to irp-view7.cisco.com (171.70.65.144), 64 hops max, 40  
byte packets
1  ip-242-001 (140.109.242.1)  8.177 ms  10.311 ms  16.018 ms
2  ae-0-10.br0.tpe.tw.rt.ascc.net (140.109.251.50)  2.096 ms  66.035  
ms  49.755 ms
3  s4-1-1-0.br0.pax.us.rt.ascc.net (140.109.251.105)  206.316 ms   
162.307 ms  259.891 ms
4  so-5-1.hsa4.sanjose1.level3.net (64.152.81.9)  130.915 ms  274.471  
ms  304.699 ms
5  so-2-1-0.bbr2.sanjose1.level3.net (4.68.114.157)  132.229 ms   
176.587 ms  135.330 ms
6  ge-11-0.ipcolo1.sanjose1.level3.net (4.68.123.41)  134.507 ms  
ge-11-2.ipcolo1.sanjose1.level3.net (4.68.123.169)  131.669 ms  
ge-11-0.ipcolo1.sanjose1.level3.net (4.68.123.41)  134.544 ms
7  p1-0.cisco.bbnplanet.net (4.0.26.14)  130.734 ms  131.757 ms   
140.291 ms
8  sjck-dmzbb-gw1.cisco.com (128.107.239.9)  146.848 ms  132.394 ms   
168.201 ms
...

I ran a ping (through the VPN) to a server inside Cisco. While I did  
that, I downloaded a number of files. The variation in ping delay is:

225 packets transmitted, 222 packets received, 1% packet loss
round-trip min/avg/max/stddev = 132.565/571.710/2167.062/441.876 ms

The peak rate sftp reported was about 141.3 KB/s, and the least rate  
was 34.2 KB/s. The difference most likely relates to the effects of  
packet loss (1.3% loss is non-negligible), delay variation (a  
standard deviation in ping RTT of 442 ms and an absolute variation in  
delay of 2034 ms are also non-negligible), the effects of slow-start  
and fast-retransmit procedures, or the bandwidth remaining while  
other users also made use of the link.

What this demonstrates is the variation in delay that happens around  
bottlenecks in the Internet, and why folks that worry about TCP/SCTP  
congestion management procedures are not playing with recreational  
pharmaceuticals. I won't speculate where this bottleneck is beyond  
saying I'll bet it's in one of the first few hops of that traceroute  
- the access path.


On Aug 23, 2005, at 5:50 AM, Fred Baker wrote:

> no, but there are different realities, and how one measures them is  
> also relevant.
>
> In large fiber backbones, within the backbone we generally run 10:1  
> overprovisioned or more. within those backbones, as you note, the  
> discussion is moot. But not all traffic stays within the cores of  
> large fiber backbones - much of it is originated and terminates in  
> end systems located in homes and offices.
>
> The networks that connect homes and offices to the backbones are  
> often constrained differently. For example, my home (in an affluent  
> community in California) is connected by Cable Modem, and the  
> service that I buy (business service that in its AUP accepts a VPN,  
> unlike the same company's residential service) guarantees a certain  
> amount of bandwidth, and constrains me to that bandwidth - measured  
> in KBPS. I can pretty easily fill that, and when I do certain  
> services like VoIP don't work anywhere near as well. So I wind up  
> playing with the queuing of traffic in the router in my home to  
> work around the service rate limit in my ISP. As I type this  
> morning (in a hotel in Taipei), the hotel provides an access  
> network that I share with the other occupants of the hotel. It's  
> not uncommon for the entire hotel to share a single path for all of  
> its occupants, and that single path is not necessarily in MBPS.  
> And, they tell me that the entire world is not connected by large  
> fiber cores - as soon as you step out of the affluent  
> industrialized countries, VSAT, 64 KBPS links, and even 9.6 access  
> over GSM become the access paths available.
>
> As to measurement, note that we generally measure that  
> overprovisioning by running MRTG and sampling throughput rates  
> every 300 seconds. When you're discussing general service levels  
> for an ISP, that is probably reasonable. When you're measuring time  
> variations on the order of milliseconds, that's a little like  
> running a bump counter cable across a busy intersection in your  
> favorite downtown, reading the counter once a day, and drawing  
> inferences about the behavior of traffic during light changes  
> during rush hour...
>
> http://www.ieee-infocom.org/2004/Papers/37_4.PDF has an interesting  
> data point. They used a much better measurement methodology, and  
> one of the large networks gave them some pretty cool access in  
> order to make those tests. Basically, queuing delays within that  
> particular very-well-engineered large fiber core were on the order  
> of 1 ms or less during the study, with very high confidence. But  
> the same data flows frequently jumped into the 10 ms range even  
> within the 90% confidence interval, and a few times jumped to 100  
> ms or so. The jumps to high delays would most likely relate to  
> correlated high volume data flows, I suspect, either due to route  
> changes or simple high traffic volume.
>
> The people on NANOG and the people in the NRENs live in a certain  
> ivory tower, and have little patience with those who don't. They  
> also measure the world in a certain way that is easy for them.
>
>
> On Aug 23, 2005, at 12:13 AM, David Hagel wrote:
>
>
>> Thanks, this is interesting. I asked the same question on nanog  
>> and got similar responses: that queuing delay is negligible on  
>> todays backbone networks compared to other fixed delay components  
>> (propagation, store-and-forward, transmission etc). Response on  
>> nanog seems to indicate that queuing delay is almost irrelevant  
>> today.
>>
>> This may sound like a naive question. But if queuing delays are so  
>> insignificant in comparison to other fixed delay components then  
>> what does it say about the usefulness of all the extensive  
>> techniques for queue management and congestion control (including  
>> TCP congestion control, RED and so forth) in the context of  
>> today's backbone networks? Any thoughts? Are the congestion  
>> control researchers out of touch with reality?
>>
>> - Dave
>>
>> On 8/21/05, David P. Reed <dpreed at reed.com> wrote:
>>
>>> I can repeatably easily measure 40 msec. coast-to-coast (Boston- 
>>> LA), of which around 25 msec. is accounted for by speed of light  
>>> in fiber (which is 2/3 of speed of light in vacuum, *299,792,458  
>>> m s^-1 *, because the refractive index of fiber is approximately  
>>> 1.5 or 3/2).   So assume 2e8 m/s as the speed of light in fiber,   
>>> 1.6e3 m/mile, and you get 1.25e5 mi/sec.
>>>
>>> The remaining 15 msec. can be accounted for by the fiber path not  
>>> being straight line, or by various "buffering delays" (which  
>>> include queueing delays, and scheduling delays in the case where  
>>> frames are scheduled periodically and you have to wait for the  
>>> next frame time to launch your frame).
>>>
>>> Craig Partridge and I have debated (offline) what the breakdown  
>>> might actually turn out to be (he thinks the total buffering  
>>> delay is only 2-3 msec., I think it's more like 10-12), and it  
>>> would be quite interesting to get more details, but that would  
>>> involve delving into the actual equipment deployed and its  
>>> operating modes.
>>>
>
>

From sampad_m at rediffmail.com  Wed Aug 24 09:06:51 2005
From: sampad_m at rediffmail.com (sampad  mishra)
Date: 24 Aug 2005 16:06:51 -0000
Subject: [e2e] Need help: setting winsock receive low watermark while
	using completion port and TCP
Message-ID: <20050824160651.789.qmail@webmail8.rediffmail.com>


On Wed, 24 Aug 2005 Lars Eggert wrote :
>On Aug 24, 2005, at 0:21, Alex Krivonosov (alexkr) wrote:
>>I have a TCP connection handled by the completion port IO model.  What is happening is in case I specify a large buffer for receiving  (WSARecv), the operation completes only after the buffer is full,  not after receiving about 500 bytes (a packet), so a significant  delay is introduced. In case of small buffers, performance  degrades. Any advice on this? Completion port model is a must.
>
>Please understand that TCP doesn't deliver "packets" to the  application, it provides a byte stream. You may want to look into  using non-blocking I/O for the receive call. (I don't know what you  mean by "completion port model.")
>
>Lars

What Lars said is right, TCP doen't deliver "packets" to the application. Now in your case I think it is going into the blocking mode.
One way to verify is, check the return value, 
Result = WSARecv(....)

If the socket is non blocking , it would return WSAEWOULDBLOCK.
You have  to handle this case using WSAAsyncSelect(SOCKET id , HWND , uint msg,combination of events(like FD_READ,FD_WRITE , etc)
Now handle these messages(FD_READ for reading,....) in ur WindowProc of the window specified.
You have to go through the MSDN document to get a clear picture...
 
u can use the chunk of code illustarted below:
Result = WSARecv(....)

if (Result == SOCKET_ERROR) { 
 
        Error = WSAGetLastError(); 
 
        switch (Error) { 
 
        case WSAENETRESET:  // flow through 
        case WSAECONNRESET: 
				return FALSE;
		case WSAEWOULDBLOCK:
			  WSAAsyncSelect (SOCKID, HWND, WM_TCP_NET_MESSAGE(uint), FD_CONNECT | FD_READ | FD_WRITE | FD_CLOSE);
			 
                          return FALSE;
		default:
			return FALSE;
           
		}

Well I'm not sure whether this is what u wanted nevertheless this might still help.

Regards,
Sampad Mishra.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.postel.org/pipermail/end2end-interest/attachments/20050824/3e3fe90e/attachment.html

From sampad_m at rediffmail.com  Wed Aug 24 09:07:31 2005
From: sampad_m at rediffmail.com (sampad  mishra)
Date: 24 Aug 2005 16:07:31 -0000
Subject: [e2e] Need help: setting winsock receive low watermark while
	using completion port and TCP
Message-ID: <20050824160731.17361.qmail@webmail30.rediffmail.com>


On Wed, 24 Aug 2005 Lars Eggert wrote :
>On Aug 24, 2005, at 0:21, Alex Krivonosov (alexkr) wrote:
>>I have a TCP connection handled by the completion port IO model.  What is happening is in case I specify a large buffer for receiving  (WSARecv), the operation completes only after the buffer is full,  not after receiving about 500 bytes (a packet), so a significant  delay is introduced. In case of small buffers, performance  degrades. Any advice on this? Completion port model is a must.
>
>Please understand that TCP doesn't deliver "packets" to the  application, it provides a byte stream. You may want to look into  using non-blocking I/O for the receive call. (I don't know what you  mean by "completion port model.")
>
>Lars

What Lars said is right, TCP doen't deliver "packets" to the application. Now in your case I think it is going into the blocking mode.
One way to verify is, check the return value, 
Result = WSARecv(....)

If the socket is non blocking , it would return WSAEWOULDBLOCK.
You have  to handle this case using WSAAsyncSelect(SOCKET id , HWND , uint msg,combination of events(like FD_READ,FD_WRITE , etc)
Now handle these messages(FD_READ for reading,....) in ur WindowProc of the window specified.
You have to go through the MSDN document to get a clear picture...
 
u can use the chunk of code illustarted below:
Result = WSARecv(....)

if (Result == SOCKET_ERROR) { 
 
        Error = WSAGetLastError(); 
 
        switch (Error) { 
 
        case WSAENETRESET:  // flow through 
        case WSAECONNRESET: 
				return FALSE;
		case WSAEWOULDBLOCK:
			  WSAAsyncSelect (SOCKID, HWND, WM_TCP_NET_MESSAGE(uint), FD_CONNECT | FD_READ | FD_WRITE | FD_CLOSE);
			 
                          return FALSE;
		default:
			return FALSE;
           
		}

Well I'm not sure whether this is what u wanted nevertheless this might still help.

Regards,
Sampad Mishra.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.postel.org/pipermail/end2end-interest/attachments/20050824/966e8710/attachment-0001.html

From sommerfeld at sun.com  Wed Aug 24 09:23:01 2005
From: sommerfeld at sun.com (Bill Sommerfeld)
Date: Wed, 24 Aug 2005 12:23:01 -0400
Subject: [e2e] Question about propagation and queuing delays
In-Reply-To: <0D1E327F-5E78-49ED-A132-D048995E144A@cisco.com>
References: <mailman.0.1124662178.23693.end2end-interest@postel.org>
	<c6ae07ec0508211515ab95fd3@mail.gmail.com> <43094A36.1040402@reed.com>
	<c6ae07ec05082209136a2d039f@mail.gmail.com>
	<13B1CFCA-3291-4B04-8CC4-D711D4423486@cisco.com>
	<0D1E327F-5E78-49ED-A132-D048995E144A@cisco.com>
Message-ID: <1124900580.7308.36.camel@thunk>

On Wed, 2005-08-24 at 05:36, Fred Baker wrote:
> The peak rate sftp reported was about 141.3 KB/s, and the least rate  
> was 34.2 KB/s. The difference most likely relates to the effects of  
> packet loss (1.3% loss is non-negligible), delay variation (a  
> standard deviation in ping RTT of 442 ms and an absolute variation in  
> delay of 2034 ms are also non-negligible), the effects of slow-start  
> and fast-retransmit procedures, or the bandwidth remaining while  
> other users also made use of the link.

If by sftp you mean the protocol described in some version of
draft-ietf-secsh-filexfer running over ssh, i'd be cautious in using it
as a path benchmark.

ssh muxes multiple stream channels over a single tcp connection, and as
a result does its own flow control on a per-channel basis so that
implementations aren't forced to choose between unlimited buffer usage
within implementations or deadlock.  

There have been anecdotal reports to the secure shell wg that default
channel window sizes in commonly used implementations are far from
optimal but I haven't heard any updates in a while.  

But then, I'm just the cat-herder for that WG and not an implementor of
that protocol family...

							- Bill


From alexkr at cisco.com  Wed Aug 24 09:22:59 2005
From: alexkr at cisco.com (Alex Krivonosov (alexkr))
Date: Wed, 24 Aug 2005 09:22:59 -0700
Subject: [e2e] Need help: setting winsock receive low watermark while
	using	completion port and TCP
Message-ID: <CDB8E728D94C3A48BAACD57F390C4022712DD6@xmb-sjc-233.amer.cisco.com>

Sireen,

This issue has nothing to do with the protocol operation, this is
clearly an intermal windows problem. While using blocking sockets, it
does not happen independent on the buffer size.

Alex

-----Original Message-----
From: Sireen Habib Malik [mailto:s.malik at tuhh.de] 
Sent: Wednesday, August 24, 2005 6:30 AM
To: Alex Krivonosov (alexkr)
Cc: end2end-interest at postel.org
Subject: Re: [e2e] Need help: setting winsock receive low watermark
while using completion port and TCP


> 
>I have a TCP connection handled by the completion port IO model. What 
>is happening is in case I specify a large buffer for receiving 
>(WSARecv), the operation completes only after the buffer is full, not 
>after receiving about 500 bytes (a packet), so a significant delay is 
>introduced. In case of small buffers, performance degrades. Any advice 
>on this? Completion port model is a must.
>  
>
No idea what a "completion port model" is! Here are some hints for your
question.

Large delay for large buffer is intuitively clear. For the small
buffers, consider the relation:

Maximum Window (Wmax) = BufferSize+ Capacity*RTT (assuming error-free
medium, and that Capacity is the speed of the ONLY bottleneck).

For the more common TCP versions, Wmax should not be so small that TCP
does not get a chance to get out of the Slow-Start phase. For
triple-duplicates to arrive, there should be atleast 3 packets
successfully delievered after the lost packet. Otherwise, TCP time-outs.

If your client access-speed is small, give your connection enough
buffer-space to "atleast" get to the saw-tooth behavior.

The other possibility is that with a large buffer the connection can
operate at the Maximum Congestion Window, however, with a small buffer
you are  forcing it go into the saw-tooth congestion control. So poorer
performance, relatively speaking.

Hope that helps.
--
SM


From randy at psg.com  Wed Aug 24 10:49:42 2005
From: randy at psg.com (Randy Bush)
Date: Wed, 24 Aug 2005 10:49:42 -0700
Subject: [e2e] Question about propagation and queuing delays
References: <mailman.0.1124662178.23693.end2end-interest@postel.org>
	<c6ae07ec0508211515ab95fd3@mail.gmail.com>
	<43094A36.1040402@reed.com>
	<c6ae07ec05082209136a2d039f@mail.gmail.com>
	<13B1CFCA-3291-4B04-8CC4-D711D4423486@cisco.com>
	<0D1E327F-5E78-49ED-A132-D048995E144A@cisco.com>
Message-ID: <17164.45878.715907.892524@roam.psg.com>

> The traceroute from here to Cisco is:
> 
> traceroute to irp-view7.cisco.com (171.70.65.144), 64 hops max, 40  
> byte packets
> 1  ip-242-001 (140.109.242.1)  8.177 ms  10.311 ms  16.018 ms

one can not measure to you as there is blockage

traceroute to 140.109.242.1 (140.109.242.1), 64 hops max, 40 byte packets
 1  psg2 (147.28.0.5)  0.268 ms  0.283 ms  0.355 ms
 2  ...
10  s4-4-3-0.br0.tpe.tw.rt.ascc.net (140.109.251.106)  167.259 ms  166.778 ms  166.749 ms
11  * * *

but one can measure up to the block and

pathchar to 140.109.251.106 (140.109.251.106)
 mtu limited to 1500 bytes at local host
 doing 32 probes at each of 45 sizes (64 to 1500 by 32)
 0 rip (147.28.0.39)
 |    33 Mb/s,   133 us (631 us)
 1 psg2 (147.28.0.5)
 |    42 Mb/s,   103 us (1.12 ms)
 2 e2.psg1.psg.com (147.28.1.5)
 |   ?? b/s,   -93 us (0.91 ms)
 3 sl-gw11-sea-0-1.sprintlink.net (144.232.9.61)
 |   182 Mb/s,   16 us (1.01 ms)
 4 sl-bb20-sea-9-2.sprintlink.net (144.232.6.125)
 |   337 Mb/s,   135 us (1.31 ms)
 5 so-3-0-0.gar1.Seattle1.Level3.net (209.0.227.133)
 |   1138 Mb/s,   28 us (1.38 ms)
 6 so-7-0-0.mp1.Seattle1.Level3.net (64.159.1.81)
 |   ?? b/s,   8.61 ms (18.5 ms)
 7 as-0-0.bbr2.SanJose1.Level3.net (64.159.0.218)
 |   1873497444986126 Mb/s,   77 us (18.7 ms)
 8 so-14-0.hsa4.SanJose1.Level3.net (4.68.114.158)
 |   256 Mb/s,   69 us (18.9 ms)
 9 REACH-SERVIC.hsa4.Level3.net (64.152.81.10)
 |    32 Mb/s,   63.7 ms (147 ms),  +q 21.3 ms (84.9 KB)
10 s4-4-3-0.br0.tpe.tw.rt.ascc.net (140.109.251.106)
10 hops, rtt 145 ms (147 ms), bottleneck  32 Mb/s, pipe 583768 bytes

observe the queue on the 9/10 hop

randy


From dpreed at reed.com  Thu Aug 25 07:37:34 2005
From: dpreed at reed.com (David P. Reed)
Date: Thu, 25 Aug 2005 10:37:34 -0400
Subject: [e2e] Need help: setting winsock receive low watermark while
 using completion port and TCP
In-Reply-To: <0EC343A4-094D-4EE6-9428-7D3FE03CB83E@netlab.nec.de>
References: <CDB8E728D94C3A48BAACD57F390C4022712C9E@xmb-sjc-233.amer.cisco.com>
	<0EC343A4-094D-4EE6-9428-7D3FE03CB83E@netlab.nec.de>
Message-ID: <430DD7AE.5060101@reed.com>

Lars Eggert wrote:

> On Aug 24, 2005, at 0:21, Alex Krivonosov (alexkr) wrote:
>
>> I have a TCP connection handled by the completion port IO model.  
>> What is happening is in case I specify a large buffer for receiving  
>> (WSARecv), the operation completes only after the buffer is full,  
>> not after receiving about 500 bytes (a packet), so a significant  
>> delay is introduced. In case of small buffers, performance  degrades. 
>> Any advice on this? Completion port model is a must.
>
>
> Please understand that TCP doesn't deliver "packets" to the  
> application, it provides a byte stream. You may want to look into  
> using non-blocking I/O for the receive call. (I don't know what you  
> mean by "completion port model.")

The definition of I/O completion in Winsock *is* buffer full.   Size of 
buffer on receive is not a major performance problem (system calls 
aren't slow compared to processing), so if you want notification on 500 
bytes, use 500 byte buffers.

A thought you might not have considered: Perhaps you are sending your 
500 byte messages, one per call, on the sender with TCP_NODELAY set?  
This could cause some performance problems if the source end has a fast 
link, but the receiving node has a slow absorption rate (the packets on 
the source will not combine into larger frames until the window fills 
up.)   Of course that is exactly what TCP_NODELAY is for (minimizing 
message latency, but increasing network overhead) - if you don't care so 
much about latency, don't set TCP_NODELAY.

(or you can get very complex by using I/O completion based app-level 
output management on the send side to control the latency/efficiency 
tradeoff, using WSASendMessage to gather mutliple frames adaptively and 
"delaying" sends at the app level until precise conditions hold related 
to your desired latency goal and trying to gather 1-3 of your messages 
into single sends).

From detlef.bosau at web.de  Thu Aug 25 12:13:38 2005
From: detlef.bosau at web.de (Detlef Bosau)
Date: Thu, 25 Aug 2005 21:13:38 +0200
Subject: [e2e] Retransmission Timouts revisited
Message-ID: <430E1862.23935C68@web.de>

I intendedly do not write "TCP" because this matter is not restricted to
TCP but retransmission timeouts are required in _any_ protocol
which has to cope with packet losses. It doesn?t matter whether a
timeout is detected on the sender or on the receiver.
As long as there is no other mechanism to detect packet loss, we must
rely on timouts. 

I just compared the rto algorithms given in Edge?s paper from 1984 and
the congavoid paper (obviously in some newer version where rto is
set to mean+4*variation instead of mean+2*variation, but this does not
matter for the discussion).

For simplity, let?s ignore Karn?s algorithm and focus on the math work
for rto. We further assume the preconditions for Edge?s work
may hold. (In fact, we must have a very close look to this, but in this
discussion we assume they will hold.)

Then the difference is the choice of the variance estimator.
Is this correct?

Both, Edge and Jacobson/Karels as well, estimate the RTT mean wusing an
EWMA filter.

Edge estimates the variance using an EWMA filter as well, other than
Jacobson who uses an estimator which gives an estimation for
the variance and is easier to calculate.

What makes me curious about that is, that the rto given by edge
_essentialy_ relys on (an one tailed version of) Chebyshev?s inequality.

That?s why ranted the last days when it came to spurious timeouts. The
is much written about the multiplicative factor k
in rto=mean + k*variance. However, isntead of a qualitative "guess",
Edge?s paper derives an RTO which respects a _prescribed_ upper limit
for an "unwanted retransmission probality" AKA spurious timeout
probability.

When we consider Edge?s formula

RTO = mean + e [\sigma*(1-Y)/Y]^1/2

and set e to 1 (which is appropriate because "e" does not appear in the
derivation of the formula), than Y is the spurious timeout probability.
(Refer to Edge?s paper, formulae 20 ff. for details.)

Practically, this means that if we choose 

rto=man+2*var = mean + (4*\sigma^2)^1/2 

this means

4=(1-Y)/Y

hence 

Y = 1/5.

Hence, the spurious timeout probability has an upper limit of 1/5,
_independent_ from the actual RTT distribution.

As I said above, a newer version of the congavoid paper deals with k=4,
then the spurious timeout probability is 1/17.

This matter was even discussed in a paper by Leung, Klein, Mooney and
Haner who proposed to further increase the RTO to avoid
spurious timeouts. Of course, one can have a religious discussion here.
However, if we take Edge?s formula, we prescribe an
upper limit for the spurious timeout propability, e.g Y should be 1/63,
and then we have:

k=sqrt((1-y)/y) = sqrt((62/63) * 63) = sqrt(62) = ca. 7.87

Hence, we can define an upper limit for the spurious timeout probability
and derive the necessary k.

NB: We did not discuss delayed retransmissions here which may result
from a too large rto.

All this holds true in Edge?s paper, at least asymptotically.

However, the congavoid paper choses a different variance estimator,
which is easier to calculate.

Q1.: Is there evidence that the rto equations by Edge hold true?
Particularly: What is the precise relationship between the "mdev" used
in 
the congavoid paper and \sigma?

There are some vague remarks on this one. However, this is a central
issue as it directly affects the applicability of Edge?s formula.
The very strength of Edge?s formula is that we have a _generic_
estimation for the spurious timeout probability, which is especially 
indendendent of the actual RTT distribution.

Q2.: As we have more powerful computing machinery now than 1988, did
anybody think of using Edge?s orginal formulae again?

If we recall Edge?s formula, that?s why I talked from an "urban legend"
here recently as far as spurious timeouts are concerned:

There _are_ spurious timeouts. Anywhere, anytime. And assumed, Edge?s
formula where applicable, we can define an upper limit for the
spurious timeout probability and hence, spurious timeouts will not occur
unduly often. It doesn?t matter whether we run
TCP over Ethernet, GPRS or even with flying pigs.

Whether the assumptions for Edge?s formulae will hold on an arbitrary
network, is a different story.

Some problems are:

1. A few years ago, I read a paper that packet order distortions cannot
be neglected in the Internet. Thus, the Internet does no longer
performe "Sequencing Positive Acknowledgement Retnramission (SPAR)". I
did not yet udnerstand all the details, but Edge explicitely maks
use of the SPAR assumption several times in his paper. Thus, I?m not yet
convinced that his rationale will hold, if this assumption
is violated.

2. The EWMA estimators used by Edge require the observation variables to
be independent. In fact, RTT observations are gained from
ACK packets and these are send by the receiver as TCP packets arrive.
(Let?s ignore delayed ACK here.) Hence, the latency experienced
by a packet directly affects the sampling times of following RTT
samples. In ohter terms: The random variables used for rtt observation
directly affect each other and I?m not sure whether they are really
indpendent.

3. In addition to 2, we must review the weakly stationary assumption.
Basically, in Edge?s paper this assumption results
in convergence statemtens:
i) The mean estimator is in fact an asymptotic unbiased estimator for
mean.
ii) The variance estimator converges to var (t(n+1)-T(n)) where t(n+1)
is the n+1 th rtt observation and T(n) the n-th estimate for mean.
Both convergences hold for n->inf., i.e. "in the long run".

4. "in the long run": As we know from pratical statistics, in the
Internet short term flows are by far more often as long term flows.
In other terms: When the estimators "start to converge", the flow of
interest may be history.

In any case, the formulae given by Edge make it easier to do proper
analysis, as they explicitely state assumptions and preconditions.
At the moment, I do not quite see, under wich circumstances the rto
formula in the congavoid paper will hold.
Particularly unduly often spurious timeouts rise the question, whether
the problem is in fact the network technology in use,
or wether the problem is the violation of the requirements for the rto
estimator to work properly.

>From the aforementioned problem list, I conclude that there are a number
of vulnerabilities in the actual rto estimation scheme.
In addition, this is not only a problem for TCP, but for any protocol
which requires timeouts and must rely upon estimators for
mean and variance to obtain them.

Detlef


-- 
Detlef Bosau
Galileistrasse 30
70565 Stuttgart
Mail: detlef.bosau at web.de
Web: http://www.detlef-bosau.de
Mobile: +49 172 681 9937

From detlef.bosau at web.de  Thu Aug 25 13:23:08 2005
From: detlef.bosau at web.de (Detlef Bosau)
Date: Thu, 25 Aug 2005 22:23:08 +0200
Subject: [e2e] Retransmission Timouts revisited
References: <430E1862.23935C68@web.de>
Message-ID: <430E28AC.7E6F320A@web.de>

Detlef Bosau wrote:
> Some problems are:
....

I missed one important problem: The forecast capacity.

Particularly, the congavoid paper recommends to choose the gain in the
mean estimator according to noise etc.

Edge clearly points out that the gain should be chosen that way, that
the forecast error is minimzied.

The problem is that RTO is applied to a packet which has not yet been
sent. 
Consider an estimated rtt of 5 seconds, then we forecast a networks path
properties for a duration of 5 seconds.

At the moment, we have a "one size fits all" gain in TCP, which
practically ignores, that actual RTTs span a range of 
up to eight orders of magnitude.

When I read the "cost to cost" latencies posted by several authors here,
I was highly interested. Unfortunately, I don?t live in the US,
I?m an ordinary private Internet user in Germany, connected with a DSL
line.
Admittedly, RTT up to one or two _seconds_ two servers in the US are
extremely rate, but they _exist_.

And when 3rd generation mobile networks reach the public, we will face
much larger round trip times.

This leads to a settling time problem as well. This is particularly a
problem for short termed flows.

Just to say that.

Detlef
-- 
Detlef Bosau
Galileistrasse 30
70565 Stuttgart
Mail: detlef.bosau at web.de
Web: http://www.detlef-bosau.de
Mobile: +49 172 681 9937

From detlef.bosau at web.de  Thu Aug 25 15:15:51 2005
From: detlef.bosau at web.de (Detlef Bosau)
Date: Fri, 26 Aug 2005 00:15:51 +0200
Subject: [e2e] SPAR, comment to: Re:  Retransmission Timouts revisited
References: <430E1862.23935C68@web.de> <430E28AC.7E6F320A@web.de>
Message-ID: <430E4316.2502DB8F@web.de>

I just received a mail from Wes:


> 
> 
> On Thu, Aug 25, 2005 at 09:13:38PM +0200, Detlef Bosau wrote:
> > 
> > Some problems are:
> > 
> > 1. A few years ago, I read a paper that packet order distortions cannot
> > be neglected in the Internet. Thus, the Internet does no longer
> > performe "Sequencing Positive Acknowledgement Retnramission (SPAR)". I
> > did not yet udnerstand all the details, but Edge explicitely maks
> > use of the SPAR assumption several times in his paper. Thus, I?m not yet
> > convinced that his rationale will hold, if this assumption
> > is violated.
> > 
> 
> My interpretation of SPAR is "TCP without SACK", i.e. only the last
> in-sequence segment is acknowledged, wheras PAR is "TCP with SACK" where
> segments are acked as they arrive.  So these are properties of the
> transport protocol, not of the network that delivers segments to the
> transport, so network reordering does not violate any assumption.
> Edge's paper states that his algorithm works for both SPAR and PAR
> anyways, so I'm not sure you can call this a "problem".
> 
> -Wes

O.k. the central point is packet reordering here. I just read his
remakrs on SPAR and PAR, and in fact it seems, Wes is right here.
It?s even just on the fist page of Edge?s paper :-|

However, I did not yet completely understand the rationale here, it?s
not easy. When I stated some "possible problems", that it may be that I
just did not understand these details. Therefore, please allow me to
ask.

-- 
Detlef Bosau
Galileistrasse 30
70565 Stuttgart
Mail: detlef.bosau at web.de
Web: http://www.detlef-bosau.de
Mobile: +49 172 681 9937

From arjuna.sathiaseelan at gmail.com  Fri Aug 26 12:06:50 2005
From: arjuna.sathiaseelan at gmail.com (Arjuna Sathiaseelan)
Date: Fri, 26 Aug 2005 20:06:50 +0100
Subject: [e2e] Link Level Retransmissions
Message-ID: <1ef2259005082612063e071e5d@mail.gmail.com>

Dear All,
  I would like to know what is the maximum number of link level
retransmissions allowed in an ARQ protocol especially in satellite
networks? Is this based on a timer i.e. when a timeout happens the
link layer would leave the error recovery to the TCP layer..Please let
me know..


Regds,
Arjuna

From detlef.bosau at web.de  Fri Aug 26 12:44:28 2005
From: detlef.bosau at web.de (Detlef Bosau)
Date: Fri, 26 Aug 2005 21:44:28 +0200
Subject: [e2e] SPAR, comment to: Re:  Retransmission Timouts revisited
References: <430E1862.23935C68@web.de> <430E28AC.7E6F320A@web.de>
	<430E4316.2502DB8F@web.de>
Message-ID: <430F711C.4020403@web.de>

Detlef Bosau wrote:
> I just received a mail from Wes:
> 
> 

(mail snipped.)

Although I thought, it would could be justified to quote this mail 
without asking Wes, because there is no personal or confidential content 
in this, it was wrong to do so.

I apologzie for that.

Detlef


-- 
Detlef Bosau
Galileistrasse 30
70565 Stuttgart
Mail: detlef.bosau at web.de
Web: http://www.detlef-bosau.de
Mobile: +49 172 681 9937


From ikob at koganei.wide.ad.jp  Wed Aug 31 03:39:22 2005
From: ikob at koganei.wide.ad.jp (Katsushi Kobayashi)
Date: Wed, 31 Aug 2005 19:39:22 +0900
Subject: [e2e] PFLDnet2006 CFP
Message-ID: <C6D3AAEF-CCCE-4A94-BC35-00F4090FF24A@koganei.wide.ad.jp>

Call For Papers
===============

Fourth International Workshop on Protocols for Fast Long-Distance  
Networks
PFLDnet2006

February 2-3, 2006

Nara, Japan - An ancient capital city older than Kyoto

------------------------------------------------------------------------
http://www.hpcc.jp/pfldnet2006/
------------------------------------------------------------------------

Fast long-distance networks (i.e., networks operating at 622 Mbit/s,
2.5 Gbit/s, or 10 Gbit/s, soon will be 40 Gbit/s, and spanning several
countries or states) are now becoming commonplace.  Increasing numbers
of researchers now routinely transfer between 10 GB and multi-TB
datasets over gigabit networks. Application domains for such massive
transfers include data-intensive Grids (e.g., in Particle Physics,
Earth Observation, Bio informatics, and Radio Astronomy), database
mirroring for Web sites (e.g., in e-commerce), and push-based Web
cache updates.


Although the connectivity infrastructure is now in place, or will
soon be, the transport and application protocols available to date are
proving inadequate for fast transfer of large volumes of data over such
networks. Current versions of TCP cannot fully exploit the network
capacity. For instance, recovery time from a congestion event grows
at a super-linear rate, and can easily exceed 10 minutes in very high
bandwidth-delay product networks.  It also requires a large congestion
window for high throughput, consuming valuable system resources. A
number of research teams have begun investigating advanced protocols
for domain-specific and general applications.

The International Workshop on Protocols for Fast Long-Distance
Networks in CERN (http://datatag.web.cern.ch/datatag/pfldnet2003/),
in Argonne (http://www-didc.lbl.gov/PFLDnet2004/), and
in Lyon (http://www.ens-lyon.fr/LIP/RESO/pfldnet2005/) were very
successful in bringing together many researchers from all over the
world including North America, Europe and Asia who are working on
these problems. This workshop will continue this tradition, and
provide a perfect setting for researchers in this area to exchange
ideas and experience.

This single-track workshop will provide researchers and technologists
with a focused, highly interactive opportunity to present, discuss and
exchange experience on leading research, development and future
directions in high performance transport and application protocols
(TCP, UDP, HTTP, FTP, etc.) over fast long-distance networks.

In order to facilitate discussions, attendance will be limited to 60
participants. Please register early to ensure your participation.
Depending on the number of people who register, we may need to
restrict the number of people from a given organization to allow for
a broader representation of the research community.

Registration will open late 2005.

Call For Papers
---------------

Participants wishing to present a paper should upload a four-
pages extended abstract to http://www.hpcc.jp/pfldnet2006/ by
October 14  2005.
Authors whose abstracts are selected for presentation will have the
option to submit a full paper, to be published on the PFLDnet 2006
web site and in the PFLDnet 2006 proceedings.

Scope
-----

The PFLDnet2006 workshop will focus on research issues and
challenges as well as lessons learned from experience. Topics of
interest include and are not limited to:

- Protocol issues in fast long-distance networks

- Enhancements of TCP and its variants

- Novel data transport protocols designed for new application services

- Transport over optical networks

- RDMA over WANs

- Shaping on TCP and UDP traffic

- QoS and scalability issues

- Parallel transfers and multistreaming

- Multicast over fast long-distance networks

- Modeling and simulation-based results

- Experiments on real networks and actual measurements

- Protocol benchmarking

- Protocol implementation and hardware issues (PCs, NICs, TOEs,
   routers, switches, etc.)

- Data replications and striping

- Requirements and experience from bandwidth demanding applications

- Bulk-data transfer applications both TCP and non-TCP based

- Transport service for Grids

Important Dates
---------------

Extended Abstract Submission Deadline: October 14
Acceptance Notification: December 2
Final Paper Submission: January  20
Workshop: February 2-3

Committees
----------
Co-Chairs:
Richard Hughes-Jones (Univ. Manchester - UK)
Kei Hiraki (Univ. of Tokyo - JP)
Jason Leigh (UIC - USA)

Steering Committee:
Pascale Vicat-Blanc Primet (INRIA - FR)
Tomohiro Kudoh (AIST - JP)
Katsushi Kobayashi (NICT - JP)

Technical Program Committee :
Brian L Tierney (LBL - USA)
R. Les Cottrell (SLAC - USA)
Bill Allcock (ANL - USA)
Eitan Altman (INRIA - FR)
Richard Carlson (Internet 2 - USA)
Sally Floyd (ICIR - USA)
Pascale Vicat-Blanc Primet (INRIA - FR)
Tomohiro Kudoh (AIST - JP)
Douglas Leith (Hamilton Institute - IR)
Steven Low (CALTECH - USA)
Medy Sanadidi (UCLA - USA)
Robin Tasker (CCLRC - UK)
Hideyuki Shimonishi (NEC - JP)
Kenjiro Cho (IIJ - JP)
Injong Rhee (NCSU - USA)
Andrew Chien (UCSD - USA)
Aaron Falk (ISI - USA)
Katsushi Kobayashi (NICT - JP)

Local Organization Committee:
Noritoshi Demizu (NICT - JP)

Sponsors:
---------
NICT, JAPAN
TBD.

Contact:
ikob at koganei.wide.ad.jp