[e2e] Why do we need TCP flow control (rwnd)?
crovella at cs.bu.edu
Wed Jul 2 13:26:15 PDT 2008
> -----Original Message-----
> From: end2end-interest-bounces at postel.org
> [mailto:end2end-interest-bounces at postel.org] On Behalf Of
> Christian Huitema
> Sent: Wednesday, July 02, 2008 1:28 PM
> To: Lachlan Andrew; jg at laptop.org
> Cc: David P. Reed; end2end-interest list
> Subject: Re: [e2e] Why do we need TCP flow control (rwnd)?
> > I was meaning to address the statement that it is bad to model
> > congestion collapse by people continuing to pound a congested link
> > with Poisson traffic. My point was that, for a core/edge link,
> > individuals each only need to pound once, and we can still have a
> > (batch) Poisson process during the congestion interval. An
> > leaving the system because of congestion doesn't stop the
> > arrivals of new individuals (flows, bursts, ...) coming.
> Actually, the arrival of individuals in the system does not
> appear to match the Poisson hypothesis.
The story is actually rather complicated. If the "individuals" we are
talking about are TCP connections, or flows, then the arrivals are often
correlated. The only validated explanations I'm aware of for
correlated flow arrivals are the presence of session-level structure (as
in web browsing, p2p downloads, and in older cases, separate downloads
of web page components). However -- the only studies I've seen of
*session* arrivals show that session arrivals *are* typically well
modeled as Poisson.
All of this is conditioned on data analysis being done over a timeperiod
of an hour or less, when statistics appear stable enough that they can
be modeled as stationary. Over time periods of more than an hour,
statistics like the mean change noticeably and then one can see
correlations in session arrivals because of exogenous driving factors,
i.e., diurnal human activity, day of week, etc. There's lots of cites
to give here (many are in Ch 6 of "Internet Measurement" by me and Bala
> Part of the heavy
> tail behavior seems to be due precisely to correlations in
> the arrival process. A given individual is more likely to
> become active if he or she was recently active; multiple
> individuals are more likely to become active if other
> individuals are already active. This "piling up" effect
> contributes to the heavy tail effect, or to power law distributions.
I've never seen good, validated evidence for this claim (that "piling
up" contributes to power law distributions). Pointers would be
> The problem with Poisson modeling is not that it overestimate
> congestion. On the contrary, the Poisson hypothesis tends to
> underestimate the variability of the aggregate, which leads
> to underestimating the risk of congestion. Basically, the
> Poisson hypothesis says that arrivals are independent, and
> thus that aggregates will be smooth. The "self correlation"
> hypothesis says that more arrivals are likely if many
> arrivals are already observed, and thus that peaks of traffic
> are not likely to be smooth.
> -- Christian Huitema
Some more notes: really what we are talking about here are closed vs.
open systems. An open system, such as one driven by Poisson arrivals,
is essentially the limit of a closed system when the population size
gets very large while the utilization stays constant, so that delays in
the system do not feed back noticeably to the population (senders). In
some cases this can be a good model for network traffic, not so in
others. The limit example shows that an open system is a reasonable
model when the population size is very large compared to the rate work
is put into the system and the timescale of interest. This can be the
case for highly multiplexed links in the Internet core, I would expect.
It's probably not the case for less utilized links at the edge. So the
extent to which correlations in arrivals are present, and Poisson vs
something else is needed to describe flow arrivals, depends a lot on the
nature of the link in question. And of course, whether flow arrivals
are Poisson or not, the heavy-tailed nature of flow lengths will serve
to make traffic at the packet level highly correlated.
More information about the end2end-interest