[e2e] [Fwd: RED-->ECN]

Thu Feb 1 08:53:13 PST 2001

Well, I think we have to define elephants, mice and maybe add a 3rd
category, ants. From a TCP point of vew, I would say that elephants are
those sessions whose behavior approaches the asymptotic behavior of the
"congestion avoidance" phase, mice are those whose behavior is
essentially limited by the slow-start algorithm, and ants are those who
simply exchange a couple packets, and are thus not affected at all, even
by slow start.

In the current TCP spec, an application in slow start mode will exchange
N packets in log2(N/X) round trips, where X is the initial window. The
average window will be N/log2(X), unless the session encounters a packet
loss. These are the ants: their rate is not limited by losses.

If the current packet loss is L, the average interval between losses
will be 1/L. If the number N is larger than 1/L, but not much larger,
then the window before the loss wil be approximately 1/2L, and after the
first loss will be 1/4L, and that will probably be sufficient to carry
the remaining packets. These sessions, that carry about 1/L packets, are
the mice. The average window for the mice will be about
1/(L.(1-log2(L.X))). This is larger than the individual windows of ants.

The average window of elephants, according to the classic models, is
supposed to be 2/sqrt(L).

At the statistical equilibrium, the loss rate, or the RED marking rate,
is supposed to be such that everything stabilizes. For the large values
of L that we have today, of the order of 1%, the window of elephants
tend to be much larger than the window of mice. Assuming X=1, and L=4%,
the effective window of mice is about 4, and the asymptotic window of
elephants is 10: on average, the elephants run more than twice faster
than mice. From a selfish point of view, it makes sense for an
application to just keep one TCP connection open, rather than go back to
slow start with a new connection. 

But consider now what happens if we increase the capacity of the
network. We want to provide better services, faster transmission for
both elephants and mice, and for that we will tend toward lower loss
rates (or lower marking rates). Assume we drop the loss rate to 0.1%, a
value which is in fact mentioned in many service level agreements. The
average window of elephants increases to about 63, which is good: our
FTP transfer are now 6 time faster than they were on the congested link
that showed a 4% loss rate. But, at the same time, the effective window
of mice grows to about 91: now, mice are runnng faster than elephants.
There is much less incentive to string several mice together and make
them an elephant.

Lets pursue the exercize, increase the network capacity some more, so
that the loss rate drops to 0.01%. Life becomes even better for mice --
their average window increases to about 700, while the average window of
elephants is only 200. At this point, it pays to split the elephant into
a string of mice. For a large file transfer, one gets better results if
one closes the connection after about 10,000 packets, and restart a new
one. Isn't that funny?

Indeed, the root of the problem, from a control point of view, is that
the window of mice grows as 0(-1/(L.log(L)), while the window of
elephants grows as 0(1/sqrt(L)). As you decrease the loss rate, you make
life much better for mice than for elephants. 

Indeed, all this is a simplistic reasoning on averages. Decreasing the
loss rate has another effect, that of increasing the average interval
between control signals, specially for elephants -- the interval
increases linearly with the average throughput of the connection. Less
frequent control means more variability -- the rate observed by similar
connections will be expected vary widely.

I believe that the only way to solve this problem is to change the CA
algorithm. The only way elephants can keep ahead of mice is, if their
window scales as 0(1/L), instead of 0(1/sqrt(L)).

-- Christian Huitema

-----Original Message-----
From: Vishal Misra [mailto:misra at newworld.cs.umass.edu] 
Sent: Thursday, February 01, 2001 6:33 AM
To: julian_satran at il.ibm.com
Cc: end2end-interest at postel.org
Subject: Re: [e2e] [Fwd: RED-->ECN]

On Thu, 1 Feb 2001 julian_satran at il.ibm.com wrote:

> accepted queueing models.  Do you have in mind some "quantum 
> statistics" for networks that we are all unaware of? Do those have 
> some tunnel effects that allow you to dream about high utilization AND

> short queues? Is there some mathematics that can support your 
> statements?

Yes, it is called "Control theory" and has been around for hundreds of
years. Steven is talking about a *closed loop* control system. TCP
(which is responsible for most of the queues in IP networks) is not an
open loop protocol. You cannot apply open loop queueing theory models to
describe the behavior of "elephants" which Steven was discussing. 

If your traffic is purely "elephant" then it is possible to design
controllers which simultaneously give low delay and (near) full
utilization. Several research groups have already done it and
demonstrated it both mathematically and through simulations/testbeds.
The presence of "mice" complicates things a bit since even though they
might be carried 
by TCP they don't live long enough to be classified as "controllable".

-Vishal