[e2e] [Iccrg] Re: Reasons not to deply TCP BIC/Cubic

Fri Feb 3 18:58:36 PST 2012

Greetings Detlef,

On 3 February 2012 23:12, Detlef Bosau <detlef.bosau at web.de> wrote:
> On 02/02/2012 11:45 PM, Lachlan Andrew wrote:
>>
>> it was saying that CUBIC induces excessive queueing delays.
>
> Perfect ;-) So I can ask the author himself: Why?

CUBIC induces high queueing delay because it aims to increase the
window rapidly until just before the point when the buffer is full,
then increases slowly so that the window stays at that buffer-filling
size for as long as possible.

> O.k., "not a simple one" is the correct view. Obviously, a simple view into
> the congavoid paper will help us here:
>
> Look at the footnote on page 2:
>>
>> A conservative flow means that for any given time, the integral of the
>> packet density around the sender–
>> receiver–sender loop is a constant. Since packets have to ‘diffuse’ around
>> this loop, the integral is sufficiently
>> continuous to be a Lyapunov function for the system. A constant function
>> trivially meets the conditions for
>> Lyapunov stability so the system is stable and any superposition of such
>> systems is stable. (See [3], chap. 11–
>> 12 or [21], chap. 9 for excellent introductions to system stability
>> theory.)
>
> However, it clearly exhibits, which mathematical effort must be done to
> apply control theory to our problem. And perhaps, the message of this
> footnote is nothing else then "with sufficient effort, it can be shown that
> at least VJCC and the well known control theory do not contradict."

That passage is justifying using window-based rate control.  It argues
that having a *constant* sliding window causes the rate to be stable.
Most studies of TCP stability take this as a starting point, and
assume that "stable window implies stable rates".  They then study the
stability of the dynamics of the window size (with varying
conclusions).

> And from what I've read so far in papers, which try to figure out
> differential equations and the like for TCP, I'm even more convinced that an
> analytical approach to TCP stability can hardly be achieved with reasonable
> effort.

There is plenty that can be known, and plenty that can't.  The main
problem isn't lack of understanding of AIMD, but the fact that so much
traffic is in the form of short flows that don't follow AIMD.

>> No.  With loss-based TCP, the feedback delay is of the order of the
>> interval between packet losses, which is very much larger than the
>> RTT.  (To see why that is the feedback delay, consider how the network
>> signals "increase your rate"; it has to withhold the next expected
>> packet loss.)  Although increasing queueing delays makes the RTT much
>> higher, algorithms like CUBIC actually make the feedback delay much
>> less, by causing more frequent losses.
>
> More frequent losses cause more frequent retransmissions.

Of course.  However, our aim isn't usually to avoid retransmissions,
but to allow each flow to get a reasonable throughput.  By allowing
faster feedback, it is in principle possible to cause established
flows to back off faster when a new flow starts up.  (Of course, the
authors of H-TCP will point out that this isn't an automatic
consequence of more frequent feedback, and early versions of CUBIC
were slower to converge to fairness than Reno was.)

[As another aside, Steven Low pointed out off-list that my comment on
Reno inducing high feedback delay was imprecise.  I should have said
that AIMD performs a filtering of the feedback with a time constant of
the order of the inter-loss time, and a filter imposes a delay of the
order of its time constant.  An alternative view is that Reno gets
rapid but very imprecise feedback, with the information content of
each (N)ACK decaying with the interval between NACKs.]

> Loss detection by delay observation is a topic, I dealt with myself about a
> decade ago.
> delay changes and load changes take
> place on incomparable time scales.

That probably depends on how we define "load".  A load in the form of
a burst of packetswill cause a burst of queueing delay on the same
timescale.  If you mean that delay changes on a faster timescale than
the arrival of flows, that is true.  We want the feedback to occur on
a timescale faster than our reactions to it.

> What made me leave this approach eventually was the simple fact, that this
> approach does what scientists call "ratio ex post". We observe a phenomenon,
> which may be caused by different reasons, e.g. we observe increasing delay
> which may be caused by increasing load _OR_ increasing service time for
> instance on a wireless link, and then we get out the crystall ball to guess
> that one, which applies here.

As you often argue, loss can also be due to wireless links rather than
congestion.  Neither loss nor delay is a guarantee that there is
congestion, but both are indications.  A congestion control algorithm
should consider both, and react to both in a way consistent with
either being a false alarm.  Doug Leith's group did some work on
testing whether increased delay is sufficiently correlated with
congestion to be useful.  From memory, they found that it is.

> And in that very sense, congestion detection by delay observation is a
> mixture of clairvoyance and hand clapping against Elephants.

> Not to be misunderstood: TCP/Reno and the like _IS_ outstanding work.
> (I don't know, whether VJ was already awarded the ACM Turing Award, however,
> his groundbreaking work deserves it.)

I agree that the insight that loss can be used as a sign of
congestion, rather than simply a trigger for retransmission, was
outstanding.  The robustness of Reno is largely that it uses AIMD,
which wasn't new to Reno.

Cheers,
Lachlan

-- 
Lachlan Andrew  Centre for Advanced Internet Architectures (CAIA)
Swinburne University of Technology, Melbourne, Australia
<http://caia.swin.edu.au/cv/landrew>
Ph +61 3 9214 4837