[e2e] [aqm] What is a good burst? -- AQM evaluation guidelines

Sun Dec 15 13:42:12 PST 2013

On Dec 14, 2013, at 9:35 PM, Naeem Khademi <naeem.khademi at gmail.com> wrote:

> Hi all 
> 
> I'm not sure if this has already been covered in any of the other threads, but looking at http://www.ietf.org/proceedings/88/slides/slides-88-aqm-5.pdf and draft-ietf-aqm-recommendation-00, the question remains: "what is a good burst (size) that AQMs should allow?" and/or "how an AQM can have a notion of the right burst size?". 
> 
> and how "naturally-occuring bursts" mentioned in draft-ietf-aqm-recommendation-00 can be defined? 

Imagine, if you will, that you have a host and a network in front of it including a first hop switch or router.The host gas a TCP offload engine, which is a device that accepts a large chunk of data and sends as much of it as it has permission to send as quickly as it can. The host has, for sake of argument, a 10 MBPS interface, and everything else in the network has interfaces whose rate are measured in gigabits. The host gives its TSO one chunk of data, so that can't  be called a "burst" - it's one message. The TSO sends data as quickly as it can, but presumably does little more than keep the transmission system operating without a pause; while it might queue up 45 messages at a crack, there is no requirement that it do so, so the term "burst" there doesn't have a lot of meaning. And as the data moves through the network, the rate of the particular session is absolutely lost in the available capacity. So a burst, in the sense of the definition, never happens.

Now, repeat the experiment. However, in this case the host as a gig-E interface, and the next interface that its router uses is 10 or 100 MBPS. The host and its TSO, and for that matter the router, do exactly the same thing. As perceived by the router, data is arriving much more quickly than it is leaving, resulting in a temporarily deep queue. If the propagation delay through the remainder of the network and the destination host are appropriate, acknowledgements could arrive at the TSO, soliciting new transmissions, before that queue empties. In that case, it is very possible that the queue remains full for a period of time. This network event could last for quite some time.

The second is clearly a burst, according to the definition, and I would argue that it is naturally occurring.

I imagine you have heard Van and/or Kathy talk about "good queue" vs "bad queue". "Good queue" keeps enough traffic in it to fully utilize its egress. "Bad queue" also does so, but does so in a manner that also materially increases measured latency. This difference is what is behind my comment on the objective of a congestion management algorithm (such as TCP's but not limited to it) that its objective is to keep the amount of data outstanding large enough to maximize its transmission rate through the network, but not so large as to materially increase measured latency or probability of loss. 

I would argue that this concept of "Good Queue" is directly relevant to the concept of an acceptable burst size. In the first transmission in a session, the sender has no information about what it will experience, so it behoves it to behave in a manner that is unlikely to create a significant amount of "bad queue" - conservatively. But it by definition has no numbers by which to quantify that. Hence, we make recommendations about the initial window size. After that, I would argue that it should continue to behave in a manner that doesn't led to "bad queue", but is free to operate in any manner that seeks to keep the amount of data outstanding large enough to maximize its transmission rate through the network, but not so large as to materially increase measured latency or probability of loss. At the point that it sends data in a manner that creates a sustained queue, it has exceeded what would be considered a useful burst size.