[e2e] Codel and Wireless

Tue Dec 3 18:02:14 PST 2013

On Tue, Dec 3, 2013 at 8:34 PM, Daniel Havey <dhavey at yahoo.com> wrote:
> On Tuesday, December 3, 2013 4:54 PM, Keith Winstein <keithw at mit.edu> wrote:
> Folks here might be interested -- my colleague Anirudh Sivaraman has done
> some quantitative testing on sfqCoDel (the ns-2 module) vs. CoDel (one
> queue) vs. large DropTail queues (with per-flow queueing), in ns-2 emulation
> of a Verizon LTE link.
>
> Paper, code and replication instructions here:
> http://web.mit.edu/anirudh/www/sdn-data-plane.html
>
> We emulate a commercial LTE link queue, where datagrams arrive at the back
> whenever somebody chooses to send one, and are serviced (and delivered) at
> particular instants if the queue is nonempty. We recorded those instants
> from a real LTE network, by keeping the queue nonempty and then measuring
> when datagrams were delivered to the other side, and then play back the same
> services in emulation. (The same strategy was used in our Sprout work.)
>
> The summary is that we find fq_codel is pretty reasonable overall, but there
> are cases where it suffers considerably on some metrics that may be worth
> knowing about. For instance, the replication dataset includes a case where a
> bulk flow of TCP NewReno gets 12 Mbps over LTE (with a bufferbloated link),
> and only 4.4 Mbps if the LTE link is controlled with fq_codel or codel.
>
> There's also a case where we look at a contending bulk-throughput
> application and a Web-like application going over the same link, and *both*
> workloads end up doing a lot better over the bufferbloated, per-flow-queue
> LTE link than the fq_codel link.
>
> This is with a single real-world trace that was recorded from Verizon's LTE
> network in Boston. I don't think our dataset is sufficient to start making
> claims about whether fq_codel is or isn't good in the real world and whether
> Verizon should implement it on their base station queues (and whether
> Qualcomm should implement it on their baseband LTE modem queues for the
> uplink), but it could be useful as a tool to measure the kinds of properties
> Andrew is talking about without having to implement CoDel on an eNodeB.
>
> AFAIK we haven't explored sweeping the CoDel parameters (the 100ms and the
> 5ms) to see if different values perform better empirically on these traces,
> but maybe we should...
>
> Cheers,
> Keith
>
>>>> Nice study!  I was just wondering what would happen if the CoDel
>>>> parameters were not true.  Specifically something like this: What happens to
>>>> a connection that exceeds the 100ms RTT?|

CoDel tracks queuing delay on a particular link, not the round trip
time of a connection, which can depend on propagation and transmission
delays on other links as well. Our experiments were run with a 150 ms
minimum RTT, and CoDel is designed for wide area networks where the
minimum RTT routinely exceeds 100 ms.

> ...Daniel
>
>
> On Tue, Dec 3, 2013 at 6:45 PM, Andrew Mcgregor <andrewmcgr at google.com>
> wrote:
>
> Empirically, for fq_codel, long RTT flows work fine so long as RTT < 5
> intervals, roughly speaking, and it degrades very slowly.  So 100ms is
> about right for the internet.
>
>
> On 4 December 2013 10:10, Daniel Havey <dhavey at yahoo.com> wrote:
>
>> On Tuesday, December 3, 2013 2:41 PM, Andrew Mcgregor <
>> andrewmcgr at google.com> wrote:
>>  All of which is why fq_codel is so much better... because flows queue
>> independently, and drops are calculated per flow (although overall queue
>> size is included implicitly via the sojourn time), the RTT delay has far
>> less impact.  CoDel is an ingredient of an AQM system, not a desirable AQM
>> on its own.
>>
>> >Makes sense to me.  We need to get the worst case RTT right.  If we set
>> the interval to 100ms then the user with the users with larger RTTs may
>> have issues.
>>
>>
>> On 4 December 2013 08:11, Daniel Havey <dhavey at yahoo.com> wrote:
>>
>>
>>
>>
>>
>>
>> On Tuesday, December 3, 2013 12:22 PM, Detlef Bosau <detlef.bosau at web.de>
>> wrote:
>>
>> To my understanding, the "sojourn time" considered in CoDel is the
>> difference between the time when a packet/leaves /a queue and the time,
>> when this packet has /arrived /at the queue. In other words: The time a
>> packet spends in the queue.
>>
>> When this time is unusually high, CoDel sees an imminent congestion and
>> drops packets.
>>
>> The problem is that CoDel makes no difference whether the "sojourn time"
>> is caused by a huge number of packets in the queue, i.e. congestion, or
>> by a huge delivery time resulting from corruption loss and necessary
>> retransmissions.
>>
>> CoDel parameters are interval and target.  If the queue drains before the
>> interval then there shouldn't be any drops.   Also there is "leverage"
>> from
>> Red in a Different Light.  If CoDel decides to drop a packet from a flow
>> that is in congestion avoidance, fast retransmit or slow start then the
>> window is halved and the queue drains quickly.  If the flow doesn't have
>> enough data to trigger fast retransmit then that is unfortunate for that
>> user since they now have to wait an RTT for that packet, and it does not
>> drain the queue very much.
>>
>>
>> Hence, we have the good old loss differentiation problem. And because
>> CoDel is particularly intended for edge routers, the disaster is placed
>> exactly there where it is expected to happen......8-)
>>
>> Detlef
>>
>> Am 01.12.2013 22:16, schrieb Andrew Mcgregor:
>> > I mean sojourn time, one way in the particular queue, as per CoDel,
>> rather
>> > than anything TCP-related.  Clearance rate is fairly simply related to
>> > sojourn time, of course, given enough integration time for the
>> > statistics
>> > to converge.
>> >
>> >
>> > On 2 December 2013 02:57, Detlef Bosau <detlef.bosau at web.de> wrote:
>> >
>> >> Am 01.12.2013 06:05, schrieb Andrew Mcgregor:
>> >>> The actual clearance rate from the queue (or the sojourn time), if
>> >>> that
>> >>> matters for your AQM scheme.  That way you are not assuming a known
>> line
>> >>> rate.
>> >> Clearance rate or sojourn time?
>> >>
>> >> Clearance rate may apply for a packet delivery rate. From a TCP point
>> >> of
>> >> view, the sojourn time is the difference between the arrival of the
>> >> according ACK and the time a data packet left the sender.
>> >>
>> >> So you omit any recovery latency.
>> >>
>> >>
>> >>
>> >>> On 30 November 2013 00:13, Detlef Bosau <detlef.bosau at web.de> wrote:
>> >>>
>> >>>>  Am 29.11.2013 00:24, schrieb Andrew Mcgregor:
>> >>>>
>> >>>> In which case... measure, don't assume.  Served us well for 802.11
>> >>>> modulation selection, I don't see why it shouldn't work for AQM.
>> >>>>
>> >>>>
>> >>>> What do you want to measure?
>> >>>>
>> >>>
>> >>
>> >> --
>> >> ------------------------------------------------------------------
>> >> Detlef Bosau
>> >> Galileistraße 30
>> >> 70565 Stuttgart                            Tel.:   +49 711 5208031
>> >>                                            mobile: +49 172 6819937
>> >>                                            skype:     detlef.bosau
>> >>                                            ICQ:          566129673
>> >> detlef.bosau at web.de                    http://www.detlef-bosau.de
>>
>> >>
>> >>
>> >
>>
>>
>> --
>> ------------------------------------------------------------------
>> Detlef Bosau
>> Galileistraße 30
>> 70565 Stuttgart                            Tel.:   +49 711 5208031
>>                                            mobile: +49 172 6819937
>>                                            skype:     detlef.bosau
>>                                            ICQ:          566129673
>> detlef.bosau at web.de                    http://www.detlef-bosau.de
>>
>>
>>
>>
>> --
>> Andrew McGregor | SRE | andrewmcgr at google.com | +61 4 8143 7128
>>
>>
>>
>
>
> --
> Andrew McGregor | SRE | andrewmcgr at google.com | +61 4 8143 7128
>
>
>
>