[e2e] 10% packet loss stops TCP flow
Reiner.Ludwig at ericsson.com
Thu Mar 3 05:49:46 PST 2005
At 15:25 02.03.05, David P. Reed wrote:
>You and I are on the same page in suggesting that TCP need not be the
>only place where optimization occurs, and that link level is a pragmatic
>and appropriate place to get part of the way there.
>The 1% was not intended as a "magic number".
>However, your "magic
>solution" goes way too far in the opposite direction, introducing a
>link-layer "perfect efforts" rather than "best efforts" reliability
>function that unilaterally declares that a link knows what the endpoints
>want more than they do.
Fine. Then we do not agree. But, I'll try to make my point clearer ...
For example, assume a wireless link where the optimal size of the retransmission unit - as determined by the link's error characteristics - is 40 Bytes. Why should L2 ARQ give up on a packet, and let it retransmit by TCP? TCP can't do any better, anyway.
Completely persistent L2 ARQ translates errors into varying delay and congestion which are two behaviors of the black-box that end-points can adapt to. Error losses inside the black-box can not be understood by the end-points (unless explicitly signaled from within the black-box); they can only be (mis-)interpreted as congestion losses.
But why bother about all of this? With a well designed L1, the common case will be that the L2 does not need any or only very few retransmissions, anyway. If it does need many retransmission, though, then the link is broken (at least during a transient period). In that case, nothing except re-routing traffic will help.
[Clearly, on a channel that is shared by multiple tranceivers that can be in different physical locations, channel-dependent scheduling is needed in addition to ensure that a user in a bad spot can not cause head-of-line blocking for the others. But that's an orthogonal issue.]
Note that this is not saying anything about buffer sizes. Clearly, completely persistent L2 ARQ should be combined with AQM (e.g., RED/ECN).
The same arguments also hold when considering a rate-adaptive but delay-sensitive app. like streaming over DCCP/TFRC.
And, I think that all of this is well in line with the e2e argument:
"best effort" = "try until congestion forces you to drop".
More information about the end2end-interest