[e2e] 10% packet loss stops TCP flow

Wed Mar 2 06:25:43 PST 2005

Reiner Ludwig wrote:

>
>... this rule with the "magic number 1%" simply can not hold. In the future we will see wireless links running at speeds > 1 Gb/s. Given that the RTT is lower bounded by the laws of physics, we end up with huge bandwidth x delay products. And, we know that TCP does not perform well with a loss rate of 1% in such a regime [FAST-TCP, HS-TCP, Scalable-TCP, etc.]. No single magic number will work for all regimes.
>
>One rule of thumb that works, though, is to make the link-level retransmission completely persistent (don't give up until the link is declared DOWN). That way any errors or variations on the link are translated into congestion, and that is something that at least the rate-adaptive end-points understand.
>  
>
You and I are on the same page in suggesting that TCP need not be the 
only place where optimization occurs, and that link level is a pragmatic 
and appropriate place to get part of the way there.

The 1% was not intended as a "magic number".  However, your "magic 
solution" goes way too far in the opposite direction, introducing a 
link-layer "perfect efforts" rather than "best efforts" reliability 
function that unilaterally declares that a link knows what the endpoints 
want more than they do.

The end-to-end argument we wrote about (as opposed to many strawman 
interpretations, and mere misunderstandings) says that there are network 
functions that are definable and implementable fully only at the 
endpoints - those are called end-to-end functions.   The 
reliability/retransmission tradeoff is one such function.   There are 
MANY reasons not to have low-level links trying to second-guess the 
endpoints need for retransmission.   The end-to-end argument just 
suggests extreme caution in introducing such ideas.

Just as ECN/RED is quite compatible with the end-to-end argument (and 
ECN/RED is not just a TCP function!  various UDP functions like RTP 
should also be using it), time-bounded link-level retransmission is 
quite appropriate as well.   ECN and RED are special cases of 
multicasting the output of congestion sensors to those known to be 
interested at the edges - a very nice theory of how to provide general 
and application-independent/protocol-independent network state reflection).

The network layer under TCP that works best is one that has a latency 
end-to-end that is predominantly driven by congestion, and does not hide 
failures that might be handled by the end-to-end protocol 
(retransmission or route switching) or application (the user canceling 
the transaction if it takes too long being the most trivial application 
control loop).