[e2e] [tcpm] RTTM + timestamps

David P. Reed dpreed at reed.com
Mon Jan 17 18:04:05 PST 2011

For congestyion control purposes, it seems like the proper figure to use 
should factor out queuing along the current path.

That is, since the ideal equilibrium state of the system is to have the 
average packet queue depth in each "bottleneck"* node between source and 
destination be 1 packet, the RTT on which one should base congestion 
management should be the RTT that would be achieved if the system were 
at that state.

If you base RTT on the current cumulative queue-backlog across all 
nodes, you end up creating a positive-feedback control loop where the 
queues grow, making the RTT longer, which encourages more packets to be 
pushed into the queues, making the RTT longer, ...

Positive-feedback control loops are unstable in a very bad way.  You 
will end up pinning the end-to-end latency at the largest possible value.

The proper design of the congestion control in the Internet should 
always push the system to minimize backlog on every queue in the 
system.  The average queue length should always converge (quickly) to 
<=1 packet over all paths.  Otherwise, the system is in a congestive 
degraded state.

I feel weird having to say this, but it appears that the "conventional 
wisdom" all too often assumes that "throughput" will be optimized by 
building up queues in every router and switch.  That is not correct.   
Beyond one packet in the bottleneck links, throughput becomes quite 

* a "bottleneck" link (for the purposes of my point in the email) is the 
link or link(s) along a path that sustain the lowest data rate of all 
links on that path.  Typically there is one bottleneck link, but in some 
cases (where there are chains of links that carry the same data rate) 
there may be two or more.

This is easily seen as the "optimal pipelining" model of scheduling jobs.

/WWW: http://www.reed.com/dpr/

On 01/17/2011 05:57 PM, Scheffenegger, Richard wrote:
> Hi Michael,
> again thank you for these references!
> I wanted to bring this particular issue up, to discuss if an improved RTO calculation should be brought into a hypothetical I-D talking about improved signaling using existing options.
> The one currently used by Linux has the merit of extremely wide deployment, and being conservative (only improves one particular aspect over the current standard calculation).
> For non-technical issues, "Eifel" algorithms offer less good choice for any such future I-D work... (restrictions on IPR prevent the use in commercial implementations unfortunately).
> However, I think most of these scenarios would also benefit from obtaining good feedback about the RTT samples (with the additional side-info, if the RTT was obtained off a reordered or retransmitted segment). But current signaling doesn't yield that feedback back from the client.
>   Best regards,
>     Richard
> From: SCHARF, Michael [mailto:Michael.Scharf at alcatel-lucent.com]
> Sent: Montag, 17. Jänner 2011 21:52
> To: Scheffenegger, Richard; tcpm at ietf.org; iccrg at cs.ucl.ac.uk; end2end-interest at postel.org
> Cc: Matt Mathis; Mark Allman
> Subject: Re: [tcpm] RTTM + timestamps
> Richard,
> I happened to look at the RTO calculation and RTO spike issue a long time ago. For whatever it is worth, I scanned that old work and extracted some references that may or may not be well-known. A number of algorithms have indeed been developed to address these issues (e. g., Linux stack), and several papers tried to further optimize the timer calculation in particular for mobile environments, including amongst others:
> R. Ludwig und K. Sklower. The Eifel Retransmission Timer. ACM SIGCOMM Computer Communications Review, 30(3), 2000, pp. 17-27 [this is not the actual Eifel algorithm]
> H. Ekstroem and R. Ludwig, "The Peak-Hopper: A New End-to-End Retransmission Timer for Reliable Unicast Transport," Proc. IEEE INFOCOM, 2004
> K. Jacobsson, H. Hjalmarsson, N. Moeller and K. H. Johansson, "Round-Trip time estimation in communication networks using adaptive Kalman filtering"
> A. Kesselman, Y. Mansour, "Optimizing TCP Retransmission Timeout", Springer LNCS 3421, 2005
> I. Psaras, V. Tsaoussidis, "Why TCP timers (still) don't work well", Computer Networks, Volume 51, Issue 8, 6 June 2007
> etc.
> And, BTW, a young wannabe-TCP researcher once tried to systematically understand the RTO spikes resulting from RFC 2988:
> Michael Scharf, Marc C. Necker, Bernd Gloss: "The Sensitivity of TCP to Sudden Delay Variations in Mobile Networks", Springer LNCS 3042, 2004, pp. 76-87
> If a better RTT estimation or RTO calculation was indeed needed, these papers might contain some interesting starting points.
> Michael
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20110117/4095add4/attachment-0001.html

More information about the end2end-interest mailing list