[e2e] Reacting to corruption based loss

Fri Jun 10 09:09:03 PDT 2005

Note that one of the major storage-systems vendors has for some years used a
modified TCP for the same reason -- improved fast-LAN performance.  I don't
recall seeing its loss behavior, but I did see it completely ignore receive
windows!  They were irrelevant to how the very fast end systems were designed
to buffer I/O data.

Alex

Rik Wade wrote:
> 
> On 7/06/2005, at 7:22 PM, Michael Welzl wrote:
> > This point has been raised several times: how exactly should
> > a sender react to corruption? I fully agree that continuing
> > to send at a high rate isn't a good idea.
> > [...]
> > So, why don't we just decide for a pragmatic approach instead
> > of waiting endlessly for a research solution that we can't come
> > up with? Why don't we simply state that the reaction to corruption
> > has to be: "reduce the rate by multiplying it with 7/8"?
> >
> > Much like the TCP reduction by half, it may not be the perfect
> > solution (Jacobson actually mentions that the reduction by half
> > is "almost certainly too large" in his congavoid paper), but
> > it could be a way to get us forward.
> >
> > ...or is there a reasonable research method that can help us
> > determine the ideal reaction to corruption, irrespective of
> > the cause?
> 
> I did some work on this as part of my PhD several years ago. A
> summary of the work was published as:
> 
> R.Wade, M.Kara, P.M.Dew. "Proposed Modifications to TCP Congestion
> Control for High Bandwidth and Local Area Networks.". Appeared in
> "Proceedings of the 6th IEEE Conference on Telecommunications
> (ICT'98)", July 1998.
> (Paper available for download from http://www.rikwade.com)
> 
> At the time, I was working with 155Mb/s ATM and Fast Ethernet, and
> looking at the performance of TCP congestion avoidance algorithms
> over such networks. My thoughts were along the lines of those
> mentioned elsewhere in this thread - why should TCP make such a large
> reduction in its window size if loss was only due to a single ATM
> cell drop, or corruption elsewhere in the stack.
> 
> The proposal in our paper was to maintain a weighted history of the
> congestion window size and to attempt to use this value when
> perceived loss was encountered. If the loss was a unique event, and
> the connection was long-lived, then restart would likely be close to
> the current transmission rate, and the connection could continue as
> normal. If recurrent loss was encountered, then the algorithm
> reverted to its normal mode of operation after three (for example)
> attempts. Various values for the history weighting were simulated in
> order to evaluate whether a more, or less, aggressive approach was
> better.
> 
> I was quite happy with the results and it was a relatively simple
> modification to the Reno implementation in both Keshav's Real
> simulator and NS.
> --
> rik wade