[e2e] Open the floodgate - back to 1st principles
Guy T Almes
almes at internet2.edu
Mon Apr 26 08:49:15 PDT 2004
We might be in "violent agreement" on this point. There are two
different issues. (1) That the large buffers are there and (2) how
Reno/AIMD uses them.
And, in a way, it all comes down to your phrase "absorb transient
shocks". I think that the TCP experts of the early 1990s were exactly
asking for a delay-bandwidth product worth of buffer per port in order to
absorb transient shocks, namely the transient shock designed into AIMD/Reno
by having it continue to grow its cwnd until something breaks. Thus they
fully intended to fill that buffer during the peak phase of the Reno/AIME
sawtooth. And they hoped that Reno/AIMD could recover as the buffer was
I do not recall the details of this idea as it was developed in the early
1990s and hope a knowledgeable veteran will step up and correct and/or
complete this story.
But I'm pretty sure that, even if this idea did work well in the early
1990s, it does not still work. In what now passes as a high-speed
wide-area path today, it takes many minutes for Reno/AIMD to recover, and
the conventional delay-bandwidth product does not suffice.
And, in the meantime, the designed-in behavior of growing cwnd until
something breaks does have some negative environmental impact.
--On Sunday, April 25, 2004 23:57:05 -0400 "David P. Reed"
<dpreed at reed.com> wrote:
> Guy -
>> I am honestly not sure if this rule of thumb is being remembered
>> correctly or if router designers examine it critically.
> I'm not doubting that the extra buffer memory in routers is useful, but
> my point was who decided that it should run filled rather than
> essentially empty?
> That's like saying that all capacitors in a circuit should be charged to
> their breakdown voltage, or all highways should be filled with bumper to
> bumper traffic to optimize the morning commute.
> The whole point of buffer memory in routers is to absorb transient
> shocks, which can only be done if they are near empty.
> At 10 gigabits/second, the United States is about 150 million bits (20
> megabytes) wide. A fully pipelined path with its bottleneck link being
> 10 gigabits/sec would have no more than about 40 megabytes of buffer
> memory occupied in steady state (2 x the number of bytes in transit),
> divided by the number of routers on the path; of course if you send
> packets that are too large, that increases the amount of buffering needed
> to achieve the bottleneck goodput (why users think that performance
> increases with packet size is an interesting thing). Any sustained
> "cross traffic" involving a bottleneck link would reduce the memory
> needed for the path, requiring less occupancy to sustain the maximum
> obtainable goodput. Bursty source traffic would also for the same reason
> reduce the total memory needed to achieve the best achievable goodput.
> The point of those big buffers is solely to absorb transients (when a
> link breaks briefly, or a burst of "cross traffic" appears and
> disappears), but if the sources don't slow down quickly, all that happens
> is that you've introduced a sustained backlog clog (essentially a
> sustained traffic jam) that grows monotonically as long as the load is
> maintained, because the outflow can not run any faster than its standard
> The longer it takes for the source to hear about congestion that it can
> help resolve, the more dramatic a slowdown that source will have to do in
> order to prevent massive discarding of traffic in the network. So
> building up big piles of traffic in network buffers that is not needed to
> achieve full pipelining only argues for AIMD-like full-brakes-on
> Worse yet, when the buffers are maintained (or get) full, a higher and
> higher proportion of "congestion signals" get returned to senders that
> are near the end of their connections, and thus cannot have much of an
> impact on reducing the congestion. If all I have left to retransmit is
> a few bytes, I can cause very little of the built-up load to go away -
> the connections that sustain the assumed high load will be among the
> newer ones, that are just getting started. They won't have sent much,
> so there is no reason for them to see any problems that will encourage
> them to start holding back.
> The whole point of RED and ECN (which are *brilliant*, elegant concepts)
> are to provide *early* signals that traffic is accumulating in those
> buffers, long before they are full.
> And it's also clear from control theory (and has been demonstrated) that
> when the drops (or ECN bits) are used to signal congestion, the right
> packets to drop (or set the ECN bits on) are the ones at the *head* of
> the congested outgoing router queue, not the tail. That provides a much
> quicker signal of congestion, which leads to a quicker response, and
> smoother control of the application level. This may seem
> "counterintuitive" but the bug is in the intuition - arising from the
> fact that humans tend to have difficulty thinking at the systems level.
More information about the end2end-interest