[e2e] Open the floodgate - back to 1st principles

Sun Apr 25 20:57:05 PDT 2004

Guy -
>I am honestly not sure if this rule of thumb is being remembered correctly 
>or if router designers examine it critically.

I'm not doubting that the extra buffer memory in routers is useful, but my 
point was who decided that it should run filled rather than essentially empty?

That's like saying that all capacitors in a circuit should be charged to 
their breakdown voltage, or all highways should be filled with bumper to 
bumper traffic to optimize the morning commute.

The whole point of buffer memory in routers is to absorb transient shocks, 
which can only be done if they are near empty.

At 10 gigabits/second, the United States is about 150 million bits (20 
megabytes) wide.  A fully pipelined path with its bottleneck link being 10 
gigabits/sec would have no more than about 40 megabytes of buffer memory 
occupied in steady state (2 x the number of bytes in transit), divided by 
the number of routers on the path; of course if you send packets that are 
too large, that increases the amount of buffering needed to achieve the 
bottleneck goodput (why users think that performance increases with packet 
size is an interesting thing).   Any sustained "cross traffic" involving a 
bottleneck link would reduce the memory needed for the path, requiring less 
occupancy to sustain the maximum obtainable goodput.  Bursty source traffic 
would also for the same reason reduce the total memory needed to achieve 
the best achievable goodput.

The point of those big buffers is solely to absorb transients (when a link 
breaks briefly, or a burst of "cross traffic" appears and disappears), but 
if the sources don't slow down quickly, all that happens is that you've 
introduced a sustained backlog clog (essentially a sustained traffic jam) 
that grows monotonically as long as the load is maintained, because the 
outflow can not run any faster than its standard rate.

The longer it takes for the source to hear about congestion that it can 
help resolve, the more dramatic a slowdown that source will have to do in 
order to prevent massive discarding of traffic in the network.   So 
building up big piles of traffic in network buffers that is not needed to 
achieve full pipelining only argues for AIMD-like full-brakes-on responses.

Worse yet, when the buffers are maintained (or get) full, a higher and 
higher proportion of "congestion signals" get returned to senders that are 
near the end of their connections, and thus cannot have much of an impact 
on reducing the congestion.   If all I have left to retransmit is a few 
bytes, I can cause very little of the built-up load to go away - the 
connections that sustain the assumed high load will be among the newer 
ones, that are just getting started.   They won't have sent much, so there 
is no reason for them to see any problems that will encourage them to start 
holding back.

The whole point of RED and ECN (which are *brilliant*, elegant concepts) 
are to provide *early* signals that traffic is accumulating in those 
buffers, long before they are full.

And it's also clear from control theory (and has been demonstrated) that 
when the drops (or ECN bits) are used to signal congestion, the right 
packets to drop (or set the ECN bits on) are the ones at the *head* of the 
congested outgoing router queue, not the tail.   That provides a much 
quicker signal of congestion, which leads to a quicker response, and 
smoother control of the application level.   This may seem 
"counterintuitive" but the bug is in the intuition - arising from the fact 
that humans tend to have difficulty thinking at the systems level.