[e2e] Can feedback be generated more fast in ECN?

Wed Feb 21 15:41:19 PST 2001

> I said nothing about timeouts.  Please read about "ack pacing" to
> discover that TCP stops long before a timeout.

If your point is that SQ has almost no chance of alerting a sender before
it has drained the available window, then of course I agree with you (I
thought I said this last week). There are some exceptions where it could
happen, but they are distinct exceptions to the demographic profile of the
bulk sender, and I won't claim them as any kind of proof.

But this does not diminish the fact that SQ offers the ability to bypass
the "default" event handling in order to explicitly state the problem once
this has occurred. Come on, the sender doesn't stop trying to transmit
after he ACKs stop coming back. It is the behavioral modification
capabilities of SQ in the face of explicit congestion -- combined with its
performance gains over default behavior and ECN -- that make it a win.

> A timeout reduces the congestion window, long after transmissions have
> stopped for lack of ACKs.

But it uses default behaviors after the fact, some of which are not going
to be conducive. Dropping cwin to one segment is not always the best
behavior for all forms of congestion, and neither is doubling the retrans
timer. And so forth. Maybe you may want to tell a sender to halve cwin and
double the SRTT without doing anything else, for example.

It is possible to establish a variety of codes for a variety of
conditions, ranging from the weakest signalling ("double SRTT only") to
the harshest ("go away right now"). I don't know how many times I can say
it: this flexibility is what's behind the real value of SQ.

> > SQ did not work because it was vague in terms of explicit behavior.

> A quick check this morning found some RFC's with reasonably specific
> descriptions of how SQ's might work

792  "On receipt of a source quench
      message, the source host should cut back the rate at which it is
      sending traffic to the specified destination until it no longer
      receives source quench messages from the gateway.  The source host
      can then gradually increase the rate at which it sends traffic to
      the destination until it again receives source quench messages."

1122 "In general, the transport or application layer SHOULD implement
      a mechanism to respond to Source Quench"

     "TCP MUST react to a Source Quench by slowing transmission on the
      connection.  The RECOMMENDED procedure is for a Source Quench to
      trigger a "slow start," as if a retransmission timeout had
      occurred."

There is nothing explict about any of that at all. The sender CAN go to a
cwin zero state, the sender MUST "slow down" in some vague manner.

Of course the implementations varied. Resulting in:

1812 
     "Research seems to suggest that Source Quench consumes network
      bandwidth but is an ineffective (and unfair) antidote to
      congestion."

"Seems to suggest" is what we writers call "bullshitting the reader." It's
exactly the kind of terminology that is used when you have no direct
proof. Given that it follows clauses about rate limiting, it's pretty
obvious to me that the principle intent is unloading SQ duties, using any
reason. "Seems to" was good enough, I guess.

Meanwhile, RFC 896 showed that explicit handling of SQ resulted in
measurable benefits for the network. Apparently there's no need for "seems
to" bullshit when there is a measured and detailed implementation.

I believe that by defining explicit behavior for explicit problem
scenarios that it can be extremely useful for the Internet at large. The
past experience was based on a bunch of "may" clauses and undefined
behaviors, which were naturally optimized for failure. I have no desire to
repeat past mistakes, but those mistakes were vagueness, not SQ.

Note that I am purposefully vague in my recommended behaviors, because I
believe that these must be defined by a WG. I am simply pointing out that
there are a lot of mechanisms available through code interpretation. I
would expect that every SQ code level would be accompanied by an explicit
set of behaviors which MUST be implemented.

> Unreachables and Time Exceed messages are not serious security problems
> only because they are ignored by TCP state machines in Established state
> and completely ignored in all states by some TCP implementations.

There are two obvious ways to abuse SQ. First is to saturate the link with
spurious messages, but this works with DU and TE messages as well. Yes it
has been done but it is very easy to profile these attacks and defeat them
when they happen. The other attack (which you seem to hint at) is directly
targeting the sender by spoofing active socket pairs. Where does this
information come from? Does the attacker sweep the sender, hoping to take
out all active sessions? Wouldn't this also be easily profiled? Or is the
attacker a man-in-the-middle bandit? Although those attacks are also
profile-friendly, but don't you have a much larger security problem at
this point? With a man-in-the-middle, he can just abscond all of your
traffic, why bother with SQ.

> > Who are we designing for? the equipment manufacturers?
> 
> Please offer your estimate of how many CPU and/or memory cycles
> are required to generate an ICMP packet.

> I notice how you've refused my previous request to estimate how many
> SQ's a router might need to generate, as well as completely ignored
> my recent reference to the SQ/sec rate of a Tbit/sec router.

I have never professed any level of expertise with hardware design. Nor
will I. What I have expressed is a doubt that it is insurmountable.

> I apologize to the other readers of the mailing list for continuing
> this trade rag stuff.  I'll shut up now.

If you're not fatigued -- and if you want to -- I'll continue discussing
this with you off-list. We have to agree to some fundamentals before any
progress will be made though. That seems unlikely.

-- 
Eric A. Hall                                        http://www.ehsco.com/
Internet Core Protocols          http://www.oreilly.com/catalog/coreprot/