[e2e] Can feedback be generated more fast in ECN?

Wed Feb 21 03:25:29 PST 2001

> There are lots of reasons to stop senders when things have gone horribly
> wrong. DOS attacks are one of them, telling the sender to stop quickly and
> reminding them whenever they try to crank up again is good for the entire
> network, is it not? Don't you think people would like this feature if it
> helps to seriously constrain DOS attacks?

Why would someone writing a DOS tool make it slow down in response to SQ (or indeed anything)?

> What about an oversubscribed exchange point, or when a backhoe has
> redirected your routes over slower links? I can point you to a 100%
> utilized network any day of the week, some of which are poorly planned,
> some of which are accidental, all of them are detrimental.
> 
> Don't those also qualify as being good scenarios for telling people to
> slow down, if not to stop altogether? The senders should slow down rather
> than stop when it is non-fatal, and this is a good usage scenario for SQ
> codes above zero, since they are also examples of where ECN fails to
> notify the sender quickly.

Lost TCP ACKs will notify the sender at most one RTT later than SQ could, which will delay the recovery by a few RTTs.  If the best that SQ can do is to recover from catastrophic failures a few seconds earlier then I suggest it's not worth it.

> And of course not everything is TCP.

If your point is that some traffic (such as streaming media) are not responsive then yes, they should be fixed.

If your point is that protocols (such as RTP) could be enhanced to take note of SQ then couldn't they be enhanced to use ECN instead.  Is the one RTT speedup really worth it.

> But the key point is that SQ code 0 works for saturated links, while there
> are 255 more codes to use for other scenarios. SQ is capable of solving
> both problems.

What other problems do you have in mind?

> > Isn't the point of ECN to deal with congestion before significant losses
> > occur?  If so, why do keep talking about how ECN doesn't work with
> > congestion that not involves not merely losses but 
> catastrophic losses?
> 
> I am raising the point that the biggest congestion problems that we have
> are from failure, not from incremental build-up. ECN acts like there's
> never any failure, or that failure doesn't matter since it can't do
> anything about it. SQ can deal with both of these scenarios.

ECN claims that dropping packets has dealt with the failure case if it occurs and that optimising the incremental build-up will reduce the chances of failure.

> > If the reason Cisco sends Unreachables, Time Exceededs, and
> > Fragmentation Needed so slowly is only because they're what the
> > other guys call the Evil Empire, then the other guys must be
> > generating those messages fast.  But they're not, are they?
> 
> I'm not naming any names, I haven't done any detailed timing tests, which
> is why I haven't completely dismissed this argument. I know that some of
> them are extremely slow, and some of them are not too slow. It might be
> load related, might be vendor related, might be CPU related, probably all
> of them combined. I have no data but I don't believe they are all crap.

The problem is not that generating SQ packets will be slow, because if the router only has to generate one a second then it will probably be fairly quick (though not as quick as forwarding the packet would be).

The problem is that if the router needs to generate hundreds or thousands of SQ packets a second then this will overload the processor and stop it doing other things like running routing protocols and handling CAM misses in the fast path.

Of course you could redesign the router, or put a bigger processor in it, but I suspect most customers would be unhappy with a 10% more expensive router just so that they can recover from backhoes a couple of RTTs faster.

Regards,

    Andy