[e2e] What's wrong with this picture?

Detlef Bosau detlef.bosau at web.de
Sat Sep 12 04:22:39 PDT 2009


Lachlan Andrew wrote:
> No, IP is claimed to run over a "best effort" network.  That means
>   

To run? Or to hobble? ;-)

> that the router *may* discard packets, but doesn't mean that it
> *must*.  

This is not even a theoretical debate.

A router's storage capacity is finite, hence the buffer cannot keep an 
infinite number of packets.

However, we're talking about TCP here. And TCP simply does not work 
properly without any kind of congestion control.
Hence, there is a strong need to have a sender informed about a 
congested network.

> If the delay is less than the IP lifetime (3 minutes?) then
> the router is within spec (from the E2E point of view). 

I don't know a router who checks a packet's lifetime with the clock, 
although some stone-aged specification proposes a temporal 
interpretation of lifetime ;-) Practically, in IPv4 the lifetime is a 
maximum hop count. In IPv6, this is even through for the specs.

>  The dominance
> of IP was exactly that it doesn't place heavy "requirements" on the
> forwarding behaviour of the routers.
>
>   

Which does not mean, that there would be no need to do so. Particularly, 
when any kind of recovery layer comes in effect, we should carefully 
consider requirements for router behaviour.
>> Otherwise, the binding of this particular layer 2 transport (with elastic 10
>> second queues) is something that is just WRONG to claim as an high-speed
>> Internet service.    (except for the case where the other end is 10 light
>> seconds away).
>>     
>
> No, it is not "WRONG" to claim a high bit-rate service is high-speed,
> even if it has high latency. 

Hm. I think, the problem is the misconception mentioned by zartash 
yesterday: Do we talk about rates? Or do we talk about delays? Or do we 
talk about service times?

In packet switching networks, we generally talk about service times and 
nothing else.

Any kind of "rate" or "throughput" is a derived quantity.

One particular consequence of this is that we well may consider to 
_restrict_ particular service times and to discard packets which cannot 
be serviced in a certain amount of time in order to keep queues stable 
and to avoid infinite head of line blocking etc.

A side effect from doing so is that a sender is, if implicitly, informed 
about a "lost packet" and reacts accordingly: It does congestion handling.

When a door is congested, it is sometimes an academic debate, whether 
it's simply overcrowded or temporarily closed. I cannot pass the door 
anyway. And the appropriate action is to either find another door, or to 
give it another try some time later.

>  High-speed is not equivalent to
> low-delay. 

However, a high speed network will necessarily have small service times.

>   
>> The entire congestion control mechanism (and approximate fairness mechanism)
>> of the Internet works on the assumption that congestion is signaled to those
>> who are congesting the network - so they can back off, which they will in
>> fact do.
>>     
>
> If the networks have changed, we should change the assumptions that TCP makes.
>
> In the old days, VJ changed TCP so that it would run over congested
> un-reliable networks. If TCP is now being asked to run over congested
> reliable networks, shouldn't we update TCP? 

I don't see the point.

 From the days, when VJ wrote the congavoid paper, up to now TCP works 
fine about congested reliable networks ;-)
(What, if not "reliable", are wirebound links?)

Dave's problem arises from unreliable networks (yes, I intendedly use a 
somewhat strange definition of reliability here ;-)), i.e. from wireless 
ones.

One possibility to turn those "unreliable networks" into reliable ones 
is a strict recovery layer which offers arbitrarily high probabilities 
for successful packet delivery.

If I would read Lloyds post in a black-hearted manner, I could interpret 
RFC 3366 in exactly that way.
However,  I don't really think, that this is the intended interpretation.

>  There are many methods
> which use delay as an indicator of congestion, as well as using loss.
>   
> (Should I plug Steven Low's FAST here?)  We don't need anything very
> fine-tuned in a case like this; just something very basic.
>   

I dealt with ideas like this myself because it is appealing at a first 
glance -  and I got several papers rejected.

Although, this was disappointing to me in the first, I had to understand 
that delay is one of the worst indicators for network congestion one 
could even imagine.

One of the first criticisms, I've got, was that there is usually no 
reference delay which is related to a "sane" network.
(This is different in network management scenarios, where one does 
_intentionally_ some "baselining" in order to obtain exactly that.)

However, this is not the hardest one.

The really problem with delay and delay variations is, that there are 
several possible causes for them:
- congestion.
- MAC latencies. (similar to congestion.)
- high recovery latencies due to large numbers of retransmissions.
- route changes.
- changes in path properties / line coding / channel coding / puncturing 
etc.

Without any particular knowledge of the path, you may not be able to 
determine "the one" (if it is one at all) reason
for a delay variation.

So, the consequence is that we should abandon the use of delays as 
congestion indicator.

The original meaning of "congestion" is that a path is full and cannot 
accept more data.
And the by far most compelling indication for "this path cannot accept 
mor data" is that this "more data" is discarded.


> Of course, fixing TCP to work over any IP connection (as it was
> intended) does not mean that the underlying networks should not be
> optimised.  As Lloyd said, we already have recommendations.
>
>   
And we have the end-to-end recommendations which tell us, that 
underlying networks should not attempt to solve all problems on their own.

And a proper trade off, when a problem can be solved at a "low layer" 
and when it should be forwarded to an upper layer, or upper layers are 
at least involved, is _always_ a concern, not only in networking but in 
every kind of all days life - including the bankruptcy of Lehman 
brothers ;-)




-- 
Detlef Bosau		Galileistraße 30	70565 Stuttgart
phone: +49 711 5208031	mobile: +49 172 6819937	skype: detlef.bosau	
ICQ: 566129673		http://detlef.bosau@web.de			




More information about the end2end-interest mailing list