[e2e] Some thoughts on WLAN etc., was: Re: RES: Why Buffering?
Detlef Bosau
detlef.bosau at web.de
Sun Jul 5 05:44:56 PDT 2009
Lachlan Andrew wrote:
> Greetings Detlef,
>
> 2009/7/5 Detlef Bosau <detlef.bosau at web.de>:
>
>> Lachlan Andrew wrote:
>>
>>> "a period of time over which all packets are lost, which extends for
>>> more than a few average RTTs plus a few hundred milliseconds".
>>>
>> I totally agree with you here in the area of fixed networks, actually we use
>> hello packets and the like in protocols like OSPF. But what about outliers
>> in the RTT on wireless networks, like my 80 ms example?
>>
>
> That is why I said "plus a few hundred milliseconds".
Now, how large is "a few"?
Not to be misunderstood: There are well networks, where a link state can
be determined.
E.g.:
- Ethernet, Normal Link Pulse,
- ISDN, ATM, where we have a continuous bit flow,
- HSDPA, where we have a continuous symbol flow on the pilot channel in
downlink direction and responses from the mobile stations in uplink
direction.
In all these networks, we have continuous or short time periodic traffic
on the link and this traffic is reflected by responses in a quite well
known period of time. In addition, the behaviour of hello-response seems
does not depend on any specific traffic. In Ethernet or ATM, a link, our
a link outage respectively, is detected even when no traffic from upper
layers exist.
In some sense, this even holds true for HSDPA, when we define a HSDPA
link to be "down", when the base station does not receive CQI
indications any longer.
I'm not quite sure (to be honest: I don't really know) whether similar
mechanisms are available e.g. for Ad Hoc Networks.
Particularly as we well know of hidden terminal / hidden station
problems, where stations in a wireless network even see each other.
> You're right
> that outliers are common in wireless, which is why protocols to run
> over wireless need to be able to handle such things.
>
>
Exactly.
So, we come to an important turn in the discussion. It's not only the
question whether we can detect a link outage.
The question is: How do we deal with a link outage?
In wireline networks, link outages are supposed to be quite rare.
(Nevertheless, the consequences may be painful.)
In contrast to that, link outages are extremely common in MANETs.
Actually, we have to ask what the term "link" and the term "link outage"
or "disconnection" shall mean in MANETs.
For example, think of TCP. How does TCP deal with a link outage?
Now, if this were a German mailing list and I came from Cologne, I would
write: "Es is wie es is und et kütt wie et kütt."
More internationally spoken: "Don't worry, be happy."
If the path is finally broken, the TCP flow is broken as well.
If there is an alternative path and the routing is adjusted by some
mechanism, the TCP flow will continue.
Of course, there may be packet loss. So, TCP will do packet retransmissions.
Of course, the path capacity may change. So, TCP will reassess the path
capacity. Either by slow start or by one or several 3 D ACK / fast
retransmit, fast recovery cycles.
Of course, the throughput may change. Thats the least problem of all,
because its automatically fixed by the ACK clocking mechanism.
Of course, the RTT may change. So, the timers have to converge to a new
expectation.
There will me some rumbling, more or less, but afterwards, TCP will keep
on going.
Either way, there is no smart guy to tell TCP "there is a short time
disconnection." Hence, there is no explicit mechanism in TCP to deal
with short time disconnections. Because the TCP mechanisms as they are
work fine - even when short time disconnections and path changes occur.
There is no need for some "short time disconnection handling".
Of course, this will rise the question whether TCP as is can be suitable
for MANETs, because we can well put in question whether e.g. the RTO
estimation and the CWND assessment algorithms in TCP will hold in the
presence of volatile paths with volatile characteristics.
TCP is supposed work with a connectionless packet transport mechanism
with "reasonbly quasistationary characteristics" and a packet loss
ratio, we can reasonably live with.
Or for the people in Cologne: "Es is wie es is und et kütt wie et kütt."
>> Was there a "short time disconnection" then?
>> Certainly not, because the system was busy to deliver the packet all the
>> time.
>>
>
> From the higher layer's point of view, it doesn't matter much whether
> the underlying system was working hard or not...
Correct. From the higher layer's point of view, the questions are:
- is the packet acknowledged at all?
- is the round trip time "quasistationary" (=> Edge's paper).
- is the packet order maintained or should we adapt the dupackthreshold?
- more TCP specific: Is the MSS size appropriate or should it be changed?
> If the outlier were
> more extreme, then I'd happily call it a short term disconnection, and
> say that the higher layers need to be able to handle it.
>
>
Question: Should we _actively_ _handle_ it (e.g. Freeze TCP?) or should
we build protocols sufficiently robust, so that protocols can implicitly
cope with short time disconnections?
>> So the problem is not a "short time disconnection", the problem is that
>> timeouts don't work
>>
>
> Timeouts are part of the problem. Another problem is reestablishing
> the ACK clock after the disconnection.
>
>
Hm. Where is the problem with the ACK clock?
If, the problem could be (and I'm not quite sure about WLAN here) that a
TCP downlink may use more than one paths in parallel. Hence, there may
be three packets delivered along three different paths - and a sender in
the wireline network sees three ACKs and hence sends three packets....
However, in the normal "single path scenario", I don't see a severe
problem. Or do I miss something?
>> Actually, e.g. in TCP, we don't deal with "short time disconnections"
>>
>
> There may not be an explicit mechanism to deal with them. I think
> that the earlier comment that they are more important than random
> losses is saying that we *should* perhaps deal with them (somehow), or
> at least include them in our models.
>
I'm actually not convinced that short time disconnections are more
important than random losses.
If this was the attitude of the reviewers who rejected my papers, I
would suppose they would try to tease me.
Of course, I could redefine any random loss to be a short time
disconnection - hence there wouldn't be any random loss at all.
However, this would be some nasty kind of hair splitting.
I think, the perhaps most important lesson from my experience from last
week is that we must not suppose
one wireless problem to be more important than others.
Of course this puts in question mainly the opportunistic scheduling work
which assumes that there is only Rayleigh Fading
and despite the useful, well behaved, periodic and predictable Rayleigh
Fading for evenly moving mobiles, there is no other disturbance on the
wireless channel.
Of course, many students earn there "hats" that way, but the more I
think about it, the less I believe that this really reflects reality.
Detlef
>
>> So, the basic strategy of "upper layers" to deal with short time
>> disconnections, or latencies more than average, is simply not to deal with
>> them - but to ignore them.
>>
>> What about a path change? Do we talk about a "short time disconnection" in
>> TCP, when a link on the path fails and the flow is redirected then? We
>> typically don't worry.
>>
>
> Those delays are typically short enough that TCP handles them OK. If
> we were looking at deploying TCP in an environment with common slow
> redirections, then we should certainly check that it handles those
> short time disconnections.
>
>
>> To me, the problem is not the existence - or non existence - of short time
>> disconnections at all but the question why we should _explicitly_ deal with
>> a phenomenon where no one worries about?
>>
>
> The protocol needn't necessarily deal with them explicitly, but we
> should explicitly make sure that it handles them OK.
>
>
>> Isn't it sufficient to describe the corruption probability?
>>
>
> No, because that ignores the temporal correlation. You say that the
> Gilbert-Elliot model isn't good enough, but an IID model is orders of
> magnitude worse.
>
> Cheers,
> Lachlan
>
>
--
Detlef Bosau Galileistraße 30 70565 Stuttgart
phone: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau
ICQ: 566129673 http://detlef.bosau@web.de
More information about the end2end-interest
mailing list