[e2e] local recovery or not local recovery, was: Re: Satellite networks latency and data corruption

Tue Jul 5 09:03:09 PDT 2005

There are pros and cons to hop by hop and end to end control. 

The nature of loss (weather events) implies that there is generally no
correlation between the error rates of the multiple hops. In practice,
only one hop will have a significant error rate, while the others will
be quasi free of errors.

If you assume a perfect selective ARQ mechanism, you will find that
end-to-end is slightly less efficient from a bandwidth point of view,
because any retransmitted packet is carried over every hop. However,
this is a relatively contained issue, since even with a "large" error
rate only a small fraction of packets are in fact retransmitted.

You will also find that a typical hop-by-hop ARQ (e.g. HDLC with
selective rejects) results in large delays if you assume re-sequencing
at each node, because all flows must wait for the retransmission of any
error -- in practice, the offending hop delay becomes three times larger
than necessary. 

You could opt to not implement re-sequencing, but then you have to deal
with end to end re-ordering. Given the satellite hop delays, the out of
sequence packet will arrive about 450 ms after the original packet. In
all likelihood, TCP will have triggered an end to end retransmission,
thus negating any bandwidth advantage to hop by hop retransmission.

The technique that is most useful is variable per hop FEC. If the
satellite system can detect that the link is experiencing bad
conditions, it could push the FEC system to use more redundancy for the
duration of the event. The effect may not be perfect, there will still
be some residual errors, but the overall error rate will remain very low
and the transmission delays will remain contained.

-- Christian Huitema