[e2e] Expected latency for a single hop

Detlef Bosau detlef.bosau at web.de
Thu Aug 4 11:27:27 PDT 2005

David P. Reed wrote:
> Detlef - Though it seems simple, your statement is about as complex as a 
> problem can be.
> This is the kind of problem statement that creates the definitional trap 
> I was referring to in earlier discussions.   By construing the "latency" 
> as being a propery of the "link" rather than of the network as a whole, 
> the statement acquires  a misleading simplicity

I know. However, the rationale behind my question is quite obvious: If 
you place a TCP sender at n1 and the according receiver at n2, the 
adaptive RTO mechanism in TCP exactly relys upon estimated mean and 
variance of (t2-t1).

If I had written: "Can we provide an adaptive RTO for a single hop TCP 
connection?" I surely had been directed to the relevant literature. 
Perhaps, one had considered it a stupid question.

Thus, I thought it might be useful to state the same problem(sic!) which 
TCP claims to solve (even for n hops!) in somewhat different words >:-)

Honestly, I believe, if we cannot estimate mean and variance for a 
_single_ hop, it´s perhaps not that much easier to do the job for an 
arbitrary number of hops (which of course includes the nasty case of a 
single hop).

> The latency only is well defined for real packets that actually arrive 
> and traverse the link.   Expectation and variance are properties of 
> distributions, not packets.

Yes. The intention of EWMA filtering used in TCP  is an attempt to do a 
parameterless forecast of mean and variance the actual latency distribution.

(Some weeks ago someone refered me to this well known saying by Niels 
Bohr: "Prediction is hard, especially of the future.")

Thus, of course we estimate properties of an unknown (!) distribution 
and in turn derive an RTO by application of an inequality similar to 
Chebyshev´s inequality.

However, the basic assumption is that we can provide estimates for mean 
and variance of a packets round trip time.

> There is no random process at all on the link itself (at least in the 
> common case - there are links where the link itself has a random delay, 
> but that usually arises where the link's physical characteristics vary 
> faster and larger than the queue management and link pacing 

My assumptions on l are tough. I totally agree with Craig here. In 
general, we do know nothing about l. It may be a tunnel, it may be a 
mobile wireless link. E.g. for mobile wireless links, I do not even know 
whether a finite variance for the link´s latency distribution exists.

> mechanisms).  The random process is the network environment that 
> provides competing packets.  So the latency is everywhere but the link 
> itself.

> The other issue is that prediction is more reliable over a collection of 
> packets, but a sufficient collection cannot happen in an instant.

I did not make any assumptions here, especially I did not assume that 
the estimation should be based upon the observation of the one packet. 
Perhaps my formulation was somewhat misleading here.

We could use testpackets sent from n1 to n2 or observe traffic from 
several flows. If n1 and n2 are routers and we can observe a large 
number of flows, the job should be much easier than if done at a TCP 
flows source which has to rely on a _very_ rough sample.

My favourate expample is always TCP including a 2400 bps link. (Nowadays 
forgotten, two years ago known from GSM - and we all know it from the 
good old modem times.) Depending on MSS, the sender gets a sample every 
one second or so. For wirebound systems in between one second is _ages_.
In one of my posts, I claimed (please correct me, if I´m wrong) that in 
contemporary networks, even link bandwidths cover a range of eight 
orders of magnitude. When our single packet crawls along a GSM link,
a Tier 1 backbone link may convey the whole Encyclopedia Britannica 
within the same period of time.

However, in a scenario

Sender----Tier1/Enc.Brit. Link--------router----GSM---receiver

the sender estimates mean and variance of the round trip time using EWMA 
filters and the extremely rough time series gained from the ACK packets.

Question: Is there a justification for doing so?

I looked at Edge´s paper, esecially for the assumptions for the 
observation variables, i.e. the time series Tn (Tn: stocastic variables, 
tn: instances.)

One sufficient assumption is that all Tn share the same mean and variance.

Some "drift" is accepted as well as are "occasional steps" (put in my 
own words).

When I look at my "Britannica example" and consider "sane mean and 
variance", I do not feel comfortable with these assumptions.

> The first order predictor is the queue size at the entry to the link.   
> That's a very reliable predictor of latency for the next event.   But it 
> provides very little input about variance (which depends entirely on 
> packets arriving from elsewhere at "light speed").
> I think there might be a much better (i.e. less complex to state) 
> approach in NOT trying to start with the link and go by induction to the 
> multilink case.   Instead, perhaps start with an end-to-end flow (over a 
> path) and reason about what happens as you add flows that superpose 
> themselves on the existing paths.

Is this really that more promising? Admittedly, this a rhetorical 
question. Basically, this is already being done. So I sharpened it a 
little bit by omitting n-1 links in the n link case ;-)

Or is it a matter of how coarse or fine grained we look at the problem?

I´m thinking about this problem - and at the same time, I use TCP and 
everything seems just to be fine :-) Then I read Raj Jain´s paper about 
the divergence of RTO estimators and Lixia Zhang´s paper on TCP timers. 
And I understand that we already addressed a number, perhaps nearly all, 
issues in these papers. But one issue which I do not yet understand is 
the use of EWMA filters.

- Do they hold for arbitrary TCP connections? Can we reasonable assume 
the necesseary conditions given by Edge? Or alternative ones?

- Do they converge fast enough in case of a sudden step in latency? Do 
they follow drifting latencies? How must we set the gain? I sometimes 
here something about "agility" and "stability". Basically, we should 
minimize the forecast error by proper choice of the gain. Can we use the 
same gain for all flows?

- Is the temporal resolution of an ACK clocked TCP flow sufficient to 
provide reasonable estimates? Or is the time series´ resolution obtained 
from that too coarse? (Nilsson, Martin and Rhee do so claim in there 
paper on lateny change / congestion correlation in June, 2003. One 
central point there was that the temporal resolution of observed round 
trip times in most cases is by far too coarse to derive reasonable 
conclusions concerning path properties.)

I get no "feeling" for this situation. I see lots of scenarios and 
individual papers there, but I don´t see the big picture yet.

Detlef Bosau
Galileistrasse 30
70565 Stuttgart
Mail: detlef.bosau at web.de
Web: http://www.detlef-bosau.de
Mobile: +49 172 681 9937

More information about the end2end-interest mailing list