[e2e] Agility of RTO Estimates, stability, vulneratibilites

Sun Jul 24 08:12:28 PDT 2005

Hi,

My $2/100.

Self-Similarity (SS) on its own is not a problem, consider Brownian 
motion, however, the property that the sum of correlations is infinite  
(Long Range Dependence) has the potential to deviate performance in a 
very significant way from the classical queuing behavior. Note: When 
people talk about Self-Similarity in Internet traffic, they are in fact 
talking about "Second Order Asymptotic Self-Similarity".

Central limit theorem is applicable under the condition of  IID's and 
more importantly the "finite variance" assumption. One way of modeling 
Internet traffic is  fractional Brownian motion (fractional Gaussian). 
It has infinite variance and non-summable long range correlations.  So 
theoretically, CLT does not apply to Internet traffic. In fact, 
mathematicians talk about Generalized CLT i.e,.convergence to an 
alpha-stable distribution.

Then TCP's RTO estimator looks in trouble under the CLT assumption.

However, that is theory.

In my opinion, following are some significant practical considerations:

* Buffers are finite, so buffering delays are bounded. So E2E delay and 
the resultant RTT's will never be infinite.
* If we assume that packets are of fixed size then a fairly utilized 
queue system will start putting deterministic/constant gaps in the 
departure process. Recall Prof. Paul Kuhn's (Univ Stuttgart) formula for 
G/G/1 system under decomposition. It predicts a perfectly deterministic 
departure process for a 100% utilized queue. Theoretical, but it does 
make sense!
* Buffers operate at small time scales, while SS is a large time 
asymptotic property...so is it  relevant at all to the question question?
* The heavy-tailed file-sizes distribution, linked with the LRD in the 
Internet traffic, is truncated in practice. There is no such thing as 
"file of infinite size".  If so, we will have a heavy-tailed 
distribution with "finite" variation. So correlations should only be 
visible for some orders of magnitude of time-scales, and not at "all" 
times scales. Therefore the process should not be LRD in the the pure 
sense, rather Short Range Dependent.

So my feeling is that E2E and resultant RTT distribution has finite mean 
and finite variance.  Therefore, the assumption of CLT is not a bad one 
for practical applications.

regards,
Sireen Malik- Hamburg

> I don´t want to discuss this here, because this is bascially no TCP 
> related question. It´s the question if it is at least _possible_ to 
> estimate a RTT. And this is a requirement for the network itself and 
> its structure. To my knowledge, there are quite a few papers around 
> dealing with "self similarity". I´m not quite sure, but if e2e 
> latencies were in fact "self similar" (I use "" here because the term 
> "self similar" is sometimes used without a satisfactory mathematical 
> definition), we could stop the discussion here. In that case, there 
> would be hardly any chance to have acceptable RTT estimates. (I´m  no 
> expert here, but estimators often converge due to the SLLN or similar 
> theorems and there is an assumption "i.i.d." in it. Identically and 
> _independently_ distributed.
> In a self similar series of stochastic variables I _strongly_ doubt 
> their independence.)

> So, at least _one_ assumption for a network is inevitable in order to 
> use sender initiated, timeout based retransmission: Convergent 
> estimators for the timeout must exist.
> Unfortunaly, a priori we do not know about possible limitaions of RTT, 
> particularly there is no general upper limit. So it is somewhat 
> cumbersome to derive an 1-alpha confidence interval directly from the 
> sample here. In fact, it is a common approach in statistics, to derive 
> confidence intervals from estimates for expectation and variation of a 
> stochastic variable. Often there is some implicit assumption about the 
> districution function of this varible, e.g. gaussian.So, if whe use 
> RTT and VAR (as we do in TCP), we implictly assume that estimators for 
> RTT and VAR _exist_.
>
> But in principle, these estimators are not defined by TCP, they are 
> _assumed_ by TCP. Bascially, we _assume_ the existence of a 
> RTT/VAR/RTO estimators here and then we use them. And hopefully, we 
> use appropriate ones for the packet switched network in use.
>
> So once again and very short:
>
> The RTO used in TCP is a confidence interval for RTT.
> TCP _assumes_ (if implicitly) the existence of a reasonable RTO 
> estimator.
>

> Is this correct?
>
> O.k.
>
> Then the next steps are:
> -Identification of a gerenic estimator, if possible.
> -Identification and elimination of vulnerabilities.
>
>>> developing Karn's algorithm (to deal with retransmission ambiguity) and
>>> improving the RSRE estimation algorithm with Van Jacobson's 
>>> replacement.
>>
>>
>
> O.k. Let´s ignore the retransmission ambiguity for the moment.
> (An easy way to overcome this would be to mark each TCP datagram sent 
> with a unique identifier, which is reflected by the according ACK. 
> Particularly, if a TCP datagram is sent more than once, it would be 
> given a different identifier each time it is sent.AFAIK this is the 
> rationale behaind the "sequence number" in ICMP.)
>
> Q2: What are other vulnerabilities and implicit assumptions?
>
> -Are there assumptions concerning the latency distribution?
> -Are there assumptions concerning the latency _stability_? What about 
> latency oscillations?
>
> In other words: What is the system model behind the RTT estimators 
> used in TCP?
>
> What are the _requirements_ for TCP to work properly? Can we make 
> implicit assumptions explicit?
>
> Which requirements must be met by a network so that TCP can work 
> without problems?
>
> Is this question stupid? If not: Is there existing work on this issue? 
> If so, I would appreciate any hint.
>
> Detlef Bosau
>
>