[e2e] simulation using self-similar traffic

Wed Apr 4 08:07:40 PDT 2001

On Wed, 4 Apr 2001, Soo-hyeong Lee wrote:

> 
> > The heavy-tailed random variable Crovella and Lipsky were talking about
> > was for the Pareto-distributed ON period or file size you are interested. 
> > This is also what I meant by heavy-tailed file sizes (assuming you're
> > using a SURGE-like model to generate traffic already...)
> 
> The difference is that Crovella and Lipsky tried to measure average of a known random variable, 
> while I am trying to measure average of a random variable which is a function of known random variables.
> Suppose file size is Pareto-distributed, then Crovella and Lipsky tried to measure average file size, while I am trying to measure average throughput when transferring those files.
> 
> > Assuming these are
> > TCP files (which is the case) and if you're interested in measuring
> > 'average TCP throughput', I expect one would need to generate as many
> > samples as necessary to make sure at least the average file size
> > converges. Otherwise, I couldn't see why would average throughput
> > converge.
> 
> You seem to argue that convergence of average file size to its ensemble mean is a necessary condition for convergence of throughput. You don't seem to argue that it's a sufficient condition. What I want is a sufficient condition for throughput convergence.
> 
> However, I am not quite sure if it is a necessary condition.
> The reason that a running average of an iid Pareto sequence doesn't
> converge quickly would be that the law of large number doesn't hold here
> because of infinite variance of Pareto variable.
> However, throughput is clearly upper bounded by link bandwidth and
> doesn't show infinite variance. In fact, throughput may be calculated as
> (sum of file sizes successfully transferred) / (time taken). Even if
> file size doesn't converge, its ratio over transfer duration may still
> converge.

Soo-hyeong,

I'd like to point out that just because a process is bounded doesn't
necessary mean it will converge.  And you might be right that it might
not be a necessary condition either.  Again, without formal analysis,
I'm using the method as heuristics and only claiming that it seems to
make sense 'intuitively'.  Let me borrow your formula:

(sum of file sizes successfully transmitted) / (time taken)

Suppose we complete all file transfers in a simulation. Since mean is 
just sum of file sizes divided by number of files, if mean doesn't
converge, the sum won't converge.  If sum doesn't converge, I'm not 
sure if overall link throughput will converge (actually the 'throughput' 
I meant in earlier emails is individual TCP connection throughput not 
the overall link throughput).

Perhaps the term 'time taken' will not converge either and somehow 
cancel out the degree of oscilation in 'sum of file sizes successfully
transmitted'.  But it's not clear at the moment.  (Hmm.. this could 
rather be a sufficient condition after all.)  

Nevertheless I agree the average link throughput is bounded by the
bandwidth.  And, the discussion is going a bit too mathematical.  
Perhaps we should take it offline.

Cheers,
-Polly