Paddy Ganti
pganti at gmail.com
Thu Jul 26 14:43:56 PDT 2007
I am thinking of an approach to analytically determine the download time as
a function of RTT given a few initial real world samples. Say, I measured a
web page from 4 locations around the globe. Knowing this sample, what can I
infer anything about the population of download times as a function of RTT.
If I assume that Download time (dt)can be expressed as follows:
dt = n* RTT + c
where n is the number of round trips (RTT ping pongs, includes one burst of
data which can be multiple packets) with c being the server stall time
between sending the data or server processing time plus some random noise
all factored into once constant.
The above equation is of the form y=mx +c and I can equate the slope with
that of number of round trips (makes sense as the lesser the number of round
trips the lower the response time) while x is RTT.
So if I take enough sampls, say 10, and perform a regression analysis on
those to generate the equation wouldnt that classify the population. If I
have such an equation then I would plug in various RTT(s) and asuming the
R-squared value is high wouldnt that be representative of real performance.
A few initial measurements showed encouraging results but a few measurements
didnt converge and a few had negative valus,etc.
Before I go further and present this to an internal audience I want to poll
this group for any feedback/remarks/comments on using this method and its
pitfalls.
-Paddy Ganti
