[e2e] Is a control theoretic approach sound?

Tue Jul 29 17:23:23 PDT 2003

Hi Christian

Thanks for your insightful analysis.
Tom Kelly's STCP indeed has TimeInterval = O(1),
though the constant is inversely proportional
to the flow's RTT in your notation (i.e., for
STCP: cwnd * loss_probability = constant, and hence
throughput * loss_probability = constant/RTT).

To your last question, as pointed out by
Martin, Nillson and Rhee's paper in ToN,
delay can be a poor or untimely predictor
of packet loss.  This does *not* mean that
it is futile to use delay as a measure of
congestion, but rather, that using a
delay-based algorithm to predict loss
in the hope of helping a loss-based algortihm
adjust its window is the wrong approach
to address problems at large windows.
Instead, a different approach
that fully exploits delay as a congestion
measure, augmented with loss information,
is needed.

Roughly, delay as a congestion measure
has two advantages.  First, each measurement
of loss (whether a packet is lost or not)
provides 1 bit of information about
congestion for the filtering of noise in
loss probability, whereas each measurement
of RTT provides multibit information, and
hence queueing delay can probably be more
accurately estimated than loss probability,
especially in networks with large
bandwidth-delay product.

Second, delay has the right scaling with
respect to network capacity, which helps
stabilize congestion control as network
scales up in capacity.  This scaling
property is also, in a sense, what
underlies Matt Mathis's idea of large
MTU's.

Delay-based congestion control has been
proposed since the 80s by Raj Jain and
many others.  We believe its advantage
over loss-based approach is small at
low speed, but decisive at high speed.

This is not to say that there are no
challenges in implementing delay-based
approach - there are many.  One of them
is that the bottleneck (e.g., at a
switch) may happen to have very
small buffer (say, <10ms).  In this
case, it may be hard to estimate
queueing delay accurately and stabilize
the control based solely on delay;
losses may be unavoidable
and therefore a good loss recovery
mechanism is essential at large window;
etc.  The fun continues.

Steven

Christian Huitema wrote:

>>This is a very interesting example:)  Let me follow this example:
>>
>>1) if the cart is very small and light, and if every mouse pulls very
>>
> hard
> 
>>and brakes very hard to chase the target speed, then the cart goes as
>>
> what
> 
>>you described.
>>2) if the cart is very heavy, and every mouse pulls slightly, then it
>>takes
>>a long time for the cart to gain the target speed, and the system may
>>oscillate around the target speed slightly.
>>3) If the mice are a little bit cleverer: they consider the speed of
>>
> the
> 
>>cart. If the speed is far from the target speed, they pull very hard
>>
> or
> 
>>brake very hard; if the speed of the cart is close to the target
>>
> speed,
> 
>>they
>>pull slightly or brake slightly, or even stop pulling. Then there is a
>>hope
>>to make the system stays around the target speed? :) Actually, even 50
>>elephants come to pull the cart, they have to follow the same rule:
>>
> pull
> 
>>or brake slightly when the speed is close to the target.
>>
> 
> There is another dimension that we ought to consider, which is the
> frequency of corrections. The traditional AIMD congestion control has a
> nasty side effect: the frequency of correction decreases when the
> average transmission speed increases. This is linked to the relation:
> 	Throughput = O(1/SQRT(LossRate))
> Which can also be written
> 	1/LossRate = O(Throughput^2)
> The average number of packets between losses is inversely proportional
> to the loss rate:
> 	PacketInterval = O(1/LossRate)
> The time interval between losses is inversely proportional to the
> throughput:
> 	TimeInterval = PacketInterval*PacketSize/Throughput
> We can express that as a function of the throughput:
> 	TimeInterval = O((1/LossRate)*(1/Throughput))
> 	TimeInterval = O((Throughput^2)*(1/Throughput))
> 	TimeInterval = O(Throughput)
> In short, the faster you go, the less often you correct the situation.
>>From a control theory point of view, that could only be sustainable if
> high speed networks where somehow much more stable than low speed
> networks, for example if the rate of arrival and departure of new
> connections somehow diminished when the connection speed increases.
> There is no evidence that this is the case.
> 
> So, if we pursue that particular avenue of control theory, we have two
> possible solutions. On possibility is to change the way TCP reacts to
> packet losses, so that for high speed connections we end up with: 
> 	Throughput = O(1/LossRate) 
> 	TimeInterval = O(1)
> Or we can try to obtain another source of control information than
> packet losses, so we can apply more frequent connections. Vegas and FAST
> seem to explore this later avenue, by using the packet transmission
> delays as a source of information about the congestion. On one hand,
> that is seductive, since we could get some limited amount of information
> in every packet. On the other hand, if we believe a paper in the last
> Transaction on Networking, there is only a weak correlation between
> delays and congestion, and I wonder how the designers of FAST deal with
> that.
> 
> -- Christian Huitema
> 
> 
> 

-- 
_____________________________________________
Steven Low                assoc prof, cs & ee
netlab.caltech.edu        tel: (626) 395-6767
slow at caltech.edu          fax: (626) 568-3603