[e2e] Are Packet Trains / Packet Bursts a Problem in TCP?
fred at cisco.com
Thu Sep 28 08:43:43 PDT 2006
That is of course mostly true. I'll quibble on Little's result: it
says (in the immortal words of the wikipedia) "The average number of
customers in a stable system (over some time interval) is equal to
their average arrival rate, multiplied by their average time in the
system." What you're referring to in this context is the set of
formulae related to delay, variance, etc, in non-deterministic
statistical systems, which are each inversely proportional to some
variation on (one minus the utilization), which approaches a limit of
zero. But your observation regarding those equations tending to
infinity at full utilization is correct.
The point is that while a network operator values headroom in the
average case, as you aptly note, the network user values response
time. Now, if he says he wants to look at a ten megabyte file (think
YouTube) and values response time, his proxy (TCP) needs to maximize
throughput in order to minimize elapsed time. Hence, TCP values
throughput, and it does so because its user values response time.
Take a good look at the various congestion management algorithms used
in TCPs over the years, and what they have all tried to do is
maximize throughput, detect the point where utilization at the
bottleneck approaches 100% (whether by measuring loss resulting from
going over the top or by measuring the increase in delay that you
point us to), and then back off a little bit. TCP seeks to maximize
Something else that is very important in this context is the matter
of time scale. When a network operator measures capacity used, he
most commonly points to an MRTG trace. MRTG samples SNMP counters
every 300 seconds and plots the deltas between their values. You no
doubt recall T-shirts from a decade ago or more that said something
about "same day service in a nanosecond world". MRTG traces, is as
useful as they are in monitoring trends, are not very useful in
telling us about individual TCPs. The comparison is a little like
putting a bump counter in Times Square, reading it at a random time
of day once a month, and making deep remarks about the behavior of
traffic at rush hour. They don't give you that data.
I'd encourage, as an alternative, http://www.ieee-infocom.org/2004/
Papers/37_4.PDF. This looks at real traffic in the Sprint network a
few years ago and analyzes traffic behavior. It makes three important
- with 90% confidence, POP-POP variation in delay within the
network is less
than one ms.
- also with 90% confidence, POP-POP delay variation spikes to ~10
- on occasion (six times in the study), POP-POP delay varies by as
much as 100
ms, and in so doing follows a pattern whose characteristics are
with the intersection of high rate data streams, not measurement
router misbehavior as some have suggested.
Note that these are POP-POP, not CPE-CPE; the study says nothing
about access networks or access links. It talks about the part of the
network that Sprint engineered.
They don't show what the 10 ms spikes look like, but I will
conjecture that they are smaller versions of the six samples they did
display, and similarly suggest the momentary intersection of high
rate data streams.
What this tells me, coupled with my knowledge of routers and various
applications, is that even within a large and well engineered ISP
network (Sprint's is among the best in the world, IMHO) there are
times when total throughput on a link hits 100% and delay ramps up.
If they are running delay-sensitive or loss-sensitive services, it
will be wise on their part to put in simple queuing mechanisms such
as those suggested in draft-ietf-tsvwg-diffserv-class-aggr-00.txt to
stabilize the service during such events. The overall effect will be
negligible, but it will materially help in maintaining the stability
of routing and of real time services. SImply adding bandwidth helps
immeasurably, but measurement tells me that it is not a final solution.
On Sep 27, 2006, at 5:45 PM, David P. Reed wrote:
> The point of Little's Lemma is that the tradeoff for using the full
> bottleneck bandwidth is asymptotically infinite delay, delay
> variance, and other statistics.
> If you value utilization but not response time, of course you can
> fill the pipes.
> But end users value response time almost always much higher than
> twice the throughput.
> And of course you can give delay-free service to a small percentage
> of traffic by prioritization, but all that does is make the
> asymptotic growth of delay and delay variance for the rest of the
> traffic even worse.
> Fred Baker wrote:
>> On Sep 27, 2006, at 3:34 PM, Detlef Bosau wrote:
>>> Wouldn´t this suggest (and I had a short glance at Fred´s answer
>>> and perhaps he might contradict here) that we intendedly drop the
>>> goal of achieving a full load in favour of a "load dependend ECN"
>>> mechanism, i.e. when the load of a link exceeds a certain limit,
>>> say 50 % of the bandwidth, any passing packets are stamped with
>>> a forward congestion notification. Thus, we would keep the
>>> throughput on a limit we cannot exceed anyway, but limit the
>>> incomming traffic that way that queues can fullfill their
>>> purpose, i.e. interleave the flows and buffer out asynchronous
>> I certainly encouraged Sally et al to publish RFC 3168, and yes I
>> would agree that something other than a loss-triggered approach
>> has real value, especially at STM-n speeds where the difference
>> between "nominal delay due to queuing" and "loss" is pretty sharp.
>> I don't think I would pick "50%"; it would be at a higher rate.
>> But that actually says about the same thing. Change the mechanism
>> for detecting when the application isn't going to get a whole lot
>> more bandwidth even if it jumps the window up by an order of
>> magnitude, but allow it to maximize throughput and minimize loss
>> in a way that s responsive to signals from the network.
More information about the end2end-interest