[e2e] What's wrong with this picture?
Anil.Agarwal at viasat.com
Tue Sep 8 10:14:16 PDT 2009
Another possible explanation would be that the IAP equipment (or DOCSIS
modem) provides some level of packet differentiation and
queuing/prioritization (e.g., real-time v/s non-real-time) and during
your test, there was a lot of real-time traffic, reducing the amount of
bandwidth available to TCP-traffic. Even if queues are "properly" sized
based on access link speed, multiple queues and prioritization can
create havoc with packet delays.
To some extent, we are all to blame, since we (the IETF community) have
not really offered any strong guidelines on how to size router buffers.
There are plenty of papers on the subject. The most we seem to suggest
is that it should be some fraction of the bandwidth-delay product, with
little agreement on the fraction value.
Some critical questions remain unanswered -
1. What does bandwidth-delay product mean for a poor access router,
probably pre-configured at the factory? Is the bandwidth based on its
own access link speed? What delay value should be used? - delay can vary
dynamically across a wide range.
2. How about links with variable speeds (wireless, satellite)?
Should buffer size be computed dynamically.
3. What happens with use of Differentiated services and queues? The
bandwidth available to a specific queue can dynamically vary across a
wide range. Best Effort queue probably suffers the most.
4. Should queues be sized based on maximum queuing delay instead of
a fixed/computed amount of buffer space?
5. Similar questions apply to use of RED/ECN; how do we compute
(all) RED parameters for such links and queues?
20511 Seneca Meadows Parkway
Germantown, MD 20876
From: end2end-interest-bounces at postel.org
[mailto:end2end-interest-bounces at postel.org] On Behalf Of David P. Reed
Sent: Tuesday, September 08, 2009 10:06 AM
To: Jim Gettys
Cc: jeroen at unfix.org; sthaug at nethelp.no; end2end-interest at postel.org
Subject: Re: [e2e] What's wrong with this picture?
Jim - I suspect your Comcast support person was partly right. ICMP
*echoing* is sidelined. However, IP packets that contain ICMP messages
destined farther down the line are NOT dropped by routers and switches.
That would be dumb, though I'm sure some networks that don't want to
monitor their own congestion might be so dumb as to imagine that ICMP
mice will somehow overload a network. I don't think such people are
members of NANOG).
It turns out that Comcast's problem (extensively investigated by
technologists rather than their PR dept., only after the Harvard FCC
hearing) was that DOCSIS modems they had bought actually had
multiple-seconds worth of buffering on their upstream-facing interfaces,
and did not under any circumstances drop packets in a way that would
allow TCP to know enough to slow down the AI part of AIMD.
Given the sidelining of *echoing* yes, pinging a router might not give
much info about that router. But pinging the next, unloaded router down
the route will tell you a lot.
In any case, it's easy to open up a TCP connection and carry out an
end-to-end ping without ever using ICMP. Just wait a few seconds after
a sync, send a few bytes, and have a responder echo them. If you use
TCPNODELAY option, you will get a reliable result. I have a python
program on my server that handles such things. In this particular
measurement, the data from this "TCP ping" gave consistent RTT's with
the ICMP ping.
It's fascinating to me that people REALLY WANT to call this "measurement
error". As opposed to *operator* misconfiguration (or
Perhaps someone might actually be able to guess what manufacturer sells
the equipment that routinely buffers 8 seconds of outgoing packets on a
link without a hint of backpressure that would allow TCP's congestion
control to kick in?
I just want to see it fixed before Sandvine sells some more
TCP-RST-injectors and DPI spies to that vendor, and starts accusing
people with some very cool handsets of "attacking the network". Maybe
the handset vendor would be interested in having interactions take less
than 8-20 seconds between gesture and response from a server?
One thing that is clear: the spate of news stories about "spectrum
shortage" has missed a fundamental technical problem that has NOTHING to
do with spectrum.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the end2end-interest