[e2e] Reacting to corruption based loss
cannara at attglobal.net
Wed Jun 29 11:47:04 PDT 2005
Good response Sam. The kind that leads to more thought, in fact.
How many years ago for the 1st example, you ask. For that one, 6. For the
one this year, 0.25 year. :]
You say you "spent a long time simulating the BSD stack". That's great, and
part of the problem. Folks do simulations which are based on code written to
simulate someone's ideas on how something works. Then, they believe the
simulations, despite what's actually seen in reality. We all know that
simulators and their use can be very limited in relevance, if not accuracy.
One of the biggest issues is lack of release control for things as important
as Internet protocols (e.g., TCP). Thus the NT server may have a different
version of TCP from that on the user's spanking new PC. No one ever addresses
even the basics of stack parameter settings in their manuals, and network
staffers rarely have the time to go in and check versions, timer settings,
yadda, yadda. This is indeed why many performance problems occur. You fixed
IRIX 6 years ago. Great.
Now, why does the Internet work? Not simply because of TCP, for sure. Your
experiment illustrates the rush to acceptance these points are raised
"I transfered a largish file to my sluggish corporate ftp server. Took 77
seconds (over the Internet, from San Francisco to Sunnyvale). I then did the
same thing, this time I unplugged my Ethernet cable 6 times, each time for 4
seconds. The transfer took 131 seconds."
So, what is "largish" in more precise terms? What are the RTT and limiting
bit-rate of your "Internet" path from SF to S'vale? The file evidently went
right by our house! But, despite the imprecision, we can use your result: 77
+ 6 x 4 = 101. Your transfer actually took 131 seconds, fully 30% more than
one would expect on a link that's simply interrupted, not congested. Good
Sam Manthorpe wrote:
> On Tue, 28 Jun 2005, Cannara wrote:
> > On the error rates issue, mobile is an extreme case, always subject to
> > difficult conditions in the physical space, so symbol definitions & error
> > correction are paramount. However, most corporate traffic isn't over mobile
> > links, but dedicated lines between routers, or radio/optical bridges. etc.
> > Here, the reality of hardware failures raises its head and we see long-lasting
> > error rates that are quite small and even content dependent. This is where
> > TCP's ignorance of what's going on and its machete approach to slowdown are
> > inappropriate and costly to the enterprise.
> > As an example of the latter, a major telecom company, whose services many of
> > us are using this instant, called a few years back, asking for help
> How many (years?)
> > determining why just some of its offices were getting extremely poor
> > performance downloading files, like customer site maps, from company servers,
> > while other sites had great performance. The maps were a few MB and loaded
> > via SMB/Samba over TCP/IP to staff PCs. The head network engineer was so
> > desperate, he even put a PC in his car and drove all over Florida checking
> > sites. This was actually good. But, best of all, he had access to the
> > company's Distributed Sniffers(r) at many offices and HQ. A few traces told
> > the story: a) some routed paths from some offices were losing 0.4% of pkts,
> > while others lost none; b) the lossy paths experienced 20-30% longer
> > file-download times. By simple triangulation, we decided that he should check
> > the T3 interface on Cisco box X for errors. Sure enough, about 0.4% error
> > rates were being tallied. The phone-line folks fixed the problem and voila,
> > all sites crossing that path were back to speed!
> > Now, if you were a network manager for a major corporation, would you rush to
> > fix a physical problem that generated less than 1% errors, if your boss &
> > users were complaining about mysterious slowdowns many times larger?
> > 0.4% wasn't even enough to trigger an alert on their management consoles. You'd
> > certainly be looking for bigger fish. Well, TCP's algorithms create a bigger
> > fish -- talk about Henny Penny. :]
> I can't help but wonder - if TCP/IP were generally so sensitive to a loss
> of 0.4%, then why does the Internet work? I spent a long time simulating
> the BSD stack a while back and it held up extremely well under random
> loss until you hit 10% at which point things go non-linear. I've also
> never experienced what you describe, neither as a user nor or in my
> capacity as engineer debugging customer network problems.
> And what's with that "major corporation" and "boss" stuff? I'm guessing
> they'd like the "replace the hardware" solution to the "replace the
> whole infrastructure with something that's incompatible with everything else
> on the planet" one.
> > The files were transferred in many 34kB SMB blocks, which required something
> > like 23 server pkts per. The NT servers had a send window of about 6 pkts
> > (uSoft later increased that to about 12). All interfaces were 100Mb/s, except
> > the T3 and a couple of T1s, depending on path. RTT was about 70mS for all
> > paths.
> So the NT servers were either misconfigured, or your example is rather
> dated, right?
> > Thankfully, the Sniffer traces also showed exactly what the TCPs at both ends
> > were doing, despite Fast Retransmit, SACK, etc.:
> I'm don't know a lot about NT's history, but having a 9K window *and* SACK sounds
> historically schizo.
> > a) the typical, default
> > timeouts were knocking the heck out of throughput; b) the fact that transfers
> > required many blocks of odd numbers of pkts meant the the Ack Timer at the
> > receiver was expiring on every block, waiting (~100mS) for the magical
> > even-numbered last pkt in the block, which never came. These defaults could
> > have been changed to gain some performance back, but not much. The basic idea
> > that TCP should assume congestion = loss was the Achille's heel. Even the
> > silly "ack alternate pkts" concept could have been largely automaticaly
> > eliminated, if the receiver TCP actually learned that it would always get an
> > odd number.
> The issue you describe was fixed a long time ago in most stacks, AFAIAW.
> I fixed it in IRIX aroundabout 6 years ago.
> For fun, I tried an experiment. I transfered a largish file to my sluggish
> corporate ftp server. Took 77 seconds (over the Internet, from San Francisco
> to Sunnyvale). I then did the same thing, this time I unplugged my Ethernet
> cable 6 times, each time for 4 seconds. The transfer took 131 seconds.
> Not bad, I think. At least not bad enough to warrant a rearchitecture.
> -- Sam
More information about the end2end-interest