From d.leith at eee.strath.ac.uk Wed Jun 1 10:22:53 2005 From: d.leith at eee.strath.ac.uk (Douglas Leith) Date: Wed, 1 Jun 2005 18:22:53 +0100 Subject: [e2e] New benchmark test results for High-Speed TCP, Scalable, FAST, BIC, H-TCP. Message-ID: Results of some recent experimental work to evaluate the performance of these TCP proposals are now available online at www.hamilton.ie/net/eval/. To our knowledge, this is the first comparison that controls for differences in network stack implementation - the Linux network stack has known performance issues in high-speed environments (e.g. see www.hamilon.ie/net/) and so most patches for new congestion control algorithms also implement many changes unrelated to the congestion control algorithm making comparison of congestion control algorithm performance difficult, if not impossible, when patches are used directly. In summary, we find that both Scalable-TCP and FAST-TCP consistently exhibit substantial unfairness, even when competing flows share identical network path characteristics. Scalable-TCP, HS-TCP, FAST-TCP and BIC-TCP all exhibit much greater RTT unfairness than does standard TCP, to the extent that long RTT flows may be completely starved of bandwidth. Scalable-TCP, HS-TCP and BIC-TCP all exhibit slow convergence and sustained unfairness following changes in network conditions such as the start-up of a new flow. FAST-TCP exhibits complex convergence behaviour. Our measured data is available in an online archive, as well as summary reports. Comments appreciated. Doug -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.postel.org/pipermail/end2end-interest/attachments/20050601/841be4aa/attachment.html From thomas.r.henderson at boeing.com Wed Jun 1 12:59:18 2005 From: thomas.r.henderson at boeing.com (Henderson, Thomas R) Date: Wed, 1 Jun 2005 12:59:18 -0700 Subject: [e2e] Packet re-ordering & NewReno Message-ID: <6938661A6EDA8A4EA8D1419BCE46F24C04060B9A@xch-nw-27.nw.nos.boeing.com> > -----Original Message----- > From: Evans, Roy [mailto:revans at emea.att.com] > Sent: Friday, May 27, 2005 6:11 AM > To: end2end-interest at postel.org > Subject: [e2e] Packet re-ordering & NewReno > > > Anyhow, its what happens next that I am getting excited > about, ok perhaps confused by. > > There is a bunch of retransmitted data in flight in the > network , followed by new & retransmitted data. > > The retransmitted data hits B , B responds with > endofwindow-ack again. To me, this seems a reasonable response > > So , I end up with a stream of endofwindow-acks flowing back to A > > ..which triggers a fast / newreno response & its reduction > of ssthresh & cwnd > > another round-trip time later , new-reno exits , with a bunch > of retransmitted data in flight in the network , triggering > duplicate acks , triggering a third fast/newreno response. > > > Any thoughts on what the protocol error is ? It looks like this implementation is supporting the so-called "Less Careful" variant of NewReno (see Section 5 of RFC 2582). Such a variant is more aggressive in recovering from the loss of the (endofwindow+1) new packet transmission without taking a timeout, at the cost of more unnecessary fast retransmits in other scenarios (which, in this reordering case, is clearly suboptimal). Both RFCs 2582 and its update, 3782, recommend the Careful variant, although the Less Careful variant is dropped altogether (and only the Careful variant specified) in 3782. See Section 11 of RFC 3782. Tom From nfonseca at ic.unicamp.br Thu Jun 2 10:22:33 2005 From: nfonseca at ic.unicamp.br (nfonseca@ic.unicamp.br) Date: Thu, 2 Jun 2005 14:22:33 -0300 (BRT) Subject: [e2e] Special Issue on LRD Traffic In-Reply-To: References: Message-ID: <2052.143.106.24.10.1117732953.squirrel@webmail.ic.unicamp.br> Dear e2e subscriber, I'd like to call your attention for a recently published special issue of the Computer Networks Journal which might be of interest for those in traffic modeling. Best regards, nelson Fonseca Computer Networks Volume 48, Issue 3, Pages 289-488 (21 June 2005)Long Range Dependent Traffic Edited by M. Devetsikiotis, N.L.S. da Fonseca 1. Editorial board Page CO2 2. Modeling network traffic with long range dependence: characterization, visualization and tools Pages 289-291 Michael Devetsikiotis and Nelson L.S. da Fonseca 3. Multifractality in TCP/IP traffic: the case against Pages 293-313 Darryl Veitch, Nicolas Hohn and Patrice Abry 4. Small-time scaling behavior of Internet backbone traffic Pages 315-334 Vinay J. Ribeiro, Zhi-Li Zhang, Sue Moon and Christophe Diot 5. Network and user driven alpha-beta on?off source model for network traffic Pages 335-350 Shriram Sarvotham, Rudolf Riedi and Richard Baraniuk 6. Envelope process and computation of the equivalent bandwidth of multifractal flows Pages 351-375 Cesar A.V. Melo and Nelson L.S. da Fonseca 7. Self-similarity and long range dependence on the internet: a second look at the evidence, origins and implications Pages 377-399 Wei-Bo Gong, Yong Liu, Vishal Misra and Don Towsley 8. Long-range dependence in a changing Internet traffic mix Pages 401-422 Cheolwoo Park, F?lix Hern?ndez-Campos, J.S. Marron and F. Donelson Smith 9. On the wavelet spectrum diagnostic for Hurst parameter estimation in the analysis of Internet traffic Pages 423-445 Stilian Stoev, Murad S. Taqqu, Cheolwoo Park and J.S. Marron 10. Queueing analysis of network traffic: methodology and visualization tools Pages 447-473 D.A. Rolls, G. Michailidis and F. Hern?ndez-Campos 11. The notion of end-to-end capacity and its application to the estimation of end-to-end network delays Pages 475-488 Han S. Kim and Ness B. Shroff From floyd at icir.org Thu Jun 2 13:55:40 2005 From: floyd at icir.org (Sally Floyd) Date: Thu, 02 Jun 2005 13:55:40 -0700 Subject: [e2e] a new IRTF group on Transport Models Message-ID: <200506022055.j52Ktetq022482@cougar.icir.org> Gengyu - >By considering end2end transport layer protocol, >whether packet drop rates of the metrics should be segment drop (error) rates? I am not sure that I understand your question. If you are asking whether we should call them packet drops or segment drops, I think it depends on where they are measured. I would use the term "packet drops" to refer to drop rates measured at a router (in a simulation or in an experiment). However, as I assume you are suggesting, it would probably be more accurate to use the term "segment drops" for drop rates measured at the transport end-nodes. - Sally From floyd at icir.org Thu Jun 2 14:33:00 2005 From: floyd at icir.org (Sally Floyd) Date: Thu, 02 Jun 2005 14:33:00 -0700 Subject: [e2e] a new IRTF group on Transport Models Message-ID: <200506022133.j52LX0ag023015@cougar.icir.org> Lloyd - >But the drops of interest measured are presumably segment drops, or >even segments-with-piggybacked acks drops, while drops of pure acks >(which are logically also packet drops) are usually neglected. > >(And congestion/queue overflow/policy drops in routers are distinct >from discards due to detected L2/L3 checksum errors.) > >Gah, semantics. Oh dear, and I wasn't even thinking about acks vs. data packets, or congestion- vs. corruption-related drops, when I wrote the earlier email. (Though I am very much interested in drop rates for pure ack packets, as it has a big influence on the burstiness of the data packets subsequently transmitted in the other direction.) I was thinking about fragmentation, where a "segment" in TCP might correspond to several "packets" on the wire. Also resulting in a difference between packet drop rates on-the-wire and segment drop rates seen by the TCP sender. Ah, semantics... - Sally From cannara at attglobal.net Thu Jun 2 22:45:10 2005 From: cannara at attglobal.net (Cannara) Date: Thu, 02 Jun 2005 22:45:10 -0700 Subject: [e2e] a new IRTF group on Transport Models References: <200506022133.j52LX0ag023015@cougar.icir.org> Message-ID: <429FEE66.1047628@attglobal.net> Since Sally & I have exchanged a few notes on what I see as the truly serious issue that never gets attention, I'll just mention it here once, in case some courageous, responsible souls are out there to do for humanity what the IETF & crew won't -- messing with TCP (or any transport) is missing the point of congestion control. The origin of it in TCP had nothing to do with the "e2e principle". It simply created a bandaid to bring the Internet back from the edge of Metcalfe's predicted collapse that scared folks in the '80s. Why? Because the Internet designers never considered anything but IP and a few hosts (ok a few coffee pots too :). DLC? What's that? Unique node addresses? Eh? Admission & flow control at the network layer? What's that? So, with eyes averted, ears covered and mouths that raised such issues taped shut, we got what we have today -- a mess. Whether or not TCP is ever given accurate info to distinguish physical loss from true congestion matters little. The network layer is responsible for its own congestion management. That's where the Internet deserves to finally get a dose of the reality that's been faced for decades in more reliable and secure deployed communications systems -- real networks, in other words. If there are such good souls ready to stand up and do what needs be done, I applaud you. Remember, courage men/women, God hates a coward. Alex Sally Floyd wrote: > > Lloyd - > > >But the drops of interest measured are presumably segment drops, or > >even segments-with-piggybacked acks drops, while drops of pure acks > >(which are logically also packet drops) are usually neglected. > > > >(And congestion/queue overflow/policy drops in routers are distinct > >from discards due to detected L2/L3 checksum errors.) > > > >Gah, semantics. > > Oh dear, and I wasn't even thinking about acks vs. data packets, > or congestion- vs. corruption-related drops, when I wrote the earlier > email. (Though I am very much interested in drop rates for pure > ack packets, as it has a big influence on the burstiness of the > data packets subsequently transmitted in the other direction.) > > I was thinking about fragmentation, where a "segment" in TCP > might correspond to several "packets" on the wire. Also resulting > in a difference between packet drop rates on-the-wire and segment > drop rates seen by the TCP sender. > > Ah, semantics... > > - Sally From am.amir at gmail.com Thu Jun 2 23:33:39 2005 From: am.amir at gmail.com (Aamir Mehmood) Date: Fri, 3 Jun 2005 11:33:39 +0500 Subject: [e2e] How measered Jitter is incorporated in E-Model Message-ID: <12a3f40805060223337a93f4de@mail.gmail.com> Hi We are doing performance analysis of our country's internet exchange. We have measured jitter for voice calls on the backbone links. Can some one please let me know that how can i incorporate that measured value into the E- Model ( ITU G.107) Regards Amir From jnc at mercury.lcs.mit.edu Fri Jun 3 05:30:12 2005 From: jnc at mercury.lcs.mit.edu (Noel Chiappa) Date: Fri, 3 Jun 2005 08:30:12 -0400 (EDT) Subject: [e2e] a new IRTF group on Transport Models Message-ID: <20050603123012.E812786AFF@mercury.lcs.mit.edu> > From: Cannara > the Internet designers never considered anything but IP and a few hosts > ... > Admission & flow control at the network layer? What's that? > ... > Whether or not TCP is ever given accurate info to distinguish physical > loss from true congestion matters little. Gee, Alex, since you seem to know everything (in addition to being smarter than all the early Internet people put together), perhaps you can explain to me what ICMP type 4 messages are supposed to do... (And, for the rest of you, does anyone know how CYCLADES handled congestion? I have the CYCLADES book, so I could go look it up, but I was hoping someone could save me the trouble.) > If there are such good souls ready to stand up and do what needs be > done, I applaud you. Remember, courage men/women, God hates a coward. I would cheerfully say what I really think, but alas, I'm afraid the list maintainers would likely chastise me (rightly) for ad hominem attacks. Noel From fkastenholz at comcast.net Fri Jun 3 06:30:00 2005 From: fkastenholz at comcast.net (frank@kastenholz.org) Date: Fri, 03 Jun 2005 13:30:00 +0000 Subject: [e2e] a new IRTF group on Transport Models Message-ID: <060320051330.21079.42A05B58000004B600005257220588601496040108020A9B9C0E0500@comcast.net> Sally Is there any thought to identifying information that routers and end systems might provide that either can be fed back into the models to refine them or used in parallel to (in)validate them? A simple example might be packet drops. If models assume that the only reason packets are dropped is overflowing queues due to congestion, that leads to certain conclusions, etc, and tweaking our transport protocols in a certain direction. But if it turns out that a significant percentage of packet drops is because of something else, then that conclusion would be incorrect... Frank Kastenholz From craig at aland.bbn.com Fri Jun 3 07:38:32 2005 From: craig at aland.bbn.com (Craig Partridge) Date: Fri, 03 Jun 2005 10:38:32 -0400 Subject: [e2e] a new IRTF group on Transport Models In-Reply-To: Your message of "Fri, 03 Jun 2005 13:30:00 -0000." <060320051330.21079.42A05B58000004B600005257220588601496040108020A9B9C0E0500@comcast.net> Message-ID: <20050603143833.043831FF@aland.bbn.com> I was part of a team that looked at the particular problem of distinguishing packet drop cause in detail recently. See, for instance, http://www.ir.bbn.com/documents/articles/krishnan_cn04.pdf You don't get as much leverage as you'd hope from knowing the cause of packet drops. Craig In message <060320051330.21079.42A05B58000004B600005257220588601496040108020A9B 9C0E0500 at comcast.net>, frank at kastenholz.org writes: >Sally > >Is there any thought to identifying information >that routers and end systems might provide that >either can be fed back into the models to refine >them or used in parallel to (in)validate them? > >A simple example might be packet drops. If models >assume that the only reason packets are dropped is >overflowing queues due to congestion, that leads to >certain conclusions, etc, and tweaking our transport >protocols in a certain direction. But if it turns >out that a significant percentage of packet drops >is because of something else, then that conclusion >would be incorrect... > >Frank Kastenholz > > From faber at ISI.EDU Fri Jun 3 12:05:09 2005 From: faber at ISI.EDU (Ted Faber) Date: Fri, 3 Jun 2005 12:05:09 -0700 Subject: [e2e] a new IRTF group on Transport Models In-Reply-To: <429FEE66.1047628@attglobal.net> References: <200506022133.j52LX0ag023015@cougar.icir.org> <429FEE66.1047628@attglobal.net> Message-ID: <20050603190509.GC56295@pun.isi.edu> On Thu, Jun 02, 2005 at 10:45:10PM -0700, Cannara wrote: > Since Sally & I have exchanged a few notes on what I see as the truly serious > issue that never gets attention, I'll just mention it here once, in case some > courageous, responsible souls are out there to do for humanity what the IETF & > crew won't -- messing with TCP (or any transport) is missing the point of > congestion control. The origin of it in TCP had nothing to do with the "e2e > principle". It simply created a bandaid to bring the Internet back from the > edge of Metcalfe's predicted collapse that scared folks in the '80s. > > Why? Because the Internet designers never considered anything but IP and a > few hosts (ok a few coffee pots too :). DLC? What's that? Unique node > addresses? Eh? Admission & flow control at the network layer? What's that? > So, with eyes averted, ears covered and mouths that raised such issues taped > shut, we got what we have today -- a mess. Whether or not TCP is ever given > accurate info to distinguish physical loss from true congestion matters > little. The network layer is responsible for its own congestion management. > That's where the Internet deserves to finally get a dose of the reality that's > been faced for decades in more reliable and secure deployed communications > systems -- real networks, in other words. It would be easy to get side-tracked into a discussion of what possible consensus usage of the word "real" you believe doesn't apply to the Internet, but I'm going to avoid that rathole. I think one should use some caution in advocating a network-layer-only approach to congestion control. The new factors that the imaginary network we call the Internet introduces into the equation is the wide variety of applications that the network is potentially used for and the fantastic array of devices used to create the network. The Internet's strength lies in its diversity and trying to apply a low-level, one-size fits all congestion control system imperils that diversity. In the Internet, congestion does not necessarily mean that your Cisco router's buffers are full. It may mean that your firewall is running low on memory or CPU cycles, or that your wireless link has encountered a cloud, or that a new and bursty source is changing the jitter qualities of your end-to-end stream. *Some* resource is getting scarce; the Curve of Truth is flattening out. Notice: the problem here depends both on what resources the devices on your path are allocating and what resource is critical to your application. And what resource is critical to the device. A sudden 50% drop in bandwidth (resulting in a drop in network capacity) is a non-event to some applications and a congestion moment to others. A firewall that is running out of memory wants to throttle new connections; a router running out of buffers wants to throttle sources sending lots of data. Now, when we the users of such an imaginary, whack-O network can't agree on what's being optimized (throughput, jitter, delay, availibility) or what a reasonable response to a resource becoming scarce (recode your video, try again tomorrow, slow down transmission rate, smooth your traffic), it really stretches the imagination to believe that the only barrier to solving the (ill-defined) problem is a lack of will. One could, I suppose, force the definition of the problem to be best-effort bulk-transfer throughput maximization while maintaining max-min fairness and hammer the fanciful network we have into delivering that through a unified network-level admission and flow control system. I don't even deny that such a solution would help the considerable number of current Internet users who are doing best-effort bulk-transfer throughput maximization. But it's not going to help the guys who want to do other things, nor will it help those gruops co-exist. > If there are such good souls ready to stand up and do what needs be done, I > applaud you. Remember, courage men/women, God hates a coward. "God hates me." "Hate 'im back. It works for me." -- Danny Glover and Mel Gibson, Lethal Weapon -- Ted Faber http://www.isi.edu/~faber PGP: http://www.isi.edu/~faber/pubkeys.asc Unexpected attachment on this mail? See http://www.isi.edu/~faber/FAQ.html#SIG -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050603/7faae9dd/attachment.bin From mascolo at poliba.it Fri Jun 3 12:03:53 2005 From: mascolo at poliba.it (Saverio Mascolo) Date: Fri, 3 Jun 2005 21:03:53 +0200 Subject: [e2e] a new IRTF group on Transport Models References: <20050603143833.043831FF@aland.bbn.com> Message-ID: <009b01c5686e$fcd47340$723bccc1@poliba.it> > > I was part of a team that looked at the particular problem of distinguishing > packet drop cause in detail recently. See, for instance, > > http://www.ir.bbn.com/documents/articles/krishnan_cn04.pdf > > You don't get as much leverage as you'd hope from knowing the cause of > packet drops. also because the link layer should be well designed so that losses should be due only (or mainly) to congestion. Saverio From revans at emea.att.com Mon Jun 6 02:48:21 2005 From: revans at emea.att.com (Evans, Roy) Date: Mon, 6 Jun 2005 10:48:21 +0100 Subject: [e2e] How measered Jitter is incorporated in E-Model Message-ID: <7FC4591491592043B3664FBAF12EE9BE03E565C1@gbhavmsx02.emea.att.com> Amir, I am a user of the model described here - http://portal.acm.org/citation.cfm?id=505669&coll=portal&dl=ACM OK, strictly the tool I use uses a later iteration of G.107 and a model of a vendors adaptive jitter buffer rather than the fixed one presented above. Out of curiosity, what metric are you measuring ? Roy Evans From hoene at tkn.tu-berlin.de Mon Jun 6 05:58:05 2005 From: hoene at tkn.tu-berlin.de (Christian Hoene) Date: Mon, 06 Jun 2005 14:58:05 +0200 Subject: [e2e] How measered Jitter is incorporated in E-Model In-Reply-To: <42A0146E.4030907@tkn.tu-berlin.de> References: <42A01299.70404@ieee.org> <42A0146E.4030907@tkn.tu-berlin.de> Message-ID: <42A4485D.8020701@tkn.tu-berlin.de> Dear Amir Mehmood, It is best is to measure the distribution of delay values. Jitter - or more precise delay variance - is not important. Only the distribution is relevant. The packet trace would be even better. The overall delay (including playout buffering and the transmission) and the overall loss rate (including late and lost packets) can be used in the ITU E-Model to calculate the conversational quality (measured with the R factor). If you then assume fixed playout buffer lengths (e.g. from 20 to 200ms), you can calculate the overall delay and the losses due to late packets. The delay distribution can be map to loss and delay values, which can be used to calculate an optimal R factor or MOS quality rating. However, this approach is problematic because to try to find an optimal fixed playout buffer length AFTER the transmission. An real-time adaptive playout scheduling would be more realistic. In the publication [1] we addressed this issue and provide an open-source software to assess the transmission VoIP with a precision that has not been reached before. But you can use (and adapt) the software (which is somewhat complicated to install, sorry) to solve your task. Good luck, Christian Hoene TKN TU-Berlin [1]C. Hoene, S. Wieth?lter, and A. Wolisz , "Predicting the Perceptual Service Quality Using a Trace of VoIP Packets", In /Proc. of Fifth International Workshop on Quality of future Internet Services (QofIS'04)/, Barcelona, Spain, September 2004. (PDF ) > Subject: [e2e] How measered Jitter is incorporated in E-Model > Date: Fri, 3 Jun 2005 11:33:39 +0500 > From: Aamir Mehmood > Reply-To: Aamir Mehmood > To: end2end-interest at postel.org > > >Hi > >We are doing performance analysis of our country's internet exchange. >We have measured jitter for voice calls on the backbone links. > >Can some one please let me know that how can i incorporate that >measured value into the E- Model ( ITU G.107) > >Regards > > >Amir > > From boavida at dei.uc.pt Mon Jun 6 02:51:13 2005 From: boavida at dei.uc.pt (Fernando Boavida) Date: Mon, 6 Jun 2005 10:51:13 +0100 Subject: [e2e] INFOCOM 2006 Call for Papers Message-ID: <20050606095057.DF25A4A8129@teste-d.ci.uc.pt> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ------------ Apologies if you receive multiple copies ---------------- ------------ Call for Papers ----------------------------------------- IEEE INFOCOM 2006 Barcelona, April 23-29 http://www.ieee-infocom.org/2006/ Full Paper Due: July 6, 2005 Notification of Acceptance: October 31, 2005 Camera-Ready Version Due: December 18, 2005 ----------------------------------------------------------------------- Best regards, Fernando Boavida, Amitabh Mishra & Ramon Fabregat - Publicity Co-chairs +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ From floyd at icir.org Mon Jun 6 14:04:55 2005 From: floyd at icir.org (Sally Floyd) Date: Mon, 06 Jun 2005 14:04:55 -0700 Subject: [e2e] a new IRTF group on Transport Models Message-ID: <200506062104.j56L4tKA067656@cougar.icir.org> Frank - >Is there any thought to identifying information >that routers and end systems might provide that >either can be fed back into the models to refine >them or used in parallel to (in)validate them? > >A simple example might be packet drops. If models >assume that the only reason packets are dropped is >overflowing queues due to congestion, that leads to >certain conclusions, etc, and tweaking our transport >protocols in a certain direction. But if it turns >out that a significant percentage of packet drops >is because of something else, then that conclusion >would be incorrect... Certainly, any evaluation of transport protocols or of congestion control mechanisms needs to take into account the difference between packets dropped due to congestion and packets dropped due to some other reason (e.g., corruption). In simulations, non-congestion-related packet drops are only present when explicitly introduced, and in test beds, I assume that there are mechanisms to detect and report on the different types of packet drops. And in analysis, it is important to consider the possibilities of non-congestion-related packet losses. If you are asking if there has been any thought to proposing additions to the protocol suite to report such information, the short answer is that this is not intended to be the province of the Transport Models Research Group (TMRG). The proposal is that TMRG itself would *not* be proposing changes to transport protocols or to explicit communication between transport protocols and lower layers, but would *only* be concerned with improving our methodologies for evaluating such proposals. E.g., with test suites for simulations or for test beds, discussions of topologies and traffic generation, of the models underlying our analysis, etc. - Sally http://www.icir.org/floyd/ Addendum: I am personally very interested in the potential benefits (and complications) of more explicit communication between link layers and transport protocols, and in particular I like the idea of explicit communication from lower layers to transport saying "the packet with this packet header was dropped due to corruption". It is not clear to me yet exactly what the response of the transport protocol should be to reports of packet corruption, however - continuing to send at a high rate in the face of a high rate of packet corruption doesn't sound too appealing. And there are complications with other forms of explicit communication between link layers and transport protocols as well, e.g., as discussed in the IAB internet-draft on "Architectural Implications of Link Indications", at "http://www.iab.org/documents/drafts/draft-iab-link-indications-01.txt". It might be that it is time also for a research group on Explicit Communication between Transport Protocols and Lower Layers, but someone else will have to start it. (I believe that there is already a proposal in the works for a new research group on new congestion control mechanisms, which includes one class of proposed new forms of explicit communication between transport protocols and routers.) From michael.welzl at uibk.ac.at Tue Jun 7 00:22:32 2005 From: michael.welzl at uibk.ac.at (Michael Welzl) Date: 07 Jun 2005 09:22:32 +0200 Subject: [e2e] Reacting to corruption based loss In-Reply-To: <200506062104.j56L4tKA067656@cougar.icir.org> References: <200506062104.j56L4tKA067656@cougar.icir.org> Message-ID: <1118128952.4771.23.camel@lap10-c703.uibk.ac.at> Dear all, I changed the subject because it's drifting away from the original topic... Sally Floyd wrote: > Addendum: > I am personally very interested in the potential benefits (and > complications) of more explicit communication between link layers > and transport protocols, and in particular I like the idea of > explicit communication from lower layers to transport saying "the > packet with this packet header was dropped due to corruption". It > is not clear to me yet exactly what the response of the transport > protocol should be to reports of packet corruption, however - > continuing to send at a high rate in the face of a high rate of > packet corruption doesn't sound too appealing. And there are This point has been raised several times: how exactly should a sender react to corruption? I fully agree that continuing to send at a high rate isn't a good idea. Now, given that we're talking about a transport endpoint which is informed about corruption, there probably isn't any knowledge regarding the origin of corruption available - we don't know what type of link layer caused it or why exactly it happened (too many sources sending at the same time? user passed a wall?). However, there seems to be some consensus that the reaction to corruption should be less severe than the reaction to congestion. Also, it has been noted several times (and in several places) that AIMD would work just as well if beta (the multiplicative decrease factor) was, say, 7/8 instead of 1/2. For a historical reason (I think it was the DECbit scheme), 7/8 seems to be a number that survived in people's minds. So, why don't we just decide for a pragmatic approach instead of waiting endlessly for a research solution that we can't come up with? Why don't we simply state that the reaction to corruption has to be: "reduce the rate by multiplying it with 7/8"? Much like the TCP reduction by half, it may not be the perfect solution (Jacobson actually mentions that the reduction by half is "almost certainly too large" in his congavoid paper), but it could be a way to get us forward. ...or is there a reasonable research method that can help us determine the ideal reaction to corruption, irrespective of the cause? > complications with other forms of explicit communication between > link layers and transport protocols as well, e.g., as discussed in > the IAB internet-draft on "Architectural Implications of Link > Indications", at > "http://www.iab.org/documents/drafts/draft-iab-link-indications-01.txt". > > It might be that it is time also for a research group on Explicit > Communication between Transport Protocols and Lower Layers, but > someone else will have to start it. (I believe that there is already > a proposal in the works for a new research group on new congestion > control mechanisms, which includes one class of proposed new forms > of explicit communication between transport protocols and routers.) I agree that this would be good to have. There have been many proposals, and none of them appeared to work well enough (*ouch*, I just hurt myself :-) ). Inter-layer communication is a tricky issue, regardless of the direction in the stack. Heck, we don't even have a reasonable way to signal "UDP-Lite packet coming through" to a link layer! Cheers, Michael From mycroft at netbsd.org Tue Jun 7 00:59:20 2005 From: mycroft at netbsd.org (Charles M. Hannum) Date: Tue, 7 Jun 2005 07:59:20 +0000 Subject: [e2e] Reacting to corruption based loss In-Reply-To: <1118128952.4771.23.camel@lap10-c703.uibk.ac.at> References: <200506062104.j56L4tKA067656@cougar.icir.org> <1118128952.4771.23.camel@lap10-c703.uibk.ac.at> Message-ID: <200506070759.20223.mycroft@netbsd.org> This gets weirder when you consider the implications of corruption on link-layer protocols. E.g., 802.11 typically will adapt to signal problems by detecting corruption (at the link layer) and switching to a slower transfer rate (more accurately: to a different coding method at a slower rate), which usually alleviates the problem. In this case, IP will never detect a corrupt packet (because corruption is handled at a lower level), and a notification from the link layer would likely not be useful (because the link layer adapts to alleviate the problem). The $64 (64-bit?) question is: is there something you could actually do with the information that would be compelling enough to be worth a layering violation? Since this would be annoying for many implementors, there needs to be a really compelling argument for it. From weddy at grc.nasa.gov Tue Jun 7 04:18:09 2005 From: weddy at grc.nasa.gov (Wesley Eddy) Date: Tue, 7 Jun 2005 07:18:09 -0400 Subject: [e2e] Reacting to corruption based loss In-Reply-To: <1118128952.4771.23.camel@lap10-c703.uibk.ac.at> References: <200506062104.j56L4tKA067656@cougar.icir.org> <1118128952.4771.2 3.camel@lap10-c703.uibk.ac.at> Message-ID: <20050607111809.GA1970@grc.nasa.gov> On Tue, Jun 07, 2005 at 09:22:32AM +0200, Michael Welzl wrote: > > Also, it has been noted several times (and in several places) that > AIMD would work just as well if beta (the multiplicative decrease > factor) was, say, 7/8 instead of 1/2. For a historical reason (I > think it was the DECbit scheme), 7/8 seems to be a number > that survived in people's minds. > > So, why don't we just decide for a pragmatic approach instead > of waiting endlessly for a research solution that we can't come > up with? Why don't we simply state that the reaction to corruption > has to be: "reduce the rate by multiplying it with 7/8"? > This idea is sort of discussed in the ETEN paper Craig sent a link to earlier. One approach that it describes (CETEN_A) adapts beta between 1/2 and 1 based on the rate of congestion events reported. In the October 2004 CCR, there is a paper that goes into greater depth on CETEN; "New Techniques for Making Transport Protocols Robust to Corruption-Based Loss" by Eddy, Ostermann, and Allman. Another idea, called CETEN_P, involes leaving beta the same, but flipping a coin weighted with the frequency of corruption events relative to congestion events. The multiplicative decrease is either triggered or not triggered depending on the coin flipping outcome. This was found to have some flaws that CETEN_A does not share. See 3.1 of: http://roland.grc.nasa.gov/~weddy/papers/ceten-e-thesis.pdf -Wes -- Wesley M. Eddy Verizon FNS / NASA GRC http://roland.grc.nasa.gov/~weddy -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050607/fb7bddf9/attachment.bin From jms at central.cis.upenn.edu Tue Jun 7 05:26:07 2005 From: jms at central.cis.upenn.edu (Jonathan M. Smith) Date: Tue, 7 Jun 2005 08:26:07 -0400 (EDT) Subject: [e2e] Reacting to corruption based loss In-Reply-To: <1118128952.4771.23.camel@lap10-c703.uibk.ac.at> References: <200506062104.j56L4tKA067656@cougar.icir.org> <1118128952.4771.23.camel@lap10-c703.uibk.ac.at> Message-ID: I would hypothesize that there is in fact no ideal reaction to corruption, because the proper response is deeply dependent on the cause. There are, however, potential inferences one could draw about causes from data available from cross-layer analyses, such as link layer checksums, FECs, and the like. Best, -JMS ------------------------------------------------------------------------- Jonathan M. Smith Olga and Alberico Pompa Professor of Engineering and Applied Science Professor of Computer and Information Science, University of Pennsylvania Levine Hall, 3330 Walnut Street, Philadelphia, PA 19104 On Tue, 7 Jun 2005, Michael Welzl wrote: > Dear all, > > I changed the subject because it's drifting away from the > original topic... > > > Sally Floyd wrote: > > > Addendum: > > I am personally very interested in the potential benefits (and > > complications) of more explicit communication between link layers > > and transport protocols, and in particular I like the idea of > > explicit communication from lower layers to transport saying "the > > packet with this packet header was dropped due to corruption". It > > is not clear to me yet exactly what the response of the transport > > protocol should be to reports of packet corruption, however - > > continuing to send at a high rate in the face of a high rate of > > packet corruption doesn't sound too appealing. And there are > > This point has been raised several times: how exactly should > a sender react to corruption? I fully agree that continuing > to send at a high rate isn't a good idea. > > Now, given that we're talking about a transport endpoint which > is informed about corruption, there probably isn't any knowledge > regarding the origin of corruption available - we don't know > what type of link layer caused it or why exactly it happened > (too many sources sending at the same time? user passed a wall?). > However, there seems to be some consensus that the reaction > to corruption should be less severe than the reaction to congestion. > > Also, it has been noted several times (and in several places) that > AIMD would work just as well if beta (the multiplicative decrease > factor) was, say, 7/8 instead of 1/2. For a historical reason (I > think it was the DECbit scheme), 7/8 seems to be a number > that survived in people's minds. > > So, why don't we just decide for a pragmatic approach instead > of waiting endlessly for a research solution that we can't come > up with? Why don't we simply state that the reaction to corruption > has to be: "reduce the rate by multiplying it with 7/8"? > > Much like the TCP reduction by half, it may not be the perfect > solution (Jacobson actually mentions that the reduction by half > is "almost certainly too large" in his congavoid paper), but > it could be a way to get us forward. > > ...or is there a reasonable research method that can help us > determine the ideal reaction to corruption, irrespective of > the cause? > > > > complications with other forms of explicit communication between > > link layers and transport protocols as well, e.g., as discussed in > > the IAB internet-draft on "Architectural Implications of Link > > Indications", at > > "http://www.iab.org/documents/drafts/draft-iab-link-indications-01.txt". > > > > It might be that it is time also for a research group on Explicit > > Communication between Transport Protocols and Lower Layers, but > > someone else will have to start it. (I believe that there is already > > a proposal in the works for a new research group on new congestion > > control mechanisms, which includes one class of proposed new forms > > of explicit communication between transport protocols and routers.) > > I agree that this would be good to have. There have been many > proposals, and none of them appeared to work well enough > (*ouch*, I just hurt myself :-) ). Inter-layer communication > is a tricky issue, regardless of the direction in the stack. > Heck, we don't even have a reasonable way to signal "UDP-Lite > packet coming through" to a link layer! > > Cheers, > Michael > From Jon.Crowcroft at cl.cam.ac.uk Tue Jun 7 06:15:52 2005 From: Jon.Crowcroft at cl.cam.ac.uk (Jon Crowcroft) Date: Tue, 07 Jun 2005 14:15:52 +0100 Subject: [e2e] Reacting to corruption based loss Message-ID: I'm getting tired of this debate - the obvious solution is for all edge routers to open a TCP connection to all other edge routers - looking at contention ratios in the Internet, this would only mean about 100M TCPCBs per edge router, which with clever compression would probably only take about 10 bytes per TCPCB, so 1 G of memory (available for around 50$ as SD flash or USB) would work fine - then all the edge networks are switched nowadays, so there's no loss there so there is no point in running TCP end to end - edge to edge will be perfectly good enough (after all, no-one argues for openning a TCP connection between the CPU and the disk controller, so this is consistent) and will lead to far less memory wastage in hosts runnign all that complicated TCP protcol - they can just send web pages and video and audio and so on as a sequence of IP packets with moores law running out of steam, this should recoup some performance for us, especialyl now that we are all going to be using x86s running Xen^H^H^H^Tiger^H^H^longhorn^H^H^ linux^H^H^H Out of Steam operating systems IP over TCP: way to go. note the advantages - all loses ONLY happen because of bad lan management. j p.s. I assure you, I am not advocateding hop-by-hop flow control congestion control or reliability. p.p.s. I doubly assure that I am serious, no really, actually, yes well, all right, maybe not. p.p.p.s if only corruption based loss applied to political leaders... p.p.p.p.s Isn't this only a step back to those Halcyon days when CIsco implemented IP on X.25 by running TCP connections between all routers over the X.25 "cloud" - only, so much more elegant. p.p.p.p.p.s I think we should try this on Planetlab first - each host opens a TCP connection to all the other 300 planetlab nodes - why, a modest 30k bytes of uncompressed state From dpreed at reed.com Tue Jun 7 07:22:08 2005 From: dpreed at reed.com (David P. Reed) Date: Tue, 07 Jun 2005 10:22:08 -0400 Subject: [e2e] Reacting to corruption based loss In-Reply-To: References: <200506062104.j56L4tKA067656@cougar.icir.org> <1118128952.4771.23.camel@lap10-c703.uibk.ac.at> Message-ID: <42A5AD90.4060806@reed.com> There are two effects that corruption causes. First, it lowers the end-to-end error-free capacity of the channel. Second, it causes congestion because it lowers the potential end-to-end error-free rate of the channel. Clearly, the response of lowering the input rate is one possible way to deal with the second phenomenon. However, this second effect is indistinguishable from bottleneck congestion or overload congestion. So, given that we have an effective way to deal with transient overload, why would "corruption" need a new layered interface change. So any key difference should relate to the first impact. It is well known that there are good reasons to create codes that cross packet boundaries. So-called erasure codes or digital fountain techniques provide the ability on an end-to-end basis to deal with data losses that are packet centric. If errors are "bursty" in time, spreading any particular end-to-end bit across several packets (or even across several paths with independent failures) is a good end-to-end response to corruption. So the utility of separation of corruption from overload losses is to be able to code better. Suppose a packet's header is salvageable but its data is not (perhaps putting a code on the header, rather than a checksum would help here!) Would it be helpful in improving the effective end-to-end capability if decoded at the endpoint? Absolutely - if there are priors that give you a reasonable error model. But the real question here is about coding a stream across a network with packet corruption. It probably is better to look at the end-to-end perspective, which includes such things as latency (spreading a bit across successive packets adds latency when decoded at the receiver) and control-loop latency (how fast can the endpoints change coding of a stream to spread across more packets and more paths, compared to a more local, rapid, link-level response). The observation that 802.11 slows rates automatically based on link quality points out the issue here - such a local tactic improves all end-to-end paths with one fell-swoop, whereas there is the possiblity that end-to-end responses will be too slow, or else drive each other into mutual instability if the rate of change of link quality varies faster than the end-to-end control loop timing can resolve. I'd argue that intuitions of most protocol designers are weak here, because the state of the system as a whole is not best managed either at the link level or at the end-to-end "session" level - but at the whole network level. RED and ECN are decentralized "network level" control strategies - which end up providing a control plane that is implicit among all those who share a common bottleneck link. SImilarly, coding strategies that can deal with "corruption" require a "network level" implicit control, not an intuitive fix focused on the TCP state machine. From jishac at grc.nasa.gov Tue Jun 7 09:48:53 2005 From: jishac at grc.nasa.gov (Joseph Ishac) Date: Tue, 7 Jun 2005 12:48:53 -0400 Subject: [e2e] a new IRTF group on Transport Models In-Reply-To: <200506062104.j56L4tKA067656@cougar.icir.org> References: <200506062104.j56L4tKA067656@cougar.icir.org> Message-ID: <20050607164853.GA11348@sunfire.grc.nasa.gov> So, I'm a bit curious about the timidness to continue sending at a high rate when faced with a large amount of corruption. If you are indeed not experiencing congestion, then why slow down? If you assume that corruption is mostly non-deterministic then slowing down wouldn't really help. So is it just fear that the signal is incorrect, and that you may be improperly reacting to what may be really congestion? If that were the case, then it's really the ambiguity/reliability of the signal that's of concern. (are there others?) -Joseph On Mon, Jun 06, 2005 at 02:04:55PM -0700, Sally Floyd wrote: > continuing to send at a high rate in the face of a high rate of > packet corruption doesn't sound too appealing. And there are -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050607/7023df54/attachment.bin From amer at cis.udel.edu Tue Jun 7 09:59:17 2005 From: amer at cis.udel.edu (Paul D. Amer) Date: Tue, 7 Jun 2005 12:59:17 -0400 Subject: [e2e] Corrected FTP over SCTP results Message-ID: <06a301c56b82$3e2e4f30$3e850480@AMERPC> In the 23rd IEEE International Performance, Computing, and Communications Conf (IPCCC 2004), Phoenix, 4/04, Sourabh Ladha and I published the paper ``Improving multiple file transfers using SCTP multistreaming.'' Unbeknownst to us, in our experiments the SCTP end-points incorrectly used Netbed's error-free, no-delay control connection for retransmissions thereby biasing the results in favor of SCTP. (Multihoming was taking place implicitly.) We have corrected this error, and re-run the entire set of experiments. The corrected results still show that SCTP with multistreaming and the use of command pipelining can provide significant reductions in the time to transfer multiple files, although less significant than previously published. The corrected results are available at http://www.cis.udel.edu/~amer/PEL/poc/pdf/IPCCC2004CORRECTED-FTP-over-SCTP-Natarajan-6-6-2005.pdf We sincerely regret our oversight. Paul D. Amer Sourabh Ladha From gaylord at dirtcheapemail.com Tue Jun 7 10:54:47 2005 From: gaylord at dirtcheapemail.com (Clark Gaylord) Date: Tue, 07 Jun 2005 13:54:47 -0400 Subject: [e2e] a new IRTF group on Transport Models In-Reply-To: <20050607164853.GA11348@sunfire.grc.nasa.gov> References: <200506062104.j56L4tKA067656@cougar.icir.org> <20050607164853.GA11348@sunfire.grc.nasa.gov> Message-ID: <42A5DF67.6080107@dirtcheapemail.com> Joseph Ishac wrote: >So, I'm a bit curious about the timidness to continue sending at a high >rate when faced with a large amount of corruption. If you are indeed > > as example: corruption could be cell loss. cell loss could be due to congestion. --ckg From cannara at attglobal.net Tue Jun 7 11:11:23 2005 From: cannara at attglobal.net (Cannara) Date: Tue, 07 Jun 2005 11:11:23 -0700 Subject: [e2e] a new IRTF group on Transport Models References: <20050603143833.043831FF@aland.bbn.com> <009b01c5686e$fcd47340$723bccc1@poliba.it> Message-ID: <42A5E34B.AB1C107E@attglobal.net> The archives for this list, over the past few years, should have many relevant emails on this general topic of Internet Dysfunction. To save space, I'll try to add to Ted's and Franks' msgs in this one email. 1) Since no effort (in shipped code) was ever made to distinguish the real cause of a missing TCP packet, it can be for any of the reasons we'd want to know (if congestion, or resource shortage, is occurring somewhere) and, it can be for reasons we can't do anything about, by slowing down, either at the network or transport layer. Using TCP as the Internet's congestion avoider, then makes even less sense, because (apart from quench) no one ever planned to tell it, say from IP, that it should slow down for the right reason. An application slows down because the last buffer it gave to the transport hasn't been released yet, so it waits. A network-layer process slows down because network management has told it to and that mgmnt knows exactly why. Something as simple as an Ethernet driver that's encountering many collisions will tell the network layer above to stop, again by buffer management, or by explicit mgmnt msg, as when all 15 retrys on a collision have failed. 2) Programmers for all manner of packet-handling devices (routers, bridges, NATs, firewalls...) send packets they can't handle to a drop queue and/or they use the incoming interface's ability to slow the sender (pause frames, not ready...). Frames that are dropped that way can indeed be classed as congestive losses along the path. Since the switching systems internally know where and when this happens, one has mgmnt info available. This info was never mapped into TCP/IP, whether datagram or reliable services were being affected. Yes, there's been talk of ECN, but it's not complete in its description of loss cause. So the packet-based Internet is shortchanged of info that can improve performance. And we all know that performance is needed for important functions, like mgmnt/security. 3) Hardware fails. When a switch backpressures a node to shut up, by faking a collision, that node indeed shuts up, but for exactly the period the switch needs to recover, unless either end has a failure. When a dedicated link on a path starts failing and errors occur, losses start out small, but TCP performance tanks, because TCP is ignorant of the reason for loss (ie., not congestion, so keep going). So, one bad CRC in 100 pkts can bring today's TCP's performance down by more than a factor of 10 on common paths. 4) Default installs are often suboptimal, and not in subtle ways. Default timers, in particular, can easily hurt performance in TCP flows -- consider the Delayed Ack Timer, or the Max Send Window. This becomes more of an issue as network data rates increase, but files are passed in smallish blocks as before. See what impact a 100mS Ack Timer value has on a large 100Mb/s file transfer, when each block fits in an odd number of TCP packets. 5) The problem is not just with TCP being made the Jedi of congestion management. It's a combination of many Internet design shortcomings that have long needed attention. Firewalls, IDSs and NAT boxes, for instance, would not be so needed if we indeed had modelled the Internet on secure, access-controlled systems and dealt sooner with its always-ancient addressing. One of the reasons packets can be lost in so many resource limited systems along a path is that no consideration of true security ever made it into the design. It's no accident that the miltary nets were kept physically separate. It's no accident that VPNs are de-rigeur. 6) There is no concept of source & release control in the Internet. That's why, for instance, folks like uSoft can ship an OS that gives its box a new IP address if the Ethernet card has its wire out for 10 secs, so any pkts on their way to it now get dropped. Or why that a vendor allows Port-Based FTP inside some apps, even though its insane violation of 3 layers causes connection failure through NAT boxes, etc. 7) And, of course, lucky 7 -- access control and unique addressing. Maybe freedom from Big Brother was the original motivation for "anyone, anywhere with any IP address" (until we exhausted them), but now we have millions of junk packets every second delivered to us, with no way of tracking the actual sources and no way of preventing them from hogging links, unless we spend billions on all manner of security, anti-spam, yadda, yadda gear and software that has never been required for our other communication media. Packet nets depend on random holes, into which a sender can often inject a packet at will -- essentially "on demand", when loads are only modest. Synchronous nets always have bit slots moving, but allocate exactly what they can fill and no more. Mgmnt has always been as important as data in such nets. Not so with the Internet protocols. Same with access control & security. Imagine deploying a mail system around the world whose services are gained by sending "HELO" (or "ELHO", or...) in plain text to establish version and connection for something as important as private information passing. Years ago SMTP was a joke. Some poor kid even got nailed by the feds years ago for showing how stupidly designed it was, by executing code at a far system via a 'feature' built into the mail protocols. The problem with the Internet is that it is a mess and tweaking a protocol ain't gonna fix anything. The shame is that we all paid for it in taxes, we all ended up with it because of the market subsidy it has enjoyed, and we will continue to pay, some more than others, until someone is encouraged to rethink it premises. Right now, all that seems to happen is that folks with sensible ideas don't get anywhere, which is exactly the property of an effective bureaucracy -- one that lost sight of its original purpose and now just persists for its own benefit. My suggestion is simple, and I'd be happy if my taxes and contributions to alma maters helped the research effort to: a) take a small, dedicated, non-IETF group at a small school and charge them with addressing the basic problems in the Internet (access & address security, network path mgmnt, layer performance, inter-layer comm...); b) tell them no Internet compatibility is required above DLC and up to the app layer; c) have them implement a demonstration campus net (with a few routed remote sites) with only the new protocols installed; d) provide a gateway (in the true sense of the word) to the Internet; and e) establish a center for open-source control and release mgmnt. This would be a research effort that many could benefit from, many good masters theses would arise from and, like most small-group efforts, would result in a good product. Deployment from there would lead to further fundable research. The big problem for any such effort is the lack of the implicit subsidy given TCP/IP over the years by its free distribution & inclusion in OSs shipped by everyone from AT&T, Sun, HP, uSoft, Linux... So, the inclusion of the new stack in Linux would be an essential task. In other words, the competitive playing field would have to be levelled, as it never was for TCP/IP vs Novell, etc. Then, competitive results would speak for themselves, and we might pass through the Age of Spam and the Valley of the Shadow of Identity Theft more safely. Alex Saverio Mascolo wrote: > > > > > I was part of a team that looked at the particular problem of > distinguishing > > packet drop cause in detail recently. See, for instance, > > > > http://www.ir.bbn.com/documents/articles/krishnan_cn04.pdf > > > > You don't get as much leverage as you'd hope from knowing the cause of > > packet drops. > > also because the link layer should be well designed so that losses should be > due only (or mainly) to congestion. > > Saverio From dpreed at reed.com Tue Jun 7 12:02:32 2005 From: dpreed at reed.com (David P. Reed) Date: Tue, 07 Jun 2005 15:02:32 -0400 Subject: [e2e] a new IRTF group on Transport Models In-Reply-To: <42A5DF67.6080107@dirtcheapemail.com> References: <200506062104.j56L4tKA067656@cougar.icir.org> <20050607164853.GA11348@sunfire.grc.nasa.gov> <42A5DF67.6080107@dirtcheapemail.com> Message-ID: <42A5EF48.8000505@reed.com> Clark Gaylord wrote: > > as example: corruption could be cell loss. cell loss could be due to > congestion. Interesting... by blurring the concept of "cell congestion" together with "router congestion" you end up with a confused description of the situation. TCP is based on a model where datagrams get through or they don't. That's the IP model. The engineering approach is to force all situations into those two cases. Enhancing TCP or replacing it would be possible if the possible outputs that result from a sequence of inputs could be characterized, and doable if the characterization has some predictable structure. If you try to design an end-to-end protocol where the inputs map to arbitrary outputs (any causal output stream-generating process that can be computed by a finite physical apparatus is as likely as any other), I doubt you will succed. This is ultimately why we put "cell-based links" into a modular black box that doesn't try to expose the cell structure of a datagram to the endpoints. It's also why we normally use the "digital abstraction" instead of modeling digital computers as analog machines. One could do otherwise. One could even argue that it would be more "efficient" for some meaning of the term efficiency. From cannara at attglobal.net Tue Jun 7 12:19:50 2005 From: cannara at attglobal.net (Cannara) Date: Tue, 07 Jun 2005 12:19:50 -0700 Subject: [e2e] Reacting to corruption based loss References: <200506062104.j56L4tKA067656@cougar.icir.org> <1118128952.4771.23.camel@lap10-c703.uibk.ac.at> <42A5AD90.4060806@reed.com> Message-ID: <42A5F356.67CE3A7A@attglobal.net> I agree with many of these comments except that the first lines, ending with "So any key difference should relate to the first impact" are overstated a bit. When error drops occur, they often have nothing to do with loading. So, their net effect is fully dependent on the protocols involved, both those at the faulty link's ends and those at the far ends. As it is today, interfaces interior to the systems comprising links may or may not correct such physical errors, but when they are not corrected, the ends that use TCP suffer greatly, simply because TCP, perhaps even with ECN, knows not what to do except slow down. In other words, a lightly-loaded network path, with one bad Tx interior interface, can drop 1% of its bits/pkts and force the running TCP connection to slow down more than 10%, while leaving UDP, VPNs, etc. largely unaffected. This is one important place where the TCP/IP implementations we now have fall down on the job. Alex "David P. Reed" wrote: > > There are two effects that corruption causes. > > First, it lowers the end-to-end error-free capacity of the channel. > > Second, it causes congestion because it lowers the potential end-to-end > error-free rate of the channel. > > Clearly, the response of lowering the input rate is one possible way to > deal with the second phenomenon. However, this second effect is > indistinguishable from bottleneck congestion or overload congestion. > So, given that we have an effective way to deal with transient overload, > why would "corruption" need a new layered interface change. > > So any key difference should relate to the first impact. > > It is well known that there are good reasons to create codes that cross > packet boundaries. So-called erasure codes or digital fountain > techniques provide the ability on an end-to-end basis to deal with data > losses that are packet centric. If errors are "bursty" in time, > spreading any particular end-to-end bit across several packets (or even > across several paths with independent failures) is a good end-to-end > response to corruption. > > So the utility of separation of corruption from overload losses is to be > able to code better. Suppose a packet's header is salvageable but its > data is not (perhaps putting a code on the header, rather than a > checksum would help here!) Would it be helpful in improving the > effective end-to-end capability if decoded at the endpoint? Absolutely > - if there are priors that give you a reasonable error model. > > But the real question here is about coding a stream across a network > with packet corruption. It probably is better to look at the > end-to-end perspective, which includes such things as latency (spreading > a bit across successive packets adds latency when decoded at the > receiver) and control-loop latency (how fast can the endpoints change > coding of a stream to spread across more packets and more paths, > compared to a more local, rapid, link-level response). > > The observation that 802.11 slows rates automatically based on link > quality points out the issue here - such a local tactic improves all > end-to-end paths with one fell-swoop, whereas there is the possiblity > that end-to-end responses will be too slow, or else drive each other > into mutual instability if the rate of change of link quality varies > faster than the end-to-end control loop timing can resolve. > > I'd argue that intuitions of most protocol designers are weak here, > because the state of the system as a whole is not best managed either at > the link level or at the end-to-end "session" level - but at the whole > network level. RED and ECN are decentralized "network level" control > strategies - which end up providing a control plane that is implicit > among all those who share a common bottleneck link. SImilarly, coding > strategies that can deal with "corruption" require a "network level" > implicit control, not an intuitive fix focused on the TCP state machine. From cannara at attglobal.net Tue Jun 7 12:35:08 2005 From: cannara at attglobal.net (Cannara) Date: Tue, 07 Jun 2005 12:35:08 -0700 Subject: [e2e] Reacting to corruption based loss References: <200506062104.j56L4tKA067656@cougar.icir.org> <1118128952.4771.23.camel@lap10-c703.uibk.ac.at> Message-ID: <42A5F6EC.1ECF2803@attglobal.net> Good, but corruption isn't just wireless related, which has to do with all manner of propagation complexities. It is as simple as a failing chip in an interface on a router, or in a service provider's line termination. In such cases, it makes no sense to slow down, and it would be better to send more pkts, if there's no congestion. This is just one reason why it's so important to let the network layer know whether congestion or error is causing loss at any link in a path. That info can then be used to manage all the flows on the path, not just TCP. And, for Jon, this has not to do with x.25, but may have lots to do with other protcool families, used for years by serious networking folks. Alex Michael Welzl wrote: > > Dear all, > > I changed the subject because it's drifting away from the > original topic... > > Sally Floyd wrote: > > > Addendum: > > I am personally very interested in the potential benefits (and > > complications) of more explicit communication between link layers > > and transport protocols, and in particular I like the idea of > > explicit communication from lower layers to transport saying "the > > packet with this packet header was dropped due to corruption". It > > is not clear to me yet exactly what the response of the transport > > protocol should be to reports of packet corruption, however - > > continuing to send at a high rate in the face of a high rate of > > packet corruption doesn't sound too appealing. And there are > > This point has been raised several times: how exactly should > a sender react to corruption? I fully agree that continuing > to send at a high rate isn't a good idea. > > Now, given that we're talking about a transport endpoint which > is informed about corruption, there probably isn't any knowledge > regarding the origin of corruption available - we don't know > what type of link layer caused it or why exactly it happened > (too many sources sending at the same time? user passed a wall?). > However, there seems to be some consensus that the reaction > to corruption should be less severe than the reaction to congestion. > > Also, it has been noted several times (and in several places) that > AIMD would work just as well if beta (the multiplicative decrease > factor) was, say, 7/8 instead of 1/2. For a historical reason (I > think it was the DECbit scheme), 7/8 seems to be a number > that survived in people's minds. > > So, why don't we just decide for a pragmatic approach instead > of waiting endlessly for a research solution that we can't come > up with? Why don't we simply state that the reaction to corruption > has to be: "reduce the rate by multiplying it with 7/8"? > > Much like the TCP reduction by half, it may not be the perfect > solution (Jacobson actually mentions that the reduction by half > is "almost certainly too large" in his congavoid paper), but > it could be a way to get us forward. > > ...or is there a reasonable research method that can help us > determine the ideal reaction to corruption, irrespective of > the cause? > > > complications with other forms of explicit communication between > > link layers and transport protocols as well, e.g., as discussed in > > the IAB internet-draft on "Architectural Implications of Link > > Indications", at > > "http://www.iab.org/documents/drafts/draft-iab-link-indications-01.txt". > > > > It might be that it is time also for a research group on Explicit > > Communication between Transport Protocols and Lower Layers, but > > someone else will have to start it. (I believe that there is already > > a proposal in the works for a new research group on new congestion > > control mechanisms, which includes one class of proposed new forms > > of explicit communication between transport protocols and routers.) > > I agree that this would be good to have. There have been many > proposals, and none of them appeared to work well enough > (*ouch*, I just hurt myself :-) ). Inter-layer communication > is a tricky issue, regardless of the direction in the stack. > Heck, we don't even have a reasonable way to signal "UDP-Lite > packet coming through" to a link layer! > > Cheers, > Michael From cannara at attglobal.net Tue Jun 7 12:40:58 2005 From: cannara at attglobal.net (Cannara) Date: Tue, 07 Jun 2005 12:40:58 -0700 Subject: [e2e] a new IRTF group on Transport Models References: <200506062104.j56L4tKA067656@cougar.icir.org> <20050607164853.GA11348@sunfire.grc.nasa.gov> <42A5DF67.6080107@dirtcheapemail.com> Message-ID: <42A5F84A.72EAAF3B@attglobal.net> The operative words are "could be", which is exactly why we need better stuff to get rid of "could". Alex Clark Gaylord wrote: > > Joseph Ishac wrote: > > >So, I'm a bit curious about the timidness to continue sending at a high > >rate when faced with a large amount of corruption. If you are indeed > > > > > > as example: corruption could be cell loss. cell loss could be due to > congestion. > > --ckg From ses at tipper.oit.unc.edu Tue Jun 7 15:35:30 2005 From: ses at tipper.oit.unc.edu (Simon Spero) Date: Tue, 07 Jun 2005 18:35:30 -0400 Subject: [e2e] Reacting to corruption based loss In-Reply-To: References: Message-ID: <42A62132.8060900@tipper.oit.unc.edu> Jon Crowcroft wrote: >and will lead to far less memory wastage in hosts runnign all >that complicated TCP protcol - they can just send web pages and video >and audio and so on as a sequence of IP packets >[...] >IP over TCP: way to go. > > Yeah right. What happens if one if the nodes on the path is unavailable? The data just gets dropped. That's completely unacceptable. You might think that the correct approach would be to layer IP over SMTP, to take advantage over well defined store and forward semantics. You might think that, but you'ld be wrong, and you'ld be wrong for the most obvious reason possible. What part of "IP over SMTP" involves XML encodings and HTTP? None at all. Sheesh. Simon p.s. There's way too much network overbuild right now for it not to be sensible to waste much of it, but the key is to waste responsibly. What does it mean if the core has infinite bandwidth, such that if packet makes it way into the core it won't face a congestion drop till it reaches the other side? What happens when the most likely cause of packet loss becomes gremlin perverted BGP converge ? Where is end-to-end when the middle ground vanishes, and the world is split between the wired and connected, whose limits are unfathomable, and the mobile and the mote-ile, where every packet and every joule brings death a little closer? "So close - the infinitesimal and the infinite. But suddenly, I knew they were really the two ends of the same concept. The unbelievably small and the unbelievably vast eventually meet - like the closing of a gigantic circle. I looked up, as if somehow I would grasp the heavens. The universe, worlds beyond number, God's silver tapestry spread across the night." p.p.s. What? You were talking about Bernie Ebbers? Never mind. From lynne at telemuse.net Tue Jun 7 16:33:22 2005 From: lynne at telemuse.net (Lynne Jolitz) Date: Tue, 7 Jun 2005 16:33:22 -0700 Subject: [e2e] Reacting to corruption based loss In-Reply-To: <42A62132.8060900@tipper.oit.unc.edu> Message-ID: <003601c56bb9$4aed13a0$6e8944c6@telemuse.net> Right on, Simon! Packet drops *are* unacceptable. Period. The problem with reductionist thinking is that they think they can do it with the minimum number of parts so as to solve a minimum problem. So all we need is Ethernet - as long as it's a *perfect* Ethernet. And there you go, pointing out annoying facts like the world isn't perfect. :-) Lynne. > -----Original Message----- > From: end2end-interest-bounces at postel.org on Behalf Of Simon Spero > Sent: Tuesday, June 07, 2005 3:36 PM > >Jon Crowcroft wrote: > >and will lead to far less memory wastage in hosts runnign all > >that complicated TCP protcol - they can just send web pages and video > >and audio and so on as a sequence of IP packets > >[...] > >IP over TCP: way to go. > > > > > Yeah right. What happens if one if the nodes on the path is > unavailable? The data just gets dropped. That's completely unacceptable. > You might think that the correct approach would be to layer IP over > SMTP, to take advantage over well defined store and forward semantics. > You might think that, but you'ld be wrong, and you'ld be wrong for the > most obvious reason possible. What part of "IP over SMTP" involves XML > encodings and HTTP? None at all. Sheesh. > > Simon > p.s. > There's way too much network overbuild right now for it not to be > sensible to waste much of it, but the key is to waste responsibly. > > What does it mean if the core has infinite bandwidth, such that if > packet makes it way into the core it won't face a congestion drop till > it reaches the other side? > What happens when the most likely cause of packet loss becomes gremlin > perverted BGP converge ? > > Where is end-to-end when the middle ground vanishes, and the world is > split between the wired and connected, whose limits are unfathomable, > and the mobile and the mote-ile, where every packet and every joule > brings death a little closer? > > "So close - the infinitesimal and the infinite. But suddenly, I knew > they were really the two ends of the same concept. The unbelievably > small and the unbelievably vast eventually meet - like the closing of a > gigantic circle. I looked up, as if somehow I would grasp the heavens. > The universe, worlds beyond number, God's silver tapestry spread across > the night." > > p.p.s. > What? You were talking about Bernie Ebbers? Never mind. > > > ---- We use SpamQuiz. If your ISP didn't make the grade try http://lynne.telemuse.net From dpreed at reed.com Tue Jun 7 19:38:03 2005 From: dpreed at reed.com (David P. Reed) Date: Tue, 07 Jun 2005 22:38:03 -0400 Subject: [e2e] Reacting to corruption based loss In-Reply-To: <003601c56bb9$4aed13a0$6e8944c6@telemuse.net> References: <003601c56bb9$4aed13a0$6e8944c6@telemuse.net> Message-ID: <42A65A0B.5040006@reed.com> I really think we missed the boat by not just proving all network components correct. Errors are really unacceptable, given modern mathematical proof techniques. Since Cannara believes that all erroneous packets can be reliably detected and signaled on the control plane, we are nearly there. Just put a theorem prover in each router, prove that the packet will be delivered, and you don't even have to put it on the output queue! A bonus question: if you have two cesium clocks on the ends of a link, they will tick simultaneously, so you should be able to send data without any risk of skew, right? And if you reduce the messages to single photons, you should NEVER have any errors, because photons are irreducible. So if we pursue reductionism to its limit, there should be no errors in our system at all. It's all "Internet Hooey" - the idea that congestion can't be prevented and corruption can't be detected are just foolish notions that SONET would never have to deal with. Cannara is right, the Internet is a completely idiotic idea, and the North American Numbering Plan was all we ever needed. :-) From Jon.Crowcroft at cl.cam.ac.uk Tue Jun 7 22:07:20 2005 From: Jon.Crowcroft at cl.cam.ac.uk (Jon Crowcroft) Date: Wed, 08 Jun 2005 06:07:20 +0100 Subject: [e2e] Reacting to corruption based loss In-Reply-To: Message from Simon Spero of "Tue, 07 Jun 2005 18:35:30 EDT." <42A62132.8060900@tipper.oit.unc.edu> Message-ID: In missive <42A62132.8060900 at tipper.oit.unc.edu>, Simon Spero typed: >>Jon Crowcroft wrote: >>>and will lead to far less memory wastage in hosts runnign all >>>that complicated TCP protcol - they can just send web pages and video >>>and audio and so on as a sequence of IP packets >>>[...] >>>IP over TCP: way to go. >>Yeah right. What happens if one if the nodes on the path is >>unavailable? The data just gets dropped. That's completely unacceptable. um - sorry - you have lost me - there's a TCP connection from each node to every other node - i can stripe the data how i like... >>You might think that the correct approach would be to layer IP over >>SMTP, to take advantage over well defined store and forward semantics. >>You might think that, but you'ld be wrong, and you'ld be wrong for the >>most obvious reason possible. What part of "IP over SMTP" involves XML >>encodings and HTTP? None at all. Sheesh. why yes- actually i was thinking that DNS lookups should be re-implemented in SOAP, then IP could be carried in DNS requests and responses... of course, some cross layer optimisation could be done as web pages that contain URLs would have the URLs encoded in XML so the lookups of IP addresses (and therefore TCP connections to them, in my architecture) would be done "on the fly" and could be "pipe lined" thus saving seconds, if not minutes, in the download times >>Simon >>p.s. >>There's way too much network overbuild right now for it not to be >>sensible to waste much of it, but the key is to waste responsibly. good point >>What does it mean if the core has infinite bandwidth, such that if >>packet makes it way into the core it won't face a congestion drop till >>it reaches the other side? >>What happens when the most likely cause of packet loss becomes gremlin >>perverted BGP converge ? BGP? who said we were using BGP? >>Where is end-to-end when the middle ground vanishes, and the world is >>split between the wired and connected, whose limits are unfathomable, >>and the mobile and the mote-ile, where every packet and every joule >>brings death a little closer? ah, the iMote in gods I, and the laser beam in your photonic core...yes >>"So close - the infinitesimal and the infinite. But suddenly, I knew >>they were really the two ends of the same concept. The unbelievably >>small and the unbelievably vast eventually meet - like the closing of a >>gigantic circle. I looked up, as if somehow I would grasp the heavens. >>The universe, worlds beyond number, God's silver tapestry spread across >>the night." indeed, this is a fair justification for the VLBI project >>p.p.s. >> What? You were talking about Bernie Ebbers? Never mind. who? cheers jon From rja at extremenetworks.com Wed Jun 8 05:31:24 2005 From: rja at extremenetworks.com (RJ Atkinson) Date: Wed, 8 Jun 2005 08:31:24 -0400 Subject: [e2e] Reacting to corruption based loss In-Reply-To: <42A65A0B.5040006@reed.com> References: <003601c56bb9$4aed13a0$6e8944c6@telemuse.net> <42A65A0B.5040006@reed.com> Message-ID: <5C12EB09-67A8-4733-817C-5DA8FABA164F@extremenetworks.com> On Jun 7, 2005, at 22:38, David P. Reed wrote: > I really think we missed the boat by not just proving all network > components correct. Errors are really unacceptable, given modern > mathematical proof techniques. > > Since Cannara believes that all erroneous packets can be reliably > detected and signaled on the control plane, we are nearly there. > Just put a theorem prover in each router, prove that the packet > will be delivered, and you don't even have to put it on the output > queue! Given the amount of router CPU that would be needed to deploy S-BGP, and the amount of energy being spent in some quarters pushing deployment of S-BGP onto the operators, adding a theorem prover on the router CPU would be easy. Good idea. :-) Ran From fkastenholz at comcast.net Wed Jun 8 05:49:57 2005 From: fkastenholz at comcast.net (frank@kastenholz.org) Date: Wed, 08 Jun 2005 12:49:57 +0000 Subject: [e2e] a new IRTF group on Transport Models Message-ID: <060820051249.24299.42A6E975000C949100005EEB220076370496040108020A9B9C0E0500@comcast.net> While all this chatter about certain actions TCP can or can not take and perfect nets with theorem provers in all the routers is as interesting and amusing as brain surgery, my original question stands: Is there any thought to identifying information that routers and end systems might provide that either can be fed back into the models to refine them or used in parallel to (in)validate them? Frank Kastenholz From cannara at attglobal.net Wed Jun 8 09:50:11 2005 From: cannara at attglobal.net (Cannara) Date: Wed, 08 Jun 2005 09:50:11 -0700 Subject: [e2e] Reacting to corruption based loss References: <003601c56bb9$4aed13a0$6e8944c6@telemuse.net> <42A65A0B.5040006@reed.com> Message-ID: <42A721C3.D59F601D@attglobal.net> It seems supercilliousness is the real solution, eh Reed? :] Alex "David P. Reed" wrote: > > I really think we missed the boat by not just proving all network > components correct. Errors are really unacceptable, given modern > mathematical proof techniques. > > Since Cannara believes that all erroneous packets can be reliably > detected and signaled on the control plane, we are nearly there. Just > put a theorem prover in each router, prove that the packet will be > delivered, and you don't even have to put it on the output queue! > > A bonus question: if you have two cesium clocks on the ends of a link, > they will tick simultaneously, so you should be able to send data > without any risk of skew, right? And if you reduce the messages to > single photons, you should NEVER have any errors, because photons are > irreducible. So if we pursue reductionism to its limit, there should > be no errors in our system at all. It's all "Internet Hooey" - the > idea that congestion can't be prevented and corruption can't be detected > are just foolish notions that SONET would never have to deal with. > Cannara is right, the Internet is a completely idiotic idea, and the > North American Numbering Plan was all we ever needed. > > :-) From d.leith at eee.strath.ac.uk Wed Jun 8 10:06:59 2005 From: d.leith at eee.strath.ac.uk (Douglas Leith) Date: Wed, 8 Jun 2005 18:06:59 +0100 Subject: [e2e] a new IRTF group on Transport Models Message-ID: Following up on Frank's question, one area where I suspect more data would help is in defining topologies to test TCP performance over. Most work to date has focussed on a dumbell topology. While this seems like a useful starting point, it would be good to have a better understanding of the range of end-to-end topologies experienced by TCP flows in practice. For example, it would be good to know what proportion of flows travel along paths where packets are queued at two or more hops (due to cross-traffic etc) and to better understand the character of such paths assuming they exist in appreciable numbers. This seems to require additional measurement information from that which is currently available - probing from the edge alone can probably only yield limited/ambiguous information on what's happening inside the network and so router information might help out a lot. Doug -----Original Message----- From: end2end-interest-bounces at postel.org on behalf of frank at kastenholz.org Sent: Wed 6/8/2005 1:49 PM To: end2end-interest at postel.org Subject: Re: [e2e] a new IRTF group on Transport Models While all this chatter about certain actions TCP can or can not take and perfect nets with theorem provers in all the routers is as interesting and amusing as brain surgery, my original question stands: Is there any thought to identifying information that routers and end systems might provide that either can be fed back into the models to refine them or used in parallel to (in)validate them? Frank Kastenholz -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.postel.org/pipermail/end2end-interest/attachments/20050608/cd4c7faf/attachment.html From ses at tipper.oit.unc.edu Wed Jun 8 10:11:16 2005 From: ses at tipper.oit.unc.edu (Simon Spero) Date: Wed, 08 Jun 2005 13:11:16 -0400 Subject: [e2e] Reacting to corruption based loss In-Reply-To: References: Message-ID: <42A726B4.9090405@tipper.oit.unc.edu> Jon Crowcroft wrote: > What? You were talking about Bernie Ebbers? Never mind. > >who? > > His full name is "Disgraced Former Worldcom CEO Benie Ebbers" (one word). He's generally considered to be the leading authority on corruption based loss in networks. His techniques can be applied to the situation under discussion by using corruption based packet synthesis to improve overall throughput by generating plausible looking packets in response to upstream loss. Naturally, this technique can't be used for all traffic, as it becomes harder to tell what's really plausible. Fortunately existing approaches can provide for the selective control and regulation of even the highest levels of corruption ; the Federal Elections Commission, or FEC, model has proven itself reliable in this environment. Simon // too soon? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.postel.org/pipermail/end2end-interest/attachments/20050608/f48320bd/attachment.html From farooq.bari at cingular.com Wed Jun 8 10:34:43 2005 From: farooq.bari at cingular.com (Bari, Farooq) Date: Wed, 8 Jun 2005 10:34:43 -0700 Subject: [e2e] a new IRTF group on Transport Models Message-ID: Do not we have feedback mechanism via ECN bit to indicate congested network.....remarking of DSCP may also be another indication of congestion..... > -----Original Message----- > From: end2end-interest-bounces at postel.org [mailto:end2end-interest- > bounces at postel.org] On Behalf Of frank at kastenholz.org > Sent: Wednesday, June 08, 2005 5:50 AM > To: end2end-interest at postel.org > Subject: Re: [e2e] a new IRTF group on Transport Models > > While all this chatter about certain actions TCP can > or can not take and perfect nets with theorem > provers in all the routers is as interesting and > amusing as brain surgery, my original question stands: > > Is there any thought to identifying information > that routers and end systems might provide that > either can be fed back into the models to refine > them or used in parallel to (in)validate them? > > Frank Kastenholz > From cannara at attglobal.net Wed Jun 8 16:47:00 2005 From: cannara at attglobal.net (Cannara) Date: Wed, 08 Jun 2005 16:47:00 -0700 Subject: [e2e] a new IRTF group on Transport Models References: <200506062104.j56L4tKA067656@cougar.icir.org> Message-ID: <42A78374.D9CEFF7F@attglobal.net> Sally, I'll be happy to supply some example test traces of uncongested paths with small physical-error rates and common RTTs that together challenge present TCP performance. The addresses in the pkts should, of course, be anonymized. Alex Sally Floyd wrote: > > Frank - > > >Is there any thought to identifying information > >that routers and end systems might provide that > >either can be fed back into the models to refine > >them or used in parallel to (in)validate them? > > > >A simple example might be packet drops. If models > >assume that the only reason packets are dropped is > >overflowing queues due to congestion, that leads to > >certain conclusions, etc, and tweaking our transport > >protocols in a certain direction. But if it turns > >out that a significant percentage of packet drops > >is because of something else, then that conclusion > >would be incorrect... > > Certainly, any evaluation of transport protocols or of congestion > control mechanisms needs to take into account the difference between > packets dropped due to congestion and packets dropped due to some > other reason (e.g., corruption). In simulations, non-congestion-related > packet drops are only present when explicitly introduced, and in > test beds, I assume that there are mechanisms to detect and report > on the different types of packet drops. And in analysis, it is > important to consider the possibilities of non-congestion-related > packet losses. > > If you are asking if there has been any thought to proposing additions > to the protocol suite to report such information, the short answer > is that this is not intended to be the province of the Transport > Models Research Group (TMRG). The proposal is that TMRG itself > would *not* be proposing changes to transport protocols or to > explicit communication between transport protocols and lower layers, > but would *only* be concerned with improving our methodologies for > evaluating such proposals. E.g., with test suites for simulations > or for test beds, discussions of topologies and traffic generation, > of the models underlying our analysis, etc. > > - Sally > http://www.icir.org/floyd/ > > Addendum: > I am personally very interested in the potential benefits (and > complications) of more explicit communication between link layers > and transport protocols, and in particular I like the idea of > explicit communication from lower layers to transport saying "the > packet with this packet header was dropped due to corruption". It > is not clear to me yet exactly what the response of the transport > protocol should be to reports of packet corruption, however - > continuing to send at a high rate in the face of a high rate of > packet corruption doesn't sound too appealing. And there are > complications with other forms of explicit communication between > link layers and transport protocols as well, e.g., as discussed in > the IAB internet-draft on "Architectural Implications of Link > Indications", at > "http://www.iab.org/documents/drafts/draft-iab-link-indications-01.txt". > > It might be that it is time also for a research group on Explicit > Communication between Transport Protocols and Lower Layers, but > someone else will have to start it. (I believe that there is already > a proposal in the works for a new research group on new congestion > control mechanisms, which includes one class of proposed new forms > of explicit communication between transport protocols and routers.) From cannara at attglobal.net Wed Jun 8 17:06:27 2005 From: cannara at attglobal.net (Cannara) Date: Wed, 08 Jun 2005 17:06:27 -0700 Subject: [e2e] a new IRTF group on Transport Models References: <20050603123012.E812786AFF@mercury.lcs.mit.edu> Message-ID: <42A78803.A7A3555E@attglobal.net> Well Noel, I actually did mention Quench, but it's not exactly a vernier adjustment, now is it?... "The source quench message is a request to the host to cut back the rate at which it is sending traffic to the internet destination. The gateway may send a source quench message for every message that it discards. On receipt of a source quench message, the source host should cut back the rate at which it is sending traffic to the specified destination until it no longer receives source quench messages from the gateway. The source host can then gradually increase the rate at which it sends traffic to the destination until it again receives source quench messages. The gateway or host MAY send the source quench message when it approaches its capacity limit rather than waiting until the capacity is exceeded. This means that the data datagram which triggered the source quench message may be delivered." Plus, there are those who think it a poor idea anyway. It obviously has no concept of management that's aware of admission commitments, etc. Instead, it's a knee-jerk reaction, which requires long delays after the event, in terms of the quenched sources: "The source host can then gradually increase the rate at which it sends traffic." Hardly a congestion-management system, since the sources may remain slow far longer than needed. In other words, it's at best a stopgap/fallback last resort. I've never had anyone say I "know everything". Thanks! All I know is from experience in networking starting with Xerox Ethernet/XNS, Zilog Znet, the AMD Lance and telecom chip designs of '84, 3Com stuff, including TCP/IP, plus consulting with a few folks on networking for all types of LANs/WANs & protocols for the past 15 years or so. In fact, we consultants love TCP/IP and the Internet, because it provides such a good living solving problems for people at good hourly rates -- just did that today! Never could have made as much if Netware or Vines were still popular. :] The "coward" statement is just what any good shrink will explain to you -- a coward lives every day knowing that about him/her self, which is exactly how whatever God(s) one believes in deal with cowards. Alex Noel Chiappa wrote: > > > From: Cannara > > > the Internet designers never considered anything but IP and a few hosts > > ... > > Admission & flow control at the network layer? What's that? > > ... > > Whether or not TCP is ever given accurate info to distinguish physical > > loss from true congestion matters little. > > Gee, Alex, since you seem to know everything (in addition to being smarter > than all the early Internet people put together), perhaps you can explain to > me what ICMP type 4 messages are supposed to do... > > (And, for the rest of you, does anyone know how CYCLADES handled congestion? > I have the CYCLADES book, so I could go look it up, but I was hoping someone > could save me the trouble.) > > > If there are such good souls ready to stand up and do what needs be > > done, I applaud you. Remember, courage men/women, God hates a coward. > > I would cheerfully say what I really think, but alas, I'm afraid the list > maintainers would likely chastise me (rightly) for ad hominem attacks. > > Noel From Jon.Crowcroft at cl.cam.ac.uk Thu Jun 9 00:10:41 2005 From: Jon.Crowcroft at cl.cam.ac.uk (Jon Crowcroft) Date: Thu, 09 Jun 2005 08:10:41 +0100 Subject: [e2e] Reacting to corruption based loss In-Reply-To: Message from Cannara of "Wed, 08 Jun 2005 09:50:11 PDT." <42A721C3.D59F601D@attglobal.net> Message-ID: actually, we have 2 pieces of work that makke this entirely reasonable 1. my colleagues have a paper at SIGCOMM coming up about using higher order logic to prove TCP correct (including different implementations _and_ the socket layer 2. one of our PhD students has written an SSHd and other non trivial protocols in ocaml, and thus can avail himself of various model checkers and automatic proof systems and (as it happens) his code has acceptable performance the decrying of good computer science methodlogy because it might be too slow or not able to cope with "real world" scale systems is simply OUT OF DATE. In missive <42A721C3.D59F601D at attglobal.net>, Cannara typed: >>It seems supercilliousness is the real solution, eh Reed? >>:] >> >>Alex >> >>"David P. Reed" wrote: >>> >>> I really think we missed the boat by not just proving all network >>> components correct. Errors are really unacceptable, given modern >>> mathematical proof techniques. >>> >>> Since Cannara believes that all erroneous packets can be reliably >>> detected and signaled on the control plane, we are nearly there. Just >>> put a theorem prover in each router, prove that the packet will be >>> delivered, and you don't even have to put it on the output queue! >>> >>> A bonus question: if you have two cesium clocks on the ends of a link, >>> they will tick simultaneously, so you should be able to send data >>> without any risk of skew, right? And if you reduce the messages to >>> single photons, you should NEVER have any errors, because photons are >>> irreducible. So if we pursue reductionism to its limit, there should >>> be no errors in our system at all. It's all "Internet Hooey" - the >>> idea that congestion can't be prevented and corruption can't be detected >>> are just foolish notions that SONET would never have to deal with. >>> Cannara is right, the Internet is a completely idiotic idea, and the >>> North American Numbering Plan was all we ever needed. >>> >>> :-) >> cheers jon From Jon.Crowcroft at cl.cam.ac.uk Thu Jun 9 00:23:50 2005 From: Jon.Crowcroft at cl.cam.ac.uk (Jon Crowcroft) Date: Thu, 09 Jun 2005 08:23:50 +0100 Subject: [e2e] Reacting to corruption based loss In-Reply-To: Message from Simon Spero of "Wed, 08 Jun 2005 13:11:16 EDT." <42A726B4.9090405@tipper.oit.unc.edu> Message-ID: ah, you mean that bernie Arrested and Really Prosecuted for a "Transfer Cash Pronto" ebbers becoming an "Unusually Disgraced Person" resulting in "I've Completely Mismanaged the Portfolio" time-to-live exceeded messages maybe... on being visited by the MIB, since it appeared that He Totally Transgressed Protocol and He Tilted Many Laws and He's an X Mr Largeing-it what an eDonkey... In missive <42A726B4.9090405 at tipper.oit.unc.edu>, Simon Spero typed: >>This is a multi-part message in MIME format. >>--------------000300090001090405080003 >>Content-Type: text/plain; charset=ISO-8859-1; format=flowed >>Content-Transfer-Encoding: 7bit >> >>Jon Crowcroft wrote: >> >>> What? You were talking about Bernie Ebbers? Never mind. >>> >>>who? >>> >>> >>His full name is "Disgraced Former Worldcom CEO Benie Ebbers" (one >>word). He's generally considered to be the leading authority on >>corruption based loss in networks. >> >>His techniques can be applied to the situation under discussion by using >>corruption based packet synthesis to improve overall throughput by >>generating plausible looking packets in response to upstream loss. >>Naturally, this technique can't be used for all traffic, as it becomes >>harder to tell what's really plausible. >> >> Fortunately existing approaches can provide for the selective control >>and regulation of even the highest levels of corruption ; the >>Federal Elections Commission, or FEC, model has proven itself reliable >>in this environment. >> >>Simon // too soon? >> >> >>--------------000300090001090405080003 >>Content-Type: text/html; charset=ISO-8859-1 >>Content-Transfer-Encoding: 7bit >> >> >> >> >> >> >> >>Jon Crowcroft wrote:
>>

>>
What? You were talking about Bernie Ebbers? Never mind.
>>
>>
 >>who?
 >>  
>>

>>His full name is "Disgraced Former Worldcom CEO Benie Ebbers" (one >>word). He's generally considered to be the leading authority on >>corruption based loss in networks.
>>
>>His techniques can be applied to the situation under discussion by >>using corruption based packet synthesis to improve overall throughput >>by generating plausible looking packets in response to upstream loss. >>Naturally, this technique can't be used for all traffic, as it becomes >>harder to tell what's really plausible.
>>
>> Fortunately existing approaches can provide for the selective control >>and regulation of even the highest levels of corruption ; the >>Federal Elections Commission, or FEC, model has proven itself reliable >>in this environment.
>>
>>Simon // too soon?
>>
>> >> >> >>--------------000300090001090405080003-- cheers jon From rik at rikwade.com Thu Jun 9 02:02:33 2005 From: rik at rikwade.com (Rik Wade) Date: Thu, 9 Jun 2005 21:02:33 +1200 Subject: [e2e] Reacting to corruption based loss In-Reply-To: <1118128952.4771.23.camel@lap10-c703.uibk.ac.at> References: <200506062104.j56L4tKA067656@cougar.icir.org> <1118128952.4771.23.camel@lap10-c703.uibk.ac.at> Message-ID: <008B2849-255C-43A4-9130-5CA559B70CC4@rikwade.com> On 7/06/2005, at 7:22 PM, Michael Welzl wrote: > This point has been raised several times: how exactly should > a sender react to corruption? I fully agree that continuing > to send at a high rate isn't a good idea. > [...] > So, why don't we just decide for a pragmatic approach instead > of waiting endlessly for a research solution that we can't come > up with? Why don't we simply state that the reaction to corruption > has to be: "reduce the rate by multiplying it with 7/8"? > > Much like the TCP reduction by half, it may not be the perfect > solution (Jacobson actually mentions that the reduction by half > is "almost certainly too large" in his congavoid paper), but > it could be a way to get us forward. > > ...or is there a reasonable research method that can help us > determine the ideal reaction to corruption, irrespective of > the cause? I did some work on this as part of my PhD several years ago. A summary of the work was published as: R.Wade, M.Kara, P.M.Dew. "Proposed Modifications to TCP Congestion Control for High Bandwidth and Local Area Networks.". Appeared in "Proceedings of the 6th IEEE Conference on Telecommunications (ICT'98)", July 1998. (Paper available for download from http://www.rikwade.com) At the time, I was working with 155Mb/s ATM and Fast Ethernet, and looking at the performance of TCP congestion avoidance algorithms over such networks. My thoughts were along the lines of those mentioned elsewhere in this thread - why should TCP make such a large reduction in its window size if loss was only due to a single ATM cell drop, or corruption elsewhere in the stack. The proposal in our paper was to maintain a weighted history of the congestion window size and to attempt to use this value when perceived loss was encountered. If the loss was a unique event, and the connection was long-lived, then restart would likely be close to the current transmission rate, and the connection could continue as normal. If recurrent loss was encountered, then the algorithm reverted to its normal mode of operation after three (for example) attempts. Various values for the history weighting were simulated in order to evaluate whether a more, or less, aggressive approach was better. I was quite happy with the results and it was a relatively simple modification to the Reno implementation in both Keshav's Real simulator and NS. -- rik wade From Jon.Crowcroft at cl.cam.ac.uk Thu Jun 9 05:43:39 2005 From: Jon.Crowcroft at cl.cam.ac.uk (Jon Crowcroft) Date: Thu, 09 Jun 2005 13:43:39 +0100 Subject: [e2e] Reacting to corruption based loss - refs on provable stuff In-Reply-To: Message from Jon Crowcroft of "Thu, 09 Jun 2005 08:10:41 BST." Message-ID: In missive , Jon Crowcroft typed: >>1. my colleagues have a paper at SIGCOMM coming up about using higher order logic see http://www.cl.cam.ac.uk/users/pes20/Netsem/index.html >> >>"David P. Reed" wrote: >> >>> I really think we missed the boat by not just proving all network >> >>> components correct. Errors are really unacceptable, given modern >> >>> mathematical proof techniques. see The Price of Safety in an Active Network D. Scott Alexander, Paul B. Menage, Angelos D. Keromytis,. William A. Arbaugh, Kostas G Anagnostakis, ... www.cis.upenn.edu/~switchware/papers/saneimp-jcn.pdf j. From dpreed at reed.com Wed Jun 8 21:33:07 2005 From: dpreed at reed.com (David P. Reed) Date: Thu, 09 Jun 2005 00:33:07 -0400 Subject: [e2e] Reacting to corruption based loss In-Reply-To: <42A721C3.D59F601D@attglobal.net> References: <003601c56bb9$4aed13a0$6e8944c6@telemuse.net> <42A65A0B.5040006@reed.com> <42A721C3.D59F601D@attglobal.net> Message-ID: <42A7C683.3020603@reed.com> Cannara wrote: >It seems supercilliousness is the real solution, eh Reed? >:] > > I was aiming at supersilliness. Weren't you? From cannara at attglobal.net Fri Jun 10 09:14:32 2005 From: cannara at attglobal.net (Cannara) Date: Fri, 10 Jun 2005 09:14:32 -0700 Subject: [e2e] Reacting to corruption based loss References: Message-ID: <42A9BC68.92228FA9@attglobal.net> Having already read your msg that follows this one Jon, are controlled substances doing the talking here? :] Alex Jon Crowcroft wrote: > > actually, we have 2 pieces of work that makke this entirely reasonable > > 1. my colleagues have a paper at SIGCOMM coming up about using higher order logic > to prove TCP correct (including different implementations _and_ the socket layer > > 2. one of our PhD students has written an SSHd and other non trivial protocols in > ocaml, and thus can avail himself of various model checkers and automatic proof systems > and (as it happens) his code has acceptable performance > > the decrying of good computer science methodlogy because it might be too slow or not able to cope with > "real world" scale systems is simply OUT OF DATE. > > In missive <42A721C3.D59F601D at attglobal.net>, Cannara typed: > > >>It seems supercilliousness is the real solution, eh Reed? > >>:] > >> > >>Alex > >> > >>"David P. Reed" wrote: > >>> > >>> I really think we missed the boat by not just proving all network > >>> components correct. Errors are really unacceptable, given modern > >>> mathematical proof techniques. > >>> > >>> Since Cannara believes that all erroneous packets can be reliably > >>> detected and signaled on the control plane, we are nearly there. Just > >>> put a theorem prover in each router, prove that the packet will be > >>> delivered, and you don't even have to put it on the output queue! > >>> > >>> A bonus question: if you have two cesium clocks on the ends of a link, > >>> they will tick simultaneously, so you should be able to send data > >>> without any risk of skew, right? And if you reduce the messages to > >>> single photons, you should NEVER have any errors, because photons are > >>> irreducible. So if we pursue reductionism to its limit, there should > >>> be no errors in our system at all. It's all "Internet Hooey" - the > >>> idea that congestion can't be prevented and corruption can't be detected > >>> are just foolish notions that SONET would never have to deal with. > >>> Cannara is right, the Internet is a completely idiotic idea, and the > >>> North American Numbering Plan was all we ever needed. > >>> > >>> :-) > >> > > cheers > > jon From cannara at attglobal.net Fri Jun 10 09:09:03 2005 From: cannara at attglobal.net (Cannara) Date: Fri, 10 Jun 2005 09:09:03 -0700 Subject: [e2e] Reacting to corruption based loss References: <200506062104.j56L4tKA067656@cougar.icir.org> <1118128952.4771.23.camel@lap10-c703.uibk.ac.at> <008B2849-255C-43A4-9130-5CA559B70CC4@rikwade.com> Message-ID: <42A9BB1F.9AFFDCBE@attglobal.net> Note that one of the major storage-systems vendors has for some years used a modified TCP for the same reason -- improved fast-LAN performance. I don't recall seeing its loss behavior, but I did see it completely ignore receive windows! They were irrelevant to how the very fast end systems were designed to buffer I/O data. Alex Rik Wade wrote: > > On 7/06/2005, at 7:22 PM, Michael Welzl wrote: > > This point has been raised several times: how exactly should > > a sender react to corruption? I fully agree that continuing > > to send at a high rate isn't a good idea. > > [...] > > So, why don't we just decide for a pragmatic approach instead > > of waiting endlessly for a research solution that we can't come > > up with? Why don't we simply state that the reaction to corruption > > has to be: "reduce the rate by multiplying it with 7/8"? > > > > Much like the TCP reduction by half, it may not be the perfect > > solution (Jacobson actually mentions that the reduction by half > > is "almost certainly too large" in his congavoid paper), but > > it could be a way to get us forward. > > > > ...or is there a reasonable research method that can help us > > determine the ideal reaction to corruption, irrespective of > > the cause? > > I did some work on this as part of my PhD several years ago. A > summary of the work was published as: > > R.Wade, M.Kara, P.M.Dew. "Proposed Modifications to TCP Congestion > Control for High Bandwidth and Local Area Networks.". Appeared in > "Proceedings of the 6th IEEE Conference on Telecommunications > (ICT'98)", July 1998. > (Paper available for download from http://www.rikwade.com) > > At the time, I was working with 155Mb/s ATM and Fast Ethernet, and > looking at the performance of TCP congestion avoidance algorithms > over such networks. My thoughts were along the lines of those > mentioned elsewhere in this thread - why should TCP make such a large > reduction in its window size if loss was only due to a single ATM > cell drop, or corruption elsewhere in the stack. > > The proposal in our paper was to maintain a weighted history of the > congestion window size and to attempt to use this value when > perceived loss was encountered. If the loss was a unique event, and > the connection was long-lived, then restart would likely be close to > the current transmission rate, and the connection could continue as > normal. If recurrent loss was encountered, then the algorithm > reverted to its normal mode of operation after three (for example) > attempts. Various values for the history weighting were simulated in > order to evaluate whether a more, or less, aggressive approach was > better. > > I was quite happy with the results and it was a relatively simple > modification to the Reno implementation in both Keshav's Real > simulator and NS. > -- > rik wade From cannara at attglobal.net Fri Jun 10 12:16:17 2005 From: cannara at attglobal.net (Cannara) Date: Fri, 10 Jun 2005 12:16:17 -0700 Subject: [e2e] Reacting to corruption based loss References: <003601c56bb9$4aed13a0$6e8944c6@telemuse.net> <42A65A0B.5040006@reed.com> <42A721C3.D59F601D@attglobal.net> <42A7C683.3020603@reed.com> Message-ID: <42A9E701.8F277685@attglobal.net> Hard to know the true mind of another. :] Alex "David P. Reed" wrote: > > Cannara wrote: > > >It seems supercilliousness is the real solution, eh Reed? > >:] > > > > > I was aiming at supersilliness. Weren't you? From cannara at attglobal.net Fri Jun 10 12:33:09 2005 From: cannara at attglobal.net (Cannara) Date: Fri, 10 Jun 2005 12:33:09 -0700 Subject: [e2e] Reacting to corruption based loss References: <003601c56bb9$4aed13a0$6e8944c6@telemuse.net> <42A65A0B.5040006@reed.com> <5C12EB09-67A8-4733-817C-5DA8FABA164F@extremenetworks.com> Message-ID: <42A9EAF5.498600E5@attglobal.net> Every program is a theorem, and TCP implementations are perfect, so that's not needed. Alex RJ Atkinson wrote: > > On Jun 7, 2005, at 22:38, David P. Reed wrote: > > I really think we missed the boat by not just proving all network > > components correct. Errors are really unacceptable, given modern > > mathematical proof techniques. > > > > Since Cannara believes that all erroneous packets can be reliably > > detected and signaled on the control plane, we are nearly there. > > Just put a theorem prover in each router, prove that the packet > > will be delivered, and you don't even have to put it on the output > > queue! > > Given the amount of router CPU that would be needed to deploy > S-BGP, and the amount of energy being spent in some quarters pushing > deployment of S-BGP onto the operators, adding a theorem prover on the > router CPU would be easy. Good idea. > > :-) > > Ran From cannara at attglobal.net Fri Jun 10 12:31:27 2005 From: cannara at attglobal.net (Cannara) Date: Fri, 10 Jun 2005 12:31:27 -0700 Subject: [e2e] a new IRTF group on Transport Models References: Message-ID: <42A9EA8F.39B8727@attglobal.net> Anyone who is, or uses, a service provider will certainly have flows over complex, crossing paths with multiple queues. Haven't TCP simulation tests taken this into account? Can it be true they haven't? Just a few weeks ago, a client had just such poor TCP performance through a particular provider, and, lo & behold, it was simply small pkt loss interior to the ISP's net, due to errors. Then began the tedious process of proving to Idiots.net that it was indeed their box. Even corporate nets have a variety of crossing points with a variety of link types & speeds, and a variety of office/site usage schedules. All together, these can create variable path characteristics, even within a second. Distinguishing congestion from error becomes extremely important when, for example, the sales team is at a large customer and accessing important presentation data at the home office in real time. Or maybe when your doctor is remotely monitoring your heart condition and getting ready to do a robotic bypass as the robotic injection he just gave you sends you drifting off...oops, TCP slowed down! :] Alex > Douglas Leith wrote: > > Following up on Frank's question, one area where I suspect more data would > help is in defining topologies to test TCP performance over. > > Most work to date has focussed on a dumbell topology. While this seems like > a useful starting point, it would be good to have a better understanding of > the range of end-to-end topologies experienced by TCP flows in practice. > For example, it would be good to know what proportion of flows travel along > paths where packets are queued at two or more hops (due to cross-traffic > etc) and to better understand the character of such paths assuming they > exist in appreciable numbers. This seems to require additional measurement > information from that which is currently available - probing from the edge > alone can probably only yield limited/ambiguous information on what's > happening inside the network and so router information might help out a lot. > > Doug > > -----Original Message----- > From: end2end-interest-bounces at postel.org on behalf of frank at kastenholz.org > Sent: Wed 6/8/2005 1:49 PM > To: end2end-interest at postel.org > Subject: Re: [e2e] a new IRTF group on Transport Models > > While all this chatter about certain actions TCP can > or can not take and perfect nets with theorem > provers in all the routers is as interesting and > amusing as brain surgery, my original question stands: > > Is there any thought to identifying information > that routers and end systems might provide that > either can be fed back into the models to refine > them or used in parallel to (in)validate them? > > Frank Kastenholz From faber at ISI.EDU Fri Jun 10 14:35:06 2005 From: faber at ISI.EDU (Ted Faber) Date: Fri, 10 Jun 2005 14:35:06 -0700 Subject: [e2e] a new IRTF group on Transport Models In-Reply-To: <42A9EA8F.39B8727@attglobal.net> References: <42A9EA8F.39B8727@attglobal.net> Message-ID: <20050610213506.GJ83512@pun.isi.edu> On Fri, Jun 10, 2005 at 12:31:27PM -0700, Cannara wrote: > Or maybe when your doctor is remotely monitoring your heart condition and > getting ready to do a robotic bypass as the robotic injection he just gave you > sends you drifting off...oops, TCP slowed down! > :] And setting a new World and Olympic Record in the "straight line for the 'Doctor it hurts when I do that' joke" event, Alex Cannara. :-) :-) :-) (http://www.catb.org/~esr/jargon/html/D/Don-t-do-that-then-.html if that's confusing) -- Ted Faber http://www.isi.edu/~faber PGP: http://www.isi.edu/~faber/pubkeys.asc Unexpected attachment on this mail? See http://www.isi.edu/~faber/FAQ.html#SIG -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050610/39ccceb4/attachment.bin From cannara at attglobal.net Sat Jun 11 16:57:44 2005 From: cannara at attglobal.net (Cannara) Date: Sat, 11 Jun 2005 16:57:44 -0700 Subject: [e2e] a new IRTF group on Transport Models References: <42A9EA8F.39B8727@attglobal.net> <20050610213506.GJ83512@pun.isi.edu> Message-ID: <42AB7A78.59A397EE@attglobal.net> Cool Ted! And now maybe I've learned why the all the Apple Lisa's on the show network, when they were introduced in SF, locked up when I accidentally brushed my palm across one's keyboard, while reaching for a brochure! Fortunately, Jobs was right there to chew out their tech crew, when all that could get things working again was a full power cycle. Hey, maybe we could arrange the same fate for TCP as the Lisa had? Oops, I didn't say that. :] Alex Ted Faber wrote: > > On Fri, Jun 10, 2005 at 12:31:27PM -0700, Cannara wrote: > > Or maybe when your doctor is remotely monitoring your heart condition and > > getting ready to do a robotic bypass as the robotic injection he just gave you > > sends you drifting off...oops, TCP slowed down! > > :] > > And setting a new World and Olympic Record in the "straight line for the > 'Doctor it hurts when I do that' joke" event, Alex Cannara. > > :-) :-) :-) > > (http://www.catb.org/~esr/jargon/html/D/Don-t-do-that-then-.html if > that's confusing) > > -- > Ted Faber > http://www.isi.edu/~faber PGP: http://www.isi.edu/~faber/pubkeys.asc > Unexpected attachment on this mail? See http://www.isi.edu/~faber/FAQ.html#SIG From braden at ISI.EDU Sun Jun 12 17:21:20 2005 From: braden at ISI.EDU (Bob Braden) Date: Sun, 12 Jun 2005 17:21:20 -0700 Subject: [e2e] a new IRTF group on Transport Models In-Reply-To: <42AB7A78.59A397EE@attglobal.net> References: <42A9EA8F.39B8727@attglobal.net> <20050610213506.GJ83512@pun.isi.edu> Message-ID: <5.1.0.14.2.20050612171614.00a8bc20@boreas.isi.edu> At 04:57 PM 6/11/2005 -0700, Cannara wrote: >Cool Ted! And now maybe I've learned why the all the Apple Lisa's on the show >network, when they were introduced in SF, locked up when I accidentally >brushed my palm across one's keyboard, while reaching for a brochure! >Fortunately, Jobs was right there to chew out their tech crew, when all that >could get things working again was a full power cycle. > >Hey, maybe we could arrange the same fate for TCP as the Lisa had? Oops, I >didn't say that. :] > >Alex Alex, You are lapsing into adoselcent mode (again). If you have a real, well-thought-out proposal for architecture and protocols to replace the Internet protocols, please send us a pointer to your paper on the subject. A technical discussion of an alternate architectural approach would be a good use of this list. Siimply hurling innuendos against TCP is boring, unproductive, and unacceptable. Cynicism is not a substitute for technical depth. BTW, don''t forget to tell the ACM that they made a terrible mistake in giving the Turing Award to Bob Kahn and Vint Cerf, in San Francisco last evening. Bob Braden From cannara at attglobal.net Mon Jun 13 11:35:46 2005 From: cannara at attglobal.net (Cannara) Date: Mon, 13 Jun 2005 11:35:46 -0700 Subject: [e2e] a new IRTF group on Transport Models References: <42A9EA8F.39B8727@attglobal.net> <20050610213506.GJ83512@pun.isi.edu> <5.1.0.14.2.20050612171614.00a8bc20@boreas.isi.edu> Message-ID: <42ADD202.50BFE3D8@attglobal.net> Bob, as you and anyone can tell from the archives on this list, I and others have made positive suggestions for specific changes to both TCP and IP to improve matters. However, even those submitting good papers have met with varying degrees of stonewalling. So, the TCP/IP bureacracy now functions as does any that has long lost its enthusiastic founders. Indeed Bob Kahn deserves an award, since it was he who assured continued funding when folks in Washington were getting tired. Vint Cerf surely deserves something too, since he made honest comments lamenting the failure to continue protocol research, rather than stop esssentially where we are. So, my comments are intended to encourage movement beyond the existence proof that today's Internet protocols are, into something more professionally developed and maturing. It's worth noting that the biggest movers on Internet, both in terms of traffic and users, are based on non-IETF developments. Alex Bob Braden wrote: > > At 04:57 PM 6/11/2005 -0700, Cannara wrote: > >Cool Ted! And now maybe I've learned why the all the Apple Lisa's on the show > >network, when they were introduced in SF, locked up when I accidentally > >brushed my palm across one's keyboard, while reaching for a brochure! > >Fortunately, Jobs was right there to chew out their tech crew, when all that > >could get things working again was a full power cycle. > > > >Hey, maybe we could arrange the same fate for TCP as the Lisa had? Oops, I > >didn't say that. :] > > > >Alex > > Alex, > > You are lapsing into adoselcent mode (again). If you have a real, > well-thought-out > proposal for architecture and protocols to replace the Internet protocols, > please > send us a pointer to your paper on the subject. A technical discussion of an > alternate architectural approach would be a good use of this list. Siimply > hurling > innuendos against TCP is boring, unproductive, and unacceptable. Cynicism > is not a > substitute for technical depth. > > BTW, don''t forget to tell the ACM that they made a terrible mistake in > giving the > Turing Award to Bob Kahn and Vint Cerf, in San Francisco last evening. > > Bob Braden From lynne at telemuse.net Mon Jun 13 11:58:36 2005 From: lynne at telemuse.net (Lynne Jolitz) Date: Mon, 13 Jun 2005 11:58:36 -0700 Subject: [e2e] a new IRTF group on Transport Models In-Reply-To: <42ADD202.50BFE3D8@attglobal.net> Message-ID: <001401c57049$e758b0c0$6e8944c6@telemuse.net> I'm sure we are all very pleased that Vint and Bob have been honored with the Turing Award this year. They both deserve it - their work has changed our world! Congratulations Vint and Bob! Lynne. ---- We use SpamQuiz. If your ISP didn't make the grade try http://lynne.telemuse.net > -----Original Message----- > From: end2end-interest-bounces at postel.org > [mailto:end2end-interest-bounces at postel.org]On Behalf Of Cannara > Sent: Monday, June 13, 2005 11:36 AM > To: end2end-interest at postel.org > Subject: Re: [e2e] a new IRTF group on Transport Models > > > Bob, as you and anyone can tell from the archives on this list, I > and others > have made positive suggestions for specific changes to both TCP and IP to > improve matters. However, even those submitting good papers have met with > varying degrees of stonewalling. So, the TCP/IP bureacracy now > functions as > does any that has long lost its enthusiastic founders. > > Indeed Bob Kahn deserves an award, since it was he who assured continued > funding when folks in Washington were getting tired. Vint Cerf surely > deserves something too, since he made honest comments lamenting > the failure to > continue protocol research, rather than stop esssentially where > we are. So, > my comments are intended to encourage movement beyond the > existence proof that > today's Internet protocols are, into something more > professionally developed > and maturing. It's worth noting that the biggest movers on > Internet, both in > terms of traffic and users, are based on non-IETF developments. > > Alex > > Bob Braden wrote: > > > > At 04:57 PM 6/11/2005 -0700, Cannara wrote: > > >Cool Ted! And now maybe I've learned why the all the Apple > Lisa's on the show > > >network, when they were introduced in SF, locked up when I accidentally > > >brushed my palm across one's keyboard, while reaching for a brochure! > > >Fortunately, Jobs was right there to chew out their tech crew, > when all that > > >could get things working again was a full power cycle. > > > > > >Hey, maybe we could arrange the same fate for TCP as the Lisa > had? Oops, I > > >didn't say that. :] > > > > > >Alex > > > > Alex, > > > > You are lapsing into adoselcent mode (again). If you have a real, > > well-thought-out > > proposal for architecture and protocols to replace the Internet > protocols, > > please > > send us a pointer to your paper on the subject. A technical > discussion of an > > alternate architectural approach would be a good use of this > list. Siimply > > hurling > > innuendos against TCP is boring, unproductive, and > unacceptable. Cynicism > > is not a > > substitute for technical depth. > > > > BTW, don''t forget to tell the ACM that they made a terrible mistake in > > giving the > > Turing Award to Bob Kahn and Vint Cerf, in San Francisco last evening. > > > > Bob Braden > From braden at ISI.EDU Mon Jun 13 12:14:05 2005 From: braden at ISI.EDU (Bob Braden) Date: Mon, 13 Jun 2005 12:14:05 -0700 (PDT) Subject: [e2e] a new IRTF group on Transport Models Message-ID: <200506131914.MAA17274@gra.isi.edu> *> version=2.64 *> *> Bob, as you and anyone can tell from the archives on this list, I and others *> have made positive suggestions for specific changes to both TCP and IP to *> improve matters. Alex, I am not concerned about others. I am sorry, I must have missed your "positive suggestions for specific changes to both TCP and IP to improve matters". Would you mind briefly summarizing them for the others on this list? Thank you. Bob Braden From cannara at attglobal.net Mon Jun 13 13:13:02 2005 From: cannara at attglobal.net (Cannara) Date: Mon, 13 Jun 2005 13:13:02 -0700 Subject: [e2e] a new IRTF group on Transport Models References: <200506131914.MAA17274@gra.isi.edu> Message-ID: <42ADE8CE.794D30B5@attglobal.net> Bob, I outlined specific, general steps to take last week. Did you not see that? Over the last few years, I outlined specific additions to current TCP to, for instance, allow it to make fewer mistakes about why packets may be delayed, missing or repeated unnecessarily. These suggestions were meant to allow anyone to conduct a research project without making large modifications to existing protocol stacks. I also listed default stack settings that are often problematic, which speaks to the need for release control. If you've been on this list regularly, you should know what those issues are. If you've forgotten, I'll try to resurrect the emails. But, the biggest TCP/IP issue is why your email here, today, came snadwiched between: "Can You Last 36 Hours", or "Burn Any Movie onto DVD", yadda, yadda. That's called admission control, and is wholly thwarted by the still-inane IP addressing system and the lack of uniqueness of names as well as addresses. If you want to start somewhere to improve TCP and other performance, then you need to start at the beginning -- packets entering the net. If you don't, then diddling with TCP (or any other) is mouse nuts (to use an old 3Com adjective :). Alex Bob Braden wrote: > > *> version=2.64 > *> > *> Bob, as you and anyone can tell from the archives on this list, I and others > *> have made positive suggestions for specific changes to both TCP and IP to > *> improve matters. > > Alex, > > I am not concerned about others. I am sorry, I must have missed your > "positive suggestions for specific changes to both TCP and IP to improve > matters". Would you mind briefly summarizing them for the others on > this list? > > Thank you. > > Bob Braden From fu at cs.uni-goettingen.de Wed Jun 15 11:24:14 2005 From: fu at cs.uni-goettingen.de (Xiaoming Fu) Date: Wed, 15 Jun 2005 20:24:14 +0200 Subject: [e2e] Reacting to corruption based loss In-Reply-To: References: Message-ID: <42B0724E.1010906@cs.uni-goettingen.de> Jon, Jon Crowcroft wrote: > >>>and will lead to far less memory wastage in hosts runnign all > >>>that complicated TCP protcol - they can just send web pages and video > >>>and audio and so on as a sequence of IP packets > >>>[...] > >>>IP over TCP: way to go. > > >>Yeah right. What happens if one if the nodes on the path is > >>unavailable? The data just gets dropped. That's completely unacceptable. > > um - sorry - you have lost me - there's a TCP connection from each node to every other node - i can stripe the data > how i like... Your vision of "creating a TCP mesh among (all) first/last hops" sounds interesting. Yeah TCP may run well in face of packet loss and path dynamics. A question is whether (and if yes, how) an endhost should react to congestions indicated by TCP in its first hop? Can this avoid re-introducing flow/rate control and fragmentation functions in the endhost? Xiaoming From Jon.Crowcroft at cl.cam.ac.uk Thu Jun 16 01:14:20 2005 From: Jon.Crowcroft at cl.cam.ac.uk (Jon Crowcroft) Date: Thu, 16 Jun 2005 09:14:20 +0100 Subject: [e2e] Reacting to corruption based loss In-Reply-To: Message from Xiaoming Fu of "Wed, 15 Jun 2005 20:24:14 +0200." <42B0724E.1010906@cs.uni-goettingen.de> Message-ID: well, it wasnt an entirely serious proposal - more a gedanken experiment but yes, if you want to think of it this way - the idea is that the mesh of TCP connections is an overlay (like RON, but 100% instead of partial), and provides direct information about the capacity pairwise between all edge points in the net, irrespective of the topology in the interdomain routes. but if the first (or last) hope is congested -e.g. its a wireles hop like GPRS or UMTS and shared, then you would want to run TCP to the first router - this is no new idea either - TCP slice or TCP splice approaches have been around a while (not sure which came first - i think the berkeley folks probably did the first wide area wireless things with two tcp connection 's glommed together at the impedance mimatched border between the host/edge, and edge/core router... of course fndamentally such a thing has been around for a long time since we had dial up with compressed TCP/IP headers and state restoration, but the de-coupling of the reliabilty, rate and congestion management at this point was a further step in the evolution of end-to-end if one had edge-to-edge implemeted as a mesh of TCP (and remember it aint going really to scale that well), you could probably do some _very_ simple DOS and flash crowd mitigation trivially too so thinking about the scaling, is a million or even 10^9 connections actually mad? there are probbly some nice data structure tricks one could do to reduce the memory cost, and share redundent connection state, and maybe keep a lot of it in the same structure if its "similar" enough , - basically, using shared memory like shared library for the TCPCBs while they have the same values, and with copy-on-write when you change more than some delta... hmmm - quite a fun research project for someone....what is the standard deviation of the values in fields in the TCPCB data structure distrubtion, over space and time? In missive <42B0724E.1010906 at cs.uni-goettingen.de>, Xiaoming Fu typed: >>Jon, >> >>Jon Crowcroft wrote: >>> >>>and will lead to far less memory wastage in hosts runnign all >>> >>>that complicated TCP protcol - they can just send web pages and video >>> >>>and audio and so on as a sequence of IP packets >>> >>>[...] >>> >>>IP over TCP: way to go. >>> >>> >>Yeah right. What happens if one if the nodes on the path is >>> >>unavailable? The data just gets dropped. That's completely unacceptable. >>> >>> um - sorry - you have lost me - there's a TCP connection from each node to every other node - i can stripe the data >>> how i like... >> >>Your vision of "creating a TCP mesh among (all) first/last hops" sounds >>interesting. Yeah TCP may run well in face of packet loss and path >>dynamics. A question is whether (and if yes, how) an endhost should >>react to congestions indicated by TCP in its first hop? Can this avoid >>re-introducing flow/rate control and fragmentation functions in the >>endhost? >>Xiaoming cheers jon From mbgreen at dsl.cis.upenn.edu Fri Jun 17 06:29:28 2005 From: mbgreen at dsl.cis.upenn.edu (Michael B Greenwald) Date: Fri, 17 Jun 2005 09:29:28 -0400 (EDT) Subject: [e2e] a new IRTF group on Transport Models In-Reply-To: Your message of "Wed, 08 Jun 2005 18:06:59 BST." Message-ID: <200506171329.j5HDTS0p003760@codex.cis.upenn.edu> To answer one specific question Doug raises below: as of three years ago an appreciable number of flows traveled along paths where packets were queued at two or more hops. We studied this issue in a paper (``On the Sensitivity of Network Simulation to Topology'', K. G. Anagnostakis, M. Greenwald, R. Ryger. Proceedings of MASCOTS 2002., Postscript: ftp://ftp.cis.upenn.edu/pub/mbgreen/papers/mascots02.ps.gz, or PDF: http://www.cis.upenn.edu/~anagnost/papers/mbottle.pdf) a couple of years ago. (We did coarse probing of 38k paths, but the paper focused on ~2k paths). We were very conservative in deciding that a path experienced multiple congestion points, so the number was an underestimate at that time. Our main question was whether such multi-congestion-point paths occured in significant numbers --- the answer was "yes", even our conservative lower bound represented a measurable fraction of congested paths. (Because things have changed in 3 years, and the number was an underestimate even then, our exact numbers are not germane at this point [although the curious should feel free to look at the paper.]) Wed, 8 Jun 2005 18:06:59 +0100 "Douglas Leith" Following up on Frank's question, one area where I suspect more data would help is in defining topologies to test TCP performance over. Most work to date has focussed on a dumbell topology. While this seems like a useful starting point, it would be good to have a better understanding of the range of end-to-end topologies experienced by TCP flows in practice. For example, it would be good to know what proportion of flows travel along paths where packets are queued at two or more hops (due to cross-traffic etc) and to better understand the character of such paths assuming they exist in appreciable numbers. This seems to require additional measurement information from that which is currently available - probing from the edge alone can probably only yield limited/ambiguous information on what's happening inside the network and so router information might help out a lot. Doug -----Original Message----- From: end2end-interest-bounces at postel.org on behalf of = frank at kastenholz.org Sent: Wed 6/8/2005 1:49 PM To: end2end-interest at postel.org Subject: Re: [e2e] a new IRTF group on Transport Models While all this chatter about certain actions TCP can or can not take and perfect nets with theorem provers in all the routers is as interesting and amusing as brain surgery, my original question stands: Is there any thought to identifying information that routers and end systems might provide that either can be fed back into the models to refine them or used in parallel to (in)validate them? Frank Kastenholz --===============1477388679==-- From detlef.bosau at web.de Sat Jun 25 15:40:52 2005 From: detlef.bosau at web.de (Detlef Bosau) Date: Sun, 26 Jun 2005 00:40:52 +0200 Subject: [e2e] Reacting to corruption based loss Message-ID: <42BDDD74.BF9FDB92@web.de> Michael Welzl wrote: > This point has been raised several times: how exactly should > a sender react to corruption? I fully agree that continuing > to send at a high rate isn't a good idea. Please excuse me if my comments seem to be a little bit stupid. I?m new to this list and it?s hard to keep all threads and discussions in mind. So, I apologize if I carry cowls to newcastle here. And hopefully, my comments are not too stupid ;-) Basically, we?re talking about the old loss differentiation debate. If packet loss is due to corruption, e.g. on a wireless link, there is not mecessarily a need for the sender to decrease it?s rate. Perhaps one could do some FEC or use robust codecs, depending on the application in use. But I do not see a reason for a sender to decrease it?s rate anyway. Loss differentiation seems to be a naughty issue. Missing packets are really misbehaved: They do not indicate whether they are lost or dropped. In my opinion, and I?m willing to receive contradiction on this point, it is a matter of the Internet system model. Why couldn?t we continue to assume loss free links? Is this really a violation of the End to End Principle when we introduce link layer recovery? Or is it simply a well done seperation of concerns to fix link issues at the link layer and to leave transport issues to the transport layer? > > Now, given that we're talking about a transport endpoint which > is informed about corruption, there probably isn't any knowledge > regarding the origin of corruption available - we don't know Oh these misbehaved packets... They pass dozens of links and pass away on one....without leaving even a note.. Or a farewell letter... > what type of link layer caused it or why exactly it happened > (too many sources sending at the same time? user passed a wall?). > However, there seems to be some consensus that the reaction > to corruption should be less severe than the reaction to congestion. > > Also, it has been noted several times (and in several places) that > AIMD would work just as well if beta (the multiplicative decrease > factor) was, say, 7/8 instead of 1/2. For a historical reason (I Hm. In my opinion, it will work with 99/100 as well. In fact, I believe it will work with an arbitrary beta chosen from (0..1). Plese note: Not [0..1], because 0 and 1 will miss the wanted behavior. In my opinion, a choice near to 0 will lead to a faster convergence to fairness and a choice near to 1 will lead to a better link usage. So, in my opinion it is essentially a tradeoff between convergence speed and link usage. Basically, I?m not convinced that 1/2 were really a choice that bad. Even more: For AIMD to work properly in TCP, beta should be the same in all TCP stacks. I don?t think we want to have dozens of betas slack around the Internet for the next decade or so. > think it was the DECbit scheme), 7/8 seems to be a number > that survived in people's minds. > > So, why don't we just decide for a pragmatic approach instead > of waiting endlessly for a research solution that we can't come > up with? Why don't we simply state that the reaction to corruption > has to be: "reduce the rate by multiplying it with 7/8"? So, why do we want to fix something that isn?t yet broken? > > Much like the TCP reduction by half, it may not be the perfect > solution (Jacobson actually mentions that the reduction by half > is "almost certainly too large" in his congavoid paper), but > it could be a way to get us forward. > > ...or is there a reasonable research method that can help us > determine the ideal reaction to corruption, irrespective of > the cause? > Excuse me, but as far as I remember, VJ talks about _congestion_ there. Not about corruption. Consider some VoIP application. Consider running this on a lossy wireless channel. (Don?t ask for a justification, I don?t see any but I know dozens of people believing there is one.) Consider a reasonable corruption rate due to noise. Consider an appropriate noise tolerant speach codec. Why should we react on this corruption? Would the noise become less therefrom? On the other hand: If some congestion drop would indicate: "Please be so kind not to usurp the whole channel. There are other flows interested on network ressources as well." Wouldn?t it be appropriate to kindly adapt ones ressource occuptation? I don?t know whether you refer to congestion _drop_ and corruption _loss_ as well when you talk about corruption. Personally I always use the termini drop and loss to make the difference perfectly clear. I sincerely think that these are two different issues which should be dealt with differently. And for the moment, I still believe, the easiest way is to have loss rates neglectible. E.g. by use of Link Layer Recovery and PEP if applicable. However, I do not know the majority position on this issue here. > > I agree that this would be good to have. There have been many > proposals, and none of them appeared to work well enough > (*ouch*, I just hurt myself :-) ). Inter-layer communication > is a tricky issue, regardless of the direction in the stack. > Heck, we don't even have a reasonable way to signal "UDP-Lite > packet coming through" to a link layer! "Inter-layer communiction" is a very big word. However, for a very small problem, perhaps some people might be interested in the little proposal on my home page http://www.detlef-bosau.de I know this is a really special issue and perhaps, I?m still in the earliest beginnings of this. However, I tend to take the position, that we should first try to exploit well proven mechanisms than to raise new issues (or to continue everlasting ones). And the drop vs. loss debate is not a trivial one. There are hundreds of papers around dealing with this topic, and I don?t see they would come to and end. O.k. This was my first post to this list, and now I?m ready to take beating, contradiction.... Whatever you prefer :-) DB -- Detlef Bosau Galileistrasse 30 70565 Stuttgart Mail: detlef.bosau at web.de Web: http://www.detlef-bosau.de Mobile: +49 172 681 9937 From craig at aland.bbn.com Sat Jun 25 17:13:18 2005 From: craig at aland.bbn.com (Craig Partridge) Date: Sat, 25 Jun 2005 20:13:18 -0400 Subject: [e2e] Reacting to corruption based loss In-Reply-To: Your message of "Sun, 26 Jun 2005 00:40:52 +0200." <42BDDD74.BF9FDB92@web.de> Message-ID: <20050626001318.CB7A424D@aland.bbn.com> In message <42BDDD74.BF9FDB92 at web.de>, Detlef Bosau writes: >Basically, we�re talking about the old loss differentiation debate. If >packet loss is due to corruption, e.g. on a wireless link, there is not >mecessarily a need for the sender to decrease it�s rate. Perhaps one >could do some FEC or use robust codecs, depending on the application in >use. But I do not see a reason for a sender to decrease it�s rate >anyway. I believe that's the general wisdom. Though I'm not sure anyone has studied whether burst losses might cause synchronized retransmissions. >In my opinion, and I�m willing to receive contradiction on this point, >it is a matter of the Internet system model. Why couldn�t we continue >to assume loss free links? Is this really a violation of the End to End >Principle when we introduce link layer recovery? Or is it simply a well >done seperation of concerns to fix link issues at the link layer and to >leave transport issues to the transport layer? Take a peek at Reiner Ludwig's (RWTH Aachen) dissertation which says that, in the extreme case, link layer recovery and end-to-end recovery don't mix -- and we know from the E2E principle that some E2E recovery must be present. (I believe there's also work from Uppsala showing that you need at least some link layer recovery or TCP performance is awful -- what this suggests is we're searching for a balance point). Craig From detlef.bosau at web.de Sun Jun 26 06:16:56 2005 From: detlef.bosau at web.de (Detlef Bosau) Date: Sun, 26 Jun 2005 15:16:56 +0200 Subject: [e2e] Reacting to corruption based loss References: <20050626001318.CB7A424D@aland.bbn.com> Message-ID: <42BEAAC8.F0FCB714@web.de> Craig Partridge wrote: > > > >In my opinion, and I?m willing to receive contradiction on this point, > >it is a matter of the Internet system model. Why couldn?t we continue > >to assume loss free links? Is this really a violation of the End to End > >Principle when we introduce link layer recovery? Or is it simply a well > >done seperation of concerns to fix link issues at the link layer and to > >leave transport issues to the transport layer? > > Take a peek at Reiner Ludwig's (RWTH Aachen) dissertation which says that, > in the extreme case, link layer recovery and end-to-end recovery don't mix -- > and we know from the E2E principle that some E2E recovery must be > present. (I believe there's also work from Uppsala showing that you > need at least some link layer recovery or TCP performance is awful -- > what this suggests is we're searching for a balance point). I totally agree here. My concern is the loss differentiation debate in wireless networks, particularly in mobile wide area networks (GSM, 3G...). (There are quite a few wireless networks where packet corruption is extremeley rare or neglectible. Think of properly set up WaveLANs for example. In my opinion, there is no urgent need to treat them different to wireless networks.) However, in mobile networks, it often makes sense to use local recovery schemes and / or performance enhancing proxies. My own work exactly deals with such situations. (And no, I do _not_ attempt to propose yet another PEP %-)). Nevertheless, I got some strong criticism because PEP, connection splitting etc. would violate the E2E principle, violate TCP semantics etc. Of couse, each of these objections deserves most careful consideration. E.g. a PEP which breaks TCP E2E reliability like the original I-TCP approach can hardly be used. However, even with connection splitting, TCP E2E reliability can be maintained, following Rajiv Chakravorty, Sachin Katti, Jon Crowcroft, and Ian Pratt. "Flow Aggregation for Enhanced TCP over Wide Area Wireless.", INFOCOM 2003. Basically, if we consider 3G networks for access networks only, the alternatives are: First: PEP and connection splitting must not be used, hence there is a need for loss differentation tu react appropriately. Second: In case of severe loss or other "non TCP compliant behaviour" in access networks, it makes sense to use PEP, connection splitting etc. to make corruption loss neglectible from the E2E point of view. Of course, great care must be taken to have _all_ E2E semantics properly maintained. In case of the second alternative, it may be necessary to obey the particular needs/E2E semantics of the transport protcol / application in use. I?m just curious, whether there is a general consensus on this matter. DB -- Detlef Bosau Galileistrasse 30 70565 Stuttgart Mail: detlef.bosau at web.de Web: http://www.detlef-bosau.de Mobile: +49 172 681 9937 From cannara at attglobal.net Sun Jun 26 10:41:43 2005 From: cannara at attglobal.net (Cannara) Date: Sun, 26 Jun 2005 10:41:43 -0700 Subject: [e2e] Reacting to corruption based loss References: <20050626001318.CB7A424D@aland.bbn.com> Message-ID: <42BEE8D7.D2FFC9E6@attglobal.net> As Craig suggests, a "balance point" is needed. But consider the key statements: "Please be so kind not to usurp the whole channel. There are other flows interested on network resources as well." Note the phrases: "other flows", "network resources". Since not all flows are TCP and not all TCPs are the same, nor are they configured the same way, this phrase makes clear the need for more Internet network-layer flow management. This is exactly what happens within metro networks that distribute many flows of many types through many interfaces. Now to the question: "Why couldn?t we continue to assume loss free links?" Because they don't exist with any long-term certainty and depend greatly on link technology. So, some thing independent of one particular transport (i.e., TCP) needs to consider what to do for Internet losses that aren't due to congestion. This entity needs to do its work for all types of transport, even something like ICMP, that is intended to be e2e. We know clearly how poorly TCP responds to loss, in ignorance of its cause. We know as engineers this is not a worthy design for a generally-useful network. This behavior is one reason TCP is not used in various networks for which loss must be handled more efficiently. So, we come back to the core meaning of the e2e principle -- assure ends know what each is doing and get data reliably & efficiently transferred. That isn't: "managing network congestion". If managing betwork congestion were the ends' tasks, then the ends would need far more accurate and complete info than simply a timeout waiting for an ack to a packet that got lost, or a timeout retransmit when an ack got lost. The network layer is where loads of info exists on why and where packets get lost. That info needs to be used in the Internet, as it is in other component nets. It would also be useful for the next layer up to get some of that info, so whatever responses to loss are available are chosen well. Alex Craig Partridge wrote: > > In message <42BDDD74.BF9FDB92 at web.de>, Detlef Bosau writes: > > >Basically, we?re talking about the old loss differentiation debate. If > >packet loss is due to corruption, e.g. on a wireless link, there is not > >mecessarily a need for the sender to decrease it?s rate. Perhaps one > >could do some FEC or use robust codecs, depending on the application in > >use. But I do not see a reason for a sender to decrease it?s rate > >anyway. > > I believe that's the general wisdom. Though I'm not sure anyone has > studied whether burst losses might cause synchronized retransmissions. > > >In my opinion, and I?m willing to receive contradiction on this point, > >it is a matter of the Internet system model. Why couldn?t we continue > >to assume loss free links? Is this really a violation of the End to End > >Principle when we introduce link layer recovery? Or is it simply a well > >done seperation of concerns to fix link issues at the link layer and to > >leave transport issues to the transport layer? > > Take a peek at Reiner Ludwig's (RWTH Aachen) dissertation which says that, > in the extreme case, link layer recovery and end-to-end recovery don't mix -- > and we know from the E2E principle that some E2E recovery must be > present. (I believe there's also work from Uppsala showing that you > need at least some link layer recovery or TCP performance is awful -- > what this suggests is we're searching for a balance point). > > Craig From detlef.bosau at web.de Sun Jun 26 15:07:52 2005 From: detlef.bosau at web.de (Detlef Bosau) Date: Mon, 27 Jun 2005 00:07:52 +0200 Subject: [e2e] Reacting to corruption based loss References: <20050626001318.CB7A424D@aland.bbn.com> <42BEE8D7.D2FFC9E6@attglobal.net> Message-ID: <42BF2738.AD082D57@web.de> Cannara wrote: > > > Now to the question: "Why couldn?t we continue to assume loss free links?" O.k, this remark was intendedly somewhat teasing ;-) And of course it was thought in a "TCP manner", i.e. the assumption that all losses are due to congestion and therefore are treated like congestion notifications. I?m personally interested in wide area mobile wireless networks ("mobile networks") used as access networks to the Internet. In this model, the Internet _core_ consists of wirebound networks and mobile networks are used as _access_ networks only. For this particular case I advocate the use of PEP to make loss differentiation for TCP unnecessary. So, correctly I should say: In the particular case of using mobile wireless networks as access networks to the Internet, we can compensate corruption based loss at the access network by the use of PEP/RLP/... to that degree that a TCP sender may continue to reckognize any loss as being due to congestion. Of course, not all protocols are TCP. And of course different protocols may require different strategies. However, I?m not quite sure whether my position meets the general consensus. I?ve earned both, strong support and strong criticism when I talked about this to colleagues. DB -- Detlef Bosau Galileistrasse 30 70565 Stuttgart Mail: detlef.bosau at web.de Web: http://www.detlef-bosau.de Mobile: +49 172 681 9937 From cannara at attglobal.net Sun Jun 26 21:32:10 2005 From: cannara at attglobal.net (Cannara) Date: Sun, 26 Jun 2005 21:32:10 -0700 Subject: [e2e] Reacting to corruption based loss References: <20050626001318.CB7A424D@aland.bbn.com> <42BEE8D7.D2FFC9E6@attglobal.net> <42BF2738.AD082D57@web.de> Message-ID: <42BF814A.C4948522@attglobal.net> Ok Detlef, then the phrase: "to that degree that a TCP sender may continue to reckognize any loss as being due to congestion" remains the problem, since TCP designers assumed loss was due to congestion, thus often slowing down unnecessarily and, even when loss is due to congestion, often too much. The problem we have now is that TCP's "congestion control" was never more than a bandaid, added in the '80s, to protect the burgeoning Internet from the dreaded "meltdown", as predicted by Metcalfe & others. The near meltdowns scared Inet folks mightily. As a kludge, making TCP try to manage congestion was never followed up with protocol research and development to properly sense and distribute congestion info. Thus, any loss, is considered congestion, and a typical TCP will be brought to its knees by a few % packet losses that are simply due to hardware errors. Alex Detlef Bosau wrote: > > Cannara wrote: > > > > > > Now to the question: "Why couldn?t we continue to assume loss free links?" > > O.k, this remark was intendedly somewhat teasing ;-) > > And of course it was thought in a "TCP manner", i.e. the assumption that > all losses are due to congestion and therefore are treated like > congestion notifications. > > I?m personally interested in wide area mobile wireless networks ("mobile > networks") used as access networks to the Internet. In this model, > the Internet _core_ consists of wirebound networks and mobile networks > are used as _access_ networks only. > > For this particular case I advocate the use of PEP to make loss > differentiation for TCP unnecessary. > > So, correctly I should say: In the particular case of using mobile > wireless networks as access networks to the Internet, we can compensate > corruption based loss at the access network by the use of PEP/RLP/... to > that degree that a TCP sender may continue to reckognize any loss as > being due to congestion. > > Of course, not all protocols are TCP. And of course different protocols > may require different strategies. > > However, I?m not quite sure whether my position meets the general > consensus. I?ve earned both, strong support and strong criticism when I > talked about > this to colleagues. > > DB > -- > Detlef Bosau > Galileistrasse 30 > 70565 Stuttgart > Mail: detlef.bosau at web.de > Web: http://www.detlef-bosau.de > Mobile: +49 172 681 9937 From sampad_m at rediffmail.com Mon Jun 27 07:35:08 2005 From: sampad_m at rediffmail.com (sampad mishra) Date: 27 Jun 2005 14:35:08 -0000 Subject: [e2e] Reacting to corruption based loss Message-ID: <20050627143508.5843.qmail@webmail30.rediffmail.com> Hi all, Though not an expert however I have something to say... As Craig Partridge wrote: > > In message <42BDDD74.BF9FDB92 at web.de>, Detlef Bosau writes: > > >Basically, we?re talking about the old loss differentiation debate. If > >packet loss is due to corruption, e.g. on a wireless link, there is not > >mecessarily a need for the sender to decrease it?s rate. Perhaps one > >could do some FEC or use robust codecs, depending on the application in > >use. But I do not see a reason for a sender to decrease it?s rate > >anyway. As we know packet loss (due to corruption) is significant in wireless and is mainly due to fading when their is shift from 1 AP to another . I think it would be better if the sender decreases the rate rather than sending data at the same rate considering the fact that data will be lost... I mean their is no point losing more amount of data by continuing to send data at the same rate when we know that packets are getting corrupted. So I think its not much of a worry if sender reacts to a corruption like congestion for some cases atleast for the time being, till we find the reason for corruption and react accordingly.... Main point is to find the reason for corruption then react accordingly. Any way link layer also has its recovery mechanism... TCP fast retransmit can also sort this problem in some sense( not very much sure)... Kindly make me aware if their r some mistakes in my thoughts... :) Regards, Sampad Mishra, Project Assistant, IISc, India. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.postel.org/pipermail/end2end-interest/attachments/20050627/d16fd8bf/attachment.html From tapankarwa at yahoo.com Mon Jun 27 12:02:12 2005 From: tapankarwa at yahoo.com (Tapan Karwa) Date: Mon, 27 Jun 2005 12:02:12 -0700 (PDT) Subject: [e2e] Receiving RST on a MD5 TCP connection. In-Reply-To: <41DC8B17.7070502@isi.edu> Message-ID: <20050627190212.52306.qmail@web53703.mail.yahoo.com> Hi, I was going through RFC 2385 - Protection of BGP Sessions via the TCP MD5 Signature Option In Section 4.1, it mentions "Similarly, resets generated by a TCP in response to segments sent on a stale connection will also be ignored. Operationally this can be a problem since resets help BGP recover quickly from peer crashes." This can easily happen in the following scenario : XX is talking to YY and both are using MD5. YY suddenly reboots but XX does not know about it yet. XX sends the next segment to YY with the MD5 digest but YY does not recognize it and hence sends a RST. Of course this RST segment does not have the MD5 digest. Even when XX receives the RST, it wont/cant close the connection since it will trash the packet as it does not have the MD5 digest. I was wondering if there is any solution to this problem. Will it be correct to accept the RST even if the MD5 digest is missing ? If we do that, can that open doors for some other attacks ? Thanks, tapan. ____________________________________________________ Yahoo! Sports Rekindle the Rivalries. Sign up for Fantasy Football http://football.fantasysports.yahoo.com From detlef.bosau at web.de Mon Jun 27 12:23:37 2005 From: detlef.bosau at web.de (Detlef Bosau) Date: Mon, 27 Jun 2005 21:23:37 +0200 Subject: [e2e] Reacting to corruption based loss, and some remarks on PTE References: <20050627143508.5843.qmail@webmail30.rediffmail.com> Message-ID: <42C05239.4C780A40@web.de> sampad mishra wrote: > As we know packet loss (due to corruption) is significant in wireless > and is mainly due to fading when their is shift from 1 AP to another . > I think it would be better if the sender decreases the rate rather > than sending data at the same rate considering the fact that data will > be lost... That?s the rationale behind e.g. M-TCP (Brown, Singh, 1997). IIRC, Brown and Singh slow down a TCP sender in the Internet which sends data to a mobile receiver by shrinking the receiver?s advertised window (AWND) in handover periods. > I mean their is no point losing more amount of data by continuing to > send data at the same rate when we know that packets are getting > corrupted. To my understanding, there are two issues. First, a sender has to recover properly from corruption when packets get lost. In mobile networks (UMTS, GPRS) loss rates in a channel without any kind of local recovery may become that high that there would be hardly any throughput at all if error recovery would be done on a packet level. Unfortunately, mobile networks suffer from high packet corruption rates not only in case of roaming but for a number of other reasons as well (shading, multipath fading,...). Please note: I put the emphasis on _packet_ level here. Therefore, local recovery in mobile networks is not done on a packet level but typically mobile networks split up IP packets in smaller portions, so called "radio blocks", which may have a size of e.g. 171 bits. Please note: This is only one value, I?ve found somwhere in literature. AFAIK, Manfred Taferner (Vienna) has worked the pros and cons for different radio block sizes in his PhD thesis. Typically, mobile networks use a Radio Link Protocol (RLP) which bascially consists of a combination of Automatic Retransmission reQuest (ARQ) protocol and Forward Error Correction (FEC). Because bit corruption typically occurs in a bursty manner in mobile networks, RLP typically employs block/packet interleaving mechansims in order to keep the FEC overhead reasonably small. Second, and that?s the focus of the loss differentiation work, a sender who experiences packet loss decreases its congestion window (CWND). It is the particular focus of Brown and Singh, that a TCP sender shall _maintain_ its CWND even in presence of transient disconnection / reate decrease in case of roaming. Now, my very concern is that these two issues are often mixed up. Particularly, we have to carefully distinghuish between 802.11 networks (e.g. WLAN) and mobile networks here. Particularly in WLAN, the problem sometimes appears to me a little bit overly exaggerated. When those networks are properly set up and are being run as designed, nowadays WLAN plants achieve BER less than 10^-6 which is comparable to early Ethernet plants. So, in my opinion (and I?m very interested in any comments on this one), in WLAN we can simply leave the whole matter alone, shut our eyes and forget about the missing wire. I know that WLAN channels may become noisy and distorted e.g. in the presence of ferroconcrete walls etc. But I said: "properly set up and .. being run as designed." And since we all konw that ferroconcrete interferes with wireless communication, we can place access points and antennas on each side of the wall and interconnect them using e.g. Ethernet. Likewise, we should obey velocity contraints etc. There are numerous problems which may occur in WLANs, but we must carefully distinguish whether these are structural problems of TCP or simply consequences of inappropriate / wrong network operation. (Or, as an allusion to a pretty well known phrase by AST: Never overestimate the range of a motor car running out of petrol.) In my humble opinion (and as usual, I?m willing to take punishment) in networks suffering from heavy loss, the problem is not TCP congestion control but TCP retransmission. Of course, both problems coexist. However: If you consider a 3G mobile network _without_ any RLP, you would be not so unlikely to experience packet loss rates of 50 % or even much higher. And in that situation, the congestion control problem is the mouse. And the time and bandwitdh spent for retransmission are the elephant. Even more precisely: The retransmission problem is not really due to TCP but due to the size of IP packets. This is the very reason why these are broken down into pieces on mobile links. > So I think its not much of a worry if sender reacts to a corruption > like congestion for some cases atleast for the time being, till we > find the reason for corruption and react accordingly.... Main point is Now, the problem is that in case of frequent corruption based loss, the sender might nearly stop sending. Howoever, I think we must be careful to put the focus on the correct problem. Simply spoken: 1: "WaveLAN is fixed LAN". (even more AST: The metric ton of salt.) 2: In mobile links, corruption loss can be neglected due to the use of RLP. So, in mobile networks, problems for TCP are reduced to (following Reiner Ludwigs PhD dissertation) -spurious timeouts, -transient disconnections, -scheduling problems, With respect to this threads subject this means: In mobile networks, corruption based loss are made (nearly) disappear, and so does the problem. However, the other side of the mountain is how a mobile link employing an RLP appears to an Internet sender. One example, as you point out, may be a decrease of bandwidth (or more precisely: achievable throughput) in case of handover. This may require a sender slow down. In M-TCP this is achieved by shrinking AWND, which may result in a number of difficulties. E.g. shrinking / clamping AWND conflicts with the original AWND semantics: To my understanding, the purpose of AWND is mainly to slow down the sender if the receiver?s _application_ is not able / willing to accept any data from the receiver?s socket. (E.g. if the stone aged netscape release on my even older desktop PC collapses when it is flooded by my DSL line ;-) ) Another problem is that it?s not an easy task to find out the appropriate AWND which is necessary to have the sender send at the correct rate. It may be even difficult to decide, whether clamping is even necessary. Imagine a situation when a flow is already restricted to a rate the mobile link can carry by some bottleneck in the wired part of the connection. For these reasons I propose the PTE mechanism described in http://www.detlef-bosau.de/043.pdf . The key idea is to make a mobile link appear like an ordinary wirebound link, the storage capacity and bandwidth of which reflect the properties of the mobile link. The key idea immediately follows the congavoid paper by JK and MK, that a network is essentially described by its storage capacity and its bottleneck rate and a TCP sender is correctly paced by the receiver?s ACK packets when the flow is in "equilibrium state", i.e. CWND is correctly set to the (flows fair share) of storage capacity. The PTE mechanism is intended as a necessary extension for PEP/split connection approaches and aims to maintain the original semantics of CWND, AWND and ACK pacing in TCP. One could simplify my approach that way, that it aims to make the mobile network disappear to the sender. The sender shall not see any difference between a pure wirebound part and a part with a mobile last mile. And the rationale for PTE was whether this is totally achieved by existing PEP (in my opinion not quite) or wheter there remains something to be done on that matter. Although the PTE work seems to be off topic with respect to the threads subject, it was _exactly_ the question raised in this thread which motivated my work. This work is intended as an _extension_ to existing approaches and aims to bring more compatibility and interoperability in heterogeneous (e.g. wirebound/mobile) networks. It is not intended as a replacement for existing approaches. A major concern was the matter of compatibilty. There is a plethora of approaches for successful congestion avoidance and control on the Internet, Active Queme Management, Performance Enhancing Proxies etc. And perhaps there is only a very little gap. Any approach which intends to close this gap must not break or interfere with well proven and perhaps broadly used mechanisms. (I think, this is in the very sense of Jon Postels "Principle of Robustness".) Therefore the general question was: What must be done to have a mobile link appear to TCP, and the Internet, exaclty that way as it is expected by the congavoid paper and following ones? The better compliance we have to the basic system model used in TCP, the less difficulties need to be dealt with when a new mechanism is proposed or even used. DB -- Detlef Bosau Galileistrasse 30 70565 Stuttgart Mail: detlef.bosau at web.de Web: http://www.detlef-bosau.de Mobile: +49 172 681 9937 From mdalal at cisco.com Mon Jun 27 12:18:40 2005 From: mdalal at cisco.com (Mitesh Dalal) Date: Mon, 27 Jun 2005 12:18:40 -0700 (PDT) Subject: [e2e] Receiving RST on a MD5 TCP connection. In-Reply-To: <20050627190212.52306.qmail@web53703.mail.yahoo.com> References: <20050627190212.52306.qmail@web53703.mail.yahoo.com> Message-ID: refer to http://www.ietf.org/internet-drafts/draft-ietf-tcpm-tcpsecure-03.txt Although not a standard yet, it offers a reasonable mitigation. Thanks Mitesh On Mon, 27 Jun 2005, Tapan Karwa wrote: > Hi, > > I was going through RFC 2385 - Protection of BGP > Sessions via the TCP MD5 Signature Option > > In Section 4.1, it mentions > "Similarly, resets generated by a TCP in response to > segments sent on a stale connection will also be > ignored. Operationally this can be a problem since > resets help BGP recover quickly from peer crashes." > > This can easily happen in the following scenario : > XX is talking to YY and both are using MD5. YY > suddenly reboots but XX does not know about it yet. XX > sends the next segment to YY with the MD5 digest but > YY does not recognize it and hence sends a RST. Of > course this RST segment does not have the MD5 digest. > > Even when XX receives the RST, it wont/cant close the > connection since it will trash the packet as it does > not have the MD5 digest. > > I was wondering if there is any solution to this > problem. Will it be correct to accept the RST even if > the MD5 digest is missing ? If we do that, can that > open doors for some other attacks ? > > Thanks, > tapan. > > > > ____________________________________________________ > Yahoo! Sports > Rekindle the Rivalries. Sign up for Fantasy Football > http://football.fantasysports.yahoo.com > From tapankarwa at yahoo.com Mon Jun 27 12:52:02 2005 From: tapankarwa at yahoo.com (Tapan Karwa) Date: Mon, 27 Jun 2005 12:52:02 -0700 (PDT) Subject: [e2e] Receiving RST on a MD5 TCP connection. In-Reply-To: Message-ID: <20050627195202.11232.qmail@web53701.mail.yahoo.com> Thanks alot, Mitesh. I think your draft sugggets solutions to the attacK part of my question. I am wondering if there is any consensus on how we should deal with the problem mentioned in Section 4.1 of RFC 2385. Thanks. --- Mitesh Dalal wrote: > refer to > http://www.ietf.org/internet-drafts/draft-ietf-tcpm-tcpsecure-03.txt > > Although not a standard yet, it offers a reasonable > mitigation. > > Thanks > Mitesh > > On Mon, 27 Jun 2005, Tapan Karwa wrote: > > > Hi, > > > > I was going through RFC 2385 - Protection of BGP > > Sessions via the TCP MD5 Signature Option > > > > In Section 4.1, it mentions > > "Similarly, resets generated by a TCP in response > to > > segments sent on a stale connection will also be > > ignored. Operationally this can be a problem since > > resets help BGP recover quickly from peer > crashes." > > > > This can easily happen in the following scenario : > > XX is talking to YY and both are using MD5. YY > > suddenly reboots but XX does not know about it > yet. XX > > sends the next segment to YY with the MD5 digest > but > > YY does not recognize it and hence sends a RST. Of > > course this RST segment does not have the MD5 > digest. > > > > Even when XX receives the RST, it wont/cant close > the > > connection since it will trash the packet as it > does > > not have the MD5 digest. > > > > I was wondering if there is any solution to this > > problem. Will it be correct to accept the RST even > if > > the MD5 digest is missing ? If we do that, can > that > > open doors for some other attacks ? > > > > Thanks, > > tapan. > > > > > > > > > ____________________________________________________ > > Yahoo! Sports > > Rekindle the Rivalries. Sign up for Fantasy > Football > > http://football.fantasysports.yahoo.com > > > __________________________________ Do you Yahoo!? Yahoo! Mail - Helps protect you from nasty viruses. http://promotions.yahoo.com/new_mail From cannara at attglobal.net Mon Jun 27 13:34:16 2005 From: cannara at attglobal.net (Cannara) Date: Mon, 27 Jun 2005 13:34:16 -0700 Subject: [e2e] Reacting to corruption based loss, and some remarks on PTE References: <20050627143508.5843.qmail@webmail30.rediffmail.com> <42C05239.4C780A40@web.de> Message-ID: <42C062C8.A348B76C@attglobal.net> Detlef, this sounds very reasonable. But remember, no matter whether there are radio links of any sort in a path, or not, the behavior of present TCP is disruptive of throughput as soon as even one packet isn't acked. So, words like differences from wired paths "nearly disappear" are begging the question. This is also why the belief: "The better compliance we have to the basic system model used in TCP..." is bound to fail. TCP is not a "system model" in any complete sense, because its control is based on undecidable (to it) events -- packet losses for unknown (to it) reasons. This "system model" is extended as the false god of "TCP Friendliness". All adherence to it means is that reasonable TCP service will remain denied if even 1% of packets are lost at a failing physical bus interface anywhere in a path, radio links or not. There's no reason to think that a transport design which results in amplifying physical, not congestive, losses at light load is a completed design. It's not even an acceptable design, as various demanding applications demonstrate. Alex Detlef Bosau wrote: > > sampad mishra wrote: > > > As we know packet loss (due to corruption) is significant in wireless > > and is mainly due to fading when their is shift from 1 AP to another . > > I think it would be better if the sender decreases the rate rather > > than sending data at the same rate considering the fact that data will > > be lost... > > That?s the rationale behind e.g. M-TCP (Brown, Singh, 1997). > IIRC, Brown and Singh slow down a TCP sender in the Internet which sends > data to a mobile receiver by shrinking the receiver?s advertised window > (AWND) in handover periods. > > > I mean their is no point losing more amount of data by continuing to > > send data at the same rate when we know that packets are getting > > corrupted. > > To my understanding, there are two issues. > > First, a sender has to recover properly from corruption when packets get > lost. In mobile networks (UMTS, GPRS) loss rates in a channel without > any kind of local recovery may become that high that there would be > hardly any throughput at all if error recovery would be done on a packet > level. Unfortunately, mobile networks suffer from high packet corruption > rates not only in case of roaming but for a number of other reasons as > well (shading, multipath fading,...). Please note: I put the emphasis on > _packet_ level here. Therefore, local recovery in mobile networks is not > done on a packet level but typically mobile networks split > up IP packets in smaller portions, so called "radio blocks", which may > have a size of e.g. 171 bits. Please note: This is only one value, I?ve > found somwhere in literature. AFAIK, Manfred Taferner (Vienna) has > worked the pros and cons for different radio block sizes in his PhD > thesis. Typically, mobile networks use a Radio Link Protocol (RLP) which > bascially consists of a combination of Automatic Retransmission reQuest > (ARQ) protocol and Forward Error Correction (FEC). Because bit > corruption typically occurs in a bursty manner in mobile networks, RLP > typically employs block/packet interleaving mechansims in order to keep > the FEC overhead reasonably small. > > Second, and that?s the focus of the loss differentiation work, a sender > who experiences packet loss decreases its congestion window (CWND). It > is the particular focus of Brown and Singh, that a TCP sender shall > _maintain_ its CWND even in presence of transient disconnection / reate > decrease in case of roaming. > > Now, my very concern is that these two issues are often mixed up. > Particularly, we have to carefully distinghuish between 802.11 networks > (e.g. WLAN) > and mobile networks here. Particularly in WLAN, the problem sometimes > appears to me a little bit overly exaggerated. When those networks are > properly set up and are being run as designed, nowadays WLAN plants > achieve BER less than 10^-6 which is comparable to early Ethernet > plants. > > So, in my opinion (and I?m very interested in any comments on this one), > in WLAN we can simply leave the whole matter alone, shut our eyes > and forget about the missing wire. > > I know that WLAN channels may become noisy and distorted e.g. in the > presence of ferroconcrete walls etc. But I said: "properly set up and .. > being run as designed." And since we all konw that ferroconcrete > interferes with wireless communication, we can place access points and > antennas on each side of the wall and interconnect them using e.g. > Ethernet. Likewise, we should obey velocity contraints etc. > > There are numerous problems which may occur in WLANs, but we must > carefully distinguish whether these are structural problems of TCP or > simply consequences of inappropriate / wrong network operation. (Or, as > an allusion to a pretty well known phrase by AST: Never overestimate the > range of a motor car > running out of petrol.) > > In my humble opinion (and as usual, I?m willing to take punishment) in > networks suffering from heavy loss, the problem is not TCP congestion > control but TCP retransmission. Of course, both problems coexist. > However: If you consider a 3G mobile network _without_ any RLP, you > would be not so unlikely to experience packet loss rates of 50 % or even > much higher. And in that situation, the congestion control problem is > the mouse. And the time and bandwitdh spent for retransmission are the > elephant. > > Even more precisely: The retransmission problem is not really due to TCP > but due to the size of IP packets. This is the very reason why these > are broken down into pieces on mobile links. > > > So I think its not much of a worry if sender reacts to a corruption > > like congestion for some cases atleast for the time being, till we > > find the reason for corruption and react accordingly.... Main point is > > Now, the problem is that in case of frequent corruption based loss, the > sender might nearly stop sending. > > Howoever, I think we must be careful to put the focus on the correct > problem. > > Simply spoken: > > 1: "WaveLAN is fixed LAN". (even more AST: The metric ton of salt.) > 2: In mobile links, corruption loss can be neglected due to the use of > RLP. > > So, in mobile networks, problems for TCP are reduced to (following > Reiner Ludwigs PhD dissertation) > -spurious timeouts, > -transient disconnections, > -scheduling problems, > > With respect to this threads subject this means: In mobile networks, > corruption based loss are made (nearly) disappear, and so does the > problem. > > However, the other side of the mountain is how a mobile link employing > an RLP appears to an Internet sender. One example, as you point out, may > be a decrease of bandwidth (or more precisely: achievable throughput) in > case of handover. > > This may require a sender slow down. > > In M-TCP this is achieved by shrinking AWND, which may result in a > number of difficulties. E.g. shrinking / clamping AWND conflicts with > the original AWND semantics: To my understanding, the purpose of AWND is > mainly to slow down the sender if the receiver?s _application_ is not > able / willing > to accept any data from the receiver?s socket. (E.g. if the stone aged > netscape release on my even older desktop PC collapses when it is > flooded > by my DSL line ;-) ) Another problem is that it?s not an easy task to > find out the appropriate AWND which is necessary to have the sender > send at the correct rate. It may be even difficult to decide, whether > clamping is even necessary. Imagine a situation when a flow is already > restricted to a rate the mobile link can carry by some bottleneck in the > wired part of the connection. > > For these reasons I propose the PTE mechanism described in > http://www.detlef-bosau.de/043.pdf . The key idea is to make a mobile > link appear like > an ordinary wirebound link, the storage capacity and bandwidth of which > reflect the properties of the mobile link. The key idea immediately > follows the congavoid paper by JK and MK, that a network is essentially > described by its storage capacity and its bottleneck rate and a TCP > sender is correctly paced by the receiver?s ACK packets when the flow is > in "equilibrium state", i.e. CWND is correctly set to the (flows fair > share) of storage capacity. The PTE mechanism is intended as a necessary > extension for PEP/split connection approaches and aims to maintain the > original semantics > of CWND, AWND and ACK pacing in TCP. > > One could simplify my approach that way, that it aims to make the mobile > network disappear to the sender. The sender shall not see any difference > between > a pure wirebound part and a part with a mobile last mile. And the > rationale for PTE was whether this is totally achieved by existing PEP > (in my opinion > not quite) or wheter there remains something to be done on that matter. > > Although the PTE work seems to be off topic with respect to the threads > subject, it was _exactly_ the question raised in this thread which > motivated my work. > This work is intended as an _extension_ to existing approaches and aims > to bring more compatibility and interoperability in heterogeneous (e.g. > wirebound/mobile) networks. It is not intended as a replacement for > existing approaches. > > A major concern was the matter of compatibilty. There is a plethora of > approaches for successful congestion avoidance and control on the > Internet, > Active Queme Management, Performance Enhancing Proxies etc. And perhaps > there is only a very little gap. Any approach which intends to close > this gap must not break or interfere with well proven and perhaps > broadly used mechanisms. (I think, this is in the very sense of Jon > Postels "Principle of Robustness".) Therefore the general question was: > What must be done to have a mobile link appear to TCP, and the Internet, > exaclty that way as it is expected by the congavoid paper and following > ones? The better compliance we have to the basic system model used in > TCP, the less difficulties need to be dealt with when a new mechanism is > proposed or even used. > > DB > -- > Detlef Bosau > Galileistrasse 30 > 70565 Stuttgart > Mail: detlef.bosau at web.de > Web: http://www.detlef-bosau.de > Mobile: +49 172 681 9937 From braden at ISI.EDU Mon Jun 27 13:57:30 2005 From: braden at ISI.EDU (Bob Braden) Date: Mon, 27 Jun 2005 13:57:30 -0700 (PDT) Subject: [e2e] Receiving RST on a MD5 TCP connection. Message-ID: <200506272057.NAA22115@gra.isi.edu> *> > *> > Even when XX receives the RST, it wont/cant close the *> > connection since it will trash the packet as it does *> > not have the MD5 digest. *> > *> > I was wondering if there is any solution to this *> > problem. Will it be correct to accept the RST even if *> > the MD5 digest is missing ? If we do that, can that *> > open doors for some other attacks ? RSTs in TCP are always advisory, I think. For example, they are not transmitted reliably. Bob Braden From jtk at northwestern.edu Mon Jun 27 15:01:30 2005 From: jtk at northwestern.edu (John Kristoff) Date: Mon, 27 Jun 2005 17:01:30 -0500 Subject: [e2e] Reacting to corruption based loss In-Reply-To: <42BEE8D7.D2FFC9E6@attglobal.net> References: <20050626001318.CB7A424D@aland.bbn.com> <42BEE8D7.D2FFC9E6@attglobal.net> Message-ID: <20050627220130.5E303136C82@aharp.ittns.northwestern.edu> On Sun, 26 Jun 2005 10:41:43 -0700 Cannara wrote: > So, we come back to the core meaning of the e2e principle -- assure > ends know what each is doing and get data reliably & efficiently > transferred. I'm one person here, perhaps of many, who was not involved in the formation of the principle or the development of the early Internet. As someone who tries painstakingly to deeply understand what those who came before me have wrote and did I find it unhelpful to read reinterpretation of e2e ideas and Internet history from those who are often told they are wrong by others who were there. I quote the above only as one example and note to neophytes who may be stumbling upon your posts. Examples of functions that may be best served on an e2e basis are just examples. Perhaps I'm misrepresenting your quote above, but as I read it, you seem to be saying reliability and efficiency of packets transferred are core goals. When in fact, as I understand it, there are not any goals per se, only the core argument; where best to put those functions. John From mdalal at cisco.com Mon Jun 27 15:38:51 2005 From: mdalal at cisco.com (Mitesh Dalal) Date: Mon, 27 Jun 2005 15:38:51 -0700 (PDT) Subject: [e2e] Receiving RST on a MD5 TCP connection. In-Reply-To: <20050627195202.11232.qmail@web53701.mail.yahoo.com> References: <20050627195202.11232.qmail@web53701.mail.yahoo.com> Message-ID: On Mon, 27 Jun 2005, Tapan Karwa wrote: > Thanks alot, Mitesh. I think your draft sugggets > solutions to the attacK part of my question. > > I am wondering if there is any consensus on how we > should deal with the problem mentioned in Section 4.1 > of RFC 2385. > AFAIK, there is no common solution for this problem. Implementations may have their own "solutions" besides the obvious drop the RST and time out the connection. But for obvious reasons, accepting the MD5less RST is not acceptable. Mitesh From detlef.bosau at web.de Mon Jun 27 17:12:40 2005 From: detlef.bosau at web.de (Detlef Bosau) Date: Tue, 28 Jun 2005 02:12:40 +0200 Subject: [e2e] Reacting to corruption based loss Message-ID: <42C095F8.26792D6@web.de> Wesley Eddy weddy at grc.nasa.gov Tue Jun 7 04:18:09 PDT 2005 > This idea is sort of discussed in the ETEN paper Craig sent a link to > earlier. One approach that it describes (CETEN_A) adapts beta between > 1/2 and 1 based on the rate of congestion events reported. In the > October 2004 CCR, there is a paper that goes into greater depth on > CETEN; "New Techniques for Making Transport Protocols Robust to > Corruption-Based Loss" by Eddy, Ostermann, and Allman. I see the point. But one question remains (admittedly, I did not yet read the paper, therefore I apologize if you have given the answer there). How do you achieve _fairness_, when beta may vary? AIMD results in fair shares, if alpha and beta are identical in competing flows. If beta is constantly chosen 1/2 and alpha derived from RTT (the idea behind the formula alpha=MSS*MSS/CWND is to increase CWND by one MSS per RTT, IIRC), which should be identical for competing flows, i.e. flows sharing the same path, this is no problem. If alpha and / or beta are different in competing flows, to my understanding CWND generally will not reach the fair share. Do you achieve similar values for beta in competing flows? -- Detlef Bosau Galileistrasse 30 70565 Stuttgart Mail: detlef.bosau at web.de Web: http://www.detlef-bosau.de Mobile: +49 172 681 9937 From cannara at attglobal.net Mon Jun 27 22:42:00 2005 From: cannara at attglobal.net (Cannara) Date: Mon, 27 Jun 2005 22:42:00 -0700 Subject: [e2e] Reacting to corruption based loss References: <20050626001318.CB7A424D@aland.bbn.com> <42BEE8D7.D2FFC9E6@attglobal.net> <20050627220130.5E303136C82@aharp.ittns.northwestern.edu> Message-ID: <42C0E328.5740ADF6@attglobal.net> John, if there are no goals for a design, then why choose the particular aspects of the design? The purpose of any transport protocol is to move data reliably and efficiently from end to end. That's surely a goal. In other words, the transport layer is designed to fulfill a commitment to the next layer above, that whatever data that layer passes down, will get to the target's same layer, completely, accurately and efficiently. That's what my remark was simply meant to affirm. TCP, as commonly deployed, fails the last part of the commitment in various real situations, and so is modified or not used when the goal is to meet all parts of the commitment that any good transport must make. Alex John Kristoff wrote: > > On Sun, 26 Jun 2005 10:41:43 -0700 > Cannara wrote: > > > So, we come back to the core meaning of the e2e principle -- assure > > ends know what each is doing and get data reliably & efficiently > > transferred. > > I'm one person here, perhaps of many, who was not involved in the > formation of the principle or the development of the early Internet. > As someone who tries painstakingly to deeply understand what those > who came before me have wrote and did I find it unhelpful to read > reinterpretation of e2e ideas and Internet history from those who > are often told they are wrong by others who were there. > > I quote the above only as one example and note to neophytes who may > be stumbling upon your posts. > > Examples of functions that may be best served on an e2e basis are > just examples. Perhaps I'm misrepresenting your quote above, but > as I read it, you seem to be saying reliability and efficiency of > packets transferred are core goals. When in fact, as I understand > it, there are not any goals per se, only the core argument; where > best to put those functions. > > John From rja at extremenetworks.com Tue Jun 28 04:55:29 2005 From: rja at extremenetworks.com (RJ Atkinson) Date: Tue, 28 Jun 2005 07:55:29 -0400 Subject: [e2e] Receiving RST on a MD5 TCP connection. In-Reply-To: <20050627195202.11232.qmail@web53701.mail.yahoo.com> References: <20050627195202.11232.qmail@web53701.mail.yahoo.com> Message-ID: <504E230F-439F-4FF7-BA79-347362AE219F@extremenetworks.com> On Jun 27, 2005, at 15:52, Tapan Karwa wrote: > I am wondering if there is any consensus on how we > should deal with the problem mentioned in Section 4.1 > of RFC 2385. I don't think this is a significant issue in real world deployments. TCP MD5 is designed to prevent acceptance of unauthenticated TCP RST message to reduce risk of (D)DOS attacks on the TCP sessions of BGP. An adversary could send an unauthenticated RST anytime. If that took out BGP, such would be a much larger operational problem. In practice, if the first (i.e. unauthenticated) RST is ignored, the router will send another RST a bit later on (e.g. after it is rebooted sufficiently to know which MD5 key to use) and that one WILL be authenticated and will be accepted rather than ignored. So it should sort itself out without any spec changes, just taking a time period closer to the reboot-time of the router that is rebooting rather than some small fraction of that time. No real harm done with the current situation at all. Ran rja at extremenetworks.com From weddy at grc.nasa.gov Tue Jun 28 07:53:30 2005 From: weddy at grc.nasa.gov (Wesley Eddy) Date: Tue, 28 Jun 2005 10:53:30 -0400 Subject: [e2e] Reacting to corruption based loss In-Reply-To: <42C095F8.26792D6@web.de> References: <42C095F8.26792D6@web.de> Message-ID: <20050628145330.GD2392@grc.nasa.gov> On Tue, Jun 28, 2005 at 02:12:40AM +0200, Detlef Bosau wrote: > > I see the point. But one question remains (admittedly, I did not yet > read the paper, therefore I apologize if you have given the answer > there). There is much better information in the paper than this email provides, but I'll try to answer anyways :). > How do you achieve _fairness_, when beta may vary? The paper splits this into two questions, so that it makes more sense: 1) Are a bunch of competing CETEN flows "fair" to each other? and 2) Are CETEN flows "friendly" to competing "legacy" TCP flows? Define fair to mean "equal sharing of resources", and define friendly (in a way that's a bit different from what TFRC uses) to mean "doesn't reduce the throughput of a normal competing TCP flow any more than another normal TCP flow would." In other words, by fairness, we mean to say that the enhanced TCP only gains performance improvements from utilizing unused link capacity, not by stealing from competing flows. The answers given to these questions in the paper are: 1) Yes, in fact, at high error rates, CETEN flows are more fair to each other than normal TCP flows are. Under CETEN, each flow has it's own floating point value for beta that's computed from observations of its own TCP behavior and some hints on error rates observed by routers; so it's safe to say that few flows have the same beta, although for most long-lived flows the beta values should be fairly closely grouped. The paper has experimental (simulation-based) evidence that acceptable fairness can be acheived even if beta isn't totally uniformly distributed. 2) The paper's answer to this comes from several simulations over varying levels of bottleneck saturation where the total number of flows is kept constant, but the ratio of CETEN to stock TCP flows is varied. We see that in normal cases, at high numbers of CETEN flows, the throughput that the stock flows' throughput is not significantly lower than it is with no CETEN flows. So, CETEN is friendly to stock TCP. However, we also show that it is possible to construct edge cases where the CETEN flows starve out normal TCP flows. -Wes -- Wesley M. Eddy Verizon FNS / NASA GRC http://roland.grc.nasa.gov/~weddy -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050628/e3fb0d78/attachment.bin From detlef.bosau at web.de Tue Jun 28 12:34:29 2005 From: detlef.bosau at web.de (Detlef Bosau) Date: Tue, 28 Jun 2005 21:34:29 +0200 Subject: [e2e] Reacting to corruption based loss References: <42C095F8.26792D6@web.de> <20050628145330.GD2392@grc.nasa.gov> Message-ID: <42C1A645.F4A563F2@web.de> Wesley Eddy wrote: > > On Tue, Jun 28, 2005 at 02:12:40AM +0200, Detlef Bosau wrote: > > > > I see the point. But one question remains (admittedly, I did not yet > > read the paper, therefore I apologize if you have given the answer > > there). > > There is much better information in the paper than this email provides, > but I'll try to answer anyways :). > Thanks a lot. Just one question in advance. What is the rationale behind the error rates used in your simulation? >From first glance at your thesis and your papers, I see packet corruption rates varying from about 0.001 to 0.1. Ist this correct? In that case I do not understand your choice. Even 0.1 is far too low for mobile networks _without_ RLP. And 0.001 seems to me much too high for wirebound networks and for mobile networs _with_ RLP. But I?m sure, that Alex will provide additional information here. Corruption rates in mobile networks vary on extremely broad ranges. What I?ve read so far, some papers use BER (_Block_ Error Rates, because radio blocks are the entity used by RLP) values 1 %, 2 %, 3 %, 4 %, 5%, other ones use 5 %, 10 %, 15 %. So let us take 5 % for the moment. This appears to me a reasonable radio block corruption rate which can be found in many papers, so hopefully it is sometimes met even in reality. (Perhas, some person experienced with mobile networks can provide details here.) So, if you consider a mobile network _without_ RLP, and if you consider for example IP packets of 500 byte = 4000 bits = ca. 23 radio blocks, assuming the 171 bit/block I mentioned yesterday as an example, then the probability for a packet to remain intact in the presence of a block corruption rate 0.05 is (1-0.05)^23 = 0.31. In other words: Your IP packet corruption rate is about seventy percent. So, for mobile links, I would have expected packet corruption rates about 0.5, 0.6, 0.7, 0.8, 0.9 to meet realistic values. That is why I did not give pure e2e approaches in the "mobile access net scenario" further consideration, because with packet corruption rates 0.7 or 0.8 congestion control does not matter. It?s simply not the problem. The problem are the inacceptable rate of packet retransmissions which is not only annoying for the user, because it takes large numbers of transmissions and large amounts of time to have a packet eventually delivered. It is annoying for the rest of the world as well, because even wirebound network links with small corruption rates would be occupied by retransmissions. This was exactly the moment, where I abolished the idea of using pure e2e recovery for TCP including mobile channels. In fact, I think, in practical mobile networks the NOs have never given it a thought. From what I see, there is no mobile network without RLP, even good ol? GSM has a reliable character / byte (?) stream used for various purposes. However, it took a long time to see this point. Personally, I have thought about pure e2e solutions for lossy channels for a really long time. I can honestly say that I have thrown away about 3 years of work because it suddenly became clear to me that I?ve done work for the waste basket. Please correct me if I?m wrong there. But I?m totally convinced that in really lossy networks, e.g. mobile wireless links, congestion control and loss differentiation simply _miss_ the problem. In mobile networks, I think it is unevitable to make use of the RLP mechanisms which typically decrease packet corruption rates to 10^-3 or 10^-9 (sic!), whatever you prefer. (However, I?ve never seen a _reliable_ packet transfer there, and I think it?s to avoid starvation problems caused by "everlasting packets" when there is no possibility to restrict the number of sending attempts by a finite limit.) > > > How do you achieve _fairness_, when beta may vary? > > The paper splits this into two questions, so that it makes more sense: > > 1) Are a bunch of competing CETEN flows "fair" to each other? > and > 2) Are CETEN flows "friendly" to competing "legacy" TCP flows? > > Define fair to mean "equal sharing of resources", and define friendly (in > a way that's a bit different from what TFRC uses) to mean "doesn't reduce > the throughput of a normal competing TCP flow any more than another normal > TCP flow would." In other words, by fairness, we mean to say that the > enhanced TCP only gains performance improvements from utilizing unused > link capacity, not by stealing from competing flows. Hm ;-) Of course, there are lots of situations, where a Sender cannot exploit the capacity of the link. In real situations you will often meet the situation that a backbone?s bandwidth by fare exceeds the bandwidth used in access links. However, in quite a number of situations, links are fully occupied by actual flows. In other words: There is no unused link capacity. And in fact, in those situations adding a new flow to the network _of_ _course_ means to take away ressources from existing flows and use them for the new one. I remember a slide set from a talk given by Len Kleinrock, where he pointed out the directive "Keep the line full!". In fact, this is the very basis for affordable network communication. And this holds true not only for packet switched networks but for any kind of networks, consider e.g. the telephone system. > The answers given to these questions in the paper are: > > 1) Yes, in fact, at high error rates, CETEN flows are more fair to each > other than normal TCP flows are. Under CETEN, each flow has it's own > floating point value for beta that's computed from observations of its > own TCP behavior and some hints on error rates observed by routers; so > it's safe to say that few flows have the same beta, although for most > long-lived flows the beta values should be fairly closely grouped. The > paper has experimental (simulation-based) evidence that acceptable > fairness can be acheived even if beta isn't totally uniformly > distributed. O.k. One remark from my own ns2-experience: Even in simple dumbbell scenarios, with only a few flows, it takes some time for a number of flows to reach fairness. To my understanding, this was due to the fact that often flows are poorly interleaved and therefore "implicit congestion notification" (ICN, AKA "droped packets" ;-)) did not reach all senders at the same time and sometimes not all senders received the same number of ICN. I?m not quite sure, how exactly this matches real networs, because I always consider the possibility of simulation artefacts etc. Thus, I set value on the theoratical basis here. Although I play around with the ns2 a lot (and even my PTE stuff is implemented with the ns2) and of course _one_ ns2 simulation may yield a counterexample to disprove an approach, I personaly would not rely too much on simulation results only. For the traditional AIMD scheme, it?s quite easy to see that AIMD sequences starting from different initial vaules will end up in the same sawtooth as long as alpha and beta are equally chosen for all competing flows. In case of different values for beta, even two competing AIMD sequences starting from a fair share at the starting point (e.g. both from zero) would become unfair after the first congestion event and never would reach fairness again. If I had a pencil and a piece of paper here, I would make a little sketch on that matter and the situation would becaome clear within three lines or so ;-) The periods between the congestion events will become longer and longer, and the flow with the lesser beta will disappear in the long run. Consinder to starting value (0,0), beta 1 = 1/2, beta 2 = 1/3. Congestion event: Sum of two values is 1. Alpha arbitrary but equaly chosen for both flows. Then the sequence is: Start: (0,0) Congestion at (1/2, 1/2) after congestion handling: (1/4,1/6) Congestion at (0.542, 0.459( after congestion handling (0.271, 0.153) etc... I think, my objection is obvious: Different values for beta will never reach a fair share. -- Detlef Bosau Galileistrasse 30 70565 Stuttgart Mail: detlef.bosau at web.de Web: http://www.detlef-bosau.de Mobile: +49 172 681 9937 From iam4 at cs.waikato.ac.nz Tue Jun 28 14:29:45 2005 From: iam4 at cs.waikato.ac.nz (Ian McDonald) Date: Wed, 29 Jun 2005 09:29:45 +1200 Subject: [e2e] Reacting to corruption based loss In-Reply-To: <42C1A645.F4A563F2@web.de> References: <42C095F8.26792D6@web.de> <20050628145330.GD2392@grc.nasa.gov> <42C1A645.F4A563F2@web.de> Message-ID: <42C1C149.6070305@cs.waikato.ac.nz> Hi there, Detlef Bosau wrote: > So let us take 5 % for the moment. This appears to me a reasonable radio > block corruption rate which can be found in many papers, so hopefully it > is sometimes met even in reality. (Perhas, some person experienced with > mobile networks can provide details here.) > Will attempt to from my experience! ... > That is why I did not give pure e2e approaches in the "mobile access net > scenario" further consideration, because with packet corruption rates > 0.7 or 0.8 congestion control does not matter. It?s simply not the > problem. The problem are the inacceptable rate of packet retransmissions > which is not only annoying for the user, because it takes large numbers > of transmissions and large amounts of time to have a packet eventually > delivered. It is annoying for the rest of the world as well, because > even wirebound network links with small corruption rates would be > occupied by retransmissions. > > > Please correct me if I?m wrong there. But I?m totally convinced that in > really lossy networks, e.g. mobile wireless links, congestion control > and loss differentiation simply _miss_ the problem. > I couldn't agree more with this last part. My experience has been as a software development manager at ECONZ (http://www.econz.co.nz) where we worked with mobile networks across radio, cellular, GPRS etc. The project that I was personally responsible was working on AMPS cellular and CDPD with, at times, very low signals. Our experience was that for AMPS we knocked the baud rate down as low as possible to 4.8K and abandonded the use of TCP/IP altogether as we could hit packet rate losses of 40% quite regularly and dropped connections. We implemented a protocol on top of UDP where we just used a sliding window (from memory about 16) and knocked the packet size to a small size (about 100 bytes max) and allowed out of order packets but with a timeout retransmission. The throughput on this compared to TCP (Reno) was around 1000% higher (yes 10 x). With better conditions we still got much better throughput. This was one of the reasons why ECONZ did so well as a company as we could actually transmit data to remote locations in a reasonable timeframe. Anyway, I may be living in the past a little but that was my experience! Regards, Ian WAND Network Research http://www.wand.net.nz From cannara at attglobal.net Tue Jun 28 23:35:52 2005 From: cannara at attglobal.net (Cannara) Date: Tue, 28 Jun 2005 23:35:52 -0700 Subject: [e2e] Reacting to corruption based loss References: <42C095F8.26792D6@web.de> <20050628145330.GD2392@grc.nasa.gov> <42C1A645.F4A563F2@web.de> Message-ID: <42C24148.ECD1AEC4@attglobal.net> On the error rates issue, mobile is an extreme case, always subject to difficult conditions in the physical space, so symbol definitions & error correction are paramount. However, most corporate traffic isn't over mobile links, but dedicated lines between routers, or radio/optical bridges. etc. Here, the reality of hardware failures raises its head and we see long-lasting error rates that are quite small and even content dependent. This is where TCP's ignorance of what's going on and its machete approach to slowdown are inappropriate and costly to the enterprise. As an example of the latter, a major telecom company, whose services many of us are using this instant, called a few years back, asking for help determining why just some of its offices were getting extremely poor performance downloading files, like customer site maps, from company servers, while other sites had great performance. The maps were a few MB and loaded via SMB/Samba over TCP/IP to staff PCs. The head network engineer was so desperate, he even put a PC in his car and drove all over Florida checking sites. This was actually good. But, best of all, he had access to the company's Distributed Sniffers(r) at many offices and HQ. A few traces told the story: a) some routed paths from some offices were losing 0.4% of pkts, while others lost none; b) the lossy paths experienced 20-30% longer file-download times. By simple triangulation, we decided that he should check the T3 interface on Cisco box X for errors. Sure enough, about 0.4% error rates were being tallied. The phone-line folks fixed the problem and voila, all sites crossing that path were back to speed! Now, if you were a network manager for a major corporation, would you rush to fix a physical problem that generated less than 1% errors, if your boss & users were complaining about mysterious slowdowns many times larger? 0.4% wasn't even enough to trigger an alert on their management consoles. You'd certainly be looking for bigger fish. Well, TCP's algorithms create a bigger fish -- talk about Henny Penny. :] The files were transferred in many 34kB SMB blocks, which required something like 23 server pkts per. The NT servers had a send window of about 6 pkts (uSoft later increased that to about 12). All interfaces were 100Mb/s, except the T3 and a couple of T1s, depending on path. RTT was about 70mS for all paths. Thankfully, the Sniffer traces also showed exactly what the TCPs at both ends were doing, despite Fast Retransmit, SACK, etc.: a) the typical, default timeouts were knocking the heck out of throughput; b) the fact that transfers required many blocks of odd numbers of pkts meant the the Ack Timer at the receiver was expiring on every block, waiting (~100mS) for the magical even-numbered last pkt in the block, which never came. These defaults could have been changed to gain some performance back, but not much. The basic idea that TCP should assume congestion = loss was the Achille's heel. Even the silly "ack alternate pkts" concept could have been largely automaticaly eliminated, if the receiver TCP actually learned that it would always get an odd number. In any case, the network management and CIO at this corporation learned just how important a good transport protocol is, and how careful they need to be of TCP. Another example, not for TCP but just for cute hardware-failure tricks, involved another WAN-connected corporation, whose St. Louis office suddenly couldn't see all the printers & servers they normally did, especially in the larger of their offices around the US. After some Sniffering in LA, it became clear that certain sizes & types of pkts just weren't getting all the way to the St. Louis office LAN. Packets larger than 128B and containing a broadcast destination weren't getting there. Everything else was. Since the servers & printers announced themselves with Netware SAP b'casts, any site with few services sent pkts <128B, while those with more services sent large ones. What could possibly select the 128B b'casts for demolition? After the usual, follow-the-pointing-finger phone discussions with the techs for the WAN provider and the St. Louis phone company, they reported clean local line-test results. Yet, the problem persisted. Fortunately, the WAN tech was experienced and knew anything can happen in hardware, so overnight he ran some special tests into the local phone circuit. The next am, the St. Louis phone company suddenly provided a new hardware path to the T1 office drop. Voila! I could go on, but the idea is that assumptions about "low" error rates are completely protocol (hardware through transport) dependent. We consultants love all these problems, because we get paid well to be suspicious of everything and look at every possibility. I love TCP/IP, because it generates far more business per node than AppleTalk, Vines, Netware, or even DECnet ever did. Even more than Token Ring! And, that's saying something. :] Alex Detlef Bosau wrote: > > Wesley Eddy wrote: > > > > On Tue, Jun 28, 2005 at 02:12:40AM +0200, Detlef Bosau wrote: > > > > > > I see the point. But one question remains (admittedly, I did not yet > > > read the paper, therefore I apologize if you have given the answer > > > there). > > > > There is much better information in the paper than this email provides, > > but I'll try to answer anyways :). > > > > Thanks a lot. > > Just one question in advance. What is the rationale behind the error > rates used in your simulation? > > >From first glance at your thesis and your papers, I see packet > corruption rates varying from about 0.001 to 0.1. Ist this correct? > In that case I do not understand your choice. Even 0.1 is far too low > for mobile networks _without_ RLP. And 0.001 seems to me much too high > for wirebound networks and for mobile networs _with_ RLP. But I?m sure, > that Alex will provide additional information here. > > Corruption rates in mobile networks vary on extremely broad ranges. What > I?ve read so far, some papers use BER (_Block_ Error Rates, because > radio blocks > are the entity used by RLP) values 1 %, 2 %, 3 %, 4 %, 5%, other ones > use 5 %, 10 %, 15 %. > > So let us take 5 % for the moment. This appears to me a reasonable radio > block corruption rate which can be found in many papers, so hopefully it > is sometimes met even in reality. (Perhas, some person experienced with > mobile networks can provide details here.) > > So, if you consider a mobile network _without_ RLP, and if you consider > for example IP packets of 500 byte = 4000 bits = ca. 23 radio blocks, > assuming the 171 bit/block I mentioned yesterday as an example, then the > probability for a packet to remain intact in the presence of a block > corruption rate 0.05 is > (1-0.05)^23 = 0.31. In other words: Your IP packet corruption rate is > about seventy percent. So, for mobile links, I would have expected > packet > corruption rates about 0.5, 0.6, 0.7, 0.8, 0.9 to meet realistic values. > > That is why I did not give pure e2e approaches in the "mobile access net > scenario" further consideration, because with packet corruption rates > 0.7 or 0.8 congestion control does not matter. It?s simply not the > problem. The problem are the inacceptable rate of packet retransmissions > which is not only annoying for the user, because it takes large numbers > of transmissions and large amounts of time to have a packet eventually > delivered. It is annoying for the rest of the world as well, because > even wirebound network links with small corruption rates would be > occupied by retransmissions. > > This was exactly the moment, where I abolished the idea of using pure > e2e recovery for TCP including mobile channels. > > In fact, I think, in practical mobile networks the NOs have never given > it a thought. From what I see, there is no mobile network without RLP, > even good ol? GSM > has a reliable character / byte (?) stream used for various purposes. > However, it took a long time to see this point. Personally, I have > thought about pure e2e solutions for lossy channels for a really long > time. > > I can honestly say that I have thrown away about 3 years of work because > it suddenly became clear to me that I?ve done > work for the waste basket. > > Please correct me if I?m wrong there. But I?m totally convinced that in > really lossy networks, e.g. mobile wireless links, congestion control > and loss differentiation simply _miss_ the problem. > > In mobile networks, I think it is unevitable to make use of the RLP > mechanisms which typically decrease packet corruption > rates to 10^-3 or 10^-9 (sic!), whatever you prefer. (However, I?ve > never seen a _reliable_ packet transfer there, and I think it?s to avoid > starvation > problems caused by "everlasting packets" when there is no possibility to > restrict the number of sending attempts by a finite limit.) > > > > > > How do you achieve _fairness_, when beta may vary? > > > > The paper splits this into two questions, so that it makes more sense: > > > > 1) Are a bunch of competing CETEN flows "fair" to each other? > > and > > 2) Are CETEN flows "friendly" to competing "legacy" TCP flows? > > > > Define fair to mean "equal sharing of resources", and define friendly (in > > a way that's a bit different from what TFRC uses) to mean "doesn't reduce > > the throughput of a normal competing TCP flow any more than another normal > > TCP flow would." In other words, by fairness, we mean to say that the > > enhanced TCP only gains performance improvements from utilizing unused > > link capacity, not by stealing from competing flows. > > Hm ;-) > > Of course, there are lots of situations, where a Sender cannot exploit > the capacity of the link. In real situations you will often meet the > situation that a backbone?s bandwidth by fare exceeds the bandwidth used > in access links. > > However, in quite a number of situations, links are fully occupied by > actual flows. In other words: There is no unused link capacity. And in > fact, in those situations adding a new flow to the network _of_ _course_ > means to take away ressources from existing flows and use them for the > new one. I remember a slide set from a talk given by Len Kleinrock, > where he pointed out the directive "Keep the line full!". In fact, this > is the very basis for affordable network communication. And this holds > true not only for packet switched networks but for any kind of networks, > consider e.g. the telephone system. > > > > The answers given to these questions in the paper are: > > > > 1) Yes, in fact, at high error rates, CETEN flows are more fair to each > > other than normal TCP flows are. Under CETEN, each flow has it's own > > floating point value for beta that's computed from observations of its > > own TCP behavior and some hints on error rates observed by routers; so > > it's safe to say that few flows have the same beta, although for most > > long-lived flows the beta values should be fairly closely grouped. The > > paper has experimental (simulation-based) evidence that acceptable > > fairness can be acheived even if beta isn't totally uniformly > > distributed. > > O.k. > > One remark from my own ns2-experience: Even in simple dumbbell > scenarios, with only a few flows, it takes some time for a number of > flows to reach fairness. > To my understanding, this was due to the fact that often flows are > poorly interleaved and therefore "implicit congestion notification" > (ICN, AKA "droped packets" ;-)) did not reach all senders at the same > time and sometimes not all senders received the same number of ICN. > > I?m not quite sure, how exactly this matches real networs, because I > always consider the possibility of simulation artefacts etc. > > Thus, I set value on the theoratical basis here. Although I play around > with the ns2 a lot (and even my PTE stuff is implemented with the ns2) > and of > course _one_ ns2 simulation may yield a counterexample to disprove an > approach, I personaly would not rely too much on simulation results > only. > > For the traditional AIMD scheme, it?s quite easy to see that AIMD > sequences starting from different initial vaules will end up in the > same sawtooth as long as alpha and beta are equally chosen for all > competing flows. > > In case of different values for beta, even two competing AIMD sequences > starting from a fair share at the starting point (e.g. both > from zero) would become unfair after the first congestion event and > never would reach fairness again. If I had a pencil and a piece of paper > here, I would > make a little sketch on that matter and the situation would becaome > clear within three lines or so ;-) The periods between the congestion > events will become longer and longer, and the flow with the lesser beta > will disappear in the long run. > > Consinder to starting value (0,0), beta 1 = 1/2, beta 2 = 1/3. > Congestion event: Sum of two values is 1. Alpha arbitrary but equaly > chosen for both flows. > > Then the sequence is: > Start: (0,0) > Congestion at (1/2, 1/2) > after congestion handling: (1/4,1/6) > Congestion at (0.542, 0.459( > after congestion handling (0.271, 0.153) > > etc... > > I think, my objection is obvious: Different values for beta will never > reach a fair share. > > -- > Detlef Bosau > Galileistrasse 30 > 70565 Stuttgart > Mail: detlef.bosau at web.de > Web: http://www.detlef-bosau.de > Mobile: +49 172 681 9937 From sm at mirapoint.com Wed Jun 29 02:03:23 2005 From: sm at mirapoint.com (Sam Manthorpe) Date: Wed, 29 Jun 2005 02:03:23 -0700 (PDT) Subject: [e2e] Reacting to corruption based loss In-Reply-To: <42C24148.ECD1AEC4@attglobal.net> References: <42C095F8.26792D6@web.de> <20050628145330.GD2392@grc.nasa.gov> <42C1A645.F4A563F2@web.de> <42C24148.ECD1AEC4@attglobal.net> Message-ID: On Tue, 28 Jun 2005, Cannara wrote: > On the error rates issue, mobile is an extreme case, always subject to > difficult conditions in the physical space, so symbol definitions & error > correction are paramount. However, most corporate traffic isn't over mobile > links, but dedicated lines between routers, or radio/optical bridges. etc. > Here, the reality of hardware failures raises its head and we see long-lasting > error rates that are quite small and even content dependent. This is where > TCP's ignorance of what's going on and its machete approach to slowdown are > inappropriate and costly to the enterprise. > > As an example of the latter, a major telecom company, whose services many of > us are using this instant, called a few years back, asking for help How many (years?) > determining why just some of its offices were getting extremely poor > performance downloading files, like customer site maps, from company servers, > while other sites had great performance. The maps were a few MB and loaded > via SMB/Samba over TCP/IP to staff PCs. The head network engineer was so > desperate, he even put a PC in his car and drove all over Florida checking > sites. This was actually good. But, best of all, he had access to the > company's Distributed Sniffers(r) at many offices and HQ. A few traces told > the story: a) some routed paths from some offices were losing 0.4% of pkts, > while others lost none; b) the lossy paths experienced 20-30% longer > file-download times. By simple triangulation, we decided that he should check > the T3 interface on Cisco box X for errors. Sure enough, about 0.4% error > rates were being tallied. The phone-line folks fixed the problem and voila, > all sites crossing that path were back to speed! > > Now, if you were a network manager for a major corporation, would you rush to > fix a physical problem that generated less than 1% errors, if your boss & > users were complaining about mysterious slowdowns many times larger? > 0.4% wasn't even enough to trigger an alert on their management consoles. You'd > certainly be looking for bigger fish. Well, TCP's algorithms create a bigger > fish -- talk about Henny Penny. :] I can't help but wonder - if TCP/IP were generally so sensitive to a loss of 0.4%, then why does the Internet work? I spent a long time simulating the BSD stack a while back and it held up extremely well under random loss until you hit 10% at which point things go non-linear. I've also never experienced what you describe, neither as a user nor or in my capacity as engineer debugging customer network problems. And what's with that "major corporation" and "boss" stuff? I'm guessing they'd like the "replace the hardware" solution to the "replace the whole infrastructure with something that's incompatible with everything else on the planet" one. > > The files were transferred in many 34kB SMB blocks, which required something > like 23 server pkts per. The NT servers had a send window of about 6 pkts > (uSoft later increased that to about 12). All interfaces were 100Mb/s, except > the T3 and a couple of T1s, depending on path. RTT was about 70mS for all > paths. So the NT servers were either misconfigured, or your example is rather dated, right? > Thankfully, the Sniffer traces also showed exactly what the TCPs at both ends > were doing, despite Fast Retransmit, SACK, etc.: I'm don't know a lot about NT's history, but having a 9K window *and* SACK sounds historically schizo. > a) the typical, default > timeouts were knocking the heck out of throughput; b) the fact that transfers > required many blocks of odd numbers of pkts meant the the Ack Timer at the > receiver was expiring on every block, waiting (~100mS) for the magical > even-numbered last pkt in the block, which never came. These defaults could > have been changed to gain some performance back, but not much. The basic idea > that TCP should assume congestion = loss was the Achille's heel. Even the > silly "ack alternate pkts" concept could have been largely automaticaly > eliminated, if the receiver TCP actually learned that it would always get an > odd number. The issue you describe was fixed a long time ago in most stacks, AFAIAW. I fixed it in IRIX aroundabout 6 years ago. For fun, I tried an experiment. I transfered a largish file to my sluggish corporate ftp server. Took 77 seconds (over the Internet, from San Francisco to Sunnyvale). I then did the same thing, this time I unplugged my Ethernet cable 6 times, each time for 4 seconds. The transfer took 131 seconds. Not bad, I think. At least not bad enough to warrant a rearchitecture. Cheers, -- Sam From detlef.bosau at web.de Wed Jun 29 04:17:39 2005 From: detlef.bosau at web.de (Detlef Bosau) Date: Wed, 29 Jun 2005 13:17:39 +0200 Subject: [e2e] Reacting to corruption based loss References: <42C095F8.26792D6@web.de> <20050628145330.GD2392@grc.nasa.gov> <42C1A645.F4A563F2@web.de> <42C24148.ECD1AEC4@attglobal.net> Message-ID: <42C28353.6E5E1861@web.de> Sam Manthorpe wrote: > > As an example of the latter, a major telecom company, whose services many of > > us are using this instant, called a few years back, asking for help > > How many (years?) Alex reminded me on a strange situation, I met myself a couple of years ago. However, there is one lesson, I?ve learned meanwhile: It?s not the stork, who brings the babies ;-) There is a difference between correlation and causality. In other words: It may happen quite often, that problems occur at the same time but with no causal relationship. One day, I met strong TCP/IP problems on a WAN line exhibiting a BER 10^-9, which was more than specified. However, I have thought about the situation a few years later and learned: BER 10^-9 => one packet in 125 MBytes is corrupted => there are about four or five corrupted TCP datagrams when I download an ISO image for the new RedHat Linux distribution. I don?t know whether this phrase exists in English as well, but in Germany we call this "beyond good and evil". Four corrupted packets in an ISO image - and please consider, most TCP flows conists only of some dozen packets. Nobody would ever notice those error rates. This _is_ neglectible. I don?t know, what really caused the trouble. But it surely was not the BER. I sometimes met, that those error rates were not the only problem that time, and more important: not the real cause for problems. A few years ago, we hat a cisco box which definitely scrambled IPX datagrams in certain cituaions. This bug was hard to find, at last we put sniffers at three locations along the path in the company network. However, it coud be identified, ciso fixed the problem and anything was fine. Software bugs do happen, however that?s not the end of the world. And even more, I can blame no one for software bugs as long as I produce ones myself. We had a problem, we identified it, we fixed it - anything was fine and andybody was lucky. > > I can't help but wonder - if TCP/IP were generally so sensitive to a loss > of 0.4%, then why does the Internet work? I spent a long time simulating This is my question as well. Just for fun, I simulated TCP flows with packet error rates of 1$ to 5%. And as far as I can remember, 1 % packet corruption rate did not really matter. > the BSD stack a while back and it held up extremely well under random > loss until you hit 10% at which point things go non-linear. I've also > never experienced what you describe, neither as a user nor or in my > capacity as engineer debugging customer network problems. > > And what's with that "major corporation" and "boss" stuff? I'm guessing > they'd like the "replace the hardware" solution to the "replace the > whole infrastructure with something that's incompatible with everything else > on the planet" one. Companies do often replace hardwre and software, if it only fixes the problem. In industrial plants, people often are not interested in the real problem. They want a _fast_ and _cheap_ solution. So, if one says: "It?s the Cisco featureset!" and then the cisco box is replaced by onther model - possibly working around the problem as a side effect, anybody is lucky about it. It?s simply much cheaber to replace even an expensive cisco box than to have a dozen netwok consultants looking after the _real_ problem a few months or so. Perhaps, cisco boxes are a bad example. But we met problems in protocols without flow control - which lead to problems in NIC with different buffers. => Not the software was rewritten but the NIC replaces. Cheap, works (around the problem), anybondy is lucky. However, one cannot always derive fundamental problems in TCP from this. And the rationale behind this is an economical one - not a scientific one. However: Does anybody have recent data about e2e packet corruption rates in Internet connections or corporate LANs, even with a large number of hops? I think, this would be useful for the discussion here. DB -- Detlef Bosau Galileistrasse 30 70565 Stuttgart Mail: detlef.bosau at web.de Web: http://www.detlef-bosau.de Mobile: +49 172 681 9937 From gaylord at dirtcheapemail.com Wed Jun 29 06:42:47 2005 From: gaylord at dirtcheapemail.com (Clark Gaylord) Date: Wed, 29 Jun 2005 09:42:47 -0400 Subject: [e2e] Reacting to corruption based loss In-Reply-To: <42C28353.6E5E1861@web.de> References: <42C095F8.26792D6@web.de> <20050628145330.GD2392@grc.nasa.gov> <42C1A645.F4A563F2@web.de> <42C24148.ECD1AEC4@attglobal.net> <42C28353.6E5E1861@web.de> Message-ID: <42C2A557.7010600@dirtcheapemail.com> Detlef Bosau wrote: >Sam Manthorpe wrote: > > >>>As an example of the latter, a major telecom company, whose services many of >>>us are using this instant, called a few years back, asking for help >>> >>> >>How many (years?) >> >> > >Alex reminded me on a strange situation, I met myself a couple of years >ago. > > you did? that is strange. what did you say? :-) >One day, I met strong TCP/IP problems on a WAN line exhibiting a BER >10^-9, which was more than specified. However, I have thought about the > > Ok, while we're discussing "corruption-based loss" and weirdness, here's mine: We often talk about bit errors being random. I put it to you that this may not be true. Perhaps it is the traffic data that are the random element and the bit errors are more predictable than we believe. A user called us years ago, when our backbone was a FDDI ring, about a several megabyte file he could not send to a neighboring building. He had successfully sent it throughout his LAN, and there were other buildings to which he could send it, but not to this one. He was using ftp. As it turns out, the intended destination was counter-clockwise from him on the ring; all buildings he had successfully sent it to via the backbone were clockwise from him. We did further testing with the user and found that, in fact, there were no buildings to which he could send this file that were counter-clockwise on the ring. Weird. So, we split the file in half and found that one piece would successfully traverse the ring, the other would not. And so we continued via binary search splitting the unsuccessful piece until we had a piece of the file with a few hundred bytes that were the problem. Out of the entire several megabyte file, these few hundred bytes absolutely could not be convinced to traverse the ring counter-clockwise from this building, yet could travel anywhere else just fine. If we tried to send a packet with these data, the FDDI interface would always accumulate an error. We sent out a field tech with an alcohol swap, and fixed the problem. The conjecture is that there was a particular bit pattern that would reliably get corrupted by the reflections on this fiber. Cleaning the fiber fixed the problem. --ckg -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.postel.org/pipermail/end2end-interest/attachments/20050629/c6ee75bb/attachment.html From cannara at attglobal.net Wed Jun 29 10:51:53 2005 From: cannara at attglobal.net (Cannara) Date: Wed, 29 Jun 2005 10:51:53 -0700 Subject: [e2e] Reacting to corruption based loss References: <42C095F8.26792D6@web.de> <20050628145330.GD2392@grc.nasa.gov> <42C1A645.F4A563F2@web.de> <42C24148.ECD1AEC4@attglobal.net> <42C28353.6E5E1861@web.de> Message-ID: <42C2DFB9.170231FF@attglobal.net> Detlef, on the questions relating to Internet loss, you should talk to the folks at SLAC, who've been doing Ping-around-the-world for years. Try Les Cottrell, who may be listening. It was, not long ago, that peering points, especially back East, were losing 30% of frames for busy periods. Don't know what that is today. Alex Detlef Bosau wrote: > > Sam Manthorpe wrote: > > > > As an example of the latter, a major telecom company, whose services many of > > > us are using this instant, called a few years back, asking for help > > > > How many (years?) > > Alex reminded me on a strange situation, I met myself a couple of years > ago. > However, there is one lesson, I?ve learned meanwhile: It?s not the > stork, who brings the babies ;-) > There is a difference between correlation and causality. > > In other words: It may happen quite often, that problems occur at the > same time but with no causal relationship. > One day, I met strong TCP/IP problems on a WAN line exhibiting a BER > 10^-9, which was more than specified. However, I have thought about the > situation a few years later and learned: BER 10^-9 => one packet in 125 > MBytes is corrupted => there are about four or five corrupted TCP > datagrams when I download an ISO image for the new RedHat Linux > distribution. > > I don?t know whether this phrase exists in English as well, but in > Germany we call this "beyond good and evil". > > Four corrupted packets in an ISO image - and please consider, most TCP > flows conists only of some dozen packets. > > Nobody would ever notice those error rates. This _is_ neglectible. > > I don?t know, what really caused the trouble. But it surely was not the > BER. > > I sometimes met, that those error rates were not the only problem that > time, and more important: not the real cause for problems. A few years > ago, we hat a cisco box which definitely scrambled IPX datagrams in > certain cituaions. This bug was hard to find, at last we put sniffers at > three locations along the path in the company network. However, it coud > be identified, ciso fixed the problem and anything was fine. > > Software bugs do happen, however that?s not the end of the world. And > even more, I can blame no one for software bugs as long as I produce > ones myself. > > We had a problem, we identified it, we fixed it - anything was fine and > andybody was lucky. > > > > > I can't help but wonder - if TCP/IP were generally so sensitive to a loss > > of 0.4%, then why does the Internet work? I spent a long time simulating > > This is my question as well. Just for fun, I simulated TCP flows with > packet error rates of 1$ to 5%. > > And as far as I can remember, 1 % packet corruption rate did not really > matter. > > > the BSD stack a while back and it held up extremely well under random > > loss until you hit 10% at which point things go non-linear. I've also > > never experienced what you describe, neither as a user nor or in my > > capacity as engineer debugging customer network problems. > > > > And what's with that "major corporation" and "boss" stuff? I'm guessing > > they'd like the "replace the hardware" solution to the "replace the > > whole infrastructure with something that's incompatible with everything else > > on the planet" one. > > Companies do often replace hardwre and software, if it only fixes the > problem. > > In industrial plants, people often are not interested in the real > problem. They want a _fast_ and _cheap_ solution. So, if one says: "It?s > the Cisco featureset!" > and then the cisco box is replaced by onther model - possibly working > around the problem as a side effect, anybody is lucky about it. > It?s simply much cheaber to replace even an expensive cisco box than to > have a dozen netwok consultants looking after the _real_ problem a few > months or so. > > Perhaps, cisco boxes are a bad example. But we met problems in protocols > without flow control - which lead to problems in NIC with different > buffers. > => Not the software was rewritten but the NIC replaces. > > Cheap, works (around the problem), anybondy is lucky. > > However, one cannot always derive fundamental problems in TCP from this. > > And the rationale behind this is an economical one - not a scientific > one. > > However: Does anybody have recent data about e2e packet corruption rates > in Internet connections or corporate LANs, even with a large number of > hops? > > I think, this would be useful for the discussion here. > > DB > > -- > Detlef Bosau > Galileistrasse 30 > 70565 Stuttgart > Mail: detlef.bosau at web.de > Web: http://www.detlef-bosau.de > Mobile: +49 172 681 9937 From cannara at attglobal.net Wed Jun 29 11:07:30 2005 From: cannara at attglobal.net (Cannara) Date: Wed, 29 Jun 2005 11:07:30 -0700 Subject: [e2e] Reacting to corruption based loss References: <42C095F8.26792D6@web.de> <20050628145330.GD2392@grc.nasa.gov> <42C1A645.F4A563F2@web.de> <42C24148.ECD1AEC4@attglobal.net> <42C28353.6E5E1861@web.de> <42C2A557.7010600@dirtcheapemail.com> Message-ID: <42C2E362.A1F03AC7@attglobal.net> Good one Clark! Indeed FDDI and other fiber rings have dual interfaces & fibers and anything can happen in any part of the hardware. The assumptions about the physical layer that have been made in most TCP discussions simply evidence lack of understanding of the reality and complexity of underlying layers. This lack extends to the length of time the defects last and go undiscovered, while folks struggle with peformance issues. It would have been interesting if your tech had disconnected one half of one interface and made the ring wrap! Alex > Clark Gaylord wrote: > > Detlef Bosau wrote: > > > Sam Manthorpe wrote: > > > > > >> > As an example of the latter, a major telecom company, whose services > >> > many of > >> > us are using this instant, called a few years back, asking for help > >> > > >> > > >> How many (years?) > >> > >> > > Alex reminded me on a strange situation, I met myself a couple of years > > ago. > > > > > you did? that is strange. what did you say? :-) > > > One day, I met strong TCP/IP problems on a WAN line exhibiting a BER > > 10^-9, which was more than specified. However, I have thought about the > > > > > Ok, while we're discussing "corruption-based loss" and weirdness, here's > mine: > > We often talk about bit errors being random. I put it to you that this may > not be true. Perhaps it is the traffic data that are the random element and > the bit errors are more predictable than we believe. > > A user called us years ago, when our backbone was a FDDI ring, about a > several megabyte file he could not send to a neighboring building. He had > successfully sent it throughout his LAN, and there were other buildings to > which he could send it, but not to this one. He was using ftp. As it turns > out, the intended destination was counter-clockwise from him on the ring; > all buildings he had successfully sent it to via the backbone were clockwise > from him. We did further testing with the user and found that, in fact, > there were no buildings to which he could send this file that were > counter-clockwise on the ring. Weird. So, we split the file in half and > found that one piece would successfully traverse the ring, the other would > not. And so we continued via binary search splitting the unsuccessful piece > until we had a piece of the file with a few hundred bytes that were the > problem. Out of the entire several megabyte file, these few hundred bytes > absolutely could not be convinced to traverse the ring counter-clockwise > from this building, yet could travel anywhere else just fine. If we tried > to send a packet with these data, the FDDI interface would always accumulate > an error. > > We sent out a field tech with an alcohol swap, and fixed the problem. > > The conjecture is that there was a particular bit pattern that would > reliably get corrupted by the reflections on this fiber. Cleaning the fiber > fixed the problem. > > --ckg From braden at ISI.EDU Wed Jun 29 11:20:01 2005 From: braden at ISI.EDU (Bob Braden) Date: Wed, 29 Jun 2005 11:20:01 -0700 (PDT) Subject: [e2e] Reacting to corruption based loss Message-ID: <200506291820.LAA22780@gra.isi.edu> *> with a few hundred bytes that were the problem. Out of the entire *> several megabyte file, these few hundred bytes absolutely could not be *> convinced to traverse the ring counter-clockwise from this building, yet *> could travel anywhere else just fine. If we tried to send a packet with *> these data, the FDDI interface would always accumulate an error. *> *> We sent out a field tech with an alcohol swap, and fixed the problem. *> *> The conjecture is that there was a particular bit pattern that would *> reliably get corrupted by the reflections on this fiber. Cleaning the *> fiber fixed the problem. *> *> --ckg GREAT story! Bob Braden From braden at ISI.EDU Wed Jun 29 11:43:40 2005 From: braden at ISI.EDU (Bob Braden) Date: Wed, 29 Jun 2005 11:43:40 -0700 (PDT) Subject: [e2e] Reacting to corruption based loss Message-ID: <200506291843.LAA22838@gra.isi.edu> *> *> Our experience was that for AMPS we knocked the baud rate down as low as *> possible to 4.8K and abandonded the use of TCP/IP altogether as we could *> hit packet rate losses of 40% quite regularly and dropped connections. *> *> We implemented a protocol on top of UDP where we just used a sliding *> window (from memory about 16) and knocked the packet size to a small *> size (about 100 bytes max) and allowed out of order packets but with a *> timeout retransmission. *> How does this differ from TCP with an MTU of 100 bytes and using SACK? (ie where's the magic?) Bob Braden From cannara at attglobal.net Wed Jun 29 11:47:04 2005 From: cannara at attglobal.net (Cannara) Date: Wed, 29 Jun 2005 11:47:04 -0700 Subject: [e2e] Reacting to corruption based loss References: <42C095F8.26792D6@web.de> <20050628145330.GD2392@grc.nasa.gov> <42C1A645.F4A563F2@web.de> <42C24148.ECD1AEC4@attglobal.net> Message-ID: <42C2ECA8.734D170C@attglobal.net> Good response Sam. The kind that leads to more thought, in fact. How many years ago for the 1st example, you ask. For that one, 6. For the one this year, 0.25 year. :] You say you "spent a long time simulating the BSD stack". That's great, and part of the problem. Folks do simulations which are based on code written to simulate someone's ideas on how something works. Then, they believe the simulations, despite what's actually seen in reality. We all know that simulators and their use can be very limited in relevance, if not accuracy. One of the biggest issues is lack of release control for things as important as Internet protocols (e.g., TCP). Thus the NT server may have a different version of TCP from that on the user's spanking new PC. No one ever addresses even the basics of stack parameter settings in their manuals, and network staffers rarely have the time to go in and check versions, timer settings, yadda, yadda. This is indeed why many performance problems occur. You fixed IRIX 6 years ago. Great. Now, why does the Internet work? Not simply because of TCP, for sure. Your experiment illustrates the rush to acceptance these points are raised against: "I transfered a largish file to my sluggish corporate ftp server. Took 77 seconds (over the Internet, from San Francisco to Sunnyvale). I then did the same thing, this time I unplugged my Ethernet cable 6 times, each time for 4 seconds. The transfer took 131 seconds." So, what is "largish" in more precise terms? What are the RTT and limiting bit-rate of your "Internet" path from SF to S'vale? The file evidently went right by our house! But, despite the imprecision, we can use your result: 77 + 6 x 4 = 101. Your transfer actually took 131 seconds, fully 30% more than one would expect on a link that's simply interrupted, not congested. Good experiment! Alex Sam Manthorpe wrote: > > On Tue, 28 Jun 2005, Cannara wrote: > > > On the error rates issue, mobile is an extreme case, always subject to > > difficult conditions in the physical space, so symbol definitions & error > > correction are paramount. However, most corporate traffic isn't over mobile > > links, but dedicated lines between routers, or radio/optical bridges. etc. > > Here, the reality of hardware failures raises its head and we see long-lasting > > error rates that are quite small and even content dependent. This is where > > TCP's ignorance of what's going on and its machete approach to slowdown are > > inappropriate and costly to the enterprise. > > > > As an example of the latter, a major telecom company, whose services many of > > us are using this instant, called a few years back, asking for help > > How many (years?) > > > determining why just some of its offices were getting extremely poor > > performance downloading files, like customer site maps, from company servers, > > while other sites had great performance. The maps were a few MB and loaded > > via SMB/Samba over TCP/IP to staff PCs. The head network engineer was so > > desperate, he even put a PC in his car and drove all over Florida checking > > sites. This was actually good. But, best of all, he had access to the > > company's Distributed Sniffers(r) at many offices and HQ. A few traces told > > the story: a) some routed paths from some offices were losing 0.4% of pkts, > > while others lost none; b) the lossy paths experienced 20-30% longer > > file-download times. By simple triangulation, we decided that he should check > > the T3 interface on Cisco box X for errors. Sure enough, about 0.4% error > > rates were being tallied. The phone-line folks fixed the problem and voila, > > all sites crossing that path were back to speed! > > > > Now, if you were a network manager for a major corporation, would you rush to > > fix a physical problem that generated less than 1% errors, if your boss & > > users were complaining about mysterious slowdowns many times larger? > > 0.4% wasn't even enough to trigger an alert on their management consoles. You'd > > certainly be looking for bigger fish. Well, TCP's algorithms create a bigger > > fish -- talk about Henny Penny. :] > > I can't help but wonder - if TCP/IP were generally so sensitive to a loss > of 0.4%, then why does the Internet work? I spent a long time simulating > the BSD stack a while back and it held up extremely well under random > loss until you hit 10% at which point things go non-linear. I've also > never experienced what you describe, neither as a user nor or in my > capacity as engineer debugging customer network problems. > > And what's with that "major corporation" and "boss" stuff? I'm guessing > they'd like the "replace the hardware" solution to the "replace the > whole infrastructure with something that's incompatible with everything else > on the planet" one. > > > > > The files were transferred in many 34kB SMB blocks, which required something > > like 23 server pkts per. The NT servers had a send window of about 6 pkts > > (uSoft later increased that to about 12). All interfaces were 100Mb/s, except > > the T3 and a couple of T1s, depending on path. RTT was about 70mS for all > > paths. > > So the NT servers were either misconfigured, or your example is rather > dated, right? > > > Thankfully, the Sniffer traces also showed exactly what the TCPs at both ends > > were doing, despite Fast Retransmit, SACK, etc.: > > I'm don't know a lot about NT's history, but having a 9K window *and* SACK sounds > historically schizo. > > > a) the typical, default > > timeouts were knocking the heck out of throughput; b) the fact that transfers > > required many blocks of odd numbers of pkts meant the the Ack Timer at the > > receiver was expiring on every block, waiting (~100mS) for the magical > > even-numbered last pkt in the block, which never came. These defaults could > > have been changed to gain some performance back, but not much. The basic idea > > that TCP should assume congestion = loss was the Achille's heel. Even the > > silly "ack alternate pkts" concept could have been largely automaticaly > > eliminated, if the receiver TCP actually learned that it would always get an > > odd number. > > The issue you describe was fixed a long time ago in most stacks, AFAIAW. > I fixed it in IRIX aroundabout 6 years ago. > > For fun, I tried an experiment. I transfered a largish file to my sluggish > corporate ftp server. Took 77 seconds (over the Internet, from San Francisco > to Sunnyvale). I then did the same thing, this time I unplugged my Ethernet > cable 6 times, each time for 4 seconds. The transfer took 131 seconds. > Not bad, I think. At least not bad enough to warrant a rearchitecture. > > Cheers, > -- Sam From eblanton at cs.ohiou.edu Wed Jun 29 12:18:53 2005 From: eblanton at cs.ohiou.edu (Ethan Blanton) Date: Wed, 29 Jun 2005 14:18:53 -0500 Subject: [e2e] Reacting to corruption based loss In-Reply-To: <42C2E362.A1F03AC7@attglobal.net> References: <42C095F8.26792D6@web.de> <20050628145330.GD2392@grc.nasa.gov> <42C1A645.F4A563F2@web.de> <42C24148.ECD1AEC4@attglobal.net> <42C28353.6E5E1861@web.de> <42C2A557.7010600@dirtcheapemail.com> <42C2E362.A1F03AC7@attglobal.net> Message-ID: <20050629191853.GA30771@colt.internal> Alex, Cannara spake unto us the following "wisdom": > Good one Clark! Indeed FDDI and other fiber rings have dual interfaces & > fibers and anything can happen in any part of the hardware. The assumptions > about the physical layer that have been made in most TCP discussions simply > evidence lack of understanding of the reality and complexity of underlying > layers. This lack extends to the length of time the defects last and go > undiscovered, while folks struggle with peformance issues. Let me get this straight, for my benefit and for the benefit of those who may not understand you. You're suggesting that TCP should have a mechanism by which the hardware layer can communicate that there is a particular sequence of bits which, due to physical imperfections in the transmission medium, cannot be reliably transmitted, and that given this information the TCP stack could choose a different method of communicating those particular bits? I can't believe no one has thought of implementing this richness of signalling before now. This is certainly an inherent flaw in the TCP design. Ethan (On a side note: I keep hearing that the Internet is completely broken and could never possibly work. Why is it that certain emails to the e2e mailing list *unfailingly* reach my mailbox under these circumstances?) -- The laws that forbid the carrying of arms are laws [that have no remedy for evils]. They disarm only those who are neither inclined nor determined to commit crimes. -- Cesare Beccaria, "On Crimes and Punishments", 1764 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050629/1852f955/attachment.bin From detlef.bosau at web.de Wed Jun 29 13:32:49 2005 From: detlef.bosau at web.de (Detlef Bosau) Date: Wed, 29 Jun 2005 22:32:49 +0200 Subject: [e2e] Reacting to corruption based loss References: <42C095F8.26792D6@web.de> <20050628145330.GD2392@grc.nasa.gov> <42C1A645.F4A563F2@web.de> <42C24148.ECD1AEC4@attglobal.net> <42C28353.6E5E1861@web.de> <42C2DFB9.170231FF@attglobal.net> Message-ID: <42C30571.ECA183B3@web.de> Cannara wrote: > > Detlef, on the questions relating to Internet loss, you should talk to the > folks at SLAC, who've been doing Ping-around-the-world for years. Try Les > Cottrell, who may be listening. It was, not long ago, that peering points, > especially back East, were losing 30% of frames for busy periods. Don't know > what that is today. I?m not quite sure, but with respect to the list topic: was this _corruption_ based loss? Or _congestion_ based loss? (Or did "busy" mean, the cable guys were busy cleaning the NICs? *SCNR*) DB -- Detlef Bosau Galileistrasse 30 70565 Stuttgart Mail: detlef.bosau at web.de Web: http://www.detlef-bosau.de Mobile: +49 172 681 9937 From cottrell at slac.stanford.edu Wed Jun 29 13:37:01 2005 From: cottrell at slac.stanford.edu (Cottrell, Les) Date: Wed, 29 Jun 2005 13:37:01 -0700 Subject: [e2e] Reacting to corruption based loss Message-ID: <35C208A168A04B4EB99D1E13F2A4DB01DA1D48@exch-mail1.win.slac.stanford.edu> Yes I am a lurker. Within the developed world, (US, Canada, Europe, Japan, Korea) losses today are usually << 1% on the networks connecting academic and research institutions. However, there are considerables losses seen to and within developing regions. See for example slides 5, 8, 9, 10, 11, 12, 13 of http://www.slac.stanford.edu/grp/scs/net/talk05/i2-members-may05.ppt. For more details see the Jan 2005 Report of the ICFA/SCIC Monitoring Working Group at http://www.slac.stanford.edu/xorg/icfa/icfa-net-paper-jan05/ Note this data only refers to losses using ping, it does not address whether the packets are lost to congestion or BER or whatever. -----Original Message----- From: end2end-interest-bounces at postel.org [mailto:end2end-interest-bounces at postel.org] On Behalf Of Cannara Sent: Wednesday, June 29, 2005 10:52 AM To: end2end-interest at postel.org Subject: Re: [e2e] Reacting to corruption based loss Detlef, on the questions relating to Internet loss, you should talk to the folks at SLAC, who've been doing Ping-around-the-world for years. Try Les Cottrell, who may be listening. It was, not long ago, that peering points, especially back East, were losing 30% of frames for busy periods. Don't know what that is today. Alex Detlef Bosau wrote: > However: Does anybody have recent data about e2e packet corruption > rates in Internet connections or corporate LANs, even with a large > number of hops? > > I think, this would be useful for the discussion here. > > DB > > -- > Detlef Bosau > Galileistrasse 30 > 70565 Stuttgart > Mail: detlef.bosau at web.de > Web: http://www.detlef-bosau.de > Mobile: +49 172 681 9937 From iam4 at cs.waikato.ac.nz Wed Jun 29 14:17:29 2005 From: iam4 at cs.waikato.ac.nz (Ian McDonald) Date: Thu, 30 Jun 2005 09:17:29 +1200 Subject: [e2e] Reacting to corruption based loss In-Reply-To: <200506291843.LAA22838@gra.isi.edu> References: <200506291843.LAA22838@gra.isi.edu> Message-ID: <42C30FE9.4070505@cs.waikato.ac.nz> Bob Braden wrote: > *> > *> Our experience was that for AMPS we knocked the baud rate down as low as > *> possible to 4.8K and abandonded the use of TCP/IP altogether as we could > *> hit packet rate losses of 40% quite regularly and dropped connections. > *> > *> We implemented a protocol on top of UDP where we just used a sliding > *> window (from memory about 16) and knocked the packet size to a small > *> size (about 100 bytes max) and allowed out of order packets but with a > *> timeout retransmission. > *> > > How does this differ from TCP with an MTU of 100 bytes and using SACK? (ie > where's the magic?) > > Bob Braden > This was a little while ago and I'm pretty sure the machines didn't have SACK and we couldn't alter parameters on TCP stack for some parts i.e. it was easier to write our own protocol. As you say maybe not the best option today. Ian From dpreed at reed.com Mon Jun 27 03:55:24 2005 From: dpreed at reed.com (David P. Reed) Date: Mon, 27 Jun 2005 06:55:24 -0400 Subject: [e2e] Reacting to corruption based loss In-Reply-To: <42BF814A.C4948522@attglobal.net> References: <20050626001318.CB7A424D@aland.bbn.com> <42BEE8D7.D2FFC9E6@attglobal.net> <42BF2738.AD082D57@web.de> <42BF814A.C4948522@attglobal.net> Message-ID: <42BFDB1C.2010009@reed.com> Cannara writes: > a typical TCP will be brought to its knees by a few % packet losses that are simply due to hardware errors. A very good point! I'll raise you one, Alex: I really think that *software* corruption should be treated differently than *hardware* corruption. TCP really gets brought to its knees by software errors, such as buffer overflows, deadlocks, etc. I mean really, those complete turkeys who designed TCP got congestion wrong, hardware wrong, and software wrong. Responding to software errors requires a completely different error recovery technique - I suggest a remote router reboot is usually needed. So we need to characterize the errors so that the link layer can discriminate and do the right thing. What we need is a whole new protocol that is based on characterizing packet corruption. Each router is to determine the cause of corruption: hardware corruption can be detected by software, software corruption can be detected by hardware, and congestion can be inserting an optical fiber into the nasal passages of the network consultant. It shouldn't be too difficult to obtain six sigma precise characterization of packet corruption in this way. Of course, financially derived corruption and corruption by power (or absolute power) are possible, but those can be dealt with on an end-to-end basis, as we know that network consultants such as Mr. Cannara are omniscient and incorruptible. Really, this could become a whole new subdiscipline of communications theory - the general systems theory of corruption and the practice of switch-based corruption-characterization. Perhaps we could invent devices that sense the sources of corruption by detecting the spin of the photons that come out of the fiber... From mbgreen at dsl.cis.upenn.edu Wed Jun 29 17:13:32 2005 From: mbgreen at dsl.cis.upenn.edu (mbgreen@dsl.cis.upenn.edu) Date: Wed, 29 Jun 2005 20:13:32 EDT Subject: [e2e] Reacting to corruption based loss In-Reply-To: Your message of "Tue, 28 Jun 2005 21:34:29 +0200." <42C1A645.F4A563F2@web.de> Message-ID: <200506300013.j5U0DWQn027542@codex.cis.upenn.edu> Tue, 28 Jun 2005 21:34:29 +0200 Detlef Bosau For the traditional AIMD scheme, it�s quite easy to see that AIMD sequences starting from different initial vaules will end up in the same sawtooth as long as alpha and beta are equally chosen for all competing flows. [Digression not directly relevant to your main argument:] I don't think this is "easy to see" if the RTTs of the competing flows are different --- I don't even think it is necessarily true that they end up in the same sawtooth. In simulations it is easy to have many flows with exactly the same RTT. In the real world this is less common. From craig at aland.bbn.com Wed Jun 29 18:24:01 2005 From: craig at aland.bbn.com (Craig Partridge) Date: Wed, 29 Jun 2005 21:24:01 -0400 Subject: [e2e] AIMD synchronization Re: Reacting to corruption based loss In-Reply-To: Your message of "Wed, 29 Jun 2005 20:13:32 EDT." <200506300013.j5U0DWQn027542@codex.cis.upenn.edu> Message-ID: <20050630012401.4A13F1FF@aland.bbn.com> In message <200506300013.j5U0DWQn027542 at codex.cis.upenn.edu>, mbgreen at dsl.cis.u penn.edu writes: > Tue, 28 Jun 2005 21:34:29 +0200 > Detlef Bosau > > For the traditional AIMD scheme, it�s quite easy to see that AIMD > sequences starting from different initial vaules will end up in the > same sawtooth as long as alpha and beta are equally chosen for all > competing flows. > >[Digression not directly relevant to your main argument:] > >I don't think this is "easy to see" if the RTTs of the competing flows >are different --- I don't even think it is necessarily true that they >end up in the same sawtooth. In simulations it is easy to have many >flows with exactly the same RTT. In the real world this is less >common. It has been a long time since we've discussed synchronization of clock type issues so I'm trying to reconstruct the issues here from memory. As I recall there are two important factors here: * congestion loss occurs when a queue's capacity is exceeded -- such that several datagrams arriving in quick succession will mostly suffer loss (a few will be lucky enough to arrive when a spot in the queue has just opened). Assuming these datagrams are from different TCPs -- this is a synchronizing event. At the moment of loss, all of them effectively enter congestion recovery -- albeit it with different timings related to their RTTs and RTOs. * if you view transmission times on a link (or spots in a queue) as "success" events -- there's reason to believe that senders over time sychronize their sending rate with successful events in routers in the path (e.g., if you send a packet through successfully you get an ack, which causes you to send a new packet at a time when success is likely). There's a note from Van c. 1987 on this subject that I think I still have and will forward if I can find it. Craig From craig at aland.bbn.com Wed Jun 29 18:29:05 2005 From: craig at aland.bbn.com (Craig Partridge) Date: Wed, 29 Jun 2005 21:29:05 -0400 Subject: [e2e] VJ re: Why small pieces sink to the bottom of the bag Message-ID: <20050630012905.082271FF@aland.bbn.com> Here's the note I mentioned. It turns out to be a private note to a few folks in response to a question I asked Van. But I believe the note has been circulated more widely in the past, so I'll risk resending it now. (I owe Van some good brandy if he's displeased -- I still find the note useful reading 18 years later). Craig E-mail: craig at aland.bbn.com or craig at bbn.com ------- Forwarded Message To: Craig Partridge ... Subject: Re: Why small pieces sink to the bottom of the bag In-Reply-To: Your message of Mon, 05 Oct 87 12:32:10 D. Date: Tue, 06 Oct 87 02:54:30 PDT From: Van Jacobson Craig - I suspect I'd need to draw some pictures to come up with a convincing explanation. Maybe we can find some time at the task force meeting to sit down together and sketch what I think is happening (preferably with a bottle of brandy nearby to lubricate our intuition). In the interim, I'll try to confuse the issue with some words. After looking at lots of trace data, I've started to picture time at a gateway as being divided into fairly distinct buckets. Those sawtooth patterns of rtt vs. t say that a gateway operates by accumulating packets for a while, sending none of them, then suddenly sends them as fast as it can, then goes back to accumulating. (The start of each accumulation cycle is just after the rtt takes its big jump up.) Both inspection and fourier analysis say that the jumps up happen at approximately equal intervals so you can view this as a constant, periodic structure and ask if there's any fine structure in it. The question is interesting because the "phase" of packet arrivals with respect to the start of the accumulation cycle determines the drop probability: A packet that arrives early in the cycle has a low probability of being dropped (since little has been accumulated and the queues are empty) and a packet that arrives late in the cycle has a high drop probability (if the traffic intensity is on the order of the available buffering). One can now ask two questions: 1) Do clumps of packets from the same tcp conversation tend to stay close together in time? (If the answer is yes, we will see clumps of packets on the order of the conversation's window size in the queue. If no, packets from different conversations will be randomly intermixed, per-conversation "phase" won't have any meaning, and drop probability will simply increase with window size). 2) If the answer to (1) is yes, is there any process that migrates clumps based on their size? >From experimental data, the answer to (1) is yes, packets from the same conversation tend to stay together and gateway queues are very stratified. The "force" that drives this seems to be that most protocol implementations are more efficient processing bursts of packets than isolated packets (partly because of fewer boundary crossings / context switches and partly because of delayed acks). This means that if the 2nd packet of a burst and an isolated packet pop out of the receiver-side gateway at about the same time, the ack for the burst packet makes it back first. This means that on the next round, the burst packet is ahead (earlier) in the gateway queue. Thus bursts tend to move to the front (empty queue) part of the cycle and isolated packets toward the tail (full queue) part. (I have actual data that shows this happening: If the conversations go on long enough, the queue in the gateway gets perfectly sorted into descending window size order.) So, getting back to cookies, necessary and sufficient conditions for the stratification that Phys Rev talked about to occur are 1) a well defined container (the cookie box; an accumulation cycle) 2) different size pieces (whole and broken cookies; different window sizes) 3) a "force" that defines "up" and "down" for the container (gravity; the "surface tension" that keeps packets of a burst together combined with the exclusion principle that says two packets can't occupy the same slot in the queue). All the conditions hold (if the conversations are long enough for the relatively weak "clumping" force to have time to organize the queue) so you end up with the small pieces at the bottom. I.e., packets from the small-window conversations tend to arrive after the big-window conversations have filled the queue. Thus the small window guys preferentially get SQ'd and dropped. Hope that confuses things. - Van ------- End of Forwarded Message From perfgeek at mac.com Wed Jun 29 20:03:14 2005 From: perfgeek at mac.com (rick jones) Date: Wed, 29 Jun 2005 20:03:14 -0700 Subject: [e2e] Reacting to corruption based loss In-Reply-To: <42C24148.ECD1AEC4@attglobal.net> References: <42C095F8.26792D6@web.de> <20050628145330.GD2392@grc.nasa.gov> <42C1A645.F4A563F2@web.de> <42C24148.ECD1AEC4@attglobal.net> Message-ID: On Jun 28, 2005, at 11:35 PM, Cannara wrote: > > Now, if you were a network manager for a major corporation, would you > rush to > fix a physical problem that generated less than 1% errors, if your > boss & > users were complaining about mysterious slowdowns many times larger? > 0.4% > wasn't even enough to trigger an alert on their management consoles. > You'd > certainly be looking for bigger fish. Well, TCP's algorithms create a > bigger > fish -- talk about Henny Penny. :] > > The files were transferred in many 34kB SMB blocks, which required > something > like 23 server pkts per. The NT servers had a send window of about 6 > pkts > (uSoft later increased that to about 12). All interfaces were > 100Mb/s, except > the T3 and a couple of T1s, depending on path. RTT was about 70mS for > all > paths. > > Thankfully, the Sniffer traces also showed exactly what the TCPs at > both ends > were doing, despite Fast Retransmit, SACK, etc.: a) the typical, > default > timeouts were knocking the heck out of throughput; with a send window of only 6 packets, and a synchronous request/response protocol like SMB (IIRC) it would seem that fast rtx wouldn't have had much of a chance anyway > b) the fact that transfers > required many blocks of odd numbers of pkts meant the the Ack Timer at > the > receiver was expiring on every block, waiting (~100mS) for the magical > even-numbered last pkt in the block, which never came. Why on earth should that have mattered unless perhaps the sending TCP had a broken implementation of Nagle that was going segment by segment rather than send by send? rick jones Wisdom teeth are impacted, people are affected by the effects of events From detlef.bosau at web.de Thu Jun 30 09:03:00 2005 From: detlef.bosau at web.de (Detlef Bosau) Date: Thu, 30 Jun 2005 18:03:00 +0200 Subject: [e2e] Reacting to corruption based loss References: <200506300013.j5U0DWQn027542@codex.cis.upenn.edu> Message-ID: <42C417B4.8B26D4B7@web.de> mbgreen at dsl.cis.upenn.edu wrote: > > Tue, 28 Jun 2005 21:34:29 +0200 > Detlef Bosau > > For the traditional AIMD scheme, it?s quite easy to see that AIMD > sequences starting from different initial vaules will end up in the > same sawtooth as long as alpha and beta are equally chosen for all > competing flows. > > [Digression not directly relevant to your main argument:] > > I don't think this is "easy to see" if the RTTs of the competing flows > are different --- I don't even think it is necessarily true that they > end up in the same sawtooth. In simulations it is easy to have many > flows with exactly the same RTT. In the real world this is less > common. You?re correct. However, for competing flows, I assume identical RTT. Of course, this is a simplification. And it depends on how you define "competing flows". If the senders of the competing flows reside on node A and the receivers reside on node B and the paths taken by the packets is identical, these "competing flows" have the same RTT. However, if you consider the typical dummbell scenario, sources at S1, S2, receivers at R1 and R2 respecitively, the two "backbone routers" being B1, B2, then e B1-B2 may be the bottleneck, the flows S1-R1 and S2-R2 my be competing and the RTT of both may differ greatly. There are funny studies around how competing flows behave with different values for alpha and beta, and I realy think they be can be used for a very nice screensaver. I?m still looking for an appropriate choice of colours ;-) In fact, it is essentially the old problem with "theory" and "practice". However, what are the alternatives? Basically two ones. We either do a centralized congestion control or do a decentralized congestion control. If you have a centralized congestion control, which not needs to be controlled by a physical single node but some centralized mechanism, reservation scheme etc., then we all can refer to Srinivasan Keshav?s PhD dissertation and anything will be fine. If you ave a decentralized congestion control scheme, it?s a little bit religious. (Isiah 53,6; NIV) "We all, like sheep, have gone astray, each of us has turned on his own way." (Hm, which was the basis for Handel?s Messiah? King James? Sounds nicer :-) I just put the CD in my player :-) Handels "Fugato" as an allegory for flows, competing for the audience?s attention :-) Even different network load situations are modeled by the music?s dynamics :-) The situation is similar to Alex? criticism on the dumbbell scenario some time ago: What is an adequate system model? Not only for the ivory tower but for engineers as well? I don?t know - but who am I? So, I use what I have. Dumbbell, AIMD, strange system models. E.g.: Competing flows have similar RTT. (Or better: _Comparable_ flows have similar RTT? What do we really want, if flows are not somewhat "comparable".) And honestly: I?m much too stupid to even understand AIMD. DB -- Detlef Bosau Galileistrasse 30 70565 Stuttgart Mail: detlef.bosau at web.de Web: http://www.detlef-bosau.de Mobile: +49 172 681 9937 From touch at ISI.EDU Thu Jun 30 11:11:41 2005 From: touch at ISI.EDU (Joe Touch) Date: Thu, 30 Jun 2005 11:11:41 -0700 Subject: [e2e] Receiving RST on a MD5 TCP connection. In-Reply-To: <20050627190212.52306.qmail@web53703.mail.yahoo.com> References: <20050627190212.52306.qmail@web53703.mail.yahoo.com> Message-ID: <42C435DD.1040702@isi.edu> See http://www.ietf.org/internet-drafts/draft-ietf-tcpm-tcp-antispoof-01.txt This includes a summary of the issues, and alternate approaches. Joe Tapan Karwa wrote: > Hi, > > I was going through RFC 2385 - Protection of BGP > Sessions via the TCP MD5 Signature Option > > In Section 4.1, it mentions > "Similarly, resets generated by a TCP in response to > segments sent on a stale connection will also be > ignored. Operationally this can be a problem since > resets help BGP recover quickly from peer crashes." > > This can easily happen in the following scenario : > XX is talking to YY and both are using MD5. YY > suddenly reboots but XX does not know about it yet. XX > sends the next segment to YY with the MD5 digest but > YY does not recognize it and hence sends a RST. Of > course this RST segment does not have the MD5 digest. > > Even when XX receives the RST, it wont/cant close the > connection since it will trash the packet as it does > not have the MD5 digest. > > I was wondering if there is any solution to this > problem. Will it be correct to accept the RST even if > the MD5 digest is missing ? If we do that, can that > open doors for some other attacks ? > > Thanks, > tapan. > > > > ____________________________________________________ > Yahoo! Sports > Rekindle the Rivalries. Sign up for Fantasy Football > http://football.fantasysports.yahoo.com -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 254 bytes Desc: OpenPGP digital signature Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050630/05d6903d/signature.bin From cannara at attglobal.net Thu Jun 30 11:33:57 2005 From: cannara at attglobal.net (Cannara) Date: Thu, 30 Jun 2005 11:33:57 -0700 Subject: [e2e] Reacting to corruption based loss References: <42C095F8.26792D6@web.de> <20050628145330.GD2392@grc.nasa.gov> <42C1A645.F4A563F2@web.de> <42C24148.ECD1AEC4@attglobal.net> <42C28353.6E5E1861@web.de> <42C2A557.7010600@dirtcheapemail.com> <42C2E362.A1F03AC7@attglobal.net> <20050629191853.GA30771@colt.internal> Message-ID: <42C43B15.62EC968D@attglobal.net> Ethan, for someone who quotes Beccaria (a paesano of my family), you clearly know how to read. So, how could you gather from anything I've said that TCP should have the knowledge you claim of the physical layer? The problem is that TCP is designed to assume the physical layer is not involved in loss. Thus, it slows down when it shouldn't, because it wrongly ascribes all loss to network congestion. Note that the words "network congestion" refer to the network layer. The kludge done in the '80s, to make one transport at the transport layer protect the network layer from meltdowns that were often nearly happening, is the issue. Alex Ethan Blanton wrote: > > Alex, > > Cannara spake unto us the following "wisdom": > > Good one Clark! Indeed FDDI and other fiber rings have dual interfaces & > > fibers and anything can happen in any part of the hardware. The assumptions > > about the physical layer that have been made in most TCP discussions simply > > evidence lack of understanding of the reality and complexity of underlying > > layers. This lack extends to the length of time the defects last and go > > undiscovered, while folks struggle with peformance issues. > > Let me get this straight, for my benefit and for the benefit of those > who may not understand you. You're suggesting that TCP should have a > mechanism by which the hardware layer can communicate that there is a > particular sequence of bits which, due to physical imperfections in > the transmission medium, cannot be reliably transmitted, and that > given this information the TCP stack could choose a different method > of communicating those particular bits? > > I can't believe no one has thought of implementing this richness of > signalling before now. This is certainly an inherent flaw in the TCP > design. > > Ethan > > (On a side note: I keep hearing that the Internet is completely > broken and could never possibly work. Why is it that certain emails > to the e2e mailing list *unfailingly* reach my mailbox under these > circumstances?) > > -- > The laws that forbid the carrying of arms are laws [that have no remedy > for evils]. They disarm only those who are neither inclined nor > determined to commit crimes. > -- Cesare Beccaria, "On Crimes and Punishments", 1764 From touch at ISI.EDU Thu Jun 30 12:03:04 2005 From: touch at ISI.EDU (Joe Touch) Date: Thu, 30 Jun 2005 12:03:04 -0700 Subject: [e2e] Receiving RST on a MD5 TCP connection. In-Reply-To: <504E230F-439F-4FF7-BA79-347362AE219F@extremenetworks.com> References: <20050627195202.11232.qmail@web53701.mail.yahoo.com> <504E230F-439F-4FF7-BA79-347362AE219F@extremenetworks.com> Message-ID: <42C441E8.3080806@isi.edu> RJ Atkinson wrote: > > On Jun 27, 2005, at 15:52, Tapan Karwa wrote: > >> I am wondering if there is any consensus on how we >> should deal with the problem mentioned in Section 4.1 >> of RFC 2385. > > I don't think this is a significant issue in real world deployments. > TCP MD5 is designed to prevent acceptance of unauthenticated TCP RST > message to reduce risk of (D)DOS attacks on the TCP sessions of BGP. > An adversary could send an unauthenticated RST anytime. If that took > out BGP, such would be a much larger operational problem. > > In practice, if the first (i.e. unauthenticated) RST is ignored, the > router will send another RST a bit later on (e.g. after it is rebooted > sufficiently to know which MD5 key to use) and that one WILL be > authenticated and will be accepted rather than ignored. > > So it should sort itself out without any spec changes, just taking > a time period closer to the reboot-time of the router that is > rebooting rather than some small fraction of that time. No real > harm done with the current situation at all. > > Ran > rja at extremenetworks.com Agreed. Another point along these lines - if you had a secure connection with another host, then the host reboots and 'forgets' the security altogether (i.e., doesn't reestablish keys), it shouldn't be able to reset the old connection anyway. It does suggest, however, that if new keys are used on both sides, then both sides ought to flush their connections entirely (i.e., drop all TCBs using old keys). This affects TCP/MD5 keying, but that's not automatically managed, though. Joe -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 254 bytes Desc: OpenPGP digital signature Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050630/3a1cc181/signature.bin From cannara at attglobal.net Thu Jun 30 12:05:01 2005 From: cannara at attglobal.net (Cannara) Date: Thu, 30 Jun 2005 12:05:01 -0700 Subject: [e2e] Reacting to corruption based loss References: <42C095F8.26792D6@web.de> <20050628145330.GD2392@grc.nasa.gov> <42C1A645.F4A563F2@web.de> <42C24148.ECD1AEC4@attglobal.net> Message-ID: <42C4425D.18C9EABF@attglobal.net> Assumptions are great dullers of thought, Rick, aren't they?! " 2) > > b) the fact that transfers > > required many blocks of odd numbers of pkts meant the the Ack Timer at > > the > > receiver was expiring on every block, waiting (~100mS) for the magical > > even-numbered last pkt in the block, which never came. > > Why on earth should that have mattered unless perhaps the sending TCP > had a broken implementation of Nagle that was going segment by segment > rather than send by send?" ...When it takes about 32000/1460 pkts to send one block, and the sender's window (not easily found & configured in typical stacks) is less than that, more than one send window is needed for each block sent. If the last send window needed is odd in pkt count, then the sender is done and awaiting an ack, while the receiver is awaiting another pkt (for the ack timer value). If the path is running at a reasonable rate, then this wasted ~100mS every 32k block in a mulit-MB transfer adds up to lots of dead time and a significant throughput hit. Now, it wouldn't be so bad, if common stacks allowed easy identification & setting of parameters, like send window, ack timer, delayed ack, etc. But, because the Internet folks have never effectively addressed source and release control, we don't even know if a given stacke we come across has those knobs available to turn. That, in itself, is an unacceptable problem for a public utility, or any product affecting so many every day. Imagine if your new TV hadn't been required to meet standards. Now for: "> with a send window of only 6 packets, and a synchronous > request/response protocol like SMB (IIRC) it would seem that fast rtx > wouldn't have had much of a chance anyway " I'll be happy to send the actual packets. SMB is fully capable of giving any transport, including TCP, many, many kbytes to send from one buffer. The reason Microsoft blocked it into about 32kB chunks is known to them alone. But, the proper Fast Retrans implementation in TCPs at both ends would indeed have improved things a lot. Again, this falls directly into the pit of uncontrolled stack releases. Alex rick jones wrote: > > On Jun 28, 2005, at 11:35 PM, Cannara wrote: > > > > Now, if you were a network manager for a major corporation, would you > > rush to > > fix a physical problem that generated less than 1% errors, if your > > boss & > > users were complaining about mysterious slowdowns many times larger? > > 0.4% > > wasn't even enough to trigger an alert on their management consoles. > > You'd > > certainly be looking for bigger fish. Well, TCP's algorithms create a > > bigger > > fish -- talk about Henny Penny. :] > > > > The files were transferred in many 34kB SMB blocks, which required > > something > > like 23 server pkts per. The NT servers had a send window of about 6 > > pkts > > (uSoft later increased that to about 12). All interfaces were > > 100Mb/s, except > > the T3 and a couple of T1s, depending on path. RTT was about 70mS for > > all > > paths. > > > > Thankfully, the Sniffer traces also showed exactly what the TCPs at > > both ends > > were doing, despite Fast Retransmit, SACK, etc.: a) the typical, > > default > > timeouts were knocking the heck out of throughput; > > with a send window of only 6 packets, and a synchronous > request/response protocol like SMB (IIRC) it would seem that fast rtx > wouldn't have had much of a chance anyway > > > b) the fact that transfers > > required many blocks of odd numbers of pkts meant the the Ack Timer at > > the > > receiver was expiring on every block, waiting (~100mS) for the magical > > even-numbered last pkt in the block, which never came. > > Why on earth should that have mattered unless perhaps the sending TCP > had a broken implementation of Nagle that was going segment by segment > rather than send by send? > > rick jones > Wisdom teeth are impacted, people are affected by the effects of events From eblanton at cs.ohiou.edu Thu Jun 30 12:43:58 2005 From: eblanton at cs.ohiou.edu (Ethan Blanton) Date: Thu, 30 Jun 2005 14:43:58 -0500 Subject: [e2e] Reacting to corruption based loss In-Reply-To: <42C43B15.62EC968D@attglobal.net> References: <42C095F8.26792D6@web.de> <20050628145330.GD2392@grc.nasa.gov> <42C1A645.F4A563F2@web.de> <42C24148.ECD1AEC4@attglobal.net> <42C28353.6E5E1861@web.de> <42C2A557.7010600@dirtcheapemail.com> <42C2E362.A1F03AC7@attglobal.net> <20050629191853.GA30771@colt.internal> <42C43B15.62EC968D@attglobal.net> Message-ID: <20050630194358.GH30771@colt.internal> (Reformatted so as not to be top-posted, so others can follow) Cannara spake unto us the following wisdom: > Ethan Blanton wrote: > > Cannara spake unto us the following "wisdom": > > > Good one Clark! Indeed FDDI and other fiber rings have dual > > > interfaces & fibers and anything can happen in any part of the > > > hardware. The assumptions about the physical layer that have been > > > made in most TCP discussions simply evidence lack of understanding of > > > the reality and complexity of underlying layers. This lack extends to > > > the length of time the defects last and go undiscovered, while folks > > > struggle with peformance issues. > > > > Let me get this straight, for my benefit and for the benefit of those > > who may not understand you. You're suggesting that TCP should have a > > mechanism by which the hardware layer can communicate that there is a > > particular sequence of bits which, due to physical imperfections in > > the transmission medium, cannot be reliably transmitted, and that > > given this information the TCP stack could choose a different method > > of communicating those particular bits? > > > Ethan, for someone who quotes Beccaria (a paesano of my family), you clearly > know how to read. So, how could you gather from anything I've said that TCP > should have the knowledge you claim of the physical layer? Actually, I never have any idea what you're saying, which is why I was asking for clarification. Let us carefully consider the scenario lead- ing up to my response: Network conditions: * FDDI ring, which in fact is capable of detecting corruption loss * A particular bit pattern which is reliably corrupted and cannot pass the network as-is * TCP backs off exponentially as this corrupted packet is repeatedly lost due to non-congestion events > The problem is that TCP is designed to assume the physical layer is not > involved in loss. Thus, it slows down when it shouldn't, because it wrongly > ascribes all loss to network congestion. Note that the words "network > congestion" refer to the network layer. The kludge done in the '80s, to make > one transport at the transport layer protect the network layer from meltdowns > that were often nearly happening, is the issue. Cannara's response: * "The assumptions about the physical layer that have been made in most TCP discussions simply evidence lack of understanding of the reality and complexity of underlying layers." * The above paragraph * This response is made in a long and fruitless thread where Cannara has repeatedly stated that TCP is worthless, doesn't work, the Internet doesn't work, if only we listened to him it would be better * Furthermore, context indicates that it is his belief that if TCP could tell this loss was corruption vs. congestion, TCP could somehow make a smarter choice My question to you, Alex, is exactly *what* do you think TCP could do better than exponentially backing off a packet which will NEVER GET THROUGH. In trying so hard to find examples that prove you right, you have chosen an example which is the absolute WORST possible case for your argument. Let's assume for a moment that TCP was notified that this packet was lost due to corruption, and it retransmitted instanta- neously... Now we're flooding the wire as fast as we can send packets with a packet that will never get through. This is surely an improve- ment. This particular scenario is of course pathological, but one can (easily) construct many scenarios where the response to corruption (or whatever network trouble you choose) does not have one clear right answer. Con- sider the simple case of a slowly fading signal from an 802.11 WAP; a packet is lost to corruption, and the link simultaneously falls back to a slower bitrate to improve its signal state. Yes, you could create a signal for both "corruption experienced", and a signal for "bitrate reduced". You could also create a signal for "bitrate increased", "com- peting traffic reduced", "user wants better latency", and a thousand other things. I am aware that there are even transport and network pro- tocols which support these signals. I am also of the opinion that it is likely that TCP can benefit from some knowledge of lower layers. What I am heartily SICK of hearing, however, is this endless prattle that TCP is completely broken, IP is worthless, it's all done wrong, it will never work, it's insecure, blah blah blah blah blah. The fact remains that the Internet exists, it WORKS, it works very WELL, the Internet protocol stack is simple enough and yet powerful enough to cope with little or no modification on everything from tiny embedded systems with a scant few K of RAM and cycles per second to many-proces- sor iron in server rooms with gigabit links sprawling to hundreds of IP devices on enterprise networks. No matter how much that hurts, it's true. Yes, we can improve it, and yes, there are mistakes in the stack. However, Chicken Little is never going to panic it out of existence. So ... have any actual ideas on how to improve things? (I don't mean citations of previous ideas which no one can find, either.) Ethan P.S.: Somehow your mail got to me again, and it even traversed the Internet... -- The laws that forbid the carrying of arms are laws [that have no remedy for evils]. They disarm only those who are neither inclined nor determined to commit crimes. -- Cesare Beccaria, "On Crimes and Punishments", 1764 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050630/6f811476/attachment.bin From cannara at attglobal.net Thu Jun 30 15:01:13 2005 From: cannara at attglobal.net (Cannara) Date: Thu, 30 Jun 2005 15:01:13 -0700 Subject: [e2e] Reacting to corruption based loss References: <200506300013.j5U0DWQn027542@codex.cis.upenn.edu> <42C417B4.8B26D4B7@web.de> Message-ID: <42C46BA9.2D20B1FB@attglobal.net> Detlef, you're not in the least "stupid"! Stupidity is self-enforced ignorance. You question things well. Alex Detlef Bosau wrote: > [clip] > So, I use what I have. Dumbbell, AIMD, strange system models. E.g.: > Competing flows have similar RTT. (Or better: _Comparable_ flows have > similar RTT? What do we really want, if flows are not somewhat > "comparable".) > > And honestly: I?m much too stupid to even understand AIMD. > > DB > -- > Detlef Bosau > Galileistrasse 30 > 70565 Stuttgart > Mail: detlef.bosau at web.de > Web: http://www.detlef-bosau.de > Mobile: +49 172 681 9937 From cannara at attglobal.net Thu Jun 30 15:35:21 2005 From: cannara at attglobal.net (Cannara) Date: Thu, 30 Jun 2005 15:35:21 -0700 Subject: [e2e] Reacting to corruption based loss References: <20050626001318.CB7A424D@aland.bbn.com> <42BEE8D7.D2FFC9E6@attglobal.net> <42BF2738.AD082D57@web.de> <42BF814A.C4948522@attglobal.net> <42BFDB1C.2010009@reed.com> Message-ID: <42C473A9.FEE67D3A@attglobal.net> Well, maintaining a serious attitude, guess what David? 1) Indeed "hardware corruption can be detected by software", which is what MIBs report to us via net mgmnt systems, and all metro distribution systems employ. Of course, IP was never designed to gather & use lower-level info like that because it's designers thought the network began with their software interfaces to IMPs stuck onto their various time-sharing machines. Otherwise, we might even have had a reasonable addressing scheme from day 2. 2) Indeed link hardware does much more than many TCP/IP folks realize, but "those complete turkeys who designed TCP got congestion wrong, hardware wrong, and software wrong" is rather harsh, don't you think? I mean all they did was assume every loss was due to congestion. Now, how bad is that? :] And, why want TCP to muddle in hardware, if it's got a cogent network layer beneath it to give "best-effort" service (whatever that means)? You're just so demanding. 3) Your wish: "I suggest a remote router reboot is usually needed. So we need to characterize the errors so that the link layer can discriminate and do the right thing" has already been satisfied by uSoft. Their recent upgrades to Win2000 & XP servers includes software that requires reboots in order for ICMP Redirects to be handled correctly after a day or so of running. And, the experiment of pulling the Enet cable from a Windows PC for a few seconds has an intriguing effect on its user's apps -- if held out about 5-10 secs, the hardware driver tells the Windows protocol mgr the network is disconnected and uSoft, always wanting to plug security holes, assumes a tap is being inserted, so forces IP to change its address to Loopback. Thus, all packets coming back to the PC, containing important financial or anti-terror info, are ignored. A denial of service attack generated by a machine's own OS, how clever of uSoft! Glad I got paid a good rate last month for helping find these two. :] These and many other real examples show how the main flaw in the Internet is no control & management of installed software. We can all laugh at uSoft's OS-for-Dummies tricks over the years, but decades have paassed with no sensible release control for minor things, like TCP/IP stacks. Linux is far better managed than the middle-level protocols the Internet depends on. Alex "David P. Reed" wrote: > > Cannara writes: > > > a typical TCP will be brought to its knees by a few % packet losses that are simply due to hardware errors. > > A very good point! I'll raise you one, Alex: > > I really think that *software* corruption should be treated differently than *hardware* corruption. TCP really gets brought to its knees by software errors, such as buffer overflows, deadlocks, etc. I mean really, those complete turkeys who designed TCP got congestion wrong, hardware wrong, and software wrong. Responding to software errors requires a completely different error recovery technique - I suggest a remote router reboot is usually needed. So we need to characterize the errors so that the link layer can discriminate and do the right thing. > > What we need is a whole new protocol that is based on characterizing packet corruption. Each router is to determine the cause of corruption: > > hardware corruption can be detected by software, > > software corruption can be detected by hardware, > > and congestion can be inserting an optical fiber into the nasal passages of the network consultant. > > It shouldn't be too difficult to obtain six sigma precise characterization of packet corruption in this way. > > Of course, financially derived corruption and corruption by power (or absolute power) are possible, but those can be dealt with on an end-to-end basis, as we know that network consultants such as Mr. Cannara are omniscient and incorruptible. > > Really, this could become a whole new subdiscipline of communications theory - the general systems theory of corruption and the practice of switch-based corruption-characterization. Perhaps we could invent devices that sense the sources of corruption by detecting the spin of the photons that come out of the fiber... From cannara at attglobal.net Thu Jun 30 16:02:08 2005 From: cannara at attglobal.net (Cannara) Date: Thu, 30 Jun 2005 16:02:08 -0700 Subject: [e2e] Reacting to corruption based loss References: <42C095F8.26792D6@web.de> <20050628145330.GD2392@grc.nasa.gov> <42C1A645.F4A563F2@web.de> <42C24148.ECD1AEC4@attglobal.net> <42C28353.6E5E1861@web.de> <42C2A557.7010600@dirtcheapemail.com> <42C2E362.A1F03AC7@attglobal.net> <20050629191853.GA30771@colt.internal> <42C43B15.62EC968D@attglobal.net> <20050630194358.GH30771@colt.internal> Message-ID: <42C479F0.23D6DC19@attglobal.net> Ethan, there must be an established bureaucracy that could use your talent for phrasing devious reposts! "Cannara has repeatedly stated that TCP is worthless, doesn't work, the Internet doesn't work" -- show us where, Ethan. I suppose if you ever need a heart pump, you'll not question the quality of the device and its manufacturer's sense of responsibility to do a reliable product design. Is that your feeling? It happens to be mine, on TCP/IP as well, since it's so important nowadays, even for our upcoming robotic surgeries, where the doc can do them from his/her PC while taking a break from surfing in Maui. So, your hyperbole, like: "endless prattle that TCP is completely broken, IP is worthless, it's all done wrong, it will never work, it's insecure, blah blah blah blah blah" or "...panic it out of existence" are your own figments, and "endless prattle". The discussion has always been what can be done to improve matters. I mean that was one thing behind IPv6, wasn't it? :] Oh, since you're impressed by, and love saying: "> P.S.: Somehow your mail got to me again, and it even traversed the Internet..." Consider the fully optical, networks across Europe around 1800 -- Napoleon got lots of reliable mail that way, about as quickly you get mine. Why, the interfaces even had compression, encryption and error correction. {:o] Alex Ethan Blanton wrote: > > (Reformatted so as not to be top-posted, so others can follow) > > Cannara spake unto us the following wisdom: > > Ethan Blanton wrote: > > > Cannara spake unto us the following "wisdom": > > > > Good one Clark! Indeed FDDI and other fiber rings have dual > > > > interfaces & fibers and anything can happen in any part of the > > > > hardware. The assumptions about the physical layer that have been > > > > made in most TCP discussions simply evidence lack of understanding of > > > > the reality and complexity of underlying layers. This lack extends to > > > > the length of time the defects last and go undiscovered, while folks > > > > struggle with peformance issues. > > > > > > Let me get this straight, for my benefit and for the benefit of those > > > who may not understand you. You're suggesting that TCP should have a > > > mechanism by which the hardware layer can communicate that there is a > > > particular sequence of bits which, due to physical imperfections in > > > the transmission medium, cannot be reliably transmitted, and that > > > given this information the TCP stack could choose a different method > > > of communicating those particular bits? > > > > > Ethan, for someone who quotes Beccaria (a paesano of my family), you clearly > > know how to read. So, how could you gather from anything I've said that TCP > > should have the knowledge you claim of the physical layer? > > Actually, I never have any idea what you're saying, which is why I was > asking for clarification. Let us carefully consider the scenario lead- > ing up to my response: > > Network conditions: > * FDDI ring, which in fact is capable of detecting corruption loss > * A particular bit pattern which is reliably corrupted and cannot pass > the network as-is > * TCP backs off exponentially as this corrupted packet is repeatedly > lost due to non-congestion events > > > The problem is that TCP is designed to assume the physical layer is not > > involved in loss. Thus, it slows down when it shouldn't, because it wrongly > > ascribes all loss to network congestion. Note that the words "network > > congestion" refer to the network layer. The kludge done in the '80s, to make > > one transport at the transport layer protect the network layer from meltdowns > > that were often nearly happening, is the issue. > > Cannara's response: > * "The assumptions about the physical layer that have been made in > most TCP discussions simply evidence lack of understanding of the > reality and complexity of underlying layers." > * The above paragraph > * This response is made in a long and fruitless thread where Cannara > has repeatedly stated that TCP is worthless, doesn't work, the > Internet doesn't work, if only we listened to him it would be better > * Furthermore, context indicates that it is his belief that if TCP > could tell this loss was corruption vs. congestion, TCP could > somehow make a smarter choice > > My question to you, Alex, is exactly *what* do you think TCP could do > better than exponentially backing off a packet which will NEVER GET > THROUGH. In trying so hard to find examples that prove you right, you > have chosen an example which is the absolute WORST possible case for > your argument. Let's assume for a moment that TCP was notified that > this packet was lost due to corruption, and it retransmitted instanta- > neously... Now we're flooding the wire as fast as we can send packets > with a packet that will never get through. This is surely an improve- > ment. > > This particular scenario is of course pathological, but one can (easily) > construct many scenarios where the response to corruption (or whatever > network trouble you choose) does not have one clear right answer. Con- > sider the simple case of a slowly fading signal from an 802.11 WAP; a > packet is lost to corruption, and the link simultaneously falls back to > a slower bitrate to improve its signal state. Yes, you could create a > signal for both "corruption experienced", and a signal for "bitrate > reduced". You could also create a signal for "bitrate increased", "com- > peting traffic reduced", "user wants better latency", and a thousand > other things. I am aware that there are even transport and network pro- > tocols which support these signals. I am also of the opinion that it is > likely that TCP can benefit from some knowledge of lower layers. What I > am heartily SICK of hearing, however, is this endless prattle that TCP > is completely broken, IP is worthless, it's all done wrong, it will > never work, it's insecure, blah blah blah blah blah. > > The fact remains that the Internet exists, it WORKS, it works very WELL, > the Internet protocol stack is simple enough and yet powerful enough to > cope with little or no modification on everything from tiny embedded > systems with a scant few K of RAM and cycles per second to many-proces- > sor iron in server rooms with gigabit links sprawling to hundreds of IP > devices on enterprise networks. No matter how much that hurts, it's > true. Yes, we can improve it, and yes, there are mistakes in the stack. > However, Chicken Little is never going to panic it out of existence. > > So ... have any actual ideas on how to improve things? (I don't mean > citations of previous ideas which no one can find, either.) > > Ethan > >