From detlef.bosau at web.de Mon Aug 1 03:41:07 2011 From: detlef.bosau at web.de (Detlef Bosau) Date: Mon, 01 Aug 2011 12:41:07 +0200 Subject: [e2e] Taking RTT Samples. Message-ID: <4E3682C3.6010305@web.de> Hi to all. I'm curious whether we shall take RTT Samples for each segment or not. AFAIK, the RFC encourage one RTT timer, which is started at least once a round. An alternative way would be one timer per segment. I'm curious how the K&P algorithm is best implentet in these cases. From what I see in the literature, it is not yet clear whether one timer per round is preferable or one timer per segment. My particular interest are however WWAN, so the basic question is, whether we have something like "weak stationary RTT" in that case or not. At least the problem addressed by K&P should be solved by RFC 1323 RTTM. Is this correct? If so, the problem were, whether the SRTT and RTTVAR estimator are appropriate in this case and whether the gains are reasonable. I would appreciate any discussion here. Detlef -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ From detlef.bosau at web.de Wed Aug 3 07:08:54 2011 From: detlef.bosau at web.de (Detlef Bosau) Date: Wed, 03 Aug 2011 16:08:54 +0200 Subject: [e2e] Spurious Timeouts, Fact or Fake? Message-ID: <4E395676.9010407@web.de> During the recent past, this list has seen quite some few posts regarding TCP RTT measurement. Now, first of all, I was interested in how often RTT measurments shall be made and how they can be made. A particular concern is Karn's Algorithm, because to my understanding, the consequence of Karn's algorithm is that RTT measurements obtained by a single RTT timer can be taken only when a sender has no outstanding duplicate packets. Perhaps, I'm wrong here. However, from what I've read so far, it is not yet completely clear, how often RTT measurements should be made. The alternatives discussed so fare are: - once round, - each packet. While the latter appears appealing to me, particularly when implemented with time stamps (RFC 1323), which overcomes the problems discussed by Karn & Partridge regarding the problem of packets being sent more than once, some literature indicates problems with the SRTT estimator when time stamps are in use. Now, the whole discussion is somewhat confusing to me. 1.: Spurious Timeouts are confusing to me, because spurious timeouts (i.e. a packet which is well successfully transmitted, however the ACK does not reach the sender on time) are basically expected by Edges paper and the literature based upon this. However, there are papers around, which put the mere existence of spurious timeouts in question, e.g. author = "Francesco Vacirca and Thomas Ziegler and Eduard Hasenleithner", title="{TCP Spurious Timeout estimation in an operational GPRS/UMTS network}", month="May", year="2005", journal = "Forschungszentrum Telekommunikation Wien Technical Report FTW-TR-2005-008" } , while others give detailed recommendations how to deal with spurious timeouts in practical implementations, e.g. http://tools.ietf.org/search/draft-allman-rto-backoff-02 However, to me the problem seems closely coupled to the underlying question whether or not we can estimate the expectation and variance of the RTT in a TCP session. Edge requires the according stochastic process to be weakly stationary. In other words: In a TCP session, once having started and being run for some settling time, the observerd RTT shall be, at least roughly, identically distributed. This distribution should be subject to only very slow and very rare change, if at all. And accourding to RFC 2988, we can obtain SRTT and RTTVAR by RTT samples using the well known EWMA estimators for this purpose. So, my questions are: 1.: How often shall RTTM be made? 2.: Is it reasonable to assume "weakly stationary" RTTs as done by Edge? 3.: Are the EWMA filters from RFC 2988 satisfactory, particularly are these sufficiently generic to yield reasonable results for an arbitrary TCP session? One could summarize these to the question: Do we obtain RTO in a reasonable way? And when we talk about spurious timeouts, are we talking about spurious timeouts - or are we talking about shortcomings of the SRTT and RTTVAR estimators here? I'm somewhat confused here at the moment. And I would appreciate any enlightenment ;-) Detlef -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ From jasleen at cs.unc.edu Wed Aug 3 09:42:03 2011 From: jasleen at cs.unc.edu (Jasleen Kaur) Date: Wed, 03 Aug 2011 12:42:03 -0400 Subject: [e2e] Spurious Timeouts, Fact or Fake? In-Reply-To: <4E395676.9010407@web.de> References: <4E395676.9010407@web.de> Message-ID: <4E397A5B.4090201@cs.unc.edu> Detlef, You might find our paper (below) interesting, in which we analyze traces of nearly 3 million Internet transfers to study the performance of TCP loss detection and recovery mecahnisms. Among other things, it also logs the occurence of spurious timeouts on a per-OS basis. S. Rewaskar, J. Kaur, and F.D. Smith, "A Performance Study of Loss Detection/Recovery in Real-world TCP Implementations,"** in Proceedings of the IEEE International Conference on Network Protocols (ICNP'07), Beijing, China, Oct 2007. http://www.cs.unc.edu/~jasleen/papers/icnp07.pdf Thanks, Jasleen On 8/3/2011 10:08 AM, Detlef Bosau wrote: > During the recent past, this list has seen quite some few posts > regarding TCP RTT measurement. > > Now, first of all, I was interested in how often RTT measurments shall > be made and how they can be made. A particular concern is Karn's > Algorithm, > because to my understanding, the consequence of Karn's algorithm is > that RTT measurements obtained by a single RTT timer can be taken only > when a sender has no outstanding duplicate packets. > > Perhaps, I'm wrong here. > > However, from what I've read so far, it is not yet completely clear, > how often RTT measurements should be made. The alternatives discussed > so fare are: > - once round, > - each packet. > > While the latter appears appealing to me, particularly when > implemented with time stamps (RFC 1323), which overcomes the problems > discussed by Karn & Partridge regarding the problem of packets being > sent more than once, some literature indicates problems with the SRTT > estimator when time stamps are in use. > > Now, the whole discussion is somewhat confusing to me. > > 1.: Spurious Timeouts are confusing to me, because spurious timeouts > (i.e. a packet which is well successfully transmitted, however the ACK > does not reach the sender on time) are basically expected by Edges > paper and the literature based upon this. However, there are papers > around, which put the mere existence of spurious timeouts in question, > e.g. > author = "Francesco Vacirca and Thomas Ziegler and Eduard Hasenleithner", > title="{TCP Spurious Timeout estimation in > an operational GPRS/UMTS network}", > month="May", > year="2005", > journal = "Forschungszentrum Telekommunikation Wien > Technical Report > FTW-TR-2005-008" > } > , while others give detailed recommendations how to deal with spurious > timeouts in practical implementations, e.g. > http://tools.ietf.org/search/draft-allman-rto-backoff-02 > > However, to me the problem seems closely coupled to the underlying > question whether or not we can estimate the expectation and variance > of the RTT in a TCP session. Edge requires the according stochastic > process to be weakly stationary. In other words: In a TCP session, > once having started and being run for some settling time, the > observerd RTT shall be, at least roughly, identically distributed. > > This distribution should be subject to only very slow and very rare > change, if at all. > > And accourding to RFC 2988, we can obtain SRTT and RTTVAR by RTT > samples using the well known EWMA estimators for this purpose. > > > So, my questions are: > > 1.: How often shall RTTM be made? > 2.: Is it reasonable to assume "weakly stationary" RTTs as done by Edge? > 3.: Are the EWMA filters from RFC 2988 satisfactory, particularly are > these sufficiently generic to yield reasonable results for an > arbitrary TCP session? > > One could summarize these to the question: Do we obtain RTO in a > reasonable way? And when we talk about spurious timeouts, are we > talking about spurious timeouts - or are we talking about shortcomings > of the SRTT and RTTVAR estimators here? > > I'm somewhat confused here at the moment. And I would appreciate any > enlightenment ;-) > > Detlef > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20110803/9b727bd6/attachment.html From emmanuel.lochin at gmail.com Wed Aug 3 10:45:12 2011 From: emmanuel.lochin at gmail.com (Emmanuel Lochin) Date: Wed, 3 Aug 2011 19:45:12 +0200 Subject: [e2e] Spurious Timeouts, Fact or Fake? In-Reply-To: <4E395676.9010407@web.de> References: <4E395676.9010407@web.de> Message-ID: Hi Detlef, I think the study on spurious timeout you cite cannot be transposed to the Internet. If you look at : Sharad Jaiswal, Gianluca Iannaccone, Christophe Diot, James F. Kurose, Donald F. Towsley, Measurement and classification of out-of-sequence packets in a tier-1 IP backbone. IEEE/ACM Trans. Netw. (TON) 15(1):54-66 (2007) the authors show that 40% of the links present in their dataset effectively reorder packets (might due to load balancing, multiple network paths, dynamic route generation and link bonding). Emmanuel On 3 August 2011 16:08, Detlef Bosau wrote: > During the recent past, this list has seen quite some few posts regarding > TCP RTT measurement. > > Now, first of all, I was interested in how often RTT measurments shall be > made and how they can be made. A particular concern is Karn's Algorithm, > because to my understanding, the consequence of Karn's algorithm is that RTT > measurements obtained by a single RTT timer can be taken only when a sender > has no outstanding duplicate packets. > > Perhaps, I'm wrong here. > > However, from what I've read so far, it is not yet completely clear, how > often RTT measurements should be made. The alternatives discussed so fare > are: > - once round, > - each packet. > > While the latter appears appealing to me, particularly when implemented with > time stamps (RFC 1323), which overcomes the problems discussed by Karn & > Partridge regarding the problem of packets being sent more than once, some > literature indicates problems with the SRTT estimator when time stamps are > in use. > > Now, the whole discussion is somewhat confusing to me. > > 1.: Spurious Timeouts are confusing to me, because spurious timeouts (i.e. a > packet which is well successfully transmitted, however the ACK does not > reach the sender on time) are basically expected by Edges paper and the > literature based upon this. However, there are papers around, which put the > mere existence of spurious timeouts in question, e.g. > author = "Francesco Vacirca and Thomas Ziegler and Eduard Hasenleithner", > title="{TCP Spurious Timeout estimation in > an operational GPRS/UMTS network}", > month="May", > year="2005", > journal = "Forschungszentrum Telekommunikation Wien > Technical Report > FTW-TR-2005-008" > } > , while others give detailed recommendations how to deal with spurious > timeouts in practical implementations, e.g. > http://tools.ietf.org/search/draft-allman-rto-backoff-02 > > However, to me the problem seems closely coupled to the underlying question > whether or not we can estimate the expectation and variance of the RTT in a > TCP session. Edge requires the according stochastic process to be weakly > stationary. In other words: In a TCP session, once having started and being > run for some settling time, the observerd RTT shall be, at least roughly, > identically distributed. > > This distribution should be subject to only very slow and very rare change, > if at all. > > And accourding to RFC 2988, we can obtain SRTT and RTTVAR by RTT samples > using the well known EWMA estimators for this purpose. > > > So, my questions are: > > 1.: How often shall RTTM be made? > 2.: Is it reasonable to assume "weakly stationary" RTTs as done by Edge? > 3.: Are the EWMA filters from RFC 2988 satisfactory, particularly are these > sufficiently generic to yield reasonable results for an arbitrary TCP > session? > > One could summarize these to the question: Do we obtain RTO in a reasonable > way? And when we talk about spurious timeouts, are we talking about spurious > timeouts - or are we talking about shortcomings of the SRTT and RTTVAR > estimators here? > > I'm somewhat confused here at the moment. And I would appreciate any > enlightenment ;-) > > Detlef > > > -- > ------------------------------------------------------------------ > Detlef Bosau > Galileistra?e 30 > 70565 Stuttgart ? ? ? ? ? ? ? ? ? ? ? ? ? ?Tel.: ? +49 711 5208031 > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? mobile: +49 172 6819937 > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? skype: ? ? detlef.bosau > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ICQ: ? ? ? ? ?566129673 > detlef.bosau at web.de ? ? ? ? ? ? ? ? ? ? http://www.detlef-bosau.de > ------------------------------------------------------------------ > > > -- "This email and any attachments are confidential. They may contain legally privileged information or copyright material. You should not read, copy, use or disclose them without authorisation. If you are not an intended recipient, please contact us at once by return email and then delete both messages. We do not accept liability in connection with computer virus, data corruption, delay, interruption, unauthorised access or unauthorised amendment. This notice should not be removed" From detlef.bosau at web.de Wed Aug 3 11:54:51 2011 From: detlef.bosau at web.de (Detlef Bosau) Date: Wed, 03 Aug 2011 20:54:51 +0200 Subject: [e2e] Spurious Timeouts, Fact or Fake? In-Reply-To: <4E397A5B.4090201@cs.unc.edu> References: <4E395676.9010407@web.de> <4E397A5B.4090201@cs.unc.edu> Message-ID: <4E39997B.9040003@web.de> On 08/03/2011 06:42 PM, Jasleen Kaur wrote: > > Detlef, > > You might find our paper (below) interesting, in which we analyze > traces of nearly 3 million Internet transfers to study the performance > of TCP loss detection and recovery mecahnisms. Among other things, it > also logs the occurence of spurious timeouts on a per-OS basis. > > S. Rewaskar, J. Kaur, and F.D. Smith, "A Performance Study of Loss > Detection/Recovery in Real-world TCP Implementations,"** > in Proceedings of > the IEEE International Conference on Network Protocols (ICNP'07), > Beijing, China, Oct 2007. > > http://www.cs.unc.edu/~jasleen/papers/icnp07.pdf > > > Thanks, > Jasleen > > Thanks for the hint. BTW: The entries for Linux in Table 1 are a bit confusing? RTO = srtt + vartt, however you note m=1 and k=4? In addition, the gains a and b are the same for all OS and as recommended by RFC 2988. I think, the interesting question is whether the formulae for srtt and vartt converge..... And of course, there may be additional questions, once I've read the hole paper in detail. -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ From detlef.bosau at web.de Wed Aug 3 13:44:19 2011 From: detlef.bosau at web.de (Detlef Bosau) Date: Wed, 03 Aug 2011 22:44:19 +0200 Subject: [e2e] Spurious Timeouts, Fact or Fake? In-Reply-To: References: <4E395676.9010407@web.de> Message-ID: <4E39B323.5090507@web.de> On 08/03/2011 07:45 PM, Emmanuel Lochin wrote: > Hi Detlef, > > I think the study on spurious timeout you cite cannot be transposed to > the Internet. > If you look at : > > Sharad Jaiswal, Gianluca Iannaccone, Christophe Diot, James F. Kurose, > Donald F. Towsley, > Measurement and classification of out-of-sequence packets in a tier-1 > IP backbone. > IEEE/ACM Trans. Netw. (TON) 15(1):54-66 (2007) > > the authors show that 40% of the links present in their dataset > effectively reorder > packets (might due to load balancing, multiple network paths, dynamic > route generation and link bonding). > Thanks again for the hint. However, at the moment, I want to get a perspective for my own work and therefore, I need a somewhat constructive approach. First of all, observing packet reordering on 40 % of the links in some reasonable dataset is worrying. And it makes clear that there is a huge difference between Internetworking in the RFC "theory" and the practical implementation. Or, to put it into some perhaps dramatic words: To which deal are we doing sliding window in the Internet and to which deal are we doing some crude mixture of loss recovery and slow start? Some kind of stop'n wait, and now and then there is some lucky TCP session with some "windowlett" of two or four segments? When I looked at Jasleens paper, all the OS discussed therein used the EWMA filters from RFC 2988. Question: Where do these stem from? Lucky guess? "Many" (i.e. > 10) experiments? Divine inspiration? Some verses of the bible? Or the Koran? I don't know. Are these natural constants? Or are all networks created equal, endowed by their creator with some unalienable properties, that amongst these are the alpha and beta of the EWMA filters for SRTT and RTTVAR? (Or should I say: Life, Liberty and the pursuit of Happiness?) Now, actually, the Internet works. And it works pretty fine. So, the recommendations by RFC 2988 can hardly be completely nonsense. However, can we give a set of assumptions / boundary conditions / ..., when thhese concepts do hold for sure? -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ From detlef.bosau at web.de Fri Aug 5 01:06:45 2011 From: detlef.bosau at web.de (Detlef Bosau) Date: Fri, 05 Aug 2011 10:06:45 +0200 Subject: [e2e] Rationale for EWMA filters in RTTM Message-ID: <4E3BA495.1020209@web.de> Hi to all, perhaps this sounds stupid, however, I would like to understand this. I'm still to understand the rationale behind the EWMA filters for SRTT and RTTVAR as suggested in RFC 2988. From what I've read, these filters basically assume a Poisson model for a TCP flow. So, I assumed a TCP flow with window 1 segment, i.e. stop'n wait, so that the arrival process of ACKs at the sender should be simple: All the interarrival times are i.i.d. and exponentially distributed. So, I used octave to generate some random numbers from an exponentially distribution and submitted these to the RTT estimator from RFC 2988. Any misconception of mine up to now? No? O.k. The results are........... When I applied some fat there, perhaps I could pass the result as an opus posthumous by Joseph Beuys. "Strange drawing with fat." (For those, who have no idea about Beuys: http://www.ionoi.it/index.php?pages=article&cod=short-lasting-comfort look at the right picture.) Of course, I plaid around with the gain alpha. (I only considered RTT yet. I don't want to waste all the butter in my fridge for an artistic redesign of my results...) Actually, this had some influence on the results. E.g. chosing a alpha=1/200 appeared to be somewhat more convincing. However, this is playing around and no science. Do I have a basic misconception here? I would appreciate any comment. -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ From detlef.bosau at web.de Fri Aug 5 11:34:57 2011 From: detlef.bosau at web.de (Detlef Bosau) Date: Fri, 05 Aug 2011 20:34:57 +0200 Subject: [e2e] Rationale for EWMA filters in RTTM In-Reply-To: <4E3BA495.1020209@web.de> References: <4E3BA495.1020209@web.de> Message-ID: <4E3C37D1.5010302@web.de> Perhaps, I should put my question in a more general way: In which cases do we have / do we expect a reliable RTO estimation in TCP sessions? Thanks! Detlef -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ From anoop at alumni.duke.edu Fri Aug 5 16:20:38 2011 From: anoop at alumni.duke.edu (Anoop Ghanwani) Date: Fri, 5 Aug 2011 16:20:38 -0700 Subject: [e2e] Rationale for EWMA filters in RTTM In-Reply-To: <4E3C37D1.5010302@web.de> References: <4E3BA495.1020209@web.de> <4E3C37D1.5010302@web.de> Message-ID: Have you looked at RFC 6298? Based on your last email it looks like you were reading an obsoleted RFC. I don't think this timer needs to be super accurate since it kicks in only when duplicate ACKs don't already solve the problem, e.g. under severe forward or reverse congestion because of which ACKs aren't making it back. Having it be a moving average just allows us to pick an initial value that could be terribly wrong for the environment (data center at one end, satellite links at the other end) and we still find a reasonable value after a few RTT. On Fri, Aug 5, 2011 at 11:34 AM, Detlef Bosau wrote: > > Perhaps, I should put my question in a more general way: In which cases do > we have / do we expect a reliable RTO estimation in TCP sessions? > > Thanks! > > Detlef > > -- > ------------------------------**------------------------------**------ > Detlef Bosau > Galileistra?e 30 > 70565 Stuttgart Tel.: +49 711 5208031 > mobile: +49 172 6819937 > skype: detlef.bosau > ICQ: 566129673 > detlef.bosau at web.de http://www.detlef-bosau.de > ------------------------------**------------------------------**------ > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20110805/16c83012/attachment.html From detlef.bosau at web.de Sat Aug 6 02:42:42 2011 From: detlef.bosau at web.de (Detlef Bosau) Date: Sat, 06 Aug 2011 11:42:42 +0200 Subject: [e2e] Rationale for EWMA filters in RTTM In-Reply-To: References: <4E3BA495.1020209@web.de> <4E3C37D1.5010302@web.de> Message-ID: <4E3D0C92.8030707@web.de> On 08/06/2011 01:20 AM, Anoop Ghanwani wrote: > Have you looked at RFC 6298? Based on your last email > it looks like you were reading an obsoleted RFC. O.k., now we've learned that RFC 6298 obsoletes RFC 2988. Of course, it is always useful to know the most recent RFC numbers. However, I don't see a solution for my problem. > > I don't think this timer needs to be super accurate since > it kicks in only when duplicate ACKs don't already solve > the problem, e.g. under severe forward or reverse congestion > because of which ACKs aren't making it back. Hang on. First, refer to Manus post some few days ago and the paper Sharad Jaiswal, Gianluca Iannaccone, Christophe Diot, James F. Kurose, Donald F. Towsley, Measurement and classification of out-of-sequence packets in a tier-1 IP backbone. IEEE/ACM Trans. Netw. (TON) 15(1):54-66 (2007) Manu points out, that according to this paper, 40% of the observed links exhibit more or less sever packet reordering. In addition, we know the state variable DUPACKTHRESH for tcp senders for years now - which was particularly intended to address packet reordering. From what I've read in recent literature, not even the least effort is spent, to address this problem in practical implementations. Consequence: Triple Dupacks may or may not happen - according to the phases of the moon or the water level, however, when they are related to congestion or packet loss, this is pure luck. In the same way, spurious timeout may occur on the same "basis", caused by RTO values being unreasonable small. > > Having it be a moving average just allows us to pick an > initial value that could be terribly wrong for the environment > (data center at one end, satellite links at the other end) I'm with you to up to now, but: > and we still find a reasonable value after a few RTT. > from what I've seen with some playing around in Octave, I would like to herewith suggest THE one and only reasonable RTO for TCP: 10 milliseconds times Bill Gates' birthday. At the moment, I'm quite convinced that this is neither worse nor better than those values we're using today. I have to apologize for my frustration. However, I'm still to overcome this huge difference between a marvelous, splendid theory and a very ugly practice. Please correct me, when I'm completely wrong here. -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ From detlef.bosau at web.de Sat Aug 6 09:50:20 2011 From: detlef.bosau at web.de (Detlef Bosau) Date: Sat, 06 Aug 2011 18:50:20 +0200 Subject: [e2e] Rationale for EWMA filters in RTTM In-Reply-To: <4E3D0C92.8030707@web.de> References: <4E3BA495.1020209@web.de> <4E3C37D1.5010302@web.de> <4E3D0C92.8030707@web.de> Message-ID: <4E3D70CC.80307@web.de> On 08/06/2011 11:42 AM, Detlef Bosau wrote: > On 08/06/2011 01:20 AM, Anoop Ghanwani wrote: >> Have you looked at RFC 6298? Based on your last email >> it looks like you were reading an obsoleted RFC. > > O.k., now we've learned that RFC 6298 obsoletes RFC 2988. > > Of course, it is always useful to know the most recent RFC numbers. > However, I don't see a solution for my problem. > >> >> I don't think this timer needs to be super accurate since >> it kicks in only when duplicate ACKs don't already solve >> the problem, e.g. under severe forward or reverse congestion >> because of which ACKs aren't making it back. > > Hang on. > > First, refer to Manus post some few days ago and the paper > > Sharad Jaiswal, Gianluca Iannaccone, Christophe Diot, James F. Kurose, > Donald F. Towsley, > Measurement and classification of out-of-sequence packets in a tier-1 > IP backbone. > IEEE/ACM Trans. Netw. (TON) 15(1):54-66 (2007) > > Manu points out, that according to this paper, 40% of the observed > links exhibit more or less sever packet reordering. > > In addition, we know the state variable DUPACKTHRESH for tcp senders > for years now - which was particularly intended to address packet > reordering. > From what I've read in recent literature, not even the least effort is > spent, to address this problem in practical implementations. > > Consequence: Triple Dupacks may or may not happen - according to the > phases of the moon or the water level, however, when they are related > to congestion or packet loss, this is pure luck. > > In the same way, spurious timeout may occur on the same "basis", > caused by RTO values being unreasonable small. > >> >> Having it be a moving average just allows us to pick an >> initial value that could be terribly wrong for the environment >> (data center at one end, satellite links at the other end) > > I'm with you to up to now, but: >> and we still find a reasonable value after a few RTT. >> > > from what I've seen with some playing around in Octave, I would like > to herewith suggest THE one and only reasonable RTO for TCP: > > 10 milliseconds times Bill Gates' birthday. > > At the moment, I'm quite convinced that this is neither worse nor > better than those values we're using today. > > I have to apologize for my frustration. However, I'm still to overcome > this huge difference between a marvelous, splendid theory and a very > ugly practice. > > Please correct me, when I'm completely wrong here. > > I just made some few "ping" tests this afternoon and just wanted to see, whether the filtered reply times make sense. The results are, gently spoken, somewhat concerning. -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ From detlef.bosau at web.de Sat Aug 6 10:47:56 2011 From: detlef.bosau at web.de (Detlef Bosau) Date: Sat, 06 Aug 2011 19:47:56 +0200 Subject: [e2e] Rationale for EWMA filters in RTTM In-Reply-To: References: <4E3BA495.1020209@web.de> <4E3C37D1.5010302@web.de> <4E3D0C92.8030707@web.de> <4E3D70CC.80307@web.de> Message-ID: <4E3D7E4C.9060806@web.de> On 08/06/2011 07:25 PM, Jon Crowcroft wrote: > i dont know how many TCPs still do it, but most used to only update > the SRTT and RTTVAR running estimates for in order acknowledged > segments. At least time stamps are echoed this way according to RFC 1323. > This is based on Van's original argument about conservation > of packets and a simple model of the qeues on the path.... ...at least one should keep this in mind. I just had a look at the congavoid paper yesterday. At least Manu's pointer to the remarkable level of packet reordering is concerning. In a private conversation, one of our colleagues stated that packet reordering usually points to broken network implementations. Perhaps, the term "broken" may sound too harsh here, however if we can aovid problems by simple means, we should do so. > .the EWMA is > just the simplest way to keep a running estimate of the 1st& 2nd > moments of the moving distribution. we played around with other > estimators in the 80s (when path diversity was even higher) - it is > entirely possible that current middle boxes and queus and multipath > and switched routers are making such a simple estimate poor - i > wouldnnt care much since it is a question how well it maps into a RTO, > not whether it is "true" - work on path characterisation has a > different goal Hm. At least http://www.thinkmind.org/index.php?view=article&articleid=icn_2011_14_10_10098 intends to reduce the path diversity here. Although this paper proposes a middlebox, the idea is particarly to support the _existing_ mechanisms of TCP. > On Sat, Aug 6, 2011 at 5:50 PM, Detlef Bosau wrote: >> On 08/06/2011 11:42 AM, Detlef Bosau wrote: >>> On 08/06/2011 01:20 AM, Anoop Ghanwani wrote: >>>> Have you looked at RFC 6298? Based on your last email >>>> it looks like you were reading an obsoleted RFC. >>> O.k., now we've learned that RFC 6298 obsoletes RFC 2988. >>> >>> Of course, it is always useful to know the most recent RFC numbers. >>> However, I don't see a solution for my problem. >>> >>>> I don't think this timer needs to be super accurate since >>>> it kicks in only when duplicate ACKs don't already solve >>>> the problem, e.g. under severe forward or reverse congestion >>>> because of which ACKs aren't making it back. >>> Hang on. >>> >>> First, refer to Manus post some few days ago and the paper >>> >>> Sharad Jaiswal, Gianluca Iannaccone, Christophe Diot, James F. Kurose, >>> Donald F. Towsley, >>> Measurement and classification of out-of-sequence packets in a tier-1 >>> IP backbone. >>> IEEE/ACM Trans. Netw. (TON) 15(1):54-66 (2007) >>> >>> Manu points out, that according to this paper, 40% of the observed links >>> exhibit more or less sever packet reordering. >>> >>> In addition, we know the state variable DUPACKTHRESH for tcp senders for >>> years now - which was particularly intended to address packet reordering. >>> From what I've read in recent literature, not even the least effort is >>> spent, to address this problem in practical implementations. >>> >>> Consequence: Triple Dupacks may or may not happen - according to the >>> phases of the moon or the water level, however, when they are related to >>> congestion or packet loss, this is pure luck. >>> >>> In the same way, spurious timeout may occur on the same "basis", caused >>> by RTO values being unreasonable small. >>> >>>> Having it be a moving average just allows us to pick an >>>> initial value that could be terribly wrong for the environment >>>> (data center at one end, satellite links at the other end) >>> I'm with you to up to now, but: >>>> and we still find a reasonable value after a few RTT. >>>> >>> from what I've seen with some playing around in Octave, I would like to >>> herewith suggest THE one and only reasonable RTO for TCP: >>> >>> 10 milliseconds times Bill Gates' birthday. >>> >>> At the moment, I'm quite convinced that this is neither worse nor better >>> than those values we're using today. >>> >>> I have to apologize for my frustration. However, I'm still to overcome >>> this huge difference between a marvelous, splendid theory and a very ugly >>> practice. >>> >>> Please correct me, when I'm completely wrong here. >>> >>> >> I just made some few "ping" tests this afternoon and just wanted to see, >> whether the filtered reply times make sense. >> >> The results are, gently spoken, somewhat concerning. >> >> >> -- >> ------------------------------------------------------------------ >> Detlef Bosau >> Galileistra?e 30 >> 70565 Stuttgart Tel.: +49 711 5208031 >> mobile: +49 172 6819937 >> skype: detlef.bosau >> ICQ: 566129673 >> detlef.bosau at web.de http://www.detlef-bosau.de >> ------------------------------------------------------------------ >> >> >> -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ From touch at isi.edu Sun Aug 7 11:49:04 2011 From: touch at isi.edu (Joe Touch) Date: Sun, 07 Aug 2011 11:49:04 -0700 Subject: [e2e] Rationale for EWMA filters in RTTM In-Reply-To: <4E3D0C92.8030707@web.de> References: <4E3BA495.1020209@web.de> <4E3C37D1.5010302@web.de> <4E3D0C92.8030707@web.de> Message-ID: <4E3EDE20.9050907@isi.edu> On 8/6/2011 2:42 AM, Detlef Bosau wrote: > On 08/06/2011 01:20 AM, Anoop Ghanwani wrote: >> Have you looked at RFC 6298? Based on your last email >> it looks like you were reading an obsoleted RFC. > > O.k., now we've learned that RFC 6298 obsoletes RFC 2988. FWIW, that's noted in the header of RFC 6298. It's generally useful to check either the rfc-index.txt file (or its XML equivalent) or if you use the tools pages to access the files, obsoleted RFCs are indicated on the old one as well: http://tools.ietf.org/html/rfc2988 I wrote a script a while back that scans text for RFC numbers and looks up uncited and updated dependencies based on rfc-index called "rfc-what-i-mean" a while back, and it's available on my tools page: http://www.isi.edu/touch/tools/ The Perl source code is here: http://www.isi.edu/touch/tools/rfc-what-i-mean.pl Joe From detlef.bosau at web.de Mon Aug 8 15:50:53 2011 From: detlef.bosau at web.de (Detlef Bosau) Date: Tue, 09 Aug 2011 00:50:53 +0200 Subject: [e2e] Once again: TCP RTT and RTO Message-ID: <4E40684D.6060808@web.de> Hi to all. Let me quote a sentence from the congavoid paper: > A good round trip time estimator, the core of the > retransmit timer, is the single most important feature of any > protocol implementation that expects to survive heavy load. To my understanding, a goot RTT estimator and a good RTO as well are not only important from a simple viewpoint of network utilization, too small an RTO would be to aggressive, too large an RTO would result in underutilization, but important for network stability at all. J&K outline that a congested network needs an exponential damping in order to return to stability. To my understanding, this is achieved by the AIMD scheme in CA/Slow Start on the one hand and by the exponential RTO back off on the other: Particularly in flows which did not yet reach equilibrium, the RTO and the RTO backoff impose a, NB exponentially damped!, "rate" on otherwise self clocking flows. For this reason, I'm particularly interested in whether we have, or at least expect to achieve, a reliable means of getting RTT measurements _AND_, which is presumably anything but an orthogonal question, we expect the RTTM process to be weakly stationary. These two issues are obviously intertwined as a "stationary" process means a process which has settled after some time. (Although "some time" may be infinite from a mathematical point of view ;-)) Hence, for the process to become weakly stationary and for the flow to reach equilibrium is basically the same story. I would appreciate a discussion about what is already achieved in this field and what we expect. Perhaps even with respect to different situations in wired networks and mobile wide area networks. Detlef -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ From Barry.Constantine at jdsu.com Fri Aug 12 07:03:37 2011 From: Barry.Constantine at jdsu.com (Barry Constantine) Date: Fri, 12 Aug 2011 07:03:37 -0700 Subject: [e2e] TCP Performance with Traffic Policing Message-ID: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> Hi, I did some testing to compare various TCP stack behaviors in the midst of traffic policing. It is common practice for a network provider to police traffic to a subscriber level agreement (SLA). In the iperf testing I conducted, the following set-up was used: Client -> Delay (50ms RTT) -> Cisco (with 10M Policing) -> Server The delay was induced using hardware base commercial gear. 50 msec RTT and bottleneck bandwidth = 10 Mbps, so BDP was 62,000 bytes. Ran Linux, Windows XP, and Windows 7 clients at 32k, 64k, 128k window (knowing that policing would kick in at 64K) Throughput for Window (Mbps) Platform 32K 64K 128K -------------------------------------------- Linux 4.9 7.5 3.8 XP 5.8 6.6 5.2 Win7 5.3 3.4 0.44 Do anyone have experience with the intricacies of the various OSes in the midst of Traffic policing? I was surprised to see such a variation in performance, especially since Windows 7 is supposed to more advanced than XP, I am going to comb through the packet captures, but wondered if anyone had insight. Thank you, Barry -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20110812/1c5317a8/attachment.html From perfgeek at mac.com Fri Aug 12 08:48:26 2011 From: perfgeek at mac.com (rick jones) Date: Fri, 12 Aug 2011 08:48:26 -0700 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> Message-ID: On Aug 12, 2011, at 7:03 AM, Barry Constantine wrote: > Hi, > > I did some testing to compare various TCP stack behaviors in the midst of traffic policing. > > It is common practice for a network provider to police traffic to a subscriber level agreement (SLA). > > In the iperf testing I conducted, the following set-up was used: > > Client -> Delay (50ms RTT) -> Cisco (with 10M Policing) -> Server > > The delay was induced using hardware base commercial gear. > > 50 msec RTT and bottleneck bandwidth = 10 Mbps, so BDP was 62,000 bytes. > > Ran Linux, Windows XP, and Windows 7 clients at 32k, 64k, 128k window (knowing that policing would > kick in at 64K) > > Throughput for Window (Mbps) > > Platform 32K 64K 128K > -------------------------------------------- > Linux 4.9 7.5 3.8 > XP 5.8 6.6 5.2 > Win7 5.3 3.4 0.44 > The folks in tcpm might be better able to help? but I'll point-out one nit - "Linux" is not that much more specific than saying "Unix" - it would be goodness to get into the habit of including the kernel version. And ID the server since it takes two to TCP... happy benchmarking, rick jones Wisdom teeth are impacted, people are affected by the effects of events From dart at es.net Fri Aug 12 09:10:57 2011 From: dart at es.net (Eli Dart) Date: Fri, 12 Aug 2011 09:10:57 -0700 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> Message-ID: <4E455091.9050601@es.net> The thing about policers is that they induce loss. So, what you're really testing is the ability of a TCP implementation to recover from loss under high-latency conditions, and adapt a connection to the existence of a bottleneck link with no buffer when TCP has potentially allocated more window than the BDP of the path requires. (I'm assuming that the host interface was 100Mbps or faster - this means that the bursts coming out of the host have an instantaneous rate that is probably at least 10x the policed rate, increasing the chance of loss even if the window is not too big). In my experience, most TCP implementations (and I say most only because I have not exhaustively tested all the various flavors) perform poorly in circumstances where there is any loss at all and latency is over 15-20 msec. Thanks, --eli On 8/12/11 7:03 AM, Barry Constantine wrote: > Hi, > > I did some testing to compare various TCP stack behaviors in the midst of traffic policing. > > It is common practice for a network provider to police traffic to a subscriber level agreement (SLA). > > In the iperf testing I conducted, the following set-up was used: > > Client -> Delay (50ms RTT) -> Cisco (with 10M Policing) -> Server > > The delay was induced using hardware base commercial gear. > > 50 msec RTT and bottleneck bandwidth = 10 Mbps, so BDP was 62,000 bytes. > > Ran Linux, Windows XP, and Windows 7 clients at 32k, 64k, 128k window (knowing that policing would > kick in at 64K) > > Throughput for Window (Mbps) > > Platform 32K 64K 128K > -------------------------------------------- > Linux 4.9 7.5 3.8 > XP 5.8 6.6 5.2 > Win7 5.3 3.4 0.44 > > > Do anyone have experience with the intricacies of the various OSes in the midst of > Traffic policing? I was surprised to see such a variation in performance, especially since Windows 7 is supposed to more advanced than XP, > > I am going to comb through the packet captures, but wondered if anyone had insight. > > Thank you, > Barry > > -- Eli Dart NOC: (510) 486-7600 ESnet Network Engineering Group (AS293) (800) 333-7638 Lawrence Berkeley National Laboratory PGP Key fingerprint = C970 F8D3 CFDD 8FFF 5486 343A 2D31 4478 5F82 B2B3 From mellia at tlc.polito.it Fri Aug 12 09:16:38 2011 From: mellia at tlc.polito.it (Marco Mellia) Date: Fri, 12 Aug 2011 09:16:38 -0700 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> Message-ID: <4E4551E6.8060108@tlc.polito.it> Hi Barry, Welcome to the TCP testing mess :) From my experience: - do consider buffering policies. It's much more critical than anything else - to this extent, do not trust delay generators: they have to buffer packets, and sometimes the buffer they have is simply too small so that they end up adding a huge loss probability. - same for shapers buffering and algorithms. How big is the buffer? - similarly, do not trust iperf too much. Especially the window parameter is very fuzzy in my experience. Another hint: avoid connecting gbps links to arrive to a bottleneck which is 2 order of magnitude slower. that means that 100 packets arrive at the congested buffer while only 1 packet can exit. As you can imagine, the burstiness makes everything worse... FYI, we have done similar tests, and got similar results. I can forward then to you if you like. Hope this helps, Marco > Hi, > > I did some testing to compare various TCP stack behaviors in the midst > of traffic policing. > > It is common practice for a network provider to police traffic to a > subscriber level agreement (SLA). > > In the iperf testing I conducted, the following set-up was used: > > Client -> Delay (50ms RTT) -> Cisco (with 10M Policing) -> Server > > The delay was induced using hardware base commercial gear. > > 50 msec RTT and bottleneck bandwidth = 10 Mbps, so BDP was 62,000 bytes. > > Ran Linux, Windows XP, and Windows 7 clients at 32k, 64k, 128k window > (knowing that policing would > > kick in at 64K) > > Throughput for Window (Mbps) > > Platform 32K 64K 128K > > -------------------------------------------- > > Linux 4.9 7.5 3.8 > > XP 5.8 6.6 5.2 > > Win7 5.3 3.4 0.44 > > Do anyone have experience with the intricacies of the various OSes in > the midst of > > Traffic policing? I was surprised to see such a variation in > performance, especially since Windows 7 is supposed to more advanced > than XP, > > I am going to comb through the packet captures, but wondered if anyone > had insight. > > Thank you, > > Barry > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20110812/dee3249d/attachment-0001.html From alexsm at gmail.com Fri Aug 12 09:30:42 2011 From: alexsm at gmail.com (Alex Moura) Date: Fri, 12 Aug 2011 13:30:42 -0300 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> Message-ID: On Fri, Aug 12, 2011 at 12:48, rick jones wrote: > > On Aug 12, 2011, at 7:03 AM, Barry Constantine wrote: > > > Hi, > > > > I did some testing to compare various TCP stack behaviors in the midst of > traffic policing. > > > > It is common practice for a network provider to police traffic to a > subscriber level agreement (SLA). > > > > In the iperf testing I conducted, the following set-up was used: > > > > Client -> Delay (50ms RTT) -> Cisco (with 10M Policing) -> Server > > > > The delay was induced using hardware base commercial gear. > > > > 50 msec RTT and bottleneck bandwidth = 10 Mbps, so BDP was 62,000 bytes. > > > > Ran Linux, Windows XP, and Windows 7 clients at 32k, 64k, 128k window > (knowing that policing would > > kick in at 64K) > > > > Throughput for Window (Mbps) > > > > Platform 32K 64K 128K > > -------------------------------------------- > > Linux 4.9 7.5 3.8 > > XP 5.8 6.6 5.2 > > Win7 5.3 3.4 0.44 > > > > The folks in tcpm might be better able to help? but I'll point-out one nit > - "Linux" is not that much more specific than saying "Unix" - it would be > goodness to get into the habit of including the kernel version. And ID the > server since it takes two to TCP... > BTW, including latest FreeBSD (or some other BSD) and Mac OS X might give interesting - maybe useful - values for the work. Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20110812/68554816/attachment.html From algold at rnp.br Fri Aug 12 09:32:50 2011 From: algold at rnp.br (Alexandre Grojsgold) Date: Fri, 12 Aug 2011 13:32:50 -0300 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> Message-ID: <4E4555B2.4030300@rnp.br> Is there a reason to consider X Mbps policing differnet of having a X Mbps link midway between source and destination? -- alg. On 12-08-2011 12:48, rick jones wrote: > On Aug 12, 2011, at 7:03 AM, Barry Constantine wrote: > >> Hi, >> >> I did some testing to compare various TCP stack behaviors in the midst of traffic policing. >> >> It is common practice for a network provider to police traffic to a subscriber level agreement (SLA). >> >> In the iperf testing I conducted, the following set-up was used: >> >> Client -> Delay (50ms RTT) -> Cisco (with 10M Policing) -> Server >> >> The delay was induced using hardware base commercial gear. >> >> 50 msec RTT and bottleneck bandwidth = 10 Mbps, so BDP was 62,000 bytes. >> >> Ran Linux, Windows XP, and Windows 7 clients at 32k, 64k, 128k window (knowing that policing would >> kick in at 64K) >> >> Throughput for Window (Mbps) >> >> Platform 32K 64K 128K >> -------------------------------------------- >> Linux 4.9 7.5 3.8 >> XP 5.8 6.6 5.2 >> Win7 5.3 3.4 0.44 >> > The folks in tcpm might be better able to help? but I'll point-out one nit - "Linux" is not that much more specific than saying "Unix" - it would be goodness to get into the habit of including the kernel version. And ID the server since it takes two to TCP... > > happy benchmarking, > > rick jones > Wisdom teeth are impacted, people are affected by the effects of events > -- _________________________________________________________________ *Alexandre L. Grojsgold*> Diretor de Engenharia e Opera??es Rede Nacional de Ensino e Pesquisa R. Lauro Muller 116 sala 1103 22.290-906 - Rio de Janeiro RJ - Brasil Tel: (21) 2102-9680 Cel: (21) 8136-2209 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20110812/4699d775/attachment-0001.html -------------- next part -------------- A non-text attachment was scrubbed... Name: figura1 Type: image/jpeg Size: 25666 bytes Desc: not available Url : http://mailman.postel.org/pipermail/end2end-interest/attachments/20110812/4699d775/figura1-0001.jpe From dart at es.net Fri Aug 12 11:37:13 2011 From: dart at es.net (Eli Dart) Date: Fri, 12 Aug 2011 11:37:13 -0700 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <4E4555B2.4030300@rnp.br> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> <4E4555B2.4030300@rnp.br> Message-ID: <4E4572D9.1080308@es.net> On 8/12/11 9:32 AM, Alexandre Grojsgold wrote: > Is there a reason to consider X Mbps policing differnet of having a X Mbps link > midway between source and destination? In my experience, policing at rate X behaves like an interface of rate X with no buffer. This means a policer must drop if there is any oversubscription at all, while an interface can provide some buffering. This means that TCP sees loss more easily in policed environments, especially if there is a large difference in bandwidth between the policed rate and the host interface rate (at any instant in time, the host is sending at wire-speed for its interface if it's got data to send and available window, regardless of average rate on the time scale of seconds). Of course, different router vendors have different buffering defaults (and different hardware capabilities), and some policers can be configured with burst allowances. However, many policers don't behave in the ways that they say they do, even when configured with burst allowances. As another post indicated, its quite a mess... --eli > > -- alg. > > > > > On 12-08-2011 12:48, rick jones wrote: >> On Aug 12, 2011, at 7:03 AM, Barry Constantine wrote: >> >>> Hi, >>> >>> I did some testing to compare various TCP stack behaviors in the midst of traffic policing. >>> >>> It is common practice for a network provider to police traffic to a subscriber level agreement (SLA). >>> >>> In the iperf testing I conducted, the following set-up was used: >>> >>> Client -> Delay (50ms RTT) -> Cisco (with 10M Policing) -> Server >>> >>> The delay was induced using hardware base commercial gear. >>> >>> 50 msec RTT and bottleneck bandwidth = 10 Mbps, so BDP was 62,000 bytes. >>> >>> Ran Linux, Windows XP, and Windows 7 clients at 32k, 64k, 128k window (knowing that policing would >>> kick in at 64K) >>> >>> Throughput for Window (Mbps) >>> >>> Platform 32K 64K 128K >>> -------------------------------------------- >>> Linux 4.9 7.5 3.8 >>> XP 5.8 6.6 5.2 >>> Win7 5.3 3.4 0.44 >>> >> The folks in tcpm might be better able to help? but I'll point-out one nit - "Linux" is not that much more specific than saying "Unix" - it would be goodness to get into the habit of including the kernel version. And ID the server since it takes two to TCP... >> >> happy benchmarking, >> >> rick jones >> Wisdom teeth are impacted, people are affected by the effects of events >> > > > -- > > _________________________________________________________________ > > > > *Alexandre L. Grojsgold*> > Diretor de Engenharia e Opera??es > Rede Nacional de Ensino e Pesquisa > R. Lauro Muller 116 sala 1103 > 22.290-906 - Rio de Janeiro RJ - Brasil > Tel: (21) 2102-9680 Cel: (21) 8136-2209 > > > -- Eli Dart NOC: (510) 486-7600 ESnet Network Engineering Group (AS293) (800) 333-7638 Lawrence Berkeley National Laboratory PGP Key fingerprint = C970 F8D3 CFDD 8FFF 5486 343A 2D31 4478 5F82 B2B3 From Barry.Constantine at jdsu.com Fri Aug 12 12:16:52 2011 From: Barry.Constantine at jdsu.com (Barry Constantine) Date: Fri, 12 Aug 2011 12:16:52 -0700 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <4E4572D9.1080308@es.net> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> <4E4555B2.4030300@rnp.br> <4E4572D9.1080308@es.net> Message-ID: <070EF7B3-CB40-495F-945B-07853ED141D2@jdsu.com> Thanks for answering this Eli, very well said. The buffeting of the slower link more gracefully allows TCP to adapt in my experience. Also thanks to all on this list, my first time posting and the suggestions and information have been fantastic. Barry Sent from my iPhone On Aug 12, 2011, at 2:44 PM, "Eli Dart" wrote: > > > On 8/12/11 9:32 AM, Alexandre Grojsgold wrote: >> Is there a reason to consider X Mbps policing differnet of having a X Mbps link >> midway between source and destination? > > In my experience, policing at rate X behaves like an interface of rate X > with no buffer. This means a policer must drop if there is any > oversubscription at all, while an interface can provide some buffering. > > This means that TCP sees loss more easily in policed environments, > especially if there is a large difference in bandwidth between the > policed rate and the host interface rate (at any instant in time, the > host is sending at wire-speed for its interface if it's got data to send > and available window, regardless of average rate on the time scale of > seconds). > > Of course, different router vendors have different buffering defaults > (and different hardware capabilities), and some policers can be > configured with burst allowances. However, many policers don't behave > in the ways that they say they do, even when configured with burst > allowances. As another post indicated, its quite a mess... > > --eli > > >> >> -- alg. >> >> >> >> >> On 12-08-2011 12:48, rick jones wrote: >>> On Aug 12, 2011, at 7:03 AM, Barry Constantine wrote: >>> >>>> Hi, >>>> >>>> I did some testing to compare various TCP stack behaviors in the midst of traffic policing. >>>> >>>> It is common practice for a network provider to police traffic to a subscriber level agreement (SLA). >>>> >>>> In the iperf testing I conducted, the following set-up was used: >>>> >>>> Client -> Delay (50ms RTT) -> Cisco (with 10M Policing) -> Server >>>> >>>> The delay was induced using hardware base commercial gear. >>>> >>>> 50 msec RTT and bottleneck bandwidth = 10 Mbps, so BDP was 62,000 bytes. >>>> >>>> Ran Linux, Windows XP, and Windows 7 clients at 32k, 64k, 128k window (knowing that policing would >>>> kick in at 64K) >>>> >>>> Throughput for Window (Mbps) >>>> >>>> Platform 32K 64K 128K >>>> -------------------------------------------- >>>> Linux 4.9 7.5 3.8 >>>> XP 5.8 6.6 5.2 >>>> Win7 5.3 3.4 0.44 >>>> >>> The folks in tcpm might be better able to help? but I'll point-out one nit - "Linux" is not that much more specific than saying "Unix" - it would be goodness to get into the habit of including the kernel version. And ID the server since it takes two to TCP... >>> >>> happy benchmarking, >>> >>> rick jones >>> Wisdom teeth are impacted, people are affected by the effects of events >>> >> >> >> -- >> >> _________________________________________________________________ >> >> >> >> *Alexandre L. Grojsgold*> >> Diretor de Engenharia e Opera??es >> Rede Nacional de Ensino e Pesquisa >> R. Lauro Muller 116 sala 1103 >> 22.290-906 - Rio de Janeiro RJ - Brasil >> Tel: (21) 2102-9680 Cel: (21) 8136-2209 >> >> >> > > -- > Eli Dart NOC: (510) 486-7600 > ESnet Network Engineering Group (AS293) (800) 333-7638 > Lawrence Berkeley National Laboratory > PGP Key fingerprint = C970 F8D3 CFDD 8FFF 5486 343A 2D31 4478 5F82 B2B3 From Barry.Constantine at jdsu.com Fri Aug 12 12:37:30 2011 From: Barry.Constantine at jdsu.com (Barry Constantine) Date: Fri, 12 Aug 2011 12:37:30 -0700 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <4E455091.9050601@es.net> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> <4E455091.9050601@es.net> Message-ID: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009DB9@MILEXCH1.ds.jdsu.net> Hi, Let me provide better background information. End-customers (business companies) purchase a network service from a network provider and purchase a Service Level Agreement (SLA). This SLA specifies the committed information rate (CIR) which the provider will guarantee along with loss / latency specifications. The end customer connects up either with 100M or 1G interface (generally) and the provider generally polices down to the CIR. The disconnect comes about as this; providers qualify the network at Layer2/3 and run stateless traffic up to the CIR and of course network performance is fine. The end customer runs their traffic across the network (many times running iperf or FTP first), gets significantly worse performance then the CIR, and the finger-pointing and churn begins. I am trying to educate network providers that: a) Testing a network with Layer 2/3 stateless traffic is not a great test (good starting point, but needs to add TCP testing as best practice b) The benefits of traffic shaping can be enormous So I hope this provides some background, and again, this list has provided great nuggets of information very quickly. Regards, Barry -----Original Message----- From: end2end-interest-bounces at postel.org [mailto:end2end-interest-bounces at postel.org] On Behalf Of Eli Dart Sent: Friday, August 12, 2011 12:11 PM To: end2end-interest at postel.org Subject: Re: [e2e] TCP Performance with Traffic Policing The thing about policers is that they induce loss. So, what you're really testing is the ability of a TCP implementation to recover from loss under high-latency conditions, and adapt a connection to the existence of a bottleneck link with no buffer when TCP has potentially allocated more window than the BDP of the path requires. (I'm assuming that the host interface was 100Mbps or faster - this means that the bursts coming out of the host have an instantaneous rate that is probably at least 10x the policed rate, increasing the chance of loss even if the window is not too big). In my experience, most TCP implementations (and I say most only because I have not exhaustively tested all the various flavors) perform poorly in circumstances where there is any loss at all and latency is over 15-20 msec. Thanks, --eli On 8/12/11 7:03 AM, Barry Constantine wrote: > Hi, > > I did some testing to compare various TCP stack behaviors in the midst of traffic policing. > > It is common practice for a network provider to police traffic to a subscriber level agreement (SLA). > > In the iperf testing I conducted, the following set-up was used: > > Client -> Delay (50ms RTT) -> Cisco (with 10M Policing) -> Server > > The delay was induced using hardware base commercial gear. > > 50 msec RTT and bottleneck bandwidth = 10 Mbps, so BDP was 62,000 bytes. > > Ran Linux, Windows XP, and Windows 7 clients at 32k, 64k, 128k window (knowing that policing would > kick in at 64K) > > Throughput for Window (Mbps) > > Platform 32K 64K 128K > -------------------------------------------- > Linux 4.9 7.5 3.8 > XP 5.8 6.6 5.2 > Win7 5.3 3.4 0.44 > > > Do anyone have experience with the intricacies of the various OSes in the midst of > Traffic policing? I was surprised to see such a variation in performance, especially since Windows 7 is supposed to more advanced than XP, > > I am going to comb through the packet captures, but wondered if anyone had insight. > > Thank you, > Barry > > -- Eli Dart NOC: (510) 486-7600 ESnet Network Engineering Group (AS293) (800) 333-7638 Lawrence Berkeley National Laboratory PGP Key fingerprint = C970 F8D3 CFDD 8FFF 5486 343A 2D31 4478 5F82 B2B3 From anoop at alumni.duke.edu Fri Aug 12 14:43:17 2011 From: anoop at alumni.duke.edu (Anoop Ghanwani) Date: Fri, 12 Aug 2011 14:43:17 -0700 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009DB9@MILEXCH1.ds.jdsu.net> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> <4E455091.9050601@es.net> <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009DB9@MILEXCH1.ds.jdsu.net> Message-ID: On Fri, Aug 12, 2011 at 12:37 PM, Barry Constantine < Barry.Constantine at jdsu.com> wrote: > > b) The benefits of traffic shaping can be enormous > I don't think anyone doubts this. It is just super-expensive to implement compared to a policer -- think per-customer counters vs per-customer queues. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20110812/baf7bc2e/attachment.html From Anil.Agarwal at viasat.com Sat Aug 13 08:41:41 2011 From: Anil.Agarwal at viasat.com (Agarwal, Anil) Date: Sat, 13 Aug 2011 11:41:41 -0400 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <070EF7B3-CB40-495F-945B-07853ED141D2@jdsu.com> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net><4E4555B2.4030300@rnp.br> <4E4572D9.1080308@es.net> <070EF7B3-CB40-495F-945B-07853ED141D2@jdsu.com> Message-ID: <0B0A20D0B3ECD742AA2514C8DDA3B065055F485E@VGAEXCH01.hq.corp.viasat.com> Barry, You might want to set the "burst size" parameter of the policer to a higher value - e.g., equal to the bandwidth-delay-product at the policer rate or even higher. This will have a **similar** effect as having a buffer with an equivalent rate link. Also, testing with multiple TCP connections will result in higher aggregate throughput, even at low burst size values. Also, check if TCP SACK is enabled in all your test cases. You should be able to achieve throughput close to the policer rate. Note that a policer with rate R bps and burst size of x bytes is not exactly equivalent to a link at rate R bps and x bytes of queue space. On a R bps link, TCP packets get spaced out more evenly due to the self-clocking nature of TCP and the transmission time of each packet at R bps. With a policer, there is no "transmission time" effect at the policer; packets in packet trains of a TCP connection tend to get spaced more closely, which can drive a policer into the state, where it drops packets, even when the average data rate (measured over an RTT) is < R bps. Having multiple connections helps - their packet trains tend to get staggered over time. Regards, Anil Anil Agarwal ViaSat Inc. -----Original Message----- From: end2end-interest-bounces at postel.org [mailto:end2end-interest-bounces at postel.org] On Behalf Of Barry Constantine Sent: Friday, August 12, 2011 3:17 PM To: dart at es.net Cc: Alexandre Grojsgold; end2end-interest at postel.org Subject: Re: [e2e] TCP Performance with Traffic Policing Thanks for answering this Eli, very well said. The buffeting of the slower link more gracefully allows TCP to adapt in my experience. Also thanks to all on this list, my first time posting and the suggestions and information have been fantastic. Barry Sent from my iPhone On Aug 12, 2011, at 2:44 PM, "Eli Dart" wrote: > > > On 8/12/11 9:32 AM, Alexandre Grojsgold wrote: >> Is there a reason to consider X Mbps policing differnet of having a X Mbps link >> midway between source and destination? > > In my experience, policing at rate X behaves like an interface of rate X > with no buffer. This means a policer must drop if there is any > oversubscription at all, while an interface can provide some buffering. > > This means that TCP sees loss more easily in policed environments, > especially if there is a large difference in bandwidth between the > policed rate and the host interface rate (at any instant in time, the > host is sending at wire-speed for its interface if it's got data to send > and available window, regardless of average rate on the time scale of > seconds). > > Of course, different router vendors have different buffering defaults > (and different hardware capabilities), and some policers can be > configured with burst allowances. However, many policers don't behave > in the ways that they say they do, even when configured with burst > allowances. As another post indicated, its quite a mess... > > --eli > > >> >> -- alg. >> >> >> >> >> On 12-08-2011 12:48, rick jones wrote: >>> On Aug 12, 2011, at 7:03 AM, Barry Constantine wrote: >>> >>>> Hi, >>>> >>>> I did some testing to compare various TCP stack behaviors in the midst of traffic policing. >>>> >>>> It is common practice for a network provider to police traffic to a subscriber level agreement (SLA). >>>> >>>> In the iperf testing I conducted, the following set-up was used: >>>> >>>> Client -> Delay (50ms RTT) -> Cisco (with 10M Policing) -> Server >>>> >>>> The delay was induced using hardware base commercial gear. >>>> >>>> 50 msec RTT and bottleneck bandwidth = 10 Mbps, so BDP was 62,000 bytes. >>>> >>>> Ran Linux, Windows XP, and Windows 7 clients at 32k, 64k, 128k window (knowing that policing would >>>> kick in at 64K) >>>> >>>> Throughput for Window (Mbps) >>>> >>>> Platform 32K 64K 128K >>>> -------------------------------------------- >>>> Linux 4.9 7.5 3.8 >>>> XP 5.8 6.6 5.2 >>>> Win7 5.3 3.4 0.44 >>>> >>> The folks in tcpm might be better able to help? but I'll point-out one nit - "Linux" is not that much more specific than saying "Unix" - it would be goodness to get into the habit of including the kernel version. And ID the server since it takes two to TCP... >>> >>> happy benchmarking, >>> >>> rick jones >>> Wisdom teeth are impacted, people are affected by the effects of events >>> >> >> >> -- >> >> _________________________________________________________________ >> >> >> >> *Alexandre L. Grojsgold*> >> Diretor de Engenharia e Opera??es >> Rede Nacional de Ensino e Pesquisa >> R. Lauro Muller 116 sala 1103 >> 22.290-906 - Rio de Janeiro RJ - Brasil >> Tel: (21) 2102-9680 Cel: (21) 8136-2209 >> >> >> > > -- > Eli Dart NOC: (510) 486-7600 > ESnet Network Engineering Group (AS293) (800) 333-7638 > Lawrence Berkeley National Laboratory > PGP Key fingerprint = C970 F8D3 CFDD 8FFF 5486 343A 2D31 4478 5F82 B2B3 From anil at cmmacs.ernet.in Sat Aug 13 12:14:47 2011 From: anil at cmmacs.ernet.in (anil@cmmacs.ernet.in) Date: Sun, 14 Aug 2011 00:44:47 +0530 (IST) Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> Message-ID: <1449.115.241.106.33.1313262887.squirrel@202.41.64.20> Hi Barry, Quite interesting I would guess that the different flows (Linux, XP and Win7) in your experiment might have expressed varying bursty patterns, and that would have made the policing process to treat these flows differently. A time vs sequence plot on either side of the policing box should help to bring out the real dynamics. Also, there was a similar post in e2e almost about a decade ago. http://www.postel.org/pipermail/end2end-interest/2002-June/002154.html It is worth having a look Anil > Hi, > > I did some testing to compare various TCP stack behaviors in the midst of > traffic policing. > > It is common practice for a network provider to police traffic to a > subscriber level agreement (SLA). > > In the iperf testing I conducted, the following set-up was used: > > Client -> Delay (50ms RTT) -> Cisco (with 10M Policing) -> Server > > The delay was induced using hardware base commercial gear. > > 50 msec RTT and bottleneck bandwidth = 10 Mbps, so BDP was 62,000 bytes. > > Ran Linux, Windows XP, and Windows 7 clients at 32k, 64k, 128k window > (knowing that policing would > kick in at 64K) > > Throughput for Window (Mbps) > > Platform 32K 64K 128K > -------------------------------------------- > Linux 4.9 7.5 3.8 > XP 5.8 6.6 5.2 > Win7 5.3 3.4 0.44 > > > Do anyone have experience with the intricacies of the various OSes in the > midst of > Traffic policing? I was surprised to see such a variation in performance, > especially since Windows 7 is supposed to more advanced than XP, > > I am going to comb through the packet captures, but wondered if anyone had > insight. > > Thank you, > Barry > > > From Barry.Constantine at jdsu.com Sat Aug 13 12:46:03 2011 From: Barry.Constantine at jdsu.com (Barry Constantine) Date: Sat, 13 Aug 2011 12:46:03 -0700 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <0B0A20D0B3ECD742AA2514C8DDA3B065055F485E@VGAEXCH01.hq.corp.viasat.com> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net><4E4555B2.4030300@rnp.br> <4E4572D9.1080308@es.net> <070EF7B3-CB40-495F-945B-07853ED141D2@jdsu.com> <0B0A20D0B3ECD742AA2514C8DDA3B065055F485E@VGAEXCH01.hq.corp.viasat.com> Message-ID: <94DEE80C63F7D34F9DC9FE69E39436BE3A0DF8437F@MILEXCH1.ds.jdsu.net> Hi Anil, I have played around with burst size in the past, but not with this particular bench test. The equipment is still in place and I should be able to do more testing next week as you propose. I will let you know the results. Thanks, -Barry -----Original Message----- From: Agarwal, Anil [mailto:Anil.Agarwal at viasat.com] Sent: Saturday, August 13, 2011 11:42 AM To: Barry Constantine; dart at es.net Cc: Alexandre Grojsgold; end2end-interest at postel.org Subject: RE: [e2e] TCP Performance with Traffic Policing Barry, You might want to set the "burst size" parameter of the policer to a higher value - e.g., equal to the bandwidth-delay-product at the policer rate or even higher. This will have a **similar** effect as having a buffer with an equivalent rate link. Also, testing with multiple TCP connections will result in higher aggregate throughput, even at low burst size values. Also, check if TCP SACK is enabled in all your test cases. You should be able to achieve throughput close to the policer rate. Note that a policer with rate R bps and burst size of x bytes is not exactly equivalent to a link at rate R bps and x bytes of queue space. On a R bps link, TCP packets get spaced out more evenly due to the self-clocking nature of TCP and the transmission time of each packet at R bps. With a policer, there is no "transmission time" effect at the policer; packets in packet trains of a TCP connection tend to get spaced more closely, which can drive a policer into the state, where it drops packets, even when the average data rate (measured over an RTT) is < R bps. Having multiple connections helps - their packet trains tend to get staggered over time. Regards, Anil Anil Agarwal ViaSat Inc. -----Original Message----- From: end2end-interest-bounces at postel.org [mailto:end2end-interest-bounces at postel.org] On Behalf Of Barry Constantine Sent: Friday, August 12, 2011 3:17 PM To: dart at es.net Cc: Alexandre Grojsgold; end2end-interest at postel.org Subject: Re: [e2e] TCP Performance with Traffic Policing Thanks for answering this Eli, very well said. The buffeting of the slower link more gracefully allows TCP to adapt in my experience. Also thanks to all on this list, my first time posting and the suggestions and information have been fantastic. Barry Sent from my iPhone On Aug 12, 2011, at 2:44 PM, "Eli Dart" wrote: > > > On 8/12/11 9:32 AM, Alexandre Grojsgold wrote: >> Is there a reason to consider X Mbps policing differnet of having a X Mbps link >> midway between source and destination? > > In my experience, policing at rate X behaves like an interface of rate X > with no buffer. This means a policer must drop if there is any > oversubscription at all, while an interface can provide some buffering. > > This means that TCP sees loss more easily in policed environments, > especially if there is a large difference in bandwidth between the > policed rate and the host interface rate (at any instant in time, the > host is sending at wire-speed for its interface if it's got data to send > and available window, regardless of average rate on the time scale of > seconds). > > Of course, different router vendors have different buffering defaults > (and different hardware capabilities), and some policers can be > configured with burst allowances. However, many policers don't behave > in the ways that they say they do, even when configured with burst > allowances. As another post indicated, its quite a mess... > > --eli > > >> >> -- alg. >> >> >> >> >> On 12-08-2011 12:48, rick jones wrote: >>> On Aug 12, 2011, at 7:03 AM, Barry Constantine wrote: >>> >>>> Hi, >>>> >>>> I did some testing to compare various TCP stack behaviors in the midst of traffic policing. >>>> >>>> It is common practice for a network provider to police traffic to a subscriber level agreement (SLA). >>>> >>>> In the iperf testing I conducted, the following set-up was used: >>>> >>>> Client -> Delay (50ms RTT) -> Cisco (with 10M Policing) -> Server >>>> >>>> The delay was induced using hardware base commercial gear. >>>> >>>> 50 msec RTT and bottleneck bandwidth = 10 Mbps, so BDP was 62,000 bytes. >>>> >>>> Ran Linux, Windows XP, and Windows 7 clients at 32k, 64k, 128k window (knowing that policing would >>>> kick in at 64K) >>>> >>>> Throughput for Window (Mbps) >>>> >>>> Platform 32K 64K 128K >>>> -------------------------------------------- >>>> Linux 4.9 7.5 3.8 >>>> XP 5.8 6.6 5.2 >>>> Win7 5.3 3.4 0.44 >>>> >>> The folks in tcpm might be better able to help? but I'll point-out one nit - "Linux" is not that much more specific than saying "Unix" - it would be goodness to get into the habit of including the kernel version. And ID the server since it takes two to TCP... >>> >>> happy benchmarking, >>> >>> rick jones >>> Wisdom teeth are impacted, people are affected by the effects of events >>> >> >> >> -- >> >> _________________________________________________________________ >> >> >> >> *Alexandre L. Grojsgold*> >> Diretor de Engenharia e Opera??es >> Rede Nacional de Ensino e Pesquisa >> R. Lauro Muller 116 sala 1103 >> 22.290-906 - Rio de Janeiro RJ - Brasil >> Tel: (21) 2102-9680 Cel: (21) 8136-2209 >> >> >> > > -- > Eli Dart NOC: (510) 486-7600 > ESnet Network Engineering Group (AS293) (800) 333-7638 > Lawrence Berkeley National Laboratory > PGP Key fingerprint = C970 F8D3 CFDD 8FFF 5486 343A 2D31 4478 5F82 B2B3 From Barry.Constantine at jdsu.com Sat Aug 13 12:49:17 2011 From: Barry.Constantine at jdsu.com (Barry Constantine) Date: Sat, 13 Aug 2011 12:49:17 -0700 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <1449.115.241.106.33.1313262887.squirrel@202.41.64.20> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> <1449.115.241.106.33.1313262887.squirrel@202.41.64.20> Message-ID: <94DEE80C63F7D34F9DC9FE69E39436BE3A0DF84381@MILEXCH1.ds.jdsu.net> Hi Anil, Your assessments seem reasonable and I will look at the packet captures with Wireshark as you suggest. Also, thanks for pointing me to the old post; it was useful as well. -Barry -----Original Message----- From: anil at cmmacs.ernet.in [mailto:anil at cmmacs.ernet.in] Sent: Saturday, August 13, 2011 3:15 PM To: Barry Constantine Cc: end2end-interest at postel.org Subject: Re: [e2e] TCP Performance with Traffic Policing Hi Barry, Quite interesting I would guess that the different flows (Linux, XP and Win7) in your experiment might have expressed varying bursty patterns, and that would have made the policing process to treat these flows differently. A time vs sequence plot on either side of the policing box should help to bring out the real dynamics. Also, there was a similar post in e2e almost about a decade ago. http://www.postel.org/pipermail/end2end-interest/2002-June/002154.html It is worth having a look Anil > Hi, > > I did some testing to compare various TCP stack behaviors in the midst of > traffic policing. > > It is common practice for a network provider to police traffic to a > subscriber level agreement (SLA). > > In the iperf testing I conducted, the following set-up was used: > > Client -> Delay (50ms RTT) -> Cisco (with 10M Policing) -> Server > > The delay was induced using hardware base commercial gear. > > 50 msec RTT and bottleneck bandwidth = 10 Mbps, so BDP was 62,000 bytes. > > Ran Linux, Windows XP, and Windows 7 clients at 32k, 64k, 128k window > (knowing that policing would > kick in at 64K) > > Throughput for Window (Mbps) > > Platform 32K 64K 128K > -------------------------------------------- > Linux 4.9 7.5 3.8 > XP 5.8 6.6 5.2 > Win7 5.3 3.4 0.44 > > > Do anyone have experience with the intricacies of the various OSes in the > midst of > Traffic policing? I was surprised to see such a variation in performance, > especially since Windows 7 is supposed to more advanced than XP, > > I am going to comb through the packet captures, but wondered if anyone had > insight. > > Thank you, > Barry > > > From anil at cmmacs.ernet.in Sat Aug 13 13:23:52 2011 From: anil at cmmacs.ernet.in (anil@cmmacs.ernet.in) Date: Sun, 14 Aug 2011 01:53:52 +0530 (IST) Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <94DEE80C63F7D34F9DC9FE69E39436BE3A0DF84381@MILEXCH1.ds.jdsu.net> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> <1449.115.241.106.33.1313262887.squirrel@202.41.64.20> <94DEE80C63F7D34F9DC9FE69E39436BE3A0DF84381@MILEXCH1.ds.jdsu.net> Message-ID: <1793.115.241.106.33.1313267032.squirrel@202.41.64.20> Hi Barry: Would be glad to see the plots/sequence data in case you would like to share them. If the cause is burst, then the next interesting question whould be: why the same TCP sender reacted quite differetnly to different (standard) clients when the policing was in between and normal otherwise. Anil > Hi Anil, > > Your assessments seem reasonable and I will look at the packet captures > with Wireshark as you suggest. > > Also, thanks for pointing me to the old post; it was useful as well. > > -Barry > > -----Original Message----- > From: anil at cmmacs.ernet.in [mailto:anil at cmmacs.ernet.in] > Sent: Saturday, August 13, 2011 3:15 PM > To: Barry Constantine > Cc: end2end-interest at postel.org > Subject: Re: [e2e] TCP Performance with Traffic Policing > > Hi Barry, > > Quite interesting > > I would guess that the different flows (Linux, XP and Win7) in your > experiment might have expressed varying bursty patterns, and that would > have made the policing process to treat these flows differently. A time vs > sequence plot on either side of the policing box should help to bring out > the real dynamics. > > Also, there was a similar post in e2e almost about a decade ago. > http://www.postel.org/pipermail/end2end-interest/2002-June/002154.html > > It is worth having a look > > Anil > >> Hi, >> >> I did some testing to compare various TCP stack behaviors in the midst >> of >> traffic policing. >> >> It is common practice for a network provider to police traffic to a >> subscriber level agreement (SLA). >> >> In the iperf testing I conducted, the following set-up was used: >> >> Client -> Delay (50ms RTT) -> Cisco (with 10M Policing) -> Server >> >> The delay was induced using hardware base commercial gear. >> >> 50 msec RTT and bottleneck bandwidth = 10 Mbps, so BDP was 62,000 bytes. >> >> Ran Linux, Windows XP, and Windows 7 clients at 32k, 64k, 128k window >> (knowing that policing would >> kick in at 64K) >> >> Throughput for Window (Mbps) >> >> Platform 32K 64K 128K >> -------------------------------------------- >> Linux 4.9 7.5 3.8 >> XP 5.8 6.6 5.2 >> Win7 5.3 3.4 0.44 >> >> >> Do anyone have experience with the intricacies of the various OSes in >> the >> midst of >> Traffic policing? I was surprised to see such a variation in >> performance, >> especially since Windows 7 is supposed to more advanced than XP, >> >> I am going to comb through the packet captures, but wondered if anyone >> had >> insight. >> >> Thank you, >> Barry >> >> >> > > From detlef.bosau at web.de Sat Aug 13 16:58:20 2011 From: detlef.bosau at web.de (Detlef Bosau) Date: Sun, 14 Aug 2011 01:58:20 +0200 Subject: [e2e] TCP Performance over WWAN In-Reply-To: <4E40684D.6060808@web.de> References: <4E40684D.6060808@web.de> Message-ID: <4E470F9C.2040903@web.de> Hi to all, this is not basically a new topic - however, it might be useful to look at the problem from a somewhat different perspective. Some time ago, Dominik Kaspar posted some surprisingly huge RTOs and RTTs which he observed in network paths which included WWAN links. IIRC, many of us were quite surprised about these results and did not really get a clue where these latencies came from. Although it is not too difficult to name some few reasons, e.g., - huge recovery latencies (one poster pointed out that, e.g., 802.11 LANs provide for up to 254 sending attempts, actually this may be even larger, I have to look it up, but I think one could even achieve an unlimited number of transmission attempts), - excessive MAC latencies resulting from huge numbers of retransmission which lead to an excessive competition for shared media, - fluctuations in the net bit rate when the line coding, channel coding etc. is adapted to accommodate varying channel properties, - transient "link outages", actually, e.g., suspended flows when a channel becomes too bad, or consequences from opportunistic scheduling, - mixed traffic best effort / QoS, - roaming, - varying cell load, only to name a few. The list is certainly to be completed. However, my interest is not to find out all the reasons for disturbances, but I'm interested in whether these things are really annoying or not and in whether these things cause grief or not. So we'll find those disturbances, which cause harm to TCP - and get them out of harm's way. (Or at least something similar ;-)) It would be of great help, if I could obtain some real world latency traces or some real world packet/block-corruption ratio traces for this purpose. Unfortunately, I cannot provide appropriate traces on my own, so I ask whether there is existing material available. Certainly, getting in touch with colleagues who work in this area would be helpful as well. Thanks. Detlef -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ From detlef.bosau at web.de Sun Aug 14 11:37:46 2011 From: detlef.bosau at web.de (Detlef Bosau) Date: Sun, 14 Aug 2011 20:37:46 +0200 Subject: [e2e] TCP Performance over WWAN In-Reply-To: <4E47FB37.5000708@freedesktop.org> References: <4E40684D.6060808@web.de> <4E470F9C.2040903@web.de> <4E47FB37.5000708@freedesktop.org> Message-ID: <4E4815FA.7010305@web.de> Just a very first reply: On 08/14/2011 06:43 PM, Jim Gettys wrote: > > With aggregation, some packets may get through, and others be damaged > (802.11n gives you a bit mask telling you which are intact and which are > damaged). This all gets complicated, as, for example, you don't want to > drop three packets from the same flow, etc. So the drivers have had to > all get reworked for 802.11n. So there is buffering after initial > transmission, so that packets can be retransmitted later if they failed > in their first attempt. > At a first glance, I think this may cause packet order distortion? How do you deal with TCP then? > Dave Taht started putting together an OpenWrt distribution (called > CeroWrt) for bufferbloat work we're doing, as you might imagine, we ran > right into these issues. In his test case, it was insane time even for > ping, much less TCP. Detlef -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ From detlef.bosau at web.de Mon Aug 15 01:38:17 2011 From: detlef.bosau at web.de (Detlef Bosau) Date: Mon, 15 Aug 2011 10:38:17 +0200 Subject: [e2e] TCP Performance over WWAN In-Reply-To: <4E483C27.8000802@freedesktop.org> References: <4E40684D.6060808@web.de> <4E470F9C.2040903@web.de> <4E47FB37.5000708@freedesktop.org> <4E4815FA.7010305@web.de> <4E483C27.8000802@freedesktop.org> Message-ID: <4E48DAF9.2070008@web.de> On 08/14/2011 11:20 PM, Jim Gettys wrote: > > Best wait for the podcast/transcript of Dave questioning Andrew, rather > than interrogating me. > - Jim Fine. However, from a second glance: Because I want to do research based upon these things, I have to make a very clear distinction between problems and bugs. So, when we talk about latencies and loss, are these structural problems where scientific research is necessary? Or are we talking about implementation bugs? Or, to state it political incorrect: We all know, that Windows has lots of bugs. However, these are no scientific problems and it is not up to the university to make M$ programmers reed textbooks on OS ;-) This may sound harsh, however, when I submit a paper on wireless networks, the remark: "This is no scientific problem but an artifact or a bug" is likely to be the first reviewer comment. -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ From Barry.Constantine at jdsu.com Mon Aug 15 06:39:46 2011 From: Barry.Constantine at jdsu.com (Barry Constantine) Date: Mon, 15 Aug 2011 06:39:46 -0700 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <0B0A20D0B3ECD742AA2514C8DDA3B065055F485E@VGAEXCH01.hq.corp.viasat.com> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net><4E4555B2.4030300@rnp.br> <4E4572D9.1080308@es.net> <070EF7B3-CB40-495F-945B-07853ED141D2@jdsu.com> <0B0A20D0B3ECD742AA2514C8DDA3B065055F485E@VGAEXCH01.hq.corp.viasat.com> Message-ID: <94DEE80C63F7D34F9DC9FE69E39436BE3A1CCEC905@MILEXCH1.ds.jdsu.net> Hi, So I reran the 64KB and 128KB window test from Linux client, under same test conditions, but tweaked the bc value of the policer. The default value was 312,500 (value for the original tests) and I increased it to maximum of 1,000,000. For the 64KB window test, no packets were dropped but for the 128KB test, the results remained the same. I also set be to 1,000,000 and same result. I also played with the policer PIR value, but no luck there. Any other suggestions? Thanks, Barry -----Original Message----- From: Agarwal, Anil [mailto:Anil.Agarwal at viasat.com] Sent: Saturday, August 13, 2011 11:42 AM To: Barry Constantine; dart at es.net Cc: Alexandre Grojsgold; end2end-interest at postel.org Subject: RE: [e2e] TCP Performance with Traffic Policing Barry, You might want to set the "burst size" parameter of the policer to a higher value - e.g., equal to the bandwidth-delay-product at the policer rate or even higher. This will have a **similar** effect as having a buffer with an equivalent rate link. Also, testing with multiple TCP connections will result in higher aggregate throughput, even at low burst size values. Also, check if TCP SACK is enabled in all your test cases. You should be able to achieve throughput close to the policer rate. Note that a policer with rate R bps and burst size of x bytes is not exactly equivalent to a link at rate R bps and x bytes of queue space. On a R bps link, TCP packets get spaced out more evenly due to the self-clocking nature of TCP and the transmission time of each packet at R bps. With a policer, there is no "transmission time" effect at the policer; packets in packet trains of a TCP connection tend to get spaced more closely, which can drive a policer into the state, where it drops packets, even when the average data rate (measured over an RTT) is < R bps. Having multiple connections helps - their packet trains tend to get staggered over time. Regards, Anil Anil Agarwal ViaSat Inc. -----Original Message----- From: end2end-interest-bounces at postel.org [mailto:end2end-interest-bounces at postel.org] On Behalf Of Barry Constantine Sent: Friday, August 12, 2011 3:17 PM To: dart at es.net Cc: Alexandre Grojsgold; end2end-interest at postel.org Subject: Re: [e2e] TCP Performance with Traffic Policing Thanks for answering this Eli, very well said. The buffeting of the slower link more gracefully allows TCP to adapt in my experience. Also thanks to all on this list, my first time posting and the suggestions and information have been fantastic. Barry Sent from my iPhone On Aug 12, 2011, at 2:44 PM, "Eli Dart" wrote: > > > On 8/12/11 9:32 AM, Alexandre Grojsgold wrote: >> Is there a reason to consider X Mbps policing differnet of having a X Mbps link >> midway between source and destination? > > In my experience, policing at rate X behaves like an interface of rate X > with no buffer. This means a policer must drop if there is any > oversubscription at all, while an interface can provide some buffering. > > This means that TCP sees loss more easily in policed environments, > especially if there is a large difference in bandwidth between the > policed rate and the host interface rate (at any instant in time, the > host is sending at wire-speed for its interface if it's got data to send > and available window, regardless of average rate on the time scale of > seconds). > > Of course, different router vendors have different buffering defaults > (and different hardware capabilities), and some policers can be > configured with burst allowances. However, many policers don't behave > in the ways that they say they do, even when configured with burst > allowances. As another post indicated, its quite a mess... > > --eli > > >> >> -- alg. >> >> >> >> >> On 12-08-2011 12:48, rick jones wrote: >>> On Aug 12, 2011, at 7:03 AM, Barry Constantine wrote: >>> >>>> Hi, >>>> >>>> I did some testing to compare various TCP stack behaviors in the midst of traffic policing. >>>> >>>> It is common practice for a network provider to police traffic to a subscriber level agreement (SLA). >>>> >>>> In the iperf testing I conducted, the following set-up was used: >>>> >>>> Client -> Delay (50ms RTT) -> Cisco (with 10M Policing) -> Server >>>> >>>> The delay was induced using hardware base commercial gear. >>>> >>>> 50 msec RTT and bottleneck bandwidth = 10 Mbps, so BDP was 62,000 bytes. >>>> >>>> Ran Linux, Windows XP, and Windows 7 clients at 32k, 64k, 128k window (knowing that policing would >>>> kick in at 64K) >>>> >>>> Throughput for Window (Mbps) >>>> >>>> Platform 32K 64K 128K >>>> -------------------------------------------- >>>> Linux 4.9 7.5 3.8 >>>> XP 5.8 6.6 5.2 >>>> Win7 5.3 3.4 0.44 >>>> >>> The folks in tcpm might be better able to help? but I'll point-out one nit - "Linux" is not that much more specific than saying "Unix" - it would be goodness to get into the habit of including the kernel version. And ID the server since it takes two to TCP... >>> >>> happy benchmarking, >>> >>> rick jones >>> Wisdom teeth are impacted, people are affected by the effects of events >>> >> >> >> -- >> >> _________________________________________________________________ >> >> >> >> *Alexandre L. Grojsgold*> >> Diretor de Engenharia e Opera??es >> Rede Nacional de Ensino e Pesquisa >> R. Lauro Muller 116 sala 1103 >> 22.290-906 - Rio de Janeiro RJ - Brasil >> Tel: (21) 2102-9680 Cel: (21) 8136-2209 >> >> >> > > -- > Eli Dart NOC: (510) 486-7600 > ESnet Network Engineering Group (AS293) (800) 333-7638 > Lawrence Berkeley National Laboratory > PGP Key fingerprint = C970 F8D3 CFDD 8FFF 5486 343A 2D31 4478 5F82 B2B3 From mallman at icir.org Tue Aug 16 08:55:47 2011 From: mallman at icir.org (Mark Allman) Date: Tue, 16 Aug 2011 11:55:47 -0400 Subject: [e2e] Taking RTT Samples. In-Reply-To: <4E3682C3.6010305@web.de> Message-ID: <20110816155547.E711C4363D4@lawyers.icir.org> An embedded and charset-unspecified text was scrubbed... Name: not available Url: http://mailman.postel.org/pipermail/end2end-interest/attachments/20110816/2b66ba72/attachment.ksh -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 194 bytes Desc: not available Url : http://mailman.postel.org/pipermail/end2end-interest/attachments/20110816/2b66ba72/attachment.bin From Barry.Constantine at jdsu.com Tue Aug 16 13:37:18 2011 From: Barry.Constantine at jdsu.com (Barry Constantine) Date: Tue, 16 Aug 2011 13:37:18 -0700 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <1793.115.241.106.33.1313267032.squirrel@202.41.64.20> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> <1449.115.241.106.33.1313262887.squirrel@202.41.64.20> <94DEE80C63F7D34F9DC9FE69E39436BE3A0DF84381@MILEXCH1.ds.jdsu.net> <1793.115.241.106.33.1313267032.squirrel@202.41.64.20> Message-ID: <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09CE4A@MILEXCH1.ds.jdsu.net> Hi Anil, Attached is a Word document with the sequence charts from Wireshark. The 64KB and 128KB window sizes are shown for each OS. Thanks, Barry -----Original Message----- From: anil at cmmacs.ernet.in [mailto:anil at cmmacs.ernet.in] Sent: Saturday, August 13, 2011 4:24 PM To: Barry Constantine Cc: anil at cmmacs.ernet.in; end2end-interest at postel.org Subject: RE: [e2e] TCP Performance with Traffic Policing Hi Barry: Would be glad to see the plots/sequence data in case you would like to share them. If the cause is burst, then the next interesting question whould be: why the same TCP sender reacted quite differetnly to different (standard) clients when the policing was in between and normal otherwise. Anil > Hi Anil, > > Your assessments seem reasonable and I will look at the packet captures > with Wireshark as you suggest. > > Also, thanks for pointing me to the old post; it was useful as well. > > -Barry > > -----Original Message----- > From: anil at cmmacs.ernet.in [mailto:anil at cmmacs.ernet.in] > Sent: Saturday, August 13, 2011 3:15 PM > To: Barry Constantine > Cc: end2end-interest at postel.org > Subject: Re: [e2e] TCP Performance with Traffic Policing > > Hi Barry, > > Quite interesting > > I would guess that the different flows (Linux, XP and Win7) in your > experiment might have expressed varying bursty patterns, and that would > have made the policing process to treat these flows differently. A time vs > sequence plot on either side of the policing box should help to bring out > the real dynamics. > > Also, there was a similar post in e2e almost about a decade ago. > http://www.postel.org/pipermail/end2end-interest/2002-June/002154.html > > It is worth having a look > > Anil > >> Hi, >> >> I did some testing to compare various TCP stack behaviors in the midst >> of >> traffic policing. >> >> It is common practice for a network provider to police traffic to a >> subscriber level agreement (SLA). >> >> In the iperf testing I conducted, the following set-up was used: >> >> Client -> Delay (50ms RTT) -> Cisco (with 10M Policing) -> Server >> >> The delay was induced using hardware base commercial gear. >> >> 50 msec RTT and bottleneck bandwidth = 10 Mbps, so BDP was 62,000 bytes. >> >> Ran Linux, Windows XP, and Windows 7 clients at 32k, 64k, 128k window >> (knowing that policing would >> kick in at 64K) >> >> Throughput for Window (Mbps) >> >> Platform 32K 64K 128K >> -------------------------------------------- >> Linux 4.9 7.5 3.8 >> XP 5.8 6.6 5.2 >> Win7 5.3 3.4 0.44 >> >> >> Do anyone have experience with the intricacies of the various OSes in >> the >> midst of >> Traffic policing? I was surprised to see such a variation in >> performance, >> especially since Windows 7 is supposed to more advanced than XP, >> >> I am going to comb through the packet captures, but wondered if anyone >> had >> insight. >> >> Thank you, >> Barry >> >> >> > > -------------- next part -------------- A non-text attachment was scrubbed... Name: Linux_XP_Win7_Policing.doc Type: application/msword Size: 160768 bytes Desc: Linux_XP_Win7_Policing.doc Url : http://mailman.postel.org/pipermail/end2end-interest/attachments/20110816/ac333078/Linux_XP_Win7_Policing-0001.doc From alexsm at gmail.com Tue Aug 16 14:30:18 2011 From: alexsm at gmail.com (Alex Moura) Date: Tue, 16 Aug 2011 18:30:18 -0300 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09CE4A@MILEXCH1.ds.jdsu.net> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> <1449.115.241.106.33.1313262887.squirrel@202.41.64.20> <94DEE80C63F7D34F9DC9FE69E39436BE3A0DF84381@MILEXCH1.ds.jdsu.net> <1793.115.241.106.33.1313267032.squirrel@202.41.64.20> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09CE4A@MILEXCH1.ds.jdsu.net> Message-ID: Barry, Have you seen this information about adjustments in Windows 7 TCP? http://www.speedguide.net/articles/windows-7-vista-2008-tweaks-2574 Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20110816/8b88a4e1/attachment.html From alexander.zimmermann at comsys.rwth-aachen.de Tue Aug 16 23:12:24 2011 From: alexander.zimmermann at comsys.rwth-aachen.de (Alexander Zimmermann) Date: Wed, 17 Aug 2011 08:12:24 +0200 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09CE4A@MILEXCH1.ds.jdsu.net> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> <1449.115.241.106.33.1313262887.squirrel@202.41.64.20> <94DEE80C63F7D34F9DC9FE69E39436BE3A0DF84381@MILEXCH1.ds.jdsu.net> <1793.115.241.106.33.1313267032.squirrel@202.41.64.20> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09CE4A@MILEXCH1.ds.jdsu.net> Message-ID: <2C31CB76-08DC-4AAC-9B87-FB98B5DECB44@comsys.rwth-aachen.de> A non-text attachment was scrubbed... Name: not available Type: application/pgp-encrypted Size: 11 bytes Desc: PGP/MIME Versions Identification Url : http://mailman.postel.org/pipermail/end2end-interest/attachments/20110817/e0d35a76/attachment.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.asc Type: application/octet-stream Size: 4170 bytes Desc: Message encrypted with OpenPGP using GPGMail Url : http://mailman.postel.org/pipermail/end2end-interest/attachments/20110817/e0d35a76/PGP.obj From alexander.zimmermann at comsys.rwth-aachen.de Wed Aug 17 00:08:19 2011 From: alexander.zimmermann at comsys.rwth-aachen.de (Alexander Zimmermann) Date: Wed, 17 Aug 2011 09:08:19 +0200 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09CE4A@MILEXCH1.ds.jdsu.net> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> <1449.115.241.106.33.1313262887.squirrel@202.41.64.20> <94DEE80C63F7D34F9DC9FE69E39436BE3A0DF84381@MILEXCH1.ds.jdsu.net> <1793.115.241.106.33.1313267032.squirrel@202.41.64.20> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09CE4A@MILEXCH1.ds.jdsu.net> Message-ID: <24BC988C-53DA-4DC0-9BFD-B6A8F377EB5B@comsys.rwth-aachen.de> Ok, once again... my pgp plugin seems to have a bug... Berry, could you put your dumps somewhere so that I can download them? I will run tcptrace/xplot by my own. I hard to see anything on your plots. Alex Am 16.08.2011 um 22:37 schrieb Barry Constantine: > Hi Anil, > > Attached is a Word document with the sequence charts from Wireshark. > > The 64KB and 128KB window sizes are shown for each OS. > > Thanks, > Barry > > -----Original Message----- > From: anil at cmmacs.ernet.in [mailto:anil at cmmacs.ernet.in] > Sent: Saturday, August 13, 2011 4:24 PM > To: Barry Constantine > Cc: anil at cmmacs.ernet.in; end2end-interest at postel.org > Subject: RE: [e2e] TCP Performance with Traffic Policing > > Hi Barry: > > Would be glad to see the plots/sequence data in case you would like to > share them. > > If the cause is burst, then the next interesting question whould be: why > the same TCP sender reacted quite differetnly to different (standard) > clients when the policing was in between and normal otherwise. > > Anil > >> Hi Anil, >> >> Your assessments seem reasonable and I will look at the packet captures >> with Wireshark as you suggest. >> >> Also, thanks for pointing me to the old post; it was useful as well. >> >> -Barry >> >> -----Original Message----- >> From: anil at cmmacs.ernet.in [mailto:anil at cmmacs.ernet.in] >> Sent: Saturday, August 13, 2011 3:15 PM >> To: Barry Constantine >> Cc: end2end-interest at postel.org >> Subject: Re: [e2e] TCP Performance with Traffic Policing >> >> Hi Barry, >> >> Quite interesting >> >> I would guess that the different flows (Linux, XP and Win7) in your >> experiment might have expressed varying bursty patterns, and that would >> have made the policing process to treat these flows differently. A time vs >> sequence plot on either side of the policing box should help to bring out >> the real dynamics. >> >> Also, there was a similar post in e2e almost about a decade ago. >> http://www.postel.org/pipermail/end2end-interest/2002-June/002154.html >> >> It is worth having a look >> >> Anil >> >>> Hi, >>> >>> I did some testing to compare various TCP stack behaviors in the midst >>> of >>> traffic policing. >>> >>> It is common practice for a network provider to police traffic to a >>> subscriber level agreement (SLA). >>> >>> In the iperf testing I conducted, the following set-up was used: >>> >>> Client -> Delay (50ms RTT) -> Cisco (with 10M Policing) -> Server >>> >>> The delay was induced using hardware base commercial gear. >>> >>> 50 msec RTT and bottleneck bandwidth = 10 Mbps, so BDP was 62,000 bytes. >>> >>> Ran Linux, Windows XP, and Windows 7 clients at 32k, 64k, 128k window >>> (knowing that policing would >>> kick in at 64K) >>> >>> Throughput for Window (Mbps) >>> >>> Platform 32K 64K 128K >>> -------------------------------------------- >>> Linux 4.9 7.5 3.8 >>> XP 5.8 6.6 5.2 >>> Win7 5.3 3.4 0.44 >>> >>> >>> Do anyone have experience with the intricacies of the various OSes in >>> the >>> midst of >>> Traffic policing? I was surprised to see such a variation in >>> performance, >>> especially since Windows 7 is supposed to more advanced than XP, >>> >>> I am going to comb through the packet captures, but wondered if anyone >>> had >>> insight. >>> >>> Thank you, >>> Barry >>> >>> >>> >> >> > > // // Dipl.-Inform. Alexander Zimmermann // Department of Computer Science, Informatik 4 // RWTH Aachen University // Ahornstr. 55, 52056 Aachen, Germany // phone: (49-241) 80-21422, fax: (49-241) 80-22222 // email: zimmermann at cs.rwth-aachen.de // web: http://www.umic-mesh.net // -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 163 bytes Desc: OpenPGP digital signature Url : http://mailman.postel.org/pipermail/end2end-interest/attachments/20110817/f7cfd91d/PGP.bin From Anil.Agarwal at viasat.com Wed Aug 17 04:50:24 2011 From: Anil.Agarwal at viasat.com (Agarwal, Anil) Date: Wed, 17 Aug 2011 07:50:24 -0400 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09CE4A@MILEXCH1.ds.jdsu.net> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net><1449.115.241.106.33.1313262887.squirrel@202.41.64.20><94DEE80C63F7D34F9DC9FE69E39436BE3A0DF84381@MILEXCH1.ds.jdsu.net><1793.115.241.106.33.1313267032.squirrel@202.41.64.20> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09CE4A@MILEXCH1.ds.jdsu.net> Message-ID: <0B0A20D0B3ECD742AA2514C8DDA3B065055F4B05@VGAEXCH01.hq.corp.viasat.com> Barry, Can you also post the details of the policer settings used for these tests? Is ctcp enabled for Windows 7? Anil -----Original Message----- From: end2end-interest-bounces at postel.org [mailto:end2end-interest-bounces at postel.org] On Behalf Of Barry Constantine Sent: Tuesday, August 16, 2011 4:37 PM To: anil at cmmacs.ernet.in Cc: end2end-interest at postel.org Subject: Re: [e2e] TCP Performance with Traffic Policing Hi Anil, Attached is a Word document with the sequence charts from Wireshark. The 64KB and 128KB window sizes are shown for each OS. Thanks, Barry -----Original Message----- From: anil at cmmacs.ernet.in [mailto:anil at cmmacs.ernet.in] Sent: Saturday, August 13, 2011 4:24 PM To: Barry Constantine Cc: anil at cmmacs.ernet.in; end2end-interest at postel.org Subject: RE: [e2e] TCP Performance with Traffic Policing Hi Barry: Would be glad to see the plots/sequence data in case you would like to share them. If the cause is burst, then the next interesting question whould be: why the same TCP sender reacted quite differetnly to different (standard) clients when the policing was in between and normal otherwise. Anil > Hi Anil, > > Your assessments seem reasonable and I will look at the packet captures > with Wireshark as you suggest. > > Also, thanks for pointing me to the old post; it was useful as well. > > -Barry > > -----Original Message----- > From: anil at cmmacs.ernet.in [mailto:anil at cmmacs.ernet.in] > Sent: Saturday, August 13, 2011 3:15 PM > To: Barry Constantine > Cc: end2end-interest at postel.org > Subject: Re: [e2e] TCP Performance with Traffic Policing > > Hi Barry, > > Quite interesting > > I would guess that the different flows (Linux, XP and Win7) in your > experiment might have expressed varying bursty patterns, and that would > have made the policing process to treat these flows differently. A time vs > sequence plot on either side of the policing box should help to bring out > the real dynamics. > > Also, there was a similar post in e2e almost about a decade ago. > http://www.postel.org/pipermail/end2end-interest/2002-June/002154.html > > It is worth having a look > > Anil > >> Hi, >> >> I did some testing to compare various TCP stack behaviors in the midst >> of >> traffic policing. >> >> It is common practice for a network provider to police traffic to a >> subscriber level agreement (SLA). >> >> In the iperf testing I conducted, the following set-up was used: >> >> Client -> Delay (50ms RTT) -> Cisco (with 10M Policing) -> Server >> >> The delay was induced using hardware base commercial gear. >> >> 50 msec RTT and bottleneck bandwidth = 10 Mbps, so BDP was 62,000 bytes. >> >> Ran Linux, Windows XP, and Windows 7 clients at 32k, 64k, 128k window >> (knowing that policing would >> kick in at 64K) >> >> Throughput for Window (Mbps) >> >> Platform 32K 64K 128K >> -------------------------------------------- >> Linux 4.9 7.5 3.8 >> XP 5.8 6.6 5.2 >> Win7 5.3 3.4 0.44 >> >> >> Do anyone have experience with the intricacies of the various OSes in >> the >> midst of >> Traffic policing? I was surprised to see such a variation in >> performance, >> especially since Windows 7 is supposed to more advanced than XP, >> >> I am going to comb through the packet captures, but wondered if anyone >> had >> insight. >> >> Thank you, >> Barry >> >> >> > > From alexander.zimmermann at comsys.rwth-aachen.de Wed Aug 17 05:37:37 2011 From: alexander.zimmermann at comsys.rwth-aachen.de (Alexander Zimmermann) Date: Wed, 17 Aug 2011 14:37:37 +0200 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <0B0A20D0B3ECD742AA2514C8DDA3B065055F4B05@VGAEXCH01.hq.corp.viasat.com> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> <1449.115.241.106.33.1313262887.squirrel@202.41.64.20> <94DEE80C63F7D34F9DC9FE69E39436BE3A0DF84381@MILEXCH1.ds.jdsu.net> <1793.115.241.106.33.1313267032.squirrel@202.41.64.20> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09CE4A@MILEXCH1.ds.jdsu.net> <0B0A20D0B3ECD742AA2514C8DDA3B065055F4B05@VGAEXCH01.hq.corp.viasat.com> Message-ID: <516BE018-532A-429E-A2A9-3BF8DCBD0A36@comsys.rwth-aachen.de> A non-text attachment was scrubbed... Name: not available Type: application/pgp-encrypted Size: 11 bytes Desc: PGP/MIME Versions Identification Url : http://mailman.postel.org/pipermail/end2end-interest/attachments/20110817/19c497f8/attachment-0001.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.asc Type: application/octet-stream Size: 4687 bytes Desc: Message encrypted with OpenPGP using GPGMail Url : http://mailman.postel.org/pipermail/end2end-interest/attachments/20110817/19c497f8/PGP-0001.obj From Barry.Constantine at jdsu.com Wed Aug 17 05:56:17 2011 From: Barry.Constantine at jdsu.com (Barry Constantine) Date: Wed, 17 Aug 2011 05:56:17 -0700 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <516BE018-532A-429E-A2A9-3BF8DCBD0A36@comsys.rwth-aachen.de> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> <1449.115.241.106.33.1313262887.squirrel@202.41.64.20> <94DEE80C63F7D34F9DC9FE69E39436BE3A0DF84381@MILEXCH1.ds.jdsu.net> <1793.115.241.106.33.1313267032.squirrel@202.41.64.20> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09CE4A@MILEXCH1.ds.jdsu.net> <0B0A20D0B3ECD742AA2514C8DDA3B065055F4B05@VGAEXCH01.hq.corp.viasat.com> <516BE018-532A-429E-A2A9-3BF8DCBD0A36@comsys.rwth-aachen.de> Message-ID: <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D0F6@MILEXCH1.ds.jdsu.net> Alexander, I could not read your encrypted email. I saw the email from the postel-moderator and assume the attachment did not go through. This is the dropbox link to the zip file: http://dl.dropbox.com/u/10123514/Linux%20versus%20XP%20versus%20Windows7%20Iperf.zip Thanks, Barry From: Alexander Zimmermann [mailto:alexander.zimmermann at comsys.rwth-aachen.de] Sent: Wednesday, August 17, 2011 8:38 AM To: Agarwal, Anil Cc: Barry Constantine; anil at cmmacs.ernet.in; end2end-interest at postel.org Subject: Re: [e2e] TCP Performance with Traffic Policing -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20110817/e8a37640/attachment.html From alexander.zimmermann at comsys.rwth-aachen.de Wed Aug 17 06:30:32 2011 From: alexander.zimmermann at comsys.rwth-aachen.de (Alexander Zimmermann) Date: Wed, 17 Aug 2011 15:30:32 +0200 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D0F6@MILEXCH1.ds.jdsu.net> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> <1449.115.241.106.33.1313262887.squirrel@202.41.64.20> <94DEE80C63F7D34F9DC9FE69E39436BE3A0DF84381@MILEXCH1.ds.jdsu.net> <1793.115.241.106.33.1313267032.squirrel@202.41.64.20> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09CE4A@MILEXCH1.ds.jdsu.net> <0B0A20D0B3ECD742AA2514C8DDA3B065055F4B05@VGAEXCH01.hq.corp.viasat.com> <516BE018-532A-429E-A2A9-3BF8DCBD0A36@comsys.rwth-aachen.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D0F6@MILEXCH1.ds.jdsu.net> Message-ID: Hi Barry, from a quick look of your Linux 64/128 dumps I would say your setup is broken. The trace is highly bursty. Do you have a delay on the reverse path? I guess no... Alex Am 17.08.2011 um 14:56 schrieb Barry Constantine: > Alexander, > > I could not read your encrypted email. > > I saw the email from the postel-moderator and assume the attachment did not go through. > > This is the dropbox link to the zip file: > > http://dl.dropbox.com/u/10123514/Linux%20versus%20XP%20versus%20Windows7%20Iperf.zip > > Thanks, > Barry > > From: Alexander Zimmermann [mailto:alexander.zimmermann at comsys.rwth-aachen.de] > Sent: Wednesday, August 17, 2011 8:38 AM > To: Agarwal, Anil > Cc: Barry Constantine; anil at cmmacs.ernet.in; end2end-interest at postel.org > Subject: Re: [e2e] TCP Performance with Traffic Policing > // // Dipl.-Inform. Alexander Zimmermann // Department of Computer Science, Informatik 4 // RWTH Aachen University // Ahornstr. 55, 52056 Aachen, Germany // phone: (49-241) 80-21422, fax: (49-241) 80-22222 // email: zimmermann at cs.rwth-aachen.de // web: http://www.umic-mesh.net // -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20110817/851401e1/attachment.html From Barry.Constantine at jdsu.com Wed Aug 17 06:33:02 2011 From: Barry.Constantine at jdsu.com (Barry Constantine) Date: Wed, 17 Aug 2011 06:33:02 -0700 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> <1449.115.241.106.33.1313262887.squirrel@202.41.64.20> <94DEE80C63F7D34F9DC9FE69E39436BE3A0DF84381@MILEXCH1.ds.jdsu.net> <1793.115.241.106.33.1313267032.squirrel@202.41.64.20> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09CE4A@MILEXCH1.ds.jdsu.net> <0B0A20D0B3ECD742AA2514C8DDA3B065055F4B05@VGAEXCH01.hq.corp.viasat.com> <516BE018-532A-429E-A2A9-3BF8DCBD0A36@comsys.rwth-aachen.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D0F6@MILEXCH1.ds.jdsu.net> Message-ID: <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D12D@MILEXCH1.ds.jdsu.net> I'd have to look, I don't think so. One thing I forgot to mention was the policer. I used default Cisco settings, 10 Mbps CIR and bc = 312,500 bytes. When I increased bc to max as suggested by Anil (1,000,000 bytes), the 64 KB scenario worked perfectly but 128 KB still poorly. Thanks, Barry From: Alexander Zimmermann [mailto:alexander.zimmermann at comsys.rwth-aachen.de] Sent: Wednesday, August 17, 2011 9:31 AM To: Barry Constantine Cc: Agarwal, Anil; anil at cmmacs.ernet.in; end2end-interest at postel.org Subject: Re: [e2e] TCP Performance with Traffic Policing Hi Barry, from a quick look of your Linux 64/128 dumps I would say your setup is broken. The trace is highly bursty. Do you have a delay on the reverse path? I guess no... Alex Am 17.08.2011 um 14:56 schrieb Barry Constantine: Alexander, I could not read your encrypted email. I saw the email from the postel-moderator and assume the attachment did not go through. This is the dropbox link to the zip file: http://dl.dropbox.com/u/10123514/Linux%20versus%20XP%20versus%20Windows7%20Iperf.zip Thanks, Barry From: Alexander Zimmermann [mailto:alexander.zimmermann at comsys.rwth-aachen.de] Sent: Wednesday, August 17, 2011 8:38 AM To: Agarwal, Anil Cc: Barry Constantine; anil at cmmacs.ernet.in; end2end-interest at postel.org Subject: Re: [e2e] TCP Performance with Traffic Policing // // Dipl.-Inform. Alexander Zimmermann // Department of Computer Science, Informatik 4 // RWTH Aachen University // Ahornstr. 55, 52056 Aachen, Germany // phone: (49-241) 80-21422, fax: (49-241) 80-22222 // email: zimmermann at cs.rwth-aachen.de // web: http://www.umic-mesh.net // -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20110817/c23518db/attachment-0001.html From detlef.bosau at web.de Wed Aug 17 08:18:51 2011 From: detlef.bosau at web.de (Detlef Bosau) Date: Wed, 17 Aug 2011 17:18:51 +0200 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> Message-ID: <4E4BDBDB.6000101@web.de> On 08/12/2011 04:03 PM, Barry Constantine wrote: > > Hi, > > I did some testing to compare various TCP stack behaviors in the midst > of traffic policing. > > It is common practice for a network provider to police traffic to a > subscriber level agreement (SLA). > > In the iperf testing I conducted, the following set-up was used: > > Client -> Delay (50ms RTT) -> Cisco (with 10M Policing) -> Server > > The delay was induced using hardware base commercial gear. > > 50 msec RTT and bottleneck bandwidth = 10 Mbps, so BDP was 62,000 bytes. > > Ran Linux, Windows XP, and Windows 7 clients at 32k, 64k, 128k window > (knowing that policing would > > kick in at 64K) > What do you mean by 32k, 64k, 128k Window? You surely don't fix rwnd to a fixed value, because you would undermine TCP flow control then. Are you talking about rwnd? Or about window scaling? In the latter case, please keep in mind that scaling rwnd to larger units might make it difficult to announce small receiver windows, when this is necessary e.g. due to load or memory reasons. So, when you scale a window in units of 128 kByte, you should careful consider what you're doing and whether this is what you really want to do. -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ From Barry.Constantine at jdsu.com Wed Aug 17 10:09:32 2011 From: Barry.Constantine at jdsu.com (Barry Constantine) Date: Wed, 17 Aug 2011 10:09:32 -0700 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <4E4BDBDB.6000101@web.de> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> <4E4BDBDB.6000101@web.de> Message-ID: <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D320@MILEXCH1.ds.jdsu.net> Hi Detlef, The client was running iperf with the "-w 64K" option which controls the client's send buffer. The server was configured to advertise 4 MB RWIN; I used the client side to control all in-flight data. Thanks, Barry -----Original Message----- From: end2end-interest-bounces at postel.org [mailto:end2end-interest-bounces at postel.org] On Behalf Of Detlef Bosau Sent: Wednesday, August 17, 2011 11:19 AM To: end2end-interest at postel.org Subject: Re: [e2e] TCP Performance with Traffic Policing On 08/12/2011 04:03 PM, Barry Constantine wrote: > > Hi, > > I did some testing to compare various TCP stack behaviors in the midst > of traffic policing. > > It is common practice for a network provider to police traffic to a > subscriber level agreement (SLA). > > In the iperf testing I conducted, the following set-up was used: > > Client -> Delay (50ms RTT) -> Cisco (with 10M Policing) -> Server > > The delay was induced using hardware base commercial gear. > > 50 msec RTT and bottleneck bandwidth = 10 Mbps, so BDP was 62,000 bytes. > > Ran Linux, Windows XP, and Windows 7 clients at 32k, 64k, 128k window > (knowing that policing would > > kick in at 64K) > What do you mean by 32k, 64k, 128k Window? You surely don't fix rwnd to a fixed value, because you would undermine TCP flow control then. Are you talking about rwnd? Or about window scaling? In the latter case, please keep in mind that scaling rwnd to larger units might make it difficult to announce small receiver windows, when this is necessary e.g. due to load or memory reasons. So, when you scale a window in units of 128 kByte, you should careful consider what you're doing and whether this is what you really want to do. -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ From detlef.bosau at web.de Wed Aug 17 11:03:14 2011 From: detlef.bosau at web.de (Detlef Bosau) Date: Wed, 17 Aug 2011 20:03:14 +0200 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D320@MILEXCH1.ds.jdsu.net> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> <4E4BDBDB.6000101@web.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D320@MILEXCH1.ds.jdsu.net> Message-ID: <4E4C0262.5090507@web.de> On 08/17/2011 07:09 PM, Barry Constantine wrote: > Hi Detlef, > > The client was running iperf with the "-w 64K" option which controls the client's send buffer. > > The server was configured to advertise 4 MB RWIN; I used the client side to control all in-flight data. > Hm. So, the server advertised a constant rwin value? Detlef -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ From Barry.Constantine at jdsu.com Wed Aug 17 11:04:34 2011 From: Barry.Constantine at jdsu.com (Barry Constantine) Date: Wed, 17 Aug 2011 11:04:34 -0700 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <4E4C0262.5090507@web.de> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> <4E4BDBDB.6000101@web.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D320@MILEXCH1.ds.jdsu.net> <4E4C0262.5090507@web.de> Message-ID: <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D3A2@MILEXCH1.ds.jdsu.net> Yes, 4MB. -----Original Message----- From: detlef.bosau at web.de [mailto:detlef.bosau at web.de] Sent: Wednesday, August 17, 2011 2:03 PM To: end2end-interest at postel.org Cc: Barry Constantine Subject: Re: [e2e] TCP Performance with Traffic Policing On 08/17/2011 07:09 PM, Barry Constantine wrote: > Hi Detlef, > > The client was running iperf with the "-w 64K" option which controls the client's send buffer. > > The server was configured to advertise 4 MB RWIN; I used the client side to control all in-flight data. > Hm. So, the server advertised a constant rwin value? Detlef -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ From detlef.bosau at web.de Wed Aug 17 11:15:40 2011 From: detlef.bosau at web.de (Detlef Bosau) Date: Wed, 17 Aug 2011 20:15:40 +0200 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D3A2@MILEXCH1.ds.jdsu.net> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> <4E4BDBDB.6000101@web.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D320@MILEXCH1.ds.jdsu.net> <4E4C0262.5090507@web.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D3A2@MILEXCH1.ds.jdsu.net> Message-ID: <4E4C054C.2040206@web.de> On 08/17/2011 08:04 PM, Barry Constantine wrote: > Yes, 4MB. So, actually, you're doing window clamping? As a consequence, you may derate the path. -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ From Barry.Constantine at jdsu.com Wed Aug 17 11:17:16 2011 From: Barry.Constantine at jdsu.com (Barry Constantine) Date: Wed, 17 Aug 2011 11:17:16 -0700 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <4E4C054C.2040206@web.de> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> <4E4BDBDB.6000101@web.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D320@MILEXCH1.ds.jdsu.net> <4E4C0262.5090507@web.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D3A2@MILEXCH1.ds.jdsu.net> <4E4C054C.2040206@web.de> Message-ID: <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D3BE@MILEXCH1.ds.jdsu.net> I just did not want the receiving side (server) to limit the performance. The RWIN does "dip" as the server receives data, but the advertised maximum was 4M. -----Original Message----- From: detlef.bosau at web.de [mailto:detlef.bosau at web.de] Sent: Wednesday, August 17, 2011 2:16 PM To: end2end-interest at postel.org Cc: Barry Constantine Subject: Re: [e2e] TCP Performance with Traffic Policing On 08/17/2011 08:04 PM, Barry Constantine wrote: > Yes, 4MB. So, actually, you're doing window clamping? As a consequence, you may derate the path. -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ From detlef.bosau at web.de Wed Aug 17 11:46:52 2011 From: detlef.bosau at web.de (Detlef Bosau) Date: Wed, 17 Aug 2011 20:46:52 +0200 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D3BE@MILEXCH1.ds.jdsu.net> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> <4E4BDBDB.6000101@web.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D320@MILEXCH1.ds.jdsu.net> <4E4C0262.5090507@web.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D3A2@MILEXCH1.ds.jdsu.net> <4E4C054C.2040206@web.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D3BE@MILEXCH1.ds.jdsu.net> Message-ID: <4E4C0C9C.4030804@web.de> An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20110817/10eb9393/attachment.html From Barry.Constantine at jdsu.com Wed Aug 17 11:57:43 2011 From: Barry.Constantine at jdsu.com (Barry Constantine) Date: Wed, 17 Aug 2011 11:57:43 -0700 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <4E4C0C9C.4030804@web.de> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> <4E4BDBDB.6000101@web.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D320@MILEXCH1.ds.jdsu.net> <4E4C0262.5090507@web.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D3A2@MILEXCH1.ds.jdsu.net> <4E4C054C.2040206@web.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D3BE@MILEXCH1.ds.jdsu.net> <4E4C0C9C.4030804@web.de> Message-ID: <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D438@MILEXCH1.ds.jdsu.net> I understand, but the client iperf "window" setting (-w 64KB) is really the send buffer so only 64KB was attempted to be in-flight (not 4MB). Are we in sync? Thanks, Barry From: detlef.bosau at web.de [mailto:detlef.bosau at web.de] Sent: Wednesday, August 17, 2011 2:47 PM To: end2end-interest at postel.org Cc: Barry Constantine Subject: Re: [e2e] TCP Performance with Traffic Policing On 08/17/2011 08:17 PM, Barry Constantine wrote: I just did not want the receiving side (server) to limit the performance. The RWIN does "dip" as the server receives data, but the advertised maximum was 4M. Hm. May I quote your original post? 50 msec RTT and bottleneck bandwidth = 10 Mbps, so BDP was 62,000 bytes. Could this be a clue? -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20110817/ca37783f/attachment.html From alexander.zimmermann at comsys.rwth-aachen.de Wed Aug 17 23:30:05 2011 From: alexander.zimmermann at comsys.rwth-aachen.de (Alexander Zimmermann) Date: Thu, 18 Aug 2011 08:30:05 +0200 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> <1449.115.241.106.33.1313262887.squirrel@202.41.64.20> <94DEE80C63F7D34F9DC9FE69E39436BE3A0DF84381@MILEXCH1.ds.jdsu.net> <1793.115.241.106.33.1313267032.squirrel@202.41.64.20> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09CE4A@MILEXCH1.ds.jdsu.net> <0B0A20D0B3ECD742AA2514C8DDA3B065055F4B05@VGAEXCH01.hq.corp.viasat.com> <516BE018-532A-429E-A2A9-3BF8DCBD0A36@comsys.rwth-aachen.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D0F6@MILEXCH1.ds.jdsu.net> Message-ID: <5BE977C7-2BC4-4F32-A98D-1123AEA681B6@comsys.rwth-aachen.de> Hi Barry, since you does't send the sysctl's, it is hard to correctly interpret the time sequence graph. However, some facts about the Linux 128kb run: - The bursty behavior is not a missing delay on the reverse path (sorry my fault), it's a missing queuing delay. - I never see Limited Transmit. Also, the sender sends almost never new data on incoming new dupacks during recovery (Linux uses Rate Halving). Maybe we are limited by the send window here... You can use iperf without the parameter to confirm this. - NewReno can only send one retransmission per RTT. With your strange policer we have massive bursty packet loss. We can see loss recovery phases for 4 sec. Use SACK. - We see 6 RTOs - Your receiver is not a linux box ;-) Why? Linux the Quick ACKs (= disabling Delayed ACKs) after an out-of-oder phase, so that the sender can quickly enlarged his CWND - All in all we have nearly more retransmissions than transmissions.Your problem is not TCP, it's your traffic policer... Alex Am 17.08.2011 um 15:30 schrieb Alexander Zimmermann: > Hi Barry, > > from a quick look of your Linux 64/128 dumps I would say your setup is broken. > The trace is highly bursty. Do you have a delay on the reverse path? I guess no... > > Alex > > Am 17.08.2011 um 14:56 schrieb Barry Constantine: > >> Alexander, >> >> I could not read your encrypted email. >> >> I saw the email from the postel-moderator and assume the attachment did not go through. >> >> This is the dropbox link to the zip file: >> >> http://dl.dropbox.com/u/10123514/Linux%20versus%20XP%20versus%20Windows7%20Iperf.zip >> >> Thanks, >> Barry >> >> From: Alexander Zimmermann [mailto:alexander.zimmermann at comsys.rwth-aachen.de] >> Sent: Wednesday, August 17, 2011 8:38 AM >> To: Agarwal, Anil >> Cc: Barry Constantine; anil at cmmacs.ernet.in; end2end-interest at postel.org >> Subject: Re: [e2e] TCP Performance with Traffic Policing >> > > // > // Dipl.-Inform. Alexander Zimmermann > // Department of Computer Science, Informatik 4 > // RWTH Aachen University > // Ahornstr. 55, 52056 Aachen, Germany > // phone: (49-241) 80-21422, fax: (49-241) 80-22222 > // email: zimmermann at cs.rwth-aachen.de > // web: http://www.umic-mesh.net > // > // // Dipl.-Inform. Alexander Zimmermann // Department of Computer Science, Informatik 4 // RWTH Aachen University // Ahornstr. 55, 52056 Aachen, Germany // phone: (49-241) 80-21422, fax: (49-241) 80-22222 // email: zimmermann at cs.rwth-aachen.de // web: http://www.umic-mesh.net // -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20110818/257cfaa5/attachment-0001.html From detlef.bosau at web.de Thu Aug 18 08:57:10 2011 From: detlef.bosau at web.de (Detlef Bosau) Date: Thu, 18 Aug 2011 17:57:10 +0200 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D438@MILEXCH1.ds.jdsu.net> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> <4E4BDBDB.6000101@web.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D320@MILEXCH1.ds.jdsu.net> <4E4C0262.5090507@web.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D3A2@MILEXCH1.ds.jdsu.net> <4E4C054C.2040206@web.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D3BE@MILEXCH1.ds.jdsu.net> <4E4C0C9C.4030804@web.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D438@MILEXCH1.ds.jdsu.net> Message-ID: <4E4D3656.5090201@web.de> On 08/17/2011 08:57 PM, Barry Constantine wrote: > > I understand, but the client iperf ?window? setting (-w 64KB) is > really the send buffer so only 64KB was attempted to be in-flight (not > 4MB). > > Are we in sync? > Not quite. With regards to the BDP, we discussed this matter off list and my fault was to mistake the dimension of your BDP. I read 62 MByte, instead of 62 kByte. So, window clamping etc. should be no problem in your case. However, iperf is an application and thus does neither control the sender's socket buffer* nor the amount of data being in flight. Particularly, from some "two liners" I dropped down myself, I'm obviously too stupid to use TCP_NODELAY correctly :-( In other words: I cannot turn the Nagle Algorithm on or off respectively. By turning of the Nagle Algorithm, you _do_ control the amount of data being in flight. O.k., when iperf does a close() after sending/writing its whole buffer, you have some control about the amount of data in flight as well. * I just had a look at the iperf mangpage. It's stated there, -w would control the sender socket's buffersize. However, I do not yet know a generic, i.e. particularly being portable, way to do so using the socket API. So besides an OS specific behaviour, you might encounter quite subtle issues by using the one or the other socket option, which will not behave exactly the same way on all OSes and OS releases. Detlef -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ From detlef.bosau at web.de Thu Aug 18 09:44:51 2011 From: detlef.bosau at web.de (Detlef Bosau) Date: Thu, 18 Aug 2011 18:44:51 +0200 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <5BE977C7-2BC4-4F32-A98D-1123AEA681B6@comsys.rwth-aachen.de> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> <1449.115.241.106.33.1313262887.squirrel@202.41.64.20> <94DEE80C63F7D34F9DC9FE69E39436BE3A0DF84381@MILEXCH1.ds.jdsu.net> <1793.115.241.106.33.1313267032.squirrel@202.41.64.20> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09CE4A@MILEXCH1.ds.jdsu.net> <0B0A20D0B3ECD742AA2514C8DDA3B065055F4B05@VGAEXCH01.hq.corp.viasat.com> <516BE018-532A-429E-A2A9-3BF8DCBD0A36@comsys.rwth-aachen.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D0F6@MILEXCH1.ds.jdsu.net> <5BE977C7-2BC4-4F32-A98D-1123AEA681B6@comsys.rwth-aachen.de> Message-ID: <4E4D4183.5050305@web.de> On 08/18/2011 08:30 AM, Alexander Zimmermann wrote: > > - Your receiver is not a linux box ;-) Why? Linux the Quick ACKs (= > disabling Delayed ACKs) after an > out-of-oder phase, so that the sender can quickly enlarged his CWND > Is this discussed somewhere in the RFC? Actually, I'm somewhat confused how some people understand "engineering" in the field of computer networks. We all know the saying: ""If builders built buildings the way programmers wrote programs, the first woodpecker that came along would destroy civilization." I sometimes worked together closely with civil engineers. And for those, there are legal duties and the law, then there are the ten commandments - however, the highest authority at all are the standards. So, why do we have RFC and why do we have standards, when each and every weird programmer does write his own mess? Computer Science is old enough and great enough and ugly enough to become a grown up discipline. And therefore, we should expect an OS to implement TCP/IP as it is defined in the RFC. This is particularly important in the world of internetworking, because we often use distributed algorithms there and being well behaved and conforming to standards is the by far most important foundation in this field. -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ From hagen at jauu.net Thu Aug 18 11:38:34 2011 From: hagen at jauu.net (Hagen Paul Pfeifer) Date: Thu, 18 Aug 2011 20:38:34 +0200 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <4E4D4183.5050305@web.de> References: <1449.115.241.106.33.1313262887.squirrel@202.41.64.20> <94DEE80C63F7D34F9DC9FE69E39436BE3A0DF84381@MILEXCH1.ds.jdsu.net> <1793.115.241.106.33.1313267032.squirrel@202.41.64.20> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09CE4A@MILEXCH1.ds.jdsu.net> <0B0A20D0B3ECD742AA2514C8DDA3B065055F4B05@VGAEXCH01.hq.corp.viasat.com> <516BE018-532A-429E-A2A9-3BF8DCBD0A36@comsys.rwth-aachen.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D0F6@MILEXCH1.ds.jdsu.net> <5BE977C7-2BC4-4F32-A98D-1123AEA681B6@comsys.rwth-aachen.de> <4E4D4183.5050305@web.de> Message-ID: <20110818183833.GA2878@nuttenaction> * Detlef Bosau | 2011-08-18 18:44:51 [+0200]: >On 08/18/2011 08:30 AM, Alexander Zimmermann wrote: >> >>- Your receiver is not a linux box ;-) Why? Linux the Quick ACKs >>(= disabling Delayed ACKs) after an >>out-of-oder phase, so that the sender can quickly enlarged his CWND >> > >Is this discussed somewhere in the RFC? No, Quick ACK is not documented in and RFC or i-d. >This is particularly important in the world of internetworking, >because we often use distributed algorithms there and being well >behaved and conforming to standards is the by far most important >foundation in this field. Detlef, I don't understand your critics about Quick ACK. Quick ACK as used in Linux is really conservative. It is disabled as soon as the communication is rated bi-directional. Wrapped sequence numbers (PAWS) and all required characteristics covered by Delayed ACK IS covered by Quick ACK. Trust me, I studied the behavior exactly (and even started to wrote an I-D ;-) Hagen From detlef.bosau at web.de Thu Aug 18 11:44:53 2011 From: detlef.bosau at web.de (Detlef Bosau) Date: Thu, 18 Aug 2011 20:44:53 +0200 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <20110818183833.GA2878@nuttenaction> References: <1449.115.241.106.33.1313262887.squirrel@202.41.64.20> <94DEE80C63F7D34F9DC9FE69E39436BE3A0DF84381@MILEXCH1.ds.jdsu.net> <1793.115.241.106.33.1313267032.squirrel@202.41.64.20> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09CE4A@MILEXCH1.ds.jdsu.net> <0B0A20D0B3ECD742AA2514C8DDA3B065055F4B05@VGAEXCH01.hq.corp.viasat.com> <516BE018-532A-429E-A2A9-3BF8DCBD0A36@comsys.rwth-aachen.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D0F6@MILEXCH1.ds.jdsu.net> <5BE977C7-2BC4-4F32-A98D-1123AEA681B6@comsys.rwth-aachen.de> <4E4D4183.5050305@web.de> <20110818183833.GA2878@nuttenaction> Message-ID: <4E4D5DA5.3080000@web.de> On 08/18/2011 08:38 PM, Hagen Paul Pfeifer wrote: > > Detlef, I don't understand your critics about Quick ACK. Quick ACK as used in I absolutely don't have any problem with Quick ACK. My problem is basically, that there are at least as many TCP as OS. -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ From hagen at jauu.net Thu Aug 18 12:06:24 2011 From: hagen at jauu.net (Hagen Paul Pfeifer) Date: Thu, 18 Aug 2011 21:06:24 +0200 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <4E4D5DA5.3080000@web.de> References: <1793.115.241.106.33.1313267032.squirrel@202.41.64.20> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09CE4A@MILEXCH1.ds.jdsu.net> <0B0A20D0B3ECD742AA2514C8DDA3B065055F4B05@VGAEXCH01.hq.corp.viasat.com> <516BE018-532A-429E-A2A9-3BF8DCBD0A36@comsys.rwth-aachen.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D0F6@MILEXCH1.ds.jdsu.net> <5BE977C7-2BC4-4F32-A98D-1123AEA681B6@comsys.rwth-aachen.de> <4E4D4183.5050305@web.de> <20110818183833.GA2878@nuttenaction> <4E4D5DA5.3080000@web.de> Message-ID: <20110818190624.GB2878@nuttenaction> * Detlef Bosau | 2011-08-18 20:44:53 [+0200]: >>Detlef, I don't understand your critics about Quick ACK. Quick ACK as used in > >I absolutely don't have any problem with Quick ACK. > >My problem is basically, that there are at least as many TCP as OS. I can't speak for other operating systems (with a exception of FreeBSD where I follow the development loosely), but for Linux I can provide guarantee that all TCP related development is judged. A standard-compliant implementation is ultimate ambition (not to harm any other TCP instance). Variety IS good as long as the default is the best possible default. There are several timer knobs, CC algorithms, queues, memory knobs and so on - you are absolutely right. They are provided to tune the stack to meet your requirements, to fit into divergent environments. If you know your environment and you want to tune something, fine, Linux provides a way. Other OS do the same! Since several months FreeBSD also provides a way to select the CC algorithm on the fly. We are embedded in a complicated (network) world, there is more then just one answer. Cheers, Hagen From detlef.bosau at web.de Thu Aug 18 12:57:39 2011 From: detlef.bosau at web.de (Detlef Bosau) Date: Thu, 18 Aug 2011 21:57:39 +0200 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <20110818190624.GB2878@nuttenaction> References: <1793.115.241.106.33.1313267032.squirrel@202.41.64.20> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09CE4A@MILEXCH1.ds.jdsu.net> <0B0A20D0B3ECD742AA2514C8DDA3B065055F4B05@VGAEXCH01.hq.corp.viasat.com> <516BE018-532A-429E-A2A9-3BF8DCBD0A36@comsys.rwth-aachen.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D0F6@MILEXCH1.ds.jdsu.net> <5BE977C7-2BC4-4F32-A98D-1123AEA681B6@comsys.rwth-aachen.de> <4E4D4183.5050305@web.de> <20110818183833.GA2878@nuttenaction> <4E4D5DA5.3080000@web.de> <20110818190624.GB2878@nuttenaction> Message-ID: <4E4D6EB3.7000006@web.de> On 08/18/2011 09:06 PM, Hagen Paul Pfeifer wrote: > * Detlef Bosau | 2011-08-18 20:44:53 [+0200]: > >>> Detlef, I don't understand your critics about Quick ACK. Quick ACK as used in >> I absolutely don't have any problem with Quick ACK. >> >> My problem is basically, that there are at least as many TCP as OS. > I can't speak for other operating systems (with a exception of FreeBSD where I > follow the development loosely), but for Linux I can provide guarantee that all TCP > related development is judged. A standard-compliant implementation is ultimate > ambition (not to harm any other TCP instance). It is a well proven attitude in science, to do research one step after another. Obviously, in Internetworking we have a certain problem with this one. Why couldn't we discuss those things in a number of papers and/or RFC proposals instead of playing around with these things, with absolutely no consideration of the rest of the world? If Quick ACKs are a good idea, there is no doubt that they will be accepted. However, I have a problem with working beside/without/against the community. My attitude is pursuing the discussion in the community (although I cannot do this to that degree I want to do for some reasons) instead of avoiding it. In some sense, Linux is the OS of the GNU generation: Gnu is Not Unix. And many of us use still BSD Unix as _the_ reference. When I look at Linux, we have Westwood and Reno Options and others as well. So, every user make mix up his own "salad of TCP flavours" wihout ging a simple thought to whether this make sense. This is an attitude of competitive playing - and I strongly expect our attitude to become more professional in the future. > Variety IS good as long as the default is the best possible default. There are Variety is good. However: Variety is somewhat different to chaos. And there is absolutely no place for variety in productive setups. And some users attempt to place Linux as a commercial product exactly there. In consequence, the discussion of Vegas or Westwood is a discussion between M$ Hackers and know-it-alls who boast with "registry hacks". Excuse me, but this is not a professional attitude. This is laymanship. To avoid the word botch. Proper engineering is something completely different. > several timer knobs, CC algorithms, queues, memory knobs and so on - you are Fine. Refer to the Hengarter et al. paper on TCP Vegas, how "several knobs" are discussed properly. > absolutely right. They are provided to tune the stack to meet your > requirements, to fit into divergent environments. If you know your > environment and you want to tune something, fine, Linux provides a way. When I compare Germany and what I here, e.g., from Cuba, I'm convinced that it is a good idea, that not each and everyone may tune his car as he wants to, if he only thinks this to be appropriate. > Other OS do the same! Which is the same bad. > Since several months FreeBSD also provides a way to > select the CC algorithm on the fly. Oh yeah. Do you remember the congavoid paper? One of VJ's goals was achieving and maintaining stability. How can this be ensured by a free choice of cc-algorithms without any solid rationale? > We are embedded in a complicated (network) world, there is more then just one > answer. > > Even more, it is necessary to obey the principle of robustness, to be conservative and to be very considerate in what we're doing. Scientific progress is achieved by carefully doing one step after another. Not by chaotic stumble. Detlef -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ From hagen at jauu.net Thu Aug 18 13:36:48 2011 From: hagen at jauu.net (Hagen Paul Pfeifer) Date: Thu, 18 Aug 2011 22:36:48 +0200 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <4E4D6EB3.7000006@web.de> References: <0B0A20D0B3ECD742AA2514C8DDA3B065055F4B05@VGAEXCH01.hq.corp.viasat.com> <516BE018-532A-429E-A2A9-3BF8DCBD0A36@comsys.rwth-aachen.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D0F6@MILEXCH1.ds.jdsu.net> <5BE977C7-2BC4-4F32-A98D-1123AEA681B6@comsys.rwth-aachen.de> <4E4D4183.5050305@web.de> <20110818183833.GA2878@nuttenaction> <4E4D5DA5.3080000@web.de> <20110818190624.GB2878@nuttenaction> <4E4D6EB3.7000006@web.de> Message-ID: <20110818203648.GC2878@nuttenaction> I started to comment a selected set of paragraph - because I feel that * Detlef Bosau | 2011-08-18 21:57:39 [+0200]: >It is a well proven attitude in science, to do research one step >after another. > >Obviously, in Internetworking we have a certain problem with this >one. Why couldn't we discuss those things in a number of papers >and/or RFC proposals instead of playing around with these things, >with absolutely no consideration of the rest of the world? > >If Quick ACKs are a good idea, there is no doubt that they will be >accepted. However, I have a problem with working >beside/without/against the community. > >My attitude is pursuing the discussion in the community (although I >cannot do this to that degree I want to do for some reasons) instead >of avoiding it. > >In some sense, Linux is the OS of the GNU generation: Gnu is Not Unix. > >And many of us use still BSD Unix as _the_ reference. > >When I look at Linux, we have Westwood and Reno Options and others as >well. So, every user make mix up his own "salad of TCP flavours" >wihout ging a simple thought to whether this make sense. > >This is an attitude of competitive playing - and I strongly expect >our attitude to become more professional in the future. There are several aspects here: 1) we choose on of the more conservative congestion control algorithm as the default one. It was BIC for a longer time BUT as fast as several research spotted out that there are some fairness issues in networks with a low RTT we skipped to CUBIC. Linux WAS fast in adjusting the default CC algorithm! Faster then any standardization body. 2) Remember: it was Sallys own CC algorithm - Highspeed TCP - standardized in an RFC which IS unfair. Nothing prevents an algorithm from being unfair, only by being standardized! 3) Linux implement an mechanism to PREVENT ordinary users from selecting an unfair CC algorithm [1]. This is far more then any other operating system! If you present some numbers where you spooted some unfairness - fine, this can be discussed too! 4) And in the end: Linux do everything to be fair, knowing that it is used as a Server/Networking OS. But you cannot stop users with root access to harm the network. They can do everything beside selecting the CC algorithm (e.g. starting traffic generator at line rate ...). [2] Detlef, if you have solid and substantive concerns you can make a request! A little bit off-topic now, you can start a separate thread here in e2e or in tcpm (or email me off-list) Cheers, Hagen [1] http://www.amailbox.org/mailarchive/linux-netdev/2010/5/27/6278101 [2] http://blog.benstrong.com/2010/11/google-and-microsoft-cheat-on-slow.html From detlef.bosau at web.de Fri Aug 19 06:29:11 2011 From: detlef.bosau at web.de (Detlef Bosau) Date: Fri, 19 Aug 2011 15:29:11 +0200 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009DB9@MILEXCH1.ds.jdsu.net> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> <4E455091.9050601@es.net> <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009DB9@MILEXCH1.ds.jdsu.net> Message-ID: <4E4E6527.6010101@web.de> On 08/12/2011 09:37 PM, Barry Constantine wrote: > Hi, > > Let me provide better background information. > > End-customers (business companies) purchase a network service from a network provider and purchase a Service Level Agreement (SLA). This SLA specifies the committed information rate (CIR) which the provider will guarantee along with loss / latency specifications. That's in fact the traditional scenario of, e.g., a Frame Relay link as it is done for years now. Once again my question: How is traffic policing achieved here? > The end customer connects up either with 100M or 1G interface (generally) and the provider generally polices down to the CIR. > Not quite. CIR means particularly: the _least_ information rate. That's the reason why I ask for the policing algorithm. I'm particularly afraid that the algorithm in use may cause a lot of packet order distortion. -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ From Barry.Constantine at jdsu.com Fri Aug 19 07:09:41 2011 From: Barry.Constantine at jdsu.com (Barry Constantine) Date: Fri, 19 Aug 2011 07:09:41 -0700 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <4E4E6527.6010101@web.de> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> <4E455091.9050601@es.net> <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009DB9@MILEXCH1.ds.jdsu.net> <4E4E6527.6010101@web.de> Message-ID: <94DEE80C63F7D34F9DC9FE69E39436BE3A1D18D037@MILEXCH1.ds.jdsu.net> It's standard Cisco policer from a carrier grade, Metro Ethernet switch (ME3400). A quip from the command reference guide: "The ME-3400E switch supports 1-rate, 2-color ingress policing and 2-rate, 3-color policing for individual or aggregate policing." Again, my whole intention is to educate network providers that policing can do bad, bad things to TCP performance. Today, network providers only run UDP traffic through the network to "commission" a service and then they turn on business customer services and then the finger pointing start. So I performed the OS test in a controlled lab to demonstrate just how badly TCP can be effected by a standard Cisco policer. Thanks, Barry -----Original Message----- From: end2end-interest-bounces at postel.org [mailto:end2end-interest-bounces at postel.org] On Behalf Of Detlef Bosau Sent: Friday, August 19, 2011 9:29 AM To: end2end-interest at postel.org Subject: Re: [e2e] TCP Performance with Traffic Policing On 08/12/2011 09:37 PM, Barry Constantine wrote: > Hi, > > Let me provide better background information. > > End-customers (business companies) purchase a network service from a network provider and purchase a Service Level Agreement (SLA). This SLA specifies the committed information rate (CIR) which the provider will guarantee along with loss / latency specifications. That's in fact the traditional scenario of, e.g., a Frame Relay link as it is done for years now. Once again my question: How is traffic policing achieved here? > The end customer connects up either with 100M or 1G interface (generally) and the provider generally polices down to the CIR. > Not quite. CIR means particularly: the _least_ information rate. That's the reason why I ask for the policing algorithm. I'm particularly afraid that the algorithm in use may cause a lot of packet order distortion. -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ From detlef.bosau at web.de Fri Aug 19 07:32:51 2011 From: detlef.bosau at web.de (Detlef Bosau) Date: Fri, 19 Aug 2011 16:32:51 +0200 Subject: [e2e] TCP Performance with Traffic Policing In-Reply-To: <94DEE80C63F7D34F9DC9FE69E39436BE3A1D18D037@MILEXCH1.ds.jdsu.net> References: <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009A65@MILEXCH1.ds.jdsu.net> <4E455091.9050601@es.net> <94DEE80C63F7D34F9DC9FE69E39436BE3A0D009DB9@MILEXCH1.ds.jdsu.net> <4E4E6527.6010101@web.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D18D037@MILEXCH1.ds.jdsu.net> Message-ID: <4E4E7413.8080703@web.de> On 08/19/2011 04:09 PM, Barry Constantine wrote: > It's standard Cisco policer from a carrier grade, Metro Ethernet switch (ME3400). O.k., that's the whitepaper answer. How does it work? I don't know, whether you're pursuing an academic degree. If so, you should know, how algorithms work and why they are used. It does not matter how it is called by some moneymakers. DB -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ From detlef.bosau at web.de Fri Aug 19 07:39:07 2011 From: detlef.bosau at web.de (Detlef Bosau) Date: Fri, 19 Aug 2011 16:39:07 +0200 Subject: [e2e] Linux TCP In-Reply-To: <824E1C32-A75E-4E11-AC78-87DD16CA5D58@nokia.com> References: <0B0A20D0B3ECD742AA2514C8DDA3B065055F4B05@VGAEXCH01.hq.corp.viasat.com> <516BE018-532A-429E-A2A9-3BF8DCBD0A36@comsys.rwth-aachen.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D0F6@MILEXCH1.ds.jdsu.net> <5BE977C7-2BC4-4F32-A98D-1123AEA681B6@comsys.rwth-aachen.de> <4E4D4183.5050305@web.de> <20110818183833.GA2878@nuttenaction> <4E4D5DA5.3080000@web.de> <20110818190624.GB2878@nuttenaction> <4E4D6EB3.7000006@web.de> <20110818203648.GC2878@nuttenaction> <824E1C32-A75E-4E11-AC78-87DD16CA5D58@nokia.com> Message-ID: <4E4E758B.9000901@web.de> On 08/19/2011 05:09 AM, Lars Eggert wrote: >> 4) And in the end: Linux do everything to be fair, knowing that it is used as >> a Server/Networking OS. I use Linux as a workstation OS. Am I doing something wrong? >> But you cannot stop users with root access to harm the >> network. You can prevent users from getting root access. >> They can do everything beside selecting the CC algorithm (e.g. >> starting traffic generator at line rate ...). [2] > No argument here. But as I said above, giving folks tuning knobs is very different from defaulting them to a non-standard CC algorithm. > Linux is frequently used by ordinary users. And because Linux boasts with "BIC" and "CUBIC", some users think themselves to be network experts. My aim is that we eventually obey standards and don't abuse ordinary workstations as harmful network experimental equipment. Detlef > Lars > > PS: And note that I have nothing against CUBIC. What I dislike is individual implementations that unilaterally decide to move beyond community consensus, esp. in their shipping default. (When I was Transport AD I actually really tried to get CUBIC, C-TCP and H-TCP to published as Experimental RFCs, like Highspeed was. But it seemed like the authors never actually cared enough to move the work from the IRTF to the IETF.) -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ From hagen at jauu.net Fri Aug 19 07:49:17 2011 From: hagen at jauu.net (Hagen Paul Pfeifer) Date: Fri, 19 Aug 2011 16:49:17 +0200 Subject: [e2e] Linux TCP (was: Re: TCP Performance with Traffic Policing) In-Reply-To: <824E1C32-A75E-4E11-AC78-87DD16CA5D58@nokia.com> References: <0B0A20D0B3ECD742AA2514C8DDA3B065055F4B05@VGAEXCH01.hq.corp.viasat.com> <516BE018-532A-429E-A2A9-3BF8DCBD0A36@comsys.rwth-aachen.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D0F6@MILEXCH1.ds.jdsu.net> <5BE977C7-2BC4-4F32-A98D-1123AEA681B6@comsys.rwth-aachen.de> <4E4D4183.5050305@web.de> <20110818183833.GA2878@nuttenaction> <4E4D5DA5.3080000@web.de> <20110818190624.GB2878@nuttenaction> <4E4D6EB3.7000006@web.de> <20110818203648.GC2878@nuttenaction> <824E1C32-A75E-4E11-AC78-87DD16CA5D58@nokia.com> Message-ID: On Thu, 18 Aug 2011 23:09:32 -0400, Lars Eggert wrote: > Neither BIC nor CUBIC are standards. How quickly an implementation can > drop one non-standard CC default for another doesn't really matter much. > How easily it moves beyond what the broader community has agreed on a a > standard does. >> 3) Linux implement an mechanism to PREVENT ordinary users from selecting >> an unfair CC algorithm [1]. This is far more then any other operating >> system! If >> you present some numbers where you spooted some unfairness - fine, this >> can be discussed too! > > That's nice, but why has Linux chosen to enable a non-standard CC > algorithm as the default? I'm all for giving knowledgable folks the knobs > they need, but a default is a default. I am sure you know the answers for most of the questions in your email. I cannot finally say why Linux selected CUBIC (I cannot speak for David). Maybe 95% of all user are never affected and fine with NewReno. I write about the decision about NewReno versus CUBIC later in this email. Your questions cannot answered from a technical view. Vendors (and Linux) act not purely driven by standards. Operating systems often differs because their is simple no standard, or because the standard do not fulfill the requirements of the customer, sometimes the standard is crap and so on. You asked why has Linux chosen to enable a non-standard CC algorithm as the default. Because the non-standardized - but well analyzed - CUBIC IS a fair and a compatible CC algorithm respecting NewReno and friends. AND simultaneously make some customers happy (those with larger LFN's). Everybody knows that CUBIC _could_ be the defacto "standard", at least with status experimental. In my opinion there will always be a race between Standardization bodies like IETF and real world networking stacks. With one important attribute: respect. Operating systems must respect constraint imposed by standardization bodies - like TCP fairness. If a standardization body show that there are some real fairness issue in Linux, then Linux will hear and react to this - no doubt. Standardization bodies on the other hand should align their development with real life demands. The CUBIC I-D is stocked several years ago - there is no standardized CC algorithm that fulfill the requirement of several customers. That's not good and Eddy in his position as the new TCPM AD should go ahead with this. The story about CC algorithms is more complicated. CC algorithms are not really vendor driven, "vendor lobbying" is low. There is no real standardization pressure behind this. Unlike, say IW10 where Google and larger HTTP providers have a real advantage. Vendors/people are required to lobby for IW10 because it has huge impact in the whole eco system. Lars, sometimes you are on netdev and follow IETF relevant topics. People like Ilpo, Alexander, Fernando and other guys are more involved. These peoples - among others - are the IETF-Keepers, watching out for potentially dangerous changes. I hope that this positive lobbying will be sufficient that conformity and interoperability is not violated. Hagen From detlef.bosau at web.de Fri Aug 19 09:09:47 2011 From: detlef.bosau at web.de (Detlef Bosau) Date: Fri, 19 Aug 2011 18:09:47 +0200 Subject: [e2e] Linux TCP In-Reply-To: References: <0B0A20D0B3ECD742AA2514C8DDA3B065055F4B05@VGAEXCH01.hq.corp.viasat.com> <516BE018-532A-429E-A2A9-3BF8DCBD0A36@comsys.rwth-aachen.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D0F6@MILEXCH1.ds.jdsu.net> <5BE977C7-2BC4-4F32-A98D-1123AEA681B6@comsys.rwth-aachen.de> <4E4D4183.5050305@web.de> <20110818183833.GA2878@nuttenaction> <4E4D5DA5.3080000@web.de> <20110818190624.GB2878@nuttenaction> <4E4D6EB3.7000006@web.de> <20110818203648.GC2878@nuttenaction> <824E1C32-A75E-4E11-AC78-87DD16CA5D58@nokia.com> Message-ID: <4E4E8ACB.4030805@web.de> On 08/19/2011 04:49 PM, Hagen Paul Pfeifer wrote: > > In my opinion there will always be a race between Standardization bodies > like IETF and real world networking stacks. With one important attribute: > respect. There must not be a "race" like this. Perhaps we cannot really agree on this one, however engineering science is not a race or a competition. It is proper science which either obeys proper rules or ends up as pure botch. We well can leave engineering to practitioners, we well remember "Indiana's squared circle", which is inevitably the result. At least, I'm severely disappointed by persons, who grip knowledge and know how from the academic world, however this kind of "team work" does not reach that far that they will tell us which algorithms they're using and which algorithms we shall debug. An operating system which is widespread for everyday use must not be abused as an experimental box for tests which belong to a lab. And anything what is to be deployed in the Internet should take his way to the usual discussions and standardization bodies. This is a direct consequence of the well proven rule "First do no harm!" -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ From detlef.bosau at web.de Fri Aug 19 11:33:15 2011 From: detlef.bosau at web.de (Detlef Bosau) Date: Fri, 19 Aug 2011 20:33:15 +0200 Subject: [e2e] Linux TCP In-Reply-To: References: <0B0A20D0B3ECD742AA2514C8DDA3B065055F4B05@VGAEXCH01.hq.corp.viasat.com> <516BE018-532A-429E-A2A9-3BF8DCBD0A36@comsys.rwth-aachen.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D0F6@MILEXCH1.ds.jdsu.net> <5BE977C7-2BC4-4F32-A98D-1123AEA681B6@comsys.rwth-aachen.de> <4E4D4183.5050305@web.de> <20110818183833.GA2878@nuttenaction> <4E4D5DA5.3080000@web.de> <20110818190624.GB2878@nuttenaction> <4E4D6EB3.7000006@web.de> <20110818203648.GC2878@nuttenaction> <824E1C32-A75E-4E11-AC78-87DD16CA5D58@nokia.com> <4E4E8ACB.4030805@web.de> Message-ID: <4E4EAC6B.4050903@web.de> On 08/19/2011 06:50 PM, Keith Moore wrote: > On Aug 19, 2011, at 12:09 PM, Detlef Bosau wrote: > >> An operating system which is widespread for everyday use must not be >> abused as an experimental box for tests which belong to a lab. And >> anything what is to be deployed in the Internet should take his way >> to the usual discussions and standardization bodies. > > Strongly disagree. It should not be presumed that standardization > bodies do no harm or have a monopoly on competence. Both is to be presumed by definition. Period. > Nor is it acceptable for standards bodies to be able to exert control > over everything that happens in the Internet. > I don't intend to discuss what is acceptable or not. I simply state the importance of standardization bodies, both in factual and legal competence as well. > Everything deployed in the Internet is experimental to some degree, The Internet is used amongst others by prorfessional commercial users who have agreed certain LSAs with their ISP. These users expect - and deserve - service as agreed. A computer scientist building a computer network has to act with precisely the same sense of responsibility and thoroughness as a civil engineer who builds the akashi kaikyo bridge. If he cannot do so, he should choose a profession adequate to his attitude. (And when you talk to civil engineers in Germany, you are frequently reminded that Konrad Zuse was civil engineer.) -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ From jsage at finchhaven.com Fri Aug 19 12:12:02 2011 From: jsage at finchhaven.com (John Sage) Date: Fri, 19 Aug 2011 12:12:02 -0700 Subject: [e2e] Linux TCP In-Reply-To: <4E4E8ACB.4030805@web.de> References: <0B0A20D0B3ECD742AA2514C8DDA3B065055F4B05@VGAEXCH01.hq.corp.viasat.com> <516BE018-532A-429E-A2A9-3BF8DCBD0A36@comsys.rwth-aachen.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D0F6@MILEXCH1.ds.jdsu.net> <5BE977C7-2BC4-4F32-A98D-1123AEA681B6@comsys.rwth-aachen.de> <4E4D4183.5050305@web.de> <20110818183833.GA2878@nuttenaction> <4E4D5DA5.3080000@web.de> <20110818190624.GB2878@nuttenaction> <4E4D6EB3.7000006@web.de> <20110818203648.GC2878@nuttenaction> <824E1C32-A75E-4E11-AC78-87DD16CA5D58@nokia.com> <4E4E8ACB.4030805@web.de> Message-ID: <4E4EB582.6060407@finchhaven.com> On 11-08-19 09:09 AM, Detlef Bosau wrote: > On 08/19/2011 04:49 PM, Hagen Paul Pfeifer wrote: >> >> In my opinion there will always be a race between Standardization bodies >> like IETF and real world networking stacks. With one important attribute: >> respect. > > There must not be a "race" like this. > > Perhaps we cannot really agree on this one, however engineering science > is not a race or a competition. > It is proper science which either obeys proper rules or ends up as pure > botch. I've been reading this thread for some time now, for some reason. The Oxford Dictionary of the English Language defines a "pedant" as one being excessively concerned with minor details and rules or with displaying academic learning. I would imagine that you find it very, very difficult to pass through the real world, inhabited with real human beings who do not match up to your standards of "pure rules" and where the only apparent alternative is "botch". Good luck with that. You'll need it. - John -- John Sage From jsage at finchhaven.com Fri Aug 19 12:34:52 2011 From: jsage at finchhaven.com (John Sage) Date: Fri, 19 Aug 2011 12:34:52 -0700 Subject: [e2e] Linux TCP In-Reply-To: <4E4EAC6B.4050903@web.de> References: <0B0A20D0B3ECD742AA2514C8DDA3B065055F4B05@VGAEXCH01.hq.corp.viasat.com> <516BE018-532A-429E-A2A9-3BF8DCBD0A36@comsys.rwth-aachen.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D0F6@MILEXCH1.ds.jdsu.net> <5BE977C7-2BC4-4F32-A98D-1123AEA681B6@comsys.rwth-aachen.de> <4E4D4183.5050305@web.de> <20110818183833.GA2878@nuttenaction> <4E4D5DA5.3080000@web.de> <20110818190624.GB2878@nuttenaction> <4E4D6EB3.7000006@web.de> <20110818203648.GC2878@nuttenaction> <824E1C32-A75E-4E11-AC78-87DD16CA5D58@nokia.com> <4E4E8ACB.4030805@web.de> <4E4EAC6B.4050903@web.de> Message-ID: <4E4EBADC.1060406@finchhaven.com> On 11-08-19 11:33 AM, Detlef Bosau wrote: > On 08/19/2011 06:50 PM, Keith Moore wrote: >> On Aug 19, 2011, at 12:09 PM, Detlef Bosau wrote: > A computer scientist building a computer network has to act with > precisely the same sense of responsibility and thoroughness as a civil > engineer who builds the akashi kaikyo bridge. If he cannot do so, he > should choose a profession adequate to his attitude. (And when you talk > to civil engineers in Germany, you are frequently reminded that Konrad > Zuse was civil engineer.) As we used to say back in the good old days: *plonk* - John -- John Sage From detlef.bosau at web.de Fri Aug 19 12:43:12 2011 From: detlef.bosau at web.de (Detlef Bosau) Date: Fri, 19 Aug 2011 21:43:12 +0200 Subject: [e2e] Linux TCP In-Reply-To: <4E4EB582.6060407@finchhaven.com> References: <0B0A20D0B3ECD742AA2514C8DDA3B065055F4B05@VGAEXCH01.hq.corp.viasat.com> <516BE018-532A-429E-A2A9-3BF8DCBD0A36@comsys.rwth-aachen.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D0F6@MILEXCH1.ds.jdsu.net> <5BE977C7-2BC4-4F32-A98D-1123AEA681B6@comsys.rwth-aachen.de> <4E4D4183.5050305@web.de> <20110818183833.GA2878@nuttenaction> <4E4D5DA5.3080000@web.de> <20110818190624.GB2878@nuttenaction> <4E4D6EB3.7000006@web.de> <20110818203648.GC2878@nuttenaction> <824E1C32-A75E-4E11-AC78-87DD16CA5D58@nokia.com> <4E4E8ACB.4030805@web.de> <4E4EB582.6060407@finchhaven.com> Message-ID: <4E4EBCD0.60501@web.de> On 08/19/2011 09:12 PM, John Sage wrote: > > The Oxford Dictionary of the English Language defines a "pedant" as > one being excessively concerned with minor details and rules or with > displaying academic learning. So, call me pedantic. And in some respect, you're perfectly correct. Whatever you mean with "displaying academic learning". This is envy, nothing else. > > I would imagine that you find it very, very difficult to pass through > the real world, inhabited with real human beings who do not match up > to your standards of "pure rules" and where the only apparent > alternative is "botch". > Is that the typical argument between practitioners and theoreticians? Excuse me, but on the one hand, we reach the point where I feel offended personally. On the other hand, we leave the topic of this list. I wanted to emphasize the importance of standardization and academic discussion. In my mind, this is a matter of fact and not a matter of discussion. However, I think we should stop at this point. The discussion as its pursued at this point is not adequate for this list. (BTW: Neither is Linux. Linux is an OS and there exists a TCP implementation for Linux. I want to discuss TCP here. What Linux is concerned: Give the programmers a pointer to the RFC and then make it so.) -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ From detlef.bosau at web.de Fri Aug 19 13:26:00 2011 From: detlef.bosau at web.de (Detlef Bosau) Date: Fri, 19 Aug 2011 22:26:00 +0200 Subject: [e2e] Linux TCP In-Reply-To: <4E4EBADC.1060406@finchhaven.com> References: <0B0A20D0B3ECD742AA2514C8DDA3B065055F4B05@VGAEXCH01.hq.corp.viasat.com> <516BE018-532A-429E-A2A9-3BF8DCBD0A36@comsys.rwth-aachen.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D0F6@MILEXCH1.ds.jdsu.net> <5BE977C7-2BC4-4F32-A98D-1123AEA681B6@comsys.rwth-aachen.de> <4E4D4183.5050305@web.de> <20110818183833.GA2878@nuttenaction> <4E4D5DA5.3080000@web.de> <20110818190624.GB2878@nuttenaction> <4E4D6EB3.7000006@web.de> <20110818203648.GC2878@nuttenaction> <824E1C32-A75E-4E11-AC78-87DD16CA5D58@nokia.com> <4E4E8ACB.4030805@web.de> <4E4EAC6B.4050903@web.de> <4E4EBADC.1060406@finchhaven.com> Message-ID: <4E4EC6D8.8080905@web.de> On 08/19/2011 09:34 PM, John Sage wrote > > > As we used to say back in the good old days: > > *plonk* > > > - John You're free to do that. However, you are _not_ free to bother me with meaningless mails, as this happened during the last hours by Keith Moore. And I think, it is sufficient to say this only once. Thank you. (And I added Joe Touch in cc:, I think latest, when I receive unwanted e-mail, this is adequate.) Let us please calm down and return to the topic of the list. -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ From touch at isi.edu Fri Aug 19 13:47:39 2011 From: touch at isi.edu (Joe Touch) Date: Fri, 19 Aug 2011 13:47:39 -0700 Subject: [e2e] Linux TCP In-Reply-To: <4E4EB582.6060407@finchhaven.com> References: <0B0A20D0B3ECD742AA2514C8DDA3B065055F4B05@VGAEXCH01.hq.corp.viasat.com> <516BE018-532A-429E-A2A9-3BF8DCBD0A36@comsys.rwth-aachen.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D0F6@MILEXCH1.ds.jdsu.net> <5BE977C7-2BC4-4F32-A98D-1123AEA681B6@comsys.rwth-aachen.de> <4E4D4183.5050305@web.de> <20110818183833.GA2878@nuttenaction> <4E4D5DA5.3080000@web.de> <20110818190624.GB2878@nuttenaction> <4E4D6EB3.7000006@web.de> <20110818203648.GC2878@nuttenaction> <824E1C32-A75E-4E11-AC78-87DD16CA5D58@nokia.com> <4E4E8ACB.4030805@web.de> <4E4EB582.6060407@finchhaven.com> Message-ID: <4E4ECBEB.9020701@isi.edu> Hi, all, This is a friendly reminder from your list admin (me) to keep things civil on this list. Please review the list posting policies on our website if you need further context on appropriate posts or the consequences of not staying within bounds. Joe (list admin) From detlef.bosau at web.de Fri Aug 19 14:04:02 2011 From: detlef.bosau at web.de (Detlef Bosau) Date: Fri, 19 Aug 2011 23:04:02 +0200 Subject: [e2e] Linux TCP In-Reply-To: <2A4A787A-C184-4783-8F9A-78557917A9F3@network-heretics.com> References: <0B0A20D0B3ECD742AA2514C8DDA3B065055F4B05@VGAEXCH01.hq.corp.viasat.com> <516BE018-532A-429E-A2A9-3BF8DCBD0A36@comsys.rwth-aachen.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D0F6@MILEXCH1.ds.jdsu.net> <5BE977C7-2BC4-4F32-A98D-1123AEA681B6@comsys.rwth-aachen.de> <4E4D4183.5050305@web.de> <20110818183833.GA2878@nuttenaction> <4E4D5DA5.3080000@web.de> <20110818190624.GB2878@nuttenaction> <4E4D6EB3.7000006@web.de> <20110818203648.GC2878@nuttenaction> <824E1C32-A75E-4E11-AC78-87DD16CA5D58@nokia.com> <4E4E8ACB.4030805@web.de> <4E4EB582.6060407@finchhaven.com> <2A4A787A-C184-4783-8F9A-78557917A9F3@network-heretics.com> Message-ID: <4E4ECFC2.1080708@web.de> On 08/19/2011 10:42 PM, Keith Moore wrote: > On Aug 19, 2011, at 3:12 PM, John Sage wrote: > >> I would imagine that you find it very, very difficult to pass through the real world, inhabited with real human beings who do not match up to your standards of "pure rules" and where the only apparent alternative is "botch". >> >> Good luck with that. >> >> You'll need it. > I would not be so harsh. I certainly wish that Internet standards were more rigorously defined and more carefully adhered to. > > The aerospace industry does reasonably well with very rigorous design and certification procedures, and so do several other engineering professions. Particularly the aerospace industry uses computers in very excessive manner. And we've seen well known disasters in aerospace industry as a direct consequence in not obeying standards. > It's not ridiculous to imagine that similar disciplines might be desirable for the Internet also. But the Internet is still very much in its infancy, engineering discipline hasn't caught up yet, and the whole industry is accustomed to relying on kludges to make things work. Than this is a reminder of what is still to be done here. > Changing this will be a very uphill battle, likely lasting decades. > Fine. However, identifying the hill and turning our steps upwards is a good point to start. > Keith > -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ From touch at isi.edu Fri Aug 19 14:07:12 2011 From: touch at isi.edu (Joe Touch) Date: Fri, 19 Aug 2011 14:07:12 -0700 Subject: [e2e] Linux TCP In-Reply-To: References: <0B0A20D0B3ECD742AA2514C8DDA3B065055F4B05@VGAEXCH01.hq.corp.viasat.com> <516BE018-532A-429E-A2A9-3BF8DCBD0A36@comsys.rwth-aachen.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D0F6@MILEXCH1.ds.jdsu.net> <5BE977C7-2BC4-4F32-A98D-1123AEA681B6@comsys.rwth-aachen.de> <4E4D4183.5050305@web.de> <20110818183833.GA2878@nuttenaction> <4E4D5DA5.3080000@web.de> <20110818190624.GB2878@nuttenaction> <4E4D6EB3.7000006@web.de> <20110818203648.GC2878@nuttenaction> <824E1C32-A75E-4E11-AC78-87DD16CA5D58@nokia.com> Message-ID: <4E4ED080.5090808@isi.edu> On 8/19/2011 7:49 AM, Hagen Paul Pfeifer wrote: ... > Your questions cannot answered from a technical view. Vendors (and Linux) > act not purely driven by standards. Operating systems often differs because > their is simple no standard, or because the standard do not fulfill the > requirements of the customer, sometimes the standard is crap and so on. (as an individual) There *is* a standard for the Internet, and it is not CUBIC ;-) > You asked why has Linux chosen to enable a non-standard CC algorithm as > the default. Because the non-standardized - but well analyzed - CUBIC IS a > fair and a compatible CC algorithm respecting NewReno and friends. There are a few studies to the contrary, esp. when the bottleneck BW is high (but still a bottleneck). Also, CUBIC is known to perform worse than NewReno in low RTT environments, AFAICT. If you have sufficient evidence to the contrary - please take it to the IETF and suggest a change to the congestion control standards. > AND > simultaneously make some customers happy (those with larger LFN's). > Everybody knows that CUBIC _could_ be the defacto "standard", at least with > status experimental. There's a mechanism for that - it's called the IETF. However, "experimental" does not permit a protocol to be deployed as default in a wide-scale environment; the goal of "experimental" is to do an experiment before such a deployment. ... > Operating systems must respect constraint imposed by standardization > bodies - like TCP fairness. If a standardization body show that there are > some real fairness issue in Linux, then Linux will hear and react to this - > no doubt. Standardization bodies on the other hand should align their > development with real life demands. This is reversed. The onus of proof rests with CUBIC, to ensure that it does no harm *before* it is deployed. The IETF does adjust to the needs of the community, but it relies on the community to bring the issues to the IETF. Not to do end-runs. However, Linux is *famous* (IMO) for accepting code first and asking questions later. It's one reason I prefer FreeBSD for my research (that, and the fact that FreeBSD's stack has been a good decade more advanced than Linux - speaking as one who did things with FreeBSD in 2000 that Linux is only recently able to match). FWIW. Joe (again, speaking as an individual) From touch at isi.edu Fri Aug 19 14:43:55 2011 From: touch at isi.edu (Joe Touch) Date: Fri, 19 Aug 2011 14:43:55 -0700 Subject: [e2e] unsubscribe In-Reply-To: <4E4ECF88.5030202@finchhaven.com> References: <0B0A20D0B3ECD742AA2514C8DDA3B065055F4B05@VGAEXCH01.hq.corp.viasat.com> <516BE018-532A-429E-A2A9-3BF8DCBD0A36@comsys.rwth-aachen.de> <94DEE80C63F7D34F9DC9FE69E39436BE3A1D09D0F6@MILEXCH1.ds.jdsu.net> <5BE977C7-2BC4-4F32-A98D-1123AEA681B6@comsys.rwth-aachen.de> <4E4D4183.5050305@web.de> <20110818183833.GA2878@nuttenaction> <4E4D5DA5.3080000@web.de> <20110818190624.GB2878@nuttenaction> <4E4D6EB3.7000006@web.de> <20110818203648.GC2878@nuttenaction> <824E1C32-A75E-4E11-AC78-87DD16CA5D58@nokia.com> <4E4E8ACB.4030805@web.de> <4E4EB582.6060407@finchhaven.com> <4E4ECBEB.9020701@isi.edu> <4E4ECF88.5030202@finchhaven.com> Message-ID: <4E4ED91B.7040100@isi.edu> FYI - subscriptions are managed by the automated interface at our website. Joe (as list admin) From detlef.bosau at web.de Sat Aug 20 03:26:44 2011 From: detlef.bosau at web.de (Detlef Bosau) Date: Sat, 20 Aug 2011 12:26:44 +0200 Subject: [e2e] Difference wired, wireless, was: Re: Taking RTT Samples. In-Reply-To: <20110816155547.E711C4363D4@lawyers.icir.org> References: <20110816155547.E711C4363D4@lawyers.icir.org> Message-ID: <4E4F8BE4.1090301@web.de> On 08/16/2011 05:55 PM, Mark Allman wrote: >> I'm curious whether we shall take RTT Samples for each segment or >> not. AFAIK, the RFC encourage one RTT timer, which is started at >> least once a round. >> >> An alternative way would be one timer per segment. > Our work suggests that multiple samples per RTT does not help the > estimator. > > See: > > Mark Allman, Vern Paxson. On Estimating End-to-End Network Path > Properties. Proceedings of the ACM SIGCOMM Technical Symposium, > Cambridge, MA, September 1999. > http://www.icir.org/mallman/papers/estimation.ps > > allman > > So, in general we can assume: In wired internetworks, we have reasonable RTT/RTO estimators. Now, when it comes to WWAN, which is basically my central interest, will these mechanisms still hold? And do we have evidence for or against this? Detlef -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------ From griff at ifi.uio.no Mon Aug 22 06:47:51 2011 From: griff at ifi.uio.no (Carsten Griwodz) Date: Mon, 22 Aug 2011 15:47:51 +0200 Subject: [e2e] CFP: ACM MMSys 2012 References: Message-ID: ACM Multimedia Systems 2012 (MMSys 2012) Call For Papers February 22-24, 2012 Chapel Hill, North Carolina, USA http://www.mmsys.org Important Dates: * September 19, 2011: Submission Deadline * November 1, 2011: Notification Date We are pleased to announce a call for papers for the 3rd ACM Multimedia Systems (MMSys) Conference to be held Feb. 22-24, 2012 in Chapel Hill, North Carolina, USA. MMSys provides a forum for researchers, engineers, and scientist to present and share their latest research findings in multimedia systems. While research about specific aspects of multimedia systems is regularly published in the various proceedings and transactions of the networking, operating system, real-time system, and database communities, MMSys aims to cut across these domains in the context of multimedia data types. This provides a unique opportunity to view the intersections and interplay of the various approaches and solutions developed across these domains to deal with multimedia data types. MMSys is your avenue for communicating research that addresses multimedia systems holistically. Noteworthy details of MMSys 2012 include: - MMSys full papers are up to 12 pages long. - MMSys invites you to share your datasets with the research community and publish a description of the dataset's features itself. - Keynote speakers at MMSys 2012 will be Deepak S. Turaga (IBM) and Leonidas Kontothanassis (Google). - A journal special issue is planned for papers that focus on multimedia systems aspects that are particularly relevant for MPEG DASH. - The Mobile Video Workshop (MoVid) becomes a regular workshop of MMSys. You can find its own web pages at http://www.eecs.ucf.edu/movid/ *** Submissions *** * Full papers: no more than 12 pages * Short papers: no more than 6 pages * Dataset papers: no more than 6 pages * Demos: no more than 6 pages (separate October deadline) Full, short and dataset papers will be presented in single track sessions. *** Topics of interest *** Generic multimedia systems topics include: * Multimedia Systems * Multimedia Networking * Multimedia Operating Systems * Multimedia Databases * Large-Scale and Remote Display Architectures * Real-Time Support For Multimedia * Networked Games * Virtual and Augmented Worlds * Cyber-Physical Systems * Peer-to-Peer Architectures for Streaming and Multicast * Modeling of Multimedia Systems * Multimedia Interfaces * Multimedia Middleware and Toolkits * Multimedia Programming Languages * Cloud-based Multimedia Processing * Multi-Core Support for Multimedia * Mobile Multimedia Systems * 3D and Multiview Streaming Special DASH topics interest include: * Adaptive, progressive DASH delivery * Live DASH streaming * Use of content distribution infrastructure components * Viewer experiences from large-scale experiments and events * Content generation for DASH-based delivery * Measurement techniques for collecting consumption data * Effects of adaptation on Quality-of-Experience * Combinations of DASH with other streaming standards * Innovative DASH-based applications From Barry.Constantine at jdsu.com Wed Aug 24 05:04:05 2011 From: Barry.Constantine at jdsu.com (Barry Constantine) Date: Wed, 24 Aug 2011 05:04:05 -0700 Subject: [e2e] Estimating Network Buffer Size with npad or ndt Message-ID: <94DEE80C63F7D34F9DC9FE69E39436BE3A1D283B22@MILEXCH1.ds.jdsu.net> Hi, I know one of these tools can estimate network buffer size on an end-end basis, I think it is ndt? If anyone on this list is familiar with the technique used, please reply. I was pondering whether the TCP test client continues to increase the CWND, measuring RTT along the way till it detects some level of retransmits. Then with the throughput achieved and RTT (before retransmission threshold), calculate network buffer size. Thank you, Barry Constantine -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20110824/ffc3fcea/attachment.html From detlef.bosau at web.de Wed Aug 24 19:33:57 2011 From: detlef.bosau at web.de (Detlef Bosau) Date: Thu, 25 Aug 2011 04:33:57 +0200 Subject: [e2e] TCP Performance over WWAN In-Reply-To: <4E47FB37.5000708@freedesktop.org> References: <4E40684D.6060808@web.de> <4E470F9C.2040903@web.de> <4E47FB37.5000708@freedesktop.org> Message-ID: <4E55B495.5080709@web.de> I would like to restart the discussion from this point on, because the discussion did not yet come to an end. (More precisely: It did not even start.) On 08/14/2011 06:43 PM, Jim Gettys wrote: > > Problems are very real and cause real grief; these stem from both > everyday bugs and mis-understandings on all sides of when packets should > be dropped, and when retransmitted at the link level (which has much > more information about how lossy the channel is this instant than the > end hosts can possibly have). > O.k., I'm afraid, you're correct here ;-) > The situation is therefore more complex than I had originally > understood; This matches my experience from the last 10 years. > both from the nature of the beast (wireless) and those > caused by aggregation (802.11n will aggregate multiple packets into the > same wireless frame). I expect most people on this list had the same > kind of naive views I started with. > > Here's quick, buggy synopsis as I understand it: > > In 802.11, you can be by far better off (taking less time on the scarce > resource, the wireless channel) attempting to retransmit a packet at the > high rate you are attempting to transmit it at than try to drop the rate > to something lower and retransmit the packet in the face of failure; in > fact, the lower rate may in fact be worse than the higher rate. It may > be much better to try to transmit the same packet conceivably even 5 or > 10 times than to have to drop to a low rate (which may work even worse > due to multipath). > Fine. The problem is: First: When you retransmit the packet: How often should this be done? Second: When alternative rates are available: How do you choose one? Of course, you could ask Q to hold on stardate, make 10 PhD students investigate the problem and make a proposal, then you can take a decision and ask Q to let stardate continue. In theory. How is this achieved practically? > So attempting to retransmit a packet makes sense. > > The problem is, how many times? Exactly. Q1. > And when do you give up and try > something different? When do you give up? Q2. What do you try instead? Q3. > And eventually, you really must drop a packet (or > mark with ECN) to slow down the transmitter). This leaves layer 2. However it raises the question: Does the layer 2 behaviour affect higher Layers? Q4. If so: How? Q5. If Q5 is answered and we have adverse effects: Can these be alleviated, Q6 and if: how? Q7? > With aggregation, some packets may get through, and others be damaged > (802.11n gives you a bit mask telling you which are intact and which are > damaged). Aggregation may worsen the situation for TCP, because packets on different channels (if you aggregate independent channels) my need different times for transport and hence packet order distortion may occur. Unfortunately, up to now, I've seen two positions on these questions. 1.: These problems are solved, particularly they are no computer science problems and hence should be left to these idiots with the soldering gun. TCP works anyway, it even works (practically proven!) over avian carriers. 2.: Besides ignoring these problems and leaving the to lower layers, there is an attitude not to publish on these problems. From existing WWAN standards, e.g. GPRS, I know, that users are offered the possibility of so called QoS parameters. (Sometimes called rosary or holy water, you could even ask St. Ignucius to borrow his halo....) Now, an SDU corruption ratio 10^-3 may appear reasonable in this context. However, promising SDU an SDU corruption ratio 10^-9 sounds funny :-) Some weeks ago, a colleague asked whether I had an idea, how these ratios were achieved. Admittedly, my excuses were quite vague... So, perhaps, one question is: Do we _have_ an idea, how rates like these are achieved? Do we _have_ an idea, to which degree packet loss should be treated locally? (Yes, we have the paper by Salzer et al., that we should think about it and not do too much retransmission. So, the maximum of accepted retransmissions is 3. I think, we can agree upon that. The question remains: What is the correct value for 3?) Sometimes, I think that our understanding of these problems is somewhat vague, particularly I did not find papers with real hard facts and recommendations. Actually, there are scenarios which simply work. This is achieved by biofeedback adaption: If it doesn't work, stop the bloody thing and try it again next Thursday. Or stop the bloody thing and try it again at a different location. Or stop the bloody thing, take a decision and delegate it to some other person. Who will fix the problem because you have so decided. On the one hand, e.g. mobile phones work quite fine. On the other hand, an acquaintance of mine gave me about 15 calls yesterday in the evening, and I don't remember exactly, whether we talked 10 words or 12. There was a certain problem with the network coverage somewhere in the city of Berlin. (That's the difference between Berlin and Stuttgart: In Berlin, you may have a problem to access a network because of coverage problems, in Stuttgart you may have problems catching a train because we cannot decide where to place the train station. There is a transitive extension. If you travel to Stuttgart by plane, you will perhaps encounter a problem how you will travel to Stuttgart from the airport. At least, an attempt to travel by train may lead to well known problems.....) (Meanwhile, the problem worsened significantly, because we cannot decide whether we should destroy the old train station or whether we should not build a new one afterwards.) O.k., that is not a topic for this list, besides, perhaps, the note that conferences (on whatever, networks included) are quite unlikely to take place in Stuttgart for the next future. (For those who have no idea what I'm talking about: Refer to the Washington Post. http://www.washingtonpost.com/wp-dyn/content/article/2010/09/30/AR2010093003592.html And the story still continues...) Therefore, achieving progress in wireless networking would greatly alleviate the situation because we could at least communicate then - if there is no possibility to travel. O.k., back to our issue. I'm still looking for a point to start to get at least a reasonable model for the situation and to achieve a sound assessment whether particularly TCP will encounter problems in WWAN or not and if: how can these be solved? However, before we solve them we must _know_ them. There are many statements of belief in this area. However: I really miss the hard facts. (Although we have sound solutions of quite some few important particular problems or some restrictive assumptions on the system respectively.) I really would appreciate any discussion in this field. -- ------------------------------------------------------------------ Detlef Bosau Galileistra?e 30 70565 Stuttgart Tel.: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 detlef.bosau at web.de http://www.detlef-bosau.de ------------------------------------------------------------------