From mellia at tlc.polito.it Sun Aug 9 06:18:07 2009 From: mellia at tlc.polito.it (Marco Mellia) Date: Sun, 09 Aug 2009 15:18:07 +0200 Subject: [e2e] CFP: PAM 2010 - Web site up Message-ID: <4A7ECC8F.7080008@tlc.polito.it> Dear all, while all debates moved to the beaches, here is the 11th PAM cfp. Hope you find it useful. Ciao Marco -------------------------------------------------------------------- Call for Papers *Passive and Active Measurement Conference (PAM) 2010* Zurich, SWITZERLAND APRIL 7-9, 2010 http://www.pam2010.ethz.ch/ -------------------------------------------------------------------- The 11th Passive and Active Measurement conference will be held April 7-9th, 2010 in Z?rich, Switzerland. PAM focuses on research and practical applications of network measurement and analysis techniques. The conference's goal is to provide a forum for current work in its early stages. Original papers are invited from the research and operations communities on topics including, but not limited to: * Active Network Measurements * Passive Network Measurements * Performance Metrics * Traffic Statistics * Measurement Visualization * New Measurement Approaches & Techniques * Deployment of Measurement Infrastructure * New Measurement Initiatives * Applications of Network Measurements * Network Measurements and Security * Network Troubleshooting using Measurements The PAM steering committee believes that releasing measurement data allows for better science to be conducted in the field of network measurement. Therefore, an award will be given at PAM 2010 for the best paper based on a new dataset that the authors are releasing for community use in subsequent research. To qualify, a paper must significantly utilize a dataset that has been collected for the work presented in the paper. Further, the dataset must be freely available to any researcher; wireless data sets may, for instance, be published through CRAWDAD. Authors should flag papers they wish to be considered for this award with a footnote on the first page of the paper. Novel datasets are especially encouraged. The awarded paper will be chosen from the set of qualifying papers accepted for the conference by a committee made up of a subset of the program and steering committees. -------------------------------------------------------------------- Important Dates -------------------------------------------------------------------- * *Abstract Registration: October 2, 2009 23:59pm GMT (UK Time)* * *Paper Submission: October 9, 2009 23:59pm GMT (UK Time) * * *Notification of Decision: December 13, 2009* * *Camera Ready: January 15, 2010* * *Conference: April 7-9, 2010* -------------------------------------------------------------------- Paper Submission Instructions -------------------------------------------------------------------- We are soliciting papers, submitted in the Springer LNCS that do not exceed *10 pages*. No additional space will be given once a paper has been accepted. The paper submission site will be online soon. Authors are, as always, asked to refrain from submitting papers submitted to PAM to other venues during the reviewing period. -------------------------------------------------------------------- Conference Organizers -------------------------------------------------------------------- General Chair: Bernhard Plattner ETH Zurich Program Chair: Arvind Krishnamurthy University of Washington Local Arrangements Chair: Xenofontas Dimitropoulos ETH Zurich Publicity Chair: Marco Mellia Politecnico di Torino -------------------------------------------------------------------- Program Committee -------------------------------------------------------------------- Matthew Caesar UIUC Mark Crovella Boston University Constantine Dovrolis Georgia Tech Jaeyeong Jung Intel Research Seattle Sachin Katti Stanford University Thomas Karagiannis Microsoft Research, Cambridge Simon Leinen Switch Craig Labowitz Arbor Networks Sridhar Machiraju Google Bruce Maggs CMU Morley Mao University of Michigan, Ann Arbor Alan Mislove Northeastern University Sue Moon KAIST Neil Spring University of Maryland Joel Sommers Colgate University Renata Teixeira LIP6 Arun Venkataramani University of Massachusetts Jia Wang AT&T Research Yinglian Xie Microsoft Research, Silicon Valley Ming Zhang Microsoft Research, Redmond Yin Zhang University of Texas-Austin -------------------------------------------------------------------- Steering Committee -------------------------------------------------------------------- Nevil Brownlee University of Auckland Bernhard Plattner ETH Zurich Mark Claypool Worcester Plytechnic Institute Ian Graham Endace Arvind Krishnamurthy University of Washington Sue Moon KAIST Renata Teixeira LIP6 Michael Rabinovich Case Western Reserve University -- Ciao, /\/\/\rco +-----------------------------------+ | Marco Mellia - Assistant Professor| | Skypeid: mgmellia | | Tel: +39-011-564-4173 | | Cel: +39-331-6714789 | /"\ .. . . . . . . . . . . . . | Politecnico di Torino | \ / . ASCII Ribbon Campaign . | Corso Duca degli Abruzzi 24 | X .- NO HTML/RTF in e-mail . | Torino - 10129 - Italy | / \ .- NO Word docs in e-mail. | http://www.telematica.polito.it | .. . . . . . . . . . . . . +-----------------------------------+ The box said "Requires Windows 95 or Better." So I installed Linux. From jain.manish at gmail.com Tue Aug 11 14:57:03 2009 From: jain.manish at gmail.com (Manish Jain) Date: Tue, 11 Aug 2009 17:57:03 -0400 Subject: [e2e] Packet reordering in Internet Message-ID: Hello, I was wondering if there are measurement studies of Internet traffic quantifying the magnitude of packet reordering within a TCP flow. Is reordering a common problem for TCP in the current Internet? How about the load balancing features in the routers from major vendors : is it per flow basis or per packet basis, and if flow based load balancing is done, then how is the flow classification is done these routers? What could be/are other sources of reordering withing a TCP flow? Thanks, Manish From alexander.afanasyev at ucla.edu Tue Aug 11 15:57:32 2009 From: alexander.afanasyev at ucla.edu (Alexander Afanasyev) Date: Tue, 11 Aug 2009 15:57:32 -0700 Subject: [e2e] Packet reordering in Internet In-Reply-To: References: Message-ID: <19b50e930908111557w1b50ce2ev44739b145f35f2b6@mail.gmail.com> Hi Manish, You can check some recent papers. For example, Arthur et. al "Keeping Order: Determining the Effect of TCP Packet Reordering" (2007) and Arkko et al. "Dagstuhl perspectives workshop on end-to-end protocols for the future internet". They say that indeed load balancing and parallelism (physical, channel, or network layer) will be (if not already is) dominated cause of packet reordering in the Internet. Reordering also can be caused by AQM (active queue management) or by various DiffServ implementations. In wireless networks, channel level retransmission may have significant impact on reordering. Theoretically, reroutings (e.g., route flapping) also can cause some kind of reordering. However, there was a lot of work to prevent flapping and enforce same routing for the single flows. In fact, today TCP has some basic mechanism to mitigate reordering effect. Have you checked the Eifel algorithm (RFC4015 - the Eifel response algorithm for TCP)? Sincerely, Alexander On Tue, Aug 11, 2009 at 2:57 PM, Manish Jain wrote: > Hello, > > I was wondering if there are measurement studies of Internet traffic > quantifying the magnitude of packet reordering within a TCP flow. Is > reordering a common problem for TCP in the current Internet? How about > the load balancing features in the routers from major vendors : is it > per flow basis or per packet basis, and if flow based load balancing > is done, then how is the flow classification is done these routers? > What could be/are other sources of reordering withing a TCP flow? > > Thanks, > Manish > From craig at aland.bbn.com Tue Aug 11 16:09:31 2009 From: craig at aland.bbn.com (Craig Partridge) Date: Tue, 11 Aug 2009 19:09:31 -0400 Subject: [e2e] Packet reordering in Internet Message-ID: <20090811230931.0982B28E137@aland.bbn.com> I don't think anyone's replicated the experiment that Bennett, Shectman and I did back in 1999 ("Packet Reordering is Not Pathological Network Behavior" in IEEE/ACM TON, Dec 1999). I could be wrong. Thanks! Craig > Hello, > > I was wondering if there are measurement studies of Internet traffic > quantifying the magnitude of packet reordering within a TCP flow. Is > reordering a common problem for TCP in the current Internet? How about > the load balancing features in the routers from major vendors : is it > per flow basis or per packet basis, and if flow based load balancing > is done, then how is the flow classification is done these routers? > What could be/are other sources of reordering withing a TCP flow? > > Thanks, > Manish ******************** Craig Partridge Chief Scientist, BBN Technologies E-mail: craig at aland.bbn.com or craig at bbn.com Phone: +1 517 324 3425 From L.Wood at surrey.ac.uk Tue Aug 11 19:55:35 2009 From: L.Wood at surrey.ac.uk (Lloyd Wood) Date: Wed, 12 Aug 2009 03:55:35 +0100 Subject: [e2e] Packet reordering in Internet In-Reply-To: <20090811230931.0982B28E137@aland.bbn.com> References: <20090811230931.0982B28E137@aland.bbn.com> Message-ID: http://personal.ee.surrey.ac.uk/Personal/L.Wood/publications/index.html#effects-on-tcp Lloyd Wood, George Pavlou and Barry Evans, 'Effects on TCP of routing strategies in satellite constellations', IEEE Communications Magazine, vol. 39, no. 3, pp. 172-181, March 2001. digs into how TCP's dupack, fast retransmit and recovery, and delack behaviour work when exposed to a rather artificial strawman round- robin per-packet multipath mesh environment, and shows that SACK is a very good idea. This used ns simulation with distance vector router and multipath enabled. (it's described in more detail in Ch. 4 of my PhD thesis.) Load balancing in real routers generally defaults to per-destination IP address, which is sufficient to prevent reordering of flows. See e.g. http://www.cisco.com/en/US/tech/tk365/technologies_tech_note09186a0080094820.shtml L. On 12 Aug 2009, at 00:09, Craig Partridge wrote: > > I don't think anyone's replicated the experiment that Bennett, > Shectman > and I did back in 1999 ("Packet Reordering is Not Pathological Network > Behavior" in IEEE/ACM TON, Dec 1999). I could be wrong. > > Thanks! > > Craig > >> Hello, >> >> I was wondering if there are measurement studies of Internet traffic >> quantifying the magnitude of packet reordering within a TCP flow. Is >> reordering a common problem for TCP in the current Internet? How >> about >> the load balancing features in the routers from major vendors : is it >> per flow basis or per packet basis, and if flow based load balancing >> is done, then how is the flow classification is done these routers? >> What could be/are other sources of reordering withing a TCP flow? >> >> Thanks, >> Manish > ******************** > Craig Partridge > Chief Scientist, BBN Technologies > E-mail: craig at aland.bbn.com or craig at bbn.com > Phone: +1 517 324 3425 DTN work: http://info.ee.surrey.ac.uk/Personal/L.Wood/saratoga/ From mellia at tlc.polito.it Wed Aug 12 06:36:16 2009 From: mellia at tlc.polito.it (Marco Mellia) Date: Wed, 12 Aug 2009 15:36:16 +0200 Subject: [e2e] Packet reordering in Internet In-Reply-To: References: Message-ID: <4A82C550.6000200@tlc.polito.it> Hi Manish, We did some measurements about TCP anomalies, among which we also identify network reordering. It turned out that reordering is marginal, but sometimes it can be very high, e.g., when you hit a strange path. Details can be found on M. Mellia, M. Meo, L. Muscariello, D. Rossi Passive analysis of TCP anomalies Computer Networks, Vol. 52, No. 14, pp. 2663-2676, http://www.tlc-networks.polito.it/mellia/papers/comnet08_TCP.pdf Also, more recent measurements we are doing on different networks show that reordering is today limited. For example, this picture shows the percentage of anomalies identified when monitoring our campus network at Politecnico di Torino. Yellow is network reodering... http://tstat.tlc.polito.it/cgi-bin/tstat_rrd.cgi?template=normidx&var=tcp_anomalies&dir=rrd_data/Polito/LIVE&bigpic=true&ymax=10&direction=both&ymin=-10 Other measurements (not available on line due to NDA) show similar behavior... For example, http://tstat.tlc.polito.it/rrd_napa_images/rrd_napa_data.NW-6.tcp_anomalies.normidx.1m.png shows results from a BRAS aggregating more than 30.000 ASDL customers from a ISP ;) You can see that reordering on those path is less than 1% on average. But some single connections can suffer from much higher problems (e.g., when going through some load balancing router). Ciao Marco > Hello, > > I was wondering if there are measurement studies of Internet traffic > quantifying the magnitude of packet reordering within a TCP flow. Is > reordering a common problem for TCP in the current Internet? How about > the load balancing features in the routers from major vendors : is it > per flow basis or per packet basis, and if flow based load balancing > is done, then how is the flow classification is done these routers? > What could be/are other sources of reordering withing a TCP flow? > > Thanks, > Manish > > -- Ciao, /\/\/\rco +-----------------------------------+ | Marco Mellia - Assistant Professor| | Skypeid: mgmellia | | Tel: +39-011-564-4173 | | Cel: +39-331-6714789 | /"\ .. . . . . . . . . . . . . | Politecnico di Torino | \ / . ASCII Ribbon Campaign . | Corso Duca degli Abruzzi 24 | X .- NO HTML/RTF in e-mail . | Torino - 10129 - Italy | / \ .- NO Word docs in e-mail. | http://www.telematica.polito.it | .. . . . . . . . . . . . . +-----------------------------------+ The box said "Requires Windows 95 or Better." So I installed Linux. From bart at man.poznan.pl Wed Aug 12 06:50:43 2009 From: bart at man.poznan.pl (Bartek Belter) Date: Wed, 12 Aug 2009 15:50:43 +0200 Subject: [e2e] Packet reordering in Internet In-Reply-To: References: Message-ID: <008101ca1b53$e66a92a0$b33fb7e0$@poznan.pl> Hi Manish, Some time ago we did some experiments in the pan-European education network. The results of experiments were summarized in a paper. It is available here: http://tnc2005.terena.org/core/getfile.php?file_id=626 (Shall we worry about Packet Reordering?). Hope it helps. Best regards, Bartek -----Original Message----- From: end2end-interest-bounces at postel.org [mailto:end2end-interest-bounces at postel.org] On Behalf Of Manish Jain Sent: Tuesday, August 11, 2009 11:57 PM To: end2end-interest at postel.org Subject: [e2e] Packet reordering in Internet Hello, I was wondering if there are measurement studies of Internet traffic quantifying the magnitude of packet reordering within a TCP flow. Is reordering a common problem for TCP in the current Internet? How about the load balancing features in the routers from major vendors : is it per flow basis or per packet basis, and if flow based load balancing is done, then how is the flow classification is done these routers? What could be/are other sources of reordering withing a TCP flow? Thanks, Manish From jasleen at cs.unc.edu Wed Aug 12 07:37:33 2009 From: jasleen at cs.unc.edu (Jasleen Kaur) Date: Wed, 12 Aug 2009 10:37:33 -0400 Subject: [e2e] Packet reordering in Internet In-Reply-To: References: <20090811230931.0982B28E137@aland.bbn.com> Message-ID: <4A82D3AD.2040903@cs.unc.edu> Manish and others, We've done recent work analyzing out-of-sequence segments (due to reordering as well as losses) in TCP transfers. The two papers below describe our methodology and the results from running it on traces of about 3 million TCP transfers. We'd welcome feedback. S. Rewaskar, J. Kaur, and F.D. Smith, "A Performance Study of Loss Detection/Recovery in Real-world TCP Implementations", in Proceedings of the IEEE International Conference on Network Protocols (ICNP'07), Beijing, China, Oct 2007. http://www.cs.unc.edu/~jasleen/papers/icnp07.pdf S. Rewaskar, J. Kaur, and F.D. Smith, "A Passive State-Machine Approach for Accurate Analysis of TCP Out-of-Sequence Segments", in ACM Computer Communications Review (CCR), July 2006. http://www.cs.unc.edu/~jasleen/papers/ccr06.pdf Thanks, Jasleen Lloyd Wood wrote: > http://personal.ee.surrey.ac.uk/Personal/L.Wood/publications/index.html#effects-on-tcp > > Lloyd Wood, George Pavlou and Barry Evans, 'Effects on TCP of routing > strategies in satellite constellations', IEEE Communications Magazine, > vol. 39, no. 3, pp. 172-181, March 2001. > > digs into how TCP's dupack, fast retransmit and recovery, and delack > behaviour work when exposed to a rather artificial strawman > round-robin per-packet multipath mesh environment, and shows that SACK > is a very good idea. This used ns simulation with distance vector > router and multipath enabled. (it's described in more detail in Ch. 4 > of my PhD thesis.) > > Load balancing in real routers generally defaults to per-destination > IP address, which is sufficient to prevent reordering of flows. See e.g. > http://www.cisco.com/en/US/tech/tk365/technologies_tech_note09186a0080094820.shtml > > > L. > On 12 Aug 2009, at 00:09, Craig Partridge wrote: > >> >> I don't think anyone's replicated the experiment that Bennett, Shectman >> and I did back in 1999 ("Packet Reordering is Not Pathological Network >> Behavior" in IEEE/ACM TON, Dec 1999). I could be wrong. >> >> Thanks! >> >> Craig >> >>> Hello, >>> >>> I was wondering if there are measurement studies of Internet traffic >>> quantifying the magnitude of packet reordering within a TCP flow. Is >>> reordering a common problem for TCP in the current Internet? How about >>> the load balancing features in the routers from major vendors : is it >>> per flow basis or per packet basis, and if flow based load balancing >>> is done, then how is the flow classification is done these routers? >>> What could be/are other sources of reordering withing a TCP flow? >>> >>> Thanks, >>> Manish >> ******************** >> Craig Partridge >> Chief Scientist, BBN Technologies >> E-mail: craig at aland.bbn.com or craig at bbn.com >> Phone: +1 517 324 3425 > > DTN work: http://info.ee.surrey.ac.uk/Personal/L.Wood/saratoga/ > > > > > > > From jain.manish at gmail.com Wed Aug 12 08:32:38 2009 From: jain.manish at gmail.com (Manish Jain) Date: Wed, 12 Aug 2009 11:32:38 -0400 Subject: [e2e] Packet reordering in Internet In-Reply-To: <008101ca1b53$e66a92a0$b33fb7e0$@poznan.pl> References: <008101ca1b53$e66a92a0$b33fb7e0$@poznan.pl> Message-ID: Thanks everyone for the useful pointers. Based on the some pointers and past discussions, I understand that routers in the current Internet have load-balancing implemented in a way to preserve packet order within a TCP flow. Is that a safe assumption? Are there any other mechanisms in the routers/switches that could lead to packet reordering? -- Manish On Wed, Aug 12, 2009 at 9:50 AM, Bartek Belter wrote: > Hi Manish, > > Some time ago we did some experiments in the pan-European education network. The results of experiments were summarized in a paper. > It is available here: http://tnc2005.terena.org/core/getfile.php?file_id=626 (Shall we worry about Packet Reordering?). > > Hope it helps. > > Best regards, > Bartek > > -----Original Message----- > From: end2end-interest-bounces at postel.org [mailto:end2end-interest-bounces at postel.org] On Behalf Of Manish Jain > Sent: Tuesday, August 11, 2009 11:57 PM > To: end2end-interest at postel.org > Subject: [e2e] Packet reordering in Internet > > Hello, > > I was wondering if there are measurement studies of Internet traffic quantifying the magnitude of packet reordering within a TCP flow. Is reordering a common problem for TCP in the current Internet? How about the load balancing features in the routers from major vendors : is it per flow basis or per packet basis, and if flow based load balancing is done, then how is the flow classification is done these routers? > What could be/are other sources of reordering withing a TCP flow? > > Thanks, > Manish > > > From pganti at gmail.com Wed Aug 12 10:21:17 2009 From: pganti at gmail.com (Paddy Ganti) Date: Wed, 12 Aug 2009 10:21:17 -0700 Subject: [e2e] Packet reordering in Internet In-Reply-To: References: <008101ca1b53$e66a92a0$b33fb7e0$@poznan.pl> Message-ID: <2ff1f08a0908121021r7ee5d5e1qeefa576cd4339df5@mail.gmail.com> In our past experience packet reordering is seen most commonly at very small time scales; that is, packets that get sent back to back on a high-speed network are likely to be reordered, but with even a millisecond or so spacing between packets (such as if the packets originated on a 10BaseT net) the chance of reordering is far less. So thats why you usually never see them on LANs but is very common on a GigE WAN. That said RFC 4737 is a good read to decide which metric of packet reordering that you want to focus upon. >Is that a safe assumption? I dont think so. The following paper clearly details the role of transmit buffer allocation in packet reordering. S. Govind, R. Govindarajan and Joy Kuri ?Packet Reordering in Network Processors?, Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS-07), CA, USA, 2007 Also this paper details why the assumption is not safe: A. Bare, A. Jayasumana, N. Piratla, ?On Growth of Parallelism within Routers and Its Impact on Packet Reordering,? Proceedings of the 2007 15th IEEE Workshop on Local and Metropolitan Area Networks, , Princeton, NJ, 2007. >Are there any other mechanisms in the routers/switches that could lead to packet reordering? Link-Layer Retransmissions and Router Forwarding Lulls are also causes. Please see the following paper for details K. Leung, Victor O.K. Li, Daiqin Yang, ?An Overview of Packet Reordering in Transmission Control Protocol; Problems, Solutions, and Challenges?, IEEE Transactions on Parallel and Distributed Systems, Vol. 18. No. 4, 2007 Finally the following papers might be of interest to you as well L. Gharai et al., ?Packet reordering, high speed networks and transport protocol performance,? In Proceeding of the 13th ICCCN, Chicago, IL, October 2004 Yi Wang, Guohan Lu and Xing Li, "A Study of Internet Packet Reordering," in Proceedings of International Conference on Information Networking (ICOIN), Lecture Notes in Computer Science 3090, Springer-Verlag, 2004 S. Jaiswal, et al., ?Measurement and Classification of Out-of-sequence Packets in Tier-1 IP Backbone,? IEEE/ACM Transactions on Networking, Vol. 15, No. 1, February 2007. S. Kandula, D. Katabi, S. Sinha, A. Berger, ?Dynamic load balancing without packet reordering,? ACM SIGCOMM Computer Communication Review, 37(2), April 2007 Hope this helps. -Paddy On Wed, Aug 12, 2009 at 8:32 AM, Manish Jain wrote: > Thanks everyone for the useful pointers. > > Based on the some pointers and past discussions, I understand that > routers in the current Internet have load-balancing implemented in a > way to preserve packet order within a TCP flow. Is that a safe > assumption? Are there any other mechanisms in the routers/switches > that could lead to packet reordering? > > -- > Manish > > On Wed, Aug 12, 2009 at 9:50 AM, Bartek Belter wrote: > > Hi Manish, > > > > Some time ago we did some experiments in the pan-European education > network. The results of experiments were summarized in a paper. > > It is available here: > http://tnc2005.terena.org/core/getfile.php?file_id=626 (Shall we worry > about Packet Reordering?). > > > > Hope it helps. > > > > Best regards, > > Bartek > > > > -----Original Message----- > > From: end2end-interest-bounces at postel.org [mailto: > end2end-interest-bounces at postel.org] On Behalf Of Manish Jain > > Sent: Tuesday, August 11, 2009 11:57 PM > > To: end2end-interest at postel.org > > Subject: [e2e] Packet reordering in Internet > > > > Hello, > > > > I was wondering if there are measurement studies of Internet traffic > quantifying the magnitude of packet reordering within a TCP flow. Is > reordering a common problem for TCP in the current Internet? How about the > load balancing features in the routers from major vendors : is it per flow > basis or per packet basis, and if flow based load balancing is done, then > how is the flow classification is done these routers? > > What could be/are other sources of reordering withing a TCP flow? > > > > Thanks, > > Manish > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20090812/2fc3546a/attachment-0001.html From L.Wood at surrey.ac.uk Wed Aug 12 11:39:52 2009 From: L.Wood at surrey.ac.uk (Lloyd Wood) Date: Wed, 12 Aug 2009 19:39:52 +0100 Subject: [e2e] Packet reordering in Internet In-Reply-To: <2ff1f08a0908121021r7ee5d5e1qeefa576cd4339df5@mail.gmail.com> References: <008101ca1b53$e66a92a0$b33fb7e0$@poznan.pl> <2ff1f08a0908121021r7ee5d5e1qeefa576cd4339df5@mail.gmail.com> Message-ID: On 12 Aug 2009, at 18:21, Paddy Ganti wrote: > In our past experience packet reordering is seen most commonly at > very small time scales; that is, packets that get sent back to back > on a high-speed network are likely to be reordered, but with even a > millisecond or so spacing between packets (such as if the packets > originated on a 10BaseT net) the chance of reordering is far less. That is an argument in favour of TCP pacing instead of bursts of back- to-back segments. L. DTN work: http://info.ee.surrey.ac.uk/Personal/L.Wood/saratoga/ From william.allen.simpson at gmail.com Wed Aug 12 14:51:18 2009 From: william.allen.simpson at gmail.com (William Allen Simpson) Date: Wed, 12 Aug 2009 17:51:18 -0400 Subject: [e2e] TCP improved closing strategies? Message-ID: <4A833956.2050501@gmail.com> With the advent of more widespread DNSsec deployment, more UDP sessions are likely to fallover into TCP sessions. I've been informed that even today, with a more limited TCP activity, busy servers cannot wait 2MSL to finish closing. Also, busy caching servers run out of port numbers, and cycle quickly. So there's ample opportunity for seemingly duplicate transmissions. I've been searching my personal copy of the e2e-interest archives back to '98 (the previous years are only on backup somewhere), and haven't found anything on improved closing strategies. Ideas? Of course, there's T/TCP, but wasn't closing one of its Achilles heels? From craig at aland.bbn.com Wed Aug 12 16:01:18 2009 From: craig at aland.bbn.com (Craig Partridge) Date: Wed, 12 Aug 2009 19:01:18 -0400 Subject: [e2e] TCP improved closing strategies? Message-ID: <20090812230118.AFD6E28E138@aland.bbn.com> Hi Bill: Couple of questions and idea. Question first -- why is port cycling an issue? In TCP, one has to keep the tuple unique. To run out of ports you'd have to use so many ports with ONE peer that you'd have problems.. Seems unlikely (even if a peer is, actually, a NATed network). The second question is why having a hashed PCB in TIME WAIT is such an issue for 2 MSL... (Is it purely the size of the hash database -- if so, there are ways to compress the hash table...). That said, the problem is fun. As I recall Andy Tanenbaum used to point out that TP4 had an abrupt close and it worked. It does require somewhat more application coordination but perhaps we can fake that by, say, retransmitting the last segment and the FIN a few times to seek to ensure that all data is received by the client??? Thanks! Craig > With the advent of more widespread DNSsec deployment, more UDP sessions > are likely to fallover into TCP sessions. > > I've been informed that even today, with a more limited TCP activity, > busy servers cannot wait 2MSL to finish closing. > > Also, busy caching servers run out of port numbers, and cycle quickly. > So there's ample opportunity for seemingly duplicate transmissions. > > I've been searching my personal copy of the e2e-interest archives back to > '98 (the previous years are only on backup somewhere), and haven't found > anything on improved closing strategies. Ideas? > > Of course, there's T/TCP, but wasn't closing one of its Achilles heels? ******************** Craig Partridge Chief Scientist, BBN Technologies E-mail: craig at aland.bbn.com or craig at bbn.com Phone: +1 517 324 3425 From faber at ISI.EDU Wed Aug 12 16:22:51 2009 From: faber at ISI.EDU (Ted Faber) Date: Wed, 12 Aug 2009 16:22:51 -0700 Subject: [e2e] TCP improved closing strategies? In-Reply-To: <4A833956.2050501@gmail.com> References: <4A833956.2050501@gmail.com> Message-ID: <20090812232251.GF94672@zod.isi.edu> On Wed, Aug 12, 2009 at 05:51:18PM -0400, William Allen Simpson wrote: > I've been searching my personal copy of the e2e-interest archives back to > '98 (the previous years are only on backup somewhere), and haven't found > anything on improved closing strategies. Ideas? http://www.isi.edu/~faber/pubs/html/infocom99/ Basically, if you make sure your clients close the connection rather than your servers, you're fine. That paper includes a couple possible ways to hack that, but the easiest is to have your clients close. -- Ted Faber http://www.isi.edu/~faber PGP: http://www.isi.edu/~faber/pubkeys.asc Unexpected attachment on this mail? See http://www.isi.edu/~faber/FAQ.html#SIG -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 196 bytes Desc: not available Url : http://mailman.postel.org/pipermail/end2end-interest/attachments/20090812/79497da3/attachment.bin From perfgeek at mac.com Wed Aug 12 18:06:42 2009 From: perfgeek at mac.com (rick jones) Date: Wed, 12 Aug 2009 18:06:42 -0700 Subject: [e2e] Packet reordering in Internet In-Reply-To: References: <008101ca1b53$e66a92a0$b33fb7e0$@poznan.pl> Message-ID: <4E562B6B-3EE0-4D67-876E-DEF19A274192@mac.com> On Aug 12, 2009, at 8:32 AM, Manish Jain wrote: > Thanks everyone for the useful pointers. > > Based on the some pointers and past discussions, I understand that > routers in the current Internet have load-balancing implemented in a > way to preserve packet order within a TCP flow. Is that a safe > assumption? Are there any other mechanisms in the routers/switches > that could lead to packet reordering? Define "safe." Correct more than 1/2 the time? More than 3/4? 7/8? 99 times out of 10?-) If the "router" is made from "Linux" the Linux bonding (aka load balancing, aggregation, teaming, trunking, call-it-what-you-will) software does offer a "round-robin" option that will spread packets of the same flow across multiple links and will indeed lead to reordering. rick jones > > -- > Manish > > On Wed, Aug 12, 2009 at 9:50 AM, Bartek Belter > wrote: >> Hi Manish, >> >> Some time ago we did some experiments in the pan-European education >> network. The results of experiments were summarized in a paper. >> It is available here: http://tnc2005.terena.org/core/getfile.php?file_id=626 >> (Shall we worry about Packet Reordering?). >> >> Hope it helps. >> >> Best regards, >> Bartek >> >> -----Original Message----- >> From: end2end-interest-bounces at postel.org [mailto:end2end-interest-bounces at postel.org >> ] On Behalf Of Manish Jain >> Sent: Tuesday, August 11, 2009 11:57 PM >> To: end2end-interest at postel.org >> Subject: [e2e] Packet reordering in Internet >> >> Hello, >> >> I was wondering if there are measurement studies of Internet >> traffic quantifying the magnitude of packet reordering within a TCP >> flow. Is reordering a common problem for TCP in the current >> Internet? How about the load balancing features in the routers from >> major vendors : is it per flow basis or per packet basis, and if >> flow based load balancing is done, then how is the flow >> classification is done these routers? >> What could be/are other sources of reordering withing a TCP flow? >> >> Thanks, >> Manish >> >> >> > there is no rest for the wicked, yet the virtuous have no pillows From dpreed at reed.com Wed Aug 12 18:14:54 2009 From: dpreed at reed.com (David P. Reed) Date: Wed, 12 Aug 2009 21:14:54 -0400 Subject: [e2e] [unclassified] TCP improved closing strategies? In-Reply-To: <4A833956.2050501@gmail.com> References: <4A833956.2050501@gmail.com> Message-ID: <4A83690E.7040307@reed.com> I'm not sure whether it wouldn't be better to think through a non-TCP solution here. TCP is incredibly heavy duty for the purpose of doing a properly "secure" DNS transaction, which ultimately involves a single request-response in the most common case. And if you do, there is no reason why the server needs to maintain *connection* state at all - connections are for long term interactions. Am I missing something here? On 08/12/2009 05:51 PM, William Allen Simpson wrote: > With the advent of more widespread DNSsec deployment, more UDP sessions > are likely to fallover into TCP sessions. > > I've been informed that even today, with a more limited TCP activity, > busy servers cannot wait 2MSL to finish closing. > > Also, busy caching servers run out of port numbers, and cycle quickly. > So there's ample opportunity for seemingly duplicate transmissions. > > I've been searching my personal copy of the e2e-interest archives back to > '98 (the previous years are only on backup somewhere), and haven't found > anything on improved closing strategies. Ideas? > > Of course, there's T/TCP, but wasn't closing one of its Achilles heels? > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20090812/6b0b87cd/attachment.html From danmcd at sun.com Wed Aug 12 20:26:25 2009 From: danmcd at sun.com (Dan McDonald) Date: Wed, 12 Aug 2009 23:26:25 -0400 Subject: [e2e] [unclassified] TCP improved closing strategies? In-Reply-To: <4A83690E.7040307@reed.com> References: <4A833956.2050501@gmail.com> <4A83690E.7040307@reed.com> Message-ID: <20090813032625.GB641@mactavish.gateway.2wire.net> On Wed, Aug 12, 2009 at 09:14:54PM -0400, David P. Reed wrote: > I'm not sure whether it wouldn't be better to think through a non-TCP > solution here. TCP is incredibly heavy duty for the purpose of doing a > properly "secure" DNS transaction, which ultimately involves a single > request-response in the most common case. > > And if you do, there is no reason why the server needs to maintain > *connection* state at all - connections are for long term interactions. > > Am I missing something here? I thought (and I'm not SecureDNS wizard) that SecureDNS packets often exceed PathMTU for most of the Internet, and that you wanted segmentation *and* retransmission covered. Dan From fernando at gont.com.ar Thu Aug 13 02:02:25 2009 From: fernando at gont.com.ar (Fernando Gont) Date: Thu, 13 Aug 2009 06:02:25 -0300 Subject: [e2e] TCP improved closing strategies? In-Reply-To: <4A833956.2050501@gmail.com> References: <4A833956.2050501@gmail.com> Message-ID: <4A83D6A1.7000602@gont.com.ar> William Allen Simpson wrote: > I've been informed that even today, with a more limited TCP activity, > busy servers cannot wait 2MSL to finish closing. Not only busy servers. Many systems have reduced the length of the TIME-WAIT state, no matter how "busy" they are. > Also, busy caching servers run out of port numbers, and cycle quickly. > So there's ample opportunity for seemingly duplicate transmissions. > > I've been searching my personal copy of the e2e-interest archives back to > '98 (the previous years are only on backup somewhere), and haven't found > anything on improved closing strategies. Ideas? Well, you do have "improved *opening* strategies" :-). See page 93 of: http://www.gont.com.ar/papers/tn-03-09-security-assessment-TCP.pdf Timestamps can be used to safely recycle the TIME-WAIT state (provided that timestamps are monotonically-increasing across connections.) Thanks! Kind regards, -- Fernando Gont e-mail: fernando at gont.com.ar || fgont at acm.org PGP Fingerprint: 7809 84F5 322E 45C7 F1C9 3945 96EE A9EF D076 FFF1 From william.allen.simpson at gmail.com Thu Aug 13 05:58:23 2009 From: william.allen.simpson at gmail.com (William Allen Simpson) Date: Thu, 13 Aug 2009 08:58:23 -0400 Subject: [e2e] TCP improved closing strategies? In-Reply-To: <20090812230118.AFD6E28E138@aland.bbn.com> References: <20090812230118.AFD6E28E138@aland.bbn.com> Message-ID: <4A840DEF.80503@gmail.com> Craig Partridge wrote: > Question first -- why is port cycling an issue? In TCP, one has to keep > the tuple unique. To run out > of ports you'd have to use so many ports with ONE peer that you'd have > problems.. Seems unlikely (even if a peer is, actually, a NATed network). > I'm not the operator in question, so I don't have the data, but based on a 'phone conversation: [x].gtld-servers.net. (Only the src port varies.) If a strategy was adopted to vary the query through all [x] possibilities, there's only 13 of them. There's a good likelihood of collisions, unless every OS carries a separate PAWS timestamp for each connection (something we've mentioned here in the past). Obviously, there are quite a few older OS out there, and some of them keep a system-wide PAWS timestamp. > The second question is why having a hashed PCB in TIME WAIT is such an > issue for 2 MSL... (Is it purely the size of the hash database -- if so, > there are ways to compress the hash table...). > AFAIK, the last survey (6-7 years ago) was 100 million queries per day, so that's roughly 694,444 during each 2MSL period. Of course, that's average, not peak (likely much more).... http://dns.measurement-factory.com/writings/wessels-pam2003-paper.pdf We're talking about Linux, Solaris, HP-UX, AIX, maybe some others. Do all these servers have the capability to handle that many TCP connections, rather than UDP connections? Do *any* of them? > That said, the problem is fun. > > As I recall Andy Tanenbaum used to point out that TP4 had an abrupt close > and it worked. It does require somewhat more application coordination but > perhaps we can fake that by, say, retransmitting the last segment and the FIN > a few times to seek to ensure that all data is received by the client??? > Cannot depend on the DNS client's OS to be that smart. Has to be a server only solution. Or based on a new TCP option, that tells us both ends are smart. (I've an option in mind.) From william.allen.simpson at gmail.com Thu Aug 13 06:23:33 2009 From: william.allen.simpson at gmail.com (William Allen Simpson) Date: Thu, 13 Aug 2009 09:23:33 -0400 Subject: [e2e] TCP improved closing strategies? In-Reply-To: <20090812232251.GF94672@zod.isi.edu> References: <4A833956.2050501@gmail.com> <20090812232251.GF94672@zod.isi.edu> Message-ID: <4A8413D5.30104@gmail.com> Ted Faber wrote: > > http://www.isi.edu/~faber/pubs/html/infocom99/ > > Basically, if you make sure your clients close the connection rather > than your servers, you're fine. That paper includes a couple possible > ways to hack that, but the easiest is to have your clients close. > Ah yes, I vaguely remember reading that back in the day. Sadly, it's the clients that we cannot depend on closing.... Still, good ideas. From william.allen.simpson at gmail.com Thu Aug 13 06:50:23 2009 From: william.allen.simpson at gmail.com (William Allen Simpson) Date: Thu, 13 Aug 2009 09:50:23 -0400 Subject: [e2e] [unclassified] TCP improved closing strategies? In-Reply-To: <20090813032625.GB641@mactavish.gateway.2wire.net> References: <4A833956.2050501@gmail.com> <4A83690E.7040307@reed.com> <20090813032625.GB641@mactavish.gateway.2wire.net> Message-ID: <4A841A1F.6040508@gmail.com> Dan McDonald wrote: > On Wed, Aug 12, 2009 at 09:14:54PM -0400, David P. Reed wrote: >> I'm not sure whether it wouldn't be better to think through a non-TCP >> solution here. TCP is incredibly heavy duty for the purpose of doing a >> properly "secure" DNS transaction, which ultimately involves a single >> request-response in the most common case. >> >> And if you do, there is no reason why the server needs to maintain >> *connection* state at all - connections are for long term interactions. >> >> Am I missing something here? > > I thought (and I'm not SecureDNS wizard) that SecureDNS packets often exceed > PathMTU for most of the Internet, and that you wanted segmentation *and* > retransmission covered. > Dan is correct. At least, *I'm* not aware of paths with greater than 4K. Heck, most paths are 1,500 at best, and 1,460 is common. Worse, it turns out that 4K doesn't work well, either. The EDNS0 default is 4K; the DNS UDP packets fragment, and the SOHO firewalls and/or NATs drop the fragments. UDP is only good for data less than 1,420 at most. Steve Crocker has a clever draft to allow the user query to limit the algorithms requested, but that only works for end systems. Caching servers still need something more -- a lot of SOHO boxen also act as caching servers, and most every ISP these days designates one or more caching servers in its DHCP. That's a lot of servers. I've got another proposed option to calculate the size of the remaining (post-truncation) response, by RRType. That will allow a caching server to make subsequent UDP queries for each RRType, for a complete cache. Mine also will indicate whether UDP is hopeless, and TCP *has* to be used; when the total response for any single RRType will be greater than 1,420. Both those proposed UDP EDNS0 options will help lower the number of TCP requests, but neither can be *required* to be implemented. So, we still need to handle the existing problem -- a problem that has been known for many years, but no solution has yet been standardized. It's time to bite the bullet. Sadly, there's no TCPng, and it's too late for that to help now. Remember, most network appliances won't be upgraded. The only remaining possibilities are vanilla UDP and vanilla TCP. From william.allen.simpson at gmail.com Thu Aug 13 07:11:10 2009 From: william.allen.simpson at gmail.com (William Allen Simpson) Date: Thu, 13 Aug 2009 10:11:10 -0400 Subject: [e2e] TCP improved closing strategies? In-Reply-To: <4A83D6A1.7000602@gont.com.ar> References: <4A833956.2050501@gmail.com> <4A83D6A1.7000602@gont.com.ar> Message-ID: <4A841EFE.2050501@gmail.com> Fernando Gont wrote: > Well, you do have "improved *opening* strategies" :-). See page 93 of: > http://www.gont.com.ar/papers/tn-03-09-security-assessment-TCP.pdf > > Timestamps can be used to safely recycle the TIME-WAIT state (provided > that timestamps are monotonically-increasing across connections.) > Wonderful! A good survey. Nice to have it all in one place. From perfgeek at mac.com Thu Aug 13 08:41:26 2009 From: perfgeek at mac.com (rick jones) Date: Thu, 13 Aug 2009 08:41:26 -0700 Subject: [e2e] TCP improved closing strategies? In-Reply-To: <4A833956.2050501@gmail.com> References: <4A833956.2050501@gmail.com> Message-ID: On Aug 12, 2009, at 2:51 PM, William Allen Simpson wrote: > With the advent of more widespread DNSsec deployment, more UDP > sessions > are likely to fallover into TCP sessions. > > I've been informed that even today, with a more limited TCP activity, > busy servers cannot wait 2MSL to finish closing. One wonders how a web server survives... > Also, busy caching servers run out of port numbers, and cycle quickly. > So there's ample opportunity for seemingly duplicate transmissions. Presuming a single source and destination IP address and a single well- known server port number, the caching server could run through ~60000/ MSL connections per second before attempting to reuse a four-tuple still in TIME_WAIT. If one were holding strictly to a four minute MSL that is ~250 connections per second. 60 seconds seems to be a rather more common MSL (well, length of TIME_WAIT) so that would be ~1000 connections per second. > I've been searching my personal copy of the e2e-interest archives > back to > '98 (the previous years are only on backup somewhere), and haven't > found > anything on improved closing strategies. Ideas? Don't close after the first transaction. There is a reason HTTP added persistent/pipelined connections. > Of course, there's T/TCP, but wasn't closing one of its Achilles > heels? T/TCP was an engineering excercise - it was interesting, I enjoyed trying it out and even had a netperf test for it, but it didn't go anywhere. rick jones Wisdom teeth are impacted, people are affected by the effects of events From perfgeek at mac.com Thu Aug 13 08:53:58 2009 From: perfgeek at mac.com (rick jones) Date: Thu, 13 Aug 2009 08:53:58 -0700 Subject: [e2e] TCP improved closing strategies? In-Reply-To: <4A840DEF.80503@gmail.com> References: <20090812230118.AFD6E28E138@aland.bbn.com> <4A840DEF.80503@gmail.com> Message-ID: <8475B9C6-BBEC-49C4-AB6E-5B349C87FB32@mac.com> On Aug 13, 2009, at 5:58 AM, William Allen Simpson wrote: > AFAIK, the last survey (6-7 years ago) was 100 million queries per > day, so > that's roughly 694,444 during each 2MSL period. Of course, that's > average, > not peak (likely much more).... > > http://dns.measurement-factory.com/writings/wessels-pam2003-paper.pdf > > We're talking about Linux, Solaris, HP-UX, AIX, maybe some others. > Do all > these servers have the capability to handle that many TCP connections, > rather than UDP connections? > > Do *any* of them? Modulo the variations in how persistent the connections were relative to the transaction rate, and the differences in the metrics, you can probably look at the archives of SPECweb96 (HTTP 1.0) SPECweb99 (1.1), SPECweb99_SSL (new SSL session for each new TCP connection, IIRC, but it has been a while) or even SPECweb2005/SPECweb2009 if you can decide which among "Banking," "Ecommerce," and "Support" workload is closest, to get an idea of how many TCP connections servers can handle. During the heyday of web server benchmarking, there was a lot of work done in minimizing the overhead of TIME_WAIT tracking etc. >> That said, the problem is fun. >> As I recall Andy Tanenbaum used to point out that TP4 had an abrupt >> close >> and it worked. It does require somewhat more application >> coordination but >> perhaps we can fake that by, say, retransmitting the last segment >> and the FIN >> a few times to seek to ensure that all data is received by the >> client??? > Cannot depend on the DNS client's OS to be that smart. Has to be a > server > only solution. Or based on a new TCP option, that tells us both > ends are > smart. (I've an option in mind.) Isn't a new TCP option by definition depending on the client's OS to be smart? rick jones Wisdom teeth are impacted, people are affected by the effects of events From dpreed at reed.com Thu Aug 13 12:39:32 2009 From: dpreed at reed.com (David P. Reed) Date: Thu, 13 Aug 2009 15:39:32 -0400 Subject: [e2e] TCP improved closing strategies? In-Reply-To: <8475B9C6-BBEC-49C4-AB6E-5B349C87FB32@mac.com> References: <20090812230118.AFD6E28E138@aland.bbn.com> <4A840DEF.80503@gmail.com> <8475B9C6-BBEC-49C4-AB6E-5B349C87FB32@mac.com> Message-ID: <4A846BF4.50501@reed.com> As I mentioned, TCP is a terrible choice for a short "remote procedure call". The arguments for it seem to be: 1) big payload in one or the other direction. I could argue that that probably reflects a bad higher-level protocol design (given that the information exchanged from a pure info theoretic point of view is more or less a couple thousand bits each way - even if you include a full public key verification). 2) MTU issues. Let's assume the MTU is 100 bytes. UDP-based protocols can span packets without requiring all the overhead of a TCP connection per request. 3) Legacy equipment issues. UDP is supported for all legacy equipment. What's the problem? I know of lot's of legacy equipment that cannot under any circumstance support TCP-based DNS *today*. So an argument from legacy is incredibly weak. 4) The "fun problem" of TCP close performance. OK, if you want money to do research on TCP closing, use a different excuse. NSF will fund a grad student. Next problem. I'm not against using TCP for this, but if you want to use it, define a protocol like HTTP 1.1 that multiplexes a TCP connection between requester and requested. Then the TCP close will happen far less frequently, as will opens. And it can use very *short* exchanges, because there is no reason to transmit more than the request and response in the common case. From day at std.com Thu Aug 13 12:51:40 2009 From: day at std.com (John Day) Date: Thu, 13 Aug 2009 15:51:40 -0400 Subject: [e2e] TCP improved closing strategies? In-Reply-To: <8475B9C6-BBEC-49C4-AB6E-5B349C87FB32@mac.com> References: <20090812230118.AFD6E28E138@aland.bbn.com> <4A840DEF.80503@gmail.com> <8475B9C6-BBEC-49C4-AB6E-5B349C87FB32@mac.com> Message-ID: Correct me if I am wrong, but in terms of recycling port-ids so that no late packets are accepted and having to wait 2MSL, I believe that Dick Watson proved in 1981 that waiting 2MSL was both necessary and sufficient. Actually it is 2MSL on one side and 3MSL on the other. Watson, Richard. "Timer-Based Mechanisms in Reliable Transport Protocol Connection Management, Computer Networks 5 (1981) 47 - 56. As far as TP4 is concerned, it has the same constraint. There is no avoiding it. Earlier Belnes showed that the 5 way exchange was required to deliver one message reliably as long as there were no failures, but to be absolutely sure timers were required and you were back to Watson's result. Belnes, Dag "Single Message Communiction," IEEE Transactions on Communications, Vol. COM-24, No. 2 February, 1976, pp 190 - 194. Watson shows that bounding 3 timers are necessary and sufficient to ensure reliable transfer. Matta and his students did a paper recently that looked at the single message case under harsh conditions and found that even though TCP bounds the same 3 timers it isn't as effective as Watson's approach,. Take care, John Day At 8:53 -0700 2009/08/13, rick jones wrote: >On Aug 13, 2009, at 5:58 AM, William Allen Simpson wrote: >>AFAIK, the last survey (6-7 years ago) was 100 million queries per day, so >>that's roughly 694,444 during each 2MSL period. Of course, that's average, >>not peak (likely much more).... >> >>http://dns.measurement-factory.com/writings/wessels-pam2003-paper.pdf >> >>We're talking about Linux, Solaris, HP-UX, AIX, maybe some others. Do all >>these servers have the capability to handle that many TCP connections, >>rather than UDP connections? >> >>Do *any* of them? > >Modulo the variations in how persistent the connections were >relative to the transaction rate, and the differences in the >metrics, you can probably look at the archives of SPECweb96 (HTTP >1.0) SPECweb99 (1.1), SPECweb99_SSL (new SSL session for each new >TCP connection, IIRC, but it has been a while) or even >SPECweb2005/SPECweb2009 if you can decide which among "Banking," >"Ecommerce," and "Support" workload is closest, to get an idea of >how many TCP connections servers can handle. During the heyday of >web server benchmarking, there was a lot of work done in minimizing >the overhead of TIME_WAIT tracking etc. > >>>That said, the problem is fun. >>>As I recall Andy Tanenbaum used to point out that TP4 had an abrupt close >>>and it worked. It does require somewhat more application coordination but >>>perhaps we can fake that by, say, retransmitting the last segment >>>and the FIN >>>a few times to seek to ensure that all data is received by the client??? >>Cannot depend on the DNS client's OS to be that smart. Has to be a server >>only solution. Or based on a new TCP option, that tells us both ends are >>smart. (I've an option in mind.) > >Isn't a new TCP option by definition depending on the client's OS to be smart? > >rick jones >Wisdom teeth are impacted, people are affected by the effects of events -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20090813/efb3bab5/attachment.html From fernando at gont.com.ar Thu Aug 13 14:09:31 2009 From: fernando at gont.com.ar (Fernando Gont) Date: Thu, 13 Aug 2009 18:09:31 -0300 Subject: [e2e] TCP improved closing strategies? In-Reply-To: <4A841EFE.2050501@gmail.com> References: <4A833956.2050501@gmail.com> <4A83D6A1.7000602@gont.com.ar> <4A841EFE.2050501@gmail.com> Message-ID: <4A84810B.6020704@gont.com.ar> William Allen Simpson wrote: >> Well, you do have "improved *opening* strategies" :-). See page 93 of: >> http://www.gont.com.ar/papers/tn-03-09-security-assessment-TCP.pdf >> >> Timestamps can be used to safely recycle the TIME-WAIT state (provided >> that timestamps are monotonically-increasing across connections.) >> > Wonderful! A good survey. Nice to have it all in one place. No need to say: if you have any feedback, it will be appreciated. (Send it to me at fernando at gont.com.ar) Thanks! Kind regards, -- Fernando Gont e-mail: fernando at gont.com.ar || fgont at acm.org PGP Fingerprint: 7809 84F5 322E 45C7 F1C9 3945 96EE A9EF D076 FFF1 From falk at bbn.com Fri Aug 14 10:31:39 2009 From: falk at bbn.com (Aaron Falk) Date: Fri, 14 Aug 2009 13:31:39 -0400 Subject: [e2e] [Fwd: Copyrights and the IRTF and Independent Stream] Message-ID: <4A859F7B.40409@bbn.com> Forwarding to all active IRTF RG mail lists because this issue affects the publication rights for IDs and RFCs developed within the IRTF. Information about IRTF RFCs can be found in http://tools.ietf.org/html/draft-irtf-rfcs-03. Please direct comments to the RFC-interest list (coordinates below). --aaron IRTF Chair -------- Original Message -------- Subject: [rfc-i] Copyrights and the IRTF and Independent Stream Date: Thu, 13 Aug 2009 18:59:42 +0200 From: Olaf Kolkman To: RFC Interest Dear Colleagues, This mail is about rights in RFCs and Internet drafts. The topic draws context, and uses terminology from: RFC 4844: The RFC Series and RFC Editor RFC 4846: Independent Submissions to the RFC Editor RFC 5378: Rights Contributors Provide to the IETF Trust RFC 5377: Advice to the Trustees of the IETF Trust on Rights to Be Granted in IETF Documents First some administrational notes. This mail serves as setting a baseline for a public discussion and is based on a few (virtual and real-life) hallway discussions. The goal of this discussion is to make sure all issues are brought to the table and the stream maintainers and the Trust are aware of the communities opinions. I see my role as an IAB member that is driving the discussion, and mildly moderating it. In the end it is the role of the IAB to see that an appropriate stream dependent process has been followed and that the streams do not interact inapropriatly. So in that sense this discussion informs the IAB too. Although this mail is sent to multiple lists I would like to urge folk to discuss the issues on the rfc-interest list: http://www.rfc-editor.org/mailman/listinfo/rfc-interest Without further ado. As you may know RFCs are published from different streams. With respect to the incoming and outgoing rights only the rights of the IETF stream documents are well defined. And currently the IAB has chosen to have IAB documents fall under this regime too. The situation is less clear for Independent and IRTF Stream documents: all existing provisions are targeted specifically towards IETF Contributions (in the narrow context defined by RFC5378). Besides, currently and in the context of copyrights, Internet Drafts (I-Ds) are seen as IETF contributions. Maintainers of the Independent and IRTF stream would like to have documents from their streams published with full rights to make derivative works or make no derivative works whatsoever, and require attribution in both cases. There are two strategies to make this work: 1. Allow I-Ds and RFCs for the IAB, IRTF, and Independent stream to be published with a Creative Commons license added. A rough outline of one of the ideas that is floating around ist that all contributions that are intended to become an RFC or are an IETF contribution (in the sense of RFC 5378 definition) will continue to have the 5378 license terms as defined by the trust. Since 5378 is non-exclusive authors may add a creative commons license if they'd like to. Care should be taken that those licenses would not be applied to IETF contributions, as that could lead to 'spoofed standard-track RFCs' 2. Have the trust manage the rights for the IAB, IRTF, and Independent streams RFCs and I-Ds. Both strategies may require that at the moment Internet Drafts is published as an RFC it is clear on which stream they are intended to be published, with which rights, and that the authors have appropriate authority to grant/sublicense those rights. In both cases we want to avoid that IETF contributions (I-Ds and RFCs, although it is not clear whether this is a strong requirement for I-Ds) are published with a license policy that would circumvent the Trust's control over the outgoing rights. A tactic/straw-man to implement (2) is as follows: i) Treat all I-Ds as or similar to IETF contributions (this way there is no doubt about the Trust having all necessary rights, see BCP78 section 5.3). There seem to be two paths to proceed: i.1) The stream controllers choose to apply the RFC5378 rules. This possibility is offered in section 4 of 5378 and the IAB stream chooses to apply the rules through its declaration in section 11. i.2) The streams write a stream specific "Rights contributors provide to the IETF trust" document. ii) Have the stream controllers request the IETF Trust to create license(s) for non IETF-Stream RFCs that grant for (full|no) derivative rights, attribution required. (document this request in an RFC). iii) Ask the trust to develop language that can be put on an I-D with which the author can request (full|no) derivative rights if/when the I-D is published on a specific non-IETF stream document. This sort of text would serve as an indication of consent with wide licensing and could be included as boilerplate material together with an indication of intended stream (as in draft-iab-streams-headers- boileplates). There are some pros and cons to this scheme that we need to identify in public discussion, hence this mail. During the hallway discussion, I've seen the following arguments and identified the following open questions. I don't claim completeness. - The Trust is not well equiped to hold non-IETF documents. Mainly because it requires the interperetation that all documents are IETF Documents. Is this interpretation based on language in the Trust agreement or on the definition of IETF Documents in the context of BCP78? Or, is the trust well suited, and does implementation only need some boilerplate changes? - IETF Stream RFCs need to be protected by limited derivative rights so that the IETF keeps ownership over its Standards and can maintain those. Suppose a I-D with full derivative rights is posted (either because the author has published it with Creative Commons, or because the Trust allows full derivative rights for stream specific I-Ds) would narrowing down the rights by publication as an IETF stream RFC cause any problems? Feedback welcome, --Olaf Kolkman From touch at ISI.EDU Mon Aug 17 07:16:23 2009 From: touch at ISI.EDU (Joe Touch) Date: Mon, 17 Aug 2009 07:16:23 -0700 Subject: [e2e] TCP improved closing strategies? In-Reply-To: <4A840DEF.80503@gmail.com> References: <20090812230118.AFD6E28E138@aland.bbn.com> <4A840DEF.80503@gmail.com> Message-ID: <4A896637.4010907@isi.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 William Allen Simpson wrote: ... >> As I recall Andy Tanenbaum used to point out that TP4 had an abrupt close >> and it worked. It does require somewhat more application coordination >> but >> perhaps we can fake that by, say, retransmitting the last segment and >> the FIN >> a few times to seek to ensure that all data is received by the client??? >> > Cannot depend on the DNS client's OS to be that smart. Has to be a server > only solution. Or based on a new TCP option, that tells us both ends are > smart. (I've an option in mind.) There are two different problems here: 1) server maintaining state clogging the server 2) server TIME-WAIT slowing connections to a single address Both go away if the client closes the connection. If you are going to modify both ends, then that's a much simpler place to start than a TCP option (which will need to be negotiated during the SYN, and might be removed/dropped by firewalls or NATs, etc.). FWIW, persistent connections helps only #2. If it's the number of different clients connecting a server that is locking up too much server memory, then persistent connections will make the problem worse, not better. Joe -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkqJZjcACgkQE5f5cImnZruodwCeI3Lgpd8FNgsVt/g/FxPG29He NAAAoOXx0XeCkuadd26u87RBfnNSro3k =ZI0g -----END PGP SIGNATURE----- From dpreed at reed.com Mon Aug 17 08:54:01 2009 From: dpreed at reed.com (David P. Reed) Date: Mon, 17 Aug 2009 11:54:01 -0400 Subject: [e2e] TCP improved closing strategies? In-Reply-To: <4A896637.4010907@isi.edu> References: <20090812230118.AFD6E28E138@aland.bbn.com> <4A840DEF.80503@gmail.com> <4A896637.4010907@isi.edu> Message-ID: <4A897D19.6030702@reed.com> The function of the TCP close protocol has two parts: 1) a "semantic" property that indicates to the *applications* on each end that there will be no more data and that all data sent has been delivered. (this has the usual problem that "exactly once" semantics cannot be achieved, and TCP provides "at most once" data delivery semantics on the data octet just prior to the close. Of course, *most* apps follow the end-to-end principle and use the TCP close only as an "optimization" because they use their data to provide all the necessary semantics for their needs. 2) a "housekeeping" property related to keeping the TCP-layer-state minimal. This is what seems to be of concern here. Avoiding (2) for many parts of TCP is the reason behind "Trickles" (q.v.) a version of TCP that moves state to the client side. If we had a "trickles" version of TCP (which could be done on top of UDP) we could get all the functions of TCP with regard to (2) without server side overloading, other than that necessary for the app itself. Of course, "trickles" is also faithful to all of TCP's end-to-end congestion management and flow control, etc. None of which is needed for the DNS application - in fact, that stuff (slowstart, QBIC, ...) is really ridiculous to think about in the DNS requirements space (as it is also in the HTML page serving space, given RTT and bitrates we observe today, but I can't stop the a academic hotrodders from their addiction to tuning terabyte FTPs from unloaded servers for 5 % improvements over 10% lossy links). You all should know that a very practical fix to both close-wait and syn-wait problems is to recognize that 500 *milli*seconds is a much better choice for lost-packet timeouts these days - 250 would be pretty good. Instead, we have a default designed so that a human drinking coffee with one hand can drive a manual connection setup one packet at a time using DDT on an ASR33 TTY while having a chat with a co-worker. And the result is that we have DDOS attacks... I understand the legacy problems, but really. If we still designed modern HDTV signals so that a 1950 Dumont console TV could show a Blu-Ray movie, we'd have not advanced far. On 08/17/2009 10:16 AM, Joe Touch wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > > > William Allen Simpson wrote: > ... > >>> As I recall Andy Tanenbaum used to point out that TP4 had an abrupt close >>> and it worked. It does require somewhat more application coordination >>> but >>> perhaps we can fake that by, say, retransmitting the last segment and >>> the FIN >>> a few times to seek to ensure that all data is received by the client??? >>> >>> >> Cannot depend on the DNS client's OS to be that smart. Has to be a server >> only solution. Or based on a new TCP option, that tells us both ends are >> smart. (I've an option in mind.) >> > There are two different problems here: > > 1) server maintaining state clogging the server > > 2) server TIME-WAIT slowing connections to a single address > > Both go away if the client closes the connection. If you are going to > modify both ends, then that's a much simpler place to start than a TCP > option (which will need to be negotiated during the SYN, and might be > removed/dropped by firewalls or NATs, etc.). > > FWIW, persistent connections helps only #2. If it's the number of > different clients connecting a server that is locking up too much server > memory, then persistent connections will make the problem worse, not better. > > Joe > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.9 (MingW32) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ > > iEYEARECAAYFAkqJZjcACgkQE5f5cImnZruodwCeI3Lgpd8FNgsVt/g/FxPG29He > NAAAoOXx0XeCkuadd26u87RBfnNSro3k > =ZI0g > -----END PGP SIGNATURE----- > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20090817/5db367c5/attachment.html From day at std.com Mon Aug 17 13:14:08 2009 From: day at std.com (John Day) Date: Mon, 17 Aug 2009 16:14:08 -0400 Subject: [e2e] TCP improved closing strategies? In-Reply-To: <4A897D19.6030702@reed.com> References: <20090812230118.AFD6E28E138@aland.bbn.com> <4A840DEF.80503@gmail.com> <4A896637.4010907@isi.edu> <4A897D19.6030702@reed.com> Message-ID: At 11:54 -0400 2009/08/17, David P. Reed wrote: >The function of the TCP close protocol has two parts: > >1) a "semantic" property that indicates to the *applications* on >each end that there will be no more data and that all data sent has >been delivered. (this has the usual problem that "exactly once" >semantics cannot be achieved, and TCP provides "at most once" data >delivery semantics on the data octet just prior to the close. Of >course, *most* apps follow the end-to-end principle and use the TCP >close only as an "optimization" because they use their data to >provide all the necessary semantics for their needs. Correct. Blowing off even more dust, yes, this result was well understood by at least 1982. And translates into Ted's solution that explicit establishment and release of an "application connection" is necessary. Again see Watson's paper and Lamport's Byzantine General's paper. Using the release of the lower level connection to terminate signal the end of the higher level connection is overloading and always leads to problems. You still need 2MSL. > >2) a "housekeeping" property related to keeping the TCP-layer-state >minimal. This is what seems to be of concern here. Agreed here as well. Taking Dave's point that the value of MSL has gotten completely out of hand. As Dave says the RFC suggests 30 seconds, 1 or 2 minutes! for MSL. Going through 2**32 port-ids in 4 minutes with one host is unlikely but not *that* unlikely. And of course because of the well-known port kludge you are restricted to the client's port-id space and address. If you had good ole ICP, you wouldn't have 2**64 (there is other stuff going on), but it would be a significant part of that. But the TCP MSL may be adding insult to injury, I have heard rumors that the IP TTL is usually set to 255, which seems absurdly high as well. Even so, surely hitting 255 hops must take well under 4 minutes! So can we guess that TCP is sitting around waiting even though all of the packets are long gone from the network? 2MSL should probably smaller but it still has to be there. Take care, John > >Avoiding (2) for many parts of TCP is the reason behind "Trickles" >(q.v.) a version of TCP that moves state to the client side. >If we had a "trickles" version of TCP (which could be done on top of >UDP) we could get all the functions of TCP with regard to (2) >without server side overloading, other than that necessary for the >app itself. > >Of course, "trickles" is also faithful to all of TCP's end-to-end >congestion management and flow control, etc. None of which is >needed for the DNS application - in fact, that stuff (slowstart, >QBIC, ...) is really ridiculous to think about in the DNS >requirements space (as it is also in the HTML page serving space, >given RTT and bitrates we observe today, but I can't stop the a >academic hotrodders from their addiction to tuning terabyte FTPs >from unloaded servers for 5 % improvements over 10% lossy links). > >You all should know that a very practical fix to both close-wait and >syn-wait problems is to recognize that 500 *milli*seconds is a much >better choice for lost-packet timeouts these days - 250 would be >pretty good. Instead, we have a default designed so that a human >drinking coffee with one hand can drive a manual connection setup >one packet at a time using DDT on an ASR33 TTY while having a chat >with a co-worker. And the result is that we have DDOS attacks... > >I understand the legacy problems, but really. If we still designed >modern HDTV signals so that a 1950 Dumont console TV could show a >Blu-Ray movie, we'd have not advanced far. > > >On 08/17/2009 10:16 AM, Joe Touch wrote: > >>-----BEGIN PGP SIGNED MESSAGE----- >>Hash: SHA1 >> >> >> >>William Allen Simpson wrote: >>... >> >> >>>>As I recall Andy Tanenbaum used to point out that TP4 had an abrupt close >>>>and it worked. It does require somewhat more application coordination >>>>but >>>>perhaps we can fake that by, say, retransmitting the last segment and >>>>the FIN >>>>a few times to seek to ensure that all data is received by the client??? >>>> >>>> >>>> >>>Cannot depend on the DNS client's OS to be that smart. Has to be a server >>>only solution. Or based on a new TCP option, that tells us both ends are >>>smart. (I've an option in mind.) >>> >>> >> >>There are two different problems here: >> >>1) server maintaining state clogging the server >> >>2) server TIME-WAIT slowing connections to a single address >> >>Both go away if the client closes the connection. If you are going to >>modify both ends, then that's a much simpler place to start than a TCP >>option (which will need to be negotiated during the SYN, and might be >>removed/dropped by firewalls or NATs, etc.). >> >>FWIW, persistent connections helps only #2. If it's the number of >>different clients connecting a server that is locking up too much server >>memory, then persistent connections will make the problem worse, not better. >> >>Joe >>-----BEGIN PGP SIGNATURE----- >>Version: GnuPG v1.4.9 (MingW32) >>Comment: Using GnuPG with Mozilla - >>http://enigmail.mozdev.org/ >> >>iEYEARECAAYFAkqJZjcACgkQE5f5cImnZruodwCeI3Lgpd8FNgsVt/g/FxPG29He >>NAAAoOXx0XeCkuadd26u87RBfnNSro3k >>=ZI0g >>-----END PGP SIGNATURE----- >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20090817/bc70f354/attachment.html From dpreed at reed.com Mon Aug 17 13:30:31 2009 From: dpreed at reed.com (David P. Reed) Date: Mon, 17 Aug 2009 16:30:31 -0400 Subject: [e2e] TCP improved closing strategies? In-Reply-To: References: <20090812230118.AFD6E28E138@aland.bbn.com> <4A840DEF.80503@gmail.com> <4A896637.4010907@isi.edu> <4A897D19.6030702@reed.com> Message-ID: <4A89BDE7.9030705@reed.com> You need 2MSL to reject delayed dups. However, one does not need "fully live" individual connections to deal with delayed dups. You can reject delayed dups by saying "port unreachable" without a problem in most cases. 2MSL provides no semantic guarantees whatever. On 08/17/2009 04:14 PM, John Day wrote: > At 11:54 -0400 2009/08/17, David P. Reed wrote: >> The function of the TCP close protocol has two parts: >> >> 1) a "semantic" property that indicates to the *applications* on each >> end that there will be no more data and that all data sent has been >> delivered. (this has the usual problem that "exactly once" semantics >> cannot be achieved, and TCP provides "at most once" data delivery >> semantics on the data octet just prior to the close. Of course, >> *most* apps follow the end-to-end principle and use the TCP close >> only as an "optimization" because they use their data to provide all >> the necessary semantics for their needs. > > Correct. Blowing off even more dust, yes, this result was well > understood by at least 1982. And translates into Ted's solution that > explicit establishment and release of an "application connection" is > necessary. Again see Watson's paper and Lamport's Byzantine General's > paper. Using the release of the lower level connection to terminate > signal the end of the higher level connection is overloading and > always leads to problems. > > You still need 2MSL. > >> >> 2) a "housekeeping" property related to keeping the TCP-layer-state >> minimal. This is what seems to be of concern here. > > Agreed here as well. Taking Dave's point that the value of MSL has > gotten completely out of hand. As Dave says the RFC suggests 30 > seconds, 1 or 2 minutes! for MSL. Going through 2**32 port-ids in 4 > minutes with one host is unlikely but not *that* unlikely. And of > course because of the well-known port kludge you are restricted to the > client's port-id space and address. If you had good ole ICP, you > wouldn't have 2**64 (there is other stuff going on), but it would be a > significant part of that. > > But the TCP MSL may be adding insult to injury, I have heard rumors > that the IP TTL is usually set to 255, which seems absurdly high as > well. Even so, surely hitting 255 hops must take well under 4 > minutes! So can we guess that TCP is sitting around waiting even > though all of the packets are long gone from the network? > > 2MSL should probably smaller but it still has to be there. > > Take care, > John > >> >> Avoiding (2) for many parts of TCP is the reason behind "Trickles" >> (q.v.) a version of TCP that moves state to the client side. >> If we had a "trickles" version of TCP (which could be done on top of >> UDP) we could get all the functions of TCP with regard to (2) without >> server side overloading, other than that necessary for the app itself. >> >> Of course, "trickles" is also faithful to all of TCP's end-to-end >> congestion management and flow control, etc. None of which is needed >> for the DNS application - in fact, that stuff (slowstart, QBIC, ...) >> is really ridiculous to think about in the DNS requirements space (as >> it is also in the HTML page serving space, given RTT and bitrates we >> observe today, but I can't stop the a academic hotrodders from their >> addiction to tuning terabyte FTPs from unloaded servers for 5 % >> improvements over 10% lossy links). >> >> You all should know that a very practical fix to both close-wait and >> syn-wait problems is to recognize that 500 *milli*seconds is a much >> better choice for lost-packet timeouts these days - 250 would be >> pretty good. Instead, we have a default designed so that a human >> drinking coffee with one hand can drive a manual connection setup one >> packet at a time using DDT on an ASR33 TTY while having a chat with a >> co-worker. And the result is that we have DDOS attacks... >> >> I understand the legacy problems, but really. If we still designed >> modern HDTV signals so that a 1950 Dumont console TV could show a >> Blu-Ray movie, we'd have not advanced far. >> >> >> On 08/17/2009 10:16 AM, Joe Touch wrote: >>> -----BEGIN PGP SIGNED MESSAGE----- >>> Hash: SHA1 >>> >>> >>> >>> William Allen Simpson wrote: >>> ... >>> >>>>> As I recall Andy Tanenbaum used to point out that TP4 had an >>>>> abrupt close >>>>> and it worked. It does require somewhat more application coordination >>>>> but >>>>> perhaps we can fake that by, say, retransmitting the last segment and >>>>> the FIN >>>>> a few times to seek to ensure that all data is received by the >>>>> client??? >>>>> >>>>> >>>> Cannot depend on the DNS client's OS to be that smart. Has to be a >>>> server >>>> only solution. Or based on a new TCP option, that tells us both >>>> ends are >>>> smart. (I've an option in mind.) >>>> >>> >>> There are two different problems here: >>> >>> 1) server maintaining state clogging the server >>> >>> 2) server TIME-WAIT slowing connections to a single address >>> >>> Both go away if the client closes the connection. If you are going to >>> modify both ends, then that's a much simpler place to start than a TCP >>> option (which will need to be negotiated during the SYN, and might be >>> removed/dropped by firewalls or NATs, etc.). >>> >>> FWIW, persistent connections helps only #2. If it's the number of >>> different clients connecting a server that is locking up too much server >>> memory, then persistent connections will make the problem worse, not >>> better. >>> >>> Joe >>> -----BEGIN PGP SIGNATURE----- >>> Version: GnuPG v1.4.9 (MingW32) >>> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ >>> >>> iEYEARECAAYFAkqJZjcACgkQE5f5cImnZruodwCeI3Lgpd8FNgsVt/g/FxPG29He >>> NAAAoOXx0XeCkuadd26u87RBfnNSro3k >>> =ZI0g >>> -----END PGP SIGNATURE----- >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20090817/90396ab1/attachment.html From day at std.com Mon Aug 17 15:35:44 2009 From: day at std.com (John Day) Date: Mon, 17 Aug 2009 18:35:44 -0400 Subject: [e2e] TCP improved closing strategies? In-Reply-To: <4A89BDE7.9030705@reed.com> References: <20090812230118.AFD6E28E138@aland.bbn.com> <4A840DEF.80503@gmail.com> <4A896637.4010907@isi.edu> <4A897D19.6030702@reed.com> <4A89BDE7.9030705@reed.com> Message-ID: At 16:30 -0400 2009/08/17, David P. Reed wrote: >You need 2MSL to reject delayed dups. However, one does not need "fully live" Correct. Bill's question was on how soon port-ids could be re-cycled. >individual connections to deal with delayed dups. You can reject >delayed dups by saying "port unreachable" without a problem in most >cases. 2MSL provides no semantic guarantees whatever. Nor should it. Nor should anyone even try to construe that it might. > >On 08/17/2009 04:14 PM, John Day wrote: > >>Re: [e2e] TCP improved closing strategies? >>At 11:54 -0400 2009/08/17, David P. Reed wrote: >> >>>The function of the TCP close protocol has two parts: >>> >>>1) a "semantic" property that indicates to the *applications* on >>>each end that there will be no more data and that all data sent >>>has been delivered. (this has the usual problem that "exactly >>>once" semantics cannot be achieved, and TCP provides "at most >>>once" data delivery semantics on the data octet just prior to the >>>close. Of course, *most* apps follow the end-to-end principle and >>>use the TCP close only as an "optimization" because they use their >>>data to provide all the necessary semantics for their needs. >>> >> >>Correct. Blowing off even more dust, yes, this result was well >>understood by at least 1982. And translates into Ted's solution >>that explicit establishment and release of an "application >>connection" is necessary. Again see Watson's paper and Lamport's >>Byzantine General's paper. Using the release of the lower level >>connection to terminate signal the end of the higher level >>connection is overloading and always leads to problems. >> >>You still need 2MSL. >> >>> >>>2) a "housekeeping" property related to keeping the >>>TCP-layer-state minimal. This is what seems to be of concern here. >>> >> >>Agreed here as well. Taking Dave's point that the value of MSL has >>gotten completely out of hand. As Dave says the RFC suggests 30 >>seconds, 1 or 2 minutes! for MSL. Going through 2**32 port-ids in >>4 minutes with one host is unlikely but not *that* unlikely. And >>of course because of the well-known port kludge you are restricted >>to the client's port-id space and address. If you had good ole >>ICP, you wouldn't have 2**64 (there is other stuff going on), but >>it would be a significant part of that. >> >>But the TCP MSL may be adding insult to injury, I have heard rumors >>that the IP TTL is usually set to 255, which seems absurdly high as >>well. Even so, surely hitting 255 hops must take well under 4 >>minutes! So can we guess that TCP is sitting around waiting even >>though all of the packets are long gone from the network? >> >>2MSL should probably smaller but it still has to be there. >> >>Take care, >>John >> >>> >>>Avoiding (2) for many parts of TCP is the reason behind "Trickles" >>>(q.v.) a version of TCP that moves state to the client side. >>>If we had a "trickles" version of TCP (which could be done on top >>>of UDP) we could get all the functions of TCP with regard to (2) >>>without server side overloading, other than that necessary for the >>>app itself. >>> >>>Of course, "trickles" is also faithful to all of TCP's end-to-end >>>congestion management and flow control, etc. None of which is >>>needed for the DNS application - in fact, that stuff (slowstart, >>>QBIC, ...) is really ridiculous to think about in the DNS >>>requirements space (as it is also in the HTML page serving space, >>>given RTT and bitrates we observe today, but I can't stop the a >>>academic hotrodders from their addiction to tuning terabyte FTPs >>>from unloaded servers for 5 % improvements over 10% lossy links). >>> >>>You all should know that a very practical fix to both close-wait >>>and syn-wait problems is to recognize that 500 *milli*seconds is a >>>much better choice for lost-packet timeouts these days - 250 would >>>be pretty good. Instead, we have a default designed so that a >>>human drinking coffee with one hand can drive a manual connection >>>setup one packet at a time using DDT on an ASR33 TTY while having >>>a chat with a co-worker. And the result is that we have DDOS >>>attacks... >>> >>>I understand the legacy problems, but really. If we still >>>designed modern HDTV signals so that a 1950 Dumont console TV >>>could show a Blu-Ray movie, we'd have not advanced far. >>> >>> >>>On 08/17/2009 10:16 AM, Joe Touch wrote: >>> >>>>-----BEGIN PGP SIGNED MESSAGE----- >>>>Hash: SHA1 >>>> >>>> >>>> >>>>William Allen Simpson wrote: >>>>... >>>> >>>> >>>>>>As I recall Andy Tanenbaum used to point out that TP4 had an abrupt close >>>>>>and it worked. It does require somewhat more application coordination >>>>>> >>>>>>but >>>>>>perhaps we can fake that by, say, retransmitting the last segment and >>>>>>the FIN >>>>>>a few times to seek to ensure that all data is received by the client??? >>>>>> >>>>>> >>>>>> >>>>>Cannot depend on the DNS client's OS to be that smart. Has to be a server >>>>>only solution. Or based on a new TCP option, that tells us both ends are >>>>>smart. (I've an option in mind.) >>>>> >>>>> >>>> >>>>There are two different problems here: >>>> >>>>1) server maintaining state clogging the server >>>> >>>>2) server TIME-WAIT slowing connections to a single address >>>> >>>>Both go away if the client closes the connection. If you are going to >>>>modify both ends, then that's a much simpler place to start than a TCP >>>>option (which will need to be negotiated during the SYN, and might be >>>>removed/dropped by firewalls or NATs, etc.). >>>> >>>>FWIW, persistent connections helps only #2. If it's the number of >>>>different clients connecting a server that is locking up too much server >>>>memory, then persistent connections will make the problem worse, >>>>not better. >>>> >>>>Joe >>>>-----BEGIN PGP SIGNATURE----- >>>>Version: GnuPG v1.4.9 (MingW32) >>>>Comment: Using GnuPG with Mozilla - >>>>http://enigmail.mozdev.org/ >>>> >>>>iEYEARECAAYFAkqJZjcACgkQE5f5cImnZruodwCeI3Lgpd8FNgsVt/g/FxPG29He >>>>NAAAoOXx0XeCkuadd26u87RBfnNSro3k >>>>=ZI0g >>>>-----END PGP SIGNATURE----- >>>> >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20090817/38db9ce6/attachment-0001.html From perfgeek at mac.com Mon Aug 17 19:03:25 2009 From: perfgeek at mac.com (rick jones) Date: Mon, 17 Aug 2009 19:03:25 -0700 Subject: [e2e] TCP improved closing strategies? In-Reply-To: <4A897D19.6030702@reed.com> References: <20090812230118.AFD6E28E138@aland.bbn.com> <4A840DEF.80503@gmail.com> <4A896637.4010907@isi.edu> <4A897D19.6030702@reed.com> Message-ID: > Of course, "trickles" is also faithful to all of TCP's end-to-end > congestion management and flow control, etc. None of which is > needed for the DNS application - in fact, that stuff (slowstart, > QBIC, ...) is really ridiculous to think about in the DNS > requirements space Would DNS query processing ever even see slowstart artifacts? 99 times out of 10 the initial cwnd is 4380 bytes or somesuch right? > (as it is also in the HTML page serving space, given RTT and > bitrates we observe today, but I can't stop the a academic > hotrodders from their addiction to tuning terabyte FTPs from > unloaded servers for 5 % improvements over 10% lossy links). If the FSI guys had their say, latency would be king. > You all should know that a very practical fix to both close-wait and > syn-wait problems is to recognize that 500 *milli*seconds is a much > better choice for lost-packet timeouts these days - 250 would be > pretty good. Instead, we have a default designed so that a human > drinking coffee with one hand can drive a manual connection setup > one packet at a time using DDT on an ASR33 TTY while having a chat > with a co-worker. I thought many/most stacks used 500 milliseconds for their TCP RTO lower bound already? > And the result is that we have DDOS attacks... Well... are the constants really the heart of that issue? > I understand the legacy problems, but really. If we still designed > modern HDTV signals so that a 1950 Dumont console TV could show a > Blu-Ray movie, we'd have not advanced far. I must disagree with the presumption that TV has progressed far since the 1950's. At least in so far as content is concerned :) rick jones http://homepage.mac.com/perfgeek From dpreed at reed.com Tue Aug 18 08:28:13 2009 From: dpreed at reed.com (David P. Reed) Date: Tue, 18 Aug 2009 11:28:13 -0400 Subject: [e2e] TCP improved closing strategies? In-Reply-To: References: <20090812230118.AFD6E28E138@aland.bbn.com> <4A840DEF.80503@gmail.com> <4A896637.4010907@isi.edu> <4A897D19.6030702@reed.com> Message-ID: <4A8AC88D.9080103@reed.com> Minor misunderstanding... I was referring to all of the advanced logic and state needed to manage TCP flow control when I mentioned "slow start" - and NOT any "artifacts". The only thing needed for a DNS message and response is dealing with assembling an application message from a small number of packets. Both the request and the response (in DNS at the application level) make no meaningful change to the state variables at either end, so they are commutative and idempotent operations with respect to all other DNS requests and all other DNS responses. That makes almost all of the "semantic functionality" of TCP irrelevant, and means that "connection state" is flushable. On 08/17/2009 10:03 PM, rick jones wrote: >> Of course, "trickles" is also faithful to all of TCP's end-to-end >> congestion management and flow control, etc. None of which is needed >> for the DNS application - in fact, that stuff (slowstart, QBIC, ...) >> is really ridiculous to think about in the DNS requirements space > > Would DNS query processing ever even see slowstart artifacts? 99 > times out of 10 the initial cwnd is 4380 bytes or somesuch right? > >> (as it is also in the HTML page serving space, given RTT and bitrates >> we observe today, but I can't stop the a academic hotrodders from >> their addiction to tuning terabyte FTPs from unloaded servers for 5 % >> improvements over 10% lossy links). > > If the FSI guys had their say, latency would be king. > >> You all should know that a very practical fix to both close-wait and >> syn-wait problems is to recognize that 500 *milli*seconds is a much >> better choice for lost-packet timeouts these days - 250 would be >> pretty good. Instead, we have a default designed so that a human >> drinking coffee with one hand can drive a manual connection setup one >> packet at a time using DDT on an ASR33 TTY while having a chat with a >> co-worker. > > I thought many/most stacks used 500 milliseconds for their TCP RTO > lower bound already? > >> And the result is that we have DDOS attacks... > > Well... are the constants really the heart of that issue? > >> I understand the legacy problems, but really. If we still designed >> modern HDTV signals so that a 1950 Dumont console TV could show a >> Blu-Ray movie, we'd have not advanced far. > > I must disagree with the presumption that TV has progressed far since > the 1950's. At least in so far as content is concerned :) > > rick jones > http://homepage.mac.com/perfgeek > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20090818/2f2c9b9c/attachment.html From touch at ISI.EDU Tue Aug 18 08:42:32 2009 From: touch at ISI.EDU (Joe Touch) Date: Tue, 18 Aug 2009 08:42:32 -0700 Subject: [e2e] TCP improved closing strategies? In-Reply-To: <4A8AC88D.9080103@reed.com> References: <20090812230118.AFD6E28E138@aland.bbn.com> <4A840DEF.80503@gmail.com> <4A896637.4010907@isi.edu> <4A897D19.6030702@reed.com> <4A8AC88D.9080103@reed.com> Message-ID: <4A8ACBE8.60704@isi.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 David P. Reed wrote: > Minor misunderstanding... I was referring to all of the advanced logic > and state needed to manage TCP flow control when I mentioned "slow > start" - and NOT any "artifacts". The only thing needed for a DNS > message and response is dealing with assembling an application message > from a small number of packets. Both the request and the response (in > DNS at the application level) make no meaningful change to the state > variables at either end, so they are commutative and idempotent > operations with respect to all other DNS requests and all other DNS > responses. That makes almost all of the "semantic functionality" of > TCP irrelevant, and means that "connection state" is flushable. It means you didn't need TCP. You can't flush TCP state unless you know you don't need what it provides - notably protection that the next TCP connection on that socket pair won't be affected by late arriving segments from the previous connection. Let's not change TCP semantics in this regard; let's just not use TCP where TCP semantics aren't needed. Joe -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkqKy+gACgkQE5f5cImnZrvSDgCguWwCI8I8pfXVKraseGEeNMqP 5QcAnjB/0PB+di2KZUgeRZ7d485aUfZ2 =/M1z -----END PGP SIGNATURE----- From dpreed at reed.com Tue Aug 18 09:04:14 2009 From: dpreed at reed.com (David P. Reed) Date: Tue, 18 Aug 2009 12:04:14 -0400 Subject: [e2e] TCP improved closing strategies? In-Reply-To: <4A8ACBE8.60704@isi.edu> References: <20090812230118.AFD6E28E138@aland.bbn.com> <4A840DEF.80503@gmail.com> <4A896637.4010907@isi.edu> <4A897D19.6030702@reed.com> <4A8AC88D.9080103@reed.com> <4A8ACBE8.60704@isi.edu> Message-ID: <4A8AD0FE.6080400@reed.com> On 08/18/2009 11:42 AM, Joe Touch wrote: > It means you didn't need TCP. Exactly! > You can't flush TCP state unless you know > you don't need what it provides - notably protection that the next TCP > connection on that socket pair won't be affected by late arriving > segments from the previous connection. > > Let's not change TCP semantics in this regard; let's just not use TCP > where TCP semantics aren't needed. > If you recall, that was my original point, in my original response. DNS shouldn't use TCP just because some DNS technique gets expansive enough to sometimes require more than 1 IP datagram. As I originally suggested, simple information theoretic analysis suggests that one can do the DNS request/response within one UDP datagram each way, so my suggestion in this case is to send the DNS layer protocol designer back to the drawing board with an information theorist and cryptographer at his/her elbow. From william.allen.simpson at gmail.com Tue Aug 18 09:56:51 2009 From: william.allen.simpson at gmail.com (William Allen Simpson) Date: Tue, 18 Aug 2009 12:56:51 -0400 Subject: [e2e] TCP improved closing strategies? In-Reply-To: <4A8AD0FE.6080400@reed.com> References: <20090812230118.AFD6E28E138@aland.bbn.com> <4A840DEF.80503@gmail.com> <4A896637.4010907@isi.edu> <4A897D19.6030702@reed.com> <4A8AC88D.9080103@reed.com> <4A8ACBE8.60704@isi.edu> <4A8AD0FE.6080400@reed.com> Message-ID: <4A8ADD53.2010107@gmail.com> David P. Reed wrote: > On 08/18/2009 11:42 AM, Joe Touch wrote: >> It means you didn't need TCP. > Exactly! >> You can't flush TCP state unless you know >> you don't need what it provides - notably protection that the next TCP >> connection on that socket pair won't be affected by late arriving >> segments from the previous connection. >> >> Let's not change TCP semantics in this regard; let's just not use TCP >> where TCP semantics aren't needed. >> > If you recall, that was my original point, in my original response. DNS > shouldn't use TCP just because some DNS technique gets expansive enough > to sometimes require more than 1 IP datagram. As I originally suggested, > simple information theoretic analysis suggests that one can do the DNS > request/response within one UDP datagram each way, so my suggestion in > this case is to send the DNS layer protocol designer back to the > drawing board with an information theorist and cryptographer at his/her > elbow. > Thank you to everybody that provided substantive information and pointers. I look forward to David's information theoretic cryptology that crams SOA, several NS, and a half dozen digital signatures into 512 bytes over UDP, for the simplest secure case of NXDOMAIN. With several hundred thousand clients per minute using 65,000 ports. Through NAT boxen that pass *only* TCP and UDP, and don't randomize the Source port, and don't properly handle returning IP fragments. Etc. Back in the real world, that means TCP semantics, such as retransmission of lost segments. Or reinventing the wheel (segmentation and retransmission over UDP). From dpreed at reed.com Tue Aug 18 13:35:42 2009 From: dpreed at reed.com (David P. Reed) Date: Tue, 18 Aug 2009 16:35:42 -0400 Subject: [e2e] TCP improved closing strategies? In-Reply-To: <4A8ADD53.2010107@gmail.com> References: <20090812230118.AFD6E28E138@aland.bbn.com> <4A840DEF.80503@gmail.com> <4A896637.4010907@isi.edu> <4A897D19.6030702@reed.com> <4A8AC88D.9080103@reed.com> <4A8ACBE8.60704@isi.edu> <4A8AD0FE.6080400@reed.com> <4A8ADD53.2010107@gmail.com> Message-ID: <4A8B109E.6030508@reed.com> I'd suggest that identity based encryption would provide a good starting point the level of quote-security-endquote that is needed for DNS in the grand practical scheme of things. But I'd probably be accused of being unconnected with the simple reality of people who thing that SOA/certificates/etc. being mumbled makes one an expert on "security". What is the risk and what is the threat model, in one simple statement that doesn't involve claims that DNS is somehow a "super secure" system to start with? In a world where I check into a hotel that forcibly rapes my packets starting with the ARP packets and going up through DHCP, so that when I send a TCP/IP packet to www.google.com on port 80 it gets redirected to a server that opendns.com (the world's "safest" DNS service) has been told is to handle all google traffic (no NXDOMAIN here) which scrapes my requests in order to sell my personal interests to a marketing company? Get real. Security used to mean something other than employing security consultants to work on subproblems as if they were the fundamental issue, and crap up fundamentally weak systems with bells-and-whistles like TCP magic close protocols that only add DDOS attack risks, while fixing nothing important. On 08/18/2009 12:56 PM, William Allen Simpson wrote: > David P. Reed wrote: >> On 08/18/2009 11:42 AM, Joe Touch wrote: >>> It means you didn't need TCP. >> Exactly! >>> You can't flush TCP state unless you know >>> you don't need what it provides - notably protection that the next TCP >>> connection on that socket pair won't be affected by late arriving >>> segments from the previous connection. >>> >>> Let's not change TCP semantics in this regard; let's just not use TCP >>> where TCP semantics aren't needed. >> If you recall, that was my original point, in my original response. >> DNS shouldn't use TCP just because some DNS technique gets expansive >> enough to sometimes require more than 1 IP datagram. As I originally >> suggested, simple information theoretic analysis suggests that one >> can do the DNS request/response within one UDP datagram each way, so >> my suggestion in this case is to send the DNS layer protocol >> designer back to the drawing board with an information theorist and >> cryptographer at his/her elbow. >> > Thank you to everybody that provided substantive information and > pointers. > > I look forward to David's information theoretic cryptology that crams > SOA, > several NS, and a half dozen digital signatures into 512 bytes over UDP, > for the simplest secure case of NXDOMAIN. > > With several hundred thousand clients per minute using 65,000 ports. > Through NAT boxen that pass *only* TCP and UDP, and don't randomize the > Source port, and don't properly handle returning IP fragments. Etc. > > Back in the real world, that means TCP semantics, such as retransmission > of lost segments. > > Or reinventing the wheel (segmentation and retransmission over UDP). > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20090818/16887013/attachment.html From vikram.visweswaraiah at gmail.com Tue Aug 18 15:33:05 2009 From: vikram.visweswaraiah at gmail.com (Vikram Visweswaraiah) Date: Tue, 18 Aug 2009 15:33:05 -0700 Subject: [e2e] TCP improved closing strategies? In-Reply-To: References: <4A833956.2050501@gmail.com> Message-ID: <5d9330db0908181533m8c03013vdc26bd9fd6987a4a@mail.gmail.com> On Aug 12, 2009, at 2:51 PM, William Allen Simpson wrote: > > > With the advent of more widespread DNSsec deployment, more UDP sessions > > are likely to fallover into TCP sessions. > > > > I've been informed that even today, with a more limited TCP activity, > > busy servers cannot wait 2MSL to finish closing. > > One wonders how a web server survives... Right, load balancing courtesy of multiple A records returned by the DNS server in a different order per query, which begs the question: Aren't DNS servers themselves configured with some level of redundancy that helps with fault tolerance and load-balancing? Doesn't that mitigate the problem? -V -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20090818/765d87e4/attachment.html From william.allen.simpson at gmail.com Tue Aug 18 16:08:24 2009 From: william.allen.simpson at gmail.com (William Allen Simpson) Date: Tue, 18 Aug 2009 19:08:24 -0400 Subject: [e2e] TCP improved closing strategies? In-Reply-To: <4A8B109E.6030508@reed.com> References: <20090812230118.AFD6E28E138@aland.bbn.com> <4A840DEF.80503@gmail.com> <4A896637.4010907@isi.edu> <4A897D19.6030702@reed.com> <4A8AC88D.9080103@reed.com> <4A8ACBE8.60704@isi.edu> <4A8AD0FE.6080400@reed.com> <4A8ADD53.2010107@gmail.com> <4A8B109E.6030508@reed.com> Message-ID: <4A8B3468.2060608@gmail.com> David P. Reed wrote: > On 08/18/2009 12:56 PM, William Allen Simpson wrote: >> Thank you to everybody that provided substantive information and >> pointers. >> >> I look forward to David's information theoretic cryptology that crams >> SOA, >> several NS, and a half dozen digital signatures into 512 bytes over UDP, >> for the simplest secure case of NXDOMAIN. >> > I'd suggest that identity based encryption would provide a good starting > point the level of quote-security-endquote that is needed for DNS in the > grand practical scheme of things. But I'd probably be accused of being > unconnected with the simple reality of people who thing that > SOA/certificates/etc. being mumbled makes one an expert on "security". > > What is the risk and what is the threat model, in one simple statement > that doesn't involve claims that DNS is somehow a "super secure" system > to start with? > RTFM. At the risk of alienating the others on the list by replying to this drivel, I'm also looking forward to the magic wand that instantaneously replaces DNS with another protocol and infrastructure. Moreover, folks that don't top post in multipart/alternative text/html, expecting others to do the work of fixing their formatting for readability. The thing that makes some of us more expert in security than others is the day to day experience of securing the "grand practical scheme of things." And the willingness to openly ask questions instead of hurling insults.... >> With several hundred thousand clients per minute using 65,000 ports. >> Through NAT boxen that pass *only* TCP and UDP, and don't randomize the >> Source port, and don't properly handle returning IP fragments. Etc. >> >> Back in the real world, that means TCP semantics, such as retransmission >> of lost segments. >> >> Or reinventing the wheel (segmentation and retransmission over UDP). >> > In a world where I check into a hotel that forcibly rapes my packets > starting with the ARP packets and going up through DHCP, so that when I > send a TCP/IP packet to www.google.com on port 80 it gets redirected to > a server that opendns.com (the world's "safest" DNS service) has been > told is to handle all google traffic (no NXDOMAIN here) which scrapes my > requests in order to sell my personal interests to a marketing company? > We've all had that experience. Some of us even *predicted* it long ago (late NSFnet/early commercial Internet days). One of us even designed a secure ARP replacement, and proposed a shared-secret requirement for DHCP, with a requirement that every Internet end-to-end session be secured for authentication, confidentiality, and integrity. Other folks argued against it. The very idea that every system required at least 1 configured secret before installation was considered anathema. What about a thousand systems on the loading dock? One fine fellow had the unmitigated gall to state (paraphrased) the ethernet model works fine today, why change it.... I kept the recording for many years, as that person was forcibly made my "co-author" on Neighbor Discovery, who then removed all the security and hidden terminal (for wireless) discovery. Only now are they adding that back again (badly and inelegantly). Better late than never? N.B.: now ATT 2-wire cable boxes actually come pre-configured with a secret, printed right on the label. Finally! If only it was a UPC, so those could easily be scanned into a database for a thousand boxes on the loading dock.... > Get real. Security used to mean something other than employing security > consultants to work on subproblems as if they were the fundamental > issue, and crap up fundamentally weak systems with bells-and-whistles > like TCP magic close protocols that only add DDOS attack risks, while > fixing nothing important. > Employing? You're being paid for this diatribe? Where were you during the crypto-wars? Where was *your* running code? Who was it that specified only 65K UDP ports? Who didn't randomize the Source port to prevent prediction, resulting in DNS cache poisoning? Who didn't even think about security for the Internet as a whole? (Compartment options are not security, they're bureaucracy.) From touch at ISI.EDU Tue Aug 18 23:29:35 2009 From: touch at ISI.EDU (Joe Touch) Date: Tue, 18 Aug 2009 23:29:35 -0700 Subject: [e2e] TCP improved closing strategies? In-Reply-To: <4A8ADD53.2010107@gmail.com> References: <20090812230118.AFD6E28E138@aland.bbn.com> <4A840DEF.80503@gmail.com> <4A896637.4010907@isi.edu> <4A897D19.6030702@reed.com> <4A8AC88D.9080103@reed.com> <4A8ACBE8.60704@isi.edu> <4A8AD0FE.6080400@reed.com> <4A8ADD53.2010107@gmail.com> Message-ID: <4A8B9BCF.50900@isi.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 William Allen Simpson wrote: ... > With several hundred thousand clients per minute using 65,000 ports. The TCP state is supposed to be per socket pair (src/dst IP, src/dst port). So unless you're running those clients behind a single NAT - or keep track of only part of the state, this isn't an issue of port reuse. The issue is more likely consumption of kernel space. > Through NAT boxen that pass *only* TCP and UDP, and don't randomize the > Source port, and don't properly handle returning IP fragments. Etc. > > Back in the real world, that means TCP semantics, such as retransmission > of lost segments. > > Or reinventing the wheel (segmentation and retransmission over UDP). A protocol that breaks a request into a 4-5 packets and does even a simple bit-mask NACK retransmission until they all get there isn't anywhere near as complex as TCP. Some wheels don't need to be reinvented. Just dusted off and used where needed. Joe -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkqLm88ACgkQE5f5cImnZrvNegCfcm3tJ5NX3WxmhXbrWxIC1laR F3sAoKKZeqOdfFP2lm0mkQ3rg92DpZqq =oJuq -----END PGP SIGNATURE----- From dpreed at reed.com Wed Aug 19 05:19:57 2009 From: dpreed at reed.com (David P. Reed) Date: Wed, 19 Aug 2009 08:19:57 -0400 Subject: [e2e] TCP improved closing strategies? In-Reply-To: <4A8B3468.2060608@gmail.com> References: <20090812230118.AFD6E28E138@aland.bbn.com> <4A840DEF.80503@gmail.com> <4A896637.4010907@isi.edu> <4A897D19.6030702@reed.com> <4A8AC88D.9080103@reed.com> <4A8ACBE8.60704@isi.edu> <4A8AD0FE.6080400@reed.com> <4A8ADD53.2010107@gmail.com> <4A8B109E.6030508@reed.com> <4A8B3468.2060608@gmail.com> Message-ID: <4A8BEDED.4090207@reed.com> Since you persist in being a jerk, Mr. Simpson, I suggest you ask someone who actually was involved what role I played in the "crypto wars" (a small one, but an important one, I think, which legitimized public key crypto in the broad public market), and what role I played in attempting to get TCP (the original version) made cryptographically secure from the beginning when I was working to try to get Steve Kent's work into the original standard, despite NSA opposition. Yes, I speak flamboyantly sometimes. Sometimes I need to apologize as a result. That is needed to cut through the crap spewed as arguments that are grounded in the notion that security is about "patching" against the last attack, or grounded in the idea that protocol design is about byte ordering and understanding the difference between MUST and SHOULD. Your insults miss the mark, because they are wrong. I'm happy to apologize if you feel personally insulted. However, the situation that concerns me remains: this whole project of "securing DNS" is a pile of bad designs aimed at a narrow script-kiddie threat, while opening up a window for far greater attacks (DDOS, etc.) and leaving the barn door open to a wide variety of problems, both security problems and authority problems. A real, pragmatic security expert would realize the security 101 is: if you strengthen A, but leave B weak, the attackers will move to B. If you strengthen A against one attack, but but leave problems unaddressed that create a set of new attacks in the process, you actually make the problem worse, not better. The best solution for security starts with an "end-to-end" understanding of the fundamental function you are trying to provide, and its security requirements. Typically "optimizations" introduced in *security* situations have the property of not fully solving the problem, but even worse, introducing new problems (hence the introduction of TCP statefulness into a DNS adds the potential of attacks through state disruption, and the introduction of kernel-implemented TCPs with a wide variety of quirks introduces far more attacks through kernel-access paths). Perhaps this doesn't matter in your world. It matters in the *real* world, where I happen to live. On 08/18/2009 07:08 PM, William Allen Simpson wrote: > David P. Reed wrote: >> On 08/18/2009 12:56 PM, William Allen Simpson wrote: >>> Thank you to everybody that provided substantive information and >>> pointers. >>> >>> I look forward to David's information theoretic cryptology that >>> crams SOA, >>> several NS, and a half dozen digital signatures into 512 bytes over >>> UDP, >>> for the simplest secure case of NXDOMAIN. >>> >> I'd suggest that identity based encryption would provide a good >> starting point the level of quote-security-endquote that is needed >> for DNS in the grand practical scheme of things. But I'd probably be >> accused of being unconnected with the simple reality of people who >> thing that SOA/certificates/etc. being mumbled makes one an expert on >> "security". >> >> What is the risk and what is the threat model, in one simple >> statement that doesn't involve claims that DNS is somehow a "super >> secure" system to start with? >> > RTFM. > > At the risk of alienating the others on the list by replying to this > drivel, I'm also looking forward to the magic wand that instantaneously > replaces DNS with another protocol and infrastructure. > > Moreover, folks that don't top post in multipart/alternative text/html, > expecting others to do the work of fixing their formatting for > readability. > > The thing that makes some of us more expert in security than others is > the > day to day experience of securing the "grand practical scheme of things." > > And the willingness to openly ask questions instead of hurling > insults.... > > >>> With several hundred thousand clients per minute using 65,000 ports. >>> Through NAT boxen that pass *only* TCP and UDP, and don't randomize the >>> Source port, and don't properly handle returning IP fragments. Etc. >>> >>> Back in the real world, that means TCP semantics, such as >>> retransmission >>> of lost segments. >>> >>> Or reinventing the wheel (segmentation and retransmission over UDP). >>> >> In a world where I check into a hotel that forcibly rapes my packets >> starting with the ARP packets and going up through DHCP, so that when >> I send a TCP/IP packet to www.google.com on port 80 it gets >> redirected to a server that opendns.com (the world's "safest" DNS >> service) has been told is to handle all google traffic (no NXDOMAIN >> here) which scrapes my requests in order to sell my personal >> interests to a marketing company? >> > We've all had that experience. Some of us even *predicted* it long ago > (late NSFnet/early commercial Internet days). One of us even designed a > secure ARP replacement, and proposed a shared-secret requirement for > DHCP, > with a requirement that every Internet end-to-end session be secured for > authentication, confidentiality, and integrity. > > Other folks argued against it. The very idea that every system required > at least 1 configured secret before installation was considered anathema. > What about a thousand systems on the loading dock? One fine fellow had > the unmitigated gall to state (paraphrased) the ethernet model works fine > today, why change it.... > > I kept the recording for many years, as that person was forcibly made my > "co-author" on Neighbor Discovery, who then removed all the security and > hidden terminal (for wireless) discovery. Only now are they adding that > back again (badly and inelegantly). Better late than never? > > N.B.: now ATT 2-wire cable boxes actually come pre-configured with a > secret, printed right on the label. Finally! If only it was a UPC, so > those could easily be scanned into a database for a thousand boxes on the > loading dock.... > > >> Get real. Security used to mean something other than employing >> security consultants to work on subproblems as if they were the >> fundamental issue, and crap up fundamentally weak systems with >> bells-and-whistles like TCP magic close protocols that only add DDOS >> attack risks, while fixing nothing important. >> > Employing? You're being paid for this diatribe? > > Where were you during the crypto-wars? Where was *your* running code? > > Who was it that specified only 65K UDP ports? Who didn't randomize the > Source port to prevent prediction, resulting in DNS cache poisoning? > Who didn't even think about security for the Internet as a whole? > > (Compartment options are not security, they're bureaucracy.) > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20090819/80a07cec/attachment-0001.html From william.allen.simpson at gmail.com Wed Aug 19 09:55:40 2009 From: william.allen.simpson at gmail.com (William Allen Simpson) Date: Wed, 19 Aug 2009 12:55:40 -0400 Subject: [e2e] TCP improved closing strategies? In-Reply-To: <4A8BEDED.4090207@reed.com> References: <20090812230118.AFD6E28E138@aland.bbn.com> <4A840DEF.80503@gmail.com> <4A896637.4010907@isi.edu> <4A897D19.6030702@reed.com> <4A8AC88D.9080103@reed.com> <4A8ACBE8.60704@isi.edu> <4A8AD0FE.6080400@reed.com> <4A8ADD53.2010107@gmail.com> <4A8B109E.6030508@reed.com> <4A8B3468.2060608@gmail.com> <4A8BEDED.4090207@reed.com> Message-ID: <4A8C2E8C.9030808@gmail.com> David P. Reed wrote: > Since you persist in being a jerk, Mr. Simpson, I suggest you ask > someone who actually was involved what role I played in the "crypto > wars" (a small one, but an important one, I think, which legitimized > public key crypto in the broad public market), and what role I played in > attempting to get TCP (the original version) made cryptographically > secure from the beginning when I was working to try to get Steve Kent's > work into the original standard, despite NSA opposition. > Since you persist in being a jerk, Mr. Reed.... I was actually involved, and I don't remember your running code. Or quite frankly, your involvement large or small. However, I'm not good at names, perhaps I'd remember your face. My first TCP/IP implementation was started circa 1979-1980 (the PE 7/16), to complement the bisync that came with the OS, and the X.25 that I'd done to talk to the satellite communications coming in from the field weather transmitters for the Integrated Pest Management project. I got the documents from Chris Wendt at Merit, where they were working on their implementation. I don't remember any crypto in them. Perhaps that wasn't forwarded to me by Merit.... Or by somebody to Merit. From william.allen.simpson at gmail.com Thu Aug 20 05:24:50 2009 From: william.allen.simpson at gmail.com (William Allen Simpson) Date: Thu, 20 Aug 2009 08:24:50 -0400 Subject: [e2e] TCP improved closing strategies? In-Reply-To: <4A8B9BCF.50900@isi.edu> References: <20090812230118.AFD6E28E138@aland.bbn.com> <4A840DEF.80503@gmail.com> <4A896637.4010907@isi.edu> <4A897D19.6030702@reed.com> <4A8AC88D.9080103@reed.com> <4A8ACBE8.60704@isi.edu> <4A8AD0FE.6080400@reed.com> <4A8ADD53.2010107@gmail.com> <4A8B9BCF.50900@isi.edu> Message-ID: <4A8D4092.8030403@gmail.com> Joe Touch wrote: > William Allen Simpson wrote: > ... >> With several hundred thousand clients per minute using 65,000 ports. > > The TCP state is supposed to be per socket pair (src/dst IP, src/dst > port). So unless you're running those clients behind a single NAT - or > keep track of only part of the state, this isn't an issue of port reuse. > The issue is more likely consumption of kernel space. > I've confirmed with Vixie. Here's my interpretation of his shorthand. The point of view of a busy recursive nameserver: 1) fin-wait-2 locks up the tuple for 2*MSL. 2) ouraddress and ourport are both fixed. 3) fixed theiraddress, from our POV. 4) they've discarded state for theirport, usually this is due to NAT. The solution requires an improved closing strategy, where the onus is entirely on the session initiator. There have been several suggestions in the literature. Thanks again to those that provided useful and interesting pointers. >>... >> Or reinventing the wheel (segmentation and retransmission over UDP). > > A protocol that breaks a request into a 4-5 packets and does even a > simple bit-mask NACK retransmission until they all get there isn't > anywhere near as complex as TCP. > > Some wheels don't need to be reinvented. Just dusted off and used where > needed. > Perhaps you'll enjoy reading: http://www.ietf.org/id/draft-barwood-dnsext-edns-page-option-00.txt That's not the direction I'm heading.... From end2end-interest at postel.org Thu Aug 20 00:49:22 2009 From: end2end-interest at postel.org (VIAGRA Inc.) Date: Thu, 20 Aug 2009 07:49:22 GMT Subject: [e2e] Pharmacy Online Sale 86% OFF! Message-ID: <20090820074922.3037.qmail@cy-6ac3ca805278> An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20090820/da714008/attachment.html -------------- next part -------------- New from WebMD: Dear end2end-interest at postel.org!. Sign-up today! You are subscribed as end2end-interest at postel.org. View and manage your WebMD newsletter preferences. Subscribe to more newsletters. Change/update your email address. WebMD Privacy Policy WebMD Office of Privacy 1175 Peachtree Street, Suite 2400, Atlanta, GA 30361 ? 2009 WebMD, LLC. All rights reserved. From vikram.visweswaraiah at gmail.com Thu Aug 20 13:13:46 2009 From: vikram.visweswaraiah at gmail.com (Vikram Visweswaraiah) Date: Thu, 20 Aug 2009 13:13:46 -0700 Subject: [e2e] TCP improved closing strategies? In-Reply-To: <4A8D4092.8030403@gmail.com> References: <20090812230118.AFD6E28E138@aland.bbn.com> <4A896637.4010907@isi.edu> <4A897D19.6030702@reed.com> <4A8AC88D.9080103@reed.com> <4A8ACBE8.60704@isi.edu> <4A8AD0FE.6080400@reed.com> <4A8ADD53.2010107@gmail.com> <4A8B9BCF.50900@isi.edu> <4A8D4092.8030403@gmail.com> Message-ID: <5d9330db0908201313k26ac013bi5496b37b5e1f62b3@mail.gmail.com> On Thu, Aug 20, 2009 at 5:24 AM, William Allen Simpson wrote: > > Joe Touch wrote: >> >> William Allen Simpson wrote: >> ... >>> >>> With several hundred thousand clients per minute using 65,000 ports. >> >> The TCP state is supposed to be per socket pair (src/dst IP, src/dst >> port). So unless you're running those clients behind a single NAT - or >> keep track of only part of the state, this isn't an issue of port reuse. >> The issue is more likely consumption of kernel space. >> > I've confirmed with Vixie. ?Here's my interpretation of his shorthand. > > The point of view of a busy recursive nameserver: > > 1) fin-wait-2 locks up the > ? tuple for 2*MSL. > > 2) ouraddress and ourport are both fixed. > > 3) fixed theiraddress, from our POV. > > 4) they've discarded state for theirport, usually this is due to NAT. > > The solution requires an improved closing strategy, where the onus is > entirely on the session initiator. > > There have been several suggestions in the literature. ?Thanks again to > those that provided useful and interesting pointers. I'm still somewhat curious about the problem space. Is NAT the culprit? It seems like DNS has a few reasonable mechanisms to limit the number of queries to servers - TTL setting, resolver caches, round-robin records. Why don't these help? Is there a specific network or scenario where this is a big problem or is this a fairly widespread issue? Thanks! -V PS: Link below to draft seems broken > > >>> ... >>> Or reinventing the wheel (segmentation and retransmission over UDP). >> >> A protocol that breaks a request into a 4-5 packets and does even a >> simple bit-mask NACK retransmission until they all get there isn't >> anywhere near as complex as TCP. >> >> Some wheels don't need to be reinvented. Just dusted off and used where >> needed. >> > Perhaps you'll enjoy reading: > > ?http://www.ietf.org/id/draft-barwood-dnsext-edns-page-option-00.txt > > That's not the direction I'm heading.... From william.allen.simpson at gmail.com Fri Aug 21 15:47:25 2009 From: william.allen.simpson at gmail.com (William Allen Simpson) Date: Fri, 21 Aug 2009 18:47:25 -0400 Subject: [e2e] TCP improved closing strategies? In-Reply-To: <5d9330db0908201313k26ac013bi5496b37b5e1f62b3@mail.gmail.com> References: <20090812230118.AFD6E28E138@aland.bbn.com> <4A896637.4010907@isi.edu> <4A897D19.6030702@reed.com> <4A8AC88D.9080103@reed.com> <4A8ACBE8.60704@isi.edu> <4A8AD0FE.6080400@reed.com> <4A8ADD53.2010107@gmail.com> <4A8B9BCF.50900@isi.edu> <4A8D4092.8030403@gmail.com> <5d9330db0908201313k26ac013bi5496b37b5e1f62b3@mail.gmail.com> Message-ID: <4A8F23FD.1070705@gmail.com> Vikram Visweswaraiah wrote: > I'm still somewhat curious about the problem space. Is NAT the > culprit? It seems like DNS has a few reasonable mechanisms to limit > the number of queries to servers - TTL setting, resolver caches, > round-robin records. Why don't these help? Is there a specific network > or scenario where this is a big problem or is this a fairly widespread > issue? > Sometimes, but as previously noted, this is about recursive nameservers. How does any of the above help at all? We already know that most stub resolvers don't bother with TTL or cache: 25.4% Identical Query 44.9% Repeated Query 2.15% Legitimate Query http://dns.measurement-factory.com/writings/wessels-pam2003-paper.pdf There were average (not peak) 644+ thousand queries per MSL banging away at 65 thousand ports, and those are old numbers. IIRC, peak was roughly 8,000 per second (920+ thousand per MSL). That's for one root server, *with* *existing* round robin rotation among 13 root servers. As we switch to TCP, that's not scaling well. On namedroppers, bert hubert wrote: # This, however, is not the point; we are not dealing with a 'clean # network'. We are dealing with a network ('The Internet') that saw a # 600-fold increase in TCP traffic, which should not have happened. Anyway, this is supposed to be the end to end group. Why are many folks trying the "Somebody Else's Problem" approach? > PS: Link below to draft seems broken > He's put out a couple more revisions over the past couple of days. Try: http://www.ietf.org/id/draft-barwood-dnsext-edns-page-option-02.txt I've not read it.... That's not the direction I'm going! My drafty draft is circulating, I'll have something less drafty in a few more days, and probably running code in a few weeks. Since there are so many nay-sayers, I'll just send it to any interested parties privately. Thanks again to those that contributed useful information! From L.Wood at surrey.ac.uk Sat Aug 22 03:43:53 2009 From: L.Wood at surrey.ac.uk (L.Wood@surrey.ac.uk) Date: Sat, 22 Aug 2009 11:43:53 +0100 Subject: [e2e] TCP improved closing strategies? References: <20090812230118.AFD6E28E138@aland.bbn.com> <4A896637.4010907@isi.edu> <4A897D19.6030702@reed.com> <4A8AC88D.9080103@reed.com> <4A8ACBE8.60704@isi.edu> <4A8AD0FE.6080400@reed.com> <4A8ADD53.2010107@gmail.com> <4A8B9BCF.50900@isi.edu> <4A8D4092.8030403@gmail.com><5d9330db0908201313k26ac013bi5496b37b5e1f62b3@mail.gmail.com> <4A8F23FD.1070705@gmail.com> Message-ID: <4835AFD53A246A40A3B8DA85D658C4BE01368956@EVS-EC1-NODE4.surrey.ac.uk> http://tools.ietf.org/html/draft-barwood-dnsext-edns-page-option won't break. Note that the version number is elided and automatically redirects to the latest version. Ideally, the boilerplate for each draft would include variations on: 'Revisions of this internet-draft can be obtained from: http://tools.ietf.org/html/draft-barwood-dnsext-edns-page-option' before giving 'the list of current Internet-Drafts' > PS: Link below to draft seems broken > He's put out a couple more revisions over the past couple of days. Try: http://www.ietf.org/id/draft-barwood-dnsext-edns-page-option-02.txt -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20090822/68562fb1/attachment.html From per.hurtig at kau.se Mon Aug 24 00:18:24 2009 From: per.hurtig at kau.se (Per Hurtig (work)) Date: Mon, 24 Aug 2009 09:18:24 +0200 Subject: [e2e] Packet reordering in Internet In-Reply-To: <4E562B6B-3EE0-4D67-876E-DEF19A274192@mac.com> References: <008101ca1b53$e66a92a0$b33fb7e0$@poznan.pl> <4E562B6B-3EE0-4D67-876E-DEF19A274192@mac.com> Message-ID: Is reordering a problem for datacenters? Thanks, Per Hurtig On Thu, Aug 13, 2009 at 03:06, rick jones wrote: > > On Aug 12, 2009, at 8:32 AM, Manish Jain wrote: > >> Thanks everyone for the useful pointers. >> >> Based on the some pointers and past discussions, I understand that >> routers in the current Internet have load-balancing implemented in a >> way to preserve packet order within a TCP flow. Is that a safe >> assumption? Are there any other mechanisms in the routers/switches >> that could lead to packet reordering? > > Define "safe." Correct more than 1/2 the time? ?More than 3/4? 7/8? 99 times > out of 10?-) > > If the "router" is made from "Linux" the Linux bonding (aka load balancing, > aggregation, teaming, trunking, call-it-what-you-will) software does offer a > "round-robin" option that will spread packets of the same flow across > multiple links and will indeed lead to reordering. > > rick jones > >> >> -- >> Manish >> >> On Wed, Aug 12, 2009 at 9:50 AM, Bartek Belter wrote: >>> >>> Hi Manish, >>> >>> Some time ago we did some experiments in the pan-European education >>> network. The results of experiments were summarized in a paper. >>> It is available here: >>> http://tnc2005.terena.org/core/getfile.php?file_id=626?(Shall we worry about >>> Packet Reordering?). >>> >>> Hope it helps. >>> >>> Best regards, >>> Bartek >>> >>> -----Original Message----- >>> From: end2end-interest-bounces at postel.org >>> [mailto:end2end-interest-bounces at postel.org] On Behalf Of Manish Jain >>> Sent: Tuesday, August 11, 2009 11:57 PM >>> To: end2end-interest at postel.org >>> Subject: [e2e] Packet reordering in Internet >>> >>> Hello, >>> >>> I was wondering if there are measurement studies of Internet traffic >>> quantifying the magnitude of packet reordering within a TCP flow. Is >>> reordering a common problem for TCP in the current Internet? How about the >>> load balancing features in the routers from major vendors : is it per flow >>> basis or per packet basis, and if flow based load balancing is done, then >>> how is the flow classification is done these routers? >>> What could be/are other sources of reordering withing a TCP flow? >>> >>> Thanks, >>> Manish >>> >>> >>> >> > > there is no rest for the wicked, yet the virtuous have no pillows > > > From per.hurtig at kau.se Mon Aug 24 02:44:44 2009 From: per.hurtig at kau.se (Per Hurtig (work)) Date: Mon, 24 Aug 2009 11:44:44 +0200 Subject: [e2e] Packet reordering in Internet In-Reply-To: <66D4D6C711E011498906F791297A7A063F0C8D391E@NLCLUEXM08.connect1.local> References: <4E562B6B-3EE0-4D67-876E-DEF19A274192@mac.com> <66D4D6C711E011498906F791297A7A063F0C8D391E@NLCLUEXM08.connect1.local> Message-ID: Hi Matthias, On Mon, Aug 24, 2009 at 11:25, Krause, Matthias wrote: > Hi Per, > >> -----Original Message----- >> From: end2end-interest-bounces at postel.org [mailto:end2end-interest- >> bounces at postel.org] On Behalf Of Per Hurtig (work) >> Sent: Monday 24 August 2009 9:18 >> To: end2end-interest at postel.org >> Subject: Re: [e2e] Packet reordering in Internet >> >> Is reordering a problem for datacenters? > > No, for TCP -> includes packet "sorting". > I know that TCP is sorting incoming packets, I should have specified my question better :) What I was wondering is if packets are more likely to be reordered in data centers than in the regular Internet, due to the specific architecture of data centers. For instance, I've seen papers that argue that packet reordering is more prevalent in high-speed environments, e.g. [1], and data centers would likely belong to this type of environment. [1] "Packet reordering in high-speed networks and its impact on high-speed TCP variants". Feng, Jie and Ouyang, Zhipeng and Xu, Lisong and Ramamurthy, Byrav. In Computer Communications 32(1), pp. 62-68, January 2009. > No, for any application using UDP that is aware of the problem (and most are) > > It is basically the problem of the network layer and it has to be solved on the network layer. > If it would be an issue, the Internet would not have been successful. > > Regards > > Matthias > > The information contained in this message may be confidential and legally protected under applicable law. The message is intended solely for the addressee(s). If you are not the intended recipient, you are hereby notified that any use, forwarding, dissemination, or reproduction of this message is strictly prohibited and may be unlawful. If you are not the intended recipient, please contact the sender by return e-mail and destroy all copies of the original message. > > Regards, Per From touch at ISI.EDU Mon Aug 24 17:29:22 2009 From: touch at ISI.EDU (Joe Touch) Date: Mon, 24 Aug 2009 17:29:22 -0700 Subject: [e2e] TCP improved closing strategies? In-Reply-To: <4A8BEDED.4090207@reed.com> References: <20090812230118.AFD6E28E138@aland.bbn.com> <4A840DEF.80503@gmail.com> <4A896637.4010907@isi.edu> <4A897D19.6030702@reed.com> <4A8AC88D.9080103@reed.com> <4A8ACBE8.60704@isi.edu> <4A8AD0FE.6080400@reed.com> <4A8ADD53.2010107@gmail.com> <4A8B109E.6030508@reed.com> <4A8B3468.2060608@gmail.com> <4A8BEDED.4090207@reed.com> Message-ID: <4A933062.6090904@isi.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Regarding portions of recent posts: > Since you persist in being a jerk Ad hominem attacks are not permitted on this list. Any additional offenses will put those parties on the list whose posts are held for review - and held posts are reviewed only once daily. Joe (as list admin) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkqTMGIACgkQE5f5cImnZruDJQCeLVwO33Ct77C+pqapallVgcSw s0EAoPf5Qu2fl9U8S2tdroJbsF1bfjfS =RxMC -----END PGP SIGNATURE----- From touch at ISI.EDU Mon Aug 24 17:45:09 2009 From: touch at ISI.EDU (Joe Touch) Date: Mon, 24 Aug 2009 17:45:09 -0700 Subject: [e2e] TCP improved closing strategies? In-Reply-To: <4A8D4092.8030403@gmail.com> References: <20090812230118.AFD6E28E138@aland.bbn.com> <4A840DEF.80503@gmail.com> <4A896637.4010907@isi.edu> <4A897D19.6030702@reed.com> <4A8AC88D.9080103@reed.com> <4A8ACBE8.60704@isi.edu> <4A8AD0FE.6080400@reed.com> <4A8ADD53.2010107@gmail.com> <4A8B9BCF.50900@isi.edu> <4A8D4092.8030403@gmail.com> Message-ID: <4A933415.30607@isi.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 William Allen Simpson wrote: > Joe Touch wrote: >> William Allen Simpson wrote: >> ... >>> With several hundred thousand clients per minute using 65,000 ports. >> >> The TCP state is supposed to be per socket pair (src/dst IP, src/dst >> port). So unless you're running those clients behind a single NAT - or >> keep track of only part of the state, this isn't an issue of port reuse. >> The issue is more likely consumption of kernel space. >> > I've confirmed with Vixie. Here's my interpretation of his shorthand. > > The point of view of a busy recursive nameserver: > > 1) fin-wait-2 locks up the > tuple for 2*MSL. TIME-WAIT has the 2*MSL delay. FIN-WAIT-2 is supposed to clear after the FIN is sent, and then the other side's FIN is received and an ACK is sent back. > 2) ouraddress and ourport are both fixed. > > 3) fixed theiraddress, from our POV. What does "fixed" mean? Presumably there is more than one DNS client, or is that not the case? > 4) they've discarded state for theirport, usually this is due to NAT. Well, this is a huge bug with NATs. When a connection through them is closed, they shouldn't be reusing the source port for new connections for 2*MSL. The question is whether this is causing a problem for you, though. > The solution requires an improved closing strategy, where the onus is > entirely on the session initiator. The onus to do what? Joe -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkqTNBUACgkQE5f5cImnZrvdCQCgvmphAjTRlor0wmPND6n1mXUc J9UAniBnDAgiH3aOIOYvH8BFi7U0JGUE =LAkL -----END PGP SIGNATURE----- From touch at ISI.EDU Mon Aug 24 17:45:01 2009 From: touch at ISI.EDU (Joe Touch) Date: Mon, 24 Aug 2009 17:45:01 -0700 Subject: [e2e] TCP improved closing strategies? In-Reply-To: <4A8F23FD.1070705@gmail.com> References: <20090812230118.AFD6E28E138@aland.bbn.com> <4A896637.4010907@isi.edu> <4A897D19.6030702@reed.com> <4A8AC88D.9080103@reed.com> <4A8ACBE8.60704@isi.edu> <4A8AD0FE.6080400@reed.com> <4A8ADD53.2010107@gmail.com> <4A8B9BCF.50900@isi.edu> <4A8D4092.8030403@gmail.com> <5d9330db0908201313k26ac013bi5496b37b5e1f62b3@mail.gmail.com> <4A8F23FD.1070705@gmail.com> Message-ID: <4A93340D.6040408@isi.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 William Allen Simpson wrote: ... > We already know that most stub resolvers don't bother with TTL or cache: > > 25.4% Identical Query > 44.9% Repeated Query > > 2.15% Legitimate Query > > http://dns.measurement-factory.com/writings/wessels-pam2003-paper.pdf > > There were average (not peak) 644+ thousand queries per MSL banging away > at 65 thousand ports, and those are old numbers. Per source IP address. Or is there some evidence that you're seeing lots of short connections from a handful of IP addresses? > IIRC, peak was roughly > 8,000 per second (920+ thousand per MSL). That's for one root server, > *with* *existing* round robin rotation among 13 root servers. > > As we switch to TCP, that's not scaling well. What isn't scaling? e.g.: - TCP's handshake and state est. costs too much time/memory - TCP has higher per-packet CPU cost - old connection state hangs around too long (TIME-WAIT) Some of the suggestions here will fix the third one, but few have anything to do with the first two. ... > Anyway, this is supposed to be the end to end group. Why are many folks > trying the "Somebody Else's Problem" approach? I'd like to understand the problem before we jump in and try to fix it at least, though... ;-) So, basically, what is the problem, and is there some evidence that confirms it? Is this a CPU issue, a state of existing connections issue, a state of old connections issue, etc? Joe -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkqTNA0ACgkQE5f5cImnZrvtHACgvLX54jKJyPuySTAnqTawRkp+ 5KUAn1tqJAAeGrnxxENMlLtrpWU0VBTW =jfrx -----END PGP SIGNATURE----- From detlef.bosau at web.de Tue Aug 25 06:39:56 2009 From: detlef.bosau at web.de (Detlef Bosau) Date: Tue, 25 Aug 2009 15:39:56 +0200 Subject: [e2e] Packet reordering in Internet In-Reply-To: References: <4E562B6B-3EE0-4D67-876E-DEF19A274192@mac.com> <66D4D6C711E011498906F791297A7A063F0C8D391E@NLCLUEXM08.connect1.local> Message-ID: <4A93E9AC.3090306@web.de> Per Hurtig (work) wrote: > I know that TCP is sorting incoming packets, I should have specified > my question better :) What I was wondering is if packets are more > likely to be reordered in data centers than in the regular Internet, > due to the specific architecture of data centers. > > For instance, I've seen papers that argue that packet reordering is > more prevalent in high-speed environments, e.g. [1], and data centers > would likely belong to this type of environment. > > [1] "Packet reordering in high-speed networks and its impact on > high-speed TCP variants". Feng, Jie and Ouyang, Zhipeng and Xu, Lisong > and Ramamurthy, Byrav. In Computer Communications 32(1), pp. 62-68, > January 2009. > > Unfortunately, I did not yet read this paper. However, I don't think that a high network throughput alone will cause packet reordering. Nevertheless, excessive traffic shaping using numerous buffers, token buckets and leaky buckets may lead to packet reordering. I still do not really know, e.g., how things like "committed information rate" and the like are implemented in Frame Relay, so it could be worthwhile to have a look at these things in practical networks. -- Detlef Bosau Galileistra?e 30 70565 Stuttgart phone: +49 711 5208031 mobile: +49 172 6819937 skype: detlef.bosau ICQ: 566129673 http://detlef.bosau at web.de From slblake at petri-meat.com Tue Aug 25 07:58:40 2009 From: slblake at petri-meat.com (slblake@petri-meat.com) Date: Tue, 25 Aug 2009 10:58:40 -0400 Subject: [e2e] Packet reordering in Internet In-Reply-To: <4A93E9AC.3090306@web.de> References: <4E562B6B-3EE0-4D67-876E-DEF19A274192@mac.com> <66D4D6C711E011498906F791297A7A063F0C8D391E@NLCLUEXM08.connect1.local> <4A93E9AC.3090306@web.de> Message-ID: <14cd344c85c3211fb9342a5a24853693@petri-meat.com> On Tue, 25 Aug 2009 15:39:56 +0200, Detlef Bosau wrote: > Per Hurtig (work) wrote: >> I know that TCP is sorting incoming packets, I should have specified >> my question better :) What I was wondering is if packets are more >> likely to be reordered in data centers than in the regular Internet, >> due to the specific architecture of data centers. >> >> For instance, I've seen papers that argue that packet reordering is >> more prevalent in high-speed environments, e.g. [1], and data centers >> would likely belong to this type of environment. >> >> [1] "Packet reordering in high-speed networks and its impact on >> high-speed TCP variants". Feng, Jie and Ouyang, Zhipeng and Xu, Lisong >> and Ramamurthy, Byrav. In Computer Communications 32(1), pp. 62-68, >> January 2009. >> >> > > Unfortunately, I did not yet read this paper. However, I don't think > that a high network throughput alone will cause packet reordering. > Nevertheless, excessive traffic shaping using numerous buffers, token > buckets and leaky buckets may lead > to packet reordering. I still do not really know, e.g., how things like > "committed information rate" and the like are implemented in Frame > Relay, so it could be worthwhile to have a look at these things in > practical networks. Assuming stable paths, the only thing that should result in re-ordered packets within a TCP connection is broken load balancing (inside packet switches and between them). Properly implemented metering, queueing, and scheduling do not result in re-ordering. Regards, // Steve