From dhc2 at dcrocker.net Thu Dec 1 07:04:04 2005 From: dhc2 at dcrocker.net (Dave Crocker) Date: Thu, 01 Dec 2005 07:04:04 -0800 Subject: [e2e] End-to-end is a design guideline, not a rigid rule Message-ID: <438F10E4.7050007@dcrocker.net> Folks, A posting on Farber's IP list finally prompted me to write some thoughts that have been wandering around in the back of my mind. I'm interested in reactions you might have: "Andrew W. Donoho" wrote: > The debate about NAT obscures the real issue - that there are legitimate reasons to assert policies for net access at organizational boundaries. Yes, we want the internet architecture to be end to end. This struck me as a particularly useful summary statement about some core architectural issues at hand: Internet technical discussions tend to lack good architectural constructs for describing operations, administration and management (OA&M) boundaries, and we lack robustness in the "end to end" construct. The issue of OA&M boundaries has long been present in the Internet. Note the distinction between routing within an Autonomous System and routing between ASs. To carry this a bit further, note that the original Internet had a single core (backbone) service, run by BBN. The creation of NSFNet finally broke this simplistic public routing model and required development of a routing protocol that supported multiple backbones. As another example, the email DNS MX record, that one finds over the open Internet, is also generally viewed as marking this boundary and is often called a Boundary MTA. However the Internet Mail architecture does not have the construct explicitly. For a year or so, I have been searching for a term that marks independent, cohesive operational environments, but haven't found one that the community likes. Some folks have suggested a derivation of an old X.400 term: Administrative Management Domain (ADMD). More generally I think that this issue of boundaries between islands of cohesive policy -- defining differences in the trust within an island, versus between islands -- is a key point of enhancement to the Internet architecture work that we must focus on. I have found ?Tussle in Cyberspace: Defining Tomorrow?s Internet,? (Clark, D., Wroclawski, J., Sollins, K., and R. Braden, ACM SIGCOMM, 2002) a particularly cogent starting point, for this issue. On the question of the "end to end" construct I believe we suffer from viewing it simplistically. What I think our community has missed is that it is a design guideline, not a rigid rule. In fact with a layered architecture, the construct varies according to the layer. At the IP level, this is demonstrated two ways. One is the next IP hop, which might go through many nodes in a layer-2 network, and the other is the source/destination IP addresses, which might go through multiple IP nodes. The TCP/IP split is the primary example of end-to-end, but it is deceptive. TCP is end-to-end but only at the TCP layer. The applications that use TCP represent points beyond the supposed end-to-end framework. My own education on this point came from doing EDI over Email. Of course I always viewed the email author-to-recipient as "end to end" but along comes EDI that did additional routing at the recipient site. To the EDI world, the entire email service was merely one hop. This proved enlightening because the point has come up repeatedly: For email, user-level re-routing and forwarding are common, but outside the scope of the generally recognized architecture. I've been working on a document that is trying to fully describe the current Internet Mail architecture: However it is not clear whether it will reach rough consensus. My own view is that the email concept of end to end has two versions. One is between the posting location and the SMTP RCPT-TO (envelope) address and the other is between the author and the (final) recipient. Failure to deal with this explicitly in the architecture is proving problematic to such email enhancements as transit responsibility, such as by SPF or DKIM). In other words, the Internet technology has never been a pure "end to end" model. Rather, end to end is a way of distinguishing between components that compose an infrastructure versus components that use the infrastructure -- at a particular layer. "End to end" is a way of characterizing a preference to keep the infrastructure as simple as possible. This does not mean that we are prohibited from putting anything into the infrastructure or changing the boundaries of the infrastructure, merely that we prefer to keep the it unchanged. In this light, NATs (and firewalls) are merely a clear demonstration of market demand for some facilities that make end to end layered with respect to some operational policies, to permit the addition of a trust boundary between intra-network operations and inter-network operations. We should not be surprised by this additional requirement nor should we resist it. The primary Internet lesson is about scaling, and this appears to be a rather straightforward example of scaling among very large numbers of independent and diverse operational groups. Growth like this always comes with vast cultural diversity. That means that the basis for trust among the independent groups is more fragile. It needs much more careful definition and enforcement than was required in the kinder and gentler days of a smaller Internet. d/ -- Dave Crocker Brandenburg InternetWorking -- Dave Crocker Brandenburg InternetWorking From detlef.bosau at web.de Thu Dec 1 06:47:20 2005 From: detlef.bosau at web.de (Detlef Bosau) Date: Thu, 01 Dec 2005 15:47:20 +0100 Subject: [e2e] [SPAM] - Re: number of flows per unit time in routers -Email has different SMTP TO: and MIME TO: fields in the email addresses References: <20051027202348.55CE81FF@aland.bbn.com> <00a901c5dbf9$b2755220$a273f59b@essex.ac.uk><43629ABC.3060106@cs.columbia.edu> <43639273.4ACF5FC6@web.de>, <4363E33F.1040803@cs.columbia.edu> <43668A89.1090208@dirtcheapemail.com> <4366C129.70003@cs.columbia.edu> Message-ID: <438F0CF8.AEABDDAD@web.de> Ping Pan wrote: > > Actually, not really. GE and 10GE are getting real popular for access, > in particular, PON-based access networks. There could be tons of GE/10GE > at aggregation points, but only a single OC3/OC12 toward the core. > I once got a paper rejected and one of the reviewer comments was why I the "bottleneck link" in the dumbbell was larger than the "acces links". Excuse me, I?m aged 42 and I have less hair on my head than when I was aged 24 and some guys claim they would have seen one or two grey hairs on my head. So, perhaps this is the reason why I cannot understand why we use OC/12 links to the _backbone_ and 10 GE networks in the access networks. It may be that some cable guys convince their customers with this, excuse me, most likely superfluous stuff but despite of very rare cases I don?t see a reason to attach a computer to the Internet using a 10 GE pipe. But attaching a network to a _backbone_ using an OC/12 link is perhaps not that much better. How many tracks as a highway? One per direction? Or two? And how many tracks as a lonely lane in Somewherevillage? Surely 24 per direction. It may be that a shepherd needs the place for his flock to come home. Of course, I have admittedly no idea of actual bandwidths in actual backbone networks. Rarely, some providers here in Germany boast with "backbone links" with OC/12 lines and think this to be somehow attractive. However, I don?t see whether this really makes sense. However, I?m willing to learn here, because either way proper bandwidth planing starts with proper baselining and traffic analysis. > The assumption has been that video traffic can be handled locally within > aggregation networks, (R/F overlay through optical links directly, or > satellite video download etc...) > > What may happened is that large user traffic bursts jam into the narrow > OC-n links toward the core. This is where proper traffic policing > becomes useful. And proper bandwidth planing. I?m by far no expert in fibre links. I really don?t see a reason why one should waste 10 GE networks to attach two computers in some office to something and to use OC/12, i.e. 600 MBit/s, links to attach a network, say a provider network, to the Internet core. O.k., perhaps it gives Cisco the opportunity to talk small companies into buying new routers and interfaces and to waste lots of money for unnecessary stuff. Honestly, I sometimes did not beleive my eyes when I saw the overprovisioning which is done in many companies. One of the topics here is congestion control. When I think of congestion control in many companies, I ask: Congestion? Control? ;-) When two PCs are interconnected by a series of 10 GE fibres and routers in between are equipped with 100 GByte of memory each, the Microsoft "teacher" will recommend window scaling in order to exploit the ressources... Congestion? ;-) I?ve rarely seen it _that_ extreme. However, I would not be surprised when the first "decision maker notebook" is attached to the company LAN using a 1 Terabit/s line, because otherwise a decision maker would not be able to make decisions. O.k., when two notebooks in a decision maker?s office are interconnected via a satellite link, 10 GBit/s may suffice.... Detlef Detlef > > - Ping > > Clark Gaylord wrote: > > > > The other thing you need for that to work is that core link speeds are > > faster than access speeds, but with that assumption it doesn't take long > > to get to this conclusion. > > > > --ckg -- Detlef Bosau Galileistrasse 30 70565 Stuttgart Mail: detlef.bosau at web.de Web: http://www.detlef-bosau.de Mobile: +49 172 681 9937 From Jon.Crowcroft at cl.cam.ac.uk Thu Dec 1 08:13:35 2005 From: Jon.Crowcroft at cl.cam.ac.uk (Jon Crowcroft) Date: Thu, 01 Dec 2005 16:13:35 +0000 Subject: [e2e] intelligent network design Message-ID: many children grow up and are disappointed to discover that (spoiler) santa claus doesn't really exist - it was mummy and darpa^H^H^ daddy who put all those presents at the end of your bed/under the tree so here's a suggestion combining, seasonally, several ideas that have been floating lightly around the net for a while:- if you want to send a packet to my computer, or you want to an e-mail to my mail box, first of all you must send an xmas present to be held in escrow for later when i hear that the xmas present is safe somewhere in the north pole, then you will get notification of an address you can reach me at. If I then get a messaage that is interesting, useful or entertaining, the present will remain in escrow. If I (where I=anyrecipient) deem the message to be dangerous, boring, or dull, then I will inform santa to deliver the present to a needy person (of santa's choosing). Presents such as anti-AIDS drugs (given the day) are welcome. jon. the implementation details are interesting, but there isn't room at the bottom of this screen for me to write them down. suffice it to say that a cryptographically generated virtual private internet space is used as a form of capacibility with moosewood routing, and a revokaction process based on one of adrian perrig's fine re-keying protocols for multicast. cover traffic is provided by frequent snow flurrys that remove traces of moose paw prints...I fully expect this to be funded by NSF as part of the GENI work that must extend out side the US (and where better than the north pole). From dhc2 at dcrocker.net Thu Dec 1 08:45:23 2005 From: dhc2 at dcrocker.net (Dave Crocker) Date: Thu, 01 Dec 2005 08:45:23 -0800 Subject: [e2e] intelligent network design In-Reply-To: References: Message-ID: <438F28A3.7030409@dcrocker.net> > if you want to send a packet to my computer, or you want to an e-mail > to my mail box, first of all you must send an xmas present to be held > in escrow for later kinda like charging for DNS queries, no? d/ -- Dave Crocker Brandenburg InternetWorking From braden at ISI.EDU Thu Dec 1 09:39:16 2005 From: braden at ISI.EDU (Bob Braden) Date: Thu, 1 Dec 2005 09:39:16 -0800 (PST) Subject: [e2e] intelligent network design Message-ID: <200512011739.JAA28590@gra.isi.edu> Sounds like an April 1 RFC to me... Bob Braden From faber at ISI.EDU Thu Dec 1 09:42:20 2005 From: faber at ISI.EDU (Ted Faber) Date: Thu, 1 Dec 2005 09:42:20 -0800 Subject: [e2e] intelligent network design In-Reply-To: References: Message-ID: <20051201174219.GF24647@hut.isi.edu> On Thu, Dec 01, 2005 at 04:13:35PM +0000, Jon Crowcroft wrote: > if you want to send a packet to my computer, or you want to an e-mail > to my mail box, first of all you must send an xmas present to be held > in escrow for later > > when i hear that the xmas present is safe somewhere in the north pole, > then you will get notification of an address you can reach me at. > If I then get a messaage that is interesting, useful or entertaining, > the present will remain in escrow. If I (where I=anyrecipient) > deem the message to be dangerous, boring, or dull, then I will inform > santa to deliver the present to a needy person (of santa's choosing). I'm assuming that by "a needy person" you mean "the recepient." It's kind of a morally sticky position to put a receipent in for their entertainment to literally be blocking charity from reaching needy people. (It might encourage more people to adopt my nephew's position that "everything is boring but PlayStation." Being charitable with other's money is pretty easy.) It's an interesting idea, but incentives are easier to grok when they're direct. (A more facetious response might have included the complexities of people who don't celebrate Christmas, have a religious or moral problem with charity, etc. Fortunately, this is a completely serious message.) The first place I heard of this idea (using the simpler, direct incentives) was in Heinlein's _The Cat Who Walks Through Walls_, though there may be earlier references. In _Cat_ ringing the main character's doorbell requires a $20 deposit for which the ringer gets a minute of time. The $20 is (obviously) refundable at the owner's discretion. It's the difference between "sender pays" and "sender pays *me*." As far as shipping AIDS drugs out, especially today, I'm in favor. -- Ted Faber http://www.isi.edu/~faber PGP: http://www.isi.edu/~faber/pubkeys.asc Unexpected attachment on this mail? See http://www.isi.edu/~faber/FAQ.html#SIG -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://www.postel.org/pipermail/end2end-interest/attachments/20051201/e6c7b46b/attachment.bin From touch at ISI.EDU Thu Dec 1 11:38:42 2005 From: touch at ISI.EDU (Joe Touch) Date: Thu, 01 Dec 2005 11:38:42 -0800 Subject: [e2e] End-to-end is a design guideline, not a rigid rule In-Reply-To: <438F10E4.7050007@dcrocker.net> References: <438F10E4.7050007@dcrocker.net> Message-ID: <438F5142.5010508@isi.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Dave Crocker wrote: > Folks, > > A posting on Farber's IP list finally prompted me to write some thoughts > that have been wandering around in the back of my mind. I'm interested > in reactions you might have: > > > "Andrew W. Donoho" wrote: >> The debate about NAT obscures the real issue - that there are > legitimate reasons to assert policies for net access at organizational > boundaries. Yes, we want the internet architecture to be end to end. > > > This struck me as a particularly useful summary statement about some > core architectural issues at hand: Internet technical discussions tend > to lack good architectural constructs for describing operations, > administration and management (OA&M) boundaries, and we lack robustness > in the "end to end" construct. > > The issue of OA&M boundaries has long been present in the Internet. Note > the distinction between routing within an Autonomous System and routing > between ASs. To carry this a bit further, note that the original > Internet had a single core (backbone) service, run by BBN. The creation > of NSFNet finally broke this simplistic public routing model and > required development of a routing protocol that supported multiple > backbones. > > As another example, the email DNS MX record, that one finds over the > open Internet, is also generally viewed as marking this boundary and is > often called a Boundary MTA. However the Internet Mail architecture > does not have the construct explicitly. For a year or so, I have been > searching for a term that marks independent, cohesive operational > environments, but haven't found one that the community likes. Some > folks have suggested a derivation of an old X.400 term: Administrative > Management Domain (ADMD). > > More generally I think that this issue of boundaries between islands of > cohesive policy -- defining differences in the trust within an island, > versus between islands -- is a key point of enhancement to the Internet > architecture work that we must focus on. I have found ?Tussle in > Cyberspace: Defining Tomorrow?s Internet,? (Clark, D., Wroclawski, J., > Sollins, K., and R. Braden, ACM SIGCOMM, 2002) a particularly cogent > starting point, for this issue. > > On the question of the "end to end" construct I believe we suffer from > viewing it simplistically. What I think our community has missed is > that it is a design guideline, not a rigid rule. In fact with a layered > architecture, the construct varies according to the layer. At the IP > level, this is demonstrated two ways. One is the next IP hop, which > might go through many nodes in a layer-2 network, and the other is the > source/destination IP addresses, which might go through multiple IP nodes. > > The TCP/IP split is the primary example of end-to-end, but it is > deceptive. TCP is end-to-end but only at the TCP layer. The > applications that use TCP represent points beyond the supposed > end-to-end framework. The "ends" and "hops" in E2E are relative, at least they always have been to me. All the E2E argument says, in that context, is that you can't compose HBH services to end up with the equivalent E2E. It never said not to do HBH (e.g., for performance). It never said where the ends definitively were for all layers, IMO. Joe -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (MingW32) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFDj1FCE5f5cImnZrsRAtGTAJ90msM3c2fUYVwu8OivMsV68/xcAACgiF/b sbQ1KPR4PBwotedQKCe40K4= =Gldi -----END PGP SIGNATURE----- From dpreed at reed.com Thu Dec 1 11:40:08 2005 From: dpreed at reed.com (David P. Reed) Date: Thu, 01 Dec 2005 14:40:08 -0500 Subject: [e2e] End-to-end is a design guideline, not a rigid rule In-Reply-To: <438F10E4.7050007@dcrocker.net> References: <438F10E4.7050007@dcrocker.net> Message-ID: <438F5198.1070103@reed.com> [oops, Dave C. pointed out that I replied only to him, instead of only to e2ei, and encouraged me to send it to the whole list] The end-to-end argument was indeed a design guideline not a rigid rule as proposed. On the other hand, as you point out, Dave, its value as a guideline is making a system scalable and evolvable. And there's a corollary: building function into the network has costs as well as benefits. Too often we ignore those costs, because they are less visible than the benefits. However, I disagree with your example. The problem is that topology doesn't map to authority. Yes, there are organizational boundaries, and organizations have an interest in communications between peers. However, those organizational boundaries do NOT correlate closely with physical network boundaries. The premature binding of organizational boundaries to physical topological connect points is why NATs and so forth so often miss the mark on solving the true "end-to-end" problems we have. So, I agree with you on your major point, but I disagree that email is a good example of how to either apply or ignore the end-to-end argument. One merely has to examine the move to having hotel ISPs spoofing SMTP connections based on their organizational "interest" in blocking spam (and their lawyers assert that the the law *requires* them to do this). That man-in-the-middle solution actually prevents better solutions (such as crypto-authentication that prevents man-in-the-middle attacks) to the actual end-to-end requirements that users want. Dave Crocker wrote: > Folks, > > A posting on Farber's IP list finally prompted me to write some > thoughts that have been wandering around in the back of my mind. I'm > interested in reactions you might have: > > > "Andrew W. Donoho" wrote: > > The debate about NAT obscures the real issue - that there are > legitimate reasons to assert policies for net access at organizational > boundaries. Yes, we want the internet architecture to be end to end. > > > This struck me as a particularly useful summary statement about some > core architectural issues at hand: Internet technical discussions > tend to lack good architectural constructs for describing operations, > administration and management (OA&M) boundaries, and we lack > robustness in the "end to end" construct. > > The issue of OA&M boundaries has long been present in the Internet. > Note the distinction between routing within an Autonomous System and > routing between ASs. To carry this a bit further, note that the > original Internet had a single core (backbone) service, run by BBN. > The creation of NSFNet finally broke this simplistic public routing > model and required development of a routing protocol that supported > multiple backbones. > > As another example, the email DNS MX record, that one finds over the > open Internet, is also generally viewed as marking this boundary and > is often called a Boundary MTA. However the Internet Mail > architecture does not have the construct explicitly. For a year or > so, I have been searching for a term that marks independent, cohesive > operational environments, but haven't found one that the community > likes. Some folks have suggested a derivation of an old X.400 term: > Administrative Management Domain (ADMD). > > More generally I think that this issue of boundaries between islands > of cohesive policy -- defining differences in the trust within an > island, versus between islands -- is a key point of enhancement to the > Internet architecture work that we must focus on. I have found > ?Tussle in Cyberspace: Defining Tomorrow?s Internet,? (Clark, D., > Wroclawski, J., Sollins, K., and R. Braden, ACM SIGCOMM, 2002) a > particularly cogent starting point, for this issue. > > On the question of the "end to end" construct I believe we suffer from > viewing it simplistically. What I think our community has missed is > that it is a design guideline, not a rigid rule. In fact with a > layered architecture, the construct varies according to the layer. At > the IP level, this is demonstrated two ways. One is the next IP hop, > which might go through many nodes in a layer-2 network, and the other > is the source/destination IP addresses, which might go through > multiple IP nodes. > > The TCP/IP split is the primary example of end-to-end, but it is > deceptive. TCP is end-to-end but only at the TCP layer. The > applications that use TCP represent points beyond the supposed > end-to-end framework. > > My own education on this point came from doing EDI over Email. Of > course I always viewed the email author-to-recipient as "end to end" > but along comes EDI that did additional routing at the recipient > site. To the EDI world, the entire email service was merely one hop. > > This proved enlightening because the point has come up repeatedly: > For email, user-level re-routing and forwarding are common, but > outside the scope of the generally recognized architecture. I've been > working on a document that is trying to fully describe the current > Internet Mail architecture: > > > > However it is not clear whether it will reach rough consensus. > > My own view is that the email concept of end to end has two versions. > One is between the posting location and the SMTP RCPT-TO (envelope) > address and the other is between the author and the (final) > recipient. Failure to deal with this explicitly in the architecture > is proving problematic to such email enhancements as transit > responsibility, such as by SPF or DKIM). > > In other words, the Internet technology has never been a pure "end to > end" model. Rather, end to end is a way of distinguishing between > components that compose an infrastructure versus components that use > the infrastructure -- at a particular layer. "End to end" is a way of > characterizing a preference to keep the infrastructure as simple as > possible. > > This does not mean that we are prohibited from putting anything into > the infrastructure or changing the boundaries of the infrastructure, > merely that we prefer to keep the it unchanged. In this light, NATs > (and firewalls) are merely a clear demonstration of market demand for > some facilities that make end to end layered with respect to some > operational policies, to permit the addition of a trust boundary > between intra-network operations and inter-network operations. > > We should not be surprised by this additional requirement nor should > we resist it. The primary Internet lesson is about scaling, and this > appears to be a rather straightforward example of scaling among very > large numbers of independent and diverse operational groups. Growth > like this always comes with vast cultural diversity. That means that > the basis for trust among the independent groups is more fragile. It > needs much more careful definition and enforcement than was required > in the kinder and gentler days of a smaller Internet. > > > d/ From Jon.Crowcroft at cl.cam.ac.uk Thu Dec 1 12:00:39 2005 From: Jon.Crowcroft at cl.cam.ac.uk (Jon Crowcroft) Date: Thu, 01 Dec 2005 20:00:39 +0000 Subject: [e2e] intelligent network design In-Reply-To: Message from Ted Faber of "Thu, 01 Dec 2005 09:42:20 PST." <20051201174219.GF24647@hut.isi.edu> Message-ID: its different from charging (e.g. proof-of-work schemes like penny black) because 1/ the money does NOT go to the receiver 2/ the money _only_ goes if the participants fail to meet metcalfe's law. (i.e. fail to increase the value of the net) so the point is that it is a _tax_ on network stupidity. the footnote in the message actually contains clues about a way you really might implement the metcalfe control cheers jon p.s. for those of you from Kansas, let me assure you: 1/ of course, the network hasn't evolved, it was the product of intelligent design and I am just proposing continuing that tradition (perhaps if you like, i want to play god with IP) 2/ luckily I dont do routing (and this is end2end) or I might have to talk about being touched by the noodly tendrils of the flying spaghetti BGP code monster In missive <20051201174219.GF24647 at hut.isi.edu>, Ted Faber typed: >> >>--6v9BRtpmy+umdQlo >>Content-Type: text/plain; charset=us-ascii >>Content-Disposition: inline >>Content-Transfer-Encoding: quoted-printable >> >>On Thu, Dec 01, 2005 at 04:13:35PM +0000, Jon Crowcroft wrote: >>> if you want to send a packet to my computer, or you want to an e-mail >>> to my mail box, first of all you must send an xmas present to be held >>> in escrow for later >>>=20 >>> when i hear that the xmas present is safe somewhere in the north pole, >>> then you will get notification of an address you can reach me at. >>> If I then get a messaage that is interesting, useful or entertaining, >>> the present will remain in escrow. If I (where I=3Danyrecipient) >>> deem the message to be dangerous, boring, or dull, then I will inform >>> santa to deliver the present to a needy person (of santa's choosing). >> >>I'm assuming that by "a needy person" you mean "the recepient." It's >>kind of a morally sticky position to put a receipent in for their >>entertainment to literally be blocking charity from reaching needy >>people. (It might encourage more people to adopt my nephew's position >>that "everything is boring but PlayStation." Being charitable with >>other's money is pretty easy.) It's an interesting idea, but incentives >>are easier to grok when they're direct. >> >>(A more facetious response might have included the complexities of >>people who don't celebrate Christmas, have a religious or moral problem >>with charity, etc. Fortunately, this is a completely serious message.) >> >>The first place I heard of this idea (using the simpler, direct >>incentives) was in Heinlein's _The Cat Who Walks Through Walls_, though >>there may be earlier references. In _Cat_ ringing the main character's >>doorbell requires a $20 deposit for which the ringer gets a minute of >>time. The $20 is (obviously) refundable at the owner's discretion. >>It's the difference between "sender pays" and "sender pays *me*." >> >>As far as shipping AIDS drugs out, especially today, I'm in favor. >> >>--=20 >>Ted Faber >>http://www.isi.edu/~faber PGP: http://www.isi.edu/~faber/pubkeys.= >>asc >>Unexpected attachment on this mail? See http://www.isi.edu/~faber/FAQ.html#= >>SIG >> >>--6v9BRtpmy+umdQlo >>Content-Type: application/pgp-signature >>Content-Disposition: inline >> >>-----BEGIN PGP SIGNATURE----- >>Version: GnuPG v1.4.2 (FreeBSD) >> >>iD8DBQFDjzX7aUz3f+Zf+XsRAnj9AJ9XHvPO4QgUKE2Z05nyU9pH1Z8ZFQCfUXX4 >>wGCLN9bZIrNo1TUdNA+bAlM= >>=/vtX >>-----END PGP SIGNATURE----- >> >>--6v9BRtpmy+umdQlo-- cheers jon From detlef.bosau at web.de Thu Dec 1 12:50:35 2005 From: detlef.bosau at web.de (Detlef Bosau) Date: Thu, 01 Dec 2005 21:50:35 +0100 Subject: [e2e] A Question on the TCP handoff References: Message-ID: <438F621B.973CEBE@web.de> Alper Kamil Demir wrote: > > > >Admittedly, I don?t quite understand what you mean by "warm up > >connection". > In our work, there is an "actual connection" between a "fixed host" > and a mobile terminal. "warm-up connection" is a pre-established connection > between a "synchronization agent-SA" and a "fixed-host" on behalf of a mobile > terminal. When a mobile terminal enters into a new cell, we assume that > "warm-up connection" replaces the "actual connection" (SA handovers > new congestion state to mobile terminal) and becomes a new "actual connection" > so that mobile terminal learns the congestion state of the new path. > I was questioning if this is ever possible and/or meaningful ? If so, is > there any tool that can be useful for us? > tcpcp was suggested. I think it can not be used to solve our problem. > > That?s what I feared. Let?s drah a network in order to see if I understand you correctly. FH ------------Internet---------------SA1 !!!!wireless network!!!!MH SA2 !!!!wireless network!!! There is some pre-established connection between SA2 and MH Then MH enters the cell of SA2. What about the path of FH to SA1? Is it replaced by a path FH to SA2? In that case, you would even have to expect changes of the path capacity in the wired part of your connection. In addition: What does "pre established" mean? TCP state variables, particularly CWND, result from a dynamic settling process. If there is no traffic vom SA2 to MH, there would be no channel storage capacity being assigend to your flow. If you enter this cell with some CWND, you would suddenly send packets to the new path. Perhaps, you do not even know whether the bottleneck between FH and MH is situated in the Internet or whether the bottleneck is the wireless network. Particularly, this may change as a result from roaming. Please correct me, when I got you totally wrong. In some respekt, your approach reminds me of the M-TCP work by Brown and Singh, 1997. I still think that you try to keep state variables for a TCP connection although its path changes fundamentally. And I?m not convinced that this will work. Detlef >However, IMHO there is some basic difficulty in any kind of TCP > >handover, which even holds in the existing and well known approaches. > Handover itsef is basicly difficult :) > > Alper K. Demir -- Detlef Bosau Galileistrasse 30 70565 Stuttgart Mail: detlef.bosau at web.de Web: http://www.detlef-bosau.de Mobile: +49 172 681 9937 From demir at kou.edu.tr Thu Dec 1 08:57:39 2005 From: demir at kou.edu.tr (Alper Kamil Demir) Date: Thu, 1 Dec 2005 18:57:39 +0200 Subject: [e2e] A Question on the TCP handoff Message-ID: >> Thank you very much for the reply. However, I think tcpcp will not >> solve the problem because tcpcp is useful to migrate tcp connection >> from place to place (correct me if I am wrong). What I want to achieve >> is to replace the warm-up connection with an already >> established actual connection (so that replaced new connection >> both does have the same previous flow control of actual connection and >> doesn't go into slow start process by having congestion control state >> of warm-up). >Admittedly, I don?t quite understand what you mean by "warm up >connection". In our work, there is an "actual connection" between a "fixed host" and a mobile terminal. "warm-up connection" is a pre-established connection between a "synchronization agent-SA" and a "fixed-host" on behalf of a mobile terminal. When a mobile terminal enters into a new cell, we assume that "warm-up connection" replaces the "actual connection" (SA handovers new congestion state to mobile terminal) and becomes a new "actual connection" so that mobile terminal learns the congestion state of the new path. I was questioning if this is ever possible and/or meaningful ? If so, is there any tool that can be useful for us? tcpcp was suggested. I think it can not be used to solve our problem. >However, IMHO there is some basic difficulty in any kind of TCP >handover, which even holds in the existing and well known approaches. Handover itsef is basicly difficult :) Alper K. Demir From fred at cisco.com Thu Dec 1 16:36:16 2005 From: fred at cisco.com (Fred Baker) Date: Thu, 1 Dec 2005 16:36:16 -0800 Subject: [e2e] End-to-end is a design guideline, not a rigid rule In-Reply-To: <438F5198.1070103@reed.com> References: <438F10E4.7050007@dcrocker.net> <438F5198.1070103@reed.com> Message-ID: I'll agree with points made in each of the emails in this thread. From my perspective, "end to end" includes both "end-to-end-across-a- single-communication" and the "end-to-end-in-a-disruption-tolerant- manner" models that Dave C mentioned. End to end in an email means that when I send a message to the various recipients of end2end- interest, I expect service in each of the hundreds of cases of that to be essentially the same - that the content of the message will not be changed en route, that the envelope of the SMTP message will be updated at each application layer hop to facilitate problem diagnosis, and that delivery will be timely within the service limits of the application or a response will be sent to me saying that it could not be accomplished. Supporting that, the MUA-MTA, MTA-MTA, and MTA-MUA hops will similarly be handled with minimal effort. One would expect that the interaction of MUAs and MTAs across a network of ISPs to be indistinguishable from one in which they all happened to be colocated on a common LAN apart from the rate and timing side-effects of the engineering of the network. One place where I depart from a common view of the end to end argument is that there are times when it makes sense to actively enquire of the network and expect the network to make a response that characterizes itself. A completely "stupid" network, such as a 3/4" diameter yellow coaxial cable, would not be able to respond, and as I understand Isenberg, that is the way all networks should behave. All intelligence should be in the end system and only in the end system. But (Dave R, tell me if I am wrong) Saltzer/Reed doesn't seem to suggest that. The point of the original end-to-end argument was not that intelligence should reside only in the end station or only in the application; it was that a lower layer should not do something that also had to be done at a higher layer without a good justification. An example, often repeated, is that LAPB go-back-N retransmission is redundant in the presence of TCP or application retransmission, and that it measurably resulted in packet duplication around bit errors. That said, 802.11 also has retransmission, and if it didn't, behavior on wireless LANs would be a lot worse than it is. Hence, we retransmit in TCP in the general case, but 802.11 presents a case where link layer retransmission is still justified. This understanding of the end-to-end principle would seem to suggest that interactions with the network that inform the intelligent edge and enable it to make better decisions are within the principle's scope. I view both the integrated services and the differentiated services architectures in that light - one doesn't want the network to subvert the intent of the intelligent edge, but interactions that enable it to better achieve its intent are good. And then, what is subversion? It is pretty common to put in what amounts to a network honeypot, in which one of the addresses in a prefix is routed down a tunnel to a collector. In the event that anyone sends something to the address, the collector picks it up, and management remediation actions follow. Is this "subversion of routing"? I would argue that it is "routing", but is not "subversion". Ditto the case where a system comes under attack and network ops staff reroutes the address through the same kind of tunnel. That certainly subverts the attack, and makes the targeted system unavailable for a period of time until the attack can be interdicted. But for any legitimate use of the targeted system, it's hard to describe as subversion; it's part of the process of restoration. As to NATs and such - to my way of thinking, a NAT is two things in one. It is a stateful firewall (it maps active authorized address/ port pairs, creates such a mapping if it originates from "inside", and if the mapping doesn't exist it blocks communication from the "outside"), which if one thinks having skin on the human body is good for its health one has to consider a reasonable prophylactic protection. To the extent that applications and protocols above the network layer know something about network layer addresses, NATs also create difficulties in deploying such applications. In that sense, a NAT is a man-in-the-middle attack, something that makes life difficult for the application. I'm all for good firewalls; the end to end model doesn't speak highly of things that break application behavior, however. So, coming back to Dave C's point about our current network architecture not doing very well with hidden boundaries, I would say "you are correct; it doesn't". I don't think that is a failure of the end-to-end principle, however. It may be a failure of our ability to apply it correctly. If all applications were message-based, like email is, one could imagine a firewall acting something like an MTA - terminating the conversation in one domain and then repeating it in another, in a manner entirely consistent with the end-to-end principle as applied to email in its two forms of end-to-end-ness. If all applications were able to be proxied, like SIP, or the various users of SOCKS are, the proxy could literally be the trusted-and- known third party that made the transition happen. If IP were very slightly different, with the AS number in the header and listed in DNS and the routing protocols, and addresses being understood as local to the identified AS, we could assign an AS to every region behind a NAT, and the whole thing would work quite nicely end to end. The problem is not that the architecture and available tools don't handle the concept of separation of domains; it is that our current common implementation of separation of domains involves a man-in-the- middle attack on a subset of the relevant applications and protocols. As Dave R points out, the man-in-the-middle attacks that we build in make the network harder to manage and harder to maintain, and makes the applications harder to improve. Fixing the architecture, in my opinion, will involve removing the things that subvert the intent of the end system, which is to say, changes them in the direction of Salter/Reed's version of the end-to-end principle. > From: Dave Crocker > Date: December 1, 2005 7:04:04 AM PST > To: end2end-interest at postel.org > Subject: [e2e] End-to-end is a design guideline, not a rigid rule > Reply-To: dcrocker at bbiw.net > > Folks, > > A posting on Farber's IP list finally prompted me to write some > thoughts that have been wandering around in the back of my mind. > I'm interested in reactions you might have: > > > "Andrew W. Donoho" wrote: > > The debate about NAT obscures the real issue - that there are > legitimate reasons to assert policies for net access at > organizational boundaries. Yes, we want the internet architecture > to be end to end. > > > This struck me as a particularly useful summary statement about > some core architectural issues at hand: Internet technical > discussions tend to lack good architectural constructs for > describing operations, administration and management (OA&M) > boundaries, and we lack robustness in the "end to end" construct. > > The issue of OA&M boundaries has long been present in the Internet. > Note the distinction between routing within an Autonomous System > and routing between ASs. To carry this a bit further, note that > the original Internet had a single core (backbone) service, run by > BBN. The creation of NSFNet finally broke this simplistic public > routing model and required development of a routing protocol that > supported multiple backbones. > > As another example, the email DNS MX record, that one finds over > the open Internet, is also generally viewed as marking this > boundary and is often called a Boundary MTA. However the Internet > Mail architecture does not have the construct explicitly. For a > year or so, I have been searching for a term that marks > independent, cohesive operational environments, but haven't found > one that the community likes. Some folks have suggested a > derivation of an old X.400 term: Administrative Management Domain > (ADMD). > > More generally I think that this issue of boundaries between > islands of cohesive policy -- defining differences in the trust > within an island, versus between islands -- is a key point of > enhancement to the Internet architecture work that we must focus > on. I have found ?Tussle in Cyberspace: Defining Tomorrow?s > Internet,? (Clark, D., Wroclawski, J., Sollins, K., and R. Braden, > ACM SIGCOMM, 2002) a particularly cogent starting point, for this > issue. > > On the question of the "end to end" construct I believe we suffer > from viewing it simplistically. What I think our community has > missed is that it is a design guideline, not a rigid rule. In fact > with a layered architecture, the construct varies according to the > layer. At the IP level, this is demonstrated two ways. One is the > next IP hop, which might go through many nodes in a layer-2 > network, and the other is the source/destination IP addresses, > which might go through multiple IP nodes. > > The TCP/IP split is the primary example of end-to-end, but it is > deceptive. TCP is end-to-end but only at the TCP layer. The > applications that use TCP represent points beyond the supposed end- > to-end framework. > > My own education on this point came from doing EDI over Email. Of > course I always viewed the email author-to-recipient as "end to > end" but along comes EDI that did additional routing at the > recipient site. To the EDI world, the entire email service was > merely one hop. > > This proved enlightening because the point has come up repeatedly: > For email, user-level re-routing and forwarding are common, but > outside the scope of the generally recognized architecture. I've > been working on a document that is trying to fully describe the > current Internet Mail architecture: > > > > However it is not clear whether it will reach rough consensus. > > My own view is that the email concept of end to end has two > versions. One is between the posting location and the SMTP RCPT-TO > (envelope) address and the other is between the author and the > (final) recipient. Failure to deal with this explicitly in the > architecture is proving problematic to such email enhancements as > transit responsibility, such as by SPF or DKIM). > > In other words, the Internet technology has never been a pure "end > to end" model. Rather, end to end is a way of distinguishing > between components that compose an infrastructure versus components > that use the infrastructure -- at a particular layer. "End to end" > is a way of characterizing a preference to keep the infrastructure > as simple as possible. > > This does not mean that we are prohibited from putting anything > into the infrastructure or changing the boundaries of the > infrastructure, merely that we prefer to keep the it unchanged. In > this light, NATs (and firewalls) are merely a clear demonstration > of market demand for some facilities that make end to end layered > with respect to some operational policies, to permit the addition > of a trust boundary between intra-network operations and inter- > network operations. > > We should not be surprised by this additional requirement nor > should we resist it. The primary Internet lesson is about scaling, > and this appears to be a rather straightforward example of scaling > among very large numbers of independent and diverse operational > groups. Growth like this always comes with vast cultural > diversity. That means that the basis for trust among the > independent groups is more fragile. It needs much more careful > definition and enforcement than was required in the kinder and > gentler days of a smaller Internet. > > > d/ > -- > > Dave Crocker > Brandenburg InternetWorking > > From: Joe Touch > Date: December 1, 2005 11:38:42 AM PST > To: dcrocker at bbiw.net > Cc: end2end-interest at postel.org > Subject: Re: [e2e] End-to-end is a design guideline, not a rigid rule > > The "ends" and "hops" in E2E are relative, at least they always > have been to me. All the E2E argument says, in that context, is > that you can't compose HBH services to end up with the equivalent E2E. > > It never said not to do HBH (e.g., for performance). It never said > where the ends definitively were for all layers, IMO. > > Joe > From: "David P. Reed" > Date: December 1, 2005 11:40:08 AM PST > To: end2end-interest at postel.org > Subject: Re: [e2e] End-to-end is a design guideline, not a rigid rule > > [oops, Dave C. pointed out that I replied only to him, instead of > only to e2ei, and encouraged me to send it to the whole list] > > The end-to-end argument was indeed a design guideline not a rigid > rule as proposed. On the other hand, as you point out, Dave, its > value as a guideline is making a system scalable and evolvable. > And there's a corollary: building function into the network has > costs as well as benefits. Too often we ignore those costs, > because they are less visible than the benefits. > > However, I disagree with your example. The problem is that > topology doesn't map to authority. Yes, there are organizational > boundaries, and organizations have an interest in communications > between peers. However, those organizational boundaries do NOT > correlate closely with physical network boundaries. The premature > binding of organizational boundaries to physical topological > connect points is why NATs and so forth so often miss the mark on > solving the true "end-to-end" problems we have. > > So, I agree with you on your major point, but I disagree that email > is a good example of how to either apply or ignore the end-to-end > argument. > > One merely has to examine the move to having hotel ISPs spoofing > SMTP connections based on their organizational "interest" in > blocking spam (and their lawyers assert that the the law *requires* > them to do this). That man-in-the-middle solution actually prevents > better solutions (such as crypto-authentication that prevents man- > in-the-middle attacks) to the actual end-to-end requirements that > users want. > -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 186 bytes Desc: This is a digitally signed message part Url : http://www.postel.org/pipermail/end2end-interest/attachments/20051201/9f228242/PGP-0001.bin From touch at ISI.EDU Fri Dec 2 08:27:42 2005 From: touch at ISI.EDU (Joe Touch) Date: Fri, 02 Dec 2005 08:27:42 -0800 Subject: [e2e] A Question on the TCP handoff In-Reply-To: <438DFB31.689D138C@web.de> References: <92E20BAD-53D2-4A8F-A1C1-62E4C21D7DB0@mimectl>, <018f01c5c487$099117e0$f5f2010a@seashadow> <251995AB-69FA-49F9-8E51-585028CDAC49@mimectl> <438DFB31.689D138C@web.de> Message-ID: <439075FE.6010201@isi.edu> Detlef Bosau wrote: > Alper Kamil Demir wrote: >> David, >> Thank you very much for the reply. However, I think tcpcp will not >> solve the problem because tcpcp is useful to migrate tcp connection >> from place to place (correct me if I am wrong). What I want to achieve >> is to replace the warm-up connection with an already >> established actual connection (so that replaced new connection >> both does have the same previous flow control of actual connection and >> doesn't go into slow start process by having congestion control state >> of warm-up). > > Admittedly, I don?t quite understand what you mean by "warm up > connection". That sounds similar to: "Prefetching the Means for Document Transfer: A New Approach for Reducing Web Latency" Edith Cohen Haim Kaplan, Computer Networks, 2002. For the web case, prefetching 'warms' the router caches and possibly the TCB caches on the end host, as well as setting up the connection in advance of its use. Joe -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 250 bytes Desc: OpenPGP digital signature Url : http://www.postel.org/pipermail/end2end-interest/attachments/20051202/66396d7a/signature.bin From faber at ISI.EDU Fri Dec 2 08:32:12 2005 From: faber at ISI.EDU (Ted Faber) Date: Fri, 2 Dec 2005 08:32:12 -0800 Subject: [e2e] intelligent network design In-Reply-To: References: <20051201174219.GF24647@hut.isi.edu> Message-ID: <20051202163212.GB97775@hut.isi.edu> On Thu, Dec 01, 2005 at 08:00:39PM +0000, Jon Crowcroft wrote: > its different from charging (e.g. proof-of-work schemes like penny black) because > 1/ the money does NOT go to the receiver > 2/ the money _only_ goes if the participants fail to meet metcalfe's law. > (i.e. fail to increase the value of the net) > > so the point is that it is a _tax_ on network stupidity. > > the footnote in the message actually contains clues about a way you > really might implement the metcalfe control I don't think there's an objective function for "increases the value of the net," and it's tough to imagine a better metric than "recepient says it increases their value." I'm willing to accept that metric, but I recognize that there are local/global issues. Victimizers of children probably (locally) believe that receiving messages that facilitate their predations increase the value of the net while society as a whole (for reasonable definitions of that entity) believes that those same messages (globally) reduce the value of the net. As I say, I'm willing to start with sender validates. I'm also willing to say it all just works. Even under those conditions there are things to think about. When you say that the penalty is a tax, you are speaking precisely, and I wonder where my taxes go. (Sometimes. Often enough the answer makes me unhappy.) I think (rational) users will behave differently depending on where the tax goes. Your charity example earlier is a good case - most people are delighted to be charitable with others money, and may overmark spam to make contributions. (There's something delightfully bizarre in marking an ACLU e-mail as spam in order to force a contribution to Amnesty International.) A different destination for the money - a government one doesn't like, a private company providing network service that is substandard - may cause users to undermark spam and do the work of filtering it themselves, rather than send the tax to an entity they dislike. (Letting the tax-collecting entity mark spam for recepients has obvious disadvantages.) Significant overmarking or undermarking potentially distorts the penalties assigned, and therefore the amounts of spam one gets - or conversely the amount of directed mailing a charitable organization can do. Of course I haven't proved that the destination of the penalty would cause such *significant* mismarking, but I think it would be an effect to look for if such a system were rolled out. -- Ted Faber http://www.isi.edu/~faber PGP: http://www.isi.edu/~faber/pubkeys.asc Unexpected attachment on this mail? See http://www.isi.edu/~faber/FAQ.html#SIG -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://www.postel.org/pipermail/end2end-interest/attachments/20051202/16dccfc4/attachment.bin From Black_David at emc.com Sat Dec 3 10:43:31 2005 From: Black_David at emc.com (Black_David@emc.com) Date: Sat, 3 Dec 2005 13:43:31 -0500 Subject: [e2e] End-to-end is a design guideline, not a rigid rule Message-ID: Dave, > On the question of the "end to end" construct I believe we suffer from > viewing it simplistically. What I think our community has missed is that it > is a design guideline, not a rigid rule. In fact with a layered > architecture, the construct varies according to the layer. At the IP level, > this is demonstrated two ways. One is the next IP hop, which might go > through many nodes in a layer-2 network, and the other is the > source/destination IP addresses, which might go through multiple IP nodes. > > The TCP/IP split is the primary example of end-to-end, but it is deceptive. > TCP is end-to-end but only at the TCP layer. The applications that use TCP > represent points beyond the supposed end-to-end framework. > > My own education on this point came from doing EDI over Email. Of course I > always viewed the email author-to-recipient as "end to end" but along comes > EDI that did additional routing at the recipient site. To the EDI world, > the entire email service was merely one hop. > > This proved enlightening because the point has come up repeatedly: I strongly agree with this point, and want to remove it from its original organizational boundary context. IMHO, Organizational boundaries are (or at least start out as) layer 9 (Political) constructs, and Engineering techniques don't seem to be particularly effective much beyond layer 7 ;-). Anytime the end-to-end topic comes up in a design discussion, I always ask two questions: - Where are the ends? - What is the service being provided between them? The latter question (IMHO) tends to be both more important and harder to answer than the former. Another area where this "end-to-end is just a hop" perspective comes up is security. In the IPsec arena, both site-to-site and remote access VPNs compress an arbitrary unprotected network path into what looks like a single hop in a somewhat more protected LAN. The underlying end-to-end IPsec service has very strong security properties, but in the bigger picture, it's just a hop in a managed LAN (in some sense) service with different properties. Thanks, --David ---------------------------------------------------- David L. Black, Senior Technologist EMC Corporation, 176 South St., Hopkinton, MA 01748 +1 (508) 293-7953 FAX: +1 (508) 293-7786 black_david at emc.com Mobile: +1 (978) 394-7754 ---------------------------------------------------- From dhc2 at dcrocker.net Sat Dec 3 23:15:44 2005 From: dhc2 at dcrocker.net (Dave Crocker) Date: Sat, 03 Dec 2005 23:15:44 -0800 Subject: [e2e] queriable networks In-Reply-To: References: <438F10E4.7050007@dcrocker.net> <438F5198.1070103@reed.com> Message-ID: <439297A0.6090406@dcrocker.net> > One place where I depart from a common view of the end to end argument is > that there are times when it makes sense to actively enquire of the > network and expect the network to make a response that characterizes > itself. Fred, (I've changed the subject line, since I think your comment is both interesting and independent of the core simple-vs-complex network infrastructure issue that I raised. I don't see any reason a simple network would be prevented from answering queries.) I have two reactions to your comment. One is: I don't see that having a network be queriable (queryable? querulous?) as automatically running against end-to-end design or, as I said above, simple infrastructure. I'd argue that traceroute is a long-standing example of such a query, along with pathmtu, and they do not seem to offend anyone excessively. The second is: what "network" would be getting characterized? Given that we do inter-networking, how can an arbitrarily long series of independently-administered networks (between sender and receiver) -- with alternate paths and different intermediate networks available -- characterize itself? (One might almost thing that end-to-end QOS over the open Internet would be difficult to provide...) So, it seems straightforward to get the "next" network or the next AS to say something about itself, but I thought we were rather a long way from getting multi-vendor, multi-administration, end-to-end homogeneity (or characterizability.) d/ -- Dave Crocker Brandenburg InternetWorking From dpreed at reed.com Sun Dec 4 11:57:47 2005 From: dpreed at reed.com (David P. Reed) Date: Sun, 04 Dec 2005 14:57:47 -0500 Subject: [e2e] queriable networks In-Reply-To: <439297A0.6090406@dcrocker.net> References: <438F10E4.7050007@dcrocker.net> <438F5198.1070103@reed.com> <439297A0.6090406@dcrocker.net> Message-ID: <43934A3B.4050808@reed.com> If you believe that "the network" is something that is "provided" by a few big oligopolists, it's of course quite easy to imagine that "the network" can characterize itself. And imagination is a powerful thing. Imagine Victory in Iraq, and it just happens (as far as our CEO and his crew is concerned, since he never gets out of the bubble that travels with him). The Internet is a collective, emergent noun describing a process to some of us. Probably not to Cisco or Verizon or the ITU, though. :-) So it seems we are talking about two different things. I can ask Cisco to make IPv6 happen, and it's a simple matter, of course. Because they are in charge, right? :-) Seriously, perhaps it would be a good thing for professional engineers who are working on the Internet to recognize that "the network" used to describe a unitary actor is a slippery concept, not to be used in critically sound discourse. Unless, of course, you work for Cisco. Dave Crocker wrote: > > >> One place where I depart from a common view of the end to end >> argument is >> that there are times when it makes sense to actively enquire of the >> network and expect the network to make a response that characterizes >> itself. > > > Fred, > > (I've changed the subject line, since I think your comment is both > interesting and independent of the core simple-vs-complex network > infrastructure issue that I raised. I don't see any reason a simple > network > would be prevented from answering queries.) > > I have two reactions to your comment. > > One is: I don't see that having a network be queriable (queryable? > querulous?) as automatically running against end-to-end design or, as > I said above, simple infrastructure. I'd argue that traceroute is a > long-standing example of such a query, along with pathmtu, and they do > not seem to offend anyone excessively. > > The second is: what "network" would be getting characterized? Given > that we > do inter-networking, how can an arbitrarily long series of > independently-administered networks (between sender and receiver) -- with > alternate paths and different intermediate networks available -- > characterize itself? (One might almost thing that end-to-end QOS over > the > open Internet would be difficult to provide...) > > So, it seems straightforward to get the "next" network or the next AS > to say > something about itself, but I thought we were rather a long way from > getting > multi-vendor, multi-administration, end-to-end homogeneity (or > characterizability.) > > d/ From demir at kou.edu.tr Sun Dec 4 05:43:40 2005 From: demir at kou.edu.tr (Alper Kamil Demir) Date: Sun, 4 Dec 2005 15:43:40 +0200 Subject: [e2e] YNT: A Question on the TCP handoff Message-ID: >> >Admittedly, I don?t quite understand what you mean by "warm up >> >connection". >> In our work, there is an "actual connection" between a "fixed host" >> and a mobile terminal. "warm-up connection" is a pre-established connection >> between a "synchronization agent-SA" and a "fixed-host" on behalf of a mobile >> terminal. When a mobile terminal enters into a new cell, we assume that >> "warm-up connection" replaces the "actual connection" (SA handovers >> new congestion state to mobile terminal) and becomes a new "actual connection" >> so that mobile terminal learns the congestion state of the new path. >> I was questioning if this is ever possible and/or meaningful ? If so, is >> there any tool that can be useful for us? >> tcpcp was suggested. I think it can not be used to solve our problem. >That?s what I feared. >Let?s drah a network in order to see if I understand you correctly. >FH ------------Internet---------------SA1 !!!!wireless network!!!!MH > SA2 !!!!wireless network!!! >There is some pre-established connection between SA2 and MH 1) There is a TCP connection ("actual connection") between MH and FH 2) Before MH moves into cell of SA2, a new "warm-up connection" is established between SA2 and FH according to MH's User Mobility Pattern (UMP). (SA2 is as much close to MH as possible) 3) When MH enters cell of SA2, "warm-up connection" becomes "actual connection" ( warm-up connection is handed over). I am questioning if this is ever meaningful and/or possible? >Then MH enters the cell of SA2. >What about the path of FH to SA1? Is it replaced by a path FH to SA2? Yes. That's correct. >In that case, you would even have to expect changes of the path capacity >in the wired part of your connection. "wam-up connection" does that. >In addition: What does "pre established" mean? TCP state variables, > particularly CWND, result from a dynamic settling process. It means that there is a TCP connection between MH and FH. >If there is no traffic vom SA2 to MH, there would be no channel storage >capacity being assigend to your flow. If you enter >this cell with some CWND, you would suddenly send packets to the new >path. If what we are proposing is meaningful, a new CWND and other congestion parameters resulting from "wam-up connection" is handed over. Before and during handover, there will be some synchronization problems. >Perhaps, you do not even know whether the bottleneck between FH and MH >is situated in the Internet or whether the bottleneck >is the wireless network. Particularly, this may change as a result from >roaming. Assuming that SAs are somewhere close to MH. >In some respekt, your approach reminds me of the M-TCP work by Brown and >Singh, 1997. >I still think that you try to keep state variables for a TCP connection >although its path changes fundamentally. And I?m not convinced that this >will work. That's correct. However, User Mobility Pattern (UMP) is proposed in our work. still not convinced? I appreciate your kind answers very much. A. Detlef >However, IMHO there is some basic difficulty in any kind of TCP > >handover, which even holds in the existing and well known approaches. > Handover itsef is basicly difficult :) > > Alper K. Demir -- Detlef Bosau Galileistrasse 30 70565 Stuttgart Mail: detlef.bosau at web.de Web: http://www.detlef-bosau.de Mobile: +49 172 681 9937 From detlef.bosau at web.de Mon Dec 5 10:07:52 2005 From: detlef.bosau at web.de (Detlef Bosau) Date: Mon, 05 Dec 2005 19:07:52 +0100 Subject: [e2e] YNT: A Question on the TCP handoff References: Message-ID: <439481F8.8D7CBBA6@web.de> I will snip the reference network here. It can be found in the former messages. However, we should care for some readability here. Alper, perhaps you please could check your linebreak settings. Alper Kamil Demir wrote: > >There is some pre-established connection between SA2 and MH O.k. What are SA1 and SA2 doing? -Routing? -Splitting? -Spoofing? In my former mails I asserted "routing". > 1) There is a TCP connection ("actual connection") between MH and FH > 2) Before MH moves into cell of SA2, a new "warm-up connection" is established between SA2 and FH according to MH's User Mobility Pattern (UMP). (SA2 is as much close to MH as possible) Please define "close". > 3) When MH enters cell of SA2, "warm-up connection" becomes "actual connection" ( warm-up connection is handed over). > I am questioning if this is ever meaningful and/or possible? > I think, we should define the term "connection" here. For me, a TCP connection has to endpoints. So, if we talk about state variables: In that case it doesn?t make sense to keep the old state variables around the cell change. However, the term "connection" wouldn?t make sense here, because it?s not clear what a "warm up connection" from BS to MH is. If, however, a "warm up connection" means a TCP conection from BS to MH, you would exchange components in Split connections. Even in that case, the connection between FH and SA1 may be different from that from FH and SA2. To make a long story short: When a TCP path changes, its state variables can change as well. > > >If there is no traffic vom SA2 to MH, there would be no channel storage > >capacity being assigend to your flow. If you enter > >this cell with some CWND, you would suddenly send packets to the new > >path. > If what we are proposing is meaningful, a new CWND and other congestion parameters > resulting from "wam-up connection" is handed over. Before and during handover, there will be some synchronization problems. > CWND estimates the available fair share for a connection. CWND is estimated on an end to end basis. I don?t quite understand, how you will get a CWND from a warm up connection. > >Perhaps, you do not even know whether the bottleneck between FH and MH > >is situated in the Internet or whether the bottleneck > >is the wireless network. Particularly, this may change as a result from > >roaming. > Assuming that SAs are somewhere close to MH. Once again: Please define "close". > > >In some respekt, your approach reminds me of the M-TCP work by Brown and > >Singh, 1997. > >I still think that you try to keep state variables for a TCP connection > >although its path changes fundamentally. And I?m not convinced that this > >will work. > That's correct. However, User Mobility Pattern (UMP) is proposed in our work. still not convinced? No. Particularly, it would be _VERY_ hard to convince me of any kind of User Mobility Patterns. There are tons of UMP around. The ones are bad, the others are worse. Nearly all of them are proven by repeated assertion or by assistance of God or something like that. First of all, UMP and paramter estimation appear to me as some independent problems. Second, when you want to convince me of a certain UMP, there is exactly one way to do so: You _must_ validate your model with _real_ observations, with measurements from _real_ life. I?m admittedly tired of all those endless "stochsastic automatons" or "Markov Chains" etc, at least as each and any textbook on Markov Chains starts in the introduction with the remark, that reality is anything but markovian. And I?m tired to read, that some researcher has 1. read this, 2. understood this, 3. ignored this. Admittedly, I did not yet read your UMP. (My blood pressure ;-)) But once again: When we talk about a UMP, I only could be convinced, when the pattern is validated with reality. A pure "system model" (hopefully, one of the then thousands of them will mach reality or reality would change according to our system model) is not sufficient. Perhaps this sounds somewhat harsh. It just reflects my own situation. For about several yeas now I think about "interactions between L2 and L4 in mobile wireless networks." And now, I?m desperately hoping for contradiction for the next sentence: "There are none." I know, that there are dozens of PhD theses wich solve these problems or alleviate these interactions or shield the Internet from the mobile network. Interesting solutions. Again waiting for contradition: "These solutions are looking for problems." Of course, I?m expecting tons of mails now :-) And I would be glad about them. Our common problem is: Some problems cannot be identified by pure thinking. (It?s to the best of my knowledge the very change from the attidute of Plato to the research attidute of Leonardo or Galieo Galilei and I?m ashamed of all these people here in Stuttgart, who occasionaly ask for my address and then are not able to spell "Galilei" correctly when I tell them I would live in the "Galileistrasse", it?s embarrassing that inhabitants of a big city like Stuttgart do not even know the name of one of the most important researchers and scientits of all times, a man who formed some of the elementary basics of our modern attitude to science.) When we propose mobility patterns, when we assert the existence of interactions, all these proposals and assertations must be validated by _observations_ and _experiments_. But this is off topic. It?s somewhat my own frustration and disappointment that I frequently read about things which do not match observations and that apparently do not exist in reality. Detlef > I appreciate your kind answers very much. > > A. > > Detlef > >However, IMHO there is some basic difficulty in any kind of TCP > > >handover, which even holds in the existing and well known approaches. > > Handover itsef is basicly difficult :) > > > > Alper K. Demir > > -- > Detlef Bosau > Galileistrasse 30 > 70565 Stuttgart > Mail: detlef.bosau at web.de > Web: http://www.detlef-bosau.de > Mobile: +49 172 681 9937 -- Detlef Bosau Galileistrasse 30 70565 Stuttgart Mail: detlef.bosau at web.de Web: http://www.detlef-bosau.de Mobile: +49 172 681 9937 From atiq at ou.edu Mon Dec 5 11:36:17 2005 From: atiq at ou.edu (Atiquzzaman, Mohammed) Date: Mon, 5 Dec 2005 13:36:17 -0600 Subject: [e2e] YNT: A Question on the TCP handoff Message-ID: <96F80D7EBB8DC34092D083C6E2EE5EF504B516A8@XMAIL1.sooner.net.ou.edu> > Perhaps this sounds somewhat harsh. > > It just reflects my own situation. For about several yeas now I think > about "interactions between L2 and L4 in mobile wireless networks." > > And now, I?m desperately hoping for contradiction for the next > sentence: > > "There are none." You could check our SIGMA mobility management scheme at http://www.cs.ou.edu/~netlab/sigma.html which uses L2 information to carry out a L4 handoff. Use of SCTP has also solved many of the cwnd related questions raised earlier in the thread. We are currently working on implementing SIGMA using TCP. Thanks Mohammed Atiquzzaman Tel: (405) 325 8077 Professor, School of Computer Science Fax: (405) 325 4044, University of Oklahoma 200 Felgar St., Room EL-160 Email: atiq at ou.edu Norman, OK 73019-6151 atiq at ieee.org www.cs.ou.edu/~atiq From detlef.bosau at web.de Mon Dec 5 12:23:39 2005 From: detlef.bosau at web.de (Detlef Bosau) Date: Mon, 05 Dec 2005 21:23:39 +0100 Subject: [e2e] YNT: A Question on the TCP handoff References: <96F80D7EBB8DC34092D083C6E2EE5EF504B51652@XMAIL1.sooner.net.ou.edu> Message-ID: <4394A1CB.53067CCB@web.de> "Atiquzzaman, Mohammed" wrote: > > You could check our SIGMA mobility management scheme at > http://www.cs.ou.edu/~netlab/sigma.html which uses L2 information > to carry out a L4 handoff. Use of SCTP has also solved many of Very stupid qeustion: What is L4 handoff? Perhaps I suffer from a somewhat old fashioned way of thinking. But for me, the Internet is a packet switching internetwork. Hence, a "connection" in TCP is a purely virtual entity. The path between sender and receiver is fully transparent, the locations of sender and receiver are fully transparent to each other. Now, when a path _changes_, this is hardly transparent. But this is a matter of fact. And no scientific problem. When a path consists only of 1 GE lines and after a path change there is a 2k4 modem line somewhere in between, perhaps the rate may drop. Perhaps the RTT may increase. Perhaps CWND and SSTHRESH will change. In short terms: TCP will adapt. So: _Where_ is the scientific problem? > the cwnd related questions raised earlier in the thread. I cannot follow. To my knowledge, SCTP is used for media streaming. Is this correct? So, we are more or less in an environement where QoS considerations apply. The Internet is a best effort network. Let me state my question in another way: What is the scientific problem in a path change (MH roams) compared to a reciever situated in a CSMA/CD network in an office at about 9.00 am, when the employees start working and switch on their computers? There may be a load peek. There may even be a noise peek because of poor power supplies used with the computers which interferes with the local Ethernet. But is there anything what cannot be successfully delt with by TCP as it is? When your car stops on a highway due to a lack of fuel, this is no reason to go for a PhD. It?s a reason to go for petrol. Your solutions and all the others may be fine. But my question is still: Are there _hard_, _structural_ problems resulting from L2/L4 interactions in wide area mobile networks like GPRS or UMTS? It?s no question that there are perhaps some technological issues. It?s no question that there are perhaps noisy lines and this can hinder communication. (When I pull out the etherner plug from my computer, this will hinder commonication as well.) To give a concrete example. Four quite a couple of years, we are talking about delay spikes and spurious timeouts now. (BTW: These are no problems for TCP in recent flavours, because packet loss is typically not detected by timeouts except from some rare situtaions, IIRC e.g. at connection shutdown or perhpas in extremely short termed flows where at least SSTHRESH will not settle anyway and the discussion itself is quite meaningless.) We?ve seen "hiccup" tools to introduce arbitrary latencies into simulated TCP connections. Fine. Question: Do spurious timeouts happen in reality? Do delay spikes happen in reality? What are the _reasons_ for delay spikes? Besides of e.g. moving behind a wall of ferroconcrete, because this is once more the problem with the car and the petrol. Perhaps, the question must be put the other way round: In which scenarios TCP is _supposed_ to work? And what is the meaning of "supposed to work"? Is TCP supposed to work when the mobile user jumps into a swimming pool? Is TCP supposed to work when the mobile user runs out of battery power? Is TCP supposed to work with 10 Gbit/s throughput on a GSM line? With respect to roaming: I myself use a mobile phone for about 10 years now. And I never perceived any annoying effects from roaming, when I made a phone call. So, even granted there would be some smoothing and some tricky mechanisms in dealing with speech, any kind of roaming will hardly interrupt the line for more than, say, 0.1 seconds. Do you know of _any_ TCP(sic! no real time media streaming!) application which cannot deal with an interruption of 0.1 seconds? I don?t. One day, some guy told me of 42 seconds reattach time after a total cell loss in GSM. O.k. So you pulled out the Ethernet plug with your feet and now you will have to crawl under your desk in order to have it reinserted. The first one is a design decision in GSM, the second one is a problem with your feet. Is this a problem for TCP? No. I don?t see that TCP is supposed to work with broken links and unreachable peers. And I?m not fully convinced that TCP modifications will help here. >From what I understood from TCP, TCP will adapt to different path characteristics. If a path changes, due to roaming or due to a route change or due to a failover or whatever, TCP will adapt to the changed path characteristics. At the moment, I don?t see any real problem. I could even look into literature. Bakre/Badrinath talk about some issues in WaveLAN handover. (I-TCP). I?m not quite sure whether roaming between different base stations in Wave LAN was supported that time. In addition, I-TCP adds some local recovery layer here. But which _problem_ is solved? Balakrishnan (Snoop) enlightens us with the benefits of local recovery on lossy links. Not that new of course, but: Repetitio est mater sapientiae. (Yes, of course, there is some problem discussed with duplicate acknowledgements on L4 resulting from some unfortunate RLP implementation. I don?t know any RLP implementation in mobile networks which ever suffered from that problem.) This is all fine. And it?s all interesting. But did these approaches solve, even _discuss_ structural questions? What were the _new_ lessons learned here? Perhaps my impression is totally wrong here. But many of the work I?ve read so far does not teach us new lessons (well taken into account the time when the work was written) but tells us funny things which are nice to have. And concerning TCP in mobile networks, the essential lesson is, what we all know for years: TCP will work not better than the path allows. If the path is fine, TCP will work fine. If there is no path, TCP will not work. I intendedly don?t say anything about the loss differentiation debate. Either a network link is (nearly) loss free or one would introduce a local recovery layer there. So, I don?t see the problem here. It?s perhaps a personal problem of mine. I was totally convinced of the existence of a pletora of problems here. But the more I think about it, the more I question problems of TCP in mobile networks the less problems persist. Perhaps, it?s not realistic to expect some scenario which is a) realistic and widespread and in which b) the whole community will shout: "We do not yet know how to deal with this situation!" Roaming is certainly not a scenario like that. In my honest opinion and in my experience, roaming in mobile networks simply works. Period. And as it?s not broken, we don?t need to fix it. When you have full coverage, you could take your mobile and walk from Stockholm to Bukarest and talk, surf, all the time, you will most likely not notice even one cell change. With actually existing techniques and approaches. O.k., Let?s take Paris to Moscow. It?s perhaps better known outside Europe ;-) (To "Europeans living overseas": Paris and Moscow are continental locations. *SCNR*) Detlef -- Detlef Bosau Galileistrasse 30 70565 Stuttgart Mail: detlef.bosau at web.de Web: http://www.detlef-bosau.de Mobile: +49 172 681 9937 From erwin.davis at gmail.com Mon Dec 5 14:45:48 2005 From: erwin.davis at gmail.com (Erwin Davis) Date: Mon, 5 Dec 2005 17:45:48 -0500 Subject: [e2e] TCP fragmentation and reassembly Message-ID: Helllo, Packet from application layer may be framed in TCP layer based on MSS (maximum segment size, not MTU in IP layer) negotiated between two TCP layers of the end parties. My question is if the TCP layer in receiving side will reassemble the TCP fragments before it forward the packet to the application layer. If yes, then how the TCP layer in receiving side knows how many TCP fragments are made up for this one application packet. If not, will it require the intelligence from the application layer for the application packet reassembly. Thanks for your help, erwin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.postel.org/pipermail/end2end-interest/attachments/20051205/14c7d55e/attachment.html From detlef.bosau at web.de Mon Dec 5 15:13:55 2005 From: detlef.bosau at web.de (Detlef Bosau) Date: Tue, 06 Dec 2005 00:13:55 +0100 Subject: [e2e] TCP fragmentation and reassembly References: Message-ID: <4394C9B3.ED9DE988@web.de> Erwin Davis wrote: > > Helllo, > > Packet from application layer may be framed in TCP layer based on MSS > (maximum segment size, not MTU in IP layer) negotiated between two TCP > layers of the end parties. My question is if the TCP layer in > receiving side will reassemble the TCP fragments before it forward the > packet to the application layer. If yes, then how the TCP layer in > receiving side knows how many TCP fragments are made up for this one > application packet. If not, will it require the intelligence from the > application layer for the application packet reassembly. Thanks for > your help, > > erwin As the name may suggest, _flow_ sockets appear to the application as a _flow_. An application reads from and writes to a TCP flow exactly the same way as to any ordinary file. You may want to read some textbook on TCP and sockets and socket programming, e.g. TCP/IP Illustrated by w. Richard Stevens or Internetworking with TCP/IP by Doug Comer. -- Detlef Bosau Galileistrasse 30 70565 Stuttgart Mail: detlef.bosau at web.de Web: http://www.detlef-bosau.de Mobile: +49 172 681 9937 From touch at ISI.EDU Mon Dec 5 16:30:25 2005 From: touch at ISI.EDU (Joe Touch) Date: Mon, 05 Dec 2005 16:30:25 -0800 Subject: [e2e] TCP fragmentation and reassembly In-Reply-To: References: Message-ID: <4394DBA1.80107@isi.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Erwin, It's useful to keep in mind that TCP is a byte-stream protocol; there are no segment boundaries preserved between application and transport layer: Erwin Davis wrote: > Helllo, > > Packet from application layer may be framed in TCP layer based on MSS > (maximum segment size, not MTU in IP layer) negotiated between two TCP > layers of the end parties. Apps using TCP don't write in packets; they write bytes. The application can write in whatever units it wants; TCP is allowed to send packets based on that data however it sees fit. While an application can tune to the behavior of a specific TCP implementation, it cannot rely on all TCPs acting the same way. > My question is if the TCP layer in receiving > side will reassemble the TCP fragments before it forward the packet to > the application layer. TCP reorders, but doesn't maintain application layer boundaries. So long as data is received in order, once it is received and ACK'd it is presented to the receive-side application layer. > If yes, then how the TCP layer in receiving side > knows how many TCP fragments are made up for this one application > packet. If not, will it require the intelligence from the application > layer for the application packet reassembly. Thanks for your help, Applications cannot strictly know what TCP does with data that is sent absent monitoring the traffic directly. Joe -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (MingW32) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFDlNuhE5f5cImnZrsRAgK/AKCD9IoKKeQncJDfvSfBXCQ0cvVZNQCg5c5r vJQkSrgnXaDPX3WQsug1PSc= =U7lI -----END PGP SIGNATURE----- From touch at ISI.EDU Mon Dec 5 16:40:31 2005 From: touch at ISI.EDU (Joe Touch) Date: Mon, 05 Dec 2005 16:40:31 -0800 Subject: [e2e] queriable networks In-Reply-To: <439297A0.6090406@dcrocker.net> References: <438F10E4.7050007@dcrocker.net> <438F5198.1070103@reed.com> <439297A0.6090406@dcrocker.net> Message-ID: <4394DDFF.8060909@isi.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Dave Crocker wrote: > > >> One place where I depart from a common view of the end to end argument is >> that there are times when it makes sense to actively enquire of the >> network and expect the network to make a response that characterizes >> itself. (responding more to Fred than to Dave, but with Dave's subject thread): This presumes two things: 1) if you ask the question, you actually want the answer Problems include: - answers aren't necessarily meaningful by the time they're used - answers apply to the path the question takes, which may not match the path the next question or action takes 2) if you ask the question, you can trust the answer Problems include: - security (spoofing, malicious behavior, etc.) - accuracy (even in a benign system) - relevance of the answer to your own situation i.e., having all drivers respond to uniform traffic reports causes traffic to oscillate rather than distribute Joe -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (MingW32) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFDlN3/E5f5cImnZrsRAp49AKC1974qeW8QdwI851dcv+fYVs01ygCg39za ApF34U/ReCF2EYTIhwnQniM= =8C1Y -----END PGP SIGNATURE----- From erwin.davis at gmail.com Mon Dec 5 17:10:22 2005 From: erwin.davis at gmail.com (Erwin Davis) Date: Mon, 5 Dec 2005 20:10:22 -0500 Subject: [e2e] TCP fragmentation and reassembly In-Reply-To: <4394DBA1.80107@isi.edu> References: <4394DBA1.80107@isi.edu> Message-ID: Hi, All, Thanks for your info. RFC 879clearly explains the TCP fragmentation in the sending side but it says nothing about reassembly in the receiving side. Joe, see an example below. Assume an application writes down a packet with 10 Kbyte to TCP layer whose negotiated MMS is 5 Kbytes. Then the TCP layer will fragment the application packet into two TCP segments with 5 Kbytes each. Assume that the first TCP packet arrives at the receiving side. Then the TCP layer in the receiving side wakes up the application listening to this TCP port. The application processes the half packet and fails. The app has no way to know if it receive a complete message or not but the TCP layer in the sending side knows. To me, such TCP operation is not transparent to the application. It requires the intelligent part in the application to determine if the arriving TCP packet is a complete packet from the sending application or not. let me know if I misunderstood some points. Thanks again, Erwin On 12/5/05, Joe Touch wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Erwin, > > It's useful to keep in mind that TCP is a byte-stream protocol; there > are no segment boundaries preserved between application and transport > layer: > > Erwin Davis wrote: > > Helllo, > > > > Packet from application layer may be framed in TCP layer based on MSS > > (maximum segment size, not MTU in IP layer) negotiated between two TCP > > layers of the end parties. > > Apps using TCP don't write in packets; they write bytes. The application > can write in whatever units it wants; TCP is allowed to send packets > based on that data however it sees fit. While an application can tune to > the behavior of a specific TCP implementation, it cannot rely on all > TCPs acting the same way. > > > My question is if the TCP layer in receiving > > side will reassemble the TCP fragments before it forward the packet to > > the application layer. > > TCP reorders, but doesn't maintain application layer boundaries. So long > as data is received in order, once it is received and ACK'd it is > presented to the receive-side application layer. > > > If yes, then how the TCP layer in receiving side > > knows how many TCP fragments are made up for this one application > > packet. If not, will it require the intelligence from the application > > layer for the application packet reassembly. Thanks for your help, > > Applications cannot strictly know what TCP does with data that is sent > absent monitoring the traffic directly. > > Joe > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.2.4 (MingW32) > Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org > > iD8DBQFDlNuhE5f5cImnZrsRAgK/AKCD9IoKKeQncJDfvSfBXCQ0cvVZNQCg5c5r > vJQkSrgnXaDPX3WQsug1PSc= > =U7lI > -----END PGP SIGNATURE----- > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.postel.org/pipermail/end2end-interest/attachments/20051205/4b552ff0/attachment.html From touch at ISI.EDU Mon Dec 5 17:16:11 2005 From: touch at ISI.EDU (Joe Touch) Date: Mon, 05 Dec 2005 17:16:11 -0800 Subject: [e2e] TCP fragmentation and reassembly In-Reply-To: References: <4394DBA1.80107@isi.edu> Message-ID: <4394E65B.80702@isi.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Lloyd Wood wrote: > On Mon, 5 Dec 2005, Joe Touch wrote: > > >>It's useful to keep in mind that TCP is a byte-stream protocol; there >>are no segment boundaries preserved between application and transport layer: > > > (cough) urgent pointer (cough). You can set the URG; you can know that at most one URG will be inside each segment, you can't know where TCP will break the data around the URG pointers. Also, when the URG pointers are overwritten, only the last one will count. And the sender app doesn't know when an URG is received, so there's no way to ensure you're not overwriting an URG. As a result, you can correlate writes to segment size only in the degenerate case: - open a connection - write one byte - close that connection Joe -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (MingW32) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFDlOZbE5f5cImnZrsRAiG2AJ9uiPsPAGUSV+h6q6v/82TQC6ox8ACgjhiX YhanQk+WaNry0Cpw8ZfA5LI= =iesi -----END PGP SIGNATURE----- From touch at ISI.EDU Mon Dec 5 17:20:48 2005 From: touch at ISI.EDU (Joe Touch) Date: Mon, 05 Dec 2005 17:20:48 -0800 Subject: [e2e] TCP fragmentation and reassembly In-Reply-To: References: <4394DBA1.80107@isi.edu> Message-ID: <4394E770.7090608@isi.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Erwin Davis wrote: > Hi, All, > > Thanks for your info. RFC 879 > clearly explains the > TCP fragmentation in the sending side but it > says nothing about reassembly in the receiving side. 1. that's IP fragmentation, not TCP 2. that document was superceded by RFCs 791 (IP) and 793 (TCP). > Joe, see an example below. Assume an application writes down a packet > with 10 Kbyte to TCP layer whose negotiated MMS is 5 Kbytes. Apps don't write packets; they write byte streams to TCP. > Then the > TCP layer will fragment the application packet into two TCP segments > with 5 Kbytes each. Assume that the first TCP packet arrives at the > receiving side. Then the TCP layer in the receiving side wakes up the > application listening to this TCP port. The application processes the > half packet and fails. The application has no business assuming this is a packet. It's a byte stream. If it needs packet boundaries, it should keep reading until it sees the packet boundary (which the app on the sending side must insert). This is how HTTP works. > The app has no way to know if it receive a > complete message or not but the TCP layer in the sending side knows. The app has to insert framing - either an 'end of packet' reserved byte or to use its own headers. Again, this is how HTTP works. > To > me, such TCP operation is not transparent to the application. It > requires the intelligent part in the application to determine if the > arriving TCP packet is a complete packet from the sending application or > not. > let me know if I misunderstood some points. Thanks again, > > Erwin > > On 12/5/05, *Joe Touch* > wrote: > > Erwin, > > It's useful to keep in mind that TCP is a byte-stream protocol; there > are no segment boundaries preserved between application and > transport layer: > > Erwin Davis wrote: >> Helllo, > >> Packet from application layer may be framed in TCP layer based on MSS >> (maximum segment size, not MTU in IP layer) negotiated between two TCP >> layers of the end parties. > > Apps using TCP don't write in packets; they write bytes. The application > can write in whatever units it wants; TCP is allowed to send packets > based on that data however it sees fit. While an application can > tune to > the behavior of a specific TCP implementation, it cannot rely on all > TCPs acting the same way. > >> My question is if the TCP layer in receiving >> side will reassemble the TCP fragments before it forward the > packet to >> the application layer. > > TCP reorders, but doesn't maintain application layer boundaries. So long > as data is received in order, once it is received and ACK'd it is > presented to the receive-side application layer. > >> If yes, then how the TCP layer in receiving side >> knows how many TCP fragments are made up for this one application >> packet. If not, will it require the intelligence from the application >> layer for the application packet reassembly. Thanks for your help, > > Applications cannot strictly know what TCP does with data that is sent > absent monitoring the traffic directly. > > Joe -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (MingW32) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFDlOdwE5f5cImnZrsRAvtlAJ42tyQy7mdzMLqyPeos11SEnzGu/gCghiGz XDpkiQLeC7G8IjJBPzxfnbE= =1IX+ -----END PGP SIGNATURE----- From detlef.bosau at web.de Tue Dec 6 04:59:40 2005 From: detlef.bosau at web.de (Detlef Bosau) Date: Tue, 06 Dec 2005 13:59:40 +0100 Subject: [e2e] TCP fragmentation and reassembly References: <4394DBA1.80107@isi.edu> Message-ID: <43958B3C.E27765C4@web.de> Erwin Davis wrote: > > Hi, All, > > Thanks for your info. RFC 879 clearly explains the TCP fragmentation > in the sending side but it > says nothing about reassembly in the receiving side. > Hang on here! TCP fills in portions of a strem into "letters". These are received by a sender. The "MSS" is nothing else than the slot size of a letter box. Or other restrictions given by the underlying transportation system. So TCP breaks down its flow into portions the lower layer can carry. And that?s it. Of course, the receiver must resequence these portions. Particularly as IP packets may arrive at the receiver out of order or even get lost and thus my be retransmitted after some time. This is different from what IP does. IP delivers a packet. ("letter".) Wether this letter is fragmented into a million of pieces and reassembled at the receiver ("Scotty, beam me up!") or sent in a whole ("I prefer a shuttle") does not matter - you will not even know it. In TCP we talk about a continous byte flow which is sent using packets. And of course, a receiver must resequnce these packets, more correctly: A receiver must reconstruct the byte stream from TCP datagrams which are sent as payloads of IP packets. Please make yourself clear _why_ fragementation is done at the IP layer. The reason is that you may not be aware of the maximum packet size which can be conveyed on the whole path. The path may start with an Ethernet segment where 1500 byte packes can be carried and there may be a SLIP line in between, were 536 byte packts can be carried. So, an IP packet traveling this path needs to be split up into pieces to be conveyed. >From TCP?s point of view, there is a transpartent transportation system for packtes. Period. You?re given an MSS for the whole path and you don?t care, would even have no idea of, what happens underneath the IP API (or behind the slot of the letterbox). > Joe, see an example below. Assume an application writes down a packet > with 10 Kbyte to TCP layer whose negotiated MMS is 5 Kbytes. Then the > TCP layer will fragment the application packet into two TCP segments > What you talk about now is how TCP splits up a byte stream into peaces. With all the issues that come up with that. And this is not only the MSS. There is a rate control issue, there is a flow control issue, there is a congestion control issue, a loss detection issue, a retransmission issue etc. etc. And of course this is done by the TCP endpoints. However, once again, I strongly advise that you have a look at a textbook on TCP/IP here to get a first understanding. If there are difficulties, please feel free to ask. However, I would recommend to do this off list, because this might be quite boaring for the list. But please start with some good introdution to TCP. Again: Tanenbaum, Computer Networks, could be a good starting point. It is by far not that detailed for TCP as the books by W. Rich Stevens or Doug Comer are. But for the first steps, I think it is quite helpful. Detlef -- Detlef Bosau Galileistrasse 30 70565 Stuttgart Mail: detlef.bosau at web.de Web: http://www.detlef-bosau.de Mobile: +49 172 681 9937 From zhani_med_faten at yahoo.fr Mon Dec 5 16:22:12 2005 From: zhani_med_faten at yahoo.fr (Zhani Mohamed Faten) Date: Tue, 6 Dec 2005 01:22:12 +0100 (CET) Subject: [e2e] TCP fragmentation and reassembly Message-ID: <20051206002212.23645.qmail@web25715.mail.ukl.yahoo.com> Zhani Mohamed Faten a ?crit : Date: Tue, 6 Dec 2005 01:19:08 +0100 (CET) De: Zhani Mohamed Faten Objet: RE: [e2e] TCP fragmentation and reassembly ?: Erwin Davis hi the TCP header contain all information needed for reassambling paquet; these are the important fields : ?ID (16 bits): used to identify datagram (the same for all fragments of one original datagram) ? ?the Flag in the TCP header means : ?001: there is more fragments ?000: this the last fragment ?O1X: do not fragment the ?FO (15 bit ): Fragment offset : means the ?Position of the fragment in the original datagramme . ?it is null for the first fragment = O using these filds TCP can reassemble paquets Erwin Davis a ?crit : Helllo, Packet from application layer may be framed in TCP layer based on MSS (maximum segment size, not MTU in IP layer) negotiated between two TCP layers of the end parties. My question is if the TCP layer in receiving side will reassemble the TCP fragments before it forward the packet to the application layer. If yes, then how the TCP layer in receiving side knows how many TCP fragments are made up for this one application packet. If not, will it require the intelligence from the application layer for the application packet reassembly. Thanks for your help, erwin --------------------------------- Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo! Messenger T?l?chargez le ici ! --------------------------------- Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo! Messenger T?l?chargez le ici ! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.postel.org/pipermail/end2end-interest/attachments/20051206/1a1a4347/attachment.html From demir at kou.edu.tr Tue Dec 6 07:42:19 2005 From: demir at kou.edu.tr (Alper Kamil Demir) Date: Tue, 6 Dec 2005 17:42:19 +0200 Subject: [e2e] YNT: A Question on the TCP handoff Message-ID: >What are SA1 and SA2 doing? >-Routing? >-Splitting? >-Spoofing? >In my former mails I asserted "routing". Synchronization of the "warm-up connection" and the "actual connection" is the responsibility of SAs that run in BSs. When SA is notified to establish a "warm-up connection", warm-up parameters, i.e., the current congestion and flow control window sizes of the actual connection are also passed to the SA. SAs are doing none of above (routing, splitting, spoofing). An SA establishes a "wam-up connection" before an MH moves into cell of SA so that characteristics of the new path is ready to be handed over or replaced by the old one. The term hand over might be confusing here. Any other suggestion is aprreciated. >> 1) There is a TCP connection ("actual connection") between MH and FH >> 2) Before MH moves into cell of SA2, a new "warm-up connection" is established between SA2 and FH according to MH's User >>Mobility Pattern (UMP). (SA2 is as much close to MH as possible) >Please define "close". "Being near in space and time". i.e. an SA is situated on BS as a middleware. >> 3) When MH enters cell of SA2, "warm-up connection" becomes "actual connection" ( warm-up connection is handed over). >> I am questioning if this is ever meaningful and/or possible? >I think, we should define the term "connection" here. I don't understand why you asked this question. I am talking about a TCP connection. It is defined in RFC793 as [The reliability and flow control mechanisms described above require that TCPs initialize and maintain certain status information for each data stream. The combination of this information, including sockets, sequence numbers, and window sizes, is called a connection. Each connection is uniquely specified by a pair of sockets identifying its two sides. ...... Since connections must be established between unreliable hosts and over the unreliable internet communication system, a handshake mechanism with clock-based sequence numbers is used to avoid erroneous initialization of connections.] >For me, a TCP connection has to endpoints. This is abvious as defined above. >So, if we talk about state variables: In that case it doesn?t make sense >to keep the old state variables around the cell change. >However, the term "connection" wouldn?t make sense here, because it?s >not clear what a "warm up connection" from BS to MH is. We are nor keeping old state variab?es. Old state variables are replaced by new state variables obtained from "warm-up connection". >If, however, a "warm up connection" means a TCP conection from BS to MH, >you would exchange components in Split connections. >Even in that case, the connection between FH and SA1 may be different >from that from FH and SA2. If FH is sender [defined same as in RFC793], "warm-up TCP connection" is between SA (based on BS) and FH. Of course, they are different cause the paths might change. >To make a long story short: When a TCP path changes, its state variables >can change as well. This is obvious. >CWND estimates the available fair share for a connection. CWND is >estimated on an end to end basis. >I don?t quite understand, how you will get a CWND from a warm up >connection. Whatever CWND means on end to end basis on a packet-swithed network, "warm-up connection" is established so that expected new path characteristics are formed on "warm-up connection" (between SA2 and FH). When MH enters into cell of SA2 path characteristics are handed over/passed. > >I still think that you try to keep state variables for a TCP connection > >although its path changes fundamentally. And I?m not convinced that this > >will work. > That's correct. However, User Mobility Pattern (UMP) is proposed in our work. still not convinced? >No. We don't keed old state variables at all. New state variables are formed in advance according to UMP so that when MH moves into a new cell it has characteristics of new path. >Particularly, it would be _VERY_ hard to convince me of any kind of User >Mobility Patterns. I don't know what to say here, either :) >There are tons of UMP around. The ones are bad, the others are worse. >Nearly all of them are proven >by repeated assertion or by assistance of God or something like that. We are using "Erdal Catirci, Ian F. Akyidiz, User Mobility Pattern Scheme for Location Update and Paging in Wireless Systems, IEEE Transactions on Mobile Computing". p. 236-247, ISSN:1536-1233, 2002. >First of all, UMP and paramter estimation appear to me as some >independent problems. >Second, when you want to convince me of a certain UMP, there is exactly >one way to do so: >You _must_ validate your model with _real_ observations, with >measurements from _real_ life. What is _real_? ;)) >I?m admittedly tired of all those endless "stochsastic automatons" or >"Markov Chains" etc, at least as >each and any textbook on Markov Chains starts in the introduction with >the remark, that >reality is anything but markovian. And I?m tired to read, that some >researcher has >1. read this, >2. understood this, >3. ignored this. I totally agree with you. >Admittedly, I did not yet read your UMP. (My blood pressure ;-)) See above. >But once again: When we talk about a UMP, I only could be convinced, >when the pattern is >validated with reality. A pure "system model" (hopefully, one of the >then thousands of them >will mach reality or reality would change according to our system model) >is not sufficient. I assume this is a deep discussion. >Perhaps this sounds somewhat harsh. It doesn't for me. >It just reflects my own situation. For about several yeas now I think >about "interactions between L2 and L4 >in mobile wireless networks." To me, all layers are "interacting". However, it is easily perceivable that L2 and L4 are "interacting" cause they are almost identical and parallel in functions but space/distance. >And now, I?m desperately hoping for contradiction for the next sentence: >"There are none." "There are none" if and unless it is proved ;))) Alper K. Demir From detlef.bosau at web.de Tue Dec 6 10:43:17 2005 From: detlef.bosau at web.de (Detlef Bosau) Date: Tue, 06 Dec 2005 19:43:17 +0100 Subject: [e2e] YNT: A Question on the TCP handoff References: Message-ID: <4395DBC5.922CAA3A@web.de> Alper Kamil Demir wrote: ...very looooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooong lines ;-) Alper, it would be _really_ helpful if you could care for some kind of linebreak. I?m not familiar with this M$ stuff, you?re apparently using. However, there should be some kind of preference or option, where your MUA can be made to format lines to some maximum length. I personaly set my Netscape to a line length of 72 characters. > > >What are SA1 and SA2 doing? > >-Routing? > >-Splitting? > >-Spoofing? > >In my former mails I asserted "routing". > Synchronization of the "warm-up connection" and the "actual connection" is the responsibility of SAs that run in BSs. When SA is notified to establish a "warm-up connection", warm-up parameters, i.e., the current congestion and flow control window sizes of the actual connection are also passed to the SA. > SAs are doing none of above (routing, splitting, spoofing). An SA establishes a "wam-up connection" before an MH moves into cell of SA so that characteristics of the new path is ready to be handed over or replaced by the old one. The term hand over might be confusing here. Any other suggestion is aprreciated. Either way, there _is_ a TCP connection from SA2 to MH before the handover occurs. Correct? So, there _is_ a sender socket on SA2? > > >> 1) There is a TCP connection ("actual connection") between MH and FH > >> 2) Before MH moves into cell of SA2, a new "warm-up connection" is established between SA2 and FH according to MH's User >>Mobility Pattern (UMP). (SA2 is as much close to MH as possible) > >Please define "close". > "Being near in space and time". i.e. an SA is situated on BS as a middleware. SA1 and SA2 are situated on two different BS, otherwise there would be no need for handover. Is this correct? > > >> 3) When MH enters cell of SA2, "warm-up connection" becomes "actual connection" ( warm-up connection is handed over). > >> I am questioning if this is ever meaningful and/or possible? > >I think, we should define the term "connection" here. > I don't understand why you asked this question. I am talking about a TCP connection. It is defined in RFC793 as > > [The reliability and flow control mechanisms described above require > that TCPs initialize and maintain certain status information for > each data stream. The combination of this information, including > sockets, sequence numbers, and window sizes, is called a connection. > Each connection is uniquely specified by a pair of sockets > identifying its two sides. > ...... > Since connections must be established between unreliable hosts and > over the unreliable internet communication system, a handshake > mechanism with clock-based sequence numbers is used to avoid > erroneous initialization of connections.] > > >For me, a TCP connection has to endpoints. ^^next try: two ;-) > This is abvious as defined above. Please correct me, if I?m wrong. But good ol? RFC 793 does not know anything about CWND and SSTRESH but defines the most basic TCP algorithms. It does not take account congestion control. Is this correct? This is no problem, because that issue was simply not known when RFC 793 was written in 1981. TCP has seen some kind of evolution since then. > > >So, if we talk about state variables: In that case it doesn?t make sense > >to keep the old state variables around the cell change. > >However, the term "connection" wouldn?t make sense here, because it?s > >not clear what a "warm up connection" from BS to MH is. > We are nor keeping old state variab?es. Old state variables are replaced by new state variables obtained from "warm-up connection". O.k. Let?s talk about the congestion window CWND. The congestion window probes / estimates the available storage capacity available for a TCP flow along the path. (Once again: IIRC this mechanism is _not_ discussed in RFC 793. It?s not 24 years old, it?s only about 16 years old. So we must look it up in some more recent literature ;-)) In the old connection, CWND describes the fair share of storage capacity along the path from FH to MH. It does not tell you where this capacity is situated. E.g. most of the packets along the path may pile up at the bottleneck which may be situated in the Internet or in the wireless last mile, you simply _don?t_ know. So, even if you would replace only the wirless part in your path and would maintain the wired part, you would have absolutely no idea about the correct value for CWND after the path change. Perhaps you know some fair share of capacity from your "warm up connection". But which part of the former CWND shall be replaced by this? And with which sending rate are you probing? This depends on where the bottleneck in your connecetion is situated. This is one of the big mysteries in your connection: You have no idea where the bottleneck is! > > >If, however, a "warm up connection" means a TCP conection from BS to MH, > >you would exchange components in Split connections. > >Even in that case, the connection between FH and SA1 may be different > >from that from FH and SA2. > > If FH is sender [defined same as in RFC793], "warm-up TCP connection" is between SA (based on BS) and FH. Of course, they are different cause the paths might change. O.k. And what does this "warm up connection" do? Particularly with respect to CWND: How does it "struggle for space"? In TCP, there is no rate probing but there is a struggle for space. As a conseqence, a TCP flow will send with a certain rate. And I intendedly say _struggle_. Because there is no "harmless probing" which leaves other streams alone and obtains space by miracle (Christmas is coming, jingle bells, jingle bells, jingle all the way *sing*) but as soon as there are several flows competing for ressources, there is a _struggle_. Even that is a problem with a warm-up connection. Will it throttle competing flows although it does not actually convey any data? Again, this is not covered in RFC 793. (I don?t know the latest RFC concerning TCP, but this is by far close to a number like 3793 than 793.) > > >CWND estimates the available fair share for a connection. CWND is > >estimated on an end to end basis. > >I don?t quite understand, how you will get a CWND from a warm up > >connection. > Whatever CWND means on end to end basis on a packet-swithed network, "warm-up connection" is established so that expected new path characteristics are formed ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Alper, you?re kidding, aren?t you? <...> > >There are tons of UMP around. The ones are bad, the others are worse. > >Nearly all of them are proven > >by repeated assertion or by assistance of God or something like that. > We are using "Erdal Catirci, Ian F. Akyidiz, User Mobility Pattern Scheme for Location Update and Paging in Wireless Systems, IEEE Transactions on Mobile Computing". p. 236-247, ISSN:1536-1233, 2002. I would appreciate a copy, if this is possible. (And as usual, we are pleased that the citeseer system behaves at it is supposed to. It?s down. :-( ) > > >First of all, UMP and paramter estimation appear to me as some > >independent problems. > >Second, when you want to convince me of a certain UMP, there is exactly > >one way to do so: > >You _must_ validate your model with _real_ observations, with > >measurements from _real_ life. > What is _real_? ;)) I?m not quite sure, what is the real Detlef. *looking around* I think about this for years now. Actually, I come to the conclusion that we should define a number of relevant scenarios. We should not _invent_ them. They must be part of the real world. E.g. it will be hardly possible to describe muser mobility or the C/I ration of a wireless channel for the whole world. (I know about those papers who claim that. The big waste basket in our house knows them as well.) But we can investingate typical situations like "Germany, city with about 500.000 inhabitants, downdown, 5.00 pm, pedestrian user". So, we talk about the downtown areas of cities like Hannover or Stutttgart (perhaps Munich? I don?t know the number of inhabitants, but it?s a village that fit?s into this category.) We don?t talk about Braunschweig or Halberstadt, and we don?t talk about Hamburg or Berlin. I?m quite sure that we will need several scenarios like that. Perhas we need a scenario for Stuttgart, Hannover, Munich. And then we need a more urban scenario. Hamburg, Berlin, Paris, London. Perhaps, we need different time schedules, I don?t know. And then, we can check whether real users behave like the model says. Example. For about nearly 6 years know I wonder whether delay spikes and spurious timeouts in TCP are reality. They exist in papers. _Without_ any kind of rationale where they should come from or a justification why they are introduced. There are only some very few observations (I think, Chakravorthy made some in London) without any deeper analysis where the observed "symptoms" result from. Sorry. This is not convincing. Show me, were a delay spike may come from and prove to me in reproducible observations that this phenomen realy exists. Afterwards, we can talk about the reasons for this phenomenon. Anything else: => waste basket. > >Admittedly, I did not yet read your UMP. (My blood pressure ;-)) > See above. Be aware of my blood pressure ;-) At _your_ risk ;-) > > >But once again: When we talk about a UMP, I only could be convinced, > >when the pattern is > >validated with reality. A pure "system model" (hopefully, one of the > >then thousands of them > >will mach reality or reality would change according to our system model) > >is not sufficient. > I assume this is a deep discussion. Absolutely. But _science_ is a deep discussion. Or it?s not science. > > >Perhaps this sounds somewhat harsh. > It doesn't for me. > > >It just reflects my own situation. For about several yeas now I think > >about "interactions between L2 and L4 > >in mobile wireless networks." > To me, all layers are "interacting". However, it is easily perceivable that L2 and L4 are "interacting" cause they are almost identical and parallel in functions but space/distance. > It?s not that easy for me to see the interactions, particularly claimed adverse ones. But I?m totally with you that L2 and L4 have pretty much in common in some important aspects. E.g. a Radio Link Protocol in mobile networks has some similarities with transport protocols. It is quite interesting to see the similarities _and_ the subtle differences which often become obvious not before a closer look. > >And now, I?m desperately hoping for contradiction for the next sentence: > >"There are none." > "There are none" if and unless it is proved ;))) That?s exaclty what I mean. Unfortunatley, my mailbox is still empty. No one wants to kill me, no one contraditcts me. So, either nobody has read this sentence, or everybody ignored it. Is there something like "prove by lack of contradiction"? Detlef -- Detlef Bosau Galileistrasse 30 70565 Stuttgart Mail: detlef.bosau at web.de Web: http://www.detlef-bosau.de Mobile: +49 172 681 9937 From gds at best.com Tue Dec 6 14:49:51 2005 From: gds at best.com (Greg Skinner) Date: Tue, 6 Dec 2005 22:49:51 +0000 Subject: [e2e] YNT: A Question on the TCP handoff In-Reply-To: <4395DBC5.922CAA3A@web.de>; from detlef.bosau@web.de on Tue, Dec 06, 2005 at 07:43:17PM +0100 References: <4395DBC5.922CAA3A@web.de> Message-ID: <20051206224951.A3602@gds.best.vwh.net> On Tue, Dec 06, 2005 at 07:43:17PM +0100, Detlef Bosau wrote: > Please correct me, if I?m wrong. But good ol? RFC 793 does not know > anything about CWND and SSTRESH but defines the most basic > TCP algorithms. It does not take account congestion control. Is this > correct? This is no problem, because that issue was simply > not known when RFC 793 was written in 1981. TCP has seen some kind of > evolution since then. I'm not sure whether the issue was not known. A question for the internet history list, perhaps. I was looking for something else when I happened upon some old tcp-ip archives from the 1980s. I found some early remarks from Van Jacobson on congestion in the 1986 archives. I vaguely recall there was some discussion of the issue on the ietf list around that time, but I don't know of any ietf archives of that time period that are available online. http://www-mice.cs.ucl.ac.uk/multimedia/misc/tcp_ip/ --gregbo From detlef.bosau at web.de Tue Dec 6 14:56:48 2005 From: detlef.bosau at web.de (Detlef Bosau) Date: Tue, 06 Dec 2005 23:56:48 +0100 Subject: [e2e] YNT: A Question on the TCP handoff References: <4395DBC5.922CAA3A@web.de> <20051206224951.A3602@gds.best.vwh.net> Message-ID: <43961730.93A7CB88@web.de> Greg Skinner wrote: > > I'm not sure whether the issue was not known. A question for the At least, IIRC, it is not discussed. > internet history list, perhaps. > > I was looking for something else when I happened upon some old tcp-ip > archives from the 1980s. I found some early remarks from Van Jacobson > on congestion in the 1986 archives. I vaguely recall there was some RFC 793 was in 1981, i.e. five years earlier :-) But it would be interesting to know, when the discussion started. If you are looking around in archives: During the last week I thought about why we use ACK in TCP and not NAK. One obvious reason for ACK is the ACK pacing mechanism. However: Has there ever been a discussion? Detlef -- Detlef Bosau Galileistrasse 30 70565 Stuttgart Mail: detlef.bosau at web.de Web: http://www.detlef-bosau.de Mobile: +49 172 681 9937 From dpreed at reed.com Wed Dec 7 08:02:23 2005 From: dpreed at reed.com (David P. Reed) Date: Wed, 07 Dec 2005 11:02:23 -0500 Subject: [e2e] YNT: A Question on the TCP handoff In-Reply-To: <43961730.93A7CB88@web.de> References: <4395DBC5.922CAA3A@web.de> <20051206224951.A3602@gds.best.vwh.net> <43961730.93A7CB88@web.de> Message-ID: <4397078F.50500@reed.com> Detlef Bosau wrote: > >If you are looking around in archives: During the last week I thought >about why we use ACK in TCP and not NAK. One obvious reason for ACK is >the ACK pacing mechanism. However: Has there ever been a discussion? > > Huge discussions in 1975-1976 when I was involved. The issue is the basic assumption about the network environment. Lampson and Sturgis in their work on two-phase commit protocols assumed you could not reliably tell whether a packet was delivered, so a NAK was at best an unreliable indicator of packet loss. Given that there was no channel model for the Internet as a whole (to presume stochastic model was to presume homogeneity, and the Internet was about interconnecting maximally heterogeneous networks), one could not even reason probabalistically about what a NAK might mean, or what the lack of a NAK might mean. I didn't invent this network reliability model (Lampson and Sturgis' paper was circulated in draft form very widely during this timeframe), but it is summarized in the opening chapters of my Ph.D. thesis (Naming and Synchronization in a Decentralized Computer System, MIT-LCS-TR-205, 1978.) From zhani_med_faten at yahoo.fr Tue Dec 6 11:16:51 2005 From: zhani_med_faten at yahoo.fr (Zhani Mohamed Faten) Date: Tue, 6 Dec 2005 20:16:51 +0100 (CET) Subject: [e2e] TCP fragmentation and reassembly In-Reply-To: <20051206002212.23645.qmail@web25715.mail.ukl.yahoo.com> Message-ID: <20051206191651.57393.qmail@web25710.mail.ukl.yahoo.com> Hi I am sorry for the information I wrote below, it concerns IP header rather than TCP one, Zhani Mohamed Faten a ?crit : hi the TCP header contain all information needed for reassambling paquet; these are the important fields : ?ID (16 bits): use! d to identify datagram (the same for all fragments of one original datagram) ? ?the Flag in the TCP header means : ?001: there is more fragments ?000: this the last fragment ?O1X: do not fragment the ?FO (15 bit ): Fragment offset : means the ?Position of the fragment in the original datagramme . ?it is null for the first fragment = O using these filds TCP can reassemble paquets Erwin Davis a ?crit : Helllo, Packet from application layer may be framed in TCP layer based on MSS (maximum segment size, not MTU in IP layer) negotiated between two TCP layers of the end parties. My question is if the TCP layer in receiving side will reassemble the TCP fragments before it forward the packet to the application layer. If yes, then how the TCP layer in receiving side knows how many TCP fragments are made up for this one application packet. If not, will it require the intelligence from the application layer for the application packet reassembly. Thanks for your help, erwin --------------------------------- Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo! MessengerT?l?chargez le ici ! --------------------------------- Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo! Messenger T?l?chargez le ici ! --------------------------------- Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo! Messenger T?l?chargez le ici ! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.postel.org/pipermail/end2end-interest/attachments/20051206/cfa758b2/attachment.html From demir at kou.edu.tr Wed Dec 7 05:54:11 2005 From: demir at kou.edu.tr (Alper Kamil Demir) Date: Wed, 7 Dec 2005 15:54:11 +0200 Subject: [e2e] YNT: A Question on the TCP handoff Message-ID: >...very >looooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooong >lines ;-) >Alper, it would be _really_ helpful if you could care for some kind of >linebreak. >I?m not familiar with this M$ stuff, you?re apparently using. However, >there should be some >kind of preference or option, where your MUA can be made to format lines >to some maximum length. >I personaly set my Netscape to a line length of 72 characters. Sorry for the unreadable mail. However, I am not sure if this readibility issue concerns I or You. Cause, I am using Microsoft Outlook Web Access (OWA)/Internet Explorer 6.0 suite to access my email and this suite has no "quoted-printable" features to be configured by an end user as far as I know. (do not ask me why ;) In order to be helpful, I will "line-break" myself so that any MUA not supporting line wrapping could view. However, putting self line lengths has also its own problems such as short and long lines . >Either way, there _is_ a TCP connection from SA2 to MH before the >handover occurs. Correct? >So, there _is_ a sender socket on SA2? Ok. Let's clarify what are SA1 and SA2 doing?. Let's draw your reference network again: |------------------------Actual TCP connection ------------------| FH ------Internet-------!!!! wireless network !!!! SA1 !!!!! MH !!!!wireless network !!!! SA2 !!!!! Please note that SAs are based on BSs. 1) There is a TCP connection between MH and FH. - if MH moved into cell of SA1 after it had established an "actual TCP connection", ignore this situation for now cause the answer is in the next steps. 2) According to UMP SA2 establishes a "wamp-up connection" |------------------------Actual TCP connection ------------------| FH ------Internet-------!!!! wireless network !!!! SA1 !!!!! MH !!!!wireless network !!!! SA2 !!!!! |---------- Warm-up TCP Connecion -----------------| 3) When MH moves into cell of SA2 area, warm-up connection is handed-over. FH ------Internet-------!!!! wireless network !!!! SA1 !!!!! !!!!wireless network !!!! SA2 !!!!! MH |------------------------Actual TCP connection ------------------| >SA1 and SA2 are situated on two different BS, otherwise there would be >no need for handover. Is this correct? Yes, it is correct. > >I think, we should define the term "connection" here. [My reply on the definition of "connection" goes here from RFC793. >Please correct me, if I?m wrong. But good ol? RFC 793 does not know >anything about CWND and SSTRESH but defines the most basic >TCP algorithms. It does not take account congestion control. Is this >correct? This is no problem, because that issue was simply >not known when RFC 793 was written in 1981. TCP has seen some kind of >evolution since then. I referenced RFC793 in order to give the definition of "connection"; NOT "congestion" cause you asked me to define it. I think there is a big misunderstanding. I am familiar with TCP Congestion control. Yes, what you have stated above is correct, but not related to why I wrote for. I am not sure if this issue was known in 1981, however somehow my memory calls that it was a known issue before 1981. I somehow remember a related thread on e2e. >The congestion window probes / estimates the available storage capacity >available for a TCP flow along the path. (Once again: IIRC this >mechanism is _not_ discussed in RFC 793. It?s not 24 years old, >it?s only about 16 years old. So we must look it up in some more recent >literature ;-)) ;))) I don't know what to say here. Once again, I referenced RFC793 in order to define the term "connection"; NOT "congestion". A start on the congestion control would be the V. Jacobson's SIGCOMM'88 paper and S. Floyd's home page at ICIR has a very good and orginized related enhancements and recent literature. To me, the recent literature is refinements on the SIGCOMM'88 paper. >In the old connection, CWND describes the fair share of storage capacity >along the path from FH to MH. It does not tell you where this >capacity is situated. Was there CWND in the old TCP at all? (whic old are we talking about; RFC793?) >E.g. most of the packets along the path may pile >up at the bottleneck which may be situated in the Internet >or in the wireless last mile, you simply _don?t_ know. In our approach, we are ignoring this last mile from BS to MH. Is this really matters too much? >So, even if you would replace only the wirless part in your path and >would maintain the wired part, you would have absolutely >no idea about the correct value for CWND after the path change. Perhaps >you know some fair share of capacity from your >"warm up connection". But which part of the former CWND shall be >replaced by this? Not the whole wireless part. Only the last mile from MH to BS. I mentioned this above. We plan to chane the whole former CWND part. isn't this possible at all, is this against the TCP end-to-end semantics? or not possible at all? >And with which sending rate are you probing? This depends on where the >bottleneck in your connecetion is situated. >This is one of the big mysteries in your connection: You have no idea >where the bottleneck is! Sending rate? TCP is window based; not rate base. If bottleneck is not on the last mile from BS to MH, we get the most accurate result. >O.k. And what does this "warm up connection" do? Particularly with >respect to CWND: How does it "struggle for space"? The duration of "warm-up connection" is low. it struggles the same way of other TCPs. >In TCP, there is no rate probing but there is a struggle for space. As a >conseqence, a TCP flow will send with a certain rate. It doesn't last long and handed over when MH moves into the cell area. >And I intendedly say _struggle_. Because there is no "harmless probing" >which leaves other streams alone and obtains space >by miracle (Christmas is coming, jingle bells, jingle bells, jingle all >the way *sing*) but as soon as there are several >flows competing for ressources, there is a _struggle_. Even that is a >problem with a warm-up connection. I don't understand this part at all. It will harm for a short of time and the bandwidth it requires is quite negligible. >Will it throttle >competing flows although it does not actually convey any data? Yes, it will but it will convey dummy data. >Again, this is not covered in RFC 793. (I don?t know the latest RFC >concerning TCP, but this is by far close to >a number like 3793 than 793.) Others are enhancements.Semantics is kept the same as it was. > Whatever CWND means on end to end basis on a packet-swithed network, "warm-up connection" is established so that expected new path characteristics are formed ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >Alper, you?re kidding, aren?t you? No I am not. Could you define me what is "an end-to-end congestion control on a packet-switched network"? In the second part of the sentence, I mean, "warm-up connection" obtains the characteristics of the new path whatever a TCP connection server up for. I don't see any kidding here. >> We are using "Erdal Cayirci, Ian F. Akyidiz, User Mobility Pattern Scheme for Location Update and Paging in Wireless >Systems, >IEEE Transactions on Mobile Computing". p. 236-247, ISSN:1536-1233, 2002. >I would appreciate a copy, if this is possible. (And as usual, we are >pleased that the citeseer system behaves at it is supposed to. It?s >down. :-( ) I will send it to you >I?m not quite sure, what is the real Detlef. *looking around* >I think about this for years now. Me too. >Actually, I come to the conclusion that we should define a number of >relevant scenarios. We should not _invent_ them. They must be >part of the real world. E.g. it will be hardly possible to describe >muser mobility or the C/I ration of a wireless channel >for the whole world. (I know about those papers who claim that. The big >waste basket in our house knows them as well.) I agree with you. However, I still am not sure on "waste basket" part. May be experts would brighten and clarify this polemic. I don't have any experience on this issue. >> >Admittedly, I did not yet read your UMP. (My blood pressure ;-)) >> See above. >Be aware of my blood pressure ;-) At _your_ risk ;-) I will send it separetely. >It?s not that easy for me to see the interactions, particularly claimed >adverse ones. But I?m totally with you that L2 and L4 have pretty much >in common in some important aspects. E.g. a Radio Link Protocol in >mobile networks has some similarities with transport protocols. >It is quite interesting to see the similarities _and_ the subtle >differences which often become obvious not before a closer look. Both are responsible for putting data accross from one point to other. L4 is on a virtual link on networks and L2 is on logical link on a physical medium. To me, it is very obvious even before a closer look. >Is there something like "prove by lack of contradiction"? I guess, not cause whole induction part is missing :)) Alper K. Demir From david.borman at windriver.com Wed Dec 7 10:45:48 2005 From: david.borman at windriver.com (David Borman) Date: Wed, 7 Dec 2005 12:45:48 -0600 Subject: [e2e] TCP fragmentation and reassembly In-Reply-To: References: Message-ID: <98371DF9-9031-4C95-B8F2-029C015EBED9@windriver.com> TCP does not do fragmentation and reassembly. IP does fragmentation and reassembly. On the sending side, TCP takes a byte stream from the application, and sends it as a series of complete TCP/IP packets. On the receive side, TCP will do resequencing of packets if they arrive out of order, so that they are presented to the receiving application in the same order that they were sent. The TCP MSS value specifies the largest size of packet that the receiver can reassemble. I doubt that there are many, if any, TCP implementations that can't handle receiving a full 64K TCP/IP packet (as a series of IP fragments). But since most TCP implementations try very hard to not send packets that will be fragmented at the IP layer, as long as the MSS is larger than the underlying MTU it doesn't really matter. In theory, when using Path MTU discovery, there is no reason to not always use the maximum MSS value of 64K-1. -David Borman On Dec 5, 2005, at 4:45 PM, Erwin Davis wrote: > Helllo, > > Packet from application layer may be framed in TCP layer based on > MSS (maximum segment size, not MTU in IP layer) negotiated between > two TCP layers of the end parties. My question is if the TCP layer > in receiving side will reassemble the TCP fragments before it > forward the packet to the application layer. If yes, then how the > TCP layer in receiving side knows how many TCP fragments are made > up for this one application packet. If not, will it require the > intelligence from the application layer for the application packet > reassembly. Thanks for your help, > > erwin From touch at ISI.EDU Wed Dec 7 12:22:19 2005 From: touch at ISI.EDU (Joe Touch) Date: Wed, 07 Dec 2005 12:22:19 -0800 Subject: [e2e] TCP fragmentation and reassembly In-Reply-To: <98371DF9-9031-4C95-B8F2-029C015EBED9@windriver.com> References: <98371DF9-9031-4C95-B8F2-029C015EBED9@windriver.com> Message-ID: <4397447B.9040405@isi.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 David Borman wrote: > TCP does not do fragmentation and reassembly. IP does fragmentation > and reassembly. On the sending side, TCP takes a byte stream from the > application, and sends it as a series of complete TCP/IP packets. On > the receive side, TCP will do resequencing of packets if they arrive > out of order, so that they are presented to the receiving application > in the same order that they were sent. > > The TCP MSS value specifies the largest size of packet that the > receiver can reassemble. I doubt that there are many, if any, TCP > implementations that can't handle receiving a full 64K TCP/IP packet > (as a series of IP fragments). See: http://www.psc.edu/networking/projects/tcptune/ The table (grep for "FreeBSD") shows that current OS's start with a default socket size smaller than 64KB (which limits the size of the receive window, right?). > But since most TCP implementations try > very hard to not send packets that will be fragmented at the IP layer, > as long as the MSS is larger than the underlying MTU it doesn't really > matter. In theory, when using Path MTU discovery, there is no reason > to not always use the maximum MSS value of 64K-1. Most systems don't try anything larger than the outgoing interface MSS, though, which is often much smaller (1500 bytes (ethernet), 4400 bytes (POS), or 9KB (ATM)). Joe > -David Borman > > On Dec 5, 2005, at 4:45 PM, Erwin Davis wrote: > >> Helllo, >> >> Packet from application layer may be framed in TCP layer based on MSS >> (maximum segment size, not MTU in IP layer) negotiated between two >> TCP layers of the end parties. My question is if the TCP layer in >> receiving side will reassemble the TCP fragments before it forward >> the packet to the application layer. If yes, then how the TCP layer >> in receiving side knows how many TCP fragments are made up for this >> one application packet. If not, will it require the intelligence from >> the application layer for the application packet reassembly. Thanks >> for your help, >> >> erwin -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (MingW32) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFDl0R6E5f5cImnZrsRAu6FAJ4x8HY7dOpxg8QB4dSmn+jpHbGZ5QCeJYc0 XMaVxojKnPONuZiL+fKi7+w= =L0Pr -----END PGP SIGNATURE----- From detlef.bosau at web.de Wed Dec 7 12:24:08 2005 From: detlef.bosau at web.de (Detlef Bosau) Date: Wed, 07 Dec 2005 21:24:08 +0100 Subject: [e2e] YNT: A Question on the TCP handoff References: Message-ID: <439744E8.E6F4EA9A@web.de> Alper Kamil Demir wrote: > Sorry for the unreadable mail. However, I am not sure if this readibility issue concerns I or You. > Cause, I am using Microsoft Outlook Web Access (OWA)/Internet Explorer 6.0 suite to access > my email and this suite has no "quoted-printable" features to be configured by an end user as far > as I know. (do not ask me why ;) It?s perhaps a similar reason while it?s difficult to forbid some mail readers to execute attachments with the "sober" virus. > In order to be helpful, I will "line-break" myself so that any MUA not supporting line > wrapping could view. However, putting self line lengths has also its own problems such as short > and long lines . One problem in automatic line wrapping at the reader side are "ASCII arts". E.g. our network drawings :-) I have difficulties with this issues myself. > > >Either way, there _is_ a TCP connection from SA2 to MH before the > >handover occurs. Correct? > >So, there _is_ a sender socket on SA2? > Ok. Let's clarify what are SA1 and SA2 doing?. Let's draw your reference network again: > > |------------------------Actual TCP connection ------------------| > FH ------Internet-------!!!! wireless network !!!! SA1 !!!!! MH > !!!!wireless network !!!! SA2 !!!!! > Please note that SAs are based on BSs. > > 1) There is a TCP connection between MH and FH. > - if MH moved into cell of SA1 after it had established an "actual TCP connection", ignore this > situation for now cause the answer is in the next steps. > 2) According to UMP SA2 establishes a "wamp-up connection" > |------------------------Actual TCP connection ------------------| > FH ------Internet-------!!!! wireless network !!!! SA1 !!!!! MH > !!!!wireless network !!!! SA2 !!!!! > |---------- Warm-up TCP Connecion -----------------| > The one endpoint of the warm connection is FH, correct? And the other is SA2. It cannot be MH, because MH is not yet in the cell of SA2. > 3) When MH moves into cell of SA2 area, warm-up connection is handed-over. > > FH ------Internet-------!!!! wireless network !!!! SA1 !!!!! > !!!!wireless network !!!! SA2 !!!!! MH > |------------------------Actual TCP connection ------------------| > Hm. I will try to understand..... (may be I suffer from very early Alzheimer-Symptoms.... %-)) What I do not understand is how the handover takes place and how MH is made the endpoint of a connection wich was terminated on SA2? > I referenced RFC793 in order to give the definition of "connection"; NOT "congestion" I did not mix up these two ;-) > cause you asked me to define it. I think there is a big misunderstanding. I am familiar with > TCP Congestion control. Yes, what you have stated above is correct, but not related to > why I wrote for. And exactly that?s what I do not understand. > I am not sure if this issue was known in 1981, however somehow my memory calls that > it was a known issue before 1981. I somehow remember a related thread on e2e. But you surely did not read the list before 1982 ;-) *SCNR* > > >The congestion window probes / estimates the available storage capacity > >available for a TCP flow along the path. (Once again: IIRC this > >mechanism is _not_ discussed in RFC 793. It?s not 24 years old, > >it?s only about 16 years old. So we must look it up in some more recent > >literature ;-)) > ;))) I don't know what to say here. Once again, I referenced RFC793 in order to define the term > "connection"; NOT "congestion". > A start on the congestion control would be the V. Jacobson's SIGCOMM'88 paper and S. Floyd's Really? :-) It?s a little bit combersome to find this using Google. It?s too much to type (JC would certainly write: to painful to his hands :-)) Simply type "congavoid" and follow the first hit. > home page at ICIR has a very good and orginized related enhancements and recent literature. > To me, the recent literature is refinements on the SIGCOMM'88 paper. This is my opinion as well. However, I would be careful there as I?m not too familiar with rate based and equation based congestion control. To my knowledge Sally has authored and co-authored quite a few papers on this matter. In addition, there is quite some work from the more control-theoretical community (Mascolo, Massoulie, Vinnicombe, just to name a few.) But I believe I remember (however, I don?t know where I read it) that Sally has called the congestion principle from the congavoid paper the most important basis for congestion control in one of here papers. > > >In the old connection, CWND describes the fair share of storage capacity > >along the path from FH to MH. It does not tell you where this > >capacity is situated. > Was there CWND in the old TCP at all? (whic old are we talking about; RFC793?) CWND is not in RFC 793. However, it should be part of any actual TCP connection. A TCP implementation which does not do congestion control should be considered broken. So, back to my point: It not only wouldn?t make sense, it?s substantially wrong to consider TCP without the state variables CWND and SSTRHESH. > > >E.g. most of the packets along the path may pile > >up at the bottleneck which may be situated in the Internet > >or in the wireless last mile, you simply _don?t_ know. > In our approach, we are ignoring this last mile from BS to MH. Is this really matters too much? Oh yeah! It does matter! Particularly in mobile networks. We?re not talking about a "base station" for WLAN. At least, I don?t. Because in that case, shut your eyes, slip into silent slumber, sail on a silver mist.... and talk to yourself: "It?s Ethernet". Believe me, if it?s properly installed, you would not notice the difference. However, in mobile networks like GPRS with latencies up to several hundred seconds and asserted latency bandwidth products in Megabyte magnitude (I think, Michael Meyer, Ericsson, has written something about that) the last mile _does_ matter. Please note: The last two lines were a pure blackbox understanding of a mobile network. It would be to detailed here to discuss these details, but in fact I think some of these issues need a very careful discussion, perhaps even some redesign in some details. (At the moment, I try to think about it a little bit in public, have a look at http://www.detlef-bosau.de/layers.html if you are interested. It?s early work in progress. But I?m in the need of comments.) > > >So, even if you would replace only the wirless part in your path and > >would maintain the wired part, you would have absolutely > >no idea about the correct value for CWND after the path change. Perhaps > >you know some fair share of capacity from your > >"warm up connection". But which part of the former CWND shall be > >replaced by this? > Not the whole wireless part. Only the last mile from MH to BS. I mentioned this above. Excuse me? Where is the difference here? > We plan to chane the whole former CWND part. isn't this possible at all, is this against > the TCP end-to-end semantics? or not possible at all? Forget about the TCP end-to-end semantics for the moment. First of all, they could be maintained even in the presence of PEP etc. Second, we must not simplify the wireless part too much. The wireless part not only consists of some electromagnetic waves between two antennas. At least, there is typically a Radio Link Protocol between BS and MH. > > >And with which sending rate are you probing? This depends on where the > >bottleneck in your connecetion is situated. > >This is one of the big mysteries in your connection: You have no idea > >where the bottleneck is! > Sending rate? TCP is window based; not rate base. If bottleneck is not on the last mile Yes. So, there is a sending rate. It?s a consequence of proper ACK pacing :-) I?m a "window guy". So, when I talk about "rate control", I always mean ACK pacing :-) > from BS to MH, we get the most accurate result. > > >O.k. And what does this "warm up connection" do? Particularly with > >respect to CWND: How does it "struggle for space"? > The duration of "warm-up connection" is low. it struggles the same way of other TCPs. O.k. With which rate? For the relationship of rate, bottleneck and windows please refer to the congavoid paper, page 3, Figure 1: Windw Fow Control ?Self Clocking". I think, if youre dealing with TCP, you will have this figure in mind anyway. (I?m always amused when researches claim numerous inventors for packet pair and packet train techinques. It?s invented _here_ in this very figure. Any later inventions are only the moments, when people begin to understand the congavoid paper :-)) > > >In TCP, there is no rate probing but there is a struggle for space. As a > >conseqence, a TCP flow will send with a certain rate. > It doesn't last long and handed over when MH moves into the cell area. Hm. When we face a LBP of 20 Mbytes (which I read in a paper by Michael Meyer) on the last mile and start from a CWND of 32 kBytes in a warm up connection, it might take some time to adapt... ;-) > > >And I intendedly say _struggle_. Because there is no "harmless probing" > >which leaves other streams alone and obtains space > >by miracle (Christmas is coming, jingle bells, jingle bells, jingle all > >the way *sing*) but as soon as there are several > >flows competing for ressources, there is a _struggle_. Even that is a > >problem with a warm-up connection. > I don't understand this part at all. It will harm for a short of time and the bandwidth it requires > is quite negligible. It will not really "harm" as it is intendend to share the available capacity between the flows. But we must not ignore it. Particularly as it requires the same bandwidth as the real flow. > > >Will it throttle > >competing flows although it does not actually convey any data? > Yes, it will but it will convey dummy data. Again: whith wich rate? The _rate_ comes from the self-clocking and immediately follows from the bottleneck rate as can bee seen in the aforementioned figure. (It?s one of the most important figures I?ve have seen in networking. However, it?s quoted in numerous textbooks and _not_ carefully explained.) > > >Again, this is not covered in RFC 793. (I don?t know the latest RFC > >concerning TCP, but this is by far close to > >a number like 3793 than 793.) > Others are enhancements.Semantics is kept the same as it was. Oh yeah ;-) Frankly, I do not completely agree here :-) > > > Whatever CWND means on end to end basis on a packet-swithed network, "warm-up connection" is established so that expected new path characteristics are formed > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > >Alper, you?re kidding, aren?t you? > No I am not. Could you define me what is "an end-to-end congestion control on a packet-switched network"? No. Frankly. If Van couldn?t, I could neither.... . > >> We are using "Erdal Cayirci, Ian F. Akyidiz, User Mobility Pattern Scheme for Location Update and Paging in Wireless > >Systems, >IEEE Transactions on Mobile Computing". p. 236-247, ISSN:1536-1233, 2002. > >I would appreciate a copy, if this is possible. (And as usual, we are > >pleased that the citeseer system behaves at it is supposed to. It?s > >down. :-( ) > I will send it to you Thank you! > > >Actually, I come to the conclusion that we should define a number of > >relevant scenarios. We should not _invent_ them. They must be > >part of the real world. E.g. it will be hardly possible to describe > >muser mobility or the C/I ration of a wireless channel > >for the whole world. (I know about those papers who claim that. The big > >waste basket in our house knows them as well.) > I agree with you. However, I still am not sure on "waste basket" part. May be experts > would brighten and clarify this polemic. I don't have any experience on this issue. We have a big waste basket in the ground floor of our house where old paper is gathered. Detlef -- Detlef Bosau Galileistrasse 30 70565 Stuttgart Mail: detlef.bosau at web.de Web: http://www.detlef-bosau.de Mobile: +49 172 681 9937 From david.borman at windriver.com Wed Dec 7 16:04:22 2005 From: david.borman at windriver.com (David Borman) Date: Wed, 7 Dec 2005 18:04:22 -0600 Subject: [e2e] TCP fragmentation and reassembly In-Reply-To: <4397447B.9040405@isi.edu> References: <98371DF9-9031-4C95-B8F2-029C015EBED9@windriver.com> <4397447B.9040405@isi.edu> Message-ID: <24BAACA9-027A-432B-A6CF-24C9DAEF638F@windriver.com> On Dec 7, 2005, at 2:22 PM, Joe Touch wrote: >> The TCP MSS value specifies the largest size of packet that the >> receiver can reassemble. I doubt that there are many, if any, TCP >> implementations that can't handle receiving a full 64K TCP/IP packet >> (as a series of IP fragments). > > See: http://www.psc.edu/networking/projects/tcptune/ > > The table (grep for "FreeBSD") shows that current OS's start with a > default socket size smaller than 64KB (which limits the size of the > receive window, right?). Yes, but that has nothing to do with the MSS value. The MSS is simply the largest IP packet that the host can reassemble. >> But since most TCP implementations try >> very hard to not send packets that will be fragmented at the IP >> layer, >> as long as the MSS is larger than the underlying MTU it doesn't >> really >> matter. In theory, when using Path MTU discovery, there is no >> reason >> to not always use the maximum MSS value of 64K-1. > > Most systems don't try anything larger than the outgoing interface > MSS, > though, which is often much smaller (1500 bytes (ethernet), 4400 bytes > (POS), or 9KB (ATM)). The received MSS value is only one of several variables that will limit the size of packets that get sent. While common, basing the MSS on the MTU of the outgoing interface breaks down in the case of asymetric routing, when the MTU of the incoming interface is larger than the MTU of the outgoing interface. For this reason, some systems use an MSS that is based on the maximum MTU of all interfaces, rather than the outgoing interface. But the MSS is also a powerful knob that can be used to force remote systems to send smaller packets when they aren't smart enough to send packets that are small enough to not get fragmented along the way. Fundamentally, it is this reason why most hosts don't just use an MSS of 64K-1. -David Borman From touch at ISI.EDU Wed Dec 7 16:56:32 2005 From: touch at ISI.EDU (Joe Touch) Date: Wed, 07 Dec 2005 16:56:32 -0800 Subject: [e2e] TCP fragmentation and reassembly In-Reply-To: <24BAACA9-027A-432B-A6CF-24C9DAEF638F@windriver.com> References: <98371DF9-9031-4C95-B8F2-029C015EBED9@windriver.com> <4397447B.9040405@isi.edu> <24BAACA9-027A-432B-A6CF-24C9DAEF638F@windriver.com> Message-ID: <439784C0.4020307@isi.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 David Borman wrote: > > On Dec 7, 2005, at 2:22 PM, Joe Touch wrote: > >>> The TCP MSS value specifies the largest size of packet that the >>> receiver can reassemble. I doubt that there are many, if any, TCP >>> implementations that can't handle receiving a full 64K TCP/IP packet >>> (as a series of IP fragments). >> >> See: http://www.psc.edu/networking/projects/tcptune/ >> >> The table (grep for "FreeBSD") shows that current OS's start with a >> default socket size smaller than 64KB (which limits the size of the >> receive window, right?). > > Yes, but that has nothing to do with the MSS value. The MSS is simply > the largest IP packet that the host can reassemble. According to RFC793, MSS is the max TCP segment the receiver can handle - - not just the largest IP packet that can be reassembled (though this could be presumed as a prerequisite). If the connection can only handle 8KB outstanding, even if IP can handle a packet that large, TCP cannot, so it seems inappropriate to ever advertise an MSS > max_recv_window, which is bounded by the socket size. >>> But since most TCP implementations try >>> very hard to not send packets that will be fragmented at the IP layer, >>> as long as the MSS is larger than the underlying MTU it doesn't really >>> matter. In theory, when using Path MTU discovery, there is no reason >>> to not always use the maximum MSS value of 64K-1. >> >> >> Most systems don't try anything larger than the outgoing interface MSS, >> though, which is often much smaller (1500 bytes (ethernet), 4400 bytes >> (POS), or 9KB (ATM)). > > The received MSS value is only one of several variables that will limit > the size of packets that get sent. While common, basing the MSS on the > MTU of the outgoing interface breaks down in the case of asymetric > routing, when the MTU of the incoming interface is larger than the MTU > of the outgoing interface. The outgoing segment size is limited by the min of the received MSS, the outgoing interface MTU, and the override value in the routing table - at least that's how it's implemented in BSD. (I was assuming 'bounded by' the outgoing interface MTU, not just assuming the outgoing interface MTU -sorry about using MSS in that context. However, I'm confused by the counterexample - the case where the incoming interface MTU is larger than the outgoing is where you _need_ to look at the outgoing interface, to avoid using the other side's declared MSS and causing fragmentation for your end). > For this reason, some systems use an MSS > that is based on the maximum MTU of all interfaces, rather than the > outgoing interface. The advertised MSS can be that large, but it presumes that all interfaces are capable of receiving and reassembling IP packets equally well, which is not the case where reassembly happens on the NIC. The advertised MSS sbould be bounded by the incoming interface MTU of this connection. > But the MSS is also a powerful knob that can be used to force remote > systems to send smaller packets when they aren't smart enough to send > packets that are small enough to not get fragmented along the way. > Fundamentally, it is this reason why most hosts don't just use an MSS > of 64K-1. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (MingW32) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFDl4TAE5f5cImnZrsRAnSCAKCBWOANWpfuIuYHs/weNfPvMWEEGgCgww05 kyC6AYAcADgFTSiuSmlF3BA= =Abyi -----END PGP SIGNATURE----- From detlef.bosau at web.de Thu Dec 8 02:15:18 2005 From: detlef.bosau at web.de (Detlef Bosau) Date: Thu, 08 Dec 2005 11:15:18 +0100 Subject: [e2e] TCP fragmentation and reassembly References: <98371DF9-9031-4C95-B8F2-029C015EBED9@windriver.com> <4397447B.9040405@isi.edu> <24BAACA9-027A-432B-A6CF-24C9DAEF638F@windriver.com> <439784C0.4020307@isi.edu> Message-ID: <439807B6.7CB719F0@web.de> Joe Touch wrote: > > > > Yes, but that has nothing to do with the MSS value. The MSS is simply > > the largest IP packet that the host can reassemble. > > According to RFC793, MSS is the max TCP segment the receiver can handle > - - not just the largest IP packet that can be reassembled (though this > could be presumed as a prerequisite). If the connection can only handle > 8KB outstanding, even if IP can handle a packet that large, TCP cannot, > so it seems inappropriate to ever advertise an MSS > max_recv_window, > which is bounded by the socket size. I think we are splitting hairs here %-) For me, it was helpful to read once more, that MSS is set amongst others by the receiver side. I forgot about it - but obviously, there is a necessity to do so and when I think about it, I believe to remember TCP options for MSS negotiation. Is this correct? However: To my understanding, the packet size of outgoing packts is limited. And there are several limiting factors: -the MTU, i.e. the maximum packet size an outgoing interface can handle, -the Path MTU (if known), i.e. the maximum packet size the _path_ can handle -the receiver?s MSS, i.e. the maximum segment size a receiver can handle. Whether a receiver?s MSS is limited by the IP layer, it?s network interface card, the receiving TCP socket or the light of the moon does not really matter. The receiver asks politely not to exceed a certain MSS and the sender is polite and will not do so :-) > > The outgoing segment size is limited by the min of the received MSS, the > outgoing interface MTU, and the override value in the routing table - at > least that's how it's implemented in BSD. And this perfectly makes sense to me. > > > For this reason, some systems use an MSS > > that is based on the maximum MTU of all interfaces, rather than the > > outgoing interface. > > The advertised MSS can be that large, but it presumes that all ..and the receiver?s advertised MSS is only _one_ of the upper limits for the MSS to be used, right? > interfaces are capable of receiving and reassembling IP packets equally > well, which is not the case where reassembly happens on the NIC. The > advertised MSS sbould be bounded by the incoming interface MTU of this > connection. > > > But the MSS is also a powerful knob that can be used to force remote > > systems to send smaller packets when they aren't smart enough to send > > packets that are small enough to not get fragmented along the way. > > Fundamentally, it is this reason why most hosts don't just use an MSS > > of 64K-1. Hm. I think we should be careful to use "powerful" knobs for each and anything and get lost in what we use for which purpse. To my understanding the proper way to avoid fragmentation is path MTU detection. Although there may be systems around which do not implement it properly, it?s the proper way to go because a receiver will hardly know the path MTU. In contrast to that a sender can and will. And IIRC in IPv6 the path MTU detection mechanism is compulsory and no transparant fragmentation will take place at any routers. However, please correct me if I?m wrong, I didn?t look up it in the RFCs in the moment. Detlef > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.2.4 (MingW32) > Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org > > iD8DBQFDl4TAE5f5cImnZrsRAnSCAKCBWOANWpfuIuYHs/weNfPvMWEEGgCgww05 > kyC6AYAcADgFTSiuSmlF3BA= > =Abyi > -----END PGP SIGNATURE----- -- Detlef Bosau Galileistrasse 30 70565 Stuttgart Mail: detlef.bosau at web.de Web: http://www.detlef-bosau.de Mobile: +49 172 681 9937 From faber at ISI.EDU Thu Dec 8 08:54:29 2005 From: faber at ISI.EDU (Ted Faber) Date: Thu, 8 Dec 2005 08:54:29 -0800 Subject: [e2e] queriable networks In-Reply-To: <4394DDFF.8060909@isi.edu> References: <438F10E4.7050007@dcrocker.net> <438F5198.1070103@reed.com> <439297A0.6090406@dcrocker.net> <4394DDFF.8060909@isi.edu> Message-ID: <20051208165429.GB36688@hut.isi.edu> On Mon, Dec 05, 2005 at 04:40:31PM -0800, Joe Touch wrote: > Dave Crocker wrote: > > > > > >> One place where I depart from a common view of the end to end argument is > >> that there are times when it makes sense to actively enquire of the > >> network and expect the network to make a response that characterizes > >> itself. > > (responding more to Fred than to Dave, but with Dave's subject thread): > > This presumes two things: > > 1) if you ask the question, you actually want the answer [snip] > 2) if you ask the question, you can trust the answer [snip] Do you think that the set of queries to "the network" for which the asker really wants and can trust the answer is empty? There are certainly lots of ways to go wrong, but it seems that there are times when an entity knows the right question and knows another entity that has the information and is trustable, but there's no protocol to ask the question. If those cases exist, it seems like it's worthwhile to be able to ask the question. -- Ted Faber http://www.isi.edu/~faber PGP: http://www.isi.edu/~faber/pubkeys.asc Unexpected attachment on this mail? See http://www.isi.edu/~faber/FAQ.html#SIG -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://www.postel.org/pipermail/end2end-interest/attachments/20051208/1c76c300/attachment.bin From david.borman at windriver.com Thu Dec 8 10:33:40 2005 From: david.borman at windriver.com (David Borman) Date: Thu, 8 Dec 2005 12:33:40 -0600 Subject: [e2e] TCP fragmentation and reassembly In-Reply-To: <439784C0.4020307@isi.edu> References: <98371DF9-9031-4C95-B8F2-029C015EBED9@windriver.com> <4397447B.9040405@isi.edu> <24BAACA9-027A-432B-A6CF-24C9DAEF638F@windriver.com> <439784C0.4020307@isi.edu> Message-ID: <17ED86FB-FBB5-4135-93AE-DC54C0B5061E@windriver.com> On Dec 7, 2005, at 6:56 PM, Joe Touch wrote: > David Borman wrote: >> >> On Dec 7, 2005, at 2:22 PM, Joe Touch wrote: >> >>>> The TCP MSS value specifies the largest size of packet that the >>>> receiver can reassemble. I doubt that there are many, if any, TCP >>>> implementations that can't handle receiving a full 64K TCP/IP >>>> packet >>>> (as a series of IP fragments). >>> >>> See: http://www.psc.edu/networking/projects/tcptune/ >>> >>> The table (grep for "FreeBSD") shows that current OS's start with a >>> default socket size smaller than 64KB (which limits the size of the >>> receive window, right?). >> >> Yes, but that has nothing to do with the MSS value. The MSS is >> simply >> the largest IP packet that the host can reassemble. > > According to RFC793, MSS is the max TCP segment the receiver can > handle > - - not just the largest IP packet that can be reassembled (though > this > could be presumed as a prerequisite). If the connection can only > handle > 8KB outstanding, even if IP can handle a packet that large, TCP > cannot, > so it seems inappropriate to ever advertise an MSS > max_recv_window, > which is bounded by the socket size. No. The receive window has no influence on the MSS value. Look at RFC 1122. The value to use in the MSS option is to be less than or equal to MMS_R - 20, where MMS_R is "the maximum message size that can be received and reassembled in an IP datagram" (RFC 1122, pg 57). When you actually go to send a packet, the received MSS value is only one of several things that can limit the packet size. ... >> For this reason, some systems use an MSS >> that is based on the maximum MTU of all interfaces, rather than the >> outgoing interface. > > The advertised MSS can be that large, but it presumes that all > interfaces are capable of receiving and reassembling IP packets > equally > well, which is not the case where reassembly happens on the NIC. The > advertised MSS sbould be bounded by the incoming interface MTU of this > connection. Provided you know for sure which is the incoming interface. You can always know what interface you are using to send packets, but in many multi-homed situations you can't guarantee on which interface the traffic will arrive. -David Borman From braden at ISI.EDU Thu Dec 8 09:18:14 2005 From: braden at ISI.EDU (Bob Braden) Date: Thu, 08 Dec 2005 09:18:14 -0800 Subject: [e2e] TCP fragmentation and reassembly In-Reply-To: <439784C0.4020307@isi.edu> References: <24BAACA9-027A-432B-A6CF-24C9DAEF638F@windriver.com> <98371DF9-9031-4C95-B8F2-029C015EBED9@windriver.com> <4397447B.9040405@isi.edu> <24BAACA9-027A-432B-A6CF-24C9DAEF638F@windriver.com> Message-ID: <5.1.0.14.2.20051208091636.00aa2438@boreas.isi.edu> RFC 1122 went to considerable pain to get the TCP MSS calculation right, since it was often wrong in early implementations. Why are we rehashing this 15 year old issue? Bob Braden From demir at kou.edu.tr Thu Dec 8 10:30:42 2005 From: demir at kou.edu.tr (Alper Kamil Demir) Date: Thu, 8 Dec 2005 20:30:42 +0200 Subject: [e2e] YNT: A Question on the TCP handoff Message-ID: >One problem in automatic line wrapping at the reader side are "ASCII >arts". >E.g. our network drawings :-) >I have difficulties with this issues myself. I seems the format of my emails are getting worse, the more I try. Befor I send emails they look nice; when I look them after posting, they look terrible. I don't know what to do when using Microsoft OWA. I am somehow stuck to use it cause there is almost always problems with IMAP, POP in my university. >The one endpoint of the warm connection is FH, correct? And the other is >SA2. It cannot be MH, because MH is not yet in the cell of SA2. Yes, FH is the one end point of the warm-up connection and SA2 is the other end. No, in warm-up connections, MHs are never end points cause they are not in the cell of related SAs yet. >I will try to understand..... (may be I suffer from very early >Alzheimer-Symptoms.... %-)) It seems not so far %-)) >What I do not understand is how the handover takes place and how MH is >made the endpoint >of a connection wich was terminated on SA2? This was my question in the whole thread. if this is ever possible or not without corrupting e2e semantics of TCP. My approach is that only congestion state is updated on FH. The congestion statefrom FH to SA2 , resulting from "warm-up connection" is ready to be handed over; only the last mile from SA2 to MH is missing in the congestion state. However, some other mechanism could be used to estimate and integrate with already known congestion state resulted from "warm-up connection". > I referenced RFC793 in order to give the definition of "connection"; NOT "congestion" >I did not mix up these two ;-) Then, I didn't understand why you had asked me to define "connection". We can skip this, if you like. >> cause you asked me to define it. I think there is a big misunderstanding. I am familiar with >> TCP Congestion control. Yes, what you have stated above is correct, but not related to >> why I wrote for. >And exactly that?s what I do not understand. I guess, we are on the same track now. >> I am not sure if this issue was known in 1981, however somehow my memory calls that >> it was a known issue before 1981. I somehow remember a related thread on e2e. >But you surely did not read the list before 1982 ;-) *SCNR* Unfortunatelly (may be likely ;), I wasn't around back then. However, I remember a related thread in a couple of years back. >> A start on the congestion control would be the V. Jacobson's SIGCOMM'88 paper and S. Floyd's >Really? :-) I didn't mean for you cause I am familiar with you "Path Tail Emulation" work and it is referenced there, too. I meant we are on the same track on this. > home page at ICIR has a very good and orginized related enhancements and recent literature. > To me, the recent literature is refinements on the SIGCOMM'88 paper. >However, it should be part of any actual TCP connection. A TCP >implementation which does not do congestion control should be considered >broken. I am, in a way, don't agree here with you. I don't understand why TCP should be responsible for this. For a long time, and still, UDP is the privilidged child of the Internet from the point of bandwith share. To me, congestion control is a type of "implicit admission control". Admission Control is a QoS mechanism. TCP cares about network quality of service whereas UDP doesn't. Moreover. Internet is "best-effort"; meaning only embodying of one type of service based on "fate-sharing" principle. To me, Congestion Contro is not a transport layer issue. It is a network layer issue. However, when integrated into transport layer it does prevent network under-utilization. Hence to replace UDP, DCCP is emerged. There is UDP and DCCP, but there is no TCP without congestion control. TCP is actually TCP congestion Control Protocol (TCCP), now. In a way, Internet has emerged with new congestion control protocols to provide an "implicit quality of service" preventing congestion. The existence of RFC793 and other RFCs related to TCP and TCP congestion control is evidence for this. >However, in mobile networks like GPRS with latencies up to several >hundred seconds and asserted latency bandwidth products in >Megabyte magnitude (I think, Michael Meyer, Ericsson, has written >something about that) the last mile _does_ matter. We are planning to use a mechanism to be used to take the last mile problem in our approach into account and integrate it into the FH-SA2 path. >Please note: The last two lines were a pure blackbox understanding of a >mobile network. It would be to detailed here to discuss ^^^^^^^ ;)) typo does happen. such as my "to" two" typo. Does this prove history is recurrence ;)) Especially, I have used some variety of keyboards. > >So, even if you would replace only the wirless part in your path and > >would maintain the wired part, you would have absolutely > >no idea about the correct value for CWND after the path change. Perhaps > >you know some fair share of capacity from your > >"warm up connection". But which part of the former CWND shall be > >replaced by this? > Not the whole wireless part. Only the last mile from MH to BS. I mentioned this above. >Excuse me? >Where is the difference here? The whole path from FH to SA2 is new path. Only the last mile is missing. The true bandwidht interration of the last mile is not possible other than estimating it before MH moves into the area of last mile. > We plan to chane the whole former CWND part. isn't this possible at all, is this against > the TCP end-to-end semantics? or not possible at all? >Forget about the TCP end-to-end semantics for the moment. First of all, >they could be maintained even in the presence >of PEP etc. Second, we must not simplify the wireless part too much. The >wireless part not only consists of some >electromagnetic waves between two antennas. At least, there is typically >a Radio Link Protocol between BS and MH. If I forget about e2e semantics then it seems there is no problem at all. It seems I am digging no problem here. The only problem is if it is possible to replace/update the sender's congestion state at all. I agree with the rest. However, we are trying to attck the problem from the "transport layer" (congestion state changes) and not taking "lower layer interactions and effects" into account for now. > >And with which sending rate are you probing? This depends on where the > >bottleneck in your connecetion is situated. > >This is one of the big mysteries in your connection: You have no idea > >where the bottleneck is! If it is not on the last mile, then in our approach there is no problem. However if it is, then we are planning to use an estimation to take this into account. >Yes. So, there is a sending rate. It?s a consequence of proper ACK >pacing :-) I would say a rate consequence of proper "ack pacing" constrained with RTT. >O.k. With which rate? The rate of whatever the "warm-up" TCP connection from FH to SA2. >Hm. When we face a LBP of 20 Mbytes (which I read in a paper by Michael >Meyer) on the last mile >and start from a CWND of 32 kBytes in a warm up connection, it might >take some time to adapt... ;-) I guess I misunderstand and underestimate this last mile issue. Could you give a reference? >Again: whith wich rate? If you are asking for packet size, we could use some estimated packet size. The rest is TCP slow start, etc from FH to SA2 for a short of time till congestion state is handed over. >> Others are enhancements.Semantics is kept the same as it was. >Oh yeah ;-) Frankly, I do not completely agree here :-) Why? >We have a big waste basket in the ground floor of our house where old >paper is gathered. May be, they are the entropy of the universe. Alper K. Demir From touch at ISI.EDU Thu Dec 8 11:11:58 2005 From: touch at ISI.EDU (Joe Touch) Date: Thu, 08 Dec 2005 11:11:58 -0800 Subject: [e2e] TCP fragmentation and reassembly In-Reply-To: <17ED86FB-FBB5-4135-93AE-DC54C0B5061E@windriver.com> References: <98371DF9-9031-4C95-B8F2-029C015EBED9@windriver.com> <4397447B.9040405@isi.edu> <24BAACA9-027A-432B-A6CF-24C9DAEF638F@windriver.com> <439784C0.4020307@isi.edu> <17ED86FB-FBB5-4135-93AE-DC54C0B5061E@windriver.com> Message-ID: <4398857E.1020805@isi.edu> David Borman wrote: > > On Dec 7, 2005, at 6:56 PM, Joe Touch wrote: > >> David Borman wrote: >>> >>> On Dec 7, 2005, at 2:22 PM, Joe Touch wrote: >>> >>>>> The TCP MSS value specifies the largest size of packet that the >>>>> receiver can reassemble. I doubt that there are many, if any, TCP >>>>> implementations that can't handle receiving a full 64K TCP/IP packet >>>>> (as a series of IP fragments). >>>> >>>> See: http://www.psc.edu/networking/projects/tcptune/ >>>> >>>> The table (grep for "FreeBSD") shows that current OS's start with a >>>> default socket size smaller than 64KB (which limits the size of the >>>> receive window, right?). >>> >>> Yes, but that has nothing to do with the MSS value. The MSS is simply >>> the largest IP packet that the host can reassemble. >> >> According to RFC793, MSS is the max TCP segment the receiver can handle >> - - not just the largest IP packet that can be reassembled (though this >> could be presumed as a prerequisite). If the connection can only handle >> 8KB outstanding, even if IP can handle a packet that large, TCP cannot, >> so it seems inappropriate to ever advertise an MSS > max_recv_window, >> which is bounded by the socket size. > > No. The receive window has no influence on the MSS value. Look at RFC > 1122. The value to use in the MSS option is to be less than or equal to > MMS_R - 20, where MMS_R is "the maximum message size that can be > received and reassembled in an IP datagram" (RFC 1122, pg 57). 1122 only says: The MSS value to be sent in an MSS option must be less than or equal to: MMS_R - 20 (i.e., it doesn't require 'equal to') I'm suggesting that an _additional_ requirement is that TCP be able to support the reassembled payload. There's nothing in 1122 that precludes that, and 793 implies it. > When you > actually go to send a packet, the received MSS value is only one of > several things that can limit the packet size. > > ... >>> For this reason, some systems use an MSS >>> that is based on the maximum MTU of all interfaces, rather than the >>> outgoing interface. >> >> The advertised MSS can be that large, but it presumes that all >> interfaces are capable of receiving and reassembling IP packets equally >> well, which is not the case where reassembly happens on the NIC. The >> advertised MSS sbould be bounded by the incoming interface MTU of this >> connection. > > Provided you know for sure which is the incoming interface. You can > always know what interface you are using to send packets, but in many > multi-homed situations you can't guarantee on which interface the > traffic will arrive. You know which ones were used; it might be sufficient to use the lower bound of the incoming interfaces that were actually used as a limit (lower bound of all interfaces if you haven't received anything). Joe -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 250 bytes Desc: OpenPGP digital signature Url : http://www.postel.org/pipermail/end2end-interest/attachments/20051208/47d39240/signature.bin From touch at ISI.EDU Thu Dec 8 11:30:02 2005 From: touch at ISI.EDU (Joe Touch) Date: Thu, 08 Dec 2005 11:30:02 -0800 Subject: [e2e] queriable networks In-Reply-To: <20051208165429.GB36688@hut.isi.edu> References: <438F10E4.7050007@dcrocker.net> <438F5198.1070103@reed.com> <439297A0.6090406@dcrocker.net> <4394DDFF.8060909@isi.edu> <20051208165429.GB36688@hut.isi.edu> Message-ID: <439889BA.7060001@isi.edu> Ted Faber wrote: > On Mon, Dec 05, 2005 at 04:40:31PM -0800, Joe Touch wrote: >> Dave Crocker wrote: >>> >>>> One place where I depart from a common view of the end to end argument is >>>> that there are times when it makes sense to actively enquire of the >>>> network and expect the network to make a response that characterizes >>>> itself. >> (responding more to Fred than to Dave, but with Dave's subject thread): >> >> This presumes two things: >> >> 1) if you ask the question, you actually want the answer > [snip] >> 2) if you ask the question, you can trust the answer > [snip] > > Do you think that the set of queries to "the network" for which the > asker really wants and can trust the answer is empty? Questions that need to be asked of the network: yes. Note that it's not enough to trust the other party; I have to trust that all parties who are providing information that might impact my answer. That's why I don't think this sort of trust ever really exists. Questions for which the answers can be measured independently (e.g., pathchar): no. Joe -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 250 bytes Desc: OpenPGP digital signature Url : http://www.postel.org/pipermail/end2end-interest/attachments/20051208/45063a89/signature.bin From detlef.bosau at web.de Thu Dec 8 14:03:16 2005 From: detlef.bosau at web.de (Detlef Bosau) Date: Thu, 08 Dec 2005 23:03:16 +0100 Subject: [e2e] YNT: A Question on the TCP handoff References: Message-ID: <4398ADA4.9090907@web.de> Alper Kamil Demir wrote: >>One problem in automatic line wrapping at the reader side are "ASCII >>arts". >>E.g. our network drawings :-) >>I have difficulties with this issues myself. > > I seems the format of my emails are getting worse, the more I try. > Befor I send emails they look nice; when I look them after posting, > they look terrible. I don't know what to do when using Microsoft OWA. > I am somehow stuck to use it cause there is almost always problems > with IMAP, POP in my university. > > It?s off topic, but usually I recommend: Open your CDROM drive. Insert a Linux installation CD. Powercycle your computer. Follow the instructions on the screen :-) > > >>I will try to understand..... (may be I suffer from very early >>Alzheimer-Symptoms.... %-)) > > It seems not so far %-)) Thank you. However, when I apply for a job, I don?t know whether one expects a dead body, some guy suffering from Alzheimer or some guy suffering from Creutzfeldt-Jacob-Disease when one reads "age: 42". I?ve not written any CVs for some months now, I?ve simply given up :-((( (Don?t ask me from which money I?m living, I don?t know.) O.k., that?s off topic as well, I know. But sometimes, it helps to talk about it. > > >>What I do not understand is how the handover takes place and how MH is >>made the endpoint >>of a connection wich was terminated on SA2? > > This was my question in the whole thread. if this is ever possible or not without > corrupting e2e semantics of TCP. My approach is that only congestion state is Welcome to the club :-) So, we?re already two guys, who do not understand your approach =8-) > updated on FH. The congestion statefrom FH to SA2 , resulting from "warm-up > connection" is ready to be handed over; only the last mile from SA2 to MH is > missing in the congestion state. However, some other mechanism could be used > to estimate and integrate with already known congestion state resulted from > "warm-up connection". > You lost me .... Let?s assume you do no splitting or spoofing and BS is simply a router. Then CWND on FH tells you hom many unacknowledged data, i.e. TCP packets on their way to MH and ACK packets on their way to FH, is allowed to be in transit. ACK packets must be counted for the bytes acknowledged in them, of course. E.g.: CWND = 20 kbyte, then you might have 10 kbyte TCP data on the fly and ACK datagrams for 10 kbyte respectiveley. You do not know which portion of these are situated between FH and BS, i.e. SA1 before handowver, and which portion is situated between BS (SA1) and MH. And exactly that?s your problem and I do not see an easy way to overcome it. In addition, depending on your protocol, some data may still reside on SA1 after the handover and might be redirected to MHs new location. So, for a short period of time, FH and SA1 might be competing for ressources. > > >>I referenced RFC793 in order to give the definition of "connection"; NOT "congestion" >>I did not mix up these two ;-) > > Then, I didn't understand why you had asked me to define "connection". > We can skip this, if you like. I still do not really know, wheter we talk about the same thing here... We?ll see. > > >>>A start on the congestion control would be the V. Jacobson's SIGCOMM'88 paper and S. Floyd's >> >>Really? :-) > > I didn't mean for you cause I am familiar with you "Path Tail Emulation" work and it > is referenced there, too. I meant we are on the same track on this. Hey :-) You read my work? Fine :-) Perhaps you could even have a look at my humble try for the "lower layers"? At the moment I would appreciate some assistance / correction / feedback in some channel coding issues I only begin to understand. > > >>However, it should be part of any actual TCP connection. A TCP >>implementation which does not do congestion control should be considered >>broken. > > I am, in a way, don't agree here with you. I don't understand why TCP should be responsible > for this. For a long time, and still, UDP is the privilidged child of the Internet from the point > of bandwith share. To me, congestion control is a type of "implicit admission control". > Admission Control is a QoS mechanism. TCP cares about network quality of service whereas UDP > doesn't. Moreover. Internet is "best-effort"; meaning only embodying of one type of service > based on "fate-sharing" principle. To me, Congestion Contro is not a transport layer issue. Hang on here. I will not repeat the whole congestion control debate here because this is sufficiently summarized in the congavoid paper. For me, there are basically two alternatives for congestion control. 1.: Distributed: Each flow cares for itself, each flow is responsive (in the sense defined by e.g. Sally Floyd). Ideally, such a system will eventuyayll converge to a fair share of ressources between all flows. 2.: Centralized: There is some mechanism / entity which assigns ressources to the individual flows and enforces the flows? rates throughout the network. IIRC, this approach is conducted in Keshav?s PhD thesis. A good discussion of this issue can be found in "Myths about Congestion Management in Hish-Speed Networks" by Raj Jain in the early 90s. Anyway, you?ll need some kind of congestion control in order to prevent congestion collapse. If you don?t agree with existing congestion mechanisms, you should propose a better one. Nevertheless, it?s commonly accepted that some kind of congestion control is inevitable. In the congavoid paper, VJ writes the Internet _has_ _seen_ several congestion collapse, they are not pure phantasy. > It is a network layer issue. However, when integrated into transport layer it does prevent If you place congestion control in L3, you advocate a mechanism as proposed by Keshav. Right? > network under-utilization. Hence to replace UDP, DCCP is emerged. There is UDP and DCCP, First of all, it prevents network "over-utilization" :-) Concerning DCCP: From a very first glance, DCCP is intended to be compatible with TCP congestion control. > but there is no TCP without congestion control. TCP is actually TCP congestion Control Yes. TCP assumes a stream model and hence there is a possibility to do congestion control for TCP. UDP does not. And in contrast to your view, UDP leaves congestion management as well as reliability issues primarily to the applications. E.g. a rate control media stream using UDP is responsible for congestion control and responsiveness on its own. In addition, a sliding window protocol like TCP _REQUIRES_ an estimation of a faire share of capacity. It?s the purupose of congestion control to provide this estimation. > Protocol (TCCP), now. In a way, Internet has emerged with new congestion control protocols > to provide an "implicit quality of service" preventing congestion. The existence of RFC793 and > other RFCs related to TCP and TCP congestion control is evidence for this. I basically don?t follow your mixup of congestion control and QoS. QoS is commonly used for _guaranteed_ service. Congestion control provides for fairness. Perhaps, you can invent a reliable UDP based protocol controled by DCCP. However, I think this would result in a TCP compatible protocol the behaviour and purpose of which is similar to that of TCP. So, why don?t you simply use TCP? > > >>However, in mobile networks like GPRS with latencies up to several >>hundred seconds and asserted latency bandwidth products in >>Megabyte magnitude (I think, Michael Meyer, Ericsson, has written >>something about that) the last mile _does_ matter. > > We are planning to use a mechanism to be used to take the last mile problem in > our approach into account and integrate it into the FH-SA2 path. > Have fun :-) > >>Please note: The last two lines were a pure blackbox understanding of a >>mobile network. It would be to detailed here to discuss > > ^^^^^^^ > ;)) typo does happen. such as my "to" two" typo. Does this prove > history is recurrence ;)) Especially, I have used some variety of keyboards. > I once was confronted with a french one.... God in Heaven :-) > >>Excuse me? >>Where is the difference here? > > The whole path from FH to SA2 is new path. Only the last mile is missing. _Yes_. > The true bandwidht interration of the last mile is not possible other than > estimating it before MH moves into the area of last mile. > This is obvious because you don?t know anything about the wireless channels quality in advance. Consequently (you?ll foresee my comment ;-)) you do not know in advance where the bottleneck is, particularly you don?t know whether packets will pile up and eventually are discarded at SA2 or at some intermediate node between FH and SA2. > I am digging no problem here. The only problem is if it is possible to replace/update > the sender's congestion state at all. And I?m in doubt here. > >>>And with which sending rate are you probing? This depends on where the >>>bottleneck in your connecetion is situated. >>>This is one of the big mysteries in your connection: You have no idea >>>where the bottleneck is! >> > If it is not on the last mile, then in our approach there is no problem. However That?s obvious. But how will you know this in advance? > if it is, then we are planning to use an estimation to take this into account. Oh yes ;-) I see, I have to work somewhat faster ;-) For the rest of the world: That?s an actual competition between Turkey and Germany =:-) > > >>Yes. So, there is a sending rate. It?s a consequence of proper ACK >>pacing :-) > > I would say a rate consequence of proper "ack pacing" constrained with RTT. ^^^^^^^^^^^^^^^^^^^^^^^^^^^? > > >>Hm. When we face a LBP of 20 Mbytes (which I read in a paper by Michael >>Meyer) on the last mile >>and start from a CWND of 32 kBytes in a warm up connection, it might >>take some time to adapt... ;-) > > I guess I misunderstand and underestimate this last mile issue. > Could you give a reference? I will have a look whether I find it. It may take some time, I don?t have it in mind where the paper of Michael Meyer was published. Unfortunately, it?s hard to find good papers on that issue. Hence, I actually try to understand this issue myself. > > >>>Others are enhancements.Semantics is kept the same as it was. >> >>Oh yeah ;-) Frankly, I do not completely agree here :-) > > Why? See above. > > >>We have a big waste basket in the ground floor of our house where old >>paper is gathered. > > May be, they are the entropy of the universe. In Germany, wastepaper is typically used to make toilet paper of it.... Detlef -- Detlef Bosau Galileistrasse 30 70565 Stuttgart Mail: detlef.bosau at web.de Web: http://www.detlef-bosau.de Mobile: +49 172 681 9937 From detlef.bosau at web.de Thu Dec 8 16:07:05 2005 From: detlef.bosau at web.de (Detlef Bosau) Date: Fri, 09 Dec 2005 01:07:05 +0100 Subject: [e2e] LBP in mobile networks Re: YNT: A Question on the TCP handoff References: Message-ID: <4398CAA9.370FC9AF@web.de> The paper mentioned does not claim 20 MBytes LBP but 20 kBytes. However, this is still quite much for a wireless channel :-) MICHAEL MEYER AND JOACHIM SACHS,ERICSSON RESEARCH MARKUS HOLZKE,T-MOBILE PERFORMANCE EVALUATION OF A TCP PROXY IN WCDMA NETWORKS IEEE Wireless Communication October 2003 -- Detlef Bosau Galileistrasse 30 70565 Stuttgart Mail: detlef.bosau at web.de Web: http://www.detlef-bosau.de Mobile: +49 172 681 9937 From david.borman at windriver.com Fri Dec 9 08:21:02 2005 From: david.borman at windriver.com (David Borman) Date: Fri, 9 Dec 2005 10:21:02 -0600 Subject: [e2e] TCP fragmentation and reassembly In-Reply-To: <4398857E.1020805@isi.edu> References: <98371DF9-9031-4C95-B8F2-029C015EBED9@windriver.com> <4397447B.9040405@isi.edu> <24BAACA9-027A-432B-A6CF-24C9DAEF638F@windriver.com> <439784C0.4020307@isi.edu> <17ED86FB-FBB5-4135-93AE-DC54C0B5061E@windriver.com> <4398857E.1020805@isi.edu> Message-ID: <1D9DA7B2-5938-4B12-9578-7670C6A70F6C@windriver.com> On Dec 8, 2005, at 1:11 PM, Joe Touch wrote: ... > (i.e., it doesn't require 'equal to') I'm suggesting that an > _additional_ requirement is that TCP be able to support the > reassembled > payload. There's nothing in 1122 that precludes that, and 793 > implies it. That already exists. It's called the receive window. Or are you suggesting that there are TCPs that can't handle packets that fit within their receive window? ... >> Provided you know for sure which is the incoming interface. You can >> always know what interface you are using to send packets, but in many >> multi-homed situations you can't guarantee on which interface the >> traffic will arrive. > > You know which ones were used; it might be sufficient to use the lower > bound of the incoming interfaces that were actually used as a limit > (lower bound of all interfaces if you haven't received anything). Not necessarily. For an outbound connection, you have to put the MSS in the initial SYN, so you haven't received any packets yet for that connection. For an inbound connection, yes, you can look at what interface the SYN arrived on and use that in determining the MSS value to put in the SYN/ACK. -David From touch at ISI.EDU Fri Dec 9 09:33:40 2005 From: touch at ISI.EDU (Joe Touch) Date: Fri, 09 Dec 2005 09:33:40 -0800 Subject: [e2e] TCP fragmentation and reassembly In-Reply-To: <1D9DA7B2-5938-4B12-9578-7670C6A70F6C@windriver.com> References: <98371DF9-9031-4C95-B8F2-029C015EBED9@windriver.com> <4397447B.9040405@isi.edu> <24BAACA9-027A-432B-A6CF-24C9DAEF638F@windriver.com> <439784C0.4020307@isi.edu> <17ED86FB-FBB5-4135-93AE-DC54C0B5061E@windriver.com> <4398857E.1020805@isi.edu> <1D9DA7B2-5938-4B12-9578-7670C6A70F6C@windriver.com> Message-ID: <4399BFF4.1030407@isi.edu> David Borman wrote: > > On Dec 8, 2005, at 1:11 PM, Joe Touch wrote: > ... > >> (i.e., it doesn't require 'equal to') I'm suggesting that an >> _additional_ requirement is that TCP be able to support the reassembled >> payload. There's nothing in 1122 that precludes that, and 793 implies it. > > That already exists. It's called the receive window. Or are you > suggesting that there are TCPs that can't handle packets that fit within > their receive window? I guess it's handled on the send side - the side would have an MSS that's larger than the receive window, but never end up using it. It seems odd to advertise the ability to receive a packet at TCP that it not actually able to ever receive, however. > ... >>> Provided you know for sure which is the incoming interface. You can >>> always know what interface you are using to send packets, but in many >>> multi-homed situations you can't guarantee on which interface the >>> traffic will arrive. >> >> You know which ones were used; it might be sufficient to use the lower >> bound of the incoming interfaces that were actually used as a limit >> (lower bound of all interfaces if you haven't received anything). > > Not necessarily. For an outbound connection, you have to put the MSS in > the initial SYN, so you haven't received any packets yet for that > connection. For an inbound connection, yes, you can look at what > interface the SYN arrived on and use that in determining the MSS value > to put in the SYN/ACK. > -David Right. Joe -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 250 bytes Desc: OpenPGP digital signature Url : http://www.postel.org/pipermail/end2end-interest/attachments/20051209/0bfae6dd/signature.bin From detlef.bosau at web.de Sat Dec 10 07:36:58 2005 From: detlef.bosau at web.de (Detlef Bosau) Date: Sat, 10 Dec 2005 16:36:58 +0100 Subject: [e2e] YNT: A Question on the TCP handoff References: Message-ID: <439AF61A.247015CB@web.de> Alper, some remarks. You wrote: > > I mean rate depends very on RTT when depending on "ack pacing". > >From the basic ideal the rate of a TCP flow is indpendent from the RTT. Pracitically, it may oscillate because traffic might be bursty and irregular. Perhaps, you think about the contents of my "very preliminary table", which I don?t find even _that_ preliminary. And perhaps you think about PTE and why I proposed it. (It?s not becaus Santa Clause is comming in his Coca Cola advertisment suit ;-)) Then you wrote: > To me, "congestion control" is a QoS mechanism. I totally disagree here. It?s exaclty the other way round. Of course, QoS architectures can prevent congestion. However, "congestion control" does not provide any kind of QoS. Congestion control shall prevent "network overload", i.e. congestion callapse. QoS architectures do scheduling, admission control etc., hence if this is properly done congestion collapse must not happen. And now for something completely different: Your approach. Eventually I think I understand what you?re doing. Your "warm up connection" shall occupy those "packet slots" from FH to SA2 which are later being used by your TCP connection: When your mobile enters the new cell, I think you simply drop the warm up connection and update the state variables in your "old" TCP connection which then is routed along the new path. Now, the problem is that in sliding window you have last_ack....(on the fly)....last_seq..(free space)....last_ack+CWND. Ideally, in settled state each ACK which arrives at the sender clocks out one TCP packet. And each TCP packet which arrives at the receiver, clocks out one ACK packet. (I intededly ignore delayed ACK here.) Perhaps, you have seen the little "ball chain" which some people use as a toy or perhaps you have seen this in physics at school. That?s basically, how TCP works. Now, when you reroute your flow, you don?t exactly know how much of your CWND is currently occupied, you don?t know how to set last_seq. Or your scoreboard or whatever. You don?t know whether your achievable throughput has changed. So you update a few of your state variables whereas you leave other ones (last_seq, last_ack, scoreboard in TCP/Reno) completely untouched. The relationship between these variables is ignored. Simply spoken: I don?t buy it. Your path changes and so do his properties. Consequently, TCP will settle and adapt. In addition: Which problem do you intend to solve? We talked about mechanisms a lot. However, it?s not quite clear which problem we solve in your approach. Detlef -- Detlef Bosau Galileistrasse 30 70565 Stuttgart Mail: detlef.bosau at web.de Web: http://www.detlef-bosau.de Mobile: +49 172 681 9937 From Jon.Crowcroft at cl.cam.ac.uk Sat Dec 10 10:17:43 2005 From: Jon.Crowcroft at cl.cam.ac.uk (Jon Crowcroft) Date: Sat, 10 Dec 2005 18:17:43 +0000 Subject: [e2e] YNT: A Question on the TCP handoff In-Reply-To: Message from Detlef Bosau of "Sat, 10 Dec 2005 16:36:58 +0100." <439AF61A.247015CB@web.de> Message-ID: there's two different causes for route change - one is link outage, and the other is mobile - but in both, one could alleviate the impact on e2e tcp performance, and on existing traffic on the links in the new route, as traffic is switched over from the old route, by modifiying the routing algorithm. if the routing algorithm computes routes that are incremental steps different from the old route (even include the routers either side of the old routes link that brok,e or access point that one is switching away from in the mobiel handover case) and then slowly migrated the route to the new "shortest path", then the impact would be massively reduces as the RTT would change in a small number of steps, rather than one big step, and the "bottleneck capacity" would potentially change more smoothly too. given interdomain routing isnt SPF anyway, i dont see why there can be anyobjection to this - also, it doesn't need any tcp state in the routers to do it - its just a change to the way we get from the result of one dijkstra or one BGP outcome, to another -its like a generalised version of crankback, then crankforward, but with live traffic an added benefit os such a routing algorithm would be that under churn, or heavy flapping, the route performance as perceived by end2end users would be relatively stable. what would such an algorithm look like? well, its a bit like the opposite of a link disjoint load balancing routing algorithm - i leave it as an excercise for the reader to come up with a modification to OSPF and BGP to achieve such a goal as there isn't space between these two pixels to write it down...... j. From Jon.Crowcroft at cl.cam.ac.uk Sun Dec 11 03:39:06 2005 From: Jon.Crowcroft at cl.cam.ac.uk (Jon Crowcroft) Date: Sun, 11 Dec 2005 11:39:06 +0000 Subject: [e2e] YNT: A Question on the TCP handoff In-Reply-To: Message from Lloyd Wood of "Sun, 11 Dec 2005 00:11:53 GMT." Message-ID: In missive , Ll oyd Wood typed: >>On Sat, 10 Dec 2005, Jon Crowcroft wrote: >>So, what's the point in introducing more than the one necessary path >>change, since the high-rate traffic will be disrupted at each and >>every routing change of path, no matter how small, while sufficiently >>low-rate traffic won't even notice the intermediate steps you're >>introducing and see only the first and final paths/routes as it did >>before? there's an awful lot of assumptions about the network in that paragraph - and your thesis makes assumptions about the network that are different than i) the terrestrial net we are talking about ii) the lifely future topologies of the internet. iii) the traffic model >>See: >>http://www.ee.surrey.ac.uk/Personal/L.Wood/publications/PhD-thesis/ >>figures 2.24 and 2.25, pp.49-50 (pdf pages 71 and 72) >>for a worked example of this. >>I see no point to disrupting high-rate traffic with a cascade of >>multiple (and larger than one hop) path changes and more jitter >>variations, while low-rate traffic won't notice what you're >>introducing if you're propagating the gradual changes on a reasonable >>timescale. I'm afraid your suggestion isn't practical. I didnt spell out a concrete algorith - the one I have in mind does this as a +side effect+ of the way route computation and update are performed and has low overhead >>> also, it doesn't need any tcp state in the routers to do it - its >>> just a change to the way we get from the result of one dijkstra or >>> one BGP outcome, to another -its like a generalised version of >>> crankback, then crankforward, but with live traffic >>so you're moving BGP updates around as well? not BGP as we know it today, jim - this is _research_ >>> an added benefit os such a routing algorithm would be that under >>> churn, or heavy flapping, the route performance as perceived by >>> end2end users would be relatively stable. >>your suggested algorithm _introduces more more churn and more >>flapping_ -- and thus more jitter to high-rate traffic. not more flapping - no - flapping is intermittent and oscilliatory - i produce piecewise improvements >>Deliberately introducing disruption to gain stability only works if >>you're really really good at control theory and have a tractable >>problem space. Internetworking/comp. sci. people rarely have the >>second, and have generally never been exposed to learning the first. - so i guess my physics (which includes control theory) was irrelevant then:) - or do you mewan that most people on this list are compscis? many networking people have EE degrees which have far more control theory actually from year 1...(oh, btw, i taught courses in EE for >10 years, so i was exposed to a lot of it -) if we are to be _polite_ for a change, problem actually is that we need to use BOTH algorithmics AND control theory, and that requires one to cross a couple of disciplines, which can be done if you have the math, and th patience to read a few books and go on some courses and try out a few things >>(I mean, "crankforward"? It's a googlewhack.) yep >>> what would such an algorithm look like? well, its a bit like the >>> opposite of a link disjoint load balancing routing algorithm - i >>> leave it as an excercise for the reader to come up with a >>> modification to OSPF and BGP to achieve such a goal as there isn't >>> space between these two pixels to write it down...... >>Fermat had the good grace to be right. if you want to be pedantic, we don't know that actually - we know that there was a proof, but we do not know if he really had one or that it was right. just that he asserted it. read the book. >>> >> >>plus I reformatted all jon's crazy linewrapping without grumbling >>about it -- Detlef, be liberal in what you receive, conservative in >>what you send. Alper and other Outlook users, take note of advice on: >>http://www.ee.surrey.ac.uk/Personal/L.Wood/outlook/ >> >> cheers jon From Jon.Crowcroft at cl.cam.ac.uk Sun Dec 11 05:39:11 2005 From: Jon.Crowcroft at cl.cam.ac.uk (Jon Crowcroft) Date: Sun, 11 Dec 2005 13:39:11 +0000 Subject: [e2e] YNT: A Question on the TCP handoff/incremental route change In-Reply-To: Message from Jon Crowcroft of "Sun, 11 Dec 2005 11:39:06 GMT." Message-ID: increasing resolution so we can read between the pixels: so the idea i was getting at was to do with the way both distance/path vector, and link state algorithms choose one route, then, when a better route is discovered, switch abruptly to it. To mitigate the effect (not to get rid of it completely, but to reduce the size of one potential step function in rtt and bottleneck capacity to a set of smaller changes), one can think of routing as a process of recursive re-routing - the idea is quite simple - a current route gets from A to B. There is an outage, or else a new route appears because of the end of an outage (or the introduction of a new link, but lets leave that for now as its occasional) so we can think of the routers either side of the outage and see if there are routes from any router on the A-B path to the routers either side of an old outage, or if its a new outage on A-B, from the routers either side of the broken link (yes, and router:), to a better route. How to do this in a distributed way without incurring some huge overhead compared to normal link state? well, lets assume ISPs aren't mad, and that a lot of links that are avaialble for alternate routes are actually part of some planned redundent capacity/topology, rather than accidental (I know this is contraversal, as most papers on multipath routing seem to assume that we consier all links in the world, but thats researchers for you - most network providers don't work that way). so then we actually cosider the problem _inside out_ - start as reaching points either side of potential and actual outages, and create a set of routes - how will that work? consider hierarchical OSPF - and you are just laberling routers at each end of outages as in the same level hierarchy, and links further on as the next level of the hierarchy (can do similar for BGP). how to _stage_ the handover ? ok - so we need to cascade the timers for the route update in each level of the hierarchy. How to do that? that's where control theory comes in, maybe although I was thinking of using a process algebra like stochastic pi calculus myself. j. From detlef.bosau at web.de Sun Dec 11 10:45:37 2005 From: detlef.bosau at web.de (Detlef Bosau) Date: Sun, 11 Dec 2005 19:45:37 +0100 Subject: [e2e] TCP in mobile networks, Questions on RLP Message-ID: <439C73D1.96CA039A@web.de> Hi to all. After heaving read numerous papers on the desastrous interactions between TCP and mobile networks like GPRS and UMTS I eventually want to understand the reasons for them. Particularly, I want to understand, whether these interactions really exist or whether they are pure phantasy....;-) I came out with some questions about the basic design of packet transfer in mobile networks. E.g.: - to me, the RLP frames appear to be chosen quite large, - is there a strict 1 to 1 relationship between RLP and TBF frames? In the TCP context, the question is whether there are adverse interactions which must be dealt with or there aren?t any. I would appreciate to get in touch with someone involved in this issue. A pointer to an appropriate list or forum is also appreciated. Thanks. Detlef -- Detlef Bosau Galileistrasse 30 70565 Stuttgart Mail: detlef.bosau at web.de Web: http://www.detlef-bosau.de Mobile: +49 172 681 9937 From demir at kou.edu.tr Fri Dec 9 11:34:32 2005 From: demir at kou.edu.tr (Alper Kamil Demir) Date: Fri, 9 Dec 2005 21:34:32 +0200 Subject: [e2e] YNT: A Question on the TCP handoff Message-ID: >It?s off topic, but usually I recommend: Open your CDROM drive. Insert a >Linux installation CD. Powercycle your computer. Follow the instructions >on the screen :-) If life will be much more easier, then why not. However, I don't see any operating system related issues here. I have both installed and use either one according to my purpose, needs and convenience. I don't have any profit from any operating system. If I did, certainly, I would choose the most profitable one. I find this topic similar to being a heavy fanatic of an X soccer team where one doesn't have any connection with that team other that his own picking the team just for the personal ego, etc... However, this is off topic. >> This was my question in the whole thread. if this is ever possible or not without >> corrupting e2e semantics of TCP. My approach is that only congestion state is >Welcome to the club :-) So, we?re already two guys, who do not >understand your approach =8-) Oh, no. I understand our approach. Only, it seems like we are not looking at the same picture. Pleease see my comments below. >You lost me .... >Let?s assume you do no splitting or spoofing and BS is simply a router. >Then CWND on FH tells you hom many unacknowledged data, i.e. TCP packets >on their way to MH and ACK packets on their way to FH, is allowed to be >in transit. ACK packets must be counted for the bytes acknowledged in >them, of course. Forget about BS for now and assume that each SA is on BS and SAs are middlewares. If we are talking about mobility, in general, above problems are inevtiable. TCP solves this issues on its own end to end. It we are talking about mobility, routing is the main issue. We assume that network layer solves this issue. Anyway, this is not related to our discussion. TCP is responsible for ACKs. Of course, when MH moves there are some TCP packets and ACKs are on the fly on the ex-path. However, this is inevtiable and TCP will handle this lost TCP packets issue. There will be no ACK problem as long as TCP semantics are not broken. >You do not know which portion of these are situated between FH and BS, >i.e. SA1 before handowver, and which portion is situated between BS >(SA1) and MH. Do I have to know? If TCP data packets or ACKs are lost on the way, then TCP wil? recover it. Okay. I see why we don't understand each other. I guess this was my mistake. However, I tried to clarify this misunderstanding. May be, "Handover" is not the right terminology to be used. I used update/copy instead of "handover". "actual connection" is never replaced by "warm-up connection". Congestion and some other parameters are updated on "actual connection". These parameters are resulted from "warm-up connection". "warm-up connection" is created to find new path's congestion state from FH to MH in advance using SA2 as an agent cause MH is not in the cell of SA2 yet. It is known that MH will move into the cell area of SA2 from UMP. As a result a "warm-up connecction" is established between FH and SA2. When MH moves into the cell area of SA2 congestion state parameters are updated so that "actual connection" gets its fair share on the new path. >And exactly that?s your problem and I do not see an easy way to overcome >it. Do you think that updating congestion state parameters of a TCP connection is not easy? >In addition, depending on your protocol, some data may still reside on >SA1 after the handover and might be redirected to MHs new location. This is inevtiable. Any approach will face this. As far as I know this issue is ignored cause TCP will eventually solve this. >So, for a short period of time, FH and SA1 might be competing for >ressources. SAs don't compete for resources. They try to find fair share of actual connections' fair share on new paths. SAs create temporary "warm-up connection" for a short period of time. I don't understand how you concluded that SA1 would compete for resources. I think according to our reference model, you meant SA2. >> I didn't mean for you cause I am familiar with you "Path Tail Emulation" work and it >> is referenced there, too. I meant we are on the same track on this. >Hey :-) You read my work? Fine :-) Yes. I did. It is very well put. >Perhaps you could even have a look at my humble try for the "lower >layers"? At the moment I would appreciate some assistance / correction / > feedback in some channel coding issues I only begin to understand. I had a look at it, too. It is very preliminary. I wouldn't say "lower layers in mobile networks used for packet swithcing". I would say ".... interacting with packet-switching" or some orher thing cause packets are not switched on the lower layers. They are frames. I know I hear something you beg for the differ. I don't know. I guess it has been used in the literature that way. Just an idea. >For me, there are basically two alternatives for congestion control. For any communication system, there is these two alternatives. This is not new. At least, for a taxonomy from the systems point of view. >1.: Distributed: Each flow cares for itself, each flow is responsive (in >the sense defined by e.g. Sally Floyd). Ideally, such a system will >eventuyayll converge to a fair share of ressources between all flows. >2.: Centralized: There is some mechanism / entity which assigns >ressources to the individual flows and enforces the flows? rates >throughout the network. IIRC, this approach is conducted in Keshav?s PhD > thesis. Somehow, I am truly unable to imagine a centralized approach for the Internet at all. >A good discussion of this issue can be found in "Myths about Congestion >Management in Hish-Speed Networks" by Raj Jain in the early 90s. I am not saying that we don't need congestion control. This reminds me QoS support versus overprovisioning. >If you don?t agree with existing congestion mechanisms, you should >propose a better one. Nevertheless, it?s commonly accepted that some >kind of congestion control is inevitable. I don't dare not to agree. I respect very much. >If you place congestion control in L3, you advocate a mechanism as >proposed by Keshav. Right? I am not familiar with Keshav's mechanism if you are not talking about "fair queuing" or "packet-pair", IIRC!! >First of all, it prevents network "over-utilization" :-) what is network "over-utilization"? >And in contrast to your view, UDP leaves congestion management as well >as reliability issues primarily to the applications. E.g. a rate control >media stream using UDP is responsible for congestion control and >responsiveness on its own. In the past, UDP traffic was not very high. Recently, there have been increase in the UDP traffic and need for congestion control as we already know. This was the main reason why UDP didn't implement "congestion control". Some applications tried to implement their own "congestion control". However, we already know that to get it right is not that easy. I don't think that any application and protocol above network layer is responsible for it in the Internet architecture. What prevents me from sending IP datagram on the Internet? Standards? To me, if a protocol does not agree with standarts, it does not work at all. As long as I use IP and anything on top of it, it works on the Internet. Right? Who controls "congestion" on the Internet? Protocols adopting congestion control on top op IP. If any protocol does not adopt and be a bad guy, who will punish the bad guy and how? Moreover, how will I know the bad guy? >I basically don?t follow your mixup of congestion control and QoS. I don't think it is a mixup. To me, congestion control is adopted to some transport layer protocols in order to prevent congestion hence increase service quality of IP packets. It is an implicit result. To me, "congestion control" is a QoS mechanism. >QoS is commonly used for _guaranteed_ service. Congestion control >provides for fairness. Are you saying that "best-effort" is not a QoS? Please, have a look at Diffserv and Intserv service classes. May be not Diffserv yet. I see a mixup with "congestion control" and "fairness". Congestion control does not necessearily have to be fair for every bit of the network. In TCP, it is cause there is no bit discrimination on the Internet, yet. >Perhaps, you can invent a reliable UDP based protocol controled by DCCP. >However, I think this would result in a TCP compatible protocol the >behaviour and purpose of which is similar to that of TCP. This is integration or differentiation choice. If you integrate you have the chance of getting more efficient protocol definition. If you differentiate or layer, you get basic building blocks. This is a very hard decision, I think. You might have useless basic building blocks and less efficient protocols. On the other hand, you might have things duplicated in different places. >So, why don?t you simply use TCP? I am whenever I need it :) >Consequently (you?ll foresee my comment ;-)) you do not know in advance >where the bottleneck is, particularly you don?t know whether packets >will pile up and eventually are discarded at SA2 or at some intermediate >node between FH and SA2. As I explained above, "warm-up connection" will solve the bottleneck or congestion state problem on the new path. Why do you think that there is a "warm-up connection" between FH and SA2 for a short period of time before MH moves into the cell area of SA2? >> If it is not on the last mile, then in our approach there is no problem. However >That?s obvious. >But how will you know this in advance? Okay. I explain it again. "warm-up connection" will know it in advance.That's why we have "warm-up connection". >> if it is, then we are planning to use an estimation to take this into account. > Oh yes ;-) No any other solution can truly solve this problem. There is no choice other than estimating it and taking it into account. >I see, I have to work somewhat faster ;-) For the rest of the world: >That?s an actual competition between Turkey and Germany =:-) No, you don't need to ;)) Competition on what? Your comments and discussions with you have been very useful, really. However, if you like to compete, let the championship begin :))) Have there been any unreel competitions? > I would say a rate consequence of proper "ack pacing" constrained with RTT. ^^^^^^^^^^^^^^^^^^^^^^^^^^^? I mean rate depends very on RTT when depending on "ack pacing". >I will have a look whether I find it. It may take some time, I don?t >have it in mind where the paper of Michael Meyer was published. Thank you very much. Alper K. Demir From demir at kou.edu.tr Mon Dec 12 07:22:23 2005 From: demir at kou.edu.tr (Alper Kamil Demir) Date: Mon, 12 Dec 2005 17:22:23 +0200 Subject: [e2e] YNT: A Question on the TCP handoff Message-ID: >From the basic ideal the rate of a TCP flow is indpendent from the RTT. >Pracitically, it may oscillate because traffic might be bursty and >irregular. Could you please check RFC3448 (TCP Friendly Rate Control (TFRC): Protocol Specification. It does depend on RTT. >> To me, "congestion control" is a QoS mechanism. >I totally disagree here. >It?s exaclty the other way round. Of course, QoS architectures can >prevent congestion. >However, "congestion control" does not provide any kind of QoS. How dou you put the other way round? What is an "expected rate" of TCP? isn't it a QoS specification? I think it is. Any traffic control is a QoS mechanism. One can employ a QoS mechanism in order to utilize the network resources and/or traffic differentiation. >QoS architectures do scheduling, admission control etc., hence if this >is properly done congestion collapse must not happen. Not necessarily. Some traffic engineering, overprovisoning could provide some sort of QoS. You might have a "best effort" or some sort of default service class and still need to implement some sort of congestion control mechanism so that traffic collapse must not happen. >Eventually I think I understand what you?re doing. I am very happy that, finally, you understand our approach. >Your "warm up connection" shall occupy those "packet slots" from FH to >SA2 which are later being used by your TCP connection: >When your mobile enters the new cell, I think you simply drop the warm >up connection and update the state variables in your >"old" TCP connection which then is routed along the new path. That is exactly what we do. I am happy that we got each other. >last_ack....(on the fly)....last_seq..(free space)....last_ack+CWND. >Ideally, in settled state each ACK which arrives at the sender clocks >out one TCP packet. >And each TCP packet which arrives at the receiver, clocks out one ACK >packet. (I intededly ignore delayed ACK here.) That's correct. I don't see any problem with last ACK sent by receiver (MH). Eventually, it will be received by sender (FH) and will clock out one TCP packet. The rest is upto routing to the packet to the receiver. Of course, oscillations will occur due to mobility. This is inevitable. >Now, when you reroute your flow, you don?t exactly know how much of your >CWND is currently occupied, you don?t know how >to set last_seq. Or your scoreboard or whatever. This is due to routing and inevtiable. During that time. some sort of oscillations will happen. However, I don't think that any other proposed mechanisms have dealt with this routing issues. If so, any reference??? >You don?t know whether your achievable throughput has changed. I know the achievable throughput got from "warm-up connection". Am I missing something? >So you update a few of your state variables whereas you leave other ones >(last_seq, last_ack, scoreboard in TCP/Reno) completely >untouched. The relationship between these variables is ignored. That's correct. The rest would be enhancements for different TCP variants. Still, I am unable to see how last_seq, last_ack will be effective to our approach. >Simply spoken: I don?t buy it. Seems like we won't be able to sell it to you. >Your path changes and so do his properties. Consequently, TCP will >settle and adapt. However, adaptation would be much more faster that any other mechanisms due to adaptation parameters of "warm-up connection" and using this parameters immediatelly so that TCP will settle and adapt to path change (won't go into slow start due to time out, etc.). >In addition: Which problem do you intend to solve? We talked about >mechanisms a lot. However, it?s not quite clear which problem >we solve in your approach. The same problem tackeled in I-TCP, Freeze-TCP, M-TCP, Snoop, etc... We will compare them with our approach. However, I am unable to find their implementation on ns-2. I am working on this. Alper K. Demir From detlef.bosau at web.de Mon Dec 12 09:49:43 2005 From: detlef.bosau at web.de (Detlef Bosau) Date: Mon, 12 Dec 2005 18:49:43 +0100 Subject: [e2e] YNT: A Question on the TCP handoff References: Message-ID: <439DB837.56F92E36@web.de> Alper Kamil Demir wrote: > > >From the basic ideal the rate of a TCP flow is indpendent from the RTT. > >Pracitically, it may oscillate because traffic might be bursty and > >irregular. > Could you please check RFC3448 (TCP Friendly Rate Control (TFRC): > Protocol Specification. It does depend on RTT. > I cannot really imagine that the authors went wrong here, so it may be a misunderstanding here. I said: From the basic ideal the rate of a TCP flow is indpendent from the RTT. Let?s draw a reference network: A-----------------------B (Ethernet, 50 Meter) And now lets keep the rate and shorten RTT: A-----------B (Ethernet, 25 Meter) And now we build a really big RTT A--------------------------------------....---------------------B (Ethnernet, fibre, 1000 Meter) At least in this example, RTT and rate are independent. May be that you ignore the concrete case but consider the "more general scenario".... The RTT depends on the path length / capacity, which is independent of the throughput. O.k., I see, the congestion sawtooth increases more slowly as the RTT increases. However, this is not my basic thought. The basic thought is that it it?s tough to build a really convincing RTT/CWND relationship. I did not think about this in detail and I know that there are many attembts being made to do so. However, to me it appears difficult to do it exactly. > >> To me, "congestion control" is a QoS mechanism. > >I totally disagree here. > >It?s exaclty the other way round. Of course, QoS architectures can > >prevent congestion. > >However, "congestion control" does not provide any kind of QoS. > How dou you put the other way round? What is an "expected rate" > of TCP? isn't it a QoS specification? I think it is. Any traffic control is No. "Expectation" is a term of statistics. And a pure description. Nothing more, nothing less. > a QoS mechanism. One can employ a QoS mechanism in order to utilize > the network resources and/or traffic differentiation. Exaclty. One can. QoS mechanisms can used to control / prevent congestion. > > >QoS architectures do scheduling, admission control etc., hence if this > >is properly done congestion collapse must not happen. > Not necessarily. Some traffic engineering, overprovisoning could The list given was neither complete nor compulsary. > provide some sort of QoS. You might have a "best effort" or some We talk about the Interent, right? May be that you can do overprosioning to your convenience in a small LAN or some hostel. In the Internet, bottlenecks happen to occur. > sort of default service class and still need to implement some sort of > congestion control mechanism so that traffic collapse must not happen. Again. This the typical VoIP justification in minor companies and offices. It works in this context. But I don?t believe in overprovisioning about the Interent. In addition, congestion control by overprovisioning means to make router queues that big that they connot be exhausted by existing flows. I think, the disadvantages of oversized router queues are pretty well known. > That is exactly what we do. I am happy that we got each other. :-) > > >last_ack....(on the fly)....last_seq..(free space)....last_ack+CWND. > >Ideally, in settled state each ACK which arrives at the sender clocks > >out one TCP packet. > >And each TCP packet which arrives at the receiver, clocks out one ACK > >packet. (I intededly ignore delayed ACK here.) > That's correct. I don't see any problem with last ACK sent by receiver (MH). In my consideration, last_ack is a state variable. > Eventually, it will be received by sender (FH) and will clock out one TCP packet. > The rest is upto routing to the packet to the receiver. Of course, oscillations > will occur due to mobility. This is inevitable. Even if MH does not movie, oscillation will occur 1. of course for the usual congestion sawtooth. 2. as a result from delay variation due to noise. > > >Now, when you reroute your flow, you don?t exactly know how much of your > >CWND is currently occupied, you don?t know how > >to set last_seq. Or your scoreboard or whatever. > This is due to routing and inevtiable. During that time. some sort of > oscillations will happen. However, I don't think that any other proposed > mechanisms have dealt with this routing issues. If so, any reference??? With which routing issue? Perhaps, your solution is clear to me. Now we have to find the matching problem %-) > > >You don?t know whether your achievable throughput has changed. > I know the achievable throughput got from "warm-up connection". Yes. And is this achievable by the wireless path as well? That?s why I asked were your bottleneck is ;-) Or your argument is: If the wired path is the bottleneck, we can?t help it. If the wired path is, the first what happens is a severe congestion - - if, yes if you somehow obtain the ACK packets which clock out your TCP packets. Otherwise your warm up connection would be of no use. > > >So you update a few of your state variables whereas you leave other ones > >(last_seq, last_ack, scoreboard in TCP/Reno) completely > >untouched. The relationship between these variables is ignored. > That's correct. The rest would be enhancements for different TCP > variants. Still, I am unable to see how last_seq, last_ack will be > effective to our approach. > > >Simply spoken: I don?t buy it. > Seems like we won't be able to sell it to you. :-) Basically, I think you have two or three target groups: 1.: If you pursue a PhD: Your supervisor. (Most important.) 2.: The community. (Also important.) 3.: Me :-) (not so important and perhaps too stupid to understand your approach :-)) (I think, the "Theroy of multiple target groups" is known to you? ;-)) > > >Your path changes and so do his properties. Consequently, TCP will > >settle and adapt. > However, adaptation would be much more faster that any other mechanisms > due to adaptation parameters of "warm-up connection" and using this parameters > immediatelly so that TCP will settle and adapt to path change (won't go into > slow start due to time out, etc.). Adaptation? What is adapted to what here? > > >In addition: Which problem do you intend to solve? We talked about > >mechanisms a lot. However, it?s not quite clear which problem > >we solve in your approach. > The same problem tackeled in I-TCP, Freeze-TCP, M-TCP, Snoop, etc... We > will compare them with our approach. However, I am unable to find their What are your criteria? Do you yout intend to build a new TCP flavor for "good look"? Or is there are certain prolem you want to solve? In addition: I?m not quite sure whether all mentioned approaches really tackle the same problem. > implementation on ns-2. I am working on this. Fine! So, you have the first step on your schedule: Add dynamic flow control to the NS-2 implementation of TCP, otherwise I-TCP, Freeze-TCP and M-TCP will not work as intended... Have fun! Detlef -- Detlef Bosau Galileistrasse 30 70565 Stuttgart Mail: detlef.bosau at web.de Web: http://www.detlef-bosau.de Mobile: +49 172 681 9937 From matta at cs.bu.edu Mon Dec 12 20:50:29 2005 From: matta at cs.bu.edu (Ibrahim Matta) Date: Mon, 12 Dec 2005 23:50:29 -0500 Subject: [e2e] YNT: A Question on the TCP handoff/incremental routechange Message-ID: <0511C607B17F804EBE96FFECD1FD98595E51E7@cs-exs2.cs-nt.bu.edu> Jon, how about easing the failure of a link (or introduction of a new link) by gradually changing the "cost" ("metric") of the link? E.g., see: The revised ARPANET routing metric, ACM SIGCOMM Computer Communication Review, Volume 19 , Issue 4 (September 1989) Authors A. Khanna, J. Zinky BBN Communications Corporation By limiting the change in successively advertized link cost (cf. "movement limit" in this paper), you can avoid abrupt changes (e.g. due to "infinite" link cost) and show better convergence by contractive mapping of routing metric = function(traffic load) and traffic load = function(routing metric)... In general, abrupt changes in routes can be avoided by making sure the link in question does not look all of a sudden too attractive or too unattractive compared to other links. Cheers, Ibrahim -----Original Message----- From: end2end-interest-bounces at postel.org [mailto:end2end-interest-bounces at postel.org] On Behalf Of Jon Crowcroft Sent: Sunday, December 11, 2005 8:39 AM To: end2end-interest at postel.org Subject: Re: [e2e] YNT: A Question on the TCP handoff/incremental routechange increasing resolution so we can read between the pixels: so the idea i was getting at was to do with the way both distance/path vector, and link state algorithms choose one route, then, when a better route is discovered, switch abruptly to it. To mitigate the effect (not to get rid of it completely, but to reduce the size of one potential step function in rtt and bottleneck capacity to a set of smaller changes), one can think of routing as a process of recursive re-routing - the idea is quite simple - a current route gets from A to B. There is an outage, or else a new route appears because of the end of an outage (or the introduction of a new link, but lets leave that for now as its occasional) so we can think of the routers either side of the outage and see if there are routes from any router on the A-B path to the routers either side of an old outage, or if its a new outage on A-B, from the routers either side of the broken link (yes, and router:), to a better route. How to do this in a distributed way without incurring some huge overhead compared to normal link state? well, lets assume ISPs aren't mad, and that a lot of links that are avaialble for alternate routes are actually part of some planned redundent capacity/topology, rather than accidental (I know this is contraversal, as most papers on multipath routing seem to assume that we consier all links in the world, but thats researchers for you - most network providers don't work that way). so then we actually cosider the problem _inside out_ - start as reaching points either side of potential and actual outages, and create a set of routes - how will that work? consider hierarchical OSPF - and you are just laberling routers at each end of outages as in the same level hierarchy, and links further on as the next level of the hierarchy (can do similar for BGP). how to _stage_ the handover ? ok - so we need to cascade the timers for the route update in each level of the hierarchy. How to do that? that's where control theory comes in, maybe although I was thinking of using a process algebra like stochastic pi calculus myself. j. From Jon.Crowcroft at cl.cam.ac.uk Tue Dec 13 01:18:42 2005 From: Jon.Crowcroft at cl.cam.ac.uk (Jon Crowcroft) Date: Tue, 13 Dec 2005 09:18:42 +0000 Subject: [e2e] YNT: A Question on the TCP handoff/incremental routechange In-Reply-To: Message from "Ibrahim Matta" of "Mon, 12 Dec 2005 23:50:29 EST." <0511C607B17F804EBE96FFECD1FD98595E51E7@cs-exs2.cs-nt.bu.edu> Message-ID: hmm, interesting - so basically another way of thinking about this is to greatly expand the metric space over which routes are computed, so that routes that are sort of a bit better but not as good as the best route after a change can look good enough for a while....time dependent metrics (rather than my scheme which was scope based)... seems like somethign one could look at - there's work by randy bush that shows that when the routing control plane is attacked, the forwarding/data plane is remarkably stable despite inability of new routes to be computed - i wonder if path vector sort of has a behaviour that is a bit like this anyway, and what we are skating around the edges of is some formalisation of a way to make routing vary more smoothly... seems like a good masters project to evaluate your and my proposasl... In missive <0511C607B17F804EBE96FFECD1FD98595E51E7 at cs-exs2.cs-nt.bu.edu>, "Ibra him Matta" typed: >>Jon, how about easing the failure of a link (or introduction of a new >>link) by gradually changing the "cost" ("metric") of the link? E.g., >>see: >> The revised ARPANET routing metric, ACM SIGCOMM Computer Communication >>Review, >>Volume 19 , Issue 4 (September 1989) =20 >>Authors A. Khanna, J. Zinky BBN Communications Corporation=20 >>By limiting the change in successively advertized link cost (cf. >>"movement limit" in this paper), you can avoid abrupt changes (e.g. due >>to "infinite" link cost) and show better convergence by contractive >>mapping of routing metric =3D function(traffic load) and traffic load = >>function(routing metric)... >>In general, abrupt changes in routes can be avoided by making sure the >>link in question does not look all of a sudden too attractive or too >>unattractive compared to other links. >> >>Cheers, >>Ibrahim >> >>=20 >> >>-----Original Message----- >>From: end2end-interest-bounces at postel.org >>[mailto:end2end-interest-bounces at postel.org] On Behalf Of Jon Crowcroft >>Sent: Sunday, December 11, 2005 8:39 AM >>To: end2end-interest at postel.org >>Subject: Re: [e2e] YNT: A Question on the TCP handoff/incremental >>routechange >> >>increasing resolution so we can read between the pixels: >> >>so the idea i was getting at was to do with the way both distance/path >>vector, and link state algorithms choose one route, then, when a better >>route is discovered, switch abruptly to it. >>To mitigate the effect (not to get rid of it completely, but to reduce >>the size of one potential step function in rtt and bottleneck capacity >>to a set of smaller changes), one can think of routing as a process of >>recursive re-routing - the idea is quite simple -=20 >> >>a current route gets from A to B. There is an outage, or else a new >>route appears because of the end of an outage (or the introduction of a >>new link, but lets leave that for now as its occasional) >> >>so we can think of the routers either side of the outage and see if >>there are routes from any router on the A-B path to the routers either >>side of an old outage, or if its a new outage on A-B, from the routers >>either side of the broken link (yes, and router:), to a better route. >>How to do this in a distributed way without incurring some huge overhead >>compared to normal link state? >> >> >>well, lets assume ISPs aren't mad, and that a lot of links that are >>avaialble for alternate routes are actually part of some planned >>redundent capacity/topology, rather than accidental (I know this is >>contraversal, as most papers on multipath routing seem to assume that we >>consier all links in the world, but thats researchers for you - most >>network providers don't work that way). >> >>so then we actually cosider the problem _inside out_ - start as reaching >>points either side of potential and actual outages, and create a set of >>routes - how will that work? consider hierarchical OSPF - and you are >>just laberling routers at each end of outages as in the same level >>hierarchy, and links further on as the next level of the hierarchy (can >>do similar for BGP). how to _stage_ the handover ? >>ok - so we need to cascade the timers for the route update in each level >>of the hierarchy. How to do that? that's where control theory comes in, >>maybe although I was thinking of using a process algebra like stochastic >>pi calculus myself. >> >> >>j. >> cheers jon From detlef.bosau at web.de Tue Dec 13 09:46:07 2005 From: detlef.bosau at web.de (Detlef Bosau) Date: Tue, 13 Dec 2005 18:46:07 +0100 Subject: [e2e] YNT: A Question on the TCP handoff References: Message-ID: <439F08DF.E630C8A8@web.de> Alper Kamil Demir wrote: > >Perhaps you could even have a look at my humble try for the "lower > >layers"? At the moment I would appreciate some assistance / correction / > > feedback in some channel coding issues I only begin to understand. > I had a look at it, too. It is very preliminary. I wouldn't say > "lower layers in mobile networks used for packet swithcing". I would > say ".... interacting with packet-switching" or some orher thing cause > packets are not switched on the lower layers. They are frames. I know > I hear something you beg for the differ. I don't know. I guess it has been > used in the literature that way. Just an idea. > > Admittedly, I?m somwhat disappointed here. Of course, I said my table is "preliminary". However, this is not an invitation to write: "Wonderful! It?s peliminary! It?s far from complete!" Translated from "ancient Greek" to modern English this means nothing else than: "You?re an idiot." It?s exactly the same experience as if you submit a paper and the reviewer comment is simple, precise, short: "shit." This is always extremely helpful :-) When I presented this link here, this was an honest invitation for correction and comments. O.k., when there is no time to comment it, it?s o.k. But I don?t like comments like: "It?s wrong." or "It?s not correct." or something else without being told what?s incorrect or what?s missing. I admittedly find this annoying. You?re free to call me an idiot - if only you give reasons for it. It?s just my personal opinion, I don?t want to offend anybody. Detlef -- Detlef Bosau Galileistrasse 30 70565 Stuttgart Mail: detlef.bosau at web.de Web: http://www.detlef-bosau.de Mobile: +49 172 681 9937 From randy at psg.com Tue Dec 13 16:09:06 2005 From: randy at psg.com (Randy Bush) Date: Wed, 14 Dec 2005 07:09:06 +0700 Subject: [e2e] YNT: A Question on the TCP handoff/incremental routechange References: <0511C607B17F804EBE96FFECD1FD98595E51E7@cs-exs2.cs-nt.bu.edu> Message-ID: <17311.25250.646602.502526@roam.psg.com> > seems like somethign one could look at - there's work by randy bush > that shows that when the routing control plane is attacked, the > forwarding/data plane is remarkably stable despite inability of > new routes to be computed often what seems to happen is what we call "curve ball routing," though i am sure you cricket players have a different term. new routing propagates in sort of a wave front. packets proceed from source on old routing information until they reach a router to which new information has propagated, and then they proceed on new data. we have empirical reason to suspect that a packet can pass through more than two (old and new) routing information states enroute to a destination. and credit should also go to tim, morley, jun, and a couple of oregon grad students (see upcoming pam2006 paper). randy From imcdnzl at gmail.com Mon Dec 12 15:00:58 2005 From: imcdnzl at gmail.com (Ian McDonald) Date: Tue, 13 Dec 2005 12:00:58 +1300 Subject: [e2e] YNT: A Question on the TCP handoff In-Reply-To: References: Message-ID: On 12/13/05, Alper Kamil Demir wrote: > >From the basic ideal the rate of a TCP flow is indpendent from the RTT. > >Pracitically, it may oscillate because traffic might be bursty and > >irregular. > Could you please check RFC3448 (TCP Friendly Rate Control (TFRC): > Protocol Specification. It does depend on RTT. > You might like to look at http://wand.net.nz/~perry/max_download.php where a colleague of mine has made a TCP rate calculator. Also useful for DCCP CCID3 etc as uses TFRC calculation here also. -- Ian McDonald http://wand.net.nz/~iam4 WAND Network Research Group University of Waikato New Zealand From kelsayed at gmail.com Mon Dec 19 04:01:28 2005 From: kelsayed at gmail.com (Khaled Elsayed) Date: Mon, 19 Dec 2005 14:01:28 +0200 Subject: [e2e] Scheduling+ARQ In-Reply-To: <438D01AB.7070702@cs.pdx.edu> References: <438D01AB.7070702@cs.pdx.edu> Message-ID: <43A6A118.2070709@gmail.com> Hi All, Given a link-level QoS-oriented scheduler on a system implementing link-level ARQ, I am trying to make a choice between the following architectural alternatives: 1- Alternative 1: When a packet/frame arrives at the data link layer it is first handled by the scheduler. When the packet is scheduled for transmission and it belongs to an ARQ-enabled connection it is forwarded to the ARQ subsystem which updates its state machine and then the packet is forwarded over the PHY (transmission) queues. In this case, the problem is when a packet is lost, how to reschedule the retransmission of the packet in a way that keeps the order of the original packet stream? 2- Alternative 2: When a packet/frame arrives at the data link layer, it is checked if it belongs to an ARQ-enabled connection, if so it is forwarded to the ARQ subsystem and after the state machine update it is forwarded to the scheduler., The scheduler schedules the packet among those in its queues and then the packet is forwarded to the PHY tranmission queues. In this case if a packet is lost, the ARQ system will simply send the packet to the scheduler keeping the original order. In other words, do we have: ------ ------ ------ | --> | --> | ------ ------ ------ SCH ARQ TX OR ------ ------ ------ | --> | --> | ------ ------ ------ ARQ SCH TX I am more inclined towards the second for better handling of lost/dropped packets, but for some reason (actually just a feeling :) I think there is some architectural mistake in doing that (I always thought that ARQ is tighthly coupled with the PHY TX queues). Any comments? Thanks, Khaled Elsayed -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.postel.org/pipermail/end2end-interest/attachments/20051219/f64556e9/attachment.html From d.manjunath at gmail.com Mon Dec 19 04:45:43 2005 From: d.manjunath at gmail.com (Manjunath D) Date: Mon, 19 Dec 2005 04:45:43 -0800 Subject: [e2e] Analysis of Traceroute Message-ID: Hi, Could you please help me understand the output of the following traceroute output. ? a) My point of doubt is that from the 2nd hop the packets are hitting the same host (203.200.10.209) and as you can see, this repeats intermittenly for the other following hops. (a.1) How can the packets travel back to the same host again? (a.2) Does this indicate there was a loop formed (due to router fault?). (a.3) or Worse, is this a bug in Solaris 10 version of traceroute utility? b) what does !H indicate in the hop output? My System:> Solaris 10 on Sparc Thanks >> Manjunath ================================================ Traceroute - capture -Start ======================= traceroute 198.133.219.25 traceroute to 198.133.219.25 (198.133.219.25), 30 hops max, 40 byte packets 1 fire.mycompany.com (192.168.3.1) 1.071 ms 0.443 ms 0.323 ms 2 203.200.10.209 (203.200.10.209) 1.662 ms 1.716 ms 1.538 ms 3 * * * 4 * 203.200.10.209 (203.200.10.209) 1.794 ms !H * 5 * * * 6 * * * 7 * * * 8 203.200.10.209 (203.200.10.209) 2.678 ms !H * * 9 * 203.200.10.209 (203.200.10.209) 2.548 ms !H * 10 * * * 11 * * 203.200.10.209 (203.200.10.209) 1.937 ms !H 12 * * * ================================================ Traceroute - capture - End ======================= -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.postel.org/pipermail/end2end-interest/attachments/20051219/eb837ca9/attachment.html From craig at aland.bbn.com Mon Dec 19 06:07:11 2005 From: craig at aland.bbn.com (Craig Partridge) Date: Mon, 19 Dec 2005 09:07:11 -0500 Subject: [e2e] Scheduling+ARQ In-Reply-To: Your message of "Mon, 19 Dec 2005 14:01:28 +0200." <43A6A118.2070709@gmail.com> Message-ID: <20051219140711.EEC194D@aland.bbn.com> I think the second solution is the only right one. In the first case, when ARQ is retransmitting, it will use more bandwidth than it was scheduled for. Craig In message <43A6A118.2070709 at gmail.com>, Khaled Elsayed writes: >This is a multi-part message in MIME format. >--------------020408050401050103030602 >Content-Type: text/plain; charset=us-ascii; format=flowed >Content-Transfer-Encoding: 7bit > >Hi All, > >Given a link-level QoS-oriented scheduler on a system implementing >link-level ARQ, I am trying to make a choice between the following >architectural alternatives: > >1- Alternative 1: When a packet/frame arrives at the data link layer it >is first handled by the scheduler. When the packet is scheduled for >transmission and it belongs to an ARQ-enabled connection it is >forwarded to the ARQ subsystem which updates its state machine and then >the packet is forwarded over the PHY (transmission) queues. In this >case, the problem is when a packet is lost, how to reschedule the >retransmission of the packet in a way that keeps the order of the >original packet stream? > >2- Alternative 2: When a packet/frame arrives at the data link layer, >it is checked if it belongs to an ARQ-enabled connection, if so it is >forwarded to the ARQ subsystem and after the state machine update it is >forwarded to the scheduler., The scheduler schedules the packet among >those in its queues and then the packet is forwarded to the PHY >tranmission queues. In this case if a packet is lost, the ARQ system >will simply send the packet to the scheduler keeping the original order. > >In other words, do we have: > >------ ------ ------ > | --> | --> | >------ ------ ------ >SCH ARQ TX > >OR > >------ ------ ------ > | --> | --> | >------ ------ ------ >ARQ SCH TX > > >I am more inclined towards the second for better handling of >lost/dropped packets, but for some reason (actually just a feeling :) I >think there is some architectural mistake in doing that (I always >thought that ARQ is tighthly coupled with the PHY TX queues). Any comments? > >Thanks, > >Khaled Elsayed > > > >--------------020408050401050103030602 >Content-Type: text/html; charset=us-ascii >Content-Transfer-Encoding: 7bit > > > > > > > > >Hi All,
>
>Given a link-level QoS-oriented scheduler on a system implementing >link-level ARQ,  I am trying to make a choice between the following >architectural alternatives:
>
>1- Alternative 1: When a packet/frame arrives at the data link layer it >is first handled by the scheduler.  When the packet is scheduled for >transmission and  it belongs to an ARQ-enabled connection it is >forwarded to the ARQ subsystem which updates its state machine and then >the packet is forwarded over the PHY (transmission) queues. In this >case, the problem is when a packet is lost, how to reschedule the >retransmission of the packet in a way that keeps the order of the >original packet stream?
>
>2- Alternative 2:  When a packet/frame arrives at the data link layer, >it is checked if it belongs to an ARQ-enabled connection, if so  it is >forwarded to the ARQ subsystem and after the state machine update it is >forwarded to the scheduler., The scheduler schedules the packet among >those in its queues and then the packet is forwarded to the PHY >tranmission queues. In this case if a packet is lost, the ARQ system >will simply send the packet to the scheduler keeping the original >order.
>
>In other words, do we have:
>
>------         ------    >;       ------
>        |  -->   &nb >sp;       |  -->     >;      |
>------         ------    >;       ------
>SCH          ARQ    >;      TX
>
>OR
>
>------         ------    >;       ------
>        |  -->   &nb >sp;       |  -->     >;      |
>------         ------    >;       ------
>ARQ         SCH    &nbs >p;       TX
>
>
>I am more inclined towards the second for better handling of >lost/dropped packets, but for some reason (actually just a feeling :) I >think there is some architectural mistake in doing that (I always >thought that ARQ is tighthly coupled with the PHY TX queues). Any >comments?
>
>Thanks,
>
>Khaled Elsayed
>
>
>
> > > >--------------020408050401050103030602-- From cottrell at slac.stanford.edu Mon Dec 19 08:26:06 2005 From: cottrell at slac.stanford.edu (Cottrell, Les) Date: Mon, 19 Dec 2005 08:26:06 -0800 Subject: [e2e] Analysis of Traceroute Message-ID: <35C208A168A04B4EB99D1E13F2A4DB0101425995@exch-mail1.win.slac.stanford.edu> My understanding is: The !H means the Host requested (198.133.219.25) is unreachable from this router 203.200.10.209 (the man pages mention this). Traceroute then increments the TTL by 1 and sends the default three more UDP probes until the TTL reaches the default max 30. It looks like the router sometime disregards the probes (maybe due to rate limiting) and so traceroute gives the timeout response of * to each probe and sometimes gives the ICMP host unreachable response. ________________________________ From: end2end-interest-bounces at postel.org [mailto:end2end-interest-bounces at postel.org] On Behalf Of Manjunath D Sent: Monday, December 19, 2005 4:46 AM To: end2end-interest at postel.org Subject: [e2e] Analysis of Traceroute Hi, Could you please help me understand the output of the following traceroute output. ? a) My point of doubt is that from the 2nd hop the packets are hitting the same host (MailScanner has detected a possible fraud attempt from "203.200.10.209" claiming to be MailScanner warning: numerical links are often malicious: 203.200.10.209 ) and as you can see, this repeats intermittenly for the other following hops. (a.1) How can the packets travel back to the same host again? (a.2) Does this indicate there was a loop formed (due to router fault?). (a.3) or Worse, is this a bug in Solaris 10 version of traceroute utility? b) what does !H indicate in the hop output? My System:> Solaris 10 on Sparc Thanks >> Manjunath ================================================ Traceroute - capture -Start ======================= traceroute MailScanner has detected a possible fraud attempt from "198.133.219.25" claiming to be MailScanner warning: numerical links are often malicious: 198.133.219.25 traceroute to MailScanner has detected a possible fraud attempt from "198.133.219.25" claiming to be MailScanner warning: numerical links are often malicious: 198.133.219.25 (MailScanner has detected a possible fraud attempt from "198.133.219.25" claiming to be MailScanner warning: numerical links are often malicious: 198.133.219.25 ), 30 hops max, 40 byte packets 1 fire.mycompany.com (MailScanner has detected a possible fraud attempt from "192.168.3.1" claiming to be MailScanner warning: numerical links are often malicious: 192.168.3.1 ) 1.071 ms 0.443 ms 0.323 ms 2 MailScanner has detected a possible fraud attempt from "203.200.10.209" claiming to be MailScanner warning: numerical links are often malicious: 203.200.10.209 (MailScanner has detected a possible fraud attempt from "203.200.10.209" claiming to be MailScanner warning: numerical links are often malicious: 203.200.10.209 ) 1.662 ms 1.716 ms 1.538 ms 3 * * * 4 * MailScanner has detected a possible fraud attempt from "203.200.10.209" claiming to be MailScanner warning: numerical links are often malicious: 203.200.10.209 (MailScanner has detected a possible fraud attempt from "203.200.10.209" claiming to be MailScanner warning: numerical links are often malicious: 203.200.10.209 ) 1.794 ms !H * 5 * * * 6 * * * 7 * * * 8 MailScanner has detected a possible fraud attempt from "203.200.10.209" claiming to be MailScanner warning: numerical links are often malicious: 203.200.10.209 (MailScanner has detected a possible fraud attempt from "203.200.10.209" claiming to be MailScanner warning: numerical links are often malicious: 203.200.10.209 ) 2.678 ms !H * * 9 * MailScanner has detected a possible fraud attempt from "203.200.10.209" claiming to be MailScanner warning: numerical links are often malicious: 203.200.10.209 (MailScanner has detected a possible fraud attempt from "203.200.10.209" claiming to be MailScanner warning: numerical links are often malicious: 203.200.10.209 ) 2.548 ms !H * 10 * * * 11 * * MailScanner has detected a possible fraud attempt from "203.200.10.209" claiming to be MailScanner warning: numerical links are often malicious: 203.200.10.209 (MailScanner has detected a possible fraud attempt from "203.200.10.209" claiming to be MailScanner warning: numerical links are often malicious: 203.200.10.209 ) 1.937 ms !H 12 * * * ================================================ Traceroute - capture - End ======================= From baruch at ev-en.org Mon Dec 19 12:26:54 2005 From: baruch at ev-en.org (Baruch Even) Date: Mon, 19 Dec 2005 20:26:54 +0000 Subject: [e2e] SACK performance improvements - technical report and updated 2.6.6 patches Message-ID: <43A7178E.4030808@ev-en.org> Hello, I wanted to post an update about my work for SACK performance improvements, I've updated the patches on our website and added a technical report on the work so far. It can be found at: http://hamilton.ie/net/research.htm#patches In summary: The Linux stack so far is unable to effectively handle single transfers on 1Gbps with high rtt links (220 ms rtt is what we tested). The sender is unable to process the ACK packets fast enough causing lost ACKs and increased transfer times. Our work resulted in a set of patches that enable the Linux TCP stack to handle this load without breaking sweat. Your comments on this work would be appreciated. Regards, Baruch From craig at aland.bbn.com Mon Dec 19 13:45:37 2005 From: craig at aland.bbn.com (Craig Partridge) Date: Mon, 19 Dec 2005 16:45:37 -0500 Subject: [e2e] SACK performance improvements - technical report and updated 2.6.6 patches In-Reply-To: Your message of "Mon, 19 Dec 2005 20:26:54 GMT." <43A7178E.4030808@ev-en.org> Message-ID: <20051219214537.6853B4D@aland.bbn.com> Thanks for sharing! I found the report fun reading in the sense that I don't know of many folks who've worked on slow path performance improvement. However, having written some of these kinds of papers in the past, I'll point out that before it could be considered publishable, it needs a much clearer explanation of the algorithms in the actual code and precisely how they were modified. For instance, it wasn't clear to me if the SACK block code walked the list of outstanding segments for each SACK block or walked the list of segments once, checking all the SACK blocks (they are both n*s algorithms, but the second algorithm will have decidedly better performance due to locality and ordering tricks you can play). Also, it was not clear why the revised algorithm grows as O(lost packets) vs O(cwnd). Thanks! Craig In message <43A7178E.4030808 at ev-en.org>, Baruch Even writes: >Hello, > >I wanted to post an update about my work for SACK performance >improvements, I've updated the patches on our website and added a >technical report on the work so far. > >It can be found at: >http://hamilton.ie/net/research.htm#patches > >In summary: The Linux stack so far is unable to effectively handle >single transfers on 1Gbps with high rtt links (220 ms rtt is what we >tested). The sender is unable to process the ACK packets fast enough >causing lost ACKs and increased transfer times. Our work resulted in a >set of patches that enable the Linux TCP stack to handle this load >without breaking sweat. > >Your comments on this work would be appreciated. > >Regards, >Baruch From rhee at eos.ncsu.edu Mon Dec 19 18:11:40 2005 From: rhee at eos.ncsu.edu (Injong Rhee) Date: Mon, 19 Dec 2005 21:11:40 -0500 Subject: [e2e] SACK performance improvements - technical report and updated 2.6.6 patches In-Reply-To: <20051219.132256.61038824.davem@davemloft.net> Message-ID: <200512200213.jBK2DbJl019826@uni03mr.unity.ncsu.edu> I wonder the same. I wonder how this new patch by the HTCP folks improves what we provided for the 2.6.x (which is currently incorporated in the latest linux version). My recollection says that this HTCP patch periodically crashes the system very often -- so we could not run the comparison. BTW, this fast SACK path fix we provided are just a simple clean-up and modification from the original Tom Kelly's SACK code - so Tom deserves the full credit for it. > -----Original Message----- > From: netdev-owner at vger.kernel.org [mailto:netdev- > owner at vger.kernel.org] On Behalf Of David S. Miller > Sent: Monday, December 19, 2005 4:23 PM > To: baruch at ev-en.org > Cc: netdev at vger.kernel.org; end2end-interest at postel.org; > d.leith at eee.strath.ac.uk > Subject: Re: SACK performance improvements - technical report and > updated 2.6.6 patches > > From: Baruch Even > Date: Mon, 19 Dec 2005 20:26:54 +0000 > > > Your comments on this work would be appreciated. > > Ummm... how about the patches that fix this which are in the 2.6.x > kernel already? > > Yes, it's not your stuff, but it was incredibly less invasive and > probably works nearly as well. > - > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo at vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html From rhee at eos.ncsu.edu Tue Dec 20 01:41:56 2005 From: rhee at eos.ncsu.edu (Injong Rhee) Date: Tue, 20 Dec 2005 04:41:56 -0500 Subject: [e2e] SACK performance improvements - technical report and updated 2.6.6 patches In-Reply-To: <20051220.010843.09466616.davem@davemloft.net> Message-ID: <200512200943.jBK9hr8C016703@uni03mr.unity.ncsu.edu> Ditto. I remember we had some discussion on this sometime back in the netdev mailing list (Baruch was part of the discussion). > -----Original Message----- > From: David S. Miller [mailto:davem at davemloft.net] > Sent: Tuesday, December 20, 2005 4:09 AM > To: doug at eee.strath.ac.uk > Cc: rhee at eos.ncsu.edu; baruch at ev-en.org; netdev at vger.kernel.org; > end2end-interest at postel.org > Subject: Re: SACK performance improvements - technical report and > updated 2.6.6 patches > > From: "Douglas Leith" > Date: Tue, 20 Dec 2005 08:40:26 -0000 > > > Well some feedback to that effect might have been useful a while > > back Dave. > > I gave him feedback on at least 5 seperate occaisions, both > publicly and in private correspondance. > > Others have done so as well. From detlef.bosau at web.de Tue Dec 20 15:45:56 2005 From: detlef.bosau at web.de (Detlef Bosau) Date: Wed, 21 Dec 2005 00:45:56 +0100 Subject: [e2e] Scheduling+ARQ References: <438D01AB.7070702@cs.pdx.edu> <43A6A118.2070709@gmail.com> Message-ID: <43A897B4.72AA1ACC@web.de> > Khaled Elsayed wrote: > > Hi All, > > Given a link-level QoS-oriented scheduler on a system implementing > link-level ARQ, I am trying to make a choice between the following > architectural alternatives: > When you look in my model for packet switching mobile networs (which is still under construction) http://www.detlef-bosau.de/layers.html scheduling is part of the "MAC" layer, ARQ is part of the local recovery layer. Scheduling is a MAC issue, so it should be between ARQ and PHY, as Craig has already written. Perhaps, the model on my homepage could help to associate functional parts in mobile networks with corrsponding OSI layers. It's not a strict mapping. But it's an helpful orientation. -- Detlef Bosau Galileistrasse 30 70565 Stuttgart Mail: detlef.bosau at web.de Web: http://www.detlef-bosau.de Mobile: +49 172 681 9937 From weiye at ISI.EDU Fri Dec 23 00:23:19 2005 From: weiye at ISI.EDU (Wei Ye) Date: Fri, 23 Dec 2005 00:23:19 -0800 Subject: [e2e] CFP: ACM SenSys 2006 Message-ID: <1135326199.3347.2.camel@localhost.localdomain> Our Apologies if you have received multiple copies of the CFP. Wei Ye and Cormac J. Sreenan SenSys'06 Publicity Co-Chairs --------- ACM SenSys 2006: Call for Papers The 4th ACM Conference on Embedded Networked Sensor Systems November 1-3, 2006 Boulder, Colorado, USA http://www.isi.edu/sensys2006/ Sponsored by ACM SIGCOMM, SIGMOBILE, SIGARCH, SIGOPS, SIGMETRICS and SIGBED; with support from NSF. The 4th ACM Conference on Embedded Networked Sensor Systems (SenSys) is a highly selective, single-track forum for the presentation of research results on systems issues in the area of embedded, networked sensors. Distributed systems based on networked sensors and actuators with embedded computation capabilities allow for an instrumentation of the physical world at an unprecedented scale and density, thus enabling a new generation of monitoring and control applications. This conference provides an ideal venue to address the research challenges facing the design, deployment, use, and fundamental limits of these systems. Sensor networks require contributions from many fields, from wireless communication and networking, embedded systems and hardware, distributed systems, data management, and applications, so we welcome cross-disciplinary work. We seek technical papers describing original, previously unpublished research results. Topics of interest include, but are not limited to, new research results in the following areas for sensor networks: -Sensor network architecture and protocols -Distributed coordination algorithms such as localization, time synchronization, clustering, and topology control -Failure resilience and fault isolation -Energy management -Operating systems -Data, information, and signal processing -Data storage and management -Distributed actuation and control -Programming methodology -Security and privacy -Sensor network planning, provisioning, calibration and deployment -Operational experience and testbeds -Experimental methodology, including measurement, simulation, and emulation infrastructure -Analysis of real-world systems and fundamental limits -Applications -Sensor networks in unusual environments -Integration with back-end systems such as web-based information systems, process control, and enterprise software Important dates: Paper Registration and Abstract: March 30, 2006, midnight US Eastern Time Paper Submission Deadline: April 6, 2006, midnight US Eastern Time Notification of Paper Acceptance: June 29, 2006 Camera Ready Paper Copy: August 22, 2006 All deadlines are firm; we will not honor extensions. Papers must be original, unpublished work not under consideration elsewhere. For submission details, see the conference web site. Selected papers from the conference will be forwarded to the ACM Transactions on Sensor Networks for possible publication. For the first time SenSys 2006 will present a best paper/presentation award. Demos: Demonstrations showing innovative research and applications are solicited. SenSys is very interested in demonstrations of technology, platforms, and applications of wireless sensor networks. Abstracts of accepted demos will be published in the SenSys conference proceedings. Submissions from both industries and universities are encouraged. For the first time SenSys 2006 will present a best demo award for the best student demo. For submission details, see the conference web site. A call for demos with submission dates, etc., will be posted at a later point. Posters: Posters showing exiting early work on sensor networks are solicited. Areas of interest are the same as those listed in the technical call for papers. While the poster need not describe completed work, it should report on research for which at least preliminary results are available. We especially encourage submissions by students (that is, for which a student is the first author on the poster). For submission details, see the conference web site. A call for posters with submission dates, etc., will be posted at a later point. Committees: General Chair: Andrew T. Campbell, Dartmouth College Program Co-Chairs: Philippe Bonnet, DIKU; John Heidemann, USC/ISI Program Committee Members: Anish Arora, Ohio State Tucker Balch, GaTech Jan Beutel, ETHZ Nirupama Bulusu, Portland State U. Erdal Cayirci, Istambul Dave Culler, UCB Deepak Ganesan, UMass Phillip Gibbons, Intel Research Leonid Guibas, Stanford Sanjay Jha, UNSW Akos Ledeczi, Vanderbilt Phil Levis, Stanford Sam Madden, MIT Shivakant Mishra, U. Colorado Joe Paradiso, MIT Greg Pottie, UCLA Andreas Savvides, Yale Sergio Servetto, Cornell Yoshito Tobe, Tokyo Denki U. Feng Zhao, Microsoft Research Local Arrangements Chair: Rick Han, U. Colorado Poster Co-Chairs: Henry Tirri, Nokia; Robert Szewczyk, Moteiv Demo Co-Chairs: Chieh Yih Wan, Intel; Jie Liu, Microsoft Publicity Co-Chairs: Wei Ye, USC/ISI; Cormac J. Sreenan, Cork Publications Chair: Sam Madden, MIT Web Chair: Mark Hansen, UCLA Registration Chair: Akos Ledeczi, Vanderbilt U. Finance Chair: Tarek Abdelzaher, UIUC Sponsorships Chair: Injong Rhee, NCSU Travel Awards: Haiyun Luo, UIUC Steering Committee: Anish Arora, Ohio State Victor Bahl, Microsoft Hari Balakrishnan, MIT David Culler, Berkeley Deborah Estrin, UCLA Ramesh Govindan, USC Craig Partridge, BBN Jason Redi, BBN Mani Srivastava, UCLA John Stankovic, U. Virgina Feng Zhao, Microsoft Research Taieb Znati, U. Pittsburgh From evijay at cs.bu.edu Fri Dec 23 14:05:25 2005 From: evijay at cs.bu.edu (Vijay Erramilli) Date: Fri, 23 Dec 2005 17:05:25 -0500 Subject: [e2e] TCP Traffic Measurement Studies Message-ID: <92205ACB-ADF5-4F8A-9A62-FEA12FE429F4@cs.bu.edu> Hi, I'm looking for any characterization or measurement studies which deal with the amount of bytes sent in the "forward" direction by TCP connections compared to the amount sent in the "reverse" direction. By "forward" direction I mean FROM the host that opened the connection, and by "reverse" I mean TO the host that opened the connection. I am trying to get a sense of what typical ratios of forward/reverse byte traffic is in TCP. This will of course vary depending on the setting and application mix, but any pointers at all are welcome. Thanks, Vijay From detlef.bosau at web.de Fri Dec 23 17:17:21 2005 From: detlef.bosau at web.de (Detlef Bosau) Date: Sat, 24 Dec 2005 02:17:21 +0100 Subject: [e2e] TCP Traffic Measurement Studies References: <92205ACB-ADF5-4F8A-9A62-FEA12FE429F4@cs.bu.edu> Message-ID: <43ACA1A1.5090900@web.de> Vijay Erramilli wrote: > > I am trying to get a sense of what typical ratios of forward/reverse > byte traffic is in TCP. This will of course vary depending on the Is there a reason why this should be different from two simplex connections, at least as long as there is no asymmetric bandwidth / load somehwere in the network? -- Detlef Bosau Galileistrasse 30 70565 Stuttgart Mail: detlef.bosau at web.de Web: http://www.detlef-bosau.de Mobile: +49 172 681 9937 From gaylord at dirtcheapemail.com Sat Dec 24 03:56:51 2005 From: gaylord at dirtcheapemail.com (Clark Gaylord) Date: Sat, 24 Dec 2005 06:56:51 -0500 Subject: [e2e] TCP Traffic Measurement Studies Message-ID: <1135425411.9360.250491703@webmail.messagingengine.com> Vijay can correct me, of course, but I think the question is more like what's the ratio of server-to-client to client-to-server traffic, e.g., in a typical bulk data xfer you have 1500ish bytes going one way and 64 bytes going the other way (modulo windowing). --ckg On Sat, 24 Dec 2005 10:46:24 +0000 (GMT), "Lloyd Wood" said: > On Sat, 24 Dec 2005, Detlef Bosau wrote: > > > Vijay Erramilli wrote: > > > > > > > > I am trying to get a sense of what typical ratios of forward/reverse > > > byte traffic is in TCP. This will of course vary depending on the > > > > Is there a reason why this should be different from two simplex > > connections, at least as long as there is no asymmetric bandwidth / load > > somehwere in the network? > > shared media - traditional Ethernet (now rare), -- Clark Gaylord Blacksburg, VA USA gaylord at dirtcheapemail.com From detlef.bosau at web.de Sat Dec 24 05:00:06 2005 From: detlef.bosau at web.de (Detlef Bosau) Date: Sat, 24 Dec 2005 14:00:06 +0100 Subject: [e2e] TCP Traffic Measurement Studies References: <92205ACB-ADF5-4F8A-9A62-FEA12FE429F4@cs.bu.edu> <43ACA1A1.5090900@web.de> Message-ID: <43AD4656.D1CB8EE9@web.de> Lloyd Wood wrote: > > On Sat, 24 Dec 2005, Detlef Bosau wrote: > > > Vijay Erramilli wrote: > > > > > > > > I am trying to get a sense of what typical ratios of forward/reverse > > > byte traffic is in TCP. This will of course vary depending on the > > > > Is there a reason why this should be different from two simplex > > connections, at least as long as there is no asymmetric bandwidth / load > > somehwere in the network? > > shared media - traditional Ethernet (now rare), that?s why I forgot about it :-) However, I think Vijay?s question was more related to the transport layer. And I don?t see a "pure L4" reason for different throughput in both directions of a TCP connection. (I admit: My computer is connected to my DSL Router via Ethernet ;-) And due to my somewhat dated hub, this is actually CSMA/CD ;-)) Detlef -- Detlef Bosau Galileistrasse 30 70565 Stuttgart Mail: detlef.bosau at web.de Web: http://www.detlef-bosau.de Mobile: +49 172 681 9937 From mellia at tlc.polito.it Sat Dec 24 05:35:16 2005 From: mellia at tlc.polito.it (Marco Mellia) Date: Sat, 24 Dec 2005 14:35:16 +0100 Subject: [e2e] TCP Traffic Measurement Studies In-Reply-To: <92205ACB-ADF5-4F8A-9A62-FEA12FE429F4@cs.bu.edu> References: <92205ACB-ADF5-4F8A-9A62-FEA12FE429F4@cs.bu.edu> Message-ID: <1135431316.43ad4e949e39c@mail.tlc.polito.it> We looked at that using tstat http://tstat.tlc.polito.it Part of the results are available in M. Mellia, R. Lo Cigno, F. Neri Measuring IP and TCP behavior on edge nodes with Tstat Computer Networks, Vol. 47, No. 1, pp. 1-21, Jan 2005 http://www.tlc-networks.polito.it/mellia/papers/tstat_cn.ps We esplicitely investigated the ratio bewteen forward and backward data carried in TCP flows, both considering bytes and packets. The first are strongly asymmetrical (due to client-server kind of applications), while considering packet based figures, there is no bias toward any particular directions... due to the ACK flow in the "backward" path... Hope this helps. -- Ciao, /\/\/\rco +-----------------------------------+ | Marco Mellia - Assistant Professor| | Tel: 39-011-2276-608 | | Tel: 39-011-564-4173 | | Cel: 39-340-9674888 | /"\ .. . . . . . . . . . . . . | Politecnico di Torino | \ / . ASCII Ribbon Campaign . | Corso Duca degli Abruzzi 24 | X .- NO HTML/RTF in e-mail . | Torino - 10129 - Italy | / \ .- NO Word docs in e-mail. | http://www1.tlc.polito.it/mellia | .. . . . . . . . . . . . . +-----------------------------------+ The box said "Requires Windows 95 or Better." So I installed Linux. Scrive Vijay Erramilli : > > > Hi, > > I'm looking for any characterization or measurement studies which deal > with the amount of bytes sent in the "forward" direction by TCP > connections compared to the amount sent in the "reverse" direction. By > "forward" direction I mean FROM the host that opened the connection, and > by "reverse" I mean TO the host that opened the connection. > > I am trying to get a sense of what typical ratios of forward/reverse > byte traffic is in TCP. This will of course vary depending on the > setting and application mix, but any pointers at all are welcome. > > Thanks, > Vijay > From evijay at cs.bu.edu Sat Dec 24 08:50:40 2005 From: evijay at cs.bu.edu (Vijay Erramilli) Date: Sat, 24 Dec 2005 11:50:40 -0500 (EST) Subject: [e2e] TCP Traffic Measurement Studies In-Reply-To: <1135425411.9360.250491703@webmail.messagingengine.com> References: <1135425411.9360.250491703@webmail.messagingengine.com> Message-ID: Hi Clark, This is exactly what I meant. I deliberately avoided "client" and a "server" but am looking for generic studies which look at the problem from a 'requests' and 'responses' or a 'initiator - responder' point of view. Thanks for all the replies, Vijay On Sat, 24 Dec 2005, Clark Gaylord wrote: > Vijay can correct me, of course, but I think the question is more like > what's the ratio of server-to-client to client-to-server traffic, e.g., > in a typical bulk data xfer you have 1500ish bytes going one way and 64 > bytes going the other way (modulo windowing). > > --ckg > > On Sat, 24 Dec 2005 10:46:24 +0000 (GMT), "Lloyd Wood" > said: > > On Sat, 24 Dec 2005, Detlef Bosau wrote: > > > > > Vijay Erramilli wrote: > > > > > > > > > > > I am trying to get a sense of what typical ratios of forward/reverse > > > > byte traffic is in TCP. This will of course vary depending on the > > > > > > Is there a reason why this should be different from two simplex > > > connections, at least as long as there is no asymmetric bandwidth / load > > > somehwere in the network? > > > > shared media - traditional Ethernet (now rare), > -- > Clark Gaylord > Blacksburg, VA USA > gaylord at dirtcheapemail.com > From gaylord at dirtcheapemail.com Sun Dec 25 14:55:03 2005 From: gaylord at dirtcheapemail.com (Clark Gaylord) Date: Sun, 25 Dec 2005 17:55:03 -0500 Subject: [e2e] TCP Traffic Measurement Studies Message-ID: <1135551303.27776.250538241@webmail.messagingengine.com> On Sat, 24 Dec 2005 14:00:06 +0100, "Detlef Bosau" said: > Lloyd Wood wrote: > > shared media - traditional Ethernet (now rare), > that?s why I forgot about it :-) Not so rare -- it's called wireless. --ckg -- Clark Gaylord Blacksburg, VA USA gaylord at dirtcheapemail.com From michael.welzl at uibk.ac.at Mon Dec 26 10:31:32 2005 From: michael.welzl at uibk.ac.at (Michael Welzl) Date: Mon, 26 Dec 2005 19:31:32 +0100 Subject: [e2e] Can we revive T/TCP ? Message-ID: <001301c60a4a$9831dc60$0200a8c0@fun> Hi everybody, Here's something that I've had on my mind for quite a while now: I'm wondering why T/TCP ( RFC 1644 ) failed. I mean, nobody seems to use it. I believe someone explained this to me once (perhaps even on this list? but I couldn't find this in the archives...), saying that there were security concerns with it, but I don't remember any other details. So - is that all? If so, I'm wondering what kind of security concerns there could be. I can imagine more danger from DDoS floods (the first packet already causes the web site request to be processed!), and I can imagine that some other concerns would relate to authentication - but what about IPSec, then? I don't understand why a web browser of someone doing telebanking with IPSec (not in tunnel mode) needs to set up a new connection whenever a link is clicked. There is a similar problem in the Grid, connections are set up and torn down whenever Grid Services are called even though every participant in the Grid is authenticated. In general, this delay is relatively small in comparison to other overhead (SOAP message processing, etc.), but it's there, and it seems avoidable to me. Also, it could become significant if a Grid node is far away, or the network is congested (a SYN could be dropped). What is it I'm missing? Cheers, Michael From dga+ at cs.cmu.edu Mon Dec 26 10:47:27 2005 From: dga+ at cs.cmu.edu (David Andersen) Date: Mon, 26 Dec 2005 13:47:27 -0500 Subject: [e2e] Can we revive T/TCP ? In-Reply-To: <001301c60a4a$9831dc60$0200a8c0@fun> References: <001301c60a4a$9831dc60$0200a8c0@fun> Message-ID: On Dec 26, 2005, at 1:31 PM, Michael Welzl wrote: > Hi everybody, > > Here's something that I've had on my mind for quite a while now: > I'm wondering why T/TCP ( RFC 1644 ) failed. I mean, nobody seems > to use it. I believe someone explained this to me once (perhaps even > on this list? but I couldn't find this in the archives...), saying > that > there > were security concerns with it, but I don't remember any other > details. > > So - is that all? If so, I'm wondering what kind of security concerns > there could be. I can imagine more danger from DDoS floods (the > first packet already causes the web site request to be processed!), The packet can be much more easily spoofed. The TCP three-way handshake does a fairly okay job of verifying that the person you're talking to is actually in control of the IP address their packets are coming from. (not to be confused with verifying that the person or computer at that IP address is the one you think it is). With T/TCP, the world is my DDoS reflector: An attacker can send a single ~100 byte packet that requests a ~1500 byte web page, spoofed to come from a victim. The web server will then send 15x more traffic back to the victim. The other DDoS possibilities that you mentioned are also very possible. > and I can imagine that some other concerns would relate to > authentication - but what about IPSec, then? Sure, iff you have a key already set up with the person you're talking with. > > I don't understand why a web browser of someone doing telebanking > with IPSec (not in tunnel mode) needs to set up a new connection > whenever a link is clicked. There is a similar problem in the Grid, It doesn't. Most links are clicked within the same site, and most servers and browsers support persistent connections. The connection is only torn down after an idle period or some maximum number of requests. The case that this doesn't handle is the "I want to send a single packet to N different people (N is large) and not have to worry about connection setup" - if that's the case, and it really is a single packet, then just send it in UDP. If it really is a reliable, congestion-controlled, in-order stream of data, use TCP, and in many cases, the connection setup RTT isn't going to completely kill you. (I'm sure there are scenarios where it will, of course.) In the Grid context, if you're talking about a not-huge set of trusted nodes, they can cache those TCP connections for quite a long time. An interesting example of this is the 'rex' system by Kaminsky and Mazieres. It's a remote execution tool much like ssh, but more flexible. It supports connection caching under the hood, so you don't have to pay the setup time if you're using remote command execution. It's worth noting that the major delay they're avoiding in the local area is the public key crypto processing time, but in the wide-area, both can add significantly to the total delay. -Dave From mycroft at netbsd.org Mon Dec 26 11:05:29 2005 From: mycroft at netbsd.org (Charles M. Hannum) Date: Mon, 26 Dec 2005 19:05:29 +0000 Subject: [e2e] Can we revive T/TCP ? In-Reply-To: <001301c60a4a$9831dc60$0200a8c0@fun> References: <001301c60a4a$9831dc60$0200a8c0@fun> Message-ID: <200512261905.29344.mycroft@netbsd.org> On Monday 26 December 2005 18:31, Michael Welzl wrote: > Here's something that I've had on my mind for quite a while now: > I'm wondering why T/TCP ( RFC 1644 ) failed. I mean, nobody seems > to use it. I believe someone explained this to me once (perhaps even > on this list? but I couldn't find this in the archives...), saying that > there > were security concerns with it, but I don't remember any other details. Here's what I wrote last time this came up: From: "Charles M. Hannum" Organization: The NetBSD Project To: end2end-interest at postel.org Subject: Re: [e2e] T/TCP usage Date: Fri, 1 Oct 2004 22:46:47 +0000 Message-Id: <200410012246.47945.mycroft at netbsd.org> On Friday 01 October 2004 20:30, John Kristoff wrote: > After reviewing some of the Internet's protocol designs this afternoon, > I was making my way through T/TCP and I began to think about some of the > potential DoS vectors it could introduce. Apparently the potential for > problems are well known. For example: > > Also see: http://midway.sourceforge.net/doc/ttcp-sec.txt That's a bit old, and I probably wouldn't write it quite the same today, but there it is. See sections 3 and 4, in particular, for comments about DoS attacks. Note that at least two implementations of T/TCP that got some use did not have a way for servers to selectively enable the use of TAO (or it had the wrong default; I forget), and that the hole mentioned in section 2 was in fact used to break into real servers, including at least one case where it was actually done through the rlogin service, as I specifically mentioned. In retrospect, I should have expanded more on my comment about it violating existing RFCs. In fact, we had to change the TCP processing in NetBSD to be compatible with T/TCP -- previously it would drop a SYN-data-ACK packet, as prescribed in RFC 793. I believe the same change had to be made in ka9q at the time. From dkp at ece.iisc.ernet.in Mon Dec 26 11:10:41 2005 From: dkp at ece.iisc.ernet.in (dharmendra) Date: Tue, 27 Dec 2005 00:40:41 +0530 Subject: [e2e] want to unsubscribe Message-ID: <43B04031.2050402@ece.iisc.ernet.in> Hi, I want to unsubscribe from end to end group.Can i do that? Dharmendra kumar -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From michael.welzl at uibk.ac.at Mon Dec 26 12:00:41 2005 From: michael.welzl at uibk.ac.at (Michael Welzl) Date: Mon, 26 Dec 2005 21:00:41 +0100 Subject: [e2e] Can we revive T/TCP ? => persistent connections References: <001301c60a4a$9831dc60$0200a8c0@fun> Message-ID: <000901c60a57$0ccaaf00$0200a8c0@fun> First of all, thanks a lot for your fast and detailed answer; it's really helpful. You actually settled the issue for me - but a point you raise up lead me to ANOTHER thing that's been on my mind for a while: > > I don't understand why a web browser of someone doing telebanking > > with IPSec (not in tunnel mode) needs to set up a new connection > > whenever a link is clicked. There is a similar problem in the Grid, > > It doesn't. Most links are clicked within the same site, and most > servers and browsers support persistent connections. The connection > is only torn down after an idle period or some maximum number of > requests. In practice, this doesn't seem to be the case. In all the tests my students did (not a thorough measurement study, just some experiments), the server closed the connection after sending a page. I think this is due to the (quasi-)stateless operation that a HTTP server can achieve this way - I mean, it's much more difficult to keep connections open for a longer period, and close them only after a timer expired, count the number of connections that should be cached, etc. etc. ... if poorly implemented, this might also not scale so well. > (I'm sure there are scenarios where it will, of course.) In the Grid > context, if you're talking about a not-huge set of trusted nodes, > they can cache those TCP connections for quite a long time. But they don't - neither in the smaller nor in the larger Grids that I know of; I think it's because the notion of a "connection" is lost in the (vertical) communication across layers. Grid Services are usually implemented on top of SOAP, which is stateless. How should SOAP tell HTTP to maintain a connection when it can't know whether a Grid Service will be called again? The decision to do so is up to the programmer, who however can't provide the remote SOAP instance with the necessary information because the notion of a "session" isn't part of SOAP. Could connections be cached in a transparent manner in such a scenario (e.g. by tweaking something at the HTTP level, but not above)? I think so, but I'm not 100% sure. Also, if it's possible, why isn't it done? In a Grid, this would surely make sense. > An interesting example of this is the 'rex' system by Kaminsky and > Mazieres. It's a remote execution tool much like ssh, but more > flexible. It supports connection caching under the hood, so you > don't have to pay the setup time if you're using remote command > execution. It's worth noting that the major delay they're avoiding > in the local area is the public key crypto processing time, but in > the wide-area, both can add significantly to the total delay. Thanks a lot for the pointer! By "under the hood", you don't mean it's transparent to upper layers, do you? How could it... I mean, if a web server decides to close a connection, there's nothing any system underneath it could do about it, I guess. I heard the term "connection caching" before, and followed it, which led to a few papers on the subject and problems with this type of caching, but no standards. It doesn't seem to be an easy issue, but it looks like it's solvable. If I'm right and common web servers don't implement this (one could of course carry out a larger measurement study for this... perhaps it has already been done), wouldn't an Informational RFC which provides an overview of connection caching methods and suggests an implementation do the trick? I'd be thankful for some pointers to the key papers about connection caching - e.g., where was it introduced? Cheers, Michael From dga+ at cs.cmu.edu Mon Dec 26 13:10:41 2005 From: dga+ at cs.cmu.edu (David Andersen) Date: Mon, 26 Dec 2005 16:10:41 -0500 Subject: [e2e] Can we revive T/TCP ? => persistent connections In-Reply-To: <000901c60a57$0ccaaf00$0200a8c0@fun> References: <001301c60a4a$9831dc60$0200a8c0@fun> <000901c60a57$0ccaaf00$0200a8c0@fun> Message-ID: <20BC21B4-8F61-451A-A026-90A31B5FB0FE@cs.cmu.edu> On Dec 26, 2005, at 3:00 PM, Michael Welzl wrote: >> >> It doesn't. Most links are clicked within the same site, and most >> servers and browsers support persistent connections. The connection >> is only torn down after an idle period or some maximum number of >> requests. > > In practice, this doesn't seem to be the case. In all the tests my > students did (not a thorough measurement study, just some > experiments), the server closed the connection after sending a page. > > I think this is due to the (quasi-)stateless operation that a HTTP > server > can achieve this way - I mean, it's much more difficult to keep > connections open for a longer period, and close them only after a > timer expired, count the number of connections that should be cached, > etc. etc. ... if poorly implemented, this might also not scale so > well. Could you elaborate on how you did those tests? A quick, highly scientific check showed: www.yahoo.com: no connection caching Google: caching microsoft.com: caching www.cmu.edu: caching The connection timeouts on some of those are fairly short; some are long enough for subsequent clicks, many are only long enough to fetch embedded objects (a few seconds). > > >> (I'm sure there are scenarios where it will, of course.) In the Grid >> context, if you're talking about a not-huge set of trusted nodes, >> they can cache those TCP connections for quite a long time. > > But they don't - neither in the smaller nor in the larger Grids that I > know of; I think it's because the notion of a "connection" is lost > in the (vertical) communication across layers. I'd suggest that that's not the fault of T/TCP, but the fault of the upper layers in the architecture... > > Grid Services are usually implemented on top of SOAP, which is > stateless. How should SOAP tell HTTP to maintain a connection > when it can't know whether a Grid Service will be called again? The > decision to do so is up to the programmer, who however can't provide > the remote SOAP instance with the necessary information because > the notion of a "session" isn't part of SOAP. > > Could connections be cached in a transparent manner in such a > scenario (e.g. by tweaking something at the HTTP level, but not > above)? I think so, but I'm not 100% sure. Also, if it's possible, > why isn't it done? In a Grid, this would surely make sense. Yes. Some SOAP and XMLRPC libraries do this. See, e.g., http://www.gnuenterprise.org/tools/common/docs/api/public/ "gnue.common.rpc.drivers.xmlrpc.ClientAdapter.ClientAdapter: Implements an XML-RPC client adapter using persistent HTTP connections as transport." > > >> An interesting example of this is the 'rex' system by Kaminsky and >> Mazieres. It's a remote execution tool much like ssh, but more >> flexible. It supports connection caching under the hood, so you >> don't have to pay the setup time if you're using remote command >> execution. It's worth noting that the major delay they're avoiding >> in the local area is the public key crypto processing time, but in >> the wide-area, both can add significantly to the total delay. > > Thanks a lot for the pointer! > By "under the hood", you don't mean it's transparent to upper > layers, do you? How could it... I mean, if a web server decides > to close a connection, there's nothing any system underneath it > could do about it, I guess. "under the hood" -- underneath what the upper layers see. The web server can close the connection, and the client {library,binary,whatever} can open it up again without having to let the user's program running on top of it know what's going on. > > I heard the term "connection caching" before, and followed it, which > led to a few papers on the subject and problems with this type of > caching, but no standards. It doesn't seem to be an easy issue, but > it looks like it's solvable. If I'm right and common web servers don't > implement this (one could of course carry out a larger measurement > study for this... perhaps it has already been done), wouldn't an > Informational RFC which provides an overview of connection caching > methods and suggests an implementation do the trick? I believe you're mistaken. Most web servers support it. It's part of the HTTP 1.1 spec, and has been around literally for years. > > I'd be thankful for some pointers to the key papers about connection > caching - e.g., where was it introduced? Proposed: 1995 sigcomm, Mogul, "The Case for Persistent-Connection HTTP". Dig around in some of his other papers, you'll get a good feel for what's going on. HTTP 1.1 spec. Persistent is the default. HTTP 1.0 hack, the: connection: keep-alive header. -d From jg at freedesktop.org Mon Dec 26 14:24:12 2005 From: jg at freedesktop.org (Jim Gettys) Date: Mon, 26 Dec 2005 17:24:12 -0500 Subject: [e2e] Can we revive T/TCP ? => persistent connections In-Reply-To: <20BC21B4-8F61-451A-A026-90A31B5FB0FE@cs.cmu.edu> References: <001301c60a4a$9831dc60$0200a8c0@fun> <000901c60a57$0ccaaf00$0200a8c0@fun> <20BC21B4-8F61-451A-A026-90A31B5FB0FE@cs.cmu.edu> Message-ID: <1135635853.11741.105.camel@localhost.localdomain> On Mon, 2005-12-26 at 16:10 -0500, David Andersen wrote: > > I heard the term "connection caching" before, and followed it, which > > led to a few papers on the subject and problems with this type of > > caching, but no standards. It doesn't seem to be an easy issue, but > > it looks like it's solvable. If I'm right and common web servers don't > > implement this (one could of course carry out a larger measurement > > study for this... perhaps it has already been done), wouldn't an > > Informational RFC which provides an overview of connection caching > > methods and suggests an implementation do the trick? > > I believe you're mistaken. Most web servers support it. It's part > of the HTTP 1.1 spec, and has been around literally for years. > > > > > I'd be thankful for some pointers to the key papers about connection > > caching - e.g., where was it introduced? > > Proposed: 1995 sigcomm, Mogul, "The Case for Persistent-Connection > HTTP". Dig around in some of his other papers, you'll get a good > feel for what's going on. > > HTTP 1.1 spec. Persistent is the default. > > HTTP 1.0 hack, the: > > connection: keep-alive > > header. > > -d Web servers have supported persistent connections for a very long time now, well before HTTP/1.1 was completed. All the significant servers do, and have, for many years. What their policy is about closing the connections may vary between implementation, configuration and load. http://www.w3.org/Protocols/HTTP/Performance/Pipeline presents a lot of data on HTTP/1.1 performance. Regards, Jim Gettys HTTP/1.1 editor. From michael.welzl at uibk.ac.at Mon Dec 26 14:27:45 2005 From: michael.welzl at uibk.ac.at (Michael Welzl) Date: Mon, 26 Dec 2005 23:27:45 +0100 Subject: [e2e] Can we revive T/TCP ? => persistent connections In-Reply-To: <20BC21B4-8F61-451A-A026-90A31B5FB0FE@cs.cmu.edu> Message-ID: > > In practice, this doesn't seem to be the case. In all the tests my > > students did (not a thorough measurement study, just some > > experiments), the server closed the connection after sending a page. > > > > I think this is due to the (quasi-)stateless operation that a HTTP > > server > > can achieve this way - I mean, it's much more difficult to keep > > connections open for a longer period, and close them only after a > > timer expired, count the number of connections that should be cached, > > etc. etc. ... if poorly implemented, this might also not scale so > > well. > > Could you elaborate on how you did those tests? A quick, highly > scientific > check showed: > > www.yahoo.com: no connection caching > Google: caching > microsoft.com: caching > www.cmu.edu: caching > > The connection timeouts on some of those are fairly short; some are > long enough for subsequent clicks, many are only long enough to fetch > embedded objects (a few seconds). ah, okay... well, perhaps it's because we didn't bother to set up a proper testing procedure for this one, then. i remember checking it once with one single web server (clicking + watching ethereal), then (assuming i'm right, and connections aren't kept open) told them to briefly check with some other web servers ... they came back and said that connections were always closed by the server. perhaps they took too much time between clicks, picked too few servers, whatever. anyway, this is news to me, and i find it really interesting! > >> (I'm sure there are scenarios where it will, of course.) In the Grid > >> context, if you're talking about a not-huge set of trusted nodes, > >> they can cache those TCP connections for quite a long time. > > > > But they don't - neither in the smaller nor in the larger Grids that I > > know of; I think it's because the notion of a "connection" is lost > > in the (vertical) communication across layers. > > I'd suggest that that's not the fault of T/TCP, but the fault of the > upper layers in the architecture... yep, this seems like the right place, given the narrow scope of usefulness of t/tcp you pointed out in your previous email ... seems like it could only be useful for grid computing, where things are in fact going wrong at a higher level. > > Grid Services are usually implemented on top of SOAP, which is > > stateless. How should SOAP tell HTTP to maintain a connection > > when it can't know whether a Grid Service will be called again? The > > decision to do so is up to the programmer, who however can't provide > > the remote SOAP instance with the necessary information because > > the notion of a "session" isn't part of SOAP. > > > > Could connections be cached in a transparent manner in such a > > scenario (e.g. by tweaking something at the HTTP level, but not > > above)? I think so, but I'm not 100% sure. Also, if it's possible, > > why isn't it done? In a Grid, this would surely make sense. > > Yes. Some SOAP and XMLRPC libraries do this. See, e.g., > > http://www.gnuenterprise.org/tools/common/docs/api/public/ > > "gnue.common.rpc.drivers.xmlrpc.ClientAdapter.ClientAdapter: > Implements an XML-RPC client adapter using persistent HTTP > connections as transport." wow! thanks again for this very helpful pointer! you really clarified a lot of things for me today. cheers, michael From michael.welzl at uibk.ac.at Mon Dec 26 14:30:34 2005 From: michael.welzl at uibk.ac.at (Michael Welzl) Date: Mon, 26 Dec 2005 23:30:34 +0100 Subject: [e2e] Can we revive T/TCP ? => persistent connections In-Reply-To: <1135635853.11741.105.camel@localhost.localdomain> Message-ID: > Web servers have supported persistent connections for a very long time > now, well before HTTP/1.1 was completed. All the significant servers > do, and have, for many years. I'm quite aware of this. > What their policy is about closing the connections may vary between > implementation, configuration and load. It's only this aspect that I was referring to, and mistaken about. Cheers, Michael