From Black_David at emc.com Fri Apr 1 06:11:04 2005 From: Black_David at emc.com (Black_David@emc.com) Date: Fri, 1 Apr 2005 09:11:04 -0500 Subject: [e2e] 911 and cell phones Message-ID: David, It's too bad that California screwed this up. I know from actual experience that calling 911 from at least my cell phone in Mass. enables one to reach the state police in short order. As for "exact position" of a cell phone, I think that's a red herring, because the only way to get it accurately appears to involve a GPS receiver in the cell phone, which most cell phones don't have. While I'm not an expert, my impression from what I've seen is that triangulation based on location of cell site antennas has not been sufficiently workable in practice. Even GPS has its limits - if the receiver can't see enough satellites, the result is a 2-D fix instead of 3-D, which can be a problem in a multi-story building. OTOH, if you want to trust your life to a global SLP infrastructure (Uh, where can I find one of those?), that's your choice ... Thanks, --David ---------------------------------------------------- David L. Black, Senior Technologist EMC Corporation, 176 South St., Hopkinton, MA 01748 +1 (508) 293-7953 FAX: +1 (508) 293-7786 black_david at emc.com Mobile: +1 (978) 394-7754 ---------------------------------------------------- > -----Original Message----- > From: end2end-interest-bounces at postel.org > [mailto:end2end-interest-bounces at postel.org] On Behalf Of > David P. Reed > Sent: Thursday, March 31, 2005 1:13 PM > To: Alex Cannara > Cc: end2end-interest at postel.org > Subject: Re: [e2e] Skype and congestion collapse. > > Alex - the underlying assumption is that traditional telephony delivers > 911 functionality best. Well, the word on the street is that in > California, if you call 911 on your basic, non IP cell phone, your exact > position is delivered to ... well, no one knows where, but it's a place > that has no capability to actually transfer that information to anyone > who actually can help you in an emergency. Better to call directory > assistance for the phone number of your local police dept. and hope they > don't tell you "911", because that will guarantee 30-90 minutes of > screwing around. > > OK, maybe wired phones still do 911 OK, but do PBXes? I doubt it - so > the point about bosses may be bogus as well. > > Here the argument that 911 should be "in the network" fails. I'd much > rather have my actual physical telephone be smart enough to figure out > how to summon emergency services (perhaps finding the doctor who is in > the next cubicle over if the SLP emergency service existed), I think. > > Dale Hatfield points out that the phone companies have made it > *impossible* to deploy a "911-like" service over the WWW, because who > can trust that a person would actually tell the truth that they have a > life or death situation on their hands. But of course we can ALL trust > Verizon Wireless with our lives... yeah right. From dpreed at reed.com Fri Apr 1 06:53:27 2005 From: dpreed at reed.com (David P. Reed) Date: Fri, 01 Apr 2005 09:53:27 -0500 Subject: [e2e] 911 and cell phones In-Reply-To: References: Message-ID: <424D6067.9040401@reed.com> David - exact position may not matter in most cases, but that's what Vonage is being beaten up about (I have 911 on the Vonage line activated, and it gets through to my local emergency services just fine because I told the system when I set it up where that was.) I note that getting the Massachusetts "state police" is rarely useful unless you are driving on the Massachusetts Turnpike (they might as well be a call center in Bangalore). They cannot by law assist you, and do not have the best means to pass on calls to localities, who might help you if you observe someone being mugged or raped on the street in (say) downtown Brockton. As far as I know, every CDMA cell phone being sold today (the vast majority in the market) have GPS in them (in the form of A-GPS, a proprietary technology that comes from qualcomm, which used GPS receiver in the phone, plus an assist from towers that gets the autonomous GPS re-locked fast when it goes out of satellite coverage). I think that GSM phones also all have GPS onboard as well. You are right that tower triangulation has failed, but the E911 mandate for cell phones still holds, and GPS is the technology that has been universally adopted, and works pretty well, as far as getting location. But as I said, knowing approximate or exact location isn't very good if the system design actually routes calls away from local responders to a single point of failure in some remote, windowless building that has no direct local presence. From hgs at cs.columbia.edu Fri Apr 1 07:11:31 2005 From: hgs at cs.columbia.edu (Henning Schulzrinne) Date: Fri, 01 Apr 2005 10:11:31 -0500 Subject: [e2e] 911 and cell phones In-Reply-To: References: Message-ID: <424D64A3.5050607@cs.columbia.edu> People interested in this topic might want to follow the work of the ECRIT working group in the IETF (and, for location delivery, the GEOPRIV working group). Some related material can be found at http://www.cs.columbia.edu/sip/emergency.html Short summary: Emergency calling ("911") is undergoing a radical technical transformation, motivated by the difficulty to support mobile devices, number portability, telematics services and VoIP in the traditional, 1960ish technology that is currently being used. As usual, it will take a decade or more for this transition. Black_David at emc.com wrote: > David, > > It's too bad that California screwed this up. I know from > actual experience that calling 911 from at least my cell > phone in Mass. enables one to reach the state police in > short order. > > As for "exact position" of a cell phone, I think that's a red > herring, because the only way to get it accurately appears to > involve a GPS receiver in the cell phone, which most cell > phones don't have. While I'm not an expert, my impression > from what I've seen is that triangulation based on location > of cell site antennas has not been sufficiently workable > in practice. Even GPS has its limits - if the receiver can't > see enough satellites, the result is a 2-D fix instead > of 3-D, which can be a problem in a multi-story building. > > OTOH, if you want to trust your life to a global SLP > infrastructure (Uh, where can I find one of those?), > that's your choice ... > > Thanks, > --David > ---------------------------------------------------- > David L. Black, Senior Technologist > EMC Corporation, 176 South St., Hopkinton, MA 01748 > +1 (508) 293-7953 FAX: +1 (508) 293-7786 > black_david at emc.com Mobile: +1 (978) 394-7754 > ---------------------------------------------------- > > > >>-----Original Message----- >>From: end2end-interest-bounces at postel.org >>[mailto:end2end-interest-bounces at postel.org] On Behalf Of >>David P. Reed >>Sent: Thursday, March 31, 2005 1:13 PM >>To: Alex Cannara >>Cc: end2end-interest at postel.org >>Subject: Re: [e2e] Skype and congestion collapse. >> >>Alex - the underlying assumption is that traditional telephony delivers >>911 functionality best. Well, the word on the street is that in >>California, if you call 911 on your basic, non IP cell phone, your exact >>position is delivered to ... well, no one knows where, but it's a place >>that has no capability to actually transfer that information to anyone >>who actually can help you in an emergency. Better to call directory >>assistance for the phone number of your local police dept. and hope they >>don't tell you "911", because that will guarantee 30-90 minutes of >>screwing around. >> >>OK, maybe wired phones still do 911 OK, but do PBXes? I doubt it - so >>the point about bosses may be bogus as well. >> >>Here the argument that 911 should be "in the network" fails. I'd much >>rather have my actual physical telephone be smart enough to figure out >>how to summon emergency services (perhaps finding the doctor who is in >>the next cubicle over if the SLP emergency service existed), I think. >> >>Dale Hatfield points out that the phone companies have made it >>*impossible* to deploy a "911-like" service over the WWW, because who >>can trust that a person would actually tell the truth that they have a >>life or death situation on their hands. But of course we can ALL trust >>Verizon Wireless with our lives... yeah right. From dpreed at reed.com Fri Apr 1 07:34:38 2005 From: dpreed at reed.com (David P. Reed) Date: Fri, 01 Apr 2005 10:34:38 -0500 Subject: [e2e] 911 and cell phones In-Reply-To: <424D64A3.5050607@cs.columbia.edu> References: <424D64A3.5050607@cs.columbia.edu> Message-ID: <424D6A0E.4050904@reed.com> Henning, it's great that there's an ECRIT working group with a 10-year transition plan. The best is always worth waiting for. It's kind of like the plan to deal with human-caused climate change. First we need to study it, so we can figure out the optimum theoretical answer (or for that matter, whether there's a problem at all). However, the Internet started based on a quite different approach. We didn't start by creating an IETF to study every problem to death and dole out money to academics doing theoretical studies. I follow ECRIT from afar, but frankly, I wonder if the group has the guts to take any technical leadership role in actually "doing" something. The IETF in many cases, and in my personal opinion, has become a pointless technocracy that talks mostly to itself. In this case, it will be irrelevant if it views the issue as a "smooth" ten-year transition, to be centrally managed and planned. Like most cases of innovation in technology that is rapidly changing, innovation in location will come from the edge, from pragmatic experimental labs, from open source, from entrepreneurs doing proprietary things that the public likes; it will be about working code and rough consensus (what the IETF used to be, before the current Stepford academics took it over and got rid of any hope of doing good work, instead focusing on MUST and MAY and WILL and SHOULD). And the IETF bureaucrats will travel the world, enjoying junkets in fancy hotels, running BOFs and if they are slightly lucky, maybe having an influence over a tiny part of the future. From Jon.Crowcroft at cl.cam.ac.uk Fri Apr 1 07:45:03 2005 From: Jon.Crowcroft at cl.cam.ac.uk (Jon Crowcroft) Date: Fri, 01 Apr 2005 16:45:03 +0100 Subject: [e2e] information superhighway finally realized. Message-ID: Thinking about this, what Al Gore really meant has just sunk in: What you need is a person and vehicle tracking system - this can be multi-modal - if a person carries a device that has gps (or cell/tower or wifi, or other triangulation based) location services, then its easy - but its also easy if you have plenty of surveillance cameras - these have two benefits 1/ you can then implement car registration recognition, and charge for road usage (and congestion charge) fairly and efficiently, reducing pollution, accidents and delays and permitting the police to catch folks that break laws (bad and fast driving etc) with hi resolution instead of notoriously inaccurate human witnesses 2/ you can also use this to recognize and catch terrorists, since the recognition system can be plugged into a traffic anomaly detection system and autmatically detect people renting cars in airports and driving them full of gas into tall buildings Looking further afield, one could put in automatic speed control in the cars, and even immobilizers so that if the camera at the roadside shows a person who doesn't have the visage of one of the recognized (safe, approved) drivers of the said car, it either goes very slowly or not at all - even further afield, the vehicle could be autmatically routed to a county jail - this could also apply to people who havnt paid their tax, or are on the run. It could prevent soldiers awol from iraq driving over the border to canada or way down south to mexico 3/ M-ad hoc think of all the benefits - if all the cars are fitted with 802.11 devices, we could also use them to provide a network - we could cause cars to route to places where there is a gap in the connectivity at the moment - as per the grossglauser/tse result, and using network coding, this would provide for arbitrary capacity almost unbounded in gas rich countries. 4/ crative accountancy at the same time, one could have creative CRISPS (Connectivity Rich ISPs) that have ingenious billing schemes - your wireless ad hoc broadband bill could be rolled into the tax on your gas at the gas station - you go fill up with 10 gallons of gas and 100Gbytes of download - if you run a multi-occupancy vehicle, and you also run peer-to-peer file sharing you would get a double discount. 5/ border routing considerations Of course one would need to consider the regulatory problems - if the US were to build a lot of roads just south of the canadian border to offer "offshore network capacity"< but the canadians used Hydro to re-charge lots of fuel-cell and electrc/hybrid cars, then someone is going to think about power-line broadband - then there could be interference unless one deploys OFDM (Oil For Download Mobility) 6/ denial of service, and other security problems of course it is easy to jam radio, and its easy to jam on the radio too. so we need to worry about this, but not too much - if someone blocks your download, you just drive to blockbuster and pick up the DVD there anyhow... 7/ network management considerations The system should be just as manageable as the internet and the road system. congestion will be rare (there will be no packet loss in my car), and resilience will be provided by fast oregon bypasses. overlay routing (put that bike in the pickup, put that laptop on the bike, put that USB memory stick in the laptop) will naturally occur and is a matter for further study. Now, back to your normal service... In missive <424D6067.9040401 at reed.com>, "David P. Reed" typed: >>David - exact position may not matter in most cases, but that's what >>Vonage is being beaten up about (I have 911 on the Vonage line >>activated, and it gets through to my local emergency services just fine >>because I told the system when I set it up where that was.) >> >>I note that getting the Massachusetts "state police" is rarely useful >>unless you are driving on the Massachusetts Turnpike (they might as well >>be a call center in Bangalore). They cannot by law assist you, and do >>not have the best means to pass on calls to localities, who might help >>you if you observe someone being mugged or raped on the street in (say) >>downtown Brockton. >> >>As far as I know, every CDMA cell phone being sold today (the vast >>majority in the market) have GPS in them (in the form of A-GPS, a >>proprietary technology that comes from qualcomm, which used GPS receiver >>in the phone, plus an assist from towers that gets the autonomous GPS >>re-locked fast when it goes out of satellite coverage). I think that >>GSM phones also all have GPS onboard as well. >> >>You are right that tower triangulation has failed, but the E911 mandate >>for cell phones still holds, and GPS is the technology that has been >>universally adopted, and works pretty well, as far as getting location. >> >>But as I said, knowing approximate or exact location isn't very good if >>the system design actually routes calls away from local responders to a >>single point of failure in some remote, windowless building that has no >>direct local presence. >> >> cheers jon From hgs at cs.columbia.edu Fri Apr 1 07:45:43 2005 From: hgs at cs.columbia.edu (Henning Schulzrinne) Date: Fri, 01 Apr 2005 10:45:43 -0500 Subject: [e2e] 911 and cell phones In-Reply-To: <424D6A0E.4050904@reed.com> References: <424D64A3.5050607@cs.columbia.edu> <424D6A0E.4050904@reed.com> Message-ID: <424D6CA7.1080000@cs.columbia.edu> None of the active participants in the IETF is arguing for a ten-year plan; I think everyone involved would be happy to ditch the existing system tomorrow - and certainly keep people from spending gobs of money on patching it. Limitations of available public funding, industry structures (and, in some cases, lack of technical skills in PSAPs) make a slow deployment likely. It's obviously more fun to write jeremiads about the IETF than deal with the complicated reality of large deployed, safety-critical systems. David P. Reed wrote: > Henning, it's great that there's an ECRIT working group with a 10-year > transition plan. The best is always worth waiting for. It's kind of > like the plan to deal with human-caused climate change. First we need > to study it, so we can figure out the optimum theoretical answer (or for > that matter, whether there's a problem at all). From dpreed at reed.com Fri Apr 1 08:10:56 2005 From: dpreed at reed.com (David P. Reed) Date: Fri, 01 Apr 2005 11:10:56 -0500 Subject: [e2e] 911 and cell phones In-Reply-To: <424D6CA7.1080000@cs.columbia.edu> References: <424D64A3.5050607@cs.columbia.edu> <424D6A0E.4050904@reed.com> <424D6CA7.1080000@cs.columbia.edu> Message-ID: <424D7290.1010501@reed.com> Henning Schulzrinne wrote: > It's obviously more fun to write jeremiads about the IETF than deal > with the complicated reality of large deployed, safety-critical systems. Indeed it is! :-) The difference is that I have worked with such systems in the past, and work directly with people who do have to deal with such things. The IETF has no accountability whatsoever. From braden at ISI.EDU Fri Apr 1 09:06:04 2005 From: braden at ISI.EDU (Bob Braden) Date: Fri, 1 Apr 2005 09:06:04 -0800 (PST) Subject: [e2e] 911 and cell phones Message-ID: <200504011706.JAA25472@gra.isi.edu> What does this topic have to do with the end-to-end principle/practice? Bob Braden From jtw at lcs.mit.edu Fri Apr 1 09:35:54 2005 From: jtw at lcs.mit.edu (John Wroclawski) Date: Fri, 1 Apr 2005 12:35:54 -0500 Subject: [e2e] 911 and cell phones In-Reply-To: <200504011706.JAA25472@gra.isi.edu> References: <200504011706.JAA25472@gra.isi.edu> Message-ID: At 9:06 AM -0800 4/1/05, Bob Braden wrote: >What does this topic have to do with the end-to-end principle/practice? > >Bob Braden Actually, it does seem to - when you strip lots of detail away a key question seems to be the architectural choice of "end system knows where it is and tells the dispatcher" vs "infrastructure expected to know where the thing using it is". Interesting that both VoIP and cellular tech are apparently pushing the architecture towards a more e2e model. --john From Farooq.Bari at cingular.com Fri Apr 1 14:41:47 2005 From: Farooq.Bari at cingular.com (Bari, Farooq) Date: Fri, 1 Apr 2005 14:41:47 -0800 Subject: [e2e] 911 and cell phones Message-ID: Can the end device be trusted in such matters? Can someone sitting in say Europe make a VoIP emergency call in US? Should not it be like trust but verify? > -----Original Message----- > From: end2end-interest-bounces at postel.org [mailto:end2end-interest- > bounces at postel.org] On Behalf Of John Wroclawski > Sent: Friday, April 01, 2005 9:36 AM > To: Bob Braden; Black_David at emc.com; dpreed at reed.com > Cc: end2end-interest at postel.org > Subject: Re: [e2e] 911 and cell phones > > At 9:06 AM -0800 4/1/05, Bob Braden wrote: > >What does this topic have to do with the end-to-end principle/practice? > > > >Bob Braden > > Actually, it does seem to - when you strip lots of detail away a key > question seems to be the architectural choice of "end system knows > where it is and tells the dispatcher" vs "infrastructure expected to > know where the thing using it is". Interesting that both VoIP and > cellular tech are apparently pushing the architecture towards a more > e2e model. > > --john From cannara at attglobal.net Fri Apr 1 20:19:26 2005 From: cannara at attglobal.net (Cannara) Date: Fri, 01 Apr 2005 20:19:26 -0800 Subject: [e2e] Skype and congestion collapse. References: <026F8EEDAD2C4342A993203088C1FC051C270E@esealmw109.eemea.ericsson.se> <42360909.6090509@attglobal.net> <424C3D96.3000402@reed.com> Message-ID: <424E1D4E.3BDC7AFF@attglobal.net> Actually, David, I wasn't talking about cell phones, but wired lines. However, I did need to use 911 via a cell some months back, when an idiot kid was shooting BBs at cars passing near where we live. He was dumb enough to shoot at my wife's side of the car as we cruised back past, trying to identify his house. So I called 911, which in Calif goes to the CHP. The CHP called the Sheriff and within minutes 2 cars were there. The officers said they'd give the kid a real scare and put him in one car, while talking about going to Juvenile Hall. They then impressed on him the seriousness of what he'd done, and he also gave up his friend, who had hidden in the garage. Then the absent parent returned home in his obligatory SUV to witness how well he'd brought up his kid. As the officers explained to all of them the illegality off such weapons in our county, they also let the twist in the wind a bit wondering if we'd press charges. After some time, we let the officers know we wouldn't and they said they'd make the kids write letters of apology. They did a good job of that, cutting short potential lives of crime -- all thanks to a cellphone and 911. :] At some point, laws and trust can and do work together to provide reliable emergency services. IP phone isn't there yet. Alex "David P. Reed" wrote: > > Alex - the underlying assumption is that traditional telephony delivers > 911 functionality best. Well, the word on the street is that in > California, if you call 911 on your basic, non IP cell phone, your exact > position is delivered to ... well, no one knows where, but it's a place > that has no capability to actually transfer that information to anyone > who actually can help you in an emergency. Better to call directory > assistance for the phone number of your local police dept. and hope they > don't tell you "911", because that will guarantee 30-90 minutes of > screwing around. > > OK, maybe wired phones still do 911 OK, but do PBXes? I doubt it - so > the point about bosses may be bogus as well. > > Here the argument that 911 should be "in the network" fails. I'd much > rather have my actual physical telephone be smart enough to figure out > how to summon emergency services (perhaps finding the doctor who is in > the next cubicle over if the SLP emergency service existed), I think. > > Dale Hatfield points out that the phone companies have made it > *impossible* to deploy a "911-like" service over the WWW, because who > can trust that a person would actually tell the truth that they have a > life or death situation on their hands. But of course we can ALL trust > Verizon Wireless with our lives... yeah right. From cannara at attglobal.net Fri Apr 1 21:13:24 2005 From: cannara at attglobal.net (Cannara) Date: Fri, 01 Apr 2005 21:13:24 -0800 Subject: [e2e] TFRC vs UDP References: Message-ID: <424E29F4.2DC4DA98@attglobal.net> The problem with all of these "The transport does what the network should do" bandaids is that they're just that. Combine dealing with congestion at the wrong layer, in naive ways, with the traditional lack of standards review and source control, and that's given us our wonderfully insecure, spam-ridden Internet. What we have today is barely a networking existence proof, after spending many millions of taxpayers' $, over about 30 years for certain 'researchers' and their grad students to write papers, convene around the world and 'invent' rather than engineer protocols. This is not meant as a criticism of DCCP, TFRC, etc., just a commentary on the vast difference between a subsidized, poorly-managed, self-involved, bureaucracy-stunted Internet and something elegant, economical, efficient and worthy of general pride, like Ethernet quickly became as has remained. :] The just out "E2E Vision" draft says this, perhaps inadvertently, by suggesting that we need a "vision" for data communications for the next "10-15 years". Well, yes, we needed it 10-15 years ago, when serious protocol R&D was stopped -- without properly addressing 'minor' details, like congestion, or even node addressing. The Vision doc also exhibits the Internet bureaucracy's traditional NIH approach, as in: "The older members of the data communications research community spent some of their formative years in the time when data communications was being revolutionized by the creation of a new paradigm: packet switching. While packet switching is now an accepted, indeed, lauded way to think about data communications, into the early 1980s it was still a radical idea and into the 1990s required periodic justification." leading newer readers to think the Internet folks invented packet switching. This self-serving comment isn't even approrpiate in a "vision" document, but it reflects the tradition of maintaining ignorance that any bureaucracy depends on to avoid scrutiny. After all, TCP/IP folks then apparently never even knew what a MAC was, because the IMPs on their Unix boxes somehow used phone lines to get to other IMPs on other Unix boxes. They apparently didn't even understand why they should have lauded things like Ethernet, Znet, CromemcoNet, CorvusNet, XNS, SNA, yadda, yadda, which in the '70s depended not on the public dole from DARPA, but real corporate investment in efficient networking systems. Though not being a great IBM fan, I do have to admit to knowing no one who ever hacked an SNA network, but someone probably did something, once, and it was likely harder that getting Sendmail to execute remote code. What a "Vision" doc could say, to be more trustable than a Carl Rove memo, is maybe: "Today's Internet has demonstrated that open, international network communication, at will, is a realizable goal. Unfortunately, lack of proper engineering attention to certain areas has also exposed shortsightedness on the part of its designers. For example, a mistaken trust in human nature has left our Internet and all its users exposed to extremely serious issues: insecurity, denial of service, unwanted traffic, lack of efficiency, economic barriers, all manner of traditional scams, and even heightened potential for identity loss. The vision for the Internet now should be to move beyond its existence-proof phase and into the realm of a safe, reliable, economical and responsible utility." Anyway, it's interesting and encouraging that we at least have some folks concerned about the future. I hope rocking the boat is on their agenda. Alex "Phelan, Tom" wrote: > > Hi Syed, > > DCCP includes TFRC as one of its congestion control algorithms, and there has been quite a bit of discussion in the group of the impact of TFRC on streaming media applications. The DCCP User Guide contains an extensive discussion of the issue. Unfortunately, it's timed out of the drafts archive as we work out the future course of the guide, but it's available at http://www.phelan-4.com/dccp/draft-ietf-dccp-user-guide-02.txt. > > Tom Phelan > > > -----Original Message----- > > From: end2end-interest-bounces at postel.org > > [mailto:end2end-interest-bounces at postel.org]On Behalf Of Syed Faisal > > Hasan > > Sent: Thursday, March 31, 2005 8:37 AM > > To: gdc at iki.fi > > Cc: end2end-interest at postel.org > > Subject: Re: [e2e] TFRC vs UDP > > > > > > Hi Dado, > > > > > > >Syed Faisal Hasan wrote: > > >> > > >>To whom it may concern, > > >> > > >>TFRC was designed for use by the Continuous Media (CM) > > applications. But > > >>why will a CM application which is performing well using > > UDP, use TFRC if > > >>there is performance gap (more latency, less number of > > packets transmitted > > >>in the same time, high rate fluctuations in the beginning) > > betwen UDP and > > >>TFRC ? May be thats the reason we haven't seen any > > applicatons using TFRC. > > >>On the other hand there is no (I haven't found) research > > which analyzes > > >>the performance difference between UDP and TFRC. It is > > clear that TFRC > > >>will not perform exactly like UDP ( due to TFRC's > > friendliness with TCP), > > >>but how much can we expect from TFRC? > > > > > > > > >Hi Syed, > > > > > >the motivation is that although your application would work > > fine and you (a > > >single person in a society) would have a good welfare using > > UDP, it is your > > >fellow citizens that would potentially suffer from your actions. The > > >network resource should be distributed fairly - whatever it > > means. If we > > >all started to disregard other users, the network might stop working > > >properly - equivalent to anarchy in a society. Mechanisms > > are needed to > > >guarantee fair distribution of the resource. TFRC attempts > > to provide > > >mechanisms to distribute the bandwidth resource in the same > > way TCP does. > > > > > >I think performance issues depend more on competing TCP > > flows and the > > >network than on the TFRC control algorithm. > > > > I understand what you are talking about. But I want to know > > the performance > > difference of > > UDP and TFRC in the same scenario. Is there any published > > research on this? > > > > Faisal From cannara at attglobal.net Fri Apr 1 21:20:44 2005 From: cannara at attglobal.net (Cannara) Date: Fri, 01 Apr 2005 21:20:44 -0800 Subject: [e2e] 911 and cell phones References: Message-ID: <424E2BAC.3E57AF8D@attglobal.net> In fact, both infrastructure and end nodes know where the caller is. Which is where the trust & security of the wired POTS system originates -- the physical line has a trusted, internal identifier no one but the telco ever knows, and our cellphones each have unique identifiers we never see, but which the services must know in order to locate us. While we may think our phone numbers are what identify us and our locations, they do not. They're just names on proprietary tables of unique internal system identifiers. Alex "Bari, Farooq" wrote: > > Can the end device be trusted in such matters? Can someone sitting in > say Europe make a VoIP emergency call in US? Should not it be like trust > but verify? > > > -----Original Message----- > > From: end2end-interest-bounces at postel.org [mailto:end2end-interest- > > bounces at postel.org] On Behalf Of John Wroclawski > > Sent: Friday, April 01, 2005 9:36 AM > > To: Bob Braden; Black_David at emc.com; dpreed at reed.com > > Cc: end2end-interest at postel.org > > Subject: Re: [e2e] 911 and cell phones > > > > At 9:06 AM -0800 4/1/05, Bob Braden wrote: > > >What does this topic have to do with the end-to-end > principle/practice? > > > > > >Bob Braden > > > > Actually, it does seem to - when you strip lots of detail away a key > > question seems to be the architectural choice of "end system knows > > where it is and tells the dispatcher" vs "infrastructure expected to > > know where the thing using it is". Interesting that both VoIP and > > cellular tech are apparently pushing the architecture towards a more > > e2e model. > > > > --john From cannara at attglobal.net Fri Apr 1 21:31:42 2005 From: cannara at attglobal.net (Cannara) Date: Fri, 01 Apr 2005 21:31:42 -0800 Subject: [e2e] information superhighway finally realized. References: Message-ID: <424E2E3E.EAA691B5@attglobal.net> Good Jon! But, Connecticut and several other states are already passing laws that prevent lots of info from car transponders from being used by anyone but the drivers -- thank goodness! You may have heard of the rental agency in Conn. that was billing customers per mile when they exceeded speed limits. Nasty! :] Alex Jon Crowcroft wrote: > > Thinking about this, what Al Gore really meant has just sunk in: > > What you need is a person and vehicle tracking system - this can be > multi-modal - if a person carries a device that has gps (or cell/tower > or wifi, or other triangulation based) location services, then its > easy - but its also easy if you have plenty of surveillance cameras - > these have two benefits > 1/ you can then implement car registration recognition, and charge for > road usage (and congestion charge) fairly and efficiently, reducing > pollution, accidents and delays and permitting the police to catch > folks that break laws (bad and fast driving etc) with hi resolution > instead of notoriously inaccurate human witnesses > 2/ you can also use this to recognize and catch terrorists, since the > recognition system can be plugged into a traffic anomaly detection > system and autmatically detect people renting cars in airports and > driving them full of gas into tall buildings > > Looking further afield, one could put in automatic speed control in > the cars, and even immobilizers so that if the camera at the roadside > shows a person who doesn't have the visage of one of the recognized > (safe, approved) drivers of the said car, it either goes very slowly > or not at all - even further afield, the vehicle could be > autmatically routed to a county jail - this could also apply to people > who havnt paid their tax, or are on the run. > > It could prevent soldiers awol from iraq driving over the border to > canada or way down south to mexico > > 3/ M-ad hoc > think of all the benefits - if all the cars are fitted with 802.11 > devices, we could also use them to provide a network - we could cause > cars to route to places where there is a gap in the connectivity at > the moment - as per the grossglauser/tse result, and using network > coding, this would provide for arbitrary capacity almost unbounded in > gas rich countries. > > 4/ crative accountancy > at the same time, one could have creative CRISPS (Connectivity Rich > ISPs) that have ingenious billing schemes - your wireless ad hoc > broadband bill could be rolled into the tax on your gas at the gas > station - you go fill up with 10 gallons of gas and 100Gbytes of > download - if you run a multi-occupancy vehicle, and you also > run peer-to-peer file sharing you would get a double discount. > > 5/ border routing considerations > Of course one would need to consider the regulatory problems - if the > US were to build a lot of roads just south of the canadian border to > offer "offshore network capacity"< but the canadians used Hydro to > re-charge lots of fuel-cell and electrc/hybrid cars, then someone is > going to think about power-line broadband - then there could be > interference unless one deploys OFDM (Oil For Download Mobility) > > 6/ denial of service, and other security problems > of course it is easy to jam radio, and its easy to jam on the radio > too. so we need to worry about this, but not too much - if someone > blocks your download, you just drive to blockbuster and pick up the > DVD there anyhow... > > 7/ network management considerations > > The system should be just as manageable as the internet and the road > system. congestion will be rare (there will be no packet loss in my > car), and resilience will be provided by fast oregon bypasses. > overlay routing (put that bike in the pickup, put that laptop on the > bike, put that USB memory stick in the laptop) will naturally occur > and is a matter for further study. > > Now, back to your normal service... > > In missive <424D6067.9040401 at reed.com>, "David P. Reed" typed: > > >>David - exact position may not matter in most cases, but that's what > >>Vonage is being beaten up about (I have 911 on the Vonage line > >>activated, and it gets through to my local emergency services just fine > >>because I told the system when I set it up where that was.) > >> > >>I note that getting the Massachusetts "state police" is rarely useful > >>unless you are driving on the Massachusetts Turnpike (they might as well > >>be a call center in Bangalore). They cannot by law assist you, and do > >>not have the best means to pass on calls to localities, who might help > >>you if you observe someone being mugged or raped on the street in (say) > >>downtown Brockton. > >> > >>As far as I know, every CDMA cell phone being sold today (the vast > >>majority in the market) have GPS in them (in the form of A-GPS, a > >>proprietary technology that comes from qualcomm, which used GPS receiver > >>in the phone, plus an assist from towers that gets the autonomous GPS > >>re-locked fast when it goes out of satellite coverage). I think that > >>GSM phones also all have GPS onboard as well. > >> > >>You are right that tower triangulation has failed, but the E911 mandate > >>for cell phones still holds, and GPS is the technology that has been > >>universally adopted, and works pretty well, as far as getting location. > >> > >>But as I said, knowing approximate or exact location isn't very good if > >>the system design actually routes calls away from local responders to a > >>single point of failure in some remote, windowless building that has no > >>direct local presence. > >> > >> > > cheers > > jon From cannara at attglobal.net Fri Apr 1 22:15:19 2005 From: cannara at attglobal.net (Cannara) Date: Fri, 01 Apr 2005 22:15:19 -0800 Subject: [e2e] Frivolity, was Re: Skype and congestion collapse. References: <026F8EEDAD2C4342A993203088C1FC051C270E@esealmw109.eemea.ericsson.se> <42360909.6090509@attglobal.net> <424C3D96.3000402@reed.com> Message-ID: <424E3877.C199C480@attglobal.net> Good! We had to do the same thing yesterday with our garbage collector, but she actually was only 5 miles away. You realize that McDonald's & Jack in the Box are outsourcing their drive-up order taking too? Will Indians not want the work because of the cow meat? :] Alex Lloyd Wood wrote: > > On Thu, 31 Mar 2005, David P. Reed wrote: > > > Date: Thu, 31 Mar 2005 13:12:38 -0500 > > From: David P. Reed > > To: Alex Cannara > > Cc: end2end-interest at postel.org > > Subject: Re: [e2e] Skype and congestion collapse. > > > > Alex - the underlying assumption is that traditional telephony delivers > > 911 functionality best. Well, the word on the street is that in > > California, if you call 911 on your basic, non IP cell phone, your exact > > position is delivered to ... > > India. > > http://www.doonesbury.com/strip/dailydose/index.html?uc_full_date=20050320 > > L. From Jon.Crowcroft at cl.cam.ac.uk Sat Apr 2 01:06:46 2005 From: Jon.Crowcroft at cl.cam.ac.uk (Jon Crowcroft) Date: Sat, 02 Apr 2005 10:06:46 +0100 Subject: [e2e] 911 and cell phones In-Reply-To: Message from John Wroclawski of "Fri, 01 Apr 2005 12:35:54 CDT." Message-ID: yes... what could be more end-to-end than knowing where the end is? and the means justifies the end especially if it helps you with prevention of theft and denial of service too... In missive , John Wroclawski typed: >>At 9:06 AM -0800 4/1/05, Bob Braden wrote: >>>What does this topic have to do with the end-to-end principle/practice? >>> >>>Bob Braden >> >>Actually, it does seem to - when you strip lots of detail away a key >>question seems to be the architectural choice of "end system knows >>where it is and tells the dispatcher" vs "infrastructure expected to >>know where the thing using it is". Interesting that both VoIP and >>cellular tech are apparently pushing the architecture towards a more >>e2e model. >> >>--john cheers jon From faisal at lums.edu.pk Sun Apr 3 03:18:38 2005 From: faisal at lums.edu.pk (Faisal Aslam) Date: Sun, 3 Apr 2005 15:18:38 +0500 Subject: [e2e] UDP checksum field? Message-ID: <8C128AD85EEA5747B9F81C6230BA12F6544BB3@jhelum.lumsnet.edu.pk> Hi, Why we have checksum field is in UDP header, as UDP does not provide data retransmission etc? I think it is used only to silently discarding a packet with wrong checksum (thats it?). Is there any other application of checksum field? Sorry if the question is too naive. Thanks Faisal From cannara at attglobal.net Sun Apr 3 20:06:09 2005 From: cannara at attglobal.net (Cannara) Date: Sun, 03 Apr 2005 20:06:09 -0700 Subject: [e2e] UDP checksum field? References: <8C128AD85EEA5747B9F81C6230BA12F6544BB3@jhelum.lumsnet.edu.pk> Message-ID: <4250AF21.552F7D9A@attglobal.net> Faisal, yes indeed, the checksum lets the receiver discard garbage, just as the CRC at the frame level does. UDP can be asked to not use the checksum, for devil-may-care applications. :] Alex Faisal Aslam wrote: > > Hi, > > Why we have checksum field is in UDP header, as UDP does not provide data retransmission etc? > I think it is used only to silently discarding a packet with wrong checksum (thats it?). Is there any other application of checksum field? > > Sorry if the question is too naive. > > Thanks > Faisal > From philippe.gentric at philips.com Mon Apr 4 03:02:06 2005 From: philippe.gentric at philips.com (Philippe Gentric) Date: Mon, 4 Apr 2005 12:02:06 +0200 Subject: [e2e] 911 and cell phones In-Reply-To: <200504011706.JAA25472@gra.isi.edu> Message-ID: >What does this topic have to do with the end-to-end principle/practice? imagine anyone could fake ten emmergency calls in ten minutes, across an ocean, and get away with it... dont you think this would be a major end-to-end [user-to-police] *principle* problem? Philippe. Bob Braden Sent by: end2end-interest-bounces at postel.org 2005-04-01 19:06 To: Black_David at emc.com dpreed at reed.com cc: end2end-interest at postel.org (bcc: Philippe Gentric/SUR/PSW/PHILIPS) Subject: Re: [e2e] 911 and cell phones Classification: What does this topic have to do with the end-to-end principle/practice? Bob Braden -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.postel.org/pipermail/end2end-interest/attachments/20050404/12e196ea/attachment.html From lynne at telemuse.net Mon Apr 4 09:30:47 2005 From: lynne at telemuse.net (Lynne Jolitz) Date: Mon, 4 Apr 2005 09:30:47 -0700 Subject: [e2e] UDP checksum field? In-Reply-To: Message-ID: <001701c53933$a7be2960$6e8944c6@telemuse.net> Yes, Lloyd is exactly right here. It is often the case that people turn off UDP checksums to "buy" more performance by relying on the CRC of the ethernet packet. It's not a stupid question - it's a very smart question, and a lot of smart people get fooled by this. For example, the Sun datacenter back in the early 1990's had an NFS cluster project called Sunbox - an array of workstation CPUs that did divide and conquer to build a massive file server. It used an ethernet multiplexer to dynamically split the load. To buy back performance, they turned off the UDP checksum. It worked fine until they had a bad lot of ethernet boards with substandard memories - this wasn't picked up in tests because the test units were doing resends of the occasionally corrupted packets (UDP checksums usually was turned on), and in TCP the checksums would do resends as well. It was also a fairly rare problem, and the test periods were too short to pick up on the nature of this problem easily. But when UDP checksums were turned off in normal use, the resulting NFS requests were corrupting the filesystem (which in this case were database files), forcing rebuilds and manual repairs of database tables. As they were about to announce and release it, they suddenly discovered this problem - they noticed the corruption and in order to determine whether it was in the high level (stack or above) or lower levels, they turned on checksums and it worked immediately. They then examined the failed checksum packets to traceback in the lower level stack-down through the link layer to discover where the corruption occured. With logic analyzers, they were able to observe the contents going into memory from the NIC on reception was different than the contents going out of the memory and traveling across the bus to the processor. This is a surprisingly common problem in datacenters - sometimes the problem would be a switch, sometimes a configuration error, sometimes a programming error in the application, and so forth. I most recently experienced this problem with an overheated ethernet switch passing video on an internal network. I also ran into this at an Internet portal company where I was a manager. We were using NetApps file servers to mirror the daily information - NetApps at the time encouraged staff to turn off checksums to increase performance. The DBAs noticed problems and ended up doing frequent rebuilds, but couldn't figure out why. It took me a lot of time to convince my staff to turn on the checksums because they were told "they don't have to" by NetApps. Most datacenter staff work by cookbook, and this wasn't in the cookbook. When they finally tried it, it worked. This little problem cost us a lot of time and aggravation for very little (if any) performance gain. Performance gain by turning off checksums now can be obviated through the use of intelligent NIC technologies like SiliconTCP (http://jolitz.telemuse.net/pubs/pt2001_01/item) and TOE that calculate the checksum as the packet is being received. But we don't have this in commodity switches yet, so check that switch if you're having problems. Higher level checksums are worth it every time. Don't leave the server without them. :-) Lynne Jolitz. ---- We use SpamQuiz. If your ISP didn't make the grade try http://lynne.telemuse.net > -----Original Message----- > From: end2end-interest-bounces at postel.org > [mailto:end2end-interest-bounces at postel.org]On Behalf Of Lloyd Wood > Sent: Monday, April 04, 2005 2:48 AM > To: Faisal Aslam > Cc: end2end-interest at postel.org > Subject: Re: [e2e] UDP checksum field? > > > On Sun, 3 Apr 2005, Faisal Aslam wrote: > > > Why we have checksum field is in UDP header, as UDP does not provide > > data retransmission etc? I think it is used only to silently > > discarding a packet with wrong checksum (thats it?). > > yes - you need an end-to-end check against a corrupted packet. UDP > could have the checksum turned off, which proved disastrous for a > number of applications, subtly corrupted filing systems which didn't > have higher-level end2end checks etc. > > > Is there any other application of checksum field? > > For other applications > http://www.faqs.org/rfcs/rfc3828.html > > UDP Lite originally sprang out of the observation that UDP has > redundant length information, and that this information could be > combined with the checksum (as in TCP/UDP) to give partial coverage. > > L. > > > > > Sorry if the question is too naive. > > > > Thanks > > Faisal > > > > > > > From cannara at attglobal.net Mon Apr 4 10:02:32 2005 From: cannara at attglobal.net (Cannara) Date: Mon, 04 Apr 2005 10:02:32 -0700 Subject: [e2e] UDP checksum field? References: <001701c53933$a7be2960$6e8944c6@telemuse.net> Message-ID: <42517328.77FD2664@attglobal.net> I'll add a funny (if you're not using Oracle TNS gateways) SQL transport example that still exists today, despite being pointed out to Oracle about a decade ago. When Network General was adding more SQL decodes to the Sniffer(r), in the '90s, we had a presentation on the Oracle transport (TNS) underlying SQL Net traffic. TNS rode on Netware SPP, or TCP, etc. The fellow went into packet fields in detail and explained how Oracle also made gateway software available for Sun boxes to go from an Oracle system to an IBM SNA db system. The gateway received SQL on TNS on TCP on IP on Ethernet (for instance) and spit out SQL on TNS or whatever IBM wanted. As he expounded on TNS pkt fields, a few hands went up -- "What's the checksum field for if it's always 0?" asked a few experienced network folks. The presenter turned back to the slide show and said: "It's unimplemented for now". Without malice, another question was posed: "Well if it's unused and your gateway has bad memory, how do you know the data going into the db on the other side will be good?" The presenter, a highly lauded Oracle techy, looked at the screen for a bit, looked back at the audience, shuffled his feet, looked again at the screen, and finally said words like: "I don't know". After the presentation, a letter was written to Oracle, copied to Ellison, explaining exactly the problem and urging the TNS checksum be implemented. No response ever came back, and, if you look at a TNS packet today, the checksum is still zero. I guess no one has used the gateway software who cares about their data. :] Alex PS Note that "gateway" here is used in the proper sense, not for "router". Lynne Jolitz wrote: > > Yes, Lloyd is exactly right here. It is often the case that people turn off UDP checksums to "buy" more performance by relying on the CRC of the ethernet packet. It's not a stupid question - it's a very smart question, and a lot of smart people get fooled by this. > > For example, the Sun datacenter back in the early 1990's had an NFS cluster project called Sunbox - an array of workstation CPUs that did divide and conquer to build a massive file server. It used an ethernet multiplexer to dynamically split the load. To buy back performance, they turned off the UDP checksum. It worked fine until they had a bad lot of ethernet boards with substandard memories - this wasn't picked up in tests because the test units were doing resends of the occasionally corrupted packets (UDP checksums usually was turned on), and in TCP the checksums would do resends as well. It was also a fairly rare problem, and the test periods were too short to pick up on the nature of this problem easily. > > But when UDP checksums were turned off in normal use, the resulting NFS requests were corrupting the filesystem (which in this case were database files), forcing rebuilds and manual repairs of database tables. > > As they were about to announce and release it, they suddenly discovered this problem - they noticed the corruption and in order to determine whether it was in the high level (stack or above) or lower levels, they turned on checksums and it worked immediately. > > They then examined the failed checksum packets to traceback in the lower level stack-down through the link layer to discover where the corruption occured. With logic analyzers, they were able to observe the contents going into memory from the NIC on reception was different than the contents going out of the memory and traveling across the bus to the processor. > > This is a surprisingly common problem in datacenters - sometimes the problem would be a switch, sometimes a configuration error, sometimes a programming error in the application, and so forth. I most recently experienced this problem with an overheated ethernet switch passing video on an internal network. > > I also ran into this at an Internet portal company where I was a manager. We were using NetApps file servers to mirror the daily information - NetApps at the time encouraged staff to turn off checksums to increase performance. The DBAs noticed problems and ended up doing frequent rebuilds, but couldn't figure out why. It took me a lot of time to convince my staff to turn on the checksums because they were told "they don't have to" by NetApps. Most datacenter staff work by cookbook, and this wasn't in the cookbook. When they finally tried it, it worked. This little problem cost us a lot of time and aggravation for very little (if any) performance gain. > > Performance gain by turning off checksums now can be obviated through the use of intelligent NIC technologies like SiliconTCP (http://jolitz.telemuse.net/pubs/pt2001_01/item) and TOE that calculate the checksum as the packet is being received. But we don't have this in commodity switches yet, so check that switch if you're having problems. > > Higher level checksums are worth it every time. Don't leave the server without them. :-) > > Lynne Jolitz. > > ---- > We use SpamQuiz. > If your ISP didn't make the grade try http://lynne.telemuse.net > > > -----Original Message----- > > From: end2end-interest-bounces at postel.org > > [mailto:end2end-interest-bounces at postel.org]On Behalf Of Lloyd Wood > > Sent: Monday, April 04, 2005 2:48 AM > > To: Faisal Aslam > > Cc: end2end-interest at postel.org > > Subject: Re: [e2e] UDP checksum field? > > > > > > On Sun, 3 Apr 2005, Faisal Aslam wrote: > > > > > Why we have checksum field is in UDP header, as UDP does not provide > > > data retransmission etc? I think it is used only to silently > > > discarding a packet with wrong checksum (thats it?). > > > > yes - you need an end-to-end check against a corrupted packet. UDP > > could have the checksum turned off, which proved disastrous for a > > number of applications, subtly corrupted filing systems which didn't > > have higher-level end2end checks etc. > > > > > Is there any other application of checksum field? > > > > For other applications > > http://www.faqs.org/rfcs/rfc3828.html > > > > UDP Lite originally sprang out of the observation that UDP has > > redundant length information, and that this information could be > > combined with the checksum (as in TCP/UDP) to give partial coverage. > > > > L. > > > > > > > > Sorry if the question is too naive. > > > > > > Thanks > > > Faisal From braden at ISI.EDU Mon Apr 4 10:33:29 2005 From: braden at ISI.EDU (Bob Braden) Date: Mon, 4 Apr 2005 10:33:29 -0700 (PDT) Subject: [e2e] UDP checksum field? Message-ID: <200504041733.KAA26987@gra.isi.edu> *> explaining exactly the problem and urging the TNS checksum be implemented. No *> response ever came back, and, if you look at a TNS packet today, the checksum *> is still zero. I guess no one has used the gateway software who cares about *> their data. :] *> *> Alex *> Or, the incidence of (detected) failures is so low that no one cares. Bob Braden From lynne at telemuse.net Mon Apr 4 10:46:29 2005 From: lynne at telemuse.net (Lynne Jolitz) Date: Mon, 4 Apr 2005 10:46:29 -0700 Subject: [e2e] UDP checksum field? In-Reply-To: <42517328.77FD2664@attglobal.net> Message-ID: <002201c5393e$3b629840$6e8944c6@telemuse.net> (With no apologies to Microsoft...) - If the Oracle tech guy had gone to the Microsoft Research school of obsfucation, he would have said "The probability of this event occuring such that the reliability of the underlying link layer is impaired by an improbably low memory bit error at ten to the minus 12 excluding thermal radiative factors and charge displacement is so low as to be impossible, hence the question is irrelevent". :-) Lynne Jolitz ---- We use SpamQuiz. If your ISP didn't make the grade try http://lynne.telemuse.net > -----Original Message----- > From: end2end-interest-bounces at postel.org > [mailto:end2end-interest-bounces at postel.org]On Behalf Of Cannara > Sent: Monday, April 04, 2005 10:03 AM > To: end2end-interest at postel.org > Subject: Re: [e2e] UDP checksum field? > > > I'll add a funny (if you're not using Oracle TNS gateways) SQL transport > example that still exists today, despite being pointed out to > Oracle about a > decade ago. When Network General was adding more SQL decodes to the > Sniffer(r), in the '90s, we had a presentation on the Oracle > transport (TNS) > underlying SQL Net traffic. TNS rode on Netware SPP, or TCP, > etc. The fellow > went into packet fields in detail and explained how Oracle also > made gateway > software available for Sun boxes to go from an Oracle system to > an IBM SNA db > system. The gateway received SQL on TNS on TCP on IP on Ethernet (for > instance) and spit out SQL on TNS or whatever IBM wanted. > > As he expounded on TNS pkt fields, a few hands went up -- "What's > the checksum > field for if it's always 0?" asked a few experienced network folks. The > presenter turned back to the slide show and said: "It's unimplemented for > now". Without malice, another question was posed: "Well if it's > unused and > your gateway has bad memory, how do you know the data going into > the db on the > other side will be good?" The presenter, a highly lauded Oracle > techy, looked > at the screen for a bit, looked back at the audience, shuffled his feet, > looked again at the screen, and finally said words like: "I > don't know". > > After the presentation, a letter was written to Oracle, copied to Ellison, > explaining exactly the problem and urging the TNS checksum be > implemented. No > response ever came back, and, if you look at a TNS packet today, > the checksum > is still zero. I guess no one has used the gateway software who > cares about > their data. :] > > Alex > > PS Note that "gateway" here is used in the proper sense, not for "router". > > Lynne Jolitz wrote: > > > > Yes, Lloyd is exactly right here. It is often the case that > people turn off UDP checksums to "buy" more performance by > relying on the CRC of the ethernet packet. It's not a stupid > question - it's a very smart question, and a lot of smart people > get fooled by this. > > > > For example, the Sun datacenter back in the early 1990's had an > NFS cluster project called Sunbox - an array of workstation CPUs > that did divide and conquer to build a massive file server. It > used an ethernet multiplexer to dynamically split the load. To > buy back performance, they turned off the UDP checksum. It worked > fine until they had a bad lot of ethernet boards with substandard > memories - this wasn't picked up in tests because the test units > were doing resends of the occasionally corrupted packets (UDP > checksums usually was turned on), and in TCP the checksums would > do resends as well. It was also a fairly rare problem, and the > test periods were too short to pick up on the nature of this > problem easily. > > > > But when UDP checksums were turned off in normal use, the > resulting NFS requests were corrupting the filesystem (which in > this case were database files), forcing rebuilds and manual > repairs of database tables. > > > > As they were about to announce and release it, they suddenly > discovered this problem - they noticed the corruption and in > order to determine whether it was in the high level (stack or > above) or lower levels, they turned on checksums and it worked > immediately. > > > > They then examined the failed checksum packets to traceback in > the lower level stack-down through the link layer to discover > where the corruption occured. With logic analyzers, they were > able to observe the contents going into memory from the NIC on > reception was different than the contents going out of the memory > and traveling across the bus to the processor. > > > > This is a surprisingly common problem in datacenters - > sometimes the problem would be a switch, sometimes a > configuration error, sometimes a programming error in the > application, and so forth. I most recently experienced this > problem with an overheated ethernet switch passing video on an > internal network. > > > > I also ran into this at an Internet portal company where I was > a manager. We were using NetApps file servers to mirror the daily > information - NetApps at the time encouraged staff to turn off > checksums to increase performance. The DBAs noticed problems and > ended up doing frequent rebuilds, but couldn't figure out why. It > took me a lot of time to convince my staff to turn on the > checksums because they were told "they don't have to" by NetApps. > Most datacenter staff work by cookbook, and this wasn't in the > cookbook. When they finally tried it, it worked. This little > problem cost us a lot of time and aggravation for very little (if > any) performance gain. > > > > Performance gain by turning off checksums now can be obviated > through the use of intelligent NIC technologies like SiliconTCP > (http://jolitz.telemuse.net/pubs/pt2001_01/item) and TOE that > calculate the checksum as the packet is being received. But we > don't have this in commodity switches yet, so check that switch > if you're having problems. > > > > Higher level checksums are worth it every time. Don't leave the > server without them. :-) > > > > Lynne Jolitz. > > > > ---- > > We use SpamQuiz. > > If your ISP didn't make the grade try http://lynne.telemuse.net > > > > > -----Original Message----- > > > From: end2end-interest-bounces at postel.org > > > [mailto:end2end-interest-bounces at postel.org]On Behalf Of Lloyd Wood > > > Sent: Monday, April 04, 2005 2:48 AM > > > To: Faisal Aslam > > > Cc: end2end-interest at postel.org > > > Subject: Re: [e2e] UDP checksum field? > > > > > > > > > On Sun, 3 Apr 2005, Faisal Aslam wrote: > > > > > > > Why we have checksum field is in UDP header, as UDP does not provide > > > > data retransmission etc? I think it is used only to silently > > > > discarding a packet with wrong checksum (thats it?). > > > > > > yes - you need an end-to-end check against a corrupted packet. UDP > > > could have the checksum turned off, which proved disastrous for a > > > number of applications, subtly corrupted filing systems which didn't > > > have higher-level end2end checks etc. > > > > > > > Is there any other application of checksum field? > > > > > > For other applications > > > http://www.faqs.org/rfcs/rfc3828.html > > > > > > UDP Lite originally sprang out of the observation that UDP has > > > redundant length information, and that this information could be > > > combined with the checksum (as in TCP/UDP) to give partial coverage. > > > > > > L. > > > > > > > > > > > Sorry if the question is too naive. > > > > > > > > Thanks > > > > Faisal > From craig at aland.bbn.com Mon Apr 4 11:32:10 2005 From: craig at aland.bbn.com (Craig Partridge) Date: Mon, 04 Apr 2005 14:32:10 -0400 Subject: [e2e] UDP checksum field? In-Reply-To: Your message of "Mon, 04 Apr 2005 10:33:29 PDT." <200504041733.KAA26987@gra.isi.edu> Message-ID: <20050404183210.1AAF324B@aland.bbn.com> In message <200504041733.KAA26987 at gra.isi.edu>, Bob Braden writes: > *> explaining exactly the problem and urging the TNS checksum be implemented >. No > *> response ever came back, and, if you look at a TNS packet today, the chec >ksum > *> is still zero. I guess no one has used the gateway software who cares ab >out > *> their data. :] > *> > *> Alex > *> > >Or, the incidence of (detected) failures is so low that no one cares. I vaguely recall that some part of BBN had experience with the NSF checksum problem and that it took a while for the corruption of the filesystem to become visible. That is, errors are infrequent enough that NIC (or switch, or whatever, ...) testing doesn't typically catch them. So bit rot is slow and subtle -- and when you find it, much has been trashed (especially if one ignores early warning signs, such as large compilations occasionally failing with unrepeatable loading/compilation errors). Craig From lynne at telemuse.net Mon Apr 4 11:58:12 2005 From: lynne at telemuse.net (Lynne Jolitz) Date: Mon, 4 Apr 2005 11:58:12 -0700 Subject: [e2e] UDP checksum field? In-Reply-To: <20050404183210.1AAF324B@aland.bbn.com> Message-ID: <003301c53948$3fdcf140$6e8944c6@telemuse.net> Absolutely right Craig - this was exactly the case with the Sunbox project I described earlier, as well as the datacenter mirror example. Too much damage too late. As implicit dependence on reliability increases, the value of checksums becomes very clear - in the early deep space probes they learned the hard way the importance of always providing enough redundancy and error correction, because a single bit error might be the one that leads to the destruction of the communications ability of the spacecraft. One spacecraft had a corruption error like this that destroyed it for precisely this reason. They optimized out reliability to get a slightly greater data rate, and lost the spacecraft (this has happened more than once). We're reaching a point where you have to seriously think about whether an "optimization" is really valuable - since as Craig notes, you may not notice a problem until too late. In this age of ubiquitous computing, with plentiful processor, memory, and network bandwidth, we should be focussed on increased reliability and integrity, but old habits of a more parsimonious age die hard. Another very recent example of ignoring the value of checksums is reflected in the recent 'fasttrack' problems of incorrect billing of tolls. Lynne. > -----Original Message----- > From: end2end-interest-bounces at postel.org > [mailto:end2end-interest-bounces at postel.org]On Behalf Of Craig Partridge > Sent: Monday, April 04, 2005 11:32 AM ... > I vaguely recall that some part of BBN had experience with the NSF > checksum problem and that it took a while for the corruption of the > filesystem to become visible. That is, errors are infrequent enough > that NIC (or switch, or whatever, ...) testing doesn't typically catch > them. So bit rot is slow and subtle -- and when you find it, much has > been trashed (especially if one ignores early warning signs, such as > large compilations occasionally failing with unrepeatable > loading/compilation > errors). > > Craig > ---- We use SpamQuiz. If your ISP didn't make the grade try http://lynne.telemuse.net From jonathan at dsg.stanford.edu Mon Apr 4 12:46:04 2005 From: jonathan at dsg.stanford.edu (Jonathan Stone) Date: Mon, 04 Apr 2005 12:46:04 -0700 Subject: [e2e] UDP checksum field? In-Reply-To: Your message of "Mon, 04 Apr 2005 14:32:10 EDT." <20050404183210.1AAF324B@aland.bbn.com> Message-ID: In message <20050404183210.1AAF324B at aland.bbn.com>, Craig Partridge writes: >In message <200504041733.KAA26987 at gra.isi.edu>, Bob Braden writes: > >> *> explaining exactly the problem and urging the TNS checksum be implemente >d > >. No >> *> response ever came back, and, if you look at a TNS packet today, the che >c > >ksum >> *> is still zero. I guess no one has used the gateway software who cares a >b > >out >> *> their data. :] >> *> >> *> Alex >> *> >> >>Or, the incidence of (detected) failures is so low that no one cares. > >I vaguely recall that some part of BBN had experience with the NSF >checksum problem and that it took a while for the corruption of the >filesystem to become visible. That is, errors are infrequent enough >that NIC (or switch, or whatever, ...) testing doesn't typically catch >them. So bit rot is slow and subtle -- and when you find it, much has >been trashed (especially if one ignores early warning signs, such as >large compilations occasionally failing with unrepeatable loading/compilation >errors). Hi Craig, I beleive Steve Crocker mentioned this point after I presented one of our papers on e2e checksums. This instance was, again, a large NFS server (don't know if it was BBN or elsewhere), where the data corruption was not detected until after several backup cycles. So even the backup tapes were corrupted. I was told people working on key projects had to go back to hardcopy print-outs and retype them. Whether it's safe to trust outboard checksum offload is a whole other story. From dpreed at reed.com Mon Apr 4 14:19:58 2005 From: dpreed at reed.com (David P. Reed) Date: Mon, 04 Apr 2005 17:19:58 -0400 Subject: [e2e] UDP checksum field? In-Reply-To: References: Message-ID: <4251AF7E.9050002@reed.com> When all is said and done, the UDP checksum isn't, and never was, fully end-to-end protection, since there are few, if any, applications where the correctness of the application data can be *fully assured* by making sure that a single datagram gets delivered correctly. It's an optional standardized way to help deal with a common risk that can arise due to bugs and other issues that show up in engineered systems, nto a guarantee of any particular property. Since UDP datagrams can still be duplicated and modified by a checksum-preserving modification in the network (such modifications are now common, given middleboxes that discard the checksum and compute a new one in many cases), there is no way to assure by a mere checksum field that data has not been corrupted somewhere in the network. Assurance is not the benefit, applications still need to do truly end-to-end checking - UDP's ability to help in detecting incipient problems is very useful, however. I won't elaborate here on the more subtle issues of TCP's lack of true end-to-end reliability. Suffice it to say that there is a difficult issue in a definition of reliability that must depend on the difference between "design errors" and "random errors". From jag8719 at vip.sina.com Mon Apr 4 17:58:18 2005 From: jag8719 at vip.sina.com (Jason Gao) Date: Tue, 5 Apr 2005 08:58:18 +0800 Subject: [e2e] Paper on ATP; end to end security provided by ATP: where SHA1-80 is enough Message-ID: <200504050052.j350qv620435@boreas.isi.edu> The draft onAsymmetric Transport Protocol: http://219.232.1.66/attached/info/atp-2004.pdf It has a security feature: encyrpted transport mode which combines AES and SHA1. It is suggested that the mode is optional when ATP is over UDP while mandatory when over IPv6. The algorithm (rewritten and clarified recently, renamed to AES-SHA1-CV) is that: Problem Space: To ensure successive packets came from the same source (identity of the source), and in the same time To protect confidentiality of the payload. (It is not in the same problem space as AES-CCM or AES-OCB.) It is assumed that a shared secret which is at least 283 bit has been established (using Elliptic Curve Diffie-Hellman key-agreement process, elliptic group sect283k1). In encrypted transport mode, if there is non-null payload no extension header may sit between the ATP fixed header and the payload. The structure of the ATP packet is: 0 1 2 3 |0|1|2|3|4|5|6|7|8|9|0|1|2|3|4|5|6|7|8|9|0|1|2|3|4|5|6|7|8|9|0|1| |---------------------------------------------------------------| | OpCode | Data Segment Length | |---------------------------------------------------------------| | Sequence Number |0 |---------------------------------------------------------------| | Connection | | Key |1 |---------------------------------------------------------------| | Next Expected Sequence Number | |---------------------------------------------------------------| | Stack Pointer | Flags | Identity |2 |-------------------------------- /Integrity | | Check | | Code |3 |---------------------------------------------------------------| ~ ~ ~ Payload ~ ~ ~ |---------------------------------------------------------------| ATP fixed header is 192 bit. When it is over UDP, the full ATP fixed header is stored next to UDP header. When it is over native IPv6, apparently ATP fixed header is 128 bit, Sender's behavior ----------------- Encryption, Composition of ICC: Step 1, Get the high-order 16 bits of ICC and the comparison vector (CV): IV80 = SHA1-80(Fixed Header Excluding ICC, Shared Secret) Namely, replace ICC field with the Shared Secret and apply SHA1-80. The high-order 16 bits of IV80 SHALL be stored in the high-order 16 bits of ICC field, while the low-order 64 bits are taken as the CV. Note that ; here ',' denotes concatenation. Step 2, Padding The length of the cipher text, which is determined by the length of the original clear text according to the padding method hereafter states, is stored in the data segment length field. It equals the length of the padded clear text minus 8. The original clear text is firstly padded with a sequence of octets of zero or more length. The length of the octet sequence is 15, if the original clear text has already been 128-bit aligned, or the number of octets it required to make the clear text 128-bit aligned minus 1. The length, which is represents by a single octets, of the octet sequence is padded as the last octet. (The idea is borrowed from ESP, slightly modified) Then the clear text is padded with the initial 64 bits, which include the OpCode, the data segment length and the sequence number, of the ATP packet, and lastly the 64-bit CV . Step 3, AES-CBC encryption The last 128-bit block of the padded clear text is taken as the initialization vector (IV). The IV and the full padded clear text are fed into the AES-CBC encryption module. The key fed SHOULD be installed by the ULA. On the default the key is derived from the shared secret. The key derived function MUST conform to ANSI-X9.63-KDF [KDF]. The first 64 bits are stored in the low-order 64 bit position of the ICC field. Following bits are stored in the payload field of the ATP packet. Receiver's behavior ------------------- Decryption, Verification of ICC Step 1, Rebuild IV and Preliminary check Again, IV80 = SHA1-80(Fixed Header Excluding ICC, Shared Secret) The high-order 16 bits are compared with the high-order 16 bits of the ICC field of the ATP packet received to preliminarily check whether the packet came from the same source. The initial 64 bits of the ATP packet received and the low-order 64 bits of IV80 form the IV'. Step 2, AES-CBC decryption The cipher text, taken from the lower-64 bit position of the ICC field and the payload field, together with the IV and the AES key are fed into the AES-CBC decryption module. Step 3, Verifying IV The last block of the decryption result is compared with IV'. If they are equal, the packet SHALL be accepted. Or else it MUST be silently discarded. Finally, padding is removed and the clear text payload is delivered to the ULA. ---- It is straightforward to modify the algorithm to use SHA1-144 to obtain the high-order 16 bits of ICC and the 128-bit initialization vector. We choose SHA1-80 because the secure hash algorithm applied used to be MD5, and we believe that entropy space of 64 bit for the initial word of the IV is enough. The algorithm is actually a combination of AES-CBC and partial HMAC. Partial HMAC protects the packet header which is very short (effectively at most 112 bits take part in the partial HMAC). AES-CBC provides both encrytion and message authentication service for the payload. The problem space for the attacker is: Provided that low-order 64-bit of the MAC value is known (the high-order 64 bits of the MAC value is encrypted and unknown by the attacker, and it is easy to modify the algorithm to make the whole MAC value confidential), and the IV is the clear text of the last block, find a sequence of octets which makes the same AES-CBC-MAC value, and the high-order 64 bits of the MAC value must equal the unkown partial HMAC result of the fixed header while the low-order 64 bits equal the initial 64 bits of the ATP fixed header. Four fields of the fixed header may be modifed, to a limited extent: the data segment length, the sequence number, the next expected sequence number and the flag. We believe that the problem is so hard that SHA1-80 is enough here. ------------------- Acknowledgement Thanks to Stephen Sprunk. He made us aware that the original name of the algorithm, AES-IV-SHA1-80 is misleading and the algorithm itself is obscure. We hope that it is corrected and clarified. From marc.herbert at free.fr Tue Apr 5 02:41:59 2005 From: marc.herbert at free.fr (Marc Herbert) Date: Tue, 5 Apr 2005 11:41:59 +0200 (CEST) Subject: [e2e] very simple IP QoS for the bottleneck access link ? (was Skype and congestion collapse.) In-Reply-To: <9531abdc241f450e15fa92b84fe74310@extremenetworks.com> References: <11ad0fa8050304053342514f51@mail.gmail.com> <200503041318.37290.don@dhoffman.net> <4228E595.9030407@dirtcheapemail.com> <9531abdc241f450e15fa92b84fe74310@extremenetworks.com> Message-ID: On Fri, 4 Mar 2005, RJ Atkinson wrote: > On Mar 4, 2005, at 17:47, Clark Gaylord wrote: > > This is why we really do need some notion of QoS other than The Fat > > Pipe. It doesn't have to be as elaborate as RSVP-disciplined CAC, but > > you need to be able to prioritize traffic that matters and limit the > > amount of traffic that gets prioritized. It doesn't have to be more > > complex than that, but it has to do at least that. [Ergo ... left as > > an exercise to the reader.] > > I don't know that the "network" needs to have a more sophisticated > notion of QoS than best effort. It can sometimes be useful for the > network device connected directly to a congested link (e.g. access > link between a site and its upstream provider) to have some > internal-to-the-box QoS configuration. > > It is not uncommon these days for the access router at the customer > premise to have some ACL ruleset that prefers some traffic over > other traffic or rate-limits certain kinds of traffic -- and > equivalent configuration of the aggregation router on the ISP side > of the same link is also not uncommon these days. OK, so why not generalize, extend, standardize, promote and sell this technique? To the point of creating a extremely simple QoS API allowing latency-sensitive applications (assumed to be CBR, as mostly are) to register their traffic to both ends of the access link. This API would just reliably replace ugly hacks like guessing about "well-known" UDP ports or tedious manual configurations. Let's assume a network overprovisioned at the core, where the bottleneck is the access link for a significant number of nodes (_significant_, not even "majority"). This looks a lot like the current Internet to me. Looking at current technology trends, this looks like it's gonna stay like this for long. OK, maybe some revolutions in transmissions and economics we can't envision today would make the assumptions above wrong in the end. But in the end, we are all dead anyway. For nodes whose access link is not the bottleneck, then this does not apply, and they have to solve this latency issue by some other means, assuming they want to solve it. That's all. Simple. The implementation looks simple. The latency-sensitive application regularly sends to both access link halves (up- and down- stream) some way to identify their packets (for instance: dst UDP port 27015 belongs to higher class). The access link implement strict priority for those latency-sensitive packets. Elastic traffic takes the rest. Only two traffic classes, can be implemented cheaply by a DSLAM and by a consumer device. No complex configurations. For those customers who only have a poor USB DSL modem, this could be implemented in the PC itself. Since it's local it's scalable. No need to perform QoS at lightning speed, the load is spreaded to numerous network ends, etc. Since it's local it's incremental. It's incremental in the sense you can deploy it for one customer and not the other without any issue. It's incremental in the sense some ISP can start offering it without caring about the others ISP. It's incremental in the sense you can deploy it for some applications and not the others _on the same access link_. Legacy applications just get the lower class. It's incremental in the sense you can deploy it first for the upstream access link (the biggest issue today because of the "A" in ADSL) before the downstream link. It's also incremental in the sense you can make it peacefully co-exist with a more primitive and less reliable "guess well-known UDP ports" approach. It's incremental in the sense that, once started, applications will have a strong incentive to move to this API. What about the user registering too much traffic in the upper priority class? Well, it fails. Not worst than today. Most internet users now know how to solve this congestion issue (observed immediately): they shut some applications down. No computations, the simple try and fix approach known today, only better. The only added complexity is the two classes. Since most elastic applications report the currently used throughput, users would not have a hard time understanding that shutting down an application that is left with zero kb/s will not solve their congestion issue in this case. Since it's local I hardly see any security issue. Well you can imagine some rogue application running in your home and stealing bandwith, but then I would say you have a much bigger issue anyway. >From the point of view of the end to end argument, you can think of it as the definition of "the end" has been extended to include the access link. Is this too much heretical? IMHO there have been much worst deviances from The Argument in network history (firewalls anyone?). Do you think it could have any economical viability? I think that if just one ISP and one CBR killer app (Skype, a game, whatever) would start to package it then it would sell. "No more lag thanks to our brand new low-ping advanced technology. Now you can download and play at the same time". You can even give to power users an advanced link access controller allowing them to prioritize most legacy applications and widening the market potential, attracting all geeks. Any issues I missed ? There must be some. This looks too good to be true :-) Thanks a lot in advance for your comments. -- So einfach wie m?glich. Aber nicht einfacher -- Albert Einstein From s.malik at tuhh.de Tue Apr 5 05:34:46 2005 From: s.malik at tuhh.de (Sireen Habib Malik) Date: Tue, 05 Apr 2005 14:34:46 +0200 Subject: [e2e] very simple IP QoS for the bottleneck access link ? (was Skype and congestion collapse.) In-Reply-To: References: <11ad0fa8050304053342514f51@mail.gmail.com> <200503041318.37290.don@dhoffman.net> <4228E595.9030407@dirtcheapemail.com> <9531abdc241f450e15fa92b84fe74310@extremenetworks.com> Message-ID: <425285E6.6020905@tuhh.de> Marc Herbert wrote: >The implementation looks simple. The latency-sensitive application >regularly sends to both access link halves (up- and down- stream) some >way to identify their packets (for instance: dst UDP port 27015 >belongs to higher class). The access link implement strict priority >for those latency-sensitive packets. Elastic traffic takes the rest. >Only two traffic classes, can be implemented cheaply by a DSLAM and by >a consumer device. No complex configurations. For those customers who >only have a poor USB DSL modem, this could be implemented in the PC >itself. > > > Here is my understanding of how it is done today. End node marks Layer-2 CoS and/or Layer-3 DSCP fields of the IP/UDP/RTP/Voice packet. Voice traffic is given the top priority and is sent into a Priority Queue (PPQ). The low priority queue could be RED or Weighted-RED, WFQ, etc. In order for this to work, the end must be in the "trust" region i.e CoS/DSCP fields should not be reset by the downstream routers/switches in the path. The presence of the other, lower priority, queue adds to the "variations" of the departing voice trafic from the PQ. Studies have shown that packet delay for this type of queue can be well bounded with M/D/1 delay + residual time of the lower priority packets. It is to be noted that if an MPLS type of tunnel is used for "voice only" then delay is modeled with SUM(D) /D/1 type of system which has significantly lower mean packet delay. So there are trade-offs. Please note, QoS for VoIP is an "Mouth-To-Ear" issue so many other factors get involved. -- SM >Since it's local it's scalable. No need to perform QoS at lightning >speed, the load is spreaded to numerous network ends, etc. > >Since it's local it's incremental. It's incremental in the sense you >can deploy it for one customer and not the other without any issue. >It's incremental in the sense some ISP can start offering it without >caring about the others ISP. It's incremental in the sense you can >deploy it for some applications and not the others _on the same access >link_. Legacy applications just get the lower class. It's incremental >in the sense you can deploy it first for the upstream access link (the >biggest issue today because of the "A" in ADSL) before the downstream >link. > >It's also incremental in the sense you can make it peacefully co-exist >with a more primitive and less reliable "guess well-known UDP ports" >approach. It's incremental in the sense that, once started, >applications will have a strong incentive to move to this API. > >What about the user registering too much traffic in the upper priority >class? Well, it fails. Not worst than today. Most internet users now >know how to solve this congestion issue (observed immediately): they >shut some applications down. No computations, the simple try and fix >approach known today, only better. The only added complexity is the >two classes. Since most elastic applications report the currently used >throughput, users would not have a hard time understanding that >shutting down an application that is left with zero kb/s will not >solve their congestion issue in this case. > >Since it's local I hardly see any security issue. Well you can imagine >some rogue application running in your home and stealing bandwith, but >then I would say you have a much bigger issue anyway. > >>From the point of view of the end to end argument, you can think of it >as the definition of "the end" has been extended to include the access >link. Is this too much heretical? IMHO there have been much worst >deviances from The Argument in network history (firewalls anyone?). > >Do you think it could have any economical viability? I think that if >just one ISP and one CBR killer app (Skype, a game, whatever) would >start to package it then it would sell. "No more lag thanks to our >brand new low-ping advanced technology. Now you can download and play >at the same time". You can even give to power users an advanced link >access controller allowing them to prioritize most legacy applications >and widening the market potential, attracting all geeks. > >Any issues I missed ? There must be some. This looks too good to >be true :-) Thanks a lot in advance for your comments. > > > > > -- Sireen Malik, M.Sc. PhD. Candidate, Communication Networks Hamburg University of Technology, FSP 4-06 (room 3008) Denickestr. 17 21073 Hamburg, Deutschland Tel: +49 (40) 42-878-3387 Fax: +49 (40) 42-878-2941 E-Mail: s.malik at tuhh.de --Everything should be as simple as possible, but no simpler (Albert Einstein) From marc.herbert at free.fr Tue Apr 5 08:46:39 2005 From: marc.herbert at free.fr (Marc Herbert) Date: Tue, 5 Apr 2005 17:46:39 +0200 (CEST) Subject: [e2e] very simple IP QoS for the bottleneck access link ? In-Reply-To: <425285E6.6020905@tuhh.de> References: <11ad0fa8050304053342514f51@mail.gmail.com> <200503041318.37290.don@dhoffman.net> <4228E595.9030407@dirtcheapemail.com> <9531abdc241f450e15fa92b84fe74310@extremenetworks.com> <425285E6.6020905@tuhh.de> Message-ID: On Tue, 5 Apr 2005, Sireen Habib Malik wrote: > Marc Herbert wrote: > > >The implementation looks simple. The latency-sensitive application > >regularly sends to both access link halves (up- and down- stream) some > >way to identify their packets (for instance: dst UDP port 27015 > >belongs to higher class). The access link implement strict priority > >for those latency-sensitive packets. Elastic traffic takes the rest. > >Only two traffic classes, can be implemented cheaply by a DSLAM and by > >a consumer device. No complex configurations. For those customers who > >only have a poor USB DSL modem, this could be implemented in the PC > >itself. > Here is my understanding of how it is done today. End node marks Layer-2 > CoS and/or Layer-3 DSCP fields of the IP/UDP/RTP/Voice packet. > Voice traffic is given the top priority and is sent into a Priority > Queue (PPQ). The low priority queue could be RED or Weighted-RED, WFQ, etc. > > In order for this to work, the end must be in the "trust" region i.e > CoS/DSCP fields should not be reset by the downstream routers/switches > in the path. > It is to be noted that if an MPLS type of tunnel is used... Now I am not sure I made myself clear... I am talking about a very simple solution _local_, _private_ to the access link, and to solve only the bottleneck issue at the access link. You get what you paid for. But it could still be very interesting IMHO. So no tunnels, no routers involved at all. Concerning VoIP for instance, each end would have to implement this trick on its own access link _independently_ from the other end. If only one end does, well only this end can abuse its access link with P2P traffic while phoning simultaneously. The other end has to stop eMule as usual. -- So einfach wie m?glich. Aber nicht einfacher -- Albert Einstein From cannara at attglobal.net Tue Apr 5 09:31:01 2005 From: cannara at attglobal.net (Cannara) Date: Tue, 05 Apr 2005 09:31:01 -0700 Subject: [e2e] UDP checksum field? References: <002201c5393e$3b629840$6e8944c6@telemuse.net> Message-ID: <4252BD45.572A0A55@attglobal.net> Or, as Steve Balmer, Prince of OS/2 LanManager, King of Faulty Releases, would glare: "WAD, so stifle". (WAD = works as designed) :] Alex Lynne Jolitz wrote: > > (With no apologies to Microsoft...) - If the Oracle tech guy had gone to the Microsoft Research school of obsfucation, he would have said "The probability of this event occuring such that the reliability of the underlying link layer is impaired by an improbably low memory bit error at ten to the minus 12 excluding thermal radiative factors and charge displacement is so low as to be impossible, hence the question is irrelevent". :-) > Lynne Jolitz > > ---- > We use SpamQuiz. > If your ISP didn't make the grade try http://lynne.telemuse.net > > > -----Original Message----- > > From: end2end-interest-bounces at postel.org > > [mailto:end2end-interest-bounces at postel.org]On Behalf Of Cannara > > Sent: Monday, April 04, 2005 10:03 AM > > To: end2end-interest at postel.org > > Subject: Re: [e2e] UDP checksum field? > > > > > > I'll add a funny (if you're not using Oracle TNS gateways) SQL transport > > example that still exists today, despite being pointed out to > > Oracle about a > > decade ago. When Network General was adding more SQL decodes to the > > Sniffer(r), in the '90s, we had a presentation on the Oracle > > transport (TNS) > > underlying SQL Net traffic. TNS rode on Netware SPP, or TCP, > > etc. The fellow > > went into packet fields in detail and explained how Oracle also > > made gateway > > software available for Sun boxes to go from an Oracle system to > > an IBM SNA db > > system. The gateway received SQL on TNS on TCP on IP on Ethernet (for > > instance) and spit out SQL on TNS or whatever IBM wanted. > > > > As he expounded on TNS pkt fields, a few hands went up -- "What's > > the checksum > > field for if it's always 0?" asked a few experienced network folks. The > > presenter turned back to the slide show and said: "It's unimplemented for > > now". Without malice, another question was posed: "Well if it's > > unused and > > your gateway has bad memory, how do you know the data going into > > the db on the > > other side will be good?" The presenter, a highly lauded Oracle > > techy, looked > > at the screen for a bit, looked back at the audience, shuffled his feet, > > looked again at the screen, and finally said words like: "I > > don't know". > > > > After the presentation, a letter was written to Oracle, copied to Ellison, > > explaining exactly the problem and urging the TNS checksum be > > implemented. No > > response ever came back, and, if you look at a TNS packet today, > > the checksum > > is still zero. I guess no one has used the gateway software who > > cares about > > their data. :] > > > > Alex > > > > PS Note that "gateway" here is used in the proper sense, not for "router". > > > > Lynne Jolitz wrote: > > > > > > Yes, Lloyd is exactly right here. It is often the case that > > people turn off UDP checksums to "buy" more performance by > > relying on the CRC of the ethernet packet. It's not a stupid > > question - it's a very smart question, and a lot of smart people > > get fooled by this. > > > > > > For example, the Sun datacenter back in the early 1990's had an > > NFS cluster project called Sunbox - an array of workstation CPUs > > that did divide and conquer to build a massive file server. It > > used an ethernet multiplexer to dynamically split the load. To > > buy back performance, they turned off the UDP checksum. It worked > > fine until they had a bad lot of ethernet boards with substandard > > memories - this wasn't picked up in tests because the test units > > were doing resends of the occasionally corrupted packets (UDP > > checksums usually was turned on), and in TCP the checksums would > > do resends as well. It was also a fairly rare problem, and the > > test periods were too short to pick up on the nature of this > > problem easily. > > > > > > But when UDP checksums were turned off in normal use, the > > resulting NFS requests were corrupting the filesystem (which in > > this case were database files), forcing rebuilds and manual > > repairs of database tables. > > > > > > As they were about to announce and release it, they suddenly > > discovered this problem - they noticed the corruption and in > > order to determine whether it was in the high level (stack or > > above) or lower levels, they turned on checksums and it worked > > immediately. > > > > > > They then examined the failed checksum packets to traceback in > > the lower level stack-down through the link layer to discover > > where the corruption occured. With logic analyzers, they were > > able to observe the contents going into memory from the NIC on > > reception was different than the contents going out of the memory > > and traveling across the bus to the processor. > > > > > > This is a surprisingly common problem in datacenters - > > sometimes the problem would be a switch, sometimes a > > configuration error, sometimes a programming error in the > > application, and so forth. I most recently experienced this > > problem with an overheated ethernet switch passing video on an > > internal network. > > > > > > I also ran into this at an Internet portal company where I was > > a manager. We were using NetApps file servers to mirror the daily > > information - NetApps at the time encouraged staff to turn off > > checksums to increase performance. The DBAs noticed problems and > > ended up doing frequent rebuilds, but couldn't figure out why. It > > took me a lot of time to convince my staff to turn on the > > checksums because they were told "they don't have to" by NetApps. > > Most datacenter staff work by cookbook, and this wasn't in the > > cookbook. When they finally tried it, it worked. This little > > problem cost us a lot of time and aggravation for very little (if > > any) performance gain. > > > > > > Performance gain by turning off checksums now can be obviated > > through the use of intelligent NIC technologies like SiliconTCP > > (http://jolitz.telemuse.net/pubs/pt2001_01/item) and TOE that > > calculate the checksum as the packet is being received. But we > > don't have this in commodity switches yet, so check that switch > > if you're having problems. > > > > > > Higher level checksums are worth it every time. Don't leave the > > server without them. :-) > > > > > > Lynne Jolitz. > > > > > > ---- > > > We use SpamQuiz. > > > If your ISP didn't make the grade try http://lynne.telemuse.net > > > > > > > -----Original Message----- > > > > From: end2end-interest-bounces at postel.org > > > > [mailto:end2end-interest-bounces at postel.org]On Behalf Of Lloyd Wood > > > > Sent: Monday, April 04, 2005 2:48 AM > > > > To: Faisal Aslam > > > > Cc: end2end-interest at postel.org > > > > Subject: Re: [e2e] UDP checksum field? > > > > > > > > > > > > On Sun, 3 Apr 2005, Faisal Aslam wrote: > > > > > > > > > Why we have checksum field is in UDP header, as UDP does not provide > > > > > data retransmission etc? I think it is used only to silently > > > > > discarding a packet with wrong checksum (thats it?). > > > > > > > > yes - you need an end-to-end check against a corrupted packet. UDP > > > > could have the checksum turned off, which proved disastrous for a > > > > number of applications, subtly corrupted filing systems which didn't > > > > have higher-level end2end checks etc. > > > > > > > > > Is there any other application of checksum field? > > > > > > > > For other applications > > > > http://www.faqs.org/rfcs/rfc3828.html > > > > > > > > UDP Lite originally sprang out of the observation that UDP has > > > > redundant length information, and that this information could be > > > > combined with the checksum (as in TCP/UDP) to give partial coverage. > > > > > > > > L. > > > > > > > > > > > > > > Sorry if the question is too naive. > > > > > > > > > > Thanks > > > > > Faisal > > From cannara at attglobal.net Tue Apr 5 09:32:23 2005 From: cannara at attglobal.net (Cannara) Date: Tue, 05 Apr 2005 09:32:23 -0700 Subject: [e2e] UDP checksum field? References: <200504041733.KAA26987@gra.isi.edu> Message-ID: <4252BD97.7E66A28E@attglobal.net> When your Social Security check is off by a binary point, Bob, someone we all know will care. {:o] Alex Bob Braden wrote: > > > *> explaining exactly the problem and urging the TNS checksum be implemented. No > *> response ever came back, and, if you look at a TNS packet today, the checksum > *> is still zero. I guess no one has used the gateway software who cares about > *> their data. :] > *> > *> Alex > *> > > Or, the incidence of (detected) failures is so low that no one cares. > > Bob Braden From cannara at attglobal.net Tue Apr 5 10:06:43 2005 From: cannara at attglobal.net (Cannara) Date: Tue, 05 Apr 2005 10:06:43 -0700 Subject: [e2e] UDP checksum field? References: <200504041733.KAA26987@gra.isi.edu> Message-ID: <4252C5A3.24AE19D7@attglobal.net> Note that many manufacturers of bridges & routers over the years have had the intelligence to include error-detection & correction in memory. However, when the marketing decisions are made about test and default configuration, that feature is usually turned off, so performance will be better. Check your system manuals for those options! One of my personal experiences with this mistrake was at a major Wall St. investment house, where their Sun jockeys wrote trading programs that the firm obviously depended on to make $ every second of every day in every market for every commodity around the world. They called us at Net Gen because their programs were changing unpredictably and they thought "it's the network" (the usual guess). So, flew to NYC with a Sniffer(r) and discussed the problem: "m" was changing to "n", "C" to "D", "6" to "7" every once in a while in their sources, so compilations would fail despite no changes by the programmers. I told them a Sniffer won't be able to see changing source files on the net, so we sat down to draw exactly where the bodies were buried in their systems. The short story was, debug the server that holds the sources. Since they had huge disc & RAM in the server, and programs were written to disc but often sat in cache RAM for a while (even days), we decided to test disc, but especially RAM. No tests showed anything. Then one of their network guys (a VP, because banks always have only VPs access data :) said he'd heard of a special, extremely rough pattern test. He downloaded it, ran it, and sure enough one small group of bits in one RAM chip was a little flakey. If EDC RAM had been used, it would not have been an issue. Hey, it wasn't the network, but it was end-end! Alex Lloyd Wood wrote: > > On Mon, 4 Apr 2005, Bob Braden wrote: > > > *> explaining exactly the problem and urging the TNS checksum be implemented. No > > *> response ever came back, and, if you look at a TNS packet today, the checksum > > *> is still zero. I guess no one has used the gateway software who cares about > > *> their data. :] > > *> > > *> Alex > > *> > > > > Or, the incidence of (detected) failures is so low that no one cares. > > This is arguably currently the state with RAM. If you write to a > memory subsystem, you would like some confidence that when you read it > back the value is correct. This is often assumed. > > You can write a paranoid application to write to memory locations > multiple times (and those sticking computers in orbit do), read back > and compare and check all of memory for reliability periodically, but > having a checksum on each memory location can be a better safeguard, > though it decreases memory density somewhat. > > There's been much furore of late about 'bad RAM' in Apple Macintoshes; > many computers have moved to ECC RAM, but Apple (bar its > commercially-focused XServe) has not. (A decade ago, people were > grumbling about Apple not using parity RAM.) > > The end-to-end argument remains as valid inside the computer too. > > L. From cannara at attglobal.net Tue Apr 5 15:18:35 2005 From: cannara at attglobal.net (Cannara) Date: Tue, 05 Apr 2005 15:18:35 -0700 Subject: [e2e] UDP checksum field? References: <4251AF7E.9050002@reed.com> Message-ID: <42530EBB.C90343E5@attglobal.net> Of course, David, but the opposite is: no checksum = no chance of correctness. And, the way NAT and other boxes have been intended and deployed, many people consider them as "ends", making the mythical End-End Principle even more of a fantasy. Alex "David P. Reed" wrote: > > When all is said and done, the UDP checksum isn't, and never was, fully > end-to-end protection, since there are few, if any, applications where > the correctness of the application data can be *fully assured* by making > sure that a single datagram gets delivered correctly. It's an optional > standardized way to help deal with a common risk that can arise due to > bugs and other issues that show up in engineered systems, nto a > guarantee of any particular property. > > Since UDP datagrams can still be duplicated and modified by a > checksum-preserving modification in the network (such modifications are > now common, given middleboxes that discard the checksum and compute a > new one in many cases), there is no way to assure by a mere checksum > field that data has not been corrupted somewhere in the network. > Assurance is not the benefit, applications still need to do truly > end-to-end checking - UDP's ability to help in detecting incipient > problems is very useful, however. > > I won't elaborate here on the more subtle issues of TCP's lack of true > end-to-end reliability. Suffice it to say that there is a difficult > issue in a definition of reliability that must depend on the difference > between "design errors" and "random errors". From eblanton at cs.ohiou.edu Tue Apr 5 16:48:36 2005 From: eblanton at cs.ohiou.edu (Ethan Blanton) Date: Tue, 5 Apr 2005 18:48:36 -0500 Subject: [e2e] UDP checksum field? In-Reply-To: <42530EBB.C90343E5@attglobal.net> References: <4251AF7E.9050002@reed.com> <42530EBB.C90343E5@attglobal.net> Message-ID: <20050405234836.GJ32194@colt.internal> Cannara spake unto us the following wisdom: > Of course, David, but the opposite is: no checksum = no chance of > correctness. And, the way NAT and other boxes have been intended and > deployed, many people consider them as "ends", making the mythical End-End > Principle even more of a fantasy. I'm not sure exactly what you're trying to say here (I seldom am), but I think it misses a very important point. There are in fact a _very_ large number of applictions which obey the end-to-end principle exten- sively. Take as an example class of such applications all SSL or TLS streams over TCP. If [heh] you have a particular axe to grind, you can probably come up with some little semantic corner where this is not end-to-end in every respect, but it will be just that -- a semantic little corner. SSL over TCP performs end-to-end flow control, end-to-end congestion control, weak end-to-end integrity checking at the transport layer, and extremely robust end-to-end integrity checking (possibly as well as authentica- tion) at the application layer. Note that, in this example, each layer of the stack provides the largest reasonable set of guarantees it can provide, and the ultimate "end-to-end" integrity and authentication checks are performed at the _true_ ends of the connection -- the appli- cation. I realize this message is probably futile, but I hope it will end the bickering over semantics in this particular thread, and provide some food for thought for future such threads. No, the end-to-end principle isn't practiced everywhere, but it is far from a "fantasy". And yes, I'm sure Ma Bell provided perfect end-to-end service via POTS in 1908 and the Internet is so far behind we might as well not even bother talking about it, no need to tell me that. Since I use the Internet every day (and, miraculously, it works), I'll leave mailing-list theories about how it can't possibly work on the shelf for now. Ethan -- The laws that forbid the carrying of arms are laws [that have no remedy for evils]. They disarm only those who are neither inclined nor determined to commit crimes. -- Cesare Beccaria, "On Crimes and Punishments", 1764 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050405/29c52a15/attachment.bin From perfgeek at mac.com Tue Apr 5 18:48:01 2005 From: perfgeek at mac.com (rick jones) Date: Tue, 5 Apr 2005 18:48:01 -0700 Subject: [e2e] UDP checksum field? In-Reply-To: <20050405234836.GJ32194@colt.internal> References: <4251AF7E.9050002@reed.com> <42530EBB.C90343E5@attglobal.net> <20050405234836.GJ32194@colt.internal> Message-ID: > If [heh] you have a particular axe to grind, you can probably come > up > with some little semantic corner where this is not end-to-end in > every > respect, but it will be just that -- a semantic little corner. SSL > over > TCP performs end-to-end flow control, end-to-end congestion > control, > weak end-to-end integrity checking at the transport layer, and > extremely > robust end-to-end integrity checking (possibly as well as > authentica- > tion) at the application layer. Note that, in this example, each > layer > of the stack provides the largest reasonable set of guarantees it > can > provide, and the ultimate "end-to-end" integrity and > authentication > checks are performed at the _true_ ends of the connection -- the > appli- > cation. Would that semantic corner include SSL offload NICs like Britestream, and/or SSL offload boxes/blades we see advertised from time to time?-) rick jones there is no rest for the wicked, yet the virtuous have no pillows From cannara at attglobal.net Wed Apr 6 20:20:10 2005 From: cannara at attglobal.net (Cannara) Date: Wed, 06 Apr 2005 20:20:10 -0700 Subject: [e2e] UDP checksum field? References: <4251AF7E.9050002@reed.com> <42530EBB.C90343E5@attglobal.net> <20050405234836.GJ32194@colt.internal> Message-ID: <4254A6EA.9825DE58@attglobal.net> Well, long Erudite reponses are always welcome Ethan, but rather than Beccaria, even I, as an Italian American, actually prefer Mao: "All political power stems from the barrel of a gun". :] Alex Ethan Blanton wrote: > > Cannara spake unto us the following wisdom: > > Of course, David, but the opposite is: no checksum = no chance of > > correctness. And, the way NAT and other boxes have been intended and > > deployed, many people consider them as "ends", making the mythical End-End > > Principle even more of a fantasy. > > I'm not sure exactly what you're trying to say here (I seldom am), but I > think it misses a very important point. There are in fact a _very_ > large number of applictions which obey the end-to-end principle exten- > sively. Take as an example class of such applications all SSL or TLS > streams over TCP. > > If [heh] you have a particular axe to grind, you can probably come up > with some little semantic corner where this is not end-to-end in every > respect, but it will be just that -- a semantic little corner. SSL over > TCP performs end-to-end flow control, end-to-end congestion control, > weak end-to-end integrity checking at the transport layer, and extremely > robust end-to-end integrity checking (possibly as well as authentica- > tion) at the application layer. Note that, in this example, each layer > of the stack provides the largest reasonable set of guarantees it can > provide, and the ultimate "end-to-end" integrity and authentication > checks are performed at the _true_ ends of the connection -- the appli- > cation. > > I realize this message is probably futile, but I hope it will end the > bickering over semantics in this particular thread, and provide some > food for thought for future such threads. No, the end-to-end principle > isn't practiced everywhere, but it is far from a "fantasy". And yes, I'm > sure Ma Bell provided perfect end-to-end service via POTS in 1908 and > the Internet is so far behind we might as well not even bother talking > about it, no need to tell me that. Since I use the Internet every day > (and, miraculously, it works), I'll leave mailing-list theories about > how it can't possibly work on the shelf for now. > > Ethan > > -- > The laws that forbid the carrying of arms are laws [that have no remedy > for evils]. They disarm only those who are neither inclined nor > determined to commit crimes. > -- Cesare Beccaria, "On Crimes and Punishments", 1764 > > ------------------------------------------------------------------------------ > Part 1.2Type: application/pgp-signature From Farooq.Bari at cingular.com Thu Apr 7 00:03:37 2005 From: Farooq.Bari at cingular.com (Bari, Farooq) Date: Thu, 7 Apr 2005 00:03:37 -0700 Subject: [e2e] e2e QoS Message-ID: This maybe an old topic but with recent drive for network convergence this topic seems to be popular again. There are several and seemingly overlapping efforts by the industry on it. What do folks on this forum think of on path mechanisms and off path mechanisms for e2e QoS. Farooq From rony3000us at hotmail.com Thu Apr 7 01:35:31 2005 From: rony3000us at hotmail.com (Syed Faisal Hasan) Date: Thu, 07 Apr 2005 08:35:31 +0000 Subject: [e2e] e2e QoS In-Reply-To: Message-ID: Farooq, perhaps you can have a look at the "Revisiting IP QoS: why do we care, what have we learned? ACM SIGCOMM 2003 RIPQOS workshop report ", which can be found at "http://portal.acm.org/citation.cfm?id=963995". Faisal >From: "Bari, Farooq" >To: >Subject: [e2e] e2e QoS >Date: Thu, 7 Apr 2005 00:03:37 -0700 > > >This maybe an old topic but with recent drive for network convergence >this topic seems to be popular again. There are several and seemingly >overlapping efforts by the industry on it. What do folks on this forum >think of on path mechanisms and off path mechanisms for e2e QoS. > >Farooq > > _________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ From nspring at cs.umd.edu Tue Apr 5 10:18:42 2005 From: nspring at cs.umd.edu (Neil Spring) Date: Tue, 5 Apr 2005 13:18:42 -0400 Subject: [e2e] CFP: HotNets-IV Message-ID: <16ee210f82566d5bea3c599845843c0a@cs.umd.edu> CALL FOR PAPERS Fourth Workshop on Hot Topics in Networks HotNets-IV http://www.acm.org/sigs/sigcomm/HotNets-IV November 14-15, 2005 College Park, MD USA The Fourth Workshop on Hot Topics in Networks, HotNets-IV, will bring together researchers in the networking and distributed systems community to debate emerging research directions. The goal of the workshop is to promote community-wide discussion of ideas that will influence and foster continued research in the field. The workshop will provide a venue for researchers to present new ideas that have the potential to significantly impact the community in the long term, especially those that are architectural or design-oriented in nature. Each potential participant should submit a short paper describing such an idea; the paper could, for example, expose a new problem, advocate a new solution, or debunk existing work. Attendance is limited to around 60 participants, by invitation based primarily on paper submissions. HotNets-IV is sponsored by ACM SIGCOMM. We encourage submissions across the broad range of networking and distributed systems research, not limited to those topics covered by the SIGCOMM conference. Submissions may be on topics traditionally published at SIGCOMM, NSDI, SOSP/OSDI, SenSys, or MobiCom, or they may be on topics that have yet to find a home in an established conference. Topics of interest include, but are by no means limited to: * Internet and non-Internet architectures, past, present, and future * Overlay, peer-to-peer, and programmable network infrastructures * Sensor networks, storage area networks, and other examples of "extreme" networking * Wireless networks, mobility, and pervasive computing * Network failures, vulnerabilities, and exploits: detection, analysis and defenses * Network management and control * Novel distributed applications and services, including systems for content distribution and real-time media * Lessons drawn from failed research, and controversial or disruptive topics * Architectural insights or understanding of network behaviors The selection of HotNets papers will be based primarily on their potential to influence future research. This influence can be exercised in many ways, exemplified by but not limited to the following: * Describing a novel approach to an old problem that promises to influence future research * Describing a new problem that requires our attention * Articulating a new perspective about networking and distributed systems * Debunking an old perspective about networking and distributed systems Copies of the accepted papers will be made publicly available via the Web prior to the workshop. Proceedings will be distributed at the workshop and will be made available through ACM's digital library. Examples of papers from past HotNets workshops can be found at: http://www.acm.org/sigs/sigcomm/hotnets. The Program Committee will write short New York Times Book Review-style reviews of accepted papers, for inclusion in the proceedings, to provide the broader community with an additional perspective on future directions in the field. Unlike other workshops and conferences, rejected papers will only receive a very short review. The acceptance of a paper to the HotNets workshop does not preclude the later acceptance of a related paper to the ACM Sigcomm 2006 conference. However, any derived Sigcomm submission must provide a significantly more in-depth treatment of the idea, for example, by providing a more complete evaluation. Assuming that there is sufficient new material in a Sigcomm submission, the existence of a prior publication at HotNets will be ignored during the evaluation for acceptance to Sigcomm. Further details about this policy and its application to other conferences will be posted on the HotNets IV Web page (http://www.acm.org/sigs/sigcomm/HotNets-IV). Submission Instructions Submitted papers must be no longer than 6 pages (10 pt font, 1 inch margins). The review process is not blind, each contributing author should be included on the first page. Only electronic submissions in PostScript or PDF will be accepted. Submissions must be written in English, render without error using standard tools (Ghostview or Acrobat Reader) and print on US-Letter sized paper. Following standard academic practice, HotNets requests that its reviewers hold submitted papers in confidence. Only accepted papers will be published in conference proceedings. Submission information will be posted at: http://www.acm.org/sigs/sigcomm/HotNets-IV Important Dates Submissions due: 1 August 2005 (11:59PM Eastern Daylight Time) Notification of Acceptance: 10 October 2005 Camera-ready copy due: 31 October 2005 Workshop: 14-15 November 2005 Organizers General Chair: * Neil Spring (UMD) Program Committee: * Jon Crowcroft (Cambridge) (Co-chair) * Srinivasan Seshan (CMU) (Co-chair) * Bengt Ahlgren (SICS) * Paul Barford (UWisc) * John Byers (BU) * Deborah Estrin (UCLA) * Tim Griffin (Cambridge) * Venkata Padmanabhan (Microsoft Research) * Jen Rexford (Princeton) * Ion Stoica (UCB) From braden at ISI.EDU Fri Apr 8 12:28:08 2005 From: braden at ISI.EDU (Bob Braden) Date: Fri, 8 Apr 2005 12:28:08 -0700 (PDT) Subject: [e2e] CFP: First IEEE ICNP Workshop on Secure Network Protocols (NPSec) Message-ID: <200504081928.MAA28433@gra.isi.edu> CALL FOR PAPERS First IEEE ICNP Workshop on Secure Network Protocols (NPSec) Boston, Massachusetts, USA November 6, 2005 http://www.cerias.purdue.edu/npsec/ (In conjunction with ICNP 2005: The 13th IEEE International Conference on Network Protocols) SCOPE: The first IEEE ICNP workshop on Secure Network Protocols (NPSec) is a one-day event held in conjunction with ICNP 2005. NPSec focuses on two general areas. The first focus is on the development and analysis of secure or hardened protocols for the operation (establishment and maintenance) of network infrastructure, including such targets as secure multidomain, ad-hoc, sensor or overlay networks, or other related target areas. This can include new protocols, enhancements to existing protocols, protocol analysis, and new attacks on existing protocols. The second focus is on employing such secure network protocols to create or enhance network applications. Examples include collaborative firewalls, incentive strategies for multiparty networks, and deployment strategies to enable secure applications. TOPICS OF INTEREST: * secure or hardened protocols for operation of networks including (but not limited to): - internetworking, e.g., BGP, DNS - MANETs - LANs and WLANs - cellular data networks - p2p and other overlay networks - federated trust systems - sensor networks * vulnerability analysis of existing protocols and applications (both theoretical and case studies), including novel attacks * key distribution * collaborative intrusion detection and response, such as collaborative firewalling * incentive systems for multiparty networks, such as for p2p and MANET routing * protocol configuration and deployment strategies enabling secure applications, e.g., e-commerce IMPORTANT DATES: Paper submission: June 3, 2005 Notification of acceptance: July 15, 2005 Camera ready version: August 5, 2005 ORGANIZING COMMITTEE: General Chair: Sonia Fahmy, Purdue University Technical Program Committee Chairs: George Kesidis, Pennsylvania State University Nicholas Weaver, International Computer Science Institute Publicity Chair: James Minseok Kwon, Rochester Institute of Technology Web Chair: Cristina Nita-Rotaru, Purdue University TECHNICAL PROGRAM COMMITTEE: Ehab Al-Shaer, DePaul University David Brumley, Carnegie Mellon University Guohong Cao, Pennsylvania State University Joseph Evans, U.S. National Science Foundation Lixin Gao, University of Massachusetts, Amherst Carl A. Gunter, University of Illinois at Urbana-Champaign George Kesidis, Pennsylvania State University Edward Knightly, Rice University Iordanis Koutsopoulos, University of Thessaly Carl Landwehr, University of Maryland Marco Ajmone Marsan, Politecnico di Torino, Italy Douglas Maughan, Department of Homeland Security Patrick McDaniel, Pennsylvania State University Jelena Mirkovic, University of Delaware Peng Ning, North Carolina State University Cristina Nita-Rotaru, Purdue University Phil Porras, SRI Saswati Sarkar, University of Pennsylvania Lakshminarayanan Subramanian, University of California at Berkeley Nina Taft, Intel Research Nicholas Weaver, International Computer Science Institute Felix Wu, University of California at Davis Jun Xu, Georgia Institute of Technology Bulent Yener, Rensselaer Polytechnic Institute SUBMISSION GUIDELINES: Submissions must be in electronic form, as Postscript or PDF documents. Papers can be up to 6 two-column pages, and can convey work-in-progress that is not completely mature but shows promise. For more information, please see: http://www.cerias.purdue.edu/npsec/ ----- End Included Message ----- From pb at cs.wisc.edu Tue Apr 12 14:28:05 2005 From: pb at cs.wisc.edu (Paul Barford) Date: Tue, 12 Apr 2005 16:28:05 -0500 (CDT) Subject: [e2e] Wisconsin network research lab now openly available Message-ID: All, It is our pleasure to announce the availability of the Wisconsin Advance Internet Laboratory (WAIL) for open use by the network research community. With support from our partners at Cisco, Intel, University of Utah, NSF, and Internet2 we have extended the Emulab user interface to enable remote access and use of 80 PC's and 34 IP routers (see list below). The remote interface - called Schooner - enables users to connect PC's to fixed configurations of routers (or in PC-only configurations like traditional Emulab) thereby creating testbeds suitable for a range of experiments. Like Emulab, the PC's come with a basic set of tools and can be modified by users with their own experimental code. At present, we offer a library of fixed router configuration principally comprised of simple topologies such as dumbells. We can offer limited support in terms of creating customized topologies and are in the process of expanding the topology library to make all systems generally available. Schooner has documentation which should enable users to get up and running with basic configurations, but we emphasize that the environment is a work in progress. We look forward to supporting projects to the extent that our resources allow and hope you will find this environment useful in your work. Please feel free to access the lab via: http://www.schooner.wail.wisc.edu Best, Paul Barford - director Chris Alfeld Ana Bizarro Dave Plonka Current WAIL Equipment List (new equipment is added on a regular basis - if there is something you need, let us know - we may have it): 80 PC's: Intel 2Ghz Pentium 4, 1GB RAM, Intel 1Gbps NIC 6 Cisco GSR 12000: OC48, OC12, OC3, Gig, FE interfaces 4 Cisco 7500: OC3, GE, FE, Serial interfaces 10 Cisco 7300: GE interfaces 5 Cisco 7200: GE, FE, OC3, Serial interfaces 5 Cisco 3600: FE interfaces 4 Cisco 2600: FE interfaces From dima at krioukov.net Wed Apr 13 14:38:58 2005 From: dima at krioukov.net (Dmitri Krioukov) Date: Wed, 13 Apr 2005 14:38:58 -0700 Subject: [e2e] E2E research visions In-Reply-To: <20050329203837.1AC9A24D@aland.bbn.com> Message-ID: <000101c54071$38f883a0$2fe2acc0@zurich> interesting text. few questions: in section 6, do you want to say that "local anti-scale" will somehow be a solution to "global scale", or do you simply want to attract our attention to the former *in addition* to the latter? in any case, you don't have a separate section on global scalability: do you think it's no longer an issue today and it won't be one in the future? -- dima. http://www.caida.org/~dima/ > -----Original Message----- > From: end2end-interest-bounces at postel.org > [mailto:end2end-interest-bounces at postel.org] On Behalf Of > Craig Partridge > Sent: Tuesday, March 29, 2005 12:39 PM > To: end2end-interest at postel.org > Subject: [e2e] E2E research visions > > > > Hi folks: > > At the most recent meeting of the End2End Research Group, the > group, along > with some attendees, had a discussion of possible research > visions that > could inspire innovative communications research over the next ten > years or so. Dave Clark and I, with help from several other > participants > in the discussion, have written up the ideas from that > discussion and a copy > is available on my website > > http://www.ir.bbn.com/~craig/e2e-vision.pdf > > for anyone who is interested. > > Craig > > E-mail: craig at aland.bbn.com or craig at bbn.com From tolizhi at gmail.com Fri Apr 15 10:09:54 2005 From: tolizhi at gmail.com (Zhi Li) Date: Fri, 15 Apr 2005 10:09:54 -0700 Subject: [e2e] Resilient UDP Message-ID: <7d6098f405041510095c4c73be@mail.gmail.com> Hello, I recently came aross a term called "reslient UDP". I couldn't find any related document on the web. Does anyone know the detail operations? Or, could you please suggest me some references or papers about it? Thanks a lot and have a nice weekend! Regards, Zhi From huitema at windows.microsoft.com Fri Apr 15 16:52:09 2005 From: huitema at windows.microsoft.com (Christian Huitema) Date: Fri, 15 Apr 2005 16:52:09 -0700 Subject: [e2e] Resilient UDP Message-ID: > I recently came aross a term called "reslient UDP". > I couldn't find any related document on the web. > Does anyone know the detail operations? Or, could you please suggest > me some references or papers about it? It is a version of UDP used with military intelligence. -- Christian Huitema From rony3000us at hotmail.com Sat Apr 16 03:12:52 2005 From: rony3000us at hotmail.com (Syed Faisal Hasan) Date: Sat, 16 Apr 2005 10:12:52 +0000 Subject: [e2e] Can TCP's congestion window go beyond receiver's maximum advertised window? Message-ID: Dear Folks, I was trying to do a simple simulation using NS-2.27 and I found something interesting. The topology is as follows [n0]---------------[n1] n0 is running a ftp application on top of tcp. TCP receiver's advertised maximum window size is set to 20 (that is the default in NS) Congestion window (cwnd) should never go beyond receive window (rwnd), right? Then why in the simulation, cwnd grows beyond rwnd? cwnd reaches to 24, while rwnd is fixed at 20. If this is a silly question, I 'm sorry for asking. But I 'ld like to have an explanation. Faisal #========================================== #The NS script is below #=========================================== set ns [new Simulator] set file1 [open testout.tr w] $ns trace-all $file1 set file2 [open ./temp/namtest.nam w] $ns namtrace-all $file2 set windowfile [ open ./temp/WindowFile w] proc finish {} { global ns file1 file2 $ns flush-trace close $file1 close $file2 exec nam ./temp/namtest.nam & exec xgraph ./temp/WindowFile -geometry 800x600 & exit 0 } set n0 [$ns node] set n1 [$ns node] $ns duplex-link $n0 $n1 0.2Mb 500ms DropTail $ns duplex-link-op $n0 $n1 orient right set tcp [new Agent/TCP/Sack1] $ns attach-agent $n0 $tcp $tcp set window_ 20 set tcpsink [new Agent/TCPSink] $ns attach-agent $n1 $tcpsink $ns connect $tcp $tcpsink set ftp [new Application/FTP] $ftp attach-agent $tcp proc getwindow {source file } { global ns set now [$ns now] set time 0.1 set cwnd [$source set cwnd_] puts $file "$now $cwnd" $ns at [expr $now+$time] "getwindow $source $file" } $ns at 0.1 "getwindow $tcp $windowfile" $ns at 0.0 "$ftp start" $ns at 9.0 "$ftp stop" $ns at 10 "finish" $ns run #=================================== _________________________________________________________________ FREE pop-up blocking with the new MSN Toolbar - get it now! http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/ From arjuna.sathiaseelan at gmail.com Sat Apr 16 13:16:09 2005 From: arjuna.sathiaseelan at gmail.com (Arjuna Sathiaseelan) Date: Sat, 16 Apr 2005 21:16:09 +0100 Subject: [e2e] end2end-interest Digest, Vol 14, Issue 16 In-Reply-To: References: Message-ID: <1ef2259005041613166d1575a1@mail.gmail.com> Dear Faisal, Even though this question should be directed to the ns-2 list :) - yes the cwnd can grow beyond the rwnd - but the amount of data that is being sent - i.e. the sending window is always the min of the cwnd and the rwnd. So the best way is to set to a window that is equal to the bandwidth delay product - if you want to utilize the link to its fullest. Regds, Arjuna On 4/16/05, end2end-interest-request at postel.org wrote: > Send end2end-interest mailing list submissions to > end2end-interest at postel.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://www.postel.org/mailman/listinfo/end2end-interest > or, via email, send a message with subject or body 'help' to > end2end-interest-request at postel.org > > You can reach the person managing the list at > end2end-interest-owner at postel.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of end2end-interest digest..." > > Today's Topics: > > 1. Re: Resilient UDP (Christian Huitema) > 2. Can TCP's congestion window go beyond receiver's maximum > advertised window? (Syed Faisal Hasan) > > ---------------------------------------------------------------------- > > Message: 1 > Date: Fri, 15 Apr 2005 16:52:09 -0700 > From: "Christian Huitema" > Subject: Re: [e2e] Resilient UDP > To: "Zhi Li" , > Message-ID: > > > Content-Type: text/plain; charset="us-ascii" > > > I recently came aross a term called "reslient UDP". > > I couldn't find any related document on the web. > > Does anyone know the detail operations? Or, could you please suggest > > me some references or papers about it? > > It is a version of UDP used with military intelligence. > > -- Christian Huitema > > ------------------------------ > > Message: 2 > Date: Sat, 16 Apr 2005 10:12:52 +0000 > From: "Syed Faisal Hasan" > Subject: [e2e] Can TCP's congestion window go beyond receiver's > maximum advertised window? > To: end2end-interest at postel.org > Message-ID: > Content-Type: text/plain; format=flowed > > Dear Folks, > > I was trying to do a simple simulation using NS-2.27 and I found > something interesting. > > The topology is as follows > [n0]---------------[n1] > > n0 is running a ftp application on top of tcp. > TCP receiver's advertised maximum window size is set to 20 (that is > the default in NS) > > Congestion window (cwnd) should never go beyond receive window (rwnd), > right? > Then why in the simulation, cwnd grows beyond rwnd? cwnd reaches to > 24, while rwnd is fixed at 20. > > If this is a silly question, I 'm sorry for asking. But I 'ld like to > have an explanation. > > Faisal > #========================================== > #The NS script is below > #=========================================== > set ns [new Simulator] > > set file1 [open testout.tr w] > $ns trace-all $file1 > > set file2 [open ./temp/namtest.nam w] > $ns namtrace-all $file2 > set windowfile [ open ./temp/WindowFile w] > > proc finish {} { > > global ns file1 file2 > $ns flush-trace > close $file1 > close $file2 > > exec nam ./temp/namtest.nam & > exec xgraph ./temp/WindowFile -geometry 800x600 & > exit 0 > > } > > set n0 [$ns node] > set n1 [$ns node] > $ns duplex-link $n0 $n1 0.2Mb 500ms DropTail > $ns duplex-link-op $n0 $n1 orient right > > set tcp [new Agent/TCP/Sack1] > $ns attach-agent $n0 $tcp > $tcp set window_ 20 > > set tcpsink [new Agent/TCPSink] > $ns attach-agent $n1 $tcpsink > $ns connect $tcp $tcpsink > > set ftp [new Application/FTP] > $ftp attach-agent $tcp > > proc getwindow {source file } { > > global ns > set now [$ns now] > set time 0.1 > set cwnd [$source set cwnd_] > puts $file "$now $cwnd" > $ns at [expr $now+$time] "getwindow $source $file" > > } > > $ns at 0.1 "getwindow $tcp $windowfile" > > $ns at 0.0 "$ftp start" > $ns at 9.0 "$ftp stop" > $ns at 10 "finish" > $ns run > #=================================== > > _________________________________________________________________ > FREE pop-up blocking with the new MSN Toolbar - get it now! > http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/ > > ------------------------------ > > _______________________________________________ > end2end-interest mailing list > end2end-interest at postel.org > http://www.postel.org/mailman/listinfo/end2end-interest > > End of end2end-interest Digest, Vol 14, Issue 16 > ************************************************ > From stelios at dcs.gla.ac.uk Sun Apr 17 11:09:18 2005 From: stelios at dcs.gla.ac.uk (Stylianos Papanastasiou) Date: Sun, 17 Apr 2005 19:09:18 +0100 Subject: [e2e] Can TCP's congestion window go beyond receiver's maximum advertised window? In-Reply-To: References: Message-ID: <1113761358.7745.6.camel@bioko> I think you should direct similar questions to the ns-users mailing list. The short answer is: The window() (and windowd()) functions return the avail. cwnd which is the min. of cwnd_ and window_ (or wnd_ in C++ space). This is the usable cong. window for all purposes in NS2. Hence, even though your trace says that cwnd_ reaches the value 24 (and it does), when for instance halving the congestion window ns does min(24,20)/2, and so you get a value of 10 for the cwnd. Your traces will verify this. Stelios On Sat, 2005-04-16 at 10:12 +0000, Syed Faisal Hasan wrote: > Dear Folks, > > I was trying to do a simple simulation using NS-2.27 and I found > something interesting. > > The topology is as follows > [n0]---------------[n1] > > n0 is running a ftp application on top of tcp. > TCP receiver's advertised maximum window size is set to 20 (that is > the default in NS) > > Congestion window (cwnd) should never go beyond receive window (rwnd), > right? > Then why in the simulation, cwnd grows beyond rwnd? cwnd reaches to > 24, while rwnd is fixed at 20. > > If this is a silly question, I 'm sorry for asking. But I 'ld like to > have an explanation. > > Faisal > #========================================== > #The NS script is below > #=========================================== > set ns [new Simulator] > > set file1 [open testout.tr w] > $ns trace-all $file1 > > set file2 [open ./temp/namtest.nam w] > $ns namtrace-all $file2 > set windowfile [ open ./temp/WindowFile w] > > proc finish {} { > > global ns file1 file2 > $ns flush-trace > close $file1 > close $file2 > > exec nam ./temp/namtest.nam & > exec xgraph ./temp/WindowFile -geometry 800x600 & > exit 0 > > } > > set n0 [$ns node] > set n1 [$ns node] > $ns duplex-link $n0 $n1 0.2Mb 500ms DropTail > $ns duplex-link-op $n0 $n1 orient right > > set tcp [new Agent/TCP/Sack1] > $ns attach-agent $n0 $tcp > $tcp set window_ 20 > > set tcpsink [new Agent/TCPSink] > $ns attach-agent $n1 $tcpsink > $ns connect $tcp $tcpsink > > set ftp [new Application/FTP] > $ftp attach-agent $tcp > > proc getwindow {source file } { > > global ns > set now [$ns now] > set time 0.1 > set cwnd [$source set cwnd_] > puts $file "$now $cwnd" > $ns at [expr $now+$time] "getwindow $source $file" > > } > > $ns at 0.1 "getwindow $tcp $windowfile" > > $ns at 0.0 "$ftp start" > $ns at 9.0 "$ftp stop" > $ns at 10 "finish" > $ns run > #=================================== > > _________________________________________________________________ > FREE pop-up blocking with the new MSN Toolbar - get it now! > http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/ > From craig at aland.bbn.com Mon Apr 18 09:47:27 2005 From: craig at aland.bbn.com (Craig Partridge) Date: Mon, 18 Apr 2005 12:47:27 -0400 Subject: [e2e] Can TCP's congestion window go beyond receiver's maximum advertised window? In-Reply-To: Your message of "Sat, 16 Apr 2005 10:12:52 -0000." Message-ID: <20050418164727.913981FF@aland.bbn.com> I don't know why this happens, but it is clear that you have to track the two values (cwnd and rwnd) separately, as the receiver can open its window (and the sender probably never knows for sure how big the fully-open receiver window would be). Craig From rbeverly at rbeverly.net Wed Apr 20 11:26:40 2005 From: rbeverly at rbeverly.net (Robert Beverly) Date: Wed, 20 Apr 2005 14:26:40 -0400 Subject: [e2e] Internet email performance study Message-ID: <20050420182640.GA26116@rbeverly.net> Hi all, We're looking for operational-types lurking on the list with experience running large mail servers. In particular, we have collected a large amount of data as part of an Internet email performance study that we cannot entirely explain. If you can help us or are simply curious about our findings, we'd love to hear from you. WHAT WE DID: Briefly, we used SMTP bounce-backs as the basis of an email active measurement survey. Using random addresses as unique identifiers, we measure latency, loss, paths, etc. to a large set of Internet MTAs. Approximately 1/3 of all servers we've surveyed respond with bounce-backs. We've found some interesting results. For example latencies of days (30 days in once instance). WHAT WE DON'T UNDERSTAND: Most servers behave as we expect, either always replying with bounce-backs or never replying. However, some exhibit odd and seemingly non-deterministic behavior. For example, a server will respond to all emails for weeks, and then reply to only a fraction (e.g., 25-75%) of the emails in a seemingly random pattern for some period of time (e.g, 4 hours). Further, we often see these patterns correlated within a domain (e.g., a subset of the MTAs will enter and exist this loss mode at the same time). We are fairly certain that the loss is an artifact of the MTA behavior or local administration. While we can guess reasons this might occur, we have yet to find an administrator who can explain this behavior with an architecture used in practice. More details on the project including our exact methodology, plausible explanations for the loss and a FAQ are available on our web site: http://ana.lcs.mit.edu/emailtester Thanks! Rob Beverly / Mike Afergan From arjuna.sathiaseelan at gmail.com Thu Apr 21 00:42:20 2005 From: arjuna.sathiaseelan at gmail.com (Arjuna Sathiaseelan) Date: Thu, 21 Apr 2005 08:42:20 +0100 Subject: [e2e] Question on MTU Message-ID: <1ef2259005042100424feef544@mail.gmail.com> Dear All, I would be very much obliged if you could let me know the following: As MTU is the maximum amount of information per packet that can be sent on the wire, does it include the MSS + TCP header + IP header + DL header (with error correction codes) or is it just the MSS + TCP header + IP header? Because for the Ethernet - which has a MTU of 1500 bytes - we usually have 1460 bytes as MSS + 20 bytes TCP header + 20 bytes IP header. What about the header that would be added in link layer? Please do clarify me. Regds, Arjuna From touch at ISI.EDU Thu Apr 21 09:28:28 2005 From: touch at ISI.EDU (Joe Touch) Date: Thu, 21 Apr 2005 09:28:28 -0700 Subject: [e2e] Question on MTU In-Reply-To: <1ef2259005042100424feef544@mail.gmail.com> References: <1ef2259005042100424feef544@mail.gmail.com> Message-ID: <4267D4AC.8090503@isi.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Arjuna Sathiaseelan wrote: > Dear All, > I would be very much obliged if you could let me know the following: > > As MTU is the maximum amount of information per packet that can be > sent on the wire, does it include the MSS + TCP header + IP header + > DL header (with error correction codes) or is it just the MSS + TCP > header + IP header? > > Because for the Ethernet - which has a MTU of 1500 bytes - we usually > have 1460 bytes as MSS + 20 bytes TCP header + 20 bytes IP header. > What about the header that would be added in link layer? > > Please do clarify me. > > Regds, > Arjuna MSS and MTU both omit headers, i.e., they are payload sizes. MTU usually refers to a link layer, and denotes the maximum link ayboad size, excluding link header/trailer info. For Ethernet, such header/trailers include: - 14 byte header - 4 byte 802.1q (VLAN) tag - 4 byte CRC Standard ethernet has 1518 byte frames, but 802.1q ethernet has 1522 byte frames. From the link frame size, subtract the link header/trailer to get the MTU. Standard ethernet has an MTU of 1500 bytes, but there are jumbograms of 9,000 bytes in the extended ethernet spec. MSS usually refers to a transport protocol, e.g., TCP, and denotes the max payload size there too. It is also relative to the network (IPv4, IPv6) protocol _and_ link layer used. And just as link layer overhead sizes vary, so do network layer overhead sizes (minimums of 20 for IPv4, 40 for IPv6 - larger if options are included, e.g., 48 for IPv6 with jumbogram option). Joe -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (MingW32) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFCZ9SsE5f5cImnZrsRAlPPAJ42GssC74fPcWXKtjS0pvA+7K5mbwCgnaPz u8ahwcXwaxH7K2anV7oik0Y= =bIR5 -----END PGP SIGNATURE----- From mtariq at cc.gatech.edu Thu Apr 21 10:39:08 2005 From: mtariq at cc.gatech.edu (Muhammad Mukarram Bin Tariq) Date: Thu, 21 Apr 2005 13:39:08 -0400 Subject: [e2e] study on 'NAT'ed hosts Message-ID: <4267E53C.4060802@cc.gatech.edu> Hello, I was wondering whether there is a study on estimating fraction of hosts that are connected to Internet from behind a NAT, or share globally routable IP addresses in some time-multiplexed fashion. -- Mukarram From tvest at pch.net Thu Apr 21 11:52:06 2005 From: tvest at pch.net (Tom Vest) Date: Thu, 21 Apr 2005 14:52:06 -0400 Subject: [e2e] study on 'NAT'ed hosts In-Reply-To: <4267E53C.4060802@cc.gatech.edu> References: <4267E53C.4060802@cc.gatech.edu> Message-ID: <4569df0b0d481cf1247f31f9d299c388@pch.net> On Apr 21, 2005, at 1:39 PM, Muhammad Mukarram Bin Tariq wrote: > Hello, > > I was wondering whether there is a study on estimating fraction of > hosts that are connected to Internet from behind a NAT, or share > globally routable IP addresses in some time-multiplexed fashion. > > -- Mukarram I would be especially interested in anything that might suggest the degree to which NAPT is used in ways that break the association/ratio between access-related address utilization, e.g., and a peak simultaneous usage rate. If most RIRs/NIRs/LIRs use such ratios as a component of their IP address request validation process (and conversely, most ISPs use it in their IP address requests), doesn't this mean that, practically speaking, NAPT does not in fact break this association? Thanks -- Tom From svp.mailman at gmail.com Thu Apr 21 12:03:29 2005 From: svp.mailman at gmail.com (Swapnil Patil) Date: Thu, 21 Apr 2005 15:03:29 -0400 Subject: [e2e] study on 'NAT'ed hosts In-Reply-To: <4267E53C.4060802@cc.gatech.edu> References: <4267E53C.4060802@cc.gatech.edu> Message-ID: see "A Technique for Counting NATted Hosts" by Steve Bellovin appeared in the Internet Measurement Workshop 2002. regards -swapnil On 4/21/05, Muhammad Mukarram Bin Tariq wrote: > Hello, > > I was wondering whether there is a study on estimating fraction of hosts > that are connected to Internet from behind a NAT, or share globally > routable IP addresses in some time-multiplexed fashion. > > -- Mukarram > -- This is Swapnil Patil's listserv address. From ljorgenson at apparentnetworks.com Thu Apr 21 12:27:42 2005 From: ljorgenson at apparentnetworks.com (Loki Jorgenson) Date: Thu, 21 Apr 2005 12:27:42 -0700 Subject: [e2e] MTU - IP layer Message-ID: Minor note - MTU is technically Layer 3 (as opposed to link layer or layer 2). So it is quite correct to describe the MTU as the link layer payload size. So, as noted, 1518 bytes is the frame size at layer 2. However, it is very important to keep in mind that MTU and path MTU discovery operate at Layer 3. For example, boundaries between differing MTUs should be handled by Layer 3 devices (not switches) to avoid end-to-end issues that can arise. Loki ---- "Joe Wrote:" Date: Thu, 21 Apr 2005 09:28:28 -0700 From: Joe Touch Subject: Re: [e2e] Question on MTU To: Arjuna Sathiaseelan Cc: end2end-interest at postel.org Message-ID: <4267D4AC.8090503 at isi.edu> Content-Type: text/plain; charset=ISO-8859-1 MTU usually refers to a link layer, and denotes the maximum link ayboad size, excluding link header/trailer info. For Ethernet, such header/trailers include: - 14 byte header - 4 byte 802.1q (VLAN) tag - 4 byte CRC Standard ethernet has 1518 byte frames, but 802.1q ethernet has 1522 byte frames. From the link frame size, subtract the link header/trailer to get the MTU. Standard ethernet has an MTU of 1500 bytes, but there are jumbograms of 9,000 bytes in the extended ethernet spec. MSS usually refers to a transport protocol, e.g., TCP, and denotes the max payload size there too. It is also relative to the network (IPv4, IPv6) protocol _and_ link layer used. And just as link layer overhead sizes vary, so do network layer overhead sizes (minimums of 20 for IPv4, 40 for IPv6 - larger if options are included, e.g., 48 for IPv6 with jumbogram option). -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.postel.org/pipermail/end2end-interest/attachments/20050421/b7127147/attachment.html From am.amir at gmail.com Thu Apr 21 12:54:26 2005 From: am.amir at gmail.com (Aamir Mehmood) Date: Fri, 22 Apr 2005 00:54:26 +0500 Subject: [e2e] Jitter Calculations in IP networks. Message-ID: <12a3f40805042112543dd601c2@mail.gmail.com> Hi all, We are doing analysis of core ip backbone. Can some one please let me know how jitter is calculated in ip networks. Is there any software except ethereal which can calculate the jitter from the captured RTP stream. Regards Amir From david.borman at windriver.com Thu Apr 21 12:58:05 2005 From: david.borman at windriver.com (David Borman) Date: Thu, 21 Apr 2005 14:58:05 -0500 Subject: [e2e] Question on MTU In-Reply-To: <4267D4AC.8090503@isi.edu> References: <1ef2259005042100424feef544@mail.gmail.com> <4267D4AC.8090503@isi.edu> Message-ID: <7bf770bf3d525c13130f6408e21788b7@windriver.com> On Apr 21, 2005, at 11:28 AM, Joe Touch wrote: > MSS usually refers to a transport protocol, e.g., TCP, and denotes the > max payload size there too. It is also relative to the network (IPv4, > IPv6) protocol _and_ link layer used. > > And just as link layer overhead sizes vary, so do network layer > overhead > sizes (minimums of 20 for IPv4, 40 for IPv6 - larger if options are > included, e.g., 48 for IPv6 with jumbogram option). But the advertised MSS in the TCP MSS option should not be adjusted to reflect any options or intermediary headers, just the fixed IP and TCP header sizes; 40 bytes for IPv4/TCP, and 60 bytes for IPv6/TCP. When the sender generates the packet, he is responsible for reducing the TCP data to allow room for any additional options or headers. -David Borman From touch at ISI.EDU Thu Apr 21 13:29:33 2005 From: touch at ISI.EDU (Joe Touch) Date: Thu, 21 Apr 2005 13:29:33 -0700 Subject: [e2e] MTU - IP layer In-Reply-To: References: Message-ID: <42680D2D.1070309@isi.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 L3 packet size isn't referred to as MTU, esp. in IP (rfc791); it is datagram length (or total length). Fragments in IP must be less than or equal to the MTU, which there (791) refers to the max payload of the L2. path MTU discovery is equivalent to path "max link payload" discovery, rather than path "max network payload" discovery. IMO, therefore, MTU really refers to the L2 payload size, which is not the same as the L3 'frame' size (size of the total IP packet), but is related to the size of an L3 fragment. Joe Loki Jorgenson wrote: > > Minor note - MTU is technically Layer 3 (as opposed to link layer or > layer 2). So it is quite correct to describe the MTU as the link layer > payload size. So, as noted, 1518 bytes is the frame size at layer 2. > > However, it is very important to keep in mind that MTU and path MTU > discovery operate at Layer 3. For example, boundaries between differing > MTUs should be handled by Layer 3 devices (not switches) to avoid > end-to-end issues that can arise. > > Loki > > ---- > > "Joe Wrote:" > > > Date: Thu, 21 Apr 2005 09:28:28 -0700 > From: Joe Touch > Subject: Re: [e2e] Question on MTU > To: Arjuna Sathiaseelan > Cc: end2end-interest at postel.org > Message-ID: <4267D4AC.8090503 at isi.edu> > Content-Type: text/plain; charset=ISO-8859-1 > > > MTU usually refers to a link layer, and denotes the maximum link ayboad > size, excluding link header/trailer info. For Ethernet, such > header/trailers include: > > - 14 byte header > - 4 byte 802.1q (VLAN) tag > - 4 byte CRC > > Standard ethernet has 1518 byte frames, but 802.1q ethernet has 1522 > byte frames. From the link frame size, subtract the link header/trailer > to get the MTU. Standard ethernet has an MTU of 1500 bytes, but there > are jumbograms of 9,000 bytes in the extended ethernet spec. > > MSS usually refers to a transport protocol, e.g., TCP, and denotes the > max payload size there too. It is also relative to the network (IPv4, > IPv6) protocol _and_ link layer used. > > And just as link layer overhead sizes vary, so do network layer overhead > sizes (minimums of 20 for IPv4, 40 for IPv6 - larger if options are > included, e.g., 48 for IPv6 with jumbogram option). > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (MingW32) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFCaA0tE5f5cImnZrsRAow8AJ4pWCIAqdCRFbQDAhbm4+z1SaZzbACfSvb/ XZXMcs7Veyt+qS6RdSEzzeU= =kDKI -----END PGP SIGNATURE----- From braden at ISI.EDU Thu Apr 21 13:55:26 2005 From: braden at ISI.EDU (Bob Braden) Date: Thu, 21 Apr 2005 13:55:26 -0700 (PDT) Subject: [e2e] Question on MTU Message-ID: <200504212055.NAA02998@gra.isi.edu> *> *> But the advertised MSS in the TCP MSS option should not be adjusted to *> reflect any options or intermediary headers, just the fixed IP and TCP *> header sizes; 40 bytes for IPv4/TCP, and 60 bytes for IPv6/TCP. When *> the sender generates the packet, he is responsible for reducing the TCP *> data to allow room for any additional options or headers. *> *> -David Borman *> Right. Please see Section 4.2.2.6 of RFC 1122 "Requirements for Internet Hosts - Communication Layers" for the details. Bob Braden From ljorgenson at apparentnetworks.com Thu Apr 21 13:55:29 2005 From: ljorgenson at apparentnetworks.com (Loki Jorgenson) Date: Thu, 21 Apr 2005 13:55:29 -0700 Subject: [e2e] MTU - IP layer Message-ID: Hmmmmmm - that's an interesting reading of RFC 791 - and the distinction of fragments over datagrams could be made in that way. My observation remains that MTU is conceptually defined and implemented at Layer 3. Making pains to define it in Layer 2 terms in order to ensure its scope includes all valid cases makes sense - and yet I find it challenged. Promoting the subtle distinction of "Frame payload" over "packet/datagram" doesn't seem beneficial. Prehaps I'm favouring the pragmatic over the precise.... Loki -----Original Message----- From: Joe Touch [mailto:touch at ISI.EDU] Sent: Thursday, April 21, 2005 1:30 PM To: Loki Jorgenson Cc: end2end-interest at postel.org Subject: Re: [e2e] MTU - IP layer -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 L3 packet size isn't referred to as MTU, esp. in IP (rfc791); it is datagram length (or total length). Fragments in IP must be less than or equal to the MTU, which there (791) refers to the max payload of the L2. path MTU discovery is equivalent to path "max link payload" discovery, rather than path "max network payload" discovery. IMO, therefore, MTU really refers to the L2 payload size, which is not the same as the L3 'frame' size (size of the total IP packet), but is related to the size of an L3 fragment. Joe Loki Jorgenson wrote: > > Minor note - MTU is technically Layer 3 (as opposed to link layer or > layer 2). So it is quite correct to describe the MTU as the link layer > payload size. So, as noted, 1518 bytes is the frame size at layer 2. > > However, it is very important to keep in mind that MTU and path MTU > discovery operate at Layer 3. For example, boundaries between differing > MTUs should be handled by Layer 3 devices (not switches) to avoid > end-to-end issues that can arise. > > Loki > > ---- > > "Joe Wrote:" > > > Date: Thu, 21 Apr 2005 09:28:28 -0700 > From: Joe Touch > Subject: Re: [e2e] Question on MTU > To: Arjuna Sathiaseelan > Cc: end2end-interest at postel.org > Message-ID: <4267D4AC.8090503 at isi.edu> > Content-Type: text/plain; charset=ISO-8859-1 > > > MTU usually refers to a link layer, and denotes the maximum link ayboad > size, excluding link header/trailer info. For Ethernet, such > header/trailers include: > > - 14 byte header > - 4 byte 802.1q (VLAN) tag > - 4 byte CRC > > Standard ethernet has 1518 byte frames, but 802.1q ethernet has 1522 > byte frames. From the link frame size, subtract the link header/trailer > to get the MTU. Standard ethernet has an MTU of 1500 bytes, but there > are jumbograms of 9,000 bytes in the extended ethernet spec. > > MSS usually refers to a transport protocol, e.g., TCP, and denotes the > max payload size there too. It is also relative to the network (IPv4, > IPv6) protocol _and_ link layer used. > > And just as link layer overhead sizes vary, so do network layer overhead > sizes (minimums of 20 for IPv4, 40 for IPv6 - larger if options are > included, e.g., 48 for IPv6 with jumbogram option). > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (MingW32) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFCaA0tE5f5cImnZrsRAow8AJ4pWCIAqdCRFbQDAhbm4+z1SaZzbACfSvb/ XZXMcs7Veyt+qS6RdSEzzeU= =kDKI -----END PGP SIGNATURE----- From touch at ISI.EDU Thu Apr 21 13:59:54 2005 From: touch at ISI.EDU (Joe Touch) Date: Thu, 21 Apr 2005 13:59:54 -0700 Subject: [e2e] Question on MTU In-Reply-To: <7bf770bf3d525c13130f6408e21788b7@windriver.com> References: <1ef2259005042100424feef544@mail.gmail.com> <4267D4AC.8090503@isi.edu> <7bf770bf3d525c13130f6408e21788b7@windriver.com> Message-ID: <4268144A.6080002@isi.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 See RFC1122 Section 4.2.2.6 on calculating the MSS advertised in the TCP MSS option. Condensed from that section: The eff.snd.MSS takes options at both IP and TCP layers into account - this is the size of the largest segment actually sent. The MSS value sent in the MSS option must be <= MSS_R - 20, where MSS_R is from GET_MAXSIZES in sec 3.4. Sec 3.4 refers to 3.3.3, which defines: MMS_S = EMTU_S - and EMTU_S must be less than or equal to the MTU of the network interface corresponding to the source address of the datagram. Note that in this equation will be 20, unless the IP reserves space to insert IP options for its own purposes in addition to any options inserted by the transport layer. I.e., IP options ARE accounted for in the advertised MSS. As you noted, intermediate headers (shims like IPsec and HIP) are harder to handle because they aren't treated as options, and may not necessarily be known to either IP or TCP. My understanding is that most implementations adjust the IP MSS accordingly, so it gets passed up to TCP as per secs 3.3.3 and 3.4 of 1122 above. Joe David Borman wrote: > > On Apr 21, 2005, at 11:28 AM, Joe Touch wrote: > >> MSS usually refers to a transport protocol, e.g., TCP, and denotes the >> max payload size there too. It is also relative to the network (IPv4, >> IPv6) protocol _and_ link layer used. >> >> And just as link layer overhead sizes vary, so do network layer overhead >> sizes (minimums of 20 for IPv4, 40 for IPv6 - larger if options are >> included, e.g., 48 for IPv6 with jumbogram option). > > > But the advertised MSS in the TCP MSS option should not be adjusted to > reflect any options or intermediary headers, just the fixed IP and TCP > header sizes; 40 bytes for IPv4/TCP, and 60 bytes for IPv6/TCP. When > the sender generates the packet, he is responsible for reducing the TCP > data to allow room for any additional options or headers. > > -David Borman -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (MingW32) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFCaBRKE5f5cImnZrsRAuVHAJ9eaIBHKXMZxhzcMgldOvFVphYRIACffqGL qWTwK4RCNc/QWYLQxi4tYOU= =SChT -----END PGP SIGNATURE----- From touch at ISI.EDU Thu Apr 21 14:02:11 2005 From: touch at ISI.EDU (Joe Touch) Date: Thu, 21 Apr 2005 14:02:11 -0700 Subject: [e2e] MTU - IP layer In-Reply-To: References: Message-ID: <426814D3.7030908@isi.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Loki Jorgenson wrote: > Hmmmmmm - that's an interesting reading of RFC 791 - and the distinction > of fragments over datagrams could be made in that way. > > My observation remains that MTU is conceptually defined and implemented > at Layer 3. Making pains to define it in Layer 2 terms in order to > ensure its scope includes all valid cases makes sense - and yet I find > it challenged. Promoting the subtle distinction of "Frame payload" over > "packet/datagram" doesn't seem beneficial. But that's exactly the difference between a datagram fragment and the entire datagram, when the datagram is larger than the MTU. Fragments are smaller than the L2 MTU, but datagrams are smaller than the L3 'framesize' - whatever we want to call that. ;-) Joe > > Prehaps I'm favouring the pragmatic over the precise.... > > Loki > > -----Original Message----- > From: Joe Touch [mailto:touch at ISI.EDU] > Sent: Thursday, April 21, 2005 1:30 PM > To: Loki Jorgenson > Cc: end2end-interest at postel.org > Subject: Re: [e2e] MTU - IP layer > > L3 packet size isn't referred to as MTU, esp. in IP (rfc791); it is > datagram length (or total length). > > Fragments in IP must be less than or equal to the MTU, which there (791) > refers to the max payload of the L2. > > path MTU discovery is equivalent to path "max link payload" discovery, > rather than path "max network payload" discovery. > > IMO, therefore, MTU really refers to the L2 payload size, which is not > the same as the L3 'frame' size (size of the total IP packet), but is > related to the size of an L3 fragment. > > Joe > > Loki Jorgenson wrote: > >>>Minor note - MTU is technically Layer 3 (as opposed to link layer or >>>layer 2). So it is quite correct to describe the MTU as the link > > layer > >>>payload size. So, as noted, 1518 bytes is the frame size at layer 2. >>> >>>However, it is very important to keep in mind that MTU and path MTU >>>discovery operate at Layer 3. For example, boundaries between > > differing > >>>MTUs should be handled by Layer 3 devices (not switches) to avoid >>>end-to-end issues that can arise. >>> >>>Loki >>> >>>---- >>> >>>"Joe Wrote:" >>> >>> >>>Date: Thu, 21 Apr 2005 09:28:28 -0700 >>>From: Joe Touch >>>Subject: Re: [e2e] Question on MTU >>>To: Arjuna Sathiaseelan >>>Cc: end2end-interest at postel.org >>>Message-ID: <4267D4AC.8090503 at isi.edu> >>>Content-Type: text/plain; charset=ISO-8859-1 >>> >>> >>>MTU usually refers to a link layer, and denotes the maximum link > > ayboad > >>>size, excluding link header/trailer info. For Ethernet, such >>>header/trailers include: >>> >>> - 14 byte header >>> - 4 byte 802.1q (VLAN) tag >>> - 4 byte CRC >>> >>>Standard ethernet has 1518 byte frames, but 802.1q ethernet has 1522 >>>byte frames. From the link frame size, subtract the link > > header/trailer > >>>to get the MTU. Standard ethernet has an MTU of 1500 bytes, but there >>>are jumbograms of 9,000 bytes in the extended ethernet spec. >>> >>>MSS usually refers to a transport protocol, e.g., TCP, and denotes the >>>max payload size there too. It is also relative to the network (IPv4, >>>IPv6) protocol _and_ link layer used. >>> >>>And just as link layer overhead sizes vary, so do network layer > > overhead > >>>sizes (minimums of 20 for IPv4, 40 for IPv6 - larger if options are >>>included, e.g., 48 for IPv6 with jumbogram option). >>> > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (MingW32) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFCaBTTE5f5cImnZrsRAn7wAJ0Qx8njXuW53Z6biPzVrgkFecROngCeOcPk hQSXcr8aQWhVwgYWlDqjVhw= =rnwt -----END PGP SIGNATURE----- From braden at ISI.EDU Thu Apr 21 14:26:00 2005 From: braden at ISI.EDU (Bob Braden) Date: Thu, 21 Apr 2005 14:26:00 -0700 (PDT) Subject: [e2e] MTU - IP layer Message-ID: <200504212126.OAA03031@gra.isi.edu> *> *> Hmmmmmm - that's an interesting reading of RFC 791 - and the distinction *> of fragments over datagrams could be made in that way. *> "Interesting"? Joe is correct and completely precise here. Bob Braden From braden at ISI.EDU Thu Apr 21 14:32:22 2005 From: braden at ISI.EDU (Bob Braden) Date: Thu, 21 Apr 2005 14:32:22 -0700 (PDT) Subject: [e2e] MTU - IP layer Message-ID: <200504212132.OAA03036@gra.isi.edu> *> *> My observation remains that MTU is conceptually defined and implemented *> at Layer 3. Making pains to define it in Layer 2 terms in order to *> ensure its scope includes all valid cases makes sense - and yet I find *> it challenged. Promoting the subtle distinction of "Frame payload" over *> "packet/datagram" doesn't seem beneficial. You are not getting the point. This is a completely correct distinction, and it is not subtle. IP permits a link layer frame to be longer (but not shorter) than the IP datagram it contains. There is NOT a necessary equality between layer 2 frame size and layer 3 datagram size. That is (one reason) why an IP header contains a length field; it cannot just assume the length field provided by the link layer (the way TCP inherits the length from IP). We thrashed this point out in 1978 when TCP/IP was being designed. On another issue in this thread, MSS is specific to TCP, because the definition of a "segment" is TCP-specific. (I once tried to convince Jon Postel that "segment" was a superfluous term, but he was not buying... ;-)) *> *> Prehaps I'm favouring the pragmatic over the precise.... *> Quite the opposite, in fact. Bob Braden *> Loki *> *> -----Original Message----- *> From: Joe Touch [mailto:touch at ISI.EDU] *> Sent: Thursday, April 21, 2005 1:30 PM *> To: Loki Jorgenson *> Cc: end2end-interest at postel.org *> Subject: Re: [e2e] MTU - IP layer *> *> -----BEGIN PGP SIGNED MESSAGE----- *> Hash: SHA1 *> *> L3 packet size isn't referred to as MTU, esp. in IP (rfc791); it is *> datagram length (or total length). *> *> Fragments in IP must be less than or equal to the MTU, which there (791) *> refers to the max payload of the L2. *> *> path MTU discovery is equivalent to path "max link payload" discovery, *> rather than path "max network payload" discovery. *> *> IMO, therefore, MTU really refers to the L2 payload size, which is not *> the same as the L3 'frame' size (size of the total IP packet), but is *> related to the size of an L3 fragment. *> *> Joe *> *> Loki Jorgenson wrote: *> > *> > Minor note - MTU is technically Layer 3 (as opposed to link layer or *> > layer 2). So it is quite correct to describe the MTU as the link *> layer *> > payload size. So, as noted, 1518 bytes is the frame size at layer 2. *> > *> > However, it is very important to keep in mind that MTU and path MTU *> > discovery operate at Layer 3. For example, boundaries between *> differing *> > MTUs should be handled by Layer 3 devices (not switches) to avoid *> > end-to-end issues that can arise. *> > *> > Loki *> > *> > ---- *> > *> > "Joe Wrote:" *> > *> > *> > Date: Thu, 21 Apr 2005 09:28:28 -0700 *> > From: Joe Touch *> > Subject: Re: [e2e] Question on MTU *> > To: Arjuna Sathiaseelan *> > Cc: end2end-interest at postel.org *> > Message-ID: <4267D4AC.8090503 at isi.edu> *> > Content-Type: text/plain; charset=ISO-8859-1 *> > *> > *> > MTU usually refers to a link layer, and denotes the maximum link *> ayboad *> > size, excluding link header/trailer info. For Ethernet, such *> > header/trailers include: *> > *> > - 14 byte header *> > - 4 byte 802.1q (VLAN) tag *> > - 4 byte CRC *> > *> > Standard ethernet has 1518 byte frames, but 802.1q ethernet has 1522 *> > byte frames. From the link frame size, subtract the link *> header/trailer *> > to get the MTU. Standard ethernet has an MTU of 1500 bytes, but there *> > are jumbograms of 9,000 bytes in the extended ethernet spec. *> > *> > MSS usually refers to a transport protocol, e.g., TCP, and denotes the *> > max payload size there too. It is also relative to the network (IPv4, *> > IPv6) protocol _and_ link layer used. *> > *> > And just as link layer overhead sizes vary, so do network layer *> overhead *> > sizes (minimums of 20 for IPv4, 40 for IPv6 - larger if options are *> > included, e.g., 48 for IPv6 with jumbogram option). *> > *> -----BEGIN PGP SIGNATURE----- *> Version: GnuPG v1.2.4 (MingW32) *> Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org *> *> iD8DBQFCaA0tE5f5cImnZrsRAow8AJ4pWCIAqdCRFbQDAhbm4+z1SaZzbACfSvb/ *> XZXMcs7Veyt+qS6RdSEzzeU= *> =kDKI *> -----END PGP SIGNATURE----- *> From david.borman at windriver.com Thu Apr 21 15:13:22 2005 From: david.borman at windriver.com (David Borman) Date: Thu, 21 Apr 2005 17:13:22 -0500 Subject: [e2e] Question on MTU In-Reply-To: <4268144A.6080002@isi.edu> References: <1ef2259005042100424feef544@mail.gmail.com> <4267D4AC.8090503@isi.edu> <7bf770bf3d525c13130f6408e21788b7@windriver.com> <4268144A.6080002@isi.edu> Message-ID: <53edeea330e7ab135170e4d17ee59c68@windriver.com> Joe, The "effective send MSS" takes into account options, but the MSS value put into the TCP MSS option should not. In section 4.2.2.6 of RFC 1122: The MSS value to be sent in an MSS option must be less than or equal to: MMS_R - 20 where MMS_R is the maximum size for a transport-layer message that can be received (and reassembled). TCP obtains MMS_R and MMS_S from the IP layer; see the generic call GET_MAXSIZES in Section 3.4. And in section 3.3.2: There MUST be a mechanism by which the transport layer can learn MMS_R, the maximum message size that can be received and reassembled in an IP datagram (see GET_MAXSIZES calls in Section 3.4). If EMTU_R is not indefinite, then the value of MMS_R is given by: MMS_R = EMTU_R - 20 since 20 is the minimum size of an IP header. The receiver can't reliably predict what IP or TCP options the sender is going to put into the packets, so it doesn't include them in the MSS option. The sender then does take those options into account when calculating the "effective send MSS", because it knows exactly what options are going into the packet. -David Borman On Apr 21, 2005, at 3:59 PM, Joe Touch wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > See RFC1122 Section 4.2.2.6 on calculating the MSS advertised in the > TCP > MSS option. Condensed from that section: > > The eff.snd.MSS takes options at both IP and TCP layers > into account - this is the size of the largest segment > actually sent. > > The MSS value sent in the MSS option must be <= MSS_R - 20, > where MSS_R is from GET_MAXSIZES in sec 3.4. > > Sec 3.4 refers to 3.3.3, which defines: > > MMS_S = EMTU_S - > > and EMTU_S must be less than or equal to the MTU of the > network > interface corresponding to the source address of the datagram. > Note that in this equation will be 20, unless > the IP reserves space to insert IP options for its own > purposes > in addition to any options inserted by the transport layer. > > I.e., IP options ARE accounted for in the advertised MSS. > > As you noted, intermediate headers (shims like IPsec and HIP) are > harder > to handle because they aren't treated as options, and may not > necessarily be known to either IP or TCP. My understanding is that most > implementations adjust the IP MSS accordingly, so it gets passed up to > TCP as per secs 3.3.3 and 3.4 of 1122 above. > > Joe > > David Borman wrote: >> >> On Apr 21, 2005, at 11:28 AM, Joe Touch wrote: >> >>> MSS usually refers to a transport protocol, e.g., TCP, and denotes >>> the >>> max payload size there too. It is also relative to the network (IPv4, >>> IPv6) protocol _and_ link layer used. >>> >>> And just as link layer overhead sizes vary, so do network layer >>> overhead >>> sizes (minimums of 20 for IPv4, 40 for IPv6 - larger if options are >>> included, e.g., 48 for IPv6 with jumbogram option). >> >> >> But the advertised MSS in the TCP MSS option should not be adjusted to >> reflect any options or intermediary headers, just the fixed IP and TCP >> header sizes; 40 bytes for IPv4/TCP, and 60 bytes for IPv6/TCP. When >> the sender generates the packet, he is responsible for reducing the >> TCP >> data to allow room for any additional options or headers. >> >> -David Borman > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.2.4 (MingW32) > Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org > > iD8DBQFCaBRKE5f5cImnZrsRAuVHAJ9eaIBHKXMZxhzcMgldOvFVphYRIACffqGL > qWTwK4RCNc/QWYLQxi4tYOU= > =SChT > -----END PGP SIGNATURE----- From ljorgenson at apparentnetworks.com Thu Apr 21 15:39:11 2005 From: ljorgenson at apparentnetworks.com (Loki Jorgenson) Date: Thu, 21 Apr 2005 15:39:11 -0700 Subject: [e2e] MTU - IP layer Message-ID: OK - I'm convinced that the language is accurate. Thanks for the clarification. So it may simply be the difference between the conceptual and the applied. What I've been struggling with are the conflicting requirements to resolve MTU as an end-to-end value and to handle framesize/MTU at the interface/link layer. If the reality of IP is such that MTU is essentially defined in terms of the link layer, but all the pMTU processes operate at the network layer, how does one avoid, for example, the problems associated with black holes? Where this comes up in our experience is when the confusion of MTU with framesize leads to human mistakes being made at mixed MTU boundaries. Either switches are put into place to manage the MTU constriction or constrictions being accidentally created by miscalculation (9000 byte frames instead of 9018). There is no (effective) mechanism to ensure that pMTU is a well-defined entity based on link layer implementaton - it tends to be fragile. At least by keeping MTU conceptually Layer 3, some of the major pitfalls can be avoided, at least at a human level ..... thoughts? Loki Jorgenson wrote: > Hmmmmmm - that's an interesting reading of RFC 791 - and the distinction > of fragments over datagrams could be made in that way. > > My observation remains that MTU is conceptually defined and implemented > at Layer 3. Making pains to define it in Layer 2 terms in order to > ensure its scope includes all valid cases makes sense - and yet I find > it challenged. Promoting the subtle distinction of "Frame payload" over > "packet/datagram" doesn't seem beneficial. But that's exactly the difference between a datagram fragment and the entire datagram, when the datagram is larger than the MTU. Fragments are smaller than the L2 MTU, but datagrams are smaller than the L3 'framesize' - whatever we want to call that. ;-) Joe From cannara at attglobal.net Thu Apr 21 16:42:25 2005 From: cannara at attglobal.net (Cannara) Date: Thu, 21 Apr 2005 16:42:25 -0700 Subject: [e2e] MTU - IP layer References: Message-ID: <42683A61.C812DBFE@attglobal.net> This is interesting for its parochial nature, behind TCP/IP blinders. MTU is a defined term from way, way back and has nothing specifically to do with any protocol. It simply indicates the Maximum Transfer Unit any layer's implementation can handle. In other words, each layer's PDU requires an advertizement of MTU upward and an acceptance of the MTU offered from below. For the IP world, where 512B seemed to be an important limit from below far longer than it should have been, this meant implementing Fragmentation of IP Datagrams. At higher or lower layers of various protocols, this has been a reality for many years. Alex Loki Jorgenson wrote: > > OK - I'm convinced that the language is accurate. Thanks for the > clarification. So it may simply be the difference between the > conceptual and the applied. > > What I've been struggling with are the conflicting requirements to > resolve MTU as an end-to-end value and to handle framesize/MTU at the > interface/link layer. If the reality of IP is such that MTU is > essentially defined in terms of the link layer, but all the pMTU > processes operate at the network layer, how does one avoid, for example, > the problems associated with black holes? > > Where this comes up in our experience is when the confusion of MTU with > framesize leads to human mistakes being made at mixed MTU boundaries. > Either switches are put into place to manage the MTU constriction or > constrictions being accidentally created by miscalculation (9000 byte > frames instead of 9018). There is no (effective) mechanism to ensure > that pMTU is a well-defined entity based on link layer implementaton - > it tends to be fragile. > > At least by keeping MTU conceptually Layer 3, some of the major pitfalls > can be avoided, at least at a human level ..... thoughts? > > Loki Jorgenson wrote: > > Hmmmmmm - that's an interesting reading of RFC 791 - and the > distinction > > of fragments over datagrams could be made in that way. > > > > My observation remains that MTU is conceptually defined and > implemented > > at Layer 3. Making pains to define it in Layer 2 terms in order to > > ensure its scope includes all valid cases makes sense - and yet I find > > it challenged. Promoting the subtle distinction of "Frame payload" > over > > "packet/datagram" doesn't seem beneficial. > > But that's exactly the difference between a datagram fragment and the > entire datagram, when the datagram is larger than the MTU. Fragments are > smaller than the L2 MTU, but datagrams are smaller than the L3 > 'framesize' - whatever we want to call that. ;-) > > Joe From touch at ISI.EDU Thu Apr 21 16:41:42 2005 From: touch at ISI.EDU (Joe Touch) Date: Thu, 21 Apr 2005 16:41:42 -0700 Subject: [e2e] Question on MTU In-Reply-To: <53edeea330e7ab135170e4d17ee59c68@windriver.com> References: <1ef2259005042100424feef544@mail.gmail.com> <4267D4AC.8090503@isi.edu> <7bf770bf3d525c13130f6408e21788b7@windriver.com> <4268144A.6080002@isi.edu> <53edeea330e7ab135170e4d17ee59c68@windriver.com> Message-ID: <42683A36.7030600@isi.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, Dave, David Borman wrote: > Joe, > > The "effective send MSS" takes into account options, but the MSS value > put into the TCP MSS option should not. In section 4.2.2.6 of RFC 1122: > > The MSS value to be sent in an MSS option must be less than > or equal to: > > MMS_R - 20 > > where MMS_R is the maximum size for a transport-layer > message that can be received (and reassembled). TCP obtains > MMS_R and MMS_S from the IP layer; see the generic call > GET_MAXSIZES in Section 3.4. The MSS you send to the other side is the side you can receive, which has nothing to do with your options - TCP or IP As you correctly note, this is related to MSS_R - 20 (sorry - I used the MSS_S value). > And in section 3.3.2: > > There MUST be a mechanism by which the transport layer can > learn MMS_R, the maximum message size that can be received and > reassembled in an IP datagram (see GET_MAXSIZES calls in > Section 3.4). If EMTU_R is not indefinite, then the value of > MMS_R is given by: > > MMS_R = EMTU_R - 20 > > since 20 is the minimum size of an IP header. > > The receiver can't reliably predict what IP or TCP options the sender is > going to put into the packets, so it doesn't include them in the MSS > option. The sender then does take those options into account when > calculating the "effective send MSS", because it knows exactly what > options are going into the packet. Agreed. The primary issue to me was that the options - both IP and TCP - are taken into account in computing the MSS TCP uses, whether obtained by looking at the local interface or learned by the PMTUD mechanisms. FWIW, the shims sometimes cobble things by setting the interface MTU down by the amount they add, effectively 'adding' space for it as a result. (sometimes; sometimes it's not so easy to point to which interface will be used, at which point I don't know if they decrement all interfaces or try to do anything more context-dependent) Joe > On Apr 21, 2005, at 3:59 PM, Joe Touch wrote: > > See RFC1122 Section 4.2.2.6 on calculating the MSS advertised in the TCP > MSS option. Condensed from that section: > > The eff.snd.MSS takes options at both IP and TCP layers > into account - this is the size of the largest segment > actually sent. > > The MSS value sent in the MSS option must be <= MSS_R - 20, > where MSS_R is from GET_MAXSIZES in sec 3.4. > > Sec 3.4 refers to 3.3.3, which defines: > > MMS_S = EMTU_S - > > and EMTU_S must be less than or equal to the MTU of the network > interface corresponding to the source address of the datagram. > Note that in this equation will be 20, unless > the IP reserves space to insert IP options for its own purposes > in addition to any options inserted by the transport layer. > > I.e., IP options ARE accounted for in the advertised MSS. > > As you noted, intermediate headers (shims like IPsec and HIP) are harder > to handle because they aren't treated as options, and may not > necessarily be known to either IP or TCP. My understanding is that most > implementations adjust the IP MSS accordingly, so it gets passed up to > TCP as per secs 3.3.3 and 3.4 of 1122 above. > > Joe > > David Borman wrote: > >>>> >>>> On Apr 21, 2005, at 11:28 AM, Joe Touch wrote: >>>> >>>>> MSS usually refers to a transport protocol, e.g., TCP, and denotes the >>>>> max payload size there too. It is also relative to the network (IPv4, >>>>> IPv6) protocol _and_ link layer used. >>>>> >>>>> And just as link layer overhead sizes vary, so do network layer >>>>> overhead >>>>> sizes (minimums of 20 for IPv4, 40 for IPv6 - larger if options are >>>>> included, e.g., 48 for IPv6 with jumbogram option). >>>> >>>> >>>> >>>> But the advertised MSS in the TCP MSS option should not be adjusted to >>>> reflect any options or intermediary headers, just the fixed IP and TCP >>>> header sizes; 40 bytes for IPv4/TCP, and 60 bytes for IPv6/TCP. When >>>> the sender generates the packet, he is responsible for reducing the TCP >>>> data to allow room for any additional options or headers. >>>> >>>> -David Borman > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (MingW32) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFCaDo2E5f5cImnZrsRAv95AKCMo1Tn9unqDs30y0+fLbqFmWlq7wCgj6TU kyYj0EwJ72DRqmH2Y5/90gU= =wIau -----END PGP SIGNATURE----- From touch at ISI.EDU Thu Apr 21 17:20:22 2005 From: touch at ISI.EDU (Joe Touch) Date: Thu, 21 Apr 2005 17:20:22 -0700 Subject: [e2e] MTU - IP layer In-Reply-To: References: Message-ID: <42684346.1020105@isi.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Loki Jorgenson wrote: > OK - I'm convinced that the language is accurate. Thanks for the > clarification. So it may simply be the difference between the > conceptual and the applied. What Bob said ;-) > What I've been struggling with are the conflicting requirements to > resolve MTU as an end-to-end value and to handle framesize/MTU at the > interface/link layer. MTU is a link payload issue. Path MTU is the min of the MTUs on the path; it is path MTU that is defined E2E. > If the reality of IP is such that MTU is > essentially defined in terms of the link layer, but all the pMTU > processes operate at the network layer, yes... > how does one avoid, for example, > the problems associated with black holes? I'm not sure one has anything to do with the other. The only way to _know_ you will avoid a black hole is to send the smallest IP packets possible - 68 bytes. You can do this by sending small datagrams (28 byte), or by sending larger datagrams and fragment them. Short of that, the only other way is POSITIVE feedback - try larger packets and see what gets through. If it does, report back and use that size. That's already under consideration in the IETF "pmtud" working group. Using NEGATIVE feedback - the absence of error messages bouncing large packets - is what is currently used, and that is what is susceptible to black holes, because black holes look like a successful transmission. > Where this comes up in our experience is when the confusion of MTU with > framesize leads to human mistakes being made at mixed MTU boundaries. That's what automated PMTUD is supposed to fix ;-) > Either switches are put into place to manage the MTU constriction or > constrictions being accidentally created by miscalculation (9000 byte > frames instead of 9018). There is no (effective) mechanism to ensure > that pMTU is a well-defined entity based on link layer implementaton - > it tends to be fragile. That, again, is because paths are not link concepts, so pMTU isn't defined at the link layer. > At least by keeping MTU conceptually Layer 3, some of the major pitfalls > can be avoided, at least at a human level ..... thoughts? IMO, there's no benefit to human management possible; automated systems are the key. The major pitfall, IMO, is trying to track this with brain cells ;-) Joe > Loki Jorgenson wrote: > >>Hmmmmmm - that's an interesting reading of RFC 791 - and the > > distinction > >>of fragments over datagrams could be made in that way. >> >>My observation remains that MTU is conceptually defined and > > implemented > >>at Layer 3. Making pains to define it in Layer 2 terms in order to >>ensure its scope includes all valid cases makes sense - and yet I find >>it challenged. Promoting the subtle distinction of "Frame payload" > > over > >>"packet/datagram" doesn't seem beneficial. > > > But that's exactly the difference between a datagram fragment and the > entire datagram, when the datagram is larger than the MTU. Fragments are > smaller than the L2 MTU, but datagrams are smaller than the L3 > 'framesize' - whatever we want to call that. ;-) > > Joe -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (MingW32) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFCaENGE5f5cImnZrsRAje5AJ9glKM5wN1vJ2G9NtPqpdV4XbH45ACeLLr+ igUDU1KTiBnn+xc0qp20bk8= =+5lW -----END PGP SIGNATURE----- From ljorgenson at apparentnetworks.com Thu Apr 21 17:46:36 2005 From: ljorgenson at apparentnetworks.com (Loki Jorgenson) Date: Thu, 21 Apr 2005 17:46:36 -0700 Subject: [e2e] MTU - IP layer Message-ID: Joe wrote: > MTU is a link payload issue. Path MTU is the min of the MTUs on the > path; it is path MTU that is defined E2E. And that doesn't seem like a problem? I guess if RFC 1191 was reliably implemented and Layer 2 fed back to the end-to-end.... > > Short of that, the only other way is POSITIVE feedback - try larger > packets and see what gets through. If it does, report back and use that > size. That's already under consideration in the IETF "pmtud" working group. >From the early drafts I looked at they were proposing, as you suggest, a "probing for packet loss by size defines pMTU" mechanism - is that still the case then? That doesn't sound like positive feedback per se. The idea of ICMP DF Set probing (a la RFC1191) at least seemed like positive feedback, if only best-effort.... Loki From touch at ISI.EDU Thu Apr 21 21:47:58 2005 From: touch at ISI.EDU (Joe Touch) Date: Thu, 21 Apr 2005 21:47:58 -0700 Subject: [e2e] MTU - IP layer In-Reply-To: References: Message-ID: <426881FE.7030408@isi.edu> Loki Jorgenson wrote: > Joe wrote: > > >>MTU is a link payload issue. Path MTU is the min of the MTUs on the >>path; it is path MTU that is defined E2E. > > And that doesn't seem like a problem? I guess if RFC 1191 was > reliably implemented and Layer 2 fed back to the end-to-end.... See the new PMTUD WG below ;-) >>Short of that, the only other way is POSITIVE feedback - try larger >>packets and see what gets through. If it does, report back and use > > that > >>size. That's already under consideration in the IETF "pmtud" working > > group. > >>From the early drafts I looked at they were proposing, as you suggest, a > "probing for packet loss by size defines pMTU" mechanism - is that > still the case then? That doesn't sound like positive feedback per se. > The idea of ICMP DF Set probing (a la RFC1191) at least seemed like > positive feedback, if only best-effort.... > > Loki My use of 'negative' and 'positive' may have been confusing. I meant more like "the absence of feedback" and "the presence of feedback". Positive/negative can be confused with the kind of information you get, _when_ you get feedback (yes it got through, or no it failed). So, current pmtud is based on the absence of "no, it failed" feedback. I.e., if the source gets the ICMP errors back, it knows that particular attempt failed. The algorithm says "try large until you get told NOT to", which is _why_ it is susceptible to black holes - because black holes behave like a working large-mtu path - you do NOT get the feedback that anything failed. The new ptmud is based on the presence of "yes, it got through" feedback. The loss isn't what matters; it's what gets through that does (successful probes). The algorithm says "stay small, and try large (disposable) probes; if the probes work, THEN get larger". This is not susceptible to black holes - it works only when both the probes get through _and_ the feedback makes it back successfully. Joe -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 254 bytes Desc: OpenPGP digital signature Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050421/2755b74a/signature.bin From ljorgenson at apparentnetworks.com Thu Apr 21 23:57:53 2005 From: ljorgenson at apparentnetworks.com (Loki Jorgenson) Date: Thu, 21 Apr 2005 23:57:53 -0700 Subject: [e2e] MTU - IP layer Message-ID: OK - I've skimmed the latest draft of the RFC and I see better how they are planning to make this work. It represents a much more significant change than I had gleaned from earlier readings. I'm still wondering about how the loss of these (disposable) probes will be distinguished from congestion and other forms of loss, how the local host's record of PMTU will change if the actual PMTU changes (for example route change), and various other scenarios involving interplay between loss, TCP and PLPMTUD .... but I'll just finish reading the current draft and find out how it all turns out. If others are as curious: http://www.ietf.org/internet-drafts/draft-ietf-pmtud-method-04.txt In any case, thanks for your insights Joe. -----Original Message----- From: Joe Touch [mailto:touch at ISI.EDU] Sent: Thursday, April 21, 2005 9:48 PM The new ptmud is based on the presence of "yes, it got through" feedback. The loss isn't what matters; it's what gets through that does (successful probes). The algorithm says "stay small, and try large (disposable) probes; if the probes work, THEN get larger". This is not susceptible to black holes - it works only when both the probes get through _and_ the feedback makes it back successfully. Joe From dpreed at reed.com Fri Apr 22 06:54:09 2005 From: dpreed at reed.com (David P. Reed) Date: Fri, 22 Apr 2005 09:54:09 -0400 Subject: [e2e] MTU - IP layer In-Reply-To: References: Message-ID: <42690201.8080405@reed.com> As a pragmatic architect, it seems to me that pmtud is focusing on micro-optimizing whatever problem turns out to be their motivating problem (FTP, I suspect), and worse yet, binding in narrow assumptions about the underlying *inter* network architecture (like the idea that there is one path, it is slowly changing, and that packet structure is preserved on the path, rather than being tunable to manage latency/jitter). We'll never be able to exploit concurrency in the transport or link layers if we continue binding highly specific low level assumptions into highlevel protocols (also known as optimizing for the narrow domain of the present). So I offer this as a suggestion... It would seem to me that a small-packet network is free to implement large packets by intra-AS fragmentation and reassembly, for example. The objection to same was that reassembly was hard if packets took different paths. But the PMTUD model implies they *Don't*! Reductio ad absurdum. So a much more practical separation of concerns would be to use a small number of end-to-end maximum packet sizes, and perhaps a notion of a much simpler f/r. To cope with the long-term trend towards supporting larger and larger end-to-end datagrams, why not allow any size datagram, but cut it only on power-of-2 or power-of-4 boundaries (like the old "buddy" memory allocator, which simplified the reassembly of "free blocks"). Let reassembly occur whereever it is possible to do so (worst case at the target). Make the end-to-end error check/error correct more robust (perhaps an adapted erasure code implemented at the endpoint would be effective at reducing round-trip overhead for fragment recovery). Note that this *does* follow the end-to-end principle making the network simple and moving the work to the endpoints, while allowing the underlying network to be simply specified. This is only a proposal, as usual. Sent in hopes of inspiring useful research by grad student architects and thoughtful systems designers who need to simplify complex tradeoffs. Perhaps cleaning up f/r is a lot more useful than making the "perfect" pmtud algorithm and then ruliing out network innovations that can't support it. In anticipation of the usual fiery reaction to end-to-end proposals from the cross-layer optimizers (routerheads) on this list, I'd ask those of you who are allergic to such solutions, please spout your annoyance at me directly, rather than doing a Cannara-like blast of rage and annoyance at past injustices and current bete noires to the whole list. From mathis at psc.edu Fri Apr 22 14:21:17 2005 From: mathis at psc.edu (Matt Mathis) Date: Fri, 22 Apr 2005 17:21:17 -0400 (EDT) Subject: [e2e] Question on MTU In-Reply-To: <42683A36.7030600@isi.edu> References: <1ef2259005042100424feef544@mail.gmail.com> <4267D4AC.8090503@isi.edu> <7bf770bf3d525c13130f6408e21788b7@windriver.com> <4268144A.6080002@isi.edu> <53edeea330e7ab135170e4d17ee59c68@windriver.com> <42683A36.7030600@isi.edu> Message-ID: There is another issue here, which I think is more germane to the original question. I quote from -pmtud-method- MTU, Maximum Transmission Unit, the size in bytes of the largest IP packet, including the IP header and payload, that can be transmitted on a link or path. Note that this could more properly be called the IP MTU, to be consistent with how other standards organizations use the acronym MTU. link MTU, The Maximum Transmission Unit, i.e., maximum IP packet size in bytes, that can be conveyed in one piece over a link. Beware that this definition differers from the definition used by other standards organizations. For IETF documents, link MTU is uniformly defined as the IP MTU over the link. This includes the IP header, but excludes link layer headers and other framing which is not part of IP or the IP payload. Be aware that other standards organizations generally define link MTU to include the link layer headers. So to make it concrete: To the IETF, Ethernet has a 1500 Byte MTU, to the IEEE, it has a 1518 Byte MTU. This causes endless confusion and errors when people are configuring router interfaces that have selectable MTUs, and other situations where both communities might have to share documentation. I seriously considered trying to pick a new term to replace "IP MTU", but nothing is as crisp or sufficiently motivating to re-train everyone who never thinks about layers below IP. When you read a piece of documentation you can usually tell which MTU the author meant, however once in a while a new product pops up where the HW engineer failed to realize that IP MTU is not the total frame size and did it wrong...... Peace, --MM-- ------------------------------------------- Matt Mathis http://www.psc.edu/~mathis Work:412.268.3319 Home/Cell:412.654.7529 ------------------------------------------- Evil is defined by people who think they know "The Truth" and use force to apply it to others. From touch at ISI.EDU Fri Apr 22 15:53:45 2005 From: touch at ISI.EDU (Joe Touch) Date: Fri, 22 Apr 2005 15:53:45 -0700 Subject: [e2e] MTU - IP layer In-Reply-To: <42690201.8080405@reed.com> References: <42690201.8080405@reed.com> Message-ID: <42698079.9060608@isi.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 David, David P. Reed wrote: > As a pragmatic architect, it seems to me that pmtud is focusing on > micro-optimizing whatever problem turns out to be their motivating > problem (FTP, I suspect), and worse yet, binding in narrow assumptions > about the underlying *inter* network architecture (like the idea that > there is one path, it is slowly changing, and that packet structure is > preserved on the path, rather than being tunable to manage > latency/jitter). We'll never be able to exploit concurrency in the > transport or link layers if we continue binding highly specific low > level assumptions into highlevel protocols (also known as optimizing for > the narrow domain of the present). FWIW, I agree completely. Much as the 'positive feedback of positive evidence' variant of the new version of pmtud is a step in the right direction, I disagree with the way the current proposal is entangled with the transport layer. I would be more comfortable if it were just part of the network layer - where the path necessarily lies - and where current PMTUD is basically implemented. ... > So I offer this as a suggestion... > > It would seem to me that a small-packet network is free to implement > large packets by intra-AS fragmentation and reassembly, for example. > The objection to same was that reassembly was hard if packets took > different paths. But the PMTUD model implies they *Don't*! Reductio > ad absurdum. So a much more practical separation of concerns would be > to use a small number of end-to-end maximum packet sizes, and perhaps a > notion of a much simpler f/r. To cope with the long-term trend towards > supporting larger and larger end-to-end datagrams, why not allow any > size datagram, but cut it only on power-of-2 or power-of-4 boundaries > (like the old "buddy" memory allocator, which simplified the reassembly > of "free blocks"). > > Let reassembly occur whereever it is possible to do so (worst case at > the target). Make the end-to-end error check/error correct more robust > (perhaps an adapted erasure code implemented at the endpoint would be > effective at reducing round-trip overhead for fragment recovery). I agree that the basic idea should allow layers not to need to be aware of each other beyond direct interface - IP fragments on link MTU, TCP segments only on IP datagram limits rather than how IP fragments. PMTUD is just an optimization, and it should never be the case that an optimization disables functionality (as with black holes on negative-info based current PMTUD). One of the problems is that the optimization turns out to be significant. The unit of loss in the network is an IP fragment, but the unit of congestion control is a TCP MSS. When the two aren't the same, things don't work as expected. PMTUD (old or new) -does- move the work to the endpoints; new PMTUD even more so, because it doesn't rely on ICMP errors from inside the network but rather E2E feedback. Why isn't that consistent with the E2E principle of making the network simpler while moving the work to the endpoints? Joe -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (MingW32) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFCaYB5E5f5cImnZrsRAgGrAJsGYpZNcPFnnqIaYcvcQkkUjkHuXACfRrX2 OPbT8mn5PV+oACi4Hjo0X/A= =unoO -----END PGP SIGNATURE----- From kkrama at research.att.com Fri Apr 22 21:54:17 2005 From: kkrama at research.att.com (K. K. Ramakrishnan) Date: Sat, 23 Apr 2005 00:54:17 -0400 Subject: [e2e] Call for Papers: LANMAN 2005 (Deadline May 16, 2005) Message-ID: <4269D4F9.6040007@research.att.com> (Our apologies if you receive multiple copies of this message) Note: updated deadline of May 16, 2005, 5 pm EDT. ====== Call for Papers 14th IEEE Workshop on Local and Metropolitan Area Networks (LANMAN 2005) September 18-21, 2005, Chania, Island of Crete, Greece http://www.ieee-lanman.org Sponsored by: IEEE Communications Society -- K. K. Ramakrishnan Email: kkrama at research.att.com AT&T Labs-Research, Rm. A117 Tel: (973)360-8764 180 Park Ave, Florham Park, NJ 07932 Fax: (973) 360-8050 URL: http://www.research.att.com/info/kkrama -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.postel.org/pipermail/end2end-interest/attachments/20050423/f05bcf58/attachment.html -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: call-for-papers_revised.txt Url: http://www.postel.org/pipermail/end2end-interest/attachments/20050423/f05bcf58/call-for-papers_revised.txt From mwd24 at thompson.cl.cam.ac.uk Sun Apr 24 05:46:28 2005 From: mwd24 at thompson.cl.cam.ac.uk (Michael Dales) Date: 24 Apr 2005 13:46:28 +0100 Subject: [e2e] Jitter Calculations in IP networks. In-Reply-To: <12a3f40805042112543dd601c2@mail.gmail.com> References: <12a3f40805042112543dd601c2@mail.gmail.com> Message-ID: Aamir Mehmood writes: > Hi all, > We are doing analysis of core ip backbone. Can some one please let me > know how jitter is calculated in ip networks. Is there any software > except ethereal which can calculate the jitter from the captured RTP > stream. You might want to look at the work the IETF working group IPPM (IP Performance Metrics) have been doing. Their charter can be found here: http://www.ietf.org/html.charters/ippm-charter.html Specifically they have an RFC specifying how to measure delay variance: http://www.ietf.org/rfc/rfc3393.txt Hope that's of some use. -- Michael Dales From jshen_cad at yahoo.com.cn Wed Apr 27 00:37:02 2005 From: jshen_cad at yahoo.com.cn (Jing Shen) Date: Wed, 27 Apr 2005 15:37:02 +0800 (CST) Subject: [e2e] VoIP traffic characteristics Message-ID: <20050427073703.2658.qmail@web15408.mail.cnb.yahoo.com> Hi, is there any work on VoIP traffic characteristics in current internet? e.g. protocol distribution, packet size distribution, flow size distribution, communication pattern of Gate keeper etc. regards Jing _________________________________________________________ Do You Yahoo!? ×¢²áÊÀ½çÒ»Á÷Æ·ÖʵÄÑÅ»¢Ãâ·ÑµçÓÊ http://cn.rd.yahoo.com/mail_cn/tag/1g/*http://cn.mail.yahoo.com/ From jussara at dcc.ufmg.br Thu Apr 28 13:04:57 2005 From: jussara at dcc.ufmg.br (Jussara Marques de Almeida) Date: Thu, 28 Apr 2005 17:04:57 -0300 (BRT) Subject: [e2e] SIGMETRICS 2005 - early registration and hotel deadlines approaching Message-ID: Call for Participation ****** ACM SIGMETRICS 2005 ****** International Conference on Measurement and Modeling of Computer Systems June 6-10, 2005 Banff, Alberta, Canada http://www.cse.cuhk.edu.hk/~sigm2005 **** Early Registration Deadline: May 5, 2005 **** **** Hotel Reservation Deadline: May 5, 2005 **** ACM SIGMETRICS 2005, the International Conference on Measurement and Modeling of Computer Systems, will be held June 6-10, 2005 in Banff, Alberta, Canada. The main conference (June 8-10) features eight paper sessions and a poster session, as well a keynote talk by Urs Hoelzle of Google, Inc., and a hot topics session on Optimization of Communication Networks. Preceeding the main conference are two workshops (June 6): - MAthematical Modeling and Analysis (MAMA) - Large Scale Network Inference (LSNI): Methods, Validation and Applications and a full day of tutorials (June 7): - Introduction to Control Theory for Computer Scientists - Mathematical Optimization Techniques for Computer System Design - Statistical Techniques for Performance Engineers - Internet Routing: Measurement, Modeling, and Analysis - Job Fairness in Queue Scheduling - Using the Open Network Laboratory Paper session topics include: - Peer-to-Peer Networks - Traffic Measurement and Classification - Wireless Networks - Caching and File Systems - Bandwidth Sharing and Scheduling - Network and Server Performance Measurement and Evaluation - Traffic Estimation and Topology Inference. For program details see the conference web site http://www.cse.cuhk.edu.hk/~sigm2005 Student travel grants are available to encourage student participation. See the website for application details. A Ph.D. student forum and dinner are planned for June 6. Banff is a world-class vacation spot in Banff National Park, in the heart of the Canadian Rocky Mountains. Conference attendees are encouraged to take advantage of the potential sightseeing and leisure (or not so "leisure") activities. A group hiking expedition is being organized for June 5th. Organizing Committee ==================== General Co-Chairs: Derek Eager, University of Saskatchewan, (eager at cs.usask.ca) Carey Williamson, University of Calgary, (carey at cpsc.ucalgary.ca) Program Co-Chairs: Sem Borst, Bell Labs and CWI, (sem at research.bell-labs.com, sem.borst at cwi.nl) John C.S. Lui, Chinese University of Hong Kong, (cslui at cse.cuhk.edu.hk) Tutorials Co-Chairs: Kimberly Keeton, HP Labs, (kkeeton at hpl.hp.com) Vishal Misra, Columbia University, (misra at cs.columbia.edu) Finance Chair: Martin Arlitt, U. Calgary, (arlitt at cpsc.ucalgary.ca) Proceedings Chair: Anirban Mahanti, U. Calgary, (mahanti at cpsc.ucalgary.ca) Registration Chair and Local Arrangements Chair: Camille Sinanan, U. Calgary, (camille at cpsc.ucalgary.ca) Publicity Co-Chairs: Jussara Almeida, UFMG, Brazil, (jussara at dcc.ufmg.br) Thomas Bonald, France Telecom, (thomas.bonald at francetelecom.com) Technical Program Committee: Vikram Adve, UIUC Marco Ajmone-Marsan, Politecnico di Torino Mostafa Ammar, Georgia Tech Francois Baccelli, INRIA/ENS Ernst Biersack, Institut Eurecom Thomas Bonald, France Telecom Edmundo De Souza e Silva, Fed U Rio de Janiero Christophe Diot, Intel Allen Downey, Olin College Nick Duffield, AT&T Ashish Goel, Stanford Leana Golubchik, USC Albert Greenberg, AT&T Matthias Grossglauser, EPFL Mor Harchol-Balter, CMU Jennifer Hou, UIUC R.K. Iyer, UIUC Shivkumar Kalyanaraman , RPI Kimberly Keeton, HP Labs Peter Key, Microsoft Anurag Kumar, IISC Bangalore Jim Kurose, UMass at Amherst T.V. Lakshman, Bell Labs Simon Lam, U Texas at Austin Jean-Yves Le Boudec, EPFL Kai Li, Princeton Zhen Liu, IBM Laurent Massoulie, Microsoft Rob van der Mei, CWI/Vrije U Arif Merchant, HP Labs Vishal Misra, Columbia Sue Moon, KAIST Dick Muntz, UCLA Erich Nahum, IBM Philippe Nain, INRIA Antonio Nucci , Sprint Banu Ozden, USC Keith Ross, Polytechnic U Matthew Roughan, Adelaide U Dan Rubenstein, Columbia Sanjay Shakkottai, U Texas Austin Evgenia Smirni, College of William & Mary Daniel Sorin, Duke U Mark Squillante, IBM R. Srikant, UIUC Y.C. Tay, NUS Don Towsley, UMass at Amherst Phuoc Tran-Gia, U Wurzburg Jeffrey Vetter, Oak Ridge National Laboratory Geoff Voelker, UCSD Jia Wang, AT&T Randy Wang, Princeton Jun Xu, Georgia Tech David K.Y. Yau, Purdue U Pen-Chung Yew, U Minnesota Philip S. Yu, IBM Zhi-Li Zhang, U Minnesota From antonio.pinizzotto at iit.cnr.it Sat Apr 30 16:29:03 2005 From: antonio.pinizzotto at iit.cnr.it (Antonio Pinizzotto) Date: Sun, 01 May 2005 01:29:03 +0200 Subject: [e2e] How to read the TCP congestion window value on Linux? Message-ID: <427414BF.3040503@iit.cnr.it> Hi everybody. Do you know about any way to read the TCP cwnd value (congestion window) on Linux? I have read that on Linux it is not possible to enable a socket option (to read to cwnd using the program trpt). thanks Antonio From kaber at trash.net Sat Apr 30 16:53:44 2005 From: kaber at trash.net (Patrick McHardy) Date: Sun, 01 May 2005 01:53:44 +0200 Subject: [e2e] How to read the TCP congestion window value on Linux? In-Reply-To: <427414BF.3040503@iit.cnr.it> References: <427414BF.3040503@iit.cnr.it> Message-ID: <42741A88.5000809@trash.net> Antonio Pinizzotto wrote: > > Hi everybody. > Do you know about any way to read the TCP cwnd value (congestion window) > on Linux? I guess one of the linux networking lists would be a better place to ask this. Anyway, you can get cwnd through the TCP socket monitoring interface using the "ss"-tool from iproute2 (http://developer.osdl.org/dev/iproute2) or by getsockopt(TCP_INFO). Regards Patrick