From demeer at fmi.uni-passau.de Sun Jan 2 03:08:39 2005 From: demeer at fmi.uni-passau.de (demeer@fmi.uni-passau.de) Date: Sun Jan 2 03:09:23 2005 Subject: [e2e] Important Message-ID: <69318d09.19d10354@administrator> -------------- next part -------------- A non-text attachment was scrubbed... Name: document.zip Type: application/x-zip-compressed Size: 204800 bytes Desc: not available Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050102/48f80aa0/document-0001.bin From tristan+e2e at ethereal.net Tue Jan 4 23:35:49 2005 From: tristan+e2e at ethereal.net (Tristan Horn) Date: Tue Jan 4 23:37:43 2005 Subject: [e2e] RST effect on socket buffers? Message-ID: <20050105073548.GS15161@ethereal.net> Hi all, I find myself wrestling with a vendor to get their HTTP proxy's half-closed connection handling fixed. One of the issues is that their hardware sends a spurious RST to the client after a 60 second timeout. The effect is that any data remaining in the client TCP's receive buffer appears to vanish; the application is not able to read it. My casual reading of RFC 793 suggests that this behavior is expected: * a connection in e.g. ESTABLISHED state will transition to CLOSED upon the receipt of a valid RST, * "CLOSED" == "non-existent" The vendor doesn't agree. I also seem to only be able to replicate the problem on Windows, not e.g. Linux... Can anyone shed light on what the expected behavior here is (if any)? (No need to convince me that the untimely RST itself is broken, BTW -- I think/hope we've agreed on that point already.) thanks! -- Tristan Horn Sr. Network Engineer CollabNet, Inc. +1 650 228-2567 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050104/216e1386/attachment.bin From michael.welzl at uibk.ac.at Wed Jan 5 04:09:47 2005 From: michael.welzl at uibk.ac.at (Michael Welzl) Date: Wed Jan 5 04:05:22 2005 Subject: [e2e] Question about RFC 2581 Message-ID: <001401c4f31f$7332c7a0$0200a8c0@fun> Dear all, I have a question regarding the congestion avoidance part of RFC 2581 - it is probably a stupid one, so please don't shoot me :) RFC 2581 states: One formula commonly used to update cwnd during congestion avoidance is given in equation 2: cwnd += SMSS*SMSS/cwnd (2) This adjustment is executed on every incoming non-duplicate ACK. Equation (2) provides an acceptable approximation to the underlying principle of increasing cwnd by 1 full-sized segment per RTT. (Note that for a connection in which the receiver acknowledges every data segment, (2) proves slightly more aggressive than 1 segment per RTT, and for a receiver acknowledging every-other packet, (2) is less aggressive.) My question is: Why is (2) slightly more aggressive than 1 segment per RTT if the receiver ACKs every segment? I just don't get it. Cheers, Michael From mallman at icir.org Wed Jan 5 04:51:11 2005 From: mallman at icir.org (Mark Allman) Date: Wed Jan 5 04:51:22 2005 Subject: [e2e] Question about RFC 2581 In-Reply-To: <001401c4f31f$7332c7a0$0200a8c0@fun> Message-ID: <20050105125111.3244F77B0CC@guns.icir.org> > Why is (2) slightly more aggressive than 1 segment per RTT if the > receiver ACKs every segment? > > I just don't get it. Stupid Author Error. It will be fixed in the revision (which is underway). allman -- Mark Allman -- ICIR -- http://www.icir.org/mallman/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 186 bytes Desc: not available Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050105/3a8eeeaa/attachment.bin From gdc at iki.fi Wed Jan 5 06:34:13 2005 From: gdc at iki.fi (Dado Colussi) Date: Wed Jan 5 06:35:21 2005 Subject: [e2e] Question about RFC 2581 In-Reply-To: <20050105125111.3244F77B0CC@guns.icir.org> References: <20050105125111.3244F77B0CC@guns.icir.org> Message-ID: <41DBFAE5.5040707@iki.fi> Mark Allman wrote: >>Why is (2) slightly more aggressive than 1 segment per RTT if the >>receiver ACKs every segment? >> >>I just don't get it. > > > Stupid Author Error. > > It will be fixed in the revision (which is underway). Hi Mark and Michael, if acknowledgment means a *segment* transmitted by a receiver that indicates successfully recieved segments, then it is indeed more aggressive because (2) is used for each ACK segment, not for each transmitted segment that has been successfully received. If acknowledgment means *indication* of a successfully received segment, then (2) should be used for each ACK segment N times where N is the number of segments indicated successfully received by that ACK segment. I've always found the term ACK confusing. Sometimes it means the first, and sometimes the latter. Cheers, Dado From mallman at icir.org Wed Jan 5 06:41:45 2005 From: mallman at icir.org (Mark Allman) Date: Wed Jan 5 06:44:20 2005 Subject: [e2e] Question about RFC 2581 In-Reply-To: <41DBFAE5.5040707@iki.fi> Message-ID: <20050105144145.65C0177B0CC@guns.icir.org> > if acknowledgment means a *segment* transmitted by a receiver that > indicates successfully recieved segments, then it is indeed more > aggressive because (2) is used for each ACK segment, not for each > transmitted segment that has been successfully received. Ah, so, I see that. But, it is only more aggressive if the receiver transmits more ACK packets than segments being ACKed (i.e., cwnd segments), right? (E.g., one can envision this happening in bi-directional transfers.) In any case, the document needs cleaned up. Sorry about all the confusion. Thanks! allman -- Mark Allman -- ICIR -- http://www.icir.org/mallman/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 186 bytes Desc: not available Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050105/bf88fbe7/attachment.bin From gdc at iki.fi Wed Jan 5 07:09:51 2005 From: gdc at iki.fi (Dado Colussi) Date: Wed Jan 5 07:11:47 2005 Subject: [e2e] Question about RFC 2581 In-Reply-To: <20050105144145.65C0177B0CC@guns.icir.org> References: <20050105144145.65C0177B0CC@guns.icir.org> Message-ID: <41DC033F.6040406@iki.fi> Mark Allman wrote: >>if acknowledgment means a *segment* transmitted by a receiver that >>indicates successfully recieved segments, then it is indeed more >>aggressive because (2) is used for each ACK segment, not for each >>transmitted segment that has been successfully received. > > > Ah, so, I see that. But, it is only more aggressive if the receiver > transmits more ACK packets than segments being ACKed (i.e., cwnd > segments), right? (E.g., one can envision this happening in > bi-directional transfers.) To me it seems that the more ACK segments, the more aggerssive the algorithm is. If you updated cwnd for each ACK segment and you got an ACK segment for every other segment you transmit, then you would grow you cwnd quite modestly. Assume a cwnd_i and an increase of delta_i = SMSS * SMMS / cwnd_i. Then cwnd_{i+1} = cwnd_i + delta_i. For all i holds that cwnd_i < cwnd_{i+1} (because incremented for each ACK) and delta_i > delta_{i+1} > 0 (because a constant is divided by the growing window, all positive). The window size after n ACKs can be expressed as cwnd_n = cwnd_0 + sum_{i=0}^{n}delta_i. Thus, the more ACK segments, the greater the n and the greater the sum part of the equation. If I derived this right, then the difference is quite significant to me. However, this is the way it should not be implemented, I hope. Cheers, Dado From mallman at icir.org Wed Jan 5 07:19:16 2005 From: mallman at icir.org (Mark Allman) Date: Wed Jan 5 07:21:51 2005 Subject: [e2e] Question about RFC 2581 In-Reply-To: <41DC033F.6040406@iki.fi> Message-ID: <20050105151916.4A3D777B0CC@guns.icir.org> > Thus, the more ACK segments, the greater the n and the greater the sum > part of the equation. If I derived this right, then the difference is > quite significant to me. However, this is the way it should not be > implemented, I hope. Right. We are on the same page, I think. My own opinion is that congestion avoidance should be implemented using byte counting and that 1 SMSS should be added to cwnd after cwnd bytes have been ACKed. That is allowed in the current RFC. In the revision, should this scheme be not only allowed but encouraged? I only see advantages (mostly in terms of security) of this, not disadvantages. (The only disadvantage that really comes to mind is that a touch more state must be kept... basically how much data has been ACKed since we last bumped the cwnd.) I'd love to hear opinions on this. allman -- Mark Allman -- ICIR -- http://www.icir.org/mallman/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 186 bytes Desc: not available Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050105/efe1ab66/attachment.bin From Michael.Welzl at uibk.ac.at Wed Jan 5 07:43:05 2005 From: Michael.Welzl at uibk.ac.at (Michael Welzl) Date: Wed Jan 5 07:45:53 2005 Subject: [e2e] Question about RFC 2581 In-Reply-To: <20050105151916.4A3D777B0CC@guns.icir.org> References: <20050105151916.4A3D777B0CC@guns.icir.org> Message-ID: <1104939785.41dc0b09b6e4a@web-mail2.uibk.ac.at> Mark, I was just about to write a note that this would probably require ABC - it resembles Savage's "ACK Division" attack, but it's unintentional - and that for me, it's a reason to vote for a strong SHOULD (if not even a MUST?) regarding usage of ABC with L = at least 1 SMSS. Personally, I'd like to see: MUST L = 1 SMSS, SHOULD L = 2 SMSS. It's really a fix, so a MUST for the more conservative case is worth thinking about IMO. Dado: thanks for bringing up that interesting issue! Cheers, Michael The very moment I saw > > Thus, the more ACK segments, the greater the n and the greater the sum > > part of the equation. If I derived this right, then the difference is > > quite significant to me. However, this is the way it should not be > > implemented, I hope. > > Right. We are on the same page, I think. > > My own opinion is that congestion avoidance should be implemented using > byte counting and that 1 SMSS should be added to cwnd after cwnd bytes > have been ACKed. That is allowed in the current RFC. > > In the revision, should this scheme be not only allowed but encouraged? > I only see advantages (mostly in terms of security) of this, not > disadvantages. (The only disadvantage that really comes to mind is that > a touch more state must be kept... basically how much data has been > ACKed since we last bumped the cwnd.) > > I'd love to hear opinions on this. > > allman > > > -- > Mark Allman -- ICIR -- http://www.icir.org/mallman/ > > > > From perfgeek at mac.com Wed Jan 5 08:28:22 2005 From: perfgeek at mac.com (rick jones) Date: Wed Jan 5 08:29:26 2005 Subject: [e2e] RST effect on socket buffers? In-Reply-To: <20050105073548.GS15161@ethereal.net> References: <20050105073548.GS15161@ethereal.net> Message-ID: On Jan 4, 2005, at 11:35 PM, Tristan Horn wrote: > Hi all, > > I find myself wrestling with a vendor to get their HTTP proxy's > half-closed connection handling fixed. > > One of the issues is that their hardware sends a spurious RST to the > client after a 60 second timeout. The effect is that any data > remaining in the client TCP's receive buffer appears to vanish; the > application is not able to read it. Half-closed at which end? If the proxy has the FIN_WAIT_2, then their stack may have an arbitrary fin_wait timer as an overly defensive measure to prevent having a FIN_WAIT_2 connection remain indefinitely when the remote evaporates before sending a FIN. Such timers are often requested of vendors by their customers. Particularly when those customers have clients which are overly fond of abortive closes at their end - the RST's get lost and are not retransmitted. Or, the proxy software may have issued a shutdown(), and waited the 60 seconds itself for the client to do the same, and then set SO_LINGER and called close() - a measure similarly defensive. > My casual reading of RFC 793 suggests that this behavior is expected: > * a connection in e.g. ESTABLISHED state will transition to CLOSED > upon the receipt of a valid RST, > * "CLOSED" == "non-existent" > which behaviour - sending the RST in the first place, or the effect of the RST on the connection? The former is not "expected" the latter is. > The vendor doesn't agree. I also seem to only be able to replicate > the problem on Windows, not e.g. Linux... Sounds like the arbitrary timer in the stack then. Although you may want to syscall trace the proxy if you can. > Can anyone shed light on what the expected behavior here is (if any)? Personally, I believe that once a TCP has been in FIN_WAIT_2 for some length of time it should start sending keepalive probes to make sure the remote is still there and if it hasn't received a response to the keepalive probes within R2 (?) - the configured retransmission limit, _then_ it should abort the connection with a RST just like a normal data retransmit. As presently defined (IIRC) FIN_WAIT_2 is a state where the local TCP will do nothing until it recieves a segment from the remote, and there are no guarantess the remote will actually be there. rick jones there is no rest for the wicked, yet the virtuous have no pillows From touch at ISI.EDU Wed Jan 5 10:31:57 2005 From: touch at ISI.EDU (Joe Touch) Date: Wed Jan 5 10:33:47 2005 Subject: [e2e] RST effect on socket buffers? In-Reply-To: <20050105073548.GS15161@ethereal.net> References: <20050105073548.GS15161@ethereal.net> Message-ID: <41DC329D.3020204@isi.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Tristan Horn wrote: | Hi all, | | I find myself wrestling with a vendor to get their HTTP proxy's | half-closed connection handling fixed. | | One of the issues is that their hardware sends a spurious RST to the | client after a 60 second timeout. The effect is that any data remaining | in the client TCP's receive buffer appears to vanish; the application is | not able to read it. There seems like a bunch of bugs here: - poor choice of timeout value i.e., 1/2 MSL is far too short - assuming a timeout is appropriate at all there's nothing in TCP that requires a timeout for a half-closed connection - sending a RST because of that timeout RSTs aren't there to clean-up state, UNLESS that state interferes with new connections (SYN to a CONNECTED session) | My casual reading of RFC 793 suggests that this behavior is expected: | * a connection in e.g. ESTABLISHED state will transition to CLOSED upon | the receipt of a valid RST, | * "CLOSED" == "non-existent" Yes. Segments are flushed (pg 71) | The vendor doesn't agree. I also seem to only be able to replicate the | problem on Windows, not e.g. Linux... | | Can anyone shed light on what the expected behavior here is (if any)? If the connection is half-closed, that means the receiver ACK'd the FIN. That means all the data up to the seq number of the FIN has been received successfully by TCP. The question is whether the application has received the data yet. You know that only when the other side issues a CLOSE; anything short of that, and you don't know whether the app has the data or not. Sending a RST to a connection when you're in FIN_WAIT1 is KNOWING that you're trashing whatever data remains in the receive buffers; since you can't know whether the data is in the receive buffers or the app, you're taking your chances. If that's not what you intended, then wait for the FIN-ACK and close like the spec says ;-) The basic lesson here is "You can't force behavior on the other end of the connection." Joe | (No need to convince me that the untimely RST itself is broken, BTW -- I | think/hope we've agreed on that point already.) | | thanks! | -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (MingW32) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFB3DKdE5f5cImnZrsRAsaUAKDcfb/+qP2h7+RZQKTbCJGOuE28BACgq++q hTcp7OR76r/M9lmqALh9heA= =Xb5b -----END PGP SIGNATURE----- From thomas.r.henderson at boeing.com Wed Jan 5 11:49:34 2005 From: thomas.r.henderson at boeing.com (Henderson, Thomas R) Date: Wed Jan 5 11:51:21 2005 Subject: [e2e] Question about RFC 2581 Message-ID: <6938661A6EDA8A4EA8D1419BCE46F24C040609B0@xch-nw-27.nw.nos.boeing.com> > > My own opinion is that congestion avoidance should be > implemented using > byte counting and that 1 SMSS should be added to cwnd after cwnd bytes > have been ACKed. That is allowed in the current RFC. > > In the revision, should this scheme be not only allowed but > encouraged? > I only see advantages (mostly in terms of security) of this, not > disadvantages. (The only disadvantage that really comes to > mind is that > a touch more state must be kept... basically how much data has been > ACKed since we last bumped the cwnd.) > > I'd love to hear opinions on this. > Not all split acks are malevolent. For example, they are a component of Cisco's RBSCP implementation. In fact, I think that one might be able to construct a satellite gateway that maintained true end-to-end semantics (no byte of data acked before the true receiver acked it), and also adhered to the principle of not returning more than one (split) ack for every segment successfully received at the gateway, and approach the performance of TCP-splitting gateways. Tom (ducking for cover) From touch at ISI.EDU Wed Jan 5 13:18:23 2005 From: touch at ISI.EDU (Joe Touch) Date: Wed Jan 5 13:20:06 2005 Subject: [e2e] RST effect on socket buffers? In-Reply-To: <20050105195851.13323.qmail@web53703.mail.yahoo.com> References: <20050105195851.13323.qmail@web53703.mail.yahoo.com> Message-ID: <41DC599F.2030602@isi.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Tapan Karwa wrote: |>If the connection is half-closed, that means the |>receiver ACK'd the FIN. |>That means all the data up to the seq number of the |>FIN has been |>received successfully by TCP. The question is |>whether the application |>has received the data yet. You know that only when |>the other side issues |>a CLOSE; anything short of that, and you don't know |>whether the app has |>the data or not. |> |>Sending a RST to a connection when you're in |>FIN_WAIT1 is KNOWING that |>you're trashing whatever data remains in the receive |>buffers; since you |>can't know whether the data is in the receive |>buffers or the app, you're |>taking your chances. If that's not what you |>intended, then wait for the |>FIN-ACK and close like the spec says ;-) |> |>The basic lesson here is "You can't force behavior |>on the other end of the connection." | | | What about the following simultaneous close case : | | If I am a server in the ESTABLISHED state and I | perform a close() on the socket, I will send a FIN to | the other end and move to FIN_WAIT_1 state. Only after all outstanding data your side is sending is ACKd. Then the FIN is sent, etc... | Lets say the other end (client) sends back a FIN at | the same time (simultaneous close), thereby moving the | server to the CLOSING state. Agreed. | Lets say that after | sending this FIN, the client dies/disappears (for some | reason, valid or invalid). OK. Machine could just reboot. | The server will wait for the ACK which would have | moved him to the TIME_WAIT state but that ACK will | never come. So, the server will be stuck in the | CLOSING state, since there is no timeout associated | with that state. None with that state per se, but there's a retransmission timer. The server should be resending its FIN because it hasn't received the FIN-ACK yet. When one of these FINs reaches the other end (which is rebooted and has no state), a RST will be generated in response. That'll clean up both ends, as it's intended to do. The more interesting case would be if the other end never comes back. In that case, you'd stay retransmitting FINs forever, and stay in the CLOSING state forever as well. | Is that ok ? I think there is a timeout, just not one that bumps you out of the state if the other end isn't there. TCP doesn't clean up state to save memory; it cleans up state when it interferes with a new connection. Joe | thanks, | tapan. | | | | __________________________________ | Do you Yahoo!? | Yahoo! Mail - Easier than ever with enhanced search. Learn more. | http://info.mail.yahoo.com/mail_250 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (MingW32) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFB3FmfE5f5cImnZrsRAtFlAJ0TTERdqN3xG2vAc4Sn7FUY2XKO8gCgzr4r /UV+40DDzbOEVZRQmWOaNhk= =Sgiz -----END PGP SIGNATURE----- From iyengar at mail.eecis.udel.edu Wed Jan 5 14:23:43 2005 From: iyengar at mail.eecis.udel.edu (Janardhan Iyengar) Date: Wed Jan 5 14:25:53 2005 Subject: [e2e] Question about RFC 2581 In-Reply-To: <6938661A6EDA8A4EA8D1419BCE46F24C040609B0@xch-nw-27.nw.nos.boeing.com> References: <6938661A6EDA8A4EA8D1419BCE46F24C040609B0@xch-nw-27.nw.nos.boeing.com> Message-ID: Hi Tom/all, > Not all split acks are malevolent. For example, they are a component > of Cisco's RBSCP implementation. If I understand you correctly, isn't that still an "exploitation" of an RFC 2581 "quirk"? I think ABC should be strongly encouraged. IMHO, SHOULD ABC with L = 1 seems conservative enough, and still considerably strong to me. That gives room for someone with a good reason to NOT implement ABC, to do so. regards, jana --------------------------------------------------------------- Janardhan R. Iyengar http://www.cis.udel.edu/~iyengar Protocol Engineering Lab -- CIS -- University Of Delaware --------------------------------------------------------------- From touch at ISI.EDU Wed Jan 5 16:49:27 2005 From: touch at ISI.EDU (Joe Touch) Date: Wed Jan 5 16:52:39 2005 Subject: [e2e] RST effect on socket buffers? In-Reply-To: <41DC599F.2030602@isi.edu> References: <20050105195851.13323.qmail@web53703.mail.yahoo.com> <41DC599F.2030602@isi.edu> Message-ID: <41DC8B17.7070502@isi.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Joe Touch wrote: | | | Tapan Karwa wrote: | |>If the connection is half-closed, that means the | |>receiver ACK'd the FIN. | |>That means all the data up to the seq number of the | |>FIN has been | |>received successfully by TCP. The question is | |>whether the application | |>has received the data yet. You know that only when | |>the other side issues | |>a CLOSE; anything short of that, and you don't know | |>whether the app has | |>the data or not. | |> | |>Sending a RST to a connection when you're in | |>FIN_WAIT1 is KNOWING that | |>you're trashing whatever data remains in the receive | |>buffers; since you | |>can't know whether the data is in the receive | |>buffers or the app, you're | |>taking your chances. If that's not what you | |>intended, then wait for the | |>FIN-ACK and close like the spec says ;-) | |> | |>The basic lesson here is "You can't force behavior | |>on the other end of the connection." | | | | | | What about the following simultaneous close case : | | | | If I am a server in the ESTABLISHED state and I | | perform a close() on the socket, I will send a FIN to | | the other end and move to FIN_WAIT_1 state. | | Only after all outstanding data your side is sending is ACKd. Then the | FIN is sent, etc... Correction (thanks Ted) - the FIN doesn't wait for the data to be ACKd, but it does wait for you to send all your pending data. Only then does it enter FIN_WAIT_1. Joe -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (MingW32) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFB3IsXE5f5cImnZrsRAgISAKDmOdJym9IN49+VnlvbjipPNn9LDwCfbXN1 Ct4YtCnKVwUK7fk+BZZ4w2g= =+b4J -----END PGP SIGNATURE----- From tristan+e2e at ethereal.net Wed Jan 5 22:50:18 2005 From: tristan+e2e at ethereal.net (Tristan Horn) Date: Wed Jan 5 22:51:41 2005 Subject: [e2e] RST effect on socket buffers? In-Reply-To: <41DC329D.3020204@isi.edu> References: <20050105073548.GS15161@ethereal.net> <41DC329D.3020204@isi.edu> Message-ID: <20050106065017.GZ15161@ethereal.net> On Wed, Jan 05, 2005 at 10:31:57AM -0800, Joe Touch wrote: > >There seems like a bunch of bugs here: It gets worse: the proxy always sends a FIN to the server as soon as it receives one -- it seems to have no concept of half-closed connections at all. (This is using the HTTP CONNECT method, where there's no room to make guesses at the content of the proxied connection.) > - poor choice of timeout value > i.e., 1/2 MSL is far too short > > - assuming a timeout is appropriate at all > there's nothing in TCP that requires a timeout for > a half-closed connection I can see the value in supporting a timeout, though (as Rick pointed out). > > - sending a RST because of that timeout > RSTs aren't there to clean-up state, UNLESS that > state interferes with new connections (SYN to a > CONNECTED session) Agreed on all points. :) >Yes. Segments are flushed (pg 71) Thanks! That's exactly what I wanted to know. -- Tristan Horn Sr. Network Engineer CollabNet, Inc. +1 650 228-2567 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050105/23635f78/attachment.bin From randall at stewart.chicago.il.us Thu Jan 6 03:12:22 2005 From: randall at stewart.chicago.il.us (Randall Stewart) Date: Thu Jan 6 03:17:30 2005 Subject: [e2e] Question about RFC 2581 In-Reply-To: <6938661A6EDA8A4EA8D1419BCE46F24C040609B0@xch-nw-27.nw.nos.boeing.com> References: <6938661A6EDA8A4EA8D1419BCE46F24C040609B0@xch-nw-27.nw.nos.boeing.com> Message-ID: <41DD1D16.50607@stewart.chicago.il.us> Henderson, Thomas R wrote: >>My own opinion is that congestion avoidance should be >>implemented using >>byte counting and that 1 SMSS should be added to cwnd after cwnd bytes >>have been ACKed. That is allowed in the current RFC. >> >>In the revision, should this scheme be not only allowed but >>encouraged? >>I only see advantages (mostly in terms of security) of this, not >>disadvantages. (The only disadvantage that really comes to >>mind is that >>a touch more state must be kept... basically how much data has been >>ACKed since we last bumped the cwnd.) >> >>I'd love to hear opinions on this. >> > > > Not all split acks are malevolent. For example, they are a component > of Cisco's RBSCP implementation. > > In fact, I think that one might be able to construct a satellite gateway > that maintained true end-to-end semantics (no byte of data acked before > the true receiver acked it), and also adhered to the principle of not > returning more than one (split) ack for every segment successfully > received at the gateway, and approach the performance of TCP-splitting > gateways. Well.. being someone behind the scenes of RBSCP :-D I will comment on this.. The whole point of RBSCP was to NOT break the E-2-E semantics of a connection and at the same time NOT save any state in the routers aka no per-flow-state... If I grok what you are saying above I don't see how you could "speed" the satellite connection up... RBSCP now does this TCP-Sndr - Router --Sat-modem ---//// -- Sat-modem - Router - TCP-rcv ------Data----> -----Tunnel--->-------------------------> -----Data---> <-----ACK---- <------------ACK------------------------- <--ACK <--ACK <--ACK (split times) Now if I grok your statement you would take out the split acks above, which are used to try to get the sndr to open its window faster. Note that the end-2-end semantics are preserved.. just that the ack is split into multiple pieces... If you take the split ack's out you wont get the sender to increase the cwnd any faster.. If SCTP is in place its a different picture, since appropriate byte counting is built into SCTP. However there is a extension floating around for SCTP that provides a <-----SACK--------------------- <---SACK+PKTDROP The packet drop in this case is used by the router to identify the bandwidth characteristics of the current outbound satellite tunnel. This allows SCTP to make adjustments to the send window. Which by the way makes SCTP scream over a satellite and still fair-share the bw... Of course all of this only works well if you get the send and receive windows up large enough on each side as well.... For SCTP thats not to much of a problem, since the implementations our newer and most implementors have followed my recomendation to put your rwnd/swnd as LARGE as possible (and no SCTP is not subject to the slipping in the window attack). For TCP implementations its more probablamatic.. since many impelementations set the windows around 8 or 16k... sometimes 32 if your lucky :-D R > > Tom > (ducking for cover) > > -- Randall Stewart 803-345-0369 815-342-5222(cell) From thomas.r.henderson at boeing.com Thu Jan 6 08:47:45 2005 From: thomas.r.henderson at boeing.com (Henderson, Thomas R) Date: Thu Jan 6 08:50:07 2005 Subject: [e2e] Question about RFC 2581 Message-ID: <6938661A6EDA8A4EA8D1419BCE46F24C040609B5@xch-nw-27.nw.nos.boeing.com> > > > > In fact, I think that one might be able to construct a > satellite gateway > > that maintained true end-to-end semantics (no byte of data > acked before > > the true receiver acked it), and also adhered to the > principle of not > > returning more than one (split) ack for every segment successfully > > received at the gateway, and approach the performance of > TCP-splitting > > gateways. > > The whole point of RBSCP was to NOT break the E-2-E semantics of > a connection and at the same time NOT save any state in the routers > aka no per-flow-state... > > If I grok what you are saying above I don't see how you could > "speed" the satellite connection up... I was just suggesting that a gateway that did hold per-flow state could operate somewhat like a split-connection gateway without having to resort to acking segments prematurely. For example, the gateway facing the server could hold the first received ack (for the first full segment of data) and parcel it out to the server in small chunks. (assuming it was not a Linux server, which already uses ABC) Now, one could certainly split every ack into, say, five acks, but that could induce TCP-unfriendly behavior. With a bit of logic (and conservative behavior) in the gateway, the ack splitting could keep the connection TCP-friendly on the terrestrial side. That is, the data transfer would proceed no more quickly than if the gateway were the actual TCP client. > > Of course all of this only works well if you get the send and receive > windows up large enough on each side as well.... Somewhat true, although receive window values could be manipulated (but small send buffer size is more problematic). But my main point was just to note that not all split acks are of the DoS variety as noted by Savage et al-- not to advocate against using ABC. There are probably other vendors too who are obtaining performance boosts with similar ack manipulations. Tom From kmp at email.unc.edu Thu Jan 6 07:31:08 2005 From: kmp at email.unc.edu (Ketan Mayer-Patel) Date: Thu Jan 6 08:59:31 2005 Subject: [e2e] CFP NOSSDAV 2005 Message-ID: CALL FOR PAPERS NOSSDAV 2005 http://www.nossdav.org/2005/ 15th ACM International Workshop on Network and Operating System Support for Digital Audio and Video For fifteen years, NOSSDAV has fostered cutting-edge state-of-the-art research in multimedia and newly emerging areas such as networked games and peer-to-peer streaming. The workshop environment encourages lively discussion among participants and invites strong feedback for work in progress. For 2005, NOSSDAV will take place in Skamania, Washington. Located along the beautiful Columbia River about 30 miles east of Portland, Oregon, Skamania offers a variety of outdoor activities including golf, river rafting, kayaking, hiking, and quaint riverfront towns steeped in Lewis-and-Clark-era history. NOSSDAV invites submissions on all areas of multimedia computing and networking and strongly encourages work in progress in emerging areas. Papers grounded in high-quality experimental research based on prototype and real systems are highly valued. Additionally, papers proposing new directions for research or calling into question existing conventional wisdom are welcomed. Topics of interest include, but are not limited to: *** Peer-to-peer streaming *** Networked games *** Wireless and mobile multimedia systems - 3D multimedia and tele-immersion - Streaming 3D graphics and virtual worlds - Sensor netowrks and architectures - In-network stream processing - Application-level multicast - Multimedia security - Digital rights management - Real-time operating system support for multimedia - Multimedia middleware and frameworks New for this year are three topic-specific sessions on peer-to-peer streaming, networked games, and mobile media. These sessions will include an invited paper from leading researchers and discussion panels. Papers in these three areas are strongly encouraged. A broad view will be taken in deciding what topics are within scope. Please feel free to contact the workshop co-chairs if you are unsure and wish to check if a particular paper or topic is within the workshop scope. As always, student participation is strongly encouraged. To encourage a good mix of seasoned researchers as well as students, we will be offering discounted registration for student members who attend with their faculty advisor. Submissions (as well as the camera ready final versions of accepted papers) should be no longer than 6 pages. We expect these submissions to be the kernel of what will eventually lead to full-length papers at high-quality conferences or journals. Important Dates February 21, 2005: Paper registration deadline (abstract and title only). February 28, 2005: Paper submission deadline (full papers). More information is available at the workshop website: http://www.nossdav.org/2005 Workshop Co-Chairs: Wu-Chi Feng (Portland State University, wuchi@cs.pdx.edu) Ketan Mayer-Patel (University of North Carolina, kmp@cs.unc.edu) From Anil.Agarwal at viasat.com Tue Jan 11 11:08:28 2005 From: Anil.Agarwal at viasat.com (Agarwal, Anil) Date: Tue Jan 11 11:10:09 2005 Subject: [e2e] Question about RFC 2581 Message-ID: Michael Welzl wrote - RFC 2581 states: One formula commonly used to update cwnd during congestion avoidance is given in equation 2: cwnd += SMSS*SMSS/cwnd (2) This adjustment is executed on every incoming non-duplicate ACK. Equation (2) provides an acceptable approximation to the underlying principle of increasing cwnd by 1 full-sized segment per RTT. (Note that for a connection in which the receiver acknowledges every data segment, (2) proves slightly more aggressive than 1 segment per RTT, and for a receiver acknowledging every-other packet, (2) is less aggressive.) Actually, for a connection in which the receiver acknowledges every data segment, (2) is slightly **less** aggressive than 1 segment per RTT. Take an example of MSS = 1000 bytes over a link with delay >> packet transmission time. At time 0, cwnd = 1000. Case 1. cwnd is increased by 1 segment every RTT After 1 RTT, cwnd = 1000 + 1000 = 2000 After 2 RTTs, cwnd = 2000 + 1000 = 3000 After 3 RTTs, cwnd = 3000 + 1000 = 4000 Case 2. cwnd is increased using (2), and only MSS sized segments are sent After 1 RTT, cwnd = 1000 + (1000 * 1000) / 1000 = 2000 In the second RTT, 2 segments are sent and acknowledged. After the first ACK, cwnd = (2000 + (1000 * 1000) / 2000) = 2500 After the second ACK, cwnd = (2500 + (1000 * 1000) / 2500) = 2900 In the third RTT, 2 segments are sent and acknowledged. After the first ACK, cwnd = (2900 + (1000 * 1000) / 2900) = 3244 After the second ACK, cwnd = (3244 + (1000 * 1000) / 3244) = 3552 Note that cwnd is smaller than the value in Case 1 and remains so subsequently. Case 3. cwnd is increased using (2), and partial segments are sent After 1 RTT, cwnd = 1000 + (1000 * 1000) / 1000 = 2000 In the second RTT, 2 segments are sent and acknowledged. After the first ACK, cwnd = (2000 + (1000 * 1000) / 2000) = 2500 After the second ACK, cwnd = (2500 + (1000 * 1000) / 2500) = 2900 In the third RTT, 3 segments are sent and acknowledged (the last segment is of size 900). After the first ACK, cwnd = (2900 + (1000 * 1000) / 2900) = 3244 After the second ACK, cwnd = (3244 + (1000 * 1000) / 3244) = 3552 After the third ACK, cwnd = (3552 + (1000 * 1000) / 3552) = 3833 Note that cwnd is smaller than the value in Case 1, but larger than in Case 2, and remains so on subsequently. As Mark suggests, "congestion avoidance should be implemented using byte counting and that 1 SMSS should be added to cwnd after cwnd bytes have been ACKed. That is allowed in the current RFC". Would the following equation be a reasonable alternative way to do this - cwnd += bytes_acked * SMSS / cwnd (3) or cwnd += bytes_acked / 2 * SMSS / cwnd (4) (4) tries to capture the effect of delayed acknowledgements. The above is executed for every ack packet, and does not require any additional state to be maintained. For comparison, (3) with MSS sized segments, and 2 ACKs per segment, gives cwnd after RTT 3 = 3415. Anil Anil.Agarwal@viasat.com From Anil.Agarwal at viasat.com Tue Jan 11 15:08:10 2005 From: Anil.Agarwal at viasat.com (Agarwal, Anil) Date: Tue Jan 11 15:09:36 2005 Subject: [e2e] Question about RFC 2581 Message-ID: Correction to the previous posting - Equation 4 should be cwnd += min(bytes_acked, SMSS)* SMSS / cwnd (4) This is (almost) equivalent to the byte-counting method described in RFC 2581. Equation (3) remains unchanged - cwnd += bytes_acked * SMSS / cwnd (3) Also, with equations (3) or (4), the rate of growth of cwnd is (slightly) smaller, when there are more (partial) ACKs per segment, which is probably a good thing. Regards, Anil Anil.Agarwal@viasat.com From michael.welzl at uibk.ac.at Wed Jan 12 12:32:17 2005 From: michael.welzl at uibk.ac.at (Michael Welzl) Date: Wed Jan 12 12:27:43 2005 Subject: [e2e] Question about RFC 2581 References: Message-ID: <003a01c4f8e5$cf4a78a0$0200a8c0@fun> Dear all, I'm surprised that nobody seems to care about this message. Is this issue really negligible? I think not. I can imagine that sending only 2 segments in the third RTT (case 2 below) can lead to quite a deviation from the desired one-segment-per-RTT increase behavior when the rate is small (when fast recovery sets in). Cheers, Michael ----- Original Message ----- From: "Agarwal, Anil" To: Sent: Tuesday, January 11, 2005 8:08 PM Subject: [e2e] Question about RFC 2581 > Michael Welzl wrote - > > RFC 2581 states: > > One formula commonly used to update > cwnd during congestion avoidance is given in equation 2: > > cwnd += SMSS*SMSS/cwnd (2) > > This adjustment is executed on every incoming non-duplicate ACK. > Equation (2) provides an acceptable approximation to the underlying > principle of increasing cwnd by 1 full-sized segment per RTT. (Note > that for a connection in which the receiver acknowledges every data > segment, (2) proves slightly more aggressive than 1 segment per RTT, > and for a receiver acknowledging every-other packet, (2) is less > aggressive.) > > > Actually, for a connection in which the receiver acknowledges every data > segment, > (2) is slightly **less** aggressive than 1 segment per RTT. > > Take an example of MSS = 1000 bytes over a link with delay >> packet > transmission time. > At time 0, cwnd = 1000. > > Case 1. cwnd is increased by 1 segment every RTT > After 1 RTT, cwnd = 1000 + 1000 = 2000 > After 2 RTTs, cwnd = 2000 + 1000 = 3000 > After 3 RTTs, cwnd = 3000 + 1000 = 4000 > > Case 2. cwnd is increased using (2), and only MSS sized segments are sent > After 1 RTT, cwnd = 1000 + (1000 * 1000) / 1000 = 2000 > > In the second RTT, 2 segments are sent and acknowledged. > After the first ACK, cwnd = (2000 + (1000 * 1000) / 2000) = 2500 > After the second ACK, cwnd = (2500 + (1000 * 1000) / 2500) = 2900 > > In the third RTT, 2 segments are sent and acknowledged. > After the first ACK, cwnd = (2900 + (1000 * 1000) / 2900) = 3244 > After the second ACK, cwnd = (3244 + (1000 * 1000) / 3244) = 3552 > > Note that cwnd is smaller than the value in Case 1 and > remains so subsequently. > > Case 3. cwnd is increased using (2), and partial segments are sent > After 1 RTT, cwnd = 1000 + (1000 * 1000) / 1000 = 2000 > > In the second RTT, 2 segments are sent and acknowledged. > After the first ACK, cwnd = (2000 + (1000 * 1000) / 2000) = 2500 > After the second ACK, cwnd = (2500 + (1000 * 1000) / 2500) = 2900 > > In the third RTT, 3 segments are sent and acknowledged > (the last segment is of size 900). > After the first ACK, cwnd = (2900 + (1000 * 1000) / 2900) = > 3244 > After the second ACK, cwnd = (3244 + (1000 * 1000) / 3244) = > 3552 > After the third ACK, cwnd = (3552 + (1000 * 1000) / 3552) = > 3833 > > Note that cwnd is smaller than the value in Case 1, > but larger than in Case 2, and remains so on subsequently. > > > As Mark suggests, "congestion avoidance should be implemented using > byte counting and that 1 SMSS should be added to cwnd after cwnd bytes > have been ACKed. That is allowed in the current RFC". > > Would the following equation be a reasonable alternative way to do this - > cwnd += bytes_acked * SMSS / cwnd (3) > or > cwnd += bytes_acked / 2 * SMSS / cwnd (4) > > (4) tries to capture the effect of delayed acknowledgements. > > The above is executed for every ack packet, and does not require > any additional state to be maintained. > > For comparison, (3) with MSS sized segments, and 2 ACKs per segment, > gives cwnd after RTT 3 = 3415. > > Anil > > Anil.Agarwal@viasat.com > From dpreed at reed.com Wed Jan 12 17:21:24 2005 From: dpreed at reed.com (David P. Reed) Date: Wed Jan 12 17:23:41 2005 Subject: [e2e] overlay over TCP Message-ID: <41E5CD14.4010206@reed.com> Anyone know of any experiments that have involved overlay networks that run over TCP virtual circuits, but which try to avoid some of the application-layer problems of reliable in-order delivery? I'm interested in optimizing any end-to-end goal-function other than bulk transfer speed. (yeah, I know about a lot of the research and hacking that uses multiple TCP connections to blast a file from here to there). Ideally, I'm interested in approaches that focus on preserving TCP-friendliness (and generally would be seen as cooperative in sharing the network resources rather than greedy or dangerous). Obviously I'm interested because I've begun playing around with such ideas. They might be practically useful in a world where UDP is viewed as a "security hole," but TCP is not (I don't agree, but why fight stupid people if you don't have to). Don't want to reinvent the wheel. From jtk at northwestern.edu Wed Jan 12 19:47:20 2005 From: jtk at northwestern.edu (John Kristoff) Date: Wed Jan 12 19:48:12 2005 Subject: [e2e] overlay over TCP In-Reply-To: <41E5CD14.4010206@reed.com> References: <41E5CD14.4010206@reed.com> Message-ID: <20050112214720.766c0292@dsl017-022-068.chi1.dsl.speakeasy.net> On Wed, 12 Jan 2005 20:21:24 -0500 "David P. Reed" wrote: > Anyone know of any experiments that have involved overlay networks that > run over TCP virtual circuits, but which try to avoid some of the > application-layer problems of reliable in-order delivery? No, but I think I know precisely what you're talking about. I've often thought about the very same thing myself recently if so. It would have probably been appropriate as an April 1 RFC a few years ago, but now it doesn't seem so silly. It would be a natural evolution. The Internet may route around damage, but users route around the suppression from a centralized control model. First it was freedom of processing now it is freedom to network. It may come as either this type of overlay or as something more fundamental through new channels (wireless) that have no central control. It would be very interesting to build those Internets on top of the Internet. Dealing with one-way circuit setups, instabilities of that first Internet layer and further attempts to restrict communications at that first layer could pose significant hurdles however. The closest thing that I can think of that has been deployed are things like the onion router projects and more recently Tor: John From michael.welzl at uibk.ac.at Thu Jan 13 02:22:20 2005 From: michael.welzl at uibk.ac.at (Michael Welzl) Date: Thu Jan 13 02:28:10 2005 Subject: [e2e] cwnd update correction for congestion avoidance Message-ID: <1105611740.4764.66.camel@lap10-c703.uibk.ac.at> Dear all, I'm now taking this to tcpm, too... On tuesday, Anil Agarwal sent a message to the end2end list which made it clear that the equation: cwnd += SMSS*SMSS/cwnd from 2581 does not really add a segment every RTT as desired. (He also went into some ABC related details, but I'll not go into them for now to keep things simple.) The underlying problem is that cwnd changes with every ACK that comes in, leading to a slightly decreasing increase factor with each ACK. While tons of books and papers erroneously state that this equation adds *EXACTLY* one segment per RTT, RFC 2581 correctly says that it is an acceptable approximation. Mathematica couldn't solve the equation x(t+1) += 1/x(t) (counting in segments now), but MS Excel told me that the effect is quite negligible for, say, a window of 1000 segments which have 1000 bytes each. However, the problem may become aggravated when cwnd is small compared to the b-d-product. Anil gave this simple example, assuming MSS=1000 and cwnd=1000 at the beginning, and counting in bytes: > After 1 RTT, cwnd = 1000 + (1000 * 1000) / 1000 = 2000 > > In the second RTT, 2 segments are sent and acknowledged. > After the first ACK, cwnd = (2000 + (1000 * 1000) / 2000) = 2500 > After the second ACK, cwnd = (2500 + (1000 * 1000) / 2500) = 2900 > > In the third RTT, 2 segments are sent and acknowledged. > After the first ACK, cwnd = (2900 + (1000 * 1000) / 2900) = 3244 > After the second ACK, cwnd = (3244 + (1000 * 1000) / 3244) = 3552 Only TWO segments are sent in the third RTT - I can imagine that right after a congestion event, this problem might play a role, depending on the spacing of ACKs. I really don't understand why this isn't fixed - it seems to be easy?! * When entering congestion avoidance, set old_cwnd = cwnd * Whenever a non-duplicate ACK arrives during congestion avoidance, do: cwnd += SMSS*SMSS/old_cwnd; if ( (cwnd - old_cwnd) >= SMSS ) old_cwnd = cwnd; This will have old_cwnd advance in SMSS-steps. Perhaps I'm missing something here, but if this code above is wrong, there surely is some other easy way to fix the problem. Cheers, Michael From jshapiro at cse.msu.edu Thu Jan 13 06:43:13 2005 From: jshapiro at cse.msu.edu (Jonathan Shapiro) Date: Thu Jan 13 06:46:28 2005 Subject: [e2e] overlay over TCP In-Reply-To: <41E5CD14.4010206@reed.com> References: <41E5CD14.4010206@reed.com> Message-ID: <41E68901.4020404@cse.msu.edu> In terms of preserving TCP friendliness, would TCP flow control provide a natural form back pressure for TCP sessions coupled together in sequence and thus constrain the rate of a session over the VC to the minimum available bandwidth on any individual overlay link? There's been some work on this back pressure mechanism for application-level multicast. The following links might be helpful http://www.cs.bu.edu/techreports/pdf/2003-015-roma.pdf http://www.eurecom.fr/~btroup/BPublished/ngc2002_overlay.pdf http://www-lor.int-evry.fr/~templemo/Academic/REALM/realm.html http://www.eecg.toronto.edu/~bli/papers/jsac04.pdf http://www.cs.cornell.edu/home/rvr/papers/SelectCast.ps http://www.arl.wustl.edu/Publications/2000-04/wucs0017.pdf When you say "avoid some of the application-layer problems of reliable in-order delivery" are you thinking about multiplexing many logical sessions over a single TCP VC? In that case, would a protocol like SCTP be a reasonable alternative to TCP? /jonathan David P. Reed wrote: > Anyone know of any experiments that have involved overlay networks > that run over TCP virtual circuits, but which try to avoid some of the > application-layer problems of reliable in-order delivery? > > I'm interested in optimizing any end-to-end goal-function other than > bulk transfer speed. (yeah, I know about a lot of the research and > hacking that uses multiple TCP connections to blast a file from here > to there). > > Ideally, I'm interested in approaches that focus on preserving > TCP-friendliness (and generally would be seen as cooperative in > sharing the network resources rather than greedy or dangerous). > > Obviously I'm interested because I've begun playing around with such > ideas. They might be practically useful in a world where UDP is > viewed as a "security hole," but TCP is not (I don't agree, but why > fight stupid people if you don't have to). Don't want to reinvent the > wheel. From dpreed at reed.com Thu Jan 13 07:16:00 2005 From: dpreed at reed.com (David P. Reed) Date: Thu Jan 13 07:17:44 2005 Subject: [e2e] overlay over TCP In-Reply-To: <20050112214720.766c0292@dsl017-022-068.chi1.dsl.speakeasy.net> References: <41E5CD14.4010206@reed.com> <20050112214720.766c0292@dsl017-022-068.chi1.dsl.speakeasy.net> Message-ID: <41E690B0.6060900@reed.com> John Kristoff wrote: >It may come as either this type of overlay or >as something more fundamental through new channels (wireless) that have >no central control. > > You could say I'm doing my bit to work on both... (central-control-free wireless and applications that motivate edge-based network overlays). I tend to think of them as belt and suspenders, rather than independent alternatives. >It would be very interesting to build those Internets on top of the >Internet. Dealing with one-way circuit setups, instabilities of that >first Internet layer and further attempts to restrict communications >at that first layer could pose significant hurdles however. The >closest thing that I can think of that has been deployed are things >like the onion router projects and more recently Tor: > > > > My reverse-engineering of Skype suggests that whether or not the now-centrally-controlled IETF gets its way, there continue to be ways to connect end-user benefit to decentralized solutions. (wait till someone starts selling middlebox "skype-blockers" and see what they use as "justification" for why people should "be afraid, be very afraid" of Skype). Nothing's perfect, but lowly Skype's done a pretty darn good pragmatic job of what we used to call "internetworking" with a lowercase "i" against the wishes of those who would recreate walled gardens. (a lot more quickly and efficiently than "STUN" or "Teredo" has been deployed). If I were teaching networking protocols today, I'd be teaching bittorrent and skype - how they work and why they work. They approximate the role of the Internet experiment in the late '70's in today's environment. (of course the smartest students have already figured that out, but perhaps a 50+-year-old's perspective might be useful to them anyway). From michael.welzl at uibk.ac.at Thu Jan 13 08:10:54 2005 From: michael.welzl at uibk.ac.at (Michael Welzl) Date: Thu Jan 13 08:13:40 2005 Subject: [e2e] Re: [tcpm] cwnd update correction for congestion avoidance In-Reply-To: <41E6995C.7080001@nokia.com> References: <1105611740.4764.66.camel@lap10-c703.uibk.ac.at> <41E6995C.7080001@nokia.com> Message-ID: <1105632653.4764.210.camel@lap10-c703.uibk.ac.at> Hi, > I am probably not answering your question, but you are assuming that the > TCP sender will be in congestion avoidance at a very small window size > (the equation that you describe only applies in congestion avoidance.) > When cwnd is small compared to BDP, TCP should be in slow start rather > than in CA. With bigger cwnd, the calculation error tends to be small. I should've been more precise: I meant small but large enough to be in congestion avoidance, e.g. right after fast recovery setting in. Then, the sender could update its window to ALMOST 1 SMSS but just not enough for sending X packets. It sends X-1 packets (causing X-1 ACKs), and it could take up to half an RTT until that situation changes. I haven't analyzed how severe that really is in such a situation, but I imagine that it's nonnegligible. > Another note: cwnd(n+1)+= S^2/cwnd(n), does not capture the fact that > RTT itself is depends upon cwnd and hence on time. In my opinion, the > equation for mathematical analysis should be > > cwnd( t+RTT(t)) = cwnd(t) + S^2/cwnd(t) > > (and even that equation is simplistic at best, because it assumes that > RTT is deterministic and not some stochastic variable!) The ack clocking > mechanism of TCP takes care of random fluctuations of RTT and the > equation in RFC2581 is robust, but while doing mathematic analysis we > tend to overlook this fact. I agree - but I wasn't going for thorough mathematical TCP analysis, just checking where this equation is going. BTW, since, as you say, it is common to assume equally spaced RTTs (or "rounds"), most TCP models don't capture this effect. Cheers, Michael From touch at ISI.EDU Thu Jan 13 09:53:00 2005 From: touch at ISI.EDU (Joe Touch) Date: Thu Jan 13 09:55:18 2005 Subject: [e2e] overlay over TCP In-Reply-To: <41E5CD14.4010206@reed.com> References: <41E5CD14.4010206@reed.com> Message-ID: <41E6B57C.6060705@isi.edu> David P. Reed wrote: > Anyone know of any experiments that have involved overlay networks that > run over TCP virtual circuits, but which try to avoid some of the > application-layer problems of reliable in-order delivery? Yes. They suffer from TCP's inability to do a true 'push' - i.e., since there is no correlation between writes and TCP segments, it's frequently the case that such use has significant 'stalls', either waiting for packet aggregation (if NAGLE is on, which it should not be, though), or for ACK compression. They've been used for years for tunnels at the application layer. The phrase "TCP tunnel" gets half a million hits in Google, and some of those on the first page are even relevant. It'd be better to use DCCP for such tunnels, though, if you want in-order, TCP-friendly, but also want some sort of segment-boundary semantics preserved at the API. Joe -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 254 bytes Desc: OpenPGP digital signature Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050113/221796f1/signature.bin From braden at ISI.EDU Thu Jan 13 11:54:03 2005 From: braden at ISI.EDU (Bob Braden) Date: Thu Jan 13 11:57:06 2005 Subject: [e2e] overlay over TCP Message-ID: <200501131954.LAA11150@gra.isi.edu> *> *> Yes. They suffer from TCP's inability to do a true 'push' Hey, it's not TCP's fault -- TCP has a true Push. Generations of implementors have chosen to invoke it implicitly rather than explicitly from an application. This choice makes a lot of sense, but it does provide less functionality than TCP allows. Bob Braden From dpreed at reed.com Thu Jan 13 13:31:41 2005 From: dpreed at reed.com (David P. Reed) Date: Thu Jan 13 13:33:44 2005 Subject: [e2e] The One and Only True Push In-Reply-To: <200501131954.LAA11150@gra.isi.edu> References: <200501131954.LAA11150@gra.isi.edu> Message-ID: <41E6E8BD.6000300@reed.com> Regarding "true push" - that's not what I was concerned about, but Bob's right. TCP isn't the Berkeley sockets API. If "push" is missing from the API, that's not the protocol's fault... it's been there from the start. In fact, when we split functionality specifically there in NCP only for Telnet out (here's another story from my own days designing TCP) we implemented the "urgent pointer" (which was pretty much from my DSP protocol, so blame me) and that was supposed to be an end-to-end Push, meaning to the TCP on the other end precisely that "your app client needs to process all the buffered data, including sender, network and receiver buffers, in order at high speed up to the sequence number in the urgent pointer, because something important to you is contained therein". The urgent pointer was a monotone pointer supposed to be the max of all urgent pointer values ever set by the sender, and retransmitted packets were supposed to get the latest (maximum). That's TRUE PUSH. Sadly, the Unix implementors, who were typically arrogant OS guys and weren't at the meetings where we hashed this out, misunderstood the point, and felt that the urgent pointer should not show through in the sockets API they supported, so people didn't use it as the general end-to-end push it was meant to be. Instead, people were looking for "process interrupt" functionality (which was part of Telnet, not part of TCP) and persisted in misunderstanding the point of the layer separation. They perseverated on "acking" each urgent pointer value, even though the point was that the INTERRUPT was encoded in the data stream, and thus encumbered the PUSH, at least on Berkeley UNIX systems, with app layer semantics that it should never ever have had. From dpreed at reed.com Thu Jan 13 13:38:53 2005 From: dpreed at reed.com (David P. Reed) Date: Thu Jan 13 13:40:12 2005 Subject: [e2e] overlay over TCP In-Reply-To: <41E6B57C.6060705@isi.edu> References: <41E5CD14.4010206@reed.com> <41E6B57C.6060705@isi.edu> Message-ID: <41E6EA6D.1080705@reed.com> The reason not to depend on SCTP is the same reason that UDP isn't adequate. The social (non-technical) processes of the "internet community" have labeled anything non-TCP as POISON, KEEP OUT. We have middleboxes and routers that chuck stuff like that on the floor. Interop is about allowing everything not explicitly prohibited, but don't tell that to the "corporate IT" folks, who want to give you freedom from pesky things like ways to do your job better... some jerk with a beard and a technical education knows how and when you should communicate, and he will let you know what communications technology you will be allowed to use. Cisco's firewall division tells him what's OK, and God (or TPC from the President's Analyst) tells Cisco. From touch at ISI.EDU Thu Jan 13 13:45:50 2005 From: touch at ISI.EDU (Joe Touch) Date: Thu Jan 13 13:47:53 2005 Subject: [e2e] overlay over TCP In-Reply-To: <41E6EA6D.1080705@reed.com> References: <41E5CD14.4010206@reed.com> <41E6B57C.6060705@isi.edu> <41E6EA6D.1080705@reed.com> Message-ID: <41E6EC0E.8030303@isi.edu> David P. Reed wrote: > The reason not to depend on SCTP is the same reason that UDP isn't > adequate. I said DCCP, not SCTP, FWIW, and for a number of reasons. > The social (non-technical) processes of the "internet > community" have labeled anything non-TCP as POISON, KEEP OUT. > We have middleboxes and routers that chuck stuff like that on the floor. Sure - at that point, you're stuck going over TCP, but then you're also stuck with a few other things: - ACK aggregation delays - messages split across packets (lack of fate sharing, so higher loss rates at the message level) - NATing that will kill interior apps anyway any app the NAT doesn't _already_ know about > Interop is about allowing everything not explicitly prohibited, but > don't tell that to the "corporate IT" folks, who want to give you > freedom from pesky things like ways to do your job better... some jerk > with a beard and a technical education knows how and when you should > communicate, and he will let you know what communications technology you > will be allowed to use. Cisco's firewall division tells him what's OK, > and God (or TPC from the President's Analyst) tells Cisco. In that case, you're safer doing IP over HTTP (yes, there is such an animal). Joe -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 254 bytes Desc: OpenPGP digital signature Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050113/79a7b698/signature.bin From dpreed at reed.com Thu Jan 13 14:12:33 2005 From: dpreed at reed.com (David P. Reed) Date: Thu Jan 13 14:13:42 2005 Subject: [e2e] overlay over TCP In-Reply-To: <41E6EC0E.8030303@isi.edu> References: <41E5CD14.4010206@reed.com> <41E6B57C.6060705@isi.edu> <41E6EA6D.1080705@reed.com> <41E6EC0E.8030303@isi.edu> Message-ID: <41E6F251.4010501@reed.com> Joe Touch wrote: > > > In that case, you're safer doing IP over HTTP (yes, there is such an > animal). > Yup, that's a possibility. But the problems I am interested in, which started this thread, are related to managing the underlying TCP (whether it's via HTTP or raw TCP) end-to-end control algorithms so that one can do things like manage congestion, latency, jitter, etc. as well as possible. Of course since HTTP is often tinkered with along the way, using HTTP adds more dimensions to the problem space. From puddinghead_wilson007 at yahoo.co.uk Thu Jan 13 19:27:54 2005 From: puddinghead_wilson007 at yahoo.co.uk (Puddinhead Wilson) Date: Thu Jan 13 19:29:44 2005 Subject: [e2e] a problem... Message-ID: <20050114032754.87613.qmail@web25704.mail.ukl.yahoo.com> Hi there! I have N1----->N2 link from N1 to N2 is physical wire/unidirectional wave guide. No reverse connection possible 1. if the media is copper, how does N1 detect that link N1-N2 is down so it should not transmit? (i think i can solve this one) 2. if the media is fiber/wireless how will i solve this? (i dont know how to this one :( ) any takers? ___________________________________________________________ ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com From puddinghead_wilson007 at yahoo.co.uk Thu Jan 13 19:36:28 2005 From: puddinghead_wilson007 at yahoo.co.uk (Puddinhead Wilson) Date: Thu Jan 13 19:37:45 2005 Subject: [e2e] Re: a problem... Message-ID: <20050114033628.28443.qmail@web25706.mail.ukl.yahoo.com> perhaps I should modify as someone asked for clarification, Channel between N1 and N2 has bandpass X, centred at f (ofcourse copper behaves like copper) do not wish to send a reverse "ack" for link detection etc. What are the simplest ways to do this, for example the case of copper would be to simply apply a DC voltage from the receiver side so that txmtr can look at presence/absence of DC line voltage to know link status. What do i do for other cases like fiber or wireless? --- Puddinhead Wilson wrote: > Hi there! > > I have > N1----->N2 > > link from N1 to N2 is physical wire/unidirectional > wave guide. > > > No reverse connection possible > > 1. if the media is copper, how does N1 detect that > link N1-N2 is down so it should not transmit? (i > think > i can solve this one) > 2. if the media is fiber/wireless how will i solve > this? (i dont know how to this one :( ) > > any takers? > > > > > > > ___________________________________________________________ > > ALL-NEW Yahoo! Messenger - all new features - even > more fun! http://uk.messenger.yahoo.com > ___________________________________________________________ ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com From dpreed at reed.com Fri Jan 14 06:42:32 2005 From: dpreed at reed.com (David P. Reed) Date: Fri Jan 14 06:45:45 2005 Subject: [e2e] a problem... In-Reply-To: <20050114032754.87613.qmail@web25704.mail.ukl.yahoo.com> References: <20050114032754.87613.qmail@web25704.mail.ukl.yahoo.com> Message-ID: <41E7DA58.1020105@reed.com> If a tree falls in the forest and no one's there to hear it, does it make a sound? Putting a DC bias on the receiver end of a pair ultimately wastes energy and reduces receiver S/N, cutting bandwidth achievable, and of course it is strictly speaking a reverse channel - so it makes me wonder what constraint you are trying to achieve here,and at what cost. For example - what's wrong with having the connector (fiber) or receiver (wireless) have a yellow label that reminds the guy who connects it to send an SMS to the guy responsible for the sending end every time the link goes up or down? Knowing what would be wrong with that solution might help suggest a good answer. From matta at cs.bu.edu Thu Jan 13 08:17:45 2005 From: matta at cs.bu.edu (Ibrahim Matta) Date: Fri Jan 14 08:06:49 2005 Subject: [e2e] ICNP 2005 CFP Message-ID: <0511C607B17F804EBE96FFECD1FD98591E7D6D@cs-exs2.cs-nt.bu.edu> PRELIMINARY CALL FOR PAPERS ICNP 2005 13th IEEE International Conference on Network Protocols Boston, Massachusetts November 6-9, 2005 http://csr.bu.edu/icnp2005 E-mail: icnp2005-org@cs.bu.edu ICNP is a highly selective single-track conference covering all aspects of network protocols including design, analysis, specification, verification, implementation, and performance. On its thirteenth anniversary, ICNP 2005 will return to Boston, the Intellectual Hub of The Universe, where it will be held in the historic Backbay area. Papers describing significant research contributions to the field of network protocols are solicited for submission. Papers must be neither previously published nor under review by another conference or journal. Selected top papers from ICNP 2005 will be forwarded to IEEE/ACM Transactions on Networking for possible publication. In addition, a "Best Paper Award" will be given to the outstanding paper presented at the conference. Topics of interest include, but are not limited to: Protocol testing and analysis Protocol design and implementation Network measurement and monitoring Security and resiliency Peer-to-peer/Overlay protocols Routing protocols Wireless and mobile networks Ad hoc and sensor networks QoS and signaling Flow and congestion control Multimedia Distributed gaming ICNP 2005 will feature tutorials and workshops for which it is soliciting proposals. Also, plans are underway for a student poster session and a student travel grant program. Details will be posted on the conference web site as they become available. IMPORTANT DATES: ================ Paper submission: May 6, 2005 Workshop/Tutorial proposals: July 1, 2005 Notification of acceptance: July 15, 2005 Camera ready version: August 5, 2005 STEERING COMMITTEE: =================== Mostafa Ammar, Georgia Tech, USA * Ken Calvert, U. of Kentucky, USA * Mohamed Gouda, U. of Texas, USA Teruo Higashino, Osaka U., Japan * Simon Lam, U. of Texas, USA David Lee, Ohio State U., USA * Mike T. Liu, Ohio State U., USA Raymond Miller, U. of Maryland, USA * Krishan Sabnani, Bell Labs, USA * Executive Committee Member ORGANIZING COMMITTEE: ===================== GENERAL CHAIRS: Azer Bestavros, Boston University, USA Jim Kurose, University of Massachusetts at Amherst, USA PROGRAM CHAIRS: Mohamed Gouda, University of Texas at Austin, USA Ibrahim Matta, Boston University, USA PANEL & TUTORIAL CHAIRS: Debanjan Saha, IBM research, USA Nina Taft, Intel Research, USA PUBLICITY CHAIR: Milind Buddhikot, Bell Labs, USA STUDENT POSTER CHAIR: Michalis Faloutsos, U. of California at Riverside, USA TECHNICAL PROGRAM COMMITTEE: Sudhir Aggarwal, Florida State U. Kevin Almeroth, UC Santa Barbara Paul Amer, U. of Delaware Mostafa Ammar, Georgia Tech Anish Arora, Ohio State U. Ehab Al-Shaer, DePaul U. Chadi Barakat, INRIA, France Bobby Bhattacharjee, U. Maryland Supratik Bhattacharyya, Sprint Labs Milind Buddhikot, Bell Labs John Byers, Boston U. Andrew Campbell, Columbia U. Ana Cavalli, INT, France Jorge Cobb, U. Texas at Dallas Mootaz Elnozahy, IBM Austin Research Lab Magda El Zarki, UC Irvine Sonia Fahmy, Purdue university Michalis Faloutsos, UC Riverside Tim Griffin, Intel Research Cambridge Liang Guo, Motorola Labs Khaled Harfoush, NCSU Teruo Higashino, Osaka University Jennifer Hou, UIUC Shudong Jin, Case Western Reserve U. Sandeep Kulkarni, Michigan State U. Ahmed Helmy, USC/ISI Chin-Tser Huang, U. South Carolina Kevin Jeffay, U. North Carolina at Chapel Hill TV Lakshman, Lucent, Bell Labs Simon Lam, U. Texas at Austin David Lee, Ohio State U. Wang-Chien Lee, Penn State University Nick Maxemchuk, Columbia U. Klara Nahrstedt, UIUC Prashant Pradhan, IBM Kihong Park, Purdue U. Sambit Sahu, IBM Medy Yahya Sanadidi, UCLA Udaya Shankar, U. Maryland Michael Smirnov, FOKUS, Germany Ioannis Stavrakakis, U. Athens, Greece Peter Steenkiste, CMU Terry Todd, McMaster University, Canada Don Towsley, Umass Amherst Joe Touch, USC/ISI Hasan Ural, U. of Ottawa, CA Geoffrey Xie, Naval Post Graduate School Richard Yang, Yale U. David Yau, Purdue U. Zhi-Li Zhang, U. Minnesota Lixia Zhang, UCLA Ty Znati, NSF and U. Pittsburg _______________________________________________________________ === Ibrahim Matta, Associate Professor Computer Science Department Boston University, Boston, MA 02215, USA Tel: (617) 358-1062, Fax: (617) 353-6457 matta@cs.bu.edu http://www.cs.bu.edu/~matta From touch at ISI.EDU Fri Jan 14 11:39:15 2005 From: touch at ISI.EDU (Joe Touch) Date: Fri Jan 14 11:40:59 2005 Subject: [e2e] a problem... In-Reply-To: <41E7DA58.1020105@reed.com> References: <20050114032754.87613.qmail@web25704.mail.ukl.yahoo.com> <41E7DA58.1020105@reed.com> Message-ID: <41E81FE3.5030501@isi.edu> Also, knowing whether multihop reverse paths exist. I.e., can N2 reach N1 via some other path? If so, that's not so hard either... (use a daemon on N1 that listens to one on N2 and tells N1 it's getting through). Joe David P. Reed wrote: > If a tree falls in the forest and no one's there to hear it, does it > make a sound? > > Putting a DC bias on the receiver end of a pair ultimately wastes energy > and reduces receiver S/N, cutting bandwidth achievable, and of course it > is strictly speaking a reverse channel - so it makes me wonder what > constraint you are trying to achieve here,and at what cost. > > For example - what's wrong with having the connector (fiber) or receiver > (wireless) have a yellow label that reminds the guy who connects it to > send an SMS to the guy responsible for the sending end every time the > link goes up or down? Knowing what would be wrong with that solution > might help suggest a good answer. > > > > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 254 bytes Desc: OpenPGP digital signature Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050114/f5a18181/signature.bin From long.spike at virgin.net Fri Jan 14 12:11:04 2005 From: long.spike at virgin.net (Darren Long) Date: Fri Jan 14 12:11:45 2005 Subject: [e2e] Re: a problem... In-Reply-To: <20050114033628.28443.qmail@web25706.mail.ukl.yahoo.com> References: <20050114033628.28443.qmail@web25706.mail.ukl.yahoo.com> Message-ID: <41E82758.9030605@virgin.net> If there is no information flow from N2 to N1, how can you expect N1 to detect N2? Darren Puddinhead Wilson wrote: >perhaps I should modify as someone asked for >clarification, > >Channel between N1 and N2 has bandpass X, centred at f >(ofcourse copper behaves like copper) > >do not wish to send a reverse "ack" for link detection >etc. > >What are the simplest ways to do this, >for example the case of copper would be to simply >apply a DC voltage from the receiver side so that >txmtr can look at presence/absence of DC line voltage >to know link status. > >What do i do for other cases like fiber or wireless? > > --- Puddinhead Wilson > wrote: > > >>Hi there! >> >>I have >>N1----->N2 >> >>link from N1 to N2 is physical wire/unidirectional >>wave guide. >> >> >>No reverse connection possible >> >>1. if the media is copper, how does N1 detect that >>link N1-N2 is down so it should not transmit? (i >>think >>i can solve this one) >>2. if the media is fiber/wireless how will i solve >>this? (i dont know how to this one :( ) >> >>any takers? >> >> >> >> >> >> >> >> >> >___________________________________________________________ > > >>ALL-NEW Yahoo! Messenger - all new features - even >>more fun! http://uk.messenger.yahoo.com >> >> >> > > > > > >___________________________________________________________ >ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com > > > From zartash at lums.edu.pk Fri Jan 14 15:23:36 2005 From: zartash at lums.edu.pk (Zartash Afzal Uzmi) Date: Fri Jan 14 15:22:35 2005 Subject: [e2e] Re: a problem... In-Reply-To: <20050114033628.28443.qmail@web25706.mail.ukl.yahoo.com> Message-ID: <000701c4fa90$12c43660$0802a8c0@lumsnet.edu.pk> If the simple way on copper is to detect a DC "transmitted" by N2, then an equivalent simple way on wireless is to detect a sine wave at f GHz transmitted by N2. Constant magnitude constant frequency Sine wave is as much information-less as the DC itself! I think you should specify the "layer" when you say that "do not wish to send a reverse ack". For example, consider a voltage source connected through a long transmission line to a resistor. In this case, resistor is my N2 and its open-circuit/in-circuit condition will tell me whether N2 is alive or not. Clearly, N2 is not going to be able to send any ACK but the input impedance seen by N1 (my voltage source) will be able to indicate whether N2 is alive or not. In above example, are we sending a signal from N2 to N1? Well, the conductor pair is sort of acting as a reverse channel at the physical layer but clearly no ack is sent at network layer. You can think of replacing the resistor by the input impedance of a live node... Now here might be something interesting. As with a voltage source, copper conductors, and resistors, I can detect the state of N2 (open or close). i.e., no network layer ack but phy layer ack is implicitly there. Can we do a similar thing over wireless and/or fiber? I am not sure and haven't looked at passive RFIDs but aren't they supposed to be detected in a similar way as the voltage source, conductor and resistor? Someone familiar with passive RFIDs should speak up. If these two cases are similar, do we have something to act similarly over a fiber? zartash -----Original Message----- From: end2end-interest-bounces@postel.org [mailto:end2end-interest-bounces@postel.org] On Behalf Of Puddinhead Wilson Sent: Friday, January 14, 2005 8:36 AM To: end2end-interest@postel.org Subject: [e2e] Re: a problem... perhaps I should modify as someone asked for clarification, Channel between N1 and N2 has bandpass X, centred at f (ofcourse copper behaves like copper) do not wish to send a reverse "ack" for link detection etc. What are the simplest ways to do this, for example the case of copper would be to simply apply a DC voltage from the receiver side so that txmtr can look at presence/absence of DC line voltage to know link status. What do i do for other cases like fiber or wireless? --- Puddinhead Wilson wrote: > Hi there! > > I have > N1----->N2 > > link from N1 to N2 is physical wire/unidirectional > wave guide. > > > No reverse connection possible > > 1. if the media is copper, how does N1 detect that > link N1-N2 is down so it should not transmit? (i > think > i can solve this one) > 2. if the media is fiber/wireless how will i solve > this? (i dont know how to this one :( ) > > any takers? > > > > > > > ___________________________________________________________ > > ALL-NEW Yahoo! Messenger - all new features - even > more fun! http://uk.messenger.yahoo.com > ___________________________________________________________ ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com From puddinghead_wilson007 at yahoo.co.uk Fri Jan 14 15:53:47 2005 From: puddinghead_wilson007 at yahoo.co.uk (Puddinhead Wilson) Date: Fri Jan 14 15:55:47 2005 Subject: [e2e] Re: a problem... In-Reply-To: <000701c4fa90$12c43660$0802a8c0@lumsnet.edu.pk> Message-ID: <20050114235348.96617.qmail@web25710.mail.ukl.yahoo.com> i do not want the reverse signal to hog/use away channel capacity though it can transmit from it. In other words, for example i know I need a DC offset on the line, so why not let the receiver send it...etc. ___________________________________________________________ ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com From puddinghead_wilson007 at yahoo.co.uk Fri Jan 14 15:58:29 2005 From: puddinghead_wilson007 at yahoo.co.uk (Puddinhead Wilson) Date: Fri Jan 14 16:00:13 2005 Subject: [e2e] Re: a problem... In-Reply-To: <41E82758.9030605@virgin.net> Message-ID: <20050114235829.31743.qmail@web25703.mail.ukl.yahoo.com> exactly, A constant DC is "no information" :) You best put my question. but it's "presence or absence" is. Say for example I have N1--->N2 if both N1 and N2 use the same frequency/channel etc to txmt, it would result in distortion of information What is that component that I always need to transmit the information but does use up my channel capacity. --- Darren Long wrote: > If there is no information flow from N2 to N1, how > can you expect N1 to > detect N2? > > Darren ___________________________________________________________ ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com From dpreed at reed.com Fri Jan 14 20:26:01 2005 From: dpreed at reed.com (David P. Reed) Date: Fri Jan 14 20:29:13 2005 Subject: [e2e] Re: a problem... In-Reply-To: <000701c4fa90$12c43660$0802a8c0@lumsnet.edu.pk> References: <000701c4fa90$12c43660$0802a8c0@lumsnet.edu.pk> Message-ID: <41E89B59.2070702@reed.com> Passive RFIDs reflect energy when stimulated by incident energy, with the reflected energy bearing some relationship to the incident energy's structure. For example, trivial kinds of passive RFID include a set of differently tuned antennas with diodes in them. The effect of the diode is reflection of a frequency-doubled version of any wave that resonates with the tuned antenna. A transmitter sweeps a sgnal across the frequency range of interest, and listens with a sweep at double the transmitter rate, reading a 1 at each frequency where an antenna reflects energy. Note that this becomes less and less practical the longer the distance between source and passive device. You can do the same thing with optical gratings coupled to the receiving end of a fiber. From Lode.Coene at siemens.com Wed Jan 19 08:17:05 2005 From: Lode.Coene at siemens.com (Coene Lode) Date: Wed Jan 19 08:22:04 2005 Subject: [e2e] overlay over TCP Message-ID: <57FD2C3A246F76438CA6FDAD8FE9F195A7F5EB@hrtades7.atea.be> >David P. Reed wrote: >> The reason not to depend on SCTP is the same reason that UDP isn't >> adequate. > >I said DCCP, not SCTP, FWIW, and for a number of reasons. > And what might be those reasons???? DCCP will have just about the same deployment difficulties that any other new transport protocol has to jump through.... >> The social (non-technical) processes of the "internet >> community" have labeled anything non-TCP as POISON, KEEP OUT. >> We have middleboxes and routers that chuck stuff like that on the floor. Well even the wrong portnumber with TCP could put you up the creek without a paddle.... Seems the drop-to-floor boxes have already won then... So all overlay networks would be welded to TCP for the rest of their lives.... > >Sure - at that point, you're stuck going over TCP, but then you're also >stuck with a few other things: > > - ACK aggregation delays > - messages split across packets (lack of fate sharing, > so higher loss rates at the message level) > - NATing that will kill interior apps anyway > any app the NAT doesn't _already_ know about > >> Interop is about allowing everything not explicitly prohibited, but >> don't tell that to the "corporate IT" folks, who want to give you >> freedom from pesky things like ways to do your job better... some jerk >> with a beard and a technical education knows how and when you should >> communicate, and he will let you know what communications technology you >> will be allowed to use. Cisco's firewall division tells him what's OK, >> and God (or TPC from the President's Analyst) tells Cisco. > >In that case, you're safer doing IP over HTTP (yes, there is such an >animal). And before you know it, all network and transport protocol have a XML syntax... :-) Seriously: you can extend TCP or deploy an already existing transport protocol that gets close to what you want. If it has to do something similar to UDP then PR-SCTP and/or DCCP might do the job.... (PR-SCTP: Partial relialability extension for SCTP) If it should do something similar to TCP, then at present SCTP might be interesting.... Extending TCP or using SCTP/DCCP functionality is actually the battle between: (1)Put functionality in the transport layer Versus (2)The application should do this functionality because it knows what it is doing... (2) Makes sense for some applications, however most applications wouldn't mind (1) because, if later they find out that the functionality actually helps them(you never know..), then they don't have to reimplement it in their own code, they just have to throw a few switches.... > >Joe > Yours sincerely, Lode Coene Siemens COM atealaan 34 2200 Herentals, Belgium E-mail: lode.coene@siemens.com Tel: +32-14-252081 Fax: +32-14-253212 From touch at ISI.EDU Wed Jan 19 12:14:56 2005 From: touch at ISI.EDU (Joe Touch) Date: Wed Jan 19 12:17:53 2005 Subject: [e2e] overlay over TCP In-Reply-To: <57FD2C3A246F76438CA6FDAD8FE9F195A7F5EB@hrtades7.atea.be> References: <57FD2C3A246F76438CA6FDAD8FE9F195A7F5EB@hrtades7.atea.be> Message-ID: <41EEBFC0.6030901@isi.edu> Coene Lode wrote: >>David P. Reed wrote: >> >>>The reason not to depend on SCTP is the same reason that UDP isn't >>>adequate. >> >>I said DCCP, not SCTP, FWIW, and for a number of reasons. > > And what might be those reasons???? > DCCP will have just about the same deployment difficulties that any other > new transport protocol has to jump through.... Recall that David Reed's initial post asked for: 1- TCP-friendliness 2- no app penalty for reliability or in-order delivery SCTP does (1) but NOT (2). DCCP does both (1) and (2) as requested. There are other reasons, notably SCTP's complexity compared to DCCP, as well as features such as multihoming and multistream muxing that may result in an unstable foundation for overlays, e.g., that want to do their own dynamic routing. ... > Seriously: you can extend TCP or deploy an already existing transport > protocol that gets close to what you want. > If it has to do something similar to UDP then PR-SCTP and/or DCCP might do > the job.... > (PR-SCTP: Partial relialability extension for SCTP) Why bother with PR-SCTP when a much simpler DCCP will suffice, esp. when other SCTP properties may be (IMO, are) harmful to overlays? Joe -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 254 bytes Desc: OpenPGP digital signature Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050119/8ff13637/signature.bin From randall at stewart.chicago.il.us Wed Jan 19 13:04:35 2005 From: randall at stewart.chicago.il.us (Randall Stewart) Date: Wed Jan 19 13:07:59 2005 Subject: [e2e] overlay over TCP In-Reply-To: <41E6EA6D.1080705@reed.com> References: <41E5CD14.4010206@reed.com> <41E6B57C.6060705@isi.edu> <41E6EA6D.1080705@reed.com> Message-ID: <41EECB63.9000304@stewart.chicago.il.us> David P. Reed wrote: > The reason not to depend on SCTP is the same reason that UDP isn't > adequate. The social (non-technical) processes of the "internet > community" have labeled anything non-TCP as POISON, KEEP OUT. > We have middleboxes and routers that chuck stuff like that on the floor. And one of the things I have been working on in my day-job is to fix this very thing for SCTP. It does take time.. I have been at my day job 5 years now and I still only have the hardware recognizing SCTP... software is still a fight through process and demand.. and even when you can come up with demand.. you still have to get it into and through all the process .. sigh. R > > Interop is about allowing everything not explicitly prohibited, but > don't tell that to the "corporate IT" folks, who want to give you > freedom from pesky things like ways to do your job better... some jerk > with a beard and a technical education knows how and when you should > communicate, and he will let you know what communications technology you > will be allowed to use. Cisco's firewall division tells him what's OK, > and God (or TPC from the President's Analyst) tells Cisco. > > -- Randall Stewart 803-345-0369 815-342-5222(cell) From randall at stewart.chicago.il.us Wed Jan 19 13:11:28 2005 From: randall at stewart.chicago.il.us (Randall Stewart) Date: Wed Jan 19 13:14:10 2005 Subject: [e2e] overlay over TCP In-Reply-To: <57FD2C3A246F76438CA6FDAD8FE9F195A7F5EB@hrtades7.atea.be> References: <57FD2C3A246F76438CA6FDAD8FE9F195A7F5EB@hrtades7.atea.be> Message-ID: <41EECD00.7080102@stewart.chicago.il.us> Comments below :-D Coene Lode wrote: >>David P. Reed wrote: >> >>>The reason not to depend on SCTP is the same reason that UDP isn't >>>adequate. >> >>I said DCCP, not SCTP, FWIW, and for a number of reasons. >> > > > And what might be those reasons???? > DCCP will have just about the same deployment difficulties that any other > new transport protocol has to jump through.... Even worse for DCCP .. since its further behind SCTP.. and time makes a difference.. let me tell you :-D > > >>>The social (non-technical) processes of the "internet >>>community" have labeled anything non-TCP as POISON, KEEP OUT. >>>We have middleboxes and routers that chuck stuff like that on the floor. > > > Well even the wrong portnumber with TCP could put you up the creek without a > paddle.... > Seems the drop-to-floor boxes have already won then... > So all overlay networks would be welded to TCP for the rest of their > lives.... There is hope.. eventually SCTP will be in all the firewalls and NAT's and then sometime down the road from that you will find DCCP in the boxes too... but in either case it will be some time yet for SCTP and DCCP has not even gotten that far yet... > > >>Sure - at that point, you're stuck going over TCP, but then you're also >>stuck with a few other things: >> >> - ACK aggregation delays >> - messages split across packets (lack of fate sharing, >> so higher loss rates at the message level) >> - NATing that will kill interior apps anyway >> any app the NAT doesn't _already_ know about >> >> >>>Interop is about allowing everything not explicitly prohibited, but >>>don't tell that to the "corporate IT" folks, who want to give you >>>freedom from pesky things like ways to do your job better... some jerk >>>with a beard and a technical education knows how and when you should >>>communicate, and he will let you know what communications technology you >>>will be allowed to use. Cisco's firewall division tells him what's OK, >>>and God (or TPC from the President's Analyst) tells Cisco. >> >>In that case, you're safer doing IP over HTTP (yes, there is such an >>animal). > > > And before you know it, all network and transport protocol have a XML > syntax... :-) ug... that would be fun would it not :-D > > Seriously: you can extend TCP or deploy an already existing transport > protocol that gets close to what you want. > If it has to do something similar to UDP then PR-SCTP and/or DCCP might do > the job.... > (PR-SCTP: Partial relialability extension for SCTP) > If it should do something similar to TCP, then at present SCTP might be > interesting.... > Extending TCP or using SCTP/DCCP functionality is actually the battle > between: > (1)Put functionality in the transport layer > Versus > (2)The application should do this functionality because it knows what it is > doing... > > (2) Makes sense for some applications, however most applications wouldn't > mind (1) because, if later they find out that the functionality actually > helps them(you never know..), then they don't have to reimplement it in > their own code, they just have to throw a few switches.... I agree.. Most apps would rather not get involved and rather just have a service it can use... yes there will always be some that one more control and involvment.. but just look how little control TCP gives you and you see that large numbers of apps still use it :-D R > > >>Joe >> > > > Yours sincerely, > Lode Coene > > Siemens COM > atealaan 34 2200 Herentals, Belgium > E-mail: lode.coene@siemens.com > Tel: +32-14-252081 > Fax: +32-14-253212 > > > -- Randall Stewart 803-345-0369 815-342-5222(cell) From randall at stewart.chicago.il.us Wed Jan 19 15:08:26 2005 From: randall at stewart.chicago.il.us (Randall Stewart) Date: Wed Jan 19 15:12:47 2005 Subject: [e2e] overlay over TCP In-Reply-To: <41EEBFC0.6030901@isi.edu> References: <57FD2C3A246F76438CA6FDAD8FE9F195A7F5EB@hrtades7.atea.be> <41EEBFC0.6030901@isi.edu> Message-ID: <41EEE86A.5060500@stewart.chicago.il.us> Joe: A question and a comment.. Joe Touch wrote: > > > Coene Lode wrote: > >>> David P. Reed wrote: >>> >>>> The reason not to depend on SCTP is the same reason that UDP isn't >>>> adequate. >>> >>> >>> I said DCCP, not SCTP, FWIW, and for a number of reasons. >> >> >> And what might be those reasons???? >> DCCP will have just about the same deployment difficulties that any other >> new transport protocol has to jump through.... > > > Recall that David Reed's initial post asked for: > 1- TCP-friendliness > 2- no app penalty for reliability or in-order delivery I don't get why you answer the way you do on <2> for SCTP... What app penalty are you talking about for reliability or in-order delivery... With SCTP you can have reliability, in-order delivery or no-reliability and out-of-order delivery and any combination. How does that not meet 2? > > SCTP does (1) but NOT (2). > > DCCP does both (1) and (2) as requested. > > There are other reasons, notably SCTP's complexity compared to DCCP, as > well as features such as multihoming and multistream muxing that may > result in an unstable foundation for overlays, e.g., that want to do > their own dynamic routing. > Hmm.. last time I checked pound for pound DCCP specs weighed in about the same length or possibly longer when you add in the CCID's and you need at least one...aka TCP friendly DDCP - 126 pages CCID-02 (TCP) - 22 pages and if you add in what all of them will eventually aka the critical CC update TFRC you add: CCID-03 (TFRC) - 39 pages Take out the 2 pages of boiler plate on each one and you have 181 pages of text. SCTP RFC2960 - 134 pages PR-SCTP ext RFC3758 - 22 pages take out the same boiler plate and you are at 152 pages of text. Seems to me about the same complexity and also DCCP is not an RFC yet.. there may be more to it... I know they talked about a mobility draft and some new CCID for VOIP.. All in all I don't buy your argument that SCTP is more complex... And I think DCCP had a form of multi-homing in it too.. its been a while since I have read through it so it may be removed or moved other places .... Besides which, we are no longer in the 8088 days so complexity in either DCCP or SCTP is not something to be afraid of. An application wants services plain and simple... > ... > >> Seriously: you can extend TCP or deploy an already existing transport >> protocol that gets close to what you want. >> If it has to do something similar to UDP then PR-SCTP and/or DCCP >> might do >> the job.... >> (PR-SCTP: Partial relialability extension for SCTP) > > > Why bother with PR-SCTP when a much simpler DCCP will suffice, esp. when > other SCTP properties may be (IMO, are) harmful to overlays? If one does not want multi-homing one can basically turn if off for tunnels.. I can setup my tunnel endpoints so that the association is singly homed.. thats easy... so if you think M-homing is harmful, turn it off. And you will find, IMO, SCTP a bit further along on the way to deployment then DCCP.. this is not to say DCCP won't deploy eventually.. but it too has the same hurtles this thread as discussed with NATs and Firewalls.. and its just starting... R > > Joe -- Randall Stewart 803-345-0369 815-342-5222(cell) From rja at extremenetworks.com Wed Jan 19 15:31:59 2005 From: rja at extremenetworks.com (RJ Atkinson) Date: Wed Jan 19 15:33:36 2005 Subject: [e2e] overlay over TCP In-Reply-To: <41EECB63.9000304@stewart.chicago.il.us> References: <41E5CD14.4010206@reed.com> <41E6B57C.6060705@isi.edu> <41E6EA6D.1080705@reed.com> <41EECB63.9000304@stewart.chicago.il.us> Message-ID: <508BD1B4-6A72-11D9-A5A2-000D93B505E6@extremenetworks.com> Perhaps one of the paths forward is for folks who propose new transport-layer protocols to also have an informational document targeted at folks who build firewalls (or other middle boxes) to help educate them on what the real risks are (and aren't) with the new protocol and also to give them help on how to implement support for that new protocol in their middle box... For example, with SCTP, one of the things that could help would be specific openly published information on efficiently re-calculating the SCTP checksum after a NAT has done its work, for example. Many folks know how to do this with a Fletcher checksum (often because they've looked at BSDish code), but not so many know how to do it with SCTP's new checksum. (My assumption here is that the big barrier is confusion/ignorance. :-) Ran From touch at ISI.EDU Wed Jan 19 15:34:26 2005 From: touch at ISI.EDU (Joe Touch) Date: Wed Jan 19 15:36:23 2005 Subject: [e2e] overlay over TCP In-Reply-To: <41EEE86A.5060500@stewart.chicago.il.us> References: <57FD2C3A246F76438CA6FDAD8FE9F195A7F5EB@hrtades7.atea.be> <41EEBFC0.6030901@isi.edu> <41EEE86A.5060500@stewart.chicago.il.us> Message-ID: <41EEEE82.10500@isi.edu> Randall Stewart wrote: > Joe: > > A question and a comment.. > ... >> Recall that David Reed's initial post asked for: >> 1- TCP-friendliness >> 2- no app penalty for reliability or in-order delivery > > > I don't get why you answer the way you do on <2> for SCTP... > > What app penalty are you talking about for reliability or > in-order delivery... With SCTP you can have reliability, in-order > delivery or no-reliability and out-of-order delivery and any > combination. PR-SCTP is an extension to SCTP, which isn't as widely deployed as SCTP. I re-read the PR-SCTP spec a few times, and _still_ cannot figure out how to provide true unreliable, any-order delivery. That alone is a fine reason not to use PR-SCTP for this example. See below for other reasons... As to the app penalty, it's related to reliable, in-order delivery costs the application latency if there are losses or out-of-order events in the net (for retransmission and reordering). > How does that not meet 2? > >> SCTP does (1) but NOT (2). >> >> DCCP does both (1) and (2) as requested. >> >> There are other reasons, notably SCTP's complexity compared to DCCP, >> as well as features such as multihoming and multistream muxing that >> may result in an unstable foundation for overlays, e.g., that want to >> do their own dynamic routing. (page count comparisons omitted for brevity)... > Seems to me about the same complexity and also DCCP is not an > RFC yet.. there may be more to it... I know they talked > about a mobility draft and some new CCID for VOIP.. > > All in all I don't buy your argument that SCTP is more > complex... > > And I think DCCP had a form of multi-homing in it too.. its > been a while since I have read through it so it may be removed or > moved other places .... > > Besides which, we are no longer in the 8088 days so complexity > in either DCCP or SCTP is not something to be afraid of. An > application wants services plain and simple... Complexity affects a number of things: - reliability of implementation - ability to easily configure and use the protocol In particular, _if_ you're referring to PR-SCTP, please indicate where in the PR-SCTP RFC its use for unreliable, out-of-order messages is simply and clearly described. ;-) >> ... >> >>> Seriously: you can extend TCP or deploy an already existing transport >>> protocol that gets close to what you want. >>> If it has to do something similar to UDP then PR-SCTP and/or DCCP >>> might do >>> the job.... >>> (PR-SCTP: Partial relialability extension for SCTP) >> >> Why bother with PR-SCTP when a much simpler DCCP will suffice, esp. >> when other SCTP properties may be (IMO, are) harmful to overlays? > > If one does not want multi-homing one can basically turn if > off for tunnels.. I can setup my tunnel endpoints so that > the association is singly homed.. thats easy... so if you > think M-homing is harmful, turn it off. And you will > find, IMO, SCTP a bit further along on the way to deployment > then DCCP.. this is not to say DCCP won't deploy eventually.. but > it too has the same hurtles this thread as discussed with NATs > and Firewalls.. and its just starting... There are so many things in SCTP to turn off, it's impossible to consider a valid argument that SCTP is less complex than DCCP. The length of the DCCP spec vs. SCTP may speak to the comparitive clarity or completeness; spec length isn't always a good metric of complexity. My metric is feature set. By that metric DCCP is a much simpler subset for the task requested. As to NATs and Firewalls, as we all agreed, anything short of IP over HTTP is liable to be blocked somewhere. Joe -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 254 bytes Desc: OpenPGP digital signature Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050119/cff279f4/signature.bin From lynne at telemuse.net Wed Jan 19 16:06:50 2005 From: lynne at telemuse.net (Lynne Jolitz) Date: Wed Jan 19 16:08:24 2005 Subject: [e2e] overlay over TCP In-Reply-To: <41EECD00.7080102@stewart.chicago.il.us> Message-ID: <000701c4fe83$f09e65e0$6e8944c6@telemuse.net> A few questions... > > DCCP will have just about the same deployment difficulties that > any other > > new transport protocol has to jump through.... > > Even worse for DCCP .. since its further behind SCTP.. and time > makes a difference.. let me tell you :-D Why? What's the rush? > There is hope.. eventually SCTP will be in all the firewalls and NAT's > and then sometime down the road from that you will find DCCP in > the boxes too... but in either case it will be some time yet for > SCTP and DCCP has not even gotten that far yet... Why would either be incoporated into firewalls and NATs? Is there a commitment from Cisco or some other big company to back them? What's the killer application that makes them pour out money to upgrade their firmware and go with either of these? Takes a lot to motivate sales these days, according to the biz guys. Think they read RFCs and get all excited, saying "Hey, that's our next big product"? :-) Lynne. ---- We use SpamQuiz. If your ISP didn't make the grade try http://lynne.telemuse.net From touch at ISI.EDU Wed Jan 19 17:08:12 2005 From: touch at ISI.EDU (Joe Touch) Date: Wed Jan 19 17:11:16 2005 Subject: [e2e] overlay over TCP In-Reply-To: <508BD1B4-6A72-11D9-A5A2-000D93B505E6@extremenetworks.com> References: <41E5CD14.4010206@reed.com> <41E6B57C.6060705@isi.edu> <41E6EA6D.1080705@reed.com> <41EECB63.9000304@stewart.chicago.il.us> <508BD1B4-6A72-11D9-A5A2-000D93B505E6@extremenetworks.com> Message-ID: <41EF047C.9070206@isi.edu> RJ Atkinson wrote: > > Perhaps one of the paths forward is for folks who propose new > transport-layer protocols to also have an informational document > targeted at folks who build firewalls (or other middle boxes) to help > educate them on what the real risks are (and aren't) with the new > protocol and also to give them help on how to implement support for > that new protocol in their middle box... That presumes, IMO, that NAT designers _want_ to incorporate new protocols. > (My assumption here is that the big barrier is confusion/ignorance. :-) For many, as well as many customers, "all new protocols are more dangerous than current ones" - as confused/ignorant as that may be. Nevermind how complicated support for SCTP would need to be (multipath, multistream + NAT rewriting = ?). JOe -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 254 bytes Desc: OpenPGP digital signature Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050119/61ef9cd7/signature-0001.bin From randy at psg.com Wed Jan 19 17:28:47 2005 From: randy at psg.com (Randy Bush) Date: Wed Jan 19 17:29:56 2005 Subject: [e2e] overlay over TCP References: <41EECD00.7080102@stewart.chicago.il.us> <000701c4fe83$f09e65e0$6e8944c6@telemuse.net> Message-ID: <16879.2383.356430.995766@roam.psg.com> > Takes a lot to motivate sales these days, according to the biz guys. Think > they read RFCs and get all excited, saying "Hey, that's our next big > product"? :-) no. they write rfcs with complex kinky new protocols where the complexity and scalability cause operators to have to buy more routers and then put massive marketing effort behind it. randy From jtk at northwestern.edu Wed Jan 19 17:53:56 2005 From: jtk at northwestern.edu (John Kristoff) Date: Wed Jan 19 17:56:40 2005 Subject: [e2e] overlay over TCP In-Reply-To: <508BD1B4-6A72-11D9-A5A2-000D93B505E6@extremenetworks.com> References: <41E5CD14.4010206@reed.com> <41E6B57C.6060705@isi.edu> <41E6EA6D.1080705@reed.com> <41EECB63.9000304@stewart.chicago.il.us> <508BD1B4-6A72-11D9-A5A2-000D93B505E6@extremenetworks.com> Message-ID: <20050119195356.0ecc80cb@dsl017-022-068.chi1.dsl.speakeasy.net> On Wed, 19 Jan 2005 18:31:59 -0500 RJ Atkinson wrote: > Perhaps one of the paths forward is for folks who propose new > transport-layer > protocols to also have an informational document targeted at folks who > build > firewalls (or other middle boxes) to help educate them on what the real > risks > are (and aren't) with the new protocol and also to give them help on how > to implement support for that new protocol in their middle box... It's not just the people who build boxes that get in the way, it is the people who operate them. > (My assumption here is that the big barrier is confusion/ignorance. :-) That's one. Fear is another. My experience in this thread seems to confirm that is a motivator for a handful of people that hang out in even what has historically been a relatively clueful operator list: John From me at armandocaro.net Wed Jan 19 20:06:38 2005 From: me at armandocaro.net (Armando L. Caro, Jr.) Date: Wed Jan 19 20:10:01 2005 Subject: [e2e] overlay over TCP In-Reply-To: <41EEEE82.10500@isi.edu> References: <57FD2C3A246F76438CA6FDAD8FE9F195A7F5EB@hrtades7.atea.be> <41EEBFC0.6030901@isi.edu> <41EEE86A.5060500@stewart.chicago.il.us> <41EEEE82.10500@isi.edu> Message-ID: On Wed, 19 Jan 2005, Joe Touch wrote: > In particular, _if_ you're referring to PR-SCTP, please indicate where > in the PR-SCTP RFC its use for unreliable, out-of-order messages is > simply and clearly described. ;-) For out-of-order messages, refer to Section 6.6 in RFC 2960. I think this section is fairly clear. For unreliable messages, refer to Section in 3.5 and 3.6 in RFC 3758. I think this section is clear, given that the reader is familiar with SCTP basics. ~armando 0-- --0 | Armando L. Caro, Jr. | Protocol Engineering Lab | | www.armandocaro.net | University of Delaware | 0-- --0 From touch at ISI.EDU Wed Jan 19 20:38:29 2005 From: touch at ISI.EDU (Joe Touch) Date: Wed Jan 19 20:40:39 2005 Subject: [e2e] overlay over TCP In-Reply-To: References: <57FD2C3A246F76438CA6FDAD8FE9F195A7F5EB@hrtades7.atea.be> <41EEBFC0.6030901@isi.edu> <41EEE86A.5060500@stewart.chicago.il.us> <41EEEE82.10500@isi.edu> Message-ID: <41EF35C5.9040901@isi.edu> Armando L. Caro, Jr. wrote: > On Wed, 19 Jan 2005, Joe Touch wrote: > > >>In particular, _if_ you're referring to PR-SCTP, please indicate where >>in the PR-SCTP RFC its use for unreliable, out-of-order messages is >>simply and clearly described. ;-) > > For out-of-order messages, refer to Section 6.6 in RFC 2960. I think > this section is fairly clear. > > For unreliable messages, refer to Section in 3.5 and 3.6 in RFC 3758. I > think this section is clear, given that the reader is familiar with SCTP > basics. Greek is clear to a Greek as well. ;-) Sec 3.5 and 3.6 are obscure, if the intent is to describe something that supports UDP-like semantics with TCp-like congestion control, such as is referred to in passing near the end of Sec 1 (item #3) of that RFC. As to the full set of reasons for which DCCP is preferable to PR-SCTP, see sec 3.3.2 of the DCCP problem statement ID (where PR-SCTP is referred to as U-SCTP). Note that item F2 in sec 3.5 of RFC 3758 also allows delaying outgoing chunks for aggregation; DCCP does not appear to do that (any DCCP experts like to chime in?)DCCP appers to correlate packets on the wire with application writes and reads; the same is not necessarily true with SCTP. There are substantial advantages to such correlation when tunneling network layer packets over transport protocols. I think that's what David Reed was referring to... Joe -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 254 bytes Desc: OpenPGP digital signature Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050119/f84a853c/signature.bin From michael.welzl at uibk.ac.at Wed Jan 19 23:50:14 2005 From: michael.welzl at uibk.ac.at (Michael Welzl) Date: Wed Jan 19 23:52:35 2005 Subject: [e2e] overlay over TCP In-Reply-To: <41EECD00.7080102@stewart.chicago.il.us> References: <57FD2C3A246F76438CA6FDAD8FE9F195A7F5EB@hrtades7.atea.be> <41EECD00.7080102@stewart.chicago.il.us> Message-ID: <1106207414.4767.29.camel@lap10-c703.uibk.ac.at> > > And what might be those reasons???? > > DCCP will have just about the same deployment difficulties that any other > > new transport protocol has to jump through.... > > > Even worse for DCCP .. since its further behind SCTP.. and time > makes a difference.. let me tell you :-D While I'm all for DCCP, I think that its deployment is in fact a more intricate issue: One can easily argue that using SCTP brings a benefit for certain special applications. That is a little more difficult for DCCP. >From the perspective of a programmer who ponders whether to use the protocol, some potential immediate benefits of DCCP to the application are: * if the application is planned to have 1000s of users connect to a single server, it may work (i.e. scale) better because congestion control is properly implemented * TCP-based applications that are used at the same time may work better * the programmer's own application might experience a smaller loss ratio while maintaining reasonable throughput, i.e. there are perhaps greater chances that most of what I send really makes it to the other end, while my rate is almost as large as it can be under such circumstances. DCCP's ECN support might help here, but on the other hand, merely setting the ECN flag in a UDP based flow is not such a big deal either, provided that the OS allows doing this... Given that we might be talking about an application which would need to be updated to use the protocol and kind of works already, as well as the standard deployment problems with firewalls etc., I have doubts that these arguments are convincing. The DCCP problem statement draft says: There has been substantial agreement [RFC 2309] [FF99] that in order to maintain the continued use of end-to-end congestion control, router mechanisms are needed to detect and penalize uncontrolled high-bandwidth flows in times of high congestion; these router mechanisms are colloquially known as `penalty boxes'. Now let's say that the only truly convincing argument to use DCCP would be dramatically worse performance with UDP as the result of such penalty boxes (this is what I believe). In this case, DCCP deployment can only happen once such boxes are widely used. An ISP will only have an incentive to install such a box if it brings a benefit - i.e. if the financial loss from problems with UDP traffic is greater than the loss from customers who switch to a different ISP because their UDP based application doesn't work anymore. If tons of apps use UDP instead of DCCP, the latter loss may be quite significant. So an ISP might have to wait for DCCP to be used by apps before installing penalty boxes, which would motivate app designers to use DCCP ... two parties waiting for each other, and what have we learned from history? Take a look at this paragraph about QoS deployment from RFC 2990: No network operator will make the significant investment in deployment and support of distinguished service infrastructure unless there is a set of clients and applications available to make immediate use of such facilities. Clients will not make the investment in enhanced services unless they see performance gains in applications that are designed to take advantage of such enhanced services. No application designer will attempt to integrate service quality features into the application unless there is a model of operation supported by widespread deployment that makes the additional investment in application complexity worthwhile and clients who are willing to purchase such applications. With all parts of the deployment scenario waiting for the others to move, widespread deployment of distinguished services may require some other external impetus. Will we also need such an other external impetus for DCCP, and what could it be? Cheers, Michael From randall at stewart.chicago.il.us Thu Jan 20 01:54:28 2005 From: randall at stewart.chicago.il.us (Randall Stewart) Date: Thu Jan 20 01:58:28 2005 Subject: [e2e] overlay over TCP In-Reply-To: <41EEEE82.10500@isi.edu> References: <57FD2C3A246F76438CA6FDAD8FE9F195A7F5EB@hrtades7.atea.be> <41EEBFC0.6030901@isi.edu> <41EEE86A.5060500@stewart.chicago.il.us> <41EEEE82.10500@isi.edu> Message-ID: <41EF7FD4.1040108@stewart.chicago.il.us> Joe Touch wrote: > > > Randall Stewart wrote: > >> Joe: >> >> A question and a comment.. >> > ... > >>> Recall that David Reed's initial post asked for: >>> 1- TCP-friendliness >>> 2- no app penalty for reliability or in-order delivery >> >> >> >> I don't get why you answer the way you do on <2> for SCTP... >> >> What app penalty are you talking about for reliability or >> in-order delivery... With SCTP you can have reliability, in-order >> delivery or no-reliability and out-of-order delivery and any >> combination. > > > PR-SCTP is an extension to SCTP, which isn't as widely deployed as SCTP. > I re-read the PR-SCTP spec a few times, and _still_ cannot figure out > how to provide true unreliable, any-order delivery. That alone is a fine > reason not to use PR-SCTP for this example. See below for other reasons... Hmm. Now how I interpret the above is that you cannot figure out how to write code for PR-SCTP that gives you unreliability. If that is what you mean, I would not look for a protocol spec to tell you how to interface to an API. The PR-SCTP spec provides a base "service" aka timed-reliability and a protocol mechanism. Its not meant to give you a how to use it. The 3rd edition of UNP gives a much better view of how to use the socket API and I belive (if I remember right) it discusses using the lifetime field. Basically to get unreliable service (equiviant to udp) all one has to do is set a lifetime parameter less than the 1 second aka RTO.min. If one does that then you will have the same symantic as UDP. One can choose to use a smaller time then RTO.min too I suppose but if you are after a unreliable service that would be how one would do it. Some implementations actually have a "SEND_ONLY_ONCE" like flag, this can also be used as well, but of course thats an implemenation specific thing. As to the unordered, thats a base SCTP thing and one (as covered in UNP) just passes the MSG_UNORDERED flag in with the message (or'd in with any other options). Of course thats for the socket-api and other implementations will vary on how they specify unordered :-D > > As to the app penalty, it's related to reliable, in-order delivery costs > the application latency if there are losses or out-of-order events in > the net (for retransmission and reordering). So basically if you can do un-reliable, out-of-order delivery you meet (2).. SCTP with the PR-SCTP extension can do that. So I believe your statement below is incorrect. > >> How does that not meet 2? >> >>> SCTP does (1) but NOT (2). >>> >>> DCCP does both (1) and (2) as requested. >>> >>> There are other reasons, notably SCTP's complexity compared to DCCP, >>> as well as features such as multihoming and multistream muxing that >>> may result in an unstable foundation for overlays, e.g., that want to >>> do their own dynamic routing. > > > (page count comparisons omitted for brevity)... > >> Seems to me about the same complexity and also DCCP is not an >> RFC yet.. there may be more to it... I know they talked >> about a mobility draft and some new CCID for VOIP.. >> >> All in all I don't buy your argument that SCTP is more >> complex... >> >> And I think DCCP had a form of multi-homing in it too.. its >> been a while since I have read through it so it may be removed or >> moved other places .... >> >> Besides which, we are no longer in the 8088 days so complexity >> in either DCCP or SCTP is not something to be afraid of. An >> application wants services plain and simple... > > > Complexity affects a number of things: > > - reliability of implementation > - ability to easily configure and use the protocol > > In particular, _if_ you're referring to PR-SCTP, please indicate where > in the PR-SCTP RFC its use for unreliable, out-of-order messages is > simply and clearly described. ;-) Check the API document and UNP. As to reliability of implementation and ease to use. a) Last interop we had about 15 stable and reliable implementations of SCTP present reperesenting and running on all O/S's. b) Most all of the implementations supported PR-SCTP, very few did not. c) a large number of implementations supported the sockets api which is quite simple and easy to use and as I said has quite detailed coverage in UNP 3rd edition (you should get a copy). > >>> ... >>> >>>> Seriously: you can extend TCP or deploy an already existing transport >>>> protocol that gets close to what you want. >>>> If it has to do something similar to UDP then PR-SCTP and/or DCCP >>>> might do >>>> the job.... >>>> (PR-SCTP: Partial relialability extension for SCTP) >>> >>> >>> Why bother with PR-SCTP when a much simpler DCCP will suffice, esp. >>> when other SCTP properties may be (IMO, are) harmful to overlays? >> >> >> If one does not want multi-homing one can basically turn if >> off for tunnels.. I can setup my tunnel endpoints so that >> the association is singly homed.. thats easy... so if you >> think M-homing is harmful, turn it off. And you will >> find, IMO, SCTP a bit further along on the way to deployment >> then DCCP.. this is not to say DCCP won't deploy eventually.. but >> it too has the same hurtles this thread as discussed with NATs >> and Firewalls.. and its just starting... > > > There are so many things in SCTP to turn off, it's impossible to > consider a valid argument that SCTP is less complex than DCCP. I don't get this response. You obviously have not used the SCTP API with sockets. Its quite easy to use only one address.. you use a bind call. Its quite easy to send with various options (check the UNP or the socket api draft if you prefer a draft). Its not complicated or hard. It took a two line code change to make mozilla run over SCTP. Of course not all features were used.. but not everyone needs to use all features either. It's easy to use and not complex.. and implementation wise we seem to have a lot of them for such a "complex problem". The 15 I mention were what was at the last inter-op. Many implementations did not come since they consider themselves so stable they are not interested in attending... there are probably closer to 30 implemenations total then 15... but I would need to go back and check all of the multitude of interops that have gone on before I could say for sure the exact numbers... > > The length of the DCCP spec vs. SCTP may speak to the comparitive > clarity or completeness; spec length isn't always a good metric of > complexity. My metric is feature set. By that metric DCCP is a much > simpler subset for the task requested. And as yet I could not a) tell you how to write a line of code to it b) tell you how to negotiate and end up with a congestion control profile (one of the things its supposed to do for me) without open loop negotation. c) Tell you for sure what O/S's support it, I believe there MAY be a KAME extension that includes DCCP but I have only tried once to get it to compile and that did not work. It may have a smaller feature set, but it does have its complexities as well and the more CCID's you add the more complex it becomes and the more you end up with features and options.. each CCID is a feature IMO. Which to me, is not a bad thing. As an APP writer I want features from my lower layers. This gives me more optioins. > > As to NATs and Firewalls, as we all agreed, anything short of IP over > HTTP is liable to be blocked somewhere. > At least on this we agree... and this is after something is supported in the NAT and F/W world. R -- Randall Stewart 803-345-0369 815-342-5222(cell) From randall at stewart.chicago.il.us Thu Jan 20 02:03:59 2005 From: randall at stewart.chicago.il.us (Randall Stewart) Date: Thu Jan 20 02:08:21 2005 Subject: [e2e] overlay over TCP In-Reply-To: <41EF35C5.9040901@isi.edu> References: <57FD2C3A246F76438CA6FDAD8FE9F195A7F5EB@hrtades7.atea.be> <41EEBFC0.6030901@isi.edu> <41EEE86A.5060500@stewart.chicago.il.us> <41EEEE82.10500@isi.edu> <41EF35C5.9040901@isi.edu> Message-ID: <41EF820F.3040106@stewart.chicago.il.us> Joe Touch wrote: > > > Armando L. Caro, Jr. wrote: > >> On Wed, 19 Jan 2005, Joe Touch wrote: >> >> >>> In particular, _if_ you're referring to PR-SCTP, please indicate where >>> in the PR-SCTP RFC its use for unreliable, out-of-order messages is >>> simply and clearly described. ;-) >> >> >> For out-of-order messages, refer to Section 6.6 in RFC 2960. I think >> this section is fairly clear. >> >> For unreliable messages, refer to Section in 3.5 and 3.6 in RFC 3758. I >> think this section is clear, given that the reader is familiar with SCTP >> basics. > > > Greek is clear to a Greek as well. ;-) Hmm but we are all Geeks here does that count :-D > > Sec 3.5 and 3.6 are obscure, if the intent is to describe something that > supports UDP-like semantics with TCp-like congestion control, such as is > referred to in passing near the end of Sec 1 (item #3) of that RFC. > > As to the full set of reasons for which DCCP is preferable to PR-SCTP, > see sec 3.3.2 of the DCCP problem statement ID (where PR-SCTP is > referred to as U-SCTP). Hmm.. is this an expired draft? I searched the ietf drafts and I also looked off the DCCP wg page and can't find this draft... > > Note that item F2 in sec 3.5 of RFC 3758 also allows delaying outgoing > chunks for aggregation; DCCP does not appear to do that (any DCCP > experts like to chime in?)DCCP appers to correlate packets on the wire > with application writes and reads; the same is not necessarily true with > SCTP. There are substantial advantages to such correlation when > tunneling network layer packets over transport protocols. I think that's > what David Reed was referring to... > That is a base feature of SCTP as well.. even though its reflected in RFC3758. In the sockets API we call it SCTP_NODELAY (same as in TCP) and it is basically an implemenation of nagel. Of course it is also required that it be disableable... so if one wants "no bundling" one turns SCTP_NODELAY off.. which at least for the socket api in BSD is on by default. R -- Randall Stewart 803-345-0369 815-342-5222(cell) From randall at stewart.chicago.il.us Thu Jan 20 02:11:44 2005 From: randall at stewart.chicago.il.us (Randall Stewart) Date: Thu Jan 20 02:16:11 2005 Subject: [e2e] overlay over TCP In-Reply-To: <508BD1B4-6A72-11D9-A5A2-000D93B505E6@extremenetworks.com> References: <41E5CD14.4010206@reed.com> <41E6B57C.6060705@isi.edu> <41E6EA6D.1080705@reed.com> <41EECB63.9000304@stewart.chicago.il.us> <508BD1B4-6A72-11D9-A5A2-000D93B505E6@extremenetworks.com> Message-ID: <41EF83E0.8050407@stewart.chicago.il.us> RJ Atkinson wrote: > > Perhaps one of the paths forward is for folks who propose new > transport-layer > protocols to also have an informational document targeted at folks who > build > firewalls (or other middle boxes) to help educate them on what the real > risks > are (and aren't) with the new protocol and also to give them help on how > to implement support for that new protocol in their middle box... > > For example, with SCTP, one of the things that could help would be specific > openly published information on efficiently re-calculating the SCTP > checksum > after a NAT has done its work, for example. Many folks know how to do this > with a Fletcher checksum (often because they've looked at BSDish code), > but not so many know how to do it with SCTP's new checksum. > > (My assumption here is that the big barrier is confusion/ignorance. :-) Ran: I wish that the big barrier were confusion/ignorance... its not in one large case I know of :-D .. its that there are not enough customers demanding it.. and their are other priorities. One customer (which is all I have requests from) is not enough to get a F/W & NAT change to support SCTP.. at least on the software side. The changes are in on the hardware side.. kind of funny actually... and they say hardware changes slower :-D I have actually started playing with the changes needed to implement SCTP in NAT and F/W worlds of BSD first.. and then I was going to move on to that other big O/S that I work upon occasionaly... and then.. maybe when enough folks ask for it I can hand the finished code to one of my colleages and say.. here.. put this in :-D But of course that is also amongst all my other "to-do"s and not even in todays set :-o R > > Ran > > > -- Randall Stewart 803-345-0369 815-342-5222(cell) From randall at stewart.chicago.il.us Thu Jan 20 02:15:00 2005 From: randall at stewart.chicago.il.us (Randall Stewart) Date: Thu Jan 20 02:18:25 2005 Subject: [e2e] overlay over TCP In-Reply-To: <000701c4fe83$f09e65e0$6e8944c6@telemuse.net> References: <000701c4fe83$f09e65e0$6e8944c6@telemuse.net> Message-ID: <41EF84A4.8030405@stewart.chicago.il.us> Lynne Jolitz wrote: > A few questions... > >>> DCCP will have just about the same deployment difficulties that >> >> any other >> >>> new transport protocol has to jump through.... >> >> Even worse for DCCP .. since its further behind SCTP.. and time >> makes a difference.. let me tell you :-D > > > Why? What's the rush? > > >> There is hope.. eventually SCTP will be in all the firewalls and >> NAT's and then sometime down the road from that you will find DCCP >> in the boxes too... but in either case it will be some time yet for >> SCTP and DCCP has not even gotten that far yet... > > > Why would either be incoporated into firewalls and NATs? Is there a > commitment from Cisco or some other big company to back them? What's > the killer application that makes them pour out money to upgrade > their firmware and go with either of these? > Hardware is not the problem.. thats been done.. its the software side that only has one customer asking for it. And yes Cisco makes money from SCTP. we have at least three products that I know of that use it.. and soon a fourth will be added to that list.. all routers that do netflow. Of course almost all of the apps that are represented are "inside" the network.. so one really does not have to have a F/W or NAT in the picture.. at least thats the argument I have heard so far... and because of that there has not been a huge customer demand.. yet :-D > Takes a lot to motivate sales these days, according to the biz guys. > Think they read RFCs and get all excited, saying "Hey, that's our > next big product"? :-) some do, some don't most in the later... R > > Lynne. > > ---- We use SpamQuiz. If your ISP didn't make the grade try > http://lynne.telemuse.net > > -- Randall Stewart 803-345-0369 815-342-5222(cell) From randall at stewart.chicago.il.us Thu Jan 20 02:19:01 2005 From: randall at stewart.chicago.il.us (Randall Stewart) Date: Thu Jan 20 02:21:58 2005 Subject: [e2e] overlay over TCP In-Reply-To: <41EF047C.9070206@isi.edu> References: <41E5CD14.4010206@reed.com> <41E6B57C.6060705@isi.edu> <41E6EA6D.1080705@reed.com> <41EECB63.9000304@stewart.chicago.il.us> <508BD1B4-6A72-11D9-A5A2-000D93B505E6@extremenetworks.com> <41EF047C.9070206@isi.edu> Message-ID: <41EF8595.5040500@stewart.chicago.il.us> Joe Touch wrote: > > > RJ Atkinson wrote: > >> >> Perhaps one of the paths forward is for folks who propose new >> transport-layer protocols to also have an informational document >> targeted at folks who build firewalls (or other middle boxes) to help >> educate them on what the real risks are (and aren't) with the new >> protocol and also to give them help on how to implement support for >> that new protocol in their middle box... > > > That presumes, IMO, that NAT designers _want_ to incorporate new protocols. I think not.. its more demand that drives the process IMO or as put in a move "show me the money" ... > >> (My assumption here is that the big barrier is confusion/ignorance. :-) > > > For many, as well as many customers, "all new protocols are more > dangerous than current ones" - as confused/ignorant as that may be. > Nevermind how complicated support for SCTP would need to be (multipath, > multistream + NAT rewriting = ?). Nope.. you DON'T need to rewrite NAT to do SCTP.. its a simple set of changes.. You just don't get multi-homing with NAT. But if you need a NAT chances are you are not too interested in multi-homing anyway. R > > JOe -- Randall Stewart 803-345-0369 815-342-5222(cell) From Lode.Coene at siemens.com Thu Jan 20 03:07:27 2005 From: Lode.Coene at siemens.com (Coene Lode) Date: Thu Jan 20 03:12:44 2005 Subject: [e2e] overlay over TCP Message-ID: <57FD2C3A246F76438CA6FDAD8FE9F195A7F5EC@hrtades7.atea.be> Joe, I was pleasently surprised to learn that I can read and (probably) write Greek, in addition to Dutch, French, German and English. Then I remebered that I could also read DCCP and TCP specs and understand them. And they were in a very similar language to the SCTP Specs..... Maybe after all, you can also read and speak greek.... :-) > Note that item F2 in sec 3.5 of RFC 3758 also allows > delaying outgoing > chunks for aggregation; DCCP does not appear to do that (any DCCP > experts like to chime in?)DCCP appers to correlate packets on > the wire > with application writes and reads; the same is not > necessarily true with > SCTP. There are substantial advantages to such correlation when > tunneling network layer packets over transport protocols. I > think that's > what David Reed was referring to... And that is what I am using with SCTP. Bundling messages is an Option in SCTP. I turn this option off, so all my messages get on the wire the moment I write them to the socket... Most present day SCTP aplication do this. They do not bundle/aggregate messages. And you are right, there are substantial advantage to such a correlation(to start with debugging, the logical flow of the program and so forth...) That why most SCTP applications want and do this... However some applications may like them aggregate, so no problem for me, just switch it on... It seems that I been doing overlay networks for most of my life, that why we required SCTP to at least have the option to turn aggregation off. And some other options that are included in SCTP which make overlaying a bit easier. Does this add complexity, sure, but then just compare UDP with DCCP... Mathematical speaking , if you go from a protocol which can be described in 10 pages to a protocol(UDP) that must be described on around 100 pages(DCCP), that is a pretty big jump in complexity, Then any application that want to use it, should at least expect to see some advantages from it.... If a transport protocol does not the stuff that I deem necessary for my particular kind of overlay, then I am sorry, then it is very likely that I will not choose that protocol.... DCCP was not around when we were working on our overlay... But now that DCCP is around, it still comes up short in some aspects.... And NO , do not change DCCP to adapt to our needs, because you will then end up with a clone of SCTP..(and the same complexity) (And yes, along those lines , I admit that SCTP is a clone of TCP with some extra thrown in...There is nothing wrong with that.) SCTP does it job . And some of the additional stuff will make it doing it's job even better... My company sells stuff with SCTP in to operators. They use it, so the NAT belonging to those operators have to let SCTP through.. Otherwise the consequence is that they make no money from it and believe if that happens, they gone tell that NAT/firewall/router vendor to adapt or take the NAT/firewall/router gear back... > > Joe > Yours sincerly, Lode Coene From michael.welzl at uibk.ac.at Thu Jan 20 05:34:50 2005 From: michael.welzl at uibk.ac.at (Michael Welzl) Date: Thu Jan 20 05:37:58 2005 Subject: [e2e] overlay over TCP In-Reply-To: <57FD2C3A246F76438CA6FDAD8FE9F195A7F5EC@hrtades7.atea.be> References: <57FD2C3A246F76438CA6FDAD8FE9F195A7F5EC@hrtades7.atea.be> Message-ID: <1106228089.4767.60.camel@lap10-c703.uibk.ac.at> > Mathematical speaking , if you go from a protocol which can be described in > 10 pages to a protocol(UDP) that must be described on around 100 > pages(DCCP), that is a pretty big jump in complexity, > Then any application that want to use it, should at least expect to see some > advantages from it.... I think you're mixing arguments here. Internal protocol complexity is not the main incentive / deployment problem. e.g., TCP is a complex beast, but using it is not such a big deal. When we talk about incentives to use a protocol, we should talk about complexity that reveals itself towards its users, i.e. API features that are seen by upper layers. Cheers, Michael From touch at ISI.EDU Thu Jan 20 06:56:22 2005 From: touch at ISI.EDU (Joe Touch) Date: Thu Jan 20 06:59:27 2005 Subject: [e2e] overlay over TCP In-Reply-To: <41EF820F.3040106@stewart.chicago.il.us> References: <57FD2C3A246F76438CA6FDAD8FE9F195A7F5EB@hrtades7.atea.be> <41EEBFC0.6030901@isi.edu> <41EEE86A.5060500@stewart.chicago.il.us> <41EEEE82.10500@isi.edu> <41EF35C5.9040901@isi.edu> <41EF820F.3040106@stewart.chicago.il.us> Message-ID: <41EFC696.6040202@isi.edu> Randall Stewart wrote: > Joe Touch wrote: > >> >> >> Armando L. Caro, Jr. wrote: >> >>> On Wed, 19 Jan 2005, Joe Touch wrote: >>> >>> >>>> In particular, _if_ you're referring to PR-SCTP, please indicate where >>>> in the PR-SCTP RFC its use for unreliable, out-of-order messages is >>>> simply and clearly described. ;-) >>> >>> >>> >>> For out-of-order messages, refer to Section 6.6 in RFC 2960. I think >>> this section is fairly clear. >>> >>> For unreliable messages, refer to Section in 3.5 and 3.6 in RFC 3758. I >>> think this section is clear, given that the reader is familiar with SCTP >>> basics. >> >> >> >> Greek is clear to a Greek as well. ;-) > > > Hmm but we are all Geeks here does that count :-D > >> >> Sec 3.5 and 3.6 are obscure, if the intent is to describe something >> that supports UDP-like semantics with TCp-like congestion control, >> such as is referred to in passing near the end of Sec 1 (item #3) of >> that RFC. >> >> As to the full set of reasons for which DCCP is preferable to PR-SCTP, >> see sec 3.3.2 of the DCCP problem statement ID (where PR-SCTP is >> referred to as U-SCTP). > > Hmm.. is this an expired draft? I searched the ietf drafts > and I also looked off the DCCP wg page and can't find > this draft... It's an expired draft on the DCCP web page at ICIR, which they have posted. Use Google on "DCCP" >> Note that item F2 in sec 3.5 of RFC 3758 also allows delaying >> outgoing chunks for aggregation; DCCP does not appear to do that (any >> DCCP experts like to chime in?)DCCP appers to correlate packets on the >> wire with application writes and reads; the same is not necessarily >> true with SCTP. There are substantial advantages to such correlation >> when tunneling network layer packets over transport protocols. I think >> that's what David Reed was referring to... >> > That is a base feature of SCTP as well.. even though its reflected > in RFC3758. In the sockets API we call it SCTP_NODELAY (same > as in TCP) and it is basically an implemenation of nagel. Of (Nagle? - it's a guy's last name, FYI ;-) > course it is also required that it be disableable... so if > one wants "no bundling" one turns SCTP_NODELAY off.. which > at least for the socket api in BSD is on by default. > > R AOK - the protocol says MAY, doesn't say "MAY, but MUST be disableable". Are you checking that sort of consistency in your bake-offs? Joe -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 254 bytes Desc: OpenPGP digital signature Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050120/2ec6319a/signature.bin From touch at ISI.EDU Thu Jan 20 06:59:35 2005 From: touch at ISI.EDU (Joe Touch) Date: Thu Jan 20 07:01:32 2005 Subject: [e2e] overlay over TCP In-Reply-To: <41EF8595.5040500@stewart.chicago.il.us> References: <41E5CD14.4010206@reed.com> <41E6B57C.6060705@isi.edu> <41E6EA6D.1080705@reed.com> <41EECB63.9000304@stewart.chicago.il.us> <508BD1B4-6A72-11D9-A5A2-000D93B505E6@extremenetworks.com> <41EF047C.9070206@isi.edu> <41EF8595.5040500@stewart.chicago.il.us> Message-ID: <41EFC757.7020707@isi.edu> Randall Stewart wrote: > Joe Touch wrote: > >> >> >> RJ Atkinson wrote: >> >>> >>> Perhaps one of the paths forward is for folks who propose new >>> transport-layer protocols to also have an informational document >>> targeted at folks who build firewalls (or other middle boxes) to help >>> educate them on what the real risks are (and aren't) with the new >>> protocol and also to give them help on how to implement support for >>> that new protocol in their middle box... >> >> >> >> That presumes, IMO, that NAT designers _want_ to incorporate new >> protocols. > > > I think not.. its more demand that drives the process IMO or > as put in a move "show me the money" ... > >> >>> (My assumption here is that the big barrier is confusion/ignorance. :-) >> >> >> >> For many, as well as many customers, "all new protocols are more >> dangerous than current ones" - as confused/ignorant as that may be. >> Nevermind how complicated support for SCTP would need to be >> (multipath, multistream + NAT rewriting = ?). > > > Nope.. you DON'T need to rewrite NAT to do SCTP.. its a simple > set of changes.. Let's see. You rewrite your NAT to understand a new protocol number, where the ports might be, and how to rewrite DATA IN ITS BODY. How do you accomplish that without "doing SCTP"? > You just don't get multi-homing with NAT. But > if you need a NAT chances are you are not too interested in > multi-homing anyway. > > R Well, tell that to people behind multiple firewall NATs at companies that would like not to be susceptible to one going down. We have a VPN that goes through such NATs (using UDP) that supports multihoming and dynamic routing (which is what dynamic choice of a multihomed path is, IMO), based on a variant of the X-Bone. But then, you knew I preferred modular solutions based on existing protocols rather than rolling a vertical stack... Joe -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 254 bytes Desc: OpenPGP digital signature Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050120/565a93e6/signature.bin From randall at stewart.chicago.il.us Thu Jan 20 07:11:21 2005 From: randall at stewart.chicago.il.us (Randall Stewart) Date: Thu Jan 20 07:14:01 2005 Subject: [e2e] overlay over TCP In-Reply-To: <41EFC757.7020707@isi.edu> References: <41E5CD14.4010206@reed.com> <41E6B57C.6060705@isi.edu> <41E6EA6D.1080705@reed.com> <41EECB63.9000304@stewart.chicago.il.us> <508BD1B4-6A72-11D9-A5A2-000D93B505E6@extremenetworks.com> <41EF047C.9070206@isi.edu> <41EF8595.5040500@stewart.chicago.il.us> <41EFC757.7020707@isi.edu> Message-ID: <41EFCA19.7080700@stewart.chicago.il.us> Joe Touch wrote: >> Nope.. you DON'T need to rewrite NAT to do SCTP.. its a simple >> set of changes.. > > > Let's see. You rewrite your NAT to understand a new protocol number, > where the ports might be, and how to rewrite DATA IN ITS BODY. How do > you accomplish that without "doing SCTP"? > Would you like me to send you the code? I have it done for FreeBSD.. have not went through extensive testing yet since I ran out of time and still have the f/w side to complete. As to "doing SCTP" NAT's don't do TCP.. they know about it.. where the ports are, what the c-sum is etc. Same for UDP and of course the same thing is needed for SCTP. You have to understand a "SYN" or an "INIT" but it is not as complex as you make out.. no more complex than having a NAT do TCP... >> You just don't get multi-homing with NAT. But >> if you need a NAT chances are you are not too interested in >> multi-homing anyway. >> >> R > > > Well, tell that to people behind multiple firewall NATs at companies > that would like not to be susceptible to one going down. We have a VPN > that goes through such NATs (using UDP) that supports multihoming and > dynamic routing (which is what dynamic choice of a multihomed path is, > IMO), based on a variant of the X-Bone. But then, you knew I preferred > modular solutions based on existing protocols rather than rolling a > vertical stack... > Well.. one could extend NAT in such a way to support your UDP or SCTPish type multi-homing.. but I have never been a proponent of such.. it gets ugly. And you end up with the same problem with TCP (assuming your earlier routing solution).. since you have two different NAT's and they need to share state to know what has been translated.. the problems are pretty much the same... assuming you of course are not using the same NAT for all networks (which would defeat the whole purpose of multiple networks ... aka no single point of failure since the NAT would be a big one)... so I think the same problem exists... NATs are just plain ugly... use them and you loose flexibility... unless you continue to hack an ugly thing :-D R -- Randall Stewart 803-345-0369 815-342-5222(cell) From randall at stewart.chicago.il.us Thu Jan 20 07:13:12 2005 From: randall at stewart.chicago.il.us (Randall Stewart) Date: Thu Jan 20 07:16:00 2005 Subject: [e2e] overlay over TCP In-Reply-To: <1106228089.4767.60.camel@lap10-c703.uibk.ac.at> References: <57FD2C3A246F76438CA6FDAD8FE9F195A7F5EC@hrtades7.atea.be> <1106228089.4767.60.camel@lap10-c703.uibk.ac.at> Message-ID: <41EFCA88.5080506@stewart.chicago.il.us> Michael Welzl wrote: >>Mathematical speaking , if you go from a protocol which can be described in >>10 pages to a protocol(UDP) that must be described on around 100 >>pages(DCCP), that is a pretty big jump in complexity, >>Then any application that want to use it, should at least expect to see some >>advantages from it.... > > > I think you're mixing arguments here. Internal protocol > complexity is not the main incentive / deployment problem. > I agree... in principle.. > e.g., TCP is a complex beast, but using it is not such > a big deal. When we talk about incentives to use a protocol, > we should talk about complexity that reveals itself towards > its users, i.e. API features that are seen by upper layers. And here I agree as well.. thats why we designed the socket-api as we did.. you can use it the same way you use it for SCTP with a few tweaks.. want more features, use more of the socket options and additional calls.. Its elegant and does not bury the API user... you learn about new features as you need to use them... R > Cheers, > Michael > > > -- Randall Stewart 803-345-0369 815-342-5222(cell) From touch at ISI.EDU Thu Jan 20 08:27:44 2005 From: touch at ISI.EDU (Joe Touch) Date: Thu Jan 20 08:30:15 2005 Subject: [e2e] overlay over TCP In-Reply-To: <41EFCA19.7080700@stewart.chicago.il.us> References: <41E5CD14.4010206@reed.com> <41E6B57C.6060705@isi.edu> <41E6EA6D.1080705@reed.com> <41EECB63.9000304@stewart.chicago.il.us> <508BD1B4-6A72-11D9-A5A2-000D93B505E6@extremenetworks.com> <41EF047C.9070206@isi.edu> <41EF8595.5040500@stewart.chicago.il.us> <41EFC757.7020707@isi.edu> <41EFCA19.7080700@stewart.chicago.il.us> Message-ID: <41EFDC00.2060002@isi.edu> Randall Stewart wrote: > Joe Touch wrote: > >>> Nope.. you DON'T need to rewrite NAT to do SCTP.. its a simple >>> set of changes.. >> >> >> >> Let's see. You rewrite your NAT to understand a new protocol number, >> where the ports might be, and how to rewrite DATA IN ITS BODY. How do >> you accomplish that without "doing SCTP"? >> > > Would you like me to send you the code? I have it > done for FreeBSD.. have not went through extensive testing > yet since I ran out of time and still have the f/w side > to complete. > > As to "doing SCTP" NAT's don't do TCP.. they know about > it.. where the ports are, what the c-sum is etc. And where the data is, which for TCP and DCCP isn't as tricky ;-) > Same for UDP and of course the same thing is needed > for SCTP. You have to understand a "SYN" or an "INIT" > but it is not as complex as you make out.. no more > complex than having a NAT do TCP... NATs translate data _inside_ the packets too; that's where 'knowing SCTP' is substantially more complex. >>> You just don't get multi-homing with NAT. But >>> if you need a NAT chances are you are not too interested in >>> multi-homing anyway. >>> >>> R >> >> Well, tell that to people behind multiple firewall NATs at companies >> that would like not to be susceptible to one going down. We have a VPN >> that goes through such NATs (using UDP) that supports multihoming and >> dynamic routing (which is what dynamic choice of a multihomed path is, >> IMO), based on a variant of the X-Bone. But then, you knew I preferred >> modular solutions based on existing protocols rather than rolling a >> vertical stack... >> > Well.. one could extend NAT in such a way to support your UDP or > SCTPish type multi-homing.. but I have never been a proponent of > such.. it gets ugly. And you end up with the same problem with > TCP (assuming your earlier routing solution).. since you have > two different NAT's and they need to share state to know > what has been translated.. the problems are pretty much the > same... assuming you of course are not using the same NAT > for all networks (which would defeat the whole purpose > of multiple networks ... aka no single point of failure > since the NAT would be a big one)... so I think the same > problem exists... NATs are just plain ugly... use them > and you loose flexibility... unless you continue to hack > an ugly thing :-D > > R Fair enough - enough NAT bashing today. Joe -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 254 bytes Desc: OpenPGP digital signature Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050120/aff2c9ec/signature.bin From touch at ISI.EDU Thu Jan 20 08:27:50 2005 From: touch at ISI.EDU (Joe Touch) Date: Thu Jan 20 08:30:21 2005 Subject: [e2e] overlay over TCP In-Reply-To: <41EF7FD4.1040108@stewart.chicago.il.us> References: <57FD2C3A246F76438CA6FDAD8FE9F195A7F5EB@hrtades7.atea.be> <41EEBFC0.6030901@isi.edu> <41EEE86A.5060500@stewart.chicago.il.us> <41EEEE82.10500@isi.edu> <41EF7FD4.1040108@stewart.chicago.il.us> Message-ID: <41EFDC06.5030809@isi.edu> Randall Stewart wrote: > Joe Touch wrote: > >> >> >> Randall Stewart wrote: >> >>> Joe: >>> >>> A question and a comment.. >>> >> ... >> >>>> Recall that David Reed's initial post asked for: >>>> 1- TCP-friendliness >>>> 2- no app penalty for reliability or in-order delivery >>> >>> >>> >>> >>> I don't get why you answer the way you do on <2> for SCTP... >>> >>> What app penalty are you talking about for reliability or >>> in-order delivery... With SCTP you can have reliability, in-order >>> delivery or no-reliability and out-of-order delivery and any >>> combination. >> >> >> >> PR-SCTP is an extension to SCTP, which isn't as widely deployed as >> SCTP. I re-read the PR-SCTP spec a few times, and _still_ cannot >> figure out how to provide true unreliable, any-order delivery. That >> alone is a fine reason not to use PR-SCTP for this example. See below >> for other reasons... > > > Hmm. Now how I interpret the above is that you cannot > figure out how to write code for PR-SCTP that gives you > unreliability. If that is what you mean, I would not > look for a protocol spec to tell you how to interface > to an API. I look to a protocol spec to tell me what a protocol can do, and (hopefully) how to get it to do it - that might give me a clue as to how to, e.g., implement an API. > The PR-SCTP spec provides a base "service" aka timed-reliability > and a protocol mechanism. > > Its not meant to give you a how to use it. The 3rd edition > of UNP gives a much better view of how to use the socket > API and I belive (if I remember right) it discusses using > the lifetime field. > > Basically to get unreliable service (equiviant to udp) > all one has to do is set a lifetime parameter less than > the 1 second aka RTO.min. That's perfectly clear! Wait - no, maybe not. So there's no "unreliable" flag, ala the unordered flag? And what does a 'lifetime less than 1 second' mean, exactly - to send-side buffers, receive-side, etc? Will it stall other packets sent? Others received? If so, it's not UDP semantics per se... > If one does that then you will > have the same symantic as UDP. One can choose to > use a smaller time then RTO.min too I suppose but > if you are after a unreliable service that would > be how one would do it. You 'suppose'? It's not that simple, is it? > Some implementations actually > have a "SEND_ONLY_ONCE" like flag, this can also > be used as well, but of course thats an implemenation > specific thing. So the protocol MIGHT support UDP semantics, IF the API provides access to enough of SCTP's knobs that you can figure out how to set it? I'm not surprised I didn't notice _that_ in the spec... ... > Check the API document and UNP. As to reliability of implementation > and ease to use. > > a) Last interop we had about 15 stable and reliable implementations > of SCTP present reperesenting and running on all O/S's. > b) Most all of the implementations supported PR-SCTP, very few > did not. > c) a large number of implementations supported the sockets api > which is quite simple and easy to use and as I said has > quite detailed coverage in UNP 3rd edition (you should > get a copy). How many pages?? ;-) ... >> There are so many things in SCTP to turn off, it's impossible to >> consider a valid argument that SCTP is less complex than DCCP. > > I don't get this response. You obviously have not > used the SCTP API with sockets. Its quite easy to > use only one address.. you use a bind call. Its > quite easy to send with various options (check > the UNP or the socket api draft if you prefer a > draft). Its not complicated or hard. > > It took a two line code change to make mozilla > run over SCTP. Of course not all features were > used.. but not everyone needs to use all features > either. But as you hinted above, it what you want to do isn't a 'phrase' in the API, you have to figure out how to do it - and be _sure_ (no "suppose" involved ;-) I'll concede that there are more SCTP implementations than DCCP, even that DCCP doesn't have a standard API (though one would expect it would have open/close like TCP with defaults to TCP cong control and a read/write like TCP that preserves boundaries only - i.e., as simple as some of the defaults in SCTP's API you mention). But if the implementations of SCTP are anything like the specs, it's destined to be limited to the small communities that speak it's particular dialect of obscura. Joe -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 254 bytes Desc: OpenPGP digital signature Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050120/9299cd2b/signature.bin From randall at stewart.chicago.il.us Thu Jan 20 10:33:33 2005 From: randall at stewart.chicago.il.us (Randall Stewart) Date: Thu Jan 20 10:36:07 2005 Subject: [e2e] overlay over TCP In-Reply-To: <41EFDC00.2060002@isi.edu> References: <41E5CD14.4010206@reed.com> <41E6B57C.6060705@isi.edu> <41E6EA6D.1080705@reed.com> <41EECB63.9000304@stewart.chicago.il.us> <508BD1B4-6A72-11D9-A5A2-000D93B505E6@extremenetworks.com> <41EF047C.9070206@isi.edu> <41EF8595.5040500@stewart.chicago.il.us> <41EFC757.7020707@isi.edu> <41EFCA19.7080700@stewart.chicago.il.us> <41EFDC00.2060002@isi.edu> Message-ID: <41EFF97D.6010809@stewart.chicago.il.us> Joe Touch wrote: > > > Randall Stewart wrote: > >> Joe Touch wrote: >> >>>> Nope.. you DON'T need to rewrite NAT to do SCTP.. its a simple >>>> set of changes.. >>> >>> >>> >>> >>> Let's see. You rewrite your NAT to understand a new protocol number, >>> where the ports might be, and how to rewrite DATA IN ITS BODY. How do >>> you accomplish that without "doing SCTP"? >>> >> >> Would you like me to send you the code? I have it >> done for FreeBSD.. have not went through extensive testing >> yet since I ran out of time and still have the f/w side >> to complete. >> >> As to "doing SCTP" NAT's don't do TCP.. they know about >> it.. where the ports are, what the c-sum is etc. > > > And where the data is, which for TCP and DCCP isn't as tricky ;-) There was no trick to it... one does not have to know where the data is since the header is just like TCP, just like UDP, just like DCCP. And all data (data and control) start after the header.. no different than TCP.. except for one minor rinkle.. I don't have to do the bit with psuedo headers... I have implemented this .. its not hard it is almost a ver-batim clone of the TCP code.. except it was a few lines less :-D > >> Same for UDP and of course the same thing is needed >> for SCTP. You have to understand a "SYN" or an "INIT" >> but it is not as complex as you make out.. no more >> complex than having a NAT do TCP... > > > NATs translate data _inside_ the packets too; that's where 'knowing > SCTP' is substantially more complex. FTP, last I checked, does not run over SCTP.. and even if it did it would not be that tough to find the addresses etc... no different than knowing the data format of any other protocol... including TCP.. R -- Randall Stewart 803-345-0369 815-342-5222(cell) From touch at ISI.EDU Thu Jan 20 10:41:11 2005 From: touch at ISI.EDU (Joe Touch) Date: Thu Jan 20 10:44:43 2005 Subject: [e2e] overlay over TCP In-Reply-To: <41EFF97D.6010809@stewart.chicago.il.us> References: <41E5CD14.4010206@reed.com> <41E6B57C.6060705@isi.edu> <41E6EA6D.1080705@reed.com> <41EECB63.9000304@stewart.chicago.il.us> <508BD1B4-6A72-11D9-A5A2-000D93B505E6@extremenetworks.com> <41EF047C.9070206@isi.edu> <41EF8595.5040500@stewart.chicago.il.us> <41EFC757.7020707@isi.edu> <41EFCA19.7080700@stewart.chicago.il.us> <41EFDC00.2060002@isi.edu> <41EFF97D.6010809@stewart.chicago.il.us> Message-ID: <41EFFB47.6010607@isi.edu> Randall Stewart wrote: .... >>> As to "doing SCTP" NAT's don't do TCP.. they know about >>> it.. where the ports are, what the c-sum is etc. >> >> And where the data is, which for TCP and DCCP isn't as tricky ;-) > > There was no trick to it... one does not have to > know where the data is since the header is > just like TCP, just like UDP, just like DCCP. > > And all data (data and control) start after the > header.. no different than TCP.. except for one > minor rinkle.. I don't have to do the bit with > psuedo headers... Yeah, but you do have to deal with the SCTP muxing headers. Don't forget, you have to scan certain protocols to translate IP addresses and port numbers in the data too. That means parsing the data inside the muxing chunks. (see below) >>> Same for UDP and of course the same thing is needed >>> for SCTP. You have to understand a "SYN" or an "INIT" >>> but it is not as complex as you make out.. no more >>> complex than having a NAT do TCP... >> >> NATs translate data _inside_ the packets too; that's where 'knowing >> SCTP' is substantially more complex. > > FTP, last I checked, does not run over SCTP.. and even > if it did it would not be that tough to find the addresses > etc... no different than knowing the data format of > any other protocol... including TCP.. Does HTTP? Will either FTP or HTTP? Any other protocols? Doing "NAT" _means_ translating inner addresses used in the data. If you're ignoring that, sure, it's certainly easy. But that's not what makes NATs hard or complex. Joe -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 254 bytes Desc: OpenPGP digital signature Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050120/2170db2b/signature.bin From randall at stewart.chicago.il.us Thu Jan 20 10:43:36 2005 From: randall at stewart.chicago.il.us (Randall Stewart) Date: Thu Jan 20 10:46:18 2005 Subject: [e2e] overlay over TCP In-Reply-To: <41EFDC06.5030809@isi.edu> References: <57FD2C3A246F76438CA6FDAD8FE9F195A7F5EB@hrtades7.atea.be> <41EEBFC0.6030901@isi.edu> <41EEE86A.5060500@stewart.chicago.il.us> <41EEEE82.10500@isi.edu> <41EF7FD4.1040108@stewart.chicago.il.us> <41EFDC06.5030809@isi.edu> Message-ID: <41EFFBD8.6030405@stewart.chicago.il.us> Joe Touch wrote: > > > Randall Stewart wrote: > >> Joe Touch wrote: >> >>> >>> >>> Randall Stewart wrote: >>> >>>> Joe: >>>> >>>> A question and a comment.. >>>> >>> ... >>> >>>>> Recall that David Reed's initial post asked for: >>>>> 1- TCP-friendliness >>>>> 2- no app penalty for reliability or in-order delivery >>>> >>>> >>>> >>>> >>>> >>>> I don't get why you answer the way you do on <2> for SCTP... >>>> >>>> What app penalty are you talking about for reliability or >>>> in-order delivery... With SCTP you can have reliability, in-order >>>> delivery or no-reliability and out-of-order delivery and any >>>> combination. >>> >>> >>> >>> >>> PR-SCTP is an extension to SCTP, which isn't as widely deployed as >>> SCTP. I re-read the PR-SCTP spec a few times, and _still_ cannot >>> figure out how to provide true unreliable, any-order delivery. That >>> alone is a fine reason not to use PR-SCTP for this example. See below >>> for other reasons... >> >> >> >> Hmm. Now how I interpret the above is that you cannot >> figure out how to write code for PR-SCTP that gives you >> unreliability. If that is what you mean, I would not >> look for a protocol spec to tell you how to interface >> to an API. > > > I look to a protocol spec to tell me what a protocol can do, and > (hopefully) how to get it to do it - that might give me a clue as to how > to, e.g., implement an API. > >> The PR-SCTP spec provides a base "service" aka timed-reliability >> and a protocol mechanism. >> >> Its not meant to give you a how to use it. The 3rd edition >> of UNP gives a much better view of how to use the socket >> API and I belive (if I remember right) it discusses using >> the lifetime field. >> >> Basically to get unreliable service (equiviant to udp) >> all one has to do is set a lifetime parameter less than >> the 1 second aka RTO.min. > > > That's perfectly clear! Wait - no, maybe not. So there's no "unreliable" > flag, ala the unordered flag? > > And what does a 'lifetime less than 1 second' mean, exactly - to > send-side buffers, receive-side, etc? Will it stall other packets sent? > Others received? If so, it's not UDP semantics per se... It does not mean anything.. since the first time they will get re-examined is when the timeout goes off.. which is about 1 second later. Yes, things are not going to get discarded as quickly with PR-SCTP as with UDP.. but then if you funnel information to UDP to fast you get drops that never hit the wire .. funny thats hard to find in any spec too. I have found, in all the app work I have done, far more useful API documents such as UNP... and thats because they give you the things the specs were not meant to give IMO.. > >> If one does that then you will >> have the same symantic as UDP. One can choose to >> use a smaller time then RTO.min too I suppose but >> if you are after a unreliable service that would >> be how one would do it. > > > You 'suppose'? It's not that simple, is it? > >> Some implementations actually >> have a "SEND_ONLY_ONCE" like flag, this can also >> be used as well, but of course thats an implemenation >> specific thing. > > > So the protocol MIGHT support UDP semantics, IF the API provides access > to enough of SCTP's knobs that you can figure out how to set it? I'm not > surprised I didn't notice _that_ in the spec... > > ... > >> Check the API document and UNP. As to reliability of implementation >> and ease to use. >> >> a) Last interop we had about 15 stable and reliable implementations >> of SCTP present reperesenting and running on all O/S's. >> b) Most all of the implementations supported PR-SCTP, very few >> did not. >> c) a large number of implementations supported the sockets api >> which is quite simple and easy to use and as I said has >> quite detailed coverage in UNP 3rd edition (you should >> get a copy). > > > How many pages?? ;-) There are three full chapters in UNP on SCTP.. and 2 other chapters that had considerable addition.. > ... > >>> There are so many things in SCTP to turn off, it's impossible to >>> consider a valid argument that SCTP is less complex than DCCP. >> >> >> I don't get this response. You obviously have not >> used the SCTP API with sockets. Its quite easy to >> use only one address.. you use a bind call. Its >> quite easy to send with various options (check >> the UNP or the socket api draft if you prefer a >> draft). Its not complicated or hard. >> >> It took a two line code change to make mozilla >> run over SCTP. Of course not all features were >> used.. but not everyone needs to use all features >> either. > > > But as you hinted above, it what you want to do isn't a 'phrase' in the > API, you have to figure out how to do it - and be _sure_ (no "suppose" > involved ;-) And AFAIKT it was something like the UNP that got everyone to the point where they could write network code.. not RFC's or drafts... TCP is a fine spec.. but it was the things Mr. Stevens put out that got people the know how to write to the API.. You need a manual.. just like a dictionary... when you start coding to TCP, UDP or any other network protocol.. If you want I will bring you one of my copies of the UNP 3rd edition.. you might find it filling in the missing information.. just like it did for TCP when it was the 1st edition... R -- Randall Stewart 803-345-0369 815-342-5222(cell) From me at armandocaro.net Thu Jan 20 10:49:51 2005 From: me at armandocaro.net (Armando L. Caro, Jr.) Date: Thu Jan 20 10:50:04 2005 Subject: [e2e] overlay over TCP In-Reply-To: <41EFDC06.5030809@isi.edu> References: <57FD2C3A246F76438CA6FDAD8FE9F195A7F5EB@hrtades7.atea.be> <41EEBFC0.6030901@isi.edu> <41EEE86A.5060500@stewart.chicago.il.us> <41EEEE82.10500@isi.edu> <41EF7FD4.1040108@stewart.chicago.il.us> <41EFDC06.5030809@isi.edu> Message-ID: On Thu, 20 Jan 2005, Joe Touch wrote: > > Basically to get unreliable service (equiviant to udp) > > all one has to do is set a lifetime parameter less than > > the 1 second aka RTO.min. > > That's perfectly clear! Wait - no, maybe not. So there's no "unreliable" > flag, ala the unordered flag? > > And what does a 'lifetime less than 1 second' mean, exactly - to > send-side buffers, receive-side, etc? Will it stall other packets sent? > Others received? If so, it's not UDP semantics per se... Well, PR-SCTP was not exactly meant to match UDP semantics. It was meant to offer a richer feature set. It can provide UDP-like reliability service (ie, unreliable), but it can also provide a range of reliability between UDP and TCP/SCTP. You cannot have a simple unreliable flag to provide _that_ service. You _do_ have a partially reliable flag (MSG_PR_SCTP_TTL) and a corresponding lifetime parameter whose units are in milliseconds. > > If one does that then you will > > have the same symantic as UDP. One can choose to > > use a smaller time then RTO.min too I suppose but > > if you are after a unreliable service that would > > be how one would do it. > > You 'suppose'? It's not that simple, is it? Sure it is. You can put a 1 in the lifetime parameter to get a 1ms lifetime for a data chunk. UNP's Section 9.9 explains in one paragraph how to use the sctp_sendmsg to get partially reliable service. > > Some implementations actually > > have a "SEND_ONLY_ONCE" like flag, this can also > > be used as well, but of course thats an implemenation > > specific thing. > > So the protocol MIGHT support UDP semantics, IF the API provides access > to enough of SCTP's knobs that you can figure out how to set it? I'm not > surprised I didn't notice _that_ in the spec... PR-SCTP _does_ provide unreliable service. I don't think it is unreasonable for the application developer to read a man page to determine how to use an API to get unreliable service from PR-SCTP. ~armando 0-- --0 | Armando L. Caro, Jr. | Protocol Engineering Lab | | www.armandocaro.net | University of Delaware | 0-- --0 From randall at stewart.chicago.il.us Thu Jan 20 10:47:39 2005 From: randall at stewart.chicago.il.us (Randall Stewart) Date: Thu Jan 20 10:50:06 2005 Subject: [e2e] overlay over TCP In-Reply-To: <41EFFB47.6010607@isi.edu> References: <41E5CD14.4010206@reed.com> <41E6B57C.6060705@isi.edu> <41E6EA6D.1080705@reed.com> <41EECB63.9000304@stewart.chicago.il.us> <508BD1B4-6A72-11D9-A5A2-000D93B505E6@extremenetworks.com> <41EF047C.9070206@isi.edu> <41EF8595.5040500@stewart.chicago.il.us> <41EFC757.7020707@isi.edu> <41EFCA19.7080700@stewart.chicago.il.us> <41EFDC00.2060002@isi.edu> <41EFF97D.6010809@stewart.chicago.il.us> <41EFFB47.6010607@isi.edu> Message-ID: <41EFFCCB.2070007@stewart.chicago.il.us> Joe Touch wrote: > > > Randall Stewart wrote: > .... > >>>> As to "doing SCTP" NAT's don't do TCP.. they know about >>>> it.. where the ports are, what the c-sum is etc. >>> >>> >>> And where the data is, which for TCP and DCCP isn't as tricky ;-) >> >> >> There was no trick to it... one does not have to >> know where the data is since the header is >> just like TCP, just like UDP, just like DCCP. >> >> And all data (data and control) start after the >> header.. no different than TCP.. except for one >> minor rinkle.. I don't have to do the bit with >> psuedo headers... > > > Yeah, but you do have to deal with the SCTP muxing headers. Don't > forget, you have to scan certain protocols to translate IP addresses and > port numbers in the data too. That means parsing the data inside the > muxing chunks. (see below) > >>>> Same for UDP and of course the same thing is needed >>>> for SCTP. You have to understand a "SYN" or an "INIT" >>>> but it is not as complex as you make out.. no more >>>> complex than having a NAT do TCP... >>> >>> >>> NATs translate data _inside_ the packets too; that's where 'knowing >>> SCTP' is substantially more complex. >> >> >> FTP, last I checked, does not run over SCTP.. and even >> if it did it would not be that tough to find the addresses >> etc... no different than knowing the data format of >> any other protocol... including TCP.. > > > Does HTTP? Will either FTP or HTTP? Any other protocols? If HTTP or FTP ever do move to SCTP the logical thing to do will be to get rid of the silly IP addresses and ports inside the data stream. Instead they can use stream's to accomplish the same thing .. and get better performance as well. The reason they do the "open another connection" thing is to get around some of the very things that SCTP provides pathway's around aka head-of-line blocking. I would hope, anyone so wise as to move to SCTP would not make the same mistakes.. and instead use the protocol mechanisms that benefit things.. Take a look at: http://pel.cis.udel.edu/ They have a real interesting paper that compared using the muilt-streaming feature for FTP. They gained a LOT of performance by doing this instead of starting parrallel connections.. exactly what you don't want to do.. R > > Doing "NAT" _means_ translating inner addresses used in the data. If > you're ignoring that, sure, it's certainly easy. But that's not what > makes NATs hard or complex. > > Joe -- Randall Stewart 803-345-0369 815-342-5222(cell) From iyengar at mail.eecis.udel.edu Thu Jan 20 11:04:24 2005 From: iyengar at mail.eecis.udel.edu (Janardhan Iyengar) Date: Thu Jan 20 11:06:01 2005 Subject: [e2e] overlay over TCP In-Reply-To: <41EFDC06.5030809@isi.edu> References: <57FD2C3A246F76438CA6FDAD8FE9F195A7F5EB@hrtades7.atea.be> <41EEBFC0.6030901@isi.edu> <41EEE86A.5060500@stewart.chicago.il.us> <41EEEE82.10500@isi.edu> <41EF7FD4.1040108@stewart.chicago.il.us> <41EFDC06.5030809@isi.edu> Message-ID: Joe, > That's perfectly clear! Wait - no, maybe not. So there's no "unreliable" > flag, ala the unordered flag? Not every feature comes with a flag. Other forms are also acceptable to me. PR-SCTP offers partial reliability. If I am an application programmer who need unreliable service, IMHO I don't have to be a genius to figure out how to get it from PR-SCTP. > But if the implementations of SCTP are anything like the specs, it's > destined to be limited to the small communities that speak it's > particular dialect of obscura. "destined"? "dialect of obscura"?? I guess I do belong to this "small community" since I understand the SCTP spec. SCTP was one of the first RFCs I read as a grad student, and I'm pretty sure I understood it. I am also pretty sure most folks who implement standards are well versed in the language of the spec ("dialect of obscura"), and would be able to understand it. After all, they're the ones the spec is written for, isn't it? regards, jana --------------------------------------------------------------- Janardhan R. Iyengar http://www.cis.udel.edu/~iyengar Protocol Engineering Lab -- CIS -- University Of Delaware --------------------------------------------------------------- From touch at ISI.EDU Thu Jan 20 11:16:49 2005 From: touch at ISI.EDU (Joe Touch) Date: Thu Jan 20 11:18:52 2005 Subject: [e2e] overlay over TCP In-Reply-To: <41EFFCCB.2070007@stewart.chicago.il.us> References: <41E5CD14.4010206@reed.com> <41E6B57C.6060705@isi.edu> <41E6EA6D.1080705@reed.com> <41EECB63.9000304@stewart.chicago.il.us> <508BD1B4-6A72-11D9-A5A2-000D93B505E6@extremenetworks.com> <41EF047C.9070206@isi.edu> <41EF8595.5040500@stewart.chicago.il.us> <41EFC757.7020707@isi.edu> <41EFCA19.7080700@stewart.chicago.il.us> <41EFDC00.2060002@isi.edu> <41EFF97D.6010809@stewart.chicago.il.us> <41EFFB47.6010607@isi.edu> <41EFFCCB.2070007@stewart.chicago.il.us> Message-ID: <41F003A1.7050909@isi.edu> Randall Stewart wrote: > Joe Touch wrote: > >> >> >> Randall Stewart wrote: >> .... >> >>>>> As to "doing SCTP" NAT's don't do TCP.. they know about >>>>> it.. where the ports are, what the c-sum is etc. >>>> >>>> >>>> >>>> And where the data is, which for TCP and DCCP isn't as tricky ;-) >>> >>> >>> >>> There was no trick to it... one does not have to >>> know where the data is since the header is >>> just like TCP, just like UDP, just like DCCP. >>> >>> And all data (data and control) start after the >>> header.. no different than TCP.. except for one >>> minor rinkle.. I don't have to do the bit with >>> psuedo headers... >> >> >> >> Yeah, but you do have to deal with the SCTP muxing headers. Don't >> forget, you have to scan certain protocols to translate IP addresses >> and port numbers in the data too. That means parsing the data inside >> the muxing chunks. (see below) >> >>>>> Same for UDP and of course the same thing is needed >>>>> for SCTP. You have to understand a "SYN" or an "INIT" >>>>> but it is not as complex as you make out.. no more >>>>> complex than having a NAT do TCP... >>>> >>>> >>>> >>>> NATs translate data _inside_ the packets too; that's where 'knowing >>>> SCTP' is substantially more complex. >>> >>> >>> >>> FTP, last I checked, does not run over SCTP.. and even >>> if it did it would not be that tough to find the addresses >>> etc... no different than knowing the data format of >>> any other protocol... including TCP.. >> >> >> >> Does HTTP? Will either FTP or HTTP? Any other protocols? > > > If HTTP or FTP ever do move to SCTP the logical thing > to do will be to get rid of the silly IP addresses and ports > inside the data stream. Instead they can use stream's to > accomplish the same thing .. and get better performance as > well. > > The reason they do the "open another connection" thing is to > get around some of the very things that SCTP provides pathway's > around aka head-of-line blocking. They opened another connection for other reasons too: - to be able to signal "EOF" - to be able to force things to a third-party port (to PUT to a lpr, e.g.) The way the PORT command is spec'd, it almost looks like you could initiate a transfer remotely from a separate machine (A tells B to send a file to C) too. (anyone know whether that's possible? ever implemented?) --- You might map it over, or not. HTTP, in particular, uses DNS names and IP address inside the stream. NATs want to translate them - even if we don't approve. Joe -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 254 bytes Desc: OpenPGP digital signature Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050120/6d2bca9b/signature.bin From amer at cis.udel.edu Thu Jan 20 13:51:09 2005 From: amer at cis.udel.edu (Paul D. Amer) Date: Thu Jan 20 13:52:01 2005 Subject: Fw: [e2e] overlay over TCP Message-ID: <046e01c4ff3a$26aacb90$3e850480@AMERPC> > >> FTP, last I checked, does not run over SCTP.. and even > >> if it did it would not be that tough to find the addresses > >> etc... no different than knowing the data format of > >> any other protocol... including TCP.. FTP can run over SCTP, and in fact can perform better than FTP running over TCP. See S. Ladha, P. Amer Improving multiple file transfers using SCTP multistreaming IPCCC '04, Phoenix, AZ, 4/04 http://www.cis.udel.edu/~amer/PEL/poc/pdf/IPCCC2004-ladha.FTP.over.SCTP.pdf From dhc2 at dcrocker.net Thu Jan 20 15:43:21 2005 From: dhc2 at dcrocker.net (Dave Crocker) Date: Thu Jan 20 15:44:01 2005 Subject: Fw: [e2e] overlay over TCP In-Reply-To: <046e01c4ff3a$26aacb90$3e850480@AMERPC> Message-ID: <2005120154321.377169@bbprime> On Thu, 20 Jan 2005 16:51:09 -0500, Paul D. Amer wrote: >? FTP can run over SCTP, and in fact can perform better >? than FTP running over TCP. "Perform better" has some significant pre-conditions, here. The savings of the work you cite are primarily in transition latencies (from connection setup and command lock-step performance.) For a single transfer of a large file, this enhancement is irrelevant. The data transfer portion washes out any command or transition overhead. For a single transfer of a small file, this enhancement is irrelevant. Everything is so quick, making the control and transition mechanism quicker doesn't really make any difference. In fact the savings are only interesting in the case of having a large number of small files to transfer. In this case, the "control" overhead will tend to dominate, so that optimizing it is indeed a Good Thing. d/ -- Dave Crocker Brandenburg InternetWorking +1.408.246.8253 dcrocker ?a t ... WE'VE MOVED to: ?www.bbiw.net From cannara at attglobal.net Thu Jan 20 22:43:03 2005 From: cannara at attglobal.net (Alex Cannara) Date: Thu Jan 20 23:22:32 2005 Subject: Fw: [e2e] overlay over TCP In-Reply-To: <2005120154321.377169@bbprime> References: <2005120154321.377169@bbprime> Message-ID: <41F0A477.4020605@attglobal.net> As the archives for this list should show, there's more to typical TCPs' behaviors than most people think. And, very few folks have done much to examine real TCP flows in real situations. TCP has an Achille's heel with some interesting performance facets, which real-world experimentation quickly reveals. The imagined benefits of various 'sophisticated' TCP algorithms pale when the simplest of things happen on a real link, such as loss, or even just the sending of an odd number of payloads per application block. One test for anyone who claims TCP expertise is to have that person estimate the effect of 1% packet loss on time to transfer a sizeable file. We can then move on to an estimate of the negative effects of Delayed Ack, etc. In other words, stopping significant Internet protocol development over 20 years ago was a mistake we pay for every day in every business now. But hey, we've settled for lots of other mediocrities. Alex Dave Crocker wrote: > On Thu, 20 Jan 2005 16:51:09 -0500, Paul D. Amer wrote: > >> FTP can run over SCTP, and in fact can perform better >> than FTP running over TCP. > > > "Perform better" has some significant pre-conditions, here. > > The savings of the work you cite are primarily in transition latencies (from connection setup and command lock-step performance.) > > For a single transfer of a large file, this enhancement is irrelevant. The data transfer portion washes out any command or transition overhead. > > For a single transfer of a small file, this enhancement is irrelevant. Everything is so quick, making the control and transition mechanism quicker doesn't really make any difference. > > In fact the savings are only interesting in the case of having a large number of small files to transfer. In this case, the "control" overhead will tend to dominate, so that optimizing it is indeed a Good Thing. > > > d/ > -- > Dave Crocker > Brandenburg InternetWorking > +1.408.246.8253 > dcrocker a t ... > WE'VE MOVED to: www.bbiw.net > > From Jon.Crowcroft at cl.cam.ac.uk Fri Jan 21 01:21:27 2005 From: Jon.Crowcroft at cl.cam.ac.uk (Jon Crowcroft) Date: Fri Jan 21 01:22:03 2005 Subject: [e2e] TCP Local Area Normal behaviour? any references? Message-ID: this is mainly for educational reasons, not research: so i am looking for any papers or dissertations about the typical behaviour of TCP on a LAN - I cannot find anything that doesnt include some intermediate device which is a bottleneck, but I 'd love to see a set of traces/analyses of a few of today's typical TCP implementations (lets say win98, XP, OSX, bsd, linux and some commercial unix server ones) between typical cliens and servers on 10/100 (perhaps gigE)... its sort of boring and i guess hard to get published but its quite hard to explain and is the base case when starting to teach TCP (and i know there's lots of "corner case" code which means that MTU and window choices may be difference ...) pointers appreciated... j. From randall at stewart.chicago.il.us Fri Jan 21 04:45:30 2005 From: randall at stewart.chicago.il.us (Randall Stewart) Date: Fri Jan 21 04:48:10 2005 Subject: Fw: [e2e] overlay over TCP In-Reply-To: <41F0A477.4020605@attglobal.net> References: <2005120154321.377169@bbprime> <41F0A477.4020605@attglobal.net> Message-ID: <41F0F96A.2050407@stewart.chicago.il.us> Alex: You raise interesting points... and I agree.. when you throw in a 1% loss or higher interesting things in real networks begin to imerge.. Its hard to get a 1% loss on the Big I and test in any sort of reliable manner ... but I have been able to setup (in some of the earlier work I was doing with Cisco's RBSCP) a pretend satellite network with a modifed dummynet (it could not provide the packet buffering needed for a 550ms rtt and 2Mbit per second links... so I had to add buffering). I ran roughly 25 passes of multiple sizes using TCP and SCTP plain (and let me tell you this was a week long project) and then I did the same for RBSCP technology in place... and I ran the error rates from 0 - 6% .. I attach a .ps file I have for just the plain satellite network 0-6% error rates.. I won't bore everyone (unless requested otherwise) with the comparison with RBSCP in place. All machines were FreeBSD 4.x using what comes stock in KAME for TCP and SCTP. Interesting numbers I have a whole range of transfer sizes... but I thought these two would be most interesting... Network config was something like: Host-A R-A D-Net R-Z Host-Z +==========+======+======+======+ With the RTT between R-A and R-Z being around 550ms. Error rate was varied 0, 1, 2, 3, 4, 5 and 6 % Fun stuff this :-D R Alex Cannara wrote: > As the archives for this list should show, there's more to typical TCPs' > behaviors than most people think. And, very few folks have done much to > examine real TCP flows in real situations. TCP has an Achille's heel > with some interesting performance facets, which real-world > experimentation quickly reveals. The imagined benefits of various > 'sophisticated' TCP algorithms pale when the simplest of things happen > on a real link, such as loss, or even just the sending of an odd number > of payloads per application block. One test for anyone who claims TCP > expertise is to have that person estimate the effect of 1% packet loss > on time to transfer a sizeable file. We can then move on to an estimate > of the negative effects of Delayed Ack, etc. In other words, stopping > significant Internet protocol development over 20 years ago was a > mistake we pay for every day in every business now. But hey, we've > settled for lots of other mediocrities. > > Alex > > Dave Crocker wrote: > >> On Thu, 20 Jan 2005 16:51:09 -0500, Paul D. Amer wrote: >> >>> FTP can run over SCTP, and in fact can perform better >>> than FTP running over TCP. >> >> >> >> "Perform better" has some significant pre-conditions, here. >> >> The savings of the work you cite are primarily in transition latencies >> (from connection setup and command lock-step performance.) >> >> For a single transfer of a large file, this enhancement is >> irrelevant. The data transfer portion washes out any command or >> transition overhead. >> >> For a single transfer of a small file, this enhancement is irrelevant. >> Everything is so quick, making the control and transition mechanism >> quicker doesn't really make any difference. >> >> In fact the savings are only interesting in the case of having a large >> number of small files to transfer. In this case, the "control" >> overhead will tend to dominate, so that optimizing it is indeed a Good >> Thing. >> >> >> d/ >> -- >> Dave Crocker >> Brandenburg InternetWorking >> +1.408.246.8253 >> dcrocker a t ... >> WE'VE MOVED to: www.bbiw.net >> >> > > > > > > -- Randall Stewart 803-345-0369 815-342-5222(cell) -------------- next part -------------- A non-text attachment was scrubbed... Name: base.100000.ps Type: application/postscript Size: 13523 bytes Desc: not available Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050121/85bf7dc0/base.100000-0001.ps -------------- next part -------------- A non-text attachment was scrubbed... Name: base.10000000.ps Type: application/postscript Size: 13600 bytes Desc: not available Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050121/85bf7dc0/base.10000000-0001.ps From craig at aland.bbn.com Fri Jan 21 06:22:01 2005 From: craig at aland.bbn.com (Craig Partridge) Date: Fri Jan 21 06:24:01 2005 Subject: [e2e] TCP Local Area Normal behaviour? any references? In-Reply-To: Your message of "Fri, 21 Jan 2005 09:21:27 GMT." Message-ID: <20050121142201.7CCC421A@aland.bbn.com> Hi Jon: All the studies I can think of were done in the early 1990s and focused on proving the LAN technology rather than testing TCP (with the exception of Dave Borman's "whoops, TCP went a gigabit over HIPPI" work). I'd dig in the IETF archives for Dave's talk. I seem to recall Raj Jain did a couple of studies on FDDI around the same time (that showed the TTRT was a key parameter). Craig In message , Jon Crowcroft writes: >this is mainly for educational reasons, not research: > >so i am looking for any papers or dissertations about the typical behaviour of > TCP on a LAN - I cannot find >anything that doesnt include some intermediate device which is a bottleneck, b >ut I 'd love to see a set of >traces/analyses of a few of today's typical TCP implementations (lets say win9 >8, XP, OSX, bsd, linux and some commercial >unix server ones) between typical cliens and servers on 10/100 (perhaps gigE). >.. > >its sort of boring and i guess hard to get published but its quite hard to exp >lain and is the base case when >starting to teach TCP (and i know there's lots of "corner case" code which mea >ns that MTU and window choices may be >difference ...) > >pointers appreciated... > >j. From Jon.Crowcroft at cl.cam.ac.uk Fri Jan 21 06:34:32 2005 From: Jon.Crowcroft at cl.cam.ac.uk (Jon Crowcroft) Date: Fri Jan 21 06:36:20 2005 Subject: [e2e] TCP Local Area Normal behaviour? any references? In-Reply-To: Message from Craig Partridge of "Fri, 21 Jan 2005 09:22:01 EST." <20050121142201.7CCC421A@aland.bbn.com> Message-ID: right - once people got over the argument "TCP is aheavyweight protocol" everyone used it to find the limit on Their Favourite LAN - even wide area, most the landspeed record stuff (e.g. ipv6 6Gbps coast2coast) is TCP based but what I am after is description of the normal dynamics of TCP on normal configs rangin from shared to switched 10 to 100 ether environments - how does the window and tx/rx packet rate vary over time over a set of flows in a typical site ? how fair/efficient is TCP in normal operation when there's no router or buffer in an intermediate node (yes i knoiw some switches have more than 1 packet buffers but ignore those) In missive <20050121142201.7CCC421A@aland.bbn.com>, Craig Partridge typed: >>All the studies I can think of were done in the early 1990s and focused on >>proving the LAN technology rather than testing TCP (with the exception >>of Dave Borman's "whoops, TCP went a gigabit over HIPPI" work). I'd dig >>in the IETF archives for Dave's talk. I seem to recall Raj Jain did >>a couple of studies on FDDI around the same time (that showed the TTRT >>was a key parameter). >> >>Craig >> >>In message , Jon Crowcroft writes: >> >>>this is mainly for educational reasons, not research: >>> >>>so i am looking for any papers or dissertations about the typical behaviour of >> > TCP on a LAN - I cannot find >>>anything that doesnt include some intermediate device which is a bottleneck, b >> >ut I 'd love to see a set of >>>traces/analyses of a few of today's typical TCP implementations (lets say win9 >> >8, XP, OSX, bsd, linux and some commercial >>>unix server ones) between typical cliens and servers on 10/100 (perhaps gigE). >> >.. >>> >>>its sort of boring and i guess hard to get published but its quite hard to exp >> >lain and is the base case when >>>starting to teach TCP (and i know there's lots of "corner case" code which mea >> >ns that MTU and window choices may be >>>difference ...) >>> >>>pointers appreciated... >>> >>>j. cheers jon From dpreed at reed.com Fri Jan 21 08:28:17 2005 From: dpreed at reed.com (David P. Reed) Date: Fri Jan 21 08:30:10 2005 Subject: [e2e] TCP Local Area Normal behaviour? any references? In-Reply-To: References: Message-ID: <41F12DA1.90403@reed.com> Jon - not meaning to distract from your main goal, I find I am very interested in some of this stuff for research purposes, not just teaching. Proxicommunications (is to LAN as telecommunications is to WAN) has always been a poor stepchild in many communications theory and algorithm thinking, but its importance seems to be growing faster than the importance of telecommunications. So additional research on behavior of traffic that involves a mix of proxi- and tele- traffic over LANs is probably well worth carrying out. From dpreed at reed.com Fri Jan 21 08:56:00 2005 From: dpreed at reed.com (David P. Reed) Date: Fri Jan 21 08:58:01 2005 Subject: [e2e] overlay over TCP In-Reply-To: <41F0F96A.2050407@stewart.chicago.il.us> References: <2005120154321.377169@bbprime> <41F0A477.4020605@attglobal.net> <41F0F96A.2050407@stewart.chicago.il.us> Message-ID: <41F13420.3030608@reed.com> On a side channel, I just received this URL, which seems quite relevant (going NUTSS a bit further): http://p2psip.org From dwing at cisco.com Fri Jan 21 09:03:29 2005 From: dwing at cisco.com (Dan Wing) Date: Fri Jan 21 09:04:30 2005 Subject: [e2e] overlay over TCP In-Reply-To: <508BD1B4-6A72-11D9-A5A2-000D93B505E6@extremenetworks.com> References: <41E5CD14.4010206@reed.com> <41E6B57C.6060705@isi.edu> <41E6EA6D.1080705@reed.com> <41EECB63.9000304@stewart.chicago.il.us> <508BD1B4-6A72-11D9-A5A2-000D93B505E6@extremenetworks.com> Message-ID: <5F9E4D56-6BCE-11D9-8F73-0003938AF740@cisco.com> On Jan 19, 2005, at 3:31 PM, RJ Atkinson wrote: > > Perhaps one of the paths forward is for folks who propose new > transport-layer > protocols to also have an informational document targeted at folks who > build > firewalls (or other middle boxes) to help educate them on what the > real risks > are (and aren't) with the new protocol and also to give them help on > how > to implement support for that new protocol in their middle box... The IETF BEHAVE working group would be a good home for such work. It is currently chartered to provide guidance for NATs handling UDP and TCP. Its charter could be expanded to other protocols, or individual submissions could follow a framework similar to BEHAVE's current documents. > For example, with SCTP, one of the things that could help would be > specific > openly published information on efficiently re-calculating the SCTP > checksum > after a NAT has done its work, for example. Many folks know how to do > this > with a Fletcher checksum (often because they've looked at BSDish code), > but not so many know how to do it with SCTP's new checksum. > > (My assumption here is that the big barrier is confusion/ignorance. :-) Yes, combined with little market demand, as yet, for a NAT to handle SCTP. -d From falk at ISI.EDU Fri Jan 21 10:04:11 2005 From: falk at ISI.EDU (Aaron Falk) Date: Fri Jan 21 10:06:01 2005 Subject: [e2e] TCP Local Area Normal behaviour? any references? In-Reply-To: References: <20050121142201.7CCC421A@aland.bbn.com> Message-ID: <20050121180410.GA6946@isi.edu> Jon Crowcroft wrote: > > how fair/efficient is TCP in normal operation when there's no router > or buffer in an intermediate node (yes i knoiw some switches have > more than 1 packet buffers but ignore those) I seem to recall Matt Mathis talking about this at IETF around 5 years ago in the context of MAC 'capture effects'. --aaron From Jon.Crowcroft at cl.cam.ac.uk Fri Jan 21 10:38:41 2005 From: Jon.Crowcroft at cl.cam.ac.uk (Jon Crowcroft) Date: Fri Jan 21 10:40:05 2005 Subject: [e2e] TCP Local Area Normal behaviour? any references? In-Reply-To: Message from Aaron Falk of "Fri, 21 Jan 2005 10:04:11 PST." <20050121180410.GA6946@isi.edu> Message-ID: um, yes i recall that - any pointers to papers/web page/results appreciated. We are a bit like Sigmund Freud :- we base an entire "science" on pathological behaviour, but most TCP connections are not congested, just like most people aren't crazy (or tell awful austrian 19th century jokes:) how many papers written on bottleneck behaviour and how few on whats normal?- what we need is the Alex Comfort of TCP, so to speak... hmm - i foresee a whole new conference on packet loss and the unconscious and beyond the AIMD principle and endless arguments about IP SLA archetypes and TCP mandalas :-) In missive <20050121180410.GA6946@isi.edu>, Aaron Falk typed: >>Jon Crowcroft wrote: >>> >>> how fair/efficient is TCP in normal operation when there's no router >>> or buffer in an intermediate node (yes i knoiw some switches have >>> more than 1 packet buffers but ignore those) >> >>I seem to recall Matt Mathis talking about this at IETF around 5 years >>ago in the context of MAC 'capture effects'. >> >>--aaron cheers jon From mathis at psc.edu Fri Jan 21 11:29:37 2005 From: mathis at psc.edu (Matt Mathis) Date: Fri Jan 21 11:30:07 2005 Subject: [e2e] TCP Local Area Normal behaviour? any references? In-Reply-To: <20050121180410.GA6946@isi.edu> References: <20050121142201.7CCC421A@aland.bbn.com> <20050121180410.GA6946@isi.edu> Message-ID: On Fri, 21 Jan 2005, Aaron Falk wrote: > Jon Crowcroft wrote: > > > > how fair/efficient is TCP in normal operation when there's no router > > or buffer in an intermediate node (yes i knoiw some switches have > > more than 1 packet buffers but ignore those) > > I seem to recall Matt Mathis talking about this at IETF around 5 years > ago in the context of MAC 'capture effects'. > > --aaron Yea, I looked at it, but in the end I decided that the problem is a rathole, and never really finished anything. Here are some "hints". There is a fundamental clash between self clocked (constant) window protocols (e.g. TCP, especially in congestion avoidance) and channels that give priority to the current sender. These channels cause self interference in the self clock because they couple the forward and reverse paths. The most common example of a channel that gives priority to the sender is half duplex Ethernet, with its so-called capture effect, but I believe that this applies to most channels that are not true full duplex. (It may even apply to any channel where there is any interference at all between the forward and return paths - Including wireless????) TCP does especially poorly on a long path with a half duplex span that is not adjacent to the sender. TCP can do some things to help if the half duplex span is the first (or only) span because it can inspect the queue in the NIC, but I would characterize most of these solutions as "hacks". (And some have been deployed in some OS's). BTW, my conjectured general "solution" for Ethernet is that a sending NIC MUST give up (release) the channel periodically (e.g. every N packets), and must have at least 2N queue space. This modulates the entire flow into bursts of size N, and assures that there is enough queue to maintain the proper average data rate. But the more pragmatic solution (adopted here at PSC and may other places) is to declare half duplex Ethernet to be broken, and eradicate it wherever possible. Where not possible, tell people that the maximum theoretical utilization is 1/e (35%), and they should be pleased if they get any better than that, because they are operating beyond the designed operating point for the media. Good luck, --MM-- ------------------------------------------- Matt Mathis http://www.psc.edu/~mathis Work:412.268.3319 Home/Cell:412.654.7529 ------------------------------------------- Evil is defined by people who think they know "The Truth" and use force to apply it to others. From craig at aland.bbn.com Fri Jan 21 11:34:47 2005 From: craig at aland.bbn.com (Craig Partridge) Date: Fri Jan 21 11:36:17 2005 Subject: [e2e] TCP Local Area Normal behaviour? any references? In-Reply-To: Your message of "Fri, 21 Jan 2005 14:29:37 EST." Message-ID: <20050121193447.910DA21A@aland.bbn.com> In message , Matt Mathis write s: >But the more pragmatic solution (adopted here at PSC and may other places) is >to declare half duplex Ethernet to be broken, and eradicate it wherever >possible. Where not possible, tell people that the maximum theoretical >utilization is 1/e (35%), and they should be pleased if they get any better >than that, because they are operating beyond the designed operating point for >the media. That 1/e is not consistent with Boggs & Mogul's work from SIGCOMM 1988. Van Jacobson also reported results inconsistent with 1/e. Indeed, I'd thought 1/e had generally been discredited as a mistaken result from inaccurate models. Boggs & Mogul used multiple TCP's. Van used a single one. So what did they do such that the capture effect didn't happen? And does capture really yield 1/e or something different? Craig From mathis at psc.edu Fri Jan 21 11:53:19 2005 From: mathis at psc.edu (Matt Mathis) Date: Fri Jan 21 11:54:04 2005 Subject: [e2e] TCP Local Area Normal behaviour? any references? In-Reply-To: <20050121193447.910DA21A@aland.bbn.com> References: <20050121193447.910DA21A@aland.bbn.com> Message-ID: OK, you caught me! It is easy to find pathological cases where you can get no more than 50% utilization (full window in a single burst, gating all of the acks on the return path.) I think capture effect does not become a problem unless the end-system and routers/switches can send true back-to-back packets for a large portion of the window. Just a small amount of idle makes the channel arbitration much more fair. It might be kind of interesting to try to reconstruct the Boggs result with modern PC's..... Thanks, --MM-- On Fri, 21 Jan 2005, Craig Partridge wrote: > > In message , Matt Mathis write > s: > > >But the more pragmatic solution (adopted here at PSC and may other places) is > >to declare half duplex Ethernet to be broken, and eradicate it wherever > >possible. Where not possible, tell people that the maximum theoretical > >utilization is 1/e (35%), and they should be pleased if they get any better > >than that, because they are operating beyond the designed operating point for > >the media. > > That 1/e is not consistent with Boggs & Mogul's work from SIGCOMM 1988. > Van Jacobson also reported results inconsistent with 1/e. Indeed, I'd > thought 1/e had generally been discredited as a mistaken result from > inaccurate models. > > Boggs & Mogul used multiple TCP's. Van used a single one. > > So what did they do such that the capture effect didn't happen? And > does capture really yield 1/e or something different? > > Craig > From arthur at gprt.ufpe.br Fri Jan 21 12:15:56 2005 From: arthur at gprt.ufpe.br (Arthur de Castro Callado) Date: Fri Jan 21 12:18:07 2005 Subject: [e2e] overlay over TCP In-Reply-To: <41F13420.3030608@reed.com> References: <2005120154321.377169@bbprime> <41F0A477.4020605@attglobal.net> <41F0F96A.2050407@stewart.chicago.il.us> <41F13420.3030608@reed.com> Message-ID: Specifically for P2P SIP/H.323 Telephony, DUNDI (http://www.dundi.com/) seems to be more evolved. On Fri, 21 Jan 2005, David P. Reed wrote: > On a side channel, I just received this URL, which seems quite relevant > (going NUTSS a bit further): > > http://p2psip.org From dpreed at reed.com Fri Jan 21 12:15:07 2005 From: dpreed at reed.com (David P. Reed) Date: Fri Jan 21 12:18:13 2005 Subject: [e2e] overlay over TCP In-Reply-To: <5F9E4D56-6BCE-11D9-8F73-0003938AF740@cisco.com> References: <41E5CD14.4010206@reed.com> <41E6B57C.6060705@isi.edu> <41E6EA6D.1080705@reed.com> <41EECB63.9000304@stewart.chicago.il.us> <508BD1B4-6A72-11D9-A5A2-000D93B505E6@extremenetworks.com> <5F9E4D56-6BCE-11D9-8F73-0003938AF740@cisco.com> Message-ID: <41F162CB.1020000@reed.com> Dan Wing wrote: > > Yes, combined with little market demand, as yet, for a NAT to handle > SCTP. There is this chicken/egg problem. If SCTP doesn't work over NATs it won't be used for applications where NATs are heavily used. Then there won't be demand (at least no evidence of it). There's a difference between "demand" (meaning actual use) and "demand" meaning I would ask and pay to use it if I thought I had any chance of getting it from the blind turkeys who sell things like NAT boxes. Reminds me of 1992 when a Nynex VP told me there was *no demand* for data connectivity between people working at home and their offices. I pointed out that companies like DEC and MIT employed 10's of thousands of people who were using terminals to connect to computers at work. His response was that they had carefully analyzed measured data use and I was wrong. I asked how they measured modems over home phone lines, thinking they could listen for modem tones or something, and he said (I'm not joking): "I thought that was illegal!" It turns out that what they called "data" was a dedicated "data circuit" and that modems were "voice". From craig at aland.bbn.com Fri Jan 21 12:20:28 2005 From: craig at aland.bbn.com (Craig Partridge) Date: Fri Jan 21 12:22:43 2005 Subject: [e2e] TCP Local Area Normal behaviour? any references? In-Reply-To: Your message of "Fri, 21 Jan 2005 14:53:19 EST." Message-ID: <20050121202028.EA35F21A@aland.bbn.com> In message , Matt Mathis write s: >I think capture effect does not become a problem unless the end-system and >routers/switches can send true back-to-back packets for a large portion of the >window. Just a small amount of idle makes the channel arbitration much more >fair. It might be kind of interesting to try to reconstruct the Boggs result >with modern PC's..... Sounds like great fun. Let's see, what would the experiments be (always fun to do experimental science on the fly).... Van tested a single TCP connection over cable (in the old days when you actually tapped into a cable!) with just two hosts. He used two different Ethernet adapters (and showed that one couldn't go full line rate, and the other could). It seems to me that in today's world you'd do this experiment four ways: * single cable, two hosts (which I think you can do with a crossover cable), one TCP connect. This is about as close as we can get to Van's original experiment. Note that the length of the cable may matter (and should be tested) * same experiment but with a switch in between Probably try multiple switches. Notes: probably worth doing at various TCP window sizes, to eliminate/ identify window effects -- e.g., the delay bandwidth product is probably small, but worth computing, especially through the switch. If one is feeling ambitious, one could also do 802.11 (both direct and through a hub). Boggs and Mogul tested an Ethernet by using 1 to 25 hosts that concurrently each tried to send 20 seconds of fixed length packets as fast as possible. During the middle 10 seconds it measured a bunch of stats (which may or may not be accessible from current adapters). They also experimented with cable length effects. I think you can't quite replicate it now, but you can certainly load up a single switch (probably want an 8-port) and also try networks with more than one switch. There's also a question of what the max capacity of the network is, if the switch is full duplex on all ports (you may end up measuring the switch backplane if you're not careful). My gut says the Jacobson experiment is the easier one to replicate and may reveal a lot on its own. I don't have time to do this experiment, but happy to help if someone else wants to take it on. Craig From jonathan at dsg.stanford.edu Fri Jan 21 12:35:25 2005 From: jonathan at dsg.stanford.edu (Jonathan Stone) Date: Fri Jan 21 12:36:32 2005 Subject: [e2e] TCP Local Area Normal behaviour? any references? In-Reply-To: Your message of "Fri, 21 Jan 2005 14:34:47 EST." <20050121193447.910DA21A@aland.bbn.com> Message-ID: In message <20050121193447.910DA21A@aland.bbn.com>, Craig Partridge writes: >In message , Matt Mathis write >s: > >>But the more pragmatic solution (adopted here at PSC and may other places) is >>to declare half duplex Ethernet to be broken, and eradicate it wherever >>possible. Where not possible, tell people that the maximum theoretical >>utilization is 1/e (35%), and they should be pleased if they get any better >>than that, because they are operating beyond the designed operating point for >>the media. > >That 1/e is not consistent with Boggs & Mogul's work from SIGCOMM 1988. >Van Jacobson also reported results inconsistent with 1/e. Indeed, I'd >thought 1/e had generally been discredited as a mistaken result from >inaccurate models. If (very) dim memory serves, 1/e is valid for slotted Aloha. But Ethernet -- even half-duplex Ethernet -- is not Aloha. Indeed, I beleive different Ethernet chips in fast enough workstations (33 MHz r3000a or thereabouts) would repeatably give different saturation throughputs for ttcp on a two-host half-duplex Ethernet, due to small differences in collision-detect and BEB hardware implementation yielding slightly more idle time on the wire, or something like that. I think Lance versus SEEQ is the pair I once noted in my lab book. I can ask some more determined practitioners of the time, if you care for more details. >Boggs & Mogul used multiple TCP's. Van used a single one. > >So what did they do such that the capture effect didn't happen? I seem to recall they used comparatively slow CPUs (DECWRL Titan) and an Ethernet chip that required a software intervention after each packet send. According to the tech report cited below, the driver interaction took about 100 usec, or about 2 10Mbit contention-slot times. (I have not checked the arithmetic.) >And does capture really yield 1/e or something different? See ``A New Binary Logarithmic Arbitration Method for Ethernet'', Mart L. Molle, Tech report CSRI-298, Computer Science Research Insitute, University of Toronto, 1994. Molle examined the results of Boggs and Mogul, and proposed a better, fairer backoff algorithm, BLAM that (in Molle's words) gave a logarithmic rather than linear estimator of the offered workload, yielding a less non-work-preserving discipline than binary exponential backoff. BLAM never went anywhere, since half-duplex shared Ethernet became obsolescent about that time. From dwing at cisco.com Fri Jan 21 12:43:07 2005 From: dwing at cisco.com (Dan Wing) Date: Fri Jan 21 12:44:03 2005 Subject: [e2e] overlay over TCP In-Reply-To: <41F162CB.1020000@reed.com> References: <41E5CD14.4010206@reed.com> <41E6B57C.6060705@isi.edu> <41E6EA6D.1080705@reed.com> <41EECB63.9000304@stewart.chicago.il.us> <508BD1B4-6A72-11D9-A5A2-000D93B505E6@extremenetworks.com> <5F9E4D56-6BCE-11D9-8F73-0003938AF740@cisco.com> <41F162CB.1020000@reed.com> Message-ID: <0E0FA274-6BED-11D9-8F73-0003938AF740@cisco.com> On Jan 21, 2005, at 12:15 PM, David P. Reed wrote: > Dan Wing wrote: > >> >> Yes, combined with little market demand, as yet, for a NAT to handle >> SCTP. > > There is this chicken/egg problem. If SCTP doesn't work over NATs it > won't be used for applications where NATs are heavily used. Then > there won't be demand (at least no evidence of it). This egg was demonstrably cooked with IPsec, which had the same problem. IPsec "passthru" was implemented on nearly all vendor's residential NATs at about the same time IPsec-over-UDP was beginning to hit the market. Passthru works by examining SPI's and simple mechanisms have drawbacks (only one session through the NAT, or only one session to a specific remote IP address, for example), and IPsec-over-UDP has even more packet bloat than IPsec itself. I expect DCCP, SCTP, and other new protocols will suffer the same awkward deployment unless we (in the collective sense) improve the situation with guidance from people familiar with those new protocols. draft-xie-tsvwg-sctp-nat-00.txt is a move in the right direction, although it seems NATting SCTP may well be complex. > There's a difference between "demand" (meaning actual use) and > "demand" meaning I would ask and pay to use it if I thought I had any > chance of getting it from the blind turkeys who sell things like NAT > boxes. > > Reminds me of 1992 when a Nynex VP told me there was *no demand* for > data connectivity between people working at home and their offices. I > pointed out that companies like DEC and MIT employed 10's of thousands > of people who were using terminals to connect to computers at work. > His response was that they had carefully analyzed measured data use > and I was wrong. I asked how they measured modems over home phone > lines, thinking they could listen for modem tones or something, and he > said (I'm not joking): "I thought that was illegal!" It turns out > that what they called "data" was a dedicated "data circuit" and that > modems were "voice". :-) -d From kkrama at research.att.com Fri Jan 21 13:56:16 2005 From: kkrama at research.att.com (K. K. Ramakrishnan) Date: Fri Jan 21 13:58:05 2005 Subject: [e2e] TCP Local Area Normal behaviour? any references? In-Reply-To: <20050121202028.EA35F21A@aland.bbn.com> References: <20050121202028.EA35F21A@aland.bbn.com> Message-ID: <41F17A80.4030505@research.att.com> Jon, Craig and Matt, I meant to respond earlier to Jon's request on short term TCP behavior on LANs. I worked on this in the context of high-performance workstations and shared 10/100 Mbps Ethernet many years ago, and found the problem with capture effect and an approach to overcome it, which got incorporated into subsequent Ethernet MAC chips that were produced by Digital Equipment. Since you folks are talking about it, I thought I'd point out to a paper we wrote on the topic. Here is the URL: http://www.research.att.com/~kkrama/papers.html#capture_effect Of course as you folks mention, full-duplex Ethernet makes this problem less relevant in LANs. But, it is still relevant in other domains, such as in Wireless LANs. K. K. Ramakrishnan Craig Partridge wrote: >In message , Matt Mathis writes: > > >>I think capture effect does not become a problem unless the end-system and >>routers/switches can send true back-to-back packets for a large portion of the >>window. Just a small amount of idle makes the channel arbitration much more >>fair. It might be kind of interesting to try to reconstruct the Boggs result >>with modern PC's..... >> >> ...... -- K. K. Ramakrishnan Email: kkrama@research.att.com AT&T Labs-Research, Rm. A117 Tel: (973)360-8764 180 Park Ave, Florham Park, NJ 07932 Fax: (973) 360-8871 URL: http://www.research.att.com/info/kkrama From rja at extremenetworks.com Fri Jan 21 16:50:35 2005 From: rja at extremenetworks.com (RJ Atkinson) Date: Fri Jan 21 16:52:04 2005 Subject: [e2e] TCP Local Area Normal behaviour In-Reply-To: References: <20050121142201.7CCC421A@aland.bbn.com> <20050121180410.GA6946@isi.edu> Message-ID: (With regards to wired Ethernet...) Half-duplex Ethernet is increasingly uncommon in commercial or service-provider environments -- not so much because of any grand plan as because most such organisations primarily buy better quality Ethernet switches that are full-duplex (rather than buying Ethernet hubs). Few 100 Mbps Ethernet deployments seem to be in half-duplex mode. I think half-duplex 10Mbps Ethernet is still quite common in residential environments, where the lower price of the Ethernet hub (and the lower bandwidth demands) are more of a factor in purchasing. With 1 GigE specifications, it was theoretically possible to configure the link to half-duplex, but I do not know of any such deployments (and not all 1 GigE implementations support that mode of operation). With 10 GigE, full-duplex is the only mode supported in the specifications. (Of course, wireless Ethernet is a current major deployment trend; its MAC is somewhat different than these several wired Ethernet MAC protocols.) Cheers, Ran On Jan 21, 2005, at 14:29, Matt Mathis wrote: > But the more pragmatic solution (adopted here at PSC and may other > places) is > to declare half duplex Ethernet to be broken, and eradicate it wherever > possible. Where not possible, tell people that the maximum theoretical > utilization is 1/e (35%), and they should be pleased if they get any > better > than that, because they are operating beyond the designed operating > point for > the media. From huitema at windows.microsoft.com Fri Jan 21 20:37:27 2005 From: huitema at windows.microsoft.com (Christian Huitema) Date: Fri Jan 21 20:40:32 2005 Subject: [e2e] TCP Local Area Normal behaviour? any references? Message-ID: > See ``A New Binary Logarithmic Arbitration Method for Ethernet'', Mart > L. Molle, Tech report CSRI-298, Computer Science Research Insitute, > University of Toronto, 1994. Molle examined the results of Boggs and > Mogul, and proposed a better, fairer backoff algorithm, BLAM that (in > Molle's words) gave a logarithmic rather than linear estimator of the > offered workload, yielding a less non-work-preserving discipline than > binary exponential backoff. BLAM never went anywhere, since > half-duplex shared Ethernet became obsolescent about that time. If I recall correctly, Gerard LeLann proposed a similar algorithm in 1987: G. Le Lann. The 802.3D Protocol: A variation of the IEEE802.3 Standard for Real-time LANs. Technical Report, INRIA, France, 1987. -- Christian Huitema From huitema at windows.microsoft.com Fri Jan 21 20:37:33 2005 From: huitema at windows.microsoft.com (Christian Huitema) Date: Fri Jan 21 20:40:36 2005 Subject: [e2e] TCP Local Area Normal behaviour? any references? Message-ID: > Since you folks are talking about it, I thought I'd point out to a paper > we wrote on the topic. Here is the URL: > http://www.research.att.com/~kkrama/papers.html#capture_effect > Of course as you folks mention, full-duplex Ethernet makes this problem > less > relevant in LANs. But, it is still relevant in other domains, such as in > Wireless LANs. Capture effects affect fairness, not throughput. In fact, by allowing a pair of stations to jump on the channel faster, capture effects enhance the throughput. That being said, there are pretty few shared media Ethernet around nowadays. The closest thing is a 10baseT hub, and that is already a pretty ancient piece of equipment. Most wired LAN are built out of Ethernet switches. Switches are by and large immune to the capture effect. On 1Gbps switches, sustained TCP rates of around 800 Mbps are common. -- Christian Huitema From mathis at psc.edu Sat Jan 22 08:16:39 2005 From: mathis at psc.edu (Matt Mathis) Date: Sat Jan 22 08:18:20 2005 Subject: [e2e] TCP Local Area Normal behaviour? any references? In-Reply-To: References: Message-ID: > Capture effects affect fairness, not throughput. In fact, by allowing a > pair of stations to jump on the channel faster, capture effects enhance > the throughput. ,,,and this is where theory and reality diverge. During the 1GE roll-out, half duplex FastE was a major headache for people trying to send large data sets. Here is a pathological case: A long path with GigE and OC-3 (or better) all the way from our data center to the hub at the remote site. The last span was likely to be a point to point half duplex link from the hub to some users workstation. During slow-start, our server and the long path deliver bursts to the far hub which are at least 100 Mb/s. Assuming there is enough buffering, nothing gets dropped, but each burst captures the channel, locking out all ACKs until the end of the burst. If the last switch is properly buffered, TCP will progress all the way into congestion avoidance in this mode: A full window of data on the forward path, followed by a full window of ACKs on the return path. If you set the windows and buffering properly, you get about half of the expected performance. If something goes wrong (like insufficient buffering), you get, shall we say "interesting" behaviors and even worse performance. Generalizing this situation: any link which exhibits any sort of capture/lockout/modulation and is the dominant bottleneck in a path which is both high delay an high capacity has the potential to be problematic. If it is the very first span the stack might be able to do something "hackish" to fix it, but in all other cases there is nothing constructive for the stack to do that does not hurt performance on healthy channels. For the half duplex case packet traces are so bizarre that anybody who looked at them immediately understood that something was seriously wrong. For this reasons most I2 connected sites are still rabid about eradicating both half duplex and unmanaged hubs in peoples offices (the prime culprit these days). Thanks, --MM-- ------------------------------------------- Matt Mathis http://www.psc.edu/~mathis Work:412.268.3319 Home/Cell:412.654.7529 ------------------------------------------- Evil is defined by people who think they know "The Truth" and use force to apply it to others. From cannara at attglobal.net Sat Jan 22 12:07:11 2005 From: cannara at attglobal.net (Cannara) Date: Sat Jan 22 12:10:08 2005 Subject: The 1/e myth, was Re: [e2e] TCP Local Area Normal behaviour? any References: Message-ID: <41F2B26F.F4C6FFEA@attglobal.net> The 37% solution was always bogus -- part of the Token-Ring crowd's (IBM's) attempt to market against Enet, despite extremely higher costs. There's a DEC Research Lab report by Boggs et alia, from '88, that proves via real-life experiments that any Enet segment with 20 or so nodes on it can easily get over 90% throughput. I may have it as a file, if someone wants it. There's also a graphic that I've used with networking students that helps a great deal in seeing how CSMA/CD is superior to any token-passing system, which I may also have as a file -- the gist is that time to complete a given pkt send starts out above zero in a token system and increases in proportion to the number of connected nodes, while the time to send in CSMA/CD starts at 0 on a lightly-loaded LAN and is not at all affected by connected nodes, only by those wanting to transmit also. This latter leads to collisions and backoff, of course, but the delay-to-send curves for token and CSMA/CD start out with the latter lower and only rising to cross the token curve at simultaneous access rates well above 30% for 128B pkts. Since most LANs run well below 1/3 of the nodes colliding at the same time, this crossover, like 1/e, is a silly measure of relative merit. And, the spike in token's heart is not just cost per node, but that the crossover moves to higher access rates as nodes are added, meaning that CSMA/CD becomes better, relatively, as LAN segments have more nodes. Of course, $ is always the bottom line and when I walked into a PacBell billing center to consult on a performance problem about 8 years ago, and saw their IBM 9745s had Ethernet interfaces, it was obvious where token was going. Some of us have also thought of ATM as the Token Ring of the 2nd millenium, but it died even faster, in the corporate net. :] Alex Jonathan Stone wrote: > > In message <20050121193447.910DA21A@aland.bbn.com>, > Craig Partridge writes: > > >In message , Matt Mathis write > >s: > > > >>But the more pragmatic solution (adopted here at PSC and may other places) is > >>to declare half duplex Ethernet to be broken, and eradicate it wherever > >>possible. Where not possible, tell people that the maximum theoretical > >>utilization is 1/e (35%), and they should be pleased if they get any better > >>than that, because they are operating beyond the designed operating point for > >>the media. > > > >That 1/e is not consistent with Boggs & Mogul's work from SIGCOMM 1988. > >Van Jacobson also reported results inconsistent with 1/e. Indeed, I'd > >thought 1/e had generally been discredited as a mistaken result from > >inaccurate models. > > If (very) dim memory serves, 1/e is valid for slotted Aloha. > > But Ethernet -- even half-duplex Ethernet -- is not Aloha. Indeed, I > beleive different Ethernet chips in fast enough workstations (33 MHz > r3000a or thereabouts) would repeatably give different saturation > throughputs for ttcp on a two-host half-duplex Ethernet, due to small > differences in collision-detect and BEB hardware implementation > yielding slightly more idle time on the wire, or something like that. > > I think Lance versus SEEQ is the pair I once noted in my lab book. I > can ask some more determined practitioners of the time, if you care > for more details. > > >Boggs & Mogul used multiple TCP's. Van used a single one. > > > >So what did they do such that the capture effect didn't happen? > > I seem to recall they used comparatively slow CPUs (DECWRL Titan) and > an Ethernet chip that required a software intervention after each > packet send. According to the tech report cited below, the driver > interaction took about 100 usec, or about 2 10Mbit contention-slot > times. (I have not checked the arithmetic.) > > >And does capture really yield 1/e or something different? > > See ``A New Binary Logarithmic Arbitration Method for Ethernet'', Mart > L. Molle, Tech report CSRI-298, Computer Science Research Insitute, > University of Toronto, 1994. Molle examined the results of Boggs and > Mogul, and proposed a better, fairer backoff algorithm, BLAM that (in > Molle's words) gave a logarithmic rather than linear estimator of the > offered workload, yielding a less non-work-preserving discipline than > binary exponential backoff. BLAM never went anywhere, since > half-duplex shared Ethernet became obsolescent about that time. From jms at central.cis.upenn.edu Sat Jan 22 12:40:44 2005 From: jms at central.cis.upenn.edu (Jonathan M. Smith) Date: Sat Jan 22 12:42:29 2005 Subject: [e2e] TCP Local Area Normal behaviour? any references? In-Reply-To: <20050121202028.EA35F21A@aland.bbn.com> References: <20050121202028.EA35F21A@aland.bbn.com> Message-ID: Why not do it on WiFi? -JMS ------------------------------------------------------------------------- Jonathan M. Smith Olga and Alberico Pompa Professor of Engineering and Applied Science Professor of Computer and Information Science, University of Pennsylvania Levine Hall, 3330 Walnut Street, Philadelphia, PA 19104 On Fri, 21 Jan 2005, Craig Partridge wrote: > > In message , Matt Mathis write > s: > > >I think capture effect does not become a problem unless the end-system and > >routers/switches can send true back-to-back packets for a large portion of the > >window. Just a small amount of idle makes the channel arbitration much more > >fair. It might be kind of interesting to try to reconstruct the Boggs result > >with modern PC's..... > > Sounds like great fun. Let's see, what would the experiments be (always > fun to do experimental science on the fly).... > > Van tested a single TCP connection over cable (in the old days when you > actually tapped into a cable!) with just two hosts. He used two different > Ethernet adapters (and showed that one couldn't go full line rate, and > the other could). It seems to me that in today's world you'd do this > experiment four ways: > > * single cable, two hosts (which I think you can do with a crossover > cable), one TCP connect. > > This is about as close as we can get to Van's original > experiment. Note that the length of the cable may matter > (and should be tested) > > * same experiment but with a switch in between > > Probably try multiple switches. > > Notes: probably worth doing at various TCP window sizes, to eliminate/ > identify window effects -- e.g., the delay bandwidth product is probably > small, but worth computing, especially through the switch. > > If one is feeling ambitious, one could also do 802.11 (both direct > and through a hub). > > Boggs and Mogul tested an Ethernet by using 1 to 25 hosts that > concurrently each tried to send 20 seconds of fixed length packets > as fast as possible. During the middle 10 seconds it measured a bunch > of stats (which may or may not be accessible from current adapters). > They also experimented with cable length effects. I think you can't > quite replicate it now, but you can certainly load up a single switch > (probably want an 8-port) and also try networks with more than one > switch. There's also a question of what the max capacity of the network > is, if the switch is full duplex on all ports (you may end up measuring > the switch backplane if you're not careful). > > My gut says the Jacobson experiment is the easier one to replicate and may > reveal a lot on its own. I don't have time to do this experiment, but happy > to help if someone else wants to take it on. > > Craig > From kkrama at research.att.com Sat Jan 22 13:55:51 2005 From: kkrama at research.att.com (K. K. Ramakrishnan) Date: Sat Jan 22 13:56:05 2005 Subject: [e2e] TCP Local Area Normal behaviour? any references? In-Reply-To: References: <20050121202028.EA35F21A@aland.bbn.com> Message-ID: <41F2CBE7.6040909@research.att.com> Jonathan, Exactly. There was a paper in last year's ICNP that seemed to show that the problem appeared in wireless networks. It'd be interesting to see to what extent channel capture occurs and if a solution such as the ones suggested for Ethernet capture would work effectively in wireless. K. K. Ramakrishnan Jonathan M. Smith wrote: >Why not do it on WiFi? >-JMS > > >------------------------------------------------------------------------- >Jonathan M. Smith >Olga and Alberico Pompa Professor of Engineering and Applied Science >Professor of Computer and Information Science, University of Pennsylvania >Levine Hall, 3330 Walnut Street, Philadelphia, PA 19104 > >On Fri, 21 Jan 2005, Craig Partridge wrote: > > > >>In message , Matt Mathis write >>s: >> >> >> >>>I think capture effect does not become a problem unless the end-system and >>>routers/switches can send true back-to-back packets for a large portion of the >>>window. Just a small amount of idle makes the channel arbitration much more >>>fair. It might be kind of interesting to try to reconstruct the Boggs result >>>with modern PC's..... >>> >>> >>Sounds like great fun. Let's see, what would the experiments be (always >>fun to do experimental science on the fly).... >> ... From cannara at attglobal.net Sat Jan 22 16:27:26 2005 From: cannara at attglobal.net (Cannara) Date: Sat Jan 22 16:28:06 2005 Subject: [e2e] TCP Local Area Normal behaviour? any references? References: Message-ID: <41F2EF6E.1B26BA20@attglobal.net> Ahhh, yes, but -- there's always at least one. :] But, 99.9999% of people have no idea of how their stacks' parameters are set by default and less idea how those and their OSs interact with their traffic and links. Again, ask someone thought to be network wise how much performance will be lost on a full-MTU link when just 1% of TCP packets are lost (error, etc., not even congestion). Ask IT managers how their stack params are set, and why. Be ready for blank looks. :] And, as to the ephemeral 'capture effect', the vendors have been so successful making $ selling LAN switches for so long that there are almost no segments anymore for it to exist on, even if it ever was an issue -- "backpressure" mateys, backpressure. It's more relevant that single-chip 16/24-port switches are in D-Link & Linksys boxes at Fry's for $25 now, and, with sub-100nm fabbing, they fail more often, with far more obscure & sneaky performance effects. {:o] Alex Jon Crowcroft wrote: > > um, yes i recall that - any pointers to papers/web page/results appreciated. > > We are a bit like Sigmund Freud :- > we base an entire "science" on pathological behaviour, > but most TCP connections are not congested, > just like most people aren't crazy > (or tell awful austrian 19th century jokes:) > > how many papers written on bottleneck behaviour > and how few on whats normal?- what we need is > the Alex Comfort of TCP, so to speak... > > hmm - i foresee a whole new conference on > packet loss and the unconscious > and > beyond the AIMD principle > and > endless arguments about IP SLA archetypes and TCP mandalas > > :-) > > In missive <20050121180410.GA6946@isi.edu>, Aaron Falk typed: > > >>Jon Crowcroft wrote: > >>> > >>> how fair/efficient is TCP in normal operation when there's no router > >>> or buffer in an intermediate node (yes i knoiw some switches have > >>> more than 1 packet buffers but ignore those) > >> > >>I seem to recall Matt Mathis talking about this at IETF around 5 years > >>ago in the context of MAC 'capture effects'. > >> > >>--aaron > > cheers > > jon From avg at kotovnik.com Sat Jan 22 18:06:20 2005 From: avg at kotovnik.com (Vadim Antonov) Date: Sat Jan 22 18:08:05 2005 Subject: The 1/e myth, was Re: [e2e] TCP Local Area Normal behaviour? any In-Reply-To: <41F2B26F.F4C6FFEA@attglobal.net> Message-ID: On Sat, 22 Jan 2005, Cannara wrote: > -- the gist is that time to complete a given > pkt send starts out above zero in a token system and increases in proportion > to the number of connected nodes, This is not exactly true, as you may have multiple tokens From jnc at mercury.lcs.mit.edu Sun Jan 23 05:30:04 2005 From: jnc at mercury.lcs.mit.edu (Noel Chiappa) Date: Sun Jan 23 05:32:06 2005 Subject: The 1/e myth, was Re: [e2e] TCP Local Area Normal behaviour? any Message-ID: <20050123133004.0BA2186AE0@mercury.lcs.mit.edu> > From: Cannara > seeing how CSMA/CD is superior to any token-passing system, You really don't want to get into this. In any event, it's kind of OBE, since most "CSMA/CD" systems these days are anything but. Rather, they are usually actually hosts connected directly to a bunch of switches, the whole kit-n-kaboodle linked together by point-point links, all of which happen to use an access protocol that looks like CSMA/CD. The latter being picked, of course, rather than something designed for the purpose, since that was the dominant market technology at the time the whole concept of "LAN based on inert transmission medium" became another dusty page of computing history (along with rotationally optimizing assemblers, etc), and it was easier to just adopt that, rather than have to replace all the host interfaces. By the way, what is this "full-duplex" stuff people keep talking about? I've never heard of a CSMA/CD network that supported full-duplex operation. Anyone out there actually still have an Ethernet which actually is a long coax cable with hosts hooked to it via transceivers? No, I didn't think so... > the spike in token's heart is not just cost per node, but that the > crossover moves to higher access rates as nodes are added, meaning > that CSMA/CD becomes better, relatively, as LAN segments have more > nodes. Your model seems to leave out a few factors which weigh against real CSMA/CD (not the ersatz simalcrum people are using these days, see above), such as network physical size. Gee, what a shock - you obviously not having any axe to grind one way or another between the two technologies. Noel From cannara at attglobal.net Sun Jan 23 20:02:55 2005 From: cannara at attglobal.net (Cannara) Date: Sun Jan 23 20:02:09 2005 Subject: The 1/e myth, was Re: [e2e] TCP Local Area Normal behaviour? any References: <20050123133004.0BA2186AE0@mercury.lcs.mit.edu> Message-ID: <41F4736F.CACE1794@attglobal.net> Not sure I get why your response is so energetic, Noel. I think I said that $ motivated much of the decisionmaking in networking devices, which seems to be exactly the reason we all now have $20 switches everywhere there are more than two $10 interfaces! Sure, there are few true Ethernet segments anymore, except in the backplane logic of switches. Having recently worked for a manufacturer of these chips, the contention issues that used to be solved on a shared bus, still need to be solved, but within a shared switch, and with a capability to slow any node connected to the switch's ports (backpressure). I also don't get your reference to Full Duplex, perhaps it's a straw man -- yes FDX Ethernet links aren't doing CSMA/CD, but something more traditional in telecom: access control via in/out-of band signalling (e.g. Pause Frames). So, indeed, I agree "FDX Ethernet" is a misnomer. The deal is that Ethernet framing has become a standard, while that name itself implies the traditional access method (CSMA/CD) as well, so there can be confusion. Alex Noel Chiappa wrote: > > > From: Cannara > > > seeing how CSMA/CD is superior to any token-passing system, > > You really don't want to get into this. > > In any event, it's kind of OBE, since most "CSMA/CD" systems these days are > anything but. Rather, they are usually actually hosts connected directly to a > bunch of switches, the whole kit-n-kaboodle linked together by point-point > links, all of which happen to use an access protocol that looks like > CSMA/CD. The latter being picked, of course, rather than something designed > for the purpose, since that was the dominant market technology at the time the > whole concept of "LAN based on inert transmission medium" became another dusty > page of computing history (along with rotationally optimizing assemblers, > etc), and it was easier to just adopt that, rather than have to replace all > the host interfaces. > > > By the way, what is this "full-duplex" stuff people keep talking about? > I've never heard of a CSMA/CD network that supported full-duplex > operation. > > > Anyone out there actually still have an Ethernet which actually is a long > coax cable with hosts hooked to it via transceivers? No, I didn't think > so... > > > the spike in token's heart is not just cost per node, but that the > > crossover moves to higher access rates as nodes are added, meaning > > that CSMA/CD becomes better, relatively, as LAN segments have more > > nodes. > > Your model seems to leave out a few factors which weigh against real > CSMA/CD (not the ersatz simalcrum people are using these days, see > above), such as network physical size. Gee, what a shock - you obviously > not having any axe to grind one way or another between the two > technologies. > > Noel From fahad.dogar at gmail.com Mon Jan 24 03:45:49 2005 From: fahad.dogar at gmail.com (Fahad Dogar) Date: Mon Jan 24 03:46:17 2005 Subject: [e2e] Traffic Engineering: OSPF-TE and RSVP-TE Message-ID: I am currently implementing a restoration routing module for MPLS Traffic Engineering. I've few questions related to the various protocols that are required for traffic engineering. Any help or pointers to relevant RFC's/drafts would be highly appreciated. 1) There are extensions proposed to OSPF v2 which specifies the additional information required to be propagated for Traffic Engineering [RFC 3630:TE extensions to OSPF v2]. However, this RFC does not provide details of the constraint based routing that can be done using this additional information. RFC 2676: "QoS routing Mechanisms and OSPF extensions" specifies the constraint based routing algorithms but the OSPF extensions proposed in this document are different from RFC 3630. Can any one point out the mechanisms that could be used for constraint based routing using the additional information proposed in RFC 3630. 2) There have been proposed extensions to RSVP to support Traffic Engineering [RFC 3209]. However, these extensions do not provide details of the interfacing between RSVP-TE and OSPF-TE. Just to elaborate, RSVP-TE would need to find a path that could satisfy the QoS constraint specified by the request. How would it use OSPF to find a constraint based route? Moreover, once approprate reservations are made, how is it ensured that the new link state is correctly propagated through OSPF-TE extensions? The answers to the above questions relate to the interfacing issues pertaining to RSVP-TE and OSPF. Thanks in advance, Fahad From randall at stewart.chicago.il.us Mon Jan 24 04:52:30 2005 From: randall at stewart.chicago.il.us (Randall Stewart) Date: Mon Jan 24 04:56:41 2005 Subject: [e2e] overlay over TCP In-Reply-To: <0E0FA274-6BED-11D9-8F73-0003938AF740@cisco.com> References: <41E5CD14.4010206@reed.com> <41E6B57C.6060705@isi.edu> <41E6EA6D.1080705@reed.com> <41EECB63.9000304@stewart.chicago.il.us> <508BD1B4-6A72-11D9-A5A2-000D93B505E6@extremenetworks.com> <5F9E4D56-6BCE-11D9-8F73-0003938AF740@cisco.com> <41F162CB.1020000@reed.com> <0E0FA274-6BED-11D9-8F73-0003938AF740@cisco.com> Message-ID: <41F4EF8E.4050601@stewart.chicago.il.us> Dan Wing wrote: > > On Jan 21, 2005, at 12:15 PM, David P. Reed wrote: > >> Dan Wing wrote: >> >>> >>> Yes, combined with little market demand, as yet, for a NAT to handle >>> SCTP. >> >> >> There is this chicken/egg problem. If SCTP doesn't work over NATs it >> won't be used for applications where NATs are heavily used. Then >> there won't be demand (at least no evidence of it). > > > This egg was demonstrably cooked with IPsec, which had the same > problem. IPsec "passthru" was implemented on nearly all vendor's > residential NATs at about the same time IPsec-over-UDP was beginning to > hit the market. Passthru works by examining SPI's and simple mechanisms > have drawbacks (only one session through the NAT, or only one session to > a specific remote IP address, for example), and IPsec-over-UDP has even > more packet bloat than IPsec itself. > > I expect DCCP, SCTP, and other new protocols will suffer the same > awkward deployment unless we (in the collective sense) improve the > situation with guidance from people familiar with those new protocols. > draft-xie-tsvwg-sctp-nat-00.txt is a move in the right direction, > although it seems NATting SCTP may well be complex. It's not that complex.. and yes Cisco has had at least one customer ask for it... Have they had lots .. no. The reason being where Cisco currently makes money from SCTP is inside the network. Most folks don't run their SS7 over IP network where they want to have a NAT to Global address cross over. There are other places, as well, that Cisco makes money from SCTP.. but again they are all "inside the network" places... However, that all being said, since Cisco does make money from the protocol, and would benefit from the M$ company producing SCTP with its O/S instead having to place an add-on component.. encouraging SCTP by making Cisco NAT's SCTP aware would help in this.. after all someone must crack the egg :-D (and yes Dan, we do ship a special internal version of SCTP for windows to some of our customers :o) R > >> There's a difference between "demand" (meaning actual use) and >> "demand" meaning I would ask and pay to use it if I thought I had any >> chance of getting it from the blind turkeys who sell things like NAT >> boxes. >> >> Reminds me of 1992 when a Nynex VP told me there was *no demand* for >> data connectivity between people working at home and their offices. I >> pointed out that companies like DEC and MIT employed 10's of thousands >> of people who were using terminals to connect to computers at work. >> His response was that they had carefully analyzed measured data use >> and I was wrong. I asked how they measured modems over home phone >> lines, thinking they could listen for modem tones or something, and he >> said (I'm not joking): "I thought that was illegal!" It turns out >> that what they called "data" was a dedicated "data circuit" and that >> modems were "voice". > > > :-) > > -d > > -- Randall Stewart 803-345-0369 815-342-5222(cell) From puddinghead_wilson007 at yahoo.co.uk Mon Jan 24 06:42:39 2005 From: puddinghead_wilson007 at yahoo.co.uk (Puddinhead Wilson) Date: Mon Jan 24 06:44:15 2005 Subject: [e2e] Traffic Engineering: OSPF-TE and RSVP-TE In-Reply-To: Message-ID: <20050124144239.65703.qmail@web25706.mail.ukl.yahoo.com> How would > it use OSPF to find > a constraint based route? Moreover, once approprate > reservations are > made, how is it ensured that the new link state is > correctly > propagated through OSPF-TE extensions? The answers > to the above > questions relate to the interfacing issues > pertaining to RSVP-TE and > OSPF. > You sound like a SAP consultant :) Eitherways, why should a signalling method not be protocol agnostic? Make a standard for "TED", irresepective if it is OSPF/BGP or some other information distribution mechanism. ___________________________________________________________ ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com From touch at ISI.EDU Mon Jan 24 08:49:36 2005 From: touch at ISI.EDU (Joe Touch) Date: Mon Jan 24 08:51:08 2005 Subject: [e2e] TCP Local Area Normal behaviour? any references? In-Reply-To: References: Message-ID: <41F52720.10304@isi.edu> Jon Crowcroft wrote: > right - once people got over the argument > "TCP is aheavyweight protocol" > everyone used it to find the limit on > Their Favourite LAN - > > even wide area, most the landspeed record stuff > (e.g. ipv6 6Gbps coast2coast) is TCP based > > but what I am after is description of the normal dynamics of TCP on normal > configs rangin from shared to switched > 10 to 100 ether environments - > how does the window and tx/rx packet rate vary over time > over a set of flows in a > typical site ? I did some tests on the ATOMIC LAN at ISI in the early 1990s. One part often ignored in simulations is that some TCPs basically ignore congestion control on a LAN - if the dest addr is inside the same subnet as the interface being sent, the software "assumes" they're on a LAN and that the best thing to do is fight it out at the MAC layer (not always a good assumption, we helped show). Have you looked to see if the congestion control windows are even being used, on the whole? Joe -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 254 bytes Desc: OpenPGP digital signature Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050124/55f76fc3/signature.bin From mallman at icir.org Mon Jan 24 09:39:48 2005 From: mallman at icir.org (Mark Allman) Date: Mon Jan 24 09:40:47 2005 Subject: [e2e] overlay over TCP In-Reply-To: <41F003A1.7050909@isi.edu> Message-ID: <20050124173948.7E86877ABAF@guns.icir.org> > The way the PORT command is spec'd, it almost looks like you could > initiate a transfer remotely from a separate machine (A tells B to > send a file to C) too. (anyone know whether that's possible? ever > implemented?) Yep. That was a "feature". And, it has been implemented in at least the BSD FTP clients/servers. Some claim that this is a security problem and so the feature might be disabled by this point. As a side note, the ftpext WG extended PORT and PASV to do IPv6 a while back. We proposed something that allowed for choosing the network protocol, as well as the transport protocol. The WG felt that negotiating the transport protocol was unnecessary because we {were stuck with / didn't need more than} TCP. So, that part was ripped out. The resulting RFC is: Mark Allman, Shawn Ostermann, Craig Metz. FTP Extensions for IPv6 and NATs, September 1998. RFC 2428. And, a companion tech report on what the proposal looked like before we chopped it to match WG consensus: Mark Allman, Shawn Ostermann. FTP Extensions for Variable Protocol Specification. Technical Report CR-209414, NASA Glenn Research Center, February 2000. http://www.icir.org/mallman/papers/ftp-var-spec.ps allman -- Mark Allman -- ICIR -- http://www.icir.org/mallman/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 186 bytes Desc: not available Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050124/91ef712d/attachment.bin From jeroen at unfix.org Mon Jan 24 09:59:08 2005 From: jeroen at unfix.org (Jeroen Massar) Date: Mon Jan 24 10:00:28 2005 Subject: [e2e] overlay over TCP In-Reply-To: <20050124173948.7E86877ABAF@guns.icir.org> References: <20050124173948.7E86877ABAF@guns.icir.org> Message-ID: <1106589548.16930.89.camel@firenze.zurich.ibm.com> On Mon, 2005-01-24 at 12:39 -0500, Mark Allman wrote: > > The way the PORT command is spec'd, it almost looks like you could > > initiate a transfer remotely from a separate machine (A tells B to > > send a file to C) too. (anyone know whether that's possible? ever > > implemented?) > > Yep. That was a "feature". And, it has been implemented in at least > the BSD FTP clients/servers. Some claim that this is a security problem > and so the feature might be disabled by this point. It is a great feature, especially a couple of years ago, now with most people having DSL it isn't that worthwhile, though it still has a great value as you can use a slow line to transfer files between two 'highspeed' servers, or better said, not going over your slow line. Generally people will call this FXP btw. Most servers supporting it nicely have it disabled per default but one can turn it on. Greets, Jeroen -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 240 bytes Desc: This is a digitally signed message part Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050124/767b3b99/attachment.bin From michael.welzl at uibk.ac.at Mon Jan 24 10:44:27 2005 From: michael.welzl at uibk.ac.at (Michael Welzl) Date: Mon Jan 24 10:48:34 2005 Subject: [e2e] TCP Local Area Normal behaviour? any references? In-Reply-To: <41F52720.10304@isi.edu> References: <41F52720.10304@isi.edu> Message-ID: <1106592267.7192.48.camel@lap10-c703.uibk.ac.at> > I did some tests on the ATOMIC LAN at ISI in the early 1990s. One part > often ignored in simulations is that some TCPs basically ignore > congestion control on a LAN - if the dest addr is inside the same subnet > as the interface being sent, the software "assumes" they're on a LAN and > that the best thing to do is fight it out at the MAC layer (not always a > good assumption, we helped show). I can't believe anything like this if it was an "ATOMIC" LAN, as the name "ATOMIC" is synonymous with REALLY REALLY REALLY high speed where I live: http://www.atomicsnow.com/ :-) The only appropriate use of the word "ATOMIC"... Cheers, Michael From michael.welzl at uibk.ac.at Mon Jan 24 12:08:01 2005 From: michael.welzl at uibk.ac.at (Michael Welzl) Date: Mon Jan 24 12:09:00 2005 Subject: AW: [e2e] overlay over TCP In-Reply-To: <1106589548.16930.89.camel@firenze.zurich.ibm.com> Message-ID: > -----Urspr?ngliche Nachricht----- > Von: end2end-interest-bounces@postel.org > [mailto:end2end-interest-bounces@postel.org]Im Auftrag von Jeroen Massar > Gesendet: Montag, 24. J?nner 2005 18:59 > An: mallman@icir.org > Cc: Randall Stewart; end-to-end list; Joe Touch > Betreff: Re: [e2e] overlay over TCP > > > On Mon, 2005-01-24 at 12:39 -0500, Mark Allman wrote: > > > The way the PORT command is spec'd, it almost looks like you could > > > initiate a transfer remotely from a separate machine (A tells B to > > > send a file to C) too. (anyone know whether that's possible? ever > > > implemented?) > > > > Yep. That was a "feature". And, it has been implemented in at least > > the BSD FTP clients/servers. Some claim that this is a security problem > > and so the feature might be disabled by this point. > > It is a great feature, especially a couple of years ago, now with most > people having DSL it isn't that worthwhile, though it still has a great > value as you can use a slow line to transfer files between two > 'highspeed' servers, or better said, not going over your slow line. > Generally people will call this FXP btw. > > Most servers supporting it nicely have it disabled per default but one > can turn it on. Hmm, so it looks like they re-invented this feature for GridFTP: http://www.globus.org/datagrid/gridftp.html (the feature list contains "Third-party (direct server-to-server) transfers"). Then again, the PORT command is in the "commands used by GridFTP" list in http://www-fp.mcs.anl.gov/dsl/GridFTP-Protocol-RFC-Draft.pdf so perhaps they turn it on and simply use the PORT command? I think that this is said to be one of the most important features GridFTP has... Cheers, Michael From touch at ISI.EDU Mon Jan 24 20:50:02 2005 From: touch at ISI.EDU (Joe Touch) Date: Mon Jan 24 20:53:14 2005 Subject: [e2e] TCP Local Area Normal behaviour? any references? In-Reply-To: <1106592267.7192.48.camel@lap10-c703.uibk.ac.at> References: <41F52720.10304@isi.edu> <1106592267.7192.48.camel@lap10-c703.uibk.ac.at> Message-ID: <41F5CFFA.6060901@isi.edu> Michael Welzl wrote: >>I did some tests on the ATOMIC LAN at ISI in the early 1990s. One part >>often ignored in simulations is that some TCPs basically ignore >>congestion control on a LAN - if the dest addr is inside the same subnet >>as the interface being sent, the software "assumes" they're on a LAN and >>that the best thing to do is fight it out at the MAC layer (not always a >>good assumption, we helped show). > > > I can't believe anything like this if it was an "ATOMIC" LAN, > as the name "ATOMIC" is synonymous with REALLY REALLY REALLY > high speed where I live: > > http://www.atomicsnow.com/ > > :-) > > > The only appropriate use of the word "ATOMIC"... FWIW, it was synonymous with high speed for us too: www.isi.edu/div7/atomic www.isi.edu/atomic2 ATOMIC originally stood for "ATm Over MosaIC", as in ATM over the Mosaic chipset from CalTech. MOSAIC was a CPU used as a mesh-connected multiprocessor, and it included a message switching component; ISI used an array of just the switching components as a packet switch for an 640 Mbps LAN. At that time (1995), 640 Mbps was fairly fast for a deployed, operational LAN. Joe -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 254 bytes Desc: OpenPGP digital signature Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050124/61cb1070/signature.bin From s.malik at tuhh.de Tue Jan 25 01:51:21 2005 From: s.malik at tuhh.de (Sireen Habib Malik) Date: Tue Jan 25 01:53:12 2005 Subject: [e2e] Traffic Engineering: OSPF-TE and RSVP-TE In-Reply-To: References: Message-ID: <41F61699.3020805@tuhh.de> Hi, No clue about the RFCs but here is the basic idea.... OSPF-TE mantains the state of the network. The algorithm, Constrained Shortest Path First (CSPF) used for TE is actually SPF (Dijkstra's) with a little pre-conditioning. Suppose your demand is 10Mbps. In CSPF, all those links which do not have 10Mbps of residual capacity would be first pruned out. Then SPF is run. The route you would get will support 10Mbps. This is your constrained based route. You can use other constraints but the idea remains the same. RSVP-TE, a signalling protocol, establishes the LSP. I am not very sure about your question about state propagation but i think OSPF would learn of the new network state on its own, on a slower time granularity. -- SM Fahad Dogar wrote: >I am currently implementing a restoration routing module for MPLS >Traffic Engineering. I've few questions related to the various >protocols that are required for traffic engineering. Any help or >pointers to relevant RFC's/drafts would be highly appreciated. > >1) There are extensions proposed to OSPF v2 which specifies the >additional information required to be propagated for Traffic >Engineering [RFC 3630:TE extensions to OSPF v2]. However, this RFC >does not provide details of the constraint based routing that can be >done using this additional information. RFC 2676: "QoS routing >Mechanisms and OSPF extensions" specifies the constraint based routing >algorithms but the OSPF extensions proposed in this document are >different from RFC 3630. Can any one point out the mechanisms that >could be used for constraint based routing using the additional >information proposed in RFC 3630. > >2) There have been proposed extensions to RSVP to support Traffic >Engineering [RFC 3209]. However, these extensions do not provide >details of the interfacing between RSVP-TE and OSPF-TE. Just to >elaborate, RSVP-TE would need to find a path that could satisfy the >QoS constraint specified by the request. How would it use OSPF to find >a constraint based route? Moreover, once approprate reservations are >made, how is it ensured that the new link state is correctly >propagated through OSPF-TE extensions? The answers to the above >questions relate to the interfacing issues pertaining to RSVP-TE and >OSPF. > >Thanks in advance, >Fahad > > -- Sireen Malik, M.Sc. PhD. Candidate, Communication Networks Hamburg University of Technology, FSP 4-06 (room 3008) Denickestr. 17 21073 Hamburg, Deutschland Tel: +49 (40) 42-878-3387 Fax: +49 (40) 42-878-2941 E-Mail: s.malik@tuhh.de --Everything should be as simple as possible, but no simpler (Albert Einstein) From thuel at lucent.com Mon Jan 24 12:45:57 2005 From: thuel at lucent.com (Sandy Thuel) Date: Tue Jan 25 08:16:20 2005 Subject: [e2e] ICCCN 2005 CFP Message-ID: <41F55E85.8467BDFC@lucent.com> ICCCN 2005 CALL FOR PAPERS FOURTEENTH INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATIONS AND NETWORKS October 17-19, 2005 (Tentative) San Diego, California USA Website: http://icccn.sce.umkc.edu Sponsors*: Technical Co-Sponsorship by IEEE TCCC (Technical Committee on Computer Communications), IBM, Avaya Labs, and Nokia (*pending approval of sponsorships) ICCCN is a major international conference for the presentation of original and fundamental advances in the field of Computer Communications and Networks. It also serves to foster communication among researchers and practitioners working in a wide variety of scientific areas with a common interest in improving Computer Communications and Networks SCOPE: The primary focus of the conference is on new and original research results in the areas of design, implementation and applications of Computer Communications and Networks. We invite you to submit papers that address novel, challenging, and innovative results. The topics include, but are not limited to: eCommerce Internet Services/Applications Protocols Network Control and Management Intelligent Networks Data Traffic Engineering Networked Databases Optical Communication Networks Wireless/Mobile/Satellite Networks Cable Broadband Technologies Mobile and Pervasive Computing Multimedia Communication over IP Networks Voice over IP Security/Reliability/Dependability of wired/wireless networks Network Interoperability Multicasting Streaming Networks Network Performance Network Architectures Terabit Optical Technologies Wireless Multimedia Applications DSL Technologies Network Processing Mobile Ad hoc Networks (MANETs) Sensor Networks SUBMISSION OF PAPERS: Authors are invited to submit complete and original papers. Papers to be submitted should not have been previously published in another forum, and should not be currently under review by another journal or conference. All submitted papers will be refereed for quality, correctness, originality and relevance. Of particular interest are papers that address concrete experiences with computer communications/networks and applications. An accepted paper must be presented by one of the authors at the conference venue. These accepted papers will be published in the conference proceedings by IEEE Press. All manuscripts must be limited to 6 pages with font size 10 in standard IEEE camera ready format (double column). The Program Committee reserves the right to decline without review any papers that exceed these length specifications. Submissions also must include the title, author(s) and affiliation, e-mail address, fax/phone numbers and postal address. In case of multiple authors, indicate which author is responsible for correspondence and preparing the camera ready paper for the proceedings. Electronic submission is required (ps or pdf format is preferred). Manuscripts should be submitted by April 15, 2005 to the ICCCN2005 website. STUDENT POSTER SESSIONS: The conference includes student poster sessions that highlight recent and ongoing research that has not been published elsewhere. An electronic (Postscript or PDF) version of the poster must be submitted to the conference website, along with a 200-word abstract. The first author of the poster must be a student at the time of submission; the student is expected to attend the conference and be available for discussion during the poster sessions. Accepted abstracts will be published in the conference proceedings. Students with accepted posters receive a discount from the regular conference registration fee. The paper session and poster sessions are two fully independent conference tracks, with separate review procedures. Submitted papers are not considered for poster sessions, and poster submissions are not considered for paper sessions. Submission of identical research material to both paper and poster sessions is not allowed. Please contact Program Co-Chairs below with any questions: Prof. Yuanyuan Yang Dept. of Electrical and Computer Engineering State University of New York at Stony Brook yang@ece.sunysb.edu +1 631 632 8474 (voice) +1 631 632 8494 (fax) Dr. Sandra R. Thuel Bell Laboratories, Room 4F515 Networking Techniques Research Department 101 Crawfords Corner Road Holmdel, New Jersey 07733 thuel@lucent.com +1 732 949 8897 IMPORTANT DATES: Paper submission deadline : April 15, 2005 Poster submission deadline : April 15, 2005 Notification of acceptance: June 27, 2005 Camera ready papers due: July 30, 2005 STUDENT FORUM: We encourage submissions from students. Some travel assistance may be available for students with top quality papers. WEBSITE: Please visit the ICCCN2005 web site http://icccn.sce.umkc.edu for more up-to-date information. GENERAL CHAIR: Prof. Luiz DaSilva Virginia Polytechnic Institute and State University ldasilva@vt.edu ****************************************************************** BEST PAPER AWARD: ICCCN will select the best paper each year and authors of the paper will be recognized at the conference ****************************************************************** From cannara at attglobal.net Tue Jan 25 10:37:04 2005 From: cannara at attglobal.net (Cannara) Date: Tue Jan 25 11:14:35 2005 Subject: The 1/e myth, was Re: [e2e] TCP Local Area Normal behaviour? any References: Message-ID: <41F691D0.FA848706@attglobal.net> Sure, multiple tokens allow things to happen better, if the ring has enough propagation delay, but the effect is simply reduced a bit, not eliminated. There's no beating the basic reality of CS: If no one's talking, send now! But we all know we need CD: Oops, who whacked my pkt?! In any case, at the access rates most LANs are run at, there's no sense waiting for any token. One of the expensive absurdities of the '90s was the Token Ring Switch. Alex Vadim Antonov wrote: > > On Sat, 22 Jan 2005, Cannara wrote: > > > -- the gist is that time to complete a given > > pkt send starts out above zero in a token system and increases in proportion > > to the number of connected nodes, > > This is not exactly true, as you may have multiple tokens From cannara at attglobal.net Tue Jan 25 10:03:21 2005 From: cannara at attglobal.net (Cannara) Date: Tue Jan 25 11:15:01 2005 Subject: [e2e] TCP Local Area Normal behaviour? any references? References: Message-ID: <41F689E9.5072B957@attglobal.net> Jon, You might check with some of the large storage-systems companies that still use TCP between fast nodes. One example here in Silicon Valley, a few years ago, altered their stacks so the receive window was simply ignored -- they knew their machines could keep up at 100Mb/s, so the sender just blasted away. Sort of like 3Com's NBT for LANs in 1988 -- start out by sending 42 full sized pkts back to back, see what's acked and either continue or throttle way down (real end-end stuff). In testing a Gb system here, our lab uses a pretty common PC and a Gb Enet card to run over 300Mb/s on a LAN with cheap DLink switches (Gb Cu/fiber) -- no bottleneck in the LAN, just at the sender. Of course our Ixia box can whack along at a Gb, but again, no Gb LAN bottleneck. With normal office switched LANs, there certainly will be limits reached in the matrix and interswitch links, and those are usually spelled out in specs: either for boxes or chips. But, that's what backpressure is for. :] Alex Jon Crowcroft wrote: > > this is mainly for educational reasons, not research: > > so i am looking for any papers or dissertations about the typical behaviour of TCP on a LAN - I cannot find > anything that doesnt include some intermediate device which is a bottleneck, but I 'd love to see a set of > traces/analyses of a few of today's typical TCP implementations (lets say win98, XP, OSX, bsd, linux and some commercial > unix server ones) between typical cliens and servers on 10/100 (perhaps gigE)... > > its sort of boring and i guess hard to get published but its quite hard to explain and is the base case when > starting to teach TCP (and i know there's lots of "corner case" code which means that MTU and window choices may be > difference ...) > > pointers appreciated... > > j. From dwing at cisco.com Wed Jan 26 09:07:14 2005 From: dwing at cisco.com (Dan Wing) Date: Wed Jan 26 09:25:14 2005 Subject: [e2e] overlay over TCP In-Reply-To: <41F4EF8E.4050601@stewart.chicago.il.us> References: <41E5CD14.4010206@reed.com> <41E6B57C.6060705@isi.edu> <41E6EA6D.1080705@reed.com> <41EECB63.9000304@stewart.chicago.il.us> <508BD1B4-6A72-11D9-A5A2-000D93B505E6@extremenetworks.com> <5F9E4D56-6BCE-11D9-8F73-0003938AF740@cisco.com> <41F162CB.1020000@reed.com> <0E0FA274-6BED-11D9-8F73-0003938AF740@cisco.com> <41F4EF8E.4050601@stewart.chicago.il.us> Message-ID: On Jan 24, 2005, at 4:52 AM, Randall Stewart wrote: > Dan Wing wrote: >> On Jan 21, 2005, at 12:15 PM, David P. Reed wrote: >>> Dan Wing wrote: >>> >>>> >>>> Yes, combined with little market demand, as yet, for a NAT to >>>> handle SCTP. >>> >>> >>> There is this chicken/egg problem. If SCTP doesn't work over NATs >>> it won't be used for applications where NATs are heavily used. >>> Then there won't be demand (at least no evidence of it). >> This egg was demonstrably cooked with IPsec, which had the same >> problem. IPsec "passthru" was implemented on nearly all vendor's >> residential NATs at about the same time IPsec-over-UDP was beginning >> to hit the market. Passthru works by examining SPI's and simple >> mechanisms have drawbacks (only one session through the NAT, or only >> one session to a specific remote IP address, for example), and >> IPsec-over-UDP has even more packet bloat than IPsec itself. >> I expect DCCP, SCTP, and other new protocols will suffer the same >> awkward deployment unless we (in the collective sense) improve the >> situation with guidance from people familiar with those new >> protocols. draft-xie-tsvwg-sctp-nat-00.txt is a move in the right >> direction, although it seems NATting SCTP may well be complex. > > It's not that complex.. I admit to only reading that I-D once, but NATting SCTP is certainly more complex than NATting TCP or UDP, especially with multihoming. I'm unclear how two SCTP devices, behind their own NATs, can communicate with each other. The communication problem seems akin to two UDP or TCP devices, behind their own NATs, communicating with each other ---- the NAT will have to preserve the port numbers which means only one SCTP device is permitted behind a NAT, or else the NAT will have to multiplex using something other than the SCTP port number, or else you need an SCTP port discovery protocol (akin to STUN for UDP). > and yes Cisco has had at least one > customer ask for it... Have they had lots .. no. The > reason being where Cisco currently makes money from > SCTP is inside the network. Most folks don't run their > SS7 over IP network where they want to have a NAT > to Global address cross over. [...] I expect SCTP will find more applications than just SS7-over-SCTP, and that will help drive the need for NATs that understand SCTP. -d From randall at stewart.chicago.il.us Wed Jan 26 12:20:47 2005 From: randall at stewart.chicago.il.us (Randall Stewart) Date: Wed Jan 26 12:24:36 2005 Subject: [e2e] overlay over TCP In-Reply-To: References: <41E5CD14.4010206@reed.com> <41E6B57C.6060705@isi.edu> <41E6EA6D.1080705@reed.com> <41EECB63.9000304@stewart.chicago.il.us> <508BD1B4-6A72-11D9-A5A2-000D93B505E6@extremenetworks.com> <5F9E4D56-6BCE-11D9-8F73-0003938AF740@cisco.com> <41F162CB.1020000@reed.com> <0E0FA274-6BED-11D9-8F73-0003938AF740@cisco.com> <41F4EF8E.4050601@stewart.chicago.il.us> Message-ID: <41F7FB9F.6070003@stewart.chicago.il.us> Dan Wing wrote: > > On Jan 24, 2005, at 4:52 AM, Randall Stewart wrote: > >> Dan Wing wrote: >> >>> On Jan 21, 2005, at 12:15 PM, David P. Reed wrote: >>> >>>> Dan Wing wrote: >>>> >>>>> >>>>> Yes, combined with little market demand, as yet, for a NAT to >>>>> handle SCTP. >>>> >>>> >>>> >>>> There is this chicken/egg problem. If SCTP doesn't work over NATs >>>> it won't be used for applications where NATs are heavily used. >>>> Then there won't be demand (at least no evidence of it). >>> >>> This egg was demonstrably cooked with IPsec, which had the same >>> problem. IPsec "passthru" was implemented on nearly all vendor's >>> residential NATs at about the same time IPsec-over-UDP was beginning >>> to hit the market. Passthru works by examining SPI's and simple >>> mechanisms have drawbacks (only one session through the NAT, or only >>> one session to a specific remote IP address, for example), and >>> IPsec-over-UDP has even more packet bloat than IPsec itself. >>> I expect DCCP, SCTP, and other new protocols will suffer the same >>> awkward deployment unless we (in the collective sense) improve the >>> situation with guidance from people familiar with those new >>> protocols. draft-xie-tsvwg-sctp-nat-00.txt is a move in the right >>> direction, although it seems NATting SCTP may well be complex. >> >> >> It's not that complex.. > > > I admit to only reading that I-D once, but NATting SCTP is certainly > more complex than NATting TCP or UDP, especially with multihoming. > > I'm unclear how two SCTP devices, behind their own NATs, can communicate > with each other. The communication problem seems akin to two UDP or TCP > devices, behind their own NATs, communicating with each other ---- the > NAT will have to preserve the port numbers which means only one SCTP > device is permitted behind a NAT, or else the NAT will have to multiplex > using something other than the SCTP port number, or else you need an > SCTP port discovery protocol (akin to STUN for UDP). No why do you say that? If you follow the recomendation of the drafts you get --From IP-A'(port:2222)-INIT(**)-To:IP-Z:port-80----> ** No addresses listed aka we are singly homed. Nat gets it and makes it ----->From IP-A(port:9999)-INIT(**)---To:IPZ:port-80--------> Nat at IPZ gets it and does whatever static mapping.. i.e. you have the same problem you have with a apache server behind a NAT .. you must have the NAT direct the port 80 connection somewhere.. this is the same for TCP.. ----From IP-A(port:9999)---INIT(**)----To:IPZ':port-8080----> And the reverse mappings take place the opposite ways... I don't see how this does not work.. R > >> and yes Cisco has had at least one >> customer ask for it... Have they had lots .. no. The >> reason being where Cisco currently makes money from >> SCTP is inside the network. Most folks don't run their >> SS7 over IP network where they want to have a NAT >> to Global address cross over. > > [...] > > I expect SCTP will find more applications than just SS7-over-SCTP, and > that will help drive the need for NATs that understand SCTP. > > -d > > -- Randall Stewart 803-345-0369 815-342-5222(cell) From dwing at cisco.com Wed Jan 26 13:48:15 2005 From: dwing at cisco.com (Dan Wing) Date: Wed Jan 26 13:50:43 2005 Subject: [e2e] NATting SCTP In-Reply-To: <41F7FB9F.6070003@stewart.chicago.il.us> References: <41E5CD14.4010206@reed.com> <41E6B57C.6060705@isi.edu> <41E6EA6D.1080705@reed.com> <41EECB63.9000304@stewart.chicago.il.us> <508BD1B4-6A72-11D9-A5A2-000D93B505E6@extremenetworks.com> <5F9E4D56-6BCE-11D9-8F73-0003938AF740@cisco.com> <41F162CB.1020000@reed.com> <0E0FA274-6BED-11D9-8F73-0003938AF740@cisco.com> <41F4EF8E.4050601@stewart.chicago.il.us> <41F7FB9F.6070003@stewart.chicago.il.us> Message-ID: (CC'ing ietf-behave, and setting reply-to to ietf-behave) On Jan 26, 2005, at 12:20 PM, Randall Stewart wrote: >>> >> I admit to only reading that I-D once, but NATting SCTP is certainly >> more complex than NATting TCP or UDP, especially with multihoming. >> I'm unclear how two SCTP devices, behind their own NATs, can >> communicate with each other. The communication problem seems akin to >> two UDP or TCP devices, behind their own NATs, communicating with >> each other ---- the NAT will have to preserve the port numbers which >> means only one SCTP device is permitted behind a NAT, or else the NAT >> will have to multiplex using something other than the SCTP port >> number, or else you need an SCTP port discovery protocol (akin to >> STUN for UDP). > No why do you say that? If you follow the recomendation of the > drafts you get > > --From IP-A'(port:2222)-INIT(**)-To:IP-Z:port-80----> > > ** No addresses listed aka we are singly homed. > > Nat gets it and makes it > > ----->From IP-A(port:9999)-INIT(**)---To:IPZ:port-80--------> > > Nat at IPZ gets it and does whatever static mapping.. i.e. you > have the same problem you have with a apache server behind a NAT .. you > must have the NAT direct the port 80 connection somewhere.. this is > the same for TCP.. > > ----From IP-A(port:9999)---INIT(**)----To:IPZ':port-8080----> > > > And the reverse mappings take place the opposite ways... > > I don't see how this does not work.. I agree it works, but only if: 1. you know the SCTP servers behind the NAT, and 2. reconfigure your NAT to do port forwarding, and 3. inform the remote SCTP client of your public SCTP port (which, if there is only one SCTP device behind the NAT, might well be the same as your application's default SCTP port). However, the barista at the local coffeeshop won't reconfigure their NAT to support your SCTP server while you're using wireless at the coffeeshop. And my proverbial father wouldn't know how to reconfigure his NAT, either, and my SCTP application won't know the public SCTP port he chose to use when I visit his house with my wireless device. As to your statement that this is how TCP works, there are active efforts to make TCP servers behind NATs 'just work', without requiring the NAT to be configured for port forwarding. See for example http://nutss.gforge.cis.cornell.edu/stunt.php, which would require only (3) from my list above -- which is necessary anyway if the same application exists on multiple hosts behind a common NAT. -d > R > >>> and yes Cisco has had at least one >>> customer ask for it... Have they had lots .. no. The >>> reason being where Cisco currently makes money from >>> SCTP is inside the network. Most folks don't run their >>> SS7 over IP network where they want to have a NAT >>> to Global address cross over. >> [...] >> I expect SCTP will find more applications than just SS7-over-SCTP, >> and that will help drive the need for NATs that understand SCTP. >> -d > > > -- > Randall Stewart > 803-345-0369 815-342-5222(cell) > From Laconsults at aol.com Thu Jan 27 13:36:07 2005 From: Laconsults at aol.com (Laconsults@aol.com) Date: Thu Jan 27 13:41:30 2005 Subject: [e2e] {Filename?} Subject : All Windows Platforms Next Generation TCP Message-ID: <36.6b313480.2f2ab8c7@aol.com> Skipped content of type multipart/alternative-------------- next part -------------- This is a message from the MailScanner E-Mail Virus Protection Service ---------------------------------------------------------------------- The original e-mail attachment "NextGenTCP.ZIP" has an unusual filename and could possibly be infected with a virus. As a precaution, the attachment has been quarantined. Virus scanner report for Thu Jan 27 13:36:24 2005: MailScanner: Compiled help files are very dangerous in email (PCATTCP.chm) Quarantine location on the ISI-4-30-3 MailScanner: /var/spool/quarantine/20050127/j0RLaHQ11070 If you were expecting the attachment and would like to receive it, please forward this e-mail to action@isi.edu for assistance. If this is urgent, please call Action at x88289 after forwarding the message. Thank you, IPC Computing Services From Laconsults at aol.com Thu Jan 27 14:58:06 2005 From: Laconsults at aol.com (Laconsults@aol.com) Date: Thu Jan 27 14:58:23 2005 Subject: [e2e] Subject : All Windows Platforms Next Generation TCP Message-ID: <197.37373862.2f2acbfe@aol.com> Subject : All Windows Platforms Next Generation TCP here is our NextGenTCP for all Windows platforms, ready for immediate production network uses even for this 1st version ( or visit http://iwxchange.com ) . you can easily set this up for production use on your home/ office LAN PCs. Its guaranteed SAFE not affecting systems in anyway whastsoever. Install this on your LAN/ WAN & immediately working corporate wide within minutes achieving immediate end2end PSTN transmissions quality, not needing multimillion pounds & 6 months timeframe QOS/ MPLS etc for PSTN quality VoIP/ VideoConference this software can be freely distributed widely with 3 months free license automatic grant , expires 1 May 2005. WXChange http://iwxchange.com patent@iwxchange.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.postel.org/pipermail/end2end-interest/attachments/20050127/b39b6e11/attachment.html From alokdube at hotpop.com Thu Jan 27 15:08:35 2005 From: alokdube at hotpop.com (Alok) Date: Thu Jan 27 15:12:09 2005 Subject: [e2e] {Filename?} Subject : All Windows Platforms Next GenerationTCP References: <36.6b313480.2f2ab8c7@aol.com> Message-ID: <002a01c504c5$3cdfde60$350d10ac@rs.riverstonenet.com> is this for real :)) ?? ----- Original Message ----- From: Laconsults@aol.com To: end2end-interest@postel.org Sent: Thursday, January 27, 2005 1:36 PM Subject: [e2e] {Filename?} Subject : All Windows Platforms Next GenerationTCP Warning: This message has had one or more attachments removed (PCATTCP.chm, NextGenTCP.ZIP). Please read the "ISI-4-30-3-Attachment-Warning.txt" attachment(s) for more information. Subject : All Windows Platforms Next Generation TCP here is our NextGenTCP for all Windows platforms, ready for immediate production network uses even for this 1st version ( or visit http://iwxchange.com ) . you can easily set this up for production use on your home/ office LAN PCs. Its guaranteed SAFE not affecting systems in anyway whastsoever. Install this on your LAN/ WAN & immediately working corporate wide within minutes achieving immediate end2end PSTN transmissions quality, not needing multimillion pounds & 6 months timeframe QOS/ MPLS etc for PSTN quality VoIP/ VideoConference this software can be freely distributed widely with 3 months free license automatic grant , expires 1 May 2005. WXChange http://iwxchange.com patent@iwxchange.com This is a message from the MailScanner E-Mail Virus Protection Service ---------------------------------------------------------------------- The original e-mail attachment "NextGenTCP.ZIP" has an unusual filename and could possibly be infected with a virus. As a precaution, the attachment has been quarantined. Virus scanner report for Thu Jan 27 13:36:24 2005: MailScanner: Compiled help files are very dangerous in email (PCATTCP.chm) Quarantine location on the ISI-4-30-3 MailScanner: /var/spool/quarantine/20050127/j0RLaHQ11070 If you were expecting the attachment and would like to receive it, please forward this e-mail to action@isi.edu for assistance. If this is urgent, please call Action at x88289 after forwarding the message. Thank you, IPC Computing Services From touch at ISI.EDU Thu Jan 27 15:37:57 2005 From: touch at ISI.EDU (Joe Touch) Date: Thu Jan 27 15:39:10 2005 Subject: [e2e] {Filename?} Subject : All Windows Platforms Next Generation TCP In-Reply-To: <36.6b313480.2f2ab8c7@aol.com> References: <36.6b313480.2f2ab8c7@aol.com> Message-ID: <41F97B55.8070809@isi.edu> Yes, folks, this is for real. Real spam. It was posted from a list subscriber, so closed-list rules didn't help. I don't know why the spam filter didn't catch it either. The account owner has been notified and posting privileges suspended until the issue can be further resolved. Now back to our regularly-scheduled rants ;-) Joe Laconsults@aol.com wrote: > *Warning: This message has had one or more attachments removed > (PCATTCP.chm, NextGenTCP.ZIP). Please read the > "ISI-4-30-3-Attachment-Warning.txt" attachment(s) for more information.* > > Subject : All Windows Platforms Next Generation TCP > > here is our NextGenTCP for all Windows platforms, ready for immediate > production network uses even for this 1st version ( or visit > http://*iwxchange*.com ) . > you can easily set this up for production use on your home/ office LAN > PCs. Its guaranteed SAFE not affecting systems in anyway whastsoever. > Install this on your LAN/ WAN & immediately working corporate wide > within minutes achieving immediate end2end PSTN transmissions quality, > not needing multimillion pounds & 6 months timeframe > QOS/ MPLS etc for PSTN quality VoIP/ VideoConference > this software can be freely distributed widely with 3 months free > license automatic grant , expires 1 May 2005. > WXChange > http://*iwxchange*.com > patent@*iwxchange*.com > > > > > ------------------------------------------------------------------------ > > This is a message from the MailScanner E-Mail Virus Protection Service > ---------------------------------------------------------------------- > The original e-mail attachment "NextGenTCP.ZIP" > has an unusual filename and could possibly be infected with a virus. > As a precaution, the attachment has been quarantined. > > Virus scanner report for Thu Jan 27 13:36:24 2005: > MailScanner: Compiled help files are very dangerous in email (PCATTCP.chm) > > Quarantine location on the ISI-4-30-3 MailScanner: /var/spool/quarantine/20050127/j0RLaHQ11070 > > If you were expecting the attachment and would like to receive it, > please forward this e-mail to action@isi.edu for assistance. If this > is urgent, please call Action at x88289 after forwarding the message. > > Thank you, > > IPC Computing Services -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 254 bytes Desc: OpenPGP digital signature Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050127/85f8a31a/signature.bin From randall at stewart.chicago.il.us Thu Jan 27 16:00:25 2005 From: randall at stewart.chicago.il.us (Randall Stewart) Date: Thu Jan 27 16:04:22 2005 Subject: [e2e] Re: NATting SCTP In-Reply-To: References: <41E5CD14.4010206@reed.com> <41E6B57C.6060705@isi.edu> <41E6EA6D.1080705@reed.com> <41EECB63.9000304@stewart.chicago.il.us> <508BD1B4-6A72-11D9-A5A2-000D93B505E6@extremenetworks.com> <5F9E4D56-6BCE-11D9-8F73-0003938AF740@cisco.com> <41F162CB.1020000@reed.com> <0E0FA274-6BED-11D9-8F73-0003938AF740@cisco.com> <41F4EF8E.4050601@stewart.chicago.il.us> <41F7FB9F.6070003@stewart.chicago.il.us> Message-ID: <41F98099.6050607@stewart.chicago.il.us> Dan Wing wrote: > (CC'ing ietf-behave, and setting reply-to to ietf-behave) > > On Jan 26, 2005, at 12:20 PM, Randall Stewart wrote: > >>>> >>> I admit to only reading that I-D once, but NATting SCTP is certainly >>> more complex than NATting TCP or UDP, especially with multihoming. >>> I'm unclear how two SCTP devices, behind their own NATs, can >>> communicate with each other. The communication problem seems akin to >>> two UDP or TCP devices, behind their own NATs, communicating with >>> each other ---- the NAT will have to preserve the port numbers which >>> means only one SCTP device is permitted behind a NAT, or else the NAT >>> will have to multiplex using something other than the SCTP port >>> number, or else you need an SCTP port discovery protocol (akin to >>> STUN for UDP). >> >> No why do you say that? If you follow the recomendation of the >> drafts you get >> >> --From IP-A'(port:2222)-INIT(**)-To:IP-Z:port-80----> >> >> ** No addresses listed aka we are singly homed. >> >> Nat gets it and makes it >> >> ----->From IP-A(port:9999)-INIT(**)---To:IPZ:port-80--------> >> >> Nat at IPZ gets it and does whatever static mapping.. i.e. you >> have the same problem you have with a apache server behind a NAT .. you >> must have the NAT direct the port 80 connection somewhere.. this is >> the same for TCP.. >> >> ----From IP-A(port:9999)---INIT(**)----To:IPZ':port-8080----> >> >> >> And the reverse mappings take place the opposite ways... >> >> I don't see how this does not work.. > > > I agree it works, but only if: > > 1. you know the SCTP servers behind the NAT, and And does this differ with the way we do TCP servers behind a NAT? > 2. reconfigure your NAT to do port forwarding, and And again, does this differ with the way we do TCP servers behind a NAT? > 3. inform the remote SCTP client of your public SCTP port (which, > if there is only one SCTP device behind the NAT, might well be > the same as your application's default SCTP port). Again.. All of these cavets you list are the SAME exact ones that you do when you put a TCP server behind a NAT.. you: 1) Place the TCP server behind the NAT and 2) configure your nat to port forward <80> (for example) to 8080 at 10.1.1.1 and 3) Tell the client about the public port. Nothing here is any differnet then the way current nats work with TCP... > > However, the barista at the local coffeeshop won't reconfigure their NAT > to support your SCTP server while you're using wireless at the > coffeeshop. And my proverbial father wouldn't know how to reconfigure > his NAT, either, and my SCTP application won't know the public SCTP port > he chose to use when I visit his house with my wireless device. And let me guess you think someone at these same places is going to know how to configure a TCP server behind a NAT too? I think not.. > > As to your statement that this is how TCP works, there are active > efforts to make TCP servers behind NATs 'just work', without requiring > the NAT to be configured for port forwarding. See for example > http://nutss.gforge.cis.cornell.edu/stunt.php, which would require only > (3) from my list above -- which is necessary anyway if the same > application exists on multiple hosts behind a common NAT. Ok.. there are efforts underway ..I will look at your link.. but whatever you come up with for TCP you can do the same exact thing for SCTP.. Its the same work.. and note that it is "an effort under way" which means the folks at the coffee shop are still doing 1 and 2 to get their TCP server to work behind their NAT :-0 R > > -d > > >> R >> >>>> and yes Cisco has had at least one >>>> customer ask for it... Have they had lots .. no. The >>>> reason being where Cisco currently makes money from >>>> SCTP is inside the network. Most folks don't run their >>>> SS7 over IP network where they want to have a NAT >>>> to Global address cross over. >>> >>> [...] >>> I expect SCTP will find more applications than just SS7-over-SCTP, >>> and that will help drive the need for NATs that understand SCTP. >>> -d >> >> >> >> -- >> Randall Stewart >> 803-345-0369 815-342-5222(cell) >> > > -- Randall Stewart 803-345-0369 815-342-5222(cell) From alokdube at hotpop.com Thu Jan 27 21:10:19 2005 From: alokdube at hotpop.com (Alok) Date: Thu Jan 27 21:12:23 2005 Subject: [e2e] overlay over TCP References: <41E5CD14.4010206@reed.com> <41E6B57C.6060705@isi.edu> <41E6EA6D.1080705@reed.com> <41EECB63.9000304@stewart.chicago.il.us> <508BD1B4-6A72-11D9-A5A2-000D93B505E6@extremenetworks.com> <5F9E4D56-6BCE-11D9-8F73-0003938AF740@cisco.com> <41F162CB.1020000@reed.com><0E0FA274-6BED-11D9-8F73-0003938AF740@cisco.com> <41F4EF8E.4050601@stewart.chicago.il.us> Message-ID: <00f501c504f7$ab558930$350d10ac@rs.riverstonenet.com> > > There are other places, as well, that Cisco makes money > from SCTP.. but again they are all "inside the network" > places... > > However, that all being said, since Cisco does make > money from the protocol, and would benefit from > the M$ company producing SCTP with its O/S instead > having to place an add-on component.. encouraging > SCTP by making Cisco NAT's SCTP aware would help in > this.. after all someone must crack the egg :-D > > (and yes Dan, we do ship a special internal version > of SCTP for windows to some of our customers :o) > ......so why not extend RSVP-TE? At least we have devices within the network who "understand" that language...(and cisco can make *even more* money ;o)... ) One could always define a new protocol type for the same and use it. If SCTP can traverse NATs so can x,y,z in the same way, as long as you can reach the end point. -thanks Alok From dwing at cisco.com Sun Jan 30 23:20:13 2005 From: dwing at cisco.com (Dan Wing) Date: Sun Jan 30 23:20:40 2005 Subject: [e2e] Re: NATting SCTP In-Reply-To: <41F98099.6050607@stewart.chicago.il.us> References: <41E5CD14.4010206@reed.com> <41E6B57C.6060705@isi.edu> <41E6EA6D.1080705@reed.com> <41EECB63.9000304@stewart.chicago.il.us> <508BD1B4-6A72-11D9-A5A2-000D93B505E6@extremenetworks.com> <5F9E4D56-6BCE-11D9-8F73-0003938AF740@cisco.com> <41F162CB.1020000@reed.com> <0E0FA274-6BED-11D9-8F73-0003938AF740@cisco.com> <41F4EF8E.4050601@stewart.chicago.il.us> <41F7FB9F.6070003@stewart.chicago.il.us> <41F98099.6050607@stewart.chicago.il.us> Message-ID: <8C4DFBCB-7358-11D9-98B0-0003938AF740@cisco.com> On Jan 27, 2005, at 4:00 PM, Randall Stewart wrote: > Dan Wing wrote: >> (CC'ing ietf-behave, and setting reply-to to ietf-behave) >> On Jan 26, 2005, at 12:20 PM, Randall Stewart wrote: >>>>> >>>> I admit to only reading that I-D once, but NATting SCTP is >>>> certainly more complex than NATting TCP or UDP, especially with >>>> multihoming. >>>> I'm unclear how two SCTP devices, behind their own NATs, can >>>> communicate with each other. The communication problem seems akin >>>> to two UDP or TCP devices, behind their own NATs, communicating >>>> with each other ---- the NAT will have to preserve the port numbers >>>> which means only one SCTP device is permitted behind a NAT, or else >>>> the NAT will have to multiplex using something other than the SCTP >>>> port number, or else you need an SCTP port discovery protocol (akin >>>> to STUN for UDP). >>> >>> No why do you say that? If you follow the recomendation of the >>> drafts you get >>> >>> --From IP-A'(port:2222)-INIT(**)-To:IP-Z:port-80----> >>> >>> ** No addresses listed aka we are singly homed. >>> >>> Nat gets it and makes it >>> >>> ----->From IP-A(port:9999)-INIT(**)---To:IPZ:port-80--------> >>> >>> Nat at IPZ gets it and does whatever static mapping.. i.e. you >>> have the same problem you have with a apache server behind a NAT .. >>> you >>> must have the NAT direct the port 80 connection somewhere.. this is >>> the same for TCP.. >>> >>> ----From IP-A(port:9999)---INIT(**)----To:IPZ':port-8080----> >>> >>> >>> And the reverse mappings take place the opposite ways... >>> >>> I don't see how this does not work.. >> I agree it works, but only if: >> 1. you know the SCTP servers behind the NAT, and > > And does this differ with the way we do TCP servers behind a NAT? > >> 2. reconfigure your NAT to do port forwarding, and > > And again, does this differ with the way we do TCP servers behind a > NAT? >> 3. inform the remote SCTP client of your public SCTP port (which, >> if there is only one SCTP device behind the NAT, might well be >> the same as your application's default SCTP port). > > Again.. All of these cavets you list are the SAME exact ones that > you do when you put a TCP server behind a NAT.. you: > > 1) Place the TCP server behind the NAT > and > 2) configure your nat to port forward <80> (for example) to 8080 at > 10.1.1.1 > and > 3) Tell the client about the public port. > > Nothing here is any differnet then the way current nats work with > TCP... As I said, I agree it works. I agree that's what happens to use TCP servers today. If you're comfortable that SCTP servers will always need manual configuration of NATs, that's great. TCP hasn't worked out that way, though -- for example, Windows fileshares and "p2p" filesharing both want to connect to TCP servers behind NATs. >> However, the barista at the local coffeeshop won't reconfigure their >> NAT to support your SCTP server while you're using wireless at the >> coffeeshop. And my proverbial father wouldn't know how to >> reconfigure his NAT, either, and my SCTP application won't know the >> public SCTP port he chose to use when I visit his house with my >> wireless device. > > And let me guess you think someone at these same places is going > to know how to configure a TCP server behind a NAT too? I think > not.. Today's difficulties in getting TCP servers working behind NATs has caused several workarounds for applications, as you're undoubtedly aware, because TCP servers that are behind unconfigured NATs are inaccessible. >> As to your statement that this is how TCP works, there are active >> efforts to make TCP servers behind NATs 'just work', without >> requiring the NAT to be configured for port forwarding. See for >> example >> http://nutss.gforge.cis.cornell.edu/stunt.php, which would require >> only (3) from my list above -- which is necessary anyway if the same >> application exists on multiple hosts behind a common NAT. > > > Ok.. there are efforts underway ..I will look at your link.. but > whatever you come up with for TCP you can do the same exact thing > for SCTP.. > > Its the same work.. and note that it is "an effort under way" which > means the folks at the coffee shop are still doing 1 and 2 to > get their TCP server to work behind their NAT :-0 -d > > R >> -d >>> R >>> >>>>> and yes Cisco has had at least one >>>>> customer ask for it... Have they had lots .. no. The >>>>> reason being where Cisco currently makes money from >>>>> SCTP is inside the network. Most folks don't run their >>>>> SS7 over IP network where they want to have a NAT >>>>> to Global address cross over. >>>> >>>> [...] >>>> I expect SCTP will find more applications than just SS7-over-SCTP, >>>> and that will help drive the need for NATs that understand SCTP. >>>> -d >>> >>> >>> >>> -- >>> Randall Stewart >>> 803-345-0369 815-342-5222(cell) >>> > > > -- > Randall Stewart > 803-345-0369 815-342-5222(cell) > From randall at stewart.chicago.il.us Mon Jan 31 05:07:33 2005 From: randall at stewart.chicago.il.us (Randall Stewart) Date: Mon Jan 31 05:11:05 2005 Subject: [e2e] Re: NATting SCTP In-Reply-To: <8C4DFBCB-7358-11D9-98B0-0003938AF740@cisco.com> References: <41E5CD14.4010206@reed.com> <41E6B57C.6060705@isi.edu> <41E6EA6D.1080705@reed.com> <41EECB63.9000304@stewart.chicago.il.us> <508BD1B4-6A72-11D9-A5A2-000D93B505E6@extremenetworks.com> <5F9E4D56-6BCE-11D9-8F73-0003938AF740@cisco.com> <41F162CB.1020000@reed.com> <0E0FA274-6BED-11D9-8F73-0003938AF740@cisco.com> <41F4EF8E.4050601@stewart.chicago.il.us> <41F7FB9F.6070003@stewart.chicago.il.us> <41F98099.6050607@stewart.chicago.il.us> <8C4DFBCB-7358-11D9-98B0-0003938AF740@cisco.com> Message-ID: <41FE2D95.9040308@stewart.chicago.il.us> Dan: So what magic do you expect SCTP to be able to do that TCP can not? Do we design protocols to go through NAT's now? I realize that this is a long standing problem... I suppose one could change the fundamental API of SCTP and move it to use PPID's in the data chunks INSTEAD of ports to access apps... but even at that you still have a problem.. how do you find a port someone is listening on behind a NAT... that is the bottome line problem and I don't think I know a solution... do you? R Dan Wing wrote: > > On Jan 27, 2005, at 4:00 PM, Randall Stewart wrote: > >> Dan Wing wrote: >> >>> (CC'ing ietf-behave, and setting reply-to to ietf-behave) >>> On Jan 26, 2005, at 12:20 PM, Randall Stewart wrote: >>> >>>>>> >>>>> I admit to only reading that I-D once, but NATting SCTP is >>>>> certainly more complex than NATting TCP or UDP, especially with >>>>> multihoming. >>>>> I'm unclear how two SCTP devices, behind their own NATs, can >>>>> communicate with each other. The communication problem seems akin >>>>> to two UDP or TCP devices, behind their own NATs, communicating >>>>> with each other ---- the NAT will have to preserve the port numbers >>>>> which means only one SCTP device is permitted behind a NAT, or else >>>>> the NAT will have to multiplex using something other than the SCTP >>>>> port number, or else you need an SCTP port discovery protocol (akin >>>>> to STUN for UDP). >>>> >>>> >>>> No why do you say that? If you follow the recomendation of the >>>> drafts you get >>>> >>>> --From IP-A'(port:2222)-INIT(**)-To:IP-Z:port-80----> >>>> >>>> ** No addresses listed aka we are singly homed. >>>> >>>> Nat gets it and makes it >>>> >>>> ----->From IP-A(port:9999)-INIT(**)---To:IPZ:port-80--------> >>>> >>>> Nat at IPZ gets it and does whatever static mapping.. i.e. you >>>> have the same problem you have with a apache server behind a NAT .. you >>>> must have the NAT direct the port 80 connection somewhere.. this is >>>> the same for TCP.. >>>> >>>> ----From IP-A(port:9999)---INIT(**)----To:IPZ':port-8080----> >>>> >>>> >>>> And the reverse mappings take place the opposite ways... >>>> >>>> I don't see how this does not work.. >>> >>> I agree it works, but only if: >>> 1. you know the SCTP servers behind the NAT, and >> >> >> And does this differ with the way we do TCP servers behind a NAT? >> >>> 2. reconfigure your NAT to do port forwarding, and >> >> >> And again, does this differ with the way we do TCP servers behind a NAT? >> >>> 3. inform the remote SCTP client of your public SCTP port (which, >>> if there is only one SCTP device behind the NAT, might well be >>> the same as your application's default SCTP port). >> >> >> Again.. All of these cavets you list are the SAME exact ones that >> you do when you put a TCP server behind a NAT.. you: >> >> 1) Place the TCP server behind the NAT >> and >> 2) configure your nat to port forward <80> (for example) to 8080 at >> 10.1.1.1 >> and >> 3) Tell the client about the public port. >> >> Nothing here is any differnet then the way current nats work with >> TCP... > > > As I said, I agree it works. I agree that's what happens to use TCP > servers today. If you're comfortable that SCTP servers will always need > manual configuration of NATs, that's great. TCP hasn't worked out that > way, though -- for example, Windows fileshares and "p2p" filesharing > both want to connect to TCP servers behind NATs. > >>> However, the barista at the local coffeeshop won't reconfigure their >>> NAT to support your SCTP server while you're using wireless at the >>> coffeeshop. And my proverbial father wouldn't know how to >>> reconfigure his NAT, either, and my SCTP application won't know the >>> public SCTP port he chose to use when I visit his house with my >>> wireless device. >> >> >> And let me guess you think someone at these same places is going >> to know how to configure a TCP server behind a NAT too? I think >> not.. > > > Today's difficulties in getting TCP servers working behind NATs has > caused several > workarounds for applications, as you're undoubtedly aware, because TCP > servers > that are behind unconfigured NATs are inaccessible. > >>> As to your statement that this is how TCP works, there are active >>> efforts to make TCP servers behind NATs 'just work', without >>> requiring the NAT to be configured for port forwarding. See for example >>> http://nutss.gforge.cis.cornell.edu/stunt.php, which would require >>> only (3) from my list above -- which is necessary anyway if the same >>> application exists on multiple hosts behind a common NAT. >> >> >> >> Ok.. there are efforts underway ..I will look at your link.. but >> whatever you come up with for TCP you can do the same exact thing >> for SCTP.. >> >> Its the same work.. and note that it is "an effort under way" which >> means the folks at the coffee shop are still doing 1 and 2 to >> get their TCP server to work behind their NAT :-0 > > > -d > > >> >> R >> >>> -d >>> >>>> R >>>> >>>>>> and yes Cisco has had at least one >>>>>> customer ask for it... Have they had lots .. no. The >>>>>> reason being where Cisco currently makes money from >>>>>> SCTP is inside the network. Most folks don't run their >>>>>> SS7 over IP network where they want to have a NAT >>>>>> to Global address cross over. >>>>> >>>>> >>>>> [...] >>>>> I expect SCTP will find more applications than just SS7-over-SCTP, >>>>> and that will help drive the need for NATs that understand SCTP. >>>>> -d >>>> >>>> >>>> >>>> >>>> -- >>>> Randall Stewart >>>> 803-345-0369 815-342-5222(cell) >>>> >> >> >> -- >> Randall Stewart >> 803-345-0369 815-342-5222(cell) >> > > -- Randall Stewart 803-345-0369 815-342-5222(cell) From dwing at cisco.com Mon Jan 31 08:18:48 2005 From: dwing at cisco.com (Dan Wing) Date: Mon Jan 31 08:20:35 2005 Subject: [e2e] Re: NATting SCTP In-Reply-To: <41FE2D95.9040308@stewart.chicago.il.us> References: <41E5CD14.4010206@reed.com> <41E6B57C.6060705@isi.edu> <41E6EA6D.1080705@reed.com> <41EECB63.9000304@stewart.chicago.il.us> <508BD1B4-6A72-11D9-A5A2-000D93B505E6@extremenetworks.com> <5F9E4D56-6BCE-11D9-8F73-0003938AF740@cisco.com> <41F162CB.1020000@reed.com> <0E0FA274-6BED-11D9-8F73-0003938AF740@cisco.com> <41F4EF8E.4050601@stewart.chicago.il.us> <41F7FB9F.6070003@stewart.chicago.il.us> <41F98099.6050607@stewart.chicago.il.us> <8C4DFBCB-7358-11D9-98B0-0003938AF740@cisco.com> <41FE2D95.9040308@stewart.chicago.il.us> Message-ID: On Jan 31, 2005, at 5:07 AM, Randall Stewart wrote: > Dan: > > So what magic do you expect SCTP to be able to > do that TCP can not? Let me give an example: Today it is possible, with the majority of the NATs on the market, to set up a bi-directional UDP 'connection' with two hosts that are behind their own NATs, _without_ changing the configuration of the NAT. This can be done with STUN (RFC3489), which is done by the application. However, with TCP, it currently isn't possible to do the same -- rather, at least one of the NATs has to be configured. > Do we design protocols to go through NAT's now? I'm not suggesting changes to SCTP, the protocol. Rather, I'm suggesting that the recommendations for NATting SCTP be written to allow two SCTP applications behind their own NATs to communicate directly with each other, without requiring configuration changes to their NATs. I expect that SCTP can do this. To answer your question, though, it's my understanding that IPsec-over-UDP was created primarily to traverse NATs. And I know STUN (RFC3489) was created expressly to assist UDP applications traverse NATs. > I realize that this is a long standing problem... I suppose > one could change the fundamental API of SCTP and move > it to use PPID's in the data chunks INSTEAD of > ports to access apps... but even at that you still > have a problem.. how do you find a port someone is > listening on behind a NAT... that is the bottome line > problem and I don't think I know a solution... do you? The device behind the NAT has to send a packet (towards the Internet) in order to establish the NAT binding. -d > R > > > Dan Wing wrote: >> On Jan 27, 2005, at 4:00 PM, Randall Stewart wrote: >>> Dan Wing wrote: >>> >>>> (CC'ing ietf-behave, and setting reply-to to ietf-behave) >>>> On Jan 26, 2005, at 12:20 PM, Randall Stewart wrote: >>>> >>>>>>> >>>>>> I admit to only reading that I-D once, but NATting SCTP is >>>>>> certainly more complex than NATting TCP or UDP, especially with >>>>>> multihoming. >>>>>> I'm unclear how two SCTP devices, behind their own NATs, can >>>>>> communicate with each other. The communication problem seems >>>>>> akin to two UDP or TCP devices, behind their own NATs, >>>>>> communicating with each other ---- the NAT will have to preserve >>>>>> the port numbers which means only one SCTP device is permitted >>>>>> behind a NAT, or else the NAT will have to multiplex using >>>>>> something other than the SCTP port number, or else you need an >>>>>> SCTP port discovery protocol (akin to STUN for UDP). >>>>> >>>>> >>>>> No why do you say that? If you follow the recomendation of the >>>>> drafts you get >>>>> >>>>> --From IP-A'(port:2222)-INIT(**)-To:IP-Z:port-80----> >>>>> >>>>> ** No addresses listed aka we are singly homed. >>>>> >>>>> Nat gets it and makes it >>>>> >>>>> ----->From IP-A(port:9999)-INIT(**)---To:IPZ:port-80--------> >>>>> >>>>> Nat at IPZ gets it and does whatever static mapping.. i.e. you >>>>> have the same problem you have with a apache server behind a NAT >>>>> .. you >>>>> must have the NAT direct the port 80 connection somewhere.. this is >>>>> the same for TCP.. >>>>> >>>>> ----From IP-A(port:9999)---INIT(**)----To:IPZ':port-8080----> >>>>> >>>>> >>>>> And the reverse mappings take place the opposite ways... >>>>> >>>>> I don't see how this does not work.. >>>> >>>> I agree it works, but only if: >>>> 1. you know the SCTP servers behind the NAT, and >>> >>> >>> And does this differ with the way we do TCP servers behind a NAT? >>> >>>> 2. reconfigure your NAT to do port forwarding, and >>> >>> >>> And again, does this differ with the way we do TCP servers behind a >>> NAT? >>> >>>> 3. inform the remote SCTP client of your public SCTP port (which, >>>> if there is only one SCTP device behind the NAT, might well be >>>> the same as your application's default SCTP port). >>> >>> >>> Again.. All of these cavets you list are the SAME exact ones that >>> you do when you put a TCP server behind a NAT.. you: >>> >>> 1) Place the TCP server behind the NAT >>> and >>> 2) configure your nat to port forward <80> (for example) to 8080 at >>> 10.1.1.1 >>> and >>> 3) Tell the client about the public port. >>> >>> Nothing here is any differnet then the way current nats work with >>> TCP... >> As I said, I agree it works. I agree that's what happens to use TCP >> servers today. If you're comfortable that SCTP servers will always >> need manual configuration of NATs, that's great. TCP hasn't worked >> out that way, though -- for example, Windows fileshares and "p2p" >> filesharing both want to connect to TCP servers behind NATs. >>>> However, the barista at the local coffeeshop won't reconfigure >>>> their NAT to support your SCTP server while you're using wireless >>>> at the coffeeshop. And my proverbial father wouldn't know how to >>>> reconfigure his NAT, either, and my SCTP application won't know the >>>> public SCTP port he chose to use when I visit his house with my >>>> wireless device. >>> >>> >>> And let me guess you think someone at these same places is going >>> to know how to configure a TCP server behind a NAT too? I think >>> not.. >> Today's difficulties in getting TCP servers working behind NATs has >> caused several >> workarounds for applications, as you're undoubtedly aware, because >> TCP servers >> that are behind unconfigured NATs are inaccessible. >>>> As to your statement that this is how TCP works, there are active >>>> efforts to make TCP servers behind NATs 'just work', without >>>> requiring the NAT to be configured for port forwarding. See for >>>> example >>>> http://nutss.gforge.cis.cornell.edu/stunt.php, which would require >>>> only (3) from my list above -- which is necessary anyway if the >>>> same application exists on multiple hosts behind a common NAT. >>> >>> >>> >>> Ok.. there are efforts underway ..I will look at your link.. but >>> whatever you come up with for TCP you can do the same exact thing >>> for SCTP.. >>> >>> Its the same work.. and note that it is "an effort under way" which >>> means the folks at the coffee shop are still doing 1 and 2 to >>> get their TCP server to work behind their NAT :-0 >> -d >>> >>> R >>> >>>> -d >>>> >>>>> R >>>>> >>>>>>> and yes Cisco has had at least one >>>>>>> customer ask for it... Have they had lots .. no. The >>>>>>> reason being where Cisco currently makes money from >>>>>>> SCTP is inside the network. Most folks don't run their >>>>>>> SS7 over IP network where they want to have a NAT >>>>>>> to Global address cross over. >>>>>> >>>>>> >>>>>> [...] >>>>>> I expect SCTP will find more applications than just >>>>>> SS7-over-SCTP, and that will help drive the need for NATs that >>>>>> understand SCTP. >>>>>> -d >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Randall Stewart >>>>> 803-345-0369 815-342-5222(cell) >>>>> >>> >>> >>> -- >>> Randall Stewart >>> 803-345-0369 815-342-5222(cell) >>> > > > -- > Randall Stewart > 803-345-0369 815-342-5222(cell) > From mallman at icir.org Mon Jan 31 21:18:41 2005 From: mallman at icir.org (Mark Allman) Date: Mon Jan 31 21:20:37 2005 Subject: [e2e] cwnd update correction for congestion avoidance In-Reply-To: <1105611740.4764.66.camel@lap10-c703.uibk.ac.at> Message-ID: <20050201051841.316842379DE@lawyers.icir.org> > I'm now taking this to tcpm, too... > > On tuesday, Anil Agarwal sent a message to the end2end list > which made it clear that the equation: > > cwnd += SMSS*SMSS/cwnd > > from 2581 does not really add a segment every RTT as desired. > (He also went into some ABC related details, but I'll > not go into them for now to keep things simple.) A note here is that the above is not the only way the spec allows for cwnd growth during congestion avoidance. Byte counting is also explicitly allowed by RFC 2581: Another acceptable way to increase cwnd during congestion avoidance is to count the number of bytes that have been acknowledged by ACKs for new data. (A drawback of this implementation is that it requires maintaining an additional state variable.) When the number of bytes acknowledged reaches cwnd, then cwnd can be incremented by up to SMSS bytes. Note that during congestion avoidance, cwnd MUST NOT be increased by more than the larger of either 1 full-sized segment per RTT, or the value computed using equation 2. allman -- Mark Allman -- ICIR -- http://www.icir.org/mallman/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 185 bytes Desc: not available Url : http://www.postel.org/pipermail/end2end-interest/attachments/20050201/256299d5/attachment.bin