From detlef.bosau at web.de Mon Jul 2 07:42:02 2007 From: detlef.bosau at web.de (Detlef Bosau) Date: Mon, 02 Jul 2007 16:42:02 +0200 Subject: [e2e] Opportunistic Scheduling. In-Reply-To: <46498002.9000903@gmail.com> References: <4616E722.3070402@web.de> <46487D95.4040104@web.de> <7.0.1.0.2.20070514143809.02813318@antd.nist.gov> <4648C666.9000607@web.de> <46498002.9000903@gmail.com> Message-ID: <46890EBA.9000009@web.de> Hi to all. Some weeks ago, I asked some questions about this topic on the e2e-list. Perhaps, the TCCC list might be more appropriate. However, I?m not quite sure. So, if I?m completely wrong here, I would appreciate if someone can please give me a hint. I refer to systems like the Qualcomm 1xEV-DO wireless system, i.e. to systems with a high data rate TDM downlink from a base station (BS) to several (mobile) terminals (UE) and dedicated CDMA uplinks from the UE to BS. To my understanding, at BS there is one queue per UE. All queues are served in turn and the question is which scheduling mechanism is appropriate. Now, there is a huge amount of literature about this issue and perhaps I cannot place all of my questions into this one post, but let me please ask the following few. 1. In uplink direction, CDMA is used. To my understanding, the use of CDMA requires a rather precise channel state model at least for the purpose of propper power control. So the question is: Can I reasonably assume a channel model on each terminal which particularly allows the terminal to identify periods of "locally high SNR", i.e. periods of constructive interference when Rayleigh fading is considered, and periods of "locally low SNR", i.e. periods with destructive interference respectively? To my understanding, each terminal issues a "data rate control" message once in a time slot and the content of this message could be derived from such a channel model. Is this correct? 2. When I look at the "opportunistic scheduling" algorithm as used e.g. by Jalali, Padovani and Pankaj, I honestly do not understand how the rationale of Kelly?s "Charging and Rate Control for Elastic Traffic" paper does apply to this algorithm. Could someone please give me some hints on this one? Thanks Detlef From detlef.bosau at web.de Mon Jul 2 11:16:14 2007 From: detlef.bosau at web.de (Detlef Bosau) Date: Mon, 02 Jul 2007 20:16:14 +0200 Subject: [e2e] Opportunistic Scheduling. In-Reply-To: <22819.81.193.31.169.1183392962.squirrel@mail.irisa.fr> References: <4616E722.3070402@web.de> <46487D95.4040104@web.de> <7.0.1.0.2.20070514143809.02813318@antd.nist.gov> <4648C666.9000607@web.de> <46498002.9000903@gmail.com> <46890EBA.9000009@web.de> <22819.81.193.31.169.1183392962.squirrel@mail.irisa.fr> Message-ID: <468940EE.90606@web.de> ksingh at irisa.fr wrote: > Just a small comment that most of the time I read it was for downlink. > (for sure now they are proposing it for uplink) > > Up to now, I only read papers were the direction is not specified or I read papers which refer to the downlink. Basically, the problem is to not mix up media access and scheduling. In downlink direction there is only one sender which serves all terminals. Hence, we have a pure scheduling problem in this case. If we did OS in both directions, we would employ an OS like scheme for the uplink media access. The key problem here would be to provide BS with propper channel state information. In the system model I refer to the uplink channels are dedicated and it is hence no problem to send one DRC (data rate control) message to BS per timeslot and channel. > There is a series of papers by Hosein. You may find some hints in the > "initial" derivations done in the paper. I hope that I am giving the > reference to the correct one below. > The key is to assume user utility U(r) = log(r) that comes from kelly's work. > > Hm. The question is, whether this "formula mapping" really suffices to keep/apply Kelly?s rationale. From what I read so far, OS follows one objective, Kelly has an obective as well, and whether both objectives are the same is not clear to me. But I will have a careful look at these papers. Detlef -- Detlef Bosau Mail: detlef.bosau at web.de Galileistrasse 30 Web: http://www.detlef-bosau.de 70565 Stuttgart Skype: detlef.bosau Mobile: +49 172 681 9937 From detlef.bosau at web.de Tue Jul 3 10:05:14 2007 From: detlef.bosau at web.de (Detlef Bosau) Date: Tue, 03 Jul 2007 19:05:14 +0200 Subject: [e2e] Opportunistic Scheduling. In-Reply-To: <22819.81.193.31.169.1183392962.squirrel@mail.irisa.fr> References: <4616E722.3070402@web.de> <46487D95.4040104@web.de> <7.0.1.0.2.20070514143809.02813318@antd.nist.gov> <4648C666.9000607@web.de> <46498002.9000903@gmail.com> <46890EBA.9000009@web.de> <22819.81.193.31.169.1183392962.squirrel@mail.irisa.fr> Message-ID: <468A81CA.9000800@web.de> ksingh at irisa.fr pointed to > QoS control for WCDMA high speed packet data > Hosein, P.A. > Ericsson Wireless Commun. Inc., San Diego, CA, USA; > I think, there is a severe misconception in this paper, however this kind of misconception is met quite freuquently. Kelly?s objective is the assignment of shared ressources. So, when Kelly talks about "rates", these rates are shares of shared ressources and particularly sum up to a total which is less or equal to the maximum amount of ressources. Simple exampel: Think of a 150 MBit/s link (is this OC/3? I always mix up the numbers), than this can be shared among three flows which are assigned for instance - 20 MBit/s - 80 Mbit/s - 50 MBit/s ========= (<)=150 MBit/s, fine. :-) What we talk about in mobile networks are _code_ _rates_, and even this is simplified because there might be some kind of dynamic channel adaption by a per time slot choice of the symbol set. So, although some flow may be scheduled "at a rate" 30 kbit/s and some other flow may be scheduled "at a rate" 60 kbit/s, both flows will typically (in HSDPA like systems) occupy _the_ _same_ _amount_ of ressources because the radio blocks have the same length in symbols, only the information words differ in lenghts depending on the actual coding scheme / puncturing scheme. Hence, when we use code rates, even the boundary conditions in Kelly?s model can not be met in the original manner but would have to be adapted to the different meaning of "rate". Particularly, I once again emphasize that a stream scheduled "at a small rate", i.e. code rate does _not_ occupy less ressources than one scheduled "at a high rate". So, simply spoken: I severely doubt Hoseins rationale. Another point which is not convincing is, that in Hoseins paper the "actual vector of average rates" is made to follow an "optimal vector of rates" which is itself a moving target and may be subject to severe fluctuation. Perhaps, there is a severe flaw in my way of thinking, but at the moment, my doubts in the "opportunistic scheduling" algorithm as it is in use today become not lesser but bigger. Detlef -- Detlef Bosau Mail: detlef.bosau at web.de Galileistrasse 30 Web: http://www.detlef-bosau.de 70565 Stuttgart Skype: detlef.bosau Mobile: +49 172 681 9937 From query.cdac at gmail.com Thu Jul 5 07:39:02 2007 From: query.cdac at gmail.com (query ) Date: Thu, 5 Jul 2007 20:09:02 +0530 Subject: [e2e] query reg improving TCP performane Message-ID: <35cb505a0707050739l5855f2abs6de4216f2610fdea@mail.gmail.com> Hi All, I was doing some Bandwidth measurement test on a 100 mbs link with a RTT of about 70ms. Based on that, I calculated the BDP as follows. BDP = Bandwidth * RTT = 13107200 bytes * .07 secs = 896 Kbytes = 900 Kbyes (approx) = 921600 bytes After that I adjusted the TCP window size as follows: /proc/sys/net/core/rmem_max 921600 /proc/sys/net/core/wmem_max 921600 /proc/sys/net/ipv4/tcp_rmem 4096 87380 921600 /proc/sys/net/ipv4/tcp_wmem 4096 87380 921600 These adjustmenst I had done on a Linux host with 2.6.15 kernel. The congestion control algorithm , it is using BIC The same window adjustments I had performed on the other hand on a Linux host with kernel 2.6.9 . It is also using BIC for congestion control. The Bandwidth Performnce test I am doing using iperf , a highly popular public domain tool for measuring TCP & UDP Bandwidth Performance . With the default Linux 2.6 TCP window settings , I was getting a throughput of nearly 10mbs which is very low for a 100 mbs link. So , I performed the above TCP adjustmets and I found the throughput to be around 55 mbs which is a significant improvement . But that is not fully utilsing the link as it is a dedicated link and there was no other traffic. This , I proved with the next experiment where I reached a link utilisation of little more than 95 mbs . That is very much O.K for a 100mbs link. I did the following adjustments. I increased the above calculated BDP by nearly half of the value . The TCP settings now look like this. /proc/sys/net/core/rmem_max 175636 /proc/sys/net/core/wmem_max 175636 /proc/sys/net/ipv4/tcp_rmem 4096 87380 175636 /proc/sys/net/ipv4/tcp_wmem 4096 87380 175636 After these settings , I find the link utilisation to be nearly 95 mbs. According to many papers that I read , I found that the BDP should be equal to the product of Bandwidth * RTT . I had done that , but the link utilisation is only 50%. But when I increased to a much higher value , the link utilisation is nearly around 95 %. I am confused regarding my findings.Please clarify me so that I can perform the experiment correctly. With Thanks in Advance -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20070705/29510656/attachment.html From anil at cmmacs.ernet.in Thu Jul 5 22:35:26 2007 From: anil at cmmacs.ernet.in (V Anil Kumar) Date: Fri, 6 Jul 2007 11:05:26 +0530 (IST) Subject: [e2e] query reg improving TCP performane In-Reply-To: <35cb505a0707050739l5855f2abs6de4216f2610fdea@mail.gmail.com> Message-ID: Hi, What are the buffer sizes configured at the interface level on intermediate routers? Are they also set to the BDP of the link? Also, what is the buffer management scheme (RED or FIFO) on the intermediate routers? These parameters will also have an impact on the end-to-end throughput. The configuration you attached shows that you could improve the throughput from 55 mbps to 95 mbps by decreasing the buffer size from 921600 to 175636. While your text says that you achieved 95 mbps by increasing the buffer size. Any idea why a dedicated 100 mbps link gives an RTT of 70 msec. under no load condition? Anil On Thu, 5 Jul 2007, query wrote: > Hi All, > > I was doing some Bandwidth measurement test on a 100 mbs link with a RTT > of about 70ms. > Based on that, I calculated the BDP as follows. > > BDP = Bandwidth * RTT > = 13107200 bytes * .07 secs > = 896 Kbytes > = 900 Kbyes (approx) > = 921600 bytes > > > After that I adjusted the TCP window size as follows: > > /proc/sys/net/core/rmem_max 921600 > /proc/sys/net/core/wmem_max 921600 > /proc/sys/net/ipv4/tcp_rmem 4096 87380 921600 > /proc/sys/net/ipv4/tcp_wmem 4096 87380 921600 > > These adjustmenst I had done on a Linux host with 2.6.15 kernel. The > congestion > control algorithm , it is using BIC > > The same window adjustments I had performed on the other hand on a Linux > host > with kernel 2.6.9 . It is also using BIC for congestion control. > > The Bandwidth Performnce test I am doing using iperf , a highly popular > public > domain tool for measuring TCP & UDP Bandwidth Performance . > > With the default Linux 2.6 TCP window settings , I was getting a > throughput of > nearly 10mbs which is very low for a 100 mbs link. > > So , I performed the above TCP adjustmets and I found the throughput to > be around > 55 mbs which is a significant improvement . But that is not fully > utilsing the link > as it is a dedicated link and there was no other traffic. > This , I proved with the next experiment where I reached a link > utilisation of > little more than 95 mbs . That is very much O.K for a 100mbs link. > > I did the following adjustments. I increased the above calculated BDP by > nearly > half of the value . The TCP settings now look like this. > > /proc/sys/net/core/rmem_max 175636 > /proc/sys/net/core/wmem_max 175636 > /proc/sys/net/ipv4/tcp_rmem 4096 87380 175636 > /proc/sys/net/ipv4/tcp_wmem 4096 87380 175636 > > After these settings , I find the link utilisation to be nearly 95 mbs. > > According to many papers that I read , I found that the BDP should be > equal > to the product of Bandwidth * RTT . > I had done that , but the link utilisation is only 50%. But when I > increased > to a much higher value , the link utilisation is nearly around 95 %. > > I am confused regarding my findings.Please clarify me so that I can > perform > the experiment correctly. > > With Thanks in Advance > From query.cdac at gmail.com Thu Jul 5 23:45:34 2007 From: query.cdac at gmail.com (query ) Date: Fri, 6 Jul 2007 12:15:34 +0530 Subject: [e2e] query reg improving TCP performane In-Reply-To: References: <35cb505a0707050739l5855f2abs6de4216f2610fdea@mail.gmail.com> Message-ID: <35cb505a0707052345j15f7386dg87bd4b062c9c4d25@mail.gmail.com> On 7/6/07, V Anil Kumar wrote: > > > Hi, > > What are the buffer sizes configured at the interface level on > intermediate routers? Are they also set to the BDP of the link? Also, what > is the buffer management scheme (RED or FIFO) on the intermediate > routers? > These parameters will also have an impact on the end-to-end throughput. The network is not under my administration . So , I will probably ask my metwork provider to give me the result. The configuration you attached shows that you could improve the throughput > from 55 mbps to 95 mbps by decreasing the buffer size from 921600 to > 175636. While your text says that you achieved 95 mbps by increasing the > buffer size. Sorry, that was a typo . It should have been 1756366. (nearly 1.7 MB ). Any idea why a dedicated 100 mbps link gives an RTT of 70 msec. under no > load condition? > > The two end points are very far away. It is more than 3000 kms . So, > probably that might me the reason. The RTT is obeying the SLA parameters that was defined for this network. With Thanks and Regards zaman Anil > > On Thu, 5 Jul 2007, query wrote: > > > Hi All, > > > > I was doing some Bandwidth measurement test on a 100 mbs link with a > RTT > > of about 70ms. > > Based on that, I calculated the BDP as follows. > > > > BDP = Bandwidth * RTT > > = 13107200 bytes * .07 secs > > = 896 Kbytes > > = 900 Kbyes (approx) > > = 921600 bytes > > > > > > After that I adjusted the TCP window size as follows: > > > > /proc/sys/net/core/rmem_max 921600 > > /proc/sys/net/core/wmem_max 921600 > > /proc/sys/net/ipv4/tcp_rmem 4096 87380 921600 > > /proc/sys/net/ipv4/tcp_wmem 4096 87380 921600 > > > > These adjustmenst I had done on a Linux host with 2.6.15 kernel. The > > congestion > > control algorithm , it is using BIC > > > > The same window adjustments I had performed on the other hand on a > Linux > > host > > with kernel 2.6.9 . It is also using BIC for congestion control. > > > > The Bandwidth Performnce test I am doing using iperf , a highly > popular > > public > > domain tool for measuring TCP & UDP Bandwidth Performance . > > > > With the default Linux 2.6 TCP window settings , I was getting a > > throughput of > > nearly 10mbs which is very low for a 100 mbs link. > > > > So , I performed the above TCP adjustmets and I found the throughput > to > > be around > > 55 mbs which is a significant improvement . But that is not fully > > utilsing the link > > as it is a dedicated link and there was no other traffic. > > This , I proved with the next experiment where I reached a link > > utilisation of > > little more than 95 mbs . That is very much O.K for a 100mbs link. > > > > I did the following adjustments. I increased the above calculated BDP > by > > nearly > > half of the value . The TCP settings now look like this. > > > > /proc/sys/net/core/rmem_max 175636 > > /proc/sys/net/core/wmem_max 175636 > > /proc/sys/net/ipv4/tcp_rmem 4096 87380 175636 > > /proc/sys/net/ipv4/tcp_wmem 4096 87380 175636 > > > > After these settings , I find the link utilisation to be nearly 95 > mbs. > > > > According to many papers that I read , I found that the BDP should > be > > equal > > to the product of Bandwidth * RTT . > > I had done that , but the link utilisation is only 50%. But when I > > increased > > to a much higher value , the link utilisation is nearly around 95 %. > > > > I am confused regarding my findings.Please clarify me so that I can > > perform > > the experiment correctly. > > > > With Thanks in Advance > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20070706/e1063fb6/attachment.html From detlef.bosau at web.de Fri Jul 6 08:46:26 2007 From: detlef.bosau at web.de (Detlef Bosau) Date: Fri, 06 Jul 2007 17:46:26 +0200 Subject: [e2e] Opportunistic Scheduling. In-Reply-To: <468A81CA.9000800@web.de> References: <4616E722.3070402@web.de> <46487D95.4040104@web.de> <7.0.1.0.2.20070514143809.02813318@antd.nist.gov> <4648C666.9000607@web.de> <46498002.9000903@gmail.com> <46890EBA.9000009@web.de> <22819.81.193.31.169.1183392962.squirrel@mail.irisa.fr> <468A81CA.9000800@web.de> Message-ID: <468E63D2.5000403@web.de> I wonder, why there is absolutely no comment on my post. I expected at least some criticism or contradiction. Or does anybody agree? Detlef -- Detlef Bosau Mail: detlef.bosau at web.de Galileistrasse 30 Web: http://www.detlef-bosau.de 70565 Stuttgart Skype: detlef.bosau Mobile: +49 172 681 9937 From lachlan.andrew at gmail.com Thu Jul 5 08:59:09 2007 From: lachlan.andrew at gmail.com (Lachlan Andrew) Date: Thu, 5 Jul 2007 08:59:09 -0700 Subject: [e2e] query reg improving TCP performane In-Reply-To: <35cb505a0707050739l5855f2abs6de4216f2610fdea@mail.gmail.com> References: <35cb505a0707050739l5855f2abs6de4216f2610fdea@mail.gmail.com> Message-ID: Greetings, On 05/07/07, query wrote: > > I was doing some Bandwidth measurement test on a 100 mbs link with a RTT > of about 70ms. > Based on that, I calculated the BDP as follows. > > BDP = Bandwidth * RTT > = 921600 bytes > I did the following adjustments. I increased the above calculated BDP by > nearly > half of the value . The TCP settings now look like this. > > /proc/sys/net/core/rmem_max 175636 > /proc/sys/net/core/wmem_max 175636 > /proc/sys/net/ipv4/tcp_rmem 4096 87380 > 175636 > /proc/sys/net/ipv4/tcp_wmem 4096 87380 > 175636 > > After these settings , I find the link utilisation to be nearly 95 mbs. > > According to many papers that I read , I found that the BDP should be > equal > to the product of Bandwidth * RTT . The papers probably said that *router* buffers need to equal the bandwidth*RTT. You are adjusting the sender/receiver buffers. These need to be significantly larger, as you have found. In order to allow retransmissions, the sender buffer needs to be able to store all "packets in flight", which include both those in the in the router buffers and those "on the wire" (that is, in the nominal RTT of the link). In order to be able to provide in-order delivery, the receiver buffer needs to be able to hold even more packets. If a packet is lost, it will receive an entire RTT (plus router buffer) worth of data before the first retransmission of that packet will arrive. If the first retransmission is also lost, then it will need to store yet another RTT worth of data. The general rule-of-thumb for Reno is that the send buffer should be at least twice the bandwidth*RTT. For BIC is is probably reduced to about 120% of the BDP (because it reduces its window by a smaller factor when there is a loss). The receive buffer should still be at least equal to the BDP plus the router buffer. I hope this help, Lachlan -- Lachlan Andrew Dept of Computer Science, Caltech 1200 E California Blvd, Mail Code 256-80, Pasadena CA 91125, USA Phone: +1 (626) 395-8820 Fax: +1 (626) 568-3603 From query.cdac at gmail.com Thu Jul 5 23:32:54 2007 From: query.cdac at gmail.com (query ) Date: Fri, 6 Jul 2007 12:02:54 +0530 Subject: [e2e] query reg improving TCP performane In-Reply-To: References: <35cb505a0707050739l5855f2abs6de4216f2610fdea@mail.gmail.com> Message-ID: <35cb505a0707052332i55809354xfeec0c700d2b9419@mail.gmail.com> Thanks a lot Andrew . It helped me to understand and I feel that my Tunings are O.K. > I was doing some Bandwidth measurement test on a 100 mbs link with a RTT > > of about 70ms. > > Based on that, I calculated the BDP as follows. > > > > BDP = Bandwidth * RTT > > = 921600 bytes > > I did the following adjustments. I increased the above calculated BDP > by > > nearly > > half of the value . The TCP settings now look like this. > > > > /proc/sys/net/core/rmem_max 175636 > > /proc/sys/net/core/wmem_max 175636 > > /proc/sys/net/ipv4/tcp_rmem 4096 87380 > > 175636 > > /proc/sys/net/ipv4/tcp_wmem 4096 87380 > > 175636 > > > > After these settings , I find the link utilisation to be nearly 95 > mbs. > > > > According to many papers that I read , I found that the BDP should > be > > equal > > to the product of Bandwidth * RTT . > > The papers probably said that *router* buffers need to equal the > bandwidth*RTT. You are adjusting the sender/receiver buffers. These > need to be significantly larger, as you have found. The papers or rather articles are talking of sender and receiver buffers . Here is one such link where I find it. http://www.psc.edu/networking/projects/tcptune/ . In order to allow retransmissions, the sender buffer needs to be able > to store all "packets in flight", which include both those in the in > the router buffers and those "on the wire" (that is, in the nominal > RTT of the link). > > In order to be able to provide in-order delivery, the receiver buffer > needs to be able to hold even more packets. If a packet is lost, it > will receive an entire RTT (plus router buffer) worth of data before > the first retransmission of that packet will arrive. If the first > retransmission is also lost, then it will need to store yet another > RTT worth of data. > > The general rule-of-thumb for Reno is that the send buffer should be > at least twice the bandwidth*RTT. For BIC is is probably reduced to > about 120% of the BDP (because it reduces its window by a smaller > factor when there is a loss). The receive buffer should still be at > least equal to the BDP plus the router buffer. What I understand from your reply, is that It is not necessary that TCP Window should be equal to BDP in all cases . Had the router buffer size is equal to BDP , then I think I should equal link utilisation equal to the capacity of the link . Since , In Internet it will not be possible to know the router buffer size , so the best thing one can do , is to make the TCP window size twice to BDP as you have suggested. I am finding another problem. The UDP transmission rate on that link is decreased. I changed to the default settings , but it is showing the exact readings after tuning. It seems it is reading some fixed value from something and based on that it is transferring data . The readings are like this.......... iperf -u -c 192.168.60.62 -t 300 -l 1460 -i 2 ------------------------------------------------------------ Client connecting to 192.168.60.62, UDP port 5001 Sending 1460 byte datagrams UDP buffer size: 108 KByte (default) ------------------------------------------------------------ [ 3] local 10.128.0.2 port 32785 connected with 192.168.60.62 port 5001 [ ID] Interval Transfer Bandwidth [ 3] -0.0- 2.0 sec 257 KBytes 1.05 Mbits/sec [ 3] 2.0- 4.0 sec 257 KBytes 1.05 Mbits/sec [ 3] 4.0- 6.0 sec 255 KBytes 1.05 Mbits/sec [ 3] 6.0- 8.0 sec 257 KBytes 1.05 Mbits/sec [ 3] 8.0-10.0 sec 255 KBytes 1.05 Mbits/sec [ 3] 10.0-12.0 sec 257 KBytes 1.05 Mbits/sec [ 3] 12.0-14.0 sec 255 KBytes 1.05 Mbits/sec [ 3] 14.0-16.0 sec 257 KBytes 1.05 Mbits/sec [ 3] 16.0-18.0 sec 257 KBytes 1.05 Mbits/sec The result is for the following tuning. net.core.rmem_default = 110592 net.core.wmem_default = 110592 After that I changed the tuning to net.core.rmem_default = 196608 net.core.wmem_default = 196608 The readings for the tuning is like this... iperf -u -c 192.168.60.62 -t 300 -l 1460 -i 2 ------------------------------------------------------------ Client connecting to 192.168.60.62, UDP port 5001 Sending 1460 byte datagrams UDP buffer size: 192 KByte (default) ------------------------------------------------------------ [ 3] local 10.128.0.2 port 32785 connected with 192.168.60.62 port 5001 [ ID] Interval Transfer Bandwidth [ 3] -0.0- 2.0 sec 257 KBytes 1.05 Mbits/sec [ 3] 2.0- 4.0 sec 257 KBytes 1.05 Mbits/sec [ 3] 4.0- 6.0 sec 255 KBytes 1.05 Mbits/sec [ 3] 6.0- 8.0 sec 257 KBytes 1.05 Mbits/sec [ 3] 8.0-10.0 sec 255 KBytes 1.05 Mbits/sec [ 3] 10.0-12.0 sec 257 KBytes 1.05 Mbits/sec [ 3] 12.0-14.0 sec 255 KBytes 1.05 Mbits/sec Kindly please help me to rectify the problem. It is the same link on which I performed the TCp test. regards zaman > > I hope this help, > Lachlan > > -- > Lachlan Andrew Dept of Computer Science, Caltech > 1200 E California Blvd, Mail Code 256-80, Pasadena CA 91125, USA > Phone: +1 (626) 395-8820 Fax: +1 (626) 568-3603 > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20070706/38aad1ea/attachment.html From lachlan.andrew at gmail.com Fri Jul 6 09:24:58 2007 From: lachlan.andrew at gmail.com (Lachlan Andrew) Date: Fri, 6 Jul 2007 09:24:58 -0700 Subject: [e2e] query reg improving TCP performane In-Reply-To: <35cb505a0707052332i55809354xfeec0c700d2b9419@mail.gmail.com> References: <35cb505a0707050739l5855f2abs6de4216f2610fdea@mail.gmail.com> <35cb505a0707052332i55809354xfeec0c700d2b9419@mail.gmail.com> Message-ID: Greetings Zaman, On 05/07/07, query wrote: > I am finding another problem. The UDP transmission rate on that link is > decreased. > > iperf -u -c 192.168.60.62 -t 300 -l 1460 -i 2 > ------------------------------------------------------------ > Client connecting to 192.168.60.62, UDP port 5001 > Sending 1460 byte datagrams > UDP buffer size: 108 KByte (default) > ------------------------------------------------------------ > [ 3] local 10.128.0.2 port 32785 connected with 192.168.60.62 port 5001 > [ ID] Interval Transfer Bandwidth > [ 3] -0.0- 2.0 sec 257 KBytes 1.05 Mbits/sec To send UDP at a rate higher than 1Mbps, you need to specify -b to iperf. Unlike TCP, UDP doesn't "probe" for bandwidth, so you have to specify the rate you want to send at. From iperf -h: Usage: iperf [-s|-c host] [options] iperf [-h|--help] [-v|--version] Client specific: -b, --bandwidth #[KM] for UDP, bandwidth to send at in bits/sec (default 1 Mbit/sec, implies -u) Cheers, Lachlan -- Lachlan Andrew Dept of Computer Science, Caltech 1200 E California Blvd, Mail Code 256-80, Pasadena CA 91125, USA Phone: +1 (626) 395-8820 Fax: +1 (626) 568-3603 From query.cdac at gmail.com Sat Jul 7 02:18:38 2007 From: query.cdac at gmail.com (query ) Date: Sat, 7 Jul 2007 14:48:38 +0530 Subject: [e2e] query reg improving TCP performane In-Reply-To: References: <35cb505a0707050739l5855f2abs6de4216f2610fdea@mail.gmail.com> <35cb505a0707052332i55809354xfeec0c700d2b9419@mail.gmail.com> Message-ID: <35cb505a0707070218x6e154690wd0b3396d1a8167c0@mail.gmail.com> Hi Lachlan and All , Once gain thanks a lot. To send UDP at a rate higher than 1Mbps, you need to specify -b to > iperf. Unlike TCP, UDP doesn't "probe" for bandwidth, so you have to > specify the rate you want to send at. From iperf -h: > > Usage: iperf [-s|-c host] [options] > iperf [-h|--help] [-v|--version] > Client specific: > -b, --bandwidth #[KM] for UDP, bandwidth to send at in bits/sec > (default 1 Mbit/sec, implies -u) > > I tried with the -b option and once again the results are > impressive. The following are the result. $ iperf -u -c 10.128.0.2 -b 100m -t 300 -i 2 ------------------------------------------------------------ Client connecting to 10.128.0.2, UDP port 5001 Sending 1470 byte datagrams UDP buffer size: 105 KByte (default) ------------------------------------------------------------ [ 3] local 192.168.60.62 port 32772 connected with 10.128.0.2 port 5001 [ 3] 0.0- 2.0 sec 22.9 MBytes 96.0 Mbits/sec [ 3] 2.0- 4.0 sec 22.8 MBytes 95.6 Mbits/sec [ 3] 4.0- 6.0 sec 22.8 MBytes 95.8 Mbits/sec [ 3] 6.0- 8.0 sec 22.8 MBytes 95.8 Mbits/sec [ 3] 8.0-10.0 sec 22.8 MBytes 95.6 Mbits/sec [ 3] 10.0-12.0 sec 22.8 MBytes 95.8 Mbits/sec So , I am getting a link utilisation of 95 mbs . Once again thanks to all who guided me in understanding and completing the experiment satisfactorily. Now , I will try to get the router level information and I will get back to the mailing list. With Thanks and Regards zaman -- > Lachlan Andrew Dept of Computer Science, Caltech > 1200 E California Blvd, Mail Code 256-80, Pasadena CA 91125, USA > Phone: +1 (626) 395-8820 Fax: +1 (626) 568-3603 > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20070707/206e0ce4/attachment-0001.html From caitlinb at broadcom.com Mon Jul 9 12:19:51 2007 From: caitlinb at broadcom.com (Caitlin Bestler) Date: Mon, 9 Jul 2007 12:19:51 -0700 Subject: [e2e] Opportunistic Scheduling. In-Reply-To: <468E63D2.5000403@web.de> Message-ID: <1EF1E44200D82B47BD5BA61171E8CE9D0475CA50@NT-IRVA-0750.brcm.ad.broadcom.com> end2end-interest-bounces at postel.org wrote: > I wonder, why there is absolutely no comment on my post. I > expected at least some criticism or contradiction. Or does anybody > agree? > > Detlef My hunch is that this type of problem needs to be generalized so that variable local conditions can be fed back to end-to-end congestion/flow control in a way that is both effective and does not require the end-to-end logic to understand exactly what the local issue is. From query.cdac at gmail.com Tue Jul 10 05:04:44 2007 From: query.cdac at gmail.com (query ) Date: Tue, 10 Jul 2007 17:34:44 +0530 Subject: [e2e] query reg improving TCP performane In-Reply-To: <35cb505a0707052332i55809354xfeec0c700d2b9419@mail.gmail.com> References: <35cb505a0707050739l5855f2abs6de4216f2610fdea@mail.gmail.com> <35cb505a0707052332i55809354xfeec0c700d2b9419@mail.gmail.com> Message-ID: <35cb505a0707100504l292bcc07te7016609cfe7a61e@mail.gmail.com> Hi All, I was able to identify the size of the buffer in the Router's interface . It was found to be 2040 bytes and the buffer management scheme used is FIFO . Then based on this statement by Lachlan , I did the following tunings. "" The general rule-of-thumb for Reno is that the send buffer should be at least twice the bandwidth*RTT. For BIC is is probably reduced to about 120% of the BDP (because it reduces its window by a smaller factor when there is a loss). The receive buffer should still be at least equal to the BDP plus the router buffer. "" The tunings are as follows. Send buffer = BDP + 120 % of BDP = 921600 + 184320 = 1105920 bytes Receive buffer = BDP + Router's buffer size = 921600 + 2040 = 923640 bytes After that I tried the same earlier experiments with iperf . I got a average throughput of 66 Mbits/s . Next , I tried some more tuning . I increased the the send buffer size to twice of BDP. The receive buffer is same as earlier. This is according to what Lachlan suggested for TCP Reno. Based on these tunings , I got a throughput of average 78 Mbits/s using iperf. More improvement but not fully utilise the link capacity. But if I tune the window size to twice the size of BDP , I got a average throughput of around 88 Mbits/sec which I feel very much O.K for a 100 MBits/sec link . But it can be further improved. Also , earlier, I have written in this mailing list that when I tuned the window size to twice of BDP , I was getting a throughput of 95 Mbits /s . That I was referring to the maximum throughput . For all these experiments, I was using BIC. I also tried with the following tunings as suggested . But I didnot get any difference. /sbin/ifconfig eth0 txqueuelen 1000 /sbin/sysctl -w net.ipv4.tcp_adv_win_scale=3 So , based on the result of all these experiments , I reach the following conclusion. The receive buffer should be at least equal to the BDP plus the router buffer . The send buffer should be 20% more than BDP if you are using BIC. These tunings will probably try to utilise the maximum link capacity provided the buffer size in the intermediate router is equal to the BDP. This I cannot prove practically because it is not possible fot me to increase the buffer size to the size of BDP , because the network is not under my administration. But since in most cases the network is under the control of ISP , so it might not be possible for end users to know the size of the router's interface. In that case , the end to end send and receive buffer size should be atleast equal to twice the size of BDP to obtain maximum throughput. This statement was reflected by my findings. It will be helpfull to me if everybody give there views on my conclusion. If you have any contradiction , please write. With Thanks and Regards zaman On 7/6/07, query wrote: > > > Thanks a lot Andrew . It helped me to understand and I feel that my > Tunings are O.K. > > > I was doing some Bandwidth measurement test on a 100 mbs link with a > > RTT > > > of about 70ms. > > > Based on that, I calculated the BDP as follows. > > > > > > BDP = Bandwidth * RTT > > > = 921600 bytes > > > I did the following adjustments. I increased the above calculated > > BDP by > > > nearly > > > half of the value . The TCP settings now look like this. > > > > > > /proc/sys/net/core/rmem_max 175636 > > > /proc/sys/net/core/wmem_max 175636 > > > /proc/sys/net/ipv4/tcp_rmem 4096 87380 > > > 175636 > > > /proc/sys/net/ipv4/tcp_wmem 4096 87380 > > > 175636 > > > > > > After these settings , I find the link utilisation to be nearly 95 > > mbs. > > > > > > According to many papers that I read , I found that the BDP should > > be > > > equal > > > to the product of Bandwidth * RTT . > > > > The papers probably said that *router* buffers need to equal the > > bandwidth*RTT. You are adjusting the sender/receiver buffers. These > > need to be significantly larger, as you have found. > > > The papers or rather articles are talking of sender > and receiver > buffers . Here is one such link where I find it. > http://www.psc.edu/networking/projects/tcptune/ . > > > > In order to allow retransmissions, the sender buffer needs to be able > > to store all "packets in flight", which include both those in the in > > the router buffers and those "on the wire" (that is, in the nominal > > RTT of the link). > > > > In order to be able to provide in-order delivery, the receiver buffer > > needs to be able to hold even more packets. If a packet is lost, it > > will receive an entire RTT (plus router buffer) worth of data before > > the first retransmission of that packet will arrive. If the first > > retransmission is also lost, then it will need to store yet another > > RTT worth of data. > > > > The general rule-of-thumb for Reno is that the send buffer should be > > at least twice the bandwidth*RTT. For BIC is is probably reduced to > > about 120% of the BDP (because it reduces its window by a smaller > > factor when there is a loss). The receive buffer should still be at > > least equal to the BDP plus the router buffer. > > > What I understand from your reply, is that It is not necessary that > TCP Window should be > equal to BDP in all cases . Had the router buffer size is equal to BDP > , then I think I > should equal link utilisation equal to the capacity of the link . > Since , In Internet it will not be possible to know the router buffer > size , so the best thing one > can do , is to make the TCP window size twice to BDP as you have > suggested. > > I am finding another problem. The UDP transmission rate on that link is > decreased. I changed > to the default settings , but it is showing the exact readings after > tuning. It seems it is reading some fixed value from something and > based on that it is transferring data . > > The readings are like this.......... > > iperf -u -c 192.168.60.62 -t 300 -l 1460 -i 2 > ------------------------------------------------------------ > Client connecting to 192.168.60.62, UDP port 5001 > Sending 1460 byte datagrams > UDP buffer size: 108 KByte (default) > ------------------------------------------------------------ > [ 3] local 10.128.0.2 port 32785 connected with 192.168.60.62 port 5001 > [ ID] Interval Transfer Bandwidth > [ 3] -0.0- 2.0 sec 257 KBytes 1.05 Mbits/sec > [ 3] 2.0- 4.0 sec 257 KBytes 1.05 Mbits/sec > [ 3] 4.0- 6.0 sec 255 KBytes 1.05 Mbits/sec > [ 3] 6.0- 8.0 sec 257 KBytes 1.05 Mbits/sec > [ 3] 8.0-10.0 sec 255 KBytes 1.05 Mbits/sec > [ 3] 10.0-12.0 sec 257 KBytes 1.05 Mbits/sec > [ 3] 12.0-14.0 sec 255 KBytes 1.05 Mbits/sec > [ 3] 14.0-16.0 sec 257 KBytes 1.05 Mbits/sec > [ 3] 16.0-18.0 sec 257 KBytes 1.05 Mbits/sec > > The result is for the following tuning. > net.core.rmem_default = 110592 > net.core.wmem_default = 110592 > > After that I changed the tuning to > net.core.rmem_default = 196608 > net.core.wmem_default = 196608 > > The readings for the tuning is like this... > iperf -u -c 192.168.60.62 -t 300 -l 1460 -i 2 > ------------------------------------------------------------ > Client connecting to 192.168.60.62, UDP port 5001 > Sending 1460 byte datagrams > UDP buffer size: 192 KByte (default) > ------------------------------------------------------------ > [ 3] local 10.128.0.2 port 32785 connected with 192.168.60.62 port 5001 > [ ID] Interval Transfer Bandwidth > [ 3] -0.0- 2.0 sec 257 KBytes 1.05 Mbits/sec > [ 3] 2.0- 4.0 sec 257 KBytes 1.05 Mbits/sec > [ 3] 4.0- 6.0 sec 255 KBytes 1.05 Mbits/sec > [ 3] 6.0- 8.0 sec 257 KBytes 1.05 Mbits/sec > [ 3] 8.0-10.0 sec 255 KBytes 1.05 Mbits/sec > [ 3] 10.0-12.0 sec 257 KBytes 1.05 Mbits/sec > [ 3] 12.0-14.0 sec 255 KBytes 1.05 Mbits/sec > > Kindly please help me to rectify the problem. It is the same link on > which I > performed the TCp test. > > regards > zaman > > > > > > > > > > > > I hope this help, > > Lachlan > > > > -- > > Lachlan Andrew Dept of Computer Science, Caltech > > 1200 E California Blvd, Mail Code 256-80, Pasadena CA 91125, USA > > Phone: +1 (626) 395-8820 Fax: +1 (626) 568-3603 > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20070710/20e8835c/attachment-0001.html From detlef.bosau at web.de Tue Jul 10 06:27:26 2007 From: detlef.bosau at web.de (Detlef Bosau) Date: Tue, 10 Jul 2007 15:27:26 +0200 Subject: [e2e] Opportunistic Scheduling. In-Reply-To: <1EF1E44200D82B47BD5BA61171E8CE9D0475CA50@NT-IRVA-0750.brcm.ad.broadcom.com> References: <1EF1E44200D82B47BD5BA61171E8CE9D0475CA50@NT-IRVA-0750.brcm.ad.broadcom.com> Message-ID: <4693893E.3020305@web.de> Caitlin Bestler wrote: > end2end-interest-bounces at postel.org wrote: > >> I wonder, why there is absolutely no comment on my post. I >> expected at least some criticism or contradiction. Or does anybody >> agree? >> >> Detlef >> > > My hunch is that this type of problem needs to be generalized > so that variable local conditions can be fed back to end-to-end > congestion/flow control in a way that is both effective and does > not require the end-to-end logic to understand exactly what the > local issue is. > > I?m not quite sure about this. Please keep in mind that wireless network conditions may change several times within the transport process of one single IP packet. From that perspective, it is simply not possible for an IP sender to "follow" the wireless channel dynamics because an IP sender can only decide when to send an IP packet or whether to send it at all. This is perhaps far to coarse for wireless networks. On the other hand, the question is whether TCP/IP needs to follow wireless channel conditions at all. Although this is frequently claimed, I wonder why Ethernet works. Although TCP/IP simply doesn?t care for Ethernet dynamics ;-) I?m not yet convinced, that wireless channel dynamics really affect flow control and congestion control. As I?m not convinced on the often claimed dreadful spurious timeouts. Regarding spurious timeouts, I frequently refer to Hasenleitner et al. Either you make careful measurements, then you will not find spurious timeouts (or sp. t. are not significant), or you find a significant number of spurious timeouts, then..... (left to the reader). Back to the subject: I first want to _understand_ the effects of opportunistic scheduling, and therefore I first want to _understand_, how OS works and how the actually used algorithms are justified. And then, let?s see, whether this results in any ramifications to the upper layers. If so, and if there are problems, we can look how to solve them. However, if there are no problems, we must not invent solutions looking for a problem. And I well keep in mind, what IIRC Joe Touch wrote some months ago: TCP is not supposed to work perfect under any circumstances. So, if a wireless channel is bad, the TCP connection may be bad. Period. You cannot make a silk purse from a sows ear. BTW: Does someone happen to know, where I can find mappings from the actual SNR / C/I ratio onto the actual BLER, given a known Coding / Modulation / Puncturing scheme? And is there anything available about the policies how the MCS/PS is chosen with respect to an actual SNR? There must be quite some literature available on this topic, however I frequently fail to find it :-} Regards Detlef -- Detlef Bosau Mail: detlef.bosau at web.de Galileistrasse 30 Web: http://www.detlef-bosau.de 70565 Stuttgart Skype: detlef.bosau Mobile: +49 172 681 9937 From caitlinb at broadcom.com Tue Jul 10 09:26:58 2007 From: caitlinb at broadcom.com (Caitlin Bestler) Date: Tue, 10 Jul 2007 09:26:58 -0700 Subject: [e2e] Opportunistic Scheduling. In-Reply-To: <4693893E.3020305@web.de> Message-ID: <1EF1E44200D82B47BD5BA61171E8CE9D0475CD7F@NT-IRVA-0750.brcm.ad.broadcom.com> detlef.bosau at web.de wrote: > Caitlin Bestler wrote: >> end2end-interest-bounces at postel.org wrote: >> >>> I wonder, why there is absolutely no comment on my post. I expected >>> at least some criticism or contradiction. Or does anybody agree? >>> >>> Detlef >>> >> >> My hunch is that this type of problem needs to be generalized so that >> variable local conditions can be fed back to end-to-end >> congestion/flow control in a way that is both effective and does not >> require the end-to-end logic to understand exactly what the local >> issue is. >> >> > > I?m not quite sure about this. Please keep in mind that > wireless network conditions may change several times within > the transport process of one single IP packet. From that > perspective, it is simply not possible for an IP sender to > "follow" the wireless channel dynamics because an IP sender > can only decide when to send an IP packet or whether to send > it at all. This is perhaps far to coarse for wireless networks. > I'd phrase that as knowing the earliest time when sending an IP packet that would not be recklessly congesting the network. And having an understanding of varying L2 delivery capacity, even without understanding the cause, would be valuable for that at least when the variation is on the first segment. Ultimtately heroic efforts to deliver packets at L2 that stay\ confined to L2 have the potential to be counter-productive. If you can deliver X frames in time Y then L4 will presume that the network is capable of doing that and adjust accordingly. Heroic efforsts that succeed without an asterisk lead to heroic expectations, eventually packets get dropped anyway -- possibly more. Without some sort of feedback, L4 simply has no concept of an opportunistically delivered packet. Congestion notification probably plugs a major portion of that hole, but there is probably more required. From anil at cmmacs.ernet.in Tue Jul 10 09:42:33 2007 From: anil at cmmacs.ernet.in (V Anil Kumar) Date: Tue, 10 Jul 2007 22:12:33 +0530 (IST) Subject: [e2e] query reg improving TCP performane In-Reply-To: <35cb505a0707100504l292bcc07te7016609cfe7a61e@mail.gmail.com> Message-ID: Hi, 2040 bytes seem to be really a very low buffer size for any router with FIFO buffer management scheme. With this buffer size, the router would not be able to keep more than 2 packets of size 1500 bytes in its buffer. Alternate possibility is 2040 packets. Regards, Anil On Tue, 10 Jul 2007, query wrote: > Hi All, > > I was able to identify the size of the buffer in the Router's interface > . It was found to be 2040 > bytes and the buffer management scheme used is FIFO . > Then based on this statement by Lachlan , I did the following tunings. > > "" > The general rule-of-thumb for Reno is that the send buffer should be > at least twice the bandwidth*RTT. For BIC is is probably reduced to > about 120% of the BDP (because it reduces its window by a smaller > factor when there is a loss). The receive buffer should still be at > least equal to the BDP plus the router buffer. > "" > The tunings are as follows. > > Send buffer = BDP + 120 % of BDP > = 921600 + 184320 > = 1105920 bytes > > Receive buffer = BDP + Router's buffer size > = 921600 + 2040 > = 923640 bytes > > After that I tried the same earlier experiments with iperf . I got a > average throughput of 66 > Mbits/s . > > Next , I tried some more tuning . I increased the the send buffer > size to twice of BDP. > The receive buffer is same as earlier. This is according to what > Lachlan suggested for TCP > Reno. > Based on these tunings , I got a throughput of average 78 Mbits/s > using iperf. More improvement > but not fully utilise the link capacity. > > But if I tune the window size to twice the size of BDP , I got a > average throughput of > around 88 Mbits/sec which I feel very much O.K for a 100 MBits/sec link > . But it can be further > improved. Also , earlier, I have written in this mailing list that when > I tuned the window size to twice > of BDP , I was getting a throughput of 95 Mbits /s . That I was > referring to the maximum > throughput . > For all these experiments, I was using BIC. > > I also tried with the following tunings as suggested . But I didnot get > any difference. > > /sbin/ifconfig eth0 txqueuelen 1000 > /sbin/sysctl -w net.ipv4.tcp_adv_win_scale=3 > > > So , based on the result of all these experiments , I reach the > following conclusion. > > The receive buffer should be at least equal to the BDP plus the router > buffer . > The send buffer should be 20% more than BDP if you are using BIC. > These tunings will probably try to utilise the maximum link capacity > provided > the buffer size in the intermediate router is equal to the BDP. > This I cannot prove practically because it is not possible fot me to > increase the buffer > size to the size of BDP , because the network is not under my > administration. > > But since in most cases the network is under the control of ISP , so it > might not > be possible for end users to know the size of the router's interface. > In that case , the end to end send and receive buffer size should be > atleast > equal to twice the size of BDP to obtain maximum throughput. This > statement > was reflected by my findings. > > It will be helpfull to me if everybody give there views on my > conclusion. If you > have any contradiction , please write. > > With Thanks and Regards > zaman > > > > > > > > > > > > > > > > > > > > On 7/6/07, query wrote: > > > > > > Thanks a lot Andrew . It helped me to understand and I feel that my > > Tunings are O.K. > > > > > I was doing some Bandwidth measurement test on a 100 mbs link with a > > > RTT > > > > of about 70ms. > > > > Based on that, I calculated the BDP as follows. > > > > > > > > BDP = Bandwidth * RTT > > > > = 921600 bytes > > > > I did the following adjustments. I increased the above calculated > > > BDP by > > > > nearly > > > > half of the value . The TCP settings now look like this. > > > > > > > > /proc/sys/net/core/rmem_max 175636 > > > > /proc/sys/net/core/wmem_max 175636 > > > > /proc/sys/net/ipv4/tcp_rmem 4096 87380 > > > > 175636 > > > > /proc/sys/net/ipv4/tcp_wmem 4096 87380 > > > > 175636 > > > > > > > > After these settings , I find the link utilisation to be nearly 95 > > > mbs. > > > > > > > > According to many papers that I read , I found that the BDP should > > > be > > > > equal > > > > to the product of Bandwidth * RTT . > > > > > > The papers probably said that *router* buffers need to equal the > > > bandwidth*RTT. You are adjusting the sender/receiver buffers. These > > > need to be significantly larger, as you have found. > > > > > > The papers or rather articles are talking of sender > > and receiver > > buffers . Here is one such link where I find it. > > http://www.psc.edu/networking/projects/tcptune/ . > > > > > > > > In order to allow retransmissions, the sender buffer needs to be able > > > to store all "packets in flight", which include both those in the in > > > the router buffers and those "on the wire" (that is, in the nominal > > > RTT of the link). > > > > > > In order to be able to provide in-order delivery, the receiver buffer > > > needs to be able to hold even more packets. If a packet is lost, it > > > will receive an entire RTT (plus router buffer) worth of data before > > > the first retransmission of that packet will arrive. If the first > > > retransmission is also lost, then it will need to store yet another > > > RTT worth of data. > > > > > > The general rule-of-thumb for Reno is that the send buffer should be > > > at least twice the bandwidth*RTT. For BIC is is probably reduced to > > > about 120% of the BDP (because it reduces its window by a smaller > > > factor when there is a loss). The receive buffer should still be at > > > least equal to the BDP plus the router buffer. > > > > > > What I understand from your reply, is that It is not necessary that > > TCP Window should be > > equal to BDP in all cases . Had the router buffer size is equal to BDP > > , then I think I > > should equal link utilisation equal to the capacity of the link . > > Since , In Internet it will not be possible to know the router buffer > > size , so the best thing one > > can do , is to make the TCP window size twice to BDP as you have > > suggested. > > > > I am finding another problem. The UDP transmission rate on that link is > > decreased. I changed > > to the default settings , but it is showing the exact readings after > > tuning. It seems it is reading some fixed value from something and > > based on that it is transferring data . > > > > The readings are like this.......... > > > > iperf -u -c 192.168.60.62 -t 300 -l 1460 -i 2 > > ------------------------------------------------------------ > > Client connecting to 192.168.60.62, UDP port 5001 > > Sending 1460 byte datagrams > > UDP buffer size: 108 KByte (default) > > ------------------------------------------------------------ > > [ 3] local 10.128.0.2 port 32785 connected with 192.168.60.62 port 5001 > > [ ID] Interval Transfer Bandwidth > > [ 3] -0.0- 2.0 sec 257 KBytes 1.05 Mbits/sec > > [ 3] 2.0- 4.0 sec 257 KBytes 1.05 Mbits/sec > > [ 3] 4.0- 6.0 sec 255 KBytes 1.05 Mbits/sec > > [ 3] 6.0- 8.0 sec 257 KBytes 1.05 Mbits/sec > > [ 3] 8.0-10.0 sec 255 KBytes 1.05 Mbits/sec > > [ 3] 10.0-12.0 sec 257 KBytes 1.05 Mbits/sec > > [ 3] 12.0-14.0 sec 255 KBytes 1.05 Mbits/sec > > [ 3] 14.0-16.0 sec 257 KBytes 1.05 Mbits/sec > > [ 3] 16.0-18.0 sec 257 KBytes 1.05 Mbits/sec > > > > The result is for the following tuning. > > net.core.rmem_default = 110592 > > net.core.wmem_default = 110592 > > > > After that I changed the tuning to > > net.core.rmem_default = 196608 > > net.core.wmem_default = 196608 > > > > The readings for the tuning is like this... > > iperf -u -c 192.168.60.62 -t 300 -l 1460 -i 2 > > ------------------------------------------------------------ > > Client connecting to 192.168.60.62, UDP port 5001 > > Sending 1460 byte datagrams > > UDP buffer size: 192 KByte (default) > > ------------------------------------------------------------ > > [ 3] local 10.128.0.2 port 32785 connected with 192.168.60.62 port 5001 > > [ ID] Interval Transfer Bandwidth > > [ 3] -0.0- 2.0 sec 257 KBytes 1.05 Mbits/sec > > [ 3] 2.0- 4.0 sec 257 KBytes 1.05 Mbits/sec > > [ 3] 4.0- 6.0 sec 255 KBytes 1.05 Mbits/sec > > [ 3] 6.0- 8.0 sec 257 KBytes 1.05 Mbits/sec > > [ 3] 8.0-10.0 sec 255 KBytes 1.05 Mbits/sec > > [ 3] 10.0-12.0 sec 257 KBytes 1.05 Mbits/sec > > [ 3] 12.0-14.0 sec 255 KBytes 1.05 Mbits/sec > > > > Kindly please help me to rectify the problem. It is the same link on > > which I > > performed the TCp test. > > > > regards > > zaman > > > > > > > > > > > > > > > > > > > > > > I hope this help, > > > Lachlan > > > > > > -- > > > Lachlan Andrew Dept of Computer Science, Caltech > > > 1200 E California Blvd, Mail Code 256-80, Pasadena CA 91125, USA > > > Phone: +1 (626) 395-8820 Fax: +1 (626) 568-3603 > > > > > > > > From detlef.bosau at web.de Tue Jul 10 10:03:22 2007 From: detlef.bosau at web.de (Detlef Bosau) Date: Tue, 10 Jul 2007 19:03:22 +0200 Subject: [e2e] Opportunistic Scheduling. In-Reply-To: <32980.131.254.10.161.1184083103.squirrel@mail.irisa.fr> References: <4616E722.3070402@web.de> <46487D95.4040104@web.de> <7.0.1.0.2.20070514143809.02813318@antd.nist.gov> <4648C666.9000607@web.de> <46498002.9000903@gmail.com> <46890EBA.9000009@web.de> <22819.81.193.31.169.1183392962.squirrel@mail.irisa.fr> <468940EE.90606@web.de> <32980.131.254.10.161.1184083103.squirrel@mail.irisa.fr> Message-ID: <4693BBDA.90907@web.de> ksingh at irisa.fr wrote: >> Hm. The question is, whether this "formula mapping" really suffices to >> keep/apply Kelly?s rationale. >> > > another things to look could be to see if "PF scheduling" fulfills the > criteria of being proportionaly fair defined in "Charging and rate control > ..." > > \Sum{ (\lambda_{i}{*} - \lambda_{i})/(\lambda_{i}) } <= 0 > > \lambda_{i}: set of throughput obtained that we want to know are PF or not > \lambda_{i}{*} : any other feasible vector of throughput > > Kamal > > That misses the problem. Whatever Kelly talks about in his rationale, Kelly starts with a very precise system model. In this system model, he defines utility functions which are then to be maximized and boundary conditions which essentialy make the discussed optimization problem have a unique solution. At the moment, and in Hosein?s paper as well, we talk about utility functions. And we don?t care about the boundary conditions. In Kelly?s approach, "rates" mean ressource shares. And the Kelly approach yields a set of ressource shares which solve a given optimization problem. The only one shared ressource in the HSDPA downlink is service time at the base station. The "rates" used here are in fact coding schemes and not ressource shares. So, my criticism is, that the Kelly paper is simply based upon a totally different model / problem and is simply not applicable here! Even if there are formulae which look similar. Detlef -- Detlef Bosau Mail: detlef.bosau at web.de Galileistrasse 30 Web: http://www.detlef-bosau.de 70565 Stuttgart Skype: detlef.bosau Mobile: +49 172 681 9937 From detlef.bosau at web.de Tue Jul 10 10:52:42 2007 From: detlef.bosau at web.de (Detlef Bosau) Date: Tue, 10 Jul 2007 19:52:42 +0200 Subject: [e2e] Opportunistic Scheduling. In-Reply-To: <33045.131.254.10.161.1184086106.squirrel@mail.irisa.fr> References: <4616E722.3070402@web.de> <46487D95.4040104@web.de> <7.0.1.0.2.20070514143809.02813318@antd.nist.gov> <4648C666.9000607@web.de> <46498002.9000903@gmail.com> <46890EBA.9000009@web.de> <22819.81.193.31.169.1183392962.squirrel@mail.irisa.fr> <468A81CA.9000800@web.de> <33045.131.254.10.161.1184086106.squirrel@mail.irisa.fr> Message-ID: <4693C76A.7000908@web.de> ksingh at irisa.fr wrote: > > for the moment not talking about Hosein's rationale or Kelly's: > > What are your doubts regarding PF scheduling? > > PF scheduling divides the time slots equally, ass First: The very intention of opportunistic scheduling is to have a terminal served when the actaual SNR or C/I ratio respectively is _high_. In other words: We want to send in periods of no or constructive interference. So the first goal is to identify periods of high SNR. And I?m not convinced that the actually proposed algorithm matches this goal. Second: Any form of opportunistic scheduling introduces scheduling jitter into the system. To my knowledge, it is not yet completely understood how large this jitter can be. There is some rumour about the smoothing filters in the OS metrics, but I don?t know of substantial work which gives an understanding of how large this jitter can grow. That is the reason why I want to understand the rationale behind OS, because I want to understand this jitter. Even without "QoS requirements", even in an best effort service, it makes sense - to keep jitter in acceptable limits, - to keep burstiness in "acceptable limits, - perhaps to drop packets the delivery of which takes too long. I.e. when we know that a packet will be acknowledged anyway far beyound its RTO, and it is not completely delivered yet it might make sense to simply drop it. In fact, OS has the potential risk of doing "too much" on L2. Basically, this seems to be Caitlins concern. > uming some conditions, > among the users and at the same time tries to adapt to the channel > conditions. That looks good till now? > I don?t follow with respect to the equally division of time slots. > It creates small fluctuations in the bandwidth but so does the RR scheduling. > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ How small is "small" :-) > Do you know of any real measurements where these fluctuations due to PF > caused lot of problems for TCP? > I only know of some paper which mention fluctuations and resulting problems. E.g. the Globecom 04 paper by Thierry Klein. Although I do not yet completely understand, whether these are simulation results or results from real measurements. > and also what could have been the properties of an idle scheduling for you? > > I did not yet think about this. At the moment I?m still in the process of understanding. (I apologize, that this takes some time ;-)) Regards Detlef -- Detlef Bosau Mail: detlef.bosau at web.de Galileistrasse 30 Web: http://www.detlef-bosau.de 70565 Stuttgart Skype: detlef.bosau Mobile: +49 172 681 9937 From query.cdac at gmail.com Wed Jul 11 04:23:55 2007 From: query.cdac at gmail.com (query ) Date: Wed, 11 Jul 2007 16:53:55 +0530 Subject: [e2e] query reg improving TCP performane In-Reply-To: References: <35cb505a0707050739l5855f2abs6de4216f2610fdea@mail.gmail.com> <35cb505a0707052332i55809354xfeec0c700d2b9419@mail.gmail.com> <35cb505a0707100504l292bcc07te7016609cfe7a61e@mail.gmail.com> Message-ID: <35cb505a0707110423w48584f19g2a7b6030d73a3069@mail.gmail.com> On 7/10/07, Lachlan Andrew wrote: > > Greetings Lachlan , > > I was able to identify the size of the buffer in the Router's > interface. > > It was found to be 2040 > > bytes and the buffer management scheme used is FIFO . > > That's a very small buffer -- it can only hold one packet! Are you > sure that was the size? I am not quite sure as I don't have much experience in routers. This is the output of "show buffers " command in Cisco C3750 switch. "" Interface buffer pools: Supervisor MIC Fallback pool buffers, 2040 bytes (total 904, permanent 904): 897 in free list (0 min, 904 max allowed) 44790529 hits, 0 misses supervisor_cpuq_0_pool buffers, 2040 bytes (total 1200, permanent 1200): 700 in free list (0 min, 1200 max allowed) 104718271 hits, 0 misses supervisor_cpuq_2_pool buffers, 2040 bytes (total 64, permanent 64): 0 in free list (0 min, 64 max allowed) 64 hits, 0 misses, 0 trims, 0 created 0 failures (0 no memory) supervisor_cpuq_1_pool buffers, 2040 bytes (total 128, permanent 128): 1 in free list (0 min, 128 max allowed) 25512718 hits, 199587 fallbacks supervisor_cpuq_4_pool buffers, 2040 bytes (total 128, permanent 128): 1 in free list (0 min, 128 max allowed) 3032553 hits, 35456 fallbacks supervisor_cpuq_15_pool buffers, 2040 bytes (total 4, permanent 4): 0 in free list (0 min, 4 max allowed) 801508724 hits, 801508720 misses "" From that I interpreted that the buffer size is 2040 bytes. Correct me if I am wrong. > "" > > The general rule-of-thumb for Reno is that the send buffer should > be > > at least twice the bandwidth*RTT. For BIC is is probably reduced > to > > about 120% of the BDP > > "" > > The tunings are as follows. > > > > Send buffer = BDP + 120 % of BDP > > = 921600 + 184320 > > = 1105920 bytes > > This is 220% of the BDP (100% + 120%). I originally meant just 120%. > Still, using 200% should be better. I think it is only 120 % only of BDP and not 220%. . = 921600 + 921600/100 * 20 = 921600 + 184320 = 1105920 bytes. > > > Receive buffer = BDP + Router's buffer size > > = 921600 + 2040 > > = 923640 bytes > > I would have guessed that this is what is causing the biggest > reduction in your throughput. I'm pretty confident that the router's > buffer size will be bigger than 2040 bytes. If you know the appropriate command , please let me know. It is Cisco C3750 catalyst switch. Also , tell me the command to increase the buffer size if you are aware of it. > But if I tune the window size to twice the size of BDP , I got a > > average throughput of > > around 88 Mbits/sec which I feel very much O.K for a 100 MBits/sec > link > > The "window" is controlled by TCP -- you can't tune it. Did you tune > one of the buffer sizes? I did the following tunings at both ends . BDP was 921600 . So twice of BDP is 1756366 /proc/sys/net/core/rmem_max 1756366 /proc/sys/net/core/wmem_max 1756366 /proc/sys/net/ipv4/tcp_rmem 4096 87380 1756366 /proc/sys/net/ipv4/tcp_wmem 4096 87380 1756366 > In that case , the end to end send and receive buffer size should be > > atleast > > equal to twice the size of BDP to obtain maximum throughput. > > That sounds like a good rule of thumb. It is certainly the widely > accepted rule for Reno. All my findings are based on BIC . Now , since in the case of our network , the size of the buffer in the intermediate router is probably not equal to the size of BDP, so in that case I feel that the end to end send and receive buffer size should be atleast equal to twice the size of BDP to obtain maximum throughput. . This applies both for RENO and BIC. But if the size of the buffer in the intermediate buffer is equal to the size of BDP , Then the settings can be whatever you said for BIC. "The receive buffer should be at least equal to the BDP plus the router buffer . The send buffer should be 20% more than BDP " Do you agree with this statement , please give your views. With Thanks and Regards Zaman -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20070711/053db948/attachment.html From sunshine at cis.udel.edu Tue Jul 10 16:36:33 2007 From: sunshine at cis.udel.edu (Jelena Mirkovic) Date: Tue, 10 Jul 2007 19:36:33 -0400 (EDT) Subject: [e2e] SIGCOMM 2007 - Call for participation Message-ID: SIGCOMM 2007 will be held in Kyoto, Japan, August 27-31. We invite you to participate in this event, which each year attracts top networking researchers and practitioners from around the world. The early registration ends on July 18 at 5pm Japan Standard Time (GMT+9). For more information please visit the SIGCOMM URL: http://www.sigcomm.org/sigcomm2007/ Collocated with the main conference, there will be six workshops: (W1) Mobility in the Evolving Internet Architecture (MobiArch) (W2) Large Scale Attack Defense (LSAD) (W3) Networked Systems for Developing Regions (NSDR) (W4) Internet Network Management 2007 (INM) (W5) Peer-to-Peer Streaming and IPTV Systems (P2P-TV) (W6) IPv6 and the Future of the Internet (IPv6) We hope to see you in Kyoto! For the SIGCOMM organizing commitee: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Jelena Mirkovic, Assistant Professor CIS, University of Delaware 412 Smith Hall, Newark, DE 19716 phone: 302-831-6052, fax: 302-831-8458 http://www.cis.udel.edu/~sunshine ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From detlef.bosau at web.de Wed Jul 11 14:05:15 2007 From: detlef.bosau at web.de (Detlef Bosau) Date: Wed, 11 Jul 2007 23:05:15 +0200 Subject: [e2e] Opportunistic Scheduling. In-Reply-To: <32883.192.44.77.81.1184166278.squirrel@mail.irisa.fr> References: <4616E722.3070402@web.de> <46487D95.4040104@web.de> <7.0.1.0.2.20070514143809.02813318@antd.nist.gov> <4648C666.9000607@web.de> <46498002.9000903@gmail.com> <46890EBA.9000009@web.de> <22819.81.193.31.169.1183392962.squirrel@mail.irisa.fr> <468A81CA.9000800@web.de> <33045.131.254.10.161.1184086106.squirrel@mail.irisa.fr> <4693C76A.7000908@web.de> <32883.192.44.77.81.1184166278.squirrel@mail.irisa.fr> Message-ID: <4695460B.7040001@web.de> ksingh at irisa.fr wrote: > PF scheduler allocates equal share of time slots to all users over long > term. At the same time it tries to be channel adaptive. Ofcourse, there > are some assumptions for e.g. it may not hold if a user is moving at 3kmph > and the another user is moving in car at 90kmph. > > You may look at this paper by Borst: > http://ftp.cwi.nl/CWIreports/PNA/PNA-R0223.pdf > Among other interesting results like why a strategy "like PF" is optimal, > on page 5 you will see that a user gets 1/M time slots by symmetry. > M is the number of users. > > I have a longer answer in preparation. But in the Borst paper, I just see: "We assume that the feasible rates for the various users vary over time according to some stationary discrete-time stochastic process fR1(t); : : : ;RM(t)g, with Ri(t) representing the feasible rate for user i in time slot t." The key word is "stationary". One _can_ make an assumption like this. However, this does not matsch reality. It is of course possible for many propositions to find scenarios where the proposition holds. And perhaps we can write those as customer?s duties and customer?s responsitibilies in the terms and conditions of network operators ;-) In addition, the paper seems to discuss the distribution of rates. Does it discuss, whether a channel is in its local optimum state? Please keep in mind, that the C/I ratio exhibits a periodic behaviour. However, I will have to read the whole paper, which will take some time. With respect to your very claim, that the sending time is shared in equal portions to all channels in the long run, this is plausible when the vector of feasible rates moves to some stationary process. However, at least at the moment I?m not convinced that this will hold in the general case. Simply spoken: The more I read about HSDPA, the more questions I have and the less convincing this whole stuff appears to me. And the more I read about it, the more rises my strongest objection: HSDPA is an excellent example for what Dave Reed critcized here in the list: My impression is that there is by far to much complexity and "intelligence" in the HSDPA link layer. Particularly, from an end to end perspective, a link layer should be simple and clearly structured. Just one observation, I made yesterday: In HSDPA, the shared downlink is split up into 16 channels using CDMA, 15 of which are used for data transport. And a maximum of 4 terminals may be served in the same time slot. I don?t yet understand the reason for this. But I see a large and complex scheduling algorithm. Simply spoken: From many posts here in the list, particularly from those by David Reed, I learned that networks (and not only networks) shall be kept small and simple. When I see HSDPA, this appears to me large and complex :-) Detlef -- Detlef Bosau Mail: detlef.bosau at web.de Galileistrasse 30 Web: http://www.detlef-bosau.de 70565 Stuttgart Skype: detlef.bosau Mobile: +49 172 681 9937 From detlef.bosau at web.de Thu Jul 12 07:12:37 2007 From: detlef.bosau at web.de (Detlef Bosau) Date: Thu, 12 Jul 2007 16:12:37 +0200 Subject: [e2e] Ressource Fairness or Througput Fairness, was Re: Opportunistic Scheduling. In-Reply-To: <32883.192.44.77.81.1184166278.squirrel@mail.irisa.fr> References: <4616E722.3070402@web.de> <46487D95.4040104@web.de> <7.0.1.0.2.20070514143809.02813318@antd.nist.gov> <4648C666.9000607@web.de> <46498002.9000903@gmail.com> <46890EBA.9000009@web.de> <22819.81.193.31.169.1183392962.squirrel@mail.irisa.fr> <468A81CA.9000800@web.de> <33045.131.254.10.161.1184086106.squirrel@mail.irisa.fr> <4693C76A.7000908@web.de> <32883.192.44.77.81.1184166278.squirrel@mail.irisa.fr> Message-ID: <469636D5.8000608@web.de> There is one important point which is perhaps missing in our whole discussion. (One of the "advantages" of being unemployed for quite al ong time is that you often cannot sleep in the night because the whole situation simply does not leave you alone. And during the last night, I thought about our discussion. It?s better than thinking about how to find an employment, when you cannot sleep. Sorry for being somewhat off topic but I cannot always hide my situation to that degree I want to.) O.k. The result of the last night is perhaps trivial and some months ago, when I played around with very rough simulations of opportunistic scheduling pointed exactly into that direction and it perfectly matches the observation of the mixup of concepts with the term "rate" here: Do we talk about ressource fairness? Or do we talk about throughput fairness? To my understanding e.g. of TCP, we shall talk about ressource fairness. And to my understaning e.g. of the congavoid paper and all the work based upon this we share network _ressources_ among network participants. Ressource fairness is typically an end to end issue: E.g. the major burden of end to end congestion control is upon the terminals. They may be assisted by nodes in between, e.g. by RED or ECN, however it?s up to the terminals to identify congestion, to relieve the network from too much load and - as a side effect - assigning a fair share of _ressources_ to each flow. When I look at OS, there is some local decision how the optimal rates for a link are to be set and then the flow?s average rate are made to achieve / follow these rates. It is obvious that this approach will cause the competing flows to achieve equal throughputs or a predefined throughput vector. And of course, this may cause unwanted consequences, e.g. scheduling jitter: When PF scheduling targets at equal average throughputs for a number of flows, a flow which starts with some delay will cause all other flows to stop because its average throughput is far less than that of the competitors. Anf of course, the whole approach requires infinitely backloged queues. And of course, this assumes greedy sources. And of course, there are numerous approaches to alleviate jitter etc. when e.g. sources are not greedy. However, I come to the conclusion that it is exactly the mixup of "rate" and "ressources" in HSDPA and the lilke and the goal of achieving throughput fairness instead of ressource fairness is the very reason for potential problems. A note on throughput fairness: What is throughput fairness? What is throughput? The rates we?re talking about in this context are code rates or service rates resulting from a certain MCS/PS. We don?t talk about block error rates. We don?t talk about necessary retransmissions, be it on layer 2 or end to end. We don?t take into accout whether the application is error tolerant or not. So we don?t have an idea what "throughput" means for the user and in terms of the application. So, any fair distribution of throughput on layer 2 is necessarily somewhat arbitrary. Some "self made fairness goal", which hopefully matches the end to end goals. I well remember that we discussed some economical aspects here on the list some time ago. And it is exactly the economical view that leads to the basic criterion: When two users pay the same price, they shall get the same service. And for a base station in a cellular network that means: They shall get the same amount of sending time. When one user places his mobile directly at the antenna station and the other hides behind a wall of steel concrete then the user with the mobile placed at the antenna station will of course receive a better TCP _good_put than our nearly hidden terminal which perhaps will not achieve any goodput at all. But is this the networks responsibility? Definitely not! When both users pay the same price, the network will spend the same effort for both users to deliver any pending packets. When I buy a new watch and I pay for a watch that is not water resistant I will _get_ a watch that is not water resistant. And when I go diving afterwards in 50 meter depth with this wonderful new watch on, I can hardly hold the watchmaker liable when the watch is broken afterwards. When I now take into account the work by Frank Kelly and if I understand this correctly, this work gives users the opportunity to get better service than others - when they are willing to pay more than others. Of course, Kelly discusses proportional fairness as an example, but this is not the clue of the paper. To my understanding, users may define there own utitlity functions as long as these are strictly concave and then we lern from Frank Kelly how to find an optimal schedule even for those utility functions _and_ we find a way to charge users appropriately. So, when a user definitely wants only service with high rates, he is free to so define his utility function. Consequently he is so served - and so charged. In some sense, the utility function of elastic traffic defines the traffic?s QoS requirements. And the optimization problem discussed by Kelly is to negotiate a trade off between the users? requirements and the network?s actual capacity. So, we have _no_ best effort network but a network with QoS requirements instead. Anf of course, Kelly may talk about (service) rates because in Kelly?s model rate and service time are reciprocally proportional. When we talk about code rates, i.e. coding schemes or puncturing schemes, i.e. the service time for a block remains the same and only the the payload / redundancy ratio varies from slot to slot, Kelly?s model will definitely fail. So, to make a long story short, I see three concerns here: 1.: In literature dealing with HSDPA and the like whe have a mixup between the terms service rate and code rate. 2.: PF scheduling pursues throughput fairness whereas in best effort networks, we want to pursue ressource fairness instead. 3.: Kelly?s "recipe" on elastic traffic simply does not apply here. Elastic traffic is not best effort traffic. Elastic traffic comes with a utility function while best effort traffic does not. And because best effort traffic is equally charged on a per ressource basis, I don?t see Kelly?s work to be applicable here. O.k., and now I take criticism :-) (And admittedly, I like the idea of exploiting periods of good channel conditions and perhaps, I have some vague idea in mind how this can be done in a reasonable way which is 1. simple and which 2. pursues ressource fairness instead of throughput fairness. And perhaps, I eventually will write it down.) Detlef -- Detlef Bosau Mail: detlef.bosau at web.de Galileistrasse 30 Web: http://www.detlef-bosau.de 70565 Stuttgart Skype: detlef.bosau Mobile: +49 172 681 9937 From detlef.bosau at web.de Mon Jul 16 14:43:40 2007 From: detlef.bosau at web.de (Detlef Bosau) Date: Mon, 16 Jul 2007 23:43:40 +0200 Subject: [e2e] Ressource Fairness or Througput Fairness, was Re: Opportunistic Scheduling. In-Reply-To: <32883.192.44.77.81.1184166278.squirrel@mail.irisa.fr> References: <4616E722.3070402@web.de> <46487D95.4040104@web.de> <7.0.1.0.2.20070514143809.02813318@antd.nist.gov> <4648C666.9000607@web.de> <46498002.9000903@gmail.com> <46890EBA.9000009@web.de> <22819.81.193.31.169.1183392962.squirrel@mail.irisa.fr> <468A81CA.9000800@web.de> <33045.131.254.10.161.1184086106.squirrel@mail.irisa.fr> <4693C76A.7000908@web.de> <32883.192.44.77.81.1184166278.squirrel@mail.irisa.fr> Message-ID: <469BE68C.2060003@web.de> For some reasons, a post of mine did not appear in the list. So, I will only repeat the very essential parts: I refer to the HSDPA downlink which is shared by some set of users. The shared ressource is hence the service time at BS. Let?s consider the users, not the flows. So each user gets a certain share of service time. Although there is one common ressource, the individual users may experience different noise levels which affect the achieved throughput for a user. Now the most basic question is: Do we pursue ressource fairness? Or do we pursue throughput fairness? To my understanding of the End to End design, we typically pursue _ressource_ fairness, wheres the actual literature on HSDPA and the like pursues _throughput_ fairness. Do you agree here? In general, I see three concerns in this context, which I cut from this post?s draft, but I think they may be understood as well: 1.: In literature dealing with HSDPA and the like whe have a mixup between the terms service rate and code rate. 2.: PF scheduling pursues throughput fairness whereas in best effort networks, we want to pursue ressource fairness instead. 3.: Kelly?s "recipe" on elastic traffic simply does not apply here. Elastic traffic is not best effort traffic. Elastic traffic comes with a utility function while best effort traffic does not. And because best effort traffic is equally charged on a per ressource basis, I don?t see Kelly?s work to be applicable here. Do you agree here? Detlef -- Detlef Bosau Mail: detlef.bosau at web.de Galileistrasse 30 Web: http://www.detlef-bosau.de 70565 Stuttgart Skype: detlef.bosau Mobile: +49 172 681 9937 From touch at ISI.EDU Tue Jul 17 07:54:01 2007 From: touch at ISI.EDU (Joe Touch) Date: Tue, 17 Jul 2007 07:54:01 -0700 Subject: [e2e] testing Message-ID: <469CD809.6080106@isi.edu> Test - please ignore. Joe (as list admin) -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 250 bytes Desc: OpenPGP digital signature Url : http://mailman.postel.org/pipermail/end2end-interest/attachments/20070717/ddadf9a4/signature.bin From davide+e2e at cs.cmu.edu Tue Jul 17 22:11:05 2007 From: davide+e2e at cs.cmu.edu (Dave Eckhardt) Date: Wed, 18 Jul 2007 01:11:05 -0400 Subject: [e2e] Ressource Fairness or Througput Fairness, was Re: Opportunistic Scheduling. In-Reply-To: Your message of "Thu, 12 Jul 2007 16:12:37 +0200." <469636D5.8000608@web.de> Message-ID: <200707180511.l6I5BJsF013408@boreas.isi.edu> > Do we talk about ressource fairness? > Or do we talk about throughput fairness? It is possible to do both, in a balanced way. Some time back we did some initial thinking about that. D. Eckhardt, P. Steenkiste. Effort-limited Fair (ELF) Scheduling for Wireless Networks. http://www.cs.cmu.edu/~davide/papers/INFOCOM2000.pdf (erratum: In Section VI, the crossover error rate is 55%, not 50%) Dave Eckhardt From detlef.bosau at web.de Wed Jul 18 10:15:18 2007 From: detlef.bosau at web.de (Detlef Bosau) Date: Wed, 18 Jul 2007 19:15:18 +0200 Subject: [e2e] Ressource Fairness or Througput Fairness, was Re: Opportunistic Scheduling. In-Reply-To: <200707180511.l6I5BJsF013408@boreas.isi.edu> References: <200707180511.l6I5BJsF013408@boreas.isi.edu> Message-ID: <469E4AA6.4020207@web.de> Dave Eckhardt wrote: >> Do we talk about ressource fairness? >> Or do we talk about throughput fairness? >> > > It is possible to do both, in a balanced way. Some time back > we did some initial thinking about that. > > D. Eckhardt, P. Steenkiste. > Effort-limited Fair (ELF) Scheduling for Wireless Networks. > http://www.cs.cmu.edu/~davide/papers/INFOCOM2000.pdf > (erratum: In Section VI, the crossover error rate is 55%, not 50%) > > Dave Eckhardt > I just read your paper and get somewhat lost in the use of - bandwidth, - capacity, - throughput, - capacity loss, - ... It is just an observation of mine, that these terms are somewhat polymorphic, particularly when they are used by CS and EE people. And I personaly think, this is a very serious issue as it is often hardly possible to communicate at all between these disciplines... Just a very premature observation. And now for something completely different: Scheduling. Sometimes, I get the impression they were three kinds of engineers dealing with networks. 1.: CS / packet switching guys, which basically don?t know about scheduling in communication networks. Packets are stored in a FIFO queue and that?s it. 2.: EE/ telco / line swichting guys, which basically can?t imagine anything else than scheduling, they have congenital schedulers inside. 3.: Multimedia / DiffServ / IntServ/ QoS guys, who are something in between :-) O.k. I belong to the first category. (And I strongly don?t belong to the third, even more after having got in touch with multimedia over wireless networks.) So, just for my understanding I raise a perhaps stupid question: Let?s assume a cellular network. Why do we need a scheduler then in the base station? (I once asked this an EE guy and got no answer...) Basically, I see exactly one reason: In networks which require a link layer recovery mechanism and where the BS-Terminal links suffer from very different error rates, we need a mechanism which prevents starvation problems and head of line blocking. Particularly the link layer recovery may insert traffic which must be served before any other traffic for flow control reasons: Unacknowledged data occupies memory at the sender. In addition incomplete L3 packets occupy memory at the receiver, particularly becaause incomplete packets cannot be handed to the application. The worst case is that a receiver cannot receive additional data in order to have packets completed and cannot pass packets to the application, so there is a dead lock situation. Therefore, retransmissions are given higher priority then first / new transmissions. Is this correct? Perhaps, I?m a bit nitpicking here. But when I introduce a scheduler at the base station at all, there must be a convincing reason to do so. And I only found the aforementioned one... O.k. Back to the cellular network. So, how do we integrate such a cell with a base station with a scheduler and a number of terminals into the big picture of - best effort - asynchronous - in many cases self clocked - end to end traffic ? At the moment, I think a ressource fair scheduler at the base station would be the best solution to do this. What do you think? Detlef -- Detlef Bosau Mail: detlef.bosau at web.de Galileistrasse 30 Web: http://www.detlef-bosau.de 70565 Stuttgart Skype: detlef.bosau Mobile: +49 172 681 9937 From davide+e2e at cs.cmu.edu Wed Jul 18 12:03:22 2007 From: davide+e2e at cs.cmu.edu (Dave Eckhardt) Date: Wed, 18 Jul 2007 15:03:22 -0400 Subject: [e2e] Ressource Fairness or Througput Fairness, was Re: Opportunistic Scheduling. In-Reply-To: Your message of "Wed, 18 Jul 2007 19:15:18 +0200." <469E4AA6.4020207@web.de> Message-ID: <200707181903.l6IJ3gtv012842@boreas.isi.edu> > Let's assume a cellular network. We more or less anti-assumed that, but I'll try to answer the question anyway. > Why do we need a scheduler then in the base station? You probably don't *need* one. But if an error-sensitive scheduler would let you undetectably degrade the quality experienced by a user in a good spot in exchange for enabling a user in a bad spot to "unfairly" (in an air-time sense) get enough quality to keep paying you by the minute for his call, you might *want* one. That's a possible monetary answer; for LANs one might imagine a sense of community supporting the idea of spending a little extra air time to help out somebody temporarily in a bad spot (maybe next to a microwave oven which will shut off soon). > Basically, I see exactly one reason: In networks which require a link > layer recovery mechanism and where the BS-Terminal links suffer from > very different error rates, we need a mechanism which prevents > starvation problems and head of line blocking. You could do that by abandoning transmission of the head-of-line packet after some amount of time (arguing it is "resource fair" to starve the station having trouble to keep the link going). Or you could fragment packets into link-level frames with different sizes and codings for each station depending on its error environment, and then round-robin sub-packet frames to stations (you'd need to have N head-of-line packets instead of 1, but that should be ok). > Perhaps, I'm a bit nitpicking here. But when I introduce a scheduler at > the base station at all, there must be a convincing reason to do so. An alternative perspective is that if other people have already firmly decided to introduce schedulers at base stations we might want to make suggestions about better schedulers :-) > Back to the cellular network. > > So, how do we integrate such a cell with a base station with a scheduler > and a number of terminals into the big picture of > - best effort > - asynchronous > - in many cases self clocked > - end to end traffic ? I'm not sure I understand the question... in CDMA networks (coming soon to a GSM phone near you!) soft handoff already means there is a level of coordination above the "base station". > At the moment, I think a ressource fair scheduler at the base station > would be the best solution to do this. We hope the paper argues that effort-fair (== "resource fair") is "fair" but undesirable in some situations, that outcome-fair is "fair" but undesirable in other situations, and that a hybrid notion of fairness is both desirable and achievable. Dave Eckhardt P.S. There is further material in my dissertation, including a warning about the difficulty of measuring per-station conditions when trying to do scheduling. From detlef.bosau at web.de Wed Jul 18 14:29:42 2007 From: detlef.bosau at web.de (Detlef Bosau) Date: Wed, 18 Jul 2007 23:29:42 +0200 Subject: [e2e] Ressource Fairness or Througput Fairness, was Re: Opportunistic Scheduling. In-Reply-To: <200707181903.l6IJ3gtv012842@boreas.isi.edu> References: <200707181903.l6IJ3gtv012842@boreas.isi.edu> Message-ID: <469E8646.6030003@web.de> Dave Eckhardt wrote: >> Let's assume a cellular network. >> > > We more or less anti-assumed that, but I'll try to answer the > question anyway. > > Forgive me :-) >> Why do we need a scheduler then in the base station? >> > > You probably don't *need* one. But if an error-sensitive scheduler > would let you undetectably degrade the quality experienced by a > user in a good spot in exchange for enabling a user in a bad spot > to "unfairly" (in an air-time sense) get enough quality to keep > paying you by the minute for his call, you might *want* one. O.k. You refer to the idea of opportunistic scheduling. That?s what I?m currently thinking about, but this is the next step. Basically, and particularly this should hold true in WLAN (please correct me if I?m wrong) you could maintain an interface queue, as e.g. for Ethernet, and send pending packets in FIFO order. Correct? > That's > a possible monetary answer; for LANs one might imagine a sense of > community supporting the idea of spending a little extra air time > to help out somebody temporarily in a bad spot (maybe next to a > microwave oven which will shut off soon). > > Question: Do you make any assumptions about a recovery layer, particularly ARQ, here? I?m not quite sure but I was told, some WLAN settings would employ an ARQ mechanism? > You could do that by abandoning transmission of the head-of-line packet > after some amount of time (arguing it is "resource fair" to starve the > station having trouble to keep the link going). > > Q: Do you think of a stop?n wait ARQ scheme here? This is a basic question, because in a sliding window ARQ scheme, the head-of-line packet would in fact not "stay" at the head of line but would repreatedly occur there again and again. So, it?s no head of line blocking in its literal sense but the effect is the same. > Or you could fragment packets into link-level frames with different sizes > and codings for each station depending on its error environment, and then > round-robin sub-packet frames to stations (you'd need to have N head-of-line > packets instead of 1, but that should be ok). > > And exactly this introduces a scheduler (round robin). Thus, you have a head of line blocking for one channel where the rest is not blocked. And of course, it makes sense to define a maximum number of transmission attempts, a packet is discarded afterwards. >> Perhaps, I'm a bit nitpicking here. But when I introduce a scheduler at >> the base station at all, there must be a convincing reason to do so. >> > > An alternative perspective is that if other people have already firmly > decided to introduce schedulers at base stations we might want to make > suggestions about better schedulers :-) > > But this is not "pure scientifing" reasoning ;-) Don?t you agree? ;-) >> Back to the cellular network. >> >> So, how do we integrate such a cell with a base station with a scheduler >> and a number of terminals into the big picture of >> - best effort >> - asynchronous >> - in many cases self clocked >> - end to end traffic ? >> > > I'm not sure I understand the question... in CDMA networks (coming soon > to a GSM phone near you!) soft handoff already means there is a level > of coordination above the "base station". > > Of course, but I intendedly ignore this for the moment. This is in fact not that unrealistic, as I?m thinking particularly of HSDPA, where (I don?t know what the marketing people say) you cannot move too fast, if you want to exploit maximum capacity. One (alleged ;-)) idea in HSDPA is to exploit multi user diversity by harnessing the microscopic fading. And that becomes the more difficult the faster you move. So, I?m thinking of velocities between 1 to 10 meters a second. Therefore, you shouldn?t have to roam that often. (And you?re perfectly right, other persons have decided to use a scheduler here and it?s an interesting challenge to think whether there are better ones than those actually used ;-)) >> At the moment, I think a ressource fair scheduler at the base station >> would be the best solution to do this. >> > > We hope the paper argues that effort-fair (== "resource fair") is "fair" > but undesirable in some situations, that outcome-fair is "fair" but > undesirable in other situations, and that a hybrid notion of fairness > is both desirable and achievable. > > That?s at least a much better position than "purely throughput fair (= "outcome fair") which I think is widely used at the moment. > Dave Eckhardt > > P.S. There is further material in my dissertation, including a warning > about the difficulty of measuring per-station conditions when trying > to do scheduling. > Is this available somehow? Could you give me a link? Thanks! Detlef -- Detlef Bosau Mail: detlef.bosau at web.de Galileistrasse 30 Web: http://www.detlef-bosau.de 70565 Stuttgart Skype: detlef.bosau Mobile: +49 172 681 9937 From detlef.bosau at web.de Sun Jul 22 14:02:35 2007 From: detlef.bosau at web.de (Detlef Bosau) Date: Sun, 22 Jul 2007 23:02:35 +0200 Subject: [e2e] Ressource Fairness or Througput Fairness, was Re: Opportunistic Scheduling. In-Reply-To: <200707181903.l6IJ3gtv012842@boreas.isi.edu> References: <200707181903.l6IJ3gtv012842@boreas.isi.edu> Message-ID: <46A3C5EB.1090305@web.de> Dave, there is another very basic objection, I have to your Inforcom Paper. This is a _very_ common misconception, and I suffer from this for years. In once was involved in a project called COMCAR and had to do a quite strange task within this project. There were no results - and I was accused for that. And honestly, I think one did me wrong there. My task was to produce an "adaptive middleware" for "adaptive components" in wireless networking. This whole thing started in 2000 and I never talked about this in the public. I?m extremely bitter about this project and that I was made responsible for things I?m not responsible for. One difficulty was, and during the last years I learned that this is perhaps the most basic reason why adaptation of multimedia documents in mobile networks is condemned to fail before it?s even started, that there is no serious possibility to have a long-term or even medium-term prediction of a wireless channel?s properties. If you want to adapt a document or a multimedia system?s appearance to changing channel properties and if you want to do this in an acceptable manner, i.e. don?t want to change your appearance any 10 ms, you inevitably need a certain idea of what properties your channel will expect to have for, say, the next 10 or 20 seconds. And from all what I know about mobile networks, this is simply ridiculous. Of course, there are the usual Nearly Infallible Sources of knowledge: Laplace?s demon, clairvoyance and divine inspiration ;-) What I learned in addition, and know I switch to your paper, is that media streaming and TCP are two different cups of tea. I was condemned to work with IP and I was strictly forbidden to have a look at the lower layers and that time, I did not have the standing to resist such a request. I felt, I had to look at the lower layers and that time I did not really know why. The reasons are so simple. - Data flows are typically asynchronous and sensitive against data corruption. - Media flows are typcially synchronous / isochronous and at least to a certain degree robust against data corruption. With respect to mobile networks that means: Because you inevitably have a recovery layer in these systems: Any recovery layer based upon ARQ or HARQ as it is offered today simply has no means to accept "blocks with one percent bit error rate" or "blocks with only a small number of bits being corrupted". An ARQ block or HARQ block is either corrupt or intact. So, I?m absolutely convinced that in networks with high error rates, i.e. particularly mobile wireless networks, IP is absolutely ill suited for any kind of media streaming. I know, that there is some religious faith which claims the contrary, but after having thought about this problem many years know, I?m convinced that IP or any other packet switched protocol must not be used for media streaming in mobile networks. You can even have anouther approach to the same problem. Media flows are typically conveyed using RTP/UDP/IP. When media flows are played out, you need three kinds of information. - What is played out - where and - when. "When" is determined by the RTP header, partitially "where", when several flows are multiplexed. "Where" is determined by the RTP header (see above), the UDP header (port) and the IP header (user equipment?s address.) "What" is dtermined by the RTP SDU. Have anything corrupted but the RTP SDU, than you can throw away your whole packet. So, your media flow may well tolerate 1 percent BER, if the one corrupted bit is in the time stamp in the RTP header, you can throw away your whole RTP packet which may consist of say 1 kBypte, roughly spoken: 10.000 bits. So, what would be a litte flaw for a propper implemented line switching is reliably turned into a disaster, when you use packet switching. If I knew this eight years ago, I would have simply not joined the project or cancelled my employment within the first three months of this nonsense. And I?m sure that my personal situation would be a different one from that what it is now. So, please forgive me when I?m bitter on this one. But when I read your paper, I saw two TCP flows and one Audio flow and one Video flow. And then I saw something on throughput, which is necessarily comparing peaches and oranges in that case, because no one is interested in TCP throughput. One is typically interested in TCP _goodput_. And that has to take into account of couse TCP retransmissions and can be "slightly" differ from any kind of L2 throughput in faulty networks. In consequence, I?m not quite sure whether it makes sense to handle TCP and media flows by the same kind of scheduler anyway. More drastically spoken: I?m strongly convinced that this is simply nonsense. I was not allowed to deal with lower layers that time. Otherwise, I would have learned what I learned duing my unemployed time because no one was to tell me then I shouldn?t, that in _any_ kind of mobile networks the - coding - forwared error correction - adaptation - care for error robustness - data compression - switching for media streams (e.g. speech) is done simply completely different from the ARQ/FEC schemes used for data, particularly there is _no_ ARQ and there is no packet switching but line switching (...with the usual metric ton of salt if someone thinks of line switching being done with copper lines) and therefore the - When and - Where for a media flow is determined by the system?s schedule and anything is fine. This is one of the rare rules which is even not confirmed by any exception ;-) Therefore, we can make phone calls with mobile phones. But from my experiences from the last years, I?m strictly opposed to IP based media streaming in mobile wireless networks. I think that media flows should be delivered using line switching techniques and if someone wants to combine packet switching and line switching here, I often remembered the "Beyound IP" project, I think it was pusrued by Paul Krueger in Kaiserslautern some years ago, which targeted at a similar goal with ATM and IP IIRC. However, I strongly beleive that IP based packet switching in mobile wireless networks is not the correct way to go and that there are better alternatives, e.g. a hybrid approach like Beyond IP. It may well be that the situation is different in WLAN. In fact, I?ve seen successful media flow distribution via IP Multicast in WLAN. However, in WLAN, you have well placed antennas and no mobility (i.e. pedestrian, i.e. no mobility) and strong distance restrictions. I often say: WLAN is a wirebound network with invisible copper. I well know that there are approaches to add additional protection to UDP / RTP headers in media flows to accommodate them to mobile networks, but I?m not yet convinced that the results are really better than media tranport using line switching techniques in that case. O.k. My position is somewhat extreme and perhaps not common sense. But at least, I confessed it :-) In addition: The situation becomes much more complex when it comes to HSDPA networks. I think, there are some remarks necessary particularly in this context concening the term "rate" and some scheduling issues, but I?m still working on it. -- Detlef Bosau Mail: detlef.bosau at web.de Galileistrasse 30 Web: http://www.detlef-bosau.de 70565 Stuttgart Skype: detlef.bosau Mobile: +49 172 681 9937 From jnc at mercury.lcs.mit.edu Mon Jul 23 10:34:20 2007 From: jnc at mercury.lcs.mit.edu (Noel Chiappa) Date: Mon, 23 Jul 2007 13:34:20 -0400 (EDT) Subject: [e2e] Ressource Fairness or Througput Fairness, was Re: Opportunistic Scheduling. Message-ID: <20070723173420.4E058872D1@mercury.lcs.mit.edu> > From: Detlef Bosau > - Data flows are typically asynchronous and sensitive against data > corruption. > - Media flows are typcially synchronous / isochronous and at least to a > certain degree robust against data corruption. "sensitive to data corruption" would probably be a better way to phrase that first one... > you inevitably have a recovery layer in these systems: ... simply has > no means to accept "blocks with one percent bit error rate" ... block > is either corrupt or intact. > ... > .. in networks with high error rates, i.e. particularly mobile wireless > networks, IP is absolutely ill suited for any kind of media streaming. > ... > IP or any other packet switched protocol must not be used for media > streaming in mobile networks. Well, at an architectural level, the internetworking model is supposed to work for media applications (i.e. error-tolerant applications) on networks with some errors. That's the whole reason IP and TCP were split apart to begin with, because the media applications (packet voice was the one they were actually doing) didn't need the high robustness of TCP, and the data stream delays caused by retranmissions were a bad case of "cure is worse than the disease". That allowed applications which didn't need the data robustness of TCP to be built directly on an unreliable datagram service. Similarly, UDP has a distinguished value of the checksum field for "no checksum", for applications for which data perfection is not critical. (Although on checking the IPv6 specification for a point below, I notice that IPv6 doesn't have this capability. More on this below...) But you do have one (actually several) interesting points below here: > your media flow may well tolerate 1 percent BER, if the one corrupted > bit is in the time stamp in the RTP header, you can throw away your > whole RTP packet which may consist of say 1 kBypte, roughly spoken: > 10.000 bits. IPv4 has a separate header checksum which has to be correct; the thinking being that if the internetwork header is damaged, you might as well toss the packet, since it's quite possibly damaged past the point of being useful (e.g. source or destination address is damaged). However, on high BER networks with a simplistic internetwork<->network adaption layer (important!), there's a good chance (especially with the smaller packets that some media applications produce) that in any packet with error(s), an error will have occurred in the header, and the packet will have to be discarded. The obvious engineering response (especially on low bandwidth networks!), knowing that if the header suffers a bit error, the bandwidth used to send the packet will have been wasted, is to use an internetwork<->network adaption layer that takes special measures to protect the header. E.g. an FEC or CRC - something that can *correct*, not just *detect*, errors - over that part of the packet, or something like that. Does any high-error, low-datarate network used for media applications actually do this? The next obvious response is one that requires a little bit of cooperation from the internetwork level, which is to realize that for data applications, which are sensitive to data error (such as TCP), it's worth turning on network-level reliability mechanisms (such as a CRC over the entire packet). Obviously, the internetwork layer ought to have a bit to say that; the adaption layer shouldn't have to look at the protocol type. For one, on low-bandwidth networks, this makes sure that the bandwidth isn't wasted. More importantly, TCP responds poorly when a high percentage of packets are dropped; throughput goes to the floor (because modern TCP's interpret this as a signal of congestion), and worse, the connection may close. Interestingly, again, IPv4 originally did include such a mechanism: the ToS bits did include a "high reliability" bit, but I'm not sure we fully understood the potential application I am speaking of here. (Also, interestingly, speaking of IPv4, IPv6 doen't have a separate header checksum; IIRC the thinking was it was duplicative of the end-end checksum. That's why UDPv6 mandates the checksum. However, it seems to me that this, plus the UDP checksum point, means that in a very critical area, IPv6 is actually less suitable for media applications than IPv4. But I digress..) > .. what would be a litte flaw for a propper implemented line switching > is reliably turned into a disaster, when you use packet switching. Now this raises another interesting point (although not the one you were making, which I think I have answered above); which is that circuit switching is potentially inherently better for transmission systems which have high BER. One possible way this could be true is that the application will perform poorly, and that damaged packets potentially waste bandwidth. However, I think this can be mostly avoided through use of CRC's, etc, as discussed above. Yes, that makes for a slightly more complex design, but in this day and age (we're not back in the age of 74-series TTL anymore) the extra computing power to do a CRC is 'no big deal'. Another point might be the per-packet header overhead, but this is obviously an old debate, one where the other advantages of the datagram model (such as the lack of a setup delay) have carried the day - to the point that even notionally circuit-based systems such as ATM have had datagram features, along with per-packet headers... But I have seriously tried to think of an inherent advantage that circuit swithed networks *necessarily* have over a well-designed packet switching system, and I can't think of one. The key, of course, is that "well-designed"; the entire system (especially the internetwork<->network adaption layer) has to be designed with both these applications (data-damage-insensitive) and also these networks (high BER) in mind. I think the original IPv4 design did a reasonable (not perfect, of course - we had nothing like the experience we have now) job on these things. Perhaps the understanding of those points was not spread as widely as it should have been, though (through not being prominently discussed, as the other points of the datagram design philosophy were). The system you worked on (which apparently used an all-or-nothing approach to data integrity - could they even repair damage, or just detect it?) didn't do that, but I don't think it proved it can't be done (any more than the Titanic proved you can't make ships which don't kill passengers by the thousand :-). Noel From davide+e2e at cs.cmu.edu Mon Jul 23 13:04:16 2007 From: davide+e2e at cs.cmu.edu (Dave Eckhardt) Date: Mon, 23 Jul 2007 16:04:16 -0400 Subject: [e2e] Ressource Fairness or Througput Fairness, was Re: Opportunistic Scheduling. In-Reply-To: Your message of "Sun, 22 Jul 2007 23:02:35 +0200." <46A3C5EB.1090305@web.de> Message-ID: <200707232004.l6NK4QBt024839@boreas.isi.edu> > One difficulty was, and during the last years I learned that this is > perhaps the most basic reason why adaptation of multimedia documents in > mobile networks is condemned to fail before it's even started, that > there is no serious possibility to have a long-term or even medium-term > prediction of a wireless channel's properties. As far as I can tell, this is indeed fiendishly difficult. A couple of times people asked for my bit-level traces in order to fit some sort of model to them, but nobody who did so was ever heard from again... this is one reason why my scheduling approach was essentially reactive rather than predictive, and works without needing to measure error rates. It would be easy enough to plug in an oracle if one were available, of course. > But when I read your paper, I saw two TCP flows and one Audio flow and > one Video flow. And then I saw something on throughput, which is necessarily > comparing peaches and oranges in that case, because no one is interested in > TCP throughput. One is typically interested in TCP _goodput_. And that has > to take into account of couse TCP retransmissions and can be "slightly" > differ from any kind of L2 throughput in faulty networks. I have never been a fan of the word "goodput". One layer's "goodput" is just the "throughput" of the next layer up, after all--if the higher layer is thrashing, your "goodput" isn't any good, and you have no way of knowing that. Since there are pre-existing words for "effort" and "outcome", it makes sense to me to use them. Anyway, rest assured that the authors of the ELF scheduling paper know about "goodput" and gave the matter due treatment--but, due to space constraints, not in that paper. > In consequence, I'm not quite sure whether it makes sense to handle TCP > and media flows by the same kind of scheduler anyway. More drastically > spoken: I'm strongly convinced that this is simply nonsense. What we were trying to accomplish was conceptualizing the scheduling of high-error wireless links in terms of effort-fair vs. outcome-fair, arguing that a hybrid is frequently desirable, and demonstrating a basic implementation. It's fine with me if you wish to argue that for data outcome should be measured as "100%-correct packet bytes with latency below 250 ms" but that for voice outcome should be measured in terms of "85%-correct packet bytes with latency below 50 ms". And I wouldn't object if you wanted to argue that effort should be measured in watt-hz-seconds or some other measure of how much spectrum resource is expended. But I believe that in a high-error environment it *does* make sense to integrate scheduling of disparate flow types according to a tradeoff between effort and outcome (and we were arguing for a particular model very different from utility curves). Note that a couple messages back my motivating example for cell phones was that an operator may be able to very slightly degrade the voice quality of some customers in order to "unfairly" boost the experience of another customer in a "dead spot", and that this might keep the customer talking instead of hanging up. No part of that example depends on TCP, "goodput", persistent ARQ, etc. The key issue is the notion of fairness. I don't think we know "the story" on running voice over data-centric networks versus running data over voice-centric networks or whether there is a neutral ground. Last time I looked Real Audio was mostly running over TCP, not UDP... let alone anything involving link-level options to deliver partially-mangled packets. And initially GSM was kind of dubious for data because of the voice-centric deep interleaving, right? I think there are plenty of open questions. But I haven't yet seen anything to convince me that the concepts of effort-fair and outcome-fair don't make sense or that either one is better than a tunable hybrid. Dave Eckhardt From detlef.bosau at web.de Mon Jul 23 14:22:28 2007 From: detlef.bosau at web.de (Detlef Bosau) Date: Mon, 23 Jul 2007 23:22:28 +0200 Subject: [e2e] Ressource Fairness or Througput Fairness, was Re: Opportunistic Scheduling. In-Reply-To: <20070723173420.4E058872D1@mercury.lcs.mit.edu> References: <20070723173420.4E058872D1@mercury.lcs.mit.edu> Message-ID: <46A51C14.9000203@web.de> Noel Chiappa wrote: > > From: Detlef Bosau > > > - Data flows are typically asynchronous and sensitive against data > > corruption. > > - Media flows are typcially synchronous / isochronous and at least to a > > certain degree robust against data corruption. > > "sensitive to data corruption" would probably be a better way to phrase that > first one... > > Thanks. (And now you all know: I?m not a native speaker. Prepositions and the like are always a problem. And we Germans often strike back with articles :-)) > > IP or any other packet switched protocol must not be used for media > > streaming in mobile networks. > > Well, at an architectural level, the internetworking model is supposed to > work for media applications (i.e. error-tolerant applications) on networks > with some errors. > > That's the whole reason IP and TCP were split apart to begin with, because > the media applications (packet voice was the one they were actually doing) > didn't need the high robustness of TCP, and the data stream delays caused by > retranmissions were a bad case of "cure is worse than the disease". That > allowed applications which didn't need the data robustness of TCP to be built > directly on an unreliable datagram service. > That points into the direction of having header data "better protected" than payload. However: Is this implemented particularly in mobile wireless networks, particularly on "data channels"? And to the best of my knowlege, IP packets typically _are_ conveyed via "data channels". In the mentioned project, I was advised to ignore the checksum in UDP packets. If we ignore possible problems with the header information, this is exactly what you write: Respect the (IP) header checksum and ignore the rest. However: What happens on the link layer? In the link layer specifications for data channels I read so far (and I never happened to read a standard where header data is given better protection than payload, although I often heard about it) there is frequently a data recovery mechanism which simply tries to recover the data packets and which check the correctnes of a packet by _one_ CRC sum for the whole packet. Or for the individual "radio blocks" respectively, which in turn means nothing else than that the whole packet is either error free or will be dropped. > Similarly, UDP has a distinguished value of the checksum field for "no > checksum", for applications for which data perfection is not critical. > Surely it has! But how do I tell the link layer not to care for correctness? Where is the switch where I can tell the link layer: "Dear Link Layer! I never will have a look at the UDP checksum, I always leave it alone!" and where I don?t get the answer: "So, when you don?t care for the checksum, you surely won?t mind me to care for a correct checksum, won?t you?" Some months ago, Dave Reed wrote that mobile network?s link layers often are simply "too smart"! ;-) I?m not quite sure, whether we have header checksums particularly for UDP and RTP, or perhaps for UDP and RTP combined, than we could define for the link layer that it shall care for the IP/UDP/RTP headers to be correct - and _must_ _not_ care for the rest. I?m curious whether this is implemented in any wireless mobile network? > IPv4 has a separate header checksum which has to be correct; the thinking > being that if the internetwork header is damaged, you might as well toss the > packet, since it's quite possibly damaged past the point of being useful > (e.g. source or destination address is damaged). > > Yes. And when the checksum is correct the network shall be well behaved and damn stupid and convey the packet. However, some of these networks like GPRS, UMTS and I?m not yet sure with HSDPA are "know it all" networks and even toss the packet when it?s perfectly acceptable for (to? Prepositions are rerrible ;-) ) the application. That?s a typical end-to-end argument: Only the application knows whether the packet is o.k. or not. (However, "smart" networks think, they knew better ;-)) > > The obvious engineering response (especially on low bandwidth networks!), > knowing that if the header suffers a bit error, the bandwidth used to send > the packet will have been wasted, is to use an internetwork<->network > adaption layer that takes special measures to protect the header. E.g. an FEC > or CRC - something that can *correct*, not just *detect*, errors - over that > part of the packet, or something like that. > > Does any high-error, low-datarate network used for media applications > actually do this? > > That?s exactly my concern! I heard some rumour that this were intended. But I never saw it practically. (And it?s only a concern in _packet_ switching as in "_line_ switching" / TDM networks the "header information" is put into the schedule and therefore the problem does not exist any more.) > The next obvious response is one that requires a little bit of cooperation > from the internetwork level, which is to realize that for data applications, > which are sensitive to data error (such as TCP), it's worth turning on > network-level reliability mechanisms (such as a CRC over the entire packet). > Obviously, the internetwork layer ought to have a bit to say that; the > adaption layer shouldn't have to look at the protocol type. > > That?s the "switch", I missed above. > Interestingly, again, IPv4 originally did include such a mechanism: the ToS > bits did include a "high reliability" bit, but I'm not sure we fully > understood the potential application I am speaking of here. > Tempora mutantur, nos et mutamur in illis, or how the use of the ToS bits changed over the years ;-) > (Also, interestingly, speaking of IPv4, IPv6 doen't have a separate header > checksum; IIRC the thinking was it was duplicative of the end-end checksum. > That's why UDPv6 mandates the checksum. However, it seems to me that this, > plus the UDP checksum point, means that in a very critical area, IPv6 is > actually less suitable for media applications than IPv4. But I digress..) > > Is this really digressive? Interestingly, some people within the COMCAR project talked about IPv6 in newspaper interviews and emphasized its suitability for media applications in mobile networks =8-) > > .. what would be a litte flaw for a propper implemented line switching > > is reliably turned into a disaster, when you use packet switching. > > Now this raises another interesting point (although not the one you were > making, which I think I have answered above); which is that circuit switching > is potentially inherently better for transmission systems which have high > BER. > > One possible way this could be true is that the application will perform > poorly, and that damaged packets potentially waste bandwidth. However, I > think this can be mostly avoided through use of CRC's, etc, as discussed > above. Yes. However, the actual IPv6 discussion points into a different direction. And I well think of projects which recommend IPv6 even for mobile networks. And when I consider your points above, this requires some considerations. The one thing is whether we _can_ use IPv6 even vor media tranportation in packet switched networks when your points are considered. The other thing is whether the implementations are actually _done_ that way. > Yes, that makes for a slightly more complex design, but in this day > and age (we're not back in the age of 74-series TTL anymore) the extra > computing power to do a CRC is 'no big deal'. > > Another point might be the per-packet header overhead, but this is obviously > an old debate, one where the other advantages of the datagram model (such as > the lack of a setup delay) have carried the day - to the point that even > notionally circuit-based systems such as ATM have had datagram features, > along with per-packet headers... > > > And we both know about the long term success of ATM :-) > I think the original IPv4 design did a reasonable (not perfect, of course - > we had nothing like the experience we have now) job on these things. Perhaps > the understanding of those points was not spread as widely as it should have > been, though (through not being prominently discussed, as the other points of > the datagram design philosophy were). > > The system you worked on (which apparently used an all-or-nothing approach to > data integrity - could they even repair damage, or just detect it?) didn't do > that, but I don't think it proved it can't be done (any more than the Titanic > proved you can't make ships which don't kill passengers by the thousand :-). > > The system I worked on is spilled milk. And it was not my job to change the IP implementation. I had to work above IP. In addition, I did not see all these details. I started the whole work with little, not to say no, knowledge about mobile networks. It?s spilled milk. And perhaps, I will learn and accept this one day. Detlef > Noel > -- Detlef Bosau Mail: detlef.bosau at web.de Galileistrasse 30 Web: http://www.detlef-bosau.de 70565 Stuttgart Skype: detlef.bosau Mobile: +49 172 681 9937 From Jon.Crowcroft at cl.cam.ac.uk Mon Jul 23 22:27:01 2007 From: Jon.Crowcroft at cl.cam.ac.uk (Jon Crowcroft) Date: Tue, 24 Jul 2007 06:27:01 +0100 Subject: [e2e] Ressource Fairness or Througput Fairness, was Re: Opportunistic Scheduling. In-Reply-To: <20070723173420.4E058872D1@mercury.lcs.mit.edu> References: <20070723173420.4E058872D1@mercury.lcs.mit.edu> Message-ID: there is now a burgeoning body of literature with results on network coding, and on generalized packet swarming techniques that show that if you want to build a data network for today's dominant traffic sources, then you can do very well without any complex resource allocation in the sense of admission control, scheduling, OR routing, and just rely on the codes to do the right thing such a network depends on packets, not circuits, fundamentally, but is very different from today's internet (as different as IP is from, say, ISDN) however, right now, it is totally unobvious how such a network would carry a phone call:) its also not at all obvious what an "end" is in such a network, so there's no point in discussing it on this list... In missive <20070723173420.4E058872D1 at mercury.lcs.mit.edu>, Noel Chiappa typed: >>Now this raises another interesting point (although not the one you were >>making, which I think I have answered above); which is that circuit switching >>is potentially inherently better for transmission systems which have high >>BER. p.s. in a wireless network, BER is probably not a terribly good metric for quality - actually the idea of a "link" is not terribly helpful either, which kind of makes the idea of a "hop" fairly redundant, which makes e2e v. hbh a sort of angels on the head of a pin type irrelevance too such fun j. From detlef.bosau at web.de Tue Jul 24 17:41:08 2007 From: detlef.bosau at web.de (Detlef Bosau) Date: Wed, 25 Jul 2007 02:41:08 +0200 Subject: [e2e] Ressource Fairness or Througput Fairness, was Re: Opportunistic Scheduling. In-Reply-To: <200707232004.l6NK4QBt024839@boreas.isi.edu> References: <200707232004.l6NK4QBt024839@boreas.isi.edu> Message-ID: <46A69C24.7040109@web.de> Dave Eckhardt wrote: >> One difficulty was, and during the last years I learned that this is >> perhaps the most basic reason why adaptation of multimedia documents in >> mobile networks is condemned to fail before it's even started, that >> there is no serious possibility to have a long-term or even medium-term >> prediction of a wireless channel's properties. >> > > As far as I can tell, this is indeed fiendishly difficult. With particular respect to my own experience in 2000 to 2002: Is it _difficult_? Or is it _hopeless_? One professor told me: "Why don?t you do channel identification? That?s a nice challenge!" > A couple of > times people asked for my bit-level traces in order to fit some sort of > model to them, but nobody who did so was ever heard from again... this > I know about Rayleigh channel models. And to my understanding, these models are simply necessary e.g. to make UMTS work. However, this is a typical misconception between CS and EE. Rayleigh channel models do not attempt to do any kind of prediction or forecast. They attempt to identify the actual channel state. First of all: The temporal perspective is "the next timeslot", i.e. about 10 ms in UMTS and about 2 ms in HSDPA. Second: An estimation of the next timeslot may fail. So, we have a problem for one timeslot. No longer. What I needed / was expected to do / however we can call it is to do prediction for a much longer period of time. E.g. 10 or 20 seconds. And I really think, this is hopeless. > is one reason why my scheduling approach was essentially reactive rather > than predictive, and works without needing to measure error rates. It > I think, we cannot be "predictive". We can, seriously spoken, only be reactive. > would be easy enough to plug in an oracle if one were available, of course. > > :-) That?s another reason why I see a difference between data and media. An often used buzz word is "adaptivity". And when we talk about "adaptivity" in mobile networks, anybody tries to "adapt" applications etc. In 2000, I was pointed to approaches like Odyssee (Brian Noble) or the SMIL standard. What I think now, is that we perhaps should better talk about robustness, when we talk about media. (The discussion adaptivity vs. robustness is a stoneaged one, e.g. in electrical engineering and systems theory.) Of course, we can - and do actually - talk about adapation for data flows. Actually, HSDPA does coding scheme adaptation on layer 2 each time slot. However, the user?s perception of data flows and media flows and the user?s requirements are different. And of course, there is such a weird think as "QoS mapping" that attempts to find a correlation or relationship or whatever between (baiscally _informal_ and _non_ technical) requirements based upon user?s perception and (basically _formal_ and _technical_) specifications in networking. Which bit error rate corresponds to "pleasant to look at"? Or which transport delay jitter corresponds to "acceptable for a phone talk"? And wich average bit rate corresponds to "pleasant to listen at"? And how would Beethoven have answered this question in his youth? And how in is later life when he was deaf? >> But when I read your paper, I saw two TCP flows and one Audio flow and >> one Video flow. And then I saw something on throughput, which is necessarily >> comparing peaches and oranges in that case, because no one is interested in >> TCP throughput. One is typically interested in TCP _goodput_. And that has >> to take into account of couse TCP retransmissions and can be "slightly" >> differ from any kind of L2 throughput in faulty networks. >> > > I have never been a fan of the word "goodput". One layer's "goodput" is > But isn?t the goodput, particularly of a TCP flow, what a user perceives? > just the "throughput" of the next layer up, after all--if the higher layer > is thrashing, your "goodput" isn't any good, and you have no way of knowing > that. Since there are pre-existing words for "effort" and "outcome", it > makes sense to me to use them. > Provoking question: How much layers do we need? IIRC in the Internet, we typically think of four layers. Subnet, Network, Transport, Application. The OSI 7 layers often are too complex. When I look at mobile networks, GPRS, UMTS and the like, we again gather layers. I spent much of my time during the last years in discussions and in thinking about the layers in these networks and I always have the question in mind: "Do we inevitably need this layer?" Does an additional layer make a system more clear and simple? Or does it only add complexity? A terrible examples are the numerous "convergence layers" in mobile networks were perhaps old and grey haired engineers attempt to salvage the fruits of their use. "80 years ago, when George V. was still king of the United Kingdom, we used something like X.25 and therefore we must have a convergence layer wich abstracts the link layer to a generic link layer and then a convergence layer which encapsulates IP and X.25 in a generic convergence layer which is then placed between X.25 and IP and the abstract convergence layer" etc. etc. etc. Whenever I see these terrible architecture diagrams which gather tons of layers and protocols, my hair turns as grey as with these old engineers :-) And each layer redefines its one meaning of effort and outcome. And with each layer you have another "QoS mapping". In the end, the lowest layer has no idea what the user basically intended to achieve. Perhaps a part of my problems eight years ago is that I never was a multimedia guy. But when I think of the whole bunch of paper I read about layers and QoS mapping in multimedia systems, I?m much less a multimedia guy today than I was eight years ago. I was confronted with strange "QoS profiles" and the like, and when you attempt to make adaptation decisions as e.g. in SMIL, you deal exactly with those values - and I found it extremely difficult to maintain the relationship between these technical parameters and the most important question in networking at all: What does the user want to do? What are the user?s needs? What does the user perceive? What is acceptable for the user? And what, particularly in adaptation, is simply annoying? And I never was convinced of bothering an innocent user with slide bars and parameter tuning knobs and profiles etc. he never will deal with. This is perhaps I worked lots of years at user?s help desks and in direct contact with users, and so I know from my own experience that users are simply completely overstraind by these knobs and slide bars and bells and whistles and don?t know how to handle them - and so we have basically two classes of users. The one class of users simply ignores this stuff and the other class of users plays around with this stuff and does more harm than good. > What we were trying to accomplish was conceptualizing the scheduling of > high-error wireless links in terms of effort-fair vs. outcome-fair, > arguing that a hybrid is frequently desirable, and demonstrating a basic > implementation. > > It's fine with me if you wish to argue that for data outcome should be > measured as "100%-correct packet bytes with latency below 250 ms" but > that for voice outcome should be measured in terms of "85%-correct > packet bytes with latency below 50 ms". And I wouldn't object if you > From my COMCAR experience, I first miss the possibility to model / define / implement "85 % correct", see the discussion with Noel. The second concern is that we still have to map this unto a user?s perspective. For data it?s easy: If you check your bank account via home banking, you obviously don?t want to be cheated by faulty data. And if you edit a document which is stored on a file server, you don?t want to corrupt it more and more each time you read and write it. O.k., at a second glance it?s not as easy as it seems: If you download some new installation CD for your linux installation, you perhaps want this download to complete within your remaining lifetime :-) But what is acceptable to the user when it comes to media / multimedia systems / multimedia documents? > wanted to argue that effort should be measured in watt-hz-seconds or > some other measure of how much spectrum resource is expended. > > But I believe that in a high-error environment it *does* make sense to > integrate scheduling of disparate flow types according to a tradeoff > between effort and outcome (and we were arguing for a particular model > very different from utility curves). > > Note that a couple messages back my motivating example for cell phones > was that an operator may be able to very slightly degrade the voice quality > of some customers in order to "unfairly" boost the experience of another > customer in a "dead spot", and that this might keep the customer talking > This is the well known idea of "graceful degradation". Up to that time it sounds fine. In the next step, you define degradation paths. And from that point on it becomes fiendish. There are many technical concepts how we can do this. Is any of them accepted by the users? > instead of hanging up. No part of that example depends on TCP, "goodput", > persistent ARQ, etc. The key issue is the notion of fairness. > > I don't think we know "the story" on running voice over data-centric networks > versus running data over voice-centric networks or whether there is a neutral > ground. Last time I looked Real Audio was mostly running over TCP, not UDP... > Interestingly! But what?s the reason? One reason is that nobody uses Real Audio conversational. So, the final reason is: Data for Real Audio is downloaded to the user?s site and than played back - from disk or memory. So, we have data transport. No media streaming. O.k., sometimes the user is cheated with large buffers and preload and pseudo-lifestreams. And depending on the network quality the stuff frequently hangs - until it?s eventually hung up by the user. (I don?t know whether you are blessed by bushisms via podcast in the U.S., here in Germany the real patriot listens to the podcast speeches of Sancta Angela. But this is no contradition to my remark that the podcast is finaly hung up by an enervated listener.) > let alone anything involving link-level options to deliver partially-mangled > packets. And initially GSM was kind of dubious for data because of the > voice-centric deep interleaving, right? Admittedly, I don?t know whether GSM was really that bad for data. I read tons of scientific papers about this and how disastrous it was. Now, as I mentioned before, I worked as a user help desk guy for many years in my life. And there were many users who didn?t know that GSM would not work with data - so they used it and were fine. That?s similar to the old story with the humble bee. Each engineering student is told that the humble bee could not fly. Only the the the humble bee does not know - and so she is happily flying. I know that there are tons of papers and even PhD theses which claim the huge difficulties of GSM and data. Now, GSM provided a data rate comparable to old telephone modems and the users worked with that succesfully and without difficulties. In fact, I did user support not only but amongst others for users who accessed their "Intranet"-data via GSM or checked their mail via GSM from about 1995 on and did so for years and all the guys I talked about the papers I read from 2000 on which mentioned the huge difficulties with GSM and data looked at me as if I were fallen to earth from another planet. That?s basically one reason, why I?m looking for any difficulties with data and wireless networking for nearly eight years now - and as you correctly guess from the sentences above, I?m absolutely no way convinced of many parts of the scientific literature I read so far. I think a huge number of so called scientifc papers and even PhD theses about this topics are simply and drastically spoken urban legends. And I think, and I?m somewhat bitter because of this, that we should be highly self critical about our claims and that we must not write tons of papers about spurious timeouts and loss differentiation problems etc. just in order to achieve "scientific honour" or a PhD or something like that - and the public ridicules about our work and some years later we have papers like the Hasenleitner paper which simply debunked the spurious timeout legend as pure nonsense. (And by the way: A look in Edge?s original work would have even done that. It?s undergraduate level that there is hardly anything as robust as Chebyshev?s inequality when it comes to confidence intervals and the like. So I wonder, why this topic was discussed at all.) O.k. That was disgressive. > I think there are plenty of open > questions. > > But I haven't yet seen anything to convince me that the concepts of effort-fair > and outcome-fair don't make sense or that either one is better than a tunable > hybrid. > > O.k. For me, it would be nice to restrict the discussion a bit. What I have in mind when I talk about this problem is in fact the multi user diversity debate. There are lots of papers on this issue as well and there was some hype about this topic during the last ten, twelve years. And there is still some hype in writing sophisticated schedulers which exploit multi user diversity and adaptive channel coding and the like. Perhaps, I?m about to see another failure of my own work and perhaps I?m going to be severely disappointed here as well. Basically, there are two concerns. First. We claim we would exploit multi user diversity and by doing so increase spectral efficency etc. etc. Do we really? Or do we hope so? Second. What I have seen so far in the lower layers of actual mobile wireless networks is highly complex and sophisticated. And I?m still to understand most of the details. However, I wonder how terms like "rate" are interpreted differently by CS and EE guys and I wonder why flow control issues, which are typical end to end issues, are dealt with locally and why there are much techniques integrated into the lower layers, the ramifictions of which on the end to end behaviour of the system are not yet clear. Therefore the question whether we should pursue ressource fairness or throughput fairness is primarily which kind of fairness, if it is pursued on a wireless link, fits best into the current end to end Internet design? And what is the real purpose of that "fairness"? To my knowledge, the original idea introduced by Knopp and Humblet and further discussed by Tse et al. simply wants to exploit multi user diversity and for this purpose some systems introduce sophisticated schedulers into the downlink channel. - Do these systems really achieve what they want to do? - Do these systems have ramifications on upper layers? - Do these systems maintain the intended end to end behaviour of protocols / applications etc.? Or is the multi user diversity debate as it is conducted at the moment just another hype? That?s my original intention. Regards Detlef > Dave Eckhardt > -- Detlef Bosau Mail: detlef.bosau at web.de Galileistrasse 30 Web: http://www.detlef-bosau.de 70565 Stuttgart Skype: detlef.bosau Mobile: +49 172 681 9937 From dpreed at reed.com Wed Jul 25 07:18:13 2007 From: dpreed at reed.com (David P. Reed) Date: Wed, 25 Jul 2007 10:18:13 -0400 Subject: [e2e] Ressource Fairness or Througput Fairness, was Re: Opportunistic Scheduling. In-Reply-To: <46A69C24.7040109@web.de> References: <200707232004.l6NK4QBt024839@boreas.isi.edu> <46A69C24.7040109@web.de> Message-ID: <46A75BA5.6000603@reed.com> Hear, hear, Detlef! Yours is indeed a very reasonable critique - that stability is assumed but never validated. Therefore, it is revolutionary. I too have been sad to see that 99.99% of the papers in IEEE Trans on Networking focused on problems that are iatrogenic - problems that exist only because the physician is trained never to question the received wisdom of the "networking community" - that layers are good, so more layers are better, that FTP speed is the right measure, so we should optimize for FTP in static environments, that centralized control is good when we know everything in advance, so we have to protect centralized controllers from DOS attacks, etc. All of these are subject to critiques, and few are crazy enough to critique the assumptions, because they are taken as given, and never questioned. That said, a great Bob Dylan quote is: "to live outside the law you must be honest". What he meant was that a revolutionary (which includes revolutionaries in science and engineering) must be far more careful, far more self-critical, far more thoughtful than those who coast through life believing things are so because "everyone knows that". QoS, for example, which you skewer below, is one of those "everyone knows" things. Of course you can smooth out the unpredictability of the "channel" - just assume that you can, and demand that a lower layer do just that. You'll look brilliant. :-) And you can get hired by Verizon, because their PR says that they deserve a monopoly on cellular data from the government because they "assure" quality of service - and you will be the engineer who sounds like they really can turn toxic sludge into food. Just utter a few TLAs like QoS, etc. I wish you luck - it's tough to build systems that work well in real worlds. It's a lot easier to build systems that work well if you can choose the assumptions, designing your own fantasy world of isolated radio links in free space, a network entirely owned by one company, applications restricted to FTP, etc. Detlef Bosau wrote: > Dave Eckhardt wrote: >>> One difficulty was, and during the last years I learned that this is >>> perhaps the most basic reason why adaptation of multimedia documents in >>> mobile networks is condemned to fail before it's even started, that >>> there is no serious possibility to have a long-term or even medium-term >>> prediction of a wireless channel's properties. >>> >> >> As far as I can tell, this is indeed fiendishly difficult. > > With particular respect to my own experience in 2000 to 2002: Is it > _difficult_? Or is it _hopeless_? > One professor told me: "Why don?t you do channel identification? > That?s a nice challenge!" >> A couple of >> times people asked for my bit-level traces in order to fit some sort of >> model to them, but nobody who did so was ever heard from again... this >> > > I know about Rayleigh channel models. And to my understanding, these > models are simply necessary e.g. to make UMTS work. > > However, this is a typical misconception between CS and EE. Rayleigh > channel models do not attempt to do any kind of prediction or > forecast. They attempt to identify the actual channel state. First of > all: The temporal perspective is "the next timeslot", i.e. about 10 ms > in UMTS and about 2 ms in HSDPA. Second: An estimation of the next > timeslot may fail. So, we have a problem for one timeslot. No longer. > What I needed / was expected to do / however we can call it is to do > prediction for a much longer period of time. E.g. 10 or 20 seconds. > And I really think, this is hopeless. > >> is one reason why my scheduling approach was essentially reactive rather >> than predictive, and works without needing to measure error rates. It >> > > I think, we cannot be "predictive". We can, seriously spoken, only be > reactive. > >> would be easy enough to plug in an oracle if one were available, of >> course. >> >> > > :-) > > That?s another reason why I see a difference between data and media. > An often used buzz word is "adaptivity". And when we talk about > "adaptivity" in mobile networks, anybody tries to "adapt" applications > etc. In 2000, I was pointed to approaches like Odyssee (Brian Noble) > or the SMIL standard. > > What I think now, is that we perhaps should better talk about > robustness, when we talk about media. (The discussion adaptivity vs. > robustness is a stoneaged one, e.g. in electrical engineering and > systems theory.) Of course, we can - and do actually - talk about > adapation for data flows. Actually, HSDPA does coding scheme > adaptation on layer 2 each time slot. However, the user?s perception > of data flows and media flows and the user?s requirements are different. > > And of course, there is such a weird think as "QoS mapping" that > attempts to find a correlation or relationship or whatever between > (baiscally _informal_ and _non_ technical) requirements based upon > user?s perception and (basically _formal_ and _technical_) > specifications in networking. > > Which bit error rate corresponds to "pleasant to look at"? Or which > transport delay jitter corresponds to "acceptable for a phone talk"? > And wich average bit rate corresponds to "pleasant to listen at"? And > how would Beethoven have answered this question in his youth? And how > in is later life when he was deaf? > >>> But when I read your paper, I saw two TCP flows and one Audio flow and >>> one Video flow. And then I saw something on throughput, which is >>> necessarily >>> comparing peaches and oranges in that case, because no one is >>> interested in >>> TCP throughput. One is typically interested in TCP _goodput_. And >>> that has >>> to take into account of couse TCP retransmissions and can be "slightly" >>> differ from any kind of L2 throughput in faulty networks. >>> >> >> I have never been a fan of the word "goodput". One layer's "goodput" is >> > > But isn?t the goodput, particularly of a TCP flow, what a user perceives? >> just the "throughput" of the next layer up, after all--if the higher >> layer >> is thrashing, your "goodput" isn't any good, and you have no way of >> knowing >> that. Since there are pre-existing words for "effort" and >> "outcome", it >> makes sense to me to use them. >> > > Provoking question: How much layers do we need? IIRC in the Internet, > we typically think of four layers. > Subnet, Network, Transport, Application. The OSI 7 layers often are > too complex. When I look at mobile networks, GPRS, UMTS and the like, > we again gather layers. > > I spent much of my time during the last years in discussions and in > thinking about the layers in these networks and I always have the > question in mind: "Do we inevitably need this layer?" Does an > additional layer make a system more clear and simple? Or does it only > add complexity? A terrible examples are the numerous "convergence > layers" in mobile networks were perhaps old and grey haired engineers > attempt to salvage the fruits of their use. > "80 years ago, when George V. was still king of the United Kingdom, we > used something like X.25 and therefore we must have a convergence > layer wich abstracts the link layer to a generic link layer and then a > convergence layer which encapsulates IP and X.25 in a generic > convergence layer which is then placed between X.25 and IP and the > abstract convergence layer" etc. etc. etc. > > Whenever I see these terrible architecture diagrams which gather tons > of layers and protocols, my hair turns as grey as with these old > engineers :-) > > And each layer redefines its one meaning of effort and outcome. And > with each layer you have another "QoS mapping". > > In the end, the lowest layer has no idea what the user basically > intended to achieve. > > Perhaps a part of my problems eight years ago is that I never was a > multimedia guy. But when I think of the whole bunch of paper I read > about layers and QoS mapping in multimedia systems, I?m much less a > multimedia guy today than I was eight years ago. > > I was confronted with strange "QoS profiles" and the like, and when > you attempt to make adaptation decisions as e.g. in SMIL, you deal > exactly with those values - and I found it extremely difficult to > maintain the relationship between these technical parameters and the > most important question in networking at all: What does the user want > to do? > What are the user?s needs? What does the user perceive? What is > acceptable for the user? And what, particularly in adaptation, is > simply annoying? And I never was convinced of bothering an innocent > user with slide bars and parameter tuning knobs and profiles etc. he > never will deal with. This is perhaps I worked lots of years at user?s > help desks and in direct contact with users, and so I know from my own > experience that users are simply completely overstraind by these knobs > and slide bars and bells and whistles and don?t know how to handle > them - and so we have basically two classes of users. The one class of > users simply ignores this stuff and the other class of users plays > around with this stuff and does more harm than good. > >> What we were trying to accomplish was conceptualizing the scheduling of >> high-error wireless links in terms of effort-fair vs. outcome-fair, >> arguing that a hybrid is frequently desirable, and demonstrating a basic >> implementation. >> >> It's fine with me if you wish to argue that for data outcome should be >> measured as "100%-correct packet bytes with latency below 250 ms" but >> that for voice outcome should be measured in terms of "85%-correct >> packet bytes with latency below 50 ms". And I wouldn't object if you >> > > From my COMCAR experience, I first miss the possibility to model / > define / implement "85 % correct", see the discussion with Noel. > > The second concern is that we still have to map this unto a user?s > perspective. For data it?s easy: If you check your bank account via > home banking, you obviously don?t want to be cheated by faulty data. > And if you edit a document which is stored on a file server, you don?t > want to corrupt it more and more each time you read and write it. > > O.k., at a second glance it?s not as easy as it seems: If you download > some new installation CD for your linux installation, you perhaps want > this download to complete within your remaining lifetime :-) > > But what is acceptable to the user when it comes to media / multimedia > systems / multimedia documents? > > >> wanted to argue that effort should be measured in watt-hz-seconds or >> some other measure of how much spectrum resource is expended. >> >> But I believe that in a high-error environment it *does* make sense to >> integrate scheduling of disparate flow types according to a tradeoff >> between effort and outcome (and we were arguing for a particular model >> very different from utility curves). >> >> Note that a couple messages back my motivating example for cell phones >> was that an operator may be able to very slightly degrade the voice >> quality >> of some customers in order to "unfairly" boost the experience of another >> customer in a "dead spot", and that this might keep the customer talking >> > > This is the well known idea of "graceful degradation". > Up to that time it sounds fine. > In the next step, you define degradation paths. > And from that point on it becomes fiendish. > > There are many technical concepts how we can do this. > > Is any of them accepted by the users? > >> instead of hanging up. No part of that example depends on TCP, >> "goodput", >> persistent ARQ, etc. The key issue is the notion of fairness. >> >> I don't think we know "the story" on running voice over data-centric >> networks >> versus running data over voice-centric networks or whether there is a >> neutral >> ground. Last time I looked Real Audio was mostly running over TCP, >> not UDP... >> > > Interestingly! > > But what?s the reason? One reason is that nobody uses Real Audio > conversational. So, the final reason is: Data for Real Audio is > downloaded to the user?s site and than played back - from disk or memory. > > So, we have data transport. No media streaming. > > O.k., sometimes the user is cheated with large buffers and preload and > pseudo-lifestreams. And depending on the network quality the stuff > frequently hangs - until it?s eventually hung up by the user. > > (I don?t know whether you are blessed by bushisms via podcast in the > U.S., here in Germany the real patriot listens to the podcast speeches > of Sancta Angela. But this is no contradition to my remark that the > podcast is finaly hung up by an enervated listener.) >> let alone anything involving link-level options to deliver >> partially-mangled >> packets. And initially GSM was kind of dubious for data because of the >> voice-centric deep interleaving, right? > > Admittedly, I don?t know whether GSM was really that bad for data. > > I read tons of scientific papers about this and how disastrous it was. > Now, as I mentioned before, I worked as a user help desk guy for many > years in my life. And there were many users who didn?t know that GSM > would not work with data - so they used it and were fine. That?s > similar to the old story with the humble bee. Each engineering student > is told that the humble bee could not fly. Only the the the humble bee > does not know - and so she is happily flying. > > I know that there are tons of papers and even PhD theses which claim > the huge difficulties of GSM and data. > Now, GSM provided a data rate comparable to old telephone modems and > the users worked with that succesfully and without difficulties. In > fact, I did user support not only but amongst others for users who > accessed their "Intranet"-data via GSM or checked their mail via GSM > from about 1995 on and did so for years and all the guys I talked > about the papers I read from 2000 on which mentioned the huge > difficulties with GSM and data looked at me as if I were fallen to > earth from another planet. > > That?s basically one reason, why I?m looking for any difficulties with > data and wireless networking for nearly eight years now - and as you > correctly guess from the sentences above, I?m absolutely no way > convinced of many parts of the scientific literature I read so far. I > think a huge number of so called scientifc papers and even PhD theses > about this topics are simply and drastically spoken urban legends. And > I think, and I?m somewhat bitter because of this, that we should be > highly self critical about our claims and that we must not write tons > of papers about spurious timeouts and loss differentiation problems > etc. just in order to achieve "scientific honour" or a PhD or > something like that - and the public ridicules about our work and some > years later we have papers like the Hasenleitner paper which simply > debunked the spurious timeout legend as pure nonsense. (And by the > way: A look in Edge?s original work would have even done that. It?s > undergraduate level that there is hardly anything as robust as > Chebyshev?s inequality when it comes to confidence intervals and the > like. So I wonder, why this topic was discussed at all.) > > O.k. That was disgressive. > >> I think there are plenty of open >> questions. >> > >> But I haven't yet seen anything to convince me that the concepts of >> effort-fair >> and outcome-fair don't make sense or that either one is better than a >> tunable >> hybrid. >> >> > > O.k. For me, it would be nice to restrict the discussion a bit. What I > have in mind when I talk about this problem is in fact the multi user > diversity debate. There are lots of papers on this issue as well and > there was some hype about this topic during the last ten, twelve > years. And there is still some hype in writing sophisticated > schedulers which exploit multi user diversity and adaptive channel > coding and the like. > > Perhaps, I?m about to see another failure of my own work and perhaps > I?m going to be severely disappointed here as well. > > Basically, there are two concerns. > > First. We claim we would exploit multi user diversity and by doing so > increase spectral efficency etc. etc. > Do we really? Or do we hope so? > > Second. What I have seen so far in the lower layers of actual mobile > wireless networks is highly complex and sophisticated. And I?m still > to understand most of the details. However, I wonder how terms like > "rate" are interpreted differently by CS and EE guys and I wonder why > flow control issues, which are typical end to end issues, are dealt > with locally and why there are much techniques integrated into the > lower layers, the ramifictions of which on the end to end behaviour of > the system are not yet clear. > > Therefore the question whether we should pursue ressource fairness or > throughput fairness is primarily which kind of fairness, if it is > pursued on a wireless link, fits best into the current end to end > Internet design? > And what is the real purpose of that "fairness"? To my knowledge, the > original idea introduced by Knopp and Humblet and further discussed by > Tse et al. simply wants to exploit multi user diversity and for this > purpose some systems introduce sophisticated schedulers into the > downlink channel. > - Do these systems really achieve what they want to do? > - Do these systems have ramifications on upper layers? > - Do these systems maintain the intended end to end behaviour of > protocols / applications etc.? > > Or is the multi user diversity debate as it is conducted at the moment > just another hype? > > That?s my original intention. > > Regards > > Detlef > > >> Dave Eckhardt >> > > From pganti at gmail.com Thu Jul 26 14:43:56 2007 From: pganti at gmail.com (Paddy Ganti) Date: Thu, 26 Jul 2007 17:43:56 -0400 Subject: [e2e] Analytic Model of Download Time as a Function of TCP Connect Time Message-ID: <2ff1f08a0707261443h120ad69fnd5df970f1d54d7d1@mail.gmail.com> I am thinking of an approach to analytically determine the download time as a function of RTT given a few initial real world samples. Say, I measured a web page from 4 locations around the globe. Knowing this sample, what can I infer anything about the population of download times as a function of RTT. If I assume that Download time (dt)can be expressed as follows: dt = n* RTT + c where n is the number of round trips (RTT ping pongs, includes one burst of data which can be multiple packets) with c being the server stall time between sending the data or server processing time plus some random noise all factored into once constant. The above equation is of the form y=mx +c and I can equate the slope with that of number of round trips (makes sense as the lesser the number of round trips the lower the response time) while x is RTT. So if I take enough sampls, say 10, and perform a regression analysis on those to generate the equation wouldnt that classify the population. If I have such an equation then I would plug in various RTT(s) and asuming the R-squared value is high wouldnt that be representative of real performance. A few initial measurements showed encouraging results but a few measurements didnt converge and a few had negative valus,etc. Before I go further and present this to an internal audience I want to poll this group for any feedback/remarks/comments on using this method and its pitfalls. -Paddy Ganti -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20070726/775a0963/attachment.html From cottrell at slac.stanford.edu Thu Jul 26 16:28:08 2007 From: cottrell at slac.stanford.edu (Cottrell, Les) Date: Thu, 26 Jul 2007 16:28:08 -0700 Subject: [e2e] Analytic Model of Download Time as a Function of TCP ConnectTime In-Reply-To: <2ff1f08a0707261443h120ad69fnd5df970f1d54d7d1@mail.gmail.com> References: <2ff1f08a0707261443h120ad69fnd5df970f1d54d7d1@mail.gmail.com> Message-ID: <35C208A168A04B4EB99D1E13F2A4DB0102864409@exch-mail1.win.slac.stanford.edu> The sounds a bit like an extnesion of "ITU-T Rec.G1040 "Network contribution to transaction time" which calculates the network contribution to transaction time. The contribution depends on the RTT, loss probability (p), the Retransmission Time Out (RTO) and the number of round trips involved (n) in a transaction. The Network Contribution to Transation Time (NCTT) is given as: Average(NCTT) = (n * RTT) + (p * n * RTO) In our case (PingER) typical values for n are 8, for RTO we take 2.5 seconds, we take the RTT and loss probability (p) from the PingER measurements. The main difference is that you seem to ignore the losses. ________________________________ From: end2end-interest-bounces at postel.org [mailto:end2end-interest-bounces at postel.org] On Behalf Of Paddy Ganti Sent: Thursday, July 26, 2007 2:44 PM To: end2end-interest at postel.org Subject: [e2e] Analytic Model of Download Time as a Function of TCP ConnectTime I am thinking of an approach to analytically determine the download time as a function of RTT given a few initial real world samples. Say, I measured a web page from 4 locations around the globe. Knowing this sample, what can I infer anything about the population of download times as a function of RTT. If I assume that Download time (dt)can be expressed as follows: dt = n* RTT + c where n is the number of round trips (RTT ping pongs, includes one burst of data which can be multiple packets) with c being the server stall time between sending the data or server processing time plus some random noise all factored into once constant. The above equation is of the form y=mx +c and I can equate the slope with that of number of round trips (makes sense as the lesser the number of round trips the lower the response time) while x is RTT. So if I take enough sampls, say 10, and perform a regression analysis on those to generate the equation wouldnt that classify the population. If I have such an equation then I would plug in various RTT(s) and asuming the R-squared value is high wouldnt that be representative of real performance. A few initial measurements showed encouraging results but a few measurements didnt converge and a few had negative valus,etc. Before I go further and present this to an internal audience I want to poll this group for any feedback/remarks/comments on using this method and its pitfalls. -Paddy Ganti -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.postel.org/pipermail/end2end-interest/attachments/20070726/fe9263f1/attachment.html