[e2e] [tcpm] Question on RFC6298, Managing the RTO Timer and additional lost pakets in Recovery state

Thu Mar 10 05:22:04 PST 2016

Hi Yuchung, thanks for the help

I seem to have gotten the RFC6937 (PRR) behavior in place.  Currently I don't see much gain with PRR, one possible reasons is that AQMs in LTE are typically a bit on the "bufferbloated" side as too low drop thresholds easily causes links be underutilized. The effect of this is that when  a loss event occurs, there will be enough data in the RLC queue to transmit even though the congestion window is cut in half immediately. There could still be benefits with PRR in terms of less RTO.

I believe then that I am getting closer to a good Linux TCP model in our simulator. 
I still have one particular behavior that I don't really understand, I am almost 100% sure that the error is on my side

Thanks for the ref to the sigcomm paper, I seem to have missed it.

/Ingemar

> -----Original Message-----
> From: Yuchung Cheng [mailto:ycheng at google.com]
> Sent: den 5 mars 2016 16:53
> To: Ingemar Johansson S
> Cc: mallman at icir.org; Michael Welzl; tcpm at ietf.org; end2end-
> interest at postel.org
> Subject: Re: [tcpm] Question on RFC6298, Managing the RTO Timer and
> additional lost pakets in Recovery state
> 
> Linux implements RFC6937 not RFC6675 to adjust cwnd in fast recovery.
> Specifically it reduces cwnd gradually toward ssthresh as packets are being
> delivered. if inflight, aka pipe, drops below ssthresh, it tries to slow start
> toward ssthresh, provided no additional packets are lost. The last condition
> was added recently and I had a presentation last meeting:
> https://www.ietf.org/proceedings/94/slides/slides-94-tcpm-7.pdf
> 
> btw, Linux may adjust RTO by taking RTT samples from newly SACK blocks,
> which is not standardized. It mitigates issues when RTT continues to raise
> during recovery in LTE networks (see figure 15 in
> http://web.eecs.umich.edu/~zmao/Papers/lte_sigcomm13.pdf)
> 
> On Sat, Mar 5, 2016 at 7:18 AM, Ingemar Johansson S
> <ingemar.s.johansson at ericsson.com> wrote:
> >
> > Hi
> >
> > Thanks for the response, and thank Michael as well, guess I need to read
> RFC5681 and RFC6675 again.
> >
> > The line of reasoning seen from an application perspective actually helps to
> put the puzzle together for me.
> > Also I understand now that case 2 below necessitates an RTO, atleast with
> TCP. I guess QUIC may be different in this respect as it retransmitted
> segments have a new transport sequence number ?.
> >
> > /Ingemar
> >
> > > -----Original Message-----
> > > From: mallman at icir.org [mailto:mallman at icir.org]
> > > Sent: den 3 mars 2016 15:32
> > > To: Ingemar Johansson S
> > > Cc: tcpm at ietf.org; end2end-interest at postel.org
> > > Subject: Re: [tcpm] Question on RFC6298, Managing the RTO Timer and
> > > additional lost pakets in Recovery state
> > >
> > >
> > > > It says quote “(5.3) When an ACK is received that acknowledges new
> > > > data, restart the retransmission timer so that it will expire
> > > > after RTO seconds (for the current value of RTO).”
> > > >
> > > > What is the definition of new data ?. The strict interpretation is
> > > > when SND.UNA advances, but it can also be that the highest SACKed
> > > > sequence number increases. The former case it is more likely that
> > > > RTO happens.
> > >
> > > Seems like something we should have nailed down in the spec at some
> > > point after SACK became widely prevalent.  Alas.
> > >
> > > I think "new data" can be interpreted as "cumulative ACK advances".
> > >
> > > The spirit of (5.3) is that as long as the connection is making
> > > progress---from an application perspective---we can keep the RTO at
> > > arms length and so we just keep re-arming it.  But, once we have a
> > > stall---or even an indication that we might stall---because a packet
> > > has been lost then we stop pushing the RTO off.
> > >
> > > > The second question is Linux related. Given that a lost packet
> > > > puts the stack in Recovery state, the congestion window reduces
> > > > one step as an effect on this. What happens if additional packets
> > > > are lost when in Recovery state. I guess the congestion window
> > > > should decrease more or ?.
> > >
> > > First, this is a more generic answer, I have no idea what linux does.
> > >
> > > I can't tell which of two cases you are talking about here.  Let's
> > > say you send
> > > 20 packets into the network in some window.  Now, the cases ...
> > >
> > > (1) We lose packets 1, 5, 13 and 17.  I.e., multiple packets are
> > >     lost from a single transmission window.  So, retransmitting
> > >     packet 1 puts us in recovery and causes congestion control
> > >     action.  I believe that the fact that packets 5, 13 and 17 are
> > >     also lost does not mean we should react to congestion again.
> > >     E.g., RFC 6675 calls for a single CC response regardless of how
> > >     many packets are lost from a window of data.
> > >
> > > (2) We lose packets 1, 5, 13 and 17 and also the retransmit of
> > >     packet 17.  So, we lose 4 packets from the first single
> > >     transmission window.  This triggers one CC response.  But, the
> > >     retransmit of packet 17 is from a subsequent transmission
> > >     window, indicating that perhaps we haven't yet done enough to
> > >     relieve the congestion.  Conservativeness would likely suggest
> > >     that in this case, yes, we should take another CC action.
> > >
> > >     And, e.g., RFC 6675 forces this second CC action by being unable
> > >     to cope with lost retransmissions.  Rather, in this case we fall
> > >     back to the RTO which means another CC response.  I am not
> > >     claiming RFC 6675 is the right approach here.  Just noting what
> > >     some spec does.  We left it this way because we didn't feel that
> > >     the complexity of dealing with this case was really generally
> > >     worth it.  But, one could envision a different algorithm making
> > >     a different choice.
> > >
> > > I hope that helps!
> > >
> > > allman
> > >
> > >
> > > --
> > > http://www.icir.org/mallman/
> > >
> > >
> >
> > _______________________________________________
> > tcpm mailing list
> > tcpm at ietf.org
> > https://www.ietf.org/mailman/listinfo/tcpm