public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: Yumei Huang <yuhuang@redhat.com>
To: David Gibson <david@gibson.dropbear.id.au>,
	Stefano Brivio <sbrivio@redhat.com>
Cc: passt-dev@passt.top
Subject: Re: [PATCH 2/2] tcp: Resend SYN for inbound connections
Date: Tue, 30 Sep 2025 14:04:24 +0800	[thread overview]
Message-ID: <CANsz47=vdhxV00sQ-SyM3T1-LHgxLv-PyxjXQJwZV80v4m8C+g@mail.gmail.com> (raw)
In-Reply-To: <aNss3NYaMIOD1qDv@zatzit>

Thank you both for the comments!

On Tue, Sep 30, 2025 at 9:06 AM David Gibson
<david@gibson.dropbear.id.au> wrote:
>
> On Tue, Sep 30, 2025 at 12:23:35AM +0200, Stefano Brivio wrote:
> > On Mon, 29 Sep 2025 16:25:39 +1000
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> >
> > > On Sun, Sep 28, 2025 at 03:29:46PM +0800, Yumei Huang wrote:
> > > > If a client connects while guest is not connected or ready yet,
> > > > resend SYN instead of just resetting connection after SYN_TIMEOUT.
> > > >
> > > > Signed-off-by: Yumei Huang <yuhuang@redhat.com>
> > >
> > > Simpler than I thought.  Nice.
> > >
> > > However, I think now that we're retrying we probably want to adjust
> > > SYN_TIMEOUT.  I suspect the 10s was a generous amount to mitigate the
> > > fact we didn't retry.
> >
> > That was the idea, yes.
> >
> > > However, AFAICT most OSes resend SYNs faster than that (after 1-3s
> > > initially).
> >
> > Right. Remember all the examples from Linux with one-second retries
> > I wanted to show? :)
> >
> > > They also typically slow down the
> > > resents on subsequent retries.  I'm not sure if that last is important
> > > in our case - since we're talking directly to a guest, we're unlikely
> > > to flood the link this way.
> >
> > Exponential back-off (or whatever is used by other implementations)
> > doesn't only serve the purpose of avoiding to flood the link. It's also
> > about functionality itself.
> >
> > That is, if you waited one second, and you didn't get any reply, that's
> > a good indication that you might not get a reply in one second from
> > now, because the peer might need a little bit longer.
> >
> > > In fact, I haven't read closely enough to be sure, but there was some
> > > language in RFC 6298 and RFC 1122 that suggested to me maybe we should
> > > be using the same backoff calculation for SYN retries as for regular
> > > retransmits.  Which as a bonus might simplify our logic a little bit.
> >
> > Somewhat surprisingly, RFC 9293 doesn't say anything about this. :(
>
> Right, it discusses RTO, and never explicitly talks about SYN resends,
> kind of implying that SYN resends should use the same RTO calculations
> as data retransmits.
>
> > And while I'm fairly sure that RFC 2988 was intended to only cover
> > *data* retransmissions, RFC 6298 (which updates it) seems to simply
> > assume, in section 5., that it's also about SYN segments:
> >
> >    (5.7) If the timer expires awaiting the ACK of a SYN segment and the
> >          TCP implementation is using an RTO less than 3 seconds, the RTO
> >          MUST be re-initialized to 3 seconds when data transmission
> >          begins (i.e., after the three-way handshake completes).
> >
> > so, yes, I tend to agree with this. Let's just use the same logic.
> >
> > Just note that it's an approximation of RFC 6298, in any case, because
> > we don't implement RTT measurements.
> >
> > It's a rather complicated implementation that I originally decided to
> > skip because there's no actual data transmission between us and
> > guest/container, so there isn't much that can go wrong. Maybe we could
> > even assume that the RTT is zero.
> >
> > As a result of that, we can't implement RFC 2988 / RFC 6298 exactly as
> > it is. But we can get quite close to it.
> >
> > > Documentation/networking/ip-sysctl.rst
> >
> > ...in the unlikely case it's not clear: David means a Linux kernel tree
> > here. Yumei, it might be a good idea to have a kernel tree (maybe
> > git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git) at
> > hand to check this kind of stuff.

I found this through Google,
https://www.kernel.org/doc/Documentation/networking/ip-sysctl.rst.
Having a repo at hand would be great too!

> >
> > > has some information on how
> > > Linux handles this (tcp_syn_retries and tcp_syn_linear_timeouts in
> > > particular).  I guess we could configure ourselves to match the host's
> > > settings - we do something similar to determine what we consider
> > > ephemeral ports.
> > >
> > > Stefano, thoughts?
> >
> > Yes, I think reading those values from the host makes sense because the
> > same thing that would happen if the host connects to the guest or
> > container with another implementation (veth, or tap).
>
> I think taking tcp_syn_retries from the host makes sense.  I'm a bit
> less sure about tcp_syn_linear_timeouts, since that requires
> implementing the more complex and specific linear backoff behaviour.
>
> > It's also the least surprising behaviour for any kind of application
> > that was previously running outside guest or containers and now it's
> > moved there.
>
> Well, probably the least surprising we can achieve.  It's not possible
> for us to match the peer's SYN retry behaviour if it's not Linux or
> has different settings to the host.
>
> > We have just 3 bits here, so we can only try 9 times (retry 8 times),
> > but other than that we can use tcp_syn_retries and
> > tcp_syn_linear_timeouts as they are.
> >
> > Summing up, I would propose that we do something like this:
> >
> > 1. (optional, and might be in another series, but keeping it together
> >    with the rest might be more convenient): read tcp_syn_retries, limit
> >    to 8, and also read tcp_syn_linear_timeouts
> >
> > 2. always use one second as initial RTO (retransmission timeout). That's
> >    what the kernel does (even though rto_initial should be 3 seconds by
> >    default...? I'm not sure)
> >
> > 3. for SYN, implement the tcp_syn_linear_timeouts thing. That is,
> >    in tcp_timer_ctl(), use this timeout:
> >
> >       if (conn->retries < c->tcp_ctx.syn_linear_timeouts)
> >               [one second]
> >       else
> >               [1 << conn->retries]
>
> 1 << conn->retries, or 1 << (conn->retries - syn_linear_timeouts) ?

I think it should be the latter.

>
> > 4. do the same for data retransmissions, but without the
> >    tcp_syn_linear_timeouts thing, start from one second (see Appendix A
> >    in RFC 6298)... and maybe limit it to 60 seconds as allowed by the
> >    same RFC:
> >
> >    (2.5) A maximum value MAY be placed on RTO provided it is at least 60
> >          seconds.
> >
> >    ? Retransmitting data after 256 seconds doesn't make a lot of sense
> >    to me.
> >
> > It shouldn't be much more complicated than the current patch, it just
> > involves some extra changes in tcp_timer_ctl(), I guess.

I'm afraid I won't be able to finish it today. Will work on this after
I come back from holiday.

> >
> > --
> > Stefano
> >
>
> --
> David Gibson (he or they)       | I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au  | minimalist, thank you, not the other way
>                                 | around.
> http://www.ozlabs.org/~dgibson


-- 
Thanks,

Yumei Huang


  reply	other threads:[~2025-09-30  6:04 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-28  7:29 [PATCH 0/2] Retry SYNs " Yumei Huang
2025-09-28  7:29 ` [PATCH 1/2] tcp: Rename "retrans" of struct tcp_tap_conn and tcp_tap_transfer Yumei Huang
2025-09-29  6:06   ` David Gibson
2025-09-28  7:29 ` [PATCH 2/2] tcp: Resend SYN for inbound connections Yumei Huang
2025-09-29  6:25   ` David Gibson
2025-09-29 22:23     ` Stefano Brivio
2025-09-30  1:05       ` David Gibson
2025-09-30  6:04         ` Yumei Huang [this message]
2025-09-29 22:23   ` Stefano Brivio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CANsz47=vdhxV00sQ-SyM3T1-LHgxLv-PyxjXQJwZV80v4m8C+g@mail.gmail.com' \
    --to=yuhuang@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=passt-dev@passt.top \
    --cc=sbrivio@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).