From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: passt.top; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=UEO7a8DM; dkim-atps=neutral Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by passt.top (Postfix) with ESMTPS id 93B205A0271 for ; Tue, 30 Sep 2025 08:04:42 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1759212281; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=p+fItitIBaZ6k0RYJSOH4JwFm19daDDPtCPBHHe9DDk=; b=UEO7a8DME4chScYayfXft1kN33bV+Pj/5Cudd5glY4PWT9K0dv8uoH+e+cpY4SwGl06bA2 xS/mDRTHI9DPBUdB5lhktWUcdzyLzLKNS6iH11mEphAsCJjF6xq4wibZsZbpVB07Za+O+o +kkclJxR40M1tWYnG4nME4ZTYSEH9OI= Received: from mail-ed1-f72.google.com (mail-ed1-f72.google.com [209.85.208.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-37-BVK3Flc3OX22FGpCDyXJ0w-1; Tue, 30 Sep 2025 02:04:38 -0400 X-MC-Unique: BVK3Flc3OX22FGpCDyXJ0w-1 X-Mimecast-MFC-AGG-ID: BVK3Flc3OX22FGpCDyXJ0w_1759212276 Received: by mail-ed1-f72.google.com with SMTP id 4fb4d7f45d1cf-634741fccc9so4095335a12.2 for ; Mon, 29 Sep 2025 23:04:37 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759212276; x=1759817076; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=p+fItitIBaZ6k0RYJSOH4JwFm19daDDPtCPBHHe9DDk=; b=AXU+inTky3VH97g5me48cPSj1okxZ7zaQh6eHrzV7YjWJc6AXaVdadWAUYpYIXvAYx D+VoO6RnrCgNRXDWoj9qmkK8m/H2antVSLXJDbiW2jrjE7aAqNdeK8rVTnaKoEAg/0M2 dhJx5on9gdgdpCiWVBh0Ro0xjJRSmRRPwz9KmT+6DgBRl8woTUqrIY/VVK+OOLwVNIoC S6D+M3LIHaSxkQntQGqmYjh2YicAzhI5Wen6wHXCFMXtTxSb95Ra55thQGB/uoZkG4Lw TlhkKPpeCr49qgPeJOI1EqpVNph7jaG5i5Dq1QI942B0Vt/gSz6PYNOauanh7JmTLFqJ jdyg== X-Gm-Message-State: AOJu0YziySZMRMUFDNyPCNQb+m+fjGCnbZdd3rKErgQOxgMYKgIUNghK oDplLDdIw4XNkMROb9eqHGqL0QZfLr92a3IH6mgDR4mYvkmtCrbf6XWsXgG6xiEQMFSx7YI7sAh vy+3u14bVrlRhDGXqa0/j2kCZ8O1/YDgaeHo9GGVdOolaRP+54HVtL8zjOdjh02HAHSBycdUKlo k/TJw2+VyKNKsW4eezELW5xPTnueWM X-Gm-Gg: ASbGnctsrmS5U6JtoKeBxDhJkWImXoItEFxzlk4SJ7OoBzi1vuX0cXabb59ZevREZx9 zm8kzG+5mcAlARKisLesqUI3IOuX1Vr1bhvfd5ADCzO8d0tIao/s2VTdUvzlAPZbQxkU3SAD9C6 MsybT5410V2Vkdd7pdsY+zEtzZqFw= X-Received: by 2002:a05:6402:180c:b0:632:bc36:db59 with SMTP id 4fb4d7f45d1cf-6349f9cb7demr14371211a12.1.1759212275821; Mon, 29 Sep 2025 23:04:35 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGMtZoXOZv97c9ro/xA8s+EUtRQv7LHokqJ0KeWDykcLcoMMQzUU7y92yf2oUTyL5DG2MxJ/Jyffr8zDOirjr0= X-Received: by 2002:a05:6402:180c:b0:632:bc36:db59 with SMTP id 4fb4d7f45d1cf-6349f9cb7demr14371195a12.1.1759212275332; Mon, 29 Sep 2025 23:04:35 -0700 (PDT) MIME-Version: 1.0 References: <20250928072946.15284-1-yuhuang@redhat.com> <20250928072946.15284-3-yuhuang@redhat.com> <20250930002335.240e37cc@elisabeth> In-Reply-To: From: Yumei Huang Date: Tue, 30 Sep 2025 14:04:24 +0800 X-Gm-Features: AS18NWDaz7wEP5lEMmM0xJW35GU9TABeFREGDqo0jW7-AhcR2N29ouf7oZPWh_4 Message-ID: Subject: Re: [PATCH 2/2] tcp: Resend SYN for inbound connections To: David Gibson , Stefano Brivio X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: 9e9lZurdlsbyEZM6NoA1bHSzYaAvoBTP31FmtZxijds_1759212276 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Message-ID-Hash: C3K56VMXARANISZC23ID6VEKTPJKXXMO X-Message-ID-Hash: C3K56VMXARANISZC23ID6VEKTPJKXXMO X-MailFrom: yuhuang@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Thank you both for the comments! On Tue, Sep 30, 2025 at 9:06=E2=80=AFAM David Gibson wrote: > > On Tue, Sep 30, 2025 at 12:23:35AM +0200, Stefano Brivio wrote: > > On Mon, 29 Sep 2025 16:25:39 +1000 > > David Gibson wrote: > > > > > On Sun, Sep 28, 2025 at 03:29:46PM +0800, Yumei Huang wrote: > > > > If a client connects while guest is not connected or ready yet, > > > > resend SYN instead of just resetting connection after SYN_TIMEOUT. > > > > > > > > Signed-off-by: Yumei Huang > > > > > > Simpler than I thought. Nice. > > > > > > However, I think now that we're retrying we probably want to adjust > > > SYN_TIMEOUT. I suspect the 10s was a generous amount to mitigate the > > > fact we didn't retry. > > > > That was the idea, yes. > > > > > However, AFAICT most OSes resend SYNs faster than that (after 1-3s > > > initially). > > > > Right. Remember all the examples from Linux with one-second retries > > I wanted to show? :) > > > > > They also typically slow down the > > > resents on subsequent retries. I'm not sure if that last is importan= t > > > in our case - since we're talking directly to a guest, we're unlikely > > > to flood the link this way. > > > > Exponential back-off (or whatever is used by other implementations) > > doesn't only serve the purpose of avoiding to flood the link. It's also > > about functionality itself. > > > > That is, if you waited one second, and you didn't get any reply, that's > > a good indication that you might not get a reply in one second from > > now, because the peer might need a little bit longer. > > > > > In fact, I haven't read closely enough to be sure, but there was some > > > language in RFC 6298 and RFC 1122 that suggested to me maybe we shoul= d > > > be using the same backoff calculation for SYN retries as for regular > > > retransmits. Which as a bonus might simplify our logic a little bit. > > > > Somewhat surprisingly, RFC 9293 doesn't say anything about this. :( > > Right, it discusses RTO, and never explicitly talks about SYN resends, > kind of implying that SYN resends should use the same RTO calculations > as data retransmits. > > > And while I'm fairly sure that RFC 2988 was intended to only cover > > *data* retransmissions, RFC 6298 (which updates it) seems to simply > > assume, in section 5., that it's also about SYN segments: > > > > (5.7) If the timer expires awaiting the ACK of a SYN segment and the > > TCP implementation is using an RTO less than 3 seconds, the RT= O > > MUST be re-initialized to 3 seconds when data transmission > > begins (i.e., after the three-way handshake completes). > > > > so, yes, I tend to agree with this. Let's just use the same logic. > > > > Just note that it's an approximation of RFC 6298, in any case, because > > we don't implement RTT measurements. > > > > It's a rather complicated implementation that I originally decided to > > skip because there's no actual data transmission between us and > > guest/container, so there isn't much that can go wrong. Maybe we could > > even assume that the RTT is zero. > > > > As a result of that, we can't implement RFC 2988 / RFC 6298 exactly as > > it is. But we can get quite close to it. > > > > > Documentation/networking/ip-sysctl.rst > > > > ...in the unlikely case it's not clear: David means a Linux kernel tree > > here. Yumei, it might be a good idea to have a kernel tree (maybe > > git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git) at > > hand to check this kind of stuff. I found this through Google, https://www.kernel.org/doc/Documentation/networking/ip-sysctl.rst. Having a repo at hand would be great too! > > > > > has some information on how > > > Linux handles this (tcp_syn_retries and tcp_syn_linear_timeouts in > > > particular). I guess we could configure ourselves to match the host'= s > > > settings - we do something similar to determine what we consider > > > ephemeral ports. > > > > > > Stefano, thoughts? > > > > Yes, I think reading those values from the host makes sense because the > > same thing that would happen if the host connects to the guest or > > container with another implementation (veth, or tap). > > I think taking tcp_syn_retries from the host makes sense. I'm a bit > less sure about tcp_syn_linear_timeouts, since that requires > implementing the more complex and specific linear backoff behaviour. > > > It's also the least surprising behaviour for any kind of application > > that was previously running outside guest or containers and now it's > > moved there. > > Well, probably the least surprising we can achieve. It's not possible > for us to match the peer's SYN retry behaviour if it's not Linux or > has different settings to the host. > > > We have just 3 bits here, so we can only try 9 times (retry 8 times), > > but other than that we can use tcp_syn_retries and > > tcp_syn_linear_timeouts as they are. > > > > Summing up, I would propose that we do something like this: > > > > 1. (optional, and might be in another series, but keeping it together > > with the rest might be more convenient): read tcp_syn_retries, limit > > to 8, and also read tcp_syn_linear_timeouts > > > > 2. always use one second as initial RTO (retransmission timeout). That'= s > > what the kernel does (even though rto_initial should be 3 seconds by > > default...? I'm not sure) > > > > 3. for SYN, implement the tcp_syn_linear_timeouts thing. That is, > > in tcp_timer_ctl(), use this timeout: > > > > if (conn->retries < c->tcp_ctx.syn_linear_timeouts) > > [one second] > > else > > [1 << conn->retries] > > 1 << conn->retries, or 1 << (conn->retries - syn_linear_timeouts) ? I think it should be the latter. > > > 4. do the same for data retransmissions, but without the > > tcp_syn_linear_timeouts thing, start from one second (see Appendix A > > in RFC 6298)... and maybe limit it to 60 seconds as allowed by the > > same RFC: > > > > (2.5) A maximum value MAY be placed on RTO provided it is at least 6= 0 > > seconds. > > > > ? Retransmitting data after 256 seconds doesn't make a lot of sense > > to me. > > > > It shouldn't be much more complicated than the current patch, it just > > involves some extra changes in tcp_timer_ctl(), I guess. I'm afraid I won't be able to finish it today. Will work on this after I come back from holiday. > > > > -- > > Stefano > > > > -- > David Gibson (he or they) | I'll have my music baroque, and my code > david AT gibson.dropbear.id.au | minimalist, thank you, not the other wa= y > | around. > http://www.ozlabs.org/~dgibson --=20 Thanks, Yumei Huang