From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: passt.top; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=gbqxxGbP; dkim-atps=neutral Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by passt.top (Postfix) with ESMTPS id CDBBB5A0274 for ; Tue, 30 Sep 2025 00:23:45 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1759184624; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=b0kwfKnZjfSVhO5a95jqn7PgOqkh1ieXZ0UtVr7+O60=; b=gbqxxGbPaQlnu+5YSGUB0MkIoIeHEhqycmCN0UgOnBuzWz4yzOkrh0mHZPxIjMzFJXllnh taB/ltoHU+ayP+pd0K+OnihSOmuivFNnZS3SQF7u3RPNULLz23Sv4O+gdWvZehhV49NdSz aPvE/rqQxsi/6V24dGVaNn0Fa/+p5X0= Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-588-SRDZkFghNnaYViAsi9hniA-1; Mon, 29 Sep 2025 18:23:40 -0400 X-MC-Unique: SRDZkFghNnaYViAsi9hniA-1 X-Mimecast-MFC-AGG-ID: SRDZkFghNnaYViAsi9hniA_1759184619 Received: by mail-wm1-f70.google.com with SMTP id 5b1f17b1804b1-46e303235e8so38781985e9.1 for ; Mon, 29 Sep 2025 15:23:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759184619; x=1759789419; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=b0kwfKnZjfSVhO5a95jqn7PgOqkh1ieXZ0UtVr7+O60=; b=V9sDu/kIA4clfDYdW9a44j81R40CQ9g8tDQFXXJ4TOHUf+hmVFNmSfUdEifj/sMn2M tOgF15DZEN901SXvTuo6Lc+zV8ZzJtO5XG6BcTj62hPZAiKCYGb+p3fRUGiL17vKP5PK Nn5/IZi4yA0j/N93EGK7LGm21BZ66Om/0Lo9wxQ82F3AktuPlZy249SXIYGQa+UM0qvU NrzhBCf7SMvpnx5SkG44QvDPdldg1B93jbnBCBXS8lHy8r/lA9WuRuc4DVlfAdlq9ww1 NgnSYsI75Ou3syC3Fpxgfm4rd+sphwHTevRen87MaBqZScHLPzzRh4DDvegvQlKZXo5v /YwQ== X-Forwarded-Encrypted: i=1; AJvYcCUOfoiSXFxuycw8mxLZ5Kty3BXlSY+hic24uMCjDAnX2b57qxy3OoANQrbKDGyJ7IOzFY+u8N0I49w=@passt.top X-Gm-Message-State: AOJu0Yy/fd0UWazG3mG1XYg71UpL4NjUgKrtxakDQvLjNsBdQ06I2fUm ElrFJfM/150snKOWF2l9gccKyl5htAKkIO0Qhal/lrxcgtk+1jKqimr47Uhee7Sxzpr6CvGNoeQ avA1HY5PZXLerlAE9FhJRAxiXwaig6zdF0qRDhrhExY/t+H/lGuZ7WA== X-Gm-Gg: ASbGnctNeWNOzgNaFRGdMSKxXm+Tsd3kxnwvrnctmdvTxKeb3fHPkSYbtN4fo0/NyOF BYdv24f5nRuGrcqJrHdlsUi19V8HomHAcNuy5YX6rQrm002uQ++mJg3TrltRAVDvz960/C+y3Jj U8WtM7ZTOS9BeJUbtu+mA5O3wY91CAQFDs3ITuRtiQtIn+rWA655PzfjWO16DyXm9Ia74Oxg5OB d/2rJaiBU/nDy/bHd1TDESQWd7o4gOGIBXZCu73yeEFMOFbgd2C7AgFml3ZvnvlruVO+tmAEmGj mZD3oIRb/ZyIdA2jzlMbBdvyshBMz53ml5eK8/tJ8kOkgS/xtaCskvSzPZNfbmhVRk/B X-Received: by 2002:a05:600c:3105:b0:46e:1fb9:5497 with SMTP id 5b1f17b1804b1-46e329e4d87mr181184035e9.18.1759184618529; Mon, 29 Sep 2025 15:23:38 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHb9r48vaGuTxELMz8pl9kUpfWqrcSszMbnyE2KpjsHxcw50L9Y6PHrrwndAguczDI9Dvqm7g== X-Received: by 2002:a05:600c:3105:b0:46e:1fb9:5497 with SMTP id 5b1f17b1804b1-46e329e4d87mr181183825e9.18.1759184618007; Mon, 29 Sep 2025 15:23:38 -0700 (PDT) Received: from maya.myfinge.rs (ifcgrfdd.trafficplex.cloud. [176.103.220.4]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-46e56f3dcacsm31046405e9.2.2025.09.29.15.23.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 29 Sep 2025 15:23:37 -0700 (PDT) Date: Tue, 30 Sep 2025 00:23:35 +0200 From: Stefano Brivio To: David Gibson Subject: Re: [PATCH 2/2] tcp: Resend SYN for inbound connections Message-ID: <20250930002335.240e37cc@elisabeth> In-Reply-To: References: <20250928072946.15284-1-yuhuang@redhat.com> <20250928072946.15284-3-yuhuang@redhat.com> Organization: Red Hat X-Mailer: Claws Mail 4.2.0 (GTK 3.24.49; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: YiIwcAMESJ80XbDMtzOFaCMFIekURuMUgbZN_XDUZYA_1759184619 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-ID-Hash: 255SLI2Z3VB5NWOWS5DBGK7ISR5A5KFU X-Message-ID-Hash: 255SLI2Z3VB5NWOWS5DBGK7ISR5A5KFU X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: Yumei Huang , passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Mon, 29 Sep 2025 16:25:39 +1000 David Gibson wrote: > On Sun, Sep 28, 2025 at 03:29:46PM +0800, Yumei Huang wrote: > > If a client connects while guest is not connected or ready yet, > > resend SYN instead of just resetting connection after SYN_TIMEOUT. > > > > Signed-off-by: Yumei Huang > > Simpler than I thought. Nice. > > However, I think now that we're retrying we probably want to adjust > SYN_TIMEOUT. I suspect the 10s was a generous amount to mitigate the > fact we didn't retry. That was the idea, yes. > However, AFAICT most OSes resend SYNs faster than that (after 1-3s > initially). Right. Remember all the examples from Linux with one-second retries I wanted to show? :) > They also typically slow down the > resents on subsequent retries. I'm not sure if that last is important > in our case - since we're talking directly to a guest, we're unlikely > to flood the link this way. Exponential back-off (or whatever is used by other implementations) doesn't only serve the purpose of avoiding to flood the link. It's also about functionality itself. That is, if you waited one second, and you didn't get any reply, that's a good indication that you might not get a reply in one second from now, because the peer might need a little bit longer. > In fact, I haven't read closely enough to be sure, but there was some > language in RFC 6298 and RFC 1122 that suggested to me maybe we should > be using the same backoff calculation for SYN retries as for regular > retransmits. Which as a bonus might simplify our logic a little bit. Somewhat surprisingly, RFC 9293 doesn't say anything about this. :( And while I'm fairly sure that RFC 2988 was intended to only cover *data* retransmissions, RFC 6298 (which updates it) seems to simply assume, in section 5., that it's also about SYN segments: (5.7) If the timer expires awaiting the ACK of a SYN segment and the TCP implementation is using an RTO less than 3 seconds, the RTO MUST be re-initialized to 3 seconds when data transmission begins (i.e., after the three-way handshake completes). so, yes, I tend to agree with this. Let's just use the same logic. Just note that it's an approximation of RFC 6298, in any case, because we don't implement RTT measurements. It's a rather complicated implementation that I originally decided to skip because there's no actual data transmission between us and guest/container, so there isn't much that can go wrong. Maybe we could even assume that the RTT is zero. As a result of that, we can't implement RFC 2988 / RFC 6298 exactly as it is. But we can get quite close to it. > Documentation/networking/ip-sysctl.rst ...in the unlikely case it's not clear: David means a Linux kernel tree here. Yumei, it might be a good idea to have a kernel tree (maybe git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git) at hand to check this kind of stuff. > has some information on how > Linux handles this (tcp_syn_retries and tcp_syn_linear_timeouts in > particular). I guess we could configure ourselves to match the host's > settings - we do something similar to determine what we consider > ephemeral ports. > > Stefano, thoughts? Yes, I think reading those values from the host makes sense because the same thing that would happen if the host connects to the guest or container with another implementation (veth, or tap). It's also the least surprising behaviour for any kind of application that was previously running outside guest or containers and now it's moved there. We have just 3 bits here, so we can only try 9 times (retry 8 times), but other than that we can use tcp_syn_retries and tcp_syn_linear_timeouts as they are. Summing up, I would propose that we do something like this: 1. (optional, and might be in another series, but keeping it together with the rest might be more convenient): read tcp_syn_retries, limit to 8, and also read tcp_syn_linear_timeouts 2. always use one second as initial RTO (retransmission timeout). That's what the kernel does (even though rto_initial should be 3 seconds by default...? I'm not sure) 3. for SYN, implement the tcp_syn_linear_timeouts thing. That is, in tcp_timer_ctl(), use this timeout: if (conn->retries < c->tcp_ctx.syn_linear_timeouts) [one second] else [1 << conn->retries] 4. do the same for data retransmissions, but without the tcp_syn_linear_timeouts thing, start from one second (see Appendix A in RFC 6298)... and maybe limit it to 60 seconds as allowed by the same RFC: (2.5) A maximum value MAY be placed on RTO provided it is at least 60 seconds. ? Retransmitting data after 256 seconds doesn't make a lot of sense to me. It shouldn't be much more complicated than the current patch, it just involves some extra changes in tcp_timer_ctl(), I guess. -- Stefano