From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: passt.top; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=Uf/jm82m; dkim-atps=neutral Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by passt.top (Postfix) with ESMTPS id 930C25A061D for ; Tue, 04 Nov 2025 05:42:49 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1762231368; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=iG9XV7CMBku86judapcWjYSByPi25WDY0sx1GUvyGqo=; b=Uf/jm82m7KD++NOvf+7tDxQBFIvhWqTuyQF6WC01W7mwHYZVrrHZMcRLVcBc+FiVuXRfTa ZYKkqtvWlPbuKJAwoPBzn0GEKi9S0K9vp2cQL2ykPohXNiJETf2h4Iec1LhEkoxkPzyaGF 2APF142NgxiHhZjTal5HpggBGsJYcG8= Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-676-YfYUuXGKPJGjqLeSWaD_yA-1; Mon, 03 Nov 2025 23:42:47 -0500 X-MC-Unique: YfYUuXGKPJGjqLeSWaD_yA-1 X-Mimecast-MFC-AGG-ID: YfYUuXGKPJGjqLeSWaD_yA_1762231366 Received: by mail-wm1-f71.google.com with SMTP id 5b1f17b1804b1-477171bbf51so27643855e9.3 for ; Mon, 03 Nov 2025 20:42:47 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1762231366; x=1762836166; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=iG9XV7CMBku86judapcWjYSByPi25WDY0sx1GUvyGqo=; b=s3I5P0z2cnitnpIDW9pipYGdY2pZNopkzW/7y8f+tCIztNQbUQw3dgQ0GqGbOnfcht SwqhLOAhK4+MeTnSVJjfBMhfRB9Q+EXRigmA4ybiZ4BLCQplwYBqg9PJ+U/1OymEjiGR cejno+gvWiN/BttTwZ72l6fu2FUoCCPTBWWgv3Ka7fapZ/Tc3wBLMjGhuCIW+OtFapCE HJQgqxn0ZQuQQazFAj8/YFMhLVZTTRV73btMZFJNREk84GznThqqr36zERviMln8GC+R k6Vo3ni2sVflrSFK90zPgpaDySw9h8+L1DYReVmEE6Cn0YX5EXmPOZvOBNQDP/hbJUvy 5yOw== X-Gm-Message-State: AOJu0Yy7/ID3tjHxc5hacuDTAGZLjLdNQz9tQOvZuxZoEk21PAde+Tz8 J8fnHDh2m4/tAcsr0geb4pteVm6QRZyqnUPL5lXk8WVp+yMctVff0BhlWKJi+Jxwv1WP2t736Wb kO+/XjCFwAmK+vuER6LSTdOyeHqhVh5ZTUUWIjKewFZZ1DVQMR5z8MQ== X-Gm-Gg: ASbGncsYvlIKt1aiYqrmmTQPgrSCSJRztfWZvjbeutkHSQOxrXzFL2dRq4HrF0B3lqX trCMUkFwyp6aG6FEnKaOgmSHbVC+mPfDKH23J1uOD8Sx+mN3fYHu6EGcjR/oxF/FrHaH7nCz/n8 QZj29Kff+aNGSSDWmzWdIVYm2ERL3I81f3tr5ITDjD9G+nMZRhPDunCIvL/hlQ/7MDiTYiVt4MR +w8dwVLL5IkZT+0k3Z23dUeBVp4MKARVo/ezuNak7OQg0qlfKUjDPqIleyBE48KYabTak3Zgyuj Je6KO07ErcKhuDxLXBS6zveqc7XP3WkQf+ChzhT0zjVcSiXeZXvPzo4yIaNtdkga8CQbJ/2cHED nRTEcS2OgmgVZk/i2mf8c3SZwHoE= X-Received: by 2002:a05:6000:2c0e:b0:3ec:dd16:fc16 with SMTP id ffacd0b85a97d-429bd6c1f4cmr12725231f8f.43.1762231365972; Mon, 03 Nov 2025 20:42:45 -0800 (PST) X-Google-Smtp-Source: AGHT+IHFI52oHdhEdt2qLyszvE4dFiCiRIz357XOIK65CSsiu+bz69Ex9sMyVhfCOZ9sz86DLWYqLw== X-Received: by 2002:a05:6000:2c0e:b0:3ec:dd16:fc16 with SMTP id ffacd0b85a97d-429bd6c1f4cmr12725212f8f.43.1762231365550; Mon, 03 Nov 2025 20:42:45 -0800 (PST) Received: from maya.myfinge.rs (ifcgrfdd.trafficplex.cloud. [176.103.220.4]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-429dc193e27sm2306612f8f.18.2025.11.03.20.42.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 03 Nov 2025 20:42:45 -0800 (PST) Date: Tue, 4 Nov 2025 05:42:43 +0100 From: Stefano Brivio To: Yumei Huang Subject: Re: [PATCH v7 5/5] tcp: Clamp the retry timeout Message-ID: <20251104054243.6c18e6b8@elisabeth> In-Reply-To: <20251031054242.7334-6-yuhuang@redhat.com> References: <20251031054242.7334-1-yuhuang@redhat.com> <20251031054242.7334-6-yuhuang@redhat.com> Organization: Red Hat X-Mailer: Claws Mail 4.2.0 (GTK 3.24.49; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: GTHBu_OXHOKcfe4LN9y0qtahk-E2oHP4fP1nYzk-cOo_1762231366 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-ID-Hash: J4GHJXDZM3Z2VT2CJHU4ZWZWIAYI423D X-Message-ID-Hash: J4GHJXDZM3Z2VT2CJHU4ZWZWIAYI423D X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, david@gibson.dropbear.id.au X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Fri, 31 Oct 2025 13:42:42 +0800 Yumei Huang wrote: > Clamp the TCP retry timeout as Linux kernel does. If RTO is less > than 3 seconds, re-initialize it to 3 seconds for data retransmissions > according to RFC 6298. > > Suggested-by: Stefano Brivio > Signed-off-by: Yumei Huang > --- > tcp.c | 25 ++++++++++++++++++++----- > tcp.h | 2 ++ > 2 files changed, 22 insertions(+), 5 deletions(-) > > diff --git a/tcp.c b/tcp.c > index 96ee56a..84a6700 100644 > --- a/tcp.c > +++ b/tcp.c > @@ -187,6 +187,9 @@ > * for established connections, or (tcp_syn_retries + > * tcp_syn_linear_timeouts) times during the handshake, reset the connection > * > + * - RTO_INIT_ACK: if the RTO is less than this, re-initialize RTO to this for > + * data retransmissions. > + * > * - FIN_TIMEOUT: if a FIN segment was sent to tap/guest (flag ACK_FROM_TAP_DUE > * with TAP_FIN_SENT event), and no ACK is received within this time, reset > * the connection > @@ -340,6 +343,7 @@ enum { > > #define ACK_INTERVAL 10 /* ms */ > #define RTO_INIT 1 /* s, RFC 6298 */ > +#define RTO_INIT_ACK 3 /* s, RFC 6298 */ > #define FIN_TIMEOUT 60 > #define ACT_TIMEOUT 7200 > > @@ -365,9 +369,11 @@ uint8_t tcp_migrate_rcv_queue [TCP_MIGRATE_RCV_QUEUE_MAX]; > > #define TCP_SYN_RETRIES "/proc/sys/net/ipv4/tcp_syn_retries" > #define TCP_SYN_LINEAR_TIMEOUTS "/proc/sys/net/ipv4/tcp_syn_linear_timeouts" > +#define TCP_RTO_MAX_MS "/proc/sys/net/ipv4/tcp_rto_max_ms" Same as my previous comment on 3/5: you could skip "TCP_" in this name, and: > > #define TCP_SYN_RETRIES_DEFAULT 6 > #define TCP_SYN_LINEAR_TIMEOUTS_DEFAULT 4 > +#define TCP_RTO_MAX_MS_DEFAULT 120000 here too, and: > /* "Extended" data (not stored in the flow table) for TCP flow migration */ > static struct tcp_tap_transfer_ext migrate_ext[FLOW_MAX]; > @@ -585,10 +591,13 @@ static void tcp_timer_ctl(const struct ctx *c, struct tcp_tap_conn *conn) > if (conn->flags & ACK_TO_TAP_DUE) { > it.it_value.tv_nsec = (long)ACK_INTERVAL * 1000 * 1000; > } else if (conn->flags & ACK_FROM_TAP_DUE) { > - int exp = conn->retries; > + int exp = conn->retries, timeout = RTO_INIT; > if (!(conn->events & ESTABLISHED)) > exp -= c->tcp.syn_linear_timeouts; > - it.it_value.tv_sec = RTO_INIT << MAX(exp, 0); > + else > + timeout = MAX(timeout, RTO_INIT_ACK); > + timeout <<= MAX(exp, 0); > + it.it_value.tv_sec = MIN(timeout, c->tcp.tcp_rto_max); > } else if (CONN_HAS(conn, SOCK_FIN_SENT | TAP_FIN_ACKED)) { > it.it_value.tv_sec = FIN_TIMEOUT; > } else { > @@ -2785,18 +2794,24 @@ static socklen_t tcp_probe_tcp_info(void) > */ > void tcp_get_rto_params(struct ctx *c) > { > - intmax_t tcp_syn_retries, syn_linear_timeouts; > + intmax_t tcp_syn_retries, syn_linear_timeouts, tcp_rto_max_ms; > > tcp_syn_retries = read_file_integer( > TCP_SYN_RETRIES, TCP_SYN_RETRIES_DEFAULT); > syn_linear_timeouts = read_file_integer( > TCP_SYN_LINEAR_TIMEOUTS, TCP_SYN_LINEAR_TIMEOUTS_DEFAULT); > + tcp_rto_max_ms = read_file_integer( > + TCP_RTO_MAX_MS, TCP_RTO_MAX_MS_DEFAULT); > > c->tcp.tcp_syn_retries = MIN(tcp_syn_retries, UINT8_MAX); > c->tcp.syn_linear_timeouts = MIN(syn_linear_timeouts, UINT8_MAX); > + c->tcp.tcp_rto_max = MIN( > + DIV_ROUND_CLOSEST(tcp_rto_max_ms, 1000), SIZE_MAX); > > - debug("Read sysctl values tcp_syn_retries: %"PRIu8", linear_timeouts: %"PRIu8, > - c->tcp.tcp_syn_retries, c->tcp.syn_linear_timeouts); > + debug("Read sysctl values tcp_syn_retries: %"PRIu8 > + ", linear_timeouts: %"PRIu8", tcp_rto_max: %zu", > + c->tcp.tcp_syn_retries, c->tcp.syn_linear_timeouts, > + c->tcp.tcp_rto_max); > } > > /** > diff --git a/tcp.h b/tcp.h > index befedde..a238bb7 100644 > --- a/tcp.h > +++ b/tcp.h > @@ -59,6 +59,7 @@ union tcp_listen_epoll_ref { > * @fwd_out: Port forwarding configuration for outbound packets > * @timer_run: Timestamp of most recent timer run > * @pipe_size: Size of pipes for spliced connections > + * @tcp_rto_max: Maximal retry timeout (in s) > * @tcp_syn_retries: SYN retries using exponential backoff timeout > * @syn_linear_timeouts: SYN retries before using exponential backoff timeout > */ > @@ -67,6 +68,7 @@ struct tcp_ctx { > struct fwd_ports fwd_out; > struct timespec timer_run; > size_t pipe_size; > + size_t tcp_rto_max; here too. > uint8_t tcp_syn_retries; > uint8_t syn_linear_timeouts; > }; I finally finished reviewing, minus pending comments the series looks good to me. -- Stefano