From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: passt.top; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=P3gJrLXY; dkim-atps=neutral Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by passt.top (Postfix) with ESMTPS id 94B455A061E for ; Fri, 31 Oct 2025 06:43:12 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1761889391; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CtPXLD4HTGn4kcBFWnBd3rckUxpoYCzkLbM+PTXpM3Y=; b=P3gJrLXYMfTHi9fwh+lp9dTV2Y9xeUkuhzdq3vqLalYOER5Ix8g8BC3gb/Itu7VlMYQswn sQ7Ph/B15J/OTXB305da176BbrUoa2pNnWv7E72uB7Ne7HoP7kTMiu+jlbsWyi+glJQHtN IevoSSqWEwgKR/FsLVM3VgxG9EcO2Fw= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-68-Cg7jHOzpPv6nVj4np6veMg-1; Fri, 31 Oct 2025 01:43:09 -0400 X-MC-Unique: Cg7jHOzpPv6nVj4np6veMg-1 X-Mimecast-MFC-AGG-ID: Cg7jHOzpPv6nVj4np6veMg_1761889389 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 204BC1833784; Fri, 31 Oct 2025 05:43:08 +0000 (UTC) Received: from fedora.redhat.com (unknown [10.72.112.34]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 9FF62180057E; Fri, 31 Oct 2025 05:43:05 +0000 (UTC) From: Yumei Huang To: passt-dev@passt.top, sbrivio@redhat.com Subject: [PATCH v7 5/5] tcp: Clamp the retry timeout Date: Fri, 31 Oct 2025 13:42:42 +0800 Message-ID: <20251031054242.7334-6-yuhuang@redhat.com> In-Reply-To: <20251031054242.7334-1-yuhuang@redhat.com> References: <20251031054242.7334-1-yuhuang@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: soTP9BqZgWcoOXpviR-SVL-govXTTzxA2aAy7J4_x14_1761889389 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: 8bit content-type: text/plain; charset="US-ASCII"; x-default=true Message-ID-Hash: 7KSCOZ4YVMCUTQTGMIMIDAAIN2OYNEDX X-Message-ID-Hash: 7KSCOZ4YVMCUTQTGMIMIDAAIN2OYNEDX X-MailFrom: yuhuang@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: david@gibson.dropbear.id.au, yuhuang@redhat.com X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Clamp the TCP retry timeout as Linux kernel does. If RTO is less than 3 seconds, re-initialize it to 3 seconds for data retransmissions according to RFC 6298. Suggested-by: Stefano Brivio Signed-off-by: Yumei Huang --- tcp.c | 25 ++++++++++++++++++++----- tcp.h | 2 ++ 2 files changed, 22 insertions(+), 5 deletions(-) diff --git a/tcp.c b/tcp.c index 96ee56a..84a6700 100644 --- a/tcp.c +++ b/tcp.c @@ -187,6 +187,9 @@ * for established connections, or (tcp_syn_retries + * tcp_syn_linear_timeouts) times during the handshake, reset the connection * + * - RTO_INIT_ACK: if the RTO is less than this, re-initialize RTO to this for + * data retransmissions. + * * - FIN_TIMEOUT: if a FIN segment was sent to tap/guest (flag ACK_FROM_TAP_DUE * with TAP_FIN_SENT event), and no ACK is received within this time, reset * the connection @@ -340,6 +343,7 @@ enum { #define ACK_INTERVAL 10 /* ms */ #define RTO_INIT 1 /* s, RFC 6298 */ +#define RTO_INIT_ACK 3 /* s, RFC 6298 */ #define FIN_TIMEOUT 60 #define ACT_TIMEOUT 7200 @@ -365,9 +369,11 @@ uint8_t tcp_migrate_rcv_queue [TCP_MIGRATE_RCV_QUEUE_MAX]; #define TCP_SYN_RETRIES "/proc/sys/net/ipv4/tcp_syn_retries" #define TCP_SYN_LINEAR_TIMEOUTS "/proc/sys/net/ipv4/tcp_syn_linear_timeouts" +#define TCP_RTO_MAX_MS "/proc/sys/net/ipv4/tcp_rto_max_ms" #define TCP_SYN_RETRIES_DEFAULT 6 #define TCP_SYN_LINEAR_TIMEOUTS_DEFAULT 4 +#define TCP_RTO_MAX_MS_DEFAULT 120000 /* "Extended" data (not stored in the flow table) for TCP flow migration */ static struct tcp_tap_transfer_ext migrate_ext[FLOW_MAX]; @@ -585,10 +591,13 @@ static void tcp_timer_ctl(const struct ctx *c, struct tcp_tap_conn *conn) if (conn->flags & ACK_TO_TAP_DUE) { it.it_value.tv_nsec = (long)ACK_INTERVAL * 1000 * 1000; } else if (conn->flags & ACK_FROM_TAP_DUE) { - int exp = conn->retries; + int exp = conn->retries, timeout = RTO_INIT; if (!(conn->events & ESTABLISHED)) exp -= c->tcp.syn_linear_timeouts; - it.it_value.tv_sec = RTO_INIT << MAX(exp, 0); + else + timeout = MAX(timeout, RTO_INIT_ACK); + timeout <<= MAX(exp, 0); + it.it_value.tv_sec = MIN(timeout, c->tcp.tcp_rto_max); } else if (CONN_HAS(conn, SOCK_FIN_SENT | TAP_FIN_ACKED)) { it.it_value.tv_sec = FIN_TIMEOUT; } else { @@ -2785,18 +2794,24 @@ static socklen_t tcp_probe_tcp_info(void) */ void tcp_get_rto_params(struct ctx *c) { - intmax_t tcp_syn_retries, syn_linear_timeouts; + intmax_t tcp_syn_retries, syn_linear_timeouts, tcp_rto_max_ms; tcp_syn_retries = read_file_integer( TCP_SYN_RETRIES, TCP_SYN_RETRIES_DEFAULT); syn_linear_timeouts = read_file_integer( TCP_SYN_LINEAR_TIMEOUTS, TCP_SYN_LINEAR_TIMEOUTS_DEFAULT); + tcp_rto_max_ms = read_file_integer( + TCP_RTO_MAX_MS, TCP_RTO_MAX_MS_DEFAULT); c->tcp.tcp_syn_retries = MIN(tcp_syn_retries, UINT8_MAX); c->tcp.syn_linear_timeouts = MIN(syn_linear_timeouts, UINT8_MAX); + c->tcp.tcp_rto_max = MIN( + DIV_ROUND_CLOSEST(tcp_rto_max_ms, 1000), SIZE_MAX); - debug("Read sysctl values tcp_syn_retries: %"PRIu8", linear_timeouts: %"PRIu8, - c->tcp.tcp_syn_retries, c->tcp.syn_linear_timeouts); + debug("Read sysctl values tcp_syn_retries: %"PRIu8 + ", linear_timeouts: %"PRIu8", tcp_rto_max: %zu", + c->tcp.tcp_syn_retries, c->tcp.syn_linear_timeouts, + c->tcp.tcp_rto_max); } /** diff --git a/tcp.h b/tcp.h index befedde..a238bb7 100644 --- a/tcp.h +++ b/tcp.h @@ -59,6 +59,7 @@ union tcp_listen_epoll_ref { * @fwd_out: Port forwarding configuration for outbound packets * @timer_run: Timestamp of most recent timer run * @pipe_size: Size of pipes for spliced connections + * @tcp_rto_max: Maximal retry timeout (in s) * @tcp_syn_retries: SYN retries using exponential backoff timeout * @syn_linear_timeouts: SYN retries before using exponential backoff timeout */ @@ -67,6 +68,7 @@ struct tcp_ctx { struct fwd_ports fwd_out; struct timespec timer_run; size_t pipe_size; + size_t tcp_rto_max; uint8_t tcp_syn_retries; uint8_t syn_linear_timeouts; }; -- 2.49.0