From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=pass (2048-bit key; secure) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202510 header.b=pCTBab0r; dkim-atps=neutral Received: from mail.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by passt.top (Postfix) with ESMTPS id 8C4165A0619 for ; Mon, 13 Oct 2025 02:20:53 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202510; t=1760314851; bh=4ZVKQM57LCjqQwIYI4FGt98TMNgDotjs3dYERxkB7WA=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=pCTBab0r3dIfc7mFyKpsP2YnN0E0GPMSaV0ZZZWbpwEFALLoJnfEwBJDCMWu/r9rG oszxUJ999N2E+nPMQooEyXlICr8JGKnM8c4LeK5ZbklOLTrdKAeAI1SYAXew3mEAN1 TTv6NvszuxWB1PUz096MEg6vxr6S5JbiNiwZeYYJuJ4wnLrh0aEcGurAtaoWG/CThC DZyNx+3rPmvoaZ6Acbdl3nTn+5wYbMzIu+4eaoHqVFbMUoVy3F4ML/2sfByMm61+Zp wISd12p0BbCa9k/6AwohIiai1B6kxp+GIkC7JKSBMcrafGWjp6dKS7XBby7c4yg4Hi jCCMTOwxIxbng== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4clJ0g43Mcz4w9v; Mon, 13 Oct 2025 11:20:51 +1100 (AEDT) Date: Mon, 13 Oct 2025 11:18:13 +1100 From: David Gibson To: Yumei Huang Subject: Re: [PATCH v2 2/3] tcp: Resend SYN for inbound connections Message-ID: References: <20251010074700.22177-1-yuhuang@redhat.com> <20251010074700.22177-3-yuhuang@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="97yJFpvDyz6VLs21" Content-Disposition: inline In-Reply-To: <20251010074700.22177-3-yuhuang@redhat.com> Message-ID-Hash: RCVW4APTH5Z2JG3AGASOMZ42A3QY2ABG X-Message-ID-Hash: RCVW4APTH5Z2JG3AGASOMZ42A3QY2ABG X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, sbrivio@redhat.com X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --97yJFpvDyz6VLs21 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Oct 10, 2025 at 03:46:59PM +0800, Yumei Huang wrote: > If a client connects while guest is not connected or ready yet, > resend SYN instead of just resetting connection after 10 seconds. >=20 > Use the same backoff calculation for the timeout as linux kernel. >=20 > Signed-off-by: Yumei Huang > --- > tcp.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++++++++++----- > tcp.h | 2 ++ > 2 files changed, 89 insertions(+), 8 deletions(-) >=20 > diff --git a/tcp.c b/tcp.c > index 2ec4b0c..85bbdac 100644 > --- a/tcp.c > +++ b/tcp.c > @@ -179,9 +179,11 @@ > * > * Timeouts are implemented by means of timerfd timers, set based on fla= gs: > * > - * - SYN_TIMEOUT: if no ACK is received from tap/guest during handshake = (flag > - * ACK_FROM_TAP_DUE without ESTABLISHED event) within this time, reset= the > - * connection > + * - SYN_TIMEOUT_INIT: if no ACK is received from tap/guest during hands= hake > + * (flag ACK_FROM_TAP_DUE without ESTABLISHED event) within this time,= resend > + * SYN. It's the starting timeout for the first SYN retry. If this per= sists > + * for more than TCP_MAX_RETRIES or (tcp_syn_retries + > + * tcp_syn_linear_timeouts) times in a row, reset the connection > * > * - ACK_TIMEOUT: if no ACK segment was received from tap/guest, after s= ending > * data (flag ACK_FROM_TAP_DUE with ESTABLISHED event), re-send data f= rom the > @@ -309,6 +311,7 @@ > #include "tcp_internal.h" > #include "tcp_buf.h" > #include "tcp_vu.h" > +#include "lineread.h" > =20 > /* > * The size of TCP header (including options) is given by doff (Data Off= set) > @@ -340,7 +343,7 @@ enum { > #define WINDOW_DEFAULT 14600 /* RFC 6928 */ > =20 > #define ACK_INTERVAL 10 /* ms */ > -#define SYN_TIMEOUT 10 /* s */ > +#define SYN_TIMEOUT_INIT 1 /* s */ > #define ACK_TIMEOUT 2 As we've discussed in some places, the RFCs largely treat SYN retries the same as data retransmits. ACK_TIMEOUT controls the latter. 2s is a little odd, IIRC the RFC suggests 3, I'm not sure what Linux does by default. In any case, I'm wondering if we should use a single define for the initial value of both timeouts. > #define FIN_TIMEOUT 60 > #define ACT_TIMEOUT 7200 > @@ -365,6 +368,10 @@ uint8_t tcp_migrate_rcv_queue [TCP_MIGRATE_RCV_QUEU= E_MAX]; > =20 > #define TCP_MIGRATE_RESTORE_CHUNK_MIN 1024 /* Try smaller when above thi= s */ > =20 > +#define TCP_SYN_RETRIES_SYSCTL "/proc/sys/net/ipv4/tcp_syn_retries" > +#define TCP_SYN_LINEAR_TIMEOUTS_SYSCTL \ > + "/proc/sys/net/ipv4/tcp_syn_linear_timeouts" > + > /* "Extended" data (not stored in the flow table) for TCP flow migration= */ > static struct tcp_tap_transfer_ext migrate_ext[FLOW_MAX]; > =20 > @@ -581,8 +588,13 @@ static void tcp_timer_ctl(const struct ctx *c, struc= t tcp_tap_conn *conn) > if (conn->flags & ACK_TO_TAP_DUE) { > it.it_value.tv_nsec =3D (long)ACK_INTERVAL * 1000 * 1000; > } else if (conn->flags & ACK_FROM_TAP_DUE) { > - if (!(conn->events & ESTABLISHED)) > - it.it_value.tv_sec =3D SYN_TIMEOUT; > + if (!(conn->events & ESTABLISHED)) { > + if (conn->retries < c->tcp.syn_linear_timeouts) > + it.it_value.tv_sec =3D SYN_TIMEOUT_INIT; > + else > + it.it_value.tv_sec =3D SYN_TIMEOUT_INIT << > + (conn->retries - c->tcp.syn_linear_timeouts); > + } > else > it.it_value.tv_sec =3D ACK_TIMEOUT; > } else if (CONN_HAS(conn, SOCK_FIN_SENT | TAP_FIN_ACKED)) { > @@ -1961,6 +1973,7 @@ static void tcp_conn_from_sock_finish(const struct = ctx *c, > } > =20 > tcp_send_flag(c, conn, ACK); > + conn->retries =3D 0; I know Stefano said you needed to add this, but on a closer examination I don't think you do. tcp_tap_handler() calls tcp_update_seqack_from_tap() if !ESTABLISHED. That will clear retrans/retries if the ack number from the guest has advanced. That will occur when the guest SYN-ACKs our SYN, which is exactly the point at which we should clear the retries count. > =20 > /* The client might have sent data already, which we didn't > * dequeue waiting for SYN,ACK from tap -- check now. > @@ -2409,8 +2422,16 @@ void tcp_timer_handler(const struct ctx *c, union = epoll_ref ref) > tcp_timer_ctl(c, conn); > } else if (conn->flags & ACK_FROM_TAP_DUE) { > if (!(conn->events & ESTABLISHED)) { > - flow_dbg(conn, "handshake timeout"); > - tcp_rst(c, conn); > + if (conn->retries =3D=3D MIN(TCP_MAX_RETRIES, > + (c->tcp.tcp_syn_retries + c->tcp.tcp_syn_retries))) { Should this be tcp_syn_retries + linear_timeouts? Also, probably safer to use a >=3D rather than =3D=3D - if we somehow send an extra retry, we don't want to then retry forever. > + flow_dbg(conn, "handshake timeout"); > + tcp_rst(c, conn); > + } else { > + flow_dbg(conn, "SYN timeout, retry"); > + tcp_send_flag(c, conn, SYN); > + conn->retries++; > + tcp_timer_ctl(c, conn); > + } > } else if (CONN_HAS(conn, SOCK_FIN_SENT | TAP_FIN_ACKED)) { > flow_dbg(conn, "FIN timeout"); > tcp_rst(c, conn); > @@ -2766,6 +2787,62 @@ static socklen_t tcp_probe_tcp_info(void) > return sl; > } > =20 > +/** > + * get_tcp_syn_param() - Read SYN parameters from /proc/sys > + * @path: Path to the sysctl file > + * @fallback: Default value if file can't be read > + * > + * Return: Parameter value, fallback on failure > +*/ > +int get_tcp_syn_param(const char *path, int fallback) I wonder if it might be worth making a new function in util.c to read a file containing a single number. > +{ > + char *line, *end; > + struct lineread lr; > + long value; > + ssize_t len; > + int fd; > + > + fd =3D open(path, O_RDONLY | O_CLOEXEC); > + if (fd < 0) { > + debug("Unable to open %s", path); > + return fallback; > + } > + > + lineread_init(&lr, fd); > + len =3D lineread_get(&lr, &line); > + close(fd); > + > + if (len < 0) { > + debug("Unable to read %s", path); > + return fallback; > + } > + > + errno =3D 0; > + value =3D strtol(line, &end, 10); > + if (*end && *end !=3D '\n') { > + debug("Invalid format in %s", path); > + return fallback; > + } > + if (errno || value < 0 || value > INT_MAX) { > + debug("Invalid value in %s: %ld", path, value); > + return fallback; > + } > + return (int)value; You return an (int) here, but store the value into a uint8_t in the caller. That will cause unexpected results if the value you read is greater than 255. Somewhere you need to check for this and clamp the value. I'd suggest the function actually reading /proc return a long long (for reuseability), then clamp it in the caller. > +} > + > +/** > + * tcp_syn_params_init() - Get initial syn params for inbound connection > + * @c: Execution context > +*/ > +void tcp_syn_params_init(struct ctx *c) > +{ > + c->tcp.tcp_syn_retries =3D get_tcp_syn_param(TCP_SYN_RETRIES_SYSCTL, 8); > + c->tcp.syn_linear_timeouts =3D > + get_tcp_syn_param(TCP_SYN_LINEAR_TIMEOUTS_SYSCTL, 1); > + debug("TCP SYN parameters: retries=3D%d, linear_timeouts=3D%d", > + c->tcp.tcp_syn_retries, c->tcp.syn_linear_timeouts); > +} > + > /** > * tcp_init() - Get initial sequence, hash secret, initialise per-socket= data > * @c: Execution context > @@ -2776,6 +2853,8 @@ int tcp_init(struct ctx *c) > { > ASSERT(!c->no_tcp); > =20 > + tcp_syn_params_init(c); > + > tcp_sock_iov_init(c); > =20 > memset(init_sock_pool4, 0xff, sizeof(init_sock_pool4)); > diff --git a/tcp.h b/tcp.h > index 234a803..df699a4 100644 > --- a/tcp.h > +++ b/tcp.h > @@ -65,6 +65,8 @@ struct tcp_ctx { > struct fwd_ports fwd_out; > struct timespec timer_run; > size_t pipe_size; > + uint8_t tcp_syn_retries; > + uint8_t syn_linear_timeouts; > }; > =20 > #endif /* TCP_H */ > --=20 > 2.47.0 >=20 --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --97yJFpvDyz6VLs21 Content-Type: application/pgp-signature; name=signature.asc -----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmjsRTUACgkQzQJF27ox 2GcHNA/+M0EmRvH7Sov043e65uYubj7DoLE+BnFvrIWWw+mNZ1NyFgzMsrOZRdwy 5qOkSnSujrE6OKsYBWg4O8wBYHi9B0vY9s0qgI5fO7e6T6rbOZb6epWNTyWnkY5n w2W8zgtEpggPml8c7cjRa0n4IhrAHfl5G6VbCFzyKfi+TpcOaMa+ebwUio7JqyY/ UjXYoNRvvASRNp4lFRO+Ves8AGT3xSBGdiBzd6XB/o/s/lEIiWAFHcrTI+HtS/2H ds2cHmqpLYU0GRg4dDfPS9uzFqwjj6WcchzYMhSJh+vXk3Lo45bZkBG4pLcV2zxM jn8TWJWitowUBdwV+0bBwMWKnYGaGkYlFt2WSzy1qVZh/BmAuPcfMLTBXGkqMH9F MRaI4ya/grP2FTlS1clTFKZna+O5+HRyrKEUd0jgpW+fJ+VccNRXijCCfmFrZmqB ARGs3r+R90bw0ctUUlKuFdHPcXMAQiPisN3iQNBq1JdicIlST+EGGOKwkmPdSeXg 3cyzTT3AdcCRwDc3YSVxkQU44+/8RxrsO5FHat8Tu72BfExztp0V7lC7RXcrD8V1 1WefH7kF7a63pAOlxLYBKguvpn/s25yuZN/OnWr92VTMxKY2TGHSiHTmMnyeMSov vmbFMRomhyR+jiImD6Vo4BMI4thoPEohsGjGgyqiXyAw+l7Vei4= =Crf8 -----END PGP SIGNATURE----- --97yJFpvDyz6VLs21--