From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by passt.top (Postfix) with ESMTPS id F33755A0272 for ; Wed, 8 Mar 2023 23:51:40 +0100 (CET) Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4PX6xl0RxBz4xDr; Thu, 9 Mar 2023 09:51:39 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=201602; t=1678315899; bh=Ffh+ll/gYkiu3p5iAqDbEEdg5We9c6R5Ly9Eo0PrAOk=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=I0PfM4i/0ebQapqmJslUhoFwc8TEvsAv3n58XT99Cmo7d9r0Mfa0xU7GSz1Ku5e21 pxEnW1Bg5YKzV9F+BwFZDesBBnXWZhjbCwsbPJhT41nPKcz9uYoAlt8H/mI5dgTT9W RhA+mS7FWKFd3yZYpezDZOMjQzt5yOjjZGLJEa3c= Date: Thu, 9 Mar 2023 09:51:23 +1100 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH] tcp: Clamp MSS value when queueing data to tap, also for pasta Message-ID: References: <20230308211628.2282752-1-sbrivio@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="2kgkTkmTiGnbT8KS" Content-Disposition: inline In-Reply-To: <20230308211628.2282752-1-sbrivio@redhat.com> Message-ID-Hash: 435H63QVAOE6RECHP4NFAKDYBYB6Y3YU X-Message-ID-Hash: 435H63QVAOE6RECHP4NFAKDYBYB6Y3YU X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, Tom Mombourquette X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --2kgkTkmTiGnbT8KS Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Mar 08, 2023 at 10:16:28PM +0100, Stefano Brivio wrote: > Tom reports that a pattern of repated ~1 MiB chunks downloads over > NNTP over TLS, on Podman 4.4 using pasta as network back-end, results > in pasta taking one full CPU thread after a while, and the download > never succeeds. >=20 > On that setup, we end up re-sending the same frame over and over, > with a consistent 65 534 bytes size, and never get an > acknowledgement from the tap-side client. This only happens for the > default MTU value (65 520 bytes) or for values that are slightly > smaller than that (down to 64 499 bytes). >=20 > We hit this condition because the MSS value we use in > tcp_data_from_sock(), only in pasta mode, is simply clamped to > USHRT_MAX, and not to the actual size of the buffers we pre-cooked > for sending, which is a bit less than that. >=20 > It looks like we got away with it until commit 0fb7b2b9080a ("tap: > Use different io vector bases depending on tap type") fixed the > setting of iov_len. >=20 > Luckily, since it's pasta, we're queueing up to two frames at a time, > so the worst that can happen is a badly segmented TCP stream: we > always have some space at the tail of the buffer. >=20 > Clamp the MSS value to the appropriate maximum given by struct > tcp{4,6}_buf_data_t, no matter if we're running in pasta or passt > mode. >=20 > While at it, fix the comments to those structs to reflect the current > struct size. This is not really relevant for any further calculation > or consideration, but it's convenient to know while debugging this > kind of issues. >=20 > Thanks to Tom for reporting the issue in a very detailed way and for > providing a test setup. >=20 > Reported-by: Tom Mombourquette > Link: https://github.com/containers/podman/issues/17703 > Signed-off-by: Stefano Brivio Reviewed-by: David Gibson > --- > tcp.c | 23 +++++++++-------------- > 1 file changed, 9 insertions(+), 14 deletions(-) >=20 > diff --git a/tcp.c b/tcp.c > index a29e387..0214087 100644 > --- a/tcp.c > +++ b/tcp.c > @@ -461,7 +461,7 @@ static struct tcp4_l2_buf_t { > struct iphdr iph; /* 44 28 */ > struct tcphdr th; /* 64 48 */ > uint8_t data[MSS4]; /* 84 68 */ > - /* 65541 65525 */ > + /* 65536 65532 */ > #ifdef __AVX2__ > } __attribute__ ((packed, aligned(32))) > #else > @@ -489,7 +489,7 @@ struct tcp6_l2_buf_t { > struct ipv6hdr ip6h; /* 32 20 */ > struct tcphdr th; /* 72 60 */ > uint8_t data[MSS6]; /* 92 80 */ > - /* 65639 65627 */ > + /* 65536 65532 */ > #ifdef __AVX2__ > } __attribute__ ((packed, aligned(32))) > #else > @@ -1917,15 +1917,13 @@ int tcp_conn_new_sock(const struct ctx *c, sa_fam= ily_t af) > =20 > /** > * tcp_conn_tap_mss() - Get MSS value advertised by tap/guest > - * @c: Execution context > * @conn: Connection pointer > * @opts: Pointer to start of TCP options > * @optlen: Bytes in options: caller MUST ensure available length > * > * Return: clamped MSS value > */ > -static uint16_t tcp_conn_tap_mss(const struct ctx *c, > - const struct tcp_tap_conn *conn, > +static uint16_t tcp_conn_tap_mss(const struct tcp_tap_conn *conn, > const char *opts, size_t optlen) > { > unsigned int mss; > @@ -1936,13 +1934,10 @@ static uint16_t tcp_conn_tap_mss(const struct ctx= *c, > else > mss =3D ret; > =20 > - /* Don't upset qemu */ > - if (c->mode =3D=3D MODE_PASST) { > - if (CONN_V4(conn)) > - mss =3D MIN(MSS4, mss); > - else > - mss =3D MIN(MSS6, mss); > - } > + if (CONN_V4(conn)) > + mss =3D MIN(MSS4, mss); > + else > + mss =3D MIN(MSS6, mss); > =20 > return MIN(mss, USHRT_MAX); > } > @@ -2066,7 +2061,7 @@ static void tcp_conn_from_tap(struct ctx *c, int af= , const void *addr, > =20 > conn->wnd_to_tap =3D WINDOW_DEFAULT; > =20 > - mss =3D tcp_conn_tap_mss(c, conn, opts, optlen); > + mss =3D tcp_conn_tap_mss(conn, opts, optlen); > if (setsockopt(s, SOL_TCP, TCP_MAXSEG, &mss, sizeof(mss))) > trace("TCP: failed to set TCP_MAXSEG on socket %i", s); > MSS_SET(conn, mss); > @@ -2533,7 +2528,7 @@ static void tcp_conn_from_sock_finish(struct ctx *c= , struct tcp_tap_conn *conn, > if (!(conn->wnd_from_tap >>=3D conn->ws_from_tap)) > conn->wnd_from_tap =3D 1; > =20 > - MSS_SET(conn, tcp_conn_tap_mss(c, conn, opts, optlen)); > + MSS_SET(conn, tcp_conn_tap_mss(conn, opts, optlen)); > =20 > conn->seq_init_from_tap =3D ntohl(th->seq) + 1; > conn->seq_from_tap =3D conn->seq_init_from_tap; --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --2kgkTkmTiGnbT8KS Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmQJEWUACgkQzQJF27ox 2Gevig//QfcNz4Xly+bUZtYn//UsUJGA6TvyXKoZp40/eq6vG+TC9csqBmmw1CAz gMf07YdR2f613J9+jGnFqecYZXBvF5yjYEBxhGj0oYvWw1N3Kf13fX3VQKKngbz3 rm0R2je6QRRSvG2f+WWqFsh8ncGop9tSW7dgNIXxHpdixyH1gOeOE71qMT+M2hYb Qqh8yTJGBCFV7yNKuMAMY5pJEWuZbj+jjNDqxqiRC19Q+nvAVmW9LZOkH5SOBkH1 +rxFS2Q6ESLds4Cevk5SQIoif59dROm01zY1X0PhGflhX4mIe/CkItrE6nQh9puG GoywmjS0a1pGItO054lFNsbX5oeIDroHLzb5YqeHQlf068XRYAgn0RLtjZx0Y0pB j7No1azNOxrxaQNkN1dYJ0kofuaR22zmAXtLKxyPfWLspZlR38DTwJvqSnjHeL+x IgpF9KFvjpTUtFCja94JrrkfhU6i9zcuHbB9AsKY82+V6d6kmU/9HN0HkvOZV3FN h/P9Tb5hY5cyXdEB9BgBl7rHYu9PAYerbaqUFgIZLbPhVW6MRA+onBc3541hgVGl U9P9wGTOyf8Y5oz59KS8K4YFgG/mcy8u8I1rblmsBoNyMm/p5xEF24efQG+1TO8v hiblxkryb/dcc6zUEnMWCww6myHeHMbseyz9lhf0GD3hV6B+b64= =zUCE -----END PGP SIGNATURE----- --2kgkTkmTiGnbT8KS--