From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=pass (2048-bit key; secure) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202510 header.b=H4h7N+et; dkim-atps=neutral Received: from mail.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by passt.top (Postfix) with ESMTPS id 008F95A061D for ; Fri, 05 Dec 2025 01:35:30 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202510; t=1764894926; bh=GZousur5qXbkIZ7ry9DlQlNUoQrk4bBvyVXniJ8WQ/k=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=H4h7N+etBpDFQFAoLuVGbLclobDvKf9szBh3h+nbUsRxqGhIYuVgSSlFpbnFjD3tR CkHWJ2X+V+8dZmAeIR62ovuEPQfLKVsXz0v/1ANbPz6hRV5173xCVSnfGV3ZNwGbaQ MH69lcXJIu1gjSHDDfYGJdgLLuDFlhcRmDm6aydAvJuHVjOoIGpn3naAKm4V4TDupJ dVtsbVPwvtrDkmD8K/Egzcx+627Lh987kmrrVGruCca/W4CrAE/E/7z8egD5ojmYN5 JT8MwdfmFETnxSP/aRys8TYk3EdahjqRg399da89FU+vc289fiNUdPJNvWX5pFKn+g xH2ByOggcmB+w== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4dMsq25s62z4wHW; Fri, 05 Dec 2025 11:35:26 +1100 (AEDT) Date: Fri, 5 Dec 2025 10:10:06 +1100 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH 1/8] tcp: Limit advertised window to available, not total sending buffer size Message-ID: References: <20251204074542.2156548-1-sbrivio@redhat.com> <20251204074542.2156548-2-sbrivio@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="p478uqId0qGHTZfW" Content-Disposition: inline In-Reply-To: <20251204074542.2156548-2-sbrivio@redhat.com> Message-ID-Hash: VX7EDWII32BKHZ4Z43EZ4P44X253JQPF X-Message-ID-Hash: VX7EDWII32BKHZ4Z43EZ4P44X253JQPF X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, Max Chernoff X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --p478uqId0qGHTZfW Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Dec 04, 2025 at 08:45:34AM +0100, Stefano Brivio wrote: > For non-local connections, we advertise the same window size as what > the peer in turn advertises to us, and limit it to the buffer size > reported via SO_SNDBUF. >=20 > That's not quite correct: in order to later avoid failures while > queueing data to the socket, we need to limit the window to the > available buffer size, not the total one. >=20 > Use the SIOCOUTQ ioctl and subtract the number of outbound queued > bytes from the total buffer size, then clamp to this value. >=20 > Signed-off-by: Stefano Brivio Reviewed-by: David Gibson > --- > README.md | 2 +- > tcp.c | 18 ++++++++++++++++-- > 2 files changed, 17 insertions(+), 3 deletions(-) >=20 > diff --git a/README.md b/README.md > index 897ae8b..8fdc0a3 100644 > --- a/README.md > +++ b/README.md > @@ -291,7 +291,7 @@ speeding up local connections, and usually requiring = NAT. _pasta_: > * =E2=9C=85 all capabilities dropped, other than `CAP_NET_BIND_SERVICE` = (if granted) > * =E2=9C=85 with default options, user, mount, IPC, UTS, PID namespaces = are detached > * =E2=9C=85 no external dependencies (other than a standard C library) > -* =E2=9C=85 restrictive seccomp profiles (33 syscalls allowed for _passt= _, 43 for > +* =E2=9C=85 restrictive seccomp profiles (34 syscalls allowed for _passt= _, 43 for > _pasta_ on x86_64) > * =E2=9C=85 examples of [AppArmor](/passt/tree/contrib/apparmor) and > [SELinux](/passt/tree/contrib/selinux) profiles available > diff --git a/tcp.c b/tcp.c > index fa95f6b..863ccdb 100644 > --- a/tcp.c > +++ b/tcp.c > @@ -1031,6 +1031,8 @@ void tcp_fill_headers(const struct ctx *c, struct t= cp_tap_conn *conn, > * @tinfo: tcp_info from kernel, can be NULL if not pre-fetched > * > * Return: 1 if sequence or window were updated, 0 otherwise > + * > + * #syscalls ioctl > */ > int tcp_update_seqack_wnd(const struct ctx *c, struct tcp_tap_conn *conn, > bool force_seq, struct tcp_info_linux *tinfo) > @@ -1113,9 +1115,21 @@ int tcp_update_seqack_wnd(const struct ctx *c, str= uct tcp_tap_conn *conn, > if ((conn->flags & LOCAL) || tcp_rtt_dst_low(conn)) { > new_wnd_to_tap =3D tinfo->tcpi_snd_wnd; > } else { > + uint32_t sendq; > + int limit; > + > + if (ioctl(s, SIOCOUTQ, &sendq)) { > + debug_perror("SIOCOUTQ on socket %i, assuming 0", s); > + sendq =3D 0; > + } > tcp_get_sndbuf(conn); > - new_wnd_to_tap =3D MIN((int)tinfo->tcpi_snd_wnd, > - SNDBUF_GET(conn)); > + > + if ((int)sendq > SNDBUF_GET(conn)) /* Due to memory pressure? */ > + limit =3D 0; > + else > + limit =3D SNDBUF_GET(conn) - (int)sendq; > + > + new_wnd_to_tap =3D MIN((int)tinfo->tcpi_snd_wnd, limit); > } > =20 > new_wnd_to_tap =3D MIN(new_wnd_to_tap, MAX_WINDOW); > --=20 > 2.43.0 >=20 --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --p478uqId0qGHTZfW Content-Type: application/pgp-signature; name=signature.asc -----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmkyFL0ACgkQzQJF27ox 2GdCXA/+OGHBDqC1nY2cRuhNUSvCbGNvT/7HG3lL/Ch2/iNmO2c2ucLQJvZSxziq ENHIBe77l42mNq6h7WXOA1GrCEoSj1qvRYXAQgSIzfYBb38fUx6Xp6p8QtyR/61h 48K45sKXMNFZlvJFCAJLWf7Wxa+UWXVRpqrIfCyY6Ss9xB63tVg42Bx6L50RJ+bV ArUj5WnumrbPyh30dzkVgQ7mCj2nv3exsh4pNrLjqiCYF5ZbnJ4bJLdPsB7Rxc+/ KWVJooslzvbd2OEfco7y6nNu9qdQvjCnnSYZhgNtGgLZrjAsWaGB0VYb8p8iYiOd QJEVEsVRJE2k27qyHGpthl894m/65vVfx0hdff9plgfNe6DDvNBjlPhwSJvJImhf MhfY2rUWqHe3QtoVpzugKlmL3umdv3vqb+Ml2ikZzNjDXgXFsHj5CZIgZxZdWAXJ ZKx4viLntiMV8yKeOGEihnsJNrTZM2POYsAcMnpg8a7heYZPCpW3e6/X7qzH8gj9 HOXqzpurSdrTn9pnJOZRAB5fzdUgDpVxK7jgx2likjHWPcjJthvBg33ScD5VvdFn AGoT3uuKQZ7P4crFpIN88qVRwySZr0gtJBeVgfFl4aocI8KpAyHu+0StE4guHxb7 wkhTaLXp+nKaZO5GzFIKezpbJrVWwR+K+6lXWgXvFlSq4G/qaEA= =Q5TI -----END PGP SIGNATURE----- --p478uqId0qGHTZfW--