From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=pass (2048-bit key; secure) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202602 header.b=WcszdL6X; dkim-atps=neutral Received: from mail.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by passt.top (Postfix) with ESMTPS id 3EAC55A0265 for ; Thu, 21 May 2026 04:29:13 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202602; t=1779330549; bh=h+pWVLr9QCBKH5HXP3JC+rs07MWl77rdbxrEOmEwL9c=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=WcszdL6X0RRLc7qttyiDXtr8yrrVG3oYu0QTLQCroPEt7u46HkHpCfYFed86hdSYW d/W9wJ+KNGQVl/STLZhHuF2w/TeecSk2pCmhIFJPCPScgs0kHcQrsZShdjiPiSViLi ig23TBspY7SihoCu/M+bKOBklN5UBA9HHhjCmflfI7qpgh52P4mj7Cs9Zu4MSvMheN JvCthJVMSgzc56P2Evq/UI3PWRCSCTJsSXajPxEDfgmOoy0fyaVVTmNGB5b/ipd/9C NjNXYWMo/hMxRr8dVgHOY+8gdXd78cYLEG7S5cq660PNyxpWSCUIM1CGX/Xb61Oekh Vxv8gcroUHybA== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4gLXR967C5z4wLQ; Thu, 21 May 2026 12:29:09 +1000 (AEST) Date: Thu, 21 May 2026 10:50:24 +1000 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH 3/6] tcp_splice: Clean up flow control path for splice forwarding Message-ID: References: <20260520130851.436931-1-david@gibson.dropbear.id.au> <20260520130851.436931-4-david@gibson.dropbear.id.au> <20260520222851.19e5f430@elisabeth> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="YLQNZo6gj4CejLQ0" Content-Disposition: inline In-Reply-To: <20260520222851.19e5f430@elisabeth> Message-ID-Hash: 6JP7C7GWOLO5Z4GALZ44KGCGYOA3RDIM X-Message-ID-Hash: 6JP7C7GWOLO5Z4GALZ44KGCGYOA3RDIM X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, Paul Holzinger X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --YLQNZo6gj4CejLQ0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, May 20, 2026 at 10:28:52PM +0200, Stefano Brivio wrote: > Ah, yes, it looks better now. Three remarks: >=20 > On Wed, 20 May 2026 23:08:48 +1000 > David Gibson wrote: >=20 > > Splice forwarding can be blocked either waiting for data from one side > > or waiting for space on the other. For that reason, > > tcp_splice_sock_handler() on either socket can forward data in either or > > both directions, depending on whether we have EPOLLIN, EPOLLOUT or both > > events. > >=20 > > The flow control for this is quite hard to follow though, since we forw= ard > > in one direction, then sometimes loop back with a goto to do it in the > > other direction. Simplify this by adding a tcp_splice_forward() functi= on > > with the logic to forward in one direction and calling it either once or > > twice from tcp_splice_sock_handler(). > >=20 > > Signed-off-by: David Gibson > > --- > > tcp_splice.c | 137 ++++++++++++++++++++++++++------------------------- > > 1 file changed, 71 insertions(+), 66 deletions(-) > >=20 > > diff --git a/tcp_splice.c b/tcp_splice.c > > index 34ffea73..18e8b303 100644 > > --- a/tcp_splice.c > > +++ b/tcp_splice.c > > @@ -474,67 +474,20 @@ void tcp_splice_conn_from_sock(const struct ctx *= c, union flow *flow, int s0) > > } > > =20 > > /** > > - * tcp_splice_sock_handler() - Handler for socket mapped to spliced co= nnection > > + * tcp_splice_forward() - Forward data in one direction using splice() > > * @c: Execution context > > - * @ref: epoll reference > > - * @events: epoll events bitmap > > + * @conn: Connection to forward data for > > + * @fromsidei: Side to forward data from > > * > > * #syscalls:pasta splice > > */ > > -void tcp_splice_sock_handler(struct ctx *c, union epoll_ref ref, > > - uint32_t events) > > +static int tcp_splice_forward(struct ctx *c, struct > > + tcp_splice_conn *conn, unsigned fromsidei) >=20 > I think the struct > argument should all be on the same line. Oops, definitely. Forgot to document the return value too. > > { > > - struct tcp_splice_conn *conn =3D conn_at_sidx(ref.flowside); > > - unsigned evsidei =3D ref.flowside.sidei, fromsidei; > > - uint8_t lowat_set_flag, lowat_act_flag; > > - int eof, never_read; > > - > > - assert(conn->f.type =3D=3D FLOW_TCP_SPLICE); > > - > > - if (conn->events =3D=3D SPLICE_CLOSED) > > - return; > > - > > - if (events & EPOLLERR) { > > - int err, rc; > > - socklen_t sl =3D sizeof(err); > > - > > - rc =3D getsockopt(ref.fd, SOL_SOCKET, SO_ERROR, &err, &sl); > > - if (rc) > > - flow_perror(conn, "Error retrieving SO_ERROR"); > > - else > > - flow_dbg(conn, "Error event on %s socket: %s", > > - pif_name(conn->f.pif[evsidei]), > > - strerror_(err)); > > - goto reset; > > - } > > - > > - if (conn->events =3D=3D SPLICE_CONNECT) { > > - if (!(events & EPOLLOUT)) { > > - flow_err(conn, "Unexpected events 0x%x during connect", > > - events); > > - goto reset; > > - } > > - if (tcp_splice_connect_finish(c, conn)) > > - goto reset; > > - } > > - > > - if (events & EPOLLOUT) { > > - fromsidei =3D !evsidei; > > - conn_event(conn, ~OUT_WAIT(evsidei)); > > - } else { > > - fromsidei =3D evsidei; > > - } > > - > > - if (events & EPOLLRDHUP) > > - /* For side 0 this is fake, but implied */ > > - conn_event(conn, FIN_RCVD(evsidei)); > > - > > -swap: > > - eof =3D 0; > > - never_read =3D 1; > > - > > - lowat_set_flag =3D RCVLOWAT_SET(fromsidei); > > - lowat_act_flag =3D RCVLOWAT_ACT(fromsidei); > > + uint8_t lowat_set_flag =3D RCVLOWAT_SET(fromsidei); > > + uint8_t lowat_act_flag =3D RCVLOWAT_ACT(fromsidei); > > + int never_read =3D 1; > > + int eof =3D 0; > > =20 > > while (1) { > > ssize_t readlen, written, pending; > > @@ -551,7 +504,7 @@ retry: > > if (readlen < 0 && errno !=3D EAGAIN) { > > flow_perror(conn, "Splicing from %s socket", > > pif_name(conn->f.pif[fromsidei])); > > - goto reset; > > + return -1; > > } > > =20 > > flow_trace(conn, "%zi from read-side call", readlen); > > @@ -578,7 +531,7 @@ retry: > > if (written < 0 && errno !=3D EAGAIN) { > > flow_perror(conn, "Splicing to %s socket", > > pif_name(conn->f.pif[!fromsidei])); > > - goto reset; > > + return -1; > > } > > =20 > > flow_trace(conn, "%zi from write-side call (passed %zi)", > > @@ -639,24 +592,76 @@ retry: > > if (shutdown(conn->s[!sidei], SHUT_WR) < 0) { > > flow_perror(conn, "shutdown() on %s", > > pif_name(conn->f.pif[!sidei])); > > - goto reset; > > + return -1; > > } > > conn_event(conn, FIN_SENT(!sidei)); > > } > > } > > } > > =20 > > - if (CONN_HAS(conn, FIN_SENT(0) | FIN_SENT(1))) { > > - /* Clean close, no reset */ > > - conn_flag(conn, CLOSING); > > + return 0; > > +} > > + > > +/** > > + * tcp_splice_sock_handler() - Handler for socket mapped to spliced co= nnection > > + * @c: Execution context > > + * @ref: epoll reference > > + * @events: epoll events bitmap > > + */ > > +void tcp_splice_sock_handler(struct ctx *c, union epoll_ref ref, > > + uint32_t events) > > +{ > > + struct tcp_splice_conn *conn =3D conn_at_sidx(ref.flowside); > > + unsigned evsidei =3D ref.flowside.sidei; > > + > > + assert(conn->f.type =3D=3D FLOW_TCP_SPLICE); > > + > > + if (conn->events =3D=3D SPLICE_CLOSED) > > return; > > + > > + if (events & EPOLLERR) { > > + int err, rc; > > + socklen_t sl =3D sizeof(err); > > + > > + rc =3D getsockopt(ref.fd, SOL_SOCKET, SO_ERROR, &err, &sl); > > + if (rc) > > + flow_perror(conn, "Error retrieving SO_ERROR"); > > + else > > + flow_dbg(conn, "Error event on %s socket: %s", > > + pif_name(conn->f.pif[evsidei]), > > + strerror_(err)); > > + goto reset; > > + } > > + > > + if (conn->events =3D=3D SPLICE_CONNECT) { > > + if (!(events & EPOLLOUT)) { > > + flow_err(conn, "Unexpected events 0x%x during connect", > > + events); > > + goto reset; > > + } > > + if (tcp_splice_connect_finish(c, conn)) > > + goto reset; > > + } > > + > > + if (events & EPOLLRDHUP) > > + /* For side 0 this is fake, but implied */ > > + conn_event(conn, FIN_RCVD(evsidei)); >=20 > I saw this all goes away in 5/6, so it wouldn't be relevant. But in > case we decide to drop 5/6, here are my remarks on the this. >=20 > EPOLLRDHUP is now handled before checking the other direction of the > connection in case of EPOLLOUT. I'm pretty sure that hasn't changed. In the old code EPOLLRDHUP handling was before we did any of the actual data handling for EPOLLIN or EPOLLOUT. > I think it actually makes more sense this way because we update flags > with everything we know until that point, and it shouldn't have a > functional effect (the check at the end of the new tcp_splice_forward() > is on FIN_RCVD(fromsidei)), but I'm raising that in case the change > wasn't intended. >=20 > > + > > + if (events & EPOLLOUT) { > > + if (tcp_splice_forward(c, conn, !evsidei)) > > + goto reset; > > + conn_event(conn, ~OUT_WAIT(evsidei)); > > } > > =20 > > - if ((events & (EPOLLIN | EPOLLOUT)) =3D=3D (EPOLLIN | EPOLLOUT)) { > > - events =3D EPOLLIN; > > + if (events & EPOLLIN) { > > + if (tcp_splice_forward(c, conn, evsidei)) > > + goto reset; >=20 > This should be: >=20 > goto reset; >=20 > instead of: >=20 > goto reset; Oops, fixed. >=20 > > + } > > =20 > > - fromsidei =3D !fromsidei; > > - goto swap; > > + if (CONN_HAS(conn, FIN_SENT(0) | FIN_SENT(1))) { > > + /* Clean close, no reset */ > > + conn_flag(conn, CLOSING); > > + return; > > } > > =20 > > if (events & EPOLLHUP) { >=20 > --=20 > Stefano >=20 --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --YLQNZo6gj4CejLQ0 Content-Type: application/pgp-signature; name=signature.asc -----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmoOVs8ACgkQzQJF27ox 2Gd4Yg//XoQhHfnuduEilxR1s/65Mc95c4TUcLVHkZ95z6A75UclnjjjGaaLwKlO g2eOT554nQ3kaydrhI9iTIbS4/9/h6C8uks1/Vyqk3jqrj25ytezmBh/BZScWsW5 8c7W+wT5uopkswEIYRiIx1yQmn85Q628GeH2z0pqDY+kSUjb7qMhNRBm+CNxge7G MhCctm24mC3pXfhileF2wHoDuvxxwmt74kzs9ZnlU/SLlhy9ebXGHT5ogwtxV4Hh FyKbPIMkhRTeJ3eR7fVJyJqL+zP6Xb8puc0f1y5tTXk6GVgT3bia9jS/OZvjU4es 1nFvcqVR33sIOCxdEJAy9mEajGWnbSP1EaUa74aMCUpCutRFuWq6k8q9pvgyd0b7 M+289JbRez4dxqq4FPTESOZtu7GRPGGD6uMibHdoqHQXHN3kIBwuLEwHhPf7U3IO PRlg9qseRvJgLLAW33UGsoWhJYyTwdcYpDAZ3GKTl4HU1Z+BqVWnQhBub8+RuwBd LfSq8bD63zJNqYxKKrH9i2oYZ+0fwH93qFsiZhYv7ByxXnwLrTOBOVOWJ1dxnQoM mMz0N1WBvrsRW9hRf8SqK+kcPyonf1cWqx4P+uqZAiDTkKjkGWIzgy2nf3hMWfuV 6+tXutkzezvmQPVM4OkRNHilfCurpeNynRfygWIxFHrgauygWGo= =hyzy -----END PGP SIGNATURE----- --YLQNZo6gj4CejLQ0--