From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by passt.top (Postfix) with ESMTPS id A64175A026D for ; Mon, 25 Sep 2023 05:31:51 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=201602; t=1695612707; bh=qEbAZMaXpH61eN4cCnD8vhxA2fd90+oLrv6LdoE3KqM=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=mDFM2ufJLhVUdHLVwT+g9M056cA8P8qL6Mzt8tZnbgCKvixJlEWxS+R7CSq+X7gVp fAni8P4MAFP3vgq6USuAelJF/xxcAW4zaRC6qaNWjkI4ouGOTiX+d9IeD6Pnw+N4C6 5GnN/GnCPNymTmFZcZce7GwoD169ai547lmBONfI= Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4Rv7hg5HxMz4xKl; Mon, 25 Sep 2023 13:31:47 +1000 (AEST) Date: Mon, 25 Sep 2023 13:07:24 +1000 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH RFT 2/5] tcp: Reset STALLED flag on ACK only, check for pending socket data Message-ID: References: <20230922220610.58767-1-sbrivio@redhat.com> <20230922220610.58767-3-sbrivio@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="05Rwwdl30Skun6ZB" Content-Disposition: inline In-Reply-To: <20230922220610.58767-3-sbrivio@redhat.com> Message-ID-Hash: EEKGIQ2K5XZASCR4QDOVCMFPFGSR757D X-Message-ID-Hash: EEKGIQ2K5XZASCR4QDOVCMFPFGSR757D X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: Matej Hrica , passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --05Rwwdl30Skun6ZB Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable I think the change itself here is sound, but I have some nits to pick with the description and reasoning. On Sat, Sep 23, 2023 at 12:06:07AM +0200, Stefano Brivio wrote: > In tcp_tap_handler(), we shouldn't reset the STALLED flag (indicating > that we ran out of tap-side window space, or that all available > socket data is already in flight -- better names welcome! Hmm.. when you put it like that it makes me wonder if those two quite different conditions really need the same handling. Hrm.. I guess both conditions mean that we can't accept data from the socket, even if it's availble. > ) on any > event: do that only if the first packet in a batch has the ACK flag > set. "First packet in a batch" may not be accurate here - we're looking at whichever packet we were up to before calling data_from_tap(). There could have been earlier packets in the receive batch that were already processed. This also raises the question of why the first data packet should be particularly privileged here. I'm wondering if what we really want to check is whether data_from_tap() advanced the ack pointer at all. I'm not clear on when the th->ack check would ever fail in practice: aren't the only normal packets in a TCP connection without ACK the initial SYN or an RST? We've handled the SYN case earlier, so should we just have a blanket case above this that if we get a packet with !ACK, we reset the connection? > Make sure we check for pending socket data when we reset it: > reverting back to level-triggered epoll events, as tcp_epoll_ctl() > does, isn't guaranteed to actually trigger a socket event. Which sure seems like a kernel bug. Some weird edge conditions for edge-triggered seems expected, but this doesn't seem like valid level-triggered semantics. Hmmm... is toggling EPOLLET even what we want. IIUC, the heart of what's going on here is that we can't take more data from the socket until something happens on the tap side (either the window expands, or it acks some data). In which case should we be toggling EPOLLIN on the socket instead? That seems more explicitly to be saying to the socket side "we don't currently care if you have data available". > Further, note that the flag only makes sense once a connection is > established, so move all this to the right place, which is convenient > for the next patch, as we want to check if the STALLED flag was set > before processing any new information about the window size > advertised by the tap. >=20 > Signed-off-by: Stefano Brivio > --- > tcp.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) >=20 > diff --git a/tcp.c b/tcp.c > index aa1c8c9..5528e05 100644 > --- a/tcp.c > +++ b/tcp.c > @@ -2572,8 +2572,6 @@ int tcp_tap_handler(struct ctx *c, int af, const vo= id *saddr, const void *daddr, > if (th->ack && !(conn->events & ESTABLISHED)) > tcp_update_seqack_from_tap(c, conn, ntohl(th->ack_seq)); > =20 > - conn_flag(c, conn, ~STALLED); > - > /* Establishing connection from socket */ > if (conn->events & SOCK_ACCEPTED) { > if (th->syn && th->ack && !th->fin) { > @@ -2631,6 +2629,11 @@ int tcp_tap_handler(struct ctx *c, int af, const v= oid *saddr, const void *daddr, > if (conn->seq_ack_to_tap !=3D conn->seq_from_tap) > ack_due =3D 1; > =20 > + if ((conn->flags & STALLED) && th->ack) { > + conn_flag(c, conn, ~STALLED); > + tcp_data_from_sock(c, conn); > + } > + > if ((conn->events & TAP_FIN_RCVD) && !(conn->events & SOCK_FIN_SENT)) { > shutdown(conn->sock, SHUT_WR); > conn_event(c, conn, SOCK_FIN_SENT); --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --05Rwwdl30Skun6ZB Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmUQ+WUACgkQzQJF27ox 2GershAAjT+e1JI3E6+rKwsGFuilJTdYpQoZqMHcI1G3RTRfBZ9DNYtxxOi5Ydba R5NOYako8fWarYUddpgsjmW2NzCtQ+PEO4tik8mDckpM0dBuygWWw+dJig79/G5e rR3PDB+fEUMjcz7RQU/5E7rYhDeqGYmSSEb2TicdvVzCewwvfPIyVdLQEaUL53iD KKLh1AzvdJram2OrRVlHMNIi0fHFK3KfNlwF/nsiMW8uoniJ3qN+KYLXoDkLrB+Y qo8xR28zMpMfTXzHyJ7TRYeSjK4UyWrL9NrCsdJKcD6uCnTruvkhsenX9riaLyYP 0j5NqTMJT+LofnRH8iFlxTVcmM17bqfH2uMyvk82JIJAbZqtptdp8YHXfI1lv07n 1CwYaYGYNBHjoPnHDq6mIOCKmpTqwjYvkb0GHv+w44unwy6FQV+h+dZWJyNKrvpY QnhZD0HskbceMAfxhiWv8uBnVe8Ewo/dtwCeQwrp54KP/yXHLBTNSSDwtVSJQ81Y SyMxQTHgmm8DwVcOjbv6yHDLM/DKEdtncn5F15Mbn5ZIr3YJVaP+WTj0csMZ/WUe 12E+MimqY5hkPDOcbi8bQl/hGxuEWgMOes5BQuzmvNoJVPlMkMvVb7qK99pXyv97 bfsj8lqKBfUXQWs/jeziGYHHvsD8Trpj4sRSFUiTJGJC3weSuQg= =eNS4 -----END PGP SIGNATURE----- --05Rwwdl30Skun6ZB--