From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=pass (2048-bit key; secure) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202412 header.b=JMCpZ3WE; dkim-atps=neutral Received: from mail.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by passt.top (Postfix) with ESMTPS id 351E35A004E for ; Tue, 21 Jan 2025 04:01:45 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202412; t=1737428492; bh=AUp0Go3ZVf2OElsgYE/wn+b10187fEuyXfn7JxvNb5Y=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=JMCpZ3WEm3W/g/62keqqgNBpEYTxkSO4wH1iXmwiucUjzzm97PGnEZQH4QbZT9fbA XlRqk4r9OKpnorIu66lfQvTd5aGzEqv7B0xid8SfrrrtVtlHI7yiXQx+57jnYP0QmY ybWGuLtjbqaf5PsoREjjgOSdtnXhm+p2qmNzXIzT/g8cxQ+mJ4M/X1LnFBXeE6BUui pkaDIQktFZwrqUAXaGXjO84WRpFC/0TNzzMx9qRcuKz6XQbufYyC8r9hHhZz4L+L/u hYk2e00k5k/rMgLhoa4CoCcP2QUO+/ez6WOCVx6T2xYv3aeI/Zh2l27u42AEjpXd+Z g4lCvt852vfBw== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4YcX6N57DGz4x8h; Tue, 21 Jan 2025 14:01:32 +1100 (AEDT) Date: Tue, 21 Jan 2025 13:31:03 +1030 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH v2] tcp: Disable Nagle's algorithm (set TCP_NODELAY) on all sockets Message-ID: References: <20250120172816.2102833-1-sbrivio@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="PguZ1sFy+LitFVxy" Content-Disposition: inline In-Reply-To: <20250120172816.2102833-1-sbrivio@redhat.com> Message-ID-Hash: KSNWJ4YJPUZLYNZXDOZN4PMNU7CAI6G5 X-Message-ID-Hash: KSNWJ4YJPUZLYNZXDOZN4PMNU7CAI6G5 X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --PguZ1sFy+LitFVxy Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Jan 20, 2025 at 06:28:16PM +0100, Stefano Brivio wrote: > Following up on 725acd111ba3 ("tcp_splice: Set (again) TCP_NODELAY on > both sides"), David argues that, in general, we don't know what kind > of TCP traffic we're dealing with, on any side or path. >=20 > TCP segments might have been delivered to our socket with a PSH flag, > but we don't have a way to know about it. >=20 > Similarly, the guest might send us segments with PSH or URG set, but > we don't know if we should generally TCP_CORK sockets and uncork on > those flags, because that would assume they're running a Linux kernel > (and a particular version of it) matching the kernel that delivers > outbound packets for us. >=20 > Given that we can't make any assumption and everything might very well > be interactive traffic, disable Nagle's algorithm on all non-spliced > sockets as well. >=20 > After all, John Nagle himself is nowadays recommending that delayed > ACKs should never be enabled together with his algorithm, but we > don't have a practical way to ensure that our environment is free from > delayed ACKs (TCP_QUICKACK is not really usable for this purpose): >=20 > https://news.ycombinator.com/item?id=3D34180239 >=20 > Suggested-by: David Gibson > Signed-off-by: Stefano Brivio > --- > v2: Set TCP_NODELAY on inbound socket after accept4(), not on the > listening sockets, and change failure message to debug() > instead of trace() Uh.. you've moved it to after accept() for incoming connections, but it looks like you've accidentally removed the call for outgoing connections (tcp_conn_new_sock()). >=20 > tcp.c | 15 +++++++++++++++ > 1 file changed, 15 insertions(+) >=20 > diff --git a/tcp.c b/tcp.c > index a012b81..4d6a6b3 100644 > --- a/tcp.c > +++ b/tcp.c > @@ -756,6 +756,19 @@ static void tcp_sock_set_bufsize(const struct ctx *c= , int s) > trace("TCP: failed to set SO_SNDBUF to %i", v); > } > =20 > +/** > + * tcp_sock_set_nodelay() - Set TCP_NODELAY option (disable Nagle's algo= rithm) > + * @s: Socket, can be -1 to avoid check in the caller > + */ > +static void tcp_sock_set_nodelay(int s) > +{ > + if (s =3D=3D -1) > + return; > + > + if (setsockopt(s, SOL_TCP, TCP_NODELAY, &((int){ 1 }), sizeof(int))) > + debug("TCP: failed to set TCP_NODELAY on socket %i", s); > +} > + > /** > * tcp_update_csum() - Calculate TCP checksum > * @psum: Unfolded partial checksum of the IPv4 or IPv6 pseudo-header > @@ -1285,6 +1298,7 @@ static int tcp_conn_new_sock(const struct ctx *c, s= a_family_t af) > return -errno; > =20 > tcp_sock_set_bufsize(c, s); > + tcp_sock_set_nodelay(s); > =20 > return s; > } > @@ -2058,6 +2072,7 @@ void tcp_listen_handler(const struct ctx *c, union = epoll_ref ref, > goto cancel; > =20 > tcp_sock_set_bufsize(c, s); > + tcp_sock_set_nodelay(s); > =20 > /* FIXME: When listening port has a specific bound address, record that > * as our address --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --PguZ1sFy+LitFVxy Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmePDeIACgkQzQJF27ox 2GeyeA/6AglhWATj05hQoV9gMmUb4cOd8Pxz51VN8blQRCEWzWvAxRy2a34ZZVua 9BrR4NRr98D/gIhobanXHf+AsIBM+YNdf/uVbC0qmi6fpJaVDbZukydW51MDdT02 KsXJjEbC50D0h1Msv4zr5p80z7l/74+8aRgzPFO8qNgx66YVgTRKYOiqZ+tP83hB UsISzt/jm/ocqGDwT9ud8nvqC9iM0s9AXwzC+pPCqv4FNxSXIYvnweWjsXelDd+/ Y67gv2bBTPFPKRqoFDA3P02fzJXcQM3CXIWxcqexAK843RImgaY2HC0lxMA968CR urlPwCKkw6FPTUx8jx1G9qDJ/sfdmUVkw/aoaxMQrm1jPLwPtKAj3Rp3PvwiUwFb Xb4So4z0wEta3VR5uwuIPMzGUxG3bjzmHkMwjCPk27ioxNwlphqUMHt06Oi5YpJN OepDepOW1MMbnVRvRCRTUZaFHAamQn8bDqik/GOe2vW1d2lFEqtwr+R8LTRyE0kt urQd3aMNpR85bTg4j7xXcMGwM1zwxqnY9VjYEM8Jry+gsL5JqGPdyU1ht/CKT/On ODNt9octGQCk315XzirLVMdCAtzwRm9t/EdPmkSGGt3OrwZlWWH2S/mRX+oe6AW3 d5rVo6CkTllJacCcIHt5L1yda4a2sE5qvQrj09sScTwbEO2zFCM= =gQm5 -----END PGP SIGNATURE----- --PguZ1sFy+LitFVxy--