From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by passt.top (Postfix) with ESMTPS id 736D75A026D for ; Fri, 26 Apr 2024 05:31:02 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202312; t=1714102258; bh=tqlk+0LRC5fNwtlhtK7MoyftjYsjm0xmxajGDnnHU/Q=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=gcTmuQzY/4I3iyRu+2J58V5rxl5cTaUxvKVU0gCwB8Cde+kuTR6G3skOvZ4uJLSyF vehbOjpDyh5hIbD6T4/6hnPclfIHfQnOHica+x8OwbXjKSWAxQp78j846f1sObZ0uY rUc3pbbKv6fiwYMnUwybW+cz7+MuaHIvRojqNjhPphVCKEbMFcC0/GNlTzMGZIvylE kVkbywp0i6UyE62ttGaCIEYlz5wIE8mlhfkOr9OKv5Zy5443Xfe4dvdvKFDX+7/vQX CrkzZ8QCsxu83fuX8MULq85oG5bPUstSVd1ZvmZeCEwasUtc3Da703VgWG9h0Gy7G9 RXBHgDFqF9chQ== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4VQdXy4LQxz4wx6; Fri, 26 Apr 2024 13:30:58 +1000 (AEST) Date: Fri, 26 Apr 2024 13:27:11 +1000 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH 1/2] tcp: leverage support of SO_PEEK_OFF socket option when available Message-ID: References: <20240420191920.104876-1-jmaloy@redhat.com> <20240420191920.104876-2-jmaloy@redhat.com> <20240423195010.2b4d5c13@elisabeth> <20240424203044.2df748d7@elisabeth> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="46BjwOuGabkzwjTj" Content-Disposition: inline In-Reply-To: <20240424203044.2df748d7@elisabeth> Message-ID-Hash: KBBYTMQB2MMSVYB5BJ6OEKNSIX66ZG25 X-Message-ID-Hash: KBBYTMQB2MMSVYB5BJ6OEKNSIX66ZG25 X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: Jon Maloy , passt-dev@passt.top, lvivier@redhat.com, dgibson@redhat.com X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --46BjwOuGabkzwjTj Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Apr 24, 2024 at 08:30:44PM +0200, Stefano Brivio wrote: > On Wed, 24 Apr 2024 10:48:05 +1000 > David Gibson wrote: >=20 > > On Tue, Apr 23, 2024 at 07:50:10PM +0200, Stefano Brivio wrote: > > > On Sat, 20 Apr 2024 15:19:19 -0400 > > > Jon Maloy wrote: =20 > > [snip] > > > > + set_peek_offset(s, 0); =20 > > >=20 > > > Do we really need to initialise it to zero on a new connection? Extra > > > system calls on this path matter for latency of connection > > > establishment. =20 > >=20 > > Sort of, yes: we need to enable the SO_PEEK_OFF behaviour by setting > > it to 0, rather than the default -1. >=20 > By the way of which, this is not documented at this point -- a man page > patch (linux-man and linux-api lists) would be nice. >=20 > > We could lazily enable it, but > > we'd need either to a) do it later in the handshake (maybe when we set > > ESTABLISHED), but we'd need to be careful it is always set before the > > first MSG_PEEK >=20 > I was actually thinking that we could set it only as we receive data > (not every connection will receive data), and keep this out of the > handshake (which we want to keep "faster", I think). That makes sense, but I think it would need a per-connection flag. > And setting it as we mark a connection as ESTABLISHED should have the > same effect on latency as setting it on a new connection -- that's not > really lazy. So, actually: Good point. > > or b) keep track of whether it's set on a per-socket > > basis (this would have the advantage of robustness if we ever > > encountered a kernel that weirdly allows it for some but not all TCP > > sockets). >=20 > ...this could be done as we receive data in tcp_data_from_sock(), with > a new flag in tcp_tap_conn::flags, to avoid adding latency to the > handshake. It also looks more robust to me, and done/checked in a > single place where we need it. >=20 > We have just three bits left there which isn't great, but if we need to > save one at a later point, we can drop this new flag easily. I just realised that folding the feature detection into this is a bit costlier than I thought. If we globally probe the feature we just need one bit per connection: is SO_PEEK_OFF set yet or not. If we tried to probe per-connection we'd need a tristate: haven't tried / SO_PEEK_OFF enabled / tried and failed. --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --46BjwOuGabkzwjTj Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmYrHw4ACgkQzQJF27ox 2GevwRAAiYxVQkl+Q4dGvMoGS7AsHZx+8HnNL4OAPwwAZhWp3U7l4dhB+nhEBu09 4CcpNah4k5U8VDtYKgpoFeKZJjw3p2MR2Hp2By01W1iZ0MkZonIdO6fkavDL4uKN YTq4Q3B3gGnsv/A0mD9WOD/uxAwo8ERPOS4YBwG0fhx1bbBUARQ3MMqBugt9/MZ3 aEhjYNYBLwhlmWSIJ6/eBQvIRfqPOu8wuGoSbpGInQJMoGfVKVCIAHsRGp0Y8gv9 nkvYpYyYlL8sUaYqfSWjgpzTO8ggZehBRKZy9i1fOxz6t+4jgnIO6YzqjvZfS/IO RmmDrsAOV/IO2DmibaxJpdLteUg5aegPk07wi3t3PKlLWZy/V1k6z2wGFfHRJ9z6 Deyd5ygxnUspg/Hac+TTPUZQSdhrr2gOwWKTmXWQIjfc18o8pyfqYBW5i+UxyxtF LNmalVPSiGLpDbZW20aMKO4ozX0Lfzwu4dQ+h9xWzWrjiWbPX4c04+bOBJDLaUTs jhMX7r79HscLmQBPLmB4u56ZRe0eV844jwjGnhYik/7C6lKA8ZtIo+t1V9CXb/2V zJ7lZOUq1O9pTX+0HcpnQpF6/SVvPJMF9zZJRX1i4tKecMGmEyfVlXG7vc6iLuDs wC2sReLnhQU3e35D84/0zDeF6IG0PQkULlHfC2LUiAYr2nXXnQw= =eORU -----END PGP SIGNATURE----- --46BjwOuGabkzwjTj--