From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by passt.top (Postfix) with ESMTPS id 169285A004C for ; Fri, 31 May 2024 03:54:50 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202312; t=1717120486; bh=ZH/UwzGCnWewJSE8VIKBq+OCzJQ0UbowvUCR0Zkc2x4=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=js9Mjx73j1sj2jyNb6NKjzmBmNgI9KJvXLzYJOAxvt2rveRDG6DzWeyGXinhbqs5h 2kRZ8Cw9dYrM1g0x8DEGm5MdB+2g3cNdBhRKxMt7ABPhAvIkidiMyhDynfmwIn0ArK FASTTyFj32XX/geGTnueCZDZKGwRwcXHdlD8X9n6uYepqGVfyARXKnm5F9yG/XpYC+ 3nW/bpLpJ/3IL/46Zecd2vBt/OpmpalEe6dJP2G7NMqU2xJG7SyVdMgy9u//HGHOHJ 5jBVIyF/cO4uJtpSxUzgV8TZZsGdDtJOGlsxKz8jJXcc5eWStwHQLzmfutQgyJh+zP mE/MYiQOnY3XA== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4Vr5lp0dDcz4wcq; Fri, 31 May 2024 11:54:46 +1000 (AEST) Date: Fri, 31 May 2024 11:54:38 +1000 From: David Gibson To: Jon Maloy Subject: Re: [PATCH v7 2/3] tcp: leverage support of SO_PEEK_OFF socket option when available Message-ID: References: <20240524172656.193183-1-jmaloy@redhat.com> <20240524172656.193183-3-jmaloy@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="cZfhT16h8rzPPVmd" Content-Disposition: inline In-Reply-To: <20240524172656.193183-3-jmaloy@redhat.com> Message-ID-Hash: CP4PF6PYPCD5OXUVVSFQPU2LIGAT6UWC X-Message-ID-Hash: CP4PF6PYPCD5OXUVVSFQPU2LIGAT6UWC X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, sbrivio@redhat.com, lvivier@redhat.com, dgibson@redhat.com X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --cZfhT16h8rzPPVmd Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, May 24, 2024 at 01:26:55PM -0400, Jon Maloy wrote: > >From linux-6.9.0 the kernel will contain > commit 05ea491641d3 ("tcp: add support for SO_PEEK_OFF socket option"). >=20 > This new feature makes is possible to call recv_msg(MSG_PEEK) and make > it start reading data from a given offset set by the SO_PEEK_OFF socket > option. This way, we can avoid repeated reading of already read bytes of > a received message, hence saving read cycles when forwarding TCP > messages in the host->name space direction. >=20 > In this commit, we add functionality to leverage this feature when > available, while we fall back to the previous behavior when not. >=20 > Measurements with iperf3 shows that throughput increases with 15-20 > percent in the host->namespace direction when this feature is used. >=20 > Signed-off-by: Jon Maloy > --- > tcp.c | 59 +++++++++++++++++++++++++++++++++++++++++++++++++++-------- > 1 file changed, 51 insertions(+), 8 deletions(-) >=20 > diff --git a/tcp.c b/tcp.c > index 146ab8f..01898f1 100644 > --- a/tcp.c > +++ b/tcp.c > @@ -509,6 +509,9 @@ static struct iovec tcp6_l2_iov [TCP_FRAMES_MEM][TCP= _NUM_IOVS]; > static struct iovec tcp4_l2_flags_iov [TCP_FRAMES_MEM][TCP_NUM_IOVS]; > static struct iovec tcp6_l2_flags_iov [TCP_FRAMES_MEM][TCP_NUM_IOVS]; > =20 > +/* Does the kernel support TCP_PEEK_OFF? */ > +static bool peek_offset_cap; > + > /* sendmsg() to socket */ > static struct iovec tcp_iov [UIO_MAXIOV]; > =20 > @@ -524,6 +527,20 @@ static_assert(ARRAY_SIZE(tc_hash) >=3D FLOW_MAX, > int init_sock_pool4 [TCP_SOCK_POOL_SIZE]; > int init_sock_pool6 [TCP_SOCK_POOL_SIZE]; > =20 > +/** > + * tcp_set_peek_offset() - Set SO_PEEK_OFF offset on a socket if support= ed > + * @s: Socket to update > + * @offset: Offset in bytes > + */ > +static void tcp_set_peek_offset(int s, int offset) > +{ > + if (!peek_offset_cap) > + return; > + > + if (setsockopt(s, SOL_SOCKET, SO_PEEK_OFF, &offset, sizeof(offset))) > + err("Failed to set SO_PEEK_OFF to %i in socket %i", offset, s); I feel like we need to reset the connection if we ever reach here. This means that SO_PEEK_OFF is now out of sync and we apparently can't fix it. If we keep the connection alive, we will inevitably send incorrect data across it, which seems pretty bad. Or, maybe we think this is unlikely enough we could just die(). Otherwise, LGTM. --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --cZfhT16h8rzPPVmd Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmZZLd0ACgkQzQJF27ox 2GcpVQ/8CwE3pMi6aP6ZkMfnFl2FAdJ/OMcyg9kkUML09JysKnsXLAkD3uklDBNV 3u73EE9b5EwGlojpR1OaMLVo3a5/sLhascoGXvgi25MGjhHnttpRH7QTR7UJTEUp Cf1co3FGmEZ0kDThs+1c+IDP319tUXKaxQ/W1YryEijpHnhZcuN4LoMF1jX6he0m e7K3saUSqhPV2Cgb9PpsqC5yWeU8OVQGVnDfqEAzJnAspLd2ln2R5F+8YfjHCXaC XBQ9jUjJPYu0/k5b6RgS9BoxKvVSO4Kldp8PVqe04EFy5Kn4ZVH4iwFBNMQ8p4o8 OWWM+PqhHItx57y1wmk1saqJD5LihsRQc0tGh+OVUbkwOIkVGVnvOCLEq3tzhav9 zRhZdN+o3WctEQZkNPfFSsw3/6eRSJPFLrGXqnHQeLj+lyLEYjDOxEg3IU2896Ss 7EBCSXMpZFgG/IHVKiC3gMaj1aR8owqL/VhmXxyHm1GUqrh6dASGFylZLjNnPkbs yOPslw/gBx1QuWNozULqmVhHn3WCvUmAGvw/LeBjht/wxQa1luIikoPo5OtPe7P+ GBt96pgVB94vaJoAEoPX1dpfx3ZeL+QdNUQHiRpre3O0GJlLfcFz/UZdfDwqq5Wh r/inTmNLz/ZWrSifvABpobbDtw+subRIcC+ZEgbxnHacKUzNVkY= =jtdS -----END PGP SIGNATURE----- --cZfhT16h8rzPPVmd--