From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=pass (2048-bit key; secure) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202602 header.b=AQwF2fHr; dkim-atps=neutral Received: from mail.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by passt.top (Postfix) with ESMTPS id 193395A0265 for ; Thu, 12 Mar 2026 03:40:41 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202602; t=1773283237; bh=IIe+PXQHZc7Hlo86g7I3OXt3LumpU+YpntsE+650HQY=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=AQwF2fHr19qM9aZq5mS8Yp80TnyL4SNuE8fzLLMSLsiyrlMo57hzncxdkUxiBEFd2 Kg+hBE1RnyVbu+TTYKpeyMjEOidvKeiRgS4OngrYcSi3eiKyYuEDispEJSiVTqO2gG 5sWA9KYHTyY9VWAHNqMByqmY/3W7i7rvuthchsuDmUEMj8j8KVOAu5TEKUpb5le0m5 m2NuiNwGluRrVg7It9dlUsm2AWFCvbsFrxTY8Bc5Z2QJq2B6fL0fZ84t3UYo3jEyy2 HC3nr8uc5YmhJg8yw/1uNmF3xe5STSaT1+u1bcfuz2kUJVMMJOiIJHScqPGFx0D7fY Zn7atnfzeTm0w== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4fWX0j2C2cz4wDJ; Thu, 12 Mar 2026 13:40:37 +1100 (AEDT) Date: Thu, 12 Mar 2026 13:38:40 +1100 From: David Gibson To: Laurent Vivier Subject: Re: [PATCH v2 04/13] udp_vu: Use iov_tail to manage virtqueue buffers Message-ID: References: <20260309094744.1907754-1-lvivier@redhat.com> <20260309094744.1907754-5-lvivier@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="BGCXgNpjmyFtkp3M" Content-Disposition: inline In-Reply-To: <20260309094744.1907754-5-lvivier@redhat.com> Message-ID-Hash: VP5Z7ASGYZNKYJAYQBTVX5OAK4FYWHAN X-Message-ID-Hash: VP5Z7ASGYZNKYJAYQBTVX5OAK4FYWHAN X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --BGCXgNpjmyFtkp3M Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Mar 09, 2026 at 10:47:35AM +0100, Laurent Vivier wrote: > Replace direct iovec pointer arithmetic in UDP vhost-user handling with > iov_tail operations. >=20 > udp_vu_sock_recv() now takes an iov/cnt pair instead of using the > file-scoped iov_vu array, and returns the data length rather than the > iov count. Internally it uses iov_drop_header() to skip past L2/L3/L4 > headers before receiving, and iov_tail_clone() to build the recvmsg() > iovec, removing the manual pointer offset and restore pattern. >=20 > udp_vu_prepare() and udp_vu_csum() take a const struct iov_tail * > instead of referencing iov_vu directly, making data flow explicit. >=20 > udp_vu_csum() uses iov_drop_header() and IOV_REMOVE_HEADER() to locate > the UDP header and payload, replacing manual offset calculations via > vu_payloadv4()/vu_payloadv6(). >=20 > Signed-off-by: Laurent Vivier Minor notes only. > --- > udp_vu.c | 111 ++++++++++++++++++++++++++++--------------------------- > 1 file changed, 57 insertions(+), 54 deletions(-) >=20 > diff --git a/udp_vu.c b/udp_vu.c > index 439f2cb399b7..a39254776099 100644 > --- a/udp_vu.c > +++ b/udp_vu.c > @@ -59,21 +59,25 @@ static size_t udp_vu_hdrlen(bool v6) > /** > * udp_vu_sock_recv() - Receive datagrams from socket into vhost-user bu= ffers > * @c: Execution context > + * @iov: IO vector for the frame (modified on output) > + * @cnt: Number of IO vector entries (in/out) Nit: "modified on output" and "in/out" are different ways of saying the same thing, yes? > * @vq: virtqueue to use to receive data > * @s: Socket to receive from > * @v6: Set for IPv6 connections > - * @dlen: Size of received data (output) > * > - * Return: number of iov entries used to store the datagram, 0 if the da= tagram > + * Return: size of received data, 0 if the datagram > * was discarded because the virtqueue is not ready, -1 on error > */ > -static int udp_vu_sock_recv(const struct ctx *c, struct vu_virtq *vq, in= t s, > - bool v6, ssize_t *dlen) > +static ssize_t udp_vu_sock_recv(const struct ctx *c, struct iovec *iov, > + size_t *cnt, struct vu_virtq *vq, int s, > + bool v6) > { > const struct vu_dev *vdev =3D c->vdev; > struct msghdr msg =3D { 0 }; > - int iov_cnt, iov_used; > + struct iov_tail payload; > size_t hdrlen; > + ssize_t dlen; > + int iov_cnt; > =20 > ASSERT(!c->no_udp); > =20 > @@ -83,78 +87,74 @@ static int udp_vu_sock_recv(const struct ctx *c, stru= ct vu_virtq *vq, int s, > if (recvmsg(s, &msg, MSG_DONTWAIT) < 0) > debug_perror("Failed to discard datagram"); > =20 > + *cnt =3D 0; > return 0; > } > =20 > /* compute L2 header length */ > hdrlen =3D udp_vu_hdrlen(v6); > =20 > - vu_init_elem(elem, iov_vu, ARRAY_SIZE(elem)); > + vu_init_elem(elem, iov, *cnt); > =20 > iov_cnt =3D vu_collect(vdev, vq, elem, ARRAY_SIZE(elem), > IP_MAX_MTU + ETH_HLEN + VNET_HLEN, NULL); > if (iov_cnt =3D=3D 0) > return -1; > =20 > - /* reserve space for the headers */ > - ASSERT(iov_vu[0].iov_len >=3D MAX(hdrlen, ETH_ZLEN + VNET_HLEN)); > + payload =3D IOV_TAIL(iov, iov_cnt, hdrlen); > =20 > - iov_vu[0].iov_base =3D (char *)iov_vu[0].iov_base + hdrlen; > - iov_vu[0].iov_len -=3D hdrlen; > + struct iovec msg_iov[payload.cnt]; We generally avoid inline declarations, although C11 does allow them. > + msg.msg_iov =3D msg_iov; > + msg.msg_iovlen =3D iov_tail_clone(msg.msg_iov, payload.cnt, &payload); > /* read data from the socket */ > - msg.msg_iov =3D iov_vu; > - msg.msg_iovlen =3D iov_cnt; > - > - *dlen =3D recvmsg(s, &msg, 0); > - if (*dlen < 0) { > + dlen =3D recvmsg(s, &msg, 0); > + if (dlen < 0) { > vu_queue_rewind(vq, iov_cnt); > return -1; > } > =20 > - /* restore the pointer to the headers address */ > - iov_vu[0].iov_base =3D (char *)iov_vu[0].iov_base - hdrlen; > - iov_vu[0].iov_len +=3D hdrlen; > - > /* Pad short frames to ETH_ZLEN */ > - if (ETH_ZLEN + VNET_HLEN > *dlen + hdrlen) { > - iov_memset(iov_vu, iov_cnt, *dlen + hdrlen, 0, > - ETH_ZLEN + VNET_HLEN - (*dlen + hdrlen)); > + if (ETH_ZLEN + VNET_HLEN > dlen + hdrlen) { > + iov_memset(iov, iov_cnt, dlen + hdrlen, 0, > + ETH_ZLEN + VNET_HLEN - (dlen + hdrlen)); > } > - iov_used =3D iov_truncate(iov_vu, iov_cnt, *dlen + hdrlen); > + *cnt =3D iov_truncate(iov, iov_cnt, dlen + hdrlen); > - vu_set_vnethdr(iov_vu[0].iov_base, iov_used); > + vu_set_vnethdr(iov[0].iov_base, *cnt); > =20 > /* release unused buffers */ > - vu_queue_rewind(vq, iov_cnt - iov_used); > + vu_queue_rewind(vq, iov_cnt - *cnt); > =20 > - return iov_used; > + return dlen; > } > =20 > /** > * udp_vu_prepare() - Prepare the packet header > * @c: Execution context > + * @data: IO vector tail for the frame > * @toside: Address information for one side of the flow > * @dlen: Packet data length > * > * Return: Layer-4 length > */ > -static size_t udp_vu_prepare(const struct ctx *c, > +static size_t udp_vu_prepare(const struct ctx *c, const struct iov_tail = *data, > const struct flowside *toside, ssize_t dlen) > { > + const struct iovec *iov =3D data->iov; > struct ethhdr *eh; > size_t l4len; > =20 > /* ethernet header */ > - eh =3D vu_eth(iov_vu[0].iov_base); > + eh =3D vu_eth(iov[0].iov_base); > =20 > memcpy(eh->h_dest, c->guest_mac, sizeof(eh->h_dest)); > memcpy(eh->h_source, c->our_tap_mac, sizeof(eh->h_source)); > =20 > /* initialize header */ > if (inany_v4(&toside->eaddr) && inany_v4(&toside->oaddr)) { > - struct iphdr *iph =3D vu_ip(iov_vu[0].iov_base); > - struct udp_payload_t *bp =3D vu_payloadv4(iov_vu[0].iov_base); > + struct iphdr *iph =3D vu_ip(iov[0].iov_base); > + struct udp_payload_t *bp =3D vu_payloadv4(iov[0].iov_base); > =20 > eh->h_proto =3D htons(ETH_P_IP); > =20 > @@ -162,8 +162,8 @@ static size_t udp_vu_prepare(const struct ctx *c, > =20 > l4len =3D udp_update_hdr4(iph, bp, toside, dlen, true); > } else { > - struct ipv6hdr *ip6h =3D vu_ip(iov_vu[0].iov_base); > - struct udp_payload_t *bp =3D vu_payloadv6(iov_vu[0].iov_base); > + struct ipv6hdr *ip6h =3D vu_ip(iov[0].iov_base); > + struct udp_payload_t *bp =3D vu_payloadv6(iov[0].iov_base); > =20 > eh->h_proto =3D htons(ETH_P_IPV6); > =20 > @@ -178,25 +178,25 @@ static size_t udp_vu_prepare(const struct ctx *c, > /** > * udp_vu_csum() - Calculate and set checksum for a UDP packet > * @toside: Address information for one side of the flow > - * @iov_used: Number of used iov_vu items > + * @data: IO vector tail for the frame With or without the VU header? > */ > -static void udp_vu_csum(const struct flowside *toside, int iov_used) > +static void udp_vu_csum(const struct flowside *toside, > + const struct iov_tail *data) > { > const struct in_addr *src4 =3D inany_v4(&toside->oaddr); > const struct in_addr *dst4 =3D inany_v4(&toside->eaddr); > - char *base =3D iov_vu[0].iov_base; > - struct udp_payload_t *bp; > - struct iov_tail data; > + struct iov_tail payload =3D *data; > + struct udphdr *uh, uh_storage; > + bool ipv4 =3D src4 && dst4; > =20 > - if (src4 && dst4) { > - bp =3D vu_payloadv4(base); > - data =3D IOV_TAIL(iov_vu, iov_used, (char *)&bp->data - base); > - csum_udp4(&bp->uh, *src4, *dst4, &data); > - } else { > - bp =3D vu_payloadv6(base); > - data =3D IOV_TAIL(iov_vu, iov_used, (char *)&bp->data - base); > - csum_udp6(&bp->uh, &toside->oaddr.a6, &toside->eaddr.a6, &data); > - } > + iov_drop_header(&payload, > + udp_vu_hdrlen(!ipv4) - sizeof(struct udphdr)); This construction is a bit awkward, better to IOV_DROP_HEADER() on the ethernet, then the IP header? > + uh =3D IOV_REMOVE_HEADER(&payload, uh_storage); > + > + if (ipv4) > + csum_udp4(uh, *src4, *dst4, &payload); > + else > + csum_udp6(uh, &toside->oaddr.a6, &toside->eaddr.a6, &payload); > } > =20 > /** > @@ -212,23 +212,26 @@ void udp_vu_sock_to_tap(const struct ctx *c, int s,= int n, flow_sidx_t tosidx) > bool v6 =3D !(inany_v4(&toside->eaddr) && inany_v4(&toside->oaddr)); > struct vu_dev *vdev =3D c->vdev; > struct vu_virtq *vq =3D &vdev->vq[VHOST_USER_RX_QUEUE]; > + struct iov_tail data; > int i; > =20 > for (i =3D 0; i < n; i++) { > + size_t iov_cnt; > ssize_t dlen; > - int iov_used; > =20 > - iov_used =3D udp_vu_sock_recv(c, vq, s, v6, &dlen); > - if (iov_used < 0) > + iov_cnt =3D VIRTQUEUE_MAX_SIZE; > + dlen =3D udp_vu_sock_recv(c, iov_vu, &iov_cnt, vq, s, v6); > + if (dlen < 0) > break; > =20 > - if (iov_used > 0) { > - udp_vu_prepare(c, toside, dlen); > + if (iov_cnt > 0) { > + data =3D IOV_TAIL(iov_vu, iov_cnt, 0); > + udp_vu_prepare(c, &data, toside, dlen); > if (*c->pcap) { > - udp_vu_csum(toside, iov_used); > - pcap_iov(iov_vu, iov_used, VNET_HLEN); > + udp_vu_csum(toside, &data); > + pcap_iov(data.iov, data.cnt, VNET_HLEN); > } > - vu_flush(vdev, vq, elem, iov_used); > + vu_flush(vdev, vq, elem, data.cnt); > } > } > } > --=20 > 2.53.0 >=20 --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --BGCXgNpjmyFtkp3M Content-Type: application/pgp-signature; name=signature.asc -----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmmyJy8ACgkQzQJF27ox 2Gf6LxAAhzN/UASTtGL/xuyJbZcBc2nAwbcBYEv7b3y8UC5fxWUdISX75l8stunD CaDJckCl2ahIDONwlse2gRvXJ8b87M34Txjh9cMtSYjNdIYqFQt28O0+ajZKrm2K BWEzbsOMZ2vFBhhXpCIK6SRjotPXHicCqf8dorUXxuUi3Pxvot5xoSlKq6c3UB/o wi4ZYmg3wpFucY60EOV671x6sdRQzlTtsnn11x0aqtbpopdA+LC87i6MGT4GmOgs 6t6r1YRYti3O14OGmYEhxiRCF391RZdxn49MEIxvWdDyg2K4KDUwUKA3BnrYynMg GqdZLh5U9hGAvbqEMZH3CbGEDIvPVsxLwSLYbxeVmPJXRyDgopAkWScebVpZsEjY bVVptTFfs/PM9KjxkfCY7ldEtO6v/v6k8SEldqDb77Sk6vAbFVGmXAmrcs/JZgeX YnolT/F08F3EXw3bxefTVqXBKq36h8EAYicP7h6Oj5uarsqArtPr0cKkGKpjZPyn xLux0S8mweaAY2eqridBDTzLNgbIXDzd2dAnjFc0MvO39EAoT6IohwkwwMLQ3IPM czz9BW7yaPyuHD4lOR72nfvAwav7JqvYyUTaP6VfiHhMLMa2NqaBp5k5fPQINpP3 YqxAzzsr70efEHVp+sAxR38D3XETd47Gm+e/L28HxO/lBBvdrL4= =1PNl -----END PGP SIGNATURE----- --BGCXgNpjmyFtkp3M--