From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=pass (2048-bit key; secure) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202602 header.b=Vwp9PXnx; dkim-atps=neutral Received: from mail.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by passt.top (Postfix) with ESMTPS id 77A295A0275 for ; Mon, 02 Mar 2026 01:13:46 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202602; t=1772410422; bh=IVwrRiEjD5ymfxQmRzbtSQclNu+iGyJ3CztClXGrZ7Y=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=Vwp9PXnxtVBu5BQrNkxoPICNyKWjUvc5hn7rlxnSDvX+it9SRr0z7fY5o+CZNVcK8 N/bbDgGo4PSR+DpiN/RcQr8UnLkHUOG+7B3iRbSXXfSjNggHn3yKi+oPdmy/rHSrP7 xTCQSPZuy377KI0JdZIJgpCAd1bQQgseYMvwZe7dDbilCblo4PxvErIBkeZbJnHoJF pALEQi4LVrbyO42a/SQeraZ+XK2GKq/PSfW8oCvD4JqFLN89Va6wuGeoJzrEzL9KkP RnyvT9l5uzsL7IoAIrnZRf2zsPafwU+fli5I8gzDEfTJWJPQXnf8Fm6o6J78tMGsi5 TzIRLAaOif2Sw== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4fPKCp4D9Jz4wCP; Mon, 02 Mar 2026 11:13:42 +1100 (AEDT) Date: Mon, 2 Mar 2026 11:03:29 +1100 From: David Gibson To: Laurent Vivier Subject: Re: [PATCH 03/12] udp_vu: Use iov_tail to manage virtqueue buffers Message-ID: References: <20260227140330.2216753-1-lvivier@redhat.com> <20260227140330.2216753-4-lvivier@redhat.com> MIME-Version: 1.0 In-Reply-To: <20260227140330.2216753-4-lvivier@redhat.com> X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="V5HuraenfRznxWe3" Content-Disposition: inline Message-ID-Hash: Y6FSIOQ37N7H37DYFX76BW6AYQDMUDTS X-Message-ID-Hash: Y6FSIOQ37N7H37DYFX76BW6AYQDMUDTS X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --V5HuraenfRznxWe3 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Feb 27, 2026 at 03:03:21PM +0100, Laurent Vivier wrote: > Replace direct iovec pointer arithmetic in UDP vhost-user handling with > iov_tail operations introduced in the previous commit. >=20 > udp_vu_sock_recv() now takes a struct iov_tail and returns the received > data length rather than the number of iov entries used. It uses > iov_drop_header() to skip past L2/L3/L4 headers before receiving socket > data, iov_tail_truncate() to trim unused buffer space, and > iov_tail_zero_end() to zero-pad short frames instead of vu_pad(). >=20 > udp_vu_prepare() and udp_vu_csum() take a const struct iov_tail instead > of referencing the file-scoped iov_vu array directly, making data flow > explicit. >=20 > udp_vu_csum() uses iov_drop_header() and IOV_REMOVE_HEADER() to locate > the UDP header and payload, replacing manual offset calculations via > vu_payloadv4()/vu_payloadv6(). >=20 > Signed-off-by: Laurent Vivier > --- > iov.c | 2 - > udp_vu.c | 121 +++++++++++++++++++++++++++---------------------------- > 2 files changed, 60 insertions(+), 63 deletions(-) >=20 > diff --git a/iov.c b/iov.c > index cb4d6fef5567..8836305fb701 100644 > --- a/iov.c > +++ b/iov.c > @@ -175,7 +175,6 @@ bool iov_tail_prune(struct iov_tail *tail) > * @tail:=09IO vector tail (modified in place, including backing iovecs) > * @size:=09Maximum number of bytes to keep, relative to current tail of= fset > */ > -/* cppcheck-suppress unusedFunction */ > void iov_tail_truncate(struct iov_tail *tail, size_t size) > { > =09size_t i, off; > @@ -195,7 +194,6 @@ void iov_tail_truncate(struct iov_tail *tail, size_t = size) > * @tail:=09IO vector tail (backing buffers modified in place) > * @size:=09Number of leading bytes to preserve > */ > -/* cppcheck-suppress unusedFunction */ > void iov_tail_zero_end(struct iov_tail *tail, size_t size) > { > =09size_t i, off; > diff --git a/udp_vu.c b/udp_vu.c > index 6f6477f7d046..8f4d0aedac10 100644 > --- a/udp_vu.c > +++ b/udp_vu.c > @@ -59,21 +59,23 @@ static size_t udp_vu_hdrlen(bool v6) > /** > * udp_vu_sock_recv() - Receive datagrams from socket into vhost-user bu= ffers > * @c:=09=09Execution context > + * @data:=09IO vector tail for the frame (modified on output) > * @vq:=09=09virtqueue to use to receive data > * @s:=09=09Socket to receive from > * @v6:=09=09Set for IPv6 connections > - * @dlen:=09Size of received data (output) > * > - * Return: number of iov entries used to store the datagram, 0 if the da= tagram > + * Return: size of received data, 0 if the datagram > * was discarded because the virtqueue is not ready, -1 on error > */ > -static int udp_vu_sock_recv(const struct ctx *c, struct vu_virtq *vq, in= t s, > -=09=09=09 bool v6, ssize_t *dlen) > +static ssize_t udp_vu_sock_recv(const struct ctx *c, struct iov_tail *da= ta, > +=09=09=09=09struct vu_virtq *vq, int s, bool v6) > { > =09const struct vu_dev *vdev =3D c->vdev; > -=09int iov_cnt, idx, iov_used; > -=09size_t off, hdrlen, l2len; > =09struct msghdr msg =3D { 0 }; > +=09struct iov_tail payload; > +=09size_t hdrlen; > +=09ssize_t dlen; > +=09int iov_cnt; > =20 > =09ASSERT(!c->no_udp); > =20 > @@ -83,82 +85,77 @@ static int udp_vu_sock_recv(const struct ctx *c, stru= ct vu_virtq *vq, int s, > =09=09if (recvmsg(s, &msg, MSG_DONTWAIT) < 0) > =09=09=09debug_perror("Failed to discard datagram"); > =20 > +=09=09data->cnt =3D 0; > =09=09return 0; > =09} > =20 > =09/* compute L2 header length */ > =09hdrlen =3D udp_vu_hdrlen(v6); > =20 > -=09vu_init_elem(elem, iov_vu, ARRAY_SIZE(elem)); > +=09vu_init_elem(elem, (struct iovec *)data->iov, data->cnt); > =20 > =09iov_cnt =3D vu_collect(vdev, vq, elem, ARRAY_SIZE(elem), > =09=09=09 IP_MAX_MTU + ETH_HLEN + VNET_HLEN, NULL); > =09if (iov_cnt =3D=3D 0) > =09=09return -1; > =20 > +=09data->cnt =3D iov_cnt; Does something limit iov_cnt to be <=3D data->cnt? If so, it's not very obvious from here. If not, then the line above is clearly dangerous. > + > =09/* reserve space for the headers */ > -=09ASSERT(iov_vu[0].iov_len >=3D MAX(hdrlen, ETH_ZLEN + VNET_HLEN)); > -=09iov_vu[0].iov_base =3D (char *)iov_vu[0].iov_base + hdrlen; > -=09iov_vu[0].iov_len -=3D hdrlen; > +=09ASSERT(iov_tail_size(data) >=3D MAX(hdrlen, ETH_ZLEN + VNET_HLEN)); > =20 > -=09/* read data from the socket */ > -=09msg.msg_iov =3D iov_vu; > -=09msg.msg_iovlen =3D iov_cnt; > +=09payload =3D *data; > +=09iov_drop_header(&payload, hdrlen); > + > +=09struct iovec msg_iov[payload.cnt]; > +=09msg.msg_iov =3D msg_iov; > +=09msg.msg_iovlen =3D iov_tail_clone(msg.msg_iov, payload.cnt, &payload)= ; > =20 > -=09*dlen =3D recvmsg(s, &msg, 0); > -=09if (*dlen < 0) { > +=09/* read data from the socket */ > +=09dlen =3D recvmsg(s, &msg, 0); > +=09if (dlen < 0) { > =09=09vu_queue_rewind(vq, iov_cnt); > =09=09return -1; > =09} > =20 > -=09/* restore the pointer to the headers address */ > -=09iov_vu[0].iov_base =3D (char *)iov_vu[0].iov_base - hdrlen; > -=09iov_vu[0].iov_len +=3D hdrlen; > - > -=09/* count the numbers of buffer filled by recvmsg() */ > -=09idx =3D iov_skip_bytes(iov_vu, iov_cnt, *dlen + hdrlen, &off); > - > -=09/* adjust last iov length */ > -=09if (idx < iov_cnt) > -=09=09iov_vu[idx].iov_len =3D off; > -=09iov_used =3D idx + !!off; > - > -=09/* pad frame to 60 bytes: first buffer is at least ETH_ZLEN long */ > -=09l2len =3D *dlen + hdrlen - VNET_HLEN; > -=09vu_pad(&iov_vu[0], l2len); > +=09iov_tail_truncate(data, MAX(dlen + hdrlen, ETH_ZLEN + VNET_HLEN)); > +=09iov_tail_zero_end(data, dlen + hdrlen); > +=09iov_tail_truncate(data, dlen + hdrlen); Zeroing the tail, then truncating it seems kind of weird. > -=09vu_set_vnethdr(vdev, iov_vu[0].iov_base, iov_used); > +=09vu_set_vnethdr(vdev, data->iov[0].iov_base, data->cnt); > =20 > =09/* release unused buffers */ > -=09vu_queue_rewind(vq, iov_cnt - iov_used); > +=09vu_queue_rewind(vq, iov_cnt - data->cnt); > =20 > -=09return iov_used; > +=09return dlen; > } > =20 > /** > * udp_vu_prepare() - Prepare the packet header > * @c:=09=09Execution context > + * @data:=09IO vector tail for the frame > * @toside:=09Address information for one side of the flow > * @dlen:=09Packet data length > * > * Return: Layer-4 length > */ > -static size_t udp_vu_prepare(const struct ctx *c, > +static size_t udp_vu_prepare(const struct ctx *c, const struct iov_tail = *data, > =09=09=09 const struct flowside *toside, ssize_t dlen) > { > +=09const struct iovec *iov =3D data->iov; > =09struct ethhdr *eh; > =09size_t l4len; > =20 > =09/* ethernet header */ > -=09eh =3D vu_eth(iov_vu[0].iov_base); > +=09eh =3D vu_eth(iov[0].iov_base); Now that you have an iov_tail, could you clone it and use IOV_{PEEK,REMOVE}_HEADER() instead of the more specific vu_eth(), vu_ip() etc? > =20 > =09memcpy(eh->h_dest, c->guest_mac, sizeof(eh->h_dest)); > =09memcpy(eh->h_source, c->our_tap_mac, sizeof(eh->h_source)); > =20 > =09/* initialize header */ > =09if (inany_v4(&toside->eaddr) && inany_v4(&toside->oaddr)) { > -=09=09struct iphdr *iph =3D vu_ip(iov_vu[0].iov_base); > -=09=09struct udp_payload_t *bp =3D vu_payloadv4(iov_vu[0].iov_base); > +=09=09struct iphdr *iph =3D vu_ip(iov[0].iov_base); > +=09=09struct udp_payload_t *bp =3D vu_payloadv4(iov[0].iov_base); > =20 > =09=09eh->h_proto =3D htons(ETH_P_IP); > =20 > @@ -166,8 +163,8 @@ static size_t udp_vu_prepare(const struct ctx *c, > =20 > =09=09l4len =3D udp_update_hdr4(iph, bp, toside, dlen, true); > =09} else { > -=09=09struct ipv6hdr *ip6h =3D vu_ip(iov_vu[0].iov_base); > -=09=09struct udp_payload_t *bp =3D vu_payloadv6(iov_vu[0].iov_base); > +=09=09struct ipv6hdr *ip6h =3D vu_ip(iov[0].iov_base); > +=09=09struct udp_payload_t *bp =3D vu_payloadv6(iov[0].iov_base); > =20 > =09=09eh->h_proto =3D htons(ETH_P_IPV6); > =20 > @@ -182,25 +179,25 @@ static size_t udp_vu_prepare(const struct ctx *c, > /** > * udp_vu_csum() - Calculate and set checksum for a UDP packet > * @toside:=09Address information for one side of the flow > - * @iov_used:=09Number of used iov_vu items > + * @data:=09IO vector tail for the frame > */ > -static void udp_vu_csum(const struct flowside *toside, int iov_used) > +static void udp_vu_csum(const struct flowside *toside, > +=09=09=09const struct iov_tail *data) > { > =09const struct in_addr *src4 =3D inany_v4(&toside->oaddr); > =09const struct in_addr *dst4 =3D inany_v4(&toside->eaddr); > -=09char *base =3D iov_vu[0].iov_base; > -=09struct udp_payload_t *bp; > -=09struct iov_tail data; > +=09struct iov_tail payload =3D *data; > +=09struct udphdr *uh, uh_storage; > +=09bool ipv4 =3D src4 && dst4; > =20 > -=09if (src4 && dst4) { > -=09=09bp =3D vu_payloadv4(base); > -=09=09data =3D IOV_TAIL(iov_vu, iov_used, (char *)&bp->data - base); > -=09=09csum_udp4(&bp->uh, *src4, *dst4, &data); > -=09} else { > -=09=09bp =3D vu_payloadv6(base); > -=09=09data =3D IOV_TAIL(iov_vu, iov_used, (char *)&bp->data - base); > -=09=09csum_udp6(&bp->uh, &toside->oaddr.a6, &toside->eaddr.a6, &data); > -=09} > +=09iov_drop_header(&payload, > +=09=09=09udp_vu_hdrlen(!ipv4) - sizeof(struct udphdr)); > +=09uh =3D IOV_REMOVE_HEADER(&payload, uh_storage); > + > +=09if (ipv4) > +=09=09csum_udp4(uh, *src4, *dst4, &payload); > +=09else > +=09=09csum_udp6(uh, &toside->oaddr.a6, &toside->eaddr.a6, &payload); > } > =20 > /** > @@ -216,23 +213,25 @@ void udp_vu_sock_to_tap(const struct ctx *c, int s,= int n, flow_sidx_t tosidx) > =09bool v6 =3D !(inany_v4(&toside->eaddr) && inany_v4(&toside->oaddr)); > =09struct vu_dev *vdev =3D c->vdev; > =09struct vu_virtq *vq =3D &vdev->vq[VHOST_USER_RX_QUEUE]; > +=09struct iov_tail data; > =09int i; > =20 > =09for (i =3D 0; i < n; i++) { > =09=09ssize_t dlen; > -=09=09int iov_used; > =20 > -=09=09iov_used =3D udp_vu_sock_recv(c, vq, s, v6, &dlen); > -=09=09if (iov_used < 0) > +=09=09data =3D IOV_TAIL(iov_vu, VIRTQUEUE_MAX_SIZE, 0); > + > +=09=09dlen =3D udp_vu_sock_recv(c, &data, vq, s, v6); > +=09=09if (dlen < 0) > =09=09=09break; > =20 > -=09=09if (iov_used > 0) { > -=09=09=09udp_vu_prepare(c, toside, dlen); > +=09=09if (data.cnt > 0) { > +=09=09=09udp_vu_prepare(c, &data, toside, dlen); > =09=09=09if (*c->pcap) { > -=09=09=09=09udp_vu_csum(toside, iov_used); > -=09=09=09=09pcap_iov(iov_vu, iov_used, VNET_HLEN); > +=09=09=09=09udp_vu_csum(toside, &data); > +=09=09=09=09pcap_iov(data.iov, data.cnt, VNET_HLEN); > =09=09=09} > -=09=09=09vu_flush(vdev, vq, elem, iov_used); > +=09=09=09vu_flush(vdev, vq, elem, data.cnt); > =09=09} > =09} > } > --=20 > 2.53.0 >=20 --=20 David Gibson (he or they)=09| I'll have my music baroque, and my code david AT gibson.dropbear.id.au=09| minimalist, thank you, not the other way =09=09=09=09| around. http://www.ozlabs.org/~dgibson --V5HuraenfRznxWe3 Content-Type: application/pgp-signature; name=signature.asc -----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmmk09AACgkQzQJF27ox 2Gchnw//TsGtajgszODdv7UkAmhql00YQN9l7OHXM888gb1IU53iouchPe8aqYXP a9HujvByKF5e/HbSr9yFBpWs8YUKKTD0H6IGPNC3fvhFgwpmO6iX8pqnpYo4vySl oMCICS9zYm2/03IyxjTZkJTjPXIca8lQbH6fcZSoGEmKvtYKN0wB+khAWHidZk8k OTRzieOEaJxeL95l2/4cmrMLWjkszPBOFfTQw3cN+io3GW5FR4hv79XDgjF8gmga pceMAhm1yk/Aqw/EJ4Hmnwn4nOkN9+WN1Vb0OGl/Md99jxwDd5SHUyMJLReUe9fc /UFgBgBdBvNrZF3DehZNHhZVi91Js/+6w9+Hjrp9WVgUwZaQeWNI82Kn0UNpReSA c0K5F13MKgxY5m8of6pXWZMNrmHDiJtYmxpFnSKl8qzcotIDmI5akH2wL67ydCCY SEKlfaNPmncsZ9EGJqoUaKUZ97kywMMqAL4WxXaflyVvOJMmgZ8oIfrlmDF9la6j uGiWCknbWWec505dJ2X/pbrElx7ghZ/+QX6YdsFWVrebFSBATG6kiKj4c8BfoPKW pkxiwJ7sR8JQ4YdZgzY3kN1di6A2nUaG71A93dxXR4ZtAm12DOfZXrtv8JQNjPSl 6BcVj/FoBgz5V2mCpS4A1vgPhoG0VrO53W9n6x9VG3Zr+g9jfcI= =O0Hz -----END PGP SIGNATURE----- --V5HuraenfRznxWe3--