From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=pass (2048-bit key; secure) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202602 header.b=JyesMdKV; dkim-atps=neutral Received: from mail.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by passt.top (Postfix) with ESMTPS id CA68C5A0262 for ; Tue, 24 Mar 2026 03:11:26 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202602; t=1774318282; bh=RTasFzw2dFAKR4pzEU2Zg0+CThJZ4m3zEhJ7qdbwNgM=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=JyesMdKV5/PNHsu6inP0rq1MTqWtHuhbQA2XMlTheUyzhaYnSEg2qgy9FI+Xir+ac JN8mhJ8318Duu89MN2xqUXpg2FdMKFa4jjWG9hGq61oTmCweEi0wZfevWVx2/0wAvo oTPrbdBOPLfyWX05I4/fgT2/IM54znPejJbKzCC1YcLHTcNS19qnxwLehM6g7gD+7Z rg8v92QGMTl7oXbmO3qxxYe9kNeJG43ma07FCQ92wA+AAhWRuLRibbRuZgXmiMMLvf R3T70uRKzkJSXUjLmUMQ3afZCQQU37bnbDBWrhAjUxjz+opkO+v+M/wbnRElssSZxz 5CSyHOL4ud3Qg== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4fftnQ4mvlz4wCt; Tue, 24 Mar 2026 13:11:22 +1100 (AEDT) Date: Tue, 24 Mar 2026 13:11:09 +1100 From: David Gibson To: Laurent Vivier Subject: Re: [PATCH v4 2/5] udp_vu: Use iov_tail to manage virtqueue buffers Message-ID: References: <20260323143151.538673-1-lvivier@redhat.com> <20260323143151.538673-3-lvivier@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="jUx9bFY4ddBQpSo3" Content-Disposition: inline In-Reply-To: <20260323143151.538673-3-lvivier@redhat.com> Message-ID-Hash: A4JB64VZWNF53WO66VLNKCEJUBALRU5D X-Message-ID-Hash: A4JB64VZWNF53WO66VLNKCEJUBALRU5D X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --jUx9bFY4ddBQpSo3 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Mar 23, 2026 at 03:31:48PM +0100, Laurent Vivier wrote: > Replace direct iovec pointer arithmetic in UDP vhost-user handling with > iov_tail operations. >=20 > udp_vu_sock_recv() now takes an iov/cnt pair instead of using the > file-scoped iov_vu array, and returns the data length rather than the > iov count. Internally it uses IOV_TAIL() to create a view past the > L2/L3/L4 headers, and iov_tail_clone() to build the recvmsg() iovec, > removing the manual pointer offset and restore pattern. >=20 > udp_vu_prepare() and udp_vu_csum() take a const struct iov_tail * > instead of referencing iov_vu directly, making data flow explicit. >=20 > udp_vu_csum() uses iov_drop_header() and IOV_REMOVE_HEADER() to locate > the UDP header and payload, replacing manual offset calculations via > vu_payloadv4()/vu_payloadv6(). >=20 > Signed-off-by: Laurent Vivier Reviewed-by: David Gibson Though a couple of nits below. > --- > udp_vu.c | 124 +++++++++++++++++++++++++++++++------------------------ > 1 file changed, 71 insertions(+), 53 deletions(-) >=20 > diff --git a/udp_vu.c b/udp_vu.c > index 6a1e1696b19f..6638cb0ddc5f 100644 > --- a/udp_vu.c > +++ b/udp_vu.c > @@ -59,21 +59,26 @@ static size_t udp_vu_hdrlen(bool v6) > /** > * udp_vu_sock_recv() - Receive datagrams from socket into vhost-user bu= ffers > * @c: Execution context > + * @iov: IO vector for the frame (in/out) > + * @cnt: Number of IO vector entries (in/out) > * @vq: virtqueue to use to receive data > * @s: Socket to receive from > * @v6: Set for IPv6 connections > - * @dlen: Size of received data (output) > * > - * Return: number of iov entries used to store the datagram, 0 if the da= tagram > + * Return: size of received data, 0 if the datagram > * was discarded because the virtqueue is not ready, -1 on error > */ > -static int udp_vu_sock_recv(const struct ctx *c, struct vu_virtq *vq, in= t s, > - bool v6, ssize_t *dlen) > +static ssize_t udp_vu_sock_recv(const struct ctx *c, struct iovec *iov, > + size_t *cnt, unsigned *elem_used, > + struct vu_virtq *vq, int s, bool v6) > { > const struct vu_dev *vdev =3D c->vdev; > - int elem_cnt, elem_used, iov_used; > struct msghdr msg =3D { 0 }; > - size_t iov_cnt, hdrlen; > + struct iov_tail payload; > + size_t hdrlen, iov_used; > + unsigned elem_cnt; > + unsigned i, j; > + ssize_t dlen; > =20 > assert(!c->no_udp); > =20 > @@ -83,6 +88,7 @@ static int udp_vu_sock_recv(const struct ctx *c, struct= vu_virtq *vq, int s, > if (recvmsg(s, &msg, MSG_DONTWAIT) < 0) > debug_perror("Failed to discard datagram"); > =20 > + *cnt =3D 0; > return 0; > } > =20 > @@ -90,67 +96,70 @@ static int udp_vu_sock_recv(const struct ctx *c, stru= ct vu_virtq *vq, int s, > hdrlen =3D udp_vu_hdrlen(v6); > =20 > elem_cnt =3D vu_collect(vdev, vq, elem, ARRAY_SIZE(elem), > - iov_vu, ARRAY_SIZE(iov_vu), &iov_cnt, > + iov, *cnt, &iov_used, > IP_MAX_MTU + ETH_HLEN + VNET_HLEN, NULL); > if (elem_cnt =3D=3D 0) > return -1; > =20 > - assert((size_t)elem_cnt =3D=3D iov_cnt); /* one iovec per element */ > + assert((size_t)elem_cnt =3D=3D iov_used); /* one iovec per element */ > =20 > - /* reserve space for the headers */ > - assert(iov_vu[0].iov_len >=3D MAX(hdrlen, ETH_ZLEN + VNET_HLEN)); > - iov_vu[0].iov_base =3D (char *)iov_vu[0].iov_base + hdrlen; > - iov_vu[0].iov_len -=3D hdrlen; > + payload =3D IOV_TAIL(iov, iov_used, hdrlen); > =20 > - /* read data from the socket */ > - msg.msg_iov =3D iov_vu; > - msg.msg_iovlen =3D iov_cnt; > + struct iovec msg_iov[payload.cnt]; > + msg.msg_iov =3D msg_iov; > + msg.msg_iovlen =3D iov_tail_clone(msg.msg_iov, payload.cnt, &payload); > =20 > - *dlen =3D recvmsg(s, &msg, 0); > - if (*dlen < 0) { > + /* read data from the socket */ > + dlen =3D recvmsg(s, &msg, 0); > + if (dlen < 0) { > vu_queue_rewind(vq, elem_cnt); > return -1; > } > =20 > - /* restore the pointer to the headers address */ > - iov_vu[0].iov_base =3D (char *)iov_vu[0].iov_base - hdrlen; > - iov_vu[0].iov_len +=3D hdrlen; > + *cnt =3D vu_pad(iov, iov_used, 0, dlen + hdrlen); > =20 > - iov_used =3D vu_pad(iov_vu, iov_cnt, 0, *dlen + hdrlen); > - elem_used =3D iov_used; /* one iovec per element */ > + *elem_used =3D 0; > + for (i =3D 0, j =3D 0; j < *cnt && i < elem_cnt; i++) { Nit, this would be slightly easier to read with the i condition before the j condition. > + if (j + elem[i].in_num > *cnt) > + elem[i].in_num =3D *cnt - j; > + j +=3D elem[i].in_num; > + (*elem_used)++; > + } > =20 > - vu_set_vnethdr(iov_vu[0].iov_base, elem_used); > + vu_set_vnethdr(iov[0].iov_base, *elem_used); > =20 > /* release unused buffers */ > - vu_queue_rewind(vq, elem_cnt - elem_used); > + vu_queue_rewind(vq, elem_cnt - *elem_used); > =20 > - return iov_used; > + return dlen; > } > =20 > /** > * udp_vu_prepare() - Prepare the packet header > * @c: Execution context > + * @data: IO vector tail for the frame > * @toside: Address information for one side of the flow > * @dlen: Packet data length > * > * Return: Layer-4 length > */ > -static size_t udp_vu_prepare(const struct ctx *c, > +static size_t udp_vu_prepare(const struct ctx *c, const struct iov_tail = *data, > const struct flowside *toside, ssize_t dlen) > { > + const struct iovec *iov =3D data->iov; > struct ethhdr *eh; > size_t l4len; > =20 > /* ethernet header */ > - eh =3D vu_eth(iov_vu[0].iov_base); > + eh =3D vu_eth(iov[0].iov_base); > =20 > memcpy(eh->h_dest, c->guest_mac, sizeof(eh->h_dest)); > memcpy(eh->h_source, c->our_tap_mac, sizeof(eh->h_source)); > =20 > /* initialize header */ > if (inany_v4(&toside->eaddr) && inany_v4(&toside->oaddr)) { > - struct iphdr *iph =3D vu_ip(iov_vu[0].iov_base); > - struct udp_payload_t *bp =3D vu_payloadv4(iov_vu[0].iov_base); > + struct iphdr *iph =3D vu_ip(iov[0].iov_base); > + struct udp_payload_t *bp =3D vu_payloadv4(iov[0].iov_base); > =20 > eh->h_proto =3D htons(ETH_P_IP); > =20 > @@ -158,8 +167,8 @@ static size_t udp_vu_prepare(const struct ctx *c, > =20 > l4len =3D udp_update_hdr4(iph, bp, toside, dlen, true); > } else { > - struct ipv6hdr *ip6h =3D vu_ip(iov_vu[0].iov_base); > - struct udp_payload_t *bp =3D vu_payloadv6(iov_vu[0].iov_base); > + struct ipv6hdr *ip6h =3D vu_ip(iov[0].iov_base); > + struct udp_payload_t *bp =3D vu_payloadv6(iov[0].iov_base); > =20 > eh->h_proto =3D htons(ETH_P_IPV6); > =20 > @@ -174,25 +183,29 @@ static size_t udp_vu_prepare(const struct ctx *c, > /** > * udp_vu_csum() - Calculate and set checksum for a UDP packet > * @toside: Address information for one side of the flow > - * @iov_used: Number of used iov_vu items > + * @data: IO vector tail for the frame (including vnet header) > */ > -static void udp_vu_csum(const struct flowside *toside, int iov_used) > +static void udp_vu_csum(const struct flowside *toside, > + const struct iov_tail *data) > { > const struct in_addr *src4 =3D inany_v4(&toside->oaddr); > const struct in_addr *dst4 =3D inany_v4(&toside->eaddr); > - char *base =3D iov_vu[0].iov_base; > - struct udp_payload_t *bp; > - struct iov_tail data; > + struct iov_tail payload =3D *data; > + struct udphdr *uh, uh_storage; > + bool ipv4 =3D src4 && dst4; > + > + IOV_DROP_HEADER(&payload, struct virtio_net_hdr_mrg_rxbuf); > + IOV_DROP_HEADER(&payload, struct ethhdr); > + if (ipv4) > + IOV_DROP_HEADER(&payload, struct iphdr); > + else > + IOV_DROP_HEADER(&payload, struct ipv6hdr); > + uh =3D IOV_REMOVE_HEADER(&payload, uh_storage); If uh_storage is actually used, this function is not correct (we never write it back to the iov). That doesn't add any limitation that wasn't already there, but it's arguably less obvious. I'm guessing you address this later in the series, but maybe a FIXME comment? > - if (src4 && dst4) { > - bp =3D vu_payloadv4(base); > - data =3D IOV_TAIL(iov_vu, iov_used, (char *)&bp->data - base); > - csum_udp4(&bp->uh, *src4, *dst4, &data); > - } else { > - bp =3D vu_payloadv6(base); > - data =3D IOV_TAIL(iov_vu, iov_used, (char *)&bp->data - base); > - csum_udp6(&bp->uh, &toside->oaddr.a6, &toside->eaddr.a6, &data); > - } > + if (ipv4) > + csum_udp4(uh, *src4, *dst4, &payload); > + else > + csum_udp6(uh, &toside->oaddr.a6, &toside->eaddr.a6, &payload); > } > =20 > /** > @@ -208,23 +221,28 @@ void udp_vu_sock_to_tap(const struct ctx *c, int s,= int n, flow_sidx_t tosidx) > bool v6 =3D !(inany_v4(&toside->eaddr) && inany_v4(&toside->oaddr)); > struct vu_dev *vdev =3D c->vdev; > struct vu_virtq *vq =3D &vdev->vq[VHOST_USER_RX_QUEUE]; > + struct iov_tail data; > int i; > =20 > for (i =3D 0; i < n; i++) { > + unsigned elem_used; > + size_t iov_cnt; > ssize_t dlen; > - int iov_used; > =20 > - iov_used =3D udp_vu_sock_recv(c, vq, s, v6, &dlen); > - if (iov_used < 0) > + iov_cnt =3D ARRAY_SIZE(iov_vu); > + dlen =3D udp_vu_sock_recv(c, iov_vu, &iov_cnt, &elem_used, vq, > + s, v6); > + if (dlen < 0) > break; > =20 > - if (iov_used > 0) { > - udp_vu_prepare(c, toside, dlen); > + if (iov_cnt > 0) { > + data =3D IOV_TAIL(iov_vu, iov_cnt, 0); > + udp_vu_prepare(c, &data, toside, dlen); > if (*c->pcap) { > - udp_vu_csum(toside, iov_used); > - pcap_iov(iov_vu, iov_used, VNET_HLEN); > + udp_vu_csum(toside, &data); > + pcap_iov(data.iov, data.cnt, VNET_HLEN); > } > - vu_flush(vdev, vq, elem, iov_used); > + vu_flush(vdev, vq, elem, elem_used); > } > } > } > --=20 > 2.53.0 >=20 --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --jUx9bFY4ddBQpSo3 Content-Type: application/pgp-signature; name=signature.asc -----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmnB8rwACgkQzQJF27ox 2GdB3hAAqFcklMJJtCXl4mGOi01OxZqvCSOnoL9oUeMeECjaKzo9lFP7mvNyBg59 Y5EmljhoOY4au6jZltrcAWVuN2KAnxc6Q0MErLG7D0+nY4N7b9Yj2BQXac/mplbS poIA1QYIdF599UI/qOB4RbhHzDPOoGgZYcK7kjPNz2nhySFEigsBnkuMMlkgaGSa TIyMZTZXcy8A5L6ue8ZdwlK+UAGz0FtO9iNKeUILj6sIeim7R8Q3WvJIFkS3rOSL TYOwBbV97DfGOVBIkYYF91QOTpt5uvIsNHx3cHafosg0q1SWnq82bGjkO4CuXUPM /mNOhhq4xM1CGjo4OOowdqgBzQY6RHAhi70gQ4i4zMeQWmD1chFipZZjc691R2bH HDeS4RMamfsHBDB7yNNb5Lp4Ao1J2weQMraOvSV88mfJri/+vBbhly0mqvlm5Nmy QNOt15RWgZkpN5NwQ/AfEElP4GXSTzlftWdppc/q4aWCzwxaT4m1qsVn4juoXqCW 93W3hRLza5DyhtjMsKpNW5AZfAFmpN3wfvIHjOAaahtdXPyHA/bkbOlC7jc9LX++ w8fNC+seT6jtIW4BSfvR9/LMl5SzIN64zIlXYE/RPn3B8uIERaTR7swkTwimokec KZ1aBj+p8Dls/dhhFFGH4aQPkuW61LyQxZf1P7ktpddFixcJTAA= =tkQo -----END PGP SIGNATURE----- --jUx9bFY4ddBQpSo3--