From: Stefano Brivio <sbrivio@redhat.com>
To: Laurent Vivier <lvivier@redhat.com>
Cc: passt-dev@passt.top
Subject: Re: [PATCH v6 1/3] udp_vu: Allow virtqueue elements with multiple iovec entries
Date: Fri, 03 Apr 2026 13:53:23 +0200 (CEST) [thread overview]
Message-ID: <20260403135309.1d84f5dd@elisabeth> (raw)
In-Reply-To: <20260401192326.1783350-2-lvivier@redhat.com>
On Wed, 1 Apr 2026 21:23:24 +0200
Laurent Vivier <lvivier@redhat.com> wrote:
> The previous code assumed a 1:1 mapping between virtqueue elements and
> iovec entries (enforced by an assert). Drop that assumption to allow
> elements that span multiple iovecs: track elem_used separately by
> walking the element list against the iov count returned after padding.
> This also fixes vu_queue_rewind() and vu_flush() to use the element
> count rather than the iov count.
>
> Use iov_tail_clone() in udp_vu_sock_recv() to handle header offset,
> replacing the manual base/len adjustment and restore pattern.
>
> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
> ---
> udp_vu.c | 29 ++++++++++++++---------------
> 1 file changed, 14 insertions(+), 15 deletions(-)
>
> diff --git a/udp_vu.c b/udp_vu.c
> index 30af64034516..5608a3a96ff5 100644
> --- a/udp_vu.c
> +++ b/udp_vu.c
> @@ -64,30 +64,25 @@ static size_t udp_vu_hdrlen(bool v6)
> */
> static ssize_t udp_vu_sock_recv(struct iovec *iov, size_t *cnt, int s, bool v6)
> {
> + struct iovec msg_iov[*cnt];
Variable-length Arrays (VLAs) are allowed starting from C99 but we
should really really avoid them. If 'cnt' is big enough, we risk
writing all over the place. That's the main reason why they were more
or less banned from the Linux kernel some years ago and eventually
eradicated:
https://lore.kernel.org/lkml/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com/
https://lore.kernel.org/lkml/20181028172401.GA41102@beast/
Can we use VIRTQUEUE_MAX_SIZE as upper bound like udp_vu_sock_to_tap()
does?
We should probably add -Wvla by the way.
> struct msghdr msg = { 0 };
> + struct iov_tail payload;
> size_t hdrlen, iov_used;
> ssize_t dlen;
>
> /* compute L2 header length */
> hdrlen = udp_vu_hdrlen(v6);
>
> - /* reserve space for the headers */
> - assert(iov[0].iov_len >= MAX(hdrlen, ETH_ZLEN + VNET_HLEN));
> - iov[0].iov_base = (char *)iov[0].iov_base + hdrlen;
> - iov[0].iov_len -= hdrlen;
> + payload = IOV_TAIL(iov, *cnt, hdrlen);
>
> - /* read data from the socket */
> - msg.msg_iov = iov;
> - msg.msg_iovlen = *cnt;
> + msg.msg_iov = msg_iov;
> + msg.msg_iovlen = iov_tail_clone(msg.msg_iov, payload.cnt, &payload);
>
> + /* read data from the socket */
> dlen = recvmsg(s, &msg, 0);
> if (dlen < 0)
> return -1;
>
> - /* restore the pointer to the headers address */
> - iov[0].iov_base = (char *)iov[0].iov_base - hdrlen;
> - iov[0].iov_len += hdrlen;
> -
> iov_used = iov_skip_bytes(iov, *cnt,
> MAX(dlen + hdrlen, VNET_HLEN + ETH_ZLEN),
> NULL);
> @@ -205,7 +200,7 @@ void udp_vu_sock_to_tap(const struct ctx *c, int s, int n, flow_sidx_t tosidx)
> }
>
> for (i = 0; i < n; i++) {
> - unsigned elem_cnt, elem_used;
> + unsigned elem_cnt, elem_used, j, k;
> size_t iov_cnt;
> ssize_t dlen;
>
> @@ -215,15 +210,19 @@ void udp_vu_sock_to_tap(const struct ctx *c, int s, int n, flow_sidx_t tosidx)
> if (elem_cnt == 0)
> break;
>
> - assert((size_t)elem_cnt == iov_cnt); /* one iovec per element */
> -
> dlen = udp_vu_sock_recv(iov_vu, &iov_cnt, s, v6);
> if (dlen < 0) {
> vu_queue_rewind(vq, elem_cnt);
> break;
> }
>
> - elem_used = iov_cnt; /* one iovec per element */
> + elem_used = 0;
> + for (j = 0, k = 0; k < iov_cnt && j < elem_cnt; j++) {
> + if (k + elem[j].in_num > iov_cnt)
> + elem[j].in_num = iov_cnt - k;
I think it would be more intuitive to write it like this:
size_t iov_still_needed = iov_cnt - k;
if (elem[j].in_num > iov_still_needed)
elem[j].in_num = iov_still_needed;
...otherwise it looks like 'k + elem[j].in_num' needs to satisfy some
kind of bound, but that's not the case.
> + k += elem[j].in_num;
> + elem_used++;
> + }
>
> /* release unused buffers */
> vu_queue_rewind(vq, elem_cnt - elem_used);
--
Stefano
next prev parent reply other threads:[~2026-04-03 11:53 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-01 19:23 [PATCH v6 0/3] vhost-user,udp: Handle multiple iovec entries per virtqueue element Laurent Vivier
2026-04-01 19:23 ` [PATCH v6 1/3] udp_vu: Allow virtqueue elements with multiple iovec entries Laurent Vivier
2026-04-03 11:53 ` Stefano Brivio [this message]
2026-04-03 15:18 ` Laurent Vivier
2026-04-03 16:59 ` Stefano Brivio
2026-04-03 17:14 ` Laurent Vivier
2026-04-01 19:23 ` [PATCH v6 2/3] iov: Introduce IOV_PUSH_HEADER() macro Laurent Vivier
2026-04-01 19:23 ` [PATCH v6 3/3] udp: Pass iov_tail to udp_update_hdr4()/udp_update_hdr6() Laurent Vivier
2026-04-03 11:53 ` Stefano Brivio
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260403135309.1d84f5dd@elisabeth \
--to=sbrivio@redhat.com \
--cc=lvivier@redhat.com \
--cc=passt-dev@passt.top \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).