On Wed, May 13, 2026 at 11:17:12AM +0200, Laurent Vivier wrote: > On 5/11/26 04:01, David Gibson wrote: > > On Thu, Apr 16, 2026 at 05:57:21PM +0200, Laurent Vivier wrote: > > > The previous per-protocol padding done by vu_pad() in tcp_vu.c and > > > udp_vu.c was only correct for single-buffer frames: it assumed the > > > padding area always fell within the first iov, writing past its end > > > with a plain memset(). > > > > > > It also required each caller to compute MAX(..., ETH_ZLEN + VNET_HLEN) > > > for vu_collect() and to call vu_pad() at the right point, duplicating > > > the minimum-size logic across protocols. > > > > > > Move the Ethernet minimum size enforcement into vu_collect() itself, so > > > that enough buffer space is always reserved for padding regardless of > > > the requested frame size. > > > > > > Rewrite vu_pad() to take a full iovec array and use iov_memset(), > > > making it safe for multi-buffer (mergeable rx buffer) frames. > > > > > > In tcp_vu_sock_recv(), replace iov_truncate() with iov_skip_bytes(): > > > now that all consumers receive explicit data lengths, truncating the > > > iovecs is no longer needed. In tcp_vu_data_from_sock(), cap each > > > frame's data length against the remaining bytes actually received from > > > the socket, so that the last partial frame gets correct headers and > > > sequence number advancement. > > > > > > Signed-off-by: Laurent Vivier > > > > LGTM, except for what looks like one minor bug. > > > > [snip] > > > index 704e908aa02c..d07f584f228a 100644 > > > --- a/vu_common.c > > > +++ b/vu_common.c > > > @@ -74,6 +74,7 @@ int vu_collect(const struct vu_dev *vdev, struct vu_virtq *vq, > > > size_t current_iov = 0; > > > int elem_cnt = 0; > > > + size = MAX(size, ETH_ZLEN /* Ethernet minimum size */ + VNET_HLEN); > > > > This seems to imply size should include the vnet header... > > size is the max of "size" provided below by vu_single() to vu_collect() (and > you noted includes vnet header) and the the minimum frame size (Ethernet > minimum + vnet header) > > > > > > while (current_size < size && elem_cnt < max_elem && > > > current_iov < max_in_sg) { > > > int ret; > > > @@ -261,29 +262,27 @@ int vu_send_single(const struct ctx *c, const void *buf, size_t size) > > > return -1; > > > } > > > - size += VNET_HLEN; > > > elem_cnt = vu_collect(vdev, vq, elem, ARRAY_SIZE(elem), in_sg, > > > - ARRAY_SIZE(in_sg), &in_total, size, &total); > > > - if (elem_cnt == 0 || total < size) { > > > + ARRAY_SIZE(in_sg), &in_total, VNET_HLEN + size, &total); > > > > ...but this seems to imply it doesn't. > > This is not the same "size". Here "size" is without vnet header, but we need > to provide a size with vnet header to vu_collect(). Oops. I thought I'd checked that these were the same function, but clearly I got myself confused. Naming tweaks might make this harder to mistake - I think l2len would be appropriate for this one. We don't have a standard name for the length including VNET_LEN (or whatever "device" level header/descriptor we have, like the qemu frame length), maybe we should invent one? > > > > > > + if (elem_cnt == 0 || total < VNET_HLEN + size) { > > > debug("vu_send_single: no space to send the data " > > > "elem_cnt %d size %zu", elem_cnt, total); > > > goto err; > > > } > > > - total -= VNET_HLEN; > > > - > > > /* copy data from the buffer to the iovec */ > > > - iov_from_buf(in_sg, in_total, VNET_HLEN, buf, total); > > > + iov_from_buf(in_sg, in_total, VNET_HLEN, buf, size); > > > if (*c->pcap) > > > pcap_iov(in_sg, in_total, VNET_HLEN, size); > > > + vu_pad(in_sg, in_total, VNET_HLEN + size); > > > > As does this. > > Same here (see vu_pad() comment header) > > > > > > vu_flush(vdev, vq, elem, elem_cnt, VNET_HLEN + size); > > > > And this. > > See vu_flush() comment header > > > > > > vu_queue_notify(vdev, vq); > > > - trace("vhost-user sent %zu", total); > > > + trace("vhost-user sent %zu", size); > > > - return total; > > > + return size; > > > err: > > > for (i = 0; i < elem_cnt; i++) > > > vu_queue_detach_element(vq); > > > @@ -292,15 +291,15 @@ err: > > > } > > > /** > > > - * vu_pad() - Pad 802.3 frame to minimum length (60 bytes) if needed > > > - * @iov: Buffer in iovec array where end of 802.3 frame is stored > > > - * @l2len: Layer-2 length already filled in frame > > > + * vu_pad() - Pad short frames to minimum Ethernet length and truncate iovec > > > + * @iov: Pointer to iovec array > > > + * @cnt: Number of entries in @iov > > > + * @frame_len: Data length in @iov (including virtio-net header) > > > */ > > > -void vu_pad(struct iovec *iov, size_t l2len) > > > +void vu_pad(const struct iovec *iov, size_t cnt, size_t frame_len) > > > { > > > - if (l2len >= ETH_ZLEN) > > > - return; > > > + size_t min_frame_len = ETH_ZLEN + VNET_HLEN; > > > - memset((char *)iov->iov_base + iov->iov_len, 0, ETH_ZLEN - l2len); > > > - iov->iov_len += ETH_ZLEN - l2len; > > > + if (frame_len < min_frame_len) > > > + iov_memset(iov, cnt, frame_len, 0, min_frame_len - frame_len); > > > } > > > diff --git a/vu_common.h b/vu_common.h > > > index 77d1849e6115..51f70084a7cb 100644 > > > --- a/vu_common.h > > > +++ b/vu_common.h > > > @@ -44,6 +44,6 @@ void vu_flush(const struct vu_dev *vdev, struct vu_virtq *vq, > > > void vu_kick_cb(struct vu_dev *vdev, union epoll_ref ref, > > > const struct timespec *now); > > > int vu_send_single(const struct ctx *c, const void *buf, size_t size); > > > -void vu_pad(struct iovec *iov, size_t l2len); > > > +void vu_pad(const struct iovec *iov, size_t cnt, size_t frame_len); > > > #endif /* VU_COMMON_H */ > > > -- > > > 2.53.0 > > > > > > > Thanks, > Laurent > -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson