On Wed, May 13, 2026 at 01:52:18PM +0200, Laurent Vivier wrote: > The previous per-protocol padding done by vu_pad() in tcp_vu.c and > udp_vu.c was only correct for single-buffer frames: it assumed the > padding area always fell within the first iov, writing past its end > with a plain memset(). > > It also required each caller to compute MAX(..., ETH_ZLEN + VNET_HLEN) > for vu_collect() and to call vu_pad() at the right point, duplicating > the minimum-size logic across protocols. > > Move the Ethernet minimum size enforcement into vu_collect() itself, so > that enough buffer space is always reserved for padding regardless of > the requested frame size. > > Rewrite vu_pad() to take a full iovec array and use iov_memset(), > making it safe for multi-buffer (mergeable rx buffer) frames. > > In tcp_vu_sock_recv(), replace iov_truncate() with iov_skip_bytes(): > now that all consumers receive explicit data lengths, truncating the > iovecs is no longer needed. In tcp_vu_data_from_sock(), cap each > frame's data length against the remaining bytes actually received from > the socket, so that the last partial frame gets correct headers and > sequence number advancement. > > Signed-off-by: Laurent Vivier > Reviewed-by: Jon Maloy Reviewed-by: David Gibson But following on from my comments on v3, a couple of clarity nits for possible follow up: [snip] > diff --git a/vu_common.c b/vu_common.c > index 704e908aa02c..d07f584f228a 100644 > --- a/vu_common.c > +++ b/vu_common.c > @@ -74,6 +74,7 @@ int vu_collect(const struct vu_dev *vdev, struct vu_virtq *vq, > size_t current_iov = 0; > int elem_cnt = 0; > > + size = MAX(size, ETH_ZLEN /* Ethernet minimum size */ + VNET_HLEN); Here I think "size" is a reasonable name, since it's the size of the buffer we're obtaining, i.e. a bound, but not otherwise related to the length of the frame. > while (current_size < size && elem_cnt < max_elem && > current_iov < max_in_sg) { > int ret; > @@ -261,29 +262,27 @@ int vu_send_single(const struct ctx *c, const void *buf, size_t size) > return -1; > } > > - size += VNET_HLEN; > elem_cnt = vu_collect(vdev, vq, elem, ARRAY_SIZE(elem), in_sg, > - ARRAY_SIZE(in_sg), &in_total, size, &total); > - if (elem_cnt == 0 || total < size) { > + ARRAY_SIZE(in_sg), &in_total, VNET_HLEN + size, &total); > + if (elem_cnt == 0 || total < VNET_HLEN + size) { Here, "l2len" would be a much better name than "size". > debug("vu_send_single: no space to send the data " > "elem_cnt %d size %zu", elem_cnt, total); > goto err; > } > > - total -= VNET_HLEN; > - > /* copy data from the buffer to the iovec */ > - iov_from_buf(in_sg, in_total, VNET_HLEN, buf, total); > + iov_from_buf(in_sg, in_total, VNET_HLEN, buf, size); > > if (*c->pcap) > pcap_iov(in_sg, in_total, VNET_HLEN, size); > > + vu_pad(in_sg, in_total, VNET_HLEN + size); > vu_flush(vdev, vq, elem, elem_cnt, VNET_HLEN + size); > vu_queue_notify(vdev, vq); > > - trace("vhost-user sent %zu", total); > + trace("vhost-user sent %zu", size); > > - return total; > + return size; > err: > for (i = 0; i < elem_cnt; i++) > vu_queue_detach_element(vq); > @@ -292,15 +291,15 @@ err: > } > > /** > - * vu_pad() - Pad 802.3 frame to minimum length (60 bytes) if needed > - * @iov: Buffer in iovec array where end of 802.3 frame is stored > - * @l2len: Layer-2 length already filled in frame > + * vu_pad() - Pad short frames to minimum Ethernet length and truncate iovec > + * @iov: Pointer to iovec array > + * @cnt: Number of entries in @iov > + * @frame_len: Data length in @iov (including virtio-net header) > */ > -void vu_pad(struct iovec *iov, size_t l2len) > +void vu_pad(const struct iovec *iov, size_t cnt, size_t frame_len) Here we have the actual frame length, including device header, but not padding. "frame_len" is different from the other standard names we use, so it's not terrible, but "frame" often refers to the L2 object so it's not great either. Not sure if 'l1len' or 'l0len' would be getting too cutesy with what "physical" layer means in a virtual network. Something like "device_len" maybe? But that should probably include padding as well. Or alternatively, vu_pad() could be updated to take l2len, and add VNET_HLEN inside. > { > - if (l2len >= ETH_ZLEN) > - return; > + size_t min_frame_len = ETH_ZLEN + VNET_HLEN; > > - memset((char *)iov->iov_base + iov->iov_len, 0, ETH_ZLEN - l2len); > - iov->iov_len += ETH_ZLEN - l2len; > + if (frame_len < min_frame_len) > + iov_memset(iov, cnt, frame_len, 0, min_frame_len - frame_len); > } > diff --git a/vu_common.h b/vu_common.h > index 77d1849e6115..51f70084a7cb 100644 > --- a/vu_common.h > +++ b/vu_common.h > @@ -44,6 +44,6 @@ void vu_flush(const struct vu_dev *vdev, struct vu_virtq *vq, > void vu_kick_cb(struct vu_dev *vdev, union epoll_ref ref, > const struct timespec *now); > int vu_send_single(const struct ctx *c, const void *buf, size_t size); > -void vu_pad(struct iovec *iov, size_t l2len); > +void vu_pad(const struct iovec *iov, size_t cnt, size_t frame_len); > > #endif /* VU_COMMON_H */ > -- > 2.54.0 > -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson