From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: passt.top; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=MAn75vwd; dkim-atps=neutral Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by passt.top (Postfix) with ESMTPS id CCAB05A061D for ; Mon, 16 Mar 2026 19:07:35 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1773684454; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GkbCrhNoWpEDRkpuGyTWiiYA5x49FXKhf9DtOKW+6vY=; b=MAn75vwdkRsQYvAK6+MXv42PmQZ2gI2ywbIsPm4sOVrHFnvF1bdCdC49nm1Pt3M+uJrquW ofGa8tIdp8bcN317V7yx/zR79J8BxInvCUxeQR+oEk1opjFR+YKRgag3cFD5PX0+WQxsRZ c8TViDuJdfoEWoHJyeFORak1E9EcSKI= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-403-LoJCNZXnPViyot9n6aSXKQ-1; Mon, 16 Mar 2026 14:07:32 -0400 X-MC-Unique: LoJCNZXnPViyot9n6aSXKQ-1 X-Mimecast-MFC-AGG-ID: LoJCNZXnPViyot9n6aSXKQ_1773684452 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 125821944EBF for ; Mon, 16 Mar 2026 18:07:32 +0000 (UTC) Received: from lenovo-t14s.redhat.com (unknown [10.44.35.65]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 1D2871800361; Mon, 16 Mar 2026 18:07:30 +0000 (UTC) From: Laurent Vivier To: passt-dev@passt.top Subject: [PATCH v3 5/8] udp_vu: Use iov_tail to manage virtqueue buffers Date: Mon, 16 Mar 2026 19:07:18 +0100 Message-ID: <20260316180721.2230640-6-lvivier@redhat.com> In-Reply-To: <20260316180721.2230640-1-lvivier@redhat.com> References: <20260316180721.2230640-1-lvivier@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: 07rb61YuZtcU05NNkBOLiPv_mxc3sFxSqzZ5NQFKphU_1773684452 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: 8bit content-type: text/plain; charset="US-ASCII"; x-default=true Message-ID-Hash: TYDSYCIGS62JH4OGAM4LHOJWJ2Z42HF3 X-Message-ID-Hash: TYDSYCIGS62JH4OGAM4LHOJWJ2Z42HF3 X-MailFrom: lvivier@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: Laurent Vivier X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Replace direct iovec pointer arithmetic in UDP vhost-user handling with iov_tail operations. udp_vu_sock_recv() now takes an iov/cnt pair instead of using the file-scoped iov_vu array, and returns the data length rather than the iov count. Internally it uses IOV_TAIL() to create a view past the L2/L3/L4 headers, and iov_tail_clone() to build the recvmsg() iovec, removing the manual pointer offset and restore pattern. udp_vu_prepare() and udp_vu_csum() take a const struct iov_tail * instead of referencing iov_vu directly, making data flow explicit. udp_vu_csum() uses iov_drop_header() and IOV_REMOVE_HEADER() to locate the UDP header and payload, replacing manual offset calculations via vu_payloadv4()/vu_payloadv6(). Signed-off-by: Laurent Vivier --- udp_vu.c | 123 +++++++++++++++++++++++++++++++------------------------ 1 file changed, 70 insertions(+), 53 deletions(-) diff --git a/udp_vu.c b/udp_vu.c index 5e030effa703..8b0de312949c 100644 --- a/udp_vu.c +++ b/udp_vu.c @@ -59,21 +59,26 @@ static size_t udp_vu_hdrlen(bool v6) /** * udp_vu_sock_recv() - Receive datagrams from socket into vhost-user buffers * @c: Execution context + * @iov: IO vector for the frame (in/out) + * @cnt: Number of IO vector entries (in/out) * @vq: virtqueue to use to receive data * @s: Socket to receive from * @v6: Set for IPv6 connections - * @dlen: Size of received data (output) * - * Return: number of iov entries used to store the datagram, 0 if the datagram + * Return: size of received data, 0 if the datagram * was discarded because the virtqueue is not ready, -1 on error */ -static int udp_vu_sock_recv(const struct ctx *c, struct vu_virtq *vq, int s, - bool v6, ssize_t *dlen) +static ssize_t udp_vu_sock_recv(const struct ctx *c, struct iovec *iov, + size_t *cnt, unsigned *elem_used, + struct vu_virtq *vq, int s, bool v6) { const struct vu_dev *vdev = c->vdev; - int elem_cnt, elem_used, iov_used; struct msghdr msg = { 0 }; - size_t iov_cnt, hdrlen; + struct iov_tail payload; + size_t hdrlen, iov_used; + unsigned elem_cnt; + unsigned i, j; + ssize_t dlen; ASSERT(!c->no_udp); @@ -83,6 +88,7 @@ static int udp_vu_sock_recv(const struct ctx *c, struct vu_virtq *vq, int s, if (recvmsg(s, &msg, MSG_DONTWAIT) < 0) debug_perror("Failed to discard datagram"); + *cnt = 0; return 0; } @@ -90,68 +96,70 @@ static int udp_vu_sock_recv(const struct ctx *c, struct vu_virtq *vq, int s, hdrlen = udp_vu_hdrlen(v6); elem_cnt = vu_collect(vdev, vq, elem, ARRAY_SIZE(elem), - iov_vu, ARRAY_SIZE(iov_vu), &iov_cnt, + iov, *cnt, &iov_used, IP_MAX_MTU + ETH_HLEN + VNET_HLEN, NULL); if (elem_cnt == 0) return -1; - ASSERT((size_t)elem_cnt == iov_cnt); /* one iovec per element */ + ASSERT((size_t)elem_cnt == iov_used); /* one iovec per element */ - /* reserve space for the headers */ - ASSERT(iov_vu[0].iov_len >= MAX(hdrlen, ETH_ZLEN + VNET_HLEN)); + payload = IOV_TAIL(iov, iov_used, hdrlen); - iov_vu[0].iov_base = (char *)iov_vu[0].iov_base + hdrlen; - iov_vu[0].iov_len -= hdrlen; + struct iovec msg_iov[payload.cnt]; + msg.msg_iov = msg_iov; + msg.msg_iovlen = iov_tail_clone(msg.msg_iov, payload.cnt, &payload); /* read data from the socket */ - msg.msg_iov = iov_vu; - msg.msg_iovlen = iov_cnt; - - *dlen = recvmsg(s, &msg, 0); - if (*dlen < 0) { + dlen = recvmsg(s, &msg, 0); + if (dlen < 0) { vu_queue_rewind(vq, elem_cnt); return -1; } - /* restore the pointer to the headers address */ - iov_vu[0].iov_base = (char *)iov_vu[0].iov_base - hdrlen; - iov_vu[0].iov_len += hdrlen; + *cnt = vu_pad(iov, iov_used, 0, dlen + hdrlen); - iov_used = vu_pad(iov_vu, iov_cnt, 0, *dlen + hdrlen); - elem_used = iov_used; /* one iovec per element */ + *elem_used = 0; + for (i = 0, j = 0; j < *cnt && i < elem_cnt; i++) { + if (j + elem[i].in_num > *cnt) + elem[i].in_num = *cnt - j; + j += elem[i].in_num; + (*elem_used)++; + } - vu_set_vnethdr(iov_vu[0].iov_base, elem_used); + vu_set_vnethdr(iov[0].iov_base, *elem_used); /* release unused buffers */ - vu_queue_rewind(vq, elem_cnt - elem_used); + vu_queue_rewind(vq, elem_cnt - *elem_used); - return iov_used; + return dlen; } /** * udp_vu_prepare() - Prepare the packet header * @c: Execution context + * @data: IO vector tail for the frame * @toside: Address information for one side of the flow * @dlen: Packet data length * * Return: Layer-4 length */ -static size_t udp_vu_prepare(const struct ctx *c, +static size_t udp_vu_prepare(const struct ctx *c, const struct iov_tail *data, const struct flowside *toside, ssize_t dlen) { + const struct iovec *iov = data->iov; struct ethhdr *eh; size_t l4len; /* ethernet header */ - eh = vu_eth(iov_vu[0].iov_base); + eh = vu_eth(iov[0].iov_base); memcpy(eh->h_dest, c->guest_mac, sizeof(eh->h_dest)); memcpy(eh->h_source, c->our_tap_mac, sizeof(eh->h_source)); /* initialize header */ if (inany_v4(&toside->eaddr) && inany_v4(&toside->oaddr)) { - struct iphdr *iph = vu_ip(iov_vu[0].iov_base); - struct udp_payload_t *bp = vu_payloadv4(iov_vu[0].iov_base); + struct iphdr *iph = vu_ip(iov[0].iov_base); + struct udp_payload_t *bp = vu_payloadv4(iov[0].iov_base); eh->h_proto = htons(ETH_P_IP); @@ -159,8 +167,8 @@ static size_t udp_vu_prepare(const struct ctx *c, l4len = udp_update_hdr4(iph, bp, toside, dlen, true); } else { - struct ipv6hdr *ip6h = vu_ip(iov_vu[0].iov_base); - struct udp_payload_t *bp = vu_payloadv6(iov_vu[0].iov_base); + struct ipv6hdr *ip6h = vu_ip(iov[0].iov_base); + struct udp_payload_t *bp = vu_payloadv6(iov[0].iov_base); eh->h_proto = htons(ETH_P_IPV6); @@ -175,25 +183,29 @@ static size_t udp_vu_prepare(const struct ctx *c, /** * udp_vu_csum() - Calculate and set checksum for a UDP packet * @toside: Address information for one side of the flow - * @iov_used: Number of used iov_vu items + * @data: IO vector tail for the frame (including vnet header) */ -static void udp_vu_csum(const struct flowside *toside, int iov_used) +static void udp_vu_csum(const struct flowside *toside, + const struct iov_tail *data) { const struct in_addr *src4 = inany_v4(&toside->oaddr); const struct in_addr *dst4 = inany_v4(&toside->eaddr); - char *base = iov_vu[0].iov_base; - struct udp_payload_t *bp; - struct iov_tail data; + struct iov_tail payload = *data; + struct udphdr *uh, uh_storage; + bool ipv4 = src4 && dst4; + + IOV_DROP_HEADER(&payload, struct virtio_net_hdr_mrg_rxbuf); + IOV_DROP_HEADER(&payload, struct ethhdr); + if (ipv4) + IOV_DROP_HEADER(&payload, struct iphdr); + else + IOV_DROP_HEADER(&payload, struct ipv6hdr); + uh = IOV_REMOVE_HEADER(&payload, uh_storage); - if (src4 && dst4) { - bp = vu_payloadv4(base); - data = IOV_TAIL(iov_vu, iov_used, (char *)&bp->data - base); - csum_udp4(&bp->uh, *src4, *dst4, &data); - } else { - bp = vu_payloadv6(base); - data = IOV_TAIL(iov_vu, iov_used, (char *)&bp->data - base); - csum_udp6(&bp->uh, &toside->oaddr.a6, &toside->eaddr.a6, &data); - } + if (ipv4) + csum_udp4(uh, *src4, *dst4, &payload); + else + csum_udp6(uh, &toside->oaddr.a6, &toside->eaddr.a6, &payload); } /** @@ -209,23 +221,28 @@ void udp_vu_sock_to_tap(const struct ctx *c, int s, int n, flow_sidx_t tosidx) bool v6 = !(inany_v4(&toside->eaddr) && inany_v4(&toside->oaddr)); struct vu_dev *vdev = c->vdev; struct vu_virtq *vq = &vdev->vq[VHOST_USER_RX_QUEUE]; + struct iov_tail data; int i; for (i = 0; i < n; i++) { + unsigned elem_used; + size_t iov_cnt; ssize_t dlen; - int iov_used; - iov_used = udp_vu_sock_recv(c, vq, s, v6, &dlen); - if (iov_used < 0) + iov_cnt = ARRAY_SIZE(iov_vu); + dlen = udp_vu_sock_recv(c, iov_vu, &iov_cnt, &elem_used, vq, + s, v6); + if (dlen < 0) break; - if (iov_used > 0) { - udp_vu_prepare(c, toside, dlen); + if (iov_cnt > 0) { + data = IOV_TAIL(iov_vu, iov_cnt, 0); + udp_vu_prepare(c, &data, toside, dlen); if (*c->pcap) { - udp_vu_csum(toside, iov_used); - pcap_iov(iov_vu, iov_used, VNET_HLEN); + udp_vu_csum(toside, &data); + pcap_iov(data.iov, data.cnt, VNET_HLEN); } - vu_flush(vdev, vq, elem, iov_used); + vu_flush(vdev, vq, elem, elem_used); } } } -- 2.53.0