From mboxrd@z Thu Jan  1 00:00:00 1970
Authentication-Results: passt.top; dmarc=pass (p=none dis=none) header.from=redhat.com
Authentication-Results: passt.top;
	dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=Z7FO2CSg;
	dkim-atps=neutral
Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124])
	by passt.top (Postfix) with ESMTP id 4DA2C5A0262
	for <passt-dev@passt.top>; Tue, 10 Sep 2024 17:47:43 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
	s=mimecast20190719; t=1725983262;
	h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
	 to:to:cc:cc:mime-version:mime-version:content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references;
	bh=cT+5kuGfyfQHAH1ZBkqvbl0qU0cWR/zMSVUFFrxDBHs=;
	b=Z7FO2CSgE3GnxfvJkEX0RAMgjfCqfP4J+/884XTH2g58BRFvphCYWMuTUCnywfEIo5Ynty
	Y5a+sAbh8aSYmDSof033RkyCnRp+ehD0oM66/Z6pJHmtB2bjmgU/2l4zYEOJ753d6Pj0pO
	S0lp4+n5KDKOlz2D3rcqUPFMvvu5MWM=
Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com
 [209.85.128.71]) by relay.mimecast.com with ESMTP with STARTTLS
 (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id
 us-mta-57-rXWL1qweO3C8JF_PAPXgDQ-1; Tue, 10 Sep 2024 11:47:40 -0400
X-MC-Unique: rXWL1qweO3C8JF_PAPXgDQ-1
Received: by mail-wm1-f71.google.com with SMTP id 5b1f17b1804b1-42cb808e9fcso16318135e9.0
        for <passt-dev@passt.top>; Tue, 10 Sep 2024 08:47:40 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1725983259; x=1726588059;
        h=content-transfer-encoding:mime-version:organization:references
         :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state
         :from:to:cc:subject:date:message-id:reply-to;
        bh=5JVRL3K3hbQqd8DmYZ+7GhrN9RjPxtDnNjAi9YdYhmw=;
        b=Gijfa3eu4I4Ppbs7X3c2GRqu3Q1et5lj7w9YwcoopoiUQKfB5hoUzxZAGePyRr3JDV
         M56KaieXIwYv6MtOEs2Jmd+lW/t/eypYTRQqUC2l/QYtFE13Hy7qZKvLbAlqZwczBSWW
         XHIK3WksR1QoD4lBNdAfkQcc3hNsqd4LOgHTgySvfmOiBcNi+d2H3x+OYD9NlkD9Hw7/
         K6BnJ3N4VKV8gtFbYRdL3VnjIHZ5T64ZOBnyihHc8Q2iRE/hy/hT+sq9KThO259c2W1c
         RC6gbCCAELLzkEQFRJnyNj0T8GcwBqTZ1562AyaYN8JIhWVkSkWTVl4nuTAhlh3Txwij
         8KvA==
X-Gm-Message-State: AOJu0YwViLTxGJnxvFh0Du9GoRT8OnWrWBMXabKExfFnDBztderL0cft
	lMHU6Yf/QepLb6Y37ZR8EvNNHjlDOxyT4q6QjIueSLuBk8qAOGV11Lo7PUqM2EZHOWZCsbEp/5O
	OdnjzHwtlhm9sZh1JnydqjD0rAX4zztpljxmO8D7x33XgyajVpsCLrycwOM3jIZlokv8542hacL
	WA9j8493MlfkFGyMqN9xp2rEkHTtVHanx1
X-Received: by 2002:a05:600c:1c0f:b0:428:1b0d:8657 with SMTP id 5b1f17b1804b1-42cae73e0a1mr79907105e9.22.1725983258375;
        Tue, 10 Sep 2024 08:47:38 -0700 (PDT)
X-Google-Smtp-Source: AGHT+IEi+HlniZhgAe4kNOnBUXUyPs46gAtsej25c5sSQO5riJskwUF2gzSB+9sayifAurcCHlLJ8w==
X-Received: by 2002:a05:600c:1c0f:b0:428:1b0d:8657 with SMTP id 5b1f17b1804b1-42cae73e0a1mr79906775e9.22.1725983257620;
        Tue, 10 Sep 2024 08:47:37 -0700 (PDT)
Received: from maya.myfinge.rs (ifcgrfdd.trafficplex.cloud. [176.103.220.4])
        by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-42cb5affb16sm90297375e9.37.2024.09.10.08.47.36
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Tue, 10 Sep 2024 08:47:36 -0700 (PDT)
Date: Tue, 10 Sep 2024 17:47:35 +0200
From: Stefano Brivio <sbrivio@redhat.com>
To: Laurent Vivier <lvivier@redhat.com>
Subject: Re: [PATCH v4 3/4] vhost-user: introduce vhost-user API
Message-ID: <20240910174735.1e80713c@elisabeth>
In-Reply-To: <20240906160455.2088854-4-lvivier@redhat.com>
References: <20240906160455.2088854-1-lvivier@redhat.com>
	<20240906160455.2088854-4-lvivier@redhat.com>
Organization: Red Hat
X-Mailer: Claws Mail 4.2.0 (GTK 3.24.41; x86_64-pc-linux-gnu)
MIME-Version: 1.0
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Message-ID-Hash: UQBYJFFEFCJWHZGHPJMAQAN6XG5TDYXB
X-Message-ID-Hash: UQBYJFFEFCJWHZGHPJMAQAN6XG5TDYXB
X-MailFrom: sbrivio@redhat.com
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: passt-dev@passt.top
X-Mailman-Version: 3.3.8
Precedence: list
List-Id: Development discussion and patches for passt <passt-dev.passt.top>
Archived-At: <https://archives.passt.top/passt-dev/20240910174735.1e80713c@elisabeth/>
Archived-At: <https://passt.top/hyperkitty/list/passt-dev@passt.top/message/UQBYJFFEFCJWHZGHPJMAQAN6XG5TDYXB/>
List-Archive: <https://archives.passt.top/passt-dev/>
List-Archive: <https://passt.top/hyperkitty/list/passt-dev@passt.top/>
List-Help: <mailto:passt-dev-request@passt.top?subject=help>
List-Owner: <mailto:passt-dev-owner@passt.top>
List-Post: <mailto:passt-dev@passt.top>
List-Subscribe: <mailto:passt-dev-join@passt.top>
List-Unsubscribe: <mailto:passt-dev-leave@passt.top>

Nits and a couple of questions only:

On Fri,  6 Sep 2024 18:04:48 +0200
Laurent Vivier <lvivier@redhat.com> wrote:

> Add vhost_user.c and vhost_user.h that define the functions needed
> to implement vhost-user backend.
>=20
> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
> ---
>  Makefile     |    4 +-
>  iov.c        |    1 -
>  vhost_user.c | 1265 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  vhost_user.h |  203 ++++++++
>  virtio.c     |    5 -
>  virtio.h     |    2 +-
>  6 files changed, 1471 insertions(+), 9 deletions(-)
>  create mode 100644 vhost_user.c
>  create mode 100644 vhost_user.h
>=20
> diff --git a/Makefile b/Makefile
> index e9a154bdd718..01e95ac1b62c 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -47,7 +47,7 @@ FLAGS +=3D -DDUAL_STACK_SOCKETS=3D$(DUAL_STACK_SOCKETS)
>  PASST_SRCS =3D arch.c arp.c checksum.c conf.c dhcp.c dhcpv6.c flow.c fwd=
.c \
>  =09icmp.c igmp.c inany.c iov.c ip.c isolation.c lineread.c log.c mld.c \
>  =09ndp.c netlink.c packet.c passt.c pasta.c pcap.c pif.c tap.c tcp.c \
> -=09tcp_buf.c tcp_splice.c udp.c udp_flow.c util.c virtio.c
> +=09tcp_buf.c tcp_splice.c udp.c udp_flow.c util.c vhost_user.c virtio.c
>  QRAP_SRCS =3D qrap.c
>  SRCS =3D $(PASST_SRCS) $(QRAP_SRCS)
> =20
> @@ -57,7 +57,7 @@ PASST_HEADERS =3D arch.h arp.h checksum.h conf.h dhcp.h=
 dhcpv6.h flow.h fwd.h \
>  =09flow_table.h icmp.h icmp_flow.h inany.h iov.h ip.h isolation.h \
>  =09lineread.h log.h ndp.h netlink.h packet.h passt.h pasta.h pcap.h pif.=
h \
>  =09siphash.h tap.h tcp.h tcp_buf.h tcp_conn.h tcp_internal.h tcp_splice.=
h \
> -=09udp.h udp_flow.h util.h virtio.h
> +=09udp.h udp_flow.h util.h vhost_user.h virtio.h
>  HEADERS =3D $(PASST_HEADERS) seccomp.h
> =20
>  C :=3D \#include <linux/tcp.h>\nstruct tcp_info x =3D { .tcpi_snd_wnd =
=3D 0 };
> diff --git a/iov.c b/iov.c
> index 3f9e229a305f..3741db21790f 100644
> --- a/iov.c
> +++ b/iov.c
> @@ -68,7 +68,6 @@ size_t iov_skip_bytes(const struct iovec *iov, size_t n=
,
>   *
>   * Returns:    The number of bytes successfully copied.
>   */
> -/* cppcheck-suppress unusedFunction */
>  size_t iov_from_buf(const struct iovec *iov, size_t iov_cnt,
>  =09=09    size_t offset, const void *buf, size_t bytes)
>  {
> diff --git a/vhost_user.c b/vhost_user.c
> new file mode 100644
> index 000000000000..6008a8adc967
> --- /dev/null
> +++ b/vhost_user.c
> @@ -0,0 +1,1265 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * vhost-user API, command management and virtio interface
> + *
> + * Copyright Red Hat
> + * Author: Laurent Vivier <lvivier@redhat.com>
> + */
> +/* some parts from QEMU subprojects/libvhost-user/libvhost-user.c

s/some/Some/, no need to split comments (just leave one extra line...).

> + * licensed under the following terms:
> + *
> + * Copyright IBM, Corp. 2007
> + * Copyright (c) 2016 Red Hat, Inc.
> + *
> + * Authors:
> + *  Anthony Liguori <aliguori@us.ibm.com>
> + *  Marc-Andr=C3=A9 Lureau <mlureau@redhat.com>
> + *  Victor Kaplansky <victork@redhat.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> + * later.  See the COPYING file in the top-level directory.
> + */
> +
> +#include <errno.h>
> +#include <fcntl.h>
> +#include <stdlib.h>
> +#include <stdio.h>
> +#include <stdint.h>
> +#include <stddef.h>
> +#include <string.h>
> +#include <assert.h>
> +#include <stdbool.h>
> +#include <inttypes.h>
> +#include <time.h>
> +#include <net/ethernet.h>
> +#include <netinet/in.h>
> +#include <sys/epoll.h>
> +#include <sys/eventfd.h>
> +#include <sys/mman.h>
> +#include <linux/vhost_types.h>
> +#include <linux/virtio_net.h>
> +
> +#include "util.h"
> +#include "passt.h"
> +#include "tap.h"
> +#include "vhost_user.h"
> +
> +/* vhost-user version we are compatible with */
> +#define VHOST_USER_VERSION 1
> +
> +/**
> + * vu_print_capabilities() - print vhost-user capabilities
> + * =09=09=09     this is part of the vhost-user backend
> + * =09=09=09     convention.
> + */
> +/* cppcheck-suppress unusedFunction */
> +void vu_print_capabilities(void)
> +{
> +=09info("{");
> +=09info("  \"type\": \"net\"");
> +=09info("}");
> +=09exit(EXIT_SUCCESS);
> +}
> +
> +/**
> + * vu_request_to_string() - convert a vhost-user request number to its n=
ame
> + * @req:=09request number
> + *
> + * Return: the name of request number
> + */
> +static const char *vu_request_to_string(unsigned int req)
> +{
> +=09if (req < VHOST_USER_MAX) {
> +#define REQ(req) [req] =3D #req
> +=09=09static const char * const vu_request_str[VHOST_USER_MAX] =3D {
> +=09=09=09REQ(VHOST_USER_NONE),
> +=09=09=09REQ(VHOST_USER_GET_FEATURES),
> +=09=09=09REQ(VHOST_USER_SET_FEATURES),
> +=09=09=09REQ(VHOST_USER_SET_OWNER),
> +=09=09=09REQ(VHOST_USER_RESET_OWNER),
> +=09=09=09REQ(VHOST_USER_SET_MEM_TABLE),
> +=09=09=09REQ(VHOST_USER_SET_LOG_BASE),
> +=09=09=09REQ(VHOST_USER_SET_LOG_FD),
> +=09=09=09REQ(VHOST_USER_SET_VRING_NUM),
> +=09=09=09REQ(VHOST_USER_SET_VRING_ADDR),
> +=09=09=09REQ(VHOST_USER_SET_VRING_BASE),
> +=09=09=09REQ(VHOST_USER_GET_VRING_BASE),
> +=09=09=09REQ(VHOST_USER_SET_VRING_KICK),
> +=09=09=09REQ(VHOST_USER_SET_VRING_CALL),
> +=09=09=09REQ(VHOST_USER_SET_VRING_ERR),
> +=09=09=09REQ(VHOST_USER_GET_PROTOCOL_FEATURES),
> +=09=09=09REQ(VHOST_USER_SET_PROTOCOL_FEATURES),
> +=09=09=09REQ(VHOST_USER_GET_QUEUE_NUM),
> +=09=09=09REQ(VHOST_USER_SET_VRING_ENABLE),
> +=09=09=09REQ(VHOST_USER_SEND_RARP),
> +=09=09=09REQ(VHOST_USER_NET_SET_MTU),
> +=09=09=09REQ(VHOST_USER_SET_BACKEND_REQ_FD),
> +=09=09=09REQ(VHOST_USER_IOTLB_MSG),
> +=09=09=09REQ(VHOST_USER_SET_VRING_ENDIAN),
> +=09=09=09REQ(VHOST_USER_GET_CONFIG),
> +=09=09=09REQ(VHOST_USER_SET_CONFIG),
> +=09=09=09REQ(VHOST_USER_POSTCOPY_ADVISE),
> +=09=09=09REQ(VHOST_USER_POSTCOPY_LISTEN),
> +=09=09=09REQ(VHOST_USER_POSTCOPY_END),
> +=09=09=09REQ(VHOST_USER_GET_INFLIGHT_FD),
> +=09=09=09REQ(VHOST_USER_SET_INFLIGHT_FD),
> +=09=09=09REQ(VHOST_USER_GPU_SET_SOCKET),
> +=09=09=09REQ(VHOST_USER_VRING_KICK),
> +=09=09=09REQ(VHOST_USER_GET_MAX_MEM_SLOTS),
> +=09=09=09REQ(VHOST_USER_ADD_MEM_REG),
> +=09=09=09REQ(VHOST_USER_REM_MEM_REG),
> +=09=09};
> +#undef REQ
> +=09=09return vu_request_str[req];
> +=09}
> +
> +=09return "unknown";
> +}
> +
> +/**
> + * qva_to_va() -  Translate front-end (QEMU) virtual address to our virt=
ual
> + * =09=09  address
> + * @dev:=09=09vhost-user device
> + * @qemu_addr:=09=09front-end userspace address
> + *
> + * Return: the memory address in our process virtual address space.
> + */
> +static void *qva_to_va(struct vu_dev *dev, uint64_t qemu_addr)
> +{
> +=09unsigned int i;
> +
> +=09/* Find matching memory region.  */
> +=09for (i =3D 0; i < dev->nregions; i++) {
> +=09=09const struct vu_dev_region *r =3D &dev->regions[i];
> +
> +=09=09if ((qemu_addr >=3D r->qva) && (qemu_addr < (r->qva + r->size))) {
> +=09=09=09/* NOLINTNEXTLINE(performance-no-int-to-ptr) */
> +=09=09=09return (void *)(qemu_addr - r->qva + r->mmap_addr +
> +=09=09=09=09=09r->mmap_offset);
> +=09=09}
> +=09}
> +
> +=09return NULL;
> +}
> +
> +/**
> + * vmsg_close_fds() - Close all file descriptors of a given message
> + * @vmsg:=09vhost-user message with the list of the file descriptors
> + */
> +static void vmsg_close_fds(const struct vhost_user_msg *vmsg)
> +{
> +=09int i;
> +
> +=09for (i =3D 0; i < vmsg->fd_num; i++)
> +=09=09close(vmsg->fds[i]);
> +}
> +
> +/**
> + * vu_remove_watch() - Remove a file descriptor from our passt epoll
> + * =09=09       file descriptor
> + * @vdev:=09vhost-user device
> + * @fd:=09=09file descriptor to remove
> + */
> +static void vu_remove_watch(const struct vu_dev *vdev, int fd)
> +{
> +=09/* Placeholder to add passt related code */
> +=09(void)vdev;
> +=09(void)fd;
> +}
> +
> +/**
> + * vmsg_set_reply_u64() - Set reply payload.u64 and clear request flags
> + * =09=09=09  and fd_num
> + * @vmsg:=09vhost-user message
> + * @val:=0964-bit value to reply
> + */
> +static void vmsg_set_reply_u64(struct vhost_user_msg *vmsg, uint64_t val=
)
> +{
> +=09vmsg->hdr.flags =3D 0; /* defaults will be set by vu_send_reply() */
> +=09vmsg->hdr.size =3D sizeof(vmsg->payload.u64);
> +=09vmsg->payload.u64 =3D val;
> +=09vmsg->fd_num =3D 0;
> +}
> +
> +/**
> + * vu_message_read_default() - Read incoming vhost-user message from the
> + * =09=09=09       front-end
> + * @conn_fd:=09vhost-user command socket
> + * @vmsg:=09vhost-user message
> + *
> + * Return: -1 there is an error,

It doesn't return on error anymore.

> + *          0 if recvmsg() has been interrupted or if there's no data to=
 read,
> + *          1 if a message has been received
> + */
> +static int vu_message_read_default(int conn_fd, struct vhost_user_msg *v=
msg)
> +{
> +=09char control[CMSG_SPACE(VHOST_MEMORY_BASELINE_NREGIONS *
> +=09=09     sizeof(int))] =3D { 0 };
> +=09struct iovec iov =3D {
> +=09=09.iov_base =3D (char *)vmsg,
> +=09=09.iov_len =3D VHOST_USER_HDR_SIZE,
> +=09};
> +=09struct msghdr msg =3D {
> +=09=09.msg_iov =3D &iov,
> +=09=09.msg_iovlen =3D 1,
> +=09=09.msg_control =3D control,
> +=09=09.msg_controllen =3D sizeof(control),
> +=09};
> +=09ssize_t ret, sz_payload;
> +=09struct cmsghdr *cmsg;
> +
> +=09ret =3D recvmsg(conn_fd, &msg, MSG_DONTWAIT);
> +=09if (ret < 0) {
> +=09=09if (errno =3D=3D EINTR || errno =3D=3D EAGAIN || errno =3D=3D EWOU=
LDBLOCK)
> +=09=09=09return 0;
> +=09=09die_perror("vhost-user message receive (recvmsg)");
> +=09}
> +
> +=09vmsg->fd_num =3D 0;
> +=09for (cmsg =3D CMSG_FIRSTHDR(&msg); cmsg !=3D NULL;
> +=09     cmsg =3D CMSG_NXTHDR(&msg, cmsg)) {
> +=09=09if (cmsg->cmsg_level =3D=3D SOL_SOCKET &&
> +=09=09    cmsg->cmsg_type =3D=3D SCM_RIGHTS) {
> +=09=09=09size_t fd_size;
> +
> +=09=09=09ASSERT(cmsg->cmsg_len >=3D CMSG_LEN(0));
> +=09=09=09fd_size =3D cmsg->cmsg_len - CMSG_LEN(0);
> +=09=09=09ASSERT(fd_size <=3D sizeof(vmsg->fds));
> +=09=09=09vmsg->fd_num =3D fd_size / sizeof(int);
> +=09=09=09memcpy(vmsg->fds, CMSG_DATA(cmsg), fd_size);
> +=09=09=09break;
> +=09=09}
> +=09}
> +
> +=09sz_payload =3D vmsg->hdr.size;
> +=09if ((size_t)sz_payload > sizeof(vmsg->payload)) {
> +=09=09die("vhost-user message request too big: %d,"
> +=09=09=09 " size: vmsg->size: %zd, "
> +=09=09=09 "while sizeof(vmsg->payload) =3D %zu",
> +=09=09=09 vmsg->hdr.request, sz_payload, sizeof(vmsg->payload));
> +=09}
> +
> +=09if (sz_payload) {
> +=09=09do
> +=09=09=09ret =3D recv(conn_fd, &vmsg->payload, sz_payload, 0);
> +=09=09while (ret < 0 && (errno =3D=3D EINTR || errno =3D=3D EAGAIN));

Perhaps you missed this from my comment to v3: the socket is blocking,
so checking for EAGAIN shouldn't be needed.

> +=09=09if (ret < 0)
> +=09=09=09die_perror("vhost-user message receive");
> +
> +=09=09if (ret < sz_payload)
> +=09=09=09die("EOF on vhost-user message receive");

I guess you want to terminate on a short read (which, as far as I
understand, you never expect as normal behaviour), but if ret > 0, can
you still call it EOF?

Perhaps we should distinguish the two cases here, ret =3D=3D 0 (EOF) and
ret < sz_payload (short read).

> +=09}
> +
> +=09return 1;
> +}
> +
> +/**
> + * vu_message_write() - Send a message to the front-end
> + * @conn_fd:=09vhost-user command socket
> + * @vmsg:=09vhost-user message
> + *
> + * #syscalls:vu sendmsg
> + */
> +static void vu_message_write(int conn_fd, struct vhost_user_msg *vmsg)
> +{
> +=09char control[CMSG_SPACE(VHOST_MEMORY_BASELINE_NREGIONS * sizeof(int))=
] =3D { 0 };
> +=09struct iovec iov =3D {
> +=09=09.iov_base =3D (char *)vmsg,
> +=09=09.iov_len =3D VHOST_USER_HDR_SIZE + vmsg->hdr.size,
> +=09};
> +=09struct msghdr msg =3D {
> +=09=09.msg_iov =3D &iov,
> +=09=09.msg_iovlen =3D 1,
> +=09=09.msg_control =3D control,
> +=09};
> +=09int rc;
> +
> +=09ASSERT(vmsg->fd_num <=3D VHOST_MEMORY_BASELINE_NREGIONS);
> +=09if (vmsg->fd_num > 0) {
> +=09=09size_t fdsize =3D vmsg->fd_num * sizeof(int);
> +=09=09struct cmsghdr *cmsg;
> +
> +=09=09msg.msg_controllen =3D CMSG_SPACE(fdsize);
> +=09=09cmsg =3D CMSG_FIRSTHDR(&msg);
> +=09=09cmsg->cmsg_len =3D CMSG_LEN(fdsize);
> +=09=09cmsg->cmsg_level =3D SOL_SOCKET;
> +=09=09cmsg->cmsg_type =3D SCM_RIGHTS;
> +=09=09memcpy(CMSG_DATA(cmsg), vmsg->fds, fdsize);
> +=09}
> +
> +=09do
> +=09=09rc =3D sendmsg(conn_fd, &msg, 0);
> +=09while (rc < 0 && (errno =3D=3D EINTR || errno =3D=3D EAGAIN));

Same as above with EAGAIN.

> +
> +=09if (rc < 0)
> +=09=09die_perror("vhost-user message send");
> +
> +=09if ((uint32_t)rc < VHOST_USER_HDR_SIZE + vmsg->hdr.size)
> +=09=09die("EOF on vhost-user message send");
> +}
> +
> +/**
> + * vu_send_reply() - Update message flags and send it to front-end
> + * @conn_fd:=09vhost-user command socket
> + * @vmsg:=09vhost-user message
> + */
> +static void vu_send_reply(int conn_fd, struct vhost_user_msg *msg)
> +{
> +=09msg->hdr.flags &=3D ~VHOST_USER_VERSION_MASK;
> +=09msg->hdr.flags |=3D VHOST_USER_VERSION;
> +=09msg->hdr.flags |=3D VHOST_USER_REPLY_MASK;
> +
> +=09vu_message_write(conn_fd, msg);
> +}
> +
> +/**
> + * vu_get_features_exec() - Provide back-end features bitmask to front-e=
nd
> + * @vdev:=09vhost-user device
> + * @vmsg:=09vhost-user message
> + *
> + * Return: True as a reply is requested
> + */
> +static bool vu_get_features_exec(struct vu_dev *vdev,
> +=09=09=09=09 struct vhost_user_msg *msg)
> +{
> +=09uint64_t features =3D
> +=09=091ULL << VIRTIO_F_VERSION_1 |
> +=09=091ULL << VIRTIO_NET_F_MRG_RXBUF |
> +=09=091ULL << VHOST_USER_F_PROTOCOL_FEATURES;
> +
> +=09(void)vdev;
> +
> +=09vmsg_set_reply_u64(msg, features);
> +
> +=09debug("Sending back to guest u64: 0x%016"PRIx64, msg->payload.u64);
> +
> +=09return true;
> +}
> +
> +/**
> + * vu_set_enable_all_rings() - Enable/disable all the virtqueues
> + * @vdev:=09vhost-user device
> + * @enable:=09New virtqueues state
> + */
> +static void vu_set_enable_all_rings(struct vu_dev *vdev, bool enable)
> +{
> +=09uint16_t i;
> +
> +=09for (i =3D 0; i < VHOST_USER_MAX_QUEUES; i++)
> +=09=09vdev->vq[i].enable =3D enable;
> +}
> +
> +/**
> + * vu_set_features_exec() - Enable features of the back-end
> + * @vdev:=09vhost-user device
> + * @vmsg:=09vhost-user message
> + *
> + * Return: False as no reply is requested
> + */
> +static bool vu_set_features_exec(struct vu_dev *vdev,
> +=09=09=09=09 struct vhost_user_msg *msg)
> +{
> +=09debug("u64: 0x%016"PRIx64, msg->payload.u64);
> +
> +=09vdev->features =3D msg->payload.u64;
> +=09/* We only support devices conforming to VIRTIO 1.0 or
> +=09 * later
> +=09 */
> +=09if (!vu_has_feature(vdev, VIRTIO_F_VERSION_1))
> +=09=09die("virtio legacy devices aren't supported by passt");
> +
> +=09if (!vu_has_feature(vdev, VHOST_USER_F_PROTOCOL_FEATURES))
> +=09=09vu_set_enable_all_rings(vdev, true);
> +
> +=09/* virtio-net features */
> +
> +=09/* VIRTIO_F_VERSION_1 always uses struct virtio_net_hdr_mrg_rxbuf */
> +=09vdev->hdrlen =3D sizeof(struct virtio_net_hdr_mrg_rxbuf);
> +
> +=09return false;
> +}
> +
> +/**
> + * vu_set_owner_exec() - Session start flag, do nothing in our case
> + * @vdev:=09vhost-user device
> + * @vmsg:=09vhost-user message
> + *
> + * Return: False as no reply is requested
> + */
> +static bool vu_set_owner_exec(struct vu_dev *vdev,
> +=09=09=09      struct vhost_user_msg *msg)
> +{
> +=09(void)vdev;
> +=09(void)msg;
> +
> +=09return false;
> +}
> +
> +/**
> + * map_ring() - Convert ring front-end (QEMU) addresses to our process
> + * =09=09virtual address space.
> + * @vdev:=09vhost-user device
> + * @vq:=09=09Virtqueue
> + *
> + * Return: True if ring cannot be mapped to our address space
> + */
> +static bool map_ring(struct vu_dev *vdev, struct vu_virtq *vq)
> +{
> +=09vq->vring.desc =3D qva_to_va(vdev, vq->vra.desc_user_addr);
> +=09vq->vring.used =3D qva_to_va(vdev, vq->vra.used_user_addr);
> +=09vq->vring.avail =3D qva_to_va(vdev, vq->vra.avail_user_addr);
> +
> +=09debug("Setting virtq addresses:");
> +=09debug("    vring_desc  at %p", (void *)vq->vring.desc);
> +=09debug("    vring_used  at %p", (void *)vq->vring.used);
> +=09debug("    vring_avail at %p", (void *)vq->vring.avail);
> +
> +=09return !(vq->vring.desc && vq->vring.used && vq->vring.avail);
> +}
> +
> +/**
> + * vu_packet_check_range() - Check if a given memory zone is contained i=
n
> + * =09=09=09     a mapped guest memory region
> + * @buf:=09Array of the available memory regions
> + * @offset:=09Offset of data range in packet descriptor
> + * @size:=09Length of desired data range
> + * @start:=09Start of the packet descriptor
> + *
> + * Return: 0 if the zone is in a mapped memory region, -1 otherwise
> + */
> +/* cppcheck-suppress unusedFunction */
> +int vu_packet_check_range(void *buf, size_t offset, size_t len,
> +=09=09=09  const char *start)
> +{
> +=09struct vu_dev_region *dev_region;
> +
> +=09for (dev_region =3D buf; dev_region->mmap_addr; dev_region++) {
> +=09=09/* NOLINTNEXTLINE(performance-no-int-to-ptr) */
> +=09=09char *m =3D (char *)dev_region->mmap_addr;
> +
> +=09=09if (m <=3D start &&
> +=09=09    start + offset + len <=3D m + dev_region->mmap_offset +
> +=09=09=09=09=09       dev_region->size)
> +=09=09=09return 0;
> +=09}
> +
> +=09return -1;
> +}
> +
> +/**
> + * vu_set_mem_table_exec() - Sets the memory map regions to be able to
> + * =09=09=09     translate the vring addresses.
> + * @vdev:=09vhost-user device
> + * @vmsg:=09vhost-user message
> + *
> + * Return: False as no reply is requested
> + *
> + * #syscalls:vu mmap munmap
> + */
> +static bool vu_set_mem_table_exec(struct vu_dev *vdev,
> +=09=09=09=09  struct vhost_user_msg *msg)
> +{
> +=09struct vhost_user_memory m =3D msg->payload.memory, *memory =3D &m;
> +=09unsigned int i;
> +
> +=09for (i =3D 0; i < vdev->nregions; i++) {
> +=09=09struct vu_dev_region *r =3D &vdev->regions[i];
> +=09=09/* NOLINTNEXTLINE(performance-no-int-to-ptr) */
> +=09=09void *mm =3D (void *)r->mmap_addr;
> +
> +=09=09if (mm)
> +=09=09=09munmap(mm, r->size + r->mmap_offset);
> +=09}
> +=09vdev->nregions =3D memory->nregions;
> +
> +=09debug("vhost-user nregions: %u", memory->nregions);
> +=09for (i =3D 0; i < vdev->nregions; i++) {
> +=09=09struct vhost_user_memory_region *msg_region =3D &memory->regions[i=
];
> +=09=09struct vu_dev_region *dev_region =3D &vdev->regions[i];
> +=09=09void *mmap_addr;
> +
> +=09=09debug("vhost-user region %d", i);
> +=09=09debug("    guest_phys_addr: 0x%016"PRIx64,
> +=09=09      msg_region->guest_phys_addr);
> +=09=09debug("    memory_size:     0x%016"PRIx64,
> +=09=09      msg_region->memory_size);
> +=09=09debug("    userspace_addr   0x%016"PRIx64,
> +=09=09      msg_region->userspace_addr);
> +=09=09debug("    mmap_offset      0x%016"PRIx64,
> +=09=09      msg_region->mmap_offset);
> +
> +=09=09dev_region->gpa =3D msg_region->guest_phys_addr;
> +=09=09dev_region->size =3D msg_region->memory_size;
> +=09=09dev_region->qva =3D msg_region->userspace_addr;
> +=09=09dev_region->mmap_offset =3D msg_region->mmap_offset;
> +
> +=09=09/* We don't use offset argument of mmap() since the
> +=09=09 * mapped address has to be page aligned.
> +=09=09 */
> +=09=09mmap_addr =3D mmap(0, dev_region->size + dev_region->mmap_offset,
> +=09=09=09=09 PROT_READ | PROT_WRITE, MAP_SHARED |
> +=09=09=09=09 MAP_NORESERVE, msg->fds[i], 0);
> +
> +=09=09if (mmap_addr =3D=3D MAP_FAILED)
> +=09=09=09die_perror("vhost-user region mmap error");
> +
> +=09=09dev_region->mmap_addr =3D (uint64_t)(uintptr_t)mmap_addr;
> +=09=09debug("    mmap_addr:       0x%016"PRIx64,
> +=09=09      dev_region->mmap_addr);
> +
> +=09=09close(msg->fds[i]);
> +=09}
> +
> +=09for (i =3D 0; i < VHOST_USER_MAX_QUEUES; i++) {
> +=09=09if (vdev->vq[i].vring.desc) {
> +=09=09=09if (map_ring(vdev, &vdev->vq[i]))
> +=09=09=09=09die("remapping queue %d during setmemtable", i);
> +=09=09}
> +=09}
> +
> +=09return false;
> +}
> +
> +/**
> + * vu_set_vring_num_exec() - Set the size of the queue (vring size)
> + * @vdev:=09vhost-user device
> + * @vmsg:=09vhost-user message
> + *
> + * Return: False as no reply is requested
> + */
> +static bool vu_set_vring_num_exec(struct vu_dev *vdev,
> +=09=09=09=09  struct vhost_user_msg *msg)
> +{
> +=09unsigned int idx =3D msg->payload.state.index;
> +=09unsigned int num =3D msg->payload.state.num;
> +
> +=09debug("State.index: %u", idx);
> +=09debug("State.num:   %u", num);
> +=09vdev->vq[idx].vring.num =3D num;
> +
> +=09return false;
> +}
> +
> +/**
> + * vu_set_vring_addr_exec() - Set the addresses of the vring
> + * @vdev:=09vhost-user device
> + * @vmsg:=09vhost-user message
> + *
> + * Return: False as no reply is requested
> + */
> +static bool vu_set_vring_addr_exec(struct vu_dev *vdev,
> +=09=09=09=09   struct vhost_user_msg *msg)
> +{
> +=09/* We need to copy the payload to vhost_vring_addr structure
> +         * to access index because address of msg->payload.addr
> +         * can be unaligned as it is packed.
> +         */
> +=09struct vhost_vring_addr addr =3D msg->payload.addr;
> +=09struct vu_virtq *vq =3D &vdev->vq[addr.index];
> +
> +=09debug("vhost_vring_addr:");
> +=09debug("    index:  %d", addr.index);
> +=09debug("    flags:  %d", addr.flags);
> +=09debug("    desc_user_addr:   0x%016" PRIx64,
> +=09      (uint64_t)addr.desc_user_addr);
> +=09debug("    used_user_addr:   0x%016" PRIx64,
> +=09      (uint64_t)addr.used_user_addr);
> +=09debug("    avail_user_addr:  0x%016" PRIx64,
> +=09      (uint64_t)addr.avail_user_addr);
> +=09debug("    log_guest_addr:   0x%016" PRIx64,
> +=09      (uint64_t)addr.log_guest_addr);
> +
> +=09vq->vra =3D msg->payload.addr;
> +=09vq->vring.flags =3D addr.flags;
> +=09vq->vring.log_guest_addr =3D addr.log_guest_addr;
> +
> +=09if (map_ring(vdev, vq))
> +=09=09die("Invalid vring_addr message");
> +
> +=09vq->used_idx =3D le16toh(vq->vring.used->idx);
> +
> +=09if (vq->last_avail_idx !=3D vq->used_idx) {
> +=09=09debug("Last avail index !=3D used index: %u !=3D %u",
> +=09=09      vq->last_avail_idx, vq->used_idx);
> +=09}
> +
> +=09return false;
> +}
> +/**
> + * vu_set_vring_base_exec() - Sets the next index to use for descriptors
> + * =09=09=09      in this vring
> + * @vdev:=09vhost-user device
> + * @vmsg:=09vhost-user message
> + *
> + * Return: False as no reply is requested
> + */
> +static bool vu_set_vring_base_exec(struct vu_dev *vdev,
> +=09=09=09=09   struct vhost_user_msg *msg)
> +{
> +=09unsigned int idx =3D msg->payload.state.index;
> +=09unsigned int num =3D msg->payload.state.num;
> +
> +=09debug("State.index: %u", idx);
> +=09debug("State.num:   %u", num);
> +=09vdev->vq[idx].shadow_avail_idx =3D vdev->vq[idx].last_avail_idx =3D n=
um;
> +
> +=09return false;
> +}
> +
> +/**
> + * vu_get_vring_base_exec() - Stops the vring and returns the current
> + * =09=09=09      descriptor index or indices
> + * @vdev:=09vhost-user device
> + * @vmsg:=09vhost-user message
> + *
> + * Return: True as a reply is requested
> + */
> +static bool vu_get_vring_base_exec(struct vu_dev *vdev,
> +=09=09=09=09   struct vhost_user_msg *msg)
> +{
> +=09unsigned int idx =3D msg->payload.state.index;
> +
> +=09debug("State.index: %u", idx);
> +=09msg->payload.state.num =3D vdev->vq[idx].last_avail_idx;
> +=09msg->hdr.size =3D sizeof(msg->payload.state);
> +
> +=09vdev->vq[idx].started =3D false;
> +
> +=09if (vdev->vq[idx].call_fd !=3D -1) {
> +=09=09close(vdev->vq[idx].call_fd);
> +=09=09vdev->vq[idx].call_fd =3D -1;
> +=09}
> +=09if (vdev->vq[idx].kick_fd !=3D -1) {
> +=09=09vu_remove_watch(vdev, vdev->vq[idx].kick_fd);
> +=09=09close(vdev->vq[idx].kick_fd);
> +=09=09vdev->vq[idx].kick_fd =3D -1;
> +=09}
> +
> +=09return true;
> +}
> +
> +/**
> + * vu_set_watch() - Add a file descriptor to the passt epoll file descri=
ptor
> + * @vdev:=09vhost-user device
> + * @fd:=09=09file descriptor to add
> + */
> +static void vu_set_watch(const struct vu_dev *vdev, int fd)
> +{
> +=09/* Placeholder to add passt related code */
> +=09(void)vdev;
> +=09(void)fd;
> +}
> +
> +/**
> + * vu_wait_queue() - wait for new free entries in the virtqueue
> + * @vq:=09=09virtqueue to wait on
> + */
> +static int vu_wait_queue(const struct vu_virtq *vq)
> +{
> +=09eventfd_t kick_data;
> +=09ssize_t rc;
> +=09int status;
> +
> +=09/* wait for the kernel to put new entries in the queue */
> +=09status =3D fcntl(vq->kick_fd, F_GETFL);
> +=09if (status =3D=3D -1)
> +=09=09return -1;

Same as on v3 (I see you changed this below, but not here): if you
don't use status later, you can omit storing it.

> +
> +=09if (fcntl(vq->kick_fd, F_SETFL, status & ~O_NONBLOCK))
> +=09=09return -1;
> +
> +=09rc =3D eventfd_read(vq->kick_fd, &kick_data);
> +
> +=09if (fcntl(vq->kick_fd, F_SETFL, status))
> +=09=09return -1;
> +
> +=09if (rc =3D=3D -1)
> +=09=09return -1;
> +
> +=09return 0;
> +}
> +
> +/**
> + * vu_send() - Send a buffer to the front-end using the RX virtqueue
> + * @vdev:=09vhost-user device
> + * @buf:=09address of the buffer
> + * @size:=09size of the buffer
> + *
> + * Return: number of bytes sent, -1 if there is an error
> + */
> +/* cppcheck-suppress unusedFunction */
> +int vu_send(struct vu_dev *vdev, const void *buf, size_t size)
> +{
> +=09struct vu_virtq *vq =3D &vdev->vq[VHOST_USER_RX_QUEUE];
> +=09struct vu_virtq_element elem[VIRTQUEUE_MAX_SIZE];
> +=09struct iovec in_sg[VIRTQUEUE_MAX_SIZE];
> +=09size_t lens[VIRTQUEUE_MAX_SIZE];
> +=09__virtio16 *num_buffers_ptr =3D NULL;
> +=09size_t hdrlen =3D vdev->hdrlen;
> +=09int in_sg_count =3D 0;
> +=09size_t offset =3D 0;
> +=09int i =3D 0, j;
> +
> +=09debug("vu_send size %zu hdrlen %zu", size, hdrlen);
> +
> +=09if (!vu_queue_enabled(vq) || !vu_queue_started(vq)) {
> +=09=09err("Got packet, but no available descriptors on RX virtq.");
> +=09=09return 0;
> +=09}
> +
> +=09while (offset < size) {
> +=09=09size_t len;
> +=09=09int total;
> +=09=09int ret;
> +
> +=09=09total =3D 0;
> +
> +=09=09if (i =3D=3D ARRAY_SIZE(elem) ||
> +=09=09    in_sg_count =3D=3D ARRAY_SIZE(in_sg)) {
> +=09=09=09err("virtio-net unexpected long buffer chain");
> +=09=09=09goto err;
> +=09=09}
> +
> +=09=09elem[i].out_num =3D 0;
> +=09=09elem[i].out_sg =3D NULL;
> +=09=09elem[i].in_num =3D ARRAY_SIZE(in_sg) - in_sg_count;
> +=09=09elem[i].in_sg =3D &in_sg[in_sg_count];
> +
> +=09=09ret =3D vu_queue_pop(vdev, vq, &elem[i]);
> +=09=09if (ret < 0) {
> +=09=09=09if (vu_wait_queue(vq) !=3D -1)
> +=09=09=09=09continue;
> +=09=09=09if (i) {
> +=09=09=09=09err("virtio-net unexpected empty queue: "
> +=09=09=09=09    "i %d mergeable %d offset %zd, size %zd, "
> +=09=09=09=09    "features 0x%" PRIx64,
> +=09=09=09=09    i, vu_has_feature(vdev,
> +=09=09=09=09=09=09      VIRTIO_NET_F_MRG_RXBUF),
> +=09=09=09=09    offset, size, vdev->features);
> +=09=09=09}
> +=09=09=09offset =3D -1;
> +=09=09=09goto err;
> +=09=09}
> +=09=09in_sg_count +=3D elem[i].in_num;
> +
> +=09=09if (elem[i].in_num < 1) {
> +=09=09=09err("virtio-net receive queue contains no in buffers");
> +=09=09=09vu_queue_detach_element(vq);
> +=09=09=09offset =3D -1;
> +=09=09=09goto err;
> +=09=09}
> +
> +=09=09if (i =3D=3D 0) {
> +=09=09=09struct virtio_net_hdr hdr =3D {
> +=09=09=09=09.flags =3D VIRTIO_NET_HDR_F_DATA_VALID,
> +=09=09=09=09.gso_type =3D VIRTIO_NET_HDR_GSO_NONE,
> +=09=09=09};
> +
> +=09=09=09ASSERT(offset =3D=3D 0);
> +=09=09=09ASSERT(elem[i].in_sg[0].iov_len >=3D hdrlen);
> +
> +=09=09=09len =3D iov_from_buf(elem[i].in_sg, elem[i].in_num, 0,
> +=09=09=09=09=09   &hdr, sizeof(hdr));
> +
> +=09=09=09num_buffers_ptr =3D (__virtio16 *)((char *)elem[i].in_sg[0].iov=
_base +
> +=09=09=09=09=09=09=09 len);
> +
> +=09=09=09total +=3D hdrlen;
> +=09=09}
> +
> +=09=09len =3D iov_from_buf(elem[i].in_sg, elem[i].in_num, total,
> +=09=09=09=09   (char *)buf + offset, size - offset);
> +
> +=09=09total +=3D len;
> +=09=09offset +=3D len;
> +
> +=09=09/* If buffers can't be merged, at this point we
> +=09=09 * must have consumed the complete packet.
> +=09=09 * Otherwise, drop it.
> +=09=09 */
> +=09=09if (!vu_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF) &&
> +=09=09    offset < size) {
> +=09=09=09vu_queue_unpop(vq);
> +=09=09=09goto err;
> +=09=09}
> +
> +=09=09lens[i] =3D total;
> +=09=09i++;
> +=09}
> +
> +=09if (num_buffers_ptr && vu_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF))
> +=09=09*num_buffers_ptr =3D htole16(i);
> +
> +=09for (j =3D 0; j < i; j++) {
> +=09=09debug("filling total %zd idx %d", lens[j], j);
> +=09=09vu_queue_fill(vq, &elem[j], lens[j], j);
> +=09}
> +
> +=09vu_queue_flush(vq, i);
> +=09vu_queue_notify(vdev, vq);
> +
> +=09debug("vhost-user sent %zu", offset);
> +
> +=09return offset;
> +err:
> +=09for (j =3D 0; j < i; j++)
> +=09=09vu_queue_detach_element(vq);
> +
> +=09return offset;
> +}
> +
> +/**
> + * vu_handle_tx() - Receive data from the TX virtqueue
> + * @vdev:=09vhost-user device
> + * @index:=09index of the virtqueue
> + * @now:=09Current timestamp
> + */
> +static void vu_handle_tx(struct vu_dev *vdev, int index,
> +=09=09=09 const struct timespec *now)
> +{
> +=09struct vu_virtq_element elem[VIRTQUEUE_MAX_SIZE];
> +=09struct iovec out_sg[VIRTQUEUE_MAX_SIZE];
> +=09struct vu_virtq *vq =3D &vdev->vq[index];
> +=09int hdrlen =3D vdev->hdrlen;
> +=09int out_sg_count;
> +=09int count;
> +

Excess newline (same as v3).

> +
> +=09if (!VHOST_USER_IS_QUEUE_TX(index)) {
> +=09=09debug("vhost-user: index %d is not a TX queue", index);
> +=09=09return;
> +=09}
> +
> +=09tap_flush_pools();
> +
> +=09count =3D 0;
> +=09out_sg_count =3D 0;
> +=09while (count < VIRTQUEUE_MAX_SIZE) {

So, I see that this is limited to 1024 iterations now (it was limited
also earlier, but I didn't realise that).

If we loop at most VIRTQUEUE_MAX_SIZE times, that means, I guess, that
while we're popping elements, the queue can't be written to, correct?

Or it can be written to, but we'll get an additional kick after
vu_queue_notify() if that happens?

> +=09=09int ret;
> +
> +=09=09elem[count].out_num =3D 1;
> +=09=09elem[count].out_sg =3D &out_sg[out_sg_count];
> +=09=09elem[count].in_num =3D 0;
> +=09=09elem[count].in_sg =3D NULL;
> +=09=09ret =3D vu_queue_pop(vdev, vq, &elem[count]);
> +=09=09if (ret < 0)
> +=09=09=09break;
> +=09=09out_sg_count +=3D elem[count].out_num;
> +
> +=09=09if (elem[count].out_num < 1) {
> +=09=09=09debug("virtio-net header not in first element");
> +=09=09=09break;
> +=09=09}
> +=09=09ASSERT(elem[count].out_num =3D=3D 1);
> +
> +=09=09tap_add_packet(vdev->context,
> +=09=09=09       elem[count].out_sg[0].iov_len - hdrlen,
> +=09=09=09       (char *)elem[count].out_sg[0].iov_base + hdrlen);
> +=09=09count++;
> +=09}
> +=09tap_handler(vdev->context, now);
> +
> +=09if (count) {
> +=09=09int i;
> +
> +=09=09for (i =3D 0; i < count; i++)
> +=09=09=09vu_queue_fill(vq, &elem[i], 0, i);
> +=09=09vu_queue_flush(vq, count);
> +=09=09vu_queue_notify(vdev, vq);
> +=09}
> +}
> +
> +/**
> + * vu_kick_cb() - Called on a kick event to start to receive data
> + * @vdev:=09vhost-user device
> + * @ref:=09epoll reference information
> + * @now:=09Current timestamp
> + */
> +/* cppcheck-suppress unusedFunction */
> +void vu_kick_cb(struct vu_dev *vdev, union epoll_ref ref,
> +=09=09const struct timespec *now)
> +{
> +=09eventfd_t kick_data;
> +=09ssize_t rc;
> +=09int idx;
> +
> +=09for (idx =3D 0; idx < VHOST_USER_MAX_QUEUES; idx++) {
> +=09=09if (vdev->vq[idx].kick_fd =3D=3D ref.fd)
> +=09=09=09break;
> +=09}
> +
> +=09if (idx =3D=3D VHOST_USER_MAX_QUEUES)
> +=09=09return;
> +
> +=09rc =3D eventfd_read(ref.fd, &kick_data);
> +=09if (rc =3D=3D -1)
> +=09=09die_perror("vhost-user kick eventfd_read()");
> +
> +=09debug("vhost-user: ot kick_data: %016"PRIx64" idx:%d",
> +=09      kick_data, idx);
> +=09if (VHOST_USER_IS_QUEUE_TX(idx))
> +=09=09vu_handle_tx(vdev, idx, now);
> +}
> +
> +/**
> + * vu_check_queue_msg_file() - Check if a message is valid,
> + * =09=09=09       close fds if NOFD bit is set
> + * @vmsg:=09vhost-user message
> + */
> +static void vu_check_queue_msg_file(struct vhost_user_msg *msg)
> +{
> +=09bool nofd =3D msg->payload.u64 & VHOST_USER_VRING_NOFD_MASK;
> +=09int idx =3D msg->payload.u64 & VHOST_USER_VRING_IDX_MASK;
> +
> +=09if (idx >=3D VHOST_USER_MAX_QUEUES)
> +=09=09die("Invalid vhost-user queue index: %u", idx);
> +
> +=09if (nofd) {
> +=09=09vmsg_close_fds(msg);
> +=09=09return;
> +=09}
> +
> +=09if (msg->fd_num !=3D 1)
> +=09=09die("Invalid fds in vhost-user request: %d", msg->hdr.request);
> +}
> +
> +/**
> + * vu_set_vring_kick_exec() - Set the event file descriptor for adding b=
uffers
> + * =09=09=09      to the vring
> + * @vdev:=09vhost-user device
> + * @vmsg:=09vhost-user message
> + *
> + * Return: False as no reply is requested
> + */
> +static bool vu_set_vring_kick_exec(struct vu_dev *vdev,
> +=09=09=09=09   struct vhost_user_msg *msg)
> +{
> +=09bool nofd =3D msg->payload.u64 & VHOST_USER_VRING_NOFD_MASK;
> +=09int idx =3D msg->payload.u64 & VHOST_USER_VRING_IDX_MASK;
> +
> +=09debug("u64: 0x%016"PRIx64, msg->payload.u64);
> +
> +=09vu_check_queue_msg_file(msg);
> +
> +=09if (vdev->vq[idx].kick_fd !=3D -1) {
> +=09=09vu_remove_watch(vdev, vdev->vq[idx].kick_fd);
> +=09=09close(vdev->vq[idx].kick_fd);
> +=09}
> +
> +=09vdev->vq[idx].kick_fd =3D nofd ? -1 : msg->fds[0];
> +=09debug("Got kick_fd: %d for vq: %d", vdev->vq[idx].kick_fd, idx);
> +
> +=09vdev->vq[idx].started =3D true;
> +
> +=09if (vdev->vq[idx].kick_fd !=3D -1 && VHOST_USER_IS_QUEUE_TX(idx)) {
> +=09=09vu_set_watch(vdev, vdev->vq[idx].kick_fd);
> +=09=09debug("Waiting for kicks on fd: %d for vq: %d",
> +=09=09      vdev->vq[idx].kick_fd, idx);
> +=09}
> +
> +=09return false;
> +}
> +
> +/**
> + * vu_set_vring_call_exec() - Set the event file descriptor to signal wh=
en
> + * =09=09=09      buffers are used
> + * @vdev:=09vhost-user device
> + * @vmsg:=09vhost-user message
> + *
> + * Return: False as no reply is requested
> + */
> +static bool vu_set_vring_call_exec(struct vu_dev *vdev,
> +=09=09=09=09   struct vhost_user_msg *msg)
> +{
> +=09bool nofd =3D msg->payload.u64 & VHOST_USER_VRING_NOFD_MASK;
> +=09int idx =3D msg->payload.u64 & VHOST_USER_VRING_IDX_MASK;
> +
> +=09debug("u64: 0x%016"PRIx64, msg->payload.u64);
> +
> +=09vu_check_queue_msg_file(msg);
> +
> +=09if (vdev->vq[idx].call_fd !=3D -1)
> +=09=09close(vdev->vq[idx].call_fd);
> +
> +=09vdev->vq[idx].call_fd =3D nofd ? -1 : msg->fds[0];
> +
> +=09/* in case of I/O hang after reconnecting */
> +=09if (vdev->vq[idx].call_fd !=3D -1)
> +=09=09eventfd_write(msg->fds[0], 1);
> +
> +=09debug("Got call_fd: %d for vq: %d", vdev->vq[idx].call_fd, idx);
> +
> +=09return false;
> +}
> +
> +/**
> + * vu_set_vring_err_exec() - Set the event file descriptor to signal whe=
n
> + * =09=09=09     error occurs
> + * @vdev:=09vhost-user device
> + * @vmsg:=09vhost-user message
> + *
> + * Return: False as no reply is requested
> + */
> +static bool vu_set_vring_err_exec(struct vu_dev *vdev,
> +=09=09=09=09  struct vhost_user_msg *msg)
> +{
> +=09bool nofd =3D msg->payload.u64 & VHOST_USER_VRING_NOFD_MASK;
> +=09int idx =3D msg->payload.u64 & VHOST_USER_VRING_IDX_MASK;
> +
> +=09debug("u64: 0x%016"PRIx64, msg->payload.u64);
> +
> +=09vu_check_queue_msg_file(msg);
> +
> +=09if (vdev->vq[idx].err_fd !=3D -1) {
> +=09=09close(vdev->vq[idx].err_fd);
> +=09=09vdev->vq[idx].err_fd =3D -1;
> +=09}
> +
> +=09/* cppcheck-suppress redundantAssignment */
> +=09vdev->vq[idx].err_fd =3D nofd ? -1 : msg->fds[0];

Maybe you missed this comment to v3:

--
Wouldn't it be easier (and not require a suppression) to say:

=09if (!nofd)
=09=09vdev->vq[idx].err_fd =3D msg->fds[0];

?
--

> +
> +=09return false;
> +}
> +
> +/**
> + * vu_get_protocol_features_exec() - Provide the protocol (vhost-user) f=
eatures
> + * =09=09=09=09     to the front-end
> + * @vdev:=09vhost-user device
> + * @vmsg:=09vhost-user message
> + *
> + * Return: True as a reply is requested
> + */
> +static bool vu_get_protocol_features_exec(struct vu_dev *vdev,
> +=09=09=09=09=09  struct vhost_user_msg *msg)
> +{
> +=09uint64_t features =3D 1ULL << VHOST_USER_PROTOCOL_F_REPLY_ACK;
> +
> +=09(void)vdev;
> +=09vmsg_set_reply_u64(msg, features);
> +
> +=09return true;
> +}
> +
> +/**
> + * vu_set_protocol_features_exec() - Enable protocol (vhost-user) featur=
es
> + * @vdev:=09vhost-user device
> + * @vmsg:=09vhost-user message
> + *
> + * Return: False as no reply is requested
> + */
> +static bool vu_set_protocol_features_exec(struct vu_dev *vdev,
> +=09=09=09=09=09  struct vhost_user_msg *msg)
> +{
> +=09uint64_t features =3D msg->payload.u64;
> +
> +=09debug("u64: 0x%016"PRIx64, features);
> +
> +=09vdev->protocol_features =3D msg->payload.u64;
> +
> +=09if (vu_has_protocol_feature(vdev,
> +=09=09=09=09    VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS) &&
> +=09    (!vu_has_protocol_feature(vdev, VHOST_USER_PROTOCOL_F_BACKEND_REQ=
) ||
> +=09     !vu_has_protocol_feature(vdev, VHOST_USER_PROTOCOL_F_REPLY_ACK))=
) {

Same as v3:

--
Do we actually care about VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS at
all, I wonder? This whole part (coming from ff1320050a3a "libvhost-user:
implement in-band notifications") is rather hard to read/understand, so
it would be great if we could just get rid of it altogether.

But if not, sure, let's leave it like the original, I'd say.
--

> +=09/*
> +=09 * The use case for using messages for kick/call is simulation, to ma=
ke
> +=09 * the kick and call synchronous. To actually get that behaviour, bot=
h
> +=09 * of the other features are required.
> +=09 * Theoretically, one could use only kick messages, or do them withou=
t
> +=09 * having F_REPLY_ACK, but too many (possibly pending) messages on th=
e
> +=09 * socket will eventually cause the master to hang, to avoid this in
> +=09 * scenarios where not desired enforce that the settings are in a way
> +=09 * that actually enables the simulation case.
> +=09 */
> +=09=09die("F_IN_BAND_NOTIFICATIONS requires F_BACKEND_REQ && F_REPLY_ACK=
");
> +=09}
> +
> +=09return false;
> +}
> +
> +/**
> + * vu_get_queue_num_exec() - Tell how many queues we support
> + * @vdev:=09vhost-user device
> + * @vmsg:=09vhost-user message
> + *
> + * Return: True as a reply is requested
> + */
> +static bool vu_get_queue_num_exec(struct vu_dev *vdev,
> +=09=09=09=09  struct vhost_user_msg *msg)
> +{
> +=09(void)vdev;
> +
> +=09vmsg_set_reply_u64(msg, VHOST_USER_MAX_QUEUES);
> +
> +=09return true;
> +}
> +
> +/**
> + * vu_set_vring_enable_exec() - Enable or disable corresponding vring
> + * @vdev:=09vhost-user device
> + * @vmsg:=09vhost-user message
> + *
> + * Return: False as no reply is requested
> + */
> +static bool vu_set_vring_enable_exec(struct vu_dev *vdev,
> +=09=09=09=09     struct vhost_user_msg *msg)
> +{
> +=09unsigned int enable =3D msg->payload.state.num;
> +=09unsigned int idx =3D msg->payload.state.index;
> +
> +=09debug("State.index:  %u", idx);
> +=09debug("State.enable: %u", enable);
> +
> +=09if (idx >=3D VHOST_USER_MAX_QUEUES)
> +=09=09die("Invalid vring_enable index: %u", idx);
> +
> +=09vdev->vq[idx].enable =3D enable;
> +=09return false;
> +}
> +
> +/**
> + * vu_init() - Initialize vhost-user device structure
> + * @c:=09=09execution context
> + * @vdev:=09vhost-user device
> + */
> +/* cppcheck-suppress unusedFunction */
> +void vu_init(struct ctx *c, struct vu_dev *vdev)
> +{
> +=09int i;
> +
> +=09vdev->context =3D c;
> +=09vdev->hdrlen =3D 0;
> +=09for (i =3D 0; i < VHOST_USER_MAX_QUEUES; i++) {
> +=09=09vdev->vq[i] =3D (struct vu_virtq){
> +=09=09=09.call_fd =3D -1,
> +=09=09=09.kick_fd =3D -1,
> +=09=09=09.err_fd =3D -1,
> +=09=09=09.notification =3D true,
> +=09=09};
> +=09}
> +}
> +
> +/**
> + * vu_cleanup() - Reset vhost-user device
> + * @vdev:=09vhost-user device
> + */
> +/* cppcheck-suppress unusedFunction */
> +void vu_cleanup(struct vu_dev *vdev)
> +{
> +=09unsigned int i;
> +
> +=09for (i =3D 0; i < VHOST_USER_MAX_QUEUES; i++) {
> +=09=09struct vu_virtq *vq =3D &vdev->vq[i];
> +
> +=09=09vq->started =3D false;
> +=09=09vq->notification =3D true;
> +
> +=09=09if (vq->call_fd !=3D -1) {
> +=09=09=09close(vq->call_fd);
> +=09=09=09vq->call_fd =3D -1;
> +=09=09}
> +=09=09if (vq->err_fd !=3D -1) {
> +=09=09=09close(vq->err_fd);
> +=09=09=09vq->err_fd =3D -1;
> +=09=09}
> +=09=09if (vq->kick_fd !=3D -1) {
> +=09=09=09vu_remove_watch(vdev, vq->kick_fd);
> +=09=09=09close(vq->kick_fd);
> +=09=09=09vq->kick_fd =3D -1;
> +=09=09}
> +
> +=09=09vq->vring.desc =3D 0;
> +=09=09vq->vring.used =3D 0;
> +=09=09vq->vring.avail =3D 0;
> +=09}
> +=09vdev->hdrlen =3D 0;
> +
> +=09for (i =3D 0; i < vdev->nregions; i++) {
> +=09=09const struct vu_dev_region *r =3D &vdev->regions[i];
> +=09=09/* NOLINTNEXTLINE(performance-no-int-to-ptr) */
> +=09=09void *m =3D (void *)r->mmap_addr;
> +
> +=09=09if (m)
> +=09=09=09munmap(m, r->size + r->mmap_offset);
> +=09}
> +=09vdev->nregions =3D 0;
> +}
> +
> +/**
> + * vu_sock_reset() - Reset connection socket
> + * @vdev:=09vhost-user device
> + */
> +static void vu_sock_reset(struct vu_dev *vdev)
> +{
> +=09/* Placeholder to add passt related code */
> +=09(void)vdev;
> +}
> +
> +static bool (*vu_handle[VHOST_USER_MAX])(struct vu_dev *vdev,
> +=09=09=09=09=09struct vhost_user_msg *msg) =3D {
> +=09[VHOST_USER_GET_FEATURES]=09   =3D vu_get_features_exec,
> +=09[VHOST_USER_SET_FEATURES]=09   =3D vu_set_features_exec,
> +=09[VHOST_USER_GET_PROTOCOL_FEATURES] =3D vu_get_protocol_features_exec,
> +=09[VHOST_USER_SET_PROTOCOL_FEATURES] =3D vu_set_protocol_features_exec,
> +=09[VHOST_USER_GET_QUEUE_NUM]=09   =3D vu_get_queue_num_exec,
> +=09[VHOST_USER_SET_OWNER]=09=09   =3D vu_set_owner_exec,
> +=09[VHOST_USER_SET_MEM_TABLE]=09   =3D vu_set_mem_table_exec,
> +=09[VHOST_USER_SET_VRING_NUM]=09   =3D vu_set_vring_num_exec,
> +=09[VHOST_USER_SET_VRING_ADDR]=09   =3D vu_set_vring_addr_exec,
> +=09[VHOST_USER_SET_VRING_BASE]=09   =3D vu_set_vring_base_exec,
> +=09[VHOST_USER_GET_VRING_BASE]=09   =3D vu_get_vring_base_exec,
> +=09[VHOST_USER_SET_VRING_KICK]=09   =3D vu_set_vring_kick_exec,
> +=09[VHOST_USER_SET_VRING_CALL]=09   =3D vu_set_vring_call_exec,
> +=09[VHOST_USER_SET_VRING_ERR]=09   =3D vu_set_vring_err_exec,
> +=09[VHOST_USER_SET_VRING_ENABLE]=09   =3D vu_set_vring_enable_exec,
> +};
> +
> +/**
> + * vu_control_handler() - Handle control commands for vhost-user
> + * @vdev:=09vhost-user device
> + * @fd:=09=09vhost-user message socket
> + * @events:=09epoll events
> + */
> +/* cppcheck-suppress unusedFunction */
> +void vu_control_handler(struct vu_dev *vdev, int fd, uint32_t events)
> +{
> +=09struct vhost_user_msg msg =3D { 0 };
> +=09bool need_reply, reply_requested;
> +=09int ret;
> +
> +=09if (events & (EPOLLRDHUP | EPOLLHUP | EPOLLERR)) {
> +=09=09vu_sock_reset(vdev);
> +=09=09return;
> +=09}
> +
> +=09ret =3D vu_message_read_default(fd, &msg);
> +=09if (ret =3D=3D 0) {
> +=09=09vu_sock_reset(vdev);
> +=09=09return;
> +=09}
> +=09debug("=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Vhost user me=
ssage =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D");
> +=09debug("Request: %s (%d)", vu_request_to_string(msg.hdr.request),
> +=09=09msg.hdr.request);
> +=09debug("Flags:   0x%x", msg.hdr.flags);
> +=09debug("Size:    %u", msg.hdr.size);
> +
> +=09need_reply =3D msg.hdr.flags & VHOST_USER_NEED_REPLY_MASK;
> +
> +=09if (msg.hdr.request >=3D 0 && msg.hdr.request < VHOST_USER_MAX &&
> +=09    vu_handle[msg.hdr.request])
> +=09=09reply_requested =3D vu_handle[msg.hdr.request](vdev, &msg);
> +=09else
> +=09=09die("Unhandled request: %d", msg.hdr.request);
> +
> +=09/* cppcheck-suppress legacyUninitvar */
> +=09if (!reply_requested && need_reply) {
> +=09=09msg.payload.u64 =3D 0;
> +=09=09msg.hdr.flags =3D 0;
> +=09=09msg.hdr.size =3D sizeof(msg.payload.u64);
> +=09=09msg.fd_num =3D 0;
> +=09=09reply_requested =3D true;
> +=09}
> +
> +=09if (reply_requested)
> +=09=09vu_send_reply(fd, &msg);
> +}
> diff --git a/vhost_user.h b/vhost_user.h
> new file mode 100644
> index 000000000000..ed4074c6b915
> --- /dev/null
> +++ b/vhost_user.h
> @@ -0,0 +1,203 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * vhost-user API, command management and virtio interface
> + *
> + * Copyright Red Hat
> + * Author: Laurent Vivier <lvivier@redhat.com>
> + */
> +
> +/* some parts from subprojects/libvhost-user/libvhost-user.h */
> +
> +#ifndef VHOST_USER_H
> +#define VHOST_USER_H
> +
> +#include "virtio.h"
> +#include "iov.h"
> +
> +#define VHOST_USER_F_PROTOCOL_FEATURES 30
> +
> +#define VHOST_MEMORY_BASELINE_NREGIONS 8
> +
> +/**
> + * enum vhost_user_protocol_feature - List of available vhost-user featu=
res
> + */
> +enum vhost_user_protocol_feature {
> +=09VHOST_USER_PROTOCOL_F_MQ =3D 0,
> +=09VHOST_USER_PROTOCOL_F_LOG_SHMFD =3D 1,
> +=09VHOST_USER_PROTOCOL_F_RARP =3D 2,
> +=09VHOST_USER_PROTOCOL_F_REPLY_ACK =3D 3,
> +=09VHOST_USER_PROTOCOL_F_NET_MTU =3D 4,
> +=09VHOST_USER_PROTOCOL_F_BACKEND_REQ =3D 5,
> +=09VHOST_USER_PROTOCOL_F_CROSS_ENDIAN =3D 6,
> +=09VHOST_USER_PROTOCOL_F_CRYPTO_SESSION =3D 7,
> +=09VHOST_USER_PROTOCOL_F_PAGEFAULT =3D 8,
> +=09VHOST_USER_PROTOCOL_F_CONFIG =3D 9,
> +=09VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD =3D 10,
> +=09VHOST_USER_PROTOCOL_F_HOST_NOTIFIER =3D 11,
> +=09VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD =3D 12,
> +=09VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS =3D 14,
> +=09VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS =3D 15,
> +
> +=09VHOST_USER_PROTOCOL_F_MAX
> +};
> +
> +/**
> + * enum vhost_user_request - List of available vhost-user requests
> + */
> +enum vhost_user_request {
> +=09VHOST_USER_NONE =3D 0,
> +=09VHOST_USER_GET_FEATURES =3D 1,
> +=09VHOST_USER_SET_FEATURES =3D 2,
> +=09VHOST_USER_SET_OWNER =3D 3,
> +=09VHOST_USER_RESET_OWNER =3D 4,
> +=09VHOST_USER_SET_MEM_TABLE =3D 5,
> +=09VHOST_USER_SET_LOG_BASE =3D 6,
> +=09VHOST_USER_SET_LOG_FD =3D 7,
> +=09VHOST_USER_SET_VRING_NUM =3D 8,
> +=09VHOST_USER_SET_VRING_ADDR =3D 9,
> +=09VHOST_USER_SET_VRING_BASE =3D 10,
> +=09VHOST_USER_GET_VRING_BASE =3D 11,
> +=09VHOST_USER_SET_VRING_KICK =3D 12,
> +=09VHOST_USER_SET_VRING_CALL =3D 13,
> +=09VHOST_USER_SET_VRING_ERR =3D 14,
> +=09VHOST_USER_GET_PROTOCOL_FEATURES =3D 15,
> +=09VHOST_USER_SET_PROTOCOL_FEATURES =3D 16,
> +=09VHOST_USER_GET_QUEUE_NUM =3D 17,
> +=09VHOST_USER_SET_VRING_ENABLE =3D 18,
> +=09VHOST_USER_SEND_RARP =3D 19,
> +=09VHOST_USER_NET_SET_MTU =3D 20,
> +=09VHOST_USER_SET_BACKEND_REQ_FD =3D 21,
> +=09VHOST_USER_IOTLB_MSG =3D 22,
> +=09VHOST_USER_SET_VRING_ENDIAN =3D 23,
> +=09VHOST_USER_GET_CONFIG =3D 24,
> +=09VHOST_USER_SET_CONFIG =3D 25,
> +=09VHOST_USER_CREATE_CRYPTO_SESSION =3D 26,
> +=09VHOST_USER_CLOSE_CRYPTO_SESSION =3D 27,
> +=09VHOST_USER_POSTCOPY_ADVISE  =3D 28,
> +=09VHOST_USER_POSTCOPY_LISTEN  =3D 29,
> +=09VHOST_USER_POSTCOPY_END     =3D 30,
> +=09VHOST_USER_GET_INFLIGHT_FD =3D 31,
> +=09VHOST_USER_SET_INFLIGHT_FD =3D 32,
> +=09VHOST_USER_GPU_SET_SOCKET =3D 33,
> +=09VHOST_USER_VRING_KICK =3D 35,
> +=09VHOST_USER_GET_MAX_MEM_SLOTS =3D 36,
> +=09VHOST_USER_ADD_MEM_REG =3D 37,
> +=09VHOST_USER_REM_MEM_REG =3D 38,
> +=09VHOST_USER_MAX
> +};
> +
> +/**
> + * struct vhost_user_header - vhost-user message header
> + * @request:=09Request type of the message
> + * @flags:=09Request flags
> + * @size:=09The following payload size
> + */
> +struct vhost_user_header {
> +=09enum vhost_user_request request;
> +
> +#define VHOST_USER_VERSION_MASK     0x3
> +#define VHOST_USER_REPLY_MASK       (0x1 << 2)
> +#define VHOST_USER_NEED_REPLY_MASK  (0x1 << 3)
> +=09uint32_t flags;
> +=09uint32_t size;
> +} __attribute__ ((__packed__));
> +
> +/**
> + * struct vhost_user_memory_region - Front-end shared memory region info=
rmation
> + * @guest_phys_addr:=09Guest physical address of the region
> + * @memory_size:=09Memory size
> + * @userspace_addr:=09front-end (QEMU) userspace address
> + * @mmap_offset:=09region offset in the shared memory area
> + */
> +struct vhost_user_memory_region {
> +=09uint64_t guest_phys_addr;
> +=09uint64_t memory_size;
> +=09uint64_t userspace_addr;
> +=09uint64_t mmap_offset;
> +};
> +
> +/**
> + * struct vhost_user_memory - List of all the shared memory regions
> + * @nregions:=09Number of memory regions
> + * @padding:=09Padding
> + * @regions:=09Memory regions list
> + */
> +struct vhost_user_memory {
> +=09uint32_t nregions;
> +=09uint32_t padding;
> +=09struct vhost_user_memory_region regions[VHOST_MEMORY_BASELINE_NREGION=
S];
> +};
> +
> +/**
> + * union vhost_user_payload - vhost-user message payload
> + * @u64:=09=0964-bit payload
> + * @state:=09=09vring state payload
> + * @addr:=09=09vring addresses payload
> + * vhost_user_memory:=09Memory regions information payload
> + */
> +union vhost_user_payload {
> +#define VHOST_USER_VRING_IDX_MASK   0xff
> +#define VHOST_USER_VRING_NOFD_MASK  (0x1 << 8)
> +=09uint64_t u64;
> +=09struct vhost_vring_state state;
> +=09struct vhost_vring_addr addr;
> +=09struct vhost_user_memory memory;
> +};
> +
> +/**
> + * struct vhost_user_msg - vhost-use message
> + * @hdr:=09=09Message header
> + * @payload:=09=09Message payload
> + * @fds:=09=09File descriptors associated with the message
> + * =09=09=09in the ancillary data.
> + * =09=09=09(shared memory or event file descriptors)
> + * @fd_num:=09=09Number of file descriptors
> + */
> +struct vhost_user_msg {
> +=09struct vhost_user_header hdr;
> +=09union vhost_user_payload payload;
> +
> +=09int fds[VHOST_MEMORY_BASELINE_NREGIONS];
> +=09int fd_num;
> +} __attribute__ ((__packed__));
> +#define VHOST_USER_HDR_SIZE sizeof(struct vhost_user_header)
> +
> +/* index of the RX virtqueue */
> +#define VHOST_USER_RX_QUEUE 0
> +/* index of the TX virtqueue */
> +#define VHOST_USER_TX_QUEUE 1
> +
> +/* in case of multiqueue, the RX and TX queues are interleaved */
> +#define VHOST_USER_IS_QUEUE_TX(n)=09(n % 2)
> +#define VHOST_USER_IS_QUEUE_RX(n)=09(!(n % 2))
> +
> +/**
> + * vu_queue_enabled - Return state of a virtqueue
> + * @vq:=09=09virtqueue to check
> + *
> + * Return: true if the virqueue is enabled, false otherwise
> + */
> +static inline bool vu_queue_enabled(const struct vu_virtq *vq)
> +{
> +=09return vq->enable;
> +}
> +
> +/**
> + * vu_queue_started - Return state of a virtqueue
> + * @vq:=09=09virtqueue to check
> + *
> + * Return: true if the virqueue is started, false otherwise
> + */
> +static inline bool vu_queue_started(const struct vu_virtq *vq)
> +{
> +=09return vq->started;
> +}
> +
> +int vu_send(struct vu_dev *vdev, const void *buf, size_t size);
> +void vu_print_capabilities(void);
> +void vu_init(struct ctx *c, struct vu_dev *vdev);
> +void vu_kick_cb(struct vu_dev *vdev, union epoll_ref ref,
> +=09=09const struct timespec *now);
> +void vu_cleanup(struct vu_dev *vdev);
> +void vu_control_handler(struct vu_dev *vdev, int fd, uint32_t events);
> +#endif /* VHOST_USER_H */
> diff --git a/virtio.c b/virtio.c
> index 380590afbca3..237395396606 100644
> --- a/virtio.c
> +++ b/virtio.c
> @@ -328,7 +328,6 @@ static bool vring_can_notify(const struct vu_dev *dev=
, struct vu_virtq *vq)
>   * @dev:=09Vhost-user device
>   * @vq:=09=09Virtqueue
>   */
> -/* cppcheck-suppress unusedFunction */
>  void vu_queue_notify(const struct vu_dev *dev, struct vu_virtq *vq)
>  {
>  =09if (!vq->vring.avail)
> @@ -504,7 +503,6 @@ static int vu_queue_map_desc(struct vu_dev *dev, stru=
ct vu_virtq *vq, unsigned i
>   *
>   * Return: -1 if there is an error, 0 otherwise
>   */
> -/* cppcheck-suppress unusedFunction */
>  int vu_queue_pop(struct vu_dev *dev, struct vu_virtq *vq, struct vu_virt=
q_element *elem)
>  {
>  =09unsigned int head;
> @@ -553,7 +551,6 @@ void vu_queue_detach_element(struct vu_virtq *vq)
>   * vu_queue_unpop() - Push back the previously popped element from the v=
irqueue
>   * @vq:=09=09Virtqueue
>   */
> -/* cppcheck-suppress unusedFunction */
>  void vu_queue_unpop(struct vu_virtq *vq)
>  {
>  =09vq->last_avail_idx--;
> @@ -621,7 +618,6 @@ void vu_queue_fill_by_index(struct vu_virtq *vq, unsi=
gned int index,
>   * @len:=09Size of the element
>   * @idx:=09Used ring entry index
>   */
> -/* cppcheck-suppress unusedFunction */
>  void vu_queue_fill(struct vu_virtq *vq, const struct vu_virtq_element *e=
lem,
>  =09=09   unsigned int len, unsigned int idx)
>  {
> @@ -645,7 +641,6 @@ static inline void vring_used_idx_set(struct vu_virtq=
 *vq, uint16_t val)
>   * @vq:=09=09Virtqueue
>   * @count:=09Number of entry to flush
>   */
> -/* cppcheck-suppress unusedFunction */
>  void vu_queue_flush(struct vu_virtq *vq, unsigned int count)
>  {
>  =09uint16_t old, new;
> diff --git a/virtio.h b/virtio.h
> index 0e5705581bd2..d58b9ef7fc1d 100644
> --- a/virtio.h
> +++ b/virtio.h
> @@ -106,6 +106,7 @@ struct vu_dev_region {
>   * @hdrlen:=09=09Virtio -net header length
>   */
>  struct vu_dev {
> +=09struct ctx *context;
>  =09uint32_t nregions;
>  =09struct vu_dev_region regions[VHOST_USER_MAX_RAM_SLOTS];
>  =09struct vu_virtq vq[VHOST_USER_MAX_QUEUES];
> @@ -162,7 +163,6 @@ static inline bool vu_has_feature(const struct vu_dev=
 *vdev,
>   *
>   * Return:=09True if the feature is available
>   */
> -/* cppcheck-suppress unusedFunction */
>  static inline bool vu_has_protocol_feature(const struct vu_dev *vdev,
>  =09=09=09=09=09   unsigned int fbit)
>  {

The rest looks good to me.

--=20
Stefano