From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: passt.top; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=KXz8cYcO; dkim-atps=neutral Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by passt.top (Postfix) with ESMTPS id 6A2255A0620 for ; Fri, 17 Jan 2025 19:05:14 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1737137113; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=N5mx+6ShWWHRNLAfMMkuf8sqxDbLwbGdLKUyIdznllw=; b=KXz8cYcOiMw7NMSp4jQ04KEvhRSGVQPfrWnT2JxxjRogpu8vTwYM23UbWs9JDcSON0/VhL hXAlxEJcsl4jwIkh2/c5BHx4TZ7C3ADFLQZd87lDkz5G3l+xmomTLiljzGpPfYS4falf9M l0wfsxxbNSb5i5ChxDyV0+5B0ovMS6k= Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-369-zYwWDFrDOuOLJ9Ehp1CSCw-1; Fri, 17 Jan 2025 13:05:12 -0500 X-MC-Unique: zYwWDFrDOuOLJ9Ehp1CSCw-1 X-Mimecast-MFC-AGG-ID: zYwWDFrDOuOLJ9Ehp1CSCw Received: by mail-wm1-f71.google.com with SMTP id 5b1f17b1804b1-43628594d34so12594055e9.2 for ; Fri, 17 Jan 2025 10:05:12 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1737137109; x=1737741909; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=N5mx+6ShWWHRNLAfMMkuf8sqxDbLwbGdLKUyIdznllw=; b=k0w+gO0WPlKRWpFn9uqz0P1DMo6SVZMWcmYTva5Q0BpeAMi/fMQxjY9U+e2UVw63vi rI0qMse3xbEpcogMIIfYvefDveoyeGlWLff7j62/h/WEITD5NHDgzLniQzXYmWegvgHl qQDbfDuuKm5idUk7iMIwzwhQpF4Fr5rgam4VJUjQ3nf8WmBRaJSr1i8hfLUhL07fHWUw +jU62luKQJ3x9o6u9iqK3Rvxs+AmgzamovFlEW1u/upUPDQw/VgXEFD01IrdsV6JGDcR IM6DSCNcuRaXNsGwvWSqBbOAwhd4ZNRWSIXZu6LHrZFcCi9CZ2BiHj0vVeCZkOSamecd Nt3w== X-Gm-Message-State: AOJu0Ywi2o99/kEre8R5Q3Fir6vy+mHLV6md5JfSOiGZ+POlyPiYhoAv 3CyitrncyqmaXIUwLZlGKhUtGyX1/7OFBPfcGY97648FFzsRKfZnfFJE4/0vYtnKPtbn3OCAvoR +/sA0ARp5pb/+05sT4RG1+7lA1qfExl5ykkdh0jWe+RpSJMHzP8Yps+n2vwZEQnJqhWDmmVN49G 9GgwMyzbDL9HfuJ0WfLu8DbnXap/Wa/DBJ X-Gm-Gg: ASbGnct8cwWLGc/vS8u7KGrT1U7LzgKWK6CU9jg+TohZDscxno1dSW5y31+rjXEkpEZ AXCdH3AGZVb5zdSU3CJDGEnUJab9rA6bOGhV4hQgP87EPN0K175VXv1jvQgoZ7LEgmCMfdaf1ua ou5eGcHcUWOgFGgeimI7kBRwkTzWM0ss3o+92DyAEvvMoielFvER4lwFsy4rAYpALqC+LceF7QD 8DL1qOMe3obITEwe4T/uees4PDoJXLjGg7vos8D+SYuTLdkxQIwEoqD4kQZrBNrZwSt3R5hoTG/ yBhLudwcAQ== X-Received: by 2002:a05:6000:4026:b0:385:e8b0:df13 with SMTP id ffacd0b85a97d-38bf57a9569mr3946296f8f.40.1737137109453; Fri, 17 Jan 2025 10:05:09 -0800 (PST) X-Google-Smtp-Source: AGHT+IEnVyvu3UvbXBBp38+/WovVZk/JYoUiCmq9iO68zxRfAzo5+42H89XOiysWjVcuE2Lz4ipkhw== X-Received: by 2002:a05:6000:4026:b0:385:e8b0:df13 with SMTP id ffacd0b85a97d-38bf57a9569mr3946206f8f.40.1737137108779; Fri, 17 Jan 2025 10:05:08 -0800 (PST) Received: from maya.myfinge.rs (ifcgrfdd.trafficplex.cloud. [176.103.220.4]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-437c74abb27sm99523185e9.9.2025.01.17.10.05.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 17 Jan 2025 10:05:07 -0800 (PST) Date: Fri, 17 Jan 2025 19:05:06 +0100 From: Stefano Brivio To: Laurent Vivier Subject: Re: [PATCH 8/9] vhost-user: add VHOST_USER_SET_DEVICE_STATE_FD command Message-ID: <20250117190506.51b3946f@elisabeth> In-Reply-To: <20241219111400.2352110-9-lvivier@redhat.com> References: <20241219111400.2352110-1-lvivier@redhat.com> <20241219111400.2352110-9-lvivier@redhat.com> Organization: Red Hat X-Mailer: Claws Mail 4.2.0 (GTK 3.24.41; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: n-N39MeWLm36Ff9HhWTUNzdgHXZwTOHaClgU3e9_jnU_1737137111 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-ID-Hash: CXQXTHWF3WXOFKI6D34FA7C622X6XBFR X-Message-ID-Hash: CXQXTHWF3WXOFKI6D34FA7C622X6XBFR X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Thu, 19 Dec 2024 12:13:59 +0100 Laurent Vivier wrote: > Set the file descriptor to use to transfer the > backend device state during migration. > > Signed-off-by: Laurent Vivier > --- > epoll_type.h | 2 ++ > passt.c | 4 +++ > vhost_user.c | 81 +++++++++++++++++++++++++++++++++++++++++++++++++++- > virtio.h | 2 ++ > vu_common.c | 49 +++++++++++++++++++++++++++++++ > vu_common.h | 1 + > 6 files changed, 138 insertions(+), 1 deletion(-) > > diff --git a/epoll_type.h b/epoll_type.h > index f3ef41584757..fd9eac392f77 100644 > --- a/epoll_type.h > +++ b/epoll_type.h > @@ -40,6 +40,8 @@ enum epoll_type { > EPOLL_TYPE_VHOST_CMD, > /* vhost-user kick event socket */ > EPOLL_TYPE_VHOST_KICK, > + /* vhost-user migration socket */ > + EPOLL_TYPE_VHOST_MIGRATION, > > EPOLL_NUM_TYPES, > }; > diff --git a/passt.c b/passt.c > index 957f3d0f4ddc..25d9823739cf 100644 > --- a/passt.c > +++ b/passt.c > @@ -75,6 +75,7 @@ char *epoll_type_str[] = { > [EPOLL_TYPE_TAP_LISTEN] = "listening qemu socket", > [EPOLL_TYPE_VHOST_CMD] = "vhost-user command socket", > [EPOLL_TYPE_VHOST_KICK] = "vhost-user kick socket", > + [EPOLL_TYPE_VHOST_MIGRATION] = "vhost-user migration socket", > }; > static_assert(ARRAY_SIZE(epoll_type_str) == EPOLL_NUM_TYPES, > "epoll_type_str[] doesn't match enum epoll_type"); > @@ -356,6 +357,9 @@ loop: > case EPOLL_TYPE_VHOST_KICK: > vu_kick_cb(c.vdev, ref, &now); > break; > + case EPOLL_TYPE_VHOST_MIGRATION: > + vu_migrate(c.vdev, eventmask); > + break; > default: > /* Can't happen */ > ASSERT(0); > diff --git a/vhost_user.c b/vhost_user.c > index 90c46d5b89fd..11b0b447850d 100644 > --- a/vhost_user.c > +++ b/vhost_user.c > @@ -981,6 +981,78 @@ static bool vu_set_vring_enable_exec(struct vu_dev *vdev, > return false; > } > > +/** > + * vu_set_migration_watch() -- Add the migration file descriptor to Single '-' between function name and comment. > + * to the passt epoll file descriptor > + * @vdev: vhost-user device > + * @fd: File descriptor to add > + * @direction: Direction of the migration (save or load backend state) > + */ > +static void vu_set_migration_watch(const struct vu_dev *vdev, int fd, > + int direction) Shouldn't direction be uint32? > +{ > + union epoll_ref ref = { > + .type = EPOLL_TYPE_VHOST_MIGRATION, > + .fd = fd, > + }; > + struct epoll_event ev = { 0 }; > + > + ev.data.u64 = ref.u64; > + switch (direction) { > + case VHOST_USER_TRANSFER_STATE_DIRECTION_SAVE: > + ev.events = EPOLLOUT; > + break; > + case VHOST_USER_TRANSFER_STATE_DIRECTION_LOAD: > + ev.events = EPOLLIN; > + break; > + default: > + ASSERT(0); > + } > + > + epoll_ctl(vdev->context->epollfd, EPOLL_CTL_ADD, ref.fd, &ev); > +} > + > +/** > + * vu_set_device_state_fd_exec() -- Set the device state migration channel Single '-' between function name and comment. > + * @vdev: vhost-user device > + * @vmsg: vhost-user message > + * > + * Return: True as the reply contains 0 to indicate success > + * and set bit 8 as we don't provide our own fd. > + */ > +static bool vu_set_device_state_fd_exec(struct vu_dev *vdev, > + struct vhost_user_msg *msg) > +{ > + unsigned int direction = msg->payload.transfer_state.direction; > + unsigned int phase = msg->payload.transfer_state.phase; > + > + if (msg->fd_num != 1) > + die("Invalid device_state_fd message"); > + > + if (phase != VHOST_USER_TRANSFER_STATE_PHASE_STOPPED) > + die("Invalid device_state_fd phase: %d", phase); > + > + if (direction != VHOST_USER_TRANSFER_STATE_DIRECTION_SAVE && > + direction != VHOST_USER_TRANSFER_STATE_DIRECTION_LOAD) > + die("Invalide device_state_fd direction: %d", direction); > + > + if (vdev->device_state_fd != -1) { > + vu_remove_watch(vdev, vdev->device_state_fd); > + close(vdev->device_state_fd); > + } > + > + vdev->device_state_fd = msg->fds[0]; > + vdev->device_state_result = -1; > + vu_set_migration_watch(vdev, vdev->device_state_fd, direction); > + > + debug("Got device_state_fd: %d", vdev->device_state_fd); > + > + /* We don't provide a new fd for the data transfer */ > + vmsg_set_reply_u64(msg, VHOST_USER_VRING_NOFD_MASK); > + > + return true; > +} > + > /** > * vu_check_device_state_exec() -- Return device state migration result Single '-' between function name and comment. > * @vdev: vhost-user device > @@ -1019,6 +1091,7 @@ void vu_init(struct ctx *c) > } > c->vdev->log_table = NULL; > c->vdev->log_call_fd = -1; > + c->vdev->device_state_fd = -1; > c->vdev->device_state_result = -1; > } > > @@ -1069,7 +1142,12 @@ void vu_cleanup(struct vu_dev *vdev) > > vu_close_log(vdev); > > - vdev->device_state_result = -1; > + if (vdev->device_state_fd != -1) { > + vu_remove_watch(vdev, vdev->device_state_fd); > + close(vdev->device_state_fd); > + vdev->device_state_fd = -1; > + vdev->device_state_result = -1; > + } > } > > /** > @@ -1100,6 +1178,7 @@ static bool (*vu_handle[VHOST_USER_MAX])(struct vu_dev *vdev, > [VHOST_USER_SET_VRING_CALL] = vu_set_vring_call_exec, > [VHOST_USER_SET_VRING_ERR] = vu_set_vring_err_exec, > [VHOST_USER_SET_VRING_ENABLE] = vu_set_vring_enable_exec, > + [VHOST_USER_SET_DEVICE_STATE_FD] = vu_set_device_state_fd_exec, > [VHOST_USER_CHECK_DEVICE_STATE] = vu_check_device_state_exec, > }; > > diff --git a/virtio.h b/virtio.h > index 512ec1bedcd3..7bef2d274acd 100644 > --- a/virtio.h > +++ b/virtio.h > @@ -106,6 +106,7 @@ struct vu_dev_region { > * @log_call_fd: Eventfd to report logging update > * @log_size: Size of the logging memory region > * @log_table: Base of the logging memory region > + * @device_state_fd: Device state migration channel > * @device_state_result: Device state migration result > */ > struct vu_dev { > @@ -118,6 +119,7 @@ struct vu_dev { > int log_call_fd; > uint64_t log_size; > uint8_t *log_table; > + int device_state_fd; > int device_state_result; > }; > > diff --git a/vu_common.c b/vu_common.c > index 16e7e76a07f3..3142b585c29f 100644 > --- a/vu_common.c > +++ b/vu_common.c > @@ -281,3 +281,52 @@ err: > > return -1; > } > + > +/** > + * vu_migrate() -- Send/receive passt insternal state to/from QEMU Single '-' between function name and comment. > + * @vdev: vhost-user device > + * @events: epoll events > + */ > +void vu_migrate(struct vu_dev *vdev, uint32_t events) > +{ > + int ret; > + > + /* TODO: collect/set passt internal state > + * and use vdev->device_state_fd to send/receive it > + */ Second and third line are indented with spaces instead of tabs. > + debug("vu_migrate fd %d events %x", vdev->device_state_fd, events); > + if (events & EPOLLOUT) { > + debug("Saving backend state"); > + > + /* send some stuff */ > + ret = write(vdev->device_state_fd, "PASST", 6); So, yeah, I still have my open questions/concerns here (essentially: "what if write() returns 5?"), but they can very well fit under the TODO above. We might need to refactor this anyway, perhaps even use writev(). So I think it's totally fine by now. > + /* value to be returned by VHOST_USER_CHECK_DEVICE_STATE */ > + vdev->device_state_result = ret == -1 ? -1 : 0; Shouldn't we err() on error? Even right now for development purposes? > + /* Closing the file descriptor signals the end of transfer */ > + epoll_ctl(vdev->context->epollfd, EPOLL_CTL_DEL, > + vdev->device_state_fd, NULL); > + close(vdev->device_state_fd); > + vdev->device_state_fd = -1; > + } else if (events & EPOLLIN) { > + char buf[6]; > + > + debug("Loading backend state"); > + /* read some stuff */ > + ret = read(vdev->device_state_fd, buf, sizeof(buf)); > + /* value to be returned by VHOST_USER_CHECK_DEVICE_STATE */ > + if (ret != sizeof(buf)) { > + vdev->device_state_result = -1; Same here. > + } else { > + ret = strncmp(buf, "PASST", sizeof(buf)); > + vdev->device_state_result = ret == 0 ? 0 : -1; > + } > + } else if (events & EPOLLHUP) { > + debug("Closing migration channel"); > + > + /* The end of file signals the end of the transfer. */ > + epoll_ctl(vdev->context->epollfd, EPOLL_CTL_DEL, > + vdev->device_state_fd, NULL); > + close(vdev->device_state_fd); > + vdev->device_state_fd = -1; > + } > +} > diff --git a/vu_common.h b/vu_common.h > index bd70faf3e226..d56c021ab0f9 100644 > --- a/vu_common.h > +++ b/vu_common.h > @@ -57,4 +57,5 @@ void vu_flush(const struct vu_dev *vdev, struct vu_virtq *vq, > void vu_kick_cb(struct vu_dev *vdev, union epoll_ref ref, > const struct timespec *now); > int vu_send_single(const struct ctx *c, const void *buf, size_t size); > +void vu_migrate(struct vu_dev *vdev, uint32_t events); > #endif /* VU_COMMON_H */ The rest of the series looks good to me. I can also fix up all the formal things on merge, but I guess you want to respin (at least for the "fake RARP" thing) anyway? -- Stefano