From: Laurent Vivier <lvivier@redhat.com>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: passt-dev@passt.top
Subject: Re: [PATCH 8/9] vhost-user: add VHOST_USER_SET_DEVICE_STATE_FD command
Date: Fri, 20 Dec 2024 08:56:27 +0100 [thread overview]
Message-ID: <61d7f502-6f4d-40bc-b3a5-feed4229585d@redhat.com> (raw)
In-Reply-To: <20241219204759.65bc5353@elisabeth>
On 19/12/2024 20:47, Stefano Brivio wrote:
> On Thu, 19 Dec 2024 12:13:59 +0100
> Laurent Vivier <lvivier@redhat.com> wrote:
>
>> +/**
>> + * vu_migrate() -- Send/receive passt insternal state to/from QEMU
>
> Magic!
>
>> + * @vdev: vhost-user device
>> + * @events: epoll events
>> + */
>> +void vu_migrate(struct vu_dev *vdev, uint32_t events)
>> +{
>> + int ret;
>> +
>> + /* TODO: collect/set passt internal state
>> + * and use vdev->device_state_fd to send/receive it
>> + */
>> + debug("vu_migrate fd %d events %x", vdev->device_state_fd, events);
>> + if (events & EPOLLOUT) {
>
> I haven't really reviewed the series yet, but I have a preliminary
> question: how does the hypervisor tell us that we're writing too much?
The hypervisor is not aware of anything. It's only a bridge between the two passt instances.
But normally the destination side can report an error, this aborts the migration on both
sides.
In QEMU, the code is:
while (true) {
ssize_t read_ret;
/* read the data from the backend */
read_ret = RETRY_ON_EINTR(read(read_fd, transfer_buf, chunk_size));
if (read_ret < 0) {
ret = -errno;
error_setg_errno(errp, -ret, "Failed to receive state");
goto fail;
}
assert(read_ret <= chunk_size);
qemu_put_be32(f, read_ret);
if (read_ret == 0) {
/* EOF */
break;
}
/* send to destination QEMU */
qemu_put_buffer(f, transfer_buf, read_ret);
}
/*
* Back-end will not really care, but be clean and close our end of the pipe
* before inquiring the back-end about whether transfer was successful
*/
close(read_fd);
On the other side:
while (true) {
size_t this_chunk_size = qemu_get_be32(f);
ssize_t write_ret;
const uint8_t *transfer_pointer;
if (this_chunk_size == 0) {
/* End of state */
break;
}
if (transfer_buf_size < this_chunk_size) {
transfer_buf = g_realloc(transfer_buf, this_chunk_size);
transfer_buf_size = this_chunk_size;
}
if (qemu_get_buffer(f, transfer_buf, this_chunk_size) <
this_chunk_size)
{
error_setg(errp, "Failed to read state");
ret = -EINVAL;
goto fail;
}
transfer_pointer = transfer_buf;
while (this_chunk_size > 0) {
write_ret = RETRY_ON_EINTR(
write(write_fd, transfer_pointer, this_chunk_size)
);
if (write_ret < 0) {
ret = -errno;
error_setg_errno(errp, -ret, "Failed to send state");
goto fail;
} else if (write_ret == 0) {
error_setg(errp, "Failed to send state: Connection is closed");
ret = -ECONNRESET;
goto fail;
}
assert(write_ret <= this_chunk_size);
this_chunk_size -= write_ret;
transfer_pointer += write_ret;
}
}
/*
* Close our end, thus ending transfer, before inquiring the back-end about
* whether transfer was successful
*/
close(write_fd);
Moreover, I think it's important to know, the source side is stopped at the end of
migration but it is in a state it can be restarted.
> I guess we'll do a short write and we'll need to go back to EPOLLOUT?
> There's no minimum chunk size we can write, correct?
Correct. It works even if there is no write() in this code.
The hypervisor reads until we close the file descriptor (it's a pipe in fact).
>
>> + debug("Saving backend state");
>> +
>> + /* send some stuff */
>> + ret = write(vdev->device_state_fd, "PASST", 6);
>> + /* value to be returned by VHOST_USER_CHECK_DEVICE_STATE */
>> + vdev->device_state_result = ret == -1 ? -1 : 0;
>> + /* Closing the file descriptor signals the end of transfer */
>> + epoll_ctl(vdev->context->epollfd, EPOLL_CTL_DEL,
>> + vdev->device_state_fd, NULL);
>> + close(vdev->device_state_fd);
>> + vdev->device_state_fd = -1;
>> + } else if (events & EPOLLIN) {
>
> ...and similarly here, I guess we'll get a short read?
We must read until we get a close().
>
>> + char buf[6];
>> +
>> + debug("Loading backend state");
>> + /* read some stuff */
>> + ret = read(vdev->device_state_fd, buf, sizeof(buf));
>> + /* value to be returned by VHOST_USER_CHECK_DEVICE_STATE */
>> + if (ret != sizeof(buf)) {
>> + vdev->device_state_result = -1;
>> + } else {
>> + ret = strncmp(buf, "PASST", sizeof(buf));
>> + vdev->device_state_result = ret == 0 ? 0 : -1;
>> + }
>> + } else if (events & EPOLLHUP) {
>> + debug("Closing migration channel");
>> +
>> + /* The end of file signals the end of the transfer. */
>> + epoll_ctl(vdev->context->epollfd, EPOLL_CTL_DEL,
>> + vdev->device_state_fd, NULL);
>> + close(vdev->device_state_fd);
>> + vdev->device_state_fd = -1;
>> + }
>> +}
>
Thanks,
Laurent
next prev parent reply other threads:[~2024-12-20 7:56 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-19 11:13 [PATCH 0/9] vhost-user: Migration support Laurent Vivier
2024-12-19 11:13 ` [PATCH 1/9] virtio: Use const pointer for vu_dev Laurent Vivier
2024-12-20 0:24 ` David Gibson
2024-12-19 11:13 ` [PATCH 2/9] vhost-user: update protocol features and commands list Laurent Vivier
2024-12-19 11:13 ` [PATCH 3/9] vhost-user: add VHOST_USER_SET_LOG_FD command Laurent Vivier
2024-12-19 11:13 ` [PATCH 4/9] vhost-user: Pass vu_dev to more virtio functions Laurent Vivier
2024-12-19 11:13 ` [PATCH 5/9] vhost-user: add VHOST_USER_SET_LOG_BASE command Laurent Vivier
2024-12-19 11:13 ` [PATCH 6/9] vhost-user: Report to front-end we support VHOST_USER_PROTOCOL_F_LOG_SHMFD Laurent Vivier
2024-12-19 11:13 ` [PATCH 7/9] vhost-user: add VHOST_USER_CHECK_DEVICE_STATE command Laurent Vivier
2024-12-19 11:13 ` [PATCH 8/9] vhost-user: add VHOST_USER_SET_DEVICE_STATE_FD command Laurent Vivier
2024-12-19 19:47 ` Stefano Brivio
2024-12-20 7:56 ` Laurent Vivier [this message]
2024-12-20 13:28 ` Stefano Brivio
2024-12-19 11:14 ` [PATCH 9/9] vhost-user: Report to front-end we support VHOST_USER_PROTOCOL_F_DEVICE_STATE Laurent Vivier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=61d7f502-6f4d-40bc-b3a5-feed4229585d@redhat.com \
--to=lvivier@redhat.com \
--cc=passt-dev@passt.top \
--cc=sbrivio@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).