From: David Gibson <david@gibson.dropbear.id.au>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: passt-dev@passt.top, Laurent Vivier <lvivier@redhat.com>
Subject: Re: [PATCH v3 17/20] vhost_user: Make source quit after reporting migration state
Date: Mon, 3 Feb 2025 19:52:37 +1100 [thread overview]
Message-ID: <Z6CD1Xse_07cPC3f@zatzit> (raw)
In-Reply-To: <20250203070932.2159fdc0@elisabeth>
[-- Attachment #1: Type: text/plain, Size: 2958 bytes --]
On Mon, Feb 03, 2025 at 07:09:32AM +0100, Stefano Brivio wrote:
> On Mon, 3 Feb 2025 12:55:47 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
>
> > On Fri, Jan 31, 2025 at 08:39:50PM +0100, Stefano Brivio wrote:
> > > On migration, the source process asks passt-helper to set TCP sockets
> > > in repair mode, dumps the information we need to migrate connections,
> > > and closes them.
> > >
> > > At this point, we can't pass them back to passt-helper using
> > > SCM_RIGHTS, because they are closed, from that perspective, and
> > > sendmsg() will give us EBADF. But if we don't clear repair mode, the
> > > port they are bound to will not be available for binding in the
> > > target.
> > >
> > > Terminate once we're done with the migration and we reported the
> > > state. This is equivalent to clearing repair mode on the sockets we
> > > just closed.
> >
> > As noted on the passt-repair patch, I think this is based on a
> > misinterpreation of the situation. I think the problem is that the
> > sockets aren't closed in passt-repair, so the additional handle copy
> > is keeping the underlying socket open. This appears to work, because
> > it is causing passt-repair to also terminate.
>
> Right, exactly that.
>
> > That said, we probably want to terminate on the source side after a
> > succesful migrate anyway. At the very least we need to close() all
> > our sockets, and delete the corresponding flows, because we don't own
> > them any more. Quitting is probably the simplest way to do that.
>
> I'm not sure if there's an established behaviour for helpers supporting
> state migration.
By "helper" do you mean passt as a device helper to qemu, or
passt-repair as a helper to passt. For the latter I wouldn't expect
so - it's only a weirdness of our situation that we need passt-repair
at all. If the former, I'm not really sure what you're after.
> We could probably close sockets, delete flows, and keep things up and
> running for the rest (restart from a clean situation), but at that
> point we already the guest networking is already broken in a number of
> ways. So, yeah, maybe let's keep this instead.
So, I realised it's a bit more complicated than that. We need to
identify exactly where the "point of no return" is. I'll discuss in
our call tonight.
> > Which also makes me realise, on a *failed* outbound migration, we _do_
> > need to turn repair mode off on everything again. Is that implemented
> > yet?
>
> No, sorry, that's the "/* TODO: rollback */" comments in
> flow_migrate_source_pre(), flow.c. But other than that, it should be
> pretty much implemented, at a migrate.c level.
Right, I realised that a bit after I wrote this.
--
David Gibson (he or they) | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you, not the other way
| around.
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2025-02-03 9:25 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-31 19:39 [PATCH v3 00/20] Draft, incomplete series introducing state migration Stefano Brivio
2025-01-31 19:39 ` [PATCH v3 01/20] tcp: Always pass NULL event with EPOLL_CTL_DEL Stefano Brivio
2025-01-31 19:39 ` [PATCH v3 02/20] util: Rename and make global vu_remove_watch() Stefano Brivio
2025-01-31 19:39 ` [PATCH v3 03/20] icmp, udp: Pad time_t timestamp to 64-bit to ease state migration Stefano Brivio
2025-01-31 19:39 ` [PATCH v3 04/20] flow, flow_table: Pad flow table entries to 128 bytes, hash entries to 32 bits Stefano Brivio
2025-01-31 19:39 ` [PATCH v3 05/20] flow_table: Use size in extern declaration for flowtab Stefano Brivio
2025-01-31 19:39 ` [PATCH v3 06/20] util: Add read_remainder() and read_all_buf() Stefano Brivio
2025-01-31 19:39 ` [PATCH v3 07/20] Introduce facilities for guest migration on top of vhost-user infrastructure Stefano Brivio
2025-01-31 19:39 ` [PATCH v3 08/20] Introduce passt-repair Stefano Brivio
2025-02-03 1:46 ` David Gibson
2025-01-31 19:39 ` [PATCH v3 09/20] Add interfaces and configuration bits for passt-repair Stefano Brivio
2025-02-03 5:22 ` David Gibson
2025-01-31 19:39 ` [PATCH v3 10/20] flow, tcp: Basic pre-migration source handler to dump sequence numbers Stefano Brivio
2025-01-31 19:39 ` [PATCH v3 11/20] migrate: vu_migrate_{source,target}() aren't actually vu speciic Stefano Brivio
2025-01-31 19:39 ` [PATCH v3 12/20] migrate: Move repair_sock_init() to vu_init() Stefano Brivio
2025-01-31 19:39 ` [PATCH v3 13/20] migrate: Make more handling common rather than vhost-user specific Stefano Brivio
2025-01-31 19:39 ` [PATCH v3 14/20] migrate: Don't handle the migration channel through epoll Stefano Brivio
2025-02-03 1:50 ` David Gibson
2025-02-03 5:38 ` Stefano Brivio
2025-02-03 8:45 ` David Gibson
2025-02-03 2:16 ` David Gibson
2025-01-31 19:39 ` [PATCH v3 15/20] flow, flow_table: Export declaration of hash table Stefano Brivio
2025-01-31 19:39 ` [PATCH v3 16/20] vhost_user: Turn vhost-user message reports to trace() Stefano Brivio
2025-02-03 3:11 ` David Gibson
2025-02-03 6:10 ` Stefano Brivio
2025-02-03 8:47 ` David Gibson
2025-01-31 19:39 ` [PATCH v3 17/20] vhost_user: Make source quit after reporting migration state Stefano Brivio
2025-02-03 1:55 ` David Gibson
2025-02-03 6:09 ` Stefano Brivio
2025-02-03 8:52 ` David Gibson [this message]
2025-02-03 9:44 ` Stefano Brivio
2025-01-31 19:39 ` [PATCH v3 18/20] tcp: Get our socket port using getsockname() when connecting from guest Stefano Brivio
2025-02-03 2:05 ` David Gibson
2025-02-03 6:09 ` Stefano Brivio
2025-02-03 8:59 ` David Gibson
2025-02-03 9:45 ` Stefano Brivio
2025-01-31 19:39 ` [PATCH v3 19/20] tcp: Add HOSTSIDE(x), HOSTFLOW(x) macros Stefano Brivio
2025-02-03 2:06 ` David Gibson
2025-01-31 19:39 ` [PATCH v3 20/20] Implement target side of migration Stefano Brivio
2025-02-01 7:45 ` [PATCH v3 00/20] Draft, incomplete series introducing state migration Stefano Brivio
2025-02-03 2:18 ` David Gibson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z6CD1Xse_07cPC3f@zatzit \
--to=david@gibson.dropbear.id.au \
--cc=lvivier@redhat.com \
--cc=passt-dev@passt.top \
--cc=sbrivio@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).