From: Stefano Brivio <sbrivio@redhat.com>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: passt-dev@passt.top
Subject: Re: [PATCH v2] flow, repair: Wait for a short while for passt-repair to connect
Date: Wed, 12 Mar 2025 21:39:10 +0100 [thread overview]
Message-ID: <20250312213910.059118d6@elisabeth> (raw)
In-Reply-To: <Z9DjZ9zctFY9WzrP@zatzit>
On Wed, 12 Mar 2025 12:29:11 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:
> On Tue, Mar 11, 2025 at 10:55:32PM +0100, Stefano Brivio wrote:
> > On Tue, 11 Mar 2025 12:13:46 +1100
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> >
> > > On Fri, Mar 07, 2025 at 11:41:29PM +0100, Stefano Brivio wrote:
> > > > ...and time out after that. This will be needed because of an upcoming
> > > > change to passt-repair enabling it to start before passt is started,
> > > > on both source and target, by means of an inotify watch.
> > > >
> > > > Once the inotify watch triggers, passt-repair will connect right away,
> > > > but we have no guarantees that the connection completes before we
> > > > start the migration process, so wait for it (for a reasonable amount
> > > > of time).
> > > >
> > > > Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> > >
> > > I still think it's ugly, of course, but I don't see a better way, so:
> > >
> > > Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> > >
> > > > ---
> > > > v2:
> > > >
> > > > - Use 10 ms as timeout instead of 100 ms. Given that I'm unable to
> > > > migrate a simple guest with 256 MiB of memory and no storage other
> > > > than an initramfs in less than 4 milliseconds, at least on my test
> > > > system (rather fast CPU threads and memory interface), I think that
> > > > 10 ms shouldn't make a big difference in case passt-repair is not
> > > > available for whatever reason
> > >
> > > So, IIUC, that 4ms is the *total* migration time.
> >
> > Ah, no, that's passt-to-passt in the migrate/basic test, to have a fair
> > comparison. That is:
> >
> > $ git diff
> > diff --git a/migrate.c b/migrate.c
> > index 0fca77b..3d36843 100644
> > --- a/migrate.c
> > +++ b/migrate.c
> > @@ -286,6 +286,13 @@ void migrate_handler(struct ctx *c)
> > if (c->device_state_fd < 0)
> > return;
> >
> > +#include <time.h>
> > + {
> > + struct timespec now;
> > + clock_gettime(CLOCK_REALTIME, &now);
> > + err("tv: %li.%li", now.tv_sec, now.tv_nsec);
> > + }
> > +
> > debug("Handling migration request from fd: %d, target: %d",
> > c->device_state_fd, c->migrate_target);
>
> Ah. That still doesn't really measure the guest downtime, for two reasons:
> * It measures from start of migration on the source to start of
> migration on the target, largely ignoring the actual duration of
> passt's processing
> * It ignores everything that happens during the final migration
> phase *except* for passt itself
>
> But, it is necessarily a lower bound on the downtime, which I guess is
> enough in this instance.
>
> > $ grep tv\: test/test_logs/context_passt_*.log
> > test/test_logs/context_passt_1.log:tv: 1741729630.368652064
> > test/test_logs/context_passt_2.log:tv: 1741729630.378664420
> >
> > In this case it's 10 ms, but I can sometimes get 7 ms. This is with 512
> > MiB, but with 256 MiB I typically get 5 to 6 ms, and sometimes slightly
> > more than 4 ms. One flow or zero flows seem to make little difference.
>
> Of course, because both ends of the measurement take place before we
> actually do anything. I wouldn't expect it to vary based on how much
> we're doing. All this really measures is the latency from notifying
> the source passt to notifying the target passt.
>
> > > The concern here is
> > > not that we add to the total migration time, but that we add to the
> > > migration downtime, that is, the time the guest is not running
> > > anywhere. The downtime can be much smaller than the total migration
> > > time. Furthermore qemu has no way to account for this delay in its
> > > estimate of what the downtime will be - the time for transferring
> > > device state is pretty much assumed to be neglible in comparison to
> > > transferring guest memory contents. So, if qemu stops the guest at
> > > the point that the remaining memory transfer will just fit in the
> > > downtime limit, any delays we add will likely cause the downtime limit
> > > to be missed by that much.
> > >
> > > Now, as it happens, the default downtime limit is 300ms, so an
> > > additional 10ms is probably fine (though 100ms really wasn't).
> > > Nonetheless the reasoning above isn't valid.
> >
> > ~50 ms is actually quite easy to get with a few (8) gigabytes of
> > memory,
>
> 50ms as measured above? That's a bit surprising, because there's no
> particular reason for it to depend on memory size. AFAICT
> SET_DEVICE_STATE_FD is called close to immediately before actually
> reading/writing the stream from the backend.
Oops, right, this figure I had in mind actually came from a rather
different measurement, that is, checking when the guest appeared to
resume from traffic captures with iperf3 running.
I definitely can't see this difference if I repeat the same measurement
as above.
> The memory size will of course affect the total migration time, and
> maybe the downtime. As soon as qemu thinks it can transfer all
> remaining RAM within its downtime limit, qemu will go to the stopped
> phase. With a fast local to local connection, it's possible qemu
> could enter that stopped phase almost immediately.
>
> > that's why 100 ms also looked fine to me, but sure, 10 ms
> > sounds more reasonable.
--
Stefano
next prev parent reply other threads:[~2025-03-12 20:40 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-07 22:41 [PATCH v2] flow, repair: Wait for a short while for passt-repair to connect Stefano Brivio
2025-03-11 1:13 ` David Gibson
2025-03-11 21:55 ` Stefano Brivio
2025-03-12 1:29 ` David Gibson
2025-03-12 20:39 ` Stefano Brivio [this message]
2025-03-13 3:03 ` David Gibson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250312213910.059118d6@elisabeth \
--to=sbrivio@redhat.com \
--cc=david@gibson.dropbear.id.au \
--cc=passt-dev@passt.top \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).