public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: Stefano Brivio <sbrivio@redhat.com>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: passt-dev@passt.top
Subject: Re: [PATCH v2] flow, repair: Wait for a short while for passt-repair to connect
Date: Wed, 12 Mar 2025 21:39:10 +0100	[thread overview]
Message-ID: <20250312213910.059118d6@elisabeth> (raw)
In-Reply-To: <Z9DjZ9zctFY9WzrP@zatzit>

On Wed, 12 Mar 2025 12:29:11 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Tue, Mar 11, 2025 at 10:55:32PM +0100, Stefano Brivio wrote:
> > On Tue, 11 Mar 2025 12:13:46 +1100
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> >   
> > > On Fri, Mar 07, 2025 at 11:41:29PM +0100, Stefano Brivio wrote:  
> > > > ...and time out after that. This will be needed because of an upcoming
> > > > change to passt-repair enabling it to start before passt is started,
> > > > on both source and target, by means of an inotify watch.
> > > > 
> > > > Once the inotify watch triggers, passt-repair will connect right away,
> > > > but we have no guarantees that the connection completes before we
> > > > start the migration process, so wait for it (for a reasonable amount
> > > > of time).
> > > > 
> > > > Signed-off-by: Stefano Brivio <sbrivio@redhat.com>    
> > > 
> > > I still think it's ugly, of course, but I don't see a better way, so:
> > > 
> > > Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> > >   
> > > > ---
> > > > v2:
> > > > 
> > > > - Use 10 ms as timeout instead of 100 ms. Given that I'm unable to
> > > >   migrate a simple guest with 256 MiB of memory and no storage other
> > > >   than an initramfs in less than 4 milliseconds, at least on my test
> > > >   system (rather fast CPU threads and memory interface), I think that
> > > >   10 ms shouldn't make a big difference in case passt-repair is not
> > > >   available for whatever reason    
> > > 
> > > So, IIUC, that 4ms is the *total* migration time.  
> > 
> > Ah, no, that's passt-to-passt in the migrate/basic test, to have a fair
> > comparison. That is:
> > 
> > $ git diff
> > diff --git a/migrate.c b/migrate.c
> > index 0fca77b..3d36843 100644
> > --- a/migrate.c
> > +++ b/migrate.c
> > @@ -286,6 +286,13 @@ void migrate_handler(struct ctx *c)
> >  	if (c->device_state_fd < 0)
> >  		return;
> >  
> > +#include <time.h>
> > +	{
> > +		struct timespec now;
> > +		clock_gettime(CLOCK_REALTIME, &now);
> > +		err("tv: %li.%li", now.tv_sec, now.tv_nsec);
> > +	}
> > +
> >  	debug("Handling migration request from fd: %d, target: %d",
> >  	      c->device_state_fd, c->migrate_target);  
> 
> Ah.  That still doesn't really measure the guest downtime, for two reasons:
>   * It measures from start of migration on the source to start of
>     migration on the target, largely ignoring the actual duration of
>     passt's processing
>   * It ignores everything that happens during the final migration
>     phase *except* for passt itself
> 
> But, it is necessarily a lower bound on the downtime, which I guess is
> enough in this instance.
> 
> > $ grep tv\: test/test_logs/context_passt_*.log
> > test/test_logs/context_passt_1.log:tv: 1741729630.368652064
> > test/test_logs/context_passt_2.log:tv: 1741729630.378664420
> > 
> > In this case it's 10 ms, but I can sometimes get 7 ms. This is with 512
> > MiB, but with 256 MiB I typically get 5 to 6 ms, and sometimes slightly
> > more than 4 ms. One flow or zero flows seem to make little difference.  
> 
> Of course, because both ends of the measurement take place before we
> actually do anything.  I wouldn't expect it to vary based on how much
> we're doing.  All this really measures is the latency from notifying
> the source passt to notifying the target passt.
> 
> > > The concern here is
> > > not that we add to the total migration time, but that we add to the
> > > migration downtime, that is, the time the guest is not running
> > > anywhere.  The downtime can be much smaller than the total migration
> > > time.  Furthermore qemu has no way to account for this delay in its
> > > estimate of what the downtime will be - the time for transferring
> > > device state is pretty much assumed to be neglible in comparison to
> > > transferring guest memory contents.  So, if qemu stops the guest at
> > > the point that the remaining memory transfer will just fit in the
> > > downtime limit, any delays we add will likely cause the downtime limit
> > > to be missed by that much.
> > > 
> > > Now, as it happens, the default downtime limit is 300ms, so an
> > > additional 10ms is probably fine (though 100ms really wasn't).
> > > Nonetheless the reasoning above isn't valid.  
> > 
> > ~50 ms is actually quite easy to get with a few (8) gigabytes of
> > memory,  
> 
> 50ms as measured above?  That's a bit surprising, because there's no
> particular reason for it to depend on memory size.  AFAICT
> SET_DEVICE_STATE_FD is called close to immediately before actually
> reading/writing the stream from the backend.

Oops, right, this figure I had in mind actually came from a rather
different measurement, that is, checking when the guest appeared to
resume from traffic captures with iperf3 running.

I definitely can't see this difference if I repeat the same measurement
as above.

> The memory size will of course affect the total migration time, and
> maybe the downtime.  As soon as qemu thinks it can transfer all
> remaining RAM within its downtime limit, qemu will go to the stopped
> phase.  With a fast local to local connection, it's possible qemu
> could enter that stopped phase almost immediately.
> 
> > that's why 100 ms also looked fine to me, but sure, 10 ms
> > sounds more reasonable.

-- 
Stefano


  reply	other threads:[~2025-03-12 20:40 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-07 22:41 [PATCH v2] flow, repair: Wait for a short while for passt-repair to connect Stefano Brivio
2025-03-11  1:13 ` David Gibson
2025-03-11 21:55   ` Stefano Brivio
2025-03-12  1:29     ` David Gibson
2025-03-12 20:39       ` Stefano Brivio [this message]
2025-03-13  3:03         ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250312213910.059118d6@elisabeth \
    --to=sbrivio@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=passt-dev@passt.top \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).