From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=pass (2048-bit key; secure) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202502 header.b=ZsoBo0lf; dkim-atps=neutral Received: from mail.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by passt.top (Postfix) with ESMTPS id D7E755A0008 for ; Wed, 12 Mar 2025 02:29:35 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202502; t=1741742966; bh=W6+8lpbDm9cqYQqSxM4hwfvrwuuQSyjoe44Ksf02Rw8=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=ZsoBo0lfr5YuexuEUvZFX8ehGjZAKQFObWRkcjMiCYI171sixYZaJaB2hvmL5Y+2D 3KsG6aJo3dDeiDu8Ojyj3Cd7uCHwnvZIS+mYBx6p1qSNoB6D9o+MNHqh+8uXGDjwmP nPifTksv6PxQPVNwDW4nL0BZ1y8h2+uUq+9nCS1j9yJuRCs9UllAwrwK+4KqG0G0By c2mqsI9tKxqObstJdZwPXDOvsApTGYyUvIxSgXrzQFD0izRaDo44f7OF3c41BJoMvF nmxTKdUYmFV1UeihuXBG9vMG2L0b6emUiNxZZJHVYADeeqIYDrmrxt2c9qAI5gUZq7 BNKKcj+JaN3Kw== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4ZCCj26b31z4x0t; Wed, 12 Mar 2025 12:29:26 +1100 (AEDT) Date: Wed, 12 Mar 2025 12:29:11 +1100 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH v2] flow, repair: Wait for a short while for passt-repair to connect Message-ID: References: <20250307224129.2789988-1-sbrivio@redhat.com> <20250311225532.7ddaa1cd@elisabeth> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="bW0O4YXUpIYE+iy3" Content-Disposition: inline In-Reply-To: <20250311225532.7ddaa1cd@elisabeth> Message-ID-Hash: E3MQUH6JPELRJODXTLUPENPIPNBKHBLD X-Message-ID-Hash: E3MQUH6JPELRJODXTLUPENPIPNBKHBLD X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --bW0O4YXUpIYE+iy3 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Mar 11, 2025 at 10:55:32PM +0100, Stefano Brivio wrote: > On Tue, 11 Mar 2025 12:13:46 +1100 > David Gibson wrote: >=20 > > On Fri, Mar 07, 2025 at 11:41:29PM +0100, Stefano Brivio wrote: > > > ...and time out after that. This will be needed because of an upcoming > > > change to passt-repair enabling it to start before passt is started, > > > on both source and target, by means of an inotify watch. > > >=20 > > > Once the inotify watch triggers, passt-repair will connect right away, > > > but we have no guarantees that the connection completes before we > > > start the migration process, so wait for it (for a reasonable amount > > > of time). > > >=20 > > > Signed-off-by: Stefano Brivio =20 > >=20 > > I still think it's ugly, of course, but I don't see a better way, so: > >=20 > > Reviewed-by: David Gibson > >=20 > > > --- > > > v2: > > >=20 > > > - Use 10 ms as timeout instead of 100 ms. Given that I'm unable to > > > migrate a simple guest with 256 MiB of memory and no storage other > > > than an initramfs in less than 4 milliseconds, at least on my test > > > system (rather fast CPU threads and memory interface), I think that > > > 10 ms shouldn't make a big difference in case passt-repair is not > > > available for whatever reason =20 > >=20 > > So, IIUC, that 4ms is the *total* migration time. >=20 > Ah, no, that's passt-to-passt in the migrate/basic test, to have a fair > comparison. That is: >=20 > $ git diff > diff --git a/migrate.c b/migrate.c > index 0fca77b..3d36843 100644 > --- a/migrate.c > +++ b/migrate.c > @@ -286,6 +286,13 @@ void migrate_handler(struct ctx *c) > if (c->device_state_fd < 0) > return; > =20 > +#include > + { > + struct timespec now; > + clock_gettime(CLOCK_REALTIME, &now); > + err("tv: %li.%li", now.tv_sec, now.tv_nsec); > + } > + > debug("Handling migration request from fd: %d, target: %d", > c->device_state_fd, c->migrate_target); Ah. That still doesn't really measure the guest downtime, for two reasons: * It measures from start of migration on the source to start of migration on the target, largely ignoring the actual duration of passt's processing * It ignores everything that happens during the final migration phase *except* for passt itself But, it is necessarily a lower bound on the downtime, which I guess is enough in this instance. > $ grep tv\: test/test_logs/context_passt_*.log > test/test_logs/context_passt_1.log:tv: 1741729630.368652064 > test/test_logs/context_passt_2.log:tv: 1741729630.378664420 >=20 > In this case it's 10 ms, but I can sometimes get 7 ms. This is with 512 > MiB, but with 256 MiB I typically get 5 to 6 ms, and sometimes slightly > more than 4 ms. One flow or zero flows seem to make little difference. Of course, because both ends of the measurement take place before we actually do anything. I wouldn't expect it to vary based on how much we're doing. All this really measures is the latency from notifying the source passt to notifying the target passt. > > The concern here is > > not that we add to the total migration time, but that we add to the > > migration downtime, that is, the time the guest is not running > > anywhere. The downtime can be much smaller than the total migration > > time. Furthermore qemu has no way to account for this delay in its > > estimate of what the downtime will be - the time for transferring > > device state is pretty much assumed to be neglible in comparison to > > transferring guest memory contents. So, if qemu stops the guest at > > the point that the remaining memory transfer will just fit in the > > downtime limit, any delays we add will likely cause the downtime limit > > to be missed by that much. > >=20 > > Now, as it happens, the default downtime limit is 300ms, so an > > additional 10ms is probably fine (though 100ms really wasn't). > > Nonetheless the reasoning above isn't valid. >=20 > ~50 ms is actually quite easy to get with a few (8) gigabytes of > memory, 50ms as measured above? That's a bit surprising, because there's no particular reason for it to depend on memory size. AFAICT SET_DEVICE_STATE_FD is called close to immediately before actually reading/writing the stream from the backend. The memory size will of course affect the total migration time, and maybe the downtime. As soon as qemu thinks it can transfer all remaining RAM within its downtime limit, qemu will go to the stopped phase. With a fast local to local connection, it's possible qemu could enter that stopped phase almost immediately. > that's why 100 ms also looked fine to me, but sure, 10 ms > sounds more reasonable. >=20 --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --bW0O4YXUpIYE+iy3 Content-Type: application/pgp-signature; name=signature.asc -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmfQ42YACgkQzQJF27ox 2GdLcw/+JbS9IhChc+V3ZGiDC45nVZmz0A7vigCcSKCdysAxN3k16u43r0d10dJy Wl295AmvUGtastJQRDIK5mX5YDtk+78BnYIR1qmx2d2yntxnPnZOv0HkXBk8fEKx 2Do+XDLJA4/zVxKAoKShVBEypRjURJ6xmJn8MUKgrECOI0ej59zDKxpdVB7mjZeg LR23v3iXq4Z7BS/IwcBqN8NkSy0GynLDbxQLlk/7OGz8g+wwIqvmJzl2sY2VHwxc B94iiA9GJ2nr9Kq+3cpjBOIR06QTIkoI4GTZoW5wr0d7k/xPTnXK2Ub/biLenLpH QxpafXcUUWvw8ua+LLfUF6VwA7FhHIN59al88m995EkgJT8sZJglNE2QdnNOQw8F wybQpShafVvbh/nlvID9xUhxskDCjhdVMijIzo3fFBzVk22Bvu7A2hGW4d4OpTu+ /c1vYWj/UFUsA2UqYo4s35BdYaOn7QQP10APO0EWdWjOG//ppRc3ZsEKAnbbKADQ EWpQWwLkeWeDI4/065oudusabxmTDlVyD7pFkxzkLXZlgmXxi8hDUM+xZ27QuPIx AuSzIJuPiDc87ZnXFY6R+dItpRGzzJ3QVLF19883oNriwknCemEclVtjzkiq/IOF uRygkbBbnuouwAsDRQOWpHt9jl7b5VHMENwQfpdJTq8t6d20ipg= =KaRb -----END PGP SIGNATURE----- --bW0O4YXUpIYE+iy3--