From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=pass (2048-bit key; secure) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202502 header.b=i1XHBkNK; dkim-atps=neutral Received: from mail.ozlabs.org (unknown [150.107.74.76]) by passt.top (Postfix) with ESMTPS id A49A55A0276 for ; Tue, 11 Mar 2025 02:15:07 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202502; t=1741655676; bh=aJAjAYzTn/V32DSHOZbbfMFPEULFrmhNFuFCXacOHQk=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=i1XHBkNKMsRk1usm/f5uihTOTA//KliVx9aQ1WVYqgxCmoscUaIYLRQafUqAhTjCm oSoCl30jfGPBqrl9q05wM7nee4MYNUvrm27GJ1WTxXzu9aiETOqrvwAlHcKZvU7pMv XhcvK7+YMGWVXzlBxt45jArJvBaBMGv1eGoprc96Mukt9ECB0/3CQSr56B6kTZVyT8 5FXYGiBEOLaYkaaOg3eGnmhL5K3QeB7daAiztW+ecqrMI/F1u5J2tzEoshP2fPeskl WznG97uHqxi4lWG7XYAwNTZHih6gyzjneipCtiZdNeoMRq7cDBWIz9Q3EEAsoz9UoJ RLGzgeJYX5AHA== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4ZBbQN733wz4x8R; Tue, 11 Mar 2025 12:14:36 +1100 (AEDT) Date: Tue, 11 Mar 2025 12:13:46 +1100 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH v2] flow, repair: Wait for a short while for passt-repair to connect Message-ID: References: <20250307224129.2789988-1-sbrivio@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="Uyyx27q6gSwqWwbR" Content-Disposition: inline In-Reply-To: <20250307224129.2789988-1-sbrivio@redhat.com> Message-ID-Hash: G2QIYXOIYOTWQRLLZYTIJVPYIBAHEDQT X-Message-ID-Hash: G2QIYXOIYOTWQRLLZYTIJVPYIBAHEDQT X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --Uyyx27q6gSwqWwbR Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Mar 07, 2025 at 11:41:29PM +0100, Stefano Brivio wrote: > ...and time out after that. This will be needed because of an upcoming > change to passt-repair enabling it to start before passt is started, > on both source and target, by means of an inotify watch. >=20 > Once the inotify watch triggers, passt-repair will connect right away, > but we have no guarantees that the connection completes before we > start the migration process, so wait for it (for a reasonable amount > of time). >=20 > Signed-off-by: Stefano Brivio I still think it's ugly, of course, but I don't see a better way, so: Reviewed-by: David Gibson > --- > v2: >=20 > - Use 10 ms as timeout instead of 100 ms. Given that I'm unable to > migrate a simple guest with 256 MiB of memory and no storage other > than an initramfs in less than 4 milliseconds, at least on my test > system (rather fast CPU threads and memory interface), I think that > 10 ms shouldn't make a big difference in case passt-repair is not > available for whatever reason So, IIUC, that 4ms is the *total* migration time. The concern here is not that we add to the total migration time, but that we add to the migration downtime, that is, the time the guest is not running anywhere. The downtime can be much smaller than the total migration time. Furthermore qemu has no way to account for this delay in its estimate of what the downtime will be - the time for transferring device state is pretty much assumed to be neglible in comparison to transferring guest memory contents. So, if qemu stops the guest at the point that the remaining memory transfer will just fit in the downtime limit, any delays we add will likely cause the downtime limit to be missed by that much. Now, as it happens, the default downtime limit is 300ms, so an additional 10ms is probably fine (though 100ms really wasn't). Nonetheless the reasoning above isn't valid. > - Move the static assert next to the initialisation of 'tv' >=20 > flow.c | 20 ++++++++++++++++++++ > repair.c | 31 +++++++++++++++++++++++++++++++ > repair.h | 1 + > 3 files changed, 52 insertions(+) >=20 > diff --git a/flow.c b/flow.c > index 749c498..5e64b79 100644 > --- a/flow.c > +++ b/flow.c > @@ -911,6 +911,21 @@ static int flow_migrate_source_rollback(struct ctx *= c, unsigned bound, int ret) > return ret; > } > =20 > +/** > + * flow_migrate_need_repair() - Do we need to set repair mode for any fl= ow? > + * > + * Return: true if repair mode is needed, false otherwise > + */ > +static bool flow_migrate_need_repair(void) > +{ > + union flow *flow; > + > + foreach_established_tcp_flow(flow) > + return true; > + > + return false; > +} > + > /** > * flow_migrate_repair_all() - Turn repair mode on or off for all flows > * @c: Execution context > @@ -966,6 +981,9 @@ int flow_migrate_source_pre(struct ctx *c, const stru= ct migrate_stage *stage, > (void)stage; > (void)fd; > =20 > + if (flow_migrate_need_repair()) > + repair_wait(c); > + > if ((rc =3D flow_migrate_repair_all(c, true))) > return -rc; > =20 > @@ -1083,6 +1101,8 @@ int flow_migrate_target(struct ctx *c, const struct= migrate_stage *stage, > if (!count) > return 0; > =20 > + repair_wait(c); > + > if ((rc =3D flow_migrate_repair_all(c, true))) > return -rc; > =20 > diff --git a/repair.c b/repair.c > index 3ee089f..ebeb248 100644 > --- a/repair.c > +++ b/repair.c > @@ -27,6 +27,10 @@ > =20 > #define SCM_MAX_FD 253 /* From Linux kernel (include/net/scm.h), not in = UAPI */ > =20 > +/* Wait for a while for TCP_REPAIR helper to connect if it's not there y= et */ > +#define REPAIR_ACCEPT_TIMEOUT_MS 10 > +#define REPAIR_ACCEPT_TIMEOUT_US (REPAIR_ACCEPT_TIMEOUT_MS * 1000) > + > /* Pending file descriptors for next repair_flush() call, or command cha= nge */ > static int repair_fds[SCM_MAX_FD]; > =20 > @@ -138,6 +142,33 @@ void repair_handler(struct ctx *c, uint32_t events) > repair_close(c); > } > =20 > +/** > + * repair_wait() - Wait (with timeout) for TCP_REPAIR helper to connect > + * @c: Execution context > + */ > +void repair_wait(struct ctx *c) > +{ > + struct timeval tv =3D { .tv_sec =3D 0, > + .tv_usec =3D (long)(REPAIR_ACCEPT_TIMEOUT_US) }; > + static_assert(REPAIR_ACCEPT_TIMEOUT_US < 1000 * 1000); > + > + if (c->fd_repair >=3D 0 || c->fd_repair_listen =3D=3D -1) > + return; > + > + if (setsockopt(c->fd_repair_listen, SOL_SOCKET, SO_RCVTIMEO, > + &tv, sizeof(tv))) { > + err_perror("Set timeout on TCP_REPAIR listening socket"); > + return; > + } > + > + repair_listen_handler(c, EPOLLIN); > + > + tv.tv_usec =3D 0; > + if (setsockopt(c->fd_repair_listen, SOL_SOCKET, SO_RCVTIMEO, > + &tv, sizeof(tv))) > + err_perror("Clear timeout on TCP_REPAIR listening socket"); > +} > + > /** > * repair_flush() - Flush current set of sockets to helper, with current= command > * @c: Execution context > diff --git a/repair.h b/repair.h > index de279d6..1d37922 100644 > --- a/repair.h > +++ b/repair.h > @@ -10,6 +10,7 @@ void repair_sock_init(const struct ctx *c); > void repair_listen_handler(struct ctx *c, uint32_t events); > void repair_handler(struct ctx *c, uint32_t events); > void repair_close(struct ctx *c); > +void repair_wait(struct ctx *c); > int repair_flush(struct ctx *c); > int repair_set(struct ctx *c, int s, int cmd); > =20 --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --Uyyx27q6gSwqWwbR Content-Type: application/pgp-signature; name=signature.asc -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmfPjkUACgkQzQJF27ox 2Geu0RAAq4smjpNJ/7IxKrqq2uah0sCir4OqO+qeV5X6zVkdMANGR/KMNYKNtw5n xesZWkKPh6nh3mDi2K/kZnRqJZ7OhQvMb44SJVK9IvZVIfuJ72kKez9WxN9cDVbW xlZsgt3f4zIpf5UwQDhEM23haeqV39y2q54iTkynfZNknZJMw0UU3x8VP9X8Q1t3 96Xy3MxIiMx8s4BWF8RGKl5ky1HcE3u5Mk4oABsutjeEQ+LdrnlIvWKBF5awNEGF 2hnJiPtewbOoBJQCggLxw+lJQffv606Ktg/aihnAkVJb4nM2M79auS1xE0orhzY2 IKf1BaBLZ+2RtGBeg7eQkX59h6Uxa44ZblSRQDkgzEsdDWdq4CIH8tpfGX5Y+55R OUQ0WKD9r7QYKY42ZtpcdNJpBfjW1wbSdTHGhe9EqHZeFT5r+Je+fBqwC9OsP9Sr l2NT4z9RaKsu4AN5fDchgY7t3tlqrwUWdidRGxHR9LNygsJZgttD4Zkacl9t6A8s pXvmizXA3zKuHTzNatKnOjV5fGgWnZKbWwz979iAyPZDxYK20kxei23tSVap2Vpy Gx4kiCxGW/WAiMnHXAq8wtuSwvLkQ3FQloTX65Sb0t5UTIKt6wA25UqHMh37wlZa LX/X0nEaIwPlLU51uOct6zyNjZyywqj3zAdDSs647Rt9SmTQLEE= =ih03 -----END PGP SIGNATURE----- --Uyyx27q6gSwqWwbR--