From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=pass (2048-bit key; secure) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202506 header.b=zBJke5IA; dkim-atps=neutral Received: from mail.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by passt.top (Postfix) with ESMTPS id 928C55A0278 for ; Mon, 28 Jul 2025 07:52:00 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202506; t=1753681715; bh=fH9ffs2DnlGRl1dmbvUwY6JysIyvvZLVSp6x4a8DCZc=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=zBJke5IA+XgVD8FYRjql4CfP+UGNt4UB8VEyOcGTxNmZHlnmKtz6V/X1R5eV0RH3Y cio/rO7RB24lpE6DIREU3FrZ6EwnMItocCsRAvAGym4a+A9b2pCTXUjfEjMvmxCxfJ KgldWt7P08k+aI6Ld9Fzlflx2tGYAfM5+jKdIjWNUo6CQlMQgAcgGETs4fbcDJvOgv GVJyIHQQMvybQw51CTEmcsxAl9PuDHrwvY0ooaa0/AlANwhx/OtE1Qgro1UeIpHUVi fEaD/dpm8ldLaoWOMaAlOkOvC+QfLOMhq8lCdXbAUyaOmt6lru38b2Rz6jW5vSqTGP GYgap4cKxqH9w== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4br6wM5JMkz4wbx; Mon, 28 Jul 2025 15:48:35 +1000 (AEST) Date: Mon, 28 Jul 2025 15:51:53 +1000 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH v3] treewide: By default, don't quit source after migration, keep sockets open Message-ID: References: <20250724172858.1189615-1-sbrivio@redhat.com> <20250725071058.0842f7a2@elisabeth> <20250725102112.55910998@elisabeth> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="x4r4zJeVTD2kh5ao" Content-Disposition: inline In-Reply-To: <20250725102112.55910998@elisabeth> Message-ID-Hash: FWH3IRTKXRNAFOKJLST3I4ZBQZKHSV42 X-Message-ID-Hash: FWH3IRTKXRNAFOKJLST3I4ZBQZKHSV42 X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, Nir Dothan X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --x4r4zJeVTD2kh5ao Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Jul 25, 2025 at 10:21:12AM +0200, Stefano Brivio wrote: > On Fri, 25 Jul 2025 16:50:23 +1000 > David Gibson wrote: > > On Fri, Jul 25, 2025 at 07:10:58AM +0200, Stefano Brivio wrote: > > > On Fri, 25 Jul 2025 14:04:17 +1000 > > > David Gibson wrote: [snip] > > > > > v3: > > > > > - Nir reported occasional failures (connections being reset) > > > > > with both v1 and v2, because, in KubeVirt's usage, we quit as > > > > > QEMU exits. Disable --one-off after migration as source, and > > > > > document this exception =20 > > > >=20 > > > > This seems like an awful, awful hack. =20 > > >=20 > > > Well, of course, it is, and long term it should be fixed in > > > either KubeVirt or libvirt (even though I'm not sure how, see below) > > > instead. =20 > >=20 > > But this hack means that even when it's fixed we'll still have this > > wildly counterintuitive behaviour that every future user will have to > > work around. >=20 > No, why? We can change that as well. We changed semantics of options > in the past and I don't see an issue doing that as long as we > coordinate things to a reasonable extent (like we do with Podman and > rootlesskit and with distributions and LSMs...). Ah, ok. That kind of changes everything. It thought this indicated commiting to these semantics indefinitely. I think the co-ordination you're suggesting may be messier than you think, but if we're explicitly willing to (technically) break compatibility to remove this again, then, sure, ok. It's still gross, but fine. [snip] > > > We found out that the guest is generally suspended during that while, > > > but sometimes it might even have already exited. The pod remains, > > > though, as long as it's needed. That's the only certainty we have. = =20 > >=20 > > Keeping the pod around is fine. What needs to change is that the > > guest's IP(s) needs to be removed from the source host before qemu > > (and therefore passt) is terminated. The pod must have at least one > > other IP, or it would be impossible to perform the migration in the > > first place. >=20 > Maybe, yes. I'm not sure if it's doable. KubeVirt and/or libvirt really need to figure out how to make it doable, because having two simultaneously active things with the same IP on the same network is bound to cause trouble. At some point it's likely to cause trouble that we can't hack our way around in passt. > > This essentially matches the situation for bridged networking: with > > the source guest suspended the source host will no longer respond to > > the guest IP > >=20 > > > So, do we want to drop --one-off from the libvirt integration, and ha= ve > > > libvirt manage passt's lifecycle entirely (note that all users outside > > > KubeVirt don't use migration, so we would make the general case vastly > > > more complicated for the sake of correctness for a single usage...)? = =20 > >=20 > > Hmm.. if I understand correctly the network swizzling is handled by > > KubeVirt, not libvirt. >=20 > That's OVN-Kubernetes in KubeVirt's case. To clarify, I'm not talking so much about what actually makes the network arrangements, but what component initiates the changeover. >=20 > > I'm hoping that means there's a suitable point > > at which it can remove the IP without having to alter libvirt. >=20 > I hope so too, eventually. Or we could make sure that QEMU is alive as > long as needed, this is probably easier to ensure from virt-launcher. Right. There's kind of a basic mismatch here - libvirt aims to manage the whole migration, end-to-end, so you just tell it to migrate and it does. But in the k8s context, something else needs to do network jiggery-pokery co-ordinated with that, so it kind of breaks the model. passt adds a further complication in that the host owns the IP, to share it with the guest. That means suspending the guest is no longer sufficient to prevent that source responding on that IP. So, maybe that means libvirt logically ought to be handling the IP assignment/de-assignment when migrating with passt. But... no doubt that would conflict with OVN-Kubernetes wanting to own the network configuration. Yet again, libvirt's wish to manage everything (in the manner of a Xen machine circa 2010) does not play well with the kind of circumstances people try to use it. > I haven't looked at the details yet, but in passt it's one line and we > can drop it later as needed, in KubeVirt it's probably much more > complicated than that. >=20 > > > Well, we can try to do that. Except that libvirt doesn't know either > > > for how long this traffic will reach the source pod (that's a KubeVirt > > > concept). So it should implement the same hack: let it outlive QEMU on > > > migration... as long as we have that issue in KubeVirt. > > >=20 > > > But I asked KubeVirt people, and it turns out that it's extremely > > > complicated to fix this in KubeVirt. So, actually, I don't see another > > > way to fix this in the short term. And without KubeVirt using this we > > > could also drop the whole feature... >=20 --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --x4r4zJeVTD2kh5ao Content-Type: application/pgp-signature; name=signature.asc -----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmiHD+EACgkQzQJF27ox 2Ge9Bg/9F5H3wLGoMxA/1IhwhVHLS4R9YPn6k9Tst/3H27jCNoE7EFgPJHPfO1Vp Aud7GI/KdBwbUFwJvjDrckQeXJL0GpIgwVSeT+JQFc+Jzy4dEKxajI6n/Gr2E5YN MtqIHsiqLQ3tTSib8VjvZ8Mt/3711FVEkh54rr/VWMTx9xbIJRK9KbU+Uv+bbmSq wgmMkeWmwWAuno9ZM6JChthxQ5p9yDa7g89F8GFYWwzTl3yO6p5pO+fTsKJSoUqb T3fyooxZI6hQ5tTky/YTWs3xEaHVjhMX5IJD/IAwqu0VXequ3Q48u6klprBTWZT5 Lq0VmLrA0dznh/RKNOfA9tZBKW9yspxJ61kuId4sLXb75NpMTKr4XPEXAkh+wFO8 LkVpgHe/i636KWvrGr2gj87OGjD8a6vMyDxZt1ErDOXu7d11waWQZ2YQtEyvnvW4 dwAwSIL8jsk2lmhAqgmRdfpypnXCV45cnPurN2FzofECB48vT4iAZ2bfhCZ2Jg7P s5Jeaiq989rZuvEGgEbiKI0c8JBofz2M86FgEwe5PyDKodV7u0Ycz0YsasciR1FL BcGdJtTBf5KJa0yHaml9buK9Bs0RLseri5oY4iX+ioRTCwRxQ16RldmwbS9vFF/Z pE9nJRBdIJ5XSvbTDgx3u9lfUBlwMQ8Tnqs6JIT418Fjhx6whu0= =YJwD -----END PGP SIGNATURE----- --x4r4zJeVTD2kh5ao--