From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=pass (2048-bit key; secure) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202602 header.b=dYmH2NoD; dkim-atps=neutral Received: from mail.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by passt.top (Postfix) with ESMTPS id D9AC15A0262 for ; Fri, 22 May 2026 08:37:00 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202602; t=1779431818; bh=kGj3NCVWulLG9dbytgql7IfsUSvmihXlfhc5AC1nX6k=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=dYmH2NoDq1eKjEw530dW7mTzudzf3cdSkcK1rP2uVBz2Tw5sN1TK7Ua26kBAdZxq0 GFeHjDPrhrDpFoLMHVFznaBUexE4t7Ee17VrRYbnLHWGWT+BxJaiv9jMZPuPQto5OX rdx9DlCkTTfTC4RLvpDNGeGulVr8nnc2XSKdeGV5vo0oE+si9HTL9VUOKyTy9Hxfym gGuEIzMESle5dgdpVfe9qW9B7RsomCtjFe1+HXD7vIKv77cb56jWkb7BY4USFiPUUd fGArPbQFq4GRvLvn32kgBFiz02rx+TvetBgteVDzzmuugZO+W4m2/ZtSnfl9F90HrC 2FmXrKLiEKPVg== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4gMFtf0rYtz4wTC; Fri, 22 May 2026 16:36:58 +1000 (AEST) Date: Fri, 22 May 2026 16:36:52 +1000 From: David GIbson To: Stefano Brivio Subject: Re: [PATCH v4 00/10] vhost-user: Preparatory series for multiple iovec entries per virtqueue element Message-ID: References: <20260520173445.0658dfef@elisabeth> <20260520180708.275ec4de@elisabeth> <20260520181852.1f0119ff@elisabeth> <20260520225340.54490a21@elisabeth> <50d79312-0493-4af0-b0bc-7c590885cbd2@redhat.com> <20260522062239.4fcd3314@elisabeth> <20260522074455.15e6cc3e@elisabeth> <20260522082349.3141a1f9@elisabeth> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="xEvdVKLtjSFMaVGh" Content-Disposition: inline In-Reply-To: <20260522082349.3141a1f9@elisabeth> Message-ID-Hash: 652XI7RF5IKWM4LH3ZHXGBLSUCT55HB4 X-Message-ID-Hash: 652XI7RF5IKWM4LH3ZHXGBLSUCT55HB4 X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: Laurent Vivier , passt-dev@passt.top, Jon Maloy X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --xEvdVKLtjSFMaVGh Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, May 22, 2026 at 08:23:50AM +0200, Stefano Brivio wrote: > On Fri, 22 May 2026 16:15:08 +1000 > David GIbson wrote: >=20 > > On Fri, May 22, 2026 at 07:44:56AM +0200, Stefano Brivio wrote: > > > On Fri, 22 May 2026 06:22:39 +0200 > > > Stefano Brivio wrote: > > > =20 > > > > On Fri, 22 May 2026 01:13:33 +0200 > > > > Laurent Vivier wrote: > > > > =20 > > > > > On 5/21/26 10:30, Laurent Vivier wrote: =20 > > > > > > On 5/20/26 22:53, Stefano Brivio wrote: =20 > > > > > >> On Wed, 20 May 2026 18:18:52 +0200 > > > > > >> Stefano Brivio wrote: > > > > > >> =20 > > > > > >>> On Wed, 20 May 2026 18:07:08 +0200 > > > > > >>> Stefano Brivio wrote: > > > > > >>> =20 > > > > > >>>> On Wed, 20 May 2026 17:34:45 +0200 > > > > > >>>> Stefano Brivio wrote: =20 > > > > > >>>>> On Wed, 13 May 2026 13:52:08 +0200 > > > > > >>>>> Laurent Vivier wrote: =20 > > > > > >>>>>> Currently, the vhost-user path assumes each virtqueue elem= ent contains > > > > > >>>>>> exactly one iovec entry covering the entire frame.=A0 This= assumption > > > > > >>>>>> breaks as some virtio-net drivers (notably iPXE) provide d= escriptors where the > > > > > >>>>>> vnet header and the frame payload are in separate buffers,= resulting in > > > > > >>>>>> two iovec entries per virtqueue element. > > > > > >>>>>> > > > > > >>>>>> This series refactors the vhost-user data path so that fra= me lengths, > > > > > >>>>>> header sizes, and padding are tracked and passed explicitl= y rather than > > > > > >>>>>> being derived from iovec sizes.=A0 This decoupling is a pr= erequisite for > > > > > >>>>>> correctly handling padding of multi-buffer frames. =20 > > > > > >>>>> > > > > > >>>>> Sorry to bring (likely) bad news, but this series seems to = introduce a > > > > > >>>>> regression: I got the migration/rampstream_in tests fail tw= ice in a > > > > > >>>>> row, which I've never saw happening (I think I saw a single= failure a > > > > > >>>>> long time ago when the machine had a high CPU load, but not= hing else). > > > > > >>>>> > > > > > >>>>> I'm currently bisecting and the bisect seems to point towar= ds the end > > > > > >>>>> of the series (probably 10/10), but I haven't finished yet.= I'll keep > > > > > >>>>> you posted. I haven't spotted anything that might cause iss= ues there. =20 > > > > > >>>> > > > > > >>>> Yeah, that's the one :( > > > > > >>>> > > > > > >>>> $ git bisect bad > > > > > >>>> db798fc60f4c5869cb53168354e068fb4dabd91a is the first bad co= mmit > > > > > >>>> commit db798fc60f4c5869cb53168354e068fb4dabd91a > > > > > >>>> Author: Laurent Vivier > > > > > >>>> Date:=A0=A0 Wed May 13 13:52:18 2026 +0200 > > > > > >>>> > > > > > >>>> =A0=A0=A0=A0 vhost-user: Centralise Ethernet frame padding i= n vu_collect() and vu_pad() =20 > > > > > >=20 > > > > > > I checked on my system with the commit previous to this series, > > > > > > bcc3d37a6e01 ("util: Fix changes to assert_with_msg()") and ram= pstream_in fails too (not=20 > > > > > > everytime). > > > > > > =20 > > > > > > > TCP/IPv4: sequence check, ramps, inbound =20 > > > > > > ...failed. > > > > > >=20 > > > > > > and rampstream_out hangs sometime too. > > > > > >=20 > > > > > > I'm going to try with ealier commits. =20 > > > > >=20 > > > > > For me the problem can happen with any commit... > > > > >=20 > > > > > As it depends on the execution path and on the load and speed of = the system it looks like=20 > > > > > a race condition. =20 > > > >=20 > > > > Hah, thanks for checking. Maybe... > > > > =20 > > > > > Did you try to test on a host with a kernel patched with > > > > > "[PATCH net v2 0/2] Fix race condition between TCP_REPAIR dump an= d data receive" ? =20 > > > >=20 > > > > Now I tried, and yes, the test doesn't hang anymore! I seem to have= an > > > > issue with teardown functions on recent kernels (current net.git HE= AD > > > > more or less): > > > >=20 > > > > --- > > > > + teardown_migrate > > > > + cat /tmp/passt-tests-VVtLn0/migrate/qemu_1.pid > > > > + /home/sbrivio/passt/test/nstool exec /tmp/passt-tests-VVtLn0/migr= ate/ns1.hold -- kill 16 > > > > qemu-system-x86_64: terminating on signal 15 from pid 34 () > > > > + cat /tmp/passt-tests-VVtLn0/migrate/qemu_2.pid > > > > + /home/sbrivio/passt/test/nstool exec /tmp/passt-tests-VVtLn0/migr= ate/ns1.hold -- kill 15 > > > > 18.8974: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Vhost use= r message =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > > > 18.8974: Request: VHOST_USER_GET_VRING_BASE (11) > > > > 18.8974: Flags: 0x1 > > > > 18.8974: Size: 8 > > > > 18.8974: State.index: 0 > > > > 18.8975: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Vhost use= r message =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > > > 18.8975: Request: VHOST_USER_GET_VRING_BASE (11) > > > > 18.8975: Flags: 0x1 > > > > 18.8975: Size: 8 > > > > 18.8975: State.index: 1 > > > > qemu-system-x86_64: terminating on signal 15 from pid 35 () > > > > 18.7961: Client connection closed > > > > 18.7962: Closing TCP_REPAIR helper socket > > > > + context_wait qemu_1 > > > > + __name=3Dqemu_1 > > > > + __pidfile=3D/tmp/passt-tests-VVtLn0/migrate/context_qemu_1.pid > > > > + cat /tmp/passt-tests-VVtLn0/migrate/context_qemu_1.pid > > > > + rc=3D0 > > > > + rm /tmp/passt-tests-VVtLn0/migrate/context_passt_repair_2.stdout.= 9pwpVbQr /tmp/passt-tests-VVtLn0/migrate/context_passt_repair_2.stderr.dSY5= hBu1 > > > > + __pid=3D67766 > > > > + rm /tmp/passt-tests-VVtLn0/migrate/context_qemu_1.pid > > > > + [ 1 -eq 1 ] > > > > + echo [Exit code: 0] > > > > + echo -n passt_repair_2$=20 > > > > + return 0 > > > > 18.9016: Client connection closed > > > > 18.9018: Closing TCP_REPAIR helper socket > > > > + wait 67766 > > > > + rc=3D0 > > > > + rm /tmp/passt-tests-VVtLn0/migrate/context_passt_repair_1.stdout.= JEyDGxXe /tmp/passt-tests-VVtLn0/migrate/context_passt_repair_1.stderr.WU55= 0iEI > > > > + [ 1 -eq 1 ] > > > > + echo [Exit code: 0] > > > > + echo -n passt_repair_1$=20 > > > > + return 0 > > > > + rc=3D0 > > > > + rm /tmp/passt-tests-VVtLn0/migrate/context_qemu_2.stdout.Dm8EAhfl= /tmp/passt-tests-VVtLn0/migrate/context_qemu_2.stderr.207qJYPA > > > > + [ 1 -eq 1 ] > > > > + echo [Exit code: 0] > > > > + echo -n qemu_2$=20 > > > > + return 0 > > > > 2026/05/22 04:08:23 socat[73089] E connect(5, AF=3D40 cid:94558 por= t:22, 16): Connection timed out > > > > Connection closed by UNKNOWN port 65535 > > > > ... > > > > --- > > > >=20 > > > > it looks like we stop QEMU a bit too early. But it should be unrela= ted. > > > >=20 > > > > I'm now trying to find some kind of workaround for existing (not fi= xed) > > > > kernel versions. Maybe stopping rampstream_in for a moment or somet= hing > > > > like that. =20 > > >=20 > > > For some weird reason even very blatant throttling (100 ms - 1 s dela= ys > > > every 10000 ramps, or an explicit 500 ms pause via signal before > > > migration) doesn't help. > > >=20 > > > So it doesn't seem to be *that* kind of race. I should probably check > > > the same exact kernel version with fix and without... =20 > >=20 > > If it's due to the kernel not stopping the queues on REPAIR, then the > > only real way to fix the test is to cut off the source machine's > > network before we trigger migration. >=20 > Well, that's a rather complicated way to do it. One could simply stop > the traffic instead. I don't know that "simply" is quite so simple. You can suspend the source of the data, but you need to wait a difficult to ascertain amount of time for that to make it to the guest, and all the acks to come back. For rampstream_out it's worse: the source is in the guest which isn't supposed to know about the migration in advance, so you can't really stop it without stopping the guest's whole network. > But it doesn't help, so there's probably another > issue. >=20 > > That could be done with > > netfilter (in a user+netns). But probably more natural would be to > > not do the migration between local passt instances, but actually > > between two host namespaces, with separate netifs for external > > connectivity and for the migration. Remove the external netif on the > > source, then trigger migration, then add the external netif on the > > destination. > >=20 > > It's quite a bit of hassle :(. But it does model something much > > closer to a real migration scenario. As a bonus it would mean we'd no > > longer rely on the hack of guessing when to exit the source passt in > > order to allow the destination passt to bind. >=20 > I struggle to see how that would be worth the investment, especially if > we're working around a kernel issue that should eventually be fixed. >=20 > Or, at least, right now, I'm just trying to get tests to pass while > keeping Laurent changes in the tree.. >=20 > --=20 > Stefano >=20 --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --xEvdVKLtjSFMaVGh Content-Type: application/pgp-signature; name=signature.asc -----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmoP+YQACgkQzQJF27ox 2Gd5Xg//aMWUUE+/Az9NbHPI7FfuRETTu8I7JmLd5NsgkmsUkUOh/XA0pyB4/gS4 9ZHlAJJ7Te7/Hz8B9s9qrZRB1Dn8CZpDEnNQPKDDH+f3HLqOAu4JXtisJGE49pI7 ybdNS3KKQRI2RUWeOzHZ08Tgt9rHRiiu8UkP/FbVarS4VXI39OTX6mxGvn8Hsfbg A6NV35pVdBfoV2EDhrbLuf6WC9I6GnGi6mhMImx/AdcLltLXW7J+HivLuwuJGD0V 2yFQcTAs8kcB1uGg9tUwE8RH86Ji4OjPMtnvzON+L6Af0nY/BedMfxOl6NFJnZWU IegffYnMG86hrvIkP+559hW1JeRZzhZfTOqpUMqn452BNreNhXmtEwKalYGLh8L3 P8s3TuG2DPParLq6iPdvn3AN/YVAHnCcfVWkqOq361/RwXJNK+gjek83a/XVKhXk gp6Xzit6bQQqNDVU9BACZxAOouqwAuDmu0j2V1c57hY6KsZX5rLGJoFbh/SLuRqx YG6Td3CZvxTNLebggXWvnSeFYtDlgd59rEY1UlNXGcOIL+lJywDBS/qVMBEJjj/b xUi911E2Ee4XK2phLrgzYtL1LKqv3yLBuav0jHzHdWhKEiM9yaCjTRXdLO2w4pmV MAt5Nub5lBC3AFj9+69RXdPayO3rX8zdWYSN716DQ7VqzSdOCEU= =BtnD -----END PGP SIGNATURE----- --xEvdVKLtjSFMaVGh--