public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: David GIbson <david@gibson.dropbear.id.au>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: Laurent Vivier <lvivier@redhat.com>,
	passt-dev@passt.top, Jon Maloy <jmaloy@redhat.com>
Subject: Re: [PATCH v4 00/10] vhost-user: Preparatory series for multiple iovec entries per virtqueue element
Date: Fri, 22 May 2026 16:36:52 +1000	[thread overview]
Message-ID: <ag_5hDFdOrZ60DTu@zatzit> (raw)
In-Reply-To: <20260522082349.3141a1f9@elisabeth>

[-- Attachment #1: Type: text/plain, Size: 9081 bytes --]

On Fri, May 22, 2026 at 08:23:50AM +0200, Stefano Brivio wrote:
> On Fri, 22 May 2026 16:15:08 +1000
> David GIbson <david@gibson.dropbear.id.au> wrote:
> 
> > On Fri, May 22, 2026 at 07:44:56AM +0200, Stefano Brivio wrote:
> > > On Fri, 22 May 2026 06:22:39 +0200
> > > Stefano Brivio <sbrivio@redhat.com> wrote:
> > >   
> > > > On Fri, 22 May 2026 01:13:33 +0200
> > > > Laurent Vivier <lvivier@redhat.com> wrote:
> > > >   
> > > > > On 5/21/26 10:30, Laurent Vivier wrote:    
> > > > > > On 5/20/26 22:53, Stefano Brivio wrote:      
> > > > > >> On Wed, 20 May 2026 18:18:52 +0200
> > > > > >> Stefano Brivio <sbrivio@redhat.com> wrote:
> > > > > >>      
> > > > > >>> On Wed, 20 May 2026 18:07:08 +0200
> > > > > >>> Stefano Brivio <sbrivio@redhat.com> wrote:
> > > > > >>>      
> > > > > >>>> On Wed, 20 May 2026 17:34:45 +0200
> > > > > >>>> Stefano Brivio <sbrivio@redhat.com> wrote:      
> > > > > >>>>> On Wed, 13 May 2026 13:52:08 +0200
> > > > > >>>>> Laurent Vivier <lvivier@redhat.com> wrote:      
> > > > > >>>>>> Currently, the vhost-user path assumes each virtqueue element contains
> > > > > >>>>>> exactly one iovec entry covering the entire frame.  This assumption
> > > > > >>>>>> breaks as some virtio-net drivers (notably iPXE) provide descriptors where the
> > > > > >>>>>> vnet header and the frame payload are in separate buffers, resulting in
> > > > > >>>>>> two iovec entries per virtqueue element.
> > > > > >>>>>>
> > > > > >>>>>> This series refactors the vhost-user data path so that frame lengths,
> > > > > >>>>>> header sizes, and padding are tracked and passed explicitly rather than
> > > > > >>>>>> being derived from iovec sizes.  This decoupling is a prerequisite for
> > > > > >>>>>> correctly handling padding of multi-buffer frames.      
> > > > > >>>>>
> > > > > >>>>> Sorry to bring (likely) bad news, but this series seems to introduce a
> > > > > >>>>> regression: I got the migration/rampstream_in tests fail twice in a
> > > > > >>>>> row, which I've never saw happening (I think I saw a single failure a
> > > > > >>>>> long time ago when the machine had a high CPU load, but nothing else).
> > > > > >>>>>
> > > > > >>>>> I'm currently bisecting and the bisect seems to point towards the end
> > > > > >>>>> of the series (probably 10/10), but I haven't finished yet. I'll keep
> > > > > >>>>> you posted. I haven't spotted anything that might cause issues there.      
> > > > > >>>>
> > > > > >>>> Yeah, that's the one :(
> > > > > >>>>
> > > > > >>>> $ git bisect bad
> > > > > >>>> db798fc60f4c5869cb53168354e068fb4dabd91a is the first bad commit
> > > > > >>>> commit db798fc60f4c5869cb53168354e068fb4dabd91a
> > > > > >>>> Author: Laurent Vivier <lvivier@redhat.com>
> > > > > >>>> Date:   Wed May 13 13:52:18 2026 +0200
> > > > > >>>>
> > > > > >>>>      vhost-user: Centralise Ethernet frame padding in vu_collect() and vu_pad()      
> > > > > > 
> > > > > > I checked on my system with the commit previous to this series,
> > > > > > bcc3d37a6e01 ("util: Fix changes to assert_with_msg()") and rampstream_in fails too (not 
> > > > > > everytime).
> > > > > >       
> > > > > >  > TCP/IPv4: sequence check, ramps, inbound      
> > > > > > ...failed.
> > > > > > 
> > > > > > and rampstream_out hangs sometime too.
> > > > > > 
> > > > > > I'm going to try with ealier commits.      
> > > > > 
> > > > > For me the problem can happen with any commit...
> > > > > 
> > > > > As it depends on the execution path and on the load and speed of the system it looks like 
> > > > > a race condition.    
> > > > 
> > > > Hah, thanks for checking. Maybe...
> > > >   
> > > > > Did you try to test on a host with a kernel patched with
> > > > > "[PATCH net v2 0/2] Fix race condition between TCP_REPAIR dump and data receive" ?    
> > > > 
> > > > Now I tried, and yes, the test doesn't hang anymore! I seem to have an
> > > > issue with teardown functions on recent kernels (current net.git HEAD
> > > > more or less):
> > > > 
> > > > ---
> > > > + teardown_migrate
> > > > + cat /tmp/passt-tests-VVtLn0/migrate/qemu_1.pid
> > > > + /home/sbrivio/passt/test/nstool exec /tmp/passt-tests-VVtLn0/migrate/ns1.hold -- kill 16
> > > > qemu-system-x86_64: terminating on signal 15 from pid 34 ()
> > > > + cat /tmp/passt-tests-VVtLn0/migrate/qemu_2.pid
> > > > + /home/sbrivio/passt/test/nstool exec /tmp/passt-tests-VVtLn0/migrate/ns1.hold -- kill 15
> > > > 18.8974: ================ Vhost user message ================
> > > > 18.8974: Request: VHOST_USER_GET_VRING_BASE (11)
> > > > 18.8974: Flags:   0x1
> > > > 18.8974: Size:    8
> > > > 18.8974: State.index: 0
> > > > 18.8975: ================ Vhost user message ================
> > > > 18.8975: Request: VHOST_USER_GET_VRING_BASE (11)
> > > > 18.8975: Flags:   0x1
> > > > 18.8975: Size:    8
> > > > 18.8975: State.index: 1
> > > > qemu-system-x86_64: terminating on signal 15 from pid 35 ()
> > > > 18.7961: Client connection closed
> > > > 18.7962: Closing TCP_REPAIR helper socket
> > > > + context_wait qemu_1
> > > > + __name=qemu_1
> > > > + __pidfile=/tmp/passt-tests-VVtLn0/migrate/context_qemu_1.pid
> > > > + cat /tmp/passt-tests-VVtLn0/migrate/context_qemu_1.pid
> > > > + rc=0
> > > > + rm /tmp/passt-tests-VVtLn0/migrate/context_passt_repair_2.stdout.9pwpVbQr /tmp/passt-tests-VVtLn0/migrate/context_passt_repair_2.stderr.dSY5hBu1
> > > > + __pid=67766
> > > > + rm /tmp/passt-tests-VVtLn0/migrate/context_qemu_1.pid
> > > > + [ 1 -eq 1 ]
> > > > + echo [Exit code: 0]
> > > > + echo -n passt_repair_2$ 
> > > > + return 0
> > > > 18.9016: Client connection closed
> > > > 18.9018: Closing TCP_REPAIR helper socket
> > > > + wait 67766
> > > > + rc=0
> > > > + rm /tmp/passt-tests-VVtLn0/migrate/context_passt_repair_1.stdout.JEyDGxXe /tmp/passt-tests-VVtLn0/migrate/context_passt_repair_1.stderr.WU550iEI
> > > > + [ 1 -eq 1 ]
> > > > + echo [Exit code: 0]
> > > > + echo -n passt_repair_1$ 
> > > > + return 0
> > > > + rc=0
> > > > + rm /tmp/passt-tests-VVtLn0/migrate/context_qemu_2.stdout.Dm8EAhfl /tmp/passt-tests-VVtLn0/migrate/context_qemu_2.stderr.207qJYPA
> > > > + [ 1 -eq 1 ]
> > > > + echo [Exit code: 0]
> > > > + echo -n qemu_2$ 
> > > > + return 0
> > > > 2026/05/22 04:08:23 socat[73089] E connect(5, AF=40 cid:94558 port:22, 16): Connection timed out
> > > > Connection closed by UNKNOWN port 65535
> > > > ...
> > > > ---
> > > > 
> > > > it looks like we stop QEMU a bit too early. But it should be unrelated.
> > > > 
> > > > I'm now trying to find some kind of workaround for existing (not fixed)
> > > > kernel versions. Maybe stopping rampstream_in for a moment or something
> > > > like that.  
> > > 
> > > For some weird reason even very blatant throttling (100 ms - 1 s delays
> > > every 10000 ramps, or an explicit 500 ms pause via signal before
> > > migration) doesn't help.
> > > 
> > > So it doesn't seem to be *that* kind of race. I should probably check
> > > the same exact kernel version with fix and without...  
> > 
> > If it's due to the kernel not stopping the queues on REPAIR, then the
> > only real way to fix the test is to cut off the source machine's
> > network before we trigger migration.
> 
> Well, that's a rather complicated way to do it. One could simply stop
> the traffic instead.

I don't know that "simply" is quite so simple.  You can suspend the
source of the data, but you need to wait a difficult to ascertain
amount of time for that to make it to the guest, and all the acks to
come back.

For rampstream_out it's worse: the source is in the guest which isn't
supposed to know about the migration in advance, so you can't really
stop it without stopping the guest's whole network.

> But it doesn't help, so there's probably another
> issue.
> 
> > That could be done with
> > netfilter (in a user+netns).  But probably more natural would be to
> > not do the migration between local passt instances, but actually
> > between two host namespaces, with separate netifs for external
> > connectivity and for the migration.  Remove the external netif on the
> > source, then trigger migration, then add the external netif on the
> > destination.
> > 
> > It's quite a bit of hassle :(.  But it does model something much
> > closer to a real migration scenario.  As a bonus it would mean we'd no
> > longer rely on the hack of guessing when to exit the source passt in
> > order to allow the destination passt to bind.
> 
> I struggle to see how that would be worth the investment, especially if
> we're working around a kernel issue that should eventually be fixed.
> 
> Or, at least, right now, I'm just trying to get tests to pass while
> keeping Laurent changes in the tree..
> 
> -- 
> Stefano
> 

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2026-05-22  6:37 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-13 11:52 Laurent Vivier
2026-05-13 11:52 ` [PATCH v4 01/10] iov: Introduce iov_memset() Laurent Vivier
2026-05-13 11:52 ` [PATCH v4 02/10] iov: Add iov_memcpy() to copy data between iovec arrays Laurent Vivier
2026-05-13 11:52 ` [PATCH v4 03/10] vu_common: Move vnethdr setup into vu_flush() Laurent Vivier
2026-05-13 11:52 ` [PATCH v4 04/10] udp_vu: Move virtqueue management from udp_vu_sock_recv() to its caller Laurent Vivier
2026-05-13 11:52 ` [PATCH v4 05/10] udp_vu: Pass iov explicitly to helpers instead of using file-scoped array Laurent Vivier
2026-05-13 11:52 ` [PATCH v4 06/10] checksum: Pass explicit L4 length to checksum functions Laurent Vivier
2026-05-13 11:52 ` [PATCH v4 07/10] pcap: Pass explicit L2 length to pcap_iov() Laurent Vivier
2026-05-13 11:52 ` [PATCH v4 08/10] vu_common: Pass explicit frame length to vu_flush() Laurent Vivier
2026-05-13 11:52 ` [PATCH v4 09/10] tcp: Pass explicit data length to tcp_fill_headers() Laurent Vivier
2026-05-13 11:52 ` [PATCH v4 10/10] vhost-user: Centralise Ethernet frame padding in vu_collect() and vu_pad() Laurent Vivier
2026-05-14  1:24   ` David Gibson
2026-05-20  0:52 ` [PATCH v4 00/10] vhost-user: Preparatory series for multiple iovec entries per virtqueue element Stefano Brivio
2026-05-20 15:34 ` Stefano Brivio
2026-05-20 16:07   ` Stefano Brivio
2026-05-20 16:18     ` Stefano Brivio
2026-05-20 20:53       ` Stefano Brivio
2026-05-21  8:30         ` Laurent Vivier
2026-05-21 23:13           ` Laurent Vivier
2026-05-22  4:22             ` Stefano Brivio
2026-05-22  5:44               ` Stefano Brivio
2026-05-22  6:15                 ` David GIbson
2026-05-22  6:23                   ` Stefano Brivio
2026-05-22  6:36                     ` David GIbson [this message]
2026-05-22  6:45                       ` Stefano Brivio
2026-05-22 12:04                 ` Stefano Brivio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ag_5hDFdOrZ60DTu@zatzit \
    --to=david@gibson.dropbear.id.au \
    --cc=jmaloy@redhat.com \
    --cc=lvivier@redhat.com \
    --cc=passt-dev@passt.top \
    --cc=sbrivio@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).