From mboxrd@z Thu Jan  1 00:00:00 1970
Authentication-Results: passt.top; dmarc=pass (p=none dis=none) header.from=redhat.com
Authentication-Results: passt.top;
	dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=UNQLv18V;
	dkim-atps=neutral
Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124])
	by passt.top (Postfix) with ESMTPS id 3E1BD5A0622
	for <passt-dev@passt.top>; Thu, 27 Feb 2025 05:32:27 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
	s=mimecast20190719; t=1740630746;
	h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
	 to:to:cc:cc:mime-version:mime-version:content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references;
	bh=kKLTJSz5fBH7jB7I4wsZ2vHWkf9deRZ+C+gnu86J2zA=;
	b=UNQLv18V3RQl2wJ7oQyb0tBnLhFFjOW1NLdjbAvMvyAvM4aPoivi8fG+KsjUBQXykY1cgE
	p0KzMUaGUQmmMLHpFYER5g3XtnQGfBHCDVINBOfvEcHoDUPldqBlLiPK8S/+cJvvSyLiI5
	gUkZuzLbivvrGGwp5zO1BDaI1BW9zZo=
Received: from mail-wr1-f71.google.com (mail-wr1-f71.google.com
 [209.85.221.71]) by relay.mimecast.com with ESMTP with STARTTLS
 (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id
 us-mta-205-ruoL86ciN_e8PTSIgMnMyg-1; Wed, 26 Feb 2025 23:32:24 -0500
X-MC-Unique: ruoL86ciN_e8PTSIgMnMyg-1
X-Mimecast-MFC-AGG-ID: ruoL86ciN_e8PTSIgMnMyg_1740630743
Received: by mail-wr1-f71.google.com with SMTP id ffacd0b85a97d-38f55ccb04bso262343f8f.3
        for <passt-dev@passt.top>; Wed, 26 Feb 2025 20:32:24 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1740630743; x=1741235543;
        h=content-transfer-encoding:mime-version:organization:references
         :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state
         :from:to:cc:subject:date:message-id:reply-to;
        bh=kKLTJSz5fBH7jB7I4wsZ2vHWkf9deRZ+C+gnu86J2zA=;
        b=ojMrxbguwMNYgcdG0MpCMeCTp7lD6lMYxTEikIQFNWb9vKhu18g5gOpas7B8wI8+Jk
         0KyXccwFc7Kz6hqk+gSI34mOl7D//QgAqGkMyyTNsHNicaNkIfzHRLMxNfnn5D9DQb45
         Ue7SzDjJACDBdsaAq7RB6cRsvlF29yJBDUl9gXDEmlGhGskPQxa0CmoW6X21LSUgRkU6
         RotWhMbBHMyAJ4xhf/B6XaWjpBOzktK6EWlgNJ/QuJuqtE0j0XqPyDQ3sdvcuZ7pcIdu
         QgWQ1MOKjGYlV03HaGZ0sAq24ULqZwNEYsEeRn8hcAVrHjxnxhy2PtN3gvPRNIM3pEsU
         GTwA==
X-Gm-Message-State: AOJu0YzjBprNHEsxQrpQUNw3clIZbwkJgThxhn/iGEFM5z0QRZiYwBpP
	jU06VfgxjzyRXVN3YHO3gAZnA1GaKjwZqOxWlUZQcY+UOxnhtONz5yYxgjnO7cN7ntyUbZLna+K
	5dxeQtT0xaOe305R7uEk6Q1pFNlAZooPqXmucry9HqW0eBS4UGw==
X-Gm-Gg: ASbGncttIkO951w2fm5N5WUKBfz92WPCC4SqN4/q8Vcf5atvm+duXcbfa/6WGZBo6nA
	kQkXeIXYeVtv3gk5Vi+fsPuFW4U9V3UtuV3DyNcp1Tb/2+6oOjl5a4hfabmnUB27tiKMBkShLjB
	pD83ajT/1QqGCy+7hT1BOKATnFdz4Qu92KqCneNc5se6GzJnTrLVCAWgOCenJVNHAqokrcMa9zY
	yIGs9XkthcrRhVXBDrmqDNzp5LfIx1wFhi0w6k+rjEOrZHV7TujwP8QAdVGXaUZUKCHw1QBh4St
	IN4e0rQIaLPc9ckWE48Za0JVYpA0IVd+h69L6fX7DK8C
X-Received: by 2002:a5d:59ad:0:b0:390:deff:f233 with SMTP id ffacd0b85a97d-390defff408mr2721925f8f.24.1740630743378;
        Wed, 26 Feb 2025 20:32:23 -0800 (PST)
X-Google-Smtp-Source: AGHT+IHXlTLG8EBjqXAPOgkHtIt9YWAxpdpLWLT6Tj3StPJ8dTfwd56TkoGpVvEfRuoPmbOJ8bRw7A==
X-Received: by 2002:a5d:59ad:0:b0:390:deff:f233 with SMTP id ffacd0b85a97d-390defff408mr2721915f8f.24.1740630742975;
        Wed, 26 Feb 2025 20:32:22 -0800 (PST)
Received: from maya.myfinge.rs (ifcgrfdd.trafficplex.cloud. [176.103.220.4])
        by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-390e4844c2bsm738253f8f.77.2025.02.26.20.32.22
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Wed, 26 Feb 2025 20:32:22 -0800 (PST)
Date: Thu, 27 Feb 2025 05:32:19 +0100
From: Stefano Brivio <sbrivio@redhat.com>
To: David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [PATCH v2 0/2] More graceful handling of migration without
 passt-repair
Message-ID: <20250227053219.01e76539@elisabeth>
In-Reply-To: <Z7_DTVnEZFYlRMJl@zatzit>
References: <20250225055132.3677190-1-david@gibson.dropbear.id.au>
	<20250225184316.407247f4@elisabeth>
	<Z75f9IDhnLS7UmDW@zatzit>
	<20250226090948.3d1fff91@elisabeth>
	<Z77V_6xrLXlkmuDx@zatzit>
	<20250226122412.33009f77@elisabeth>
	<Z7_DTVnEZFYlRMJl@zatzit>
Organization: Red Hat
X-Mailer: Claws Mail 4.2.0 (GTK 3.24.41; x86_64-pc-linux-gnu)
MIME-Version: 1.0
X-Mimecast-Spam-Score: 0
X-Mimecast-MFC-PROC-ID: XxrOb3z4yXwIHsD-vmeE8hmK8uj0ehZS8LhqrF8Qq5s_1740630743
X-Mimecast-Originator: redhat.com
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Message-ID-Hash: NZGKO32ZDFDJVT5L52IIP4K3LG2C72AO
X-Message-ID-Hash: NZGKO32ZDFDJVT5L52IIP4K3LG2C72AO
X-MailFrom: sbrivio@redhat.com
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: passt-dev@passt.top
X-Mailman-Version: 3.3.8
Precedence: list
List-Id: Development discussion and patches for passt <passt-dev.passt.top>
Archived-At: <https://archives.passt.top/passt-dev/20250227053219.01e76539@elisabeth/>
Archived-At: <https://passt.top/hyperkitty/list/passt-dev@passt.top/message/NZGKO32ZDFDJVT5L52IIP4K3LG2C72AO/>
List-Archive: <https://archives.passt.top/passt-dev/>
List-Archive: <https://passt.top/hyperkitty/list/passt-dev@passt.top/>
List-Help: <mailto:passt-dev-request@passt.top?subject=help>
List-Owner: <mailto:passt-dev-owner@passt.top>
List-Post: <mailto:passt-dev@passt.top>
List-Subscribe: <mailto:passt-dev-join@passt.top>
List-Unsubscribe: <mailto:passt-dev-leave@passt.top>

On Thu, 27 Feb 2025 12:43:41 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Wed, Feb 26, 2025 at 12:24:12PM +0100, Stefano Brivio wrote:
> > On Wed, 26 Feb 2025 19:51:11 +1100
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> >   
> > > On Wed, Feb 26, 2025 at 09:09:48AM +0100, Stefano Brivio wrote:  
> > > > On Wed, 26 Feb 2025 11:27:32 +1100
> > > > David Gibson <david@gibson.dropbear.id.au> wrote:
> > > >     
> > > > > On Tue, Feb 25, 2025 at 06:43:16PM +0100, Stefano Brivio wrote:    
> > > > > > On Tue, 25 Feb 2025 16:51:30 +1100
> > > > > > David Gibson <david@gibson.dropbear.id.au> wrote:
> > > > > >       
> > > > > > > From Red Hat internal testing we've had some reports that if
> > > > > > > attempting to migrate without passt-repair, the failure mode is uglier
> > > > > > > than we'd like.  The migration fails, which is somewhat expected, but
> > > > > > > we don't correctly roll things back on the source, so it breaks
> > > > > > > network there as well.
> > > > > > > 
> > > > > > > Handle this more gracefully allowing the migration to proceed in this
> > > > > > > case, but allow TCP connections to break
> > > > > > > 
> > > > > > > I've now tested this reasonably:
> > > > > > >  * I get a clean migration if there are now active flows
> > > > > > >  * Migration completes, although connections are broken if
> > > > > > >    passt-repair isn't connected
> > > > > > >  * Basic test suite (minus perf)
> > > > > > > 
> > > > > > > I didn't manage to test with libvirt yet, but I'm pretty convinced the
> > > > > > > behaviour should be better than it was.      
> > > > > > 
> > > > > > I did, and it is. The series looks good to me and I would apply it as
> > > > > > it is, but I'm waiting a bit longer in case you want to try out some
> > > > > > variations based on my tests as well. Here's what I did.      
> > > > > 
> > > > > [snip]
> > > > > 
> > > > > Thanks for the detailed instructions.  More complex than I might have
> > > > > liked, but oh well.
> > > > >     
> > > > > >   $ virsh migrate --verbose --p2p --live --unsafe alpine --tunneled qemu+ssh://88.198.0.161:10951/session
> > > > > >   Migration: [97.59 %]error: End of file while reading data: : Input/output error
> > > > > > 
> > > > > > ...despite --verbose the error doesn't tell much (perhaps I need
> > > > > > LIBVIRT_DEBUG=1 instead?), but passt terminates at this point. With
> > > > > > this series (I just used 'make install' from the local build), migration
> > > > > > succeeds instead:
> > > > > > 
> > > > > >   $ virsh migrate --verbose --p2p --live --unsafe alpine --tunneled qemu+ssh://88.198.0.161:10951/session
> > > > > >   Migration: [100.00 %]
> > > > > > 
> > > > > > Now, on the target, I still have to figure out how to tell libvirt
> > > > > > to start QEMU and prepare for the migration (equivalent of '-incoming'
> > > > > > as we use in our tests), instead of just starting a new instance like
> > > > > > it does. Otherwise, I have no chance to start passt-repair there.
> > > > > > Perhaps it has something to do with persistent mode described here:      
> > > > > 
> > > > > Ah.  So I'm pretty sure virsh migrate will automatically start qemu
> > > > > with --incoming on the target.    
> > > > 
> > > > ("-incoming"), yes, see src/qemu/qemu_migration.c,
> > > > qemuMigrationDstPrepare().
> > > >     
> > > > > IIUC the problem here is more about
> > > > > timing: we want it to start it early, so that we have a chance to
> > > > > start passt-repair and let it connect before the migration actually
> > > > > happens.    
> > > > 
> > > > For the timing itself, we could actually wait for passt-repair to be
> > > > there, with a timeout (say, 100ms).    
> > > 
> > > I guess.  That still requires some way for KubeVirt (or whatever) to
> > > know at least roughly when it needs to launch passt-repair, and I'm
> > > not sure if that's something we can currently get from libvirt.  
> > 
> > KubeVirt sets up the target pod, and that's when it should be done (if
> > we have an inotify mechanism or similar). I can't point to an exact code
> > path yet but there's something like that.  
> 
> Right, but that approach does require inotify and starting
> passt-repair before passt, which might be fine, but I have the concern
> noted below.  To avoid that we'd need notification after passt & qemu
> are started on the target, but before the migration is actually
> initiated which I don't think libvirt provides.
> 
> > > > We could also modify passt-repair to set up an inotify watcher if the
> > > > socket isn't there yet.    
> > > 
> > > Maybe, yes.  This kind of breaks our "passt starts first, passt-repair
> > > connects to it" model though, and I wonder if we need to revisit the
> > > security implications of that.  
> > 
> > I don't think it actually breaks that model for security purposes,
> > because the guest doesn't have anyway a chance to cause a connection to
> > passt-repair. The guest is still suspended (or missing) at that point.  
> 
> I wasn't thinking of threat models coming from the guest, but an
> attack from an unrelated process impersonating passt in order to
> access passt-repair's superpowers.

Then an inotify watch shouldn't substantially change things. The
attacker could create the socket earlier and obtain the same outcome.

> [...]
>
> > We could even think of deferring switching repair mode off until the
> > right address is there, by the way. That would make a difference to
> > us.  
> 
> Do you mean by blocking?  Or by returning to normal operation with the
> flow flagged somehow to be "woken up" by a netlink monitor?

The latter. I don't think we should block connectivity (with new
addresses) meanwhile.

-- 
Stefano