From mboxrd@z Thu Jan  1 00:00:00 1970
Authentication-Results: passt.top; dmarc=pass (p=none dis=none) header.from=redhat.com
Authentication-Results: passt.top;
	dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=D8WIyyBo;
	dkim-atps=neutral
Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124])
	by passt.top (Postfix) with ESMTPS id CFC8C5A0271
	for <passt-dev@passt.top>; Wed, 12 Mar 2025 21:40:29 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
	s=mimecast20190719; t=1741812028;
	h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
	 to:to:cc:cc:mime-version:mime-version:content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references;
	bh=O45+9JHvwNJVBGQwh4jhfPL8o0obIW8dWlLjzbnZHU4=;
	b=D8WIyyBobL0zs7LoLcpCA9ZkIfw3HBwehG7faBIFDNC4fTj62yaCBb77GdsNaVaMmjXkSj
	GkRDg1dVIATiXojuGMzvZFVwLpESpyvNGXCLZPUqcM/DVGrdEwyc3DWtZ12wP6x/vOdj7U
	tEGsA4HCGPW22xfaawcIhFdE4ahtLGM=
Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com
 [209.85.128.70]) by relay.mimecast.com with ESMTP with STARTTLS
 (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id
 us-mta-694-nyzH6bFDNWCec7Di7zvqBQ-1; Wed, 12 Mar 2025 16:39:17 -0400
X-MC-Unique: nyzH6bFDNWCec7Di7zvqBQ-1
X-Mimecast-MFC-AGG-ID: nyzH6bFDNWCec7Di7zvqBQ_1741811956
Received: by mail-wm1-f70.google.com with SMTP id 5b1f17b1804b1-43ce245c5acso1318715e9.2
        for <passt-dev@passt.top>; Wed, 12 Mar 2025 13:39:16 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1741811955; x=1742416755;
        h=content-transfer-encoding:mime-version:organization:references
         :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state
         :from:to:cc:subject:date:message-id:reply-to;
        bh=O45+9JHvwNJVBGQwh4jhfPL8o0obIW8dWlLjzbnZHU4=;
        b=L3mmo7nB9/ntzxXLkQpBHvBGnkQCrDDlwpe3hYniFF2LszuKiGHCEG2WYiBlzXnB7R
         LDAngNUH2UoBWCYuw5Os6T92AnJkQphmpquMgE8I4Atsm7ktVn/OEMGIiqkLlp5gRucL
         hKryKRhXSe57B483U03LIx5SgL9Tb7GqjwWxhUYKNGxqMVRR58d2GN0aiLBCAsS0/gZV
         cgjHthT9+mnoeUrwQMwUCUj9N2cz8Szb4eRIflUz0bGnY23iG56j3ofjcib7Wrvz8/pH
         qX9KxZ79BjDekCVNB6Bi1jfd9cFAZvRjDTWQFAu3XRiPC3n2+3NZbzcckGq1X++AyDny
         7Qig==
X-Gm-Message-State: AOJu0Yxzp7S2kZaubMMEoLCyQ1yQgL6Z2jNEv/SnCPg0r8ukCDnyPRm6
	1q5/Xgh7NzKroIRRKk3LohVEAk8F51NKFpI/375nY/RKnEKl4XeDxyV9MhJ78y2nprB3KDWQcyK
	Z1pWABqjHV/5beUWeQOOXNk6Q/mqIQHLpPKqkksI1yCOlRLbcKLsw6MRi2w==
X-Gm-Gg: ASbGncuz9a/J6ulQG3UmkFoTq6aF5Yg8aO4hYseMFwhNAm31J8TeN8+nWC93+Xdvawh
	gq3xTwFKq66CIieP9Xabe7RvQoj/3/2Vhuk75BAKdc/j8XmL4h1eHiLCnJPxGRNTNwuMrtKU4mG
	8Q/XIBBKCfUc1DrF2C4DOCzWAzoyQmmTGH4mxbEL462AJa9vxRV9JUOPESrG5krvgRmTvC5GjI3
	+pE7yHjStyQI7xfrlf/Bhn9XgGds7VN3Y7zKkFUuUkuoyG8j6rbBBbIRn0wXhzOzI/Il8HLIJzz
	hV7Pfie7SxuxKjq6vSd8lP3piP7GWchanqYST11RK7xC
X-Received: by 2002:a5d:6d8c:0:b0:390:dfa1:3448 with SMTP id ffacd0b85a97d-39132dbb23emr22338760f8f.43.1741811955521;
        Wed, 12 Mar 2025 13:39:15 -0700 (PDT)
X-Google-Smtp-Source: AGHT+IHIV3s06reDoz8Cs2WLa1lF2088ESuurf2V38XtU5FMyK1XU1H/jH3ru6E3MCkt6N2uk/vLPw==
X-Received: by 2002:a5d:6d8c:0:b0:390:dfa1:3448 with SMTP id ffacd0b85a97d-39132dbb23emr22338747f8f.43.1741811955109;
        Wed, 12 Mar 2025 13:39:15 -0700 (PDT)
Received: from maya.myfinge.rs (ifcgrfdd.trafficplex.cloud. [176.103.220.4])
        by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-43d0a7582absm31489105e9.17.2025.03.12.13.39.14
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Wed, 12 Mar 2025 13:39:14 -0700 (PDT)
Date: Wed, 12 Mar 2025 21:39:10 +0100
From: Stefano Brivio <sbrivio@redhat.com>
To: David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [PATCH v2] flow, repair: Wait for a short while for
 passt-repair to connect
Message-ID: <20250312213910.059118d6@elisabeth>
In-Reply-To: <Z9DjZ9zctFY9WzrP@zatzit>
References: <20250307224129.2789988-1-sbrivio@redhat.com>
	<Z8-OSrfzr9GkFzHD@zatzit>
	<20250311225532.7ddaa1cd@elisabeth>
	<Z9DjZ9zctFY9WzrP@zatzit>
Organization: Red Hat
X-Mailer: Claws Mail 4.2.0 (GTK 3.24.41; x86_64-pc-linux-gnu)
MIME-Version: 1.0
X-Mimecast-Spam-Score: 0
X-Mimecast-MFC-PROC-ID: lTp_cn9gC_77nA_JWL134N7S7TT2ENmZzjLHuVySvKc_1741811956
X-Mimecast-Originator: redhat.com
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Message-ID-Hash: KZCOEG6ZNXUWGCJHWOSAJWRWLSLF7KHM
X-Message-ID-Hash: KZCOEG6ZNXUWGCJHWOSAJWRWLSLF7KHM
X-MailFrom: sbrivio@redhat.com
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: passt-dev@passt.top
X-Mailman-Version: 3.3.8
Precedence: list
List-Id: Development discussion and patches for passt <passt-dev.passt.top>
Archived-At: <https://archives.passt.top/passt-dev/20250312213910.059118d6@elisabeth/>
Archived-At: <https://passt.top/hyperkitty/list/passt-dev@passt.top/message/KZCOEG6ZNXUWGCJHWOSAJWRWLSLF7KHM/>
List-Archive: <https://archives.passt.top/passt-dev/>
List-Archive: <https://passt.top/hyperkitty/list/passt-dev@passt.top/>
List-Help: <mailto:passt-dev-request@passt.top?subject=help>
List-Owner: <mailto:passt-dev-owner@passt.top>
List-Post: <mailto:passt-dev@passt.top>
List-Subscribe: <mailto:passt-dev-join@passt.top>
List-Unsubscribe: <mailto:passt-dev-leave@passt.top>

On Wed, 12 Mar 2025 12:29:11 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Tue, Mar 11, 2025 at 10:55:32PM +0100, Stefano Brivio wrote:
> > On Tue, 11 Mar 2025 12:13:46 +1100
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> >   
> > > On Fri, Mar 07, 2025 at 11:41:29PM +0100, Stefano Brivio wrote:  
> > > > ...and time out after that. This will be needed because of an upcoming
> > > > change to passt-repair enabling it to start before passt is started,
> > > > on both source and target, by means of an inotify watch.
> > > > 
> > > > Once the inotify watch triggers, passt-repair will connect right away,
> > > > but we have no guarantees that the connection completes before we
> > > > start the migration process, so wait for it (for a reasonable amount
> > > > of time).
> > > > 
> > > > Signed-off-by: Stefano Brivio <sbrivio@redhat.com>    
> > > 
> > > I still think it's ugly, of course, but I don't see a better way, so:
> > > 
> > > Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> > >   
> > > > ---
> > > > v2:
> > > > 
> > > > - Use 10 ms as timeout instead of 100 ms. Given that I'm unable to
> > > >   migrate a simple guest with 256 MiB of memory and no storage other
> > > >   than an initramfs in less than 4 milliseconds, at least on my test
> > > >   system (rather fast CPU threads and memory interface), I think that
> > > >   10 ms shouldn't make a big difference in case passt-repair is not
> > > >   available for whatever reason    
> > > 
> > > So, IIUC, that 4ms is the *total* migration time.  
> > 
> > Ah, no, that's passt-to-passt in the migrate/basic test, to have a fair
> > comparison. That is:
> > 
> > $ git diff
> > diff --git a/migrate.c b/migrate.c
> > index 0fca77b..3d36843 100644
> > --- a/migrate.c
> > +++ b/migrate.c
> > @@ -286,6 +286,13 @@ void migrate_handler(struct ctx *c)
> >  	if (c->device_state_fd < 0)
> >  		return;
> >  
> > +#include <time.h>
> > +	{
> > +		struct timespec now;
> > +		clock_gettime(CLOCK_REALTIME, &now);
> > +		err("tv: %li.%li", now.tv_sec, now.tv_nsec);
> > +	}
> > +
> >  	debug("Handling migration request from fd: %d, target: %d",
> >  	      c->device_state_fd, c->migrate_target);  
> 
> Ah.  That still doesn't really measure the guest downtime, for two reasons:
>   * It measures from start of migration on the source to start of
>     migration on the target, largely ignoring the actual duration of
>     passt's processing
>   * It ignores everything that happens during the final migration
>     phase *except* for passt itself
> 
> But, it is necessarily a lower bound on the downtime, which I guess is
> enough in this instance.
> 
> > $ grep tv\: test/test_logs/context_passt_*.log
> > test/test_logs/context_passt_1.log:tv: 1741729630.368652064
> > test/test_logs/context_passt_2.log:tv: 1741729630.378664420
> > 
> > In this case it's 10 ms, but I can sometimes get 7 ms. This is with 512
> > MiB, but with 256 MiB I typically get 5 to 6 ms, and sometimes slightly
> > more than 4 ms. One flow or zero flows seem to make little difference.  
> 
> Of course, because both ends of the measurement take place before we
> actually do anything.  I wouldn't expect it to vary based on how much
> we're doing.  All this really measures is the latency from notifying
> the source passt to notifying the target passt.
> 
> > > The concern here is
> > > not that we add to the total migration time, but that we add to the
> > > migration downtime, that is, the time the guest is not running
> > > anywhere.  The downtime can be much smaller than the total migration
> > > time.  Furthermore qemu has no way to account for this delay in its
> > > estimate of what the downtime will be - the time for transferring
> > > device state is pretty much assumed to be neglible in comparison to
> > > transferring guest memory contents.  So, if qemu stops the guest at
> > > the point that the remaining memory transfer will just fit in the
> > > downtime limit, any delays we add will likely cause the downtime limit
> > > to be missed by that much.
> > > 
> > > Now, as it happens, the default downtime limit is 300ms, so an
> > > additional 10ms is probably fine (though 100ms really wasn't).
> > > Nonetheless the reasoning above isn't valid.  
> > 
> > ~50 ms is actually quite easy to get with a few (8) gigabytes of
> > memory,  
> 
> 50ms as measured above?  That's a bit surprising, because there's no
> particular reason for it to depend on memory size.  AFAICT
> SET_DEVICE_STATE_FD is called close to immediately before actually
> reading/writing the stream from the backend.

Oops, right, this figure I had in mind actually came from a rather
different measurement, that is, checking when the guest appeared to
resume from traffic captures with iperf3 running.

I definitely can't see this difference if I repeat the same measurement
as above.

> The memory size will of course affect the total migration time, and
> maybe the downtime.  As soon as qemu thinks it can transfer all
> remaining RAM within its downtime limit, qemu will go to the stopped
> phase.  With a fast local to local connection, it's possible qemu
> could enter that stopped phase almost immediately.
> 
> > that's why 100 ms also looked fine to me, but sure, 10 ms
> > sounds more reasonable.

-- 
Stefano