From: Stefano Brivio <sbrivio@redhat.com>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: passt-dev@passt.top, Laurent Vivier <lvivier@redhat.com>
Subject: Re: [PATCH 6/7] Introduce facilities for guest migration on top of vhost-user infrastructure
Date: Thu, 30 Jan 2025 09:32:36 +0100 [thread overview]
Message-ID: <20250130093236.117c3fd0@elisabeth> (raw)
In-Reply-To: <Z5ssbg6ID_Tqx6Eq@zatzit>
On Thu, 30 Jan 2025 18:38:22 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:
> Right, but in the present draft you pay that cost whether or not
> you're actually using the flows. Unfortunately a busy server with
> heaps of active connections is exactly the case that's likely to be
> most sensitve to additional downtime, but there's not really any
> getting around that. A machine with a lot of state will need either
> high downtime or high migration bandwidth.
It's... sixteen megabytes. A KubeVirt node is only allowed to perform up
to _four_ migrations in parallel, and that's our main use case at the
moment. "High downtime" is kind of relative.
> But, I'm really hoping we can move relatively quickly to a model where
> a guest with only a handful of connections _doesn't_ have to pay that
> 128k flow cost - and can consequently migrate ok even with quite
> constrained migration bandwidth. In that scenario the size of the
> header could become significant.
I think the biggest cost of the full flow table transfer is rather code
that's a bit quicker to write (I just managed to properly set sequences
on the target, connections don't quite "flow" yet) but relatively high
maintenance (as you mentioned, we need to be careful about every single
field) and easy to break.
I would like to quickly complete the whole flow first, because I think
we can inform design and implementation decisions much better at that
point, and we can be sure it's feasible, but I'm not particularly keen
to merge this patch like it is, if we can switch it relatively swiftly
to an implementation where we model a smaller fixed-endian structure
with just the stuff we need.
And again, to be a bit more sure of which stuff we need in it, the full
flow is useful to have implemented.
Actually the biggest complications I see in switching to that approach,
from the current point, are that we need to, I guess:
1. model arrays (not really complicated by itself)
2. have a temporary structure where we store flows instead of using the
flow table directly (meaning that the "data model" needs to logically
decouple source and destination of the copy)
3. batch stuff to some extent. We'll call socket() and connect() once
for each socket anyway, obviously, but sending one message to the
TCP_REPAIR helper for each socket looks like a rather substantial
and avoidable overhead
> > > It's both easier to do
> > > and a bigger win in most cases. That would dramatically reduce the
> > > size sent here.
> >
> > Yep, feel free.
>
> It's on my queue for the next few days.
To me this part actually looks like the biggest priority after/while
getting the whole thing to work, because we can start right with a 'v1'
which looks more sustainable.
And I would just get stuff working on x86_64 in that case, without even
implementing conversions and endianness switches etc.
--
Stefano
next prev parent reply other threads:[~2025-01-30 8:32 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-27 23:15 [PATCH 0/7] Draft, incomplete series introducing state migration Stefano Brivio
2025-01-27 23:15 ` [PATCH 1/7] icmp, udp: Pad time_t timestamp to 64-bit to ease " Stefano Brivio
2025-01-28 0:49 ` David Gibson
2025-01-28 6:48 ` Stefano Brivio
2025-01-27 23:15 ` [PATCH 2/7] flow, flow_table: Pad flow table entries to 128 bytes, hash entries to 32 bits Stefano Brivio
2025-01-28 0:50 ` David Gibson
2025-01-27 23:15 ` [PATCH 3/7] tcp_conn: Avoid 7-bit hole in struct tcp_splice_conn Stefano Brivio
2025-01-28 0:53 ` David Gibson
2025-01-28 6:48 ` Stefano Brivio
2025-01-29 1:02 ` David Gibson
2025-01-29 7:33 ` Stefano Brivio
2025-01-30 0:44 ` David Gibson
2025-01-30 4:55 ` Stefano Brivio
2025-01-30 7:27 ` David Gibson
2025-01-27 23:15 ` [PATCH 4/7] flow_table: Use size in extern declaration for flowtab Stefano Brivio
2025-01-27 23:15 ` [PATCH 5/7] util: Add read_remainder() and read_all_buf() Stefano Brivio
2025-01-28 0:59 ` David Gibson
2025-01-28 6:48 ` Stefano Brivio
2025-01-29 1:03 ` David Gibson
2025-01-29 7:33 ` Stefano Brivio
2025-01-30 0:44 ` David Gibson
2025-01-27 23:15 ` [PATCH 6/7] Introduce facilities for guest migration on top of vhost-user infrastructure Stefano Brivio
2025-01-28 1:40 ` David Gibson
2025-01-28 6:50 ` Stefano Brivio
2025-01-29 1:16 ` David Gibson
2025-01-29 7:33 ` Stefano Brivio
2025-01-30 0:48 ` David Gibson
2025-01-30 4:55 ` Stefano Brivio
2025-01-30 7:38 ` David Gibson
2025-01-30 8:32 ` Stefano Brivio [this message]
2025-01-30 8:54 ` David Gibson
2025-01-27 23:15 ` [PATCH 7/7] Introduce passt-repair Stefano Brivio
2025-01-27 23:31 ` Stefano Brivio
2025-01-28 1:51 ` David Gibson
2025-01-28 6:51 ` Stefano Brivio
2025-01-29 1:29 ` David Gibson
2025-01-29 7:04 ` Stefano Brivio
2025-01-30 0:53 ` David Gibson
2025-01-30 4:55 ` Stefano Brivio
2025-01-30 7:43 ` David Gibson
2025-01-30 7:56 ` Stefano Brivio
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250130093236.117c3fd0@elisabeth \
--to=sbrivio@redhat.com \
--cc=david@gibson.dropbear.id.au \
--cc=lvivier@redhat.com \
--cc=passt-dev@passt.top \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).