public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: Stefano Brivio <sbrivio@redhat.com>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: passt-dev@passt.top
Subject: Re: [PATCH 1/1] migrate: Use forward table information to close() listening sockets
Date: Sat, 31 Jan 2026 10:47:28 +0100 (CET)	[thread overview]
Message-ID: <20260131104727.2fbdfaff@elisabeth> (raw)
In-Reply-To: <20260130055811.2408284-2-david@gibson.dropbear.id.au>

On Fri, 30 Jan 2026 16:58:11 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On incoming migrations we need to bind() reconstructed sockets to their
> correct local address.  We can't do this if the origin passt instance is
> in the same namespace and still has those addresses bound.  Arguably that's
> a bug in bind()s operation during repair mode, but for now we have to work
> around it.
> 
> So, to allow local-to-local migrations we close() sockets on the outgoing
> side as we process them.  In addition to closing the connected socket we
> also have to close the associated listen()ing socket, because that can also
> cause an address conflict.
> 
> To do that, we introduced the listening_sock field in the connection
> state, because we had no other way to find the right listening sockets.
> Now that we have the forwarding table, we have a complete list of
> listening sockets elsewhere.  We can use that instead, to close all
> listening sockets on outbound migration, rather than just the ones that
> might conflict.
> 
> This is cleaner and, importantly, saves a valuable 32-bits in the flow
> state structure.  It does mean that there is a longer window where a peer
> attempting to connect during migration might get a Connection Refused.
> I think this is an acceptable trade-off for now: arguably we should not
> allow local-to-local migrations in any case, since the socket closes make
> it impossible to safely roll back migration as per the qemu model.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  flow.c     | 12 ++++++++++++
>  fwd.c      | 21 +++++++++++++++++++++
>  fwd.h      |  1 +
>  tcp.c      |  9 ---------
>  tcp_conn.h |  3 ---
>  5 files changed, 34 insertions(+), 12 deletions(-)
> 
> diff --git a/flow.c b/flow.c
> index fd4d5f38..5207143d 100644
> --- a/flow.c
> +++ b/flow.c
> @@ -1023,6 +1023,9 @@ static int flow_migrate_source_rollback(struct ctx *c, unsigned bound, int ret)
>  
>  	debug("...roll back migration");
>  
> +	if (fwd_listen_sync(c, &c->tcp.fwd_in, PIF_HOST, IPPROTO_TCP) < 0)
> +		die("Failed to re-establish listening sockets");
> +
>  	foreach_established_tcp_flow(flow) {
>  		if (FLOW_IDX(flow) >= bound)
>  			break;
> @@ -1147,6 +1150,15 @@ int flow_migrate_source(struct ctx *c, const struct migrate_stage *stage,

Nit: the comment to this function currently says "Send data (flow
table) for flow, close listening". I fixed that up (dropped ", close listening").

>  		return flow_migrate_source_rollback(c, FLOW_MAX, rc);
>  	}
>  
> +	/* HACK: A local to local migrate will fail if the origin passt has the
> +	 * listening sockets still open when the destination passt tries to bind
> +	 * them.  This does mean there's a window where we lost our listen()s,
> +	 * even if the migration is rolled back later.  The only way to really
> +	 * fix that is to not allow local to local migration, which arguably we
> +	 * should (use namespaces for testing instead). */

Actually, we already use namespaces in the current tests, but we didn't
(always) do that during development, and it might be convenient in
general to have the possibility to test *a part* of the implementation
using the same namespace as long as it's reasonably cheap (it seems to
be).

That's just a part because anyway bind() and connect() will conflict,
if we're in the same namespace, which is a kernel issue you already
noted:

  https://pad.passt.top/p/TcpRepairTodo#L3
  Repair mode sockets should not have address conflicts with non-repair
  sockets (both bind() and connect())

but even that part is convenient to have, I think, so I'm a bit worried
that somebody might take this comment as a to-do item, while I don't
think it should be.

Patch applied anyway, to give this as much testing time and exposure as
possible.

-- 
Stefano


  reply	other threads:[~2026-01-31  9:47 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-30  5:58 [PATCH 0/1] migrate: Eliminate listening_sock field from TCP connection state David Gibson
2026-01-30  5:58 ` [PATCH 1/1] migrate: Use forward table information to close() listening sockets David Gibson
2026-01-31  9:47   ` Stefano Brivio [this message]
2026-02-02  0:24     ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260131104727.2fbdfaff@elisabeth \
    --to=sbrivio@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=passt-dev@passt.top \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).