[PATCH 0/2] RFC: More graceful handling of migration without passt-repair (UNTESTED)

public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed

* [PATCH 0/2] RFC: More graceful handling of migration without passt-repair (UNTESTED)
@ 2025-02-20  6:03 David Gibson
  2025-02-20  6:03 ` [PATCH 1/2] migrate, flow: Trivially succeed if migrating with no flows David Gibson
  2025-02-20  6:03 ` [PATCH 2/2] migrate, flow: Don't attempt to migrate TCP flows without passt-repair David Gibson
  0 siblings, 2 replies; 10+ messages in thread
From: David Gibson @ 2025-02-20  6:03 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: David Gibson

From Red Hat internal testing we've had some reports that if
attempting to migrate without passt-repair, the failure mode is uglier
than we'd like.  The migration fails, which is somewhat expected, but
we don't correctly roll things back on the source, so it breaks
network there as well.

Handle this more gracefully allowing the migration to proceed in this
case, but allow TCP connections to break

Unfortunately, I only spotted the ticket about this quite late in my
day.  I'm posting this now for visibility, but I ran out of time to
test it.  I'll do that tomorrow if someone doesn't pick it up before
then.

David Gibson (2):
  migrate, flow: Trivially succeed if migrating with no flows
  migrate, flow: Don't attempt to migrate TCP flows without passt-repair

 flow.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

-- 
2.48.1

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2] migrate, flow: Trivially succeed if migrating with no flows
  2025-02-20  6:03 [PATCH 0/2] RFC: More graceful handling of migration without passt-repair (UNTESTED) David Gibson
@ 2025-02-20  6:03 ` David Gibson
  2025-02-20  6:03 ` [PATCH 2/2] migrate, flow: Don't attempt to migrate TCP flows without passt-repair David Gibson
  1 sibling, 0 replies; 10+ messages in thread
From: David Gibson @ 2025-02-20  6:03 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: David Gibson

We could get a migration request when we have no active flows; or at least
none that we need or are able to migrate.  In this case after sending or
receiving the number of flows we continue to step through various lists.

In the target case, this could include communication with passt-repair.  If
passt-repair wasn't started that could cause further errors, but of course
they shouldn't matter if we have nothing to repair.

Make it more obvious that there's nothing to do and avoid such errors by
short-circuiting flow_migrate_{source,target}() if there are no migratable
flows.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 flow.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/flow.c b/flow.c
index bb5dcc3c..6cf96c26 100644
--- a/flow.c
+++ b/flow.c
@@ -999,6 +999,9 @@ int flow_migrate_source(struct ctx *c, const struct migrate_stage *stage,
 
 	debug("Sending %u flows", ntohl(count));
 
+	if (!count)
+		return 0;
+
 	/* Dump and send information that can be stored in the flow table.
 	 *
 	 * Limited rollback options here: if we fail to transfer any data (that
@@ -1070,6 +1073,9 @@ int flow_migrate_target(struct ctx *c, const struct migrate_stage *stage,
 	count = ntohl(count);
 	debug("Receiving %u flows", count);
 
+	if (!count)
+		return 0;
+
 	if ((rc = flow_migrate_repair_all(c, true)))
 		return -rc;
 
-- 
2.48.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 2/2] migrate, flow: Don't attempt to migrate TCP flows without passt-repair
  2025-02-20  6:03 [PATCH 0/2] RFC: More graceful handling of migration without passt-repair (UNTESTED) David Gibson
  2025-02-20  6:03 ` [PATCH 1/2] migrate, flow: Trivially succeed if migrating with no flows David Gibson
@ 2025-02-20  6:03 ` David Gibson
  2025-02-20  8:07   ` Stefano Brivio
  1 sibling, 1 reply; 10+ messages in thread
From: David Gibson @ 2025-02-20  6:03 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: David Gibson

Migrating TCP flows requires passt-repair in order to use TCP_REPAIR.  If
passt-repair is not started, our failure mode is pretty ugly though: we'll
attempt the migration, hitting various problems when we can't enter repair
mode.  In some cases we may not roll back these changes properly, meaning
we break network connections on the source.

Our general approach is not to completely block migration if there are
problems, but simply to break any flows we can't migrate.  So, if we have
no connection from passt-repair carry on with the migration, but don't
attempt to migrate any TCP connections.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 flow.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/flow.c b/flow.c
index 6cf96c26..749c4984 100644
--- a/flow.c
+++ b/flow.c
@@ -923,6 +923,10 @@ static int flow_migrate_repair_all(struct ctx *c, bool enable)
 	union flow *flow;
 	int rc;
 
+	/* If we don't have a repair helper, there's nothing we can do */
+	if (c->fd_repair < 0)
+		return 0;
+
 	foreach_established_tcp_flow(flow) {
 		if (enable)
 			rc = tcp_flow_repair_on(c, &flow->tcp);
@@ -987,8 +991,11 @@ int flow_migrate_source(struct ctx *c, const struct migrate_stage *stage,
 	(void)c;
 	(void)stage;
 
-	foreach_established_tcp_flow(flow)
-		count++;
+	/* If we don't have a repair helper, we can't migrate TCP flows */
+	if (c->fd_repair >= 0) {
+		foreach_established_tcp_flow(flow)
+			count++;
+	}
 
 	count = htonl(count);
 	if (write_all_buf(fd, &count, sizeof(count))) {
-- 
2.48.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] migrate, flow: Don't attempt to migrate TCP flows without passt-repair
  2025-02-20  6:03 ` [PATCH 2/2] migrate, flow: Don't attempt to migrate TCP flows without passt-repair David Gibson
@ 2025-02-20  8:07   ` Stefano Brivio
  2025-02-20 10:18     ` David Gibson
  0 siblings, 1 reply; 10+ messages in thread
From: Stefano Brivio @ 2025-02-20  8:07 UTC (permalink / raw)
  To: David Gibson; +Cc: passt-dev

On Thu, 20 Feb 2025 17:03:18 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> Migrating TCP flows requires passt-repair in order to use TCP_REPAIR.  If
> passt-repair is not started, our failure mode is pretty ugly though: we'll
> attempt the migration, hitting various problems when we can't enter repair
> mode.  In some cases we may not roll back these changes properly, meaning
> we break network connections on the source.
> 
> Our general approach is not to completely block migration if there are
> problems, but simply to break any flows we can't migrate.  So, if we have
> no connection from passt-repair carry on with the migration, but don't
> attempt to migrate any TCP connections.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  flow.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/flow.c b/flow.c
> index 6cf96c26..749c4984 100644
> --- a/flow.c
> +++ b/flow.c
> @@ -923,6 +923,10 @@ static int flow_migrate_repair_all(struct ctx *c, bool enable)
>  	union flow *flow;
>  	int rc;
>  
> +	/* If we don't have a repair helper, there's nothing we can do */
> +	if (c->fd_repair < 0)
> +		return 0;
> +

This doesn't fix the behaviour in a relatively likely failure mode:
passt-repair is there, but we can't communicate to it (LSM policy
issues or similar).

In that case, unconditionally terminating on failure in the rollback
function:

		if (tcp_flow_repair_off(c, &flow->tcp))
			die("Failed to roll back TCP_REPAIR mode");

	if (repair_flush(c))
		die("Failed to roll back TCP_REPAIR mode");

isn't a very productive thing to do: we go from an uneventful failure
where flows were not affected at all to a guest left without
connectivity.

That starts looking less robust than the alternative (e.g. what I
implemented in v12: silently fail and continue) at least without
https://patchew.org/QEMU/20250217092550.1172055-1-lvivier@redhat.com/
in a general case as well: if we continue, we'll have hanging flows
that will expire on timeout, but if we don't, again, we'll have a
guest without connectivity.

I understand that leaving flows around for that long might present a
relevant inconsistency, though.

So I'm wondering about some alternatives: actually, the rollback
function shouldn't be called at all in this case. Or it could just
(indirectly) call tcp_rst() on all the flows that were possibly
affected.

-- 
Stefano

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] migrate, flow: Don't attempt to migrate TCP flows without passt-repair
  2025-02-20  8:07   ` Stefano Brivio
@ 2025-02-20 10:18     ` David Gibson
  2025-02-20 10:38       ` Stefano Brivio
  0 siblings, 1 reply; 10+ messages in thread
From: David Gibson @ 2025-02-20 10:18 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: passt-dev

[-- Attachment #1: Type: text/plain, Size: 4008 bytes --]

On Thu, Feb 20, 2025 at 09:07:26AM +0100, Stefano Brivio wrote:
> On Thu, 20 Feb 2025 17:03:18 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > Migrating TCP flows requires passt-repair in order to use TCP_REPAIR.  If
> > passt-repair is not started, our failure mode is pretty ugly though: we'll
> > attempt the migration, hitting various problems when we can't enter repair
> > mode.  In some cases we may not roll back these changes properly, meaning
> > we break network connections on the source.
> > 
> > Our general approach is not to completely block migration if there are
> > problems, but simply to break any flows we can't migrate.  So, if we have
> > no connection from passt-repair carry on with the migration, but don't
> > attempt to migrate any TCP connections.
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> >  flow.c | 11 +++++++++--
> >  1 file changed, 9 insertions(+), 2 deletions(-)
> > 
> > diff --git a/flow.c b/flow.c
> > index 6cf96c26..749c4984 100644
> > --- a/flow.c
> > +++ b/flow.c
> > @@ -923,6 +923,10 @@ static int flow_migrate_repair_all(struct ctx *c, bool enable)
> >  	union flow *flow;
> >  	int rc;
> >  
> > +	/* If we don't have a repair helper, there's nothing we can do */
> > +	if (c->fd_repair < 0)
> > +		return 0;
> > +
> 
> This doesn't fix the behaviour in a relatively likely failure mode:
> passt-repair is there, but we can't communicate to it (LSM policy
> issues or similar).

Ah... true.  Although it shouldn't make it any worse for that case,
right, so that could be a separate fix.

> In that case, unconditionally terminating on failure in the rollback
> function:
> 
> 		if (tcp_flow_repair_off(c, &flow->tcp))
> 			die("Failed to roll back TCP_REPAIR mode");
> 
> 	if (repair_flush(c))
> 		die("Failed to roll back TCP_REPAIR mode");
> 
> isn't a very productive thing to do: we go from an uneventful failure
> where flows were not affected at all to a guest left without
> connectivity.

So, the issue is that leaving sockets in repair mode after we leave
the migration path would be very bad.  We can't easily close
sockets/flows for which that's the case, because the batching means if
there's a failure we don't actually know which sockets are in which
mode, hence the die().

> That starts looking less robust than the alternative (e.g. what I
> implemented in v12: silently fail and continue) at least without
> https://patchew.org/QEMU/20250217092550.1172055-1-lvivier@redhat.com/
> in a general case as well: if we continue, we'll have hanging flows
> that will expire on timeout, but if we don't, again, we'll have a
> guest without connectivity.
> 
> I understand that leaving flows around for that long might present a
> relevant inconsistency, though.
> 
> So I'm wondering about some alternatives: actually, the rollback
> function shouldn't be called at all in this case. Or it could just
> (indirectly) call tcp_rst() on all the flows that were possibly
> affected.

Making it be a safe no-op if we never managed to turn repair on for
anything would make sense to me.  Unfortunately, in this situation we
won't see an error until we do a repair_flush() which means we now
don't know the state of any sockets we already passed to
repair_set().

We could, I suppose, close all flows that we passed to repair_set() by
the time we see the error.  If we have < one batch's worth of
connections that will kill connectivity almost as much as die()ing,
though.  I guess it will come back without needing qemu to restart us,
though, so that's something.

This sort of thing is, incidentally, why I did way back suggest the
possibility of passt-repair reporting failures per-fd, rather than
just per-batch.

-- 
David Gibson (he or they) | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you, not the other
way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] migrate, flow: Don't attempt to migrate TCP flows without passt-repair
  2025-02-20 10:18     ` David Gibson
@ 2025-02-20 10:38       ` Stefano Brivio
  2025-02-21  2:40         ` David Gibson
  0 siblings, 1 reply; 10+ messages in thread
From: Stefano Brivio @ 2025-02-20 10:38 UTC (permalink / raw)
  To: David Gibson; +Cc: passt-dev

On Thu, 20 Feb 2025 21:18:06 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Thu, Feb 20, 2025 at 09:07:26AM +0100, Stefano Brivio wrote:
> > On Thu, 20 Feb 2025 17:03:18 +1100
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> >   
> > > Migrating TCP flows requires passt-repair in order to use TCP_REPAIR.  If
> > > passt-repair is not started, our failure mode is pretty ugly though: we'll
> > > attempt the migration, hitting various problems when we can't enter repair
> > > mode.  In some cases we may not roll back these changes properly, meaning
> > > we break network connections on the source.
> > > 
> > > Our general approach is not to completely block migration if there are
> > > problems, but simply to break any flows we can't migrate.  So, if we have
> > > no connection from passt-repair carry on with the migration, but don't
> > > attempt to migrate any TCP connections.
> > > 
> > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > > ---
> > >  flow.c | 11 +++++++++--
> > >  1 file changed, 9 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/flow.c b/flow.c
> > > index 6cf96c26..749c4984 100644
> > > --- a/flow.c
> > > +++ b/flow.c
> > > @@ -923,6 +923,10 @@ static int flow_migrate_repair_all(struct ctx *c, bool enable)
> > >  	union flow *flow;
> > >  	int rc;
> > >  
> > > +	/* If we don't have a repair helper, there's nothing we can do */
> > > +	if (c->fd_repair < 0)
> > > +		return 0;
> > > +  
> > 
> > This doesn't fix the behaviour in a relatively likely failure mode:
> > passt-repair is there, but we can't communicate to it (LSM policy
> > issues or similar).  
> 
> Ah... true.  Although it shouldn't make it any worse for that case,
> right, so that could be a separate fix.

Sure.

> > In that case, unconditionally terminating on failure in the rollback
> > function:
> > 
> > 		if (tcp_flow_repair_off(c, &flow->tcp))
> > 			die("Failed to roll back TCP_REPAIR mode");
> > 
> > 	if (repair_flush(c))
> > 		die("Failed to roll back TCP_REPAIR mode");
> > 
> > isn't a very productive thing to do: we go from an uneventful failure
> > where flows were not affected at all to a guest left without
> > connectivity.  
> 
> So, the issue is that leaving sockets in repair mode after we leave
> the migration path would be very bad.

Why? I really can't see anything catastrophic happening as a result of
that (hence my v12 version of this). Surely not as bad as the guest
losing connectivity without any possible recovery.

> We can't easily close
> sockets/flows for which that's the case, because the batching means if
> there's a failure we don't actually know which sockets are in which
> mode, hence the die().

They can be closed (via tcp_rst()) anyway. If they're in repair mode,
no RST will reach the peer, and if they aren't, it will.

> > That starts looking less robust than the alternative (e.g. what I
> > implemented in v12: silently fail and continue) at least without
> > https://patchew.org/QEMU/20250217092550.1172055-1-lvivier@redhat.com/
> > in a general case as well: if we continue, we'll have hanging flows
> > that will expire on timeout, but if we don't, again, we'll have a
> > guest without connectivity.
> > 
> > I understand that leaving flows around for that long might present a
> > relevant inconsistency, though.
> > 
> > So I'm wondering about some alternatives: actually, the rollback
> > function shouldn't be called at all in this case. Or it could just
> > (indirectly) call tcp_rst() on all the flows that were possibly
> > affected.  
> 
> Making it be a safe no-op if we never managed to turn repair on for
> anything would make sense to me.  Unfortunately, in this situation we
> won't see an error until we do a repair_flush() which means we now
> don't know the state of any sockets we already passed to
> repair_set().
> 
> We could, I suppose, close all flows that we passed to repair_set() by
> the time we see the error.  If we have < one batch's worth of
> connections that will kill connectivity almost as much as die()ing,
> though.  I guess it will come back without needing qemu to restart us,
> though, so that's something.

Right, yes, that's what I'm suggesting.

> This sort of thing is, incidentally, why I did way back suggest the
> possibility of passt-repair reporting failures per-fd, rather than
> just per-batch.

Sorry, I somehow missed that proposal, and I can't find any trace of it.

But anyway, the problem is that if we fail to read a batch for any
reason (invalid ancillary data... maybe always implying a kernel issue,
but I'm not sure), you can't _reliably_ report per-fd failures.
*Usually*, you can. Worth it?

In any case, if it's simple, we can still do it, because passt and
passt-repair are distributed together. You can't pass back the file
descriptors via SCM_RIGHTS though, because we want to close() them
before we reply.

Another alternative could be that passt-repair reverts back the state
of the file descriptors that were already switched, on failure.

-- 
Stefano


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] migrate, flow: Don't attempt to migrate TCP flows without passt-repair
  2025-02-20 10:38       ` Stefano Brivio
@ 2025-02-21  2:40         ` David Gibson
  2025-02-21  5:59           ` Stefano Brivio
  0 siblings, 1 reply; 10+ messages in thread
From: David Gibson @ 2025-02-21  2:40 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: passt-dev

[-- Attachment #1: Type: text/plain, Size: 6406 bytes --]

On Thu, Feb 20, 2025 at 11:38:00AM +0100, Stefano Brivio wrote:
> On Thu, 20 Feb 2025 21:18:06 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Thu, Feb 20, 2025 at 09:07:26AM +0100, Stefano Brivio wrote:
> > > On Thu, 20 Feb 2025 17:03:18 +1100
> > > David Gibson <david@gibson.dropbear.id.au> wrote:
> > >   
> > > > Migrating TCP flows requires passt-repair in order to use TCP_REPAIR.  If
> > > > passt-repair is not started, our failure mode is pretty ugly though: we'll
> > > > attempt the migration, hitting various problems when we can't enter repair
> > > > mode.  In some cases we may not roll back these changes properly, meaning
> > > > we break network connections on the source.
> > > > 
> > > > Our general approach is not to completely block migration if there are
> > > > problems, but simply to break any flows we can't migrate.  So, if we have
> > > > no connection from passt-repair carry on with the migration, but don't
> > > > attempt to migrate any TCP connections.
> > > > 
> > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > > > ---
> > > >  flow.c | 11 +++++++++--
> > > >  1 file changed, 9 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/flow.c b/flow.c
> > > > index 6cf96c26..749c4984 100644
> > > > --- a/flow.c
> > > > +++ b/flow.c
> > > > @@ -923,6 +923,10 @@ static int flow_migrate_repair_all(struct ctx *c, bool enable)
> > > >  	union flow *flow;
> > > >  	int rc;
> > > >  
> > > > +	/* If we don't have a repair helper, there's nothing we can do */
> > > > +	if (c->fd_repair < 0)
> > > > +		return 0;
> > > > +  
> > > 
> > > This doesn't fix the behaviour in a relatively likely failure mode:
> > > passt-repair is there, but we can't communicate to it (LSM policy
> > > issues or similar).  
> > 
> > Ah... true.  Although it shouldn't make it any worse for that case,
> > right, so that could be a separate fix.
> 
> Sure.
> 
> > > In that case, unconditionally terminating on failure in the rollback
> > > function:
> > > 
> > > 		if (tcp_flow_repair_off(c, &flow->tcp))
> > > 			die("Failed to roll back TCP_REPAIR mode");
> > > 
> > > 	if (repair_flush(c))
> > > 		die("Failed to roll back TCP_REPAIR mode");
> > > 
> > > isn't a very productive thing to do: we go from an uneventful failure
> > > where flows were not affected at all to a guest left without
> > > connectivity.  
> > 
> > So, the issue is that leaving sockets in repair mode after we leave
> > the migration path would be very bad.
> 
> Why? I really can't see anything catastrophic happening as a result of
> that (hence my v12 version of this). Surely not as bad as the guest
> losing connectivity without any possible recovery.

I was meaning specifically trying to carry on using a repair mode
socket as if it wasn't in repair mode, not closing it out.

> > We can't easily close
> > sockets/flows for which that's the case, because the batching means if
> > there's a failure we don't actually know which sockets are in which
> > mode, hence the die().
> 
> They can be closed (via tcp_rst()) anyway. If they're in repair mode,
> no RST will reach the peer, and if they aren't, it will.

Ah, yes, I guess we can just close anything that might be affected.
Brutal, but effective.  It will disrupt connectivity, of course, but
not as badly as dying completely.

> > > That starts looking less robust than the alternative (e.g. what I
> > > implemented in v12: silently fail and continue) at least without
> > > https://patchew.org/QEMU/20250217092550.1172055-1-lvivier@redhat.com/
> > > in a general case as well: if we continue, we'll have hanging flows
> > > that will expire on timeout, but if we don't, again, we'll have a
> > > guest without connectivity.
> > > 
> > > I understand that leaving flows around for that long might present a
> > > relevant inconsistency, though.
> > > 
> > > So I'm wondering about some alternatives: actually, the rollback
> > > function shouldn't be called at all in this case. Or it could just
> > > (indirectly) call tcp_rst() on all the flows that were possibly
> > > affected.  
> > 
> > Making it be a safe no-op if we never managed to turn repair on for
> > anything would make sense to me.  Unfortunately, in this situation we
> > won't see an error until we do a repair_flush() which means we now
> > don't know the state of any sockets we already passed to
> > repair_set().
> > 
> > We could, I suppose, close all flows that we passed to repair_set() by
> > the time we see the error.  If we have < one batch's worth of
> > connections that will kill connectivity almost as much as die()ing,
> > though.  I guess it will come back without needing qemu to restart us,
> > though, so that's something.
> 
> Right, yes, that's what I'm suggesting.

Ok.  I'll work on something to do that.

> > This sort of thing is, incidentally, why I did way back suggest the
> > possibility of passt-repair reporting failures per-fd, rather than
> > just per-batch.
> 
> Sorry, I somehow missed that proposal, and I can't find any trace of
> it.

It may have just been on IRC somewhere.

> But anyway, the problem is that if we fail to read a batch for any
> reason (invalid ancillary data... maybe always implying a kernel issue,
> but I'm not sure), you can't _reliably_ report per-fd failures.
> *Usually*, you can. Worth it?

Ah, I see.  We could handle that by being able to report both per-fd
and "whole batch" failure (equivalent to failure on every fd), but
that would complexify the protocol, of course.

> In any case, if it's simple, we can still do it, because passt and
> passt-repair are distributed together. You can't pass back the file
> descriptors via SCM_RIGHTS though, because we want to close() them
> before we reply.
> 
> Another alternative could be that passt-repair reverts back the state
> of the file descriptors that were already switched, on failure.

That might help a bit, we'd still need to rework the passt-side
interface to know what needs reverting at the right stage.

I'll take those ideas and see what I can come up with today.

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] migrate, flow: Don't attempt to migrate TCP flows without passt-repair
  2025-02-21  2:40         ` David Gibson
@ 2025-02-21  5:59           ` Stefano Brivio
  2025-02-21  6:37             ` David Gibson
  0 siblings, 1 reply; 10+ messages in thread
From: Stefano Brivio @ 2025-02-21  5:59 UTC (permalink / raw)
  To: David Gibson; +Cc: passt-dev

On Fri, 21 Feb 2025 13:40:12 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Thu, Feb 20, 2025 at 11:38:00AM +0100, Stefano Brivio wrote:
> > On Thu, 20 Feb 2025 21:18:06 +1100
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> >   
> > > This sort of thing is, incidentally, why I did way back suggest the
> > > possibility of passt-repair reporting failures per-fd, rather than
> > > just per-batch.  
> > 
> > Sorry, I somehow missed that proposal, and I can't find any trace of
> > it.  
> 
> It may have just been on IRC somewhere.
> 
> > But anyway, the problem is that if we fail to read a batch for any
> > reason (invalid ancillary data... maybe always implying a kernel issue,
> > but I'm not sure), you can't _reliably_ report per-fd failures.
> > *Usually*, you can. Worth it?  
> 
> Ah, I see.  We could handle that by being able to report both per-fd
> and "whole batch" failure (equivalent to failure on every fd), but
> that would complexify the protocol, of course.

By the way, after having another look at the kernel interface and
implementation: this makes no sense.

Either we're able to set repair mode for all the sockets, or for none
of them (EPERM). And if there's any invalid file descriptor in the set,
we'll get EBADF for the whole sendmsg().

The stuff I'm proposing below is well beyond my threshold of things
that make no sense to implement, but at least it limits damage in terms
of complexity (and hence of potential impact on the actual usage,
because that's what we're talking about here: a die() that makes no
sense but now proves to be actually harmful). I wouldn't go any further
than that.

> > In any case, if it's simple, we can still do it, because passt and
> > passt-repair are distributed together. You can't pass back the file
> > descriptors via SCM_RIGHTS though, because we want to close() them
> > before we reply.
> > 
> > Another alternative could be that passt-repair reverts back the state
> > of the file descriptors that were already switched, on failure.  
> 
> That might help a bit, we'd still need to rework the passt-side
> interface to know what needs reverting at the right stage.

Not for what I'm proposing:

1. passt sends 1 (TCP_REPAIR_ON) for sockets 2, 3

2. passt-repair sets repair mode for 2

3. passt-repair fails to set repair mode for 3

4. passt-repair clears repair mode for 2

5. passt-repair closes the connection to signal failure

6. passt knows that repair mode is *not* set for any socket in the batch

The interface remains the same (per-batch error only), but you can rely
on the whole batch to have failed.

You already can, even with no changes in passt-repair (see above), by
the way.

-- 
Stefano

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] migrate, flow: Don't attempt to migrate TCP flows without passt-repair
  2025-02-21  5:59           ` Stefano Brivio
@ 2025-02-21  6:37             ` David Gibson
  2025-02-21  7:03               ` Stefano Brivio
  0 siblings, 1 reply; 10+ messages in thread
From: David Gibson @ 2025-02-21  6:37 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: passt-dev

[-- Attachment #1: Type: text/plain, Size: 3734 bytes --]

On Fri, Feb 21, 2025 at 06:59:12AM +0100, Stefano Brivio wrote:
> On Fri, 21 Feb 2025 13:40:12 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Thu, Feb 20, 2025 at 11:38:00AM +0100, Stefano Brivio wrote:
> > > On Thu, 20 Feb 2025 21:18:06 +1100
> > > David Gibson <david@gibson.dropbear.id.au> wrote:
> > >   
> > > > This sort of thing is, incidentally, why I did way back suggest the
> > > > possibility of passt-repair reporting failures per-fd, rather than
> > > > just per-batch.  
> > > 
> > > Sorry, I somehow missed that proposal, and I can't find any trace of
> > > it.  
> > 
> > It may have just been on IRC somewhere.
> > 
> > > But anyway, the problem is that if we fail to read a batch for any
> > > reason (invalid ancillary data... maybe always implying a kernel issue,
> > > but I'm not sure), you can't _reliably_ report per-fd failures.
> > > *Usually*, you can. Worth it?  
> > 
> > Ah, I see.  We could handle that by being able to report both per-fd
> > and "whole batch" failure (equivalent to failure on every fd), but
> > that would complexify the protocol, of course.
> 
> By the way, after having another look at the kernel interface and
> implementation: this makes no sense.
> 
> Either we're able to set repair mode for all the sockets, or for none
> of them (EPERM).

Well.. probably.  I suspect something sufficiently insane in an LSM
could break that rule.

> And if there's any invalid file descriptor in the set,
> we'll get EBADF for the whole sendmsg().
> 
> The stuff I'm proposing below is well beyond my threshold of things
> that make no sense to implement, but at least it limits damage in terms
> of complexity (and hence of potential impact on the actual usage,
> because that's what we're talking about here: a die() that makes no
> sense but now proves to be actually harmful). I wouldn't go any further
> than that.
> 
> > > In any case, if it's simple, we can still do it, because passt and
> > > passt-repair are distributed together. You can't pass back the file
> > > descriptors via SCM_RIGHTS though, because we want to close() them
> > > before we reply.
> > > 
> > > Another alternative could be that passt-repair reverts back the state
> > > of the file descriptors that were already switched, on failure.  
> > 
> > That might help a bit, we'd still need to rework the passt-side
> > interface to know what needs reverting at the right stage.
> 
> Not for what I'm proposing:
> 
> 1. passt sends 1 (TCP_REPAIR_ON) for sockets 2, 3
> 
> 2. passt-repair sets repair mode for 2
> 
> 3. passt-repair fails to set repair mode for 3
> 
> 4. passt-repair clears repair mode for 2

Again, it shouldn't happen in practice, but you will get a mess if you
ever managed to set repair mode, but failed to clear it.

There's a similar question in the other direction, if passt is trying
to REPAIR_OFF, and we fail part way through.  Should passt-repair
REPAIR_ON again?  I'm not sure it has the information to make a sane
choice here.

> 5. passt-repair closes the connection to signal failure
> 
> 6. passt knows that repair mode is *not* set for any socket in the batch
> 
> The interface remains the same (per-batch error only), but you can rely
> on the whole batch to have failed.
> 
> You already can, even with no changes in passt-repair (see above), by
> the way.

Well, I just finished implementing a simple way of reporting partial
failures.  I guess see what you think of it.

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] migrate, flow: Don't attempt to migrate TCP flows without passt-repair
  2025-02-21  6:37             ` David Gibson
@ 2025-02-21  7:03               ` Stefano Brivio
  0 siblings, 0 replies; 10+ messages in thread
From: Stefano Brivio @ 2025-02-21  7:03 UTC (permalink / raw)
  To: David Gibson; +Cc: passt-dev

On Fri, 21 Feb 2025 17:37:18 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Fri, Feb 21, 2025 at 06:59:12AM +0100, Stefano Brivio wrote:
> > On Fri, 21 Feb 2025 13:40:12 +1100
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> >   
> > > On Thu, Feb 20, 2025 at 11:38:00AM +0100, Stefano Brivio wrote:  
> > > > On Thu, 20 Feb 2025 21:18:06 +1100
> > > > David Gibson <david@gibson.dropbear.id.au> wrote:
> > > >     
> > > > > This sort of thing is, incidentally, why I did way back suggest the
> > > > > possibility of passt-repair reporting failures per-fd, rather than
> > > > > just per-batch.    
> > > > 
> > > > Sorry, I somehow missed that proposal, and I can't find any trace of
> > > > it.    
> > > 
> > > It may have just been on IRC somewhere.
> > >   
> > > > But anyway, the problem is that if we fail to read a batch for any
> > > > reason (invalid ancillary data... maybe always implying a kernel issue,
> > > > but I'm not sure), you can't _reliably_ report per-fd failures.
> > > > *Usually*, you can. Worth it?    
> > > 
> > > Ah, I see.  We could handle that by being able to report both per-fd
> > > and "whole batch" failure (equivalent to failure on every fd), but
> > > that would complexify the protocol, of course.  
> > 
> > By the way, after having another look at the kernel interface and
> > implementation: this makes no sense.
> > 
> > Either we're able to set repair mode for all the sockets, or for none
> > of them (EPERM).  
> 
> Well.. probably.  I suspect something sufficiently insane in an LSM
> could break that rule.

Not in the LSMs we ship policies for, because we don't run as root, so
we can't magically relabel (setfilecon()) sockets or change subprofile
(aa_change_hat()) before/after creating some, even if we wanted. And
from the little I know of TOMOYO that should be impossible as well.

> > And if there's any invalid file descriptor in the set,
> > we'll get EBADF for the whole sendmsg().
> > 
> > The stuff I'm proposing below is well beyond my threshold of things
> > that make no sense to implement, but at least it limits damage in terms
> > of complexity (and hence of potential impact on the actual usage,
> > because that's what we're talking about here: a die() that makes no
> > sense but now proves to be actually harmful). I wouldn't go any further
> > than that.
> >   
> > > > In any case, if it's simple, we can still do it, because passt and
> > > > passt-repair are distributed together. You can't pass back the file
> > > > descriptors via SCM_RIGHTS though, because we want to close() them
> > > > before we reply.
> > > > 
> > > > Another alternative could be that passt-repair reverts back the state
> > > > of the file descriptors that were already switched, on failure.    
> > > 
> > > That might help a bit, we'd still need to rework the passt-side
> > > interface to know what needs reverting at the right stage.  
> > 
> > Not for what I'm proposing:
> > 
> > 1. passt sends 1 (TCP_REPAIR_ON) for sockets 2, 3
> > 
> > 2. passt-repair sets repair mode for 2
> > 
> > 3. passt-repair fails to set repair mode for 3
> > 
> > 4. passt-repair clears repair mode for 2  
> 
> Again, it shouldn't happen in practice, but you will get a mess if you
> ever managed to set repair mode, but failed to clear it.

"It shouldn't happen in practice" == "It doesn't happen". And again, in
this impossible case, we would either get a socket that's stuck, for two
hours, or close it because we get some other error on it.

> There's a similar question in the other direction, if passt is trying
> to REPAIR_OFF, and we fail part way through.  Should passt-repair
> REPAIR_ON again?  I'm not sure it has the information to make a sane
> choice here.

Yes, I think it should. The information it has is that it just set
REPAIR_OFF, so repair mode was presumably on to begin with.

> > 5. passt-repair closes the connection to signal failure
> > 
> > 6. passt knows that repair mode is *not* set for any socket in the batch
> > 
> > The interface remains the same (per-batch error only), but you can rely
> > on the whole batch to have failed.
> > 
> > You already can, even with no changes in passt-repair (see above), by
> > the way.  
> 
> Well, I just finished implementing a simple way of reporting partial
> failures.  I guess see what you think of it.

I guess it conflicts with my priority which is to fix actual bugs without
introducing new ones, but other than that, I'm not necessarily against
it.

-- 
Stefano


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-02-21  7:03 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-02-20  6:03 [PATCH 0/2] RFC: More graceful handling of migration without passt-repair (UNTESTED) David Gibson
2025-02-20  6:03 ` [PATCH 1/2] migrate, flow: Trivially succeed if migrating with no flows David Gibson
2025-02-20  6:03 ` [PATCH 2/2] migrate, flow: Don't attempt to migrate TCP flows without passt-repair David Gibson
2025-02-20  8:07   ` Stefano Brivio
2025-02-20 10:18     ` David Gibson
2025-02-20 10:38       ` Stefano Brivio
2025-02-21  2:40         ` David Gibson
2025-02-21  5:59           ` Stefano Brivio
2025-02-21  6:37             ` David Gibson
2025-02-21  7:03               ` Stefano Brivio

Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).