* [PATCH 0/2] RFC: More graceful handling of migration without passt-repair (UNTESTED) @ 2025-02-20 6:03 David Gibson 2025-02-20 6:03 ` [PATCH 1/2] migrate, flow: Trivially succeed if migrating with no flows David Gibson 2025-02-20 6:03 ` [PATCH 2/2] migrate, flow: Don't attempt to migrate TCP flows without passt-repair David Gibson 0 siblings, 2 replies; 10+ messages in thread From: David Gibson @ 2025-02-20 6:03 UTC (permalink / raw) To: Stefano Brivio, passt-dev; +Cc: David Gibson From Red Hat internal testing we've had some reports that if attempting to migrate without passt-repair, the failure mode is uglier than we'd like. The migration fails, which is somewhat expected, but we don't correctly roll things back on the source, so it breaks network there as well. Handle this more gracefully allowing the migration to proceed in this case, but allow TCP connections to break Unfortunately, I only spotted the ticket about this quite late in my day. I'm posting this now for visibility, but I ran out of time to test it. I'll do that tomorrow if someone doesn't pick it up before then. David Gibson (2): migrate, flow: Trivially succeed if migrating with no flows migrate, flow: Don't attempt to migrate TCP flows without passt-repair flow.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-) -- 2.48.1 ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/2] migrate, flow: Trivially succeed if migrating with no flows 2025-02-20 6:03 [PATCH 0/2] RFC: More graceful handling of migration without passt-repair (UNTESTED) David Gibson @ 2025-02-20 6:03 ` David Gibson 2025-02-20 6:03 ` [PATCH 2/2] migrate, flow: Don't attempt to migrate TCP flows without passt-repair David Gibson 1 sibling, 0 replies; 10+ messages in thread From: David Gibson @ 2025-02-20 6:03 UTC (permalink / raw) To: Stefano Brivio, passt-dev; +Cc: David Gibson We could get a migration request when we have no active flows; or at least none that we need or are able to migrate. In this case after sending or receiving the number of flows we continue to step through various lists. In the target case, this could include communication with passt-repair. If passt-repair wasn't started that could cause further errors, but of course they shouldn't matter if we have nothing to repair. Make it more obvious that there's nothing to do and avoid such errors by short-circuiting flow_migrate_{source,target}() if there are no migratable flows. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> --- flow.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/flow.c b/flow.c index bb5dcc3c..6cf96c26 100644 --- a/flow.c +++ b/flow.c @@ -999,6 +999,9 @@ int flow_migrate_source(struct ctx *c, const struct migrate_stage *stage, debug("Sending %u flows", ntohl(count)); + if (!count) + return 0; + /* Dump and send information that can be stored in the flow table. * * Limited rollback options here: if we fail to transfer any data (that @@ -1070,6 +1073,9 @@ int flow_migrate_target(struct ctx *c, const struct migrate_stage *stage, count = ntohl(count); debug("Receiving %u flows", count); + if (!count) + return 0; + if ((rc = flow_migrate_repair_all(c, true))) return -rc; -- @@ -999,6 +999,9 @@ int flow_migrate_source(struct ctx *c, const struct migrate_stage *stage, debug("Sending %u flows", ntohl(count)); + if (!count) + return 0; + /* Dump and send information that can be stored in the flow table. * * Limited rollback options here: if we fail to transfer any data (that @@ -1070,6 +1073,9 @@ int flow_migrate_target(struct ctx *c, const struct migrate_stage *stage, count = ntohl(count); debug("Receiving %u flows", count); + if (!count) + return 0; + if ((rc = flow_migrate_repair_all(c, true))) return -rc; -- 2.48.1 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 2/2] migrate, flow: Don't attempt to migrate TCP flows without passt-repair 2025-02-20 6:03 [PATCH 0/2] RFC: More graceful handling of migration without passt-repair (UNTESTED) David Gibson 2025-02-20 6:03 ` [PATCH 1/2] migrate, flow: Trivially succeed if migrating with no flows David Gibson @ 2025-02-20 6:03 ` David Gibson 2025-02-20 8:07 ` Stefano Brivio 1 sibling, 1 reply; 10+ messages in thread From: David Gibson @ 2025-02-20 6:03 UTC (permalink / raw) To: Stefano Brivio, passt-dev; +Cc: David Gibson Migrating TCP flows requires passt-repair in order to use TCP_REPAIR. If passt-repair is not started, our failure mode is pretty ugly though: we'll attempt the migration, hitting various problems when we can't enter repair mode. In some cases we may not roll back these changes properly, meaning we break network connections on the source. Our general approach is not to completely block migration if there are problems, but simply to break any flows we can't migrate. So, if we have no connection from passt-repair carry on with the migration, but don't attempt to migrate any TCP connections. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> --- flow.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/flow.c b/flow.c index 6cf96c26..749c4984 100644 --- a/flow.c +++ b/flow.c @@ -923,6 +923,10 @@ static int flow_migrate_repair_all(struct ctx *c, bool enable) union flow *flow; int rc; + /* If we don't have a repair helper, there's nothing we can do */ + if (c->fd_repair < 0) + return 0; + foreach_established_tcp_flow(flow) { if (enable) rc = tcp_flow_repair_on(c, &flow->tcp); @@ -987,8 +991,11 @@ int flow_migrate_source(struct ctx *c, const struct migrate_stage *stage, (void)c; (void)stage; - foreach_established_tcp_flow(flow) - count++; + /* If we don't have a repair helper, we can't migrate TCP flows */ + if (c->fd_repair >= 0) { + foreach_established_tcp_flow(flow) + count++; + } count = htonl(count); if (write_all_buf(fd, &count, sizeof(count))) { -- @@ -923,6 +923,10 @@ static int flow_migrate_repair_all(struct ctx *c, bool enable) union flow *flow; int rc; + /* If we don't have a repair helper, there's nothing we can do */ + if (c->fd_repair < 0) + return 0; + foreach_established_tcp_flow(flow) { if (enable) rc = tcp_flow_repair_on(c, &flow->tcp); @@ -987,8 +991,11 @@ int flow_migrate_source(struct ctx *c, const struct migrate_stage *stage, (void)c; (void)stage; - foreach_established_tcp_flow(flow) - count++; + /* If we don't have a repair helper, we can't migrate TCP flows */ + if (c->fd_repair >= 0) { + foreach_established_tcp_flow(flow) + count++; + } count = htonl(count); if (write_all_buf(fd, &count, sizeof(count))) { -- 2.48.1 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] migrate, flow: Don't attempt to migrate TCP flows without passt-repair 2025-02-20 6:03 ` [PATCH 2/2] migrate, flow: Don't attempt to migrate TCP flows without passt-repair David Gibson @ 2025-02-20 8:07 ` Stefano Brivio 2025-02-20 10:18 ` David Gibson 0 siblings, 1 reply; 10+ messages in thread From: Stefano Brivio @ 2025-02-20 8:07 UTC (permalink / raw) To: David Gibson; +Cc: passt-dev On Thu, 20 Feb 2025 17:03:18 +1100 David Gibson <david@gibson.dropbear.id.au> wrote: > Migrating TCP flows requires passt-repair in order to use TCP_REPAIR. If > passt-repair is not started, our failure mode is pretty ugly though: we'll > attempt the migration, hitting various problems when we can't enter repair > mode. In some cases we may not roll back these changes properly, meaning > we break network connections on the source. > > Our general approach is not to completely block migration if there are > problems, but simply to break any flows we can't migrate. So, if we have > no connection from passt-repair carry on with the migration, but don't > attempt to migrate any TCP connections. > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > --- > flow.c | 11 +++++++++-- > 1 file changed, 9 insertions(+), 2 deletions(-) > > diff --git a/flow.c b/flow.c > index 6cf96c26..749c4984 100644 > --- a/flow.c > +++ b/flow.c > @@ -923,6 +923,10 @@ static int flow_migrate_repair_all(struct ctx *c, bool enable) > union flow *flow; > int rc; > > + /* If we don't have a repair helper, there's nothing we can do */ > + if (c->fd_repair < 0) > + return 0; > + This doesn't fix the behaviour in a relatively likely failure mode: passt-repair is there, but we can't communicate to it (LSM policy issues or similar). In that case, unconditionally terminating on failure in the rollback function: if (tcp_flow_repair_off(c, &flow->tcp)) die("Failed to roll back TCP_REPAIR mode"); if (repair_flush(c)) die("Failed to roll back TCP_REPAIR mode"); isn't a very productive thing to do: we go from an uneventful failure where flows were not affected at all to a guest left without connectivity. That starts looking less robust than the alternative (e.g. what I implemented in v12: silently fail and continue) at least without https://patchew.org/QEMU/20250217092550.1172055-1-lvivier@redhat.com/ in a general case as well: if we continue, we'll have hanging flows that will expire on timeout, but if we don't, again, we'll have a guest without connectivity. I understand that leaving flows around for that long might present a relevant inconsistency, though. So I'm wondering about some alternatives: actually, the rollback function shouldn't be called at all in this case. Or it could just (indirectly) call tcp_rst() on all the flows that were possibly affected. -- Stefano ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] migrate, flow: Don't attempt to migrate TCP flows without passt-repair 2025-02-20 8:07 ` Stefano Brivio @ 2025-02-20 10:18 ` David Gibson 2025-02-20 10:38 ` Stefano Brivio 0 siblings, 1 reply; 10+ messages in thread From: David Gibson @ 2025-02-20 10:18 UTC (permalink / raw) To: Stefano Brivio; +Cc: passt-dev [-- Attachment #1: Type: text/plain, Size: 4008 bytes --] On Thu, Feb 20, 2025 at 09:07:26AM +0100, Stefano Brivio wrote: > On Thu, 20 Feb 2025 17:03:18 +1100 > David Gibson <david@gibson.dropbear.id.au> wrote: > > > Migrating TCP flows requires passt-repair in order to use TCP_REPAIR. If > > passt-repair is not started, our failure mode is pretty ugly though: we'll > > attempt the migration, hitting various problems when we can't enter repair > > mode. In some cases we may not roll back these changes properly, meaning > > we break network connections on the source. > > > > Our general approach is not to completely block migration if there are > > problems, but simply to break any flows we can't migrate. So, if we have > > no connection from passt-repair carry on with the migration, but don't > > attempt to migrate any TCP connections. > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > --- > > flow.c | 11 +++++++++-- > > 1 file changed, 9 insertions(+), 2 deletions(-) > > > > diff --git a/flow.c b/flow.c > > index 6cf96c26..749c4984 100644 > > --- a/flow.c > > +++ b/flow.c > > @@ -923,6 +923,10 @@ static int flow_migrate_repair_all(struct ctx *c, bool enable) > > union flow *flow; > > int rc; > > > > + /* If we don't have a repair helper, there's nothing we can do */ > > + if (c->fd_repair < 0) > > + return 0; > > + > > This doesn't fix the behaviour in a relatively likely failure mode: > passt-repair is there, but we can't communicate to it (LSM policy > issues or similar). Ah... true. Although it shouldn't make it any worse for that case, right, so that could be a separate fix. > In that case, unconditionally terminating on failure in the rollback > function: > > if (tcp_flow_repair_off(c, &flow->tcp)) > die("Failed to roll back TCP_REPAIR mode"); > > if (repair_flush(c)) > die("Failed to roll back TCP_REPAIR mode"); > > isn't a very productive thing to do: we go from an uneventful failure > where flows were not affected at all to a guest left without > connectivity. So, the issue is that leaving sockets in repair mode after we leave the migration path would be very bad. We can't easily close sockets/flows for which that's the case, because the batching means if there's a failure we don't actually know which sockets are in which mode, hence the die(). > That starts looking less robust than the alternative (e.g. what I > implemented in v12: silently fail and continue) at least without > https://patchew.org/QEMU/20250217092550.1172055-1-lvivier@redhat.com/ > in a general case as well: if we continue, we'll have hanging flows > that will expire on timeout, but if we don't, again, we'll have a > guest without connectivity. > > I understand that leaving flows around for that long might present a > relevant inconsistency, though. > > So I'm wondering about some alternatives: actually, the rollback > function shouldn't be called at all in this case. Or it could just > (indirectly) call tcp_rst() on all the flows that were possibly > affected. Making it be a safe no-op if we never managed to turn repair on for anything would make sense to me. Unfortunately, in this situation we won't see an error until we do a repair_flush() which means we now don't know the state of any sockets we already passed to repair_set(). We could, I suppose, close all flows that we passed to repair_set() by the time we see the error. If we have < one batch's worth of connections that will kill connectivity almost as much as die()ing, though. I guess it will come back without needing qemu to restart us, though, so that's something. This sort of thing is, incidentally, why I did way back suggest the possibility of passt-repair reporting failures per-fd, rather than just per-batch. -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] migrate, flow: Don't attempt to migrate TCP flows without passt-repair 2025-02-20 10:18 ` David Gibson @ 2025-02-20 10:38 ` Stefano Brivio 2025-02-21 2:40 ` David Gibson 0 siblings, 1 reply; 10+ messages in thread From: Stefano Brivio @ 2025-02-20 10:38 UTC (permalink / raw) To: David Gibson; +Cc: passt-dev On Thu, 20 Feb 2025 21:18:06 +1100 David Gibson <david@gibson.dropbear.id.au> wrote: > On Thu, Feb 20, 2025 at 09:07:26AM +0100, Stefano Brivio wrote: > > On Thu, 20 Feb 2025 17:03:18 +1100 > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > Migrating TCP flows requires passt-repair in order to use TCP_REPAIR. If > > > passt-repair is not started, our failure mode is pretty ugly though: we'll > > > attempt the migration, hitting various problems when we can't enter repair > > > mode. In some cases we may not roll back these changes properly, meaning > > > we break network connections on the source. > > > > > > Our general approach is not to completely block migration if there are > > > problems, but simply to break any flows we can't migrate. So, if we have > > > no connection from passt-repair carry on with the migration, but don't > > > attempt to migrate any TCP connections. > > > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > > --- > > > flow.c | 11 +++++++++-- > > > 1 file changed, 9 insertions(+), 2 deletions(-) > > > > > > diff --git a/flow.c b/flow.c > > > index 6cf96c26..749c4984 100644 > > > --- a/flow.c > > > +++ b/flow.c > > > @@ -923,6 +923,10 @@ static int flow_migrate_repair_all(struct ctx *c, bool enable) > > > union flow *flow; > > > int rc; > > > > > > + /* If we don't have a repair helper, there's nothing we can do */ > > > + if (c->fd_repair < 0) > > > + return 0; > > > + > > > > This doesn't fix the behaviour in a relatively likely failure mode: > > passt-repair is there, but we can't communicate to it (LSM policy > > issues or similar). > > Ah... true. Although it shouldn't make it any worse for that case, > right, so that could be a separate fix. Sure. > > In that case, unconditionally terminating on failure in the rollback > > function: > > > > if (tcp_flow_repair_off(c, &flow->tcp)) > > die("Failed to roll back TCP_REPAIR mode"); > > > > if (repair_flush(c)) > > die("Failed to roll back TCP_REPAIR mode"); > > > > isn't a very productive thing to do: we go from an uneventful failure > > where flows were not affected at all to a guest left without > > connectivity. > > So, the issue is that leaving sockets in repair mode after we leave > the migration path would be very bad. Why? I really can't see anything catastrophic happening as a result of that (hence my v12 version of this). Surely not as bad as the guest losing connectivity without any possible recovery. > We can't easily close > sockets/flows for which that's the case, because the batching means if > there's a failure we don't actually know which sockets are in which > mode, hence the die(). They can be closed (via tcp_rst()) anyway. If they're in repair mode, no RST will reach the peer, and if they aren't, it will. > > That starts looking less robust than the alternative (e.g. what I > > implemented in v12: silently fail and continue) at least without > > https://patchew.org/QEMU/20250217092550.1172055-1-lvivier@redhat.com/ > > in a general case as well: if we continue, we'll have hanging flows > > that will expire on timeout, but if we don't, again, we'll have a > > guest without connectivity. > > > > I understand that leaving flows around for that long might present a > > relevant inconsistency, though. > > > > So I'm wondering about some alternatives: actually, the rollback > > function shouldn't be called at all in this case. Or it could just > > (indirectly) call tcp_rst() on all the flows that were possibly > > affected. > > Making it be a safe no-op if we never managed to turn repair on for > anything would make sense to me. Unfortunately, in this situation we > won't see an error until we do a repair_flush() which means we now > don't know the state of any sockets we already passed to > repair_set(). > > We could, I suppose, close all flows that we passed to repair_set() by > the time we see the error. If we have < one batch's worth of > connections that will kill connectivity almost as much as die()ing, > though. I guess it will come back without needing qemu to restart us, > though, so that's something. Right, yes, that's what I'm suggesting. > This sort of thing is, incidentally, why I did way back suggest the > possibility of passt-repair reporting failures per-fd, rather than > just per-batch. Sorry, I somehow missed that proposal, and I can't find any trace of it. But anyway, the problem is that if we fail to read a batch for any reason (invalid ancillary data... maybe always implying a kernel issue, but I'm not sure), you can't _reliably_ report per-fd failures. *Usually*, you can. Worth it? In any case, if it's simple, we can still do it, because passt and passt-repair are distributed together. You can't pass back the file descriptors via SCM_RIGHTS though, because we want to close() them before we reply. Another alternative could be that passt-repair reverts back the state of the file descriptors that were already switched, on failure. -- Stefano ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] migrate, flow: Don't attempt to migrate TCP flows without passt-repair 2025-02-20 10:38 ` Stefano Brivio @ 2025-02-21 2:40 ` David Gibson 2025-02-21 5:59 ` Stefano Brivio 0 siblings, 1 reply; 10+ messages in thread From: David Gibson @ 2025-02-21 2:40 UTC (permalink / raw) To: Stefano Brivio; +Cc: passt-dev [-- Attachment #1: Type: text/plain, Size: 6406 bytes --] On Thu, Feb 20, 2025 at 11:38:00AM +0100, Stefano Brivio wrote: > On Thu, 20 Feb 2025 21:18:06 +1100 > David Gibson <david@gibson.dropbear.id.au> wrote: > > > On Thu, Feb 20, 2025 at 09:07:26AM +0100, Stefano Brivio wrote: > > > On Thu, 20 Feb 2025 17:03:18 +1100 > > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > > > Migrating TCP flows requires passt-repair in order to use TCP_REPAIR. If > > > > passt-repair is not started, our failure mode is pretty ugly though: we'll > > > > attempt the migration, hitting various problems when we can't enter repair > > > > mode. In some cases we may not roll back these changes properly, meaning > > > > we break network connections on the source. > > > > > > > > Our general approach is not to completely block migration if there are > > > > problems, but simply to break any flows we can't migrate. So, if we have > > > > no connection from passt-repair carry on with the migration, but don't > > > > attempt to migrate any TCP connections. > > > > > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > > > --- > > > > flow.c | 11 +++++++++-- > > > > 1 file changed, 9 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/flow.c b/flow.c > > > > index 6cf96c26..749c4984 100644 > > > > --- a/flow.c > > > > +++ b/flow.c > > > > @@ -923,6 +923,10 @@ static int flow_migrate_repair_all(struct ctx *c, bool enable) > > > > union flow *flow; > > > > int rc; > > > > > > > > + /* If we don't have a repair helper, there's nothing we can do */ > > > > + if (c->fd_repair < 0) > > > > + return 0; > > > > + > > > > > > This doesn't fix the behaviour in a relatively likely failure mode: > > > passt-repair is there, but we can't communicate to it (LSM policy > > > issues or similar). > > > > Ah... true. Although it shouldn't make it any worse for that case, > > right, so that could be a separate fix. > > Sure. > > > > In that case, unconditionally terminating on failure in the rollback > > > function: > > > > > > if (tcp_flow_repair_off(c, &flow->tcp)) > > > die("Failed to roll back TCP_REPAIR mode"); > > > > > > if (repair_flush(c)) > > > die("Failed to roll back TCP_REPAIR mode"); > > > > > > isn't a very productive thing to do: we go from an uneventful failure > > > where flows were not affected at all to a guest left without > > > connectivity. > > > > So, the issue is that leaving sockets in repair mode after we leave > > the migration path would be very bad. > > Why? I really can't see anything catastrophic happening as a result of > that (hence my v12 version of this). Surely not as bad as the guest > losing connectivity without any possible recovery. I was meaning specifically trying to carry on using a repair mode socket as if it wasn't in repair mode, not closing it out. > > We can't easily close > > sockets/flows for which that's the case, because the batching means if > > there's a failure we don't actually know which sockets are in which > > mode, hence the die(). > > They can be closed (via tcp_rst()) anyway. If they're in repair mode, > no RST will reach the peer, and if they aren't, it will. Ah, yes, I guess we can just close anything that might be affected. Brutal, but effective. It will disrupt connectivity, of course, but not as badly as dying completely. > > > That starts looking less robust than the alternative (e.g. what I > > > implemented in v12: silently fail and continue) at least without > > > https://patchew.org/QEMU/20250217092550.1172055-1-lvivier@redhat.com/ > > > in a general case as well: if we continue, we'll have hanging flows > > > that will expire on timeout, but if we don't, again, we'll have a > > > guest without connectivity. > > > > > > I understand that leaving flows around for that long might present a > > > relevant inconsistency, though. > > > > > > So I'm wondering about some alternatives: actually, the rollback > > > function shouldn't be called at all in this case. Or it could just > > > (indirectly) call tcp_rst() on all the flows that were possibly > > > affected. > > > > Making it be a safe no-op if we never managed to turn repair on for > > anything would make sense to me. Unfortunately, in this situation we > > won't see an error until we do a repair_flush() which means we now > > don't know the state of any sockets we already passed to > > repair_set(). > > > > We could, I suppose, close all flows that we passed to repair_set() by > > the time we see the error. If we have < one batch's worth of > > connections that will kill connectivity almost as much as die()ing, > > though. I guess it will come back without needing qemu to restart us, > > though, so that's something. > > Right, yes, that's what I'm suggesting. Ok. I'll work on something to do that. > > This sort of thing is, incidentally, why I did way back suggest the > > possibility of passt-repair reporting failures per-fd, rather than > > just per-batch. > > Sorry, I somehow missed that proposal, and I can't find any trace of > it. It may have just been on IRC somewhere. > But anyway, the problem is that if we fail to read a batch for any > reason (invalid ancillary data... maybe always implying a kernel issue, > but I'm not sure), you can't _reliably_ report per-fd failures. > *Usually*, you can. Worth it? Ah, I see. We could handle that by being able to report both per-fd and "whole batch" failure (equivalent to failure on every fd), but that would complexify the protocol, of course. > In any case, if it's simple, we can still do it, because passt and > passt-repair are distributed together. You can't pass back the file > descriptors via SCM_RIGHTS though, because we want to close() them > before we reply. > > Another alternative could be that passt-repair reverts back the state > of the file descriptors that were already switched, on failure. That might help a bit, we'd still need to rework the passt-side interface to know what needs reverting at the right stage. I'll take those ideas and see what I can come up with today. -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] migrate, flow: Don't attempt to migrate TCP flows without passt-repair 2025-02-21 2:40 ` David Gibson @ 2025-02-21 5:59 ` Stefano Brivio 2025-02-21 6:37 ` David Gibson 0 siblings, 1 reply; 10+ messages in thread From: Stefano Brivio @ 2025-02-21 5:59 UTC (permalink / raw) To: David Gibson; +Cc: passt-dev On Fri, 21 Feb 2025 13:40:12 +1100 David Gibson <david@gibson.dropbear.id.au> wrote: > On Thu, Feb 20, 2025 at 11:38:00AM +0100, Stefano Brivio wrote: > > On Thu, 20 Feb 2025 21:18:06 +1100 > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > This sort of thing is, incidentally, why I did way back suggest the > > > possibility of passt-repair reporting failures per-fd, rather than > > > just per-batch. > > > > Sorry, I somehow missed that proposal, and I can't find any trace of > > it. > > It may have just been on IRC somewhere. > > > But anyway, the problem is that if we fail to read a batch for any > > reason (invalid ancillary data... maybe always implying a kernel issue, > > but I'm not sure), you can't _reliably_ report per-fd failures. > > *Usually*, you can. Worth it? > > Ah, I see. We could handle that by being able to report both per-fd > and "whole batch" failure (equivalent to failure on every fd), but > that would complexify the protocol, of course. By the way, after having another look at the kernel interface and implementation: this makes no sense. Either we're able to set repair mode for all the sockets, or for none of them (EPERM). And if there's any invalid file descriptor in the set, we'll get EBADF for the whole sendmsg(). The stuff I'm proposing below is well beyond my threshold of things that make no sense to implement, but at least it limits damage in terms of complexity (and hence of potential impact on the actual usage, because that's what we're talking about here: a die() that makes no sense but now proves to be actually harmful). I wouldn't go any further than that. > > In any case, if it's simple, we can still do it, because passt and > > passt-repair are distributed together. You can't pass back the file > > descriptors via SCM_RIGHTS though, because we want to close() them > > before we reply. > > > > Another alternative could be that passt-repair reverts back the state > > of the file descriptors that were already switched, on failure. > > That might help a bit, we'd still need to rework the passt-side > interface to know what needs reverting at the right stage. Not for what I'm proposing: 1. passt sends 1 (TCP_REPAIR_ON) for sockets 2, 3 2. passt-repair sets repair mode for 2 3. passt-repair fails to set repair mode for 3 4. passt-repair clears repair mode for 2 5. passt-repair closes the connection to signal failure 6. passt knows that repair mode is *not* set for any socket in the batch The interface remains the same (per-batch error only), but you can rely on the whole batch to have failed. You already can, even with no changes in passt-repair (see above), by the way. -- Stefano ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] migrate, flow: Don't attempt to migrate TCP flows without passt-repair 2025-02-21 5:59 ` Stefano Brivio @ 2025-02-21 6:37 ` David Gibson 2025-02-21 7:03 ` Stefano Brivio 0 siblings, 1 reply; 10+ messages in thread From: David Gibson @ 2025-02-21 6:37 UTC (permalink / raw) To: Stefano Brivio; +Cc: passt-dev [-- Attachment #1: Type: text/plain, Size: 3734 bytes --] On Fri, Feb 21, 2025 at 06:59:12AM +0100, Stefano Brivio wrote: > On Fri, 21 Feb 2025 13:40:12 +1100 > David Gibson <david@gibson.dropbear.id.au> wrote: > > > On Thu, Feb 20, 2025 at 11:38:00AM +0100, Stefano Brivio wrote: > > > On Thu, 20 Feb 2025 21:18:06 +1100 > > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > > > This sort of thing is, incidentally, why I did way back suggest the > > > > possibility of passt-repair reporting failures per-fd, rather than > > > > just per-batch. > > > > > > Sorry, I somehow missed that proposal, and I can't find any trace of > > > it. > > > > It may have just been on IRC somewhere. > > > > > But anyway, the problem is that if we fail to read a batch for any > > > reason (invalid ancillary data... maybe always implying a kernel issue, > > > but I'm not sure), you can't _reliably_ report per-fd failures. > > > *Usually*, you can. Worth it? > > > > Ah, I see. We could handle that by being able to report both per-fd > > and "whole batch" failure (equivalent to failure on every fd), but > > that would complexify the protocol, of course. > > By the way, after having another look at the kernel interface and > implementation: this makes no sense. > > Either we're able to set repair mode for all the sockets, or for none > of them (EPERM). Well.. probably. I suspect something sufficiently insane in an LSM could break that rule. > And if there's any invalid file descriptor in the set, > we'll get EBADF for the whole sendmsg(). > > The stuff I'm proposing below is well beyond my threshold of things > that make no sense to implement, but at least it limits damage in terms > of complexity (and hence of potential impact on the actual usage, > because that's what we're talking about here: a die() that makes no > sense but now proves to be actually harmful). I wouldn't go any further > than that. > > > > In any case, if it's simple, we can still do it, because passt and > > > passt-repair are distributed together. You can't pass back the file > > > descriptors via SCM_RIGHTS though, because we want to close() them > > > before we reply. > > > > > > Another alternative could be that passt-repair reverts back the state > > > of the file descriptors that were already switched, on failure. > > > > That might help a bit, we'd still need to rework the passt-side > > interface to know what needs reverting at the right stage. > > Not for what I'm proposing: > > 1. passt sends 1 (TCP_REPAIR_ON) for sockets 2, 3 > > 2. passt-repair sets repair mode for 2 > > 3. passt-repair fails to set repair mode for 3 > > 4. passt-repair clears repair mode for 2 Again, it shouldn't happen in practice, but you will get a mess if you ever managed to set repair mode, but failed to clear it. There's a similar question in the other direction, if passt is trying to REPAIR_OFF, and we fail part way through. Should passt-repair REPAIR_ON again? I'm not sure it has the information to make a sane choice here. > 5. passt-repair closes the connection to signal failure > > 6. passt knows that repair mode is *not* set for any socket in the batch > > The interface remains the same (per-batch error only), but you can rely > on the whole batch to have failed. > > You already can, even with no changes in passt-repair (see above), by > the way. Well, I just finished implementing a simple way of reporting partial failures. I guess see what you think of it. -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] migrate, flow: Don't attempt to migrate TCP flows without passt-repair 2025-02-21 6:37 ` David Gibson @ 2025-02-21 7:03 ` Stefano Brivio 0 siblings, 0 replies; 10+ messages in thread From: Stefano Brivio @ 2025-02-21 7:03 UTC (permalink / raw) To: David Gibson; +Cc: passt-dev On Fri, 21 Feb 2025 17:37:18 +1100 David Gibson <david@gibson.dropbear.id.au> wrote: > On Fri, Feb 21, 2025 at 06:59:12AM +0100, Stefano Brivio wrote: > > On Fri, 21 Feb 2025 13:40:12 +1100 > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > On Thu, Feb 20, 2025 at 11:38:00AM +0100, Stefano Brivio wrote: > > > > On Thu, 20 Feb 2025 21:18:06 +1100 > > > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > > > > > This sort of thing is, incidentally, why I did way back suggest the > > > > > possibility of passt-repair reporting failures per-fd, rather than > > > > > just per-batch. > > > > > > > > Sorry, I somehow missed that proposal, and I can't find any trace of > > > > it. > > > > > > It may have just been on IRC somewhere. > > > > > > > But anyway, the problem is that if we fail to read a batch for any > > > > reason (invalid ancillary data... maybe always implying a kernel issue, > > > > but I'm not sure), you can't _reliably_ report per-fd failures. > > > > *Usually*, you can. Worth it? > > > > > > Ah, I see. We could handle that by being able to report both per-fd > > > and "whole batch" failure (equivalent to failure on every fd), but > > > that would complexify the protocol, of course. > > > > By the way, after having another look at the kernel interface and > > implementation: this makes no sense. > > > > Either we're able to set repair mode for all the sockets, or for none > > of them (EPERM). > > Well.. probably. I suspect something sufficiently insane in an LSM > could break that rule. Not in the LSMs we ship policies for, because we don't run as root, so we can't magically relabel (setfilecon()) sockets or change subprofile (aa_change_hat()) before/after creating some, even if we wanted. And from the little I know of TOMOYO that should be impossible as well. > > And if there's any invalid file descriptor in the set, > > we'll get EBADF for the whole sendmsg(). > > > > The stuff I'm proposing below is well beyond my threshold of things > > that make no sense to implement, but at least it limits damage in terms > > of complexity (and hence of potential impact on the actual usage, > > because that's what we're talking about here: a die() that makes no > > sense but now proves to be actually harmful). I wouldn't go any further > > than that. > > > > > > In any case, if it's simple, we can still do it, because passt and > > > > passt-repair are distributed together. You can't pass back the file > > > > descriptors via SCM_RIGHTS though, because we want to close() them > > > > before we reply. > > > > > > > > Another alternative could be that passt-repair reverts back the state > > > > of the file descriptors that were already switched, on failure. > > > > > > That might help a bit, we'd still need to rework the passt-side > > > interface to know what needs reverting at the right stage. > > > > Not for what I'm proposing: > > > > 1. passt sends 1 (TCP_REPAIR_ON) for sockets 2, 3 > > > > 2. passt-repair sets repair mode for 2 > > > > 3. passt-repair fails to set repair mode for 3 > > > > 4. passt-repair clears repair mode for 2 > > Again, it shouldn't happen in practice, but you will get a mess if you > ever managed to set repair mode, but failed to clear it. "It shouldn't happen in practice" == "It doesn't happen". And again, in this impossible case, we would either get a socket that's stuck, for two hours, or close it because we get some other error on it. > There's a similar question in the other direction, if passt is trying > to REPAIR_OFF, and we fail part way through. Should passt-repair > REPAIR_ON again? I'm not sure it has the information to make a sane > choice here. Yes, I think it should. The information it has is that it just set REPAIR_OFF, so repair mode was presumably on to begin with. > > 5. passt-repair closes the connection to signal failure > > > > 6. passt knows that repair mode is *not* set for any socket in the batch > > > > The interface remains the same (per-batch error only), but you can rely > > on the whole batch to have failed. > > > > You already can, even with no changes in passt-repair (see above), by > > the way. > > Well, I just finished implementing a simple way of reporting partial > failures. I guess see what you think of it. I guess it conflicts with my priority which is to fix actual bugs without introducing new ones, but other than that, I'm not necessarily against it. -- Stefano ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-02-21 7:03 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-02-20 6:03 [PATCH 0/2] RFC: More graceful handling of migration without passt-repair (UNTESTED) David Gibson 2025-02-20 6:03 ` [PATCH 1/2] migrate, flow: Trivially succeed if migrating with no flows David Gibson 2025-02-20 6:03 ` [PATCH 2/2] migrate, flow: Don't attempt to migrate TCP flows without passt-repair David Gibson 2025-02-20 8:07 ` Stefano Brivio 2025-02-20 10:18 ` David Gibson 2025-02-20 10:38 ` Stefano Brivio 2025-02-21 2:40 ` David Gibson 2025-02-21 5:59 ` Stefano Brivio 2025-02-21 6:37 ` David Gibson 2025-02-21 7:03 ` Stefano Brivio
Code repositories for project(s) associated with this public inbox https://passt.top/passt This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for IMAP folder(s).