public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: Stefano Brivio <sbrivio@redhat.com>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: passt-dev@passt.top
Subject: Re: [PATCH 2/2] migrate, flow: Don't attempt to migrate TCP flows without passt-repair
Date: Fri, 21 Feb 2025 08:03:00 +0100	[thread overview]
Message-ID: <20250221080300.3dcf97a2@elisabeth> (raw)
In-Reply-To: <Z7gfHldsxIDfCnlG@zatzit>

On Fri, 21 Feb 2025 17:37:18 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Fri, Feb 21, 2025 at 06:59:12AM +0100, Stefano Brivio wrote:
> > On Fri, 21 Feb 2025 13:40:12 +1100
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> >   
> > > On Thu, Feb 20, 2025 at 11:38:00AM +0100, Stefano Brivio wrote:  
> > > > On Thu, 20 Feb 2025 21:18:06 +1100
> > > > David Gibson <david@gibson.dropbear.id.au> wrote:
> > > >     
> > > > > This sort of thing is, incidentally, why I did way back suggest the
> > > > > possibility of passt-repair reporting failures per-fd, rather than
> > > > > just per-batch.    
> > > > 
> > > > Sorry, I somehow missed that proposal, and I can't find any trace of
> > > > it.    
> > > 
> > > It may have just been on IRC somewhere.
> > >   
> > > > But anyway, the problem is that if we fail to read a batch for any
> > > > reason (invalid ancillary data... maybe always implying a kernel issue,
> > > > but I'm not sure), you can't _reliably_ report per-fd failures.
> > > > *Usually*, you can. Worth it?    
> > > 
> > > Ah, I see.  We could handle that by being able to report both per-fd
> > > and "whole batch" failure (equivalent to failure on every fd), but
> > > that would complexify the protocol, of course.  
> > 
> > By the way, after having another look at the kernel interface and
> > implementation: this makes no sense.
> > 
> > Either we're able to set repair mode for all the sockets, or for none
> > of them (EPERM).  
> 
> Well.. probably.  I suspect something sufficiently insane in an LSM
> could break that rule.

Not in the LSMs we ship policies for, because we don't run as root, so
we can't magically relabel (setfilecon()) sockets or change subprofile
(aa_change_hat()) before/after creating some, even if we wanted. And
from the little I know of TOMOYO that should be impossible as well.

> > And if there's any invalid file descriptor in the set,
> > we'll get EBADF for the whole sendmsg().
> > 
> > The stuff I'm proposing below is well beyond my threshold of things
> > that make no sense to implement, but at least it limits damage in terms
> > of complexity (and hence of potential impact on the actual usage,
> > because that's what we're talking about here: a die() that makes no
> > sense but now proves to be actually harmful). I wouldn't go any further
> > than that.
> >   
> > > > In any case, if it's simple, we can still do it, because passt and
> > > > passt-repair are distributed together. You can't pass back the file
> > > > descriptors via SCM_RIGHTS though, because we want to close() them
> > > > before we reply.
> > > > 
> > > > Another alternative could be that passt-repair reverts back the state
> > > > of the file descriptors that were already switched, on failure.    
> > > 
> > > That might help a bit, we'd still need to rework the passt-side
> > > interface to know what needs reverting at the right stage.  
> > 
> > Not for what I'm proposing:
> > 
> > 1. passt sends 1 (TCP_REPAIR_ON) for sockets 2, 3
> > 
> > 2. passt-repair sets repair mode for 2
> > 
> > 3. passt-repair fails to set repair mode for 3
> > 
> > 4. passt-repair clears repair mode for 2  
> 
> Again, it shouldn't happen in practice, but you will get a mess if you
> ever managed to set repair mode, but failed to clear it.

"It shouldn't happen in practice" == "It doesn't happen". And again, in
this impossible case, we would either get a socket that's stuck, for two
hours, or close it because we get some other error on it.

> There's a similar question in the other direction, if passt is trying
> to REPAIR_OFF, and we fail part way through.  Should passt-repair
> REPAIR_ON again?  I'm not sure it has the information to make a sane
> choice here.

Yes, I think it should. The information it has is that it just set
REPAIR_OFF, so repair mode was presumably on to begin with.

> > 5. passt-repair closes the connection to signal failure
> > 
> > 6. passt knows that repair mode is *not* set for any socket in the batch
> > 
> > The interface remains the same (per-batch error only), but you can rely
> > on the whole batch to have failed.
> > 
> > You already can, even with no changes in passt-repair (see above), by
> > the way.  
> 
> Well, I just finished implementing a simple way of reporting partial
> failures.  I guess see what you think of it.

I guess it conflicts with my priority which is to fix actual bugs without
introducing new ones, but other than that, I'm not necessarily against
it.

-- 
Stefano


      reply	other threads:[~2025-02-21  7:03 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-20  6:03 [PATCH 0/2] RFC: More graceful handling of migration without passt-repair (UNTESTED) David Gibson
2025-02-20  6:03 ` [PATCH 1/2] migrate, flow: Trivially succeed if migrating with no flows David Gibson
2025-02-20  6:03 ` [PATCH 2/2] migrate, flow: Don't attempt to migrate TCP flows without passt-repair David Gibson
2025-02-20  8:07   ` Stefano Brivio
2025-02-20 10:18     ` David Gibson
2025-02-20 10:38       ` Stefano Brivio
2025-02-21  2:40         ` David Gibson
2025-02-21  5:59           ` Stefano Brivio
2025-02-21  6:37             ` David Gibson
2025-02-21  7:03               ` Stefano Brivio [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250221080300.3dcf97a2@elisabeth \
    --to=sbrivio@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=passt-dev@passt.top \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).