public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: passt-dev@passt.top
Subject: Re: [PATCH 1/3] conf: Introduce --no-bindtodevice option for testing
Date: Tue, 13 Jan 2026 14:00:18 +1100	[thread overview]
Message-ID: <aWW1Qrn-rb9I9aY9@zatzit> (raw)
In-Reply-To: <20260113011201.05a80cb7@elisabeth>

[-- Attachment #1: Type: text/plain, Size: 11180 bytes --]

On Tue, Jan 13, 2026 at 01:12:01AM +0100, Stefano Brivio wrote:
> On Mon, 12 Jan 2026 14:42:39 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Sun, Jan 11, 2026 at 12:33:14AM +0100, Stefano Brivio wrote:
> > > On Mon,  5 Jan 2026 19:28:48 +1100
> > > David Gibson <david@gibson.dropbear.id.au> wrote:
> > >   
> > > > We need to support (as best we can) older kernels which don't allow
> > > > unprivilieged processes to use the SO_BINDTODEVICE socket option.  
> > > 
> > > Nit: unprivileged
> > >   
> > > > Fallcaks for that case are controlled by the c->no_bindtodevice variable.  
> > > 
> > > Fallbacks  
> > 
> > Oops & oops.  Fixed.
> > 
> > > > Currently testing behaviour of those fallbacks requires setting up a test
> > > > system with a kernel that doesn't support the option, which is pretty
> > > > awkward.  We can test it almost as well and much more easily by adding a
> > > > command line option to explicitly disable use of SO_BINDTODEVICE.  
> > > 
> > > It's kind of hard to understand if this patch entirely does that, I
> > > think.  
> > 
> > Well, it forces c->no_bindtodevice to be true.  If we attempt to use
> > SO_BINDTODEVICE in that case, it's a bug elsewhere.
> 
> Yes... but we wouldn't find it with this patch. We would only find it
> with a kernel actually not supporting it, or by replacing all the
> setsockopt() calls with something else.

True.  What I was looking to test with this was behaviour of the
higher level workarounds - e.g. that we split -[TU] forwards into
127.0.0.1 and ::1 instead of using *%lo.

> > > We still have a separate, implicit probing of SO_BINDTODEVICE in
> > > sock_l4_(), which is perhaps excluded by c->no_bindtodevice (but then
> > > the comment is misleading?).  
> > 
> > It should indeed be excluded because we should never call sock_l4_()
> > with a non-empty ifname if !c->no_bindtodevice.  It's not really
> > probing, because we outright fail sock_l4_(), there's no fallback
> > there.  The error path is there:
> >   * As a backstop if there is a bug elsewhere meaning we do call this
> >     with non-empty ifname
> >   * If the SO_BINDTODEVICE call fails for a reason other than being
> >     globally unavailable (non existent interface, out of memory,
> >     sufficiently perverse selinux module).
> > 
> > Given the above, probably should be an err(), and the comment there is
> > no longer accurate / helpful (we already moved it to
> > sock_probe_features()).  I've made those changes for the next spin.
> 
> Ah, okay.
> 
> > > > Like --no-splice this is envisaged as something for developers' and
> > > > testers' convenience, not a supported option for end users.  The man page
> > > > text reflects that.  
> > > 
> > > I never really understood the point of --no-splice, as there was no
> > > user request whatsoever behind it, but fine, the argument was that it
> > > added some needed functionality, even though I couldn't quite grasp
> > > which one it was.  
> > 
> > That was never the argument from _me_ for --no-splice.  For me it was
> > always that it was useful for development / testing / debugging, not
> > that it was (directly) useful to end users.
> 
> Right, I think Jon meant it was useful to end users. Otherwise, I would
> have argued, it should be mentioned in the man page, and, I would have
> argued further, the option shouldn't exist at all.
> 
> > That's true in at least
> > two ways:
> >   * Allows testing non-splice functionality without having to either
> >     use passt or create some non-loopback addresses
> 
> ...but without a loopback address we can't use the tap path anyway.

I'm not sure what you mean here.  If I want to exercise something on
the tap path I can use:
	$ pasta --no-splice [whatever else]
	[...]
	$ socat STDIO TCP:localhost:12345

and I don't need to look up my host's current global IP.  Or if I want
to test tap with multiple different host-side oaddrs, I can use
127.0.0.0/8 without 

> >   * Lets us ask a user reporting a problem to try --no-splice if we
> >     suspect, but aren't sure that it's specific to the splice logic
> 
> ...which we never had to do (because it's obvious whether they're using
> the splice logic or not, I simply ask what kind of address they're
> using).

Admittedly, I don't think we've ever used it like that since it was
introduced.  I do know that before it existed there were several bugs
where it would have been helpful (obviously not essential) to try
that.

> > My case for --no-bindtodevice is the same: it's useful to me (and
> > therefore I'm guessing to other developers and testers).
> 
> I have some doubts about other developers and testers, in the sense
> that to me it really looks like something you need just for the
> implementation.

Eh, maybe.

> > The man page update is pretty explicit about that.
> 
> Sure, better than --no-splice.
> 
> > > However, with this, the question is where we draw the line. There are
> > > probably other options we could use to make debugging or testing
> > > slightly simpler, but if they don't offer actual functionality, we
> > > always kept them out so far.  
> > 
> > I mean, maybe, none are immediately occurring to me.  If they do in
> > future, I think we should consider adding them.
> 
> The thing is, 'passt -h' already reports 117 lines. It's still somewhat
> usable, but 200 lines would be substantially less usable, I think.
> 
> A counter-example (at least for me) is 'qemu-system-x86_64 -h', 524
> lines on my build. I don't think that's usable and I don't think we
> should go there.
> 
> > Note that
> > --no-splice, and especially --no-bindtodevice are extremely simple to
> > implement.  I would not be arguing for them if they were more complex.
> 
> My concern isn't really about complexity of the implementation, rather
> about the fact that we add more command line options. Users don't need
> them, but they have to scroll through them (in --help output and man
> page) just because we needed them (quite likely) once.

That's a reasonable point.

> > > That's because we already have a long list of options and making it
> > > unnecessarily longer is a disservice to users, I think.  
> > 
> > That's a valid point.  Would it be more palatable to you if we made
> > these suboptions of some explicit "developer hacks" option?  (--hacks?
> > --debugopt? --devtest?)
> 
> At that point the hassle looks comparable to a mandatory macro
> implementing (or not) the setsockopt(), which can be selected at build
> time.

True, a build time option might do almost as well.

> But anyway, not really, because they would also need to be documented
> command-line options. How would we use them otherwise as developers?

Well, we could limit --help and the man page to just stating the
existence of the top-level option and a pointer to a HACKS.md or
whatever for the details.  And we could make it explicitly subject to
change without notice between versions.

> > > Would using something like this:
> > > 
> > >   sed -i 's/(\(setsockopt([a-z]*, SOL_SOCKET, SO_BINDTODEVICE\)/((errno = EPERM) || \1/g' *.c
> > > 
> > > be totally outrageous, for testing purposes?  
> > 
> > Totally outrageous, no.  A bit more hassle, yes.
> 
> ...what about a script? Or a macro with a #define?
> 
> > > It has the advantage of making it easier to verify if we're really
> > > disabling the usage of SO_BINDTODEVICE on all the paths (together with
> > > grep / git / editors), and not introducing additional command line
> > > options.
> > > 
> > > Another trick I use sometimes to selectively disable or enable kernel
> > > features is to handle system calls via seitan, in this case the
> > > (simple) recipe would something like:
> > > 
> > > [
> > >   {
> > >     "match": [
> > >       { "setsockopt": { "level": socket", "name": "bindtodevice" } }
> > >     ],
> > >     "return": { "value": "EPERM", "error": -1 }
> > >   }
> > > ]
> > > 
> > > but I haven't implemented setsockopt() yet. :(
> > >   
> > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > > > ---
> > > >  conf.c  | 2 ++
> > > >  passt.1 | 6 ++++++
> > > >  2 files changed, 8 insertions(+)
> > > > 
> > > > diff --git a/conf.c b/conf.c
> > > > index ceb9aa55..70ea168c 100644
> > > > --- a/conf.c
> > > > +++ b/conf.c
> > > > @@ -962,6 +962,7 @@ static void usage(const char *name, FILE *f, int status)
> > > >  		"  --no-ndp		Disable NDP responses\n"
> > > >  		"  --no-dhcpv6		Disable DHCPv6 server\n"
> > > >  		"  --no-ra		Disable router advertisements\n"
> > > > +		"  --no-bindtodevice	Disable SO_BINDTODEVICE\n"
> > > >  		"  --freebind		Bind to any address for forwarding\n"
> > > >  		"  --no-map-gw		Don't map gateway address to host\n"
> > > >  		"  -4, --ipv4-only	Enable IPv4 operation only\n"
> > > > @@ -1454,6 +1455,7 @@ void conf(struct ctx *c, int argc, char **argv)
> > > >  		{"no-dhcpv6",	no_argument,		&c->no_dhcpv6,	1 },
> > > >  		{"no-ndp",	no_argument,		&c->no_ndp,	1 },
> > > >  		{"no-ra",	no_argument,		&c->no_ra,	1 },
> > > > +		{"no-bindtodevice", no_argument,	&c->no_bindtodevice, 1},
> > > >  		{"no-splice",	no_argument,		&c->no_splice,	1 },
> > > >  		{"freebind",	no_argument,		&c->freebind,	1 },
> > > >  		{"no-map-gw",	no_argument,		&no_map_gw,	1 },
> > > > diff --git a/passt.1 b/passt.1
> > > > index db0d6620..4859d9e5 100644
> > > > --- a/passt.1
> > > > +++ b/passt.1
> > > > @@ -348,6 +348,12 @@ namespace will be silently dropped.
> > > >  Disable Router Advertisements. Router Solicitations coming from guest or target
> > > >  namespace will be ignored.
> > > >  
> > > > +.TP
> > > > +.BR \-\-no-bindtodevice
> > > > +Development/testing option, do not use.  Disables use of
> > > > +SO_BINDTODEVICE socket option.  Implicitly enabled on older kernels
> > > > +which don't permit unprivileged use of SO_BINDTODEVICE.
> > > > +
> > > >  .TP
> > > >  .BR \-\-freebind
> > > >  Allow any binding address to be specified for \fB-t\fR and \fB-u\fR  
> > > 
> > > The change looks otherwise good to me... I just hope we can avoid it
> > > somehow, but if not, so be it.  
> > 
> > I mean, it's not essential to anything that follows, but it was useful
> > to me during testing.  If you really don't want it, well, I'll cope.
> 
> I'm not sure but... if the threshold is "useful during testing" we
> should also build something reordering TCP segments so that we can
> reproduce https://bugs.passt.top/show_bug.cgi?id=159 from time to time.
> 
> And that could actually be a clean and relatively simple implementation,
> but it just adds noise to the documentation.
> 
> I don't see a big damage we do with two extra options, but... then
> maybe we should we stop at 5? 10?

Hrm, yeah.  Ok, you convinced me for now, I'll drop this one.

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2026-01-13  3:06 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-05  8:28 [PATCH 0/3] Allow listen functions to return fds David Gibson
2026-01-05  8:28 ` [PATCH 1/3] conf: Introduce --no-bindtodevice option for testing David Gibson
2026-01-10 23:33   ` Stefano Brivio
2026-01-12  3:42     ` David Gibson
2026-01-13  0:12       ` Stefano Brivio
2026-01-13  3:00         ` David Gibson [this message]
2026-01-05  8:28 ` [PATCH 2/3] tcp, udp, conf: Don't silently ignore listens on unsupported IP versions David Gibson
2026-01-10 23:33   ` Stefano Brivio
2026-01-12  3:48     ` David Gibson
2026-01-13  0:12       ` Stefano Brivio
2026-01-13  3:05         ` David Gibson
2026-01-05  8:28 ` [PATCH 3/3] tcp, udp: Make {tcp,udp}_listen() return socket fds David Gibson
2026-01-10 23:33   ` Stefano Brivio
2026-01-12  3:50     ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aWW1Qrn-rb9I9aY9@zatzit \
    --to=david@gibson.dropbear.id.au \
    --cc=passt-dev@passt.top \
    --cc=sbrivio@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).