From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: passt.top; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=QfMyb5k0; dkim-atps=neutral Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by passt.top (Postfix) with ESMTPS id D928B5A0271 for ; Tue, 13 Jan 2026 01:12:08 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1768263127; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JXsrfCBAtKGMW5y+4L7HbuhnyC+wxGMjQ2zQ1vpUs3Y=; b=QfMyb5k0YpwRBAPNDB6kCV2B7w5PhE5XLCGqBHYs5eE9J4TF2X4XpJRpZ4AXreDljtA+Uf 0b12u6CSR/I5NiNSwoMShxzZ5R7UyfXjWiP6qC2riU7/AbKde00zwrl9sBzZKnXkGqxRuJ cFXm+/roH21n7sBGUQsZeHYkk3aZv+Q= Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-453-T4lVwoMQOvSfUSei2xEfoQ-1; Mon, 12 Jan 2026 19:12:06 -0500 X-MC-Unique: T4lVwoMQOvSfUSei2xEfoQ-1 X-Mimecast-MFC-AGG-ID: T4lVwoMQOvSfUSei2xEfoQ_1768263125 Received: by mail-wm1-f70.google.com with SMTP id 5b1f17b1804b1-47d28e7960fso71621955e9.0 for ; Mon, 12 Jan 2026 16:12:06 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1768263124; x=1768867924; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=JXsrfCBAtKGMW5y+4L7HbuhnyC+wxGMjQ2zQ1vpUs3Y=; b=cGZ2WsDbDjHXpcRBo8CpUXQueK7yFUV3Xlf9rVli88ENfa5r4YnKi3cRzqO8YrVyvu wCq3NlC9irs7kJA7qSPWXWbYWx1MrK+ng8XJdIdmpeZ329SDSoGqBThVXHU+RXkBbHwD gWOO3jLfSB4p/tdkKSGsssnJ/dWMGCoraArFCvpVgw5OBCoOZVXckbnCGEE3Fvc8Wg62 RjqI1nGCv4RfxNgMeTs/balnwIlI0WpokqzLmJJzkvDkb2w+e4nZNsJOpmysZDr8K3eJ TTYzVeZG+CxODgeO/MSB+NwAXedScdU/us/Ie1/1oJ3Sb7RDRTDKP7ENgUTNkhSba6pw LmLA== X-Gm-Message-State: AOJu0YwOKBAEZKoDVtob/U6C1TH+7EuqjjIMQFVRYLEyc0IoCKZnGyx8 P8yBdaVZ1QFrIP6P0bqdLjftOE795LFmfx07U72yGfdFF3eaqoLjY/avbqDLfSnQ92IsClZlSbC nQIhHay3ir2sG+2/UOT4Bkz51L4Tdtzvz+ej8Bx2pZJv/ONFlXXO6dmWK9BUVlw== X-Gm-Gg: AY/fxX7k6nP9ggGvgmRhQ8afxC9qdhQeMm6ZPPskqty5vsbonFKgYw3d6AJly6zB0DZ 5uLHH7yJViqhlaSxJPITWbFR+0SggsrgDFPLkhkVWMwsrHsj8i0hqefmrUM0DH29k/hOJyzhx+z bePiHBdP1n3ws7K/UtsTF1ipsvgpnmjXJZrxmGTIQT/wkpPJJS0J7ZVW5zqTXHm8pLGUDBPcJyL MitkvhV04CwSBDotJItopZcXw1w64k5fCKOXwtb4mqBys9DNmg6nv3ztA6hDpXpktCvbhIsZtMd OtmcbZxw5rfkDHB8ZYZk1vG+qqMtMBFH8cYl/iOHyTlyChibSaqwI8wpDKflluR9MrYtYiQ+YSR uW2uFYUAdEx2S7UY3SUrw6cqJHZ5Fr1zynu96Ng== X-Received: by 2002:a05:600c:3484:b0:479:2a0b:180d with SMTP id 5b1f17b1804b1-47d84b20fd7mr233945345e9.11.1768263124280; Mon, 12 Jan 2026 16:12:04 -0800 (PST) X-Google-Smtp-Source: AGHT+IG2klk6nOmAPiUnKcQNs8meXNIHeb+/M6PgDVbXNOgww4X2FG5NNov4NDMfpWrUqSOSUzNbfw== X-Received: by 2002:a05:600c:3484:b0:479:2a0b:180d with SMTP id 5b1f17b1804b1-47d84b20fd7mr233945175e9.11.1768263123572; Mon, 12 Jan 2026 16:12:03 -0800 (PST) Received: from maya.myfinge.rs (ifcgrfdd.trafficplex.cloud. [176.103.220.4]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-47ed9e9b8a9sm2851385e9.2.2026.01.12.16.12.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Jan 2026 16:12:02 -0800 (PST) Date: Tue, 13 Jan 2026 01:12:01 +0100 From: Stefano Brivio To: David Gibson Subject: Re: [PATCH 1/3] conf: Introduce --no-bindtodevice option for testing Message-ID: <20260113011201.05a80cb7@elisabeth> In-Reply-To: References: <20260105082850.1985300-1-david@gibson.dropbear.id.au> <20260105082850.1985300-2-david@gibson.dropbear.id.au> <20260111003314.2e24f648@elisabeth> Organization: Red Hat X-Mailer: Claws Mail 4.2.0 (GTK 3.24.49; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: ucKY4rFPkd7vWENWM-iEMRfYJIekTKfH1hYjnzFcs4A_1768263125 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-ID-Hash: E6D3OXMCD6HBKQLTHFXCZZOCIPFS6DDS X-Message-ID-Hash: E6D3OXMCD6HBKQLTHFXCZZOCIPFS6DDS X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Mon, 12 Jan 2026 14:42:39 +1100 David Gibson wrote: > On Sun, Jan 11, 2026 at 12:33:14AM +0100, Stefano Brivio wrote: > > On Mon, 5 Jan 2026 19:28:48 +1100 > > David Gibson wrote: > > > > > We need to support (as best we can) older kernels which don't allow > > > unprivilieged processes to use the SO_BINDTODEVICE socket option. > > > > Nit: unprivileged > > > > > Fallcaks for that case are controlled by the c->no_bindtodevice variable. > > > > Fallbacks > > Oops & oops. Fixed. > > > > Currently testing behaviour of those fallbacks requires setting up a test > > > system with a kernel that doesn't support the option, which is pretty > > > awkward. We can test it almost as well and much more easily by adding a > > > command line option to explicitly disable use of SO_BINDTODEVICE. > > > > It's kind of hard to understand if this patch entirely does that, I > > think. > > Well, it forces c->no_bindtodevice to be true. If we attempt to use > SO_BINDTODEVICE in that case, it's a bug elsewhere. Yes... but we wouldn't find it with this patch. We would only find it with a kernel actually not supporting it, or by replacing all the setsockopt() calls with something else. > > We still have a separate, implicit probing of SO_BINDTODEVICE in > > sock_l4_(), which is perhaps excluded by c->no_bindtodevice (but then > > the comment is misleading?). > > It should indeed be excluded because we should never call sock_l4_() > with a non-empty ifname if !c->no_bindtodevice. It's not really > probing, because we outright fail sock_l4_(), there's no fallback > there. The error path is there: > * As a backstop if there is a bug elsewhere meaning we do call this > with non-empty ifname > * If the SO_BINDTODEVICE call fails for a reason other than being > globally unavailable (non existent interface, out of memory, > sufficiently perverse selinux module). > > Given the above, probably should be an err(), and the comment there is > no longer accurate / helpful (we already moved it to > sock_probe_features()). I've made those changes for the next spin. Ah, okay. > > > Like --no-splice this is envisaged as something for developers' and > > > testers' convenience, not a supported option for end users. The man page > > > text reflects that. > > > > I never really understood the point of --no-splice, as there was no > > user request whatsoever behind it, but fine, the argument was that it > > added some needed functionality, even though I couldn't quite grasp > > which one it was. > > That was never the argument from _me_ for --no-splice. For me it was > always that it was useful for development / testing / debugging, not > that it was (directly) useful to end users. Right, I think Jon meant it was useful to end users. Otherwise, I would have argued, it should be mentioned in the man page, and, I would have argued further, the option shouldn't exist at all. > That's true in at least > two ways: > * Allows testing non-splice functionality without having to either > use passt or create some non-loopback addresses ...but without a loopback address we can't use the tap path anyway. > * Lets us ask a user reporting a problem to try --no-splice if we > suspect, but aren't sure that it's specific to the splice logic ...which we never had to do (because it's obvious whether they're using the splice logic or not, I simply ask what kind of address they're using). > My case for --no-bindtodevice is the same: it's useful to me (and > therefore I'm guessing to other developers and testers). I have some doubts about other developers and testers, in the sense that to me it really looks like something you need just for the implementation. > The man page update is pretty explicit about that. Sure, better than --no-splice. > > However, with this, the question is where we draw the line. There are > > probably other options we could use to make debugging or testing > > slightly simpler, but if they don't offer actual functionality, we > > always kept them out so far. > > I mean, maybe, none are immediately occurring to me. If they do in > future, I think we should consider adding them. The thing is, 'passt -h' already reports 117 lines. It's still somewhat usable, but 200 lines would be substantially less usable, I think. A counter-example (at least for me) is 'qemu-system-x86_64 -h', 524 lines on my build. I don't think that's usable and I don't think we should go there. > Note that > --no-splice, and especially --no-bindtodevice are extremely simple to > implement. I would not be arguing for them if they were more complex. My concern isn't really about complexity of the implementation, rather about the fact that we add more command line options. Users don't need them, but they have to scroll through them (in --help output and man page) just because we needed them (quite likely) once. > > That's because we already have a long list of options and making it > > unnecessarily longer is a disservice to users, I think. > > That's a valid point. Would it be more palatable to you if we made > these suboptions of some explicit "developer hacks" option? (--hacks? > --debugopt? --devtest?) At that point the hassle looks comparable to a mandatory macro implementing (or not) the setsockopt(), which can be selected at build time. But anyway, not really, because they would also need to be documented command-line options. How would we use them otherwise as developers? > > Would using something like this: > > > > sed -i 's/(\(setsockopt([a-z]*, SOL_SOCKET, SO_BINDTODEVICE\)/((errno = EPERM) || \1/g' *.c > > > > be totally outrageous, for testing purposes? > > Totally outrageous, no. A bit more hassle, yes. ...what about a script? Or a macro with a #define? > > It has the advantage of making it easier to verify if we're really > > disabling the usage of SO_BINDTODEVICE on all the paths (together with > > grep / git / editors), and not introducing additional command line > > options. > > > > Another trick I use sometimes to selectively disable or enable kernel > > features is to handle system calls via seitan, in this case the > > (simple) recipe would something like: > > > > [ > > { > > "match": [ > > { "setsockopt": { "level": socket", "name": "bindtodevice" } } > > ], > > "return": { "value": "EPERM", "error": -1 } > > } > > ] > > > > but I haven't implemented setsockopt() yet. :( > > > > > Signed-off-by: David Gibson > > > --- > > > conf.c | 2 ++ > > > passt.1 | 6 ++++++ > > > 2 files changed, 8 insertions(+) > > > > > > diff --git a/conf.c b/conf.c > > > index ceb9aa55..70ea168c 100644 > > > --- a/conf.c > > > +++ b/conf.c > > > @@ -962,6 +962,7 @@ static void usage(const char *name, FILE *f, int status) > > > " --no-ndp Disable NDP responses\n" > > > " --no-dhcpv6 Disable DHCPv6 server\n" > > > " --no-ra Disable router advertisements\n" > > > + " --no-bindtodevice Disable SO_BINDTODEVICE\n" > > > " --freebind Bind to any address for forwarding\n" > > > " --no-map-gw Don't map gateway address to host\n" > > > " -4, --ipv4-only Enable IPv4 operation only\n" > > > @@ -1454,6 +1455,7 @@ void conf(struct ctx *c, int argc, char **argv) > > > {"no-dhcpv6", no_argument, &c->no_dhcpv6, 1 }, > > > {"no-ndp", no_argument, &c->no_ndp, 1 }, > > > {"no-ra", no_argument, &c->no_ra, 1 }, > > > + {"no-bindtodevice", no_argument, &c->no_bindtodevice, 1}, > > > {"no-splice", no_argument, &c->no_splice, 1 }, > > > {"freebind", no_argument, &c->freebind, 1 }, > > > {"no-map-gw", no_argument, &no_map_gw, 1 }, > > > diff --git a/passt.1 b/passt.1 > > > index db0d6620..4859d9e5 100644 > > > --- a/passt.1 > > > +++ b/passt.1 > > > @@ -348,6 +348,12 @@ namespace will be silently dropped. > > > Disable Router Advertisements. Router Solicitations coming from guest or target > > > namespace will be ignored. > > > > > > +.TP > > > +.BR \-\-no-bindtodevice > > > +Development/testing option, do not use. Disables use of > > > +SO_BINDTODEVICE socket option. Implicitly enabled on older kernels > > > +which don't permit unprivileged use of SO_BINDTODEVICE. > > > + > > > .TP > > > .BR \-\-freebind > > > Allow any binding address to be specified for \fB-t\fR and \fB-u\fR > > > > The change looks otherwise good to me... I just hope we can avoid it > > somehow, but if not, so be it. > > I mean, it's not essential to anything that follows, but it was useful > to me during testing. If you really don't want it, well, I'll cope. I'm not sure but... if the threshold is "useful during testing" we should also build something reordering TCP segments so that we can reproduce https://bugs.passt.top/show_bug.cgi?id=159 from time to time. And that could actually be a clean and relatively simple implementation, but it just adds noise to the documentation. I don't see a big damage we do with two extra options, but... then maybe we should we stop at 5? 10? -- Stefano