From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: passt.top; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20251104 header.b=AVZOhFT9; dkim-atps=neutral Received: from mail-wr1-x436.google.com (mail-wr1-x436.google.com [IPv6:2a00:1450:4864:20::436]) by passt.top (Postfix) with ESMTPS id 567375A0262 for ; Thu, 02 Jul 2026 09:16:19 +0200 (CEST) Received: by mail-wr1-x436.google.com with SMTP id ffacd0b85a97d-475881b9a4bso1484289f8f.3 for ; Thu, 02 Jul 2026 00:16:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1782976579; x=1783581379; darn=passt.top; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to:content-type; bh=qIrkWsp5foHZSsAQaOCGr++XDsARCbaqvZ8IlENQPWc=; b=AVZOhFT9iiS4nHLdzVkYCedWnImByhvOH1BCTCg+2mPmF4r1jAKX/W63JJZMNXBNTL KNtWr113MAkhWp3C9cq5jcgrhew6LEwyuBQBo0fjfd03CrtKm9lxJnR6WDCdwls5oHIm dF+et7fhqK/66q/6wjvgIc+Ueoeo6erZOOhjYjpIMmc/jVVccG3vKjuknj3NN+s544Nf qtNrtAulJ5kqpx3Q6aaOgMbuDmpEX0vBCAPwYYb8poixUGy/daZcSVRPw3187WiHdgwg /Dpfx3DvI3sO10C2Pl7IV4jIiwTxB3HLA74wrz3sIHzsDCyZaEkehbjESJgbXE5+Z49p 7BsQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1782976579; x=1783581379; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to:content-type; bh=qIrkWsp5foHZSsAQaOCGr++XDsARCbaqvZ8IlENQPWc=; b=NxgCRWb0pSMXAo29BDziJdRtek5tdw+2mMv003xQ2ceFxeq5VL45HR2Kk6MIh6Z0ph j89g4/1Z8m8MGZ+SU6/Et6arv/ucuzQC5GrDfAhKzgDoRXxbZcNKj2Ac2hsawB14ky+A Pn6xP9DYvgt+J/p4bjyStTQTPXxxZ+Nrx42zbfLUnfD8gglDrNoB7FnqxMiFplKSr44R ukWbwpZdyjkv9x0gVS9FeaFQx21oj1g9Rea3hM5o0kYxz/aKqY7mFPCd9HsLccopDmPl 5aP/QMZF1YLhqAzs5BjbZiMeS8SslQm/4s47+R8dV5HdjWtm7HICbIN4ZiI2g43rCSDx 32Zg== X-Gm-Message-State: AOJu0Yz07cnAtPyigKp/HAEYTZWjsje48wew1nv0JI5ypIH6gp+lq0Ma ye0ECaDriWIVJ260ObzzQkRCFD0rfIVOx3DQ5mv3yhdIG/rxm6cmVML0Ln8prkhSjdiupQ== X-Gm-Gg: AfdE7clsZfvu6KMPvkE68/qxtShuhJtlELjRzJ+c5wru/Wbyl3uZh8mGb5OmLFYoiXT w2MffEO3V3+bh2VXnyHIE3Q/6uSfj/Nem7bUGQmp/+BgVOQrlvxpzcADWygD1HgVKjA60TKF9Bh bITnVMnIAl4hlhN4+svCuPfxDzSQXy9DsNuUjwrgutql5fnH2jGWPmH+WCOWeE+inzAm9qUDzPA aOAW4eXl3dILo0VhSbK2kz2gy/YcWi4mDiVzxeB49XPATFekTE8vp6RtJ8UkBUvyxUyvIRTDn3n 3b+X6aUK5Qp2OT3wR4ImNHHSCCCtl2GISV1SLlrvbp8P6YxN4TVqA4UhzNcnHzekotzVJwMHHeN yYmh64ozXjlwZEAs6IfqOmhxFBM5cHlIxTcFkJ7bzNdaufle5WCqCpKM/m2kEMGimNXy3Mnlp2P ybdb7BZoB1NrXH1Mqu26lGkGd7C9AfNtY/2ZcDSLXVN8gUkA7+1QRHTTS2Zd5OrTomBtTVYwnER ROeJXjZm9klVCph9c5zyA== X-Received: by 2002:a05:600c:e494:20b0:493:a570:df7d with SMTP id 5b1f17b1804b1-493c3ceb28amr43151235e9.20.1782976578534; Thu, 02 Jul 2026 00:16:18 -0700 (PDT) Received: from k29844791.mikronika.com.pl (91-213-123-13.static.ip.netia.com.pl. [91.213.123.13]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-493c635c365sm23147085e9.4.2026.07.02.00.16.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 02 Jul 2026 00:16:18 -0700 (PDT) From: Mateusz Andrzejewski X-Google-Original-From: Mateusz Andrzejewski To: passt-dev@passt.top Subject: [PATCH v2] isolation: Add --chroot-fallback option Date: Thu, 2 Jul 2026 09:13:31 +0200 Message-ID: <20260702071526.2460400-1-mateusz.andrzejewski@mikronika.com.pl> X-Mailer: git-send-email 2.43.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-MailFrom: mandrzejewski06@gmail.com X-Mailman-Rule-Hits: nonmember-moderation X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation Message-ID-Hash: CUOXPPDZOHULSFH6ODMNPIRQXSNBJXMN X-Message-ID-Hash: CUOXPPDZOHULSFH6ODMNPIRQXSNBJXMN X-Mailman-Approved-At: Thu, 02 Jul 2026 12:39:02 +0200 CC: piotr.bzdrega@mikronika.com.pl, mateusz.andrzejewski@mikronika.com.pl X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: For integrations, which use rootfs on tmpfs or initramfs, it is not allowed to use pivot_root(). It results with invalid argument (EINVAL) error. Introduce --chroot-fallback option as a workaround with MS_MOVE + chroot(). Due to weaker isolation of chroot() method (we don't unmount old root), user has tu explicitly enable fallback with the new option. First, always try to sandbox with pivot_root(). In both cases the new root is placed into an empty tmpfs. For the solution to work we keep CAP_SYS_CHROOT capability, which is dropped at the end of the isolate_prefork() function. Link: https://bugs.passt.top/show_bug.cgi?id=104 Signed-off-by: Mateusz Andrzejewski --- Changes since v1: - added a description of the new option to the man page - renamed option to --chroot-fallback - removed auxiliary variable in conf() - passed execution context to isolate_user() - fixed indentation and other coding style issues - fixed cppcheck issue with variableScope in isolate_prefork() Direct write to &c->chroot_fallback is not valid, because it's a bool variable and compilation results with an incompatible-pointer-types warning (int* expected). To fix this, the assignment was moved to the '32' label in the switch statement and the temporary helper variable could be removed. This follows the same pattern of other boolean options. conf.c | 8 +++++- isolation.c | 71 ++++++++++++++++++++++++++++++++++++++++++----------- isolation.h | 4 +-- passt.1 | 9 +++++++ passt.h | 2 ++ 5 files changed, 77 insertions(+), 17 deletions(-) diff --git a/conf.c b/conf.c index 4755a9f..6755bf6 100644 --- a/conf.c +++ b/conf.c @@ -654,6 +654,8 @@ static void usage(const char *name, FILE *f, int status) " --no-ra Disable router advertisements\n" " --freebind Bind to any address for forwarding\n" " --no-map-gw Don't map gateway address to host\n" + " --chroot-fallback Use chroot() if pivot_root() fails\n" + " can be useful for rootfs on tmpfs or initramfs\n" " -4, --ipv4-only Enable IPv4 operation only\n" " -6, --ipv6-only Enable IPv6 operation only\n" " -t, --tcp-ports SPEC TCP port forwarding to %s\n" @@ -1233,6 +1235,7 @@ void conf(struct ctx *c, int argc, char **argv) {"migrate-no-linger", no_argument, NULL, 30 }, {"stats", required_argument, NULL, 31 }, {"conf-path", required_argument, NULL, 'c' }, + {"chroot-fallback", no_argument, NULL, 32 }, { 0 }, }; const char *optstring = "+dqfel:hs:c:F:I:p:P:m:a:n:M:g:i:o:D:S:H:461t:u:T:U:"; @@ -1467,6 +1470,9 @@ void conf(struct ctx *c, int argc, char **argv) die("Can't display statistics if not running in foreground"); c->stats = strtol(optarg, NULL, 0); break; + case 32: + c->chroot_fallback = true; + break; case 'd': c->debug = 1; c->quiet = 0; @@ -1879,7 +1885,7 @@ void conf(struct ctx *c, int argc, char **argv) conf_open_files(c); /* Before any possible setuid() / setgid() */ - isolate_user(uid, gid, !netns_only, userns, c->mode); + isolate_user(c, uid, gid, !netns_only, userns); if (c->no_icmp) c->no_ndp = 1; diff --git a/isolation.c b/isolation.c index 7e6225d..08e4008 100644 --- a/isolation.c +++ b/isolation.c @@ -166,6 +166,31 @@ static void clamp_caps(void) die_perror("Couldn't drop inheritable capabilities"); } +/** + * move_root() - Use chroot() instead of pivot_root() for sandboxing + * + * Return: negative error code on failure, zero on success + */ +static int move_root(void) +{ + if (mount(TMPDIR, "/", "", MS_MOVE, "")) { + err_perror("Failed to move root into empty tmpfs"); + return -errno; + } + + if (chroot(".")) { + err_perror("Failed to chroot() into empty tmpfs"); + return -errno; + } + + if (chdir("/")) { + err_perror("Failed to change directory into new root"); + return -errno; + } + + return 0; +} + /** * isolate_initial() - Early, mostly config independent self isolation * @argc: Argument count @@ -195,14 +220,18 @@ void isolate_initial(int argc, char **argv) * - CAP_SYS_ADMIN, so that we can setns() to the netns. * - Keep CAP_NET_ADMIN, so that we can configure interfaces * + * We have to keep CAP_SYS_CHROOT in case of --chroot-fallback option + * being enabled, so we can fallback from pivot_root() to chroot() in + * isolate_prefork(). + * * It's debatable whether it's useful to drop caps when we * retain SETUID and SYS_ADMIN, but we might as well. We drop * further capabilities in isolate_user() and * isolate_prefork(). */ keep = BIT(CAP_NET_BIND_SERVICE) | BIT(CAP_SETUID) | BIT(CAP_SETGID) | - BIT(CAP_SYS_ADMIN) | BIT(CAP_NET_ADMIN) | BIT(CAP_DAC_OVERRIDE); - + BIT(CAP_SYS_ADMIN) | BIT(CAP_NET_ADMIN) | BIT(CAP_DAC_OVERRIDE) | + BIT(CAP_SYS_CHROOT); /* Since Linux 5.12, if we want to update /proc/self/uid_map to create * a mapping from UID 0, which only happens with pasta spawning a child * from a non-init user namespace (pasta can't run as root), we need to @@ -220,11 +249,11 @@ void isolate_initial(int argc, char **argv) /** * isolate_user() - Switch to final UID/GID and move into userns + * @c: Execution context * @uid: User ID to run as (in original userns) * @gid: Group ID to run as (in original userns) * @use_userns: Whether to join or create a userns * @userns: userns path to enter, may be empty - * @mode: Mode (passt or pasta) * * Should: * - set our final UID and GID @@ -232,8 +261,8 @@ void isolate_initial(int argc, char **argv) * Mustn't: * - remove filesystem access (we need that for further setup) */ -void isolate_user(uid_t uid, gid_t gid, bool use_userns, const char *userns, - enum passt_modes mode) +void isolate_user(const struct ctx *c, uid_t uid, gid_t gid, bool use_userns, + const char *userns) { uint64_t ns_caps = 0; @@ -277,7 +306,14 @@ void isolate_user(uid_t uid, gid_t gid, bool use_userns, const char *userns, * netns */ ns_caps |= BIT(CAP_SYS_ADMIN); - if (mode == MODE_PASTA) { + + /* Only keep CAP_SYS_CHROOT for the --chroot-fallback case. Otherwise + * it can be dropped + */ + if (c->chroot_fallback) + ns_caps |= BIT(CAP_SYS_CHROOT); + + if (c->mode == MODE_PASTA) { /* Keep CAP_NET_ADMIN, so we can configure the if */ ns_caps |= BIT(CAP_NET_ADMIN); /* Keep CAP_NET_BIND_SERVICE, so we can splice @@ -331,7 +367,7 @@ int isolate_prefork(const struct ctx *c) if (mount("", TMPDIR, "tmpfs", MS_NODEV | MS_NOEXEC | MS_NOSUID | MS_RDONLY, "nr_inodes=2,nr_blocks=0")) { - err_perror("Failed to mount empty tmpfs for pivot_root()"); + err_perror("Failed to mount empty tmpfs for sandboxing"); return -errno; } @@ -341,13 +377,20 @@ int isolate_prefork(const struct ctx *c) } if (syscall(SYS_pivot_root, ".", ".")) { - err_perror("Failed to pivot_root() into empty tmpfs"); - return -errno; - } - - if (umount2(".", MNT_DETACH | UMOUNT_NOFOLLOW)) { - err_perror("Failed to unmount original root filesystem"); - return -errno; + if (c->chroot_fallback) { + int rc; + info("Failed to pivot_root(), fallback to chroot()..."); + if ((rc = move_root())) + return rc; + } else { + err_perror("Failed to pivot_root() into empty tmpfs"); + return -errno; + } + } else { + if (umount2(".", MNT_DETACH | UMOUNT_NOFOLLOW)) { + err_perror("Failed to unmount original root filesystem"); + return -errno; + } } /* Now that initialization is more-or-less complete, we can diff --git a/isolation.h b/isolation.h index 0576168..66b6968 100644 --- a/isolation.h +++ b/isolation.h @@ -11,8 +11,8 @@ #include void isolate_initial(int argc, char **argv); -void isolate_user(uid_t uid, gid_t gid, bool use_userns, const char *userns, - enum passt_modes mode); +void isolate_user(const struct ctx *c, uid_t uid, gid_t gid, bool use_userns, + const char *userns); int isolate_prefork(const struct ctx *c); void isolate_postfork(const struct ctx *c); diff --git a/passt.1 b/passt.1 index 908fd4a..8a33c2e 100644 --- a/passt.1 +++ b/passt.1 @@ -393,6 +393,15 @@ as destination, to the host. Implied if there is no gateway on the selected default route, or if there is no default route, for any of the enabled address families. +.TP +.BR \-\-chroot-fallback +Enable a fallback to chroot() in case pivot_root() returns an error. Useful for +integrations, which use rootfs on tmpfs or initramfs, where it is not allowed to +use pivot_root(), because it results with an invalid argument error (EINVAL). + +By default, fallback is disabled. If pivot_root() fails, then the entire +sandboxing process fails. + .TP .BR \-\-map-guest-addr " " \fIaddr Translate \fIaddr\fR in the guest to be equal to the guest's assigned diff --git a/passt.h b/passt.h index 16506dc..a20148e 100644 --- a/passt.h +++ b/passt.h @@ -214,6 +214,7 @@ struct ip6_ctx { * @splice_only: Only enable loopback forwarding * @host_lo_to_ns_lo: Map host loopback addresses to ns loopback addresses * @freebind: Allow binding of non-local addresses for forwarding + * @chroot_fallback: Use chroot() in case pivot_root() fails * @low_wmem: Low probed net.core.wmem_max * @low_rmem: Low probed net.core.rmem_max * @no_bindtodevice: Unprivileged SO_BINDTODEVICE not available @@ -299,6 +300,7 @@ struct ctx { int splice_only; int host_lo_to_ns_lo; int freebind; + bool chroot_fallback; int low_wmem; int low_rmem; -- 2.43.7