* [PATCH] RFC: Remove unusable --netns-only option
@ 2022-07-19 6:23 David Gibson
2022-07-19 20:39 ` Stefano Brivio
0 siblings, 1 reply; 4+ messages in thread
From: David Gibson @ 2022-07-19 6:23 UTC (permalink / raw)
To: passt-dev
[-- Attachment #1: Type: text/plain, Size: 8811 bytes --]
The intended semantics of --netns-only are pretty unclear to me. It's
intended for pasta, but it's not clear whether its saying the spawned shell
should only enter the target netns, or that the passt/pasta packet
forwarding process should only sandbox itself in a network namespace, not
a user namespace.
In any case, as far as I can tell there's not actually any case in which
the --netns-only option will work. If nothing else, we will always fail
in sandbox(), because it attempts a number of operations which require
CAP_SYS_ADMIN in our current user namespace. We drop all capabilities in
our initial user namespace when we start, so the only way we can have
CAP_SYS_ADMIN at this point is if we've joined a new user namespace, which
we won't do with --netns-only.
For pasta joining an existing namespace (the apparently intended use case), we'll actually fail before
we'll fail before we get to that point: in conf_ns_check() we'll attempt
to join the target network namespace. This also requires CAP_SYS_ADMIN in
both our current user namespace and the user namespace which owns the
target network namespace. Again, since we've dropped capabilities in our
original namespace this will never be the case.
For pasta creating its own network namespace we'll fail for a similar
reason in yet another place. This time we'll fail in nl_sock_init() again
because we attempt to enter the new network ns via NS_CALL without having
regained CAP_SYS_ADMIN by joining a new user namespace. Because this
happens after spawning the shell, it results in a weird failure mode, where
the pasta spawned shell is running, but pasta isn't actually handling
packets. Exiting the shell will lead to a hang until the process is
explicitly killed.
Since there's no way to invoke it, remove this feature.
Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au>
---
conf.c | 33 ++++++++++-----------------------
passt.c | 10 ++++------
passt.h | 2 --
pasta.c | 28 ++++++++++------------------
util.c | 3 +--
5 files changed, 25 insertions(+), 51 deletions(-)
diff --git a/conf.c b/conf.c
index cddc769..1dfbba1 100644
--- a/conf.c
+++ b/conf.c
@@ -498,7 +498,7 @@ static int conf_ns_check(void *arg)
{
struct ctx *c = (struct ctx *)arg;
- if ((!c->netns_only && setns(c->pasta_userns_fd, CLONE_NEWUSER)) ||
+ if (setns(c->pasta_userns_fd, CLONE_NEWUSER) ||
setns(c->pasta_netns_fd, CLONE_NEWNET))
c->pasta_userns_fd = c->pasta_netns_fd = -1;
@@ -518,17 +518,12 @@ static int conf_ns_check(void *arg)
static int conf_ns_opt(struct ctx *c,
char *nsdir, const char *conf_userns, const char *optarg)
{
- int ufd = -1, nfd = -1, try, ret, netns_only_reset = c->netns_only;
+ int ufd = -1, nfd = -1, try, ret;
char userns[PATH_MAX] = { 0 }, netns[PATH_MAX];
char *endptr;
long pid_arg;
pid_t pid;
- if (c->netns_only && *conf_userns) {
- err("Both --userns and --netns-only given");
- return -EINVAL;
- }
-
/* It might be a PID, a netns path, or a netns name */
for (try = 0; try < 3; try++) {
if (try == 0) {
@@ -538,7 +533,7 @@ static int conf_ns_opt(struct ctx *c,
pid = pid_arg;
- if (!*conf_userns && !c->netns_only) {
+ if (!*conf_userns) {
ret = snprintf(userns, PATH_MAX,
"/proc/%i/ns/user", pid);
if (ret <= 0 || ret > (int)sizeof(userns))
@@ -548,9 +543,6 @@ static int conf_ns_opt(struct ctx *c,
if (ret <= 0 || ret > (int)sizeof(netns))
continue;
} else if (try == 1) {
- if (!*conf_userns)
- c->netns_only = 1;
-
ret = snprintf(netns, PATH_MAX, "%s", optarg);
if (ret <= 0 || ret > (int)sizeof(userns))
continue;
@@ -562,19 +554,17 @@ static int conf_ns_opt(struct ctx *c,
}
/* Don't pass O_CLOEXEC here: ns_enter() needs those files */
- if (!c->netns_only) {
- if (*conf_userns)
- /* NOLINTNEXTLINE(android-cloexec-open) */
- ufd = open(conf_userns, O_RDONLY);
- else if (*userns)
- /* NOLINTNEXTLINE(android-cloexec-open) */
- ufd = open(userns, O_RDONLY);
- }
+ if (*conf_userns)
+ /* NOLINTNEXTLINE(android-cloexec-open) */
+ ufd = open(conf_userns, O_RDONLY);
+ else if (*userns)
+ /* NOLINTNEXTLINE(android-cloexec-open) */
+ ufd = open(userns, O_RDONLY);
/* NOLINTNEXTLINE(android-cloexec-open) */
nfd = open(netns, O_RDONLY);
- if (nfd == -1 || (ufd == -1 && !c->netns_only)) {
+ if (nfd == -1 || ufd == -1) {
if (nfd >= 0)
close(nfd);
@@ -604,8 +594,6 @@ static int conf_ns_opt(struct ctx *c,
}
}
- c->netns_only = netns_only_reset;
-
return -ENOENT;
}
@@ -1046,7 +1034,6 @@ void conf(struct ctx *c, int argc, char **argv)
{"tcp-ns", required_argument, NULL, 'T' },
{"udp-ns", required_argument, NULL, 'U' },
{"userns", required_argument, NULL, 2 },
- {"netns-only", no_argument, &c->netns_only, 1 },
{"nsrun-dir", required_argument, NULL, 3 },
{"config-net", no_argument, &c->pasta_conf_ns, 1 },
{"ns-mac-addr", required_argument, NULL, 4 },
diff --git a/passt.c b/passt.c
index 58d9062..64edd39 100644
--- a/passt.c
+++ b/passt.c
@@ -199,12 +199,10 @@ static int sandbox(struct ctx *c)
{
int flags = CLONE_NEWIPC | CLONE_NEWNS | CLONE_NEWUTS;
- if (!c->netns_only) {
- if (c->pasta_userns_fd == -1)
- flags |= CLONE_NEWUSER;
- else
- setns(c->pasta_userns_fd, CLONE_NEWUSER);
- }
+ if (c->pasta_userns_fd == -1)
+ flags |= CLONE_NEWUSER;
+ else
+ setns(c->pasta_userns_fd, CLONE_NEWUSER);
c->pasta_userns_fd = -1;
diff --git a/passt.h b/passt.h
index e541341..7f9d54b 100644
--- a/passt.h
+++ b/passt.h
@@ -110,7 +110,6 @@ enum passt_modes {
* @gid: GID we should drop to, if started as root
* @pasta_netns_fd: File descriptor for network namespace in pasta mode
* @pasta_userns_fd: Descriptor for user namespace to join, -1 once joined
- * @netns_only: In pasta mode, don't join or create a user namespace
* @no_netns_quit: In pasta mode, don't exit if fs-bound namespace is gone
* @netns_base: Base name for fs-bound namespace, if any, in pasta mode
* @netns_dir: Directory of fs-bound namespace, if any, in pasta mode
@@ -177,7 +176,6 @@ struct ctx {
int pasta_netns_fd;
int pasta_userns_fd;
- int netns_only;
int no_netns_quit;
char netns_base[PATH_MAX];
diff --git a/pasta.c b/pasta.c
index 5166082..dc35fef 100644
--- a/pasta.c
+++ b/pasta.c
@@ -82,16 +82,12 @@ static int pasta_wait_for_ns(void *arg)
int flags = O_RDONLY | O_CLOEXEC;
char ns[PATH_MAX];
- if (c->netns_only)
- goto netns;
-
snprintf(ns, PATH_MAX, "/proc/%i/ns/user", pasta_child_pid);
do
while ((c->pasta_userns_fd = open(ns, flags)) < 0);
while (setns(c->pasta_userns_fd, CLONE_NEWUSER) &&
!close(c->pasta_userns_fd));
-netns:
snprintf(ns, PATH_MAX, "/proc/%i/ns/net", pasta_child_pid);
do
while ((c->pasta_netns_fd = open(ns, flags)) < 0);
@@ -121,21 +117,18 @@ static int pasta_setup_ns(void *arg)
{
struct pasta_setup_ns_arg *a = (struct pasta_setup_ns_arg *)arg;
char *shell;
+ char buf[BUFSIZ];
- if (!a->c->netns_only) {
- char buf[BUFSIZ];
-
- snprintf(buf, BUFSIZ, "%i %i %i", 0, a->euid, 1);
+ snprintf(buf, BUFSIZ, "%i %i %i", 0, a->euid, 1);
- FWRITE("/proc/self/uid_map", buf,
- "Cannot set uid_map in namespace");
+ FWRITE("/proc/self/uid_map", buf,
+ "Cannot set uid_map in namespace");
- FWRITE("/proc/self/setgroups", "deny",
- "Cannot write to setgroups in namespace");
+ FWRITE("/proc/self/setgroups", "deny",
+ "Cannot write to setgroups in namespace");
- FWRITE("/proc/self/gid_map", buf,
- "Cannot set gid_map in namespace");
- }
+ FWRITE("/proc/self/gid_map", buf,
+ "Cannot set gid_map in namespace");
FWRITE("/proc/sys/net/ipv4/ping_group_range", "0 0",
"Cannot set ping_group_range, ICMP requests might fail");
@@ -165,9 +158,8 @@ void pasta_start_ns(struct ctx *c)
pasta_child_pid = clone(pasta_setup_ns,
ns_fn_stack + sizeof(ns_fn_stack) / 2,
- (c->netns_only ? 0 : CLONE_NEWNET) |
- CLONE_NEWIPC | CLONE_NEWPID | CLONE_NEWUSER |
- CLONE_NEWUTS,
+ CLONE_NEWNET | CLONE_NEWIPC | CLONE_NEWPID |
+ CLONE_NEWUSER | CLONE_NEWUTS,
(void *)&arg);
if (pasta_child_pid == -1) {
diff --git a/util.c b/util.c
index f45bc72..11cd3f4 100644
--- a/util.c
+++ b/util.c
@@ -524,8 +524,7 @@ void check_root(struct ctx *c)
*/
int ns_enter(const struct ctx *c)
{
- if (!c->netns_only &&
- c->pasta_userns_fd != -1 &&
+ if (c->pasta_userns_fd != -1 &&
setns(c->pasta_userns_fd, CLONE_NEWUSER))
exit(EXIT_FAILURE);
--
@@ -524,8 +524,7 @@ void check_root(struct ctx *c)
*/
int ns_enter(const struct ctx *c)
{
- if (!c->netns_only &&
- c->pasta_userns_fd != -1 &&
+ if (c->pasta_userns_fd != -1 &&
setns(c->pasta_userns_fd, CLONE_NEWUSER))
exit(EXIT_FAILURE);
--
2.36.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] RFC: Remove unusable --netns-only option
2022-07-19 6:23 [PATCH] RFC: Remove unusable --netns-only option David Gibson
@ 2022-07-19 20:39 ` Stefano Brivio
2022-07-20 2:45 ` David Gibson
0 siblings, 1 reply; 4+ messages in thread
From: Stefano Brivio @ 2022-07-19 20:39 UTC (permalink / raw)
To: passt-dev
[-- Attachment #1: Type: text/plain, Size: 2539 bytes --]
On Tue, 19 Jul 2022 16:23:10 +1000
David Gibson <david(a)gibson.dropbear.id.au> wrote:
> The intended semantics of --netns-only are pretty unclear to me. It's
> intended for pasta, but it's not clear whether its saying the spawned shell
> should only enter the target netns, or that the passt/pasta packet
> forwarding process should only sandbox itself in a network namespace, not
> a user namespace.
The latter. I think this is marginally more clear in the man page, but needs
indeed a better explanation.
> In any case, as far as I can tell there's not actually any case in which
> the --netns-only option will work. If nothing else, we will always fail
> in sandbox(), because it attempts a number of operations which require
> CAP_SYS_ADMIN in our current user namespace. We drop all capabilities in
> our initial user namespace when we start, so the only way we can have
> CAP_SYS_ADMIN at this point is if we've joined a new user namespace, which
> we won't do with --netns-only.
>
> For pasta joining an existing namespace (the apparently intended use case), we'll actually fail before
> we'll fail before we get to that point: in conf_ns_check() we'll attempt
> to join the target network namespace. This also requires CAP_SYS_ADMIN in
> both our current user namespace and the user namespace which owns the
> target network namespace. Again, since we've dropped capabilities in our
> original namespace this will never be the case.
...however, we can also have UID 0 in a non-init user namespace, and
that will work.
This is what happens in the Podman integration case. Unfortunately the
demo is broken at the moment (I had to rebase the patch with a bit of
care, I'll publish the updated one soon).
> For pasta creating its own network namespace we'll fail for a similar
> reason in yet another place. This time we'll fail in nl_sock_init() again
> because we attempt to enter the new network ns via NS_CALL without having
> regained CAP_SYS_ADMIN by joining a new user namespace. Because this
> happens after spawning the shell, it results in a weird failure mode, where
> the pasta spawned shell is running, but pasta isn't actually handling
> packets. Exiting the shell will lead to a hang until the process is
> explicitly killed.
Ouch, I didn't think of this.
Anyway, let me get back to you in a couple of days on the whole issue.
The usage is there, albeit poorly documented, with a broken demo, and
no handling of (kind of) corner cases.
--
Stefano
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] RFC: Remove unusable --netns-only option
2022-07-19 20:39 ` Stefano Brivio
@ 2022-07-20 2:45 ` David Gibson
2022-07-20 8:00 ` Stefano Brivio
0 siblings, 1 reply; 4+ messages in thread
From: David Gibson @ 2022-07-20 2:45 UTC (permalink / raw)
To: passt-dev
[-- Attachment #1: Type: text/plain, Size: 3319 bytes --]
On Tue, Jul 19, 2022 at 10:39:25PM +0200, Stefano Brivio wrote:
> On Tue, 19 Jul 2022 16:23:10 +1000
> David Gibson <david(a)gibson.dropbear.id.au> wrote:
>
> > The intended semantics of --netns-only are pretty unclear to me. It's
> > intended for pasta, but it's not clear whether its saying the spawned shell
> > should only enter the target netns, or that the passt/pasta packet
> > forwarding process should only sandbox itself in a network namespace, not
> > a user namespace.
>
> The latter. I think this is marginally more clear in the man page, but needs
> indeed a better explanation.
Definitely. At present it also appears to affect the spawned shell as
well, it a rather counter-intuitive way.
> > In any case, as far as I can tell there's not actually any case in which
> > the --netns-only option will work. If nothing else, we will always fail
> > in sandbox(), because it attempts a number of operations which require
> > CAP_SYS_ADMIN in our current user namespace. We drop all capabilities in
> > our initial user namespace when we start, so the only way we can have
> > CAP_SYS_ADMIN at this point is if we've joined a new user namespace, which
> > we won't do with --netns-only.
> >
> > For pasta joining an existing namespace (the apparently intended use case), we'll actually fail before
> > we'll fail before we get to that point: in conf_ns_check() we'll attempt
> > to join the target network namespace. This also requires CAP_SYS_ADMIN in
> > both our current user namespace and the user namespace which owns the
> > target network namespace. Again, since we've dropped capabilities in our
> > original namespace this will never be the case.
>
> ...however, we can also have UID 0 in a non-init user namespace, and
> that will work.
Hrm.. I thought being UID 0 just meant we started with all the
capabilities, so once we've explicitly dropped them we still won't be
able to do this. That seemed to be what happened when I tried running
it as root.
> This is what happens in the Podman integration case. Unfortunately the
> demo is broken at the moment (I had to rebase the patch with a bit of
> care, I'll publish the updated one soon).
Can you explain a bit more about what the podman use case is, and why
it requires the netns only logic?
> > For pasta creating its own network namespace we'll fail for a similar
> > reason in yet another place. This time we'll fail in nl_sock_init() again
> > because we attempt to enter the new network ns via NS_CALL without having
> > regained CAP_SYS_ADMIN by joining a new user namespace. Because this
> > happens after spawning the shell, it results in a weird failure mode, where
> > the pasta spawned shell is running, but pasta isn't actually handling
> > packets. Exiting the shell will lead to a hang until the process is
> > explicitly killed.
>
> Ouch, I didn't think of this.
>
> Anyway, let me get back to you in a couple of days on the whole issue.
> The usage is there, albeit poorly documented, with a broken demo, and
> no handling of (kind of) corner cases.
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] RFC: Remove unusable --netns-only option
2022-07-20 2:45 ` David Gibson
@ 2022-07-20 8:00 ` Stefano Brivio
0 siblings, 0 replies; 4+ messages in thread
From: Stefano Brivio @ 2022-07-20 8:00 UTC (permalink / raw)
To: passt-dev
[-- Attachment #1: Type: text/plain, Size: 6071 bytes --]
On Wed, 20 Jul 2022 12:45:26 +1000
David Gibson <david(a)gibson.dropbear.id.au> wrote:
> On Tue, Jul 19, 2022 at 10:39:25PM +0200, Stefano Brivio wrote:
> > On Tue, 19 Jul 2022 16:23:10 +1000
> > David Gibson <david(a)gibson.dropbear.id.au> wrote:
> >
> > > The intended semantics of --netns-only are pretty unclear to me. It's
> > > intended for pasta, but it's not clear whether its saying the spawned shell
> > > should only enter the target netns, or that the passt/pasta packet
> > > forwarding process should only sandbox itself in a network namespace, not
> > > a user namespace.
> >
> > The latter. I think this is marginally more clear in the man page, but needs
> > indeed a better explanation.
>
> Definitely. At present it also appears to affect the spawned shell as
> well, it a rather counter-intuitive way.
Right, in that case we should restrict conditions where we can spawn a
shell to having UID 0 in a non-init namespace. See working example
below.
> > > In any case, as far as I can tell there's not actually any case in which
> > > the --netns-only option will work. If nothing else, we will always fail
> > > in sandbox(), because it attempts a number of operations which require
> > > CAP_SYS_ADMIN in our current user namespace. We drop all capabilities in
> > > our initial user namespace when we start, so the only way we can have
> > > CAP_SYS_ADMIN at this point is if we've joined a new user namespace, which
> > > we won't do with --netns-only.
> > >
> > > For pasta joining an existing namespace (the apparently intended use case), we'll actually fail before
> > > we'll fail before we get to that point: in conf_ns_check() we'll attempt
> > > to join the target network namespace. This also requires CAP_SYS_ADMIN in
> > > both our current user namespace and the user namespace which owns the
> > > target network namespace. Again, since we've dropped capabilities in our
> > > original namespace this will never be the case.
> >
> > ...however, we can also have UID 0 in a non-init user namespace, and
> > that will work.
>
> Hrm.. I thought being UID 0 just meant we started with all the
> capabilities, so once we've explicitly dropped them we still won't be
> able to do this. That seemed to be what happened when I tried running
> it as root.
If you run it as root, it will drop to nobody (or user passed via
--runas), and it drops capabilities anyway, so it won't be able to do
that.
If you run it as UID 0 in a non-init namespace, it won't change the
UID, though, and even after dropping capabilities, it will be able to
join a network namespace.
> > This is what happens in the Podman integration case. Unfortunately the
> > demo is broken at the moment (I had to rebase the patch with a bit of
> > care, I'll publish the updated one soon).
>
> Can you explain a bit more about what the podman use case is, and why
> it requires the netns only logic?
Podman creates a network namespace (with a filesystem handle), starts
slirp4netns (or pasta, in the integration draft) as UID 0 in a new user
namespace, pointing it to the network namespace:
# ps aux|grep pasta
sbrivio 2283703 0.0 0.0 2070672 56468 pts/10 Sl+ Jul19 0:40 ./bin/podman run --net=pasta:-T,5213-5214,-U,5213-5214 -p 5203-5204:5203-5204/tcp -p 5203-5204:5203-5204/udp --rm -ti alpine sh
sbrivio 2283760 0.1 0.0 85300 51120 ? Ss Jul19 0:57 /usr/bin/pasta --config-net -u 5203:5203 -t 5203:5203 -T 5213-5214 -U 5213-5214 /run/user/1000/netns/netns-3b6147d8-34e1-a516-87c3-631938a1973e
# readlink /proc/2283703/ns/net
net:[4026531992]
# readlink /proc/2283760/ns/net
net:[4026531992]
# readlink /proc/2283703/ns/user
user:[4026533032]
# readlink /proc/2283760/ns/user
user:[4026533032]
It's equivalent to this example (for convenience, with PIDs instead of
filesystem handles):
---
[TTY #0]
$ unshare -Ur
# echo $$
4117948
[TTY #1]
$ nsenter --preserve-credentials -U -t 4117948
# unshare -n
# ip li sh
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
# echo $$
4126920
[TTY #0]
# ./pasta -f --netns-only 4126920
Outbound interface: enp9s0, namespace interface: enp9s0
ARP:
address: a8:a1:59:8e:d7:b6
DHCP:
assign: 88.198.0.164
mask: 255.255.255.224
router: 88.198.0.161
DNS:
185.12.64.1
185.12.64.2
NDP/DHCPv6:
assign: 2a01:4f8:222:904::2
router: fe80::1
our link-local: fe80::aaa1:59ff:fe8e:d7b6
DNS:
2a01:4ff:ff00::add:2
2a01:4ff:ff00::add:1
[TTY #1]
# ip li sh
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp9s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether f2:c0:09:fe:89:c3 brd ff:ff:ff:ff:ff:ff
---
Unrelated to the Podman case: you can also do this and let pasta spawn
an interactive shell with its network namespace (also created by
pasta) detached:
---
$ unshare -Ur
# ./pasta --netns-only
Cannot set ping_group_range, ICMP requests might fail
$ ip li sh
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp9s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
link/ether a8:a1:59:8e:d7:b6 brd ff:ff:ff:ff:ff:ff
---
...if you then log out from this shell, it will hang:
openat(AT_FDCWD, "/proc/6500/ns/net", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/proc/6500/ns/net", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/proc/6500/ns/net", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
but that's a separate issue (which I just discovered).
--
Stefano
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2022-07-20 8:00 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-19 6:23 [PATCH] RFC: Remove unusable --netns-only option David Gibson
2022-07-19 20:39 ` Stefano Brivio
2022-07-20 2:45 ` David Gibson
2022-07-20 8:00 ` Stefano Brivio
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).