From: Stefano Brivio <sbrivio@redhat.com>
To: passt-dev@passt.top
Subject: Re: [PATCH] RFC: Remove unusable --netns-only option
Date: Wed, 20 Jul 2022 10:00:40 +0200 [thread overview]
Message-ID: <20220720100040.120eec70@elisabeth> (raw)
In-Reply-To: <YtdsRs1dunkTNp4L@yekko>
[-- Attachment #1: Type: text/plain, Size: 6071 bytes --]
On Wed, 20 Jul 2022 12:45:26 +1000
David Gibson <david(a)gibson.dropbear.id.au> wrote:
> On Tue, Jul 19, 2022 at 10:39:25PM +0200, Stefano Brivio wrote:
> > On Tue, 19 Jul 2022 16:23:10 +1000
> > David Gibson <david(a)gibson.dropbear.id.au> wrote:
> >
> > > The intended semantics of --netns-only are pretty unclear to me. It's
> > > intended for pasta, but it's not clear whether its saying the spawned shell
> > > should only enter the target netns, or that the passt/pasta packet
> > > forwarding process should only sandbox itself in a network namespace, not
> > > a user namespace.
> >
> > The latter. I think this is marginally more clear in the man page, but needs
> > indeed a better explanation.
>
> Definitely. At present it also appears to affect the spawned shell as
> well, it a rather counter-intuitive way.
Right, in that case we should restrict conditions where we can spawn a
shell to having UID 0 in a non-init namespace. See working example
below.
> > > In any case, as far as I can tell there's not actually any case in which
> > > the --netns-only option will work. If nothing else, we will always fail
> > > in sandbox(), because it attempts a number of operations which require
> > > CAP_SYS_ADMIN in our current user namespace. We drop all capabilities in
> > > our initial user namespace when we start, so the only way we can have
> > > CAP_SYS_ADMIN at this point is if we've joined a new user namespace, which
> > > we won't do with --netns-only.
> > >
> > > For pasta joining an existing namespace (the apparently intended use case), we'll actually fail before
> > > we'll fail before we get to that point: in conf_ns_check() we'll attempt
> > > to join the target network namespace. This also requires CAP_SYS_ADMIN in
> > > both our current user namespace and the user namespace which owns the
> > > target network namespace. Again, since we've dropped capabilities in our
> > > original namespace this will never be the case.
> >
> > ...however, we can also have UID 0 in a non-init user namespace, and
> > that will work.
>
> Hrm.. I thought being UID 0 just meant we started with all the
> capabilities, so once we've explicitly dropped them we still won't be
> able to do this. That seemed to be what happened when I tried running
> it as root.
If you run it as root, it will drop to nobody (or user passed via
--runas), and it drops capabilities anyway, so it won't be able to do
that.
If you run it as UID 0 in a non-init namespace, it won't change the
UID, though, and even after dropping capabilities, it will be able to
join a network namespace.
> > This is what happens in the Podman integration case. Unfortunately the
> > demo is broken at the moment (I had to rebase the patch with a bit of
> > care, I'll publish the updated one soon).
>
> Can you explain a bit more about what the podman use case is, and why
> it requires the netns only logic?
Podman creates a network namespace (with a filesystem handle), starts
slirp4netns (or pasta, in the integration draft) as UID 0 in a new user
namespace, pointing it to the network namespace:
# ps aux|grep pasta
sbrivio 2283703 0.0 0.0 2070672 56468 pts/10 Sl+ Jul19 0:40 ./bin/podman run --net=pasta:-T,5213-5214,-U,5213-5214 -p 5203-5204:5203-5204/tcp -p 5203-5204:5203-5204/udp --rm -ti alpine sh
sbrivio 2283760 0.1 0.0 85300 51120 ? Ss Jul19 0:57 /usr/bin/pasta --config-net -u 5203:5203 -t 5203:5203 -T 5213-5214 -U 5213-5214 /run/user/1000/netns/netns-3b6147d8-34e1-a516-87c3-631938a1973e
# readlink /proc/2283703/ns/net
net:[4026531992]
# readlink /proc/2283760/ns/net
net:[4026531992]
# readlink /proc/2283703/ns/user
user:[4026533032]
# readlink /proc/2283760/ns/user
user:[4026533032]
It's equivalent to this example (for convenience, with PIDs instead of
filesystem handles):
---
[TTY #0]
$ unshare -Ur
# echo $$
4117948
[TTY #1]
$ nsenter --preserve-credentials -U -t 4117948
# unshare -n
# ip li sh
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
# echo $$
4126920
[TTY #0]
# ./pasta -f --netns-only 4126920
Outbound interface: enp9s0, namespace interface: enp9s0
ARP:
address: a8:a1:59:8e:d7:b6
DHCP:
assign: 88.198.0.164
mask: 255.255.255.224
router: 88.198.0.161
DNS:
185.12.64.1
185.12.64.2
NDP/DHCPv6:
assign: 2a01:4f8:222:904::2
router: fe80::1
our link-local: fe80::aaa1:59ff:fe8e:d7b6
DNS:
2a01:4ff:ff00::add:2
2a01:4ff:ff00::add:1
[TTY #1]
# ip li sh
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp9s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether f2:c0:09:fe:89:c3 brd ff:ff:ff:ff:ff:ff
---
Unrelated to the Podman case: you can also do this and let pasta spawn
an interactive shell with its network namespace (also created by
pasta) detached:
---
$ unshare -Ur
# ./pasta --netns-only
Cannot set ping_group_range, ICMP requests might fail
$ ip li sh
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp9s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
link/ether a8:a1:59:8e:d7:b6 brd ff:ff:ff:ff:ff:ff
---
...if you then log out from this shell, it will hang:
openat(AT_FDCWD, "/proc/6500/ns/net", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/proc/6500/ns/net", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/proc/6500/ns/net", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
but that's a separate issue (which I just discovered).
--
Stefano
prev parent reply other threads:[~2022-07-20 8:00 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-19 6:23 [PATCH] RFC: Remove unusable --netns-only option David Gibson
2022-07-19 20:39 ` Stefano Brivio
2022-07-20 2:45 ` David Gibson
2022-07-20 8:00 ` Stefano Brivio [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220720100040.120eec70@elisabeth \
--to=sbrivio@redhat.com \
--cc=passt-dev@passt.top \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).