* Issues when using pasta with bubblewrap @ 2025-07-06 15:15 Lisa Gnedt 2025-07-06 17:08 ` Lisa Gnedt [not found] ` <175188240057.3062894.4319502484182397394@maja> 0 siblings, 2 replies; 8+ messages in thread From: Lisa Gnedt @ 2025-07-06 15:15 UTC (permalink / raw) To: passt-user Hi, I am working on integrating pasta into NixPak [1], which boils down to using pasta together with bubblewrap [2]. It basically works, but in a specific edge-case I am running into problems. The edge-case is when bubblewrap creates two layers of user namespaces. What I am basically doing is let bubblewrap create a new network namespace and then start pasta to create the interfaces accordingly. In the NixPak support, I am doing this in coordination with bubblewrap to start pasta before the actual application is launched. However, here are a few minimum examples for re-producing the problem. All testcases were run using pasta 2025_06_11.0293c6f on Linux 6.12.34-hardened1. Testcase A: Single layer of user namespaces -> Works ---------------------------------------------------- First, I start the bwrap sandbox with a new network namespace: $ bwrap --unshare-all --ro-bind / / /bin/sh sh-5.2$ ip -br a lo UNKNOWN 127.0.0.1/8 ::1/128 In another terminal window, I start pasta with the pid of the bubblewrap child which runs inside the new Linux namespaces: $ pasta --config-net --no-dhcp --no-dhcpv6 --no-ndp --no-ra --no-map-gw --tcp-ns none --udp-ns none --tcp-ports none --udp-ports none --ns-ifname eth0 --address 192.168.1.100 --netmask 255.255.255.0 --gateway 192.168.1.1 --mac-addr 52:54:00:12:34:56 --dns-forward 192.168.1.1 --search none 389267 No interfaces with usable IPv6 routes Template interface: eno2 (IPv4) Namespace interface: eth0 MAC: host: 52:54:00:12:34:56 DNS: 192.168.1.1 Then, I go back to the bwrap sandbox and check that the network namespace is now fully set up: sh-5.2$ ip -br a lo UNKNOWN 127.0.0.1/8 ::1/128 eth0 UNKNOWN 192.168.1.100/24 fe80::bceb:d3ff:fe9c:d037/64 Testcase B: Two layers of user namespaces -> Fails directly, but works with nsenter ----------------------------------------------------------------------------------- First, I start again the bwrap sandbox with a new network namespace and the option --dev which is one case where bubblewrap creates two layers of user namespaces: $ bwrap --unshare-all --ro-bind / / --dev /dev /bin/sh sh-5.2$ ip -br a lo UNKNOWN 127.0.0.1/8 ::1/128 In another terminal window, I try to start pasta with the pid of the bubblewrap child which runs inside the new Linux namespaces: $ pasta --config-net --no-dhcp --no-dhcpv6 --no-ndp --no-ra --no-map-gw --tcp-ns none --udp-ns none --tcp-ports none --udp-ports none --ns-ifname eth0 --address 192.168.1.100 --netmask 255.255.255.0 --gateway 192.168.1.1 --mac-addr 52:54:00:12:34:56 --dns-forward 192.168.1.1 --search none 390352 No interfaces with usable IPv6 routes Couldn't switch to pasta namespaces: Operation not permitted This does not work, since pasta joined the second layer user namespace which does not own the network namespace. What works although is if I join the first layer user namespace with nsenter and then let pasta run: $ nsenter -t 390352 -U --preserve-credentials --user-parent -- pasta --config-net --no-dhcp --no-dhcpv6 --no-ndp --no-ra --no-map-gw --tcp-ns none --udp-ns none --tcp-ports none --udp-ports none --ns-ifname eth0 --address 192.168.1.100 --netmask 255.255.255.0 --gateway 192.168.1.1 --mac-addr 52:54:00:12:34:56 --dns-forward 192.168.1.1 --search none --netns /proc/390352/ns/net No interfaces with usable IPv6 routes Template interface: eno2 (IPv4) Namespace interface: eth0 MAC: host: 52:54:00:12:34:56 DNS: 192.168.1.1 Then, I go back to the bwrap sandbox and check that the network namespace is now fully set up: sh-5.2$ ip -br a lo UNKNOWN 127.0.0.1/8 ::1/128 eth0 UNKNOWN 192.168.1.100/24 fe80::54fa:e1ff:fe87:79e/64 Ideas for Solutions ------------------- I am trying to find a solution that works with both testcases (single and two layers of user namespaces). My idea would be to always join the owning user namespace of the network namespace. I tried to simulate this with nsenter, but for some reason I am not getting pasta working for the single layer user namespace (testcase A): $ bwrap --unshare-all --ro-bind / / /bin/sh sh-5.2$ ip -br a lo UNKNOWN 127.0.0.1/8 ::1/128 $ nsenter -t 390424 -U --preserve-credentials -- pasta --config-net --no-dhcp --no-dhcpv6 --no-ndp --no-ra --no-map-gw --tcp-ns none --udp-ns none --tcp-ports none --udp-ports none --ns-ifname eth0 --address 192.168.1.100 --netmask 255.255.255.0 --gateway 192.168.1.1 --mac-addr 52:54:00:12:34:56 --dns-forward 192.168.1.1 --search none --netns /proc/390424/ns/net No interfaces with usable IPv6 routes Couldn't switch to pasta namespaces: Operation not permitted While experimenting with this, I wondered if it is a scenario that pasta would like to support out of the box. It might be easier to get it correct when directly controlling all syscalls involved and not have to mix and match multiple tools. Since Linux 4.9 it seems to be possible to get the owning user namespace of a network namespace with the ioctl NS_GET_USERNS [3]. Do you consider looking into this or would you accept a patch for adding support for this? Best regards, Lisa Gnedt [1] https://github.com/nixpak/nixpak [2] https://github.com/containers/bubblewrap [3] https://man7.org/linux/man-pages/man2/ns_get_userns.2const.html ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Issues when using pasta with bubblewrap 2025-07-06 15:15 Issues when using pasta with bubblewrap Lisa Gnedt @ 2025-07-06 17:08 ` Lisa Gnedt [not found] ` <175188240057.3062894.4319502484182397394@maja> 1 sibling, 0 replies; 8+ messages in thread From: Lisa Gnedt @ 2025-07-06 17:08 UTC (permalink / raw) To: passt-user Hi, On 2025-07-06 17:15, Lisa Gnedt wrote: > It might be easier to get it correct when directly controlling all > syscalls involved and not have to mix and match multiple tools. > Since Linux 4.9 it seems to be possible to get the owning user namespace > of a network namespace with the ioctl NS_GET_USERNS [3]. I just wrote a hacky patch as proof-of-concept of this idea. It is working for me fine in both testcases. However, in its current form it breaks the --userns parameter. But it should not be too hard to address this issue. I am not sure, what kernel version compatibility you are targeting, since the ioctl is only available since Linux 4.9. Would it be an option for you to make it the default behavior when a PID is specified? From my perspective this should be the expected behavior and should not break any previously working use case. Best regards, Lisa Gnedt diff --git a/conf.c b/conf.c index 36845e2..cd67e7a 100644 --- a/conf.c +++ b/conf.c @@ -642,7 +642,7 @@ static void conf_pasta_ns(int *netns_only, char *userns, char *netns, if (!*userns) { if (snprintf_check(userns, PATH_MAX, - "/proc/%ld/ns/user", pidval)) + "/proc/%ld/ns/net", pidval)) die_perror("Can't build userns path"); } } diff --git a/isolation.c b/isolation.c index bbcd23b..cbfe0f0 100644 --- a/isolation.c +++ b/isolation.c @@ -81,6 +81,7 @@ #include <linux/audit.h> #include <linux/capability.h> #include <linux/filter.h> +#include <linux/nsfs.h> #include <linux/seccomp.h> #include "util.h" @@ -254,6 +255,14 @@ void isolate_user(uid_t uid, gid_t gid, bool use_userns, const char *userns, if (ufd < 0) die_perror("Couldn't open user namespace %s", userns); + int real_ufd; + real_ufd = ioctl(ufd, NS_GET_USERNS); + if (real_ufd < 0) + die_perror("Couldn't get user namespace from network namespace %s", userns); + + close(ufd); + ufd = real_ufd; + if (setns(ufd, CLONE_NEWUSER) != 0) die_perror("Couldn't enter user namespace %s", userns); ^ permalink raw reply related [flat|nested] 8+ messages in thread
[parent not found: <175188240057.3062894.4319502484182397394@maja>]
* Re: Issues when using pasta with bubblewrap [not found] ` <175188240057.3062894.4319502484182397394@maja> @ 2025-07-07 10:56 ` Stefano Brivio 2025-07-07 16:19 ` Stefano Brivio 1 sibling, 0 replies; 8+ messages in thread From: Stefano Brivio @ 2025-07-07 10:56 UTC (permalink / raw) To: Lisa Gnedt; +Cc: passt-user Hi Lisa, On Sun, 6 Jul 2025 19:08:46 +0200 Lisa Gnedt via user <passt-user@passt.top> wrote: > Hi, > > On 2025-07-06 17:15, Lisa Gnedt wrote: > > It might be easier to get it correct when directly controlling all > > syscalls involved and not have to mix and match multiple tools. > > Since Linux 4.9 it seems to be possible to get the owning user namespace > > of a network namespace with the ioctl NS_GET_USERNS [3]. > > I just wrote a hacky patch as proof-of-concept of this idea. > It is working for me fine in both testcases. > However, in its current form it breaks the --userns parameter. But it should not be too hard to address this issue. > > I am not sure, what kernel version compatibility you are targeting, since the ioctl is only available since Linux 4.9. Thanks for reporting this and for the draft. I didn't look into your issue and patch yet, I plan to get to it later today, but just as a quick answer to this point: the earlier the better, not everything is reasonable, but 4.9 should be. And yes, patches for compatibility are always warmly welcome. -- Stefano ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Issues when using pasta with bubblewrap [not found] ` <175188240057.3062894.4319502484182397394@maja> 2025-07-07 10:56 ` Stefano Brivio @ 2025-07-07 16:19 ` Stefano Brivio 2025-07-08 23:54 ` Lisa Gnedt [not found] ` <175204738851.3062894.16732172806767761140@maja> 1 sibling, 2 replies; 8+ messages in thread From: Stefano Brivio @ 2025-07-07 16:19 UTC (permalink / raw) To: Lisa Gnedt; +Cc: passt-user, Paul Holzinger [Cc'ing Paul as Podman maintainer, thread at: https://archives.passt.top/passt-user/671252c8-88f6-45b7-b719-b82786e84bb7@gnedt.at/] On Sun, 6 Jul 2025 19:08:46 +0200 Lisa Gnedt via user <passt-user@passt.top> wrote: > Hi, > > On 2025-07-06 17:15, Lisa Gnedt wrote: > > It might be easier to get it correct when directly controlling all > > syscalls involved and not have to mix and match multiple tools. > > Since Linux 4.9 it seems to be possible to get the owning user namespace > > of a network namespace with the ioctl NS_GET_USERNS [3]. > > I just wrote a hacky patch as proof-of-concept of this idea. > It is working for me fine in both testcases. > However, in its current form it breaks the --userns parameter. But it should not be too hard to address this issue. > > I am not sure, what kernel version compatibility you are targeting, since the ioctl is only available since Linux 4.9. > Would it be an option for you to make it the default behavior when a PID is specified? > >From my perspective this should be the expected behavior and should not break any previously working use case. I finally had a second look, a bit quicker than I wanted but I think I grasped the issue. For context, "this" is: always join the user namespace owning a network namespace. It looks reasonable (and desirable) to me, but I'm not sure how / why it breaks the --userns parameter. We should probably never do this when --netns-only is given (that's Podman's case, for example). It would be good to have a way to "cleanly" exclude this new behaviour, but, once we add the NS_GET_USERNS trick, --netns-only doesn't exactly get us back to the previous behaviour. What about --userns-from-pid or something like that? That name isn't great though. Now, 4.9 feels "old" enough, but pasta used to run on a 3.13 kernel a while ago, then a few things were (inadvertently) broken. But it "almost" does. Couldn't we just add a fallback for the case where NS_GET_USERNS fails? You're already handling the error. You could just print a warning and continue instead of calling die_perror()... > diff --git a/conf.c b/conf.c > index 36845e2..cd67e7a 100644 > --- a/conf.c > +++ b/conf.c > @@ -642,7 +642,7 @@ static void conf_pasta_ns(int *netns_only, char *userns, char *netns, > > if (!*userns) { > if (snprintf_check(userns, PATH_MAX, > - "/proc/%ld/ns/user", pidval)) > + "/proc/%ld/ns/net", pidval)) > die_perror("Can't build userns path"); > } > } > diff --git a/isolation.c b/isolation.c > index bbcd23b..cbfe0f0 100644 > --- a/isolation.c > +++ b/isolation.c > @@ -81,6 +81,7 @@ > #include <linux/audit.h> > #include <linux/capability.h> > #include <linux/filter.h> > +#include <linux/nsfs.h> > #include <linux/seccomp.h> > > #include "util.h" > @@ -254,6 +255,14 @@ void isolate_user(uid_t uid, gid_t gid, bool use_userns, const char *userns, > if (ufd < 0) > die_perror("Couldn't open user namespace %s", userns); > > + int real_ufd; > + real_ufd = ioctl(ufd, NS_GET_USERNS); > + if (real_ufd < 0) > + die_perror("Couldn't get user namespace from network namespace %s", userns); > + > + close(ufd); > + ufd = real_ufd; > + > if (setns(ufd, CLONE_NEWUSER) != 0) > die_perror("Couldn't enter user namespace %s", userns); > -- Stefano ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Issues when using pasta with bubblewrap 2025-07-07 16:19 ` Stefano Brivio @ 2025-07-08 23:54 ` Lisa Gnedt 2025-07-17 12:58 ` Stefano Brivio [not found] ` <175204738851.3062894.16732172806767761140@maja> 1 sibling, 1 reply; 8+ messages in thread From: Lisa Gnedt @ 2025-07-08 23:54 UTC (permalink / raw) To: Stefano Brivio; +Cc: passt-user, Paul Holzinger Hi Stefano, Thanks for you fast feedback! On 2025-07-07 18:19, Stefano Brivio wrote: > For context, "this" is: always join the user namespace owning a network > namespace. Yes, exactly. > It looks reasonable (and desirable) to me, but I'm not sure how / why > it breaks the --userns parameter. In the case, where a PID is supplied, it misuses the userns variable and sets it to the path of the network namespace and then always calls the ioctl NS_GET_USERNS to get the owning user namespace. However, the userns variable might also be manually set via the --userns option, whereas we expect users to set this parameter to the path of a user namespace. The ioctl NS_GET_USERNS returns the parent user namespace when a user namespace is given. Therefore, we join the parent user namespace with this patch instead of joining the given user namespace. > We should probably never do this when --netns-only is given (that's > Podman's case, for example). I agree, it does not make sense to join a user namespace, when a user explicitly only wants to join the network namespace. > It would be good to have a way to "cleanly" exclude this new behaviour, > but, once we add the NS_GET_USERNS trick, --netns-only doesn't exactly > get us back to the previous behaviour. What about --userns-from-pid or > something like that? That name isn't great though. I agree that this would be one of the possible solutions. It would enable the use either with PID or with the --netns option. Maybe --userns-from-netns would be better? Somehow it would be cool to include the relationship between userns and netns more concretely like --join-owning-userns-from-netns, but on the other hand it is also a bit too long. However, I am not sure, if it is really necessary to have a separate CLI option. Maybe it would also be a fine new default behavior just for the case when a PID is specified, but no network or user namespace is explicitly given. If anyone really needs the old behavior, it is still possible to specify the user namespace explicitly and, therefore, deactivate the new behavior. It seems that podman uses the --netns option, so it should be fully unaffected by this proposed change of default behavior. I am fine with both solutions. I thought a bit more about the current and changed behavior by iterating trough the possible combinations of options with my code changes in mind. I hope this is not too much, but it also uncovered a few already existing strange edge cases. CLI options -> behavior ---------------------------------------------- PID -> new behavior (netns from PID, userns from netns from PID with fallback to userns from PID) --netns-only PID -> new behavior (netns from PID, userns from netns from PID with fallback to userns from PID) ***2 It looks like this is currently already a strange behavior, as it would get the netns and userns from PID. --userns X PID -> existing behavior (netns from PID, userns from option) --userns X --netns-only PID -> new behavior (netns from PID, userns from netns from PID with fallback to userns from PID - strange) ***1 It looks like this is currently already a strange behavior, as it would also get the userns from PID. --netns-only --userns X PID -> existing behavior (netns from PID, userns from option - strange) ***1 --netns X PID -> existing behavior (invalid) (skipping further combination with --netns and PID) COMMAND -> existing behavior (new netns, new userns) --netns-only COMMAND -> existing behavior (new netns, no userns) --userns X COMMAND -> existing behavior (new netns, userns from option) --userns X --netns-only COMMAND -> existing behavior (new netns, no userns - a bit strange) ***1 --netns-only --userns X COMMAND -> existing behavior (new netns, userns from option - strange) ***1 --netns X COMMAND -> existing behavior (invalid) (skipping further combination with --netns and COMMAND) --netns X -> existing behavior (netns from option, no userns) --netns X --netns-only -> existing behavior (netns from option, no userns) --netns X --userns Y -> existing behavior (netns from option, userns from option) --netns X --userns Y --netns-only -> existing behavior (netns from option, no userns - a bit strange) ***1 --netns X --netns-only --userns Y -> existing behavior (netns from option, userns from option - strange) ***1 Although it is not directly related to the change I am proposing, it might make sense to clean up the CLI option behavior a bit. I would argue to forbid --userns in combination with --netns-only completely (everything marked with ***1). Furthermore, --netns-only PID seems to be currently broken (marked with ***2). I think the netns_only variable (or use_userns how it is called inside isolate.c) should most likely get higher priority than the userns variable itself. This should fix the behavior to only use the netns from PID and no userns. > Now, 4.9 feels "old" enough, but pasta used to run on a 3.13 kernel a > while ago, then a few things were (inadvertently) broken. But it > "almost" does. Couldn't we just add a fallback for the case where > NS_GET_USERNS fails? You're already handling the error. You could just > print a warning and continue instead of calling die_perror()... Yes, it makes sense to implement a fallback when changing the default behavior. If it will become a separate option, it seems counter-intuitive to have an automatic fallback. I just thought it might be best to first discuss the wanted behavior before starting to implement more complex changes. Best regards, Lisa ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Issues when using pasta with bubblewrap 2025-07-08 23:54 ` Lisa Gnedt @ 2025-07-17 12:58 ` Stefano Brivio 2025-07-19 21:04 ` Lisa Gnedt 0 siblings, 1 reply; 8+ messages in thread From: Stefano Brivio @ 2025-07-17 12:58 UTC (permalink / raw) To: Lisa Gnedt; +Cc: passt-user, Paul Holzinger Apologies for the delay. On Wed, 9 Jul 2025 01:54:36 +0200 Lisa Gnedt <lisa+passt-user@gnedt.at> wrote: > Hi Stefano, > > Thanks for you fast feedback! > > On 2025-07-07 18:19, Stefano Brivio wrote: > > For context, "this" is: always join the user namespace owning a network > > namespace. > > Yes, exactly. > > > It looks reasonable (and desirable) to me, but I'm not sure how / why > > it breaks the --userns parameter. > > In the case, where a PID is supplied, it misuses the userns variable > and sets it to the path of the network namespace and then always calls > the ioctl NS_GET_USERNS to get the owning user namespace. > However, the userns variable might also be manually set via the --userns > option, whereas we expect users to set this parameter to the path of a > user namespace. The ioctl NS_GET_USERNS returns the parent user namespace > when a user namespace is given. Therefore, we join the parent user > namespace with this patch instead of joining the given user namespace. Ah, I see. Well, in that case, I guess we could simply skip the NS_GET_USERNS ioctl() if --userns is given. > > We should probably never do this when --netns-only is given (that's > > Podman's case, for example). > > I agree, it does not make sense to join a user namespace, when a user > explicitly only wants to join the network namespace. > > > It would be good to have a way to "cleanly" exclude this new behaviour, > > but, once we add the NS_GET_USERNS trick, --netns-only doesn't exactly > > get us back to the previous behaviour. What about --userns-from-pid or > > something like that? That name isn't great though. > > I agree that this would be one of the possible solutions. It would > enable the use either with PID or with the --netns option. > Maybe --userns-from-netns would be better? Somehow it would be cool to > include the relationship between userns and netns more concretely like > --join-owning-userns-from-netns, but on the other hand it is also a bit > too long. Would --userns-from-netns imply that the PID given on the command line always refers to the network namespace, and the user namespace comes from it? If that's the case, the name looks fitting (but it needs a bit of explanation in the man page and usage message). > However, I am not sure, if it is really necessary to have a separate > CLI option. Maybe it would also be a fine new default behavior just for > the case when a PID is specified, but no network or user namespace is > explicitly given. > If anyone really needs the old behavior, it is still possible to specify > the user namespace explicitly and, therefore, deactivate the new > behavior. > It seems that podman uses the --netns option, so it should be fully > unaffected by this proposed change of default behavior. Right, Podman shouldn't be affected at all. I wonder about rootlesskit (used by moby / Docker) though: https://github.com/rootless-containers/rootlesskit/blob/3c8213d359b54284f4f0aa373ef9adb61d913e0e/pkg/network/pasta/pasta.go#L178 from what I understand, --netns is passed to pasta only if the user gives an explicit --detach-netns. Now, even with the change you propose, things should always work, but I guess we should test it at least in the common use case (Docker starting a container). > I am fine with both solutions. I thought a bit more about the current > and changed behavior by iterating trough the possible combinations of > options with my code changes in mind. I hope this is not too much, but > it also uncovered a few already existing strange edge cases. > > CLI options -> behavior > ---------------------------------------------- > > PID -> new behavior (netns from PID, userns from netns from PID with fallback to userns from PID) > --netns-only PID -> new behavior (netns from PID, userns from netns from PID with fallback to userns from PID) ***2 > It looks like this is currently already a strange behavior, as it would get the netns and userns from PID. I'm not sure about this part: the intended behaviour is to only care about a target network namespace, because who starts pasta already joined / detached the intended user namespace. You mention it's broken but I'm not sure why. I don't think the behaviour should change here. > --userns X PID -> existing behavior (netns from PID, userns from option) > --userns X --netns-only PID -> new behavior (netns from PID, userns from netns from PID with fallback to userns from PID - strange) ***1 > It looks like this is currently already a strange behavior, as it would also get the userns from PID. > --netns-only --userns X PID -> existing behavior (netns from PID, userns from option - strange) ***1 > --netns X PID -> existing behavior (invalid) > (skipping further combination with --netns and PID) > > COMMAND -> existing behavior (new netns, new userns) > --netns-only COMMAND -> existing behavior (new netns, no userns) > --userns X COMMAND -> existing behavior (new netns, userns from option) > --userns X --netns-only COMMAND -> existing behavior (new netns, no userns - a bit strange) ***1 > --netns-only --userns X COMMAND -> existing behavior (new netns, userns from option - strange) ***1 > --netns X COMMAND -> existing behavior (invalid) > (skipping further combination with --netns and COMMAND) > > --netns X -> existing behavior (netns from option, no userns) > --netns X --netns-only -> existing behavior (netns from option, no userns) > --netns X --userns Y -> existing behavior (netns from option, userns from option) > --netns X --userns Y --netns-only -> existing behavior (netns from option, no userns - a bit strange) ***1 > --netns X --netns-only --userns Y -> existing behavior (netns from option, userns from option - strange) ***1 Thanks for the table, it's really helpful, and everything else makes sense to me. > Although it is not directly related to the change I am proposing, it > might make sense to clean up the CLI option behavior a bit. I would > argue to forbid --userns in combination with --netns-only completely > (everything marked with ***1). Right, that's probably a good idea. By the way, I'd suggest checking with David Gibson <david@gibson.dropbear.id.au> as well, as he's going to get back online soon (at some point next week) and he took care of the most recent rework in this area. I'll take care of asking him to have a look at this thread when he's back. > Furthermore, --netns-only PID seems to be currently broken (marked > with ***2). I think the netns_only variable (or use_userns how it is called > inside isolate.c) should most likely get higher priority than the userns > variable itself. This should fix the behavior to only use the netns > from PID and no userns. I'm not quite sure what the current problem is. > > Now, 4.9 feels "old" enough, but pasta used to run on a 3.13 kernel a > > while ago, then a few things were (inadvertently) broken. But it > > "almost" does. Couldn't we just add a fallback for the case where > > NS_GET_USERNS fails? You're already handling the error. You could just > > print a warning and continue instead of calling die_perror()... > > Yes, it makes sense to implement a fallback when changing the default > behavior. If it will become a separate option, it seems counter-intuitive > to have an automatic fallback. > I just thought it might be best to first discuss the wanted behavior before > starting to implement more complex changes. -- Stefano ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Issues when using pasta with bubblewrap 2025-07-17 12:58 ` Stefano Brivio @ 2025-07-19 21:04 ` Lisa Gnedt 0 siblings, 0 replies; 8+ messages in thread From: Lisa Gnedt @ 2025-07-19 21:04 UTC (permalink / raw) To: Stefano Brivio; +Cc: passt-user, Paul Holzinger Hi, On 2025-07-17 14:58, Stefano Brivio wrote: > Apologies for the delay. No worries. > Ah, I see. Well, in that case, I guess we could simply skip the > NS_GET_USERNS ioctl() if --userns is given. Yes, this is exactly what I am suggesting with my table below to change the current default behavior only when a PID is supplied. > Would --userns-from-netns imply that the PID given on the command line > always refers to the network namespace, and the user namespace comes > from it? If that's the case, the name looks fitting (but it needs a bit > of explanation in the man page and usage message). Yes, but it would also be usable with the --netns option. That's also the main difference when compared to the other suggestion of changing the default behavior when only a PID is supplied. > Right, Podman shouldn't be affected at all. I wonder about rootlesskit > (used by moby / Docker) though: > > https://github.com/rootless-containers/rootlesskit/blob/3c8213d359b54284f4f0aa373ef9adb61d913e0e/pkg/network/pasta/pasta.go#L178 > > from what I understand, --netns is passed to pasta only if the user > gives an explicit --detach-netns. Now, even with the change you > propose, things should always work, but I guess we should test it at > least in the common use case (Docker starting a container). Good point. >> --netns-only PID -> new behavior (netns from PID, userns from netns from PID with fallback to userns from PID) ***2 >> It looks like this is currently already a strange behavior, as it would get the netns and userns from PID. > > I'm not sure about this part: the intended behaviour is to only care > about a target network namespace, because who starts pasta already > joined / detached the intended user namespace. You mention it's broken > but I'm not sure why. > > I don't think the behaviour should change here. Maybe I was not very clear about this case. I think the current behavior of the code is broken and does not do what you described (why see below). When we leave this broken code like it is now and apply the code changes I have in mind, this would result in the changed behavior described in the table that is still broken. Therefore, I think the best outcome would be to also fix the issue, which should then result in the behavior you describe, skipping user namespace handling all together and assuming we are already in the correct user namespace. >> Furthermore, --netns-only PID seems to be currently broken (marked >> with ***2). I think the netns_only variable (or use_userns how it is called >> inside isolate.c) should most likely get higher priority than the userns >> variable itself. This should fix the behavior to only use the netns >> from PID and no userns. > > I'm not quite sure what the current problem is. Maybe let's go through the conf() function when the command line --netns-only PID is given and see what happens to the userns and netns_only variables. 1. Initialization Set userns = "" Set netns_only = 0 2. Parsing of --netns-only argument in getopt_long loop Set userns = NULL Set netns_only = 1 3. Parsing of remaining opts in conf_opt_ns() Since PID is a number and userns is false (ignoring the fact that netns_only is 1): Set userns = "/proc/{PID}/ns/user" 4. Calling isolate_user() with use_userns = !netns_only and userns = userns Since userns is set, join the given user namespace (ignoring the face that use_userns is false since it would be only checked if userns is not set) I think the problem needs to be fixed either in 3. or 4. respecting the netns_only/ use_userns options, so that no user namespace would be joined. When this is fixed, then the behavior would stay the same even with my intended changes of the default behavior I described. This was a bit misleading in my posted table since it assumed that it will not be fixed. Best regards, Lisa ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <175204738851.3062894.16732172806767761140@maja>]
* Re: Issues when using pasta with bubblewrap [not found] ` <175204738851.3062894.16732172806767761140@maja> @ 2025-07-23 5:35 ` David Gibson 0 siblings, 0 replies; 8+ messages in thread From: David Gibson @ 2025-07-23 5:35 UTC (permalink / raw) To: Lisa Gnedt; +Cc: Stefano Brivio, passt-user, Paul Holzinger [-- Attachment #1: Type: text/plain, Size: 10863 bytes --] On Wed, Jul 09, 2025 at 01:54:36AM +0200, Lisa Gnedt via user wrote: > Date: Wed, 9 Jul 2025 01:54:36 +0200 > From: Lisa Gnedt <lisa+passt-user@gnedt.at> > To: Stefano Brivio <sbrivio@redhat.com> > CC: passt-user@passt.top, Paul Holzinger <pholzing@redhat.com> > Subject: Re: Issues when using pasta with bubblewrap > List-Id: "For passt users: support, questions and answers" > <passt-user.passt.top> > > Hi Stefano, Hi Lisa, I'm back from extended leave and looking at this thread as Stefano suggested. > Thanks for you fast feedback! > > On 2025-07-07 18:19, Stefano Brivio wrote: > > For context, "this" is: always join the user namespace owning a network > > namespace. > > Yes, exactly. > > > It looks reasonable (and desirable) to me, but I'm not sure how / why > > it breaks the --userns parameter. > > In the case, where a PID is supplied, it misuses the userns variable > and sets it to the path of the network namespace and then always calls > the ioctl NS_GET_USERNS to get the owning user namespace. > However, the userns variable might also be manually set via the --userns > option, whereas we expect users to set this parameter to the path of a > user namespace. The ioctl NS_GET_USERNS returns the parent user namespace > when a user namespace is given. Therefore, we join the parent user > namespace with this patch instead of joining the given user namespace. > > > We should probably never do this when --netns-only is given (that's > > Podman's case, for example). > > I agree, it does not make sense to join a user namespace, when a user > explicitly only wants to join the network namespace. So.. I think this discussion is missing a crucial point: In order to operate, pasta *must* end up in the userns owning the guest's netns. --userns and --netns-only aren't exceptions to that, they're just different ways of locating the correct userns. In PID/--netns mode, I think NS_GET_USERNS makes --userns entirely obsolete. No need to have the user tell us the right userns when we have a reliable way to discover it. It also makes --netns-only obsolete. The slight wrinkle here is that you can't re-enter the userns you're already in. So, we have to test if the ns returned by NS_GET_USERNS is the one we already inhabit. According to namespaces(7) we can do that by using fstat/stat and checking the device and inode of the fd from NS_GET_USERNS versus that of /proc/self/ns/user. So, I think we should straight out deprecate --userns and --netns-only for the PID/--netns case. NS_GET_USERNS is basically an all around better solution. For COMMAND mode, NS_GET_USERNS can't help us, because we need to locate the userns before we can create the netns. We still have the three options of where to create the new netns: * In a new userns (default) * In our current userns (--netns-only) * In a different existing userns (--userns) With that in mind, opinions on the specific behaviours we should aim for below. > > It would be good to have a way to "cleanly" exclude this new behaviour, > > but, once we add the NS_GET_USERNS trick, --netns-only doesn't exactly > > get us back to the previous behaviour. What about --userns-from-pid or > > something like that? That name isn't great though. > > I agree that this would be one of the possible solutions. It would > enable the use either with PID or with the --netns option. > Maybe --userns-from-netns would be better? Somehow it would be cool to > include the relationship between userns and netns more concretely like > --join-owning-userns-from-netns, but on the other hand it is also a bit > too long. > > However, I am not sure, if it is really necessary to have a separate > CLI option. Maybe it would also be a fine new default behavior just for > the case when a PID is specified, but no network or user namespace is > explicitly given. > If anyone really needs the old behavior, it is still possible to specify > the user namespace explicitly and, therefore, deactivate the new > behavior. > It seems that podman uses the --netns option, so it should be fully > unaffected by this proposed change of default behavior. > > I am fine with both solutions. I thought a bit more about the current > and changed behavior by iterating trough the possible combinations of > options with my code changes in mind. I hope this is not too much, but > it also uncovered a few already existing strange edge cases. Here's what I think we should do for each case (many of these match Lisa's suggestions). This aims to keep old kernel compatibility as best we can. > CLI options -> behavior > ---------------------------------------------- > > PID -> new behavior (netns from PID, userns from netns from PID with fallback to userns from PID) netns from pid, userns from NS_GET_USERNS. If NS_GET_USERNS fails, userns from PID. > --netns-only PID -> new behavior (netns from PID, userns from netns from PID with fallback to userns from PID) ***2 > It looks like this is currently already a strange behavior, as it would get the netns and userns from PID. netns from pid. Check NET_GET_USERNS, if it succeeds, but isn't the same as our current namespace, exit with error. If it fails, userns from PID. > --userns X PID -> existing behavior (netns from PID, userns from option) netns from pid. Check NET_GET_USERNS, if it succeeds, but isn't the same as given userns, exit with error. If it fails userns from option. > --userns X --netns-only PID -> new behavior (netns from PID, userns from netns from PID with fallback to userns from PID - strange) ***1 > It looks like this is currently already a strange behavior, as it would also get the userns from PID. Fail with an error. This already makes no sense, because we're giving contradictory instructions on what userns to use. > --netns-only --userns X PID -> existing behavior (netns from PID, userns from option - strange) ***1 Same case as previous. > --netns X PID -> existing behavior (invalid) > (skipping further combination with --netns and PID) Fail with an error (as we already do). > --netns X -> existing behavior (netns from option, no userns) netns from option, userns from NS_GET_USERNS. If NS_GET_USERNS fails, exit with error. > --netns X --netns-only -> existing behavior (netns from option, no userns) netns from option. Check NS_GET_USERNS, if it succeeds but isn't the same as current namespace, exit with error. If it fails, fall back to keeping current userns and hope for the best (i.e. existing behaviour). > --netns X --userns Y -> existing behavior (netns from option, userns from option) netns from option. Check NS_GET_USERNS, if it succeeds but isn't the same as given userns, exit with error. If it fails, fall back to using given userns and hope for the best (existing behaviour). > --netns X --userns Y --netns-only -> existing behavior (netns from option, no userns - a bit strange) ***1 Fail with an error. > --netns X --netns-only --userns Y -> existing behavior (netns from option, userns from option - strange) ***1 Same case as previous. > COMMAND -> existing behavior (new netns, new userns) Existing behaviour. > --netns-only COMMAND -> existing behavior (new netns, no userns) Existing behaviour. > --userns X COMMAND -> existing behavior (new netns, userns from option) Existing behaviour. > --userns X --netns-only COMMAND -> existing behavior (new netns, no userns - a bit strange) ***1 Fail with an error. > --netns-only --userns X COMMAND -> existing behavior (new netns, userns from option - strange) ***1 Fail with an error. > --netns X COMMAND -> existing behavior (invalid) > (skipping further combination with --netns and COMMAND) Fail with an error (which is existing behaviour). > Although it is not directly related to the change I am proposing, it > might make sense to clean up the CLI option behavior a bit. I would > argue to forbid --userns in combination with --netns-only completely > (everything marked with ***1). As noted above, I agree. This already makes no sense. > Furthermore, --netns-only PID seems to be currently broken (marked > with ***2). I think the netns_only variable (or use_userns how it is called > inside isolate.c) should most likely get higher priority than the userns > variable itself. This should fix the behavior to only use the netns > from PID and no userns. Oh, interesting. As above, I think we should deprecate this case anyway (with NS_GET_USERNS + checking if we're already in the right namespace, it's not necessary). If it happens to already be broken, that gives us an excuse to just remove support without going through the deprecation process. > > Now, 4.9 feels "old" enough, but pasta used to run on a 3.13 kernel a > > while ago, then a few things were (inadvertently) broken. But it > > "almost" does. Couldn't we just add a fallback for the case where > > NS_GET_USERNS fails? You're already handling the error. You could just > > print a warning and continue instead of calling die_perror()... For the plain 'PID' case this makes sense. For other cases it's murkier - we may have no reasonable fallback, or the "fallback" might contradict the behaviour we'd have in the case that NS_GET_USERNS succeeds, which would be weird. I've opined on each case separately above. > Yes, it makes sense to implement a fallback when changing the default > behavior. If it will become a separate option, it seems counter-intuitive > to have an automatic fallback. > I just thought it might be best to first discuss the wanted behavior before > starting to implement more complex changes. Yes, that makes sense. Are you comfortable with implementing the behaviour I've described above? I could tackle this, but I'd love for someone else to do it instead. If you are happy to tackle it, two small suggestions: - start with a preliminary patch to make combining --netns-only and --userns an error. That never made sense and making it explicit is a good idea regardless of what else we do. - Another preliminary patch can add a "namespaces_equal" helper, since that will ne needed in a number of places. -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-07-23 5:35 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-07-06 15:15 Issues when using pasta with bubblewrap Lisa Gnedt 2025-07-06 17:08 ` Lisa Gnedt [not found] ` <175188240057.3062894.4319502484182397394@maja> 2025-07-07 10:56 ` Stefano Brivio 2025-07-07 16:19 ` Stefano Brivio 2025-07-08 23:54 ` Lisa Gnedt 2025-07-17 12:58 ` Stefano Brivio 2025-07-19 21:04 ` Lisa Gnedt [not found] ` <175204738851.3062894.16732172806767761140@maja> 2025-07-23 5:35 ` David Gibson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for IMAP folder(s).