From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by passt.top (Postfix) with ESMTPS id 19F095A005E for ; Wed, 12 Oct 2022 05:17:49 +0200 (CEST) Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4MnHs30wrfz4xGs; Wed, 12 Oct 2022 14:17:43 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=201602; t=1665544663; bh=8wpOml5od/oWiv62Xl3SHbFxJ2rVwbgpMSKdtEYhoCQ=; h=Date:From:To:Cc:Subject:From; b=Y1BXlebzVm0ug/4lhSxB3aWUHKVE5mEDsdzI80DGSLrkNVaxyXR+T+uo/o2hUx80W c4T16mAtCRz+Auxg+o9JiTZ6Du4aQgMV1bumILONtW+VXqq19mc2PXNj9G3WEN78d/ WLyuQDgDfY4bueewpCeZVFpM5VRx5tchJJHfmO4o= Date: Wed, 12 Oct 2022 13:55:02 +1100 From: David Gibson To: Stefano Brivio Subject: Alas for CAP_NET_BIND_SERVICE Message-ID: MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="q8xUNo0ZNNAJRJvC" Content-Disposition: inline Message-ID-Hash: ZF35ZCLHXF4UUYKS3J3QOB5J5P3MQ4CW X-Message-ID-Hash: ZF35ZCLHXF4UUYKS3J3QOB5J5P3MQ4CW X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.3 Precedence: list List-Id: Development discussion and patches for passt Archived-At: <> Archived-At: List-Archive: <> List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --q8xUNo0ZNNAJRJvC Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi Stefano, I've looked deeper into why giving passt/pasta CAP_NET_BIND_SERVICE isn't working, and I'm afraid I have bad news. We lose CAP_NET_BIND_SERVICE in the initial namespace as soon as we unshare() or setns() into the isolated namespace, and this appears to be intended behaviour. From user_namespaces(7), in the Capabilities sectio= n: The child process created by clone(2) with the CLONE_NEWUSER flag starts out with a complete set of capabilities in the new user namespace. Likewise, a process that creates a new user namespace using unshare(2) or joins an existing user namespace using setns(2) gains a full set of capabilities in that namespace. ***On the other hand, that process has no capabilities in the parent (in the case of clone(2)) or previous (in the case of unshare(2) and setns(2)) user namespace, even if the new namespace is created or joined by the root user (i.e., a process with user ID 0 in the root namespace).*** Emphasis (***) mine. Basically, despite the way it's phrased in many places, processes don't have an independent set of capabilities in each userns, they only have a set of capabilities in their current userns. Any capabilities in other namespaces are implied in a pretty much all or nothing way - if the process's UID (the real, init ns one) owns the userns (or one of its ancestors), it gets all caps, otherwise none. cap_capable() has the specific logic in the kernel. So, using CAP_NET_BIND_SERVICE isn't compatible with isolating ourselves in our own userns. At the very least "auto" inbound forwarding of low ports is pretty much off the cards. For forwarding of specific low ports, we could delay our entry into the new userns until we've set up the listening sockets, although it does mean rolling back some of the simplification we gained from the new-style userns handling. Or, we could abandon CAP_NET_BIND_SERVICE, and recommend the net.ipv4.ip_unprivileged_port_start sysctl as the only way to handle low ports in passt. I do see a fair bit of logic in that approach: passt has no meaningful way to limit what users do with the low ports it allows them (indirectly) to bind to, giving passt CAP_NET_BIND_SERVICE is pretty much equivalent to giving any process which can invoke passt CAP_NET_BIND_SERVICE. --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --q8xUNo0ZNNAJRJvC Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEoULxWu4/Ws0dB+XtgypY4gEwYSIFAmNGLH8ACgkQgypY4gEw YSKUwhAAsw/wIkJeKxhnCxfFH7yRwQLFc21AMV3KrNzsH5TNImiazruWLIFShq9H iV7AXwgX8Gq/6Sg1MryO3ZaCe1PGsp9yN1oChqmRucP73QF+m9htCk7YHpIBukKm IvFWG5PEetTf64eBgEp+YJuEv+0bPqilZvkUdlfQN+Ipi0wttdKkZWMcJMiPDoaS pMi7x57qdxvOKMEuYx+VEF/jbe4mILSSb+PKYLeKx1XNmP0dphjD87WFjvbRZNPg lKOjm+WSy27ej2/Eyy+NpoXo+dKcCEaEPEubqTRa0bPork0KLLPGVvz5Dy06nDcL W6rOdJzuEbAVwZ0d/dNhe00ZRDY0T13i/Dv32JbcyuANdGsmGjPdSs76JYSb/5vJ nGlfq+999sQwLtziHuFEmZjvHG0hhuBS7oMZHn1Sr6fFjRp5h6u6CK8BTFf5gwAT irF/FPEuKUXeE86NphUGQMH5snGEngF3oO+9Yzb7ukvsnWzcaLjR79FZ0pA4naeu ql6OACqnctjPHCFunO/wmR6Xo9zVOHcp1cFWiBXK7vzLvBRC8apf4877POR0ocAQ mkXFodgEZTOh1ZuR09F1zm6LxP31fRFXkh5wDdgbSV6sFRyzILoHvigSc0gtIDzl TOiARkqfMtfpES1he9Z1/DGcElBOUJrzqu5+ImPif/NdYaAqpro= =Popf -----END PGP SIGNATURE----- --q8xUNo0ZNNAJRJvC--