From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by passt.top (Postfix) with ESMTPS id 828CA5A005E for ; Thu, 13 Oct 2022 11:37:25 +0200 (CEST) Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4Mp4Dg1NpSz4xH1; Thu, 13 Oct 2022 20:37:23 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=201602; t=1665653843; bh=wOdkvGwsKDrCAltzVvKKRczMHFU/fj+/jWGYD6WWgOE=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=HWE7xijNQyIYFWh5klFZIhuhjnvuA8XVpdn3dwr5yuyocRbg25dsQEIqN9fMx0bxC d2briIN5lYjqCW/SJ7kSBHd9TnLTEyercEUei6XpSKPw7QHbZ6VZbxkgj/uJB+g+Dh g0fbObBiFqvfTngEwyp/60/MFvipWqYl+SDdX1lM= Date: Thu, 13 Oct 2022 20:33:34 +1100 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH 08/10] isolation: Prevent any child processes gaining capabilities Message-ID: References: <20221011054018.1449506-1-david@gibson.dropbear.id.au> <20221011054018.1449506-9-david@gibson.dropbear.id.au> <20221013041730.6d75759a@elisabeth> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="zFwCqliugpdGxo86" Content-Disposition: inline In-Reply-To: <20221013041730.6d75759a@elisabeth> Message-ID-Hash: X3XCXMDYBRAHE4OWMPSVBSMYYYFJZCCL X-Message-ID-Hash: X3XCXMDYBRAHE4OWMPSVBSMYYYFJZCCL X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.3 Precedence: list List-Id: Development discussion and patches for passt Archived-At: <> Archived-At: List-Archive: <> List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --zFwCqliugpdGxo86 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Oct 13, 2022 at 04:17:30AM +0200, Stefano Brivio wrote: > On Tue, 11 Oct 2022 16:40:16 +1100 > David Gibson wrote: >=20 > > We drop our own capabilities, but it's possible that processes we exec() > > could gain extra privilege via file capabilities. It shouldn't be poss= ible > > for us to exec() anyway due to seccomp() and our filesystem isolation. = But > > just in case, zero the bounding and inheritable capability sets to prev= ent > > any such child from gainin privilege. > >=20 > > Note that we do this *after* spawning the pasta shell/command (if any), > > because we do want the user to be able to give that privilege if they w= ant. > >=20 > > Signed-off-by: David Gibson > > --- > > isolation.c | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 56 insertions(+) > >=20 > > diff --git a/isolation.c b/isolation.c > > index 2468f84..e1a024d 100644 > > --- a/isolation.c > > +++ b/isolation.c > > @@ -120,6 +120,61 @@ static void drop_caps_ep_except(uint64_t keep) > > } > > } > > =20 > > +/** > > + * clamp_caps() - Prevent any children from gaining caps >=20 > "clamp" doesn't sound very specific or clear. caps_drop_inherit_bound() > would actually tell me what the function does, but it's a bit of a > mouthful in comparison. I'm fine with both, really, but if you have a > better idea... Yeah, I couldn't think of something that was both brief and specific, so I went with brief. > > + * > > + * This drops all capabilities from both the inheritable and the > > + * bounding set. This means that any exec()ed processes can't gain > > + * capabilities, even if they have file capabilities which would grant > > + * them. We shouldn't ever exec() in any case, but this provides an > > + * additional layer of protection. Executing this requires > > + * CAP_SETPCAP, which we will have within our userns. > > + * > > + * Note that dropping capabilites from the bounding set limits > > + * exec()ed processes, but does not remove them from the effective or > > + * permitted sets, so it doesn't reduce our own capabilities. > > + */ > > +static void clamp_caps(void) > > +{ > > + struct __user_cap_header_struct hdr =3D { > > + .version =3D CAP_VERSION, > > + .pid =3D 0, > > + }; > > + struct __user_cap_data_struct data[CAP_WORDS]; >=20 > For consistency, I'd move this before hdr. Ok. > > + int i; > > + > > + for (i =3D 0; i < 64; i++) { > > + /* Some errors can be ignored: > > + * - EINVAL, we'll get this for all values in 0..63 > > + * that are not actually allocated caps > > + * - EPERM, we'll get this if we don't have > > + * CAP_SETPCAP, which can happen if using > > + * --netns-only. We don't need CAP_SETPCAP for > > + * normal operation, so carry on without it. > > + */ > > + if (prctl(PR_CAPBSET_DROP, i, 0, 0, 0) && > > + errno !=3D EINVAL && errno !=3D EPERM) { > > + err("Couldn't drop cap %i from bounding set: %s", > > + i, strerror(errno)); > > + exit(EXIT_FAILURE); > > + } > > + } > > + > > + if (syscall(SYS_capget, &hdr, data)) { > > + err("Couldn't get current capabilities: %s", strerror(errno)); > > + exit(EXIT_FAILURE); > > + } > > + > > + for (i =3D 0; i < CAP_WORDS; i++) > > + data[i].inheritable =3D 0; >=20 > Any specific reason why? Initialisers can have variable sizes to some > extent, but if there's a reason why it can't be done, perhaps that > would warrant a comment here. Why what? We're not trying to alter the permitted or effective sets here, so we're doing a capget() first, zeroing the inheritable field, then setting it back again. > > + > > + if (syscall(SYS_capset, &hdr, data)) { > > + err("Couldn't drop inheritable capabilities: %s", > > + strerror(errno)); > > + exit(EXIT_FAILURE); > > + } > > +} > > + > > /** > > * isolate_initial() - Early, config independent self isolation > > * > > @@ -287,6 +342,7 @@ int isolate_prefork(struct ctx *c) > > ns_caps |=3D 1UL << CAP_NET_BIND_SERVICE; > > } > > =20 > > + clamp_caps(); > > drop_caps_ep_except(ns_caps); > > =20 > > return 0; >=20 --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --zFwCqliugpdGxo86 Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEoULxWu4/Ws0dB+XtgypY4gEwYSIFAmNH21YACgkQgypY4gEw YSIOpBAAlSCpn9ShqtgqTdxnTfg6xvaVAJBkubSsdkZozBGgrA6PVGzgxg9cbErb 2h/HEMAU1nHbjyuL1CBigQQV1PtRb2y+w6BG0dDnxuORWbtlVm1z9TcF9eUqMLHD knUZbVQWU6JpnSm44Gg382fBXSGZgBros0vn48xYyaNlHJsBGb0RGLC/gszRODcd t0sWMXkuiUrBpSyQHWRGmq+51z7c5JbQW+zkNxXDpvsd+MruUgpMY595YwZ9lEzX EWMvdtPTqoQwkuVNwX5q5FdY3sF8RAOa5KuTRaLZ6yOXLXJSUm0IwFixW2DChEEZ AuMHGb2PxWX4/nuy5ZmXznemax035cg8GhOI98sfenrPpv9dSK/VFOagXBcSMgdR FjL4uS6n3mMNiiCoLBkx6yyJPrvK8fFBNwY1EKcQ5ps5ouChrPfIHqpp+Fs6Hz1l 3X41XMAqIeiSJUg2JLlGmk+Z8IVyylcSdO90dytj3BtaC2UYjehRRVnLAm6ILwO9 DDRQSIPocfi1SdIiZNrMvXQkM3ifAthhs96lzb2YBalFy6fB0WAnPjPyNXmFfiHk 75yp9illDojXWJJd86Wl0hCi8JaSa2E1666/vHmGz4mqKsF9fQyBHGVsoddyc3hZ xRzpQQxQVeBurODDnTcYNLWOWFsup3VQKF6BPJn+ZFltZ/167lU= =vFjw -----END PGP SIGNATURE----- --zFwCqliugpdGxo86--