From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by passt.top (Postfix) with ESMTP id D93035A004F for ; Wed, 07 Aug 2024 13:37:23 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1723030642; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pZOfDiysDsVQ9wu//IJWqYeXe9FqTybHOrHeE3JgI4o=; b=PvBam5IdqTEVeKwdFV27tq06bKQzNgZNlYfRl1R/p7N4MSNVIqJLP67JQiW/xl57ml5KME aDzhVCtn2nUJtkPj2w1DRskjecCmxlucgoZlmJNw1AFrqkAjq2YqHxWGfeM3w6xZOLylPs rIPZ0JUL0Litpin/kHPPmRx63n4UhtY= Received: from mail-pl1-f197.google.com (mail-pl1-f197.google.com [209.85.214.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-131-2QBHy1LQNhmjVdbg6y48cQ-1; Wed, 07 Aug 2024 07:37:21 -0400 X-MC-Unique: 2QBHy1LQNhmjVdbg6y48cQ-1 Received: by mail-pl1-f197.google.com with SMTP id d9443c01a7336-1fc5f04f356so16941495ad.1 for ; Wed, 07 Aug 2024 04:37:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723030640; x=1723635440; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=pZOfDiysDsVQ9wu//IJWqYeXe9FqTybHOrHeE3JgI4o=; b=l1a2ah3/x6d/+jK5dbBzKrzPxvrGxCzDOpgDwjgyCjPwe1MP3K8z7TYArX4J5GxtjN u35//vx6l9T/9pYsiOiUGZ6WwAz096cP7Ez5k5RXKWqa+XAlvsb0ZAts/70mlE9riKtQ /hhTEAv3L22iJIX3oZjh6LZfTwKRqusTupTqDnW9eoTiHeXL24nvw6ceKRrEJ9vmFTlZ 38ipG79rOpaAHaQUxG3xHIGxzmUNaA49fRk0v0vXFMl47NeKPEvyioi1VQbZYJW9pmtP YrsNUnhgjEqUHebiRc5MXrw2qypnD7WK0JIexTmgMpBgQkifh9dzf8l/lIktccKgbH8L fluA== X-Gm-Message-State: AOJu0YygOOafyu0qr1yhULB6GuBU7D9kmhZ9/L66DF0QrINnhdOWVOZO 3PPWBAD5Dj0JkZo6+AH/jsE04veanP6h3GqabJDfT2rdn4Gw9Q7N/3YBg0CGJ7J7kPjLPSEUkaZ B6f1miYRpY8c043QAb9S3jsC5bHCnjbUQO3purKoId99N/z78sg== X-Received: by 2002:a17:903:2288:b0:1fc:f65:cd8a with SMTP id d9443c01a7336-1ff57281ef3mr177278965ad.18.1723030640454; Wed, 07 Aug 2024 04:37:20 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEfns0T2UU/WtLLQNGevW1XBqVnG07hAqdMCd+yk9f89ypx9lbu9+z8cVZIAvD42et+w4KzAA== X-Received: by 2002:a17:903:2288:b0:1fc:f65:cd8a with SMTP id d9443c01a7336-1ff57281ef3mr177278645ad.18.1723030639917; Wed, 07 Aug 2024 04:37:19 -0700 (PDT) Received: from maya.myfinge.rs (ifcgrfdd.trafficplex.cloud. [2a10:fc81:a806:d6a9::1]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1ff58f29de7sm104068275ad.26.2024.08.07.04.37.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 07 Aug 2024 04:37:19 -0700 (PDT) Date: Wed, 7 Aug 2024 13:37:16 +0200 From: Stefano Brivio To: Paul Holzinger Subject: Re: [PATCH v3] passt, util: Close any open file that the parent might have leaked Message-ID: <20240807133716.6ac546d3@elisabeth> In-Reply-To: <138d459e-e805-4cc8-9a6e-f4bdcb4347d3@redhat.com> References: <20240807111100.2086825-1-sbrivio@redhat.com> <138d459e-e805-4cc8-9a6e-f4bdcb4347d3@redhat.com> Organization: Red Hat X-Mailer: Claws Mail 4.2.0 (GTK 3.24.41; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-ID-Hash: XP2RTET3TDXLC3BORFEBH7W6AKNB2KY7 X-Message-ID-Hash: XP2RTET3TDXLC3BORFEBH7W6AKNB2KY7 X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, David Gibson X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Wed, 7 Aug 2024 13:26:32 +0200 Paul Holzinger wrote: > On 07/08/2024 13:11, Stefano Brivio wrote: > > If a parent accidentally or due to implementation reasons leaks any > > open file, we don't want to have access to them, except for the file > > passed via --fd, if any. > > > > This is the case for Podman when Podman's parent leaks files into > > Podman: it's not practical for Podman to close unrelated files before > > starting pasta, as reported by Paul. > > > > Use close_range(2) to close all open files except for standard streams > > and the one from --fd. > > > > Given that parts of conf() depend on other files to be already opened, > > such as the epoll file descriptor, we can't easily defer this to a > > more convenient point, where --fd was already parsed. Introduce a > > minimal, duplicate version of --fd parsing to keep this simple. > > > > As we need to check that the passed --fd option doesn't exceed > > INT_MAX, because we'll parse it with strtol() but file descriptor > > indices are signed ints (regardless of the arguments close_range() > > take), extend the existing check in the actual --fd parsing in conf(), > > while at it. > > > > Suggested-by: Paul Holzinger > > Signed-off-by: Stefano Brivio > > --- > > v3: Handle --fd 3 case, and don't overflow if the --fd number exceeds > > UINT_MAX: add an explicit check to ensure it's less than INT_MAX > > > > v2: Move call to close_open_files() to isolate_initial() > > > > conf.c | 3 ++- > > isolation.c | 12 +++++++++--- > > isolation.h | 2 +- > > passt.c | 2 +- > > util.c | 38 ++++++++++++++++++++++++++++++++++++++ > > util.h | 1 + > > 6 files changed, 52 insertions(+), 6 deletions(-) > > > > diff --git a/conf.c b/conf.c > > index 14d8ece..5422813 100644 > > --- a/conf.c > > +++ b/conf.c > > @@ -1260,6 +1260,7 @@ void conf(struct ctx *c, int argc, char **argv) > > c->tcp.fwd_in.mode = c->tcp.fwd_out.mode = FWD_UNSET; > > c->udp.fwd_in.mode = c->udp.fwd_out.mode = FWD_UNSET; > > > > + optind = 1; > > do { > > name = getopt_long(argc, argv, optstring, options, NULL); > > > > @@ -1426,7 +1427,7 @@ void conf(struct ctx *c, int argc, char **argv) > > errno = 0; > > c->fd_tap = strtol(optarg, NULL, 0); > > > > - if (c->fd_tap < 0 || errno) > > + if (c->fd_tap < 0 || errno || c->fd_tap > INT_MAX) > > die("Invalid --fd: %s", optarg); > > > > c->one_off = true; > > diff --git a/isolation.c b/isolation.c > > index 4956d7e..45fba1e 100644 > > --- a/isolation.c > > +++ b/isolation.c > > @@ -29,7 +29,8 @@ > > * > > * Executed immediately after startup, drops capabilities we don't > > * need at any point during execution (or which we gain back when we > > - * need by joining other namespaces). > > + * need by joining other namespaces), and closes any leaked file we > > + * might have inherited from the parent process. > > * > > * 2. isolate_user() > > * ================= > > @@ -166,14 +167,17 @@ static void clamp_caps(void) > > } > > > > /** > > - * isolate_initial() - Early, config independent self isolation > > + * isolate_initial() - Early, mostly config independent self isolation > > + * @argc: Argument count > > + * @argv: Command line options: only --fd (if present) is relevant here > > * > > * Should: > > * - drop unneeded capabilities > > + * - close all open files except for standard streams and the one from --fd > > * Musn't: > > * - remove filesytem access (we need to access files during setup) > > */ > > -void isolate_initial(void) > > +void isolate_initial(int argc, char **argv) > > { > > uint64_t keep; > > > > @@ -207,6 +211,8 @@ void isolate_initial(void) > > keep |= BIT(CAP_SETFCAP) | BIT(CAP_SYS_PTRACE); > > > > drop_caps_ep_except(keep); > > + > > + close_open_files(argc, argv); > > } > > > > /** > > diff --git a/isolation.h b/isolation.h > > index 846b2af..80bb68d 100644 > > --- a/isolation.h > > +++ b/isolation.h > > @@ -7,7 +7,7 @@ > > #ifndef ISOLATION_H > > #define ISOLATION_H > > > > -void isolate_initial(void); > > +void isolate_initial(int argc, char **argv); > > void isolate_user(uid_t uid, gid_t gid, bool use_userns, const char *userns, > > enum passt_modes mode); > > int isolate_prefork(const struct ctx *c); > > diff --git a/passt.c b/passt.c > > index ea5bece..4b3c306 100644 > > --- a/passt.c > > +++ b/passt.c > > @@ -211,7 +211,7 @@ int main(int argc, char **argv) > > > > arch_avx2_exec(argv); > > > > - isolate_initial(); > > + isolate_initial(argc, argv); > > > > c.pasta_netns_fd = c.fd_tap = c.pidfile_fd = -1; > > > > diff --git a/util.c b/util.c > > index 07fb21c..9c6be6a 100644 > > --- a/util.c > > +++ b/util.c > > @@ -26,6 +26,7 @@ > > #include > > #include > > #include > > +#include > > > > #include "util.h" > > #include "iov.h" > > @@ -694,3 +695,40 @@ const char *str_ee_origin(const struct sock_extended_err *ee) > > > > return ""; > > } > > + > > +/** > > + * close_open_files() - Close leaked files, but not --fd, stdin, stdout, stderr > > + * @argc: Argument count > > + * @argv: Command line options, as we need to skip any file given via --fd > > + */ > > +void close_open_files(int argc, char **argv) > > +{ > > + const struct option optfd[] = { { "fd", required_argument, NULL, 'F' }, > > + { 0 }, > > + }; > > + long fd = -1; > > + int name; > > + > > + do { > > + name = getopt_long(argc, argv, ":F", optfd, NULL); > > + > > + if (name == 'F') { > > + errno = 0; > > + fd = strtol(optarg, NULL, 0); > > + > > + if (fd < 0 || errno || fd > INT_MAX) > > + die("Invalid --fd: %s", optarg); > > + } > > + } while (name != -1); > > + > > + if (fd == -1 || fd == 3) { > > + unsigned int first = (fd == 3) ? 4 : 3; > > + > > + if (close_range(first, ~0U, CLOSE_RANGE_UNSHARE)) > > + die_perror("Failed to close files leaked by parent"); > > + } else { > > + if (close_range(3, fd - 1, CLOSE_RANGE_UNSHARE) || > > + close_range(fd + 1, ~0U, CLOSE_RANGE_UNSHARE)) > > + die_perror("Failed to close files leaked by parent"); > > + } > Sorry that I didn't mentioned this before but doesn't this still fail > when given fd 0, 1 or 2? I guess it should check if (fd < 3) in the > die(Invalid --fd) case, unless someone sees a reason to allow passing > the fd on stdio fds? Oops, I missed your comment as I was sending v4. Right, that would also be problematic. I think the only semi-reasonable thing somebody could try is to close our standard input, because we don't use it at all, and use that as --fd. But it wouldn't work anyway at the moment, in any case where we call __daemon(), because we would happily close that file descriptor. Same for standard output and standard error anyway. So, yeah, let's add the check you proposed, and should somebody ever come up with a '--fd 0' use case, we'll handle that. v5 coming. -- Stefano