public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: Stefano Brivio <sbrivio@redhat.com>
To: passt-dev@passt.top
Cc: David Gibson <david@gibson.dropbear.id.au>
Subject: [PATCH] pasta: Workaround: wait for execvp() to be done in child before entering netns
Date: Fri,  4 Nov 2022 02:53:28 +0100	[thread overview]
Message-ID: <20221104015328.3831630-1-sbrivio@redhat.com> (raw)

This happens about every third time on the two_guests/basic test, and
on that test only: we clone() twice, first to spawn a child, then to
spawn a thread to check that we can enter the target network
namespace.

In this thread, we open a file descriptor associated to the target
namespace. It might happen that it doesn't exist yet: the kernel can
legitimately take its time to create one, after clone(). In this
case, at least on a 5.15 Linux kernel, trying to open that file again
always yields EACCES, and we get stuck there.

This only occurs if we spawn two instances of pasta very close
together, as it's done in the two_guests/basic case.

I couldn't figure out what the race condition is, yet, and especially
if it's a kernel issue or something we're doing wrong. However, if we
wait until the execvp() in the child is done, the issue disappears.
I'm not sure yet if it's just because of timing and this is hiding an
unrelated race condition.

The workaround consists of checking /proc/PID/exe against our own. If
it's different, that means execvp() already completed and we can
proceed. It's rather ugly, but much better than the alternative.
Leave a FIXME there for the moment being.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
---
 pasta.c | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/pasta.c b/pasta.c
index db86317..36072b2 100644
--- a/pasta.c
+++ b/pasta.c
@@ -81,9 +81,26 @@ void pasta_child_handler(int signal)
  */
 static int pasta_wait_for_ns(void *arg)
 {
+	char ns_exe_link[PATH_MAX], ns[PATH_MAX];
 	struct ctx *c = (struct ctx *)arg;
 	int flags = O_RDONLY | O_CLOEXEC;
-	char ns[PATH_MAX];
+	char exe[PATH_MAX] = { 0 };
+
+	/* FIXME: Why do we have to wait until execvp() is done in the child?
+	 * If we don't, and the first call to open() below returns ENOENT, any
+	 * subsequent call to it returns EACCES, at least on Linux 5.15, even
+	 * though the observed PID is correct, and another process can open that
+	 * path, and call setns() on that.
+	 */
+	snprintf(ns_exe_link, PATH_MAX, "/proc/%i/exe", pasta_child_pid);
+	if (readlink("/proc/self/exe", exe, PATH_MAX - 1) != -1) {
+		char ns_exe[PATH_MAX] = { 0 };
+
+		do {
+			if (readlink(ns_exe_link, ns_exe, PATH_MAX - 1) == -1)
+				break;
+		} while (!strncmp(exe, ns_exe, PATH_MAX - 1));
+	}
 
 	snprintf(ns, PATH_MAX, "/proc/%i/ns/net", pasta_child_pid);
 	do
-- 
@@ -81,9 +81,26 @@ void pasta_child_handler(int signal)
  */
 static int pasta_wait_for_ns(void *arg)
 {
+	char ns_exe_link[PATH_MAX], ns[PATH_MAX];
 	struct ctx *c = (struct ctx *)arg;
 	int flags = O_RDONLY | O_CLOEXEC;
-	char ns[PATH_MAX];
+	char exe[PATH_MAX] = { 0 };
+
+	/* FIXME: Why do we have to wait until execvp() is done in the child?
+	 * If we don't, and the first call to open() below returns ENOENT, any
+	 * subsequent call to it returns EACCES, at least on Linux 5.15, even
+	 * though the observed PID is correct, and another process can open that
+	 * path, and call setns() on that.
+	 */
+	snprintf(ns_exe_link, PATH_MAX, "/proc/%i/exe", pasta_child_pid);
+	if (readlink("/proc/self/exe", exe, PATH_MAX - 1) != -1) {
+		char ns_exe[PATH_MAX] = { 0 };
+
+		do {
+			if (readlink(ns_exe_link, ns_exe, PATH_MAX - 1) == -1)
+				break;
+		} while (!strncmp(exe, ns_exe, PATH_MAX - 1));
+	}
 
 	snprintf(ns, PATH_MAX, "/proc/%i/ns/net", pasta_child_pid);
 	do
-- 
2.35.1


             reply	other threads:[~2022-11-04  1:53 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-04  1:53 Stefano Brivio [this message]
2022-11-07  3:17 ` [PATCH] pasta: Workaround: wait for execvp() to be done in child before entering netns David Gibson
2022-11-07  9:51   ` Stefano Brivio
2022-11-07 18:40     ` Stefano Brivio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221104015328.3831630-1-sbrivio@redhat.com \
    --to=sbrivio@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=passt-dev@passt.top \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).