From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by passt.top (Postfix) with ESMTP id 1D08A5A005E for ; Fri, 3 Feb 2023 19:55:42 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675450541; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5kkBfzJsoTipTHVNy+fnXuVf3GqbPBMJa3QA2hdOpP0=; b=Ve8WToM4d1mc1ho5VC/U4u5IRopKdq0G3z3R/SWrFVG9mZdXrQdHT8sxNv7mCfSTDK7iA7 VacfoBwfIZdNVyjG5OllKucGavDWQLC2YHrXyS2H2Sxx+OfHMOcm4IiOd8YcoAhkGlkevB ukmFjGvUN2dY6ueB4Lt4IUmYvW5Ml08= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-149-q_jaqUAZMEqhyEiuJ-q-aA-1; Fri, 03 Feb 2023 13:55:39 -0500 X-MC-Unique: q_jaqUAZMEqhyEiuJ-q-aA-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 4A290800D9F for ; Fri, 3 Feb 2023 18:55:39 +0000 (UTC) Received: from maya.cloud.tilaa.com (ovpn-208-4.brq.redhat.com [10.40.208.4]) by smtp.corp.redhat.com (Postfix) with ESMTPS id E94C6140EBF4; Fri, 3 Feb 2023 18:55:38 +0000 (UTC) Date: Fri, 3 Feb 2023 19:55:36 +0100 From: Stefano Brivio To: Paul Holzinger Subject: Re: [PATCH] pasta: wait for netns setup before calling exec Message-ID: <20230203195536.667a09f6@elisabeth> In-Reply-To: <20230203173703.564f73c3@elisabeth> References: <20230201180116.21281-1-pholzing@redhat.com> <20230202112506.187d852e@elisabeth> <7df94a05-50a5-26d8-784c-3565b6578212@redhat.com> <20230203173703.564f73c3@elisabeth> Organization: Red Hat MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-ID-Hash: P2RMHMMR5FMHQ6KMOORBEIJXEGAT2IW3 X-Message-ID-Hash: P2RMHMMR5FMHQ6KMOORBEIJXEGAT2IW3 X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.3 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Fri, 3 Feb 2023 17:37:03 +0100 Stefano Brivio wrote: > On Fri, 3 Feb 2023 15:44:40 +0100 > Paul Holzinger wrote: > > > On 02/02/2023 11:25, Stefano Brivio wrote: > > > On Wed, 1 Feb 2023 19:01:16 +0100 > > > Paul Holzinger wrote: > > > > > >> When a user spawns a command with pasta they expect the network to be > > >> ready. Currently this does not work because pasta will fork/exec > > >> before it will setup the network config. > > >> > > >> This patch fixes it by using a pipe to sync parent and child. The child > > >> will now block reading from this pipe before the exec call. The parent > > >> will then unblock the child only after the netns was configured. > > > Thanks for the patch! I'm reviewing this in a bit. > > > > > > A few considerations meanwhile: > > > > > > - there's actually a bigger issue (you're fixing here) than the > > > namespace configuration (via netlink) itself: the tap device isn't > > > ready (tap_sock_init() hasn't been called yet) when we spawn the > > > command in the new namespace. Oops. > > > > > > If you're wondering: we can't just reorder things, because to complete > > > the configuration phase (conf()) we need the namespace to be set up, > > > and we can't initialise the tap device before it's set up > > > > > > - pipes are more commonly used to transfer data around (hence the whole > > > code you need to open a communication channel, check it, close it). > > > Did you try with a signal? Or is there a reason why it wouldn't work? > > > > > > You could simply SIGSTOP the child, from the child itself: > > > > > > kill(getpid(), SIGSTOP); > > > > > > and send a SIGCONT to it (we already store the PID of the child in > > > pasta_child_pid) once we're ready. > > > > > > SIGCONT is special in that it doesn't need CAP_KILL or the processes > > > to run under the same UID -- just in the same session, so it wouldn't > > > risk interfering with the isolation_*() calls. > > > > > > I haven't tested this but I think it should lead to simpler code. > > > > Thinking about this more STOP/CONT will not work reliably, it could stop > > the child forever when the parent sends SIGCONT before the child > > SIGSTOPs itself. While this is unlikely we have no control over how both > > processes are scheduled. > > > > With this pipe version there is no problem when the parent closes the fd > > before the child calls read, read will simply return EOF and the child can > > continue, thus it will work correctly in all cases. > > Ah, right, nice catch. Still, you could probably use pause() or > sigsuspend() instead of the SIGSTOP. Let me try a quick stand-alone > experiment and I'll get back to you (probably early next week), unless > you manage to get it working before. Sorry, forget about it -- it doesn't solve the problem of waiting, in the parent, that the child is stopped, which is exactly the point you raised. A waitpid() with WUNTRACED does: #include #include #include #include #define DELAY_PARENT 0 #define DELAY_CHILD 0 int main() { pid_t pid; int i; if ((pid = fork())) { #if DELAY_PARENT for (i = 0; i < 10000000; i++); #endif waitpid(pid, NULL, WUNTRACED); kill(pid, SIGCONT); sleep(1); return 0; } #if DELAY_CHILD for (i = 0; i < 10000000; i++); #endif raise(SIGSTOP); fprintf(stderr, "received SIGCONT\n"); return 0; } I left in some busyloops you can use to check. It's three lines, with error checks probably 9, still less than the pipe thing (~16) and it looks simpler (to me). -- Stefano