From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by passt.top (Postfix) with ESMTP id C5AA85A0082 for ; Mon, 6 Feb 2023 20:53:13 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675713192; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=RIO4m7mvPAcCqA97llHYu0BIqAjJ9yfFkgATT4+UPl4=; b=D5uxIjPggkCiVjDC1LDhVck5F1XnFO5O6O7k7yQ41CrKZ1dxkeCT/XJCbrI65QoZ6w3qFk +4cVmo/5dQZjYeYIGAlpfKfdiqE190TX4btDAOdRPVgvPSopUcdINOoNouBHQVxr7VTHX3 MpAp7fjUdEK1nyVVAbAag3mx2by2uIk= Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-178-l7guZid9PbO9E7diwufoWA-1; Mon, 06 Feb 2023 14:53:11 -0500 X-MC-Unique: l7guZid9PbO9E7diwufoWA-1 Received: by mail-qt1-f200.google.com with SMTP id c16-20020ac85190000000b003b841d1118aso7193915qtn.17 for ; Mon, 06 Feb 2023 11:53:11 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=RIO4m7mvPAcCqA97llHYu0BIqAjJ9yfFkgATT4+UPl4=; b=RRUTX1BU4H+CIxs1ZSOaOlSSTWNCTtegkjjVvTkE11ZGn5aKEZ3WFz0pewtC1gjsx/ ssZ8F2ikE+3CyoXUy0WlyNDSuX2m4BBcrXm/XkgLjRXSkfssZ/6tEAHuAoJs5Eo6sdzp 1xh+u5RGhJaFRaUL9RzLfIJvfPcBQSMY50SiXwyHsu1hhTDamtMMWb8+uVTOE4ihO6MF Pq4lGgG8T8SrT12EefZq1CZoMKMric2pya5YN6W7baHI4te/px92YHBikMaoHsT/7YW1 G7mxf8Eggl7q0G9qzz7hAmwVcqbJxBYljE0kdsbFO5djESARpO6EYPYv6SuGIauVHzvd ExmQ== X-Gm-Message-State: AO0yUKXsB8QU59x/m6eV1bGkUhtDjFeWdaS8soIL70PFuFGs/OzU0ZnS FVO/BVNnwJoWe3oSTkkjStU+m9r4xja3mOKo7l8oBR++q07w1RAJMPQE2VfaDobNChID9sEkDBm dh19n4ORtKY1B X-Received: by 2002:a05:6214:d84:b0:56b:ef93:b8da with SMTP id e4-20020a0562140d8400b0056bef93b8damr500055qve.20.1675713190720; Mon, 06 Feb 2023 11:53:10 -0800 (PST) X-Google-Smtp-Source: AK7set9VqdPv1682OujvekFjkYA7CBXdAH8WQkti6SX9JXzeNhLROa/utAML4gbPjLyllN7hezd7kQ== X-Received: by 2002:a05:6214:d84:b0:56b:ef93:b8da with SMTP id e4-20020a0562140d8400b0056bef93b8damr500025qve.20.1675713190389; Mon, 06 Feb 2023 11:53:10 -0800 (PST) Received: from [192.168.188.25] ([80.243.52.133]) by smtp.gmail.com with ESMTPSA id d2-20020a05620a204200b0071f0d0aaef7sm7940749qka.80.2023.02.06.11.53.09 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 06 Feb 2023 11:53:10 -0800 (PST) Message-ID: Date: Mon, 6 Feb 2023 20:53:08 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.6.0 Subject: Re: [PATCH] pasta: wait for netns setup before calling exec To: Stefano Brivio References: <20230201180116.21281-1-pholzing@redhat.com> <20230202112506.187d852e@elisabeth> <7df94a05-50a5-26d8-784c-3565b6578212@redhat.com> <20230203173703.564f73c3@elisabeth> <20230203195536.667a09f6@elisabeth> From: Paul Holzinger In-Reply-To: <20230203195536.667a09f6@elisabeth> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-MailFrom: pholzing@redhat.com X-Mailman-Rule-Hits: nonmember-moderation X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation Message-ID-Hash: IYVD7ZYWCR4KFXLXUGBUZWMHC35MO75U X-Message-ID-Hash: IYVD7ZYWCR4KFXLXUGBUZWMHC35MO75U X-Mailman-Approved-At: Mon, 06 Feb 2023 21:24:12 +0100 CC: passt-dev@passt.top X-Mailman-Version: 3.3.3 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On 03/02/2023 19:55, Stefano Brivio wrote: > On Fri, 3 Feb 2023 17:37:03 +0100 > Stefano Brivio wrote: > >> On Fri, 3 Feb 2023 15:44:40 +0100 >> Paul Holzinger wrote: >> >>> On 02/02/2023 11:25, Stefano Brivio wrote: >>>> On Wed, 1 Feb 2023 19:01:16 +0100 >>>> Paul Holzinger wrote: >>>> >>>>> When a user spawns a command with pasta they expect the network to be >>>>> ready. Currently this does not work because pasta will fork/exec >>>>> before it will setup the network config. >>>>> >>>>> This patch fixes it by using a pipe to sync parent and child. The child >>>>> will now block reading from this pipe before the exec call. The parent >>>>> will then unblock the child only after the netns was configured. >>>> Thanks for the patch! I'm reviewing this in a bit. >>>> >>>> A few considerations meanwhile: >>>> >>>> - there's actually a bigger issue (you're fixing here) than the >>>> namespace configuration (via netlink) itself: the tap device isn't >>>> ready (tap_sock_init() hasn't been called yet) when we spawn the >>>> command in the new namespace. Oops. >>>> >>>> If you're wondering: we can't just reorder things, because to complete >>>> the configuration phase (conf()) we need the namespace to be set up, >>>> and we can't initialise the tap device before it's set up >>>> >>>> - pipes are more commonly used to transfer data around (hence the whole >>>> code you need to open a communication channel, check it, close it). >>>> Did you try with a signal? Or is there a reason why it wouldn't work? >>>> >>>> You could simply SIGSTOP the child, from the child itself: >>>> >>>> kill(getpid(), SIGSTOP); >>>> >>>> and send a SIGCONT to it (we already store the PID of the child in >>>> pasta_child_pid) once we're ready. >>>> >>>> SIGCONT is special in that it doesn't need CAP_KILL or the processes >>>> to run under the same UID -- just in the same session, so it wouldn't >>>> risk interfering with the isolation_*() calls. >>>> >>>> I haven't tested this but I think it should lead to simpler code. >>> Thinking about this more STOP/CONT will not work reliably, it could stop >>> the child forever when the parent sends SIGCONT before the child >>> SIGSTOPs itself. While this is unlikely we have no control over how both >>> processes are scheduled. >>> >>> With this pipe version there is no problem when the parent closes the fd >>> before the child calls read, read will simply return EOF and the child can >>> continue, thus it will work correctly in all cases. >> Ah, right, nice catch. Still, you could probably use pause() or >> sigsuspend() instead of the SIGSTOP. Let me try a quick stand-alone >> experiment and I'll get back to you (probably early next week), unless >> you manage to get it working before. > Sorry, forget about it -- it doesn't solve the problem of waiting, in > the parent, that the child is stopped, which is exactly the point you > raised. A waitpid() with WUNTRACED does: > > #include > #include > #include > #include > > #define DELAY_PARENT 0 > #define DELAY_CHILD 0 > > int main() > { > pid_t pid; > int i; > > if ((pid = fork())) { > #if DELAY_PARENT > for (i = 0; i < 10000000; i++); > #endif > waitpid(pid, NULL, WUNTRACED); > kill(pid, SIGCONT); > sleep(1); > return 0; > } > > #if DELAY_CHILD > for (i = 0; i < 10000000; i++); > #endif > raise(SIGSTOP); > fprintf(stderr, "received SIGCONT\n"); > return 0; > } > > I left in some busyloops you can use to check. It's three lines, with > error checks probably 9, still less than the pipe thing (~16) and it > looks simpler (to me). > I don't know what it is but this doesn't work when I implement it in pasta. Somehow the child doesn't seem to be stopped. A short lived processes such as ip addr causes pasta to exit even before the parent got to the point where it would send SIGCONT. If I get the nanoseconds before and after raise(SIGSTOP) there is almost no delay. It is clear that the child still runs after raise(SIGSTOP) even though the parent never send SIGCONT at this point, in fact I can completely remove the kill(pid, SIGCONT) call and the program works without hanging. And of course if I run this through strace it works just fine, so I am bit lost right now. Likely because strace makes things much slower? The waitpid call also always fails with ECHILD, I don't understand why. I think it must have something to do with the SIGCHILD signal handler that already reaps the signal info? So either I made a a stupid mistake somewhere or it simply does not work.