From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by passt.top (Postfix) with ESMTP id 21D855A0082 for ; Thu, 16 Feb 2023 10:15:38 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1676538937; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hN0MpIC3c7cZHlQXUl1rWEjIxvFXVCIdciuhuy1bxvc=; b=RyPHOTBWu4fnoH+NnwDfFM3egX2tPQGBSEmlha7IdfhCg7StSsUoMSTQTsmVxll61spCdl 2mGz0h8Y3tIbmi3chmQa+PcqcigYDW0dgjb8AXfXqZ5e9+g7Sb+Xqx+RgkuWMxUioXoyV0 LrJZhrJC5+OiWxU7yDfCN3rCGY/1mZQ= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-587-7keG3X_2OV6g6m_oMktiKQ-1; Thu, 16 Feb 2023 04:15:35 -0500 X-MC-Unique: 7keG3X_2OV6g6m_oMktiKQ-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 790BF1C06914 for ; Thu, 16 Feb 2023 09:15:35 +0000 (UTC) Received: from maya.cloud.tilaa.com (unknown [10.33.32.3]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 304DA2166B30; Thu, 16 Feb 2023 09:15:35 +0000 (UTC) Date: Thu, 16 Feb 2023 10:15:12 +0100 From: Stefano Brivio To: Michal =?UTF-8?B?UHLDrXZvem7DrWs=?= Subject: Re: [PATCH 4/4] qemu_passt: Don't let passt fork off Message-ID: <20230216101512.4a3e4246@elisabeth> In-Reply-To: <0ab46c27-33aa-35cf-f233-91ca31c26987@redhat.com> References: <5abfc412e4692a38e980c8dc600e1bfbd03ddcfd.1676374699.git.mprivozn@redhat.com> <20230214140253.49bbc13a@elisabeth> <90dbb5f3-7b3f-893c-ca32-a7653eb486c6@redhat.com> <7cbc3713-9d51-2950-2a3c-ae90928b83b6@redhat.com> <20230215193020.4af13f54@elisabeth> <0ab46c27-33aa-35cf-f233-91ca31c26987@redhat.com> Organization: Red Hat MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Message-ID-Hash: FJRUTFB43TF2IRAPZIDC4CPKI3WHPMWK X-Message-ID-Hash: FJRUTFB43TF2IRAPZIDC4CPKI3WHPMWK X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: Laine Stump , Libvirt , passt-dev@passt.top X-Mailman-Version: 3.3.3 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Thu, 16 Feb 2023 09:52:27 +0100 Michal Pr=C3=ADvozn=C3=ADk wrote: > On 2/15/23 19:30, Stefano Brivio wrote: > > On Wed, 15 Feb 2023 18:04:56 +0100 > > Michal Pr=C3=ADvozn=C3=ADk wrote: > > =20 > >> On 2/15/23 08:50, Laine Stump wrote: =20 > >>> On 2/14/23 8:02 AM, Stefano Brivio wrote: =20 > >>>> On Tue, 14 Feb 2023 12:51:22 +0100 > >>>> Michal Privoznik wrote: > >>>> =20 > >>>>> When passt starts it tries to do some security measures to > >>>>> restrict itself. For instance, it creates its own namespaces, > >>>>> umounts basically everything, drops capabilities, forks off to > >>>>> further restrict itself (the child is where all interesting work > >>>>> takes place now). This is sound, except it's causing two > >>>>> problems: > >>>>> > >>>>> 1) the PID file FD, which we leak into the passt process, gets > >>>>> =C2=A0=C2=A0=C2=A0 closed (and thus our virPidFile*() helpers see u= nlocked PID > >>>>> =C2=A0=C2=A0=C2=A0 file, which makes them think the process is gone= ), =20 > >>>> > >>>> I didn't realise this was the case, but giving passt write (unless I= 'm > >>>> missing something) access to a file created by libvirtd doesn't look > >>>> desirable to me. =20 > >>> =20 > >>>> =20 > >>>>> 2) the PID file no longer reflects true PID of the process. > >>>>> > >>>>> Worse, the child calls setsid() so we can't even kill the whole > >>>>> process group. I mean, we can but it won't be any good. =20 > >>> > >>> I think that (incorrect PID in the pidfile) is=C2=A0 happening becaus= e Michal > >>> is using the original version of my patches that were pushed - I had > >>> mimicked the behavior of slirp, where libvirt deamonizes the new > >>> process. If that process then daemonizes itself, we have some sort of > >>> "double daemon"; libvirt has saved off the pid of what it thinks is > >>> going to be the final process, but then that process further forks an= d > >>> exits from the process whose pid libvirt saved. But because passt was > >>> cleaning up after itself I hadn't noticed the discrepancy in pids whe= n > >>> testing. > >>> > >>> Without going into all the details of the pidfile and locking and etc= , I > >>> just want to say that if we can fork/exec dnsmasq and let it daemoniz= e > >>> itself and create its own pidfile, then certainly we can do the same > >>> thing for passt. (and if there's a fundamental problem, then it's a > >>> fundamental problem for dnsmasq as well). =20 > >> > >> Alright. I think I have a solution that would please everybody involve= d. > >> I'll post it tomorrow though. I need to test it thoroughly. We would b= e > >> able to get passt's PID (which is needed not only for killing it, but > >> also for CGroup placement), NOT use --foreground and still pass errors > >> from it to users (that is unless logfile was specified, because > >> unfortunately, --log-file and --stderr are mutually exclusive). =20 > >=20 > > That doesn't need to be the case (--log-file and --stderr being > > mutually exclusive)... if you have a use case for it, let's change that > > in passt. I just wanted to keep it simple for users ("give a log file, > > and be sure it won't spam"). > >=20 > > Also mind that Laine's series: > > https://archives.passt.top/passt-dev/20230215082437.110151-1-laine@re= dhat.com/ =20 >=20 > Thanks, this looks exactly like what we need. So for now I can just pass > --stderr if there's no --log-file, to deal with those "releases" that > don't have those patches merged yet. I wouldn't even bother with that, the user base (especially with libvirt) is small enough that we can be quite confident that 100% of the users will upgrade as soon as a new release (most likely coming this week) includes them. I'd suggest to keep it simple, because you really can. Those are actual releases. Their naming just doesn't follow "semantic versioning". > > *should* already cover all the cases where libvirt is interested in > > relaying "early" errors back to the user. > >=20 > > By the way, the one below is pretty much the patch I would have propose= d > > for libvirt. I prepared it earlier today and didn't have a chance to > > test it yet, it's compile-tested only, and doesn't take cgroups into > > account (which, it seems, is needed no matter the lifecycle). > >=20 > > So I'm sharing it here as reference (that's how simple I wanted it to > > be -- minus cgroups), or if it's convenient for you to copy and paste > > something. >=20 > This effectively disables placing passt into the CGroup set up for > emulator thread. And I don't think we want that. Firstly, it makes > statistics gathering report incorrect values. Secondly, these helper > processes are "implementation detail" - I mean, users don't really care > (from accounting POV) whether a task runs in emulator thread inside of > QEMU or in a separate process. It's still an emulation and as such > should be accounted for. And also, on NUMA machines we definitely want > to place passt as close to the emulator as possible (i.e. if emulator > thread is pinned than helper processes should be pinned too). Yes, definitely, I see now -- I thought, earlier, that cgroups were just used to handle lifecycles at the moment. > [...] --=20 Stefano