From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by passt.top (Postfix) with ESMTPS id 75D495A0082 for ; Mon, 24 Oct 2022 01:53:33 +0200 (CEST) Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4MwZlq3tQ3z4xGS; Mon, 24 Oct 2022 10:53:27 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=201602; t=1666569207; bh=5QBCoZB2PTbKSU5rR3erGtEeWQgcE1UxI5eWMrfDgvY=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=JeUgQM32vkR7tHoMqZquo+PxyiYWLEMnwbCRGnE1WYFPx8rfPAG2bUH6O7guIRc/I MRl70HPovJUvW3FvRoDJDP/reBPBW/dQKN4TPO9znC1IxuFv4ug538ivY7U3FkknMf HXTxSo0dnlDF3nYtVza3cP3Zpw4HgxCnlsXT3c2c= Date: Mon, 24 Oct 2022 10:52:55 +1100 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH] util: Set NS_FN_STACK_SIZE to one eighth of ulimit-reported maximum stack size Message-ID: References: <20221022064503.386563-1-sbrivio@redhat.com> <20221022101535.4ceda8d6@elisabeth> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="lsp/anHezChOOOrL" Content-Disposition: inline In-Reply-To: Message-ID-Hash: F4W36DY7UNFCP6VF7JVUFOBQCVZDTL2Z X-Message-ID-Hash: F4W36DY7UNFCP6VF7JVUFOBQCVZDTL2Z X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, Andrea Bolognani X-Mailman-Version: 3.3.3 Precedence: list List-Id: Development discussion and patches for passt Archived-At: <> Archived-At: List-Archive: <> List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --lsp/anHezChOOOrL Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Oct 24, 2022 at 10:36:19AM +1100, David Gibson wrote: > On Sat, Oct 22, 2022 at 10:15:35AM +0200, Stefano Brivio wrote: > > On Sat, 22 Oct 2022 08:45:03 +0200 > > Stefano Brivio wrote: > >=20 > > > ...instead of one fourth. On the main() -> conf() -> nl_sock_init() > > > call path, LTO from gcc 12 on (at least) x86_64 decides to inline... > > > everything: nl_sock_init() is effectively part of main(), after > > > commit 3e2eb4337bc0 ("conf: Bind inbound ports with > > > CAP_NET_BIND_SERVICE before isolate_user()"). > > >=20 > > > This means we exceed the maximum stack size, and we get SIGSEGV, > > > under any condition, at start time, as reported by Andrea on a recent > > > build for CentOS Stream 9. > > >=20 > > > The calculation of NS_FN_STACK_SIZE, which is the stack size we > > > reserve for clones, was previously obtained by dividing the maximum > > > stack size by two, to avoid an explicit check on architecture (on > > > PA-RISC, also known as hppa, the stack grows up, so we point the > > > clone to the middle of this area), and then further divided by two > > > to allow for any additional usage in the caller. > > >=20 > > > Well, if there are essentially no function calls anymore, this is > > > not enough. Divide it by eight, which is anyway much more than > > > possibly needed by any clone()d callee. > > >=20 > > > I think this is robust, so it's a fix in some sense. Strictly > > > speaking, though, we have no formal guarantees that this isn't > > > either too little or too much. > > >=20 > > > What we should do, eventually: check cloned() callees, there are just > > > thirteen of them at the moment. Note down any stack usage (they are > > > mostly small helpers), bonus points for an automated way at build > > > time, quadruple that or so, to allow for extreme clumsiness, and use > > > as NS_FN_STACK_SIZE. Perhaps introduce a specific condition for hppa. > > >=20 > > > Reported-by: Andrea Bolognani > > > Fixes: 3e2eb4337bc0 ("conf: Bind inbound ports with CAP_NET_BIND_SERV= ICE before isolate_user()") > > > Signed-off-by: Stefano Brivio > > > --- > >=20 > > I posted this in any case for (later) review, but I'm actually applying > > it right away, given that some builds are completely unusable otherwise. >=20 > That's sensible, however, this patch confuses me. I don't really > understand how reducing the stack size is avoiding a SEGV, regardless > of what LTO does. Shortly after I wrote this, I realized what the issue was. IIUC, the problem is basically because we're allocating the stack of the sub-thread as a buffer on the stack of the main thread, so the main thread stack has to have room for both. > The fact that we're basing the runtime stack size > on a limit that's from build time also doesn't really make sense to > me. This aspect still seems pretty bogus to me. > > I guess it would make sense to extend build ("distro") tests with some > > compiler flags, common and less common. > >=20 >=20 --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --lsp/anHezChOOOrL Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEoULxWu4/Ws0dB+XtgypY4gEwYSIFAmNV09EACgkQgypY4gEw YSLiiQ//fM8UvJ8kKBT91NmHdu0QlK2zJoEOIJVQHBgHzVhdUMICCAR7xnT/+brh ENnlmXH2bxWKTqizIZCmB8+v35A/RcZIBvUhhoDgN6CX1YBgF/HOHZRj5y5JHJBP CCC4yviw/2DPglH2WD/QzGhB1tZdqf7KAmzTZScD8275jFTOkM79ZRz/UN7jRtas XJpyqOa77YlYswyjBZfliUG1jGZs+SnuN+bZLcIccmMQY3YzTcBlzuIxH/JO0dZi G/bbJ49J7dbSwQOyGomQVA0oNGNDcGcJqBUcpob7CPw3A2VlVdfaTqtXavJKc/GP Bhn2NOABiblCbYOy4akgy+a3vqrzeaxoNnMd+UzT6n8uZjYDybL+7u6rQPgOVXTO jMM4+ta+DzD+neKAqlYddhyeiRs45vESgMYS2JOYw1cBHJJFKOasx1xQId7lJOMU iwKs9sLUA2atYPZy+1EKZaxK8boLYXE16OP8DFxdZ6XiibtqXbAnY3N7m6D8W1GJ gE9zdYrpXrCibXMLwiHF6s+pUmhOoj+yehSbwLDl1zlBu6K1LMjIU+/+h79sp1XF C+5NQt2n9TLId8tEAx7OjxHlk8/+xPKjMZPFUQnMMp65dG1Jy0GY+LbMMV38GF07 o6gy/O37m4mhWXMv2NJc3OzQJi3ZOME4fhR3UFWPdofWahZVeHQ= =8zVC -----END PGP SIGNATURE----- --lsp/anHezChOOOrL--