From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by passt.top (Postfix) with ESMTPS id 494F45A0268 for ; Mon, 27 Feb 2023 11:47:56 +0100 (CET) Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4PQHJm5lCRz4x8x; Mon, 27 Feb 2023 21:47:52 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=201602; t=1677494872; bh=51M2PKQ/Mpj8/J1PHVvUfWdgcWtD4DjvgTRhsm7Rm6g=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=VlRCOQ/eDp0E4PE9zfiHgTFtzdimN2eLOCnWFAeYxAhNQxWmvKPGo7cgHd2fkTLwl HsYnfqxzdYRMw/wZfhVfkQn62wUpLyg6nRRFCIV7Zoa5OJ0DOMvyuitliUMfxPE7MX iztkHJFUbjrxO25S3H5CDCWpG4kNu4SdFbWfhyS4= Date: Mon, 27 Feb 2023 21:47:50 +1100 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH] treewide: Disable gcc strict aliasing rules as needed, drop workarounds Message-ID: References: <20230227095929.225622-1-sbrivio@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="Gef9geWATm7lfPi9" Content-Disposition: inline In-Reply-To: <20230227095929.225622-1-sbrivio@redhat.com> Message-ID-Hash: EJ557EUSDBLGFPR52GIA6E2OR3EF527M X-Message-ID-Hash: EJ557EUSDBLGFPR52GIA6E2OR3EF527M X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --Gef9geWATm7lfPi9 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Feb 27, 2023 at 10:59:29AM +0100, Stefano Brivio wrote: > Recently, commit 4ddbcb9c0c55 ("tcp: Disable optimisations > for tcp_hash()") worked around yet another issue we hit with gcc 12 > and '-flto -O2': some stores affecting the input data to siphash_20b() > were omitted altogether, and tcp_hash() wouldn't get the correct hash > for incoming connections. >=20 > Digging further into this revealed that, at least according to gcc's > interpretation of C99 aliasing rules, passing pointers to functions > with different types compared to the effective type of the object > (for example, a uint8_t pointer to an anonymous struct, as it happens > in tcp_hash()), doesn't guarantee that stores are not reordered > across the function call. > This means that, in general, our checksum and hash functions might > not see parts of input data that was intended to be provided by > callers. >=20 > Not even switching from uint8_t to character types, which should be > appropriate here, according to C99 (ISO/IEC 9899, TC3, draft N1256), > section 6.5, "Expressions", paragraph 7: >=20 > An object shall have its stored value accessed only by an lvalue > expression that has one of the following types: >=20 > [...] >=20 > =E2=80=94 a character type. Huh, weird. I certainly thought char/uint8_t pointers were explicitly allowed to be aliased with anything, or an enormous number of C-isms can't be counted on. > does the trick. I guess this is also subject to interpretation: > casting passed pointers to character types, and then using those as > different types, might still violate (dubious) aliasing rules. Well, at least we know why now. > Disable gcc strict aliasing rules for potentially affected functions, > which, in turn, disables gcc's Type-Based Alias Analysis (TBAA) > optimisations based on those function arguments. Excellent, I like this workaround much better than the previous one. Who knows exactly what "-O0" mean, here it's much more explicit what we're saying. I'd still prefer to make the code strict-aliasing correct if we can, but if going via a (char *) isn't doing it, then I really don't know how at this point. > Drop the existing workarounds. Also the (seemingly?) bogus > 'maybe-uninitialized' warning on the tcp_tap_handler() > tcp_hash() > > siphash_20b() path goes away with -fno-strict-aliasing on > siphash_20b(). >=20 > Signed-off-by: Stefano Brivio Reviewed-by: David Gibson > --- > Makefile | 21 --------------------- > checksum.c | 13 ++++++++++--- > siphash.c | 22 +++++++++++++++++++--- > tcp.c | 6 ------ > 4 files changed, 29 insertions(+), 33 deletions(-) >=20 > diff --git a/Makefile b/Makefile > index 080c748..667ddfb 100644 > --- a/Makefile > +++ b/Makefile > @@ -56,27 +56,6 @@ PASST_HEADERS =3D arch.h arp.h checksum.h conf.h dhcp.= h dhcpv6.h icmp.h \ > tcp_splice.h udp.h util.h > HEADERS =3D $(PASST_HEADERS) seccomp.h > =20 > -# On gcc 11 and 12, with -O2 and -flto, tcp_hash() and siphash_20b(), if > -# inlined, seem to be hitting something similar to: > -# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D78993 > -# from the pointer arithmetic used from the tcp_tap_handler() path to ge= t the > -# remote connection address. > -# > -# TODO: With the same combination, in ndp(), gcc optimises away the stor= e of > -# hop_limit in the IPv6 header (temporarily set to the protocol number f= or > -# convenience, to mimic the ICMPv6 checksum pseudo-header) before the ca= ll to > -# csum_unaligned(). Mark csum_unaligned() as "noipa" as a quick work-aro= und, > -# while we figure out if a corresponding gcc issue has already been repo= rted. > -ifeq (,$(filter-out 11 12, $(shell $(CC) -dumpversion))) > -ifneq (,$(filter -flto%,$(FLAGS) $(CFLAGS) $(CPPFLAGS))) > -ifneq (,$(filter -O2,$(FLAGS) $(CFLAGS) $(CPPFLAGS))) > - FLAGS +=3D -DTCP_HASH_NOINLINE > - FLAGS +=3D -DSIPHASH_20B_NOINLINE > - FLAGS +=3D -DCSUM_UNALIGNED_NO_IPA > -endif > -endif > -endif > - > C :=3D \#include \nstruct tcp_info x =3D { .tcpi_snd_wnd = =3D 0 }; > ifeq ($(shell printf "$(C)" | $(CC) -S -xc - -o - >/dev/null 2>&1; echo = $$?),0) > FLAGS +=3D -DHAS_SND_WND > diff --git a/checksum.c b/checksum.c > index 29769d9..9631f91 100644 > --- a/checksum.c > +++ b/checksum.c > @@ -69,6 +69,8 @@ > * > * Return: 32-bit sum of 16-bit words > */ > +/* NOLINTNEXTLINE(clang-diagnostic-unknown-attributes) */ > +__attribute__((optimize("-fno-strict-aliasing"))) /* See siphash_8b() */ > uint32_t sum_16b(const void *buf, size_t len) > { > const uint16_t *p =3D buf; > @@ -107,9 +109,8 @@ uint16_t csum_fold(uint32_t sum) > * > * Return: 16-bit IPv4-style checksum > */ > -#if CSUM_UNALIGNED_NO_IPA > -__attribute__((__noipa__)) /* See comment in Makefile */ > -#endif > +/* NOLINTNEXTLINE(clang-diagnostic-unknown-attributes) */ > +__attribute__((optimize("-fno-strict-aliasing"))) /* See siphash_8b() */ > uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init) > { > return (uint16_t)~csum_fold(sum_16b(buf, len) + init); > @@ -245,6 +246,8 @@ void csum_icmp6(struct icmp6hdr *icmp6hr, > * - sum_a/sum_b unpacking is interleaved and not sequential to reduce s= talls > * - coding style adaptation > */ > +/* NOLINTNEXTLINE(clang-diagnostic-unknown-attributes) */ > +__attribute__((optimize("-fno-strict-aliasing"))) /* See siphash_8b() */ > static uint32_t csum_avx2(const void *buf, size_t len, uint32_t init) > { > __m256i a, b, sum256, sum_a_hi, sum_a_lo, sum_b_hi, sum_b_lo, c, d; > @@ -391,6 +394,8 @@ less_than_128_bytes: > * > * Return: 16-bit folded, complemented checksum sum > */ > +/* NOLINTNEXTLINE(clang-diagnostic-unknown-attributes) */ > +__attribute__((optimize("-fno-strict-aliasing"))) /* See siphash_8b() */ > uint16_t csum(const void *buf, size_t len, uint32_t init) > { > return (uint16_t)~csum_fold(csum_avx2(buf, len, init)); > @@ -406,6 +411,8 @@ uint16_t csum(const void *buf, size_t len, uint32_t i= nit) > * > * Return: 16-bit folded, complemented checksum > */ > +/* NOLINTNEXTLINE(clang-diagnostic-unknown-attributes) */ > +__attribute__((optimize("-fno-strict-aliasing"))) /* See siphash_8b() */ > uint16_t csum(const void *buf, size_t len, uint32_t init) > { > return csum_unaligned(buf, len, init); > diff --git a/siphash.c b/siphash.c > index 811918b..e8b144d 100644 > --- a/siphash.c > +++ b/siphash.c > @@ -104,6 +104,17 @@ > * > * Return: the 64-bit hash output > */ > +/* Type-Based Alias Analysis (TBAA) optimisation in gcc 11 and 12 (-flto= -O2) > + * makes these functions essentially useless by allowing reordering of s= tores of > + * input data across function calls. Not even declaring @in as char poin= ter is > + * enough: disable gcc's interpretation of strict aliasing altogether. S= ee also: > + * > + * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D106706 > + * https://stackoverflow.com/questions/2958633/gcc-strict-aliasing-and-h= orror-stories > + * https://lore.kernel.org/all/alpine.LFD.2.00.0901121128080.6528__33422= =2E5328093909$1232291247$gmane$org@localhost.localdomain/ > + */ > +/* NOLINTNEXTLINE(clang-diagnostic-unknown-attributes) */ > +__attribute__((optimize("-fno-strict-aliasing"))) > /* cppcheck-suppress unusedFunction */ > uint64_t siphash_8b(const uint8_t *in, const uint64_t *k) > { > @@ -123,6 +134,8 @@ uint64_t siphash_8b(const uint8_t *in, const uint64_t= *k) > * > * Return: 32 bits obtained by XORing the two halves of the 64-bit hash = output > */ > +/* NOLINTNEXTLINE(clang-diagnostic-unknown-attributes) */ > +__attribute__((optimize("-fno-strict-aliasing"))) /* See siphash_8b() */ > /* cppcheck-suppress unusedFunction */ > uint32_t siphash_12b(const uint8_t *in, const uint64_t *k) > { > @@ -148,9 +161,8 @@ uint32_t siphash_12b(const uint8_t *in, const uint64_= t *k) > * > * Return: the 64-bit hash output > */ > -#if SIPHASH_20B_NOINLINE > -__attribute__((__noinline__)) /* See comment in Makefile */ > -#endif > +/* NOLINTNEXTLINE(clang-diagnostic-unknown-attributes) */ > +__attribute__((optimize("-fno-strict-aliasing"))) /* See siphash_8b() */ > uint64_t siphash_20b(const uint8_t *in, const uint64_t *k) > { > uint32_t *in32 =3D (uint32_t *)in; > @@ -179,6 +191,8 @@ uint64_t siphash_20b(const uint8_t *in, const uint64_= t *k) > * > * Return: the 64-bit hash output > */ > +/* NOLINTNEXTLINE(clang-diagnostic-unknown-attributes) */ > +__attribute__((optimize("-fno-strict-aliasing"))) /* See siphash_8b() */ > /* cppcheck-suppress unusedFunction */ > uint32_t siphash_32b(const uint8_t *in, const uint64_t *k) > { > @@ -205,6 +219,8 @@ uint32_t siphash_32b(const uint8_t *in, const uint64_= t *k) > * > * Return: 32 bits obtained by XORing the two halves of the 64-bit hash = output > */ > +/* NOLINTNEXTLINE(clang-diagnostic-unknown-attributes) */ > +__attribute__((optimize("-fno-strict-aliasing"))) /* See siphash_8b() */ > uint32_t siphash_36b(const uint8_t *in, const uint64_t *k) > { > uint32_t *in32 =3D (uint32_t *)in; > diff --git a/tcp.c b/tcp.c > index 21c319d..cbd537e 100644 > --- a/tcp.c > +++ b/tcp.c > @@ -1182,12 +1182,6 @@ static int tcp_hash_match(const struct tcp_tap_con= n *conn, > * > * Return: hash value, already modulo size of the hash table > */ > -#if TCP_HASH_NOINLINE > -__attribute__((__noinline__)) /* See comment in Makefile */ > -#endif > -__attribute__((optimize("O0"))) /* TODO: with -O2 and -flto on gcc 12.2, > - * siphash_20b() doesn't see 'addr', why? > - */ > static unsigned int tcp_hash(const struct ctx *c, const union inany_addr= *addr, > in_port_t tap_port, in_port_t sock_port) > { --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --Gef9geWATm7lfPi9 Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmP8ilAACgkQzQJF27ox 2GegEQ//UvQ2TOd5ZbjOagzgGEX4yRIjjkiTdFVAw4N8AOjGQfsKeZCLlHnCM3XR ZHEZ4bqtu/je8hy0qjZZpT0hrNSkQCT5y6ImS/8bUEKmUA1FZIiegkMMJiggfuMS nsCBM7xfRKMwWL8O7RJ14OMYnlBn9mghbx69owLcLnZK/CgWEqJxDGKD1J+ziwJO LiybjmpDjrdEyKaoPnTLNGriFIjeMR3gz+c4KdpgO0UU01lJVOPyh2S2Kavx+N8S Gh5oc0Bmv54M/L3nbR5QNN+Bj/x6kakr7g4ZuyDbhIbrs5aO1IT/+FURICYE+pkc xzhJTW0adRuZLdknihfsTDk5fpgSl+3eTYbJlpjMndjOPckTmzkAkdaDFrE0boRo BXNKyfSdhMR1y5rgl+Khs1uWYh3hrSo6kWOm8cYzxlfkt7FIvkHR2IBlfQg+RtDP iVp4ApTu/++74nHbn0Jp53BGhLWhfLTuNM/7pZ0S/40/3BHG0YJRuPAcY2jSLCla tJr21DmlSNOA7QwL33gCgvjucabkeDTyuDf6t2KMV3mD3z+Kc03kvmzJQ8Shfrz0 mEoP8PI6IvZYTuxs3hPhRYLI4f14ORQSQlWdlrhIkUGJPFUOwwCoEPpZPHFQue+Z JJg9cerIE5NVw0R1mqYeEApo783eLDnnHQqC31Tz4Qmw0CrvRQo= =ZeMv -----END PGP SIGNATURE----- --Gef9geWATm7lfPi9--