From: David Gibson <david@gibson.dropbear.id.au>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: passt-dev@passt.top
Subject: Re: [PATCH] treewide: Disable gcc strict aliasing rules as needed, drop workarounds
Date: Mon, 27 Feb 2023 21:47:50 +1100 [thread overview]
Message-ID: <Y/yKVkgsmR9+sR5s@yekko> (raw)
In-Reply-To: <20230227095929.225622-1-sbrivio@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 10023 bytes --]
On Mon, Feb 27, 2023 at 10:59:29AM +0100, Stefano Brivio wrote:
> Recently, commit 4ddbcb9c0c55 ("tcp: Disable optimisations
> for tcp_hash()") worked around yet another issue we hit with gcc 12
> and '-flto -O2': some stores affecting the input data to siphash_20b()
> were omitted altogether, and tcp_hash() wouldn't get the correct hash
> for incoming connections.
>
> Digging further into this revealed that, at least according to gcc's
> interpretation of C99 aliasing rules, passing pointers to functions
> with different types compared to the effective type of the object
> (for example, a uint8_t pointer to an anonymous struct, as it happens
> in tcp_hash()), doesn't guarantee that stores are not reordered
> across the function call.
> This means that, in general, our checksum and hash functions might
> not see parts of input data that was intended to be provided by
> callers.
>
> Not even switching from uint8_t to character types, which should be
> appropriate here, according to C99 (ISO/IEC 9899, TC3, draft N1256),
> section 6.5, "Expressions", paragraph 7:
>
> An object shall have its stored value accessed only by an lvalue
> expression that has one of the following types:
>
> [...]
>
> — a character type.
Huh, weird. I certainly thought char/uint8_t pointers were explicitly
allowed to be aliased with anything, or an enormous number of C-isms
can't be counted on.
> does the trick. I guess this is also subject to interpretation:
> casting passed pointers to character types, and then using those as
> different types, might still violate (dubious) aliasing rules.
Well, at least we know why now.
> Disable gcc strict aliasing rules for potentially affected functions,
> which, in turn, disables gcc's Type-Based Alias Analysis (TBAA)
> optimisations based on those function arguments.
Excellent, I like this workaround much better than the previous one.
Who knows exactly what "-O0" mean, here it's much more explicit what
we're saying.
I'd still prefer to make the code strict-aliasing correct if we can,
but if going via a (char *) isn't doing it, then I really don't know
how at this point.
> Drop the existing workarounds. Also the (seemingly?) bogus
> 'maybe-uninitialized' warning on the tcp_tap_handler() > tcp_hash() >
> siphash_20b() path goes away with -fno-strict-aliasing on
> siphash_20b().
>
> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> ---
> Makefile | 21 ---------------------
> checksum.c | 13 ++++++++++---
> siphash.c | 22 +++++++++++++++++++---
> tcp.c | 6 ------
> 4 files changed, 29 insertions(+), 33 deletions(-)
>
> diff --git a/Makefile b/Makefile
> index 080c748..667ddfb 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -56,27 +56,6 @@ PASST_HEADERS = arch.h arp.h checksum.h conf.h dhcp.h dhcpv6.h icmp.h \
> tcp_splice.h udp.h util.h
> HEADERS = $(PASST_HEADERS) seccomp.h
>
> -# On gcc 11 and 12, with -O2 and -flto, tcp_hash() and siphash_20b(), if
> -# inlined, seem to be hitting something similar to:
> -# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78993
> -# from the pointer arithmetic used from the tcp_tap_handler() path to get the
> -# remote connection address.
> -#
> -# TODO: With the same combination, in ndp(), gcc optimises away the store of
> -# hop_limit in the IPv6 header (temporarily set to the protocol number for
> -# convenience, to mimic the ICMPv6 checksum pseudo-header) before the call to
> -# csum_unaligned(). Mark csum_unaligned() as "noipa" as a quick work-around,
> -# while we figure out if a corresponding gcc issue has already been reported.
> -ifeq (,$(filter-out 11 12, $(shell $(CC) -dumpversion)))
> -ifneq (,$(filter -flto%,$(FLAGS) $(CFLAGS) $(CPPFLAGS)))
> -ifneq (,$(filter -O2,$(FLAGS) $(CFLAGS) $(CPPFLAGS)))
> - FLAGS += -DTCP_HASH_NOINLINE
> - FLAGS += -DSIPHASH_20B_NOINLINE
> - FLAGS += -DCSUM_UNALIGNED_NO_IPA
> -endif
> -endif
> -endif
> -
> C := \#include <linux/tcp.h>\nstruct tcp_info x = { .tcpi_snd_wnd = 0 };
> ifeq ($(shell printf "$(C)" | $(CC) -S -xc - -o - >/dev/null 2>&1; echo $$?),0)
> FLAGS += -DHAS_SND_WND
> diff --git a/checksum.c b/checksum.c
> index 29769d9..9631f91 100644
> --- a/checksum.c
> +++ b/checksum.c
> @@ -69,6 +69,8 @@
> *
> * Return: 32-bit sum of 16-bit words
> */
> +/* NOLINTNEXTLINE(clang-diagnostic-unknown-attributes) */
> +__attribute__((optimize("-fno-strict-aliasing"))) /* See siphash_8b() */
> uint32_t sum_16b(const void *buf, size_t len)
> {
> const uint16_t *p = buf;
> @@ -107,9 +109,8 @@ uint16_t csum_fold(uint32_t sum)
> *
> * Return: 16-bit IPv4-style checksum
> */
> -#if CSUM_UNALIGNED_NO_IPA
> -__attribute__((__noipa__)) /* See comment in Makefile */
> -#endif
> +/* NOLINTNEXTLINE(clang-diagnostic-unknown-attributes) */
> +__attribute__((optimize("-fno-strict-aliasing"))) /* See siphash_8b() */
> uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init)
> {
> return (uint16_t)~csum_fold(sum_16b(buf, len) + init);
> @@ -245,6 +246,8 @@ void csum_icmp6(struct icmp6hdr *icmp6hr,
> * - sum_a/sum_b unpacking is interleaved and not sequential to reduce stalls
> * - coding style adaptation
> */
> +/* NOLINTNEXTLINE(clang-diagnostic-unknown-attributes) */
> +__attribute__((optimize("-fno-strict-aliasing"))) /* See siphash_8b() */
> static uint32_t csum_avx2(const void *buf, size_t len, uint32_t init)
> {
> __m256i a, b, sum256, sum_a_hi, sum_a_lo, sum_b_hi, sum_b_lo, c, d;
> @@ -391,6 +394,8 @@ less_than_128_bytes:
> *
> * Return: 16-bit folded, complemented checksum sum
> */
> +/* NOLINTNEXTLINE(clang-diagnostic-unknown-attributes) */
> +__attribute__((optimize("-fno-strict-aliasing"))) /* See siphash_8b() */
> uint16_t csum(const void *buf, size_t len, uint32_t init)
> {
> return (uint16_t)~csum_fold(csum_avx2(buf, len, init));
> @@ -406,6 +411,8 @@ uint16_t csum(const void *buf, size_t len, uint32_t init)
> *
> * Return: 16-bit folded, complemented checksum
> */
> +/* NOLINTNEXTLINE(clang-diagnostic-unknown-attributes) */
> +__attribute__((optimize("-fno-strict-aliasing"))) /* See siphash_8b() */
> uint16_t csum(const void *buf, size_t len, uint32_t init)
> {
> return csum_unaligned(buf, len, init);
> diff --git a/siphash.c b/siphash.c
> index 811918b..e8b144d 100644
> --- a/siphash.c
> +++ b/siphash.c
> @@ -104,6 +104,17 @@
> *
> * Return: the 64-bit hash output
> */
> +/* Type-Based Alias Analysis (TBAA) optimisation in gcc 11 and 12 (-flto -O2)
> + * makes these functions essentially useless by allowing reordering of stores of
> + * input data across function calls. Not even declaring @in as char pointer is
> + * enough: disable gcc's interpretation of strict aliasing altogether. See also:
> + *
> + * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106706
> + * https://stackoverflow.com/questions/2958633/gcc-strict-aliasing-and-horror-stories
> + * https://lore.kernel.org/all/alpine.LFD.2.00.0901121128080.6528__33422.5328093909$1232291247$gmane$org@localhost.localdomain/
> + */
> +/* NOLINTNEXTLINE(clang-diagnostic-unknown-attributes) */
> +__attribute__((optimize("-fno-strict-aliasing")))
> /* cppcheck-suppress unusedFunction */
> uint64_t siphash_8b(const uint8_t *in, const uint64_t *k)
> {
> @@ -123,6 +134,8 @@ uint64_t siphash_8b(const uint8_t *in, const uint64_t *k)
> *
> * Return: 32 bits obtained by XORing the two halves of the 64-bit hash output
> */
> +/* NOLINTNEXTLINE(clang-diagnostic-unknown-attributes) */
> +__attribute__((optimize("-fno-strict-aliasing"))) /* See siphash_8b() */
> /* cppcheck-suppress unusedFunction */
> uint32_t siphash_12b(const uint8_t *in, const uint64_t *k)
> {
> @@ -148,9 +161,8 @@ uint32_t siphash_12b(const uint8_t *in, const uint64_t *k)
> *
> * Return: the 64-bit hash output
> */
> -#if SIPHASH_20B_NOINLINE
> -__attribute__((__noinline__)) /* See comment in Makefile */
> -#endif
> +/* NOLINTNEXTLINE(clang-diagnostic-unknown-attributes) */
> +__attribute__((optimize("-fno-strict-aliasing"))) /* See siphash_8b() */
> uint64_t siphash_20b(const uint8_t *in, const uint64_t *k)
> {
> uint32_t *in32 = (uint32_t *)in;
> @@ -179,6 +191,8 @@ uint64_t siphash_20b(const uint8_t *in, const uint64_t *k)
> *
> * Return: the 64-bit hash output
> */
> +/* NOLINTNEXTLINE(clang-diagnostic-unknown-attributes) */
> +__attribute__((optimize("-fno-strict-aliasing"))) /* See siphash_8b() */
> /* cppcheck-suppress unusedFunction */
> uint32_t siphash_32b(const uint8_t *in, const uint64_t *k)
> {
> @@ -205,6 +219,8 @@ uint32_t siphash_32b(const uint8_t *in, const uint64_t *k)
> *
> * Return: 32 bits obtained by XORing the two halves of the 64-bit hash output
> */
> +/* NOLINTNEXTLINE(clang-diagnostic-unknown-attributes) */
> +__attribute__((optimize("-fno-strict-aliasing"))) /* See siphash_8b() */
> uint32_t siphash_36b(const uint8_t *in, const uint64_t *k)
> {
> uint32_t *in32 = (uint32_t *)in;
> diff --git a/tcp.c b/tcp.c
> index 21c319d..cbd537e 100644
> --- a/tcp.c
> +++ b/tcp.c
> @@ -1182,12 +1182,6 @@ static int tcp_hash_match(const struct tcp_tap_conn *conn,
> *
> * Return: hash value, already modulo size of the hash table
> */
> -#if TCP_HASH_NOINLINE
> -__attribute__((__noinline__)) /* See comment in Makefile */
> -#endif
> -__attribute__((optimize("O0"))) /* TODO: with -O2 and -flto on gcc 12.2,
> - * siphash_20b() doesn't see 'addr', why?
> - */
> static unsigned int tcp_hash(const struct ctx *c, const union inany_addr *addr,
> in_port_t tap_port, in_port_t sock_port)
> {
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
prev parent reply other threads:[~2023-02-27 10:47 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-27 9:59 [PATCH] treewide: Disable gcc strict aliasing rules as needed, drop workarounds Stefano Brivio
2023-02-27 10:47 ` David Gibson [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y/yKVkgsmR9+sR5s@yekko \
--to=david@gibson.dropbear.id.au \
--cc=passt-dev@passt.top \
--cc=sbrivio@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).