public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
* [PATCH v2 0/2] Some fixes for valgrind suppressions
@ 2023-11-16  5:43 David Gibson
  2023-11-16  5:43 ` [PATCH v2 1/2] valgrind: Adjust suppression for MSG_TRUNC with NULL buffer David Gibson
  2023-11-16  5:43 ` [PATCH v2 2/2] valgrind: Don't disable optimizations for valgrind builds David Gibson
  0 siblings, 2 replies; 5+ messages in thread
From: David Gibson @ 2023-11-16  5:43 UTC (permalink / raw)
  To: passt-dev, Stefano Brivio; +Cc: David Gibson

With the new versions of compiler and valgrind in Fedora 39, I hit
some spurious test failures.  Here are some workarounds.

Changes since v1:
 * Took entirely different approach, fixing the existing suppression
 * Extra patch allowing optimised valgrind builds

David Gibson (2):
  valgrind: Adjust suppression for MSG_TRUNC with NULL buffer
  valgrind: Don't disable optimizations for valgrind builds

 Makefile           | 2 +-
 tcp.c              | 9 +++++++++
 test/valgrind.supp | 3 +--
 3 files changed, 11 insertions(+), 3 deletions(-)

-- 
2.41.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2 1/2] valgrind: Adjust suppression for MSG_TRUNC with NULL buffer
  2023-11-16  5:43 [PATCH v2 0/2] Some fixes for valgrind suppressions David Gibson
@ 2023-11-16  5:43 ` David Gibson
  2023-11-16  7:21   ` Stefano Brivio
  2023-11-16  5:43 ` [PATCH v2 2/2] valgrind: Don't disable optimizations for valgrind builds David Gibson
  1 sibling, 1 reply; 5+ messages in thread
From: David Gibson @ 2023-11-16  5:43 UTC (permalink / raw)
  To: passt-dev, Stefano Brivio; +Cc: David Gibson

valgrind complains if we pass a NULL buffer to recv(), even if we use
MSG_TRUNC, in which case it's actually safe.  For a long time we've had
a valgrind suppression for this.  It singles out the recv() in
tcp_sock_consume(), the only place we use MSG_TRUNC.

However, tcp_sock_consume() only has a single caller, which makes it a
prime candidate for inlining.  If inlined, it won't appear on the stack and
valgrind won't match the correct suppression.

It appears that certain compiler versions (for example gcc-13.2.1 in
Fedora 39) will inline this function even with the -O0 we use for valgrind
builds.  This breaks the suppression leading to a spurious failure in the
tests.

There's not really any way to adjust the suppression itself without making
it overly broad (we don't want to match other recv() calls).  So, as a hack
explicitly prevent inlining of this function when we're making a valgrind
build.  To accomplish this add an explicit -DVALGRIND when making such a
build.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 Makefile           | 2 +-
 tcp.c              | 9 +++++++++
 test/valgrind.supp | 3 +--
 3 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/Makefile b/Makefile
index ff21459..57b2544 100644
--- a/Makefile
+++ b/Makefile
@@ -128,7 +128,7 @@ qrap: $(QRAP_SRCS) passt.h
 valgrind: EXTRA_SYSCALLS += rt_sigprocmask rt_sigtimedwait rt_sigaction	\
 			    getpid gettid kill clock_gettime mmap	\
 			    munmap open unlink gettimeofday futex
-valgrind: FLAGS:=-g -O0 $(filter-out -O%,$(FLAGS))
+valgrind: FLAGS:=-g -O0 $(filter-out -O%,$(FLAGS)) -DVALGRIND
 valgrind: all
 
 .PHONY: clean
diff --git a/tcp.c b/tcp.c
index cfcd40a..b86d433 100644
--- a/tcp.c
+++ b/tcp.c
@@ -2097,6 +2097,15 @@ static void tcp_conn_from_tap(struct ctx *c,
  *
  * Return: 0 on success, negative error code from recv() on failure
  */
+#ifdef VALGRIND
+/* valgrind doesn't realise that passing a NULL buffer to recv() is ok if using
+ * MSG_TRUNC.  We have a supression for this in the tests, but it relies on
+ * valgrind being able to see the tcp_sock_consume() stack frame, which it won't
+ * if this gets inlined.  This has a single caller making it a likely inlining
+ * candidate, and certain compiler versions will do so even at -O0.
+ */
+ __attribute__((noinline))
+#endif /* VALGRIND */
 static int tcp_sock_consume(const struct tcp_tap_conn *conn, uint32_t ack_seq)
 {
 	/* Simply ignore out-of-order ACKs: we already consumed the data we
diff --git a/test/valgrind.supp b/test/valgrind.supp
index 1228056..a158394 100644
--- a/test/valgrind.supp
+++ b/test/valgrind.supp
@@ -3,7 +3,6 @@
    passt_recv_MSG_TRUNC_into_NULL_buffer
    Memcheck:Param
    socketcall.recvfrom(buf)
-   fun:recv
    ...
-   fun:tcp_sock_consume*
+   fun:tcp_sock_consume
 }
-- 
@@ -3,7 +3,6 @@
    passt_recv_MSG_TRUNC_into_NULL_buffer
    Memcheck:Param
    socketcall.recvfrom(buf)
-   fun:recv
    ...
-   fun:tcp_sock_consume*
+   fun:tcp_sock_consume
 }
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2 2/2] valgrind: Don't disable optimizations for valgrind builds
  2023-11-16  5:43 [PATCH v2 0/2] Some fixes for valgrind suppressions David Gibson
  2023-11-16  5:43 ` [PATCH v2 1/2] valgrind: Adjust suppression for MSG_TRUNC with NULL buffer David Gibson
@ 2023-11-16  5:43 ` David Gibson
  1 sibling, 0 replies; 5+ messages in thread
From: David Gibson @ 2023-11-16  5:43 UTC (permalink / raw)
  To: passt-dev, Stefano Brivio; +Cc: David Gibson

When we plan to use valgrind, we need to build passt a bit differently:
  * We need debug symbols so that valgrind can match problems it finds to
    meaningful locations
  * We need to allow additional syscalls in the seccomp filter, because
    valgrind's wrappers need them

Currently we also disable optimization (-O0).  This is unfortunate, because
it will make valgrind tests even slower than they'd otherwise be.  Worse,
it's possible that the asm behaviour without optimization might be
different enough that valgrind could miss a use of uninitialized variable
or other fault it would detect.

I suspect this was originally done because without it inlining could mean
that suppressions we use don't reliably match the places we want them to.
Alas, it turns out this is true even with -O0.  We've now implemented a
more robust workaround for this (explicit ((noinline)) attributes when
compiled with -DVALGRIND).  So, we can re-enable optimization for valgrind
builds, making them closer to regular builds.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 57b2544..6c53695 100644
--- a/Makefile
+++ b/Makefile
@@ -128,7 +128,7 @@ qrap: $(QRAP_SRCS) passt.h
 valgrind: EXTRA_SYSCALLS += rt_sigprocmask rt_sigtimedwait rt_sigaction	\
 			    getpid gettid kill clock_gettime mmap	\
 			    munmap open unlink gettimeofday futex
-valgrind: FLAGS:=-g -O0 $(filter-out -O%,$(FLAGS)) -DVALGRIND
+valgrind: FLAGS += -g -DVALGRIND
 valgrind: all
 
 .PHONY: clean
-- 
@@ -128,7 +128,7 @@ qrap: $(QRAP_SRCS) passt.h
 valgrind: EXTRA_SYSCALLS += rt_sigprocmask rt_sigtimedwait rt_sigaction	\
 			    getpid gettid kill clock_gettime mmap	\
 			    munmap open unlink gettimeofday futex
-valgrind: FLAGS:=-g -O0 $(filter-out -O%,$(FLAGS)) -DVALGRIND
+valgrind: FLAGS += -g -DVALGRIND
 valgrind: all
 
 .PHONY: clean
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 1/2] valgrind: Adjust suppression for MSG_TRUNC with NULL buffer
  2023-11-16  5:43 ` [PATCH v2 1/2] valgrind: Adjust suppression for MSG_TRUNC with NULL buffer David Gibson
@ 2023-11-16  7:21   ` Stefano Brivio
  2023-11-16  9:16     ` David Gibson
  0 siblings, 1 reply; 5+ messages in thread
From: Stefano Brivio @ 2023-11-16  7:21 UTC (permalink / raw)
  To: David Gibson; +Cc: passt-dev

On Thu, 16 Nov 2023 16:43:49 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> valgrind complains if we pass a NULL buffer to recv(), even if we use
> MSG_TRUNC, in which case it's actually safe.  For a long time we've had
> a valgrind suppression for this.  It singles out the recv() in
> tcp_sock_consume(), the only place we use MSG_TRUNC.
> 
> However, tcp_sock_consume() only has a single caller, which makes it a
> prime candidate for inlining.  If inlined, it won't appear on the stack and
> valgrind won't match the correct suppression.
> 
> It appears that certain compiler versions (for example gcc-13.2.1 in
> Fedora 39) will inline this function even with the -O0 we use for valgrind
> builds.  This breaks the suppression leading to a spurious failure in the
> tests.
> 
> There's not really any way to adjust the suppression itself without making
> it overly broad (we don't want to match other recv() calls).  So, as a hack
> explicitly prevent inlining of this function when we're making a valgrind
> build.  To accomplish this add an explicit -DVALGRIND when making such a
> build.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  Makefile           | 2 +-
>  tcp.c              | 9 +++++++++
>  test/valgrind.supp | 3 +--
>  3 files changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/Makefile b/Makefile
> index ff21459..57b2544 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -128,7 +128,7 @@ qrap: $(QRAP_SRCS) passt.h
>  valgrind: EXTRA_SYSCALLS += rt_sigprocmask rt_sigtimedwait rt_sigaction	\
>  			    getpid gettid kill clock_gettime mmap	\
>  			    munmap open unlink gettimeofday futex
> -valgrind: FLAGS:=-g -O0 $(filter-out -O%,$(FLAGS))
> +valgrind: FLAGS:=-g -O0 $(filter-out -O%,$(FLAGS)) -DVALGRIND
>  valgrind: all
>  
>  .PHONY: clean
> diff --git a/tcp.c b/tcp.c
> index cfcd40a..b86d433 100644
> --- a/tcp.c
> +++ b/tcp.c
> @@ -2097,6 +2097,15 @@ static void tcp_conn_from_tap(struct ctx *c,
>   *
>   * Return: 0 on success, negative error code from recv() on failure
>   */
> +#ifdef VALGRIND
> +/* valgrind doesn't realise that passing a NULL buffer to recv() is ok if using
> + * MSG_TRUNC.  We have a supression for this in the tests, but it relies on
                              ^ pp

(I can fix it up on merge but I guess you prefer to repost). Rest of the
series looks good to me.

-- 
Stefano


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 1/2] valgrind: Adjust suppression for MSG_TRUNC with NULL buffer
  2023-11-16  7:21   ` Stefano Brivio
@ 2023-11-16  9:16     ` David Gibson
  0 siblings, 0 replies; 5+ messages in thread
From: David Gibson @ 2023-11-16  9:16 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: passt-dev

[-- Attachment #1: Type: text/plain, Size: 2826 bytes --]

On Thu, Nov 16, 2023 at 08:21:34AM +0100, Stefano Brivio wrote:
> On Thu, 16 Nov 2023 16:43:49 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > valgrind complains if we pass a NULL buffer to recv(), even if we use
> > MSG_TRUNC, in which case it's actually safe.  For a long time we've had
> > a valgrind suppression for this.  It singles out the recv() in
> > tcp_sock_consume(), the only place we use MSG_TRUNC.
> > 
> > However, tcp_sock_consume() only has a single caller, which makes it a
> > prime candidate for inlining.  If inlined, it won't appear on the stack and
> > valgrind won't match the correct suppression.
> > 
> > It appears that certain compiler versions (for example gcc-13.2.1 in
> > Fedora 39) will inline this function even with the -O0 we use for valgrind
> > builds.  This breaks the suppression leading to a spurious failure in the
> > tests.
> > 
> > There's not really any way to adjust the suppression itself without making
> > it overly broad (we don't want to match other recv() calls).  So, as a hack
> > explicitly prevent inlining of this function when we're making a valgrind
> > build.  To accomplish this add an explicit -DVALGRIND when making such a
> > build.
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> >  Makefile           | 2 +-
> >  tcp.c              | 9 +++++++++
> >  test/valgrind.supp | 3 +--
> >  3 files changed, 11 insertions(+), 3 deletions(-)
> > 
> > diff --git a/Makefile b/Makefile
> > index ff21459..57b2544 100644
> > --- a/Makefile
> > +++ b/Makefile
> > @@ -128,7 +128,7 @@ qrap: $(QRAP_SRCS) passt.h
> >  valgrind: EXTRA_SYSCALLS += rt_sigprocmask rt_sigtimedwait rt_sigaction	\
> >  			    getpid gettid kill clock_gettime mmap	\
> >  			    munmap open unlink gettimeofday futex
> > -valgrind: FLAGS:=-g -O0 $(filter-out -O%,$(FLAGS))
> > +valgrind: FLAGS:=-g -O0 $(filter-out -O%,$(FLAGS)) -DVALGRIND
> >  valgrind: all
> >  
> >  .PHONY: clean
> > diff --git a/tcp.c b/tcp.c
> > index cfcd40a..b86d433 100644
> > --- a/tcp.c
> > +++ b/tcp.c
> > @@ -2097,6 +2097,15 @@ static void tcp_conn_from_tap(struct ctx *c,
> >   *
> >   * Return: 0 on success, negative error code from recv() on failure
> >   */
> > +#ifdef VALGRIND
> > +/* valgrind doesn't realise that passing a NULL buffer to recv() is ok if using
> > + * MSG_TRUNC.  We have a supression for this in the tests, but it relies on
>                               ^ pp
> 
> (I can fix it up on merge but I guess you prefer to repost). Rest of the
> series looks good to me.

Oops, yes.  Resent.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-11-16  9:16 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-16  5:43 [PATCH v2 0/2] Some fixes for valgrind suppressions David Gibson
2023-11-16  5:43 ` [PATCH v2 1/2] valgrind: Adjust suppression for MSG_TRUNC with NULL buffer David Gibson
2023-11-16  7:21   ` Stefano Brivio
2023-11-16  9:16     ` David Gibson
2023-11-16  5:43 ` [PATCH v2 2/2] valgrind: Don't disable optimizations for valgrind builds David Gibson

Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).