public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
* [PATCH 0/3] More caution with NONBLOCK flag on Unix sockets
@ 2026-05-13  4:14 David Gibson
  2026-05-13  4:14 ` [PATCH 1/3] treewide: Add SOCK_CLOEXEC to accept() calls that are missing it David Gibson
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: David Gibson @ 2026-05-13  4:14 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: David Gibson

This is a revision of the one patch from my pesto series that wasn't
merged: a rework to the handling of the blocking flag for both
listening and accepted Unix sockets.  This new version is split into
several pieces to make the rationales clearer at each step.  It's also
a bit more cautious in what it does, so should avoid the problems with
the original version.

David Gibson (3):
  treewide: Add SOCK_CLOEXEC to accept() calls that are missing it
  conf, tap, repair: Uniformly use non-blocking accept() on Unix sockets
  conf, repair, tap: More caution about blocking flag on Unix sockets

 conf.c   | 10 +++++++---
 repair.c | 13 +++++++++----
 tap.c    | 10 ++++++++--
 util.c   |  2 +-
 4 files changed, 25 insertions(+), 10 deletions(-)

-- 
2.54.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/3] treewide: Add SOCK_CLOEXEC to accept() calls that are missing it
  2026-05-13  4:14 [PATCH 0/3] More caution with NONBLOCK flag on Unix sockets David Gibson
@ 2026-05-13  4:14 ` David Gibson
  2026-05-16 15:46   ` Stefano Brivio
  2026-05-13  4:14 ` [PATCH 2/3] conf, tap, repair: Uniformly use non-blocking accept() on Unix sockets David Gibson
  2026-05-13  4:14 ` [PATCH 3/3] conf, repair, tap: More caution about blocking flag " David Gibson
  2 siblings, 1 reply; 8+ messages in thread
From: David Gibson @ 2026-05-13  4:14 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: David Gibson

Generally we try to set the O_CLOEXEC flag on every fd we create.  This
seems to be generally accepted security best practice these days, and we
never fork(), so certainly have no need to pass fds to children.

A handful of accept4() calls on Unix sockets are missing the SOCK_CLOEXEC
flag to set this though.  Add the missing flag.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 repair.c | 5 +++--
 tap.c    | 4 ++--
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/repair.c b/repair.c
index 69c53077..3e0e3e0a 100644
--- a/repair.c
+++ b/repair.c
@@ -87,7 +87,7 @@ int repair_listen_handler(struct ctx *c, uint32_t events)
 	/* Another client is already connected: accept and close right away. */
 	if (c->fd_repair != -1) {
 		int discard = accept4(c->fd_repair_listen, NULL, NULL,
-				      SOCK_NONBLOCK);
+				      SOCK_NONBLOCK | SOCK_CLOEXEC);
 
 		if (discard == -1)
 			return errno;
@@ -99,7 +99,8 @@ int repair_listen_handler(struct ctx *c, uint32_t events)
 		return EEXIST;
 	}
 
-	if ((c->fd_repair = accept4(c->fd_repair_listen, NULL, NULL, 0)) < 0) {
+	if ((c->fd_repair = accept4(c->fd_repair_listen, NULL, NULL,
+				    SOCK_CLOEXEC)) < 0) {
 		rc = errno;
 		debug_perror("accept4() on TCP_REPAIR helper listening socket");
 		return rc;
diff --git a/tap.c b/tap.c
index 0920a325..e7cac9df 100644
--- a/tap.c
+++ b/tap.c
@@ -1477,7 +1477,7 @@ void tap_listen_handler(struct ctx *c, uint32_t events)
 	/* Another client is already connected: accept and close right away. */
 	if (c->fd_tap != -1) {
 		int discard = accept4(c->fd_tap_listen, NULL, NULL,
-				      SOCK_NONBLOCK);
+				      SOCK_NONBLOCK | SOCK_CLOEXEC);
 
 		if (discard == -1)
 			return;
@@ -1490,7 +1490,7 @@ void tap_listen_handler(struct ctx *c, uint32_t events)
 		return;
 	}
 
-	c->fd_tap = accept4(c->fd_tap_listen, NULL, NULL, 0);
+	c->fd_tap = accept4(c->fd_tap_listen, NULL, NULL, SOCK_CLOEXEC);
 
 	if (!getsockopt(c->fd_tap, SOL_SOCKET, SO_PEERCRED, &ucred, &len))
 		info("accepted connection from PID %i", ucred.pid);
-- 
2.54.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 2/3] conf, tap, repair: Uniformly use non-blocking accept() on Unix sockets
  2026-05-13  4:14 [PATCH 0/3] More caution with NONBLOCK flag on Unix sockets David Gibson
  2026-05-13  4:14 ` [PATCH 1/3] treewide: Add SOCK_CLOEXEC to accept() calls that are missing it David Gibson
@ 2026-05-13  4:14 ` David Gibson
  2026-05-13  5:51   ` David Gibson
  2026-05-13  4:14 ` [PATCH 3/3] conf, repair, tap: More caution about blocking flag " David Gibson
  2 siblings, 1 reply; 8+ messages in thread
From: David Gibson @ 2026-05-13  4:14 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: David Gibson

sock_unix(), which creates a listening Unix socket, doesn't set the
SOCK_NONBLOCK flag, meaning that accept() will block if called with no
pending connections.  Generally, this doesn't matter because we only
accept() once we've received an epoll event indicating there's a pending
connection request.

Control connections (pesto) are an exception, because the way we queue
connections requires that we call accept() when we close one connection to
see if there's another one waiting.  We rely on an EAGAIN here to know that
there's nothing waiting.  To handle these we have an explicit fcntl() to
enable NONBLOCK on the control listening socket.

However, always using non-blocking accept() for Unix sockets would make
things a bit more uniform, and should be a bit less fragile in the case
that we ever somehow got a spurious connection event.  So, alter
sock_unix() to always use the SOCK_NONBLOCK flag.  Remove the control
socket's special case fcntl(), and adjust the error handling on each
Unix socket accept() for the new behaviour.  As a bonus the last adds
reporting for accept() errors on tap socket connections.

we will need non-blocking accept() for the upcoming control/configuration
socket.  Always add SOCK_NONBLOCK, which is more robust and in keeping with
the normal non-blocking style of passt.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 conf.c   | 4 +---
 repair.c | 4 ++--
 tap.c    | 5 +++++
 util.c   | 2 +-
 4 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/conf.c b/conf.c
index 029b9c7c..dec43fca 100644
--- a/conf.c
+++ b/conf.c
@@ -1091,8 +1091,6 @@ static void conf_open_files(struct ctx *c)
 			die_perror("Couldn't open control socket %s",
 				   c->control_path);
 		}
-		if (fcntl(c->fd_control_listen, F_SETFL, O_NONBLOCK))
-			die_perror("Couldn't set O_NONBLOCK on control socket");
 	} else {
 		c->fd_control_listen = -1;
 	}
@@ -2087,7 +2085,7 @@ retry:
 	fd = accept4(c->fd_control_listen, NULL, NULL, SOCK_CLOEXEC);
 	if (fd < 0) {
 		if (errno != EAGAIN)
-			warn_perror("accept4() on configuration listening socket");
+			warn_perror("Error accept()ing configuration socket");
 		return;
 	}
 
diff --git a/repair.c b/repair.c
index 3e0e3e0a..42c4ae97 100644
--- a/repair.c
+++ b/repair.c
@@ -101,8 +101,8 @@ int repair_listen_handler(struct ctx *c, uint32_t events)
 
 	if ((c->fd_repair = accept4(c->fd_repair_listen, NULL, NULL,
 				    SOCK_CLOEXEC)) < 0) {
-		rc = errno;
-		debug_perror("accept4() on TCP_REPAIR helper listening socket");
+		if ((rc = errno) != EAGAIN)
+			warn_perror("Error accept()ing repair helper");
 		return rc;
 	}
 
diff --git a/tap.c b/tap.c
index e7cac9df..fda2da9b 100644
--- a/tap.c
+++ b/tap.c
@@ -1491,6 +1491,11 @@ void tap_listen_handler(struct ctx *c, uint32_t events)
 	}
 
 	c->fd_tap = accept4(c->fd_tap_listen, NULL, NULL, SOCK_CLOEXEC);
+	if (c->fd_tap < 0) {
+		if (errno != EAGAIN)
+			warn_perror("Error accepting tap client");
+		return;
+	}
 
 	if (!getsockopt(c->fd_tap, SOL_SOCKET, SO_PEERCRED, &ucred, &len))
 		info("accepted connection from PID %i", ucred.pid);
diff --git a/util.c b/util.c
index 73c9d51d..204391c7 100644
--- a/util.c
+++ b/util.c
@@ -238,7 +238,7 @@ int sock_l4_dualstack_any(const struct ctx *c, enum epoll_type type,
  */
 int sock_unix(char *sock_path)
 {
-	int fd = socket(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0);
+	int fd = socket(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC | SOCK_NONBLOCK, 0);
 	struct sockaddr_un addr = {
 		.sun_family = AF_UNIX,
 	};
-- 
2.54.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 3/3] conf, repair, tap: More caution about blocking flag on Unix sockets
  2026-05-13  4:14 [PATCH 0/3] More caution with NONBLOCK flag on Unix sockets David Gibson
  2026-05-13  4:14 ` [PATCH 1/3] treewide: Add SOCK_CLOEXEC to accept() calls that are missing it David Gibson
  2026-05-13  4:14 ` [PATCH 2/3] conf, tap, repair: Uniformly use non-blocking accept() on Unix sockets David Gibson
@ 2026-05-13  4:14 ` David Gibson
  2026-05-16 15:46   ` Stefano Brivio
  2 siblings, 1 reply; 8+ messages in thread
From: David Gibson @ 2026-05-13  4:14 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: David Gibson

Most of our operation is asynchronous, based on non-blocking fds handled
in our epoll loop.  However, our several Unix sockets (tap client, repair
helper, control client) are all blocking fds after accept().

That's correct for the repair helper, and (for now) correct for the control
client.  However, the reasons for that might not be obvious, so add some
extra comments giving the rationale.

I don't believe it's correct for the tap client; having this socket be
blocking means we could potentially block the main loop if we ever got a
a spurious EPOLL{IN,OUT} event on the tap socket.  Switch the tap socket
to non-blocking for better robustness, and consistency with nearly every
other fd we track.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 conf.c   | 6 ++++++
 repair.c | 4 ++++
 tap.c    | 3 ++-
 3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/conf.c b/conf.c
index dec43fca..dc85f0f8 100644
--- a/conf.c
+++ b/conf.c
@@ -2082,6 +2082,12 @@ static void conf_accept(struct ctx *c)
 	int fd, rc;
 
 retry:
+	/* Currently we perform the configuration transaction more-or-less
+	 * synchronously, so we want the accepted socket to be blocking.
+	 *
+	 * FIXME: We should make the configuration update asynchronous, like
+	 * most of our operation, so a misbehaving configuration client can't
+	 * block the main forwarding loop */
 	fd = accept4(c->fd_control_listen, NULL, NULL, SOCK_CLOEXEC);
 	if (fd < 0) {
 		if (errno != EAGAIN)
diff --git a/repair.c b/repair.c
index 42c4ae97..8a2d119d 100644
--- a/repair.c
+++ b/repair.c
@@ -99,6 +99,10 @@ int repair_listen_handler(struct ctx *c, uint32_t events)
 		return EEXIST;
 	}
 
+	/* We want accepted socket to be blocking; we use it during migration
+	 * which is a synchronous interruption to our normal non-blocking
+	 * behaviour.
+	 */
 	if ((c->fd_repair = accept4(c->fd_repair_listen, NULL, NULL,
 				    SOCK_CLOEXEC)) < 0) {
 		if ((rc = errno) != EAGAIN)
diff --git a/tap.c b/tap.c
index fda2da9b..3b8a3f3d 100644
--- a/tap.c
+++ b/tap.c
@@ -1490,7 +1490,8 @@ void tap_listen_handler(struct ctx *c, uint32_t events)
 		return;
 	}
 
-	c->fd_tap = accept4(c->fd_tap_listen, NULL, NULL, SOCK_CLOEXEC);
+	c->fd_tap = accept4(c->fd_tap_listen, NULL, NULL,
+			    SOCK_NONBLOCK | SOCK_CLOEXEC);
 	if (c->fd_tap < 0) {
 		if (errno != EAGAIN)
 			warn_perror("Error accepting tap client");
-- 
2.54.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/3] conf, tap, repair: Uniformly use non-blocking accept() on Unix sockets
  2026-05-13  4:14 ` [PATCH 2/3] conf, tap, repair: Uniformly use non-blocking accept() on Unix sockets David Gibson
@ 2026-05-13  5:51   ` David Gibson
  2026-05-16 15:46     ` Stefano Brivio
  0 siblings, 1 reply; 8+ messages in thread
From: David Gibson @ 2026-05-13  5:51 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev

[-- Attachment #1: Type: text/plain, Size: 4352 bytes --]

On Wed, May 13, 2026 at 02:14:22PM +1000, David Gibson wrote:
> sock_unix(), which creates a listening Unix socket, doesn't set the
> SOCK_NONBLOCK flag, meaning that accept() will block if called with no
> pending connections.  Generally, this doesn't matter because we only
> accept() once we've received an epoll event indicating there's a pending
> connection request.
> 
> Control connections (pesto) are an exception, because the way we queue
> connections requires that we call accept() when we close one connection to
> see if there's another one waiting.  We rely on an EAGAIN here to know that
> there's nothing waiting.  To handle these we have an explicit fcntl() to
> enable NONBLOCK on the control listening socket.
> 
> However, always using non-blocking accept() for Unix sockets would make
> things a bit more uniform, and should be a bit less fragile in the case
> that we ever somehow got a spurious connection event.  So, alter
> sock_unix() to always use the SOCK_NONBLOCK flag.  Remove the control
> socket's special case fcntl(), and adjust the error handling on each
> Unix socket accept() for the new behaviour.  As a bonus the last adds
> reporting for accept() errors on tap socket connections.

I didn't realise it, but adding that reporting also removes a valid,
if fairly minor coverity warning (at least with coverity 2026.3.0).

> we will need non-blocking accept() for the upcoming control/configuration
> socket.  Always add SOCK_NONBLOCK, which is more robust and in keeping with
> the normal non-blocking style of passt.

Oops.  This paragraph is left over from a previous version.  Can you
remove on merge, if there's no other reason to respin?

> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  conf.c   | 4 +---
>  repair.c | 4 ++--
>  tap.c    | 5 +++++
>  util.c   | 2 +-
>  4 files changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/conf.c b/conf.c
> index 029b9c7c..dec43fca 100644
> --- a/conf.c
> +++ b/conf.c
> @@ -1091,8 +1091,6 @@ static void conf_open_files(struct ctx *c)
>  			die_perror("Couldn't open control socket %s",
>  				   c->control_path);
>  		}
> -		if (fcntl(c->fd_control_listen, F_SETFL, O_NONBLOCK))
> -			die_perror("Couldn't set O_NONBLOCK on control socket");
>  	} else {
>  		c->fd_control_listen = -1;
>  	}
> @@ -2087,7 +2085,7 @@ retry:
>  	fd = accept4(c->fd_control_listen, NULL, NULL, SOCK_CLOEXEC);
>  	if (fd < 0) {
>  		if (errno != EAGAIN)
> -			warn_perror("accept4() on configuration listening socket");
> +			warn_perror("Error accept()ing configuration socket");
>  		return;
>  	}
>  
> diff --git a/repair.c b/repair.c
> index 3e0e3e0a..42c4ae97 100644
> --- a/repair.c
> +++ b/repair.c
> @@ -101,8 +101,8 @@ int repair_listen_handler(struct ctx *c, uint32_t events)
>  
>  	if ((c->fd_repair = accept4(c->fd_repair_listen, NULL, NULL,
>  				    SOCK_CLOEXEC)) < 0) {
> -		rc = errno;
> -		debug_perror("accept4() on TCP_REPAIR helper listening socket");
> +		if ((rc = errno) != EAGAIN)
> +			warn_perror("Error accept()ing repair helper");
>  		return rc;
>  	}
>  
> diff --git a/tap.c b/tap.c
> index e7cac9df..fda2da9b 100644
> --- a/tap.c
> +++ b/tap.c
> @@ -1491,6 +1491,11 @@ void tap_listen_handler(struct ctx *c, uint32_t events)
>  	}
>  
>  	c->fd_tap = accept4(c->fd_tap_listen, NULL, NULL, SOCK_CLOEXEC);
> +	if (c->fd_tap < 0) {
> +		if (errno != EAGAIN)
> +			warn_perror("Error accepting tap client");
> +		return;
> +	}
>  
>  	if (!getsockopt(c->fd_tap, SOL_SOCKET, SO_PEERCRED, &ucred, &len))
>  		info("accepted connection from PID %i", ucred.pid);
> diff --git a/util.c b/util.c
> index 73c9d51d..204391c7 100644
> --- a/util.c
> +++ b/util.c
> @@ -238,7 +238,7 @@ int sock_l4_dualstack_any(const struct ctx *c, enum epoll_type type,
>   */
>  int sock_unix(char *sock_path)
>  {
> -	int fd = socket(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0);
> +	int fd = socket(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC | SOCK_NONBLOCK, 0);
>  	struct sockaddr_un addr = {
>  		.sun_family = AF_UNIX,
>  	};
> -- 
> 2.54.0
> 

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/3] treewide: Add SOCK_CLOEXEC to accept() calls that are missing it
  2026-05-13  4:14 ` [PATCH 1/3] treewide: Add SOCK_CLOEXEC to accept() calls that are missing it David Gibson
@ 2026-05-16 15:46   ` Stefano Brivio
  0 siblings, 0 replies; 8+ messages in thread
From: Stefano Brivio @ 2026-05-16 15:46 UTC (permalink / raw)
  To: David Gibson; +Cc: passt-dev

On Wed, 13 May 2026 14:14:21 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> Generally we try to set the O_CLOEXEC flag on every fd we create.  This
> seems to be generally accepted security best practice these days, and we
> never fork(), so certainly have no need to pass fds to children.

But we do clone() with CLONE_FILES (even though when we clone() to call
execvp() later, we don't set CLONE_FILES), so, even though I don't see
a reason to skip O_CLOEXEC for c->fd_tap, this conclusion shouldn't be
automatic from the fact we don't fork().

I spent some time on it and I really couldn't find a reason why we
don't have O_CLOEXEC there, so probably there isn't any, and I think
this patch is fine.

I would just change this paragraph to "[...] these days, and we don't
need to pass file descriptors to children."

> A handful of accept4() calls on Unix sockets are missing the SOCK_CLOEXEC
> flag to set this though.  Add the missing flag.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  repair.c | 5 +++--
>  tap.c    | 4 ++--
>  2 files changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/repair.c b/repair.c
> index 69c53077..3e0e3e0a 100644
> --- a/repair.c
> +++ b/repair.c
> @@ -87,7 +87,7 @@ int repair_listen_handler(struct ctx *c, uint32_t events)
>  	/* Another client is already connected: accept and close right away. */
>  	if (c->fd_repair != -1) {
>  		int discard = accept4(c->fd_repair_listen, NULL, NULL,
> -				      SOCK_NONBLOCK);
> +				      SOCK_NONBLOCK | SOCK_CLOEXEC);
>  
>  		if (discard == -1)
>  			return errno;
> @@ -99,7 +99,8 @@ int repair_listen_handler(struct ctx *c, uint32_t events)
>  		return EEXIST;
>  	}
>  
> -	if ((c->fd_repair = accept4(c->fd_repair_listen, NULL, NULL, 0)) < 0) {
> +	if ((c->fd_repair = accept4(c->fd_repair_listen, NULL, NULL,
> +				    SOCK_CLOEXEC)) < 0) {
>  		rc = errno;
>  		debug_perror("accept4() on TCP_REPAIR helper listening socket");
>  		return rc;
> diff --git a/tap.c b/tap.c
> index 0920a325..e7cac9df 100644
> --- a/tap.c
> +++ b/tap.c
> @@ -1477,7 +1477,7 @@ void tap_listen_handler(struct ctx *c, uint32_t events)
>  	/* Another client is already connected: accept and close right away. */
>  	if (c->fd_tap != -1) {
>  		int discard = accept4(c->fd_tap_listen, NULL, NULL,
> -				      SOCK_NONBLOCK);
> +				      SOCK_NONBLOCK | SOCK_CLOEXEC);
>  
>  		if (discard == -1)
>  			return;
> @@ -1490,7 +1490,7 @@ void tap_listen_handler(struct ctx *c, uint32_t events)
>  		return;
>  	}
>  
> -	c->fd_tap = accept4(c->fd_tap_listen, NULL, NULL, 0);
> +	c->fd_tap = accept4(c->fd_tap_listen, NULL, NULL, SOCK_CLOEXEC);
>  
>  	if (!getsockopt(c->fd_tap, SOL_SOCKET, SO_PEERCRED, &ucred, &len))
>  		info("accepted connection from PID %i", ucred.pid);

-- 
Stefano


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/3] conf, repair, tap: More caution about blocking flag on Unix sockets
  2026-05-13  4:14 ` [PATCH 3/3] conf, repair, tap: More caution about blocking flag " David Gibson
@ 2026-05-16 15:46   ` Stefano Brivio
  0 siblings, 0 replies; 8+ messages in thread
From: Stefano Brivio @ 2026-05-16 15:46 UTC (permalink / raw)
  To: David Gibson; +Cc: passt-dev

On Wed, 13 May 2026 14:14:23 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> Most of our operation is asynchronous, based on non-blocking fds handled
> in our epoll loop.  However, our several Unix sockets (tap client, repair
> helper, control client) are all blocking fds after accept().
> 
> That's correct for the repair helper, and (for now) correct for the control
> client.  However, the reasons for that might not be obvious, so add some
> extra comments giving the rationale.
> 
> I don't believe it's correct for the tap client; having this socket be
> blocking means we could potentially block the main loop if we ever got a
> a spurious EPOLL{IN,OUT} event on the tap socket.  Switch the tap socket
> to non-blocking for better robustness, and consistency with nearly every
> other fd we track.

That socket needs to be blocking for the second usage we make of it in
tap_send_frames_passt(), that is, the one via write_remainder() without
MSG_DONTWAIT.

While a part of https://bugs.passt.top/show_bug.cgi?id=38 is solved
(there are no blocking reads left in tap_passt_input()), this isn't the
case for the writing side of it.

Some nits below:

> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  conf.c   | 6 ++++++
>  repair.c | 4 ++++
>  tap.c    | 3 ++-
>  3 files changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/conf.c b/conf.c
> index dec43fca..dc85f0f8 100644
> --- a/conf.c
> +++ b/conf.c
> @@ -2082,6 +2082,12 @@ static void conf_accept(struct ctx *c)
>  	int fd, rc;
>  
>  retry:
> +	/* Currently we perform the configuration transaction more-or-less
> +	 * synchronously, so we want the accepted socket to be blocking.
> +	 *
> +	 * FIXME: We should make the configuration update asynchronous, like
> +	 * most of our operation, so a misbehaving configuration client can't
> +	 * block the main forwarding loop */

	 * ... loop.
	 */

>  	fd = accept4(c->fd_control_listen, NULL, NULL, SOCK_CLOEXEC);
>  	if (fd < 0) {
>  		if (errno != EAGAIN)
> diff --git a/repair.c b/repair.c
> index 42c4ae97..8a2d119d 100644
> --- a/repair.c
> +++ b/repair.c
> @@ -99,6 +99,10 @@ int repair_listen_handler(struct ctx *c, uint32_t events)
>  		return EEXIST;
>  	}
>  
> +	/* We want accepted socket to be blocking; we use it during migration

"the accepted socket"

> +	 * which is a synchronous interruption to our normal non-blocking
> +	 * behaviour.
> +	 */
>  	if ((c->fd_repair = accept4(c->fd_repair_listen, NULL, NULL,
>  				    SOCK_CLOEXEC)) < 0) {
>  		if ((rc = errno) != EAGAIN)
> diff --git a/tap.c b/tap.c
> index fda2da9b..3b8a3f3d 100644
> --- a/tap.c
> +++ b/tap.c
> @@ -1490,7 +1490,8 @@ void tap_listen_handler(struct ctx *c, uint32_t events)
>  		return;
>  	}
>  
> -	c->fd_tap = accept4(c->fd_tap_listen, NULL, NULL, SOCK_CLOEXEC);
> +	c->fd_tap = accept4(c->fd_tap_listen, NULL, NULL,
> +			    SOCK_NONBLOCK | SOCK_CLOEXEC);

...so this part would need to be dropped. 

>  	if (c->fd_tap < 0) {
>  		if (errno != EAGAIN)
>  			warn_perror("Error accepting tap client");

The rest looks good to me.

-- 
Stefano


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/3] conf, tap, repair: Uniformly use non-blocking accept() on Unix sockets
  2026-05-13  5:51   ` David Gibson
@ 2026-05-16 15:46     ` Stefano Brivio
  0 siblings, 0 replies; 8+ messages in thread
From: Stefano Brivio @ 2026-05-16 15:46 UTC (permalink / raw)
  To: David Gibson; +Cc: passt-dev

On Wed, 13 May 2026 15:51:27 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Wed, May 13, 2026 at 02:14:22PM +1000, David Gibson wrote:
> > sock_unix(), which creates a listening Unix socket, doesn't set the
> > SOCK_NONBLOCK flag, meaning that accept() will block if called with no
> > pending connections.  Generally, this doesn't matter because we only
> > accept() once we've received an epoll event indicating there's a pending
> > connection request.
> > 
> > Control connections (pesto) are an exception, because the way we queue
> > connections requires that we call accept() when we close one connection to
> > see if there's another one waiting.  We rely on an EAGAIN here to know that
> > there's nothing waiting.  To handle these we have an explicit fcntl() to
> > enable NONBLOCK on the control listening socket.
> > 
> > However, always using non-blocking accept() for Unix sockets would make
> > things a bit more uniform, and should be a bit less fragile in the case
> > that we ever somehow got a spurious connection event.  So, alter
> > sock_unix() to always use the SOCK_NONBLOCK flag.  Remove the control
> > socket's special case fcntl(), and adjust the error handling on each
> > Unix socket accept() for the new behaviour.  As a bonus the last adds
> > reporting for accept() errors on tap socket connections.  
> 
> I didn't realise it, but adding that reporting also removes a valid,
> if fairly minor coverity warning (at least with coverity 2026.3.0).
> 
> > we will need non-blocking accept() for the upcoming control/configuration
> > socket.  Always add SOCK_NONBLOCK, which is more robust and in keeping with
> > the normal non-blocking style of passt.  
> 
> Oops.  This paragraph is left over from a previous version.  Can you
> remove on merge, if there's no other reason to respin?

I think the comments I'm raising (the one below and the one to 3/3)
actually warrant a respin at this point.

> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> >  conf.c   | 4 +---
> >  repair.c | 4 ++--
> >  tap.c    | 5 +++++
> >  util.c   | 2 +-
> >  4 files changed, 9 insertions(+), 6 deletions(-)
> > 
> > diff --git a/conf.c b/conf.c
> > index 029b9c7c..dec43fca 100644
> > --- a/conf.c
> > +++ b/conf.c
> > @@ -1091,8 +1091,6 @@ static void conf_open_files(struct ctx *c)
> >  			die_perror("Couldn't open control socket %s",
> >  				   c->control_path);
> >  		}
> > -		if (fcntl(c->fd_control_listen, F_SETFL, O_NONBLOCK))
> > -			die_perror("Couldn't set O_NONBLOCK on control socket");
> >  	} else {
> >  		c->fd_control_listen = -1;
> >  	}
> > @@ -2087,7 +2085,7 @@ retry:
> >  	fd = accept4(c->fd_control_listen, NULL, NULL, SOCK_CLOEXEC);
> >  	if (fd < 0) {
> >  		if (errno != EAGAIN)
> > -			warn_perror("accept4() on configuration listening socket");
> > +			warn_perror("Error accept()ing configuration socket");
> >  		return;
> >  	}
> >  
> > diff --git a/repair.c b/repair.c
> > index 3e0e3e0a..42c4ae97 100644
> > --- a/repair.c
> > +++ b/repair.c
> > @@ -101,8 +101,8 @@ int repair_listen_handler(struct ctx *c, uint32_t events)
> >  
> >  	if ((c->fd_repair = accept4(c->fd_repair_listen, NULL, NULL,
> >  				    SOCK_CLOEXEC)) < 0) {
> > -		rc = errno;
> > -		debug_perror("accept4() on TCP_REPAIR helper listening socket");
> > +		if ((rc = errno) != EAGAIN)
> > +			warn_perror("Error accept()ing repair helper");

See repair_wait() for the reason why this particular listening socket
needs to be blocking (with a timeout).

> >  		return rc;
> >  	}
> >  
> > diff --git a/tap.c b/tap.c
> > index e7cac9df..fda2da9b 100644
> > --- a/tap.c
> > +++ b/tap.c
> > @@ -1491,6 +1491,11 @@ void tap_listen_handler(struct ctx *c, uint32_t events)
> >  	}
> >  
> >  	c->fd_tap = accept4(c->fd_tap_listen, NULL, NULL, SOCK_CLOEXEC);
> > +	if (c->fd_tap < 0) {
> > +		if (errno != EAGAIN)
> > +			warn_perror("Error accepting tap client");
> > +		return;
> > +	}
> >  
> >  	if (!getsockopt(c->fd_tap, SOL_SOCKET, SO_PEERCRED, &ucred, &len))
> >  		info("accepted connection from PID %i", ucred.pid);
> > diff --git a/util.c b/util.c
> > index 73c9d51d..204391c7 100644
> > --- a/util.c
> > +++ b/util.c
> > @@ -238,7 +238,7 @@ int sock_l4_dualstack_any(const struct ctx *c, enum epoll_type type,
> >   */
> >  int sock_unix(char *sock_path)
> >  {
> > -	int fd = socket(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0);
> > +	int fd = socket(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC | SOCK_NONBLOCK, 0);
> >  	struct sockaddr_un addr = {
> >  		.sun_family = AF_UNIX,
> >  	};
> > -- 
> > 2.54.0

-- 
Stefano


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-05-16 15:46 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-05-13  4:14 [PATCH 0/3] More caution with NONBLOCK flag on Unix sockets David Gibson
2026-05-13  4:14 ` [PATCH 1/3] treewide: Add SOCK_CLOEXEC to accept() calls that are missing it David Gibson
2026-05-16 15:46   ` Stefano Brivio
2026-05-13  4:14 ` [PATCH 2/3] conf, tap, repair: Uniformly use non-blocking accept() on Unix sockets David Gibson
2026-05-13  5:51   ` David Gibson
2026-05-16 15:46     ` Stefano Brivio
2026-05-13  4:14 ` [PATCH 3/3] conf, repair, tap: More caution about blocking flag " David Gibson
2026-05-16 15:46   ` Stefano Brivio

Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).