On Sat, May 16, 2026 at 05:46:27PM +0200, Stefano Brivio wrote: > On Wed, 13 May 2026 15:51:27 +1000 > David Gibson wrote: > > > On Wed, May 13, 2026 at 02:14:22PM +1000, David Gibson wrote: > > > sock_unix(), which creates a listening Unix socket, doesn't set the > > > SOCK_NONBLOCK flag, meaning that accept() will block if called with no > > > pending connections. Generally, this doesn't matter because we only > > > accept() once we've received an epoll event indicating there's a pending > > > connection request. > > > > > > Control connections (pesto) are an exception, because the way we queue > > > connections requires that we call accept() when we close one connection to > > > see if there's another one waiting. We rely on an EAGAIN here to know that > > > there's nothing waiting. To handle these we have an explicit fcntl() to > > > enable NONBLOCK on the control listening socket. > > > > > > However, always using non-blocking accept() for Unix sockets would make > > > things a bit more uniform, and should be a bit less fragile in the case > > > that we ever somehow got a spurious connection event. So, alter > > > sock_unix() to always use the SOCK_NONBLOCK flag. Remove the control > > > socket's special case fcntl(), and adjust the error handling on each > > > Unix socket accept() for the new behaviour. As a bonus the last adds > > > reporting for accept() errors on tap socket connections. > > > > I didn't realise it, but adding that reporting also removes a valid, > > if fairly minor coverity warning (at least with coverity 2026.3.0). > > > > > we will need non-blocking accept() for the upcoming control/configuration > > > socket. Always add SOCK_NONBLOCK, which is more robust and in keeping with > > > the normal non-blocking style of passt. > > > > Oops. This paragraph is left over from a previous version. Can you > > remove on merge, if there's no other reason to respin? > > I think the comments I'm raising (the one below and the one to 3/3) > actually warrant a respin at this point. Agreed. > > > Signed-off-by: David Gibson > > > --- > > > conf.c | 4 +--- > > > repair.c | 4 ++-- > > > tap.c | 5 +++++ > > > util.c | 2 +- > > > 4 files changed, 9 insertions(+), 6 deletions(-) > > > > > > diff --git a/conf.c b/conf.c > > > index 029b9c7c..dec43fca 100644 > > > --- a/conf.c > > > +++ b/conf.c > > > @@ -1091,8 +1091,6 @@ static void conf_open_files(struct ctx *c) > > > die_perror("Couldn't open control socket %s", > > > c->control_path); > > > } > > > - if (fcntl(c->fd_control_listen, F_SETFL, O_NONBLOCK)) > > > - die_perror("Couldn't set O_NONBLOCK on control socket"); > > > } else { > > > c->fd_control_listen = -1; > > > } > > > @@ -2087,7 +2085,7 @@ retry: > > > fd = accept4(c->fd_control_listen, NULL, NULL, SOCK_CLOEXEC); > > > if (fd < 0) { > > > if (errno != EAGAIN) > > > - warn_perror("accept4() on configuration listening socket"); > > > + warn_perror("Error accept()ing configuration socket"); > > > return; > > > } > > > > > > diff --git a/repair.c b/repair.c > > > index 3e0e3e0a..42c4ae97 100644 > > > --- a/repair.c > > > +++ b/repair.c > > > @@ -101,8 +101,8 @@ int repair_listen_handler(struct ctx *c, uint32_t events) > > > > > > if ((c->fd_repair = accept4(c->fd_repair_listen, NULL, NULL, > > > SOCK_CLOEXEC)) < 0) { > > > - rc = errno; > > > - debug_perror("accept4() on TCP_REPAIR helper listening socket"); > > > + if ((rc = errno) != EAGAIN) > > > + warn_perror("Error accept()ing repair helper"); > > See repair_wait() for the reason why this particular listening socket > needs to be blocking (with a timeout). Ah, good point. That makes this whole patch pretty ill-conceived. I'll ditch everything except the error reporting addition. -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson