public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: Stefano Brivio <sbrivio@redhat.com>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: passt-dev@passt.top, Laurent Vivier <lvivier@redhat.com>
Subject: Re: [PATCH v2 7/8] Add interfaces and configuration bits for passt-repair
Date: Wed, 29 Jan 2025 09:46:10 +0100	[thread overview]
Message-ID: <20250129094610.148de3c6@elisabeth> (raw)
In-Reply-To: <Z5nGAzvWAG0l7NOe@zatzit>

On Wed, 29 Jan 2025 17:09:07 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Wed, Jan 29, 2025 at 12:39:39AM +0100, Stefano Brivio wrote:
> > In vhost-user mode, by default, create a second UNIX domain socket
> > accepting connections from passt-repair, with the usual listener
> > socket.
> > 
> > When we need to set or clear TCP_REPAIR on sockets, we'll send them
> > via SCM_RIGHTS to passt-repair, who sets the socket option values we
> > ask for.
> > 
> > To that end, introduce batched functions to request TCP_REPAIR
> > settings on sockets, so that we don't have to send a single message
> > for each socket, on migration. When needed, repair_flush() will
> > send the message and check for the reply.
> > 
> > Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> > ---
> >  Makefile     |  12 ++--
> >  conf.c       |  46 ++++++++++--
> >  epoll_type.h |   4 ++
> >  passt.1      |  11 +++
> >  passt.c      |   9 +++
> >  passt.h      |   7 ++
> >  repair.c     | 192 +++++++++++++++++++++++++++++++++++++++++++++++++++
> >  repair.h     |  16 +++++
> >  tap.c        |  65 +----------------
> >  util.c       |  62 +++++++++++++++++
> >  util.h       |   1 +
> >  11 files changed, 353 insertions(+), 72 deletions(-)
> >  create mode 100644 repair.c
> >  create mode 100644 repair.h
> > 
> > diff --git a/Makefile b/Makefile
> > index 1b71cb0..f67a20b 100644
> > --- a/Makefile
> > +++ b/Makefile
> > @@ -38,9 +38,9 @@ FLAGS += -DDUAL_STACK_SOCKETS=$(DUAL_STACK_SOCKETS)
> >  
> >  PASST_SRCS = arch.c arp.c checksum.c conf.c dhcp.c dhcpv6.c flow.c fwd.c \
> >  	icmp.c igmp.c inany.c iov.c ip.c isolation.c lineread.c log.c mld.c \
> > -	ndp.c netlink.c migrate.c packet.c passt.c pasta.c pcap.c pif.c tap.c \
> > -	tcp.c tcp_buf.c tcp_splice.c tcp_vu.c udp.c udp_flow.c udp_vu.c util.c \
> > -	vhost_user.c virtio.c vu_common.c
> > +	ndp.c netlink.c migrate.c packet.c passt.c pasta.c pcap.c pif.c \
> > +	repair.c tap.c tcp.c tcp_buf.c tcp_splice.c tcp_vu.c udp.c udp_flow.c \
> > +	udp_vu.c util.c vhost_user.c virtio.c vu_common.c
> >  QRAP_SRCS = qrap.c
> >  PASST_REPAIR_SRCS = passt-repair.c
> >  SRCS = $(PASST_SRCS) $(QRAP_SRCS) $(PASST_REPAIR_SRCS)
> > @@ -50,9 +50,9 @@ MANPAGES = passt.1 pasta.1 qrap.1
> >  PASST_HEADERS = arch.h arp.h checksum.h conf.h dhcp.h dhcpv6.h flow.h fwd.h \
> >  	flow_table.h icmp.h icmp_flow.h inany.h iov.h ip.h isolation.h \
> >  	lineread.h log.h migrate.h ndp.h netlink.h packet.h passt.h pasta.h \
> > -	pcap.h pif.h siphash.h tap.h tcp.h tcp_buf.h tcp_conn.h tcp_internal.h \
> > -	tcp_splice.h tcp_vu.h udp.h udp_flow.h udp_internal.h udp_vu.h util.h \
> > -	vhost_user.h virtio.h vu_common.h
> > +	pcap.h pif.h repair.h siphash.h tap.h tcp.h tcp_buf.h tcp_conn.h \
> > +	tcp_internal.h tcp_splice.h tcp_vu.h udp.h udp_flow.h udp_internal.h \
> > +	udp_vu.h util.h vhost_user.h virtio.h vu_common.h
> >  HEADERS = $(PASST_HEADERS) seccomp.h
> >  
> >  C := \#include <sys/random.h>\nint main(){int a=getrandom(0, 0, 0);}
> > diff --git a/conf.c b/conf.c
> > index df2b016..85dec44 100644
> > --- a/conf.c
> > +++ b/conf.c
> > @@ -816,6 +816,9 @@ static void usage(const char *name, FILE *f, int status)
> >  			"    UNIX domain socket is provided by -s option\n"
> >  			"  --print-capabilities	print back-end capabilities in JSON format,\n"
> >  			"    only meaningful for vhost-user mode\n");
> > +		FPRINTF(f,
> > +			"  --repair-path PATH	path for passt-repair(1)\n"  
> 
> Nit: as a privileged helper, should it be passt-repair(8)?

So, I spent a couple of minutes on this as I wrote this, and I spent a
bit longer now: the most authoritative definition I can find of section
8 is from man-pages(7):

       8 System management commands
              Commands like mount(8), many of which only root can
              execute.

It's not really a system management command, and the idea is to run it
with CAP_NET_ADMIN, but not necessarily as root. So I would rather keep
it in section 1, unless there's some other conflicting definition I'm
not aware of.

There's also the topic of where it should be installed (/sbin,
/usr/sbin/, /bin, /usr/bin). I'd pick /usr/bin, because /sbin doesn't
really mean much nowadays, and it's anyway fitting with the FHS 3.0:

  Utilities used for system administration (and other root-only
  commands) are stored in /sbin, /usr/ sbin, and /usr/local/sbin.
  
  /sbin contains binaries essential for booting, restoring, recovering,
  and/or repairing the system in addition to the binaries in /bin.
  18 Programs executed after /usr is known to be mounted (when there
  are no problems) are generally placed into /usr/sbin.

...and I don't think this helper would qualify for /sbin or /usr/sbin.

> 
> > +			"    default: append '.repair' to UNIX domain path\n");
> >  	}
> >  
> >  	FPRINTF(f,
> > @@ -1240,8 +1243,30 @@ static void conf_nat(const char *arg, struct in_addr *addr4,
> >   */
> >  static void conf_open_files(struct ctx *c)
> >  {
> > -	if (c->mode != MODE_PASTA && c->fd_tap == -1)
> > -		c->fd_tap_listen = tap_sock_unix_open(c->sock_path);
> > +	if (c->mode != MODE_PASTA && c->fd_tap == -1) {
> > +		c->fd_tap_listen = sock_unix(c->sock_path);
> > +
> > +		if (c->mode == MODE_VU && strcmp(c->repair_path, "none")) {
> > +			if (!strncmp(c->repair_path, "./", 2)) {
> > +				memmove(c->repair_path, c->repair_path + 2,
> > +					sizeof(c->repair_path) - 2);
> > +			}  
> 
> Do you need this? Shouldn't "./whatever" be usable as-is?

Ah, yes, I didn't know. I explicitly added this for the '--repair-path
./none' case, but it's not actually needed. I'll drop this.

> > +
> > +			if (!*c->repair_path &&
> > +			    snprintf_check(c->repair_path,
> > +					   sizeof(c->repair_path), "%s.repair",
> > +					   c->sock_path)) {
> > +				warn("passt-repair path %s not usable",
> > +				     c->repair_path);  
> 
> I'd prefer a die() here - I think omitting a possibly expected
> feature, with just a warning that could easily be lost in the logs is
> not a good idea.

I was thinking about that, but should we really risk *not* starting
because with ".repair" the path is now too long, for a feature that,
realistically, most users won't actually use?

If we're started by any kind of framework (which is where we run the
risk of the warning being ignored), then there should be explicit
checks about the path and the usability of passt-repair (or equivalent).

> > +				c->fd_repair_listen = -1;
> > +			} else {
> > +				c->fd_repair_listen = sock_unix(c->repair_path);
> > +			}
> > +		} else {
> > +			c->fd_repair_listen = -1;
> > +		}
> > +		c->fd_repair = -1;
> > +	}
> >  
> >  	if (*c->pidfile) {
> >  		c->pidfile_fd = output_file_open(c->pidfile, O_WRONLY);
> > @@ -1354,9 +1379,12 @@ void conf(struct ctx *c, int argc, char **argv)
> >  		{"host-lo-to-ns-lo", no_argument, 	NULL,		23 },
> >  		{"dns-host",	required_argument,	NULL,		24 },
> >  		{"vhost-user",	no_argument,		NULL,		25 },
> > +
> >  		/* vhost-user backend program convention */
> >  		{"print-capabilities", no_argument,	NULL,		26 },
> >  		{"socket-path",	required_argument,	NULL,		's' },
> > +
> > +		{"repair-path",	required_argument,	NULL,		27 },
> >  		{ 0 },
> >  	};
> >  	const char *logname = (c->mode == MODE_PASTA) ? "pasta" : "passt";
> > @@ -1824,8 +1852,8 @@ void conf(struct ctx *c, int argc, char **argv)
> >  	if (c->ifi4 && IN4_IS_ADDR_UNSPECIFIED(&c->ip4.guest_gw))
> >  		c->no_dhcp = 1;
> >  
> > -	/* Inbound port options & DNS can be parsed now (after IPv4/IPv6
> > -	 * settings)
> > +	/* Inbound port options, DNS, and --repair-path can be parsed now, after
> > +	 * IPv4/IPv6 settings and --vhost-user.
> >  	 */
> >  	fwd_probe_ephemeral();
> >  	udp_portmap_clear();
> > @@ -1871,6 +1899,16 @@ void conf(struct ctx *c, int argc, char **argv)
> >  			}
> >  
> >  			die("Cannot use DNS address %s", optarg);
> > +		} else if (name == 27) {
> > +			if (c->mode != MODE_VU && strcmp(optarg, "none"))
> > +				die("--repair-path is for vhost-user mode only");
> > +
> > +			if (snprintf_check(c->repair_path,
> > +					   sizeof(c->repair_path), "%s",
> > +					   optarg))
> > +				die("Invalid passt-repair path: %s", optarg);
> > +
> > +			break;
> >  		}
> >  	} while (name != -1);
> >  
> > diff --git a/epoll_type.h b/epoll_type.h
> > index fd9eac3..706238a 100644
> > --- a/epoll_type.h
> > +++ b/epoll_type.h
> > @@ -42,6 +42,10 @@ enum epoll_type {
> >  	EPOLL_TYPE_VHOST_KICK,
> >  	/* vhost-user migration socket */
> >  	EPOLL_TYPE_VHOST_MIGRATION,
> > +	/* TCP_REPAIR helper listening socket */
> > +	EPOLL_TYPE_REPAIR_LISTEN,
> > +	/* TCP_REPAIR helper socket */
> > +	EPOLL_TYPE_REPAIR,
> >  
> >  	EPOLL_NUM_TYPES,
> >  };
> > diff --git a/passt.1 b/passt.1
> > index d9cd33e..63a3a01 100644
> > --- a/passt.1
> > +++ b/passt.1
> > @@ -418,6 +418,17 @@ Enable vhost-user. The vhost-user command socket is provided by \fB--socket\fR.
> >  .BR \-\-print-capabilities
> >  Print back-end capabilities in JSON format, only meaningful for vhost-user mode.
> >  
> > +.TP
> > +.BR \-\-repair-path " " \fIpath
> > +Path for UNIX domain socket used by the \fBpasst-repair\fR(1) helper to connect  
> 
> passt-repair(8)?

See above.

> > +to \fBpasst\fR in order to set or clear the TCP_REPAIR option on sockets, during
> > +migration. \fB--repair-path none\fR disables this interface (if you need to
> > +specify a socket path called "none" you can prefix the path by \fI./\fR).
> > +
> > +Default, for \-\-vhost-user mode only, is to append \fI.repair\fR to the path
> > +chosen for the hypervisor UNIX domain socket. No socket is created if not in
> > +\-\-vhost-user mode.
> > +
> >  .TP
> >  .BR \-F ", " \-\-fd " " \fIFD
> >  Pass a pre-opened, connected socket to \fBpasst\fR. Usually the socket is opened
> > diff --git a/passt.c b/passt.c
> > index 184d4e5..1fa2ddd 100644
> > --- a/passt.c
> > +++ b/passt.c
> > @@ -51,6 +51,7 @@
> >  #include "tcp_splice.h"
> >  #include "ndp.h"
> >  #include "vu_common.h"
> > +#include "repair.h"
> >  
> >  #define EPOLL_EVENTS		8
> >  
> > @@ -76,6 +77,8 @@ char *epoll_type_str[] = {
> >  	[EPOLL_TYPE_VHOST_CMD]		= "vhost-user command socket",
> >  	[EPOLL_TYPE_VHOST_KICK]		= "vhost-user kick socket",
> >  	[EPOLL_TYPE_VHOST_MIGRATION]	= "vhost-user migration socket",
> > +	[EPOLL_TYPE_REPAIR_LISTEN]	= "TCP_REPAIR helper listening socket",
> > +	[EPOLL_TYPE_REPAIR]		= "TCP_REPAIR helper socket",
> >  };
> >  static_assert(ARRAY_SIZE(epoll_type_str) == EPOLL_NUM_TYPES,
> >  	      "epoll_type_str[] doesn't match enum epoll_type");
> > @@ -360,6 +363,12 @@ loop:
> >  		case EPOLL_TYPE_VHOST_MIGRATION:
> >  			vu_migrate(&c, eventmask);
> >  			break;
> > +		case EPOLL_TYPE_REPAIR_LISTEN:
> > +			repair_listen_handler(&c, eventmask);
> > +			break;
> > +		case EPOLL_TYPE_REPAIR:
> > +			repair_handler(&c, eventmask);
> > +			break;
> >  		default:
> >  			/* Can't happen */
> >  			ASSERT(0);
> > diff --git a/passt.h b/passt.h
> > index 0dd4efa..85b0a10 100644
> > --- a/passt.h
> > +++ b/passt.h
> > @@ -20,6 +20,7 @@ union epoll_ref;
> >  #include "siphash.h"
> >  #include "ip.h"
> >  #include "inany.h"
> > +#include "migrate.h"
> >  #include "flow.h"
> >  #include "icmp.h"
> >  #include "fwd.h"
> > @@ -193,6 +194,7 @@ struct ip6_ctx {
> >   * @foreground:		Run in foreground, don't log to stderr by default
> >   * @nofile:		Maximum number of open files (ulimit -n)
> >   * @sock_path:		Path for UNIX domain socket
> > + * @repair_path:	TCP_REPAIR helper path, can be "none", empty for default
> >   * @pcap:		Path for packet capture file
> >   * @pidfile:		Path to PID file, empty string if not configured
> >   * @pidfile_fd:		File descriptor for PID file, -1 if none
> > @@ -203,6 +205,8 @@ struct ip6_ctx {
> >   * @epollfd:		File descriptor for epoll instance
> >   * @fd_tap_listen:	File descriptor for listening AF_UNIX socket, if any
> >   * @fd_tap:		AF_UNIX socket, tuntap device, or pre-opened socket
> > + * @fd_repair_listen:	File descriptor for listening TCP_REPAIR socket, if any
> > + * @fd_repair:		Connected AF_UNIX socket for TCP_REPAIR helper
> >   * @our_tap_mac:	Pasta/passt's MAC on the tap link
> >   * @guest_mac:		MAC address of guest or namespace, seen or configured
> >   * @hash_secret:	128-bit secret for siphash functions
> > @@ -244,6 +248,7 @@ struct ctx {
> >  	int foreground;
> >  	int nofile;
> >  	char sock_path[UNIX_PATH_MAX];
> > +	char repair_path[UNIX_PATH_MAX];
> >  	char pcap[PATH_MAX];
> >  
> >  	char pidfile[PATH_MAX];
> > @@ -260,6 +265,8 @@ struct ctx {
> >  	int epollfd;
> >  	int fd_tap_listen;
> >  	int fd_tap;
> > +	int fd_repair_listen;
> > +	int fd_repair;
> >  	unsigned char our_tap_mac[ETH_ALEN];
> >  	unsigned char guest_mac[ETH_ALEN];
> >  	uint64_t hash_secret[2];
> > diff --git a/repair.c b/repair.c
> > new file mode 100644
> > index 0000000..24966f5
> > --- /dev/null
> > +++ b/repair.c
> > @@ -0,0 +1,192 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +
> > +/* PASST - Plug A Simple Socket Transport
> > + *  for qemu/UNIX domain socket mode
> > + *
> > + * PASTA - Pack A Subtle Tap Abstraction
> > + *  for network namespace/tap device mode
> > + *
> > + * repair.c - Interface (server) for passt-repair, set/clear TCP_REPAIR
> > + *
> > + * Copyright (c) 2025 Red Hat GmbH
> > + * Author: Stefano Brivio <sbrivio@redhat.com>
> > + */
> > +
> > +#include <errno.h>
> > +#include <sys/uio.h>
> > +
> > +#include "util.h"
> > +#include "ip.h"
> > +#include "passt.h"
> > +#include "inany.h"
> > +#include "flow.h"
> > +#include "flow_table.h"
> > +
> > +#include "repair.h"
> > +
> > +#define SCM_MAX_FD 253 /* From Linux kernel (include/net/scm.h), not in UAPI */
> > +
> > +static int fds[SCM_MAX_FD];  
> 
> Even 'static', I'd prefer a longer name for a global variable.

A longer name for this also means a longer name for nfds below, which
is not necessarily practical, but I'll check and try to change these to
repair_fds[] / repair_nfds if doable.

> > +static int current_cmd;  
> 
> Is there any particular rationale behind these being globals, whereas
> fd_repair is in struct ctx?  AFAICT they're basically equally global
> in practice.

Yes: c->fd_repair_listen needs to be in struct ctx, and I want to keep
this consistent with c->fd_tap_listen / c->fd_tap.

Besides, the variables declared here are really hacks representing
state for this compilation unit only.

> > +static int nfds;
> > +
> > +/**
> > + * repair_sock_init() - Start listening for connections on helper socket
> > + * @c:		Execution context
> > + */
> > +void repair_sock_init(const struct ctx *c)
> > +{
> > +	union epoll_ref ref = { .type = EPOLL_TYPE_REPAIR_LISTEN };
> > +	struct epoll_event ev = { 0 };
> > +
> > +	listen(c->fd_repair_listen, 0);
> > +
> > +	ref.fd = c->fd_repair_listen;
> > +	ev.events = EPOLLIN | EPOLLHUP | EPOLLET;
> > +	ev.data.u64 = ref.u64;
> > +	epoll_ctl(c->epollfd, EPOLL_CTL_ADD, c->fd_repair_listen, &ev);
> > +}
> > +
> > +/**
> > + * repair_listen_handler() - Handle events on TCP_REPAIR helper listening socket
> > + * @c:		Execution context
> > + * @events:	epoll events
> > + */
> > +void repair_listen_handler(struct ctx *c, uint32_t events)
> > +{
> > +	union epoll_ref ref = { .type = EPOLL_TYPE_REPAIR };
> > +	struct epoll_event ev = { 0 };
> > +	struct ucred ucred;
> > +	socklen_t len;
> > +
> > +	if (events != EPOLLIN) {
> > +		debug("Spurious event 0x%04x on TCP_REPAIR helper socket",
> > +		      events);
> > +		return;
> > +	}
> > +
> > +	len = sizeof(ucred);
> > +
> > +	/* Another client is already connected: accept and close right away. */  
> 
> For the repair socket, last-connection-wins would make more sense to
> me than first-connection-wins.  While hacking/debugging seems it might
> be useful to fix something in the passt-repair, re-run it and have it
> displace the stale version for existing passt instances.

In the whole debugging I've done so far I'm actually using passt-repair
(otherwise I don't even start it), so it already terminates once it's
done.

I think it's more secure to keep it like this, because it's more robust
against races.

Let's say we have an issue in KubeVirt's virt-handler so that the
migration starts a bit before the helper is started, and somebody
manages to use this time window to connect another helper, this attempt
will actually go unnoticed.

If passt-repair fails to connect, it will be very obvious.

> > +	if (c->fd_repair != -1) {
> > +		int discard = accept4(c->fd_repair_listen, NULL, NULL,
> > +				      SOCK_NONBLOCK);
> > +
> > +		if (discard == -1)
> > +			return;
> > +
> > +		if (!getsockopt(discard, SOL_SOCKET, SO_PEERCRED, &ucred, &len))
> > +			info("Discarding TCP_REPAIR helper, PID %i", ucred.pid);
> > +
> > +		close(discard);
> > +		return;
> > +	}
> > +
> > +	c->fd_repair = accept4(c->fd_repair_listen, NULL, NULL, 0);
> > +
> > +	if (!getsockopt(c->fd_repair, SOL_SOCKET, SO_PEERCRED, &ucred, &len))
> > +		info("Accepted TCP_REPAIR helper, PID %i", ucred.pid);
> > +
> > +	ref.fd = c->fd_repair;
> > +	ev.events = EPOLLHUP | EPOLLET;
> > +	ev.data.u64 = ref.u64;
> > +	epoll_ctl(c->epollfd, EPOLL_CTL_ADD, c->fd_repair, &ev);
> > +}
> > +
> > +/**
> > + * repair_close() - Close connection to TCP_REPAIR helper
> > + * @c:		Execution context
> > + */
> > +void repair_close(struct ctx *c)
> > +{
> > +	debug("Closing TCP_REPAIR helper socket");
> > +
> > +	epoll_ctl(c->epollfd, EPOLL_CTL_DEL, c->fd_repair, NULL);
> > +	close(c->fd_repair);
> > +	c->fd_repair = -1;
> > +}
> > +
> > +/**
> > + * repair_handler() - Handle EPOLLHUP and EPOLLERR on TCP_REPAIR helper socket
> > + * @c:		Execution context
> > + * @events:	epoll events
> > + */
> > +void repair_handler(struct ctx *c, uint32_t events)
> > +{
> > +	(void)events;
> > +
> > +	repair_close(c);
> > +}
> > +
> > +/**
> > + * repair_flush() - Flush current set of sockets to helper, with current command
> > + * @c:		Execution context
> > + *
> > + * Return: 0 on success, negative error code on failure
> > + */
> > +int repair_flush(struct ctx *c)
> > +{
> > +	struct iovec iov = { &((int8_t){ current_cmd }), sizeof(int8_t) };
> > +	char buf[CMSG_SPACE(sizeof(int) * SCM_MAX_FD)]
> > +	     __attribute__ ((aligned(__alignof__(struct cmsghdr))));
> > +	struct cmsghdr *cmsg;
> > +	struct msghdr msg;
> > +	int ret = 0;
> > +
> > +	if (!nfds)
> > +		return 0;
> > +
> > +	msg = (struct msghdr){ NULL, 0, &iov, 1,
> > +			       buf, CMSG_SPACE(sizeof(int) * nfds), 0 };
> > +	cmsg = CMSG_FIRSTHDR(&msg);
> > +
> > +	cmsg->cmsg_level = SOL_SOCKET;
> > +	cmsg->cmsg_type = SCM_RIGHTS;
> > +	cmsg->cmsg_len = CMSG_LEN(sizeof(int) * nfds);
> > +	memcpy(CMSG_DATA(cmsg), fds, sizeof(int) * nfds);
> > +
> > +	nfds = 0;
> > +
> > +	if (sendmsg(c->fd_repair, &msg, 0) < 0) {
> > +		ret = -errno;  
> 
> This error code won't be reported to the caller: you'll continue on to
> the recv() below, which will return EBADF, clobbering ret.

Oops, right, I'll return early in this case (or equivalent).

> > +		err_perror("Failed to send sockets to TCP_REPAIR helper");
> > +		repair_close(c);
> > +	}
> > +
> > +	if (recv(c->fd_repair, &((int8_t){ 0 }), 1, 0) < 0) {
> > +		ret = -errno;
> > +		err_perror("Failed to receive reply from TCP_REPAIR helper");
> > +		repair_close(c);
> > +	}
> > +
> > +	return ret;
> > +}
> > +
> > +/**
> > + * repair_flush() - Add socket to TCP_REPAIR set with given command
> > + * @c:		Execution context
> > + * @s:		Socket to add
> > + * @cmd:	TCP_REPAIR_ON, TCP_REPAIR_OFF, or TCP_REPAIR_OFF_NO_WP
> > + *
> > + * Return: 0 on success, negative error code on failure
> > + */
> > +/* cppcheck-suppress unusedFunction */
> > +int repair_set(struct ctx *c, int s, int cmd)
> > +{
> > +	int rc;
> > +
> > +	if (nfds && current_cmd != cmd) {
> > +		if ((rc = repair_flush(c)))
> > +			return rc;
> > +	}
> > +
> > +	current_cmd = cmd;
> > +	fds[nfds++] = s;
> > +
> > +	if (nfds >= SCM_MAX_FD) {
> > +		if ((rc = repair_flush(c)))
> > +			return rc;
> > +	}
> > +
> > +	return 0;
> > +}
> > diff --git a/repair.h b/repair.h
> > new file mode 100644
> > index 0000000..693c515
> > --- /dev/null
> > +++ b/repair.h
> > @@ -0,0 +1,16 @@
> > +/* SPDX-License-Identifier: GPL-2.0-or-later
> > + * Copyright (c) 2025 Red Hat GmbH
> > + * Author: Stefano Brivio <sbrivio@redhat.com>
> > + */
> > + 
> > +#ifndef REPAIR_H
> > +#define REPAIR_H
> > +
> > +void repair_sock_init(const struct ctx *c);
> > +void repair_listen_handler(struct ctx *c, uint32_t events);
> > +void repair_handler(struct ctx *c, uint32_t events);
> > +void repair_close(struct ctx *c);
> > +int repair_flush(struct ctx *c);
> > +int repair_set(struct ctx *c, int s, int cmd);
> > +
> > +#endif /* REPAIR_H */
> > diff --git a/tap.c b/tap.c
> > index cd32a90..0e60eb4 100644
> > --- a/tap.c
> > +++ b/tap.c
> > @@ -56,6 +56,7 @@
> >  #include "netlink.h"
> >  #include "pasta.h"
> >  #include "packet.h"
> > +#include "repair.h"
> >  #include "tap.h"
> >  #include "log.h"
> >  #include "vhost_user.h"
> > @@ -1151,68 +1152,6 @@ void tap_handler_pasta(struct ctx *c, uint32_t events,
> >  		tap_pasta_input(c, now);
> >  }
> >  
> > -/**
> > - * tap_sock_unix_open() - Create and bind AF_UNIX socket
> > - * @sock_path:	Socket path. If empty, set on return (UNIX_SOCK_PATH as prefix)
> > - *
> > - * Return: socket descriptor on success, won't return on failure
> > - */
> > -int tap_sock_unix_open(char *sock_path)
> > -{
> > -	int fd = socket(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0);
> > -	struct sockaddr_un addr = {
> > -		.sun_family = AF_UNIX,
> > -	};
> > -	int i;
> > -
> > -	if (fd < 0)
> > -		die_perror("Failed to open UNIX domain socket");
> > -
> > -	for (i = 1; i < UNIX_SOCK_MAX; i++) {
> > -		char *path = addr.sun_path;
> > -		int ex, ret;
> > -
> > -		if (*sock_path)
> > -			memcpy(path, sock_path, UNIX_PATH_MAX);
> > -		else if (snprintf_check(path, UNIX_PATH_MAX - 1,
> > -					UNIX_SOCK_PATH, i))
> > -			die_perror("Can't build UNIX domain socket path");
> > -
> > -		ex = socket(AF_UNIX, SOCK_STREAM | SOCK_NONBLOCK | SOCK_CLOEXEC,
> > -			    0);
> > -		if (ex < 0)
> > -			die_perror("Failed to check for UNIX domain conflicts");
> > -
> > -		ret = connect(ex, (const struct sockaddr *)&addr, sizeof(addr));
> > -		if (!ret || (errno != ENOENT && errno != ECONNREFUSED &&
> > -			     errno != EACCES)) {
> > -			if (*sock_path)
> > -				die("Socket path %s already in use", path);
> > -
> > -			close(ex);
> > -			continue;
> > -		}
> > -		close(ex);
> > -
> > -		unlink(path);
> > -		ret = bind(fd, (const struct sockaddr *)&addr, sizeof(addr));
> > -		if (*sock_path && ret)
> > -			die_perror("Failed to bind UNIX domain socket");
> > -
> > -		if (!ret)
> > -			break;
> > -	}
> > -
> > -	if (i == UNIX_SOCK_MAX)
> > -		die_perror("Failed to bind UNIX domain socket");
> > -
> > -	info("UNIX domain socket bound at %s", addr.sun_path);
> > -	if (!*sock_path)
> > -		memcpy(sock_path, addr.sun_path, UNIX_PATH_MAX);
> > -
> > -	return fd;
> > -}
> > -
> >  /**
> >   * tap_backend_show_hints() - Give help information to start QEMU
> >   * @c:		Execution context
> > @@ -1423,6 +1362,8 @@ void tap_backend_init(struct ctx *c)
> >  		tap_sock_tun_init(c);
> >  		break;
> >  	case MODE_VU:
> > +		repair_sock_init(c);
> > +		/* fall through */
> >  	case MODE_PASST:
> >  		tap_sock_unix_init(c);
> >  
> > diff --git a/util.c b/util.c
> > index 36857d4..e98da74 100644
> > --- a/util.c
> > +++ b/util.c
> > @@ -178,6 +178,68 @@ int sock_l4_sa(const struct ctx *c, enum epoll_type type,
> >  	return fd;
> >  }
> >  
> > +/**
> > + * sock_unix() - Create and bind AF_UNIX socket
> > + * @sock_path:	Socket path. If empty, set on return (UNIX_SOCK_PATH as prefix)
> > + *
> > + * Return: socket descriptor on success, won't return on failure
> > + */  
> 
> I like making tap_sock_unix_open() more general.  Could be split into
> a separate patch.

I need it here though, and keeping it here saves lines in the commit
message...

-- 
Stefano


  reply	other threads:[~2025-01-29  8:46 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-28 23:39 [PATCH v2 0/8] Draft, incomplete series introducing state migration Stefano Brivio
2025-01-28 23:39 ` [PATCH v2 1/8] icmp, udp: Pad time_t timestamp to 64-bit to ease " Stefano Brivio
2025-01-29  1:34   ` David Gibson
2025-01-28 23:39 ` [PATCH v2 2/8] flow, flow_table: Pad flow table entries to 128 bytes, hash entries to 32 bits Stefano Brivio
2025-01-29  1:35   ` David Gibson
2025-01-28 23:39 ` [PATCH v2 3/8] flow_table: Use size in extern declaration for flowtab Stefano Brivio
2025-01-28 23:39 ` [PATCH v2 4/8] util: Add read_remainder() and read_all_buf() Stefano Brivio
2025-01-29  1:37   ` David Gibson
2025-01-28 23:39 ` [PATCH v2 5/8] Introduce facilities for guest migration on top of vhost-user infrastructure Stefano Brivio
2025-01-29  5:41   ` David Gibson
2025-01-29  8:46     ` Stefano Brivio
2025-01-30  1:28       ` David Gibson
2025-01-28 23:39 ` [PATCH v2 6/8] Introduce passt-repair Stefano Brivio
2025-01-28 23:39 ` [PATCH v2 7/8] Add interfaces and configuration bits for passt-repair Stefano Brivio
2025-01-29  6:09   ` David Gibson
2025-01-29  8:46     ` Stefano Brivio [this message]
2025-01-30  1:33       ` David Gibson
2025-01-28 23:39 ` [PATCH v2 8/8] flow, tcp: Basic pre-migration source handler to dump sequence numbers Stefano Brivio
2025-01-29  6:15   ` David Gibson
2025-01-29  8:46     ` Stefano Brivio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250129094610.148de3c6@elisabeth \
    --to=sbrivio@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=lvivier@redhat.com \
    --cc=passt-dev@passt.top \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).