public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: passt-dev@passt.top, Laurent Vivier <lvivier@redhat.com>
Subject: Re: [PATCH 7/7] Introduce passt-repair
Date: Tue, 28 Jan 2025 12:51:59 +1100	[thread overview]
Message-ID: <Z5g4P-Ox2clykt5b@zatzit> (raw)
In-Reply-To: <20250127231532.672363-8-sbrivio@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 6312 bytes --]

On Tue, Jan 28, 2025 at 12:15:32AM +0100, Stefano Brivio wrote:
> A privileged helper to set/clear TCP_REPAIR on sockets on behalf of
> passt. Not used yet.
> 
> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> ---
>  Makefile       |  10 +++--
>  passt-repair.c | 111 +++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 118 insertions(+), 3 deletions(-)
>  create mode 100644 passt-repair.c
> 
> diff --git a/Makefile b/Makefile
> index 1383875..1b71cb0 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -42,7 +42,8 @@ PASST_SRCS = arch.c arp.c checksum.c conf.c dhcp.c dhcpv6.c flow.c fwd.c \
>  	tcp.c tcp_buf.c tcp_splice.c tcp_vu.c udp.c udp_flow.c udp_vu.c util.c \
>  	vhost_user.c virtio.c vu_common.c
>  QRAP_SRCS = qrap.c
> -SRCS = $(PASST_SRCS) $(QRAP_SRCS)
> +PASST_REPAIR_SRCS = passt-repair.c
> +SRCS = $(PASST_SRCS) $(QRAP_SRCS) $(PASST_REPAIR_SRCS)
>  
>  MANPAGES = passt.1 pasta.1 qrap.1
>  
> @@ -72,9 +73,9 @@ mandir		?= $(datarootdir)/man
>  man1dir		?= $(mandir)/man1
>  
>  ifeq ($(TARGET_ARCH),x86_64)
> -BIN := passt passt.avx2 pasta pasta.avx2 qrap
> +BIN := passt passt.avx2 pasta pasta.avx2 qrap passt-repair
>  else
> -BIN := passt pasta qrap
> +BIN := passt pasta qrap passt-repair
>  endif
>  
>  all: $(BIN) $(MANPAGES) docs
> @@ -101,6 +102,9 @@ pasta.avx2 pasta.1 pasta: pasta%: passt%
>  qrap: $(QRAP_SRCS) passt.h
>  	$(CC) $(FLAGS) $(CFLAGS) $(CPPFLAGS) -DARCH=\"$(TARGET_ARCH)\" $(QRAP_SRCS) -o qrap $(LDFLAGS)
>  
> +passt-repair: $(PASST_REPAIR_SRCS)
> +	$(CC) $(FLAGS) $(CFLAGS) $(CPPFLAGS) $(PASST_REPAIR_SRCS) -o passt-repair $(LDFLAGS)
> +
>  valgrind: EXTRA_SYSCALLS += rt_sigprocmask rt_sigtimedwait rt_sigaction	\
>  			    rt_sigreturn getpid gettid kill clock_gettime mmap \
>  			    mmap2 munmap open unlink gettimeofday futex statx \
> diff --git a/passt-repair.c b/passt-repair.c
> new file mode 100644
> index 0000000..e9b9609
> --- /dev/null
> +++ b/passt-repair.c
> @@ -0,0 +1,111 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +
> +/* PASST - Plug A Simple Socket Transport
> + *  for qemu/UNIX domain socket mode
> + *
> + * passt-repair.c - Privileged helper to set/clear TCP_REPAIR on sockets
> + *
> + * Copyright (c) 2025 Red Hat GmbH
> + * Author: Stefano Brivio <sbrivio@redhat.com>
> + *
> + * Connect to passt via UNIX domain socket, receive sockets via SCM_RIGHTS along
> + * with commands mapping to TCP_REPAIR values, and switch repair mode on or
> + * off. Reply by echoing the command. Exit if the command is INT_MAX.
> + */
> +
> +#include <sys/types.h>
> +#include <sys/socket.h>
> +#include <sys/un.h>
> +#include <errno.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +#include <limits.h>
> +#include <unistd.h>
> +#include <netdb.h>
> +
> +#include <netinet/tcp.h>
> +
> +#define SCM_MAX_FD 253 /* From Linux kernel (include/net/scm.h), not in UAPI */
> +
> +int main(int argc, char **argv)
> +{
> +	char buf[CMSG_SPACE(sizeof(int) * SCM_MAX_FD)]
> +	     __attribute__ ((aligned(__alignof__(struct cmsghdr))));
> +	struct sockaddr_un a = { AF_UNIX, "" };
> +	int cmd, fds[SCM_MAX_FD], s, ret, i;
> +	struct cmsghdr *cmsg;
> +	struct msghdr msg;
> +	struct iovec iov;
> +
> +	iov = (struct iovec){ &cmd, sizeof(cmd) };

I mean, local to local, it's *probably* fine, but still a network
protocol not defined in terms of explicit width fields makes me
nervous.  I'd prefer to see the cmd being a packed structure with
fixed width elements.

I also think we should do some sort of basic magic / version exchange.
I don't see any reason we'd need to extend the protocol, but I'd
rather have the option if we have to.  Plus checking a magic number
should make things less damaging and more debuggable if you were to
point the repair helper at an entirely unrelated unix socket instead
of passt's repair socket.

> +	msg = (struct msghdr){ NULL, 0, &iov, 1, buf, sizeof(buf), 0 };
> +	cmsg = CMSG_FIRSTHDR(&msg);
> +
> +	if (argc != 2) {
> +		fprintf(stderr, "Usage: %s PATH\n", argv[0]);
> +		return -1;
> +	}
> +
> +	ret = snprintf(a.sun_path, sizeof(a.sun_path), "%s", argv[1]);
> +	if (ret <= 0 || ret >= (int)sizeof(a.sun_path)) {
> +		fprintf(stderr, "Invalid socket path: %s\n", argv[1]);
> +		return -1;
> +	}
> +
> +	if ((s = socket(AF_UNIX, SOCK_STREAM, 0)) < 0) {

Hmm.. would a datagram socket better suit our needs here?

> +		perror("Failed to create AF_UNIX socket");
> +		return -1;
> +	}
> +
> +	if (connect(s, (struct sockaddr *)&a, sizeof(a))) {
> +		fprintf(stderr, "Failed to connect to %s: %s\n", argv[1],
> +			strerror(errno));
> +		return -1;
> +	}
> +
> +	while (1) {
> +		int n;
> +
> +		if (recvmsg(s, &msg, 0) < 0) {
> +			perror("Failed to receive message");
> +			return -1;
> +		}
> +
> +		if (!cmsg ||
> +		    cmsg->cmsg_len < CMSG_LEN(sizeof(int)) ||
> +		    cmsg->cmsg_len > CMSG_LEN(sizeof(int) * SCM_MAX_FD) ||
> +		    cmsg->cmsg_type != SCM_RIGHTS)
> +			return -1;
> +
> +		n = cmsg->cmsg_len / CMSG_LEN(sizeof(int));
> +		memcpy(fds, CMSG_DATA(cmsg), sizeof(int) * n);
> +
> +		switch (cmd) {
> +		case INT_MAX:
> +			return 0;
> +		case TCP_REPAIR_ON:
> +		case TCP_REPAIR_OFF:
> +		case TCP_REPAIR_OFF_NO_WP:
> +			for (i = 0; i < n; i++) {
> +				if (setsockopt(fds[i], SOL_TCP, TCP_REPAIR,
> +					       &cmd, sizeof(int))) {
> +					perror("Setting TCP_REPAIR");
> +					return -1;

We probably want this to report errors back to passt, rather than just
dying in this case.  That way if for some weird reason one socket
can't be placed in repair mode, we can still migrate all the other
connections.

> +				}
> +			}
> +
> +			/* Confirm setting by echoing the command back */
> +			if (send(s, &cmd, sizeof(int), 0) < 0) {
> +				fprintf(stderr, "Reply to command %i: %s\n",
> +					cmd, strerror(errno));
> +				return -1;
> +			}
> +
> +			break;
> +		default:
> +			fprintf(stderr, "Unsupported command 0x%04x\n", cmd);
> +			return -1;
> +		}
> +	}
> +}

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  parent reply	other threads:[~2025-01-28  1:52 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-27 23:15 [PATCH 0/7] Draft, incomplete series introducing state migration Stefano Brivio
2025-01-27 23:15 ` [PATCH 1/7] icmp, udp: Pad time_t timestamp to 64-bit to ease " Stefano Brivio
2025-01-28  0:49   ` David Gibson
2025-01-28  6:48     ` Stefano Brivio
2025-01-27 23:15 ` [PATCH 2/7] flow, flow_table: Pad flow table entries to 128 bytes, hash entries to 32 bits Stefano Brivio
2025-01-28  0:50   ` David Gibson
2025-01-27 23:15 ` [PATCH 3/7] tcp_conn: Avoid 7-bit hole in struct tcp_splice_conn Stefano Brivio
2025-01-28  0:53   ` David Gibson
2025-01-28  6:48     ` Stefano Brivio
2025-01-29  1:02       ` David Gibson
2025-01-29  7:33         ` Stefano Brivio
2025-01-30  0:44           ` David Gibson
2025-01-30  4:55             ` Stefano Brivio
2025-01-30  7:27               ` David Gibson
2025-01-27 23:15 ` [PATCH 4/7] flow_table: Use size in extern declaration for flowtab Stefano Brivio
2025-01-27 23:15 ` [PATCH 5/7] util: Add read_remainder() and read_all_buf() Stefano Brivio
2025-01-28  0:59   ` David Gibson
2025-01-28  6:48     ` Stefano Brivio
2025-01-29  1:03       ` David Gibson
2025-01-29  7:33         ` Stefano Brivio
2025-01-30  0:44           ` David Gibson
2025-01-27 23:15 ` [PATCH 6/7] Introduce facilities for guest migration on top of vhost-user infrastructure Stefano Brivio
2025-01-28  1:40   ` David Gibson
2025-01-28  6:50     ` Stefano Brivio
2025-01-29  1:16       ` David Gibson
2025-01-29  7:33         ` Stefano Brivio
2025-01-30  0:48           ` David Gibson
2025-01-30  4:55             ` Stefano Brivio
2025-01-30  7:38               ` David Gibson
2025-01-30  8:32                 ` Stefano Brivio
2025-01-30  8:54                   ` David Gibson
2025-01-27 23:15 ` [PATCH 7/7] Introduce passt-repair Stefano Brivio
2025-01-27 23:31   ` Stefano Brivio
2025-01-28  1:51   ` David Gibson [this message]
2025-01-28  6:51     ` Stefano Brivio
2025-01-29  1:29       ` David Gibson
2025-01-29  7:04         ` Stefano Brivio
2025-01-30  0:53           ` David Gibson
2025-01-30  4:55             ` Stefano Brivio
2025-01-30  7:43               ` David Gibson
2025-01-30  7:56                 ` Stefano Brivio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z5g4P-Ox2clykt5b@zatzit \
    --to=david@gibson.dropbear.id.au \
    --cc=lvivier@redhat.com \
    --cc=passt-dev@passt.top \
    --cc=sbrivio@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).