From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: passt.top; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=Za605sbJ; dkim-atps=neutral Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by passt.top (Postfix) with ESMTPS id 09D085A0271 for ; Tue, 18 Nov 2025 01:19:27 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1763425166; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9CBMdFStwxVyKLwR2vBUSk3mTpYLlqPNneBvpf8xv5k=; b=Za605sbJw1uEdaYxEyhlcUh4PNeFvE4azzd0+40uG//bS6uOO7vWMHtDg3qxBzju90E3gf +myss2sn2yBaqK1eHqmLyRxXKErocaaUp++v//Z1rN31JOj1KEo7bqUmdVeOOZRgzlXcWD VYL/ZAGpgmBxpp7KCWxkO1/6R53NnJQ= Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-643-5YhKYVeSMSqmFEkQUpM2SA-1; Mon, 17 Nov 2025 19:19:25 -0500 X-MC-Unique: 5YhKYVeSMSqmFEkQUpM2SA-1 X-Mimecast-MFC-AGG-ID: 5YhKYVeSMSqmFEkQUpM2SA_1763425164 Received: by mail-wm1-f70.google.com with SMTP id 5b1f17b1804b1-47777158a85so43452015e9.3 for ; Mon, 17 Nov 2025 16:19:25 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1763425164; x=1764029964; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=9CBMdFStwxVyKLwR2vBUSk3mTpYLlqPNneBvpf8xv5k=; b=NFfaLmBg+NaRFNASVFQjKdi1QAc7+DSufgQWW1/oYhz5s4447CPQ8JWAer9XEmA7rh nEIGaxHGW7lX+J+X8V0//yagDQ/AUKk3l9BGpibiSiaVrGiAKBIHTKkD2n5ffEdV1k6r vUzYO2Vz8/gB8aUYmTQxQ2FELT+ZgBfRasINx31yapKfSQDbJBK2wyPIpbwb+GasdZPS +5PSTOthKf1PgKuC9ESUX4VniKRTkVZdZ9F0g6mdf7TI6nT1uxPPrZrKHgc53rqe+T9q HD2+6PzKciobDJUiwRILZzSrp9UdAcage8WAvVNx/CwIRW4pBoZ3a+XjJobcFFqwZMdT Vaaw== X-Gm-Message-State: AOJu0Yw0XmdcEYt31RpOoEAn4HhWgyqo9asL0v6jEeAeS2PfwSJ2Dxb6 3gk+Hu8hj3tZf6Ks1bnBBqjKeD9eCCt/icgaW1U9ZORwFKwgcFygb10KQoUAfFzKO02L4tPP/8Z uro4d/e2lzK6TQOz7kARpBerh+PDG8wRRxG4dCqivgnIjoTIG6e6jciHetVFqhA== X-Gm-Gg: ASbGncu39CImnS3cOIUa44HiPS/VDuoJYSUhQQNgvO6MwIy1qhvdvJIvWFOqG10TWMZ tZEiIpPW9yfVXdT9cKiZ9e1roI+2HWKtEycCt2ijz5GkKDMLuwqQqBPZmiw43Tey6qX/BIXrluL LH9V8yJK1KTs2O30WhiTFAliJIq6oked/c2R+G8bzYJE1CmSIs6XD8zagldDKjSb2ukFjog533L npI9IPEijI16nvBV9LP7aZu5U6RSLG1bwPKpPlrndyqRUkBnmVoJEypNtxHCEh/mrsovzncBPff Y2dhrdSnWvDsHLoIWjzEsL6Fh6dhSF5ACrESO2ajstx/hCShgxjJucnnquK6gufbF6xI/JKBdda 93RKjWxlyDby2BcB/H4Ary5ztttfBITXO52oGiA== X-Received: by 2002:a05:600c:a05:b0:475:dc5c:3a89 with SMTP id 5b1f17b1804b1-4778fea881fmr138635155e9.34.1763425163912; Mon, 17 Nov 2025 16:19:23 -0800 (PST) X-Google-Smtp-Source: AGHT+IEQw/gvYt3iJOFuJCkzOdZiHBHnFSSyrB2OtkkDGBFj2azxV8XNQhSK3cR5t/HXcC+05qJ2NA== X-Received: by 2002:a05:600c:a05:b0:475:dc5c:3a89 with SMTP id 5b1f17b1804b1-4778fea881fmr138634995e9.34.1763425163440; Mon, 17 Nov 2025 16:19:23 -0800 (PST) Received: from maya.myfinge.rs (ifcgrfdd.trafficplex.cloud. [176.103.220.4]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-42b53e7ae5bsm28582282f8f.8.2025.11.17.16.19.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 17 Nov 2025 16:19:22 -0800 (PST) Date: Tue, 18 Nov 2025 01:19:21 +0100 From: Stefano Brivio To: David Gibson Subject: Re: [PATCH v3 2/8] util, flow, pif: Simplify sock_l4_sa() interface Message-ID: <20251118011921.4094e698@elisabeth> In-Reply-To: References: <20251029062628.1647051-1-david@gibson.dropbear.id.au> <20251029062628.1647051-3-david@gibson.dropbear.id.au> <20251113073313.1287b4dc@elisabeth> Organization: Red Hat X-Mailer: Claws Mail 4.2.0 (GTK 3.24.49; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: u1ty_CBr_xANJPSSBZNY5ZbJWp_-MUommt2EVCJc6CU_1763425164 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-ID-Hash: DA33X5HIGT5HVGWKG6MLSDEXAOR3KCRU X-Message-ID-Hash: DA33X5HIGT5HVGWKG6MLSDEXAOR3KCRU X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Fri, 14 Nov 2025 10:21:46 +1100 David Gibson wrote: > On Thu, Nov 13, 2025 at 07:33:13AM +0100, Stefano Brivio wrote: > > On Wed, 29 Oct 2025 17:26:22 +1100 > > David Gibson wrote: > > > > > sock_l4_sa() has a somewhat confusing 'v6only' option controlling whether > > > to set the IPV6_V6ONLY socket option. Usually it's set when the given > > > address is IPv6, but not when we want to create a dual stack listening > > > socket. The latter only makes sense when the address is :: however. > > > > > > Clarify this by only keeping the v6only option in an internal helper > > > sock_l4_(). External users will call either sock_l4() which always creates > > > a socket bound to a specific IP version, or sock_l4_dualstack() which > > > creates a dual stack socket, but takes only a port not an address. > > > > I'm not sure if we'll ever need anything different, but I guess that > > this is not the only obvious semantic of sock_l4_dualstack(), as it > > could take a sockaddr_inany eventually, and bind() IPv6 address and its > > v4-mapped equivalent (...does that even work?). > > Do you mean that if we have a v4-mapped address, then using an IPv6 > "dual stack" socket will listen both for IPv4 traffic and for IPv6 > traffic actually using that v4-mapped address on the wire (presumably > as a result of a router translating to a local IPv6-only network)? I > think that will work, though I haven't tested. Yes, that's what I meant. > In that case we can determine that we need IPV6_V6ONLY from the > address. The only case that doesn't cover is if we want to listen for > v4-mapped traffic already translated by a router but *not* native IPv4 > traffic. I don't see a lot of reason to ever do that, so it's in the > "refactor if we ever discover we need it" pile. I thought that we might want to listen on both IP versions for whatever reason, on a single socket, with a specific address (say, that v4-mapped address and the equivalent untranslated address...?). I know it can't be done now anyway, I'm just saying that sock_l4_dualstack() forcing wildcard addresses isn't something we should imply as part of "dualstack". > Otherwise, the only case in which a single dual stack socket actually > listens to traffic from both protocols is for a wildcard. Maybe there > are obscure wildcard addresses other than :: / 0.0.0.0, but that's > also in the "worry about it later" pile. Sure. > Note that: > > https://github.com/containers/podman/pull/14026/commits/772ead25318dfa340541197e92322bd2346df087 > > implies some sort of dual stack localhost support (it treats "dual > stack" ::1 as listening on both ::1 and 127.0.0.1). However, AFAICT > that's just not correct. On Linux, listening on ::1 listens only on > ::1 even with V6ONLY explicitly set to 0. Right, I don't even know what "simulated" means there. Actually there's no problem description at all. Go figure. I'm not sure if we want to report something (I'm not even sure what we should report). > > > We drop the '_sa' suffix while we're at it - it exists because this used > > > to be an internal version with a sock_l4() wrapper. The wrapper no longer > > > exists so the '_sa' is no longer useful. > > > > > > Signed-off-by: David Gibson > > > --- > > > flow.c | 6 ++---- > > > pif.c | 10 +++------- > > > util.c | 27 +++++++++++++++++++++++---- > > > util.h | 8 +++++--- > > > 4 files changed, 33 insertions(+), 18 deletions(-) > > > > > > diff --git a/flow.c b/flow.c > > > index 9926f408..fd530ddb 100644 > > > --- a/flow.c > > > +++ b/flow.c > > > @@ -186,8 +186,7 @@ static int flowside_sock_splice(void *arg) > > > > > > ns_enter(a->c); > > > > > > - a->fd = sock_l4_sa(a->c, a->type, a->sa, NULL, > > > - a->sa->sa_family == AF_INET6, a->data); > > > + a->fd = sock_l4(a->c, a->type, a->sa, NULL, a->data); > > > a->err = errno; > > > > > > return 0; > > > @@ -222,8 +221,7 @@ int flowside_sock_l4(const struct ctx *c, enum epoll_type type, uint8_t pif, > > > else if (sa.sa_family == AF_INET6) > > > ifname = c->ip6.ifname_out; > > > > > > - return sock_l4_sa(c, type, &sa, ifname, > > > - sa.sa_family == AF_INET6, data); > > > + return sock_l4(c, type, &sa, ifname, data); > > > > > > case PIF_SPLICE: { > > > struct flowside_sock_args args = { > > > diff --git a/pif.c b/pif.c > > > index 31723b29..5fb1f455 100644 > > > --- a/pif.c > > > +++ b/pif.c > > > @@ -75,11 +75,7 @@ int pif_sock_l4(const struct ctx *c, enum epoll_type type, uint8_t pif, > > > const union inany_addr *addr, const char *ifname, > > > in_port_t port, uint32_t data) > > > { > > > - union sockaddr_inany sa = { > > > - .sa6.sin6_family = AF_INET6, > > > - .sa6.sin6_addr = in6addr_any, > > > - .sa6.sin6_port = htons(port), > > > - }; > > > + union sockaddr_inany sa; > > > > > > ASSERT(pif_is_socket(pif)); > > > > > > @@ -90,8 +86,8 @@ int pif_sock_l4(const struct ctx *c, enum epoll_type type, uint8_t pif, > > > } > > > > > > if (!addr) > > > - return sock_l4_sa(c, type, &sa, ifname, false, data); > > > + return sock_l4_dualstack(c, type, port, ifname, data); > > > > > > pif_sockaddr(c, &sa, pif, addr, port); > > > - return sock_l4_sa(c, type, &sa, ifname, sa.sa_family == AF_INET6, data); > > > + return sock_l4(c, type, &sa, ifname, data); > > > } > > > diff --git a/util.c b/util.c > > > index 976fcabe..c94efae4 100644 > > > --- a/util.c > > > +++ b/util.c > > > @@ -40,7 +40,7 @@ > > > #endif > > > > > > /** > > > - * sock_l4_sa() - Create and bind socket to socket address, add to epoll list > > > + * sock_l4_() - Create and bind socket to socket address, add to epoll list > > > * @c: Execution context > > > * @type: epoll type > > > * @sa: Socket address to bind to > > > @@ -50,9 +50,9 @@ > > > * > > > * Return: newly created socket, negative error code on failure > > > */ > > > -int sock_l4_sa(const struct ctx *c, enum epoll_type type, > > > - const union sockaddr_inany *sa, const char *ifname, > > > - bool v6only, uint32_t data) > > > +static int sock_l4_(const struct ctx *c, enum epoll_type type, > > > + const union sockaddr_inany *sa, const char *ifname, > > > + bool v6only, uint32_t data) > > > { > > > sa_family_t af = sa->sa_family; > > > union epoll_ref ref = { .type = type, .data = data }; > > > @@ -182,6 +182,25 @@ int sock_l4_sa(const struct ctx *c, enum epoll_type type, > > > return fd; > > > } > > > > > > +int sock_l4(const struct ctx *c, enum epoll_type type, > > > + const union sockaddr_inany *sa, const char *ifname, > > > + uint32_t data) > > > > Not extremely useful but it saves one "lookup": > > > > /** > > * sock_l4() - Create and bind socket to given address, add to epoll list > > * @c: Execution context > > * @type: epoll type > > * @sa: Socket address to bind to > > * @ifname: Interface for binding, NULL for any > > * > > * Return: newly created socket, negative error code on failure > > */ > > Oops, I meant to go back and add function comments here, but I > obviously forgot. Fixed. > > While there I removed the "add to epoll list" which is no longer > correct. Oops, I hadn't solved the merge conflict yet... > > > +{ > > > + return sock_l4_(c, type, sa, ifname, sa->sa_family == AF_INET6, data); > > > +} > > > + > > > +int sock_l4_dualstack(const struct ctx *c, enum epoll_type type, > > > + in_port_t port, const char *ifname, uint32_t data) > > > > ...same here, and the comment might be used to clarify the > > functionality. > > Done. > > > > > > +{ > > > + union sockaddr_inany sa = { > > > + .sa6.sin6_family = AF_INET6, > > > + .sa6.sin6_addr = in6addr_any, > > > + .sa6.sin6_port = htons(port), > > > + }; > > > + > > > + return sock_l4_(c, type, &sa, ifname, 0, data); > > > +} > > > + > > > /** > > > * sock_unix() - Create and bind AF_UNIX socket > > > * @sock_path: Socket path. If empty, set on return (UNIX_SOCK_PATH as prefix) > > > diff --git a/util.h b/util.h > > > index e1a1ebc9..7f0cf686 100644 > > > --- a/util.h > > > +++ b/util.h > > > @@ -203,9 +203,11 @@ int do_clone(int (*fn)(void *), char *stack_area, size_t stack_size, int flags, > > > struct ctx; > > > union sockaddr_inany; > > > > > > -int sock_l4_sa(const struct ctx *c, enum epoll_type type, > > > - const union sockaddr_inany *sa, const char *ifname, > > > - bool v6only, uint32_t data); > > > +int sock_l4(const struct ctx *c, enum epoll_type type, > > > + const union sockaddr_inany *sa, const char *ifname, > > > + uint32_t data); > > > +int sock_l4_dualstack(const struct ctx *c, enum epoll_type type, > > > + in_port_t port, const char *ifname, uint32_t data); > > > int sock_unix(char *sock_path); > > > void sock_probe_mem(struct ctx *c); > > > long timespec_diff_ms(const struct timespec *a, const struct timespec *b); -- Stefano