From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=pass (2048-bit key; secure) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202510 header.b=vQx0WF9b; dkim-atps=neutral Received: from mail.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by passt.top (Postfix) with ESMTPS id 39A4D5A0BB9 for ; Fri, 14 Nov 2025 01:26:28 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202510; t=1763079985; bh=im6mNOleeARrqU9ua0DARTE8DG33P08pyRAO8p9ppnE=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=vQx0WF9brZe7H6zEbnrBdwZIr5YjwTgjuMTNHylZ/kRl9L4cPg0Z0PB5wvQsKxUF5 hN179vou1BH4j+1/WmYc5+9wArRQMdw94GQfEkEM8qVYOg2j78RjJFZYF4Ewzfj2MI wA0lX97BzBGtiG3Iht+cz4laUufzckdOhTQPg4uUYWTSb8vM3VpkiwF5ZZ3OMtqBNi w9cOL42VhPCXh//tX67zBpC8JQOETEdmu+VMZWWpI7fwTyedrdqtkic4Nherq4pKuo /SvZla4DOXd9FXIQTT/6MX7KSmruP8I0OKhT33j1xZEcI5ok3kCoMVgIoQJXuifcGE ypF6eecV9RXTQ== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4d6ycK1VfYz4w9q; Fri, 14 Nov 2025 11:26:25 +1100 (AEDT) Date: Fri, 14 Nov 2025 10:21:46 +1100 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH v3 2/8] util, flow, pif: Simplify sock_l4_sa() interface Message-ID: References: <20251029062628.1647051-1-david@gibson.dropbear.id.au> <20251029062628.1647051-3-david@gibson.dropbear.id.au> <20251113073313.1287b4dc@elisabeth> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="i00Tb+PPgQ9cnMzz" Content-Disposition: inline In-Reply-To: <20251113073313.1287b4dc@elisabeth> Message-ID-Hash: XWYBRB7XCGYTJ2A5ALYMY6NPPRRPQOLQ X-Message-ID-Hash: XWYBRB7XCGYTJ2A5ALYMY6NPPRRPQOLQ X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --i00Tb+PPgQ9cnMzz Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Nov 13, 2025 at 07:33:13AM +0100, Stefano Brivio wrote: > On Wed, 29 Oct 2025 17:26:22 +1100 > David Gibson wrote: >=20 > > sock_l4_sa() has a somewhat confusing 'v6only' option controlling wheth= er > > to set the IPV6_V6ONLY socket option. Usually it's set when the given > > address is IPv6, but not when we want to create a dual stack listening > > socket. The latter only makes sense when the address is :: however. > >=20 > > Clarify this by only keeping the v6only option in an internal helper > > sock_l4_(). External users will call either sock_l4() which always cre= ates > > a socket bound to a specific IP version, or sock_l4_dualstack() which > > creates a dual stack socket, but takes only a port not an address. >=20 > I'm not sure if we'll ever need anything different, but I guess that > this is not the only obvious semantic of sock_l4_dualstack(), as it > could take a sockaddr_inany eventually, and bind() IPv6 address and its > v4-mapped equivalent (...does that even work?). Do you mean that if we have a v4-mapped address, then using an IPv6 "dual stack" socket will listen both for IPv4 traffic and for IPv6 traffic actually using that v4-mapped address on the wire (presumably as a result of a router translating to a local IPv6-only network)? I think that will work, though I haven't tested. In that case we can determine that we need IPV6_V6ONLY from the address. The only case that doesn't cover is if we want to listen for v4-mapped traffic already translated by a router but *not* native IPv4 traffic. I don't see a lot of reason to ever do that, so it's in the "refactor if we ever discover we need it" pile. Otherwise, the only case in which a single dual stack socket actually listens to traffic from both protocols is for a wildcard. Maybe there are obscure wildcard addresses other than :: / 0.0.0.0, but that's also in the "worry about it later" pile. Note that: https://github.com/containers/podman/pull/14026/commits/772ead25318dfa34054= 1197e92322bd2346df087 implies some sort of dual stack localhost support (it treats "dual stack" ::1 as listening on both ::1 and 127.0.0.1). However, AFAICT that's just not correct. On Linux, listening on ::1 listens only on ::1 even with V6ONLY explicitly set to 0. > > We drop the '_sa' suffix while we're at it - it exists because this used > > to be an internal version with a sock_l4() wrapper. The wrapper no lon= ger > > exists so the '_sa' is no longer useful. > >=20 > > Signed-off-by: David Gibson > > --- > > flow.c | 6 ++---- > > pif.c | 10 +++------- > > util.c | 27 +++++++++++++++++++++++---- > > util.h | 8 +++++--- > > 4 files changed, 33 insertions(+), 18 deletions(-) > >=20 > > diff --git a/flow.c b/flow.c > > index 9926f408..fd530ddb 100644 > > --- a/flow.c > > +++ b/flow.c > > @@ -186,8 +186,7 @@ static int flowside_sock_splice(void *arg) > > =20 > > ns_enter(a->c); > > =20 > > - a->fd =3D sock_l4_sa(a->c, a->type, a->sa, NULL, > > - a->sa->sa_family =3D=3D AF_INET6, a->data); > > + a->fd =3D sock_l4(a->c, a->type, a->sa, NULL, a->data); > > a->err =3D errno; > > =20 > > return 0; > > @@ -222,8 +221,7 @@ int flowside_sock_l4(const struct ctx *c, enum epol= l_type type, uint8_t pif, > > else if (sa.sa_family =3D=3D AF_INET6) > > ifname =3D c->ip6.ifname_out; > > =20 > > - return sock_l4_sa(c, type, &sa, ifname, > > - sa.sa_family =3D=3D AF_INET6, data); > > + return sock_l4(c, type, &sa, ifname, data); > > =20 > > case PIF_SPLICE: { > > struct flowside_sock_args args =3D { > > diff --git a/pif.c b/pif.c > > index 31723b29..5fb1f455 100644 > > --- a/pif.c > > +++ b/pif.c > > @@ -75,11 +75,7 @@ int pif_sock_l4(const struct ctx *c, enum epoll_type= type, uint8_t pif, > > const union inany_addr *addr, const char *ifname, > > in_port_t port, uint32_t data) > > { > > - union sockaddr_inany sa =3D { > > - .sa6.sin6_family =3D AF_INET6, > > - .sa6.sin6_addr =3D in6addr_any, > > - .sa6.sin6_port =3D htons(port), > > - }; > > + union sockaddr_inany sa; > > =20 > > ASSERT(pif_is_socket(pif)); > > =20 > > @@ -90,8 +86,8 @@ int pif_sock_l4(const struct ctx *c, enum epoll_type = type, uint8_t pif, > > } > > =20 > > if (!addr) > > - return sock_l4_sa(c, type, &sa, ifname, false, data); > > + return sock_l4_dualstack(c, type, port, ifname, data); > > =20 > > pif_sockaddr(c, &sa, pif, addr, port); > > - return sock_l4_sa(c, type, &sa, ifname, sa.sa_family =3D=3D AF_INET6,= data); > > + return sock_l4(c, type, &sa, ifname, data); > > } > > diff --git a/util.c b/util.c > > index 976fcabe..c94efae4 100644 > > --- a/util.c > > +++ b/util.c > > @@ -40,7 +40,7 @@ > > #endif > > =20 > > /** > > - * sock_l4_sa() - Create and bind socket to socket address, add to epo= ll list > > + * sock_l4_() - Create and bind socket to socket address, add to epoll= list > > * @c: Execution context > > * @type: epoll type > > * @sa: Socket address to bind to > > @@ -50,9 +50,9 @@ > > * > > * Return: newly created socket, negative error code on failure > > */ > > -int sock_l4_sa(const struct ctx *c, enum epoll_type type, > > - const union sockaddr_inany *sa, const char *ifname, > > - bool v6only, uint32_t data) > > +static int sock_l4_(const struct ctx *c, enum epoll_type type, > > + const union sockaddr_inany *sa, const char *ifname, > > + bool v6only, uint32_t data) > > { > > sa_family_t af =3D sa->sa_family; > > union epoll_ref ref =3D { .type =3D type, .data =3D data }; > > @@ -182,6 +182,25 @@ int sock_l4_sa(const struct ctx *c, enum epoll_typ= e type, > > return fd; > > } > > =20 > > +int sock_l4(const struct ctx *c, enum epoll_type type, > > + const union sockaddr_inany *sa, const char *ifname, > > + uint32_t data) >=20 > Not extremely useful but it saves one "lookup": >=20 > /** > * sock_l4() - Create and bind socket to given address, add to epoll list > * @c: Execution context > * @type: epoll type > * @sa: Socket address to bind to > * @ifname: Interface for binding, NULL for any > * > * Return: newly created socket, negative error code on failure > */ Oops, I meant to go back and add function comments here, but I obviously forgot. Fixed. While there I removed the "add to epoll list" which is no longer correct. > > +{ > > + return sock_l4_(c, type, sa, ifname, sa->sa_family =3D=3D AF_INET6, d= ata); > > +} > > + > > +int sock_l4_dualstack(const struct ctx *c, enum epoll_type type, > > + in_port_t port, const char *ifname, uint32_t data) >=20 > ...same here, and the comment might be used to clarify the > functionality. Done. >=20 > > +{ > > + union sockaddr_inany sa =3D { > > + .sa6.sin6_family =3D AF_INET6, > > + .sa6.sin6_addr =3D in6addr_any, > > + .sa6.sin6_port =3D htons(port), > > + }; > > + > > + return sock_l4_(c, type, &sa, ifname, 0, data); > > +} > > + > > /** > > * sock_unix() - Create and bind AF_UNIX socket > > * @sock_path: Socket path. If empty, set on return (UNIX_SOCK_PATH as= prefix) > > diff --git a/util.h b/util.h > > index e1a1ebc9..7f0cf686 100644 > > --- a/util.h > > +++ b/util.h > > @@ -203,9 +203,11 @@ int do_clone(int (*fn)(void *), char *stack_area, = size_t stack_size, int flags, > > struct ctx; > > union sockaddr_inany; > > =20 > > -int sock_l4_sa(const struct ctx *c, enum epoll_type type, > > - const union sockaddr_inany *sa, const char *ifname, > > - bool v6only, uint32_t data); > > +int sock_l4(const struct ctx *c, enum epoll_type type, > > + const union sockaddr_inany *sa, const char *ifname, > > + uint32_t data); > > +int sock_l4_dualstack(const struct ctx *c, enum epoll_type type, > > + in_port_t port, const char *ifname, uint32_t data); > > int sock_unix(char *sock_path); > > void sock_probe_mem(struct ctx *c); > > long timespec_diff_ms(const struct timespec *a, const struct timespec = *b); >=20 > --=20 > Stefano >=20 --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --i00Tb+PPgQ9cnMzz Content-Type: application/pgp-signature; name=signature.asc -----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmkWaAkACgkQzQJF27ox 2Gd4JBAAhF7oKscdXBflzr5zm4YDbmFiA08hRveuYIEB+kHA9FD5EEFWaIXSYZ00 cfm6IfZZf31PPzFaMVTq3I9jRobd40oHUAKOmqUHW2k/Ornq22bKFJdw878Fo6cj KjQZQR41fl6t3cz1Srnx3fcppMioafFyv+nJ9j1sKFcMtIz6Cn/PDW78KNH/AQDk 4Z7E4rB0CU+K/sDC6nvSWzy+UeefkU5WziffJgnJQkWAa4G6cOYkVptpbFjOuOuP GZKarzYo1Gj5EQvDXxgquPeU/sBgPeWNrTeTceuR15YYv72oy0eGnzzTBXIBhvRK 8q4/FuCKBHOjifuF2IvKz8mtRRTX7JdzVsg8MOZoCxvPUM5e2X3/w6LevSsymmLS qxs1r6DjhxIMqLk2U8/psPPlirzu/+/UI3pZGKBN682TGYRq90w859slCq5bnZtI 4Xnim74YcCCoQywugDwgNuHjR/9IxJKXAHyH86lTPs4DMR8efeFqpRQxxN2ImP+2 hC+qD96cUHRcTOHj1rFAPbthqc8ExN/uy5J3K0LuaPBzLJlzzzqeRrwnge/jcrMF Cw//TcJRWeaSa0d6Cr5EwrTrKeMeK2y8kurAOte18Ee6DKiCIeriqdFlax8Y/G4O xsWM30vphqbvPyWtFfI9VnzR81ux66EP++xAns2roJWWDtEyZz4= =df9S -----END PGP SIGNATURE----- --i00Tb+PPgQ9cnMzz--