On Thu, Nov 13, 2025 at 07:33:13AM +0100, Stefano Brivio wrote: > On Wed, 29 Oct 2025 17:26:22 +1100 > David Gibson wrote: > > > sock_l4_sa() has a somewhat confusing 'v6only' option controlling whether > > to set the IPV6_V6ONLY socket option. Usually it's set when the given > > address is IPv6, but not when we want to create a dual stack listening > > socket. The latter only makes sense when the address is :: however. > > > > Clarify this by only keeping the v6only option in an internal helper > > sock_l4_(). External users will call either sock_l4() which always creates > > a socket bound to a specific IP version, or sock_l4_dualstack() which > > creates a dual stack socket, but takes only a port not an address. > > I'm not sure if we'll ever need anything different, but I guess that > this is not the only obvious semantic of sock_l4_dualstack(), as it > could take a sockaddr_inany eventually, and bind() IPv6 address and its > v4-mapped equivalent (...does that even work?). Do you mean that if we have a v4-mapped address, then using an IPv6 "dual stack" socket will listen both for IPv4 traffic and for IPv6 traffic actually using that v4-mapped address on the wire (presumably as a result of a router translating to a local IPv6-only network)? I think that will work, though I haven't tested. In that case we can determine that we need IPV6_V6ONLY from the address. The only case that doesn't cover is if we want to listen for v4-mapped traffic already translated by a router but *not* native IPv4 traffic. I don't see a lot of reason to ever do that, so it's in the "refactor if we ever discover we need it" pile. Otherwise, the only case in which a single dual stack socket actually listens to traffic from both protocols is for a wildcard. Maybe there are obscure wildcard addresses other than :: / 0.0.0.0, but that's also in the "worry about it later" pile. Note that: https://github.com/containers/podman/pull/14026/commits/772ead25318dfa340541197e92322bd2346df087 implies some sort of dual stack localhost support (it treats "dual stack" ::1 as listening on both ::1 and 127.0.0.1). However, AFAICT that's just not correct. On Linux, listening on ::1 listens only on ::1 even with V6ONLY explicitly set to 0. > > We drop the '_sa' suffix while we're at it - it exists because this used > > to be an internal version with a sock_l4() wrapper. The wrapper no longer > > exists so the '_sa' is no longer useful. > > > > Signed-off-by: David Gibson > > --- > > flow.c | 6 ++---- > > pif.c | 10 +++------- > > util.c | 27 +++++++++++++++++++++++---- > > util.h | 8 +++++--- > > 4 files changed, 33 insertions(+), 18 deletions(-) > > > > diff --git a/flow.c b/flow.c > > index 9926f408..fd530ddb 100644 > > --- a/flow.c > > +++ b/flow.c > > @@ -186,8 +186,7 @@ static int flowside_sock_splice(void *arg) > > > > ns_enter(a->c); > > > > - a->fd = sock_l4_sa(a->c, a->type, a->sa, NULL, > > - a->sa->sa_family == AF_INET6, a->data); > > + a->fd = sock_l4(a->c, a->type, a->sa, NULL, a->data); > > a->err = errno; > > > > return 0; > > @@ -222,8 +221,7 @@ int flowside_sock_l4(const struct ctx *c, enum epoll_type type, uint8_t pif, > > else if (sa.sa_family == AF_INET6) > > ifname = c->ip6.ifname_out; > > > > - return sock_l4_sa(c, type, &sa, ifname, > > - sa.sa_family == AF_INET6, data); > > + return sock_l4(c, type, &sa, ifname, data); > > > > case PIF_SPLICE: { > > struct flowside_sock_args args = { > > diff --git a/pif.c b/pif.c > > index 31723b29..5fb1f455 100644 > > --- a/pif.c > > +++ b/pif.c > > @@ -75,11 +75,7 @@ int pif_sock_l4(const struct ctx *c, enum epoll_type type, uint8_t pif, > > const union inany_addr *addr, const char *ifname, > > in_port_t port, uint32_t data) > > { > > - union sockaddr_inany sa = { > > - .sa6.sin6_family = AF_INET6, > > - .sa6.sin6_addr = in6addr_any, > > - .sa6.sin6_port = htons(port), > > - }; > > + union sockaddr_inany sa; > > > > ASSERT(pif_is_socket(pif)); > > > > @@ -90,8 +86,8 @@ int pif_sock_l4(const struct ctx *c, enum epoll_type type, uint8_t pif, > > } > > > > if (!addr) > > - return sock_l4_sa(c, type, &sa, ifname, false, data); > > + return sock_l4_dualstack(c, type, port, ifname, data); > > > > pif_sockaddr(c, &sa, pif, addr, port); > > - return sock_l4_sa(c, type, &sa, ifname, sa.sa_family == AF_INET6, data); > > + return sock_l4(c, type, &sa, ifname, data); > > } > > diff --git a/util.c b/util.c > > index 976fcabe..c94efae4 100644 > > --- a/util.c > > +++ b/util.c > > @@ -40,7 +40,7 @@ > > #endif > > > > /** > > - * sock_l4_sa() - Create and bind socket to socket address, add to epoll list > > + * sock_l4_() - Create and bind socket to socket address, add to epoll list > > * @c: Execution context > > * @type: epoll type > > * @sa: Socket address to bind to > > @@ -50,9 +50,9 @@ > > * > > * Return: newly created socket, negative error code on failure > > */ > > -int sock_l4_sa(const struct ctx *c, enum epoll_type type, > > - const union sockaddr_inany *sa, const char *ifname, > > - bool v6only, uint32_t data) > > +static int sock_l4_(const struct ctx *c, enum epoll_type type, > > + const union sockaddr_inany *sa, const char *ifname, > > + bool v6only, uint32_t data) > > { > > sa_family_t af = sa->sa_family; > > union epoll_ref ref = { .type = type, .data = data }; > > @@ -182,6 +182,25 @@ int sock_l4_sa(const struct ctx *c, enum epoll_type type, > > return fd; > > } > > > > +int sock_l4(const struct ctx *c, enum epoll_type type, > > + const union sockaddr_inany *sa, const char *ifname, > > + uint32_t data) > > Not extremely useful but it saves one "lookup": > > /** > * sock_l4() - Create and bind socket to given address, add to epoll list > * @c: Execution context > * @type: epoll type > * @sa: Socket address to bind to > * @ifname: Interface for binding, NULL for any > * > * Return: newly created socket, negative error code on failure > */ Oops, I meant to go back and add function comments here, but I obviously forgot. Fixed. While there I removed the "add to epoll list" which is no longer correct. > > +{ > > + return sock_l4_(c, type, sa, ifname, sa->sa_family == AF_INET6, data); > > +} > > + > > +int sock_l4_dualstack(const struct ctx *c, enum epoll_type type, > > + in_port_t port, const char *ifname, uint32_t data) > > ...same here, and the comment might be used to clarify the > functionality. Done. > > > +{ > > + union sockaddr_inany sa = { > > + .sa6.sin6_family = AF_INET6, > > + .sa6.sin6_addr = in6addr_any, > > + .sa6.sin6_port = htons(port), > > + }; > > + > > + return sock_l4_(c, type, &sa, ifname, 0, data); > > +} > > + > > /** > > * sock_unix() - Create and bind AF_UNIX socket > > * @sock_path: Socket path. If empty, set on return (UNIX_SOCK_PATH as prefix) > > diff --git a/util.h b/util.h > > index e1a1ebc9..7f0cf686 100644 > > --- a/util.h > > +++ b/util.h > > @@ -203,9 +203,11 @@ int do_clone(int (*fn)(void *), char *stack_area, size_t stack_size, int flags, > > struct ctx; > > union sockaddr_inany; > > > > -int sock_l4_sa(const struct ctx *c, enum epoll_type type, > > - const union sockaddr_inany *sa, const char *ifname, > > - bool v6only, uint32_t data); > > +int sock_l4(const struct ctx *c, enum epoll_type type, > > + const union sockaddr_inany *sa, const char *ifname, > > + uint32_t data); > > +int sock_l4_dualstack(const struct ctx *c, enum epoll_type type, > > + in_port_t port, const char *ifname, uint32_t data); > > int sock_unix(char *sock_path); > > void sock_probe_mem(struct ctx *c); > > long timespec_diff_ms(const struct timespec *a, const struct timespec *b); > > -- > Stefano > -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson