From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: passt.top; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=EI1tf6Hp; dkim-atps=neutral Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by passt.top (Postfix) with ESMTPS id 898DD5A0274 for ; Tue, 18 Nov 2025 01:19:32 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1763425171; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YTvgoCXZT5E24LY7VfKNni9cWWqRi2nzJ2gTvZMkiN8=; b=EI1tf6HpopyNHQptmhWFsoofLBUuT8ybO2IOj8nIxeRUbc3kWbjI+8hgLwHRxfGPdTTYH3 KR0UmIH0G/3wAEgIxpwa4jRaJfkqGeB6xZfp0YNGo7t1eI35Yl3NW4D/uPhGr7zhadjEZj HeRrKroTxEn3T1nx1UGKB42QhtinIFI= Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-624-Es6rADmZO3yn_41AJtYSng-1; Mon, 17 Nov 2025 19:19:30 -0500 X-MC-Unique: Es6rADmZO3yn_41AJtYSng-1 X-Mimecast-MFC-AGG-ID: Es6rADmZO3yn_41AJtYSng_1763425169 Received: by mail-wr1-f70.google.com with SMTP id ffacd0b85a97d-42b5556d80bso3645073f8f.2 for ; Mon, 17 Nov 2025 16:19:29 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1763425168; x=1764029968; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=YTvgoCXZT5E24LY7VfKNni9cWWqRi2nzJ2gTvZMkiN8=; b=rzsNQyk2erqjhkzoNy8Ef24pNh9D7BGZLquVHxsRknpQNlDpV31LpwfuGr4+99sGR0 dup+fGTScYQBRZ+eHN9QjxiAEh67zvMeBs/yk7PsF7Ev4W8Ltjms6o8bBBMSB/FYxqNd zEMYqIu43uZ1sKIEYYn/8pHIYeq0fMI798BErRIkuXEl0bGmN+etgOExGRgFTK1ls1uo rdK31vX4t54+M19AyBv1niqnBDo/jFMY6ywSC9nC15h38WZQIHZr9to2WVcajd3l67mt dwxkGKh5Ga6hK7FQ933o39kYKen56xwVs0vnJI2k5Kcfx8foX9nCJaLHTa08QOZieaRC dV5g== X-Gm-Message-State: AOJu0YxhsjdNAE6P5CFVxdiYIuBsZJmYHhLZvML6d3llCy4qsp02fUHJ 7HLLnV3hncOLZtBg4kIv/I2LoXRTM1avMusV16Tqqji/Tsl5jhsWZUb+e3J2jok4GN4T3XDVf0o dBjpl7cwQzxgIG30u/k9TV8PL2ZCqmh2qBlVSZlPoAT/TW3PQ3HZuKsXMB3UAdg== X-Gm-Gg: ASbGncv5h+wB3kEj4Mo4FOpG4/xNQhug9sE1ok60vZdshjDcE7R50SFE3osRNP+ZJxc Pli+YpTvpjxXS6aGlMTjHeDu04vaMKqf1B3hfuTvIESXsiwKQLtn7N1YCpwMXkBQo2dFrt2ixmZ f+RGLkH/NnfO0pXNLG1SuwRlI7JJ8c3wxygGyLjwOrlRCmfjniSvW6mS7bBjtesbVazxSASQExy KCeHlZRhthd8ul6puJbGCrK6lNDWEXSDHURKEcXEeonmtClUnPj6g+KmdWhcKMqrJjClAyexPfW 2byWc3VXsIbgc9FjMJcyet1cV558kxcjG6v5yMsIyJW/2D8foZ11MBSm6AVghJt5VYnlr6nYEDN g0+32EkQ8YJES865ESUId X-Received: by 2002:a05:6000:2909:b0:429:d66b:507f with SMTP id ffacd0b85a97d-42b5939111dmr14426766f8f.48.1763425168274; Mon, 17 Nov 2025 16:19:28 -0800 (PST) X-Google-Smtp-Source: AGHT+IFPfJpjjQMFZEjyO2rvisrUelf7GC+cri6wWmjf/gOdyeLRt3cCVRFdDdCa8qvMBDqnzj/gNw== X-Received: by 2002:a05:6000:2909:b0:429:d66b:507f with SMTP id ffacd0b85a97d-42b5939111dmr14426741f8f.48.1763425167701; Mon, 17 Nov 2025 16:19:27 -0800 (PST) Received: from maya.myfinge.rs (ifcgrfdd.trafficplex.cloud. [2a10:fc81:a806:d6a9::1]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-42b53e7ae5bsm28582572f8f.8.2025.11.17.16.19.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 17 Nov 2025 16:19:27 -0800 (PST) Date: Tue, 18 Nov 2025 01:19:26 +0100 From: Stefano Brivio To: David Gibson Subject: Re: [PATCH v3 6/8] util: Fix setting of IPV6_V6ONLY socket option Message-ID: <20251118011926.462fe7f9@elisabeth> In-Reply-To: References: <20251029062628.1647051-1-david@gibson.dropbear.id.au> <20251029062628.1647051-7-david@gibson.dropbear.id.au> <20251113073335.3a73f9b9@elisabeth> Organization: Red Hat X-Mailer: Claws Mail 4.2.0 (GTK 3.24.49; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: zqtHK8Cv-lCnCeXRHilAdBLjFUH5WaK8WdlEtbm5tyw_1763425169 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-ID-Hash: SERJMJYSS3TDPA5EVVPJ5EXZSPOKMYMK X-Message-ID-Hash: SERJMJYSS3TDPA5EVVPJ5EXZSPOKMYMK X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Fri, 14 Nov 2025 11:24:31 +1100 David Gibson wrote: > On Thu, Nov 13, 2025 at 07:33:35AM +0100, Stefano Brivio wrote: > > On Wed, 29 Oct 2025 17:26:26 +1100 > > David Gibson wrote: > > > > > Currently we only call setsockopt() on IPV6_V6ONLY when we want to set it > > > to 1, which we typically do on all IPv6 sockets except those explicitly for > > > dual stack listening. That's not quite right in two ways: > > > > > > * Although IPV6_V6ONLY==0 is normally the default on Linux, that can be > > > changed with the net.ipv6.bindv6only sysctl. It may also have different > > > defaults on other OSes if we ever support them. We know we need it off > > > for dual stack sockets, so explicitly set it to 0 in that case. > > > > > > * At the same time setting IPV6_V6ONLY to 1 for IPv6 sockets bound to a > > > specific address is harmless but pointless. Don't set the option at all > > > in this case, saving a syscall. > > > > I haven't checked the implications of this but __inet6_bind() handles > > address "types" IPV6_ADDR_ANY and IPV6_ADDR_MAPPED in the same way, for > > IPV6_V6ONLY purposes. I'm not sure if this has any influence on > > functionality though. > > AFAICT, technically yes, but not in a way that really matters. IIUC > for a v4-mapped address using IPV6_V6ONLY=0 will let the socket handle > IPv4 traffic with the corresponding address. IPV6_V6ONLY will only > handle IPv6 traffic that actually has the v4-mapped address on it. > > We won't actually get to the point of passing v4-mapped addresses to > the kernel in passt/pasta, because we already use those to mean "IPv4" > and will go to the IPv4 paths. > > > > Signed-off-by: David Gibson > > > --- > > > util.c | 27 +++++++++++++++++++++------ > > > 1 file changed, 21 insertions(+), 6 deletions(-) > > > > > > diff --git a/util.c b/util.c > > > index c94efae4..62f43895 100644 > > > --- a/util.c > > > +++ b/util.c > > > @@ -45,14 +45,14 @@ > > > * @type: epoll type > > > * @sa: Socket address to bind to > > > * @ifname: Interface for binding, NULL for any > > > - * @v6only: Set IPV6_V6ONLY socket option > > > + * @v6only: If >= 0, set IPV6_V6ONLY socket option to this value > > > * @data: epoll reference portion for protocol handlers > > > * > > > * Return: newly created socket, negative error code on failure > > > */ > > > static int sock_l4_(const struct ctx *c, enum epoll_type type, > > > const union sockaddr_inany *sa, const char *ifname, > > > - bool v6only, uint32_t data) > > > + int v6only, uint32_t data) > > > { > > > sa_family_t af = sa->sa_family; > > > union epoll_ref ref = { .type = type, .data = data }; > > > @@ -101,9 +101,11 @@ static int sock_l4_(const struct ctx *c, enum epoll_type type, > > > > > > ref.fd = fd; > > > > > > - if (v6only) > > > - if (setsockopt(fd, IPPROTO_IPV6, IPV6_V6ONLY, &y, sizeof(y))) > > > - debug("Failed to set IPV6_V6ONLY on socket %i", fd); > > > + if (v6only >= 0) > > > + if (setsockopt(fd, IPPROTO_IPV6, IPV6_V6ONLY, > > > + &v6only, sizeof(v6only))) > > > + debug("Failed to set IPV6_V6ONLY to %d on socket %i", > > > + v6only, fd); > > > > Nit: curly brackets (two pairs) for consistency. > > Fixed. > > > > > > > if (setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, &y, sizeof(y))) > > > debug("Failed to set SO_REUSEADDR on socket %i", fd); > > > @@ -186,7 +188,16 @@ int sock_l4(const struct ctx *c, enum epoll_type type, > > > const union sockaddr_inany *sa, const char *ifname, > > > uint32_t data) > > > { > > > - return sock_l4_(c, type, sa, ifname, sa->sa_family == AF_INET6, data); > > > + int v6only = -1; > > > + > > > + /* The option doesn't exist for IPv4 sockets, and is irrelevant for IPv6 > > > + * sockets with a non-wildcard address. > > > > Same as above: I don't think this is true, strictly speaking, but I > > didn't check whether this inaccuracy is in any way relevant. > > Right, so technically yes, but I don't think it's relevant for the > reasons I gave above. I've rephrased to "we don't care about it for > IPv6 sockets with ...". Does that help? Yes, it does, thanks. > > > + */ > > > + if (sa->sa_family == AF_INET6 && > > > + IN6_IS_ADDR_UNSPECIFIED(&sa->sa6.sin6_addr)) > > > + v6only = 1; > > > + > > > + return sock_l4_(c, type, sa, ifname, v6only, data); > > > } > > > > > > int sock_l4_dualstack(const struct ctx *c, enum epoll_type type, > > > @@ -198,6 +209,10 @@ int sock_l4_dualstack(const struct ctx *c, enum epoll_type type, > > > .sa6.sin6_port = htons(port), > > > }; > > > > > > + /* Dual stack sockets require IPV6_V6ONLY == 0. Usually that's the > > > + * default, but sysctl net.ipv6.bindv6only can change that, so set the > > > + * sockopt explicitly. > > > + */ > > > return sock_l4_(c, type, &sa, ifname, 0, data); > > > } > > > > > > > The rest of the series looks good to me, including 8/8, but it's not > > clear to me what the "secondary reasons" to consider 8/8 at this stage > > might be, so I'm not actually sure what to do with it. > > > > Is it because it drops some code? > > * It drops some code > * I think it will address bug 113 (still need to check) Right, I forgot about this part. > * It might also help for some other systemd-resolved edge cases > * It will make life a bit easier to implement bug 171 > * I think the improved symmetry will make other flexible forwarding > changes a bit easier > > I'll look into bug 113 and revise the commit message for 8/8 in the > next spin. -- Stefano