From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=fail reason="key not found in DNS" header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202312 header.b=Ep7NnzIj; dkim-atps=neutral Received: from mail.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by passt.top (Postfix) with ESMTPS id 186FB5A026E for ; Fri, 16 Aug 2024 02:56:00 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202312; t=1723769755; bh=47sCkfJhGqiPcFmbQIiLHKabc4Uus+Yv9N5kNUyQTGw=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=Ep7NnzIjZhn6TXG2pKL9L0g9K3fM2QTlYUPhDoQaSgCgNHX/Ey3pnvdPF9dHR0Brv H5A8u3HBplo9UKFpmmAqY5s1gn1jrM6KTnyViEU9rFUcdVFCOjld5Nsg1wKsBoMztI HkA2s3rsIqW/JXBYbiKDr51/mWkxhepzV9Ja0w2flp5NXphd5Jayqfx56sU00h2ldY /mVCS86yqemQ6Z2gIc+2WtIxlpUcqygv6WF7l3V5VXXyYVufiiI6997s34sM0Xrjx7 dPAbPR3HUXjBUbtb3SMjhTZY1m6oRbpIEDP9nZMvGDhK39zmtrWmJoNRSsNHBBV6/4 +jio8FE4wKB0g== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4WlNpM3wmGz4x0t; Fri, 16 Aug 2024 10:55:55 +1000 (AEST) Date: Fri, 16 Aug 2024 10:55:45 +1000 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH v2 4/7] netlink, pasta: Disable DAD for link-local addresses on namespace interface Message-ID: References: <20240815083649.4188007-1-sbrivio@redhat.com> <20240815083649.4188007-5-sbrivio@redhat.com> <20240815125932.53f38296@elisabeth> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="6ZP6e9zwO18/cJYX" Content-Disposition: inline In-Reply-To: <20240815125932.53f38296@elisabeth> Message-ID-Hash: X33MDHMZELMX7WX6JDD3462AGCHGOYJN X-Message-ID-Hash: X33MDHMZELMX7WX6JDD3462AGCHGOYJN X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, Paul Holzinger X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --6ZP6e9zwO18/cJYX Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Aug 15, 2024 at 12:59:32PM +0200, Stefano Brivio wrote: > On Thu, 15 Aug 2024 20:38:17 +1000 > David Gibson wrote: >=20 > > On Thu, Aug 15, 2024 at 10:36:46AM +0200, Stefano Brivio wrote: > > > It makes no sense for a container or a guest to try and perform > > > duplicate address detection for their link-local address, as we'll > > > anyway not relay neighbour solicitations with an unspecified source > > > address. > > >=20 > > > While they perform duplicate address detection, the link-local address > > > is not usable, which prevents us from bringing up especially > > > containers and communicate with them right away via IPv6. > > >=20 > > > This is not enough to prevent DAD and reach the container right away: > > > we'll need a couple more patches. > > >=20 > > > As we send NLM_F_REPLACE requests right away, while we still have to > > > read out other addresses on the same socket, we can't use nl_do(): > > > keep a count of messages we send (addresses we change) and deal with > > > the answer to those NLM_F_REPLACE requests in a separate loop, later. > > >=20 > > > Link: https://github.com/containers/podman/pull/23561#discussion_r171= 1639663 > > > Signed-off-by: Stefano Brivio > > > --- > > > netlink.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++++++++= ++ > > > netlink.h | 1 + > > > pasta.c | 6 ++++++ > > > 3 files changed, 62 insertions(+) > > >=20 > > > diff --git a/netlink.c b/netlink.c > > > index 873e6c7..59f2fd9 100644 > > > --- a/netlink.c > > > +++ b/netlink.c > > > @@ -673,6 +673,61 @@ int nl_route_dup(int s_src, unsigned int ifi_src, > > > return 0; > > > } > > > =20 > > > +/** > > > + * nl_addr_set_ll_nodad() - Set IFA_F_NODAD on IPv6 link-local addre= sses > > > + * @s: Netlink socket > > > + * @ifi: Interface index in target namespace > > > + * > > > + * Return: 0 on success, negative error code on failure > > > + */ > > > +int nl_addr_set_ll_nodad(int s, unsigned int ifi) > > > +{ > > > + struct req_t { > > > + struct nlmsghdr nlh; > > > + struct ifaddrmsg ifa; > > > + } req =3D { > > > + .ifa.ifa_family =3D AF_INET6, > > > + .ifa.ifa_index =3D ifi, > > > + }; > > > + unsigned ll_addrs =3D 0; > > > + struct nlmsghdr *nh; > > > + char buf[NLBUFSIZ]; > > > + ssize_t status; > > > + uint32_t seq; > > > + > > > + seq =3D nl_send(s, &req, RTM_GETADDR, NLM_F_DUMP, sizeof(req)); > > > + nl_foreach_oftype(nh, status, s, buf, seq, RTM_NEWADDR) { > > > + struct ifaddrmsg *ifa =3D (struct ifaddrmsg *)NLMSG_DATA(nh); > > > + struct rtattr *rta; > > > + size_t na; > > > + > > > + if (ifa->ifa_index !=3D ifi || ifa->ifa_scope !=3D RT_SCOPE_LINK) > > > + continue; > > > + > > > + ifa->ifa_flags |=3D IFA_F_NODAD; > > > + > > > + for (rta =3D IFA_RTA(ifa), na =3D IFA_PAYLOAD(nh); RTA_OK(rta, na); > > > + rta =3D RTA_NEXT(rta, na)) { > > > + /* If 32-bit flags are used, add IFA_F_NODAD there */ > > > + if (rta->rta_type =3D=3D IFA_FLAGS) > > > + *(uint32_t *)RTA_DATA(rta) |=3D IFA_F_NODAD; > > > + } > > > + > > > + nl_send(s, nh, RTM_NEWADDR, NLM_F_REPLACE, nh->nlmsg_len); > > > + ll_addrs++; > > > + } > > > + > > > + if (status < 0) > > > + return status; =20 > >=20 > > Ah... one gotcha with the nl_send() in the loop. We should make sure > > we get the responses from any of those we sent, even if the original > > request failed. Otherwise we'll be out of sync on the netlink socket a= gain. >=20 > I'm ignoring the return code of nl_send(), so, minus the issue you're > raising about nl_foreach() below, that should already be sorted, right? No. The return code from nl_send() is mostly irrelevant - it's just the sequence number (other errors die()). But the point is you've queued requests, so the kernel will queue responses and if you exit the function here, nothing will consume them. > > > + seq +=3D ll_addrs; > > > + > > > + nl_foreach(nh, status, s, buf, seq) > > > + warn("netlink: Unexpected response message"); =20 > >=20 > > I don't think this will work right if there's > 1 address. It will be > > looking for the last sequence number on the first iteration and will > > die in nl_status() when it mismatches. >=20 > Ah, oops, right. >=20 > > Maybe just loop on nl_next() until you get the last seq number, then > > call nl_status()? >=20 > How do I check for errors on the answers before the next one? I mean, > nl_foreach() should fit here, it's just that I need to start from the > right sequence number. >=20 > > That also means you could just save the seq each > > time you nl_send(), overwriting the previous one, rather than relying > > on the fact that we allocate seqs, well, sequentially. >=20 > I don't understand how this fits with calling nl_next() until I get > to the last sequence number. Letting that aside, can't I simply use > nl_foreach(), but start with the sequence of the first nl_send() > instead of the last one? Uh.. yeah, it's a bit fiddly. Especially since in those foreach loops status does double duty as the remaining data in the current message and as the status code. # Option 1 Assuming contiguous sequence numbers, which is true for now. - Change the nl_send() within the first loop to last_seq =3D nl_send(...) Then immediately after the first loop int status2 =3D status; for (seq++; seq <=3D last_seq; seq++) { nl_foreach(nh, status2, s, buf, seq) ; if (status =3D=3D 0) status =3D status2; } At this point you will have consumed all the responses and status will have the first reported error code. # Option 2 Refactor nl_status() to have a version that reports sequence number instead of taking & checking it. Loop on nl_next() until nl_status_variant() returns <=3D 0 *and* the last sequence number. # Option 3 Open-coded version of (2) ssize_t err =3D status; do { nh =3D nl_next(s, buf, nh, &status); if (err =3D=3D 0 && nh->nl_msg_type =3D=3D NLMSG_ERR) { struct nlmsgerr *errmsg =3D (struct nlmsgerr *)NLMSG_DATA(nh); err =3D errmsg->error; } } while (ng->nlmsg_seq !=3D last_seq || (nh->nlmsg_type !=3D NLMSG_DONE && nh->nlmsg_type !=3D NLMSG_ERROR)); And at this point, again, you've consumed all the responses and 'err' has the first error code. I think this is roughly what I was suggesting originally, but it is messier than I thought. --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --6ZP6e9zwO18/cJYX Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAma+o4wACgkQzQJF27ox 2Gd5JQ//XM8O5RcWZhQUnz6Jwf1HHjn9yyEE4wP0xCunFwODH4DaqT4SF2JfCslh Tz07cUKc+7QWz2WAOz9CLpVAhWYcbGpUx1BY0wiOGfFF8B9IZGVt1qgt2eNynjDL +Tesp5yiaznKxfw51anNIUM+YF8XHwD1xzh3qdgd8UwbVxfybzH1kjxI1k2RDg0X J5U2e2bn1NFpBK93Z/rLG1MhTP2y3upB/jnzzdZNn2KiDcvFK11+9Os7N8vRl0Pe kY8Uxny4kjWjc6bk6t5jldfy9n99LuVXZtPQflpFVe3KpZLq98SSFniD75ulbumw m3/40h+W0mr7Wyri20rX8CfXhhqNIvo1D56DBGX7FpcnzTjrRaWXAZ3ffhwOoYxi whrXAMtAYL7C12f1IHIA11AFc7+zl15UxbbCO5oFRyw3nUQITd3DN/OUhCEDPASC dF/GGihPYVKWzvpBAqYjFuaJ3KBGve/esXsFjuva+adUpGfaCUV0NQNjMXyR/Sst iNsp8mY5KHJUywq5NnMlaujnR7Es2upm507jARG3JlZfRBxljI6zLtJgpgM6Z7Hk HxFMgFSqjeUqMdMgxEb9A/T2rugrthycfCFtYAQIXsckZu6OfXCZ6V19g0D8RI7I vSlRVvRoU7H2ReiHsa/IdaSpUEdaq4RtdArC90Pvl7mcdODCYAI= =h+AT -----END PGP SIGNATURE----- --6ZP6e9zwO18/cJYX--