From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by passt.top (Postfix) with ESMTPS id 148C95A0272 for ; Thu, 3 Aug 2023 04:27:09 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=201602; t=1691029621; bh=/0nzTrv+wVevqhltJ9m45cn4WTEnkyGLGObXL/AiQqw=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=NoM8acjxvukdk+v5TPwsj4Ye26fvUi4BP6WrHD84TBkfXznyQ68bKvaxXo713099W jJuY7lyqbXwSu2n+KCnOMNGnzdVXliCKcim0lYt7MaAl3uras5xbV/Y+e3j9dMiZhM oSMln1ikT4ntUsoELYHrmMNYuCXF8ZpF/J8N/VXY= Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4RGXmP4JnMz4wqW; Thu, 3 Aug 2023 12:27:01 +1000 (AEST) Date: Thu, 3 Aug 2023 12:09:16 +1000 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH 01/17] netlink: Split up functionality if nl_link() Message-ID: References: <20230724060936.952659-1-david@gibson.dropbear.id.au> <20230724060936.952659-2-david@gibson.dropbear.id.au> <20230803004729.03ca0e36@elisabeth> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="SaebAXqryUJycrHR" Content-Disposition: inline In-Reply-To: <20230803004729.03ca0e36@elisabeth> Message-ID-Hash: 6FFXGYSFQFM4ZS5NPFS7RNVZVBFH4PXM X-Message-ID-Hash: 6FFXGYSFQFM4ZS5NPFS7RNVZVBFH4PXM X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --SaebAXqryUJycrHR Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Aug 03, 2023 at 12:47:29AM +0200, Stefano Brivio wrote: > In the subject: s/if/of/. >=20 > On Mon, 24 Jul 2023 16:09:20 +1000 > David Gibson wrote: >=20 > > nl_link() performs a number of functions: it can bring links up, set MAC > > address and MTU and also retrieve the existing MAC. This makes for a s= mall > > number of lines of code, but high conceptual complexity: it's quite hard > > to follow what's going on both in nl_link() itself and it's also not ve= ry > > obvious which function its callers are intending to use. >=20 > Actually I don't find nl_link() *that* bad, but for consistency with the > next patches this definitely makes sense. Eh. > > Clarify this, by splitting nl_link() into nl_link_up(), nl_link_set_mac= (), > > and nl_link_get_mac(). The first brings up a link, optionally setting = the > > MTU, the others get or set the MAC address. > >=20 > > This fixes an arguable bug in pasta_ns_conf(): it looks as though that = was > > intended to retrieve the guest MAC whether or not c->pasta_conf_ns is s= et. > > However, it only actually does so in the !c->pasta_conf_ns case: the fa= ct > > that we set up=3D=3D1 means we would only ever set, never get, the MAC = in the > > nl_link() call in the other path. We get away with this because the MAC > > will quickly be discovered once we receive packets on the tap interface. > > Still, it's neater to always get the MAC address here. >=20 > Actually, the intention wasn't to always retrieve the namespaced MAC > address: I thought I'd do that only if we don't configure the > interface, because we want NDP and DHCP to be "ready". Huh, ok. Still very hard to follow though, because a policy decision of the caller is being implemented by the subtle interactions of parameters to nl_link() itself, making the intent clear in neither location. > But that's not > really relevant... I guess yes, it's more consistent if we fetch it in > any case (as long as we don't configure it). >=20 > >=20 > > Signed-off-by: David Gibson > > --- > > conf.c | 4 +- > > netlink.c | 143 +++++++++++++++++++++++++++++++----------------------- > > netlink.h | 4 +- > > pasta.c | 12 +++-- > > 4 files changed, 96 insertions(+), 67 deletions(-) > >=20 > > diff --git a/conf.c b/conf.c > > index 78eaf2d..2ff9e2a 100644 > > --- a/conf.c > > +++ b/conf.c > > @@ -670,7 +670,7 @@ static unsigned int conf_ip4(unsigned int ifi, > > memcpy(&ip4->addr_seen, &ip4->addr, sizeof(ip4->addr_seen)); > > =20 > > if (MAC_IS_ZERO(mac)) > > - nl_link(0, ifi, mac, 0, 0); > > + nl_link_get_mac(0, ifi, mac); > > =20 > > if (IN4_IS_ADDR_UNSPECIFIED(&ip4->addr) || > > MAC_IS_ZERO(mac)) > > @@ -711,7 +711,7 @@ static unsigned int conf_ip6(unsigned int ifi, > > memcpy(&ip6->addr_ll_seen, &ip6->addr_ll, sizeof(ip6->addr_ll)); > > =20 > > if (MAC_IS_ZERO(mac)) > > - nl_link(0, ifi, mac, 0, 0); > > + nl_link_get_mac(0, ifi, mac); > > =20 > > if (IN6_IS_ADDR_UNSPECIFIED(&ip6->addr) || > > IN6_IS_ADDR_UNSPECIFIED(&ip6->addr_ll) || > > diff --git a/netlink.c b/netlink.c > > index e15e23f..4b1f75e 100644 > > --- a/netlink.c > > +++ b/netlink.c > > @@ -486,83 +486,44 @@ next: > > } > > =20 > > /** > > - * nl_link() - Get/set link attributes > > + * nl_link_get_mac() - Get link MAC address > > * @ns: Use netlink socket in namespace > > * @ifi: Interface index > > - * @mac: MAC address to fill, if passed as zero, to set otherwise > > - * @up: If set, bring up the link > > - * @mtu: If non-zero, set interface MTU > > + * @mac: Fill with current MAC address > > */ > > -void nl_link(int ns, unsigned int ifi, void *mac, int up, int mtu) > > +void nl_link_get_mac(int ns, unsigned int ifi, void *mac) > > { > > - int change =3D !MAC_IS_ZERO(mac) || up || mtu; > > struct req_t { > > struct nlmsghdr nlh; > > struct ifinfomsg ifm; > > - struct rtattr rta; > > - union { > > - unsigned char mac[ETH_ALEN]; > > - struct { > > - unsigned int mtu; > > - } mtu; > > - } set; > > } req =3D { > > - .nlh.nlmsg_type =3D change ? RTM_NEWLINK : RTM_GETLINK, > > - .nlh.nlmsg_len =3D NLMSG_LENGTH(sizeof(struct ifinfomsg)), > > - .nlh.nlmsg_flags =3D NLM_F_REQUEST | (change ? NLM_F_ACK : 0), > > + .nlh.nlmsg_type =3D RTM_GETLINK, > > + .nlh.nlmsg_len =3D sizeof(req), >=20 > I don't think there's a practical issue with this, but there were two > reasons why I used NLMSG_LENGTH(sizeof(struct ifinfomsg)) instead: >=20 > - NLMSG_LENGTH() aligns to 4 bytes, not to whatever > architecture-dependent alignment we might have: the message might > actually be smaller Oof... so. On the one hand, I see the issue; if these are different, I'm not sure what the effect will be. On the other hand, if we use NLMSG_LENGTH and it *is* longer than the structure size, we'll be saying that this message is longer than the datagram containing it. I'm not sure what the effect of that will be either. Not really sure what to do about this. > - I see that this works with gcc and clang, but, strictly > speaking, is the size of the struct known "before" > (sequence-point-wise) we're done initialising it? I have a very vague > memory of this not working with gcc 2.9 or suchlike -- which is not a > problem, as long as our new friend C11 actually supports this (but > I'm not entirely sure). I'm pretty sure it's ok, regardless of C11 state. It's not really a question of sequence points: those are about the ordering of run time operations. Even though the structure is being defined inline, determining it's size and layout will still happen at compile time, whereas the initialization is obviously a runtime event. > Then, in 9/17, NLMSG_LENGTH() could be conveniently used by nl_req(). >=20 > > + .nlh.nlmsg_flags =3D NLM_F_REQUEST | NLM_F_ACK, > > .nlh.nlmsg_seq =3D nl_seq++, > > .ifm.ifi_family =3D AF_UNSPEC, > > .ifm.ifi_index =3D ifi, > > - .ifm.ifi_flags =3D up ? IFF_UP : 0, > > - .ifm.ifi_change =3D up ? IFF_UP : 0, > > }; > > - struct ifinfomsg *ifm; > > struct nlmsghdr *nh; > > - struct rtattr *rta; > > char buf[NLBUFSIZ]; > > ssize_t n; > > - size_t na; > > - > > - if (!MAC_IS_ZERO(mac)) { > > - req.nlh.nlmsg_len =3D sizeof(req); > > - memcpy(req.set.mac, mac, ETH_ALEN); > > - req.rta.rta_type =3D IFLA_ADDRESS; > > - req.rta.rta_len =3D RTA_LENGTH(ETH_ALEN); > > - if (nl_req(ns, buf, &req, req.nlh.nlmsg_len) < 0) > > - return; > > - > > - up =3D 0; > > - } > > - > > - if (mtu) { > > - req.nlh.nlmsg_len =3D offsetof(struct req_t, set.mtu) > > - + sizeof(req.set.mtu); > > - req.set.mtu.mtu =3D mtu; > > - req.rta.rta_type =3D IFLA_MTU; > > - req.rta.rta_len =3D RTA_LENGTH(sizeof(unsigned int)); > > - if (nl_req(ns, buf, &req, req.nlh.nlmsg_len) < 0) > > - return; > > - > > - up =3D 0; > > - } > > - > > - if (up && nl_req(ns, buf, &req, req.nlh.nlmsg_len) < 0) > > - return; > > - > > - if (change) > > - return; > > =20 > > - if ((n =3D nl_req(ns, buf, &req, req.nlh.nlmsg_len)) < 0) > > + n =3D nl_req(ns, buf, &req, sizeof(req)); > > + if (n < 0) > > return; > > +=09 > > + for (nh =3D (struct nlmsghdr *)buf; > > + NLMSG_OK(nh, n) && nh->nlmsg_type !=3D NLMSG_DONE; > > + nh =3D NLMSG_NEXT(nh, n)) { > > + struct ifinfomsg *ifm =3D (struct ifinfomsg *)NLMSG_DATA(nh); > > + struct rtattr *rta; > > + size_t na; > > =20 > > - nh =3D (struct nlmsghdr *)buf; > > - for ( ; NLMSG_OK(nh, n); nh =3D NLMSG_NEXT(nh, n)) { > > if (nh->nlmsg_type !=3D RTM_NEWLINK) > > - goto next; > > - > > - ifm =3D (struct ifinfomsg *)NLMSG_DATA(nh); > > + continue; > > =20 > > - for (rta =3D IFLA_RTA(ifm), na =3D RTM_PAYLOAD(nh); RTA_OK(rta, na); > > + for (rta =3D IFLA_RTA(ifm), na =3D RTM_PAYLOAD(nh); > > + RTA_OK(rta, na); > > rta =3D RTA_NEXT(rta, na)) { > > if (rta->rta_type !=3D IFLA_ADDRESS) > > continue; > > @@ -570,8 +531,70 @@ void nl_link(int ns, unsigned int ifi, void *mac, = int up, int mtu) > > memcpy(mac, RTA_DATA(rta), ETH_ALEN); > > break; > > } > > -next: > > - if (nh->nlmsg_type =3D=3D NLMSG_DONE) > > - break; > > } > > } > > + > > +/** > > + * nl_link_set_mac() - Set link MAC address > > + * @ns: Use netlink socket in namespace > > + * @ifi: Interface index > > + * @mac: MAC address to set > > + */ > > +void nl_link_set_mac(int ns, unsigned int ifi, void *mac) > > +{ > > + struct req_t { > > + struct nlmsghdr nlh; > > + struct ifinfomsg ifm; > > + struct rtattr rta; > > + unsigned char mac[ETH_ALEN]; > > + } req =3D { > > + .nlh.nlmsg_type =3D RTM_NEWLINK, > > + .nlh.nlmsg_len =3D sizeof(req), >=20 > Same here. >=20 > > + .nlh.nlmsg_flags =3D NLM_F_REQUEST | NLM_F_ACK, > > + .nlh.nlmsg_seq =3D nl_seq++, > > + .ifm.ifi_family =3D AF_UNSPEC, > > + .ifm.ifi_index =3D ifi, > > + .rta.rta_type =3D IFLA_ADDRESS, > > + .rta.rta_len =3D RTA_LENGTH(ETH_ALEN), > > + }; > > + char buf[NLBUFSIZ]; > > + > > + memcpy(req.mac, mac, ETH_ALEN); > > + > > + nl_req(ns, buf, &req, sizeof(req)); > > +} > > + > > +/** > > + * nl_link_up() - Bring link up > > + * @ns: Use netlink socket in namespace > > + * @ifi: Interface index > > + * @mtu: If non-zero, set interface MTU > > + */ > > +void nl_link_up(int ns, unsigned int ifi, int mtu) > > +{ > > + struct req_t { > > + struct nlmsghdr nlh; > > + struct ifinfomsg ifm; > > + struct rtattr rta; > > + unsigned int mtu; > > + } req =3D { > > + .nlh.nlmsg_type =3D RTM_NEWLINK, > > + .nlh.nlmsg_len =3D sizeof(req), >=20 > And here. >=20 > > + .nlh.nlmsg_flags =3D NLM_F_REQUEST | NLM_F_ACK, > > + .nlh.nlmsg_seq =3D nl_seq++, > > + .ifm.ifi_family =3D AF_UNSPEC, > > + .ifm.ifi_index =3D ifi, > > + .ifm.ifi_flags =3D IFF_UP, > > + .ifm.ifi_change =3D IFF_UP, > > + .rta.rta_type =3D IFLA_MTU, > > + .rta.rta_len =3D RTA_LENGTH(sizeof(unsigned int)), > > + .mtu =3D mtu, > > + }; > > + char buf[NLBUFSIZ]; > > + > > + if (!mtu) > > + /* Shorten request to drop MTU attribute */ > > + req.nlh.nlmsg_len =3D offsetof(struct req_t, rta); >=20 > Pre-existing issue I see now: we should probably use NLMSG_LENGTH() > here, in any case. Well.. if NLMSG_LENGTH() really is different here, we're (by definition) including some of req.rta in the message, which isn't our intention. So.. if we trust the rta member to be aligned properly for the case where we *do* include it, can't we also trust it for the case where we don't? > > + > > + nl_req(ns, buf, &req, req.nlh.nlmsg_len); > > +} > > diff --git a/netlink.h b/netlink.h > > index cd0e666..980ac44 100644 > > --- a/netlink.h > > +++ b/netlink.h > > @@ -18,6 +18,8 @@ void nl_route(enum nl_op op, unsigned int ifi, unsign= ed int ifi_ns, > > sa_family_t af, void *gw); > > void nl_addr(enum nl_op op, unsigned int ifi, unsigned int ifi_ns, > > sa_family_t af, void *addr, int *prefix_len, void *addr_l); > > -void nl_link(int ns, unsigned int ifi, void *mac, int up, int mtu); > > +void nl_link_get_mac(int ns, unsigned int ifi, void *mac); > > +void nl_link_set_mac(int ns, unsigned int ifi, void *mac); > > +void nl_link_up(int ns, unsigned int ifi, int mtu); > > =20 > > #endif /* NETLINK_H */ > > diff --git a/pasta.c b/pasta.c > > index 8c85546..3b5537d 100644 > > --- a/pasta.c > > +++ b/pasta.c > > @@ -272,13 +272,19 @@ void pasta_start_ns(struct ctx *c, uid_t uid, gid= _t gid, > > */ > > void pasta_ns_conf(struct ctx *c) > > { > > - nl_link(1, 1 /* lo */, MAC_ZERO, 1, 0); > > + nl_link_up(1, 1 /* lo */, 0); > > + > > + /* Get or set guest MAC */ >=20 > I know it's called mac_guest, my bad, but what about "MAC address in > the target namespace"? Good idea, changed. --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --SaebAXqryUJycrHR Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmTLDDcACgkQzQJF27ox 2Gd2IhAArO4pak1hzKzi673CoT0s87/sRfoQGolO10ncCHWvTvhz0b1OS/ipsarB IY2shYFQ6H4w/ZMjZsYLQlpWhkdqVfq/hhV5Lw400OlojNvpamHKwzr5aBEL8bPo d+n6ccBlT3itwyuV0M3kXoF8zxip3MFLLboKpa5HD2BaPdROUyLL8pxeaWarlen4 J0Q8kOReHfG4QQeuP7fAH0bo+HtsW34auQtXjd1WFBs7tsW61MwPsVV//5XU3G7J 41/I1orXVH6eRBQdEyi9cM3149/5Goyqc/cVQWI7bWSe7m46v5MRZNErWhK6QjdU hpZO3HoSVe0DS6ufwdcOS6GA96NwSww3tjfiXBMDcnJLhiDKLxkZE4mX95blDbgz G0+piydFgAAvOHIN8zfRa5iBfJKx5R0w0MVidhkIlOeh4gXFp+ffJgWHGx9W3PlA bQRa1F12jEBe86JpCdJ6WOqiXfX/HLYWN64avQ1eA2wyJKKaAQlfe4cTUFTIO/ki ZFzeE0w2men9YrkOcQ2HIGw9TDe245Y1K936gW1oM1h6wyAjBngszn+q2icrvite s/btFdy3E0BbRGTwqyzp1jgM+7P9YR9Pt8RLj0UzSBrhU/cWILT5Kz8+A9u+mK1H 6Sd/h+EGv7amTGT5iO4kwmcl9kEySbTCJtCGjdL5V7vhH6uloCE= =6dXw -----END PGP SIGNATURE----- --SaebAXqryUJycrHR--