From: David Gibson <david@gibson.dropbear.id.au>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: passt-dev@passt.top
Subject: Re: [PATCH 01/17] netlink: Split up functionality if nl_link()
Date: Thu, 3 Aug 2023 12:09:16 +1000 [thread overview]
Message-ID: <ZMsMTByU6J7ce/xx@zatzit> (raw)
In-Reply-To: <20230803004729.03ca0e36@elisabeth>
[-- Attachment #1: Type: text/plain, Size: 12057 bytes --]
On Thu, Aug 03, 2023 at 12:47:29AM +0200, Stefano Brivio wrote:
> In the subject: s/if/of/.
>
> On Mon, 24 Jul 2023 16:09:20 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
>
> > nl_link() performs a number of functions: it can bring links up, set MAC
> > address and MTU and also retrieve the existing MAC. This makes for a small
> > number of lines of code, but high conceptual complexity: it's quite hard
> > to follow what's going on both in nl_link() itself and it's also not very
> > obvious which function its callers are intending to use.
>
> Actually I don't find nl_link() *that* bad, but for consistency with the
> next patches this definitely makes sense.
Eh.
> > Clarify this, by splitting nl_link() into nl_link_up(), nl_link_set_mac(),
> > and nl_link_get_mac(). The first brings up a link, optionally setting the
> > MTU, the others get or set the MAC address.
> >
> > This fixes an arguable bug in pasta_ns_conf(): it looks as though that was
> > intended to retrieve the guest MAC whether or not c->pasta_conf_ns is set.
> > However, it only actually does so in the !c->pasta_conf_ns case: the fact
> > that we set up==1 means we would only ever set, never get, the MAC in the
> > nl_link() call in the other path. We get away with this because the MAC
> > will quickly be discovered once we receive packets on the tap interface.
> > Still, it's neater to always get the MAC address here.
>
> Actually, the intention wasn't to always retrieve the namespaced MAC
> address: I thought I'd do that only if we don't configure the
> interface, because we want NDP and DHCP to be "ready".
Huh, ok. Still very hard to follow though, because a policy decision
of the caller is being implemented by the subtle interactions of
parameters to nl_link() itself, making the intent clear in neither
location.
> But that's not
> really relevant... I guess yes, it's more consistent if we fetch it in
> any case (as long as we don't configure it).
>
> >
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> > conf.c | 4 +-
> > netlink.c | 143 +++++++++++++++++++++++++++++++-----------------------
> > netlink.h | 4 +-
> > pasta.c | 12 +++--
> > 4 files changed, 96 insertions(+), 67 deletions(-)
> >
> > diff --git a/conf.c b/conf.c
> > index 78eaf2d..2ff9e2a 100644
> > --- a/conf.c
> > +++ b/conf.c
> > @@ -670,7 +670,7 @@ static unsigned int conf_ip4(unsigned int ifi,
> > memcpy(&ip4->addr_seen, &ip4->addr, sizeof(ip4->addr_seen));
> >
> > if (MAC_IS_ZERO(mac))
> > - nl_link(0, ifi, mac, 0, 0);
> > + nl_link_get_mac(0, ifi, mac);
> >
> > if (IN4_IS_ADDR_UNSPECIFIED(&ip4->addr) ||
> > MAC_IS_ZERO(mac))
> > @@ -711,7 +711,7 @@ static unsigned int conf_ip6(unsigned int ifi,
> > memcpy(&ip6->addr_ll_seen, &ip6->addr_ll, sizeof(ip6->addr_ll));
> >
> > if (MAC_IS_ZERO(mac))
> > - nl_link(0, ifi, mac, 0, 0);
> > + nl_link_get_mac(0, ifi, mac);
> >
> > if (IN6_IS_ADDR_UNSPECIFIED(&ip6->addr) ||
> > IN6_IS_ADDR_UNSPECIFIED(&ip6->addr_ll) ||
> > diff --git a/netlink.c b/netlink.c
> > index e15e23f..4b1f75e 100644
> > --- a/netlink.c
> > +++ b/netlink.c
> > @@ -486,83 +486,44 @@ next:
> > }
> >
> > /**
> > - * nl_link() - Get/set link attributes
> > + * nl_link_get_mac() - Get link MAC address
> > * @ns: Use netlink socket in namespace
> > * @ifi: Interface index
> > - * @mac: MAC address to fill, if passed as zero, to set otherwise
> > - * @up: If set, bring up the link
> > - * @mtu: If non-zero, set interface MTU
> > + * @mac: Fill with current MAC address
> > */
> > -void nl_link(int ns, unsigned int ifi, void *mac, int up, int mtu)
> > +void nl_link_get_mac(int ns, unsigned int ifi, void *mac)
> > {
> > - int change = !MAC_IS_ZERO(mac) || up || mtu;
> > struct req_t {
> > struct nlmsghdr nlh;
> > struct ifinfomsg ifm;
> > - struct rtattr rta;
> > - union {
> > - unsigned char mac[ETH_ALEN];
> > - struct {
> > - unsigned int mtu;
> > - } mtu;
> > - } set;
> > } req = {
> > - .nlh.nlmsg_type = change ? RTM_NEWLINK : RTM_GETLINK,
> > - .nlh.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg)),
> > - .nlh.nlmsg_flags = NLM_F_REQUEST | (change ? NLM_F_ACK : 0),
> > + .nlh.nlmsg_type = RTM_GETLINK,
> > + .nlh.nlmsg_len = sizeof(req),
>
> I don't think there's a practical issue with this, but there were two
> reasons why I used NLMSG_LENGTH(sizeof(struct ifinfomsg)) instead:
>
> - NLMSG_LENGTH() aligns to 4 bytes, not to whatever
> architecture-dependent alignment we might have: the message might
> actually be smaller
Oof... so. On the one hand, I see the issue; if these are different,
I'm not sure what the effect will be. On the other hand, if we use
NLMSG_LENGTH and it *is* longer than the structure size, we'll be
saying that this message is longer than the datagram containing it.
I'm not sure what the effect of that will be either.
Not really sure what to do about this.
> - I see that this works with gcc and clang, but, strictly
> speaking, is the size of the struct known "before"
> (sequence-point-wise) we're done initialising it? I have a very vague
> memory of this not working with gcc 2.9 or suchlike -- which is not a
> problem, as long as our new friend C11 actually supports this (but
> I'm not entirely sure).
I'm pretty sure it's ok, regardless of C11 state. It's not really a
question of sequence points: those are about the ordering of run time
operations. Even though the structure is being defined inline,
determining it's size and layout will still happen at compile time,
whereas the initialization is obviously a runtime event.
> Then, in 9/17, NLMSG_LENGTH() could be conveniently used by nl_req().
>
> > + .nlh.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK,
> > .nlh.nlmsg_seq = nl_seq++,
> > .ifm.ifi_family = AF_UNSPEC,
> > .ifm.ifi_index = ifi,
> > - .ifm.ifi_flags = up ? IFF_UP : 0,
> > - .ifm.ifi_change = up ? IFF_UP : 0,
> > };
> > - struct ifinfomsg *ifm;
> > struct nlmsghdr *nh;
> > - struct rtattr *rta;
> > char buf[NLBUFSIZ];
> > ssize_t n;
> > - size_t na;
> > -
> > - if (!MAC_IS_ZERO(mac)) {
> > - req.nlh.nlmsg_len = sizeof(req);
> > - memcpy(req.set.mac, mac, ETH_ALEN);
> > - req.rta.rta_type = IFLA_ADDRESS;
> > - req.rta.rta_len = RTA_LENGTH(ETH_ALEN);
> > - if (nl_req(ns, buf, &req, req.nlh.nlmsg_len) < 0)
> > - return;
> > -
> > - up = 0;
> > - }
> > -
> > - if (mtu) {
> > - req.nlh.nlmsg_len = offsetof(struct req_t, set.mtu)
> > - + sizeof(req.set.mtu);
> > - req.set.mtu.mtu = mtu;
> > - req.rta.rta_type = IFLA_MTU;
> > - req.rta.rta_len = RTA_LENGTH(sizeof(unsigned int));
> > - if (nl_req(ns, buf, &req, req.nlh.nlmsg_len) < 0)
> > - return;
> > -
> > - up = 0;
> > - }
> > -
> > - if (up && nl_req(ns, buf, &req, req.nlh.nlmsg_len) < 0)
> > - return;
> > -
> > - if (change)
> > - return;
> >
> > - if ((n = nl_req(ns, buf, &req, req.nlh.nlmsg_len)) < 0)
> > + n = nl_req(ns, buf, &req, sizeof(req));
> > + if (n < 0)
> > return;
> > +
> > + for (nh = (struct nlmsghdr *)buf;
> > + NLMSG_OK(nh, n) && nh->nlmsg_type != NLMSG_DONE;
> > + nh = NLMSG_NEXT(nh, n)) {
> > + struct ifinfomsg *ifm = (struct ifinfomsg *)NLMSG_DATA(nh);
> > + struct rtattr *rta;
> > + size_t na;
> >
> > - nh = (struct nlmsghdr *)buf;
> > - for ( ; NLMSG_OK(nh, n); nh = NLMSG_NEXT(nh, n)) {
> > if (nh->nlmsg_type != RTM_NEWLINK)
> > - goto next;
> > -
> > - ifm = (struct ifinfomsg *)NLMSG_DATA(nh);
> > + continue;
> >
> > - for (rta = IFLA_RTA(ifm), na = RTM_PAYLOAD(nh); RTA_OK(rta, na);
> > + for (rta = IFLA_RTA(ifm), na = RTM_PAYLOAD(nh);
> > + RTA_OK(rta, na);
> > rta = RTA_NEXT(rta, na)) {
> > if (rta->rta_type != IFLA_ADDRESS)
> > continue;
> > @@ -570,8 +531,70 @@ void nl_link(int ns, unsigned int ifi, void *mac, int up, int mtu)
> > memcpy(mac, RTA_DATA(rta), ETH_ALEN);
> > break;
> > }
> > -next:
> > - if (nh->nlmsg_type == NLMSG_DONE)
> > - break;
> > }
> > }
> > +
> > +/**
> > + * nl_link_set_mac() - Set link MAC address
> > + * @ns: Use netlink socket in namespace
> > + * @ifi: Interface index
> > + * @mac: MAC address to set
> > + */
> > +void nl_link_set_mac(int ns, unsigned int ifi, void *mac)
> > +{
> > + struct req_t {
> > + struct nlmsghdr nlh;
> > + struct ifinfomsg ifm;
> > + struct rtattr rta;
> > + unsigned char mac[ETH_ALEN];
> > + } req = {
> > + .nlh.nlmsg_type = RTM_NEWLINK,
> > + .nlh.nlmsg_len = sizeof(req),
>
> Same here.
>
> > + .nlh.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK,
> > + .nlh.nlmsg_seq = nl_seq++,
> > + .ifm.ifi_family = AF_UNSPEC,
> > + .ifm.ifi_index = ifi,
> > + .rta.rta_type = IFLA_ADDRESS,
> > + .rta.rta_len = RTA_LENGTH(ETH_ALEN),
> > + };
> > + char buf[NLBUFSIZ];
> > +
> > + memcpy(req.mac, mac, ETH_ALEN);
> > +
> > + nl_req(ns, buf, &req, sizeof(req));
> > +}
> > +
> > +/**
> > + * nl_link_up() - Bring link up
> > + * @ns: Use netlink socket in namespace
> > + * @ifi: Interface index
> > + * @mtu: If non-zero, set interface MTU
> > + */
> > +void nl_link_up(int ns, unsigned int ifi, int mtu)
> > +{
> > + struct req_t {
> > + struct nlmsghdr nlh;
> > + struct ifinfomsg ifm;
> > + struct rtattr rta;
> > + unsigned int mtu;
> > + } req = {
> > + .nlh.nlmsg_type = RTM_NEWLINK,
> > + .nlh.nlmsg_len = sizeof(req),
>
> And here.
>
> > + .nlh.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK,
> > + .nlh.nlmsg_seq = nl_seq++,
> > + .ifm.ifi_family = AF_UNSPEC,
> > + .ifm.ifi_index = ifi,
> > + .ifm.ifi_flags = IFF_UP,
> > + .ifm.ifi_change = IFF_UP,
> > + .rta.rta_type = IFLA_MTU,
> > + .rta.rta_len = RTA_LENGTH(sizeof(unsigned int)),
> > + .mtu = mtu,
> > + };
> > + char buf[NLBUFSIZ];
> > +
> > + if (!mtu)
> > + /* Shorten request to drop MTU attribute */
> > + req.nlh.nlmsg_len = offsetof(struct req_t, rta);
>
> Pre-existing issue I see now: we should probably use NLMSG_LENGTH()
> here, in any case.
Well.. if NLMSG_LENGTH() really is different here, we're (by
definition) including some of req.rta in the message, which isn't our
intention. So.. if we trust the rta member to be aligned properly for
the case where we *do* include it, can't we also trust it for the case
where we don't?
> > +
> > + nl_req(ns, buf, &req, req.nlh.nlmsg_len);
> > +}
> > diff --git a/netlink.h b/netlink.h
> > index cd0e666..980ac44 100644
> > --- a/netlink.h
> > +++ b/netlink.h
> > @@ -18,6 +18,8 @@ void nl_route(enum nl_op op, unsigned int ifi, unsigned int ifi_ns,
> > sa_family_t af, void *gw);
> > void nl_addr(enum nl_op op, unsigned int ifi, unsigned int ifi_ns,
> > sa_family_t af, void *addr, int *prefix_len, void *addr_l);
> > -void nl_link(int ns, unsigned int ifi, void *mac, int up, int mtu);
> > +void nl_link_get_mac(int ns, unsigned int ifi, void *mac);
> > +void nl_link_set_mac(int ns, unsigned int ifi, void *mac);
> > +void nl_link_up(int ns, unsigned int ifi, int mtu);
> >
> > #endif /* NETLINK_H */
> > diff --git a/pasta.c b/pasta.c
> > index 8c85546..3b5537d 100644
> > --- a/pasta.c
> > +++ b/pasta.c
> > @@ -272,13 +272,19 @@ void pasta_start_ns(struct ctx *c, uid_t uid, gid_t gid,
> > */
> > void pasta_ns_conf(struct ctx *c)
> > {
> > - nl_link(1, 1 /* lo */, MAC_ZERO, 1, 0);
> > + nl_link_up(1, 1 /* lo */, 0);
> > +
> > + /* Get or set guest MAC */
>
> I know it's called mac_guest, my bad, but what about "MAC address in
> the target namespace"?
Good idea, changed.
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2023-08-03 2:27 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-24 6:09 [PATCH 00/17] netlink fixes and cleanups David Gibson
2023-07-24 6:09 ` [PATCH 01/17] netlink: Split up functionality if nl_link() David Gibson
2023-08-02 22:47 ` Stefano Brivio
2023-08-03 2:09 ` David Gibson [this message]
2023-08-03 4:29 ` David Gibson
2023-08-03 5:39 ` David Gibson
2023-08-03 5:40 ` Stefano Brivio
2023-07-24 6:09 ` [PATCH 02/17] netlink: Split nl_addr() into separate operation functions David Gibson
2023-08-02 22:47 ` Stefano Brivio
2023-08-03 2:11 ` David Gibson
2023-07-24 6:09 ` [PATCH 03/17] netlink: Split nl_route() " David Gibson
2023-08-02 22:47 ` Stefano Brivio
2023-08-03 2:18 ` David Gibson
2023-07-24 6:09 ` [PATCH 04/17] netlink: Use struct in_addr for IPv4 addresses, not bare uint32_t David Gibson
2023-07-24 6:09 ` [PATCH 05/17] netlink: Explicitly pass netlink sockets to operations David Gibson
2023-07-24 6:09 ` [PATCH 06/17] netlink: Make nl_*_dup() use a separate datagram for each request David Gibson
2023-07-24 6:09 ` [PATCH 07/17] netlink: Start sequence number from 1 instead of 0 David Gibson
2023-07-24 6:09 ` [PATCH 08/17] netlink: Treat send() or recv() errors as fatal David Gibson
2023-08-02 22:47 ` Stefano Brivio
2023-08-03 2:19 ` David Gibson
2023-07-24 6:09 ` [PATCH 09/17] netlink: Fill in netlink header fields from nl_req() David Gibson
2023-07-24 6:09 ` [PATCH 10/17] netlink: Add nl_do() helper for simple operations with error checking David Gibson
2023-08-02 22:48 ` Stefano Brivio
2023-08-03 2:24 ` David Gibson
2023-07-24 6:09 ` [PATCH 11/17] netlink: Clearer reasoning about the netlink response buffer size David Gibson
2023-08-02 22:48 ` Stefano Brivio
2023-08-03 2:22 ` David Gibson
2023-07-24 6:09 ` [PATCH 12/17] netlink: Split nl_req() to allow processing multiple response datagrams David Gibson
2023-07-24 6:09 ` [PATCH 13/17] netlink: Add nl_foreach_oftype to filter response message types David Gibson
2023-07-24 6:09 ` [PATCH 14/17] netlink: Propagate errors for "set" operations David Gibson
2023-07-24 6:09 ` [PATCH 15/17] netlink: Always process all responses to a netlink request David Gibson
2023-07-24 6:09 ` [PATCH 16/17] netlink: Propagate errors for "dump" operations David Gibson
2023-07-24 6:09 ` [PATCH 17/17] netlink: Propagate errors for "dup" operations David Gibson
2023-08-02 22:48 ` Stefano Brivio
2023-08-03 2:26 ` David Gibson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZMsMTByU6J7ce/xx@zatzit \
--to=david@gibson.dropbear.id.au \
--cc=passt-dev@passt.top \
--cc=sbrivio@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).