From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=pass (2048-bit key; secure) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202512 header.b=ebEesijV; dkim-atps=neutral Received: from mail.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by passt.top (Postfix) with ESMTPS id AA4A75A0625 for ; Thu, 18 Dec 2025 05:44:44 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202512; t=1766033082; bh=wxAi1KJz7LrIIse74KqjOnKjVatNUMNq0I/QPEcaDuM=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=ebEesijVd6Thnr/3Z9O07SfU7DRqcj6EU3f8SxKFCxxoarbx0DTzbhQL33YShqUQV BWgRDYO/EFZ+1MuTCxEszFgTr0IlFzHv/cfntHaWgNsHEGOSYYL9wsqmXeDY0Kir2C d6PuFIUhu4v8fUfJpj1I4JCKD/9mIu3Ppk4enILdo1iXxXjW2q3d3tUhIVQSq1xxoS LaMB82/KsN4PR3LaZwUVHLDGPkiE6LyasZWPRAPd71wujuO4BeplcFcGDcwSCbch/b Kw8xcZmC4fA+ie1fDHyzZZq8wzBS7LSoXT5yEyEwHHxPpDMNvN8uZbEUYZy8H7tk0j hoDhwY/NfeQDQ== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4dWykf2mg4z4wGr; Thu, 18 Dec 2025 15:44:42 +1100 (AEDT) Date: Thu, 18 Dec 2025 15:44:31 +1100 From: David Gibson To: Jon Maloy Subject: Re: [RFC 09/12] netlink: Add host-side monitoring for late template interface binding Message-ID: References: <20251215015441.887736-1-jmaloy@redhat.com> <20251215015441.887736-10-jmaloy@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="LUPhZp0fBOE7bgou" Content-Disposition: inline In-Reply-To: <20251215015441.887736-10-jmaloy@redhat.com> Message-ID-Hash: YROMJC5RSRQFZQ73TULQGZAY2YSUX3EK X-Message-ID-Hash: YROMJC5RSRQFZQ73TULQGZAY2YSUX3EK X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: sbrivio@redhat.com, dgibson@redhat.com, passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --LUPhZp0fBOE7bgou Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, Dec 14, 2025 at 08:54:38PM -0500, Jon Maloy wrote: > When pasta starts without an active template interface (e.g., WiFi > not yet connected), it falls back to local mode. This change adds > support for late binding: when the template interface gets an address > later, pasta detects this via a host-side netlink socket and > propagates the configuration to the namespace. >=20 > Late binding occurs when: > - A specific interface is given via -I and it later gets an address "-i"? -I is the guest side interface name. > - No interface is specified, and any interface gets an address. > In the latter case the first discovered interface is adopted as > template. >=20 > The key changes we make in this commit are: > - We add a host-side netlink socket (nl_sock_linkaddr_host) to > monitor link and address changes on the template interface. > - We add a nl_linkaddr_host_handler() to process these events > and propagate addresses to the namespace. > - We add support for late binding: when ifi4/ifi6 are unset, we > adopt the interface that receives an address - either the one > specified via -I, or the first one to get an address if -I was > not given. > - We bring the interface UP after first address is added via late > binding. > - We retain CAP_NET_ADMIN in isolate_prefork() for pasta mode to allow > dynamic interface configuration after sandboxing. >=20 > Signed-off-by: Jon Maloy > --- > epoll_type.h | 4 +- > isolation.c | 4 + > netlink.c | 314 +++++++++++++++++++++++++++++++++++++++++++++++++++ > netlink.h | 3 + > passt.c | 4 + > 5 files changed, 328 insertions(+), 1 deletion(-) >=20 > diff --git a/epoll_type.h b/epoll_type.h > index 0a16d94..8dc6b8a 100644 > --- a/epoll_type.h > +++ b/epoll_type.h > @@ -46,8 +46,10 @@ enum epoll_type { > EPOLL_TYPE_REPAIR, > /* Netlink neighbour subscription socket */ > EPOLL_TYPE_NL_NEIGH, > - /* Netlink link/address subscription socket */ > + /* Netlink link/address subscription socket (namespace) */ > EPOLL_TYPE_NL_LINKADDR, > + /* Netlink link/address subscription socket (host, for template) */ > + EPOLL_TYPE_NL_LINKADDR_HOST, > =20 > EPOLL_NUM_TYPES, > }; > diff --git a/isolation.c b/isolation.c > index b25f349..633c396 100644 > --- a/isolation.c > +++ b/isolation.c > @@ -356,6 +356,10 @@ int isolate_prefork(const struct ctx *c) > if (c->mode =3D=3D MODE_PASTA) { > /* Keep CAP_SYS_ADMIN, so we can enter the netns */ > ns_caps |=3D BIT(CAP_SYS_ADMIN); > + /* Keep CAP_NET_ADMIN for dynamic interface configuration > + * (late binding when template interface comes up after start) > + */ > + ns_caps |=3D BIT(CAP_NET_ADMIN); > /* Keep CAP_NET_BIND_SERVICE, so we can splice > * outbound connections to low port numbers > */ > diff --git a/netlink.c b/netlink.c > index a8d3116..583ada8 100644 > --- a/netlink.c > +++ b/netlink.c > @@ -41,6 +41,9 @@ > #include "netlink.h" > #include "epoll_ctl.h" > =20 > +/* Default namespace interface name from conf.c */ > +extern const char *pasta_default_ifn; > + > /* Same as RTA_NEXT() but for nexthops: RTNH_NEXT() doesn't take 'attrle= n' */ > #define RTNH_NEXT_AND_DEC(rtnh, attrlen) \ > ((attrlen) -=3D RTNH_ALIGN((rtnh)->rtnh_len), RTNH_NEXT(rtnh)) > @@ -63,6 +66,7 @@ int nl_sock =3D -1; > int nl_sock_ns =3D -1; > static int nl_sock_neigh =3D -1; > static int nl_sock_linkaddr =3D -1; > +static int nl_sock_linkaddr_host =3D -1; > static int nl_seq =3D 1; > =20 > /** > @@ -249,6 +253,175 @@ static bool nl_addr6_del(struct ctx *c, const struc= t in6_addr *addr) > return true; > } > =20 > +/** > + * nl_linkaddr_host_msg_read() - Handle host-side link/addr changes > + * @c: Execution context > + * @nh: Netlink message header > + * > + * Monitor template interface changes and propagate to namespace. > + * Supports late binding: if no template was detected at startup, > + * adopt the interface specified by -I when it gets an address. > + */ > +static void nl_linkaddr_host_msg_read(struct ctx *c, const struct nlmsgh= dr *nh) > +{ > + if (nh->nlmsg_type =3D=3D NLMSG_DONE || nh->nlmsg_type =3D=3D NLMSG_ERR= OR) > + return; > + > + if (nh->nlmsg_type =3D=3D RTM_NEWADDR || nh->nlmsg_type =3D=3D RTM_DELA= DDR) { > + bool is_new =3D (nh->nlmsg_type =3D=3D RTM_NEWADDR); > + const struct ifaddrmsg *ifa =3D NLMSG_DATA(nh); > + struct rtattr *rta =3D IFA_RTA(ifa); > + size_t na =3D IFA_PAYLOAD(nh); > + bool late_binding =3D false; > + unsigned int template_ifi; > + char ifname[IFNAMSIZ]; > + void *addr =3D NULL; > + bool is_default; > + bool is_match; > + bool unbound; > + > + /* Get interface name for this message */ > + if (!if_indextoname(ifa->ifa_index, ifname)) > + snprintf(ifname, sizeof(ifname), "?"); > + > + /* Get template interface index, handling late binding. > + * Late binding occurs when ifi4/ifi6 <=3D 0 (local mode) and either: > + * - pasta_ifn is set and matches this interface, or > + * - pasta_ifn contains the default name I think this is trying to infer that we're in local mode from the interface name. That seems roundabout and fragile to me, better to add a flag to struct ctx if we need it. (If nothing else, using -I would break this, wouldn't it?). > + */ > + if (ifa->ifa_family =3D=3D AF_INET) > + template_ifi =3D c->ifi4; > + else if (ifa->ifa_family =3D=3D AF_INET6) > + template_ifi =3D c->ifi6; > + else > + return; > + > + /* Check for late binding conditions */ > + is_default =3D !strcmp(c->pasta_ifn, pasta_default_ifn); > + is_match =3D !strcmp(ifname, c->pasta_ifn); > + unbound =3D (ifa->ifa_family =3D=3D AF_INET) ? > + (int)c->ifi4 <=3D 0 : (int)c->ifi6 <=3D 0; > + > + if (unbound && (is_default || is_match)) { > + debug("Late binding: using %s as %s template", ifname, > + ifa->ifa_family =3D=3D AF_INET ? "IPv4" : "IPv6"); > + > + if (ifa->ifa_family =3D=3D AF_INET) { > + c->ifi4 =3D ifa->ifa_index; > + template_ifi =3D c->ifi4; > + } else { > + c->ifi6 =3D ifa->ifa_index; > + template_ifi =3D c->ifi6; > + } > + late_binding =3D true; > + > + if (is_default) > + snprintf(c->pasta_ifn, sizeof(c->pasta_ifn), > + "%s", ifname); > + } > + > + if (ifa->ifa_index !=3D template_ifi) > + return; > + > + /* Re-initialize rta/na for attribute parsing */ > + rta =3D IFA_RTA(ifa); > + na =3D IFA_PAYLOAD(nh); > + > + for (; RTA_OK(rta, na); rta =3D RTA_NEXT(rta, na)) { > + if (ifa->ifa_family =3D=3D AF_INET && > + rta->rta_type =3D=3D IFA_LOCAL) { > + addr =3D RTA_DATA(rta); > + break; > + } else if (ifa->ifa_family =3D=3D AF_INET6 && > + rta->rta_type =3D=3D IFA_ADDRESS) { > + addr =3D RTA_DATA(rta); > + break; > + } > + } > + > + if (!addr) { > + info("No addr found in netlink linkaddr message"); Maybe best to check for this *before* touching important state variables like c->ifi4? > + return; > + } > + > + if (ifa->ifa_family =3D=3D AF_INET) { > + struct in_addr *a =3D (struct in_addr *)addr; > + char buf[INET_ADDRSTRLEN]; > + int rc; > + > + inet_ntop(AF_INET, a, buf, sizeof(buf)); > + > + if (!is_new) { > + nl_addr4_del(c, a); > + nl_addr_del(nl_sock_ns, c->pasta_ifi, > + AF_INET, a, ifa->ifa_prefixlen); We only want to actually poke the guest if c->pasta_conf_ns. > + return; > + } > + rc =3D nl_addr_set(nl_sock_ns, c->pasta_ifi, > + AF_INET, a, > + ifa->ifa_prefixlen); > + if (rc < 0) { > + debug("Failed to add %s/%u to ns: %s", buf, > + ifa->ifa_prefixlen, strerror_(-rc)); > + } else { > + nl_addr4_add(c, a, ifa->ifa_prefixlen); What's nl_addr4_add() and how does it differ from nl_addr_set()? > + c->ip4.addr_seen =3D *a; This is a host side event, so we shouldn't be updating addr_seen. > + debug("Added %s/%u to namespace", > + buf, ifa->ifa_prefixlen); > + > + /* Bring interface UP on late binding */ > + if (late_binding && !c->pasta_ifi_up) { > + nl_link_set_flags(nl_sock_ns, > + c->pasta_ifi, > + IFF_UP, IFF_UP); > + c->pasta_ifi_up =3D 1; > + debug("Brought interface up"); > + } > + if (late_binding || c->pasta_ifi_up) c->pasta_ifi_up must always be true at this point, no? > + arp_send_init_req(c); > + } > + } else if (ifa->ifa_family =3D=3D AF_INET6) { > + struct in6_addr *a =3D (struct in6_addr *)addr; > + char buf[INET6_ADDRSTRLEN]; > + int rc; > + > + inet_ntop(AF_INET6, a, buf, sizeof(buf)); > + > + if (!is_new) { > + nl_addr6_del(c, a); > + nl_addr_del(nl_sock_ns, c->pasta_ifi, > + AF_INET6, a, ifa->ifa_prefixlen); > + return; > + } > + rc =3D nl_addr_set(nl_sock_ns, c->pasta_ifi, > + AF_INET6, a, ifa->ifa_prefixlen); > + if (rc < 0) { > + debug("Failed to add %s/%u to ns: %s", > + buf, ifa->ifa_prefixlen, > + strerror_(-rc)); > + } else { > + nl_addr6_add(c, a, ifa->ifa_prefixlen); > + c->ip6.addr_seen =3D *a; > + debug("Added %s/%u to namespace", > + buf, ifa->ifa_prefixlen); > + > + /* Bring interface UP on late binding */ > + if (late_binding && !c->pasta_ifi_up) { > + nl_link_set_flags(nl_sock_ns, > + c->pasta_ifi, > + IFF_UP, IFF_UP); > + c->pasta_ifi_up =3D 1; > + debug("Brought interface up"); > + } > + if ((late_binding || c->pasta_ifi_up) && > + !c->no_ndp) > + ndp_send_init_req(c); > + } > + } > + return; > + } > +} > + > /** > * nl_linkaddr_msg_read() - Parse and log a netlink link/addr message > * @c: Execution context > @@ -432,6 +605,36 @@ void nl_linkaddr_notify_handler(struct ctx *c) > } > } > =20 > +/** > + * nl_linkaddr_host_handler() - Handle events from host link/addr notifi= er > + * @c: Execution context > + * > + * Monitor template interface changes and propagate to namespace > + */ > +void nl_linkaddr_host_handler(struct ctx *c) > +{ > + char buf[NLBUFSIZ]; > + > + for (;;) { > + ssize_t n =3D recv(nl_sock_linkaddr_host, buf, sizeof(buf), > + MSG_DONTWAIT); > + struct nlmsghdr *nh =3D (struct nlmsghdr *)buf; > + > + if (n < 0) { > + if (errno =3D=3D EINTR) > + continue; > + if (errno !=3D EAGAIN) > + info("Host recv() error: %s", strerror_(errno)); > + break; > + } > + > + info("Host netlink: received %zd bytes", n); > + > + for (; NLMSG_OK(nh, n); nh =3D NLMSG_NEXT(nh, n)) > + nl_linkaddr_host_msg_read(c, nh); > + } > +} > + > /** > * nl_linkaddr_init_do() - Actually create and bind the netlink socket > * @arg: Execution context (for namespace entry) or NULL > @@ -464,6 +667,38 @@ static int nl_linkaddr_init_do(void *arg) > return 0; > } > =20 > +/** > + * nl_linkaddr_host_init_do() - Create host-side link/addr notifier sock= et > + * @arg: Unused > + * > + * Return: 0 on success, -1 on failure > + */ > +static int nl_linkaddr_host_init_do(void *arg) Why the void *? This is host side, so you don't need an NS_CALL(). > +{ > + struct sockaddr_nl addr =3D { .nl_family =3D AF_NETLINK, > + .nl_groups =3D RTMGRP_LINK | RTMGRP_IPV4_IFADDR | RTMGRP_IPV6_IFADDR }; > + > + (void)arg; > + > + nl_sock_linkaddr_host =3D socket(AF_NETLINK, SOCK_RAW | SOCK_CLOEXEC, > + NETLINK_ROUTE); > + if (nl_sock_linkaddr_host < 0) { > + debug("socket() failed for host: %s", strerror_(errno)); > + return -1; > + } > + > + if (bind(nl_sock_linkaddr_host, (struct sockaddr *)&addr, > + sizeof(addr)) < 0) { > + debug("bind() failed for host: %s", strerror_(errno)); > + close(nl_sock_linkaddr_host); > + nl_sock_linkaddr_host =3D -1; > + return -1; > + } > + > + debug("host socket fd=3D%d", nl_sock_linkaddr_host); > + return 0; > +} > + > /** > * nl_linkaddr_notify_init() - Initialize link/address change notifier > * @c: Execution context > @@ -502,6 +737,33 @@ int nl_linkaddr_notify_init(const struct ctx *c) > return -1; > } > =20 > + debug("namespace socket fd=3D%d", nl_sock_linkaddr); Looks like it belongs in an earlier patch. Plus "namespace socket fd" isn't very specific. > + > + /* In PASTA mode, also create a host-side socket to monitor > + * template interface changes > + */ > + if (c->mode =3D=3D MODE_PASTA) { > + nl_linkaddr_host_init_do(NULL); > + > + if (nl_sock_linkaddr_host < 0) { > + warn("Failed to create host link/addr notifier socket"); > + /* Non-fatal - continue without host monitoring */ > + } else { > + ref.type =3D EPOLL_TYPE_NL_LINKADDR_HOST; > + ev.data.u64 =3D ref.u64; > + if (epoll_ctl(c->epollfd, EPOLL_CTL_ADD, > + nl_sock_linkaddr_host, &ev) =3D=3D -1) { > + warn("epoll_ctl() failed on host notifier: %s", > + strerror_(errno)); > + close(nl_sock_linkaddr_host); > + nl_sock_linkaddr_host =3D -1; > + } else { > + info("Host netlink socket fd=3D%d, pasta_ifn=3D%s", > + nl_sock_linkaddr_host, c->pasta_ifn); > + } > + } > + } > + > return 0; > } > /** > @@ -1340,6 +1602,58 @@ int nl_addr_set(int s, unsigned int ifi, sa_family= _t af, > return nl_do(s, &req, RTM_NEWADDR, NLM_F_CREATE | NLM_F_EXCL, len); > } > =20 > +/** > + * nl_addr_del() - Delete IP address from given interface > + * @s: Netlink socket > + * @ifi: Interface index > + * @af: Address family > + * @addr: Address to delete > + * @prefix_len: Prefix length > + * > + * Return: 0 on success, negative error code on failure > + */ > +int nl_addr_del(int s, unsigned int ifi, sa_family_t af, > + const void *addr, int prefix_len) > +{ > + struct req_t { > + struct nlmsghdr nlh; > + struct ifaddrmsg ifa; > + union { > + struct { > + struct rtattr rta_l; > + struct in_addr l; > + } a4; > + struct { > + struct rtattr rta_l; > + struct in6_addr l; > + } a6; > + } del; > + } req =3D { > + .ifa.ifa_family =3D af, > + .ifa.ifa_index =3D ifi, > + .ifa.ifa_prefixlen =3D prefix_len, > + }; > + ssize_t len; > + > + if (af =3D=3D AF_INET6) { > + size_t rta_len =3D RTA_LENGTH(sizeof(req.del.a6.l)); > + > + len =3D offsetof(struct req_t, del.a6) + sizeof(req.del.a6); > + memcpy(&req.del.a6.l, addr, sizeof(req.del.a6.l)); > + req.del.a6.rta_l.rta_len =3D rta_len; > + req.del.a6.rta_l.rta_type =3D IFA_LOCAL; > + } else { > + size_t rta_len =3D RTA_LENGTH(sizeof(req.del.a4.l)); > + > + len =3D offsetof(struct req_t, del.a4) + sizeof(req.del.a4); > + memcpy(&req.del.a4.l, addr, sizeof(req.del.a4.l)); > + req.del.a4.rta_l.rta_len =3D rta_len; > + req.del.a4.rta_l.rta_type =3D IFA_LOCAL; > + } > + > + return nl_do(s, &req, RTM_DELADDR, 0, len); > +} > + > /** > * nl_addr_dup() - Copy IP addresses for given interface and address fam= ily > * @s_src: Netlink socket in source network namespace > diff --git a/netlink.h b/netlink.h > index 1796a72..f65ae10 100644 > --- a/netlink.h > +++ b/netlink.h > @@ -35,5 +35,8 @@ void nl_neigh_notify_handler(const struct ctx *c); > =20 > int nl_linkaddr_notify_init(const struct ctx *c); > void nl_linkaddr_notify_handler(struct ctx *c); > +void nl_linkaddr_host_handler(struct ctx *c); > +int nl_addr_del(int s, unsigned int ifi, sa_family_t af, > + const void *addr, int prefix_len); > =20 > #endif /* NETLINK_H */ > diff --git a/passt.c b/passt.c > index f274858..438dac8 100644 > --- a/passt.c > +++ b/passt.c > @@ -81,6 +81,7 @@ char *epoll_type_str[] =3D { > [EPOLL_TYPE_REPAIR] =3D "TCP_REPAIR helper socket", > [EPOLL_TYPE_NL_NEIGH] =3D "netlink neighbour notifier socket", > [EPOLL_TYPE_NL_LINKADDR] =3D "netlink link/address notifier socket", > + [EPOLL_TYPE_NL_LINKADDR_HOST] =3D "netlink host link/address notifier s= ocket", > }; > static_assert(ARRAY_SIZE(epoll_type_str) =3D=3D EPOLL_NUM_TYPES, > "epoll_type_str[] doesn't match enum epoll_type"); > @@ -308,6 +309,9 @@ static void passt_worker(void *opaque, int nfds, stru= ct epoll_event *events) > case EPOLL_TYPE_NL_LINKADDR: > nl_linkaddr_notify_handler(c); > break; > + case EPOLL_TYPE_NL_LINKADDR_HOST: > + nl_linkaddr_host_handler(c); > + break; > default: > /* Can't happen */ > ASSERT(0); > --=20 > 2.51.1 >=20 --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --LUPhZp0fBOE7bgou Content-Type: application/pgp-signature; name=signature.asc -----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmlDhq4ACgkQzQJF27ox 2GenGRAAk79jZhE82ERSeTjfZCFN6HEvwtHpRmljadcnt/gBKrsy3TY7B+oK9oKw KKieoT6st1toIwXzeeCHktx7IMiQb4dqPwYBIEl2c65LeY/lws4oakDvwvZPVUEv QRl1fasA7nYT/iCodaSc9CntqgGLU7YFgdRC3wLma9Y2Di1twbY7FwyBXY6zvBd+ RclcC6ru76HgeOO6Pj9R1aggTzXE22oQC7RtWFkWiA1cUsToYZzRr28dT4XiYhIe n3of9a7xTbGrIs02Df+3sVdXHADmrakOhXDgWLCwQmQKcqxL/OkkRpk6W+L3K24P CN4fpfsNIfOtNjEsaoYBETJweArtSOdMOE73u6zJYHNG3nCbWlM6qgN9b53jrZqS XPUtFcmuNr7X2G9QLP7wQN8tkGpxsb/5hQmFVjlzJGL06TNa7S2jAbF6VI4h/FJr rUncPV5qQg3mp5Lm9W8LpcXuMd/SoYjKUbAWngvGGqz0CJunSuM2d2B12Fi9oX5i 3c/VTCStSEh5lWyhP4rMHkpFpcqP5fAjkkZG8yt5CV2gbyWCc6nFImRMu9YHvNTQ ZVWTbpzF0LMPUidDWfpRPTc+HQIDsJxVWLoIf+Ky1Z0XxV4ZBULlU89AU7NXIYwz tYOpkEgjXZHwgIE0MUFl3nADxrm1S0NodLTyq7Kj2pGQdiWvY2k= =nbjP -----END PGP SIGNATURE----- --LUPhZp0fBOE7bgou--