From: David Gibson <david@gibson.dropbear.id.au>
To: Jon Maloy <jmaloy@redhat.com>
Cc: sbrivio@redhat.com, dgibson@redhat.com, passt-dev@passt.top
Subject: Re: [RFC 10/12] netlink: Add host-side route monitoring and propagation
Date: Thu, 18 Dec 2025 15:53:56 +1100 [thread overview]
Message-ID: <aUOI5Ou25bj3FuD_@zatzit> (raw)
In-Reply-To: <20251215015441.887736-11-jmaloy@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 6162 bytes --]
On Sun, Dec 14, 2025 at 08:54:39PM -0500, Jon Maloy wrote:
> We extend host-side netlink monitoring to also track default route
> changes on the template interface and propagate them to the namespace.
>
> - Subscribe to RTMGRP_IPV4_ROUTE and RTMGRP_IPV6_ROUTE groups on the
> host-side netlink socket
> - Handle RTM_NEWROUTE/RTM_DELROUTE events for default routes.
> - Support late binding via routes: if no template interface is bound
> yet, adopt the interface in question when a default route appears
> on it.
> - When a default route is added, set guest_gw/our_tap_addr and
> propagate the route to the namespace via nl_route_set_def()
> - When a default route is removed, clear guest_gw/our_tap_addr
>
> Signed-off-by: Jon Maloy <jmaloy@redhat.com>
> ---
> netlink.c | 100 ++++++++++++++++++++++++++++++++++++++++++++++++++++--
> 1 file changed, 97 insertions(+), 3 deletions(-)
>
> diff --git a/netlink.c b/netlink.c
> index 583ada8..d049239 100644
> --- a/netlink.c
> +++ b/netlink.c
> @@ -199,7 +199,7 @@ static bool nl_addr6_add(struct ctx *c, const struct in6_addr *addr,
> idx = c->ip6.addr_count++;
> c->ip6.addrs[idx].addr = *addr;
> c->ip6.addrs[idx].prefix_len = prefix_len;
> - c->ip6.addrs[idxyes].permanent = 0;
> + c->ip6.addrs[idx].permanent = 0;
> return true;
> }
>
> @@ -254,7 +254,7 @@ static bool nl_addr6_del(struct ctx *c, const struct in6_addr *addr)
> }
>
> /**
> - * nl_linkaddr_host_msg_read() - Handle host-side link/addr changes
> + * nl_linkaddr_host_msg_read() - Handle host-side link/addr/route changes
> * @c: Execution context
> * @nh: Netlink message header
> *
> @@ -420,6 +420,99 @@ static void nl_linkaddr_host_msg_read(struct ctx *c, const struct nlmsghdr *nh)
> }
> return;
> }
> +
> + if (nh->nlmsg_type == RTM_NEWROUTE || nh->nlmsg_type == RTM_DELROUTE) {
There's enough in ths block it's probably worth splitting out into a function.
> + bool is_new = (nh->nlmsg_type == RTM_NEWROUTE);
> + const struct rtmsg *rtm = NLMSG_DATA(nh);
> + struct rtattr *rta = RTM_RTA(rtm);
> + size_t na = RTM_PAYLOAD(nh);
> + unsigned int template_ifi;
> + char ifname[IFNAMSIZ];
> + unsigned int oif = 0;
> + void *gw = NULL;
> + bool is_default;
> + bool is_match;
> + bool unbound;
> +
> + /* Only interested in default routes */
I'm not convinced this is enough. Just as we have to copy non-default
routes in nl_route_dup(), I think we're going to need to keep them
updated here. Speaking of which, it's ugly to have nl_route_dup() for
the initial route copy, then an entirely different path for subsequent
updates. Similar to the neighbour table, I think it should be
possible to unify these by setting up the handler, then forcing an
enumeration of the existing routes.
> + if (rtm->rtm_dst_len != 0)
> + return;
> +
> + for (; RTA_OK(rta, na); rta = RTA_NEXT(rta, na)) {
> + if (rta->rta_type == RTA_GATEWAY)
> + gw = RTA_DATA(rta);
> + else if (rta->rta_type == RTA_OIF)
> + oif = *(unsigned int *)RTA_DATA(rta);
> + }
> +
> + if (!gw || !oif)
> + return;
> +
> + /* Get interface name for late binding check */
> + if (!if_indextoname(oif, ifname))
> + return;
> +
> + /* Check for late binding conditions */
> + is_default = !strcmp(c->pasta_ifn, pasta_default_ifn);
> + is_match = !strcmp(ifname, c->pasta_ifn);
Again, checking by interface name doesn't seem right.
> + if (rtm->rtm_family == AF_INET)
> + template_ifi = c->ifi4;
> + else if (rtm->rtm_family == AF_INET6)
> + template_ifi = c->ifi6;
> + else
> + return;
> +
> + unbound = (rtm->rtm_family == AF_INET) ?
> + (int)c->ifi4 <= 0 : (int)c->ifi6 <= 0;
Can some of this filtering logic be shared with the address handling
path?
> +
> + if (unbound && (is_default || is_match)) {
> + debug("Late binding (route): using %s as %s template",
> + ifname,
> + rtm->rtm_family == AF_INET ? "IPv4" : "IPv6");
> +
> + if (rtm->rtm_family == AF_INET) {
> + c->ifi4 = oif;
> + template_ifi = c->ifi4;
> + } else {
> + c->ifi6 = oif;
> + template_ifi = c->ifi6;
> + }
> +
> + if (is_default)
> + snprintf(c->pasta_ifn, sizeof(c->pasta_ifn),
> + "%s", ifname);
> + }
> +
> + if (oif != template_ifi)
> + return;
> +
> + if (rtm->rtm_family == AF_INET) {
> + char buf[INET_ADDRSTRLEN];
> +
> + if (!is_new) {
> + c->ip4.guest_gw = (struct in_addr){ 0 };
> + c->ip4.our_tap_addr = (struct in_addr){ 0 };
> + return;
This doesn't seem right. It will delete our gw information when *any*
default route is removed, even if another one still exists.
> + }
> + c->ip4.guest_gw = *(struct in_addr *)gw;
> + c->ip4.our_tap_addr = c->ip4.guest_gw;
> + nl_route_set_def(nl_sock_ns, c->pasta_ifi, AF_INET, gw);
We should only touch the guest if c->pasta_conf_ns.
> + inet_ntop(AF_INET, &c->ip4.guest_gw, buf, sizeof(buf));
> + debug("Set IPv4 default route via %s", buf);
> + } else if (rtm->rtm_family == AF_INET6) {
> + char buf[INET6_ADDRSTRLEN];
> +
> + if (!is_new) {
> + c->ip6.guest_gw = (struct in6_addr){ 0 };
> + return;
> + }
> + c->ip6.guest_gw = *(struct in6_addr *)gw;
> + nl_route_set_def(nl_sock_ns, c->pasta_ifi, AF_INET6, gw);
> + inet_ntop(AF_INET6, &c->ip6.guest_gw, buf, sizeof(buf));
> + debug("Set IPv6 default route via %s", buf);
> + }
> + }
> }
>
> /**
> @@ -676,7 +769,8 @@ static int nl_linkaddr_init_do(void *arg)
> static int nl_linkaddr_host_init_do(void *arg)
> {
> struct sockaddr_nl addr = { .nl_family = AF_NETLINK,
> - .nl_groups = RTMGRP_LINK | RTMGRP_IPV4_IFADDR | RTMGRP_IPV6_IFADDR };
> + .nl_groups = RTMGRP_LINK | RTMGRP_IPV4_IFADDR | RTMGRP_IPV6_IFADDR |
> + RTMGRP_IPV4_ROUTE | RTMGRP_IPV6_ROUTE };
>
> (void)arg;
>
> --
> 2.51.1
>
--
David Gibson (he or they) | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you, not the other way
| around.
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2025-12-18 5:02 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-15 1:54 [RFC 00/12] Support for multiple address and late binding Jon Maloy
2025-12-15 1:54 ` [RFC 01/12] ip: Introduce multi-address data structures for IPv4 and IPv6 Jon Maloy
2025-12-15 9:40 ` David Gibson
2025-12-15 22:05 ` Jon Maloy
2025-12-16 1:58 ` Jon Maloy
2025-12-16 3:14 ` David Gibson
2025-12-15 9:46 ` David Gibson
2025-12-15 1:54 ` [RFC 02/12] ip: Add ip4_default_prefix_len() helper function for class-based prefix Jon Maloy
2025-12-15 9:41 ` David Gibson
2025-12-15 1:54 ` [RFC 03/12] conf: Allow multiple -a/--address options per address family Jon Maloy
2025-12-15 9:53 ` David Gibson
2025-12-15 1:54 ` [RFC 04/12] conf: Apply -n/--netmask to most recently added address Jon Maloy
2025-12-15 9:54 ` David Gibson
2025-12-15 22:43 ` Jon Maloy
2025-12-15 1:54 ` [RFC 05/12] fwd: Check all configured addresses in guest accessibility functions Jon Maloy
2025-12-15 10:06 ` David Gibson
2025-12-15 1:54 ` [RFC 06/12] arp: Check all configured addresses in ARP filtering Jon Maloy
2025-12-15 10:07 ` David Gibson
2025-12-15 1:54 ` [RFC 07/12] netlink: Subscribe to link/address changes in namespace Jon Maloy
2025-12-15 10:32 ` David Gibson
2025-12-15 23:25 ` Jon Maloy
2025-12-16 3:21 ` David Gibson
2025-12-15 1:54 ` [RFC 08/12] netlink: Subscribe to route " Jon Maloy
2025-12-15 10:38 ` David Gibson
2025-12-15 1:54 ` [RFC 09/12] netlink: Add host-side monitoring for late template interface binding Jon Maloy
2025-12-18 4:44 ` David Gibson
2025-12-15 1:54 ` [RFC 10/12] netlink: Add host-side route monitoring and propagation Jon Maloy
2025-12-18 4:53 ` David Gibson [this message]
2025-12-15 1:54 ` [RFC 11/12] netlink: Prevent host route events from overwriting guest-configured gateway Jon Maloy
2025-12-18 4:59 ` David Gibson
2025-12-15 1:54 ` [RFC 12/12] netlink: Rename tap interface when late binding discovers template name Jon Maloy
2025-12-18 5:02 ` David Gibson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aUOI5Ou25bj3FuD_@zatzit \
--to=david@gibson.dropbear.id.au \
--cc=dgibson@redhat.com \
--cc=jmaloy@redhat.com \
--cc=passt-dev@passt.top \
--cc=sbrivio@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).