public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Jon Maloy <jmaloy@redhat.com>
Cc: sbrivio@redhat.com, dgibson@redhat.com, passt-dev@passt.top
Subject: Re: [RFC  10/12] netlink: Add host-side route monitoring and propagation
Date: Thu, 18 Dec 2025 15:53:56 +1100	[thread overview]
Message-ID: <aUOI5Ou25bj3FuD_@zatzit> (raw)
In-Reply-To: <20251215015441.887736-11-jmaloy@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 6162 bytes --]

On Sun, Dec 14, 2025 at 08:54:39PM -0500, Jon Maloy wrote:
> We extend host-side netlink monitoring to also track default route
> changes on the template interface and propagate them to the namespace.
> 
> - Subscribe to RTMGRP_IPV4_ROUTE and RTMGRP_IPV6_ROUTE groups on the
>   host-side netlink socket
> - Handle RTM_NEWROUTE/RTM_DELROUTE events for default routes.
> - Support late binding via routes: if no template interface is bound
>   yet, adopt the interface in question when a default route appears
>   on it.
> - When a default route is added, set guest_gw/our_tap_addr and
>   propagate the route to the namespace via nl_route_set_def()
> - When a default route is removed, clear guest_gw/our_tap_addr
> 
> Signed-off-by: Jon Maloy <jmaloy@redhat.com>
> ---
>  netlink.c | 100 ++++++++++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 97 insertions(+), 3 deletions(-)
> 
> diff --git a/netlink.c b/netlink.c
> index 583ada8..d049239 100644
> --- a/netlink.c
> +++ b/netlink.c
> @@ -199,7 +199,7 @@ static bool nl_addr6_add(struct ctx *c, const struct in6_addr *addr,
>  	idx = c->ip6.addr_count++;
>  	c->ip6.addrs[idx].addr = *addr;
>  	c->ip6.addrs[idx].prefix_len = prefix_len;
> -	c->ip6.addrs[idxyes].permanent = 0;
> +	c->ip6.addrs[idx].permanent = 0;
>  	return true;
>  }
>  
> @@ -254,7 +254,7 @@ static bool nl_addr6_del(struct ctx *c, const struct in6_addr *addr)
>  }
>  
>  /**
> - * nl_linkaddr_host_msg_read() - Handle host-side link/addr changes
> + * nl_linkaddr_host_msg_read() - Handle host-side link/addr/route changes
>   * @c:		Execution context
>   * @nh:	Netlink message header
>   *
> @@ -420,6 +420,99 @@ static void nl_linkaddr_host_msg_read(struct ctx *c, const struct nlmsghdr *nh)
>  		}
>  		return;
>  	}
> +
> +	if (nh->nlmsg_type == RTM_NEWROUTE || nh->nlmsg_type == RTM_DELROUTE) {

There's enough in ths block it's probably worth splitting out into a function.

> +		bool is_new = (nh->nlmsg_type == RTM_NEWROUTE);
> +		const struct rtmsg *rtm = NLMSG_DATA(nh);
> +		struct rtattr *rta = RTM_RTA(rtm);
> +		size_t na = RTM_PAYLOAD(nh);
> +		unsigned int template_ifi;
> +		char ifname[IFNAMSIZ];
> +		unsigned int oif = 0;
> +		void *gw = NULL;
> +		bool is_default;
> +		bool is_match;
> +		bool unbound;
> +
> +		/* Only interested in default routes */

I'm not convinced this is enough.  Just as we have to copy non-default
routes in nl_route_dup(), I think we're going to need to keep them
updated here.  Speaking of which, it's ugly to have nl_route_dup() for
the initial route copy, then an entirely different path for subsequent
updates.  Similar to the neighbour table, I think it should be
possible to unify these by setting up the handler, then forcing an
enumeration of the existing routes.

> +		if (rtm->rtm_dst_len != 0)
> +			return;
> +
> +		for (; RTA_OK(rta, na); rta = RTA_NEXT(rta, na)) {
> +			if (rta->rta_type == RTA_GATEWAY)
> +				gw = RTA_DATA(rta);
> +			else if (rta->rta_type == RTA_OIF)
> +				oif = *(unsigned int *)RTA_DATA(rta);
> +		}
> +
> +		if (!gw || !oif)
> +			return;
> +
> +		/* Get interface name for late binding check */
> +		if (!if_indextoname(oif, ifname))
> +			return;
> +
> +		/* Check for late binding conditions */
> +		is_default = !strcmp(c->pasta_ifn, pasta_default_ifn);
> +		is_match = !strcmp(ifname, c->pasta_ifn);

Again, checking by interface name doesn't seem right.

> +		if (rtm->rtm_family == AF_INET)
> +			template_ifi = c->ifi4;
> +		else if (rtm->rtm_family == AF_INET6)
> +			template_ifi = c->ifi6;
> +		else
> +			return;
> +
> +		unbound = (rtm->rtm_family == AF_INET) ?
> +			  (int)c->ifi4 <= 0 : (int)c->ifi6 <= 0;

Can some of this filtering logic be shared with the address handling
path?

> +
> +		if (unbound && (is_default || is_match)) {
> +			debug("Late binding (route): using %s as %s template",
> +			      ifname,
> +			      rtm->rtm_family == AF_INET ? "IPv4" : "IPv6");
> +
> +			if (rtm->rtm_family == AF_INET) {
> +				c->ifi4 = oif;
> +				template_ifi = c->ifi4;
> +			} else {
> +				c->ifi6 = oif;
> +				template_ifi = c->ifi6;
> +			}
> +
> +			if (is_default)
> +				snprintf(c->pasta_ifn, sizeof(c->pasta_ifn),
> +					 "%s", ifname);
> +		}
> +
> +		if (oif != template_ifi)
> +			return;
> +
> +		if (rtm->rtm_family == AF_INET) {
> +			char buf[INET_ADDRSTRLEN];
> +
> +			if (!is_new) {
> +				c->ip4.guest_gw = (struct in_addr){ 0 };
> +				c->ip4.our_tap_addr = (struct in_addr){ 0 };
> +				return;

This doesn't seem right.  It will delete our gw information when *any*
default route is removed, even if another one still exists.

> +			}
> +			c->ip4.guest_gw = *(struct in_addr *)gw;
> +			c->ip4.our_tap_addr = c->ip4.guest_gw;
> +			nl_route_set_def(nl_sock_ns, c->pasta_ifi, AF_INET, gw);

We should only touch the guest if c->pasta_conf_ns.

> +			inet_ntop(AF_INET, &c->ip4.guest_gw, buf, sizeof(buf));
> +			debug("Set IPv4 default route via %s", buf);
> +		} else if (rtm->rtm_family == AF_INET6) {
> +			char buf[INET6_ADDRSTRLEN];
> +
> +			if (!is_new) {
> +				c->ip6.guest_gw = (struct in6_addr){ 0 };
> +				return;
> +			}
> +			c->ip6.guest_gw = *(struct in6_addr *)gw;
> +			nl_route_set_def(nl_sock_ns, c->pasta_ifi, AF_INET6, gw);
> +			inet_ntop(AF_INET6, &c->ip6.guest_gw, buf, sizeof(buf));
> +			debug("Set IPv6 default route via %s", buf);
> +		}
> +	}
>  }
>  
>  /**
> @@ -676,7 +769,8 @@ static int nl_linkaddr_init_do(void *arg)
>  static int nl_linkaddr_host_init_do(void *arg)
>  {
>  	struct sockaddr_nl addr = { .nl_family = AF_NETLINK,
> -		.nl_groups = RTMGRP_LINK | RTMGRP_IPV4_IFADDR | RTMGRP_IPV6_IFADDR };
> +		.nl_groups = RTMGRP_LINK | RTMGRP_IPV4_IFADDR | RTMGRP_IPV6_IFADDR |
> +			     RTMGRP_IPV4_ROUTE | RTMGRP_IPV6_ROUTE };
>  
>  	(void)arg;
>  
> -- 
> 2.51.1
> 

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2025-12-18  5:02 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-15  1:54 [RFC 00/12] Support for multiple address and late binding Jon Maloy
2025-12-15  1:54 ` [RFC 01/12] ip: Introduce multi-address data structures for IPv4 and IPv6 Jon Maloy
2025-12-15  9:40   ` David Gibson
2025-12-15 22:05     ` Jon Maloy
2025-12-16  1:58       ` Jon Maloy
2025-12-16  3:14         ` David Gibson
2025-12-15  9:46   ` David Gibson
2025-12-15  1:54 ` [RFC 02/12] ip: Add ip4_default_prefix_len() helper function for class-based prefix Jon Maloy
2025-12-15  9:41   ` David Gibson
2025-12-15  1:54 ` [RFC 03/12] conf: Allow multiple -a/--address options per address family Jon Maloy
2025-12-15  9:53   ` David Gibson
2025-12-15  1:54 ` [RFC 04/12] conf: Apply -n/--netmask to most recently added address Jon Maloy
2025-12-15  9:54   ` David Gibson
2025-12-15 22:43     ` Jon Maloy
2025-12-15  1:54 ` [RFC 05/12] fwd: Check all configured addresses in guest accessibility functions Jon Maloy
2025-12-15 10:06   ` David Gibson
2025-12-15  1:54 ` [RFC 06/12] arp: Check all configured addresses in ARP filtering Jon Maloy
2025-12-15 10:07   ` David Gibson
2025-12-15  1:54 ` [RFC 07/12] netlink: Subscribe to link/address changes in namespace Jon Maloy
2025-12-15 10:32   ` David Gibson
2025-12-15 23:25     ` Jon Maloy
2025-12-16  3:21       ` David Gibson
2025-12-15  1:54 ` [RFC 08/12] netlink: Subscribe to route " Jon Maloy
2025-12-15 10:38   ` David Gibson
2025-12-15  1:54 ` [RFC 09/12] netlink: Add host-side monitoring for late template interface binding Jon Maloy
2025-12-18  4:44   ` David Gibson
2025-12-15  1:54 ` [RFC 10/12] netlink: Add host-side route monitoring and propagation Jon Maloy
2025-12-18  4:53   ` David Gibson [this message]
2025-12-15  1:54 ` [RFC 11/12] netlink: Prevent host route events from overwriting guest-configured gateway Jon Maloy
2025-12-18  4:59   ` David Gibson
2025-12-15  1:54 ` [RFC 12/12] netlink: Rename tap interface when late binding discovers template name Jon Maloy
2025-12-18  5:02   ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aUOI5Ou25bj3FuD_@zatzit \
    --to=david@gibson.dropbear.id.au \
    --cc=dgibson@redhat.com \
    --cc=jmaloy@redhat.com \
    --cc=passt-dev@passt.top \
    --cc=sbrivio@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).