From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by passt.top (Postfix) with ESMTPS id 9B8275A0274 for ; Mon, 5 Feb 2024 06:14:21 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202312; t=1707110057; bh=dT3ndV0OuEZv2VGLle2FGBdNcmLedEE22tsRZFB0O/4=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=GvgMXoPq3dpb/JX7MzVFWOPNRyKimfCtpXCOpythjF8ebU2o0CPGRaT1q6qSv8e3I 9C6T7OApQ4uAWdMiz1NP+yoNzg5VIY2KBqXpegkD7UC1IVJK1bAJQ6nIKKRvGiMHtK 3c34xun3lQeed1lpOc+lduG8dxfjrIFuFTKpzaYCuikvUZbzZVcRA+/OMtO0KoQLdo 1Cka3xJ0VwpdWAwVGyCYpLQoVn31BPFPCXriHq2QJ+0aIWODPAyCk+FIZNmXwEFZFB rzTui1hRIZhD70Gi6nSRLxN2nwwBcMkgL292WYAx/hCKziCEY5WHLXKO6rS1KpKeSG ZyX+8PbaJmC4w== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4TSvgY4GdBz4wxx; Mon, 5 Feb 2024 16:14:17 +1100 (AEDT) Date: Mon, 5 Feb 2024 15:23:02 +1100 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH] netlink: Add support to fetch default gateway from multipath routes Message-ID: References: <20240201233257.2287011-1-sbrivio@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="Igfdllvm0zGd6EN9" Content-Disposition: inline In-Reply-To: <20240201233257.2287011-1-sbrivio@redhat.com> Message-ID-Hash: CW3I2YZZIQLNG32KWA4ET754AX33SPMU X-Message-ID-Hash: CW3I2YZZIQLNG32KWA4ET754AX33SPMU X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, Ed Santiago X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --Igfdllvm0zGd6EN9 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Feb 02, 2024 at 12:32:57AM +0100, Stefano Brivio wrote: > If the default route for a given IP version is a multipath one, > instead of refusing to start because there's no RTA_GATEWAY attribute > in the set returned by the kernel, we can just pick one of the paths. >=20 > To make this somewhat less arbitrary, pick the path with the highest > weight, if weights differ. >=20 > Reported-by: Ed Santiago > Link: https://github.com/containers/podman/issues/20927 > Signed-off-by: Stefano Brivio Reviewed-by: David Gibson I still hope I can dramatically simplify this at some point by making a distinction between a general "host" pif and pifs bound to specific host interfaces. But this looks reasonable for the time being. > --- > netlink.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++++--- > passt.1 | 6 ++++-- > 2 files changed, 52 insertions(+), 5 deletions(-) >=20 > diff --git a/netlink.c b/netlink.c > index bf79dd4..f0b04cb 100644 > --- a/netlink.c > +++ b/netlink.c > @@ -271,18 +271,60 @@ unsigned int nl_get_ext_if(int s, sa_family_t af) > =20 > for (rta =3D RTM_RTA(rtm), na =3D RTM_PAYLOAD(nh); RTA_OK(rta, na); > rta =3D RTA_NEXT(rta, na)) { > - if (rta->rta_type !=3D RTA_OIF) > - continue; > + if (rta->rta_type =3D=3D RTA_OIF) { > + ifi =3D *(unsigned int *)RTA_DATA(rta); > + } else if (rta->rta_type =3D=3D RTA_MULTIPATH) { > + struct rtnexthop *rtnh; > =20 > - ifi =3D *(unsigned int *)RTA_DATA(rta); > + rtnh =3D (struct rtnexthop *)RTA_DATA(rta); > + ifi =3D rtnh->rtnh_ifindex; > + } > } > } > + > if (status < 0) > warn("netlink: RTM_GETROUTE failed: %s", strerror(-status)); > =20 > return ifi; > } > =20 > +/** > + * nl_route_get_def_multipath() - Get lowest-weight route from nexthop l= ist > + * @rta: Routing netlink attribute with type RTA_MULTIPATH > + * @gw: Default gateway to fill > + * > + * Return: true if a gateway was found, false otherwise > + */ > +bool nl_route_get_def_multipath(struct rtattr *rta, void *gw) > +{ > + struct rtnexthop *rtnh; > + bool found =3D false; > + int hops =3D -1; > + > + for (rtnh =3D (struct rtnexthop *)RTA_DATA(rta); > + RTNH_OK(rtnh, RTA_PAYLOAD(rta)); rtnh =3D RTNH_NEXT(rtnh)) { > + size_t len =3D rtnh->rtnh_len - sizeof(*rtnh); > + struct rtattr *rta_inner; > + > + if (rtnh->rtnh_hops < hops) > + continue; > + > + hops =3D rtnh->rtnh_hops; > + > + for (rta_inner =3D RTNH_DATA(rtnh); RTA_OK(rta_inner, len); > + rta_inner =3D RTA_NEXT(rta_inner, len)) { > + > + if (rta_inner->rta_type !=3D RTA_GATEWAY) > + continue; > + > + memcpy(gw, RTA_DATA(rta_inner), RTA_PAYLOAD(rta_inner)); > + found =3D true; > + } > + } > + > + return found; > +} > + > /** > * nl_route_get_def() - Get default route for given interface and addres= s family > * @s: Netlink socket > @@ -326,6 +368,9 @@ int nl_route_get_def(int s, unsigned int ifi, sa_fami= ly_t af, void *gw) > =20 > for (rta =3D RTM_RTA(rtm), na =3D RTM_PAYLOAD(nh); RTA_OK(rta, na); > rta =3D RTA_NEXT(rta, na)) { > + if (rta->rta_type =3D=3D RTA_MULTIPATH) > + found =3D nl_route_get_def_multipath(rta, gw); > + > if (rta->rta_type !=3D RTA_GATEWAY) > continue; > =20 > diff --git a/passt.1 b/passt.1 > index efd6bb7..cc678ed 100644 > --- a/passt.1 > +++ b/passt.1 > @@ -171,8 +171,10 @@ Assign IPv4 \fIaddr\fR as default gateway via DHCP (= option 3), or IPv6 > \fIaddr\fR as source for NDP Router Advertisement and DHCPv6 messages. > This option can be specified zero (for defaults) to two times (once for = IPv4, > once for IPv6). > -By default, IPv4 and IPv6 addresses are taken from the host interface wi= th the > -first default route for the corresponding IP version. > +By default, IPv4 and IPv6 gateways are taken from the host interface wit= h the > +first default route for the corresponding IP version. If the default rou= te is a > +multipath one, the gateway is the first nexthop router returned by the k= ernel > +which has the highest weight in the set of paths. > =20 > Note: these addresses are also used as source address for packets direct= ed to > the guest or to the target namespace having a loopback or local source a= ddress, --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --Igfdllvm0zGd6EN9 Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmXAYqUACgkQzQJF27ox 2GfGxBAAncp+smbbfsBCdkPpq1oLZD9hHpklWPTOGvGzNudkXH8O1EQvru5tyGXe XBC8irtmdelyMsGz/q+OSKGbOo5hLahqChGgepi96BJT3lpfogBWPSMmxBdDvrCx llsCv0Q6EyWuoqQviiaUD8jI1rFXAJmLP+GE4sH0cDqYBH/3aWae8485QyAmWEu+ U0Ht9b1+bkalkqu5B3onDl7vDOvyeLdC2MtViFCzvXptQkZfcBw9DDd93B7ElHwQ AcBtjHNJBlSojP54THUSEwdArUkge0rres95jNJj5ADxGqoKMhl5Ik/e3Q1daPbP V9BAHKrOjh4z+hEQ/U5SJFtdxtSGAuSrcenqP61cET3m3tkt83qgCinSZ1yDlwlK ZL4xuffvRyxSsQkRNA/fGEayIgYUqZ2wrOOZqC4EDy+ZUfQ3gNGQrhJs4fF48tPB EfGPzDgCwTrtDy+9sG0jEXKf+NDwwFvUzDJ5IJm5fi+cY363cBNvJAur1RNg56Fr wgP38XhHKjqqU/TNYOwgh3XaJYtGWU/39DBxffXyo82txyMGm82XznBvJ8IdNPGf m32rDZ3AgS1FJDh1iRdO+0kPswxEnfB3Vj/ex1MddihjAFnCcr47s+EJVXINkbpU lhkgZBhwGjkNUcqvNcPcNp4QUg6hNAkCD3Fd3hNvTXGIJWBSOBM= =5BZJ -----END PGP SIGNATURE----- --Igfdllvm0zGd6EN9--