From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by passt.top (Postfix) with ESMTPS id B22795A0271 for ; Mon, 18 Mar 2024 04:28:52 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202312; t=1710732529; bh=C829fPRE8wcARYUXvbfyWH4Zn3bk1lGnNFTEqLf3Ucw=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=FVrA5Rwuqvemqq/FNWKIQcy/9yIdqTj+RT8qIo/uSBMIGvxvWU1xpYHgNLW4wwjhR ddx6Mg8j7RS+8bzXUX2MpcGETvFN1GihaTmaVJ4oXvLWgfeuEwhb/npqoVoZGDQuOt Y6VqfaajIBCqBNIeN468FWvgmElh2weQb9naN/jebjqxBQd4j3rWAWYHgcKD8/Nc/C Tz/HhKUzto/B7Hwntg9KTypjLYKMQN7e8y/QwDJ3VTIWbTKx6xvCdYfzaE+K4vtPAy 4Vciye9jL9b8o+F4XF3kjK5K/ofkph2tsyIy0lEsFNou7fqerlHNiJMS+8r2DTvlWt 9AYyW+hZ3qQtg== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4TygLT1y45z4wc1; Mon, 18 Mar 2024 14:28:49 +1100 (AEDT) Date: Mon, 18 Mar 2024 14:28:44 +1100 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH v2] conf, netlink: Don't require a default route to start Message-ID: References: <20240315161326.651583-1-sbrivio@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="x4CyN20jihukZMCe" Content-Disposition: inline In-Reply-To: <20240315161326.651583-1-sbrivio@redhat.com> Message-ID-Hash: NHCXT5IK2OD2PLBUIXVSOJSBPYRV7CJ4 X-Message-ID-Hash: NHCXT5IK2OD2PLBUIXVSOJSBPYRV7CJ4 X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, Martin Pitt , Paul Holzinger X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --x4CyN20jihukZMCe Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Mar 15, 2024 at 05:13:26PM +0100, Stefano Brivio wrote: > There might be isolated testing environments where default routes and > global connectivity are not needed, a single interface has all > non-loopback addresses and routes, and still passt and pasta are > expected to work. >=20 > In this case, it's pretty obvious what our upstream interface should > be, so go ahead and select the only interface with at least one > route, disabling DHCP and implying --no-map-gw as the documentation > already states. >=20 > If there are multiple interfaces with routes, though, refuse to start, > because at that point it's really not clear what we should do. >=20 > Reported-by: Martin Pitt > Link: https://github.com/containers/podman/issues/21896 > Signed-off-by: Stefano brivio It's an ugly hack, but a useful one, so Reviewed-by: David Gibson > --- > v2: Initialise rtnh in nl_get_ext_if() before using it... >=20 > conf.c | 4 ++-- > netlink.c | 31 ++++++++++++++++++++++++++++--- > passt.1 | 45 +++++++++++++++++++++++++++++---------------- > 3 files changed, 59 insertions(+), 21 deletions(-) >=20 > diff --git a/conf.c b/conf.c > index ac9fb34..644752c 100644 > --- a/conf.c > +++ b/conf.c > @@ -584,7 +584,7 @@ static unsigned int conf_ip4(unsigned int ifi, > ifi =3D nl_get_ext_if(nl_sock, AF_INET); > =20 > if (!ifi) { > - info("No interface with a default route for IPv4: disabling IPv4"); > + info("No interface with a route for IPv4: disabling IPv4"); > return 0; > } > =20 > @@ -656,7 +656,7 @@ static unsigned int conf_ip6(unsigned int ifi, > ifi =3D nl_get_ext_if(nl_sock, AF_INET6); > =20 > if (!ifi) { > - info("No interface with a default route for IPv6: disabling IPv6"); > + info("No interface with a route for IPv6: disabling IPv6"); > return 0; > } > =20 > diff --git a/netlink.c b/netlink.c > index 20de9b3..f93f377 100644 > --- a/netlink.c > +++ b/netlink.c > @@ -254,6 +254,7 @@ unsigned int nl_get_ext_if(int s, sa_family_t af) > .rtm.rtm_type =3D RTN_UNICAST, > .rtm.rtm_family =3D af, > }; > + bool default_only =3D true; > unsigned int ifi =3D 0; > struct nlmsghdr *nh; > struct rtattr *rta; > @@ -262,21 +263,40 @@ unsigned int nl_get_ext_if(int s, sa_family_t af) > uint32_t seq; > size_t na; > =20 > +again: > + /* Look for an interface with a default route first, failing that, look > + * for any interface with a route, and pick it only if it's the only > + * interface with a route. > + */ > seq =3D nl_send(s, &req, RTM_GETROUTE, NLM_F_DUMP, sizeof(req)); > nl_foreach_oftype(nh, status, s, buf, seq, RTM_NEWROUTE) { > struct rtmsg *rtm =3D (struct rtmsg *)NLMSG_DATA(nh); > =20 > - if (ifi || rtm->rtm_dst_len || rtm->rtm_family !=3D af) > - continue; > + if (default_only) { > + if (ifi || rtm->rtm_dst_len || rtm->rtm_family !=3D af) > + continue; > + } else { > + if (rtm->rtm_family !=3D af) > + continue; > + } > =20 > for (rta =3D RTM_RTA(rtm), na =3D RTM_PAYLOAD(nh); RTA_OK(rta, na); > rta =3D RTA_NEXT(rta, na)) { > if (rta->rta_type =3D=3D RTA_OIF) { > + if (!default_only && ifi && > + ifi !=3D *(unsigned int *)RTA_DATA(rta)) > + return 0; > + > ifi =3D *(unsigned int *)RTA_DATA(rta); > } else if (rta->rta_type =3D=3D RTA_MULTIPATH) { > const struct rtnexthop *rtnh; > =20 > rtnh =3D (struct rtnexthop *)RTA_DATA(rta); > + > + if (!default_only && ifi && > + (int)ifi !=3D rtnh->rtnh_ifindex) > + return 0; > + > ifi =3D rtnh->rtnh_ifindex; > } > } > @@ -285,6 +305,11 @@ unsigned int nl_get_ext_if(int s, sa_family_t af) > if (status < 0) > warn("netlink: RTM_GETROUTE failed: %s", strerror(-status)); > =20 > + if (!ifi && default_only) { > + default_only =3D false; > + goto again; > + } > + > return ifi; > } > =20 > @@ -332,7 +357,7 @@ bool nl_route_get_def_multipath(struct rtattr *rta, v= oid *gw) > * @af: Address family > * @gw: Default gateway to fill on NL_GET > * > - * Return: 0 on success, negative error code on failure > + * Return: error on netlink failure, or 0 (gw unset if default route not= found) > */ > int nl_route_get_def(int s, unsigned int ifi, sa_family_t af, void *gw) > { > diff --git a/passt.1 b/passt.1 > index 9c492f5..3a23a43 100644 > --- a/passt.1 > +++ b/passt.1 > @@ -148,7 +148,9 @@ for an IPv6 \fIaddr\fR. > This option can be specified zero (for defaults) to two times (once for = IPv4, > once for IPv6). > By default, assigned IPv4 and IPv6 addresses are taken from the host int= erfaces > -with the first default route for the corresponding IP version. > +with the first default route, if any, for the corresponding IP version. = If no > +default routes are available and there is just one interface with any ro= ute, > +that interface will be chosen instead. > =20 > .TP > .BR \-n ", " \-\-netmask " " \fImask > @@ -172,9 +174,11 @@ Assign IPv4 \fIaddr\fR as default gateway via DHCP (= option 3), or IPv6 > This option can be specified zero (for defaults) to two times (once for = IPv4, > once for IPv6). > By default, IPv4 and IPv6 gateways are taken from the host interface wit= h the > -first default route for the corresponding IP version. If the default rou= te is a > -multipath one, the gateway is the first nexthop router returned by the k= ernel > -which has the highest weight in the set of paths. > +first default route, if any, for the corresponding IP version. If the de= fault > +route is a multipath one, the gateway is the first nexthop router return= ed by > +the kernel which has the highest weight in the set of paths. If no defau= lt > +routes are available and there is just one interface with any route, that > +interface will be chosen instead. > =20 > Note: these addresses are also used as source address for packets direct= ed to > the guest or to the target namespace having a loopback or local source a= ddress, > @@ -185,9 +189,11 @@ to allow mapping of local traffic to guest and targe= t namespace. See the > .BR \-i ", " \-\-interface " " \fIname > Use host interface \fIname\fR to derive addresses and routes. > Default is to use the interfaces specified by \fB--outbound-if4\fR and > -\fB--outbound-if6\fR, for IPv4 and IPv6 addresses and routes, respective= ly. If > -no interfaces are given, the interface with the first default routes for= each IP > -version is selected. > +\fB--outbound-if6\fR, for IPv4 and IPv6 addresses and routes, respective= ly. > + > +If no interfaces are given, the interface with the first default routes = for each > +IP version is selected. If no default routes are available and there is = just one > +interface with any route, that interface will be chosen instead. > =20 > .TP > .BR \-o ", " \-\-outbound " " \fIaddr > @@ -203,14 +209,20 @@ By default, the source address is selected by the r= outing tables. > Bind IPv4 outbound sockets to host interface \fIname\fR, and, unless ano= ther > interface is specified via \fB-i\fR, \fB--interface\fR, use this interfa= ce to > derive IPv4 addresses and routes. > -By default, the interface given by the default route is selected. > + > +By default, the interface given by the default route is selected. If no = default > +routes are available and there is just one interface with any route, that > +interface will be chosen instead. > =20 > .TP > .BR \-\-outbound-if6 " " \fIname > Bind IPv6 outbound sockets to host interface \fIname\fR, and, unless ano= ther > interface is specified via \fB-i\fR, \fB--interface\fR, use this interfa= ce to > derive IPv6 addresses and routes. > -By default, the interface given by the default route is selected. > + > +By default, the interface given by the default route is selected. If no = default > +routes are available and there is just one interface with any route, that > +interface will be chosen instead. > =20 > .TP > .BR \-D ", " \-\-dns " " \fIaddr > @@ -305,19 +317,20 @@ namespace will be ignored. > .BR \-\-no-map-gw > Don't remap TCP connections and untracked UDP traffic, with the gateway = address > as destination, to the host. Implied if there is no gateway on the selec= ted > -default route for any of the enabled address families. > +default route, or if there is no default route, for any of the enabled a= ddress > +families. > =20 > .TP > .BR \-4 ", " \-\-ipv4-only > Enable IPv4-only operation. IPv6 traffic will be ignored. > -By default, IPv6 operation is enabled as long as at least an IPv6 defaul= t route > -and an interface address are configured on a given host interface. > +By default, IPv6 operation is enabled as long as at least an IPv6 route = and an > +interface address are configured on a given host interface. > =20 > .TP > .BR \-6 ", " \-\-ipv6-only > Enable IPv6-only operation. IPv4 traffic will be ignored. > -By default, IPv4 operation is enabled as long as at least an IPv4 defaul= t route > -and an interface address are configured on a given host interface. > +By default, IPv4 operation is enabled as long as at least an IPv4 route = and an > +interface address are configured on a given host interface. > =20 > .SS \fBpasst\fR-only options > =20 > @@ -817,8 +830,8 @@ local addresses, and it would also be impossible for = guest or target namespace > to route answers back. > =20 > For convenience, and somewhat arbitrarily, the source address on these p= ackets > -is translated to the address of the default IPv4 or IPv6 gateway -- this= is > -known to be an existing, valid address on the same subnet. > +is translated to the address of the default IPv4 or IPv6 gateway (if any= ) -- > +this is known to be an existing, valid address on the same subnet. > =20 > Loopback destination addresses are instead translated to the observed ex= ternal > address of the guest or target namespace. For IPv6 packets, if usage of a --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --x4CyN20jihukZMCe Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmX3tOsACgkQzQJF27ox 2Gdvag//b9dvFbti294o/Vgm2qbZKrAL0JpgnEqb8fSng1YIAnTPtZnbj9CvoB0M Bbpw2FtaaLxp0YU1a9hMyP7lPpLi6KBm1DOe6cxK230U3hi03ggprmQxaQuux2vh 20htLQACGFgdz8oGHRxmeJx88sgUgZtA/MNq44F8fdUHzSRhTI3aVeLYqiOTDr2J AhA08fBa5jVWO1j9GegO+ZyA2Eqln1nKob1JzX5UimYNw6eUDlB6ezhz2uTMNwO7 ukxt8MCPrjN66sGcpICbU3UqrFAINLCMsJaHtZLlysBLAYaOkSyjJMW2rTbx/ymU d79zY4qKuJPDN/A36ddBozqArm6dfRWpocsscrrMR6+gtYKHHIK8kCjGI2tyKeuH bw0cgvRMd5brW9LYlBJPYpiRIXC2YHw5L6+sLCM5MdcwuOweUdogn/erYPJi7sTj eHdEtGXL49bl1n2B30D37ronemKd+5uCYspsXDTZzAitgcoP7JkaF5Lymj0+t02l LjWboYF/3UB/owq9LyFcFGzCcFQSYsvZrIuhB0bW9CImxuQaZBs3sOC0g+vBz8Zb jtuaK7z9FiSlv6xQ+9CePb87vjKiVq7uurDwgu4ETzqIifYFmOp4sR9eF5L8ctFJ GCohQmByzUbjPWQg3IkXv/KJBq7AEFmwG6A65vxy2T8UjgEDUkk= =6hWy -----END PGP SIGNATURE----- --x4CyN20jihukZMCe--