From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by passt.top (Postfix) with ESMTPS id D5EC75A0290 for ; Tue, 23 May 2023 05:09:13 +0200 (CEST) Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4QQK6H6fZ3z4x3q; Tue, 23 May 2023 13:09:11 +1000 (AEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=201602; t=1684811351; bh=RsOU/tZtsB+3c9n7mz1jEIVZRkshC8E10eNivWwssUE=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=QSfK/GWeLO+o4DPqQXuWXZlt/soub+9/qIw52jnibOGwXcSC2NK5LBotdT87XD864 H8if5CTo5Hu5g5TShXlzrHHzSbg3LEszcp7IGyHYl41Egjb6fMlZ8HkJLdOQQmipdZ QMcuAhmSBBmPdWcJLKvj9c6BIpInQy+otjgc/JRc= Date: Tue, 23 May 2023 13:08:21 +1000 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH v2 03/10] netlink: Add functionality to copy routes from outer namespace Message-ID: References: <20230521234224.2770015-1-sbrivio@redhat.com> <20230521234224.2770015-4-sbrivio@redhat.com> <20230522115851.2c7f7736@elisabeth> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="a3W2/2d0TkescMQM" Content-Disposition: inline In-Reply-To: <20230522115851.2c7f7736@elisabeth> Message-ID-Hash: OCBHX2QGBRYPVCTJM7YCOP2E5JDS47DR X-Message-ID-Hash: OCBHX2QGBRYPVCTJM7YCOP2E5JDS47DR X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, Callum Parsey , me@yawnt.com, lemmi@nerd2nerd.org, Andrea Arcangeli X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --a3W2/2d0TkescMQM Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, May 22, 2023 at 11:58:51AM +0200, Stefano Brivio wrote: > On Mon, 22 May 2023 18:42:01 +1000 > David Gibson wrote: >=20 > > On Mon, May 22, 2023 at 01:42:17AM +0200, Stefano Brivio wrote: > > > Instead of just fetching the default gateway and configuring a single > > > equivalent route in the target namespace, on 'pasta --config-net', it > > > might be desirable in some cases to copy the whole set of routes > > > corresponding to a given output interface. > > >=20 > > > For instance, in: > > > https://github.com/containers/podman/issues/18539 > > > IPv4 Default Route Does Not Propagate to Pasta Containers on Hetzne= r VPSes > > >=20 > > > configuring the default gateway won't work without a gateway-less > > > route (specifying the output interface only), because the default > > > gateway is, somewhat dubiously, not on the same subnet as the > > > container. > > >=20 > > > This is a similar case to the one covered by commit 7656a6f88882 > > > ("conf: Adjust netmask on mismatch between IPv4 address/netmask and > > > gateway"), and I'm not exactly proud of that workaround. > > >=20 > > > We also have: > > > https://bugs.passt.top/show_bug.cgi?id=3D49 > > > pasta does not work with tap-style interface > > >=20 > > > for which, eventually, we should be able to configure a gateway-less > > > route in the target namespace. > > >=20 > > > Introduce different operation modes for nl_route(), including a new > > > NL_DUP one, not exposed yet, which simply parrots back to the kernel > > > the route dump for a given interface from the outer namespace, fixing > > > up flags and interface indices on the way, and requesting to add the > > > same routes in the target namespace, on the interface we manage. > > >=20 > > > For n routes we want to duplicate, send n identical netlink requests > > > including the full dump: routes might depend on each other and the > > > kernel processes RTM_NEWROUTE messages sequentially, not atomically, > > > and repeating the full dump naturally resolves dependencies without > > > the need to actually calculate them. > > >=20 > > > I'm not kidding, it actually works pretty well. =20 > >=20 > > If there's a way to detect whether the kernel rejected some of the > > routes, it would be nice to cut that loop short as soon as all the > > routes are inserted. Obviously that could be a followup improvement, > > though. >=20 > Yes, there's a way, but to keep things asynchronous in a simple way we > process errors from nl_req() only at the next request. >=20 > This part doesn't really need to be asynchronous, though: we could add > a flag for nl_req() saying that we want to know about NLMSG_ERROR right > away. This looks relatively straightforward, and already an improvement > in the sense you mentioned. >=20 > Actually parsing the error and finding out the offending route is a bit > more complicated, though. Right, but we don't necessarily need to do that: all we need is that if there are *no* errors we can stop the loop early. --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --a3W2/2d0TkescMQM Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmRsLh4ACgkQzQJF27ox 2Gdyig/9EQVTrYR0w3efEfCNlmHWvg+Y1+7z6ZlqfqZd1sdI5RHjNg2OtyLzO8zK SKF6D3UBNzJUvE7vy+Y3rOGGLZw3yLJ9eJmI/n3TdaV4J1evtJ5ukXuIut0pMBWE VcRGFIMB7kfhz8/cx+6DJOssTe/sR/IxiLBUyVSNgdVB91xaEgBbaO9I9JNrBmRn lUETD0lZL1sfUIITB+Rf7dLyuk587Ov1/0iSdqxhdr9FiP2TvW3tUHe/yD1D5xJW gaJDtarF8XnIof7YQ5dbbaq5BBJEdvfXyaovGfiou6aFijIoifCZfvphzzVAxASi pW0aeQIxpi50izOJN122jwvLxvnEaxH6XIyU+wPEdAsCvAjO130lXDeyXliUJmXZ 4v0RXK32W+0grb7D74fH2z7uVuwKB7xCaGGnw/J0wlI9CrdnO1hTLkY4/+GkAKuf 0gE87m+cR5coO5O3EesUzGFHNa3hClYPJKYfHWmuHlVJXFueZauCV1kxGYxdHvps lkqFxJ/vKAOh8EJgsu9BsmPSq03udYYnpMdCSLiCXmwkhxw96RMSfxfpgBJTu8FW xfrqbNAMFLzHjj/sr2LxtvVYmeoCGmjrisgKO2+P+TYX/MqwSZ/5K6bAl+NRmAar hRNEoDqcR3wTFYN/DjGWODzYzp5VMz2Y3p0lwR85fWz5XUZ5evE= =7k7n -----END PGP SIGNATURE----- --a3W2/2d0TkescMQM--