From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=pass (2048-bit key; secure) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202410 header.b=laad7E+Z; dkim-atps=neutral Received: from mail.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by passt.top (Postfix) with ESMTPS id 4371B5A004C for ; Tue, 26 Nov 2024 05:01:23 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202410; t=1732593668; bh=aCosWA+5V5eCcEdkzchDYGnmuce5FqEABBtqI9XbIcs=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=laad7E+ZascRENQqSDaY1d58u+K9F46eYsG/KMXBxHWE8cWqf5+Sj6JOnpESxIx+L DFJ2acow93kGSI+c/PoGmNJc84ZtxHtdZO//PIaxjBeDCZj4zZU8ciU3QIkvd6vEk5 jygBNl84A7BTia+mdA7kMDLjysYqktD3JO430xJwPNwH/KCvJ56DlSC8MD9xwtfSyH /+EO02Xn+80jHW19/8kVpyxuadZ3lGNvAoucSxy5B39auQqORYkNsFYKOeYRXsjEvI awaAWO5ProAL2Xb27ZcUD40YkRaad/ZRYt07deILfLMgPBgB3EvRSCqFy9RpzYQDFA 7Uzo5xdlFvN0Q== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4Xy8506Y0lz4xfx; Tue, 26 Nov 2024 15:01:08 +1100 (AEDT) Date: Tue, 26 Nov 2024 15:01:06 +1100 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH v2] treewide: Introduce 'local mode' for disconnected setups Message-ID: References: <20241124235037.4127862-1-sbrivio@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="1qJhLL+wC8buaGlJ" Content-Disposition: inline In-Reply-To: <20241124235037.4127862-1-sbrivio@redhat.com> Message-ID-Hash: X5PNLDFAKXTIMRAUYR6VYEIC3JFJRKFP X-Message-ID-Hash: X5PNLDFAKXTIMRAUYR6VYEIC3JFJRKFP X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, Paul Holzinger X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --1qJhLL+wC8buaGlJ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Nov 25, 2024 at 12:50:37AM +0100, Stefano Brivio wrote: > There are setups where no host interface is available or configured > at all, intentionally or not, temporarily or not, but users expect > (Podman) containers to run in any case as they did with slirp4netns, > and we're now getting reports that we broke such setups at a rather > alarming rate. This looks pretty good to me. It's simpler than I feared, and I don't think it will tie our hands much in terms of future fancier handling (netlink monitors, no-template-interface mode etc.). I'm pretty good to go ahead with this soon. I do have a few thoughts on some details which might be polished, see below. > To this end, if we don't find any usable host interface, instead of > exiting: >=20 > - for IPv4, use 169.254.2.1 as guest/container address and 169.254.2.2 > as default gateway >=20 > - for IPv6, don't assign any address (forcibly disable DHCPv6), and > use the *first* link-local address we observe to represent the > guest/container. Advertise fe80::1 as default gateway >=20 > - use 'tap0' as default interface name for pasta >=20 > Change ifi4 and ifi6 in struct ctx to int and accept a special -1 > value meaning that no host interface was selected, but the IP family > is enabled. The fact that the kernel uses unsigned int values for > those is not an issue as 1. one can't create so many interfaces > anyway and 2. we otherwise handle those values transparently. ifi4 and ifi6 are currently kind of overloaded. A few places that actually need to deal with the host side plumbing use them for the interface index. Most places use these only to tell if IPv4/IPv6 is enabled (for the guest). This patch is disconnecting whether we have IPv4/IPv6 enabled from whether there exists a suitable external interface. So, I think it makes sense to disconnect the variables as well, rather than complexifying the encoding into ifi4/ifi6. I'd suggest instead that we introduce new boolean ip[46]_enabled variables, and change all the places that check ifi4/ifi6 only for that reason to use those instead. That can be a preiminary patch. Then we keep the encoding of ifi4/ifi6 as is. I think that will make things slightly cleaner and not a great deal harder for this initial version of a "local" setting. It will also extend more naturally if we have future modes that can interact with multiple host side interfaces or other more complex scenarios. > Fix a botched conditional in conf_print() to actually skip printing > DHCPv6 information if DHCPv6 is disabled (and skip printing NDP > information if NDP is disabled). >=20 > Link: https://github.com/containers/podman/issues/24614 > Signed-off-by: Stefano Brivio > --- > v2: > - drop fixed link-local address for IPv6 > - change addresses to be reminiscent of libslirp's default choices > - add man page changes and commit message > - fix several things around, from testing (checked with several > --map-guest-addr and --map-host-loopback combinations etc.) >=20 > conf.c | 92 +++++++++++++++++++++++++++++++++++++++++++++------------ > passt.1 | 33 +++++++++++++++++---- > passt.h | 8 ++--- > pasta.c | 7 +++-- > tap.c | 3 ++ > 5 files changed, 113 insertions(+), 30 deletions(-) >=20 > diff --git a/conf.c b/conf.c > index 86566db..551c2b0 100644 > --- a/conf.c > +++ b/conf.c > @@ -48,6 +48,20 @@ > =20 > #define NETNS_RUN_DIR "/run/netns" > =20 > +#define IP4_LL_GUEST_ADDR (struct in_addr){ htonl_constant(0xa9fe0201) } > + /* 169.254.2.1, libslirp default: 10.0.2.1 */ > + > +#define IP4_LL_GUEST_GW (struct in_addr){ htonl_constant(0xa9fe0202) } > + /* 169.254.2.2, libslirp default: 10.0.2.2 */ > + > +#define IP4_LL_PREFIX_LEN 16 > + > +#define IP6_LL_GUEST_GW (struct in6_addr) \ > + {{{ 0xfe, 0x80, 0, 0, 0, 0, 0, 0, \ > + 0, 0, 0, 0, 0, 0, 0, 0x01 }}} > + > +const char *pasta_default_ifn =3D "tap0"; Not sure I like this as a default name, though I guess it's fine. It kind of exposes the underlying mechanisms we're using in a way that's not really necessary. Maybe just call it "pasta" or "pasta0"? > + > /** > * next_chunk - Return the next piece of a string delimited by a charact= er > * @s: String to search > @@ -631,7 +645,7 @@ static unsigned int conf_ip4(unsigned int ifi, struct= ip4_ctx *ip4) > ifi =3D nl_get_ext_if(nl_sock, AF_INET); > =20 > if (!ifi) { > - info("Couldn't pick external interface: disabling IPv4"); > + debug("Failed to detect external interface for IPv4"); > return 0; > } > =20 > @@ -639,8 +653,8 @@ static unsigned int conf_ip4(unsigned int ifi, struct= ip4_ctx *ip4) > int rc =3D nl_route_get_def(nl_sock, ifi, AF_INET, > &ip4->guest_gw); > if (rc < 0) { > - err("Couldn't discover IPv4 gateway address: %s", > - strerror(-rc)); > + debug("Couldn't discover IPv4 gateway address: %s", > + strerror(-rc)); > return 0; > } > } > @@ -649,8 +663,8 @@ static unsigned int conf_ip4(unsigned int ifi, struct= ip4_ctx *ip4) > int rc =3D nl_addr_get(nl_sock, ifi, AF_INET, > &ip4->addr, &ip4->prefix_len, NULL); > if (rc < 0) { > - err("Couldn't discover IPv4 address: %s", > - strerror(-rc)); > + debug("Couldn't discover IPv4 address: %s", > + strerror(-rc)); > return 0; > } > } > @@ -677,6 +691,19 @@ static unsigned int conf_ip4(unsigned int ifi, struc= t ip4_ctx *ip4) > return ifi; > } > =20 > +/** > + * conf_ip4_local() - Configure IPv4 addresses and attributes for local = mode > + * @ip4: IPv4 context (will be written) > + */ > +static void conf_ip4_local(struct ip4_ctx *ip4) > +{ > + ip4->addr_seen =3D ip4->addr =3D IP4_LL_GUEST_ADDR; > + ip4->our_tap_addr =3D ip4->guest_gw =3D IP4_LL_GUEST_GW; > + ip4->prefix_len =3D IP4_LL_PREFIX_LEN; > + > + ip4->no_copy_addrs =3D ip4->no_copy_routes =3D true; > +} > + > /** > * conf_ip6() - Verify or detect IPv6 support, get relevant addresses > * @ifi: Host interface to attempt (0 to determine one) > @@ -693,15 +720,15 @@ static unsigned int conf_ip6(unsigned int ifi, stru= ct ip6_ctx *ip6) > ifi =3D nl_get_ext_if(nl_sock, AF_INET6); > =20 > if (!ifi) { > - info("Couldn't pick external interface: disabling IPv6"); > + debug("Failed to detect external interface for IPv6"); > return 0; > } > =20 > if (IN6_IS_ADDR_UNSPECIFIED(&ip6->guest_gw)) { > rc =3D nl_route_get_def(nl_sock, ifi, AF_INET6, &ip6->guest_gw); > if (rc < 0) { > - err("Couldn't discover IPv6 gateway address: %s", > - strerror(-rc)); > + debug("Couldn't discover IPv6 gateway address: %s", > + strerror(-rc)); > return 0; > } > } > @@ -710,7 +737,7 @@ static unsigned int conf_ip6(unsigned int ifi, struct= ip6_ctx *ip6) > IN6_IS_ADDR_UNSPECIFIED(&ip6->addr) ? &ip6->addr : NULL, > &prefix_len, &ip6->our_tap_ll); > if (rc < 0) { > - err("Couldn't discover IPv6 address: %s", strerror(-rc)); > + debug("Couldn't discover IPv6 address: %s", strerror(-rc)); > return 0; > } > =20 > @@ -726,6 +753,17 @@ static unsigned int conf_ip6(unsigned int ifi, struc= t ip6_ctx *ip6) > return ifi; > } > =20 > +/** > + * conf_ip6_local() - Configure IPv6 addresses and attributes for local = mode > + * @ip6: IPv6 context (will be written) > + */ > +static void conf_ip6_local(struct ip6_ctx *ip6) > +{ > + ip6->our_tap_ll =3D ip6->guest_gw =3D IP6_LL_GUEST_GW; > + > + ip6->no_copy_addrs =3D ip6->no_copy_routes =3D true; > +} > + > /** > * usage() - Print usage, exit with given status code > * @name: Executable name > @@ -948,12 +986,14 @@ static void conf_print(const struct ctx *c) > char bufmac[ETH_ADDRSTRLEN], ifn[IFNAMSIZ]; > int i; > =20 > - info("Template interface: %s%s%s%s%s", > - c->ifi4 ? if_indextoname(c->ifi4, ifn) : "", > - c->ifi4 ? " (IPv4)" : "", > - (c->ifi4 && c->ifi6) ? ", " : "", > - c->ifi6 ? if_indextoname(c->ifi6, ifn) : "", > - c->ifi6 ? " (IPv6)" : ""); > + if (c->ifi4 > 0 || c->ifi6 > 0) { > + info("Template interface: %s%s%s%s%s", > + c->ifi4 > 0 ? if_indextoname(c->ifi4, ifn) : "", > + c->ifi4 > 0 ? " (IPv4)" : "", > + (c->ifi4 && c->ifi6) ? ", " : "", > + c->ifi6 > 0 ? if_indextoname(c->ifi6, ifn) : "", > + c->ifi6 > 0 ? " (IPv6)" : ""); > + } > =20 > if (*c->ip4.ifname_out || *c->ip6.ifname_out) { > info("Outbound interface: %s%s%s%s%s", > @@ -1024,9 +1064,9 @@ static void conf_print(const struct ctx *c) > =20 > if (!c->no_ndp && !c->no_dhcpv6) > info("NDP/DHCPv6:"); > - else if (!c->no_ndp) > - info("DHCPv6:"); > else if (!c->no_dhcpv6) > + info("DHCPv6:"); > + else if (!c->no_ndp) > info("NDP:"); > else > goto dns6; > @@ -1735,8 +1775,20 @@ void conf(struct ctx *c, int argc, char **argv) > c->ifi6 =3D conf_ip6(ifi6, &c->ip6); > if ((!c->ifi4 && !c->ifi6) || > (*c->ip4.ifname_out && !c->ifi4) || > - (*c->ip6.ifname_out && !c->ifi6)) > - die("External interface not usable"); I guess a question is if the host has IPv4 but no IPv6 connectivity, do we want to disable IPv6 in the guest, or configure local-mode IPv6? That is something we can probably change without too much danger later on. > + (*c->ip6.ifname_out && !c->ifi6)) { > + info("No external interface as template, switch to local mode"); > + > + conf_ip4_local(&c->ip4); > + c->ifi4 =3D -1; > + > + conf_ip6_local(&c->ip6); > + c->ifi6 =3D -1; > + > + if (!*c->pasta_ifn) { > + strncpy(c->pasta_ifn, pasta_default_ifn, > + sizeof(c->pasta_ifn) - 1); > + } > + } > =20 > if (c->ifi4 && !no_map_gw && > IN4_IS_ADDR_UNSPECIFIED(&c->ip4.map_host_loopback)) > @@ -1840,6 +1892,8 @@ void conf(struct ctx *c, int argc, char **argv) > if (!c->ifi6) { > c->no_ndp =3D 1; > c->no_dhcpv6 =3D 1; > + } else if (IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr)) { > + c->no_dhcpv6 =3D 1; > } > =20 > if (!c->mtu) > diff --git a/passt.1 b/passt.1 > index f084978..f5cf8a4 100644 > --- a/passt.1 > +++ b/passt.1 > @@ -160,7 +160,9 @@ once for IPv6). > By default, assigned IPv4 and IPv6 addresses are taken from the host int= erfaces > with the first default route, if any, for the corresponding IP version. = If no > default routes are available and there is any interface with any route f= or a > -given IP version, the first of these interfaces will be chosen instead. > +given IP version, the first of these interfaces will be chosen instead. = If no > +such interface exists, the link-local address 169.254.2.1 is assigned fo= r IPv4, > +and no additional address will be assigned for IPv6. > =20 > .TP > .BR \-n ", " \-\-netmask " " \fImask > @@ -188,7 +190,9 @@ first default route, if any, for the corresponding IP= version. If the default > route is a multipath one, the gateway is the first nexthop router return= ed by > the kernel which has the highest weight in the set of paths. If no defau= lt > routes are available and there is just one interface with any route, that > -interface will be chosen instead. > +interface will be chosen instead. If no such interface exists, the link-= local > +address 169.254.2.2 is used for IPv4, and the link-local address fe80::1= is used > +for IPv6. > =20 > Note: these addresses are also used as source address for packets direct= ed to > the guest or to the target namespace having a loopback or local source a= ddress, > @@ -203,7 +207,9 @@ Default is to use the interfaces specified by \fB--ou= tbound-if4\fR and > =20 > If no interfaces are given, the interface with the first default routes = for each > IP version is selected. If no default routes are available and there is = just one > -interface with any route, that interface will be chosen instead. > +interface with any route, that interface will be chosen instead. If no s= uch > +interface exists, host interfaces will be ignored for the purposes of as= signing > +addresses and routes, and link-local addresses will be used instead. > =20 > .TP > .BR \-o ", " \-\-outbound " " \fIaddr > @@ -222,7 +228,8 @@ derive IPv4 addresses and routes. > =20 > By default, the interface given by the default route is selected. If no = default > routes are available and there is just one interface with any route, that > -interface will be chosen instead. > +interface will be chosen instead. If no such interface exists, outbound = sockets > +will not be bound to any specific interface. > =20 > .TP > .BR \-\-outbound-if6 " " \fIname > @@ -232,7 +239,8 @@ derive IPv6 addresses and routes. > =20 > By default, the interface given by the default route is selected. If no = default > routes are available and there is just one interface with any route, that > -interface will be chosen instead. > +interface will be chosen instead. If no such interface exists, outbound = sockets > +will not be bound to any specific interface. > =20 > .TP > .BR \-D ", " \-\-dns " " \fIaddr > @@ -504,6 +512,7 @@ Default is \fBnone\fR. > .BR \-I ", " \-\-ns-ifname " " \fIname > Name of tap interface to be created in target namespace. > By default, the same interface name as the external, routable interface = is used. > +If no such interface exists, the name \fItap0\fR will be used instead. > =20 > .TP > .BR \-t ", " \-\-tcp-ports " " \fIspec > @@ -1032,6 +1041,20 @@ If the sending window cannot be queried, it will a= lways be announced as the > current sending buffer size to guest or target namespace. This might aff= ect > throughput of TCP connections. > =20 > +.SS Local mode for disconnected setups > + > +If \fBpasst\fR and \fBpasta\fR fail to find a host interface with a conf= igured > +address, other than loopback addresses, they will, obviously, not attemp= t to > +source addresses or routes from the host. > + > +In this case, unless configured otherwise, they will assign the IPv4 lin= k-local > +address 169.254.2.1 to the guest or target namespace, and no IPv6 addres= s. The > +notion of the guest or target namespace IPv6 address is derived from the= first > +link-local address observed. > + > +Default gateways will be assigned as the link-local address 169.254.2.2 = for > +IPv4, and as the link-local address fe80::1 for IPv6. > + > .SH LIMITATIONS > =20 > Currently, IGMP/MLD proxying (RFC 4605) and support for SCTP (RFC 4960) = are not > diff --git a/passt.h b/passt.h > index 72c7f72..799ee50 100644 > --- a/passt.h > +++ b/passt.h > @@ -202,10 +202,10 @@ struct ip6_ctx { > * @our_tap_mac: Pasta/passt's MAC on the tap link > * @guest_mac: MAC address of guest or namespace, seen or configured > * @hash_secret: 128-bit secret for siphash functions > - * @ifi4: Index of template interface for IPv4, 0 if IPv4 disabled > + * @ifi4: Template interface for IPv4, -1: none, 0: IPv4 disabled > * @ip: IPv4 configuration > * @dns_search: DNS search list > - * @ifi6: Index of template interface for IPv6, 0 if IPv6 disabled > + * @ifi6: Template interface for IPv6, -1: none, 0: IPv6 disabled > * @ip6: IPv6 configuration > * @pasta_ifn: Name of namespace interface for pasta > * @pasta_ifi: Index of namespace interface for pasta > @@ -258,12 +258,12 @@ struct ctx { > unsigned char guest_mac[ETH_ALEN]; > uint64_t hash_secret[2]; > =20 > - unsigned int ifi4; > + int ifi4; > struct ip4_ctx ip4; > =20 > struct fqdn dns_search[MAXDNSRCH]; > =20 > - unsigned int ifi6; > + int ifi6; > struct ip6_ctx ip6; > =20 > char pasta_ifn[IF_NAMESIZE]; > diff --git a/pasta.c b/pasta.c > index a117704..96dacc3 100644 > --- a/pasta.c > +++ b/pasta.c > @@ -369,8 +369,11 @@ void pasta_ns_conf(struct ctx *c) > 0, IFF_NOARP); > =20 > if (c->ip6.no_copy_addrs) { > - rc =3D nl_addr_set(nl_sock_ns, c->pasta_ifi, > - AF_INET6, &c->ip6.addr, 64); > + if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr)) { > + rc =3D nl_addr_set(nl_sock_ns, > + c->pasta_ifi, AF_INET6, > + &c->ip6.addr, 64); > + } > } else { > rc =3D nl_addr_dup(nl_sock, c->ifi6, > nl_sock_ns, c->pasta_ifi, > diff --git a/tap.c b/tap.c > index 14d9b3d..5347df4 100644 > --- a/tap.c > +++ b/tap.c > @@ -803,6 +803,9 @@ resume: > if (IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_seen)) { > c->ip6.addr_seen =3D *saddr; > } > + > + if (IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr)) > + c->ip6.addr =3D *saddr; Hmm.. not entirely sure if this is a good idea, or whether we should instead keep the assigned address blank as an indication that we never assigned something, the guest just picked for itself. > } else if (!IN6_IS_ADDR_UNSPECIFIED(saddr)){ > c->ip6.addr_seen =3D *saddr; > } --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --1qJhLL+wC8buaGlJ Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmdFR/AACgkQzQJF27ox 2GdSLA/+PPrdeTzNX40hCSdk8S2I3gWGyMWntwadRbjta25nwB33U4Co2eydoPvv grfR1KJT4oNpJgr9lj6C959qbW3Kxuna8547nxfX+d1Pga4gmIpb5o9HPCUJpELo 3rvZBUz9Xxdy42WeC4PeVmSyT1H9XDVG97TET35jp38jwc2uIUHPNjgVvYq68Xu7 bqGPFQbUdxSGF3LqQ9bWL1UH5ZNxDsS1YLs7QHV33aX1IocZn4Aa9Y8nxNMwn4Hr fAgtL88z2QWCl7aNGIa+yUnpnlnbQ1UYUUyDpFuim1mtfM9JL6VM/VNKKPx9iAHQ KRwFicL4YnWrXPFSkAo2dO8R9FeOwsidiPsu0vqwJXPYBsccE7Ogjr8nqVagLrsH dwJC9vPBpfFhkxWBEO22PN7WJZGjCTcI3i426m4XgJA6j+tUI2mtN6uUm5fWkC9f NiGJbEZBA0sD4ywwVzx5CcTv++3ynDlN89DiMW9zXWPkFF3zOjUUNh+bzDQQJSuG l1XWWcXHUkxUdGzqd+iX8PrS8yHywF83/8SzBMt7sgjepYzay+Y3mmv3pTOYB30G p9yH6rOR4LhpzzFuRrP/ndbC6O/3fJ66mr4w1ViSFDyqGe2UXzpJaNX5E4lszb6p Lp9MbvyN9dnJmAol2Al0TXi2Ckr2KXWyE0etjV01osAHBymrzvY= =TWXS -----END PGP SIGNATURE----- --1qJhLL+wC8buaGlJ--