public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: passt-dev@passt.top, Paul Holzinger <pholzing@redhat.com>
Subject: Re: [PATCH v2] treewide: Introduce 'local mode' for disconnected setups
Date: Tue, 26 Nov 2024 15:01:06 +1100	[thread overview]
Message-ID: <Z0VIAqA1LimOwOnl@zatzit> (raw)
In-Reply-To: <20241124235037.4127862-1-sbrivio@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 17011 bytes --]

On Mon, Nov 25, 2024 at 12:50:37AM +0100, Stefano Brivio wrote:
> There are setups where no host interface is available or configured
> at all, intentionally or not, temporarily or not, but users expect
> (Podman) containers to run in any case as they did with slirp4netns,
> and we're now getting reports that we broke such setups at a rather
> alarming rate.

This looks pretty good to me.  It's simpler than I feared, and I don't
think it will tie our hands much in terms of future fancier handling
(netlink monitors, no-template-interface mode etc.).  I'm pretty good
to go ahead with this soon.

I do have a few thoughts on some details which might be polished, see
below.

> To this end, if we don't find any usable host interface, instead of
> exiting:
> 
> - for IPv4, use 169.254.2.1 as guest/container address and 169.254.2.2
>   as default gateway
> 
> - for IPv6, don't assign any address (forcibly disable DHCPv6), and
>   use the *first* link-local address we observe to represent the
>   guest/container. Advertise fe80::1 as default gateway
> 
> - use 'tap0' as default interface name for pasta
> 
> Change ifi4 and ifi6 in struct ctx to int and accept a special -1
> value meaning that no host interface was selected, but the IP family
> is enabled. The fact that the kernel uses unsigned int values for
> those is not an issue as 1. one can't create so many interfaces
> anyway and 2. we otherwise handle those values transparently.

ifi4 and ifi6 are currently kind of overloaded.  A few places that
actually need to deal with the host side plumbing use them for the
interface index.  Most places use these only to tell if IPv4/IPv6 is
enabled (for the guest).

This patch is disconnecting whether we have IPv4/IPv6 enabled from
whether there exists a suitable external interface.  So, I think it
makes sense to disconnect the variables as well, rather than
complexifying the encoding into ifi4/ifi6.

I'd suggest instead that we introduce new boolean ip[46]_enabled
variables, and change all the places that check ifi4/ifi6 only for
that reason to use those instead.  That can be a preiminary patch.
Then we keep the encoding of ifi4/ifi6 as is.

I think that will make things slightly cleaner and not a great deal
harder for this initial version of a "local" setting.  It will also
extend more naturally if we have future modes that can interact with
multiple host side interfaces or other more complex scenarios.

> Fix a botched conditional in conf_print() to actually skip printing
> DHCPv6 information if DHCPv6 is disabled (and skip printing NDP
> information if NDP is disabled).
> 
> Link: https://github.com/containers/podman/issues/24614
> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> ---
> v2:
>  - drop fixed link-local address for IPv6
>  - change addresses to be reminiscent of libslirp's default choices
>  - add man page changes and commit message
>  - fix several things around, from testing (checked with several
>    --map-guest-addr and --map-host-loopback combinations etc.)
> 
>  conf.c  | 92 +++++++++++++++++++++++++++++++++++++++++++++------------
>  passt.1 | 33 +++++++++++++++++----
>  passt.h |  8 ++---
>  pasta.c |  7 +++--
>  tap.c   |  3 ++
>  5 files changed, 113 insertions(+), 30 deletions(-)
> 
> diff --git a/conf.c b/conf.c
> index 86566db..551c2b0 100644
> --- a/conf.c
> +++ b/conf.c
> @@ -48,6 +48,20 @@
>  
>  #define NETNS_RUN_DIR	"/run/netns"
>  
> +#define IP4_LL_GUEST_ADDR	(struct in_addr){ htonl_constant(0xa9fe0201) }
> +				/* 169.254.2.1, libslirp default: 10.0.2.1 */
> +
> +#define IP4_LL_GUEST_GW		(struct in_addr){ htonl_constant(0xa9fe0202) }
> +				/* 169.254.2.2, libslirp default: 10.0.2.2 */
> +
> +#define IP4_LL_PREFIX_LEN	16
> +
> +#define IP6_LL_GUEST_GW		(struct in6_addr)			\
> +				{{{ 0xfe, 0x80, 0, 0, 0, 0, 0, 0,	\
> +				       0, 0, 0, 0, 0, 0, 0, 0x01 }}}
> +
> +const char *pasta_default_ifn = "tap0";

Not sure I like this as a default name, though I guess it's fine.  It
kind of exposes the underlying mechanisms we're using in a way that's
not really necessary.  Maybe just call it "pasta" or "pasta0"?

> +
>  /**
>   * next_chunk - Return the next piece of a string delimited by a character
>   * @s:		String to search
> @@ -631,7 +645,7 @@ static unsigned int conf_ip4(unsigned int ifi, struct ip4_ctx *ip4)
>  		ifi = nl_get_ext_if(nl_sock, AF_INET);
>  
>  	if (!ifi) {
> -		info("Couldn't pick external interface: disabling IPv4");
> +		debug("Failed to detect external interface for IPv4");
>  		return 0;
>  	}
>  
> @@ -639,8 +653,8 @@ static unsigned int conf_ip4(unsigned int ifi, struct ip4_ctx *ip4)
>  		int rc = nl_route_get_def(nl_sock, ifi, AF_INET,
>  					  &ip4->guest_gw);
>  		if (rc < 0) {
> -			err("Couldn't discover IPv4 gateway address: %s",
> -			    strerror(-rc));
> +			debug("Couldn't discover IPv4 gateway address: %s",
> +			      strerror(-rc));
>  			return 0;
>  		}
>  	}
> @@ -649,8 +663,8 @@ static unsigned int conf_ip4(unsigned int ifi, struct ip4_ctx *ip4)
>  		int rc = nl_addr_get(nl_sock, ifi, AF_INET,
>  				     &ip4->addr, &ip4->prefix_len, NULL);
>  		if (rc < 0) {
> -			err("Couldn't discover IPv4 address: %s",
> -			    strerror(-rc));
> +			debug("Couldn't discover IPv4 address: %s",
> +			      strerror(-rc));
>  			return 0;
>  		}
>  	}
> @@ -677,6 +691,19 @@ static unsigned int conf_ip4(unsigned int ifi, struct ip4_ctx *ip4)
>  	return ifi;
>  }
>  
> +/**
> + * conf_ip4_local() - Configure IPv4 addresses and attributes for local mode
> + * @ip4:	IPv4 context (will be written)
> + */
> +static void conf_ip4_local(struct ip4_ctx *ip4)
> +{
> +	ip4->addr_seen = ip4->addr = IP4_LL_GUEST_ADDR;
> +	ip4->our_tap_addr = ip4->guest_gw = IP4_LL_GUEST_GW;
> +	ip4->prefix_len = IP4_LL_PREFIX_LEN;
> +
> +	ip4->no_copy_addrs = ip4->no_copy_routes = true;
> +}
> +
>  /**
>   * conf_ip6() - Verify or detect IPv6 support, get relevant addresses
>   * @ifi:	Host interface to attempt (0 to determine one)
> @@ -693,15 +720,15 @@ static unsigned int conf_ip6(unsigned int ifi, struct ip6_ctx *ip6)
>  		ifi = nl_get_ext_if(nl_sock, AF_INET6);
>  
>  	if (!ifi) {
> -		info("Couldn't pick external interface: disabling IPv6");
> +		debug("Failed to detect external interface for IPv6");
>  		return 0;
>  	}
>  
>  	if (IN6_IS_ADDR_UNSPECIFIED(&ip6->guest_gw)) {
>  		rc = nl_route_get_def(nl_sock, ifi, AF_INET6, &ip6->guest_gw);
>  		if (rc < 0) {
> -			err("Couldn't discover IPv6 gateway address: %s",
> -			    strerror(-rc));
> +			debug("Couldn't discover IPv6 gateway address: %s",
> +			      strerror(-rc));
>  			return 0;
>  		}
>  	}
> @@ -710,7 +737,7 @@ static unsigned int conf_ip6(unsigned int ifi, struct ip6_ctx *ip6)
>  			 IN6_IS_ADDR_UNSPECIFIED(&ip6->addr) ? &ip6->addr : NULL,
>  			 &prefix_len, &ip6->our_tap_ll);
>  	if (rc < 0) {
> -		err("Couldn't discover IPv6 address: %s", strerror(-rc));
> +		debug("Couldn't discover IPv6 address: %s", strerror(-rc));
>  		return 0;
>  	}
>  
> @@ -726,6 +753,17 @@ static unsigned int conf_ip6(unsigned int ifi, struct ip6_ctx *ip6)
>  	return ifi;
>  }
>  
> +/**
> + * conf_ip6_local() - Configure IPv6 addresses and attributes for local mode
> + * @ip6:	IPv6 context (will be written)
> + */
> +static void conf_ip6_local(struct ip6_ctx *ip6)
> +{
> +	ip6->our_tap_ll = ip6->guest_gw = IP6_LL_GUEST_GW;
> +
> +	ip6->no_copy_addrs = ip6->no_copy_routes = true;
> +}
> +
>  /**
>   * usage() - Print usage, exit with given status code
>   * @name:	Executable name
> @@ -948,12 +986,14 @@ static void conf_print(const struct ctx *c)
>  	char bufmac[ETH_ADDRSTRLEN], ifn[IFNAMSIZ];
>  	int i;
>  
> -	info("Template interface: %s%s%s%s%s",
> -	     c->ifi4 ? if_indextoname(c->ifi4, ifn) : "",
> -	     c->ifi4 ? " (IPv4)" : "",
> -	     (c->ifi4 && c->ifi6) ? ", " : "",
> -	     c->ifi6 ? if_indextoname(c->ifi6, ifn) : "",
> -	     c->ifi6 ? " (IPv6)" : "");
> +	if (c->ifi4 > 0 || c->ifi6 > 0) {
> +		info("Template interface: %s%s%s%s%s",
> +		     c->ifi4 > 0 ? if_indextoname(c->ifi4, ifn) : "",
> +		     c->ifi4 > 0 ? " (IPv4)" : "",
> +		     (c->ifi4 && c->ifi6) ? ", " : "",
> +		     c->ifi6 > 0 ? if_indextoname(c->ifi6, ifn) : "",
> +		     c->ifi6 > 0 ? " (IPv6)" : "");
> +	}
>  
>  	if (*c->ip4.ifname_out || *c->ip6.ifname_out) {
>  		info("Outbound interface: %s%s%s%s%s",
> @@ -1024,9 +1064,9 @@ static void conf_print(const struct ctx *c)
>  
>  		if (!c->no_ndp && !c->no_dhcpv6)
>  			info("NDP/DHCPv6:");
> -		else if (!c->no_ndp)
> -			info("DHCPv6:");
>  		else if (!c->no_dhcpv6)
> +			info("DHCPv6:");
> +		else if (!c->no_ndp)
>  			info("NDP:");
>  		else
>  			goto dns6;
> @@ -1735,8 +1775,20 @@ void conf(struct ctx *c, int argc, char **argv)
>  		c->ifi6 = conf_ip6(ifi6, &c->ip6);
>  	if ((!c->ifi4 && !c->ifi6) ||
>  	    (*c->ip4.ifname_out && !c->ifi4) ||
> -	    (*c->ip6.ifname_out && !c->ifi6))
> -		die("External interface not usable");

I guess a question is if the host has IPv4 but no IPv6 connectivity,
do we want to disable IPv6 in the guest, or configure local-mode IPv6?
That is something we can probably change without too much danger later
on.

> +	    (*c->ip6.ifname_out && !c->ifi6)) {
> +		info("No external interface as template, switch to local mode");
> +
> +		conf_ip4_local(&c->ip4);
> +		c->ifi4 = -1;
> +
> +		conf_ip6_local(&c->ip6);
> +		c->ifi6 = -1;
> +
> +		if (!*c->pasta_ifn) {
> +			strncpy(c->pasta_ifn, pasta_default_ifn,
> +				sizeof(c->pasta_ifn) - 1);
> +		}
> +	}
>  
>  	if (c->ifi4 && !no_map_gw &&
>  	    IN4_IS_ADDR_UNSPECIFIED(&c->ip4.map_host_loopback))
> @@ -1840,6 +1892,8 @@ void conf(struct ctx *c, int argc, char **argv)
>  	if (!c->ifi6) {
>  		c->no_ndp = 1;
>  		c->no_dhcpv6 = 1;
> +	} else if (IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr)) {
> +		c->no_dhcpv6 = 1;
>  	}
>  
>  	if (!c->mtu)
> diff --git a/passt.1 b/passt.1
> index f084978..f5cf8a4 100644
> --- a/passt.1
> +++ b/passt.1
> @@ -160,7 +160,9 @@ once for IPv6).
>  By default, assigned IPv4 and IPv6 addresses are taken from the host interfaces
>  with the first default route, if any, for the corresponding IP version. If no
>  default routes are available and there is any interface with any route for a
> -given IP version, the first of these interfaces will be chosen instead.
> +given IP version, the first of these interfaces will be chosen instead. If no
> +such interface exists, the link-local address 169.254.2.1 is assigned for IPv4,
> +and no additional address will be assigned for IPv6.
>  
>  .TP
>  .BR \-n ", " \-\-netmask " " \fImask
> @@ -188,7 +190,9 @@ first default route, if any, for the corresponding IP version. If the default
>  route is a multipath one, the gateway is the first nexthop router returned by
>  the kernel which has the highest weight in the set of paths. If no default
>  routes are available and there is just one interface with any route, that
> -interface will be chosen instead.
> +interface will be chosen instead. If no such interface exists, the link-local
> +address 169.254.2.2 is used for IPv4, and the link-local address fe80::1 is used
> +for IPv6.
>  
>  Note: these addresses are also used as source address for packets directed to
>  the guest or to the target namespace having a loopback or local source address,
> @@ -203,7 +207,9 @@ Default is to use the interfaces specified by \fB--outbound-if4\fR and
>  
>  If no interfaces are given, the interface with the first default routes for each
>  IP version is selected. If no default routes are available and there is just one
> -interface with any route, that interface will be chosen instead.
> +interface with any route, that interface will be chosen instead. If no such
> +interface exists, host interfaces will be ignored for the purposes of assigning
> +addresses and routes, and link-local addresses will be used instead.
>  
>  .TP
>  .BR \-o ", " \-\-outbound " " \fIaddr
> @@ -222,7 +228,8 @@ derive IPv4 addresses and routes.
>  
>  By default, the interface given by the default route is selected. If no default
>  routes are available and there is just one interface with any route, that
> -interface will be chosen instead.
> +interface will be chosen instead. If no such interface exists, outbound sockets
> +will not be bound to any specific interface.
>  
>  .TP
>  .BR \-\-outbound-if6 " " \fIname
> @@ -232,7 +239,8 @@ derive IPv6 addresses and routes.
>  
>  By default, the interface given by the default route is selected. If no default
>  routes are available and there is just one interface with any route, that
> -interface will be chosen instead.
> +interface will be chosen instead. If no such interface exists, outbound sockets
> +will not be bound to any specific interface.
>  
>  .TP
>  .BR \-D ", " \-\-dns " " \fIaddr
> @@ -504,6 +512,7 @@ Default is \fBnone\fR.
>  .BR \-I ", " \-\-ns-ifname " " \fIname
>  Name of tap interface to be created in target namespace.
>  By default, the same interface name as the external, routable interface is used.
> +If no such interface exists, the name \fItap0\fR will be used instead.
>  
>  .TP
>  .BR \-t ", " \-\-tcp-ports " " \fIspec
> @@ -1032,6 +1041,20 @@ If the sending window cannot be queried, it will always be announced as the
>  current sending buffer size to guest or target namespace. This might affect
>  throughput of TCP connections.
>  
> +.SS Local mode for disconnected setups
> +
> +If \fBpasst\fR and \fBpasta\fR fail to find a host interface with a configured
> +address, other than loopback addresses, they will, obviously, not attempt to
> +source addresses or routes from the host.
> +
> +In this case, unless configured otherwise, they will assign the IPv4 link-local
> +address 169.254.2.1 to the guest or target namespace, and no IPv6 address. The
> +notion of the guest or target namespace IPv6 address is derived from the first
> +link-local address observed.
> +
> +Default gateways will be assigned as the link-local address 169.254.2.2 for
> +IPv4, and as the link-local address fe80::1 for IPv6.
> +
>  .SH LIMITATIONS
>  
>  Currently, IGMP/MLD proxying (RFC 4605) and support for SCTP (RFC 4960) are not
> diff --git a/passt.h b/passt.h
> index 72c7f72..799ee50 100644
> --- a/passt.h
> +++ b/passt.h
> @@ -202,10 +202,10 @@ struct ip6_ctx {
>   * @our_tap_mac:	Pasta/passt's MAC on the tap link
>   * @guest_mac:		MAC address of guest or namespace, seen or configured
>   * @hash_secret:	128-bit secret for siphash functions
> - * @ifi4:		Index of template interface for IPv4, 0 if IPv4 disabled
> + * @ifi4:		Template interface for IPv4, -1: none, 0: IPv4 disabled
>   * @ip:			IPv4 configuration
>   * @dns_search:		DNS search list
> - * @ifi6:		Index of template interface for IPv6, 0 if IPv6 disabled
> + * @ifi6:		Template interface for IPv6, -1: none, 0: IPv6 disabled
>   * @ip6:		IPv6 configuration
>   * @pasta_ifn:		Name of namespace interface for pasta
>   * @pasta_ifi:		Index of namespace interface for pasta
> @@ -258,12 +258,12 @@ struct ctx {
>  	unsigned char guest_mac[ETH_ALEN];
>  	uint64_t hash_secret[2];
>  
> -	unsigned int ifi4;
> +	int ifi4;
>  	struct ip4_ctx ip4;
>  
>  	struct fqdn dns_search[MAXDNSRCH];
>  
> -	unsigned int ifi6;
> +	int ifi6;
>  	struct ip6_ctx ip6;
>  
>  	char pasta_ifn[IF_NAMESIZE];
> diff --git a/pasta.c b/pasta.c
> index a117704..96dacc3 100644
> --- a/pasta.c
> +++ b/pasta.c
> @@ -369,8 +369,11 @@ void pasta_ns_conf(struct ctx *c)
>  					  0, IFF_NOARP);
>  
>  			if (c->ip6.no_copy_addrs) {
> -				rc = nl_addr_set(nl_sock_ns, c->pasta_ifi,
> -						 AF_INET6, &c->ip6.addr, 64);
> +				if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr)) {
> +					rc = nl_addr_set(nl_sock_ns,
> +							 c->pasta_ifi, AF_INET6,
> +							 &c->ip6.addr, 64);
> +				}
>  			} else {
>  				rc = nl_addr_dup(nl_sock, c->ifi6,
>  						 nl_sock_ns, c->pasta_ifi,
> diff --git a/tap.c b/tap.c
> index 14d9b3d..5347df4 100644
> --- a/tap.c
> +++ b/tap.c
> @@ -803,6 +803,9 @@ resume:
>  			if (IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_seen)) {
>  				c->ip6.addr_seen = *saddr;
>  			}
> +
> +			if (IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr))
> +				c->ip6.addr = *saddr;

Hmm.. not entirely sure if this is a good idea, or whether we should
instead keep the assigned address blank as an indication that we never
assigned something, the guest just picked for itself.

>  		} else if (!IN6_IS_ADDR_UNSPECIFIED(saddr)){
>  			c->ip6.addr_seen = *saddr;
>  		}

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2024-11-26  4:01 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-24 23:50 [PATCH v2] treewide: Introduce 'local mode' for disconnected setups Stefano Brivio
2024-11-26  4:01 ` David Gibson [this message]
2024-11-26  4:39   ` Stefano Brivio
2024-11-26  5:28     ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z0VIAqA1LimOwOnl@zatzit \
    --to=david@gibson.dropbear.id.au \
    --cc=passt-dev@passt.top \
    --cc=pholzing@redhat.com \
    --cc=sbrivio@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).