public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
* [PATCH 0/2] Implement explicit outbound address and interface selection
@ 2023-03-08  7:34 Stefano Brivio
  2023-03-08  7:34 ` [PATCH 1/2] conf, passt.h: Rename "outbound" interface to "template" interface Stefano Brivio
  2023-03-08  7:34 ` [PATCH 2/2] conf, icmp, tcp, udp: Add options to bind to outbound address and interface Stefano Brivio
  0 siblings, 2 replies; 8+ messages in thread
From: Stefano Brivio @ 2023-03-08  7:34 UTC (permalink / raw)
  To: passt-dev

Stefano Brivio (2):
  conf, passt.h: Rename "outbound" interface to "template" interface
  conf, icmp, tcp, udp: Add options to bind to outbound address and
    interface

 conf.c  | 89 +++++++++++++++++++++++++++++++++++++++++++++++++++------
 icmp.c  | 24 +++++++++++++---
 passt.1 | 19 ++++++++++++
 passt.h | 14 +++++++--
 tcp.c   | 60 ++++++++++++++++++++++++++++++++++++++
 udp.c   | 54 +++++++++++++++++++++++-----------
 6 files changed, 228 insertions(+), 32 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/2] conf, passt.h: Rename "outbound" interface to "template" interface
  2023-03-08  7:34 [PATCH 0/2] Implement explicit outbound address and interface selection Stefano Brivio
@ 2023-03-08  7:34 ` Stefano Brivio
  2023-03-08 21:05   ` David Gibson
  2023-03-08  7:34 ` [PATCH 2/2] conf, icmp, tcp, udp: Add options to bind to outbound address and interface Stefano Brivio
  1 sibling, 1 reply; 8+ messages in thread
From: Stefano Brivio @ 2023-03-08  7:34 UTC (permalink / raw)
  To: passt-dev

In preparation for the next patch, make it clear that the first
routable interface fetched via netlink, or the one configured via
-i/--interface, is simply used as template to copy addresses and
routes, not an interface we actually use to derive the source address
(which will be _bound to_) for outgoing packets.

The man page and usage message appear to be already clear enough.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
---
 conf.c  | 11 +++++++----
 passt.h |  4 ++--
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/conf.c b/conf.c
index 0e512f4..3aa3314 100644
--- a/conf.c
+++ b/conf.c
@@ -903,10 +903,13 @@ static void conf_print(const struct ctx *c)
 	char buf4[INET_ADDRSTRLEN], ifn[IFNAMSIZ];
 	int i;
 
-	if (c->ifi4)
-		info("Outbound interface (IPv4): %s", if_indextoname(c->ifi4, ifn));
-	if (c->ifi6)
-		info("Outbound interface (IPv6): %s", if_indextoname(c->ifi6, ifn));
+	info("Template interface: %s%s%s%s%s",
+	     c->ifi4 ? if_indextoname(c->ifi4, ifn) : "",
+	     c->ifi4 ? " (IPv4)" : "",
+	     (c->ifi4 && c->ifi6) ? ", " : "",
+	     c->ifi6 ? if_indextoname(c->ifi6, ifn) : "",
+	     c->ifi6 ? " (IPv6)" : "");
+
 	if (c->mode == MODE_PASTA)
 		info("Namespace interface: %s", c->pasta_ifn);
 
diff --git a/passt.h b/passt.h
index e0383eb..cc60c84 100644
--- a/passt.h
+++ b/passt.h
@@ -164,10 +164,10 @@ struct ip6_ctx {
  * @fd_tap:		AF_UNIX socket, tuntap device, or pre-opened socket
  * @mac:		Host MAC address
  * @mac_guest:		MAC address of guest or namespace, seen or configured
- * @ifi4:		Index of routable interface for IPv4, 0 if IPv4 disabled
+ * @ifi4:		Index of template interface for IPv4, 0 if IPv4 disabled
  * @ip:			IPv4 configuration
  * @dns_search:		DNS search list
- * @ifi6:		Index of routable interface for IPv6, 0 if IPv6 disabled
+ * @ifi6:		Index of template interface for IPv6, 0 if IPv6 disabled
  * @ip6:		IPv6 configuration
  * @pasta_ifn:		Name of namespace interface for pasta
  * @pasta_ifn:		Index of namespace interface for pasta
-- 
@@ -164,10 +164,10 @@ struct ip6_ctx {
  * @fd_tap:		AF_UNIX socket, tuntap device, or pre-opened socket
  * @mac:		Host MAC address
  * @mac_guest:		MAC address of guest or namespace, seen or configured
- * @ifi4:		Index of routable interface for IPv4, 0 if IPv4 disabled
+ * @ifi4:		Index of template interface for IPv4, 0 if IPv4 disabled
  * @ip:			IPv4 configuration
  * @dns_search:		DNS search list
- * @ifi6:		Index of routable interface for IPv6, 0 if IPv6 disabled
+ * @ifi6:		Index of template interface for IPv6, 0 if IPv6 disabled
  * @ip6:		IPv6 configuration
  * @pasta_ifn:		Name of namespace interface for pasta
  * @pasta_ifn:		Index of namespace interface for pasta
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] conf, icmp, tcp, udp: Add options to bind to outbound address and interface
  2023-03-08  7:34 [PATCH 0/2] Implement explicit outbound address and interface selection Stefano Brivio
  2023-03-08  7:34 ` [PATCH 1/2] conf, passt.h: Rename "outbound" interface to "template" interface Stefano Brivio
@ 2023-03-08  7:34 ` Stefano Brivio
  2023-03-08 22:02   ` David Gibson
  1 sibling, 1 reply; 8+ messages in thread
From: Stefano Brivio @ 2023-03-08  7:34 UTC (permalink / raw)
  To: passt-dev

I didn't notice earlier: libslirp (and slirp4netns) supports binding
outbound sockets to specific IPv4 and IPv6 addresses, to force the
source addresse selection. If we want to claim feature parity, we
should implement that as well.

Further, Podman supports specifying outbound interfaces as well, but
this is simply done by resolving the primary address for an interface
when the network back-end is started. However, since kernel version
5.7, commit c427bfec18f2 ("net: core: enable SO_BINDTODEVICE for
non-root users"), we can actually bind to a specific interface name,
which doesn't need to be validated in advance.

Implement -o / --outbound ADDR to bind to IPv4 and IPv6 addresses,
and --outbound-ip4 and --outbound-ip6 to bind IPv4 and IPv6 sockets
to given interfaces.

For ICMP and UDP, we call sock_l4() to open outbound sockets, as we
already needed to bind to given ports or echo identifiers, and we
can bind() a socket only once: there, pass address (if any) and
interface (if any) for the existing bind() and setsockopt() calls.

For TCP, in general, we wouldn't otherwise bind sockets. Add a
specific helper to do that.

For UDP outbound sockets, we need to know if the final destination
of the socket is a loopback address, before we decide whether it
makes sense to bind the socket at all: move the block mangling the
address destination before the creation of the socket in the IPv4
path. This was already the case for the IPv6 path.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
---
 conf.c  | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++----
 icmp.c  | 24 +++++++++++++++---
 passt.1 | 19 ++++++++++++++
 passt.h | 10 ++++++++
 tcp.c   | 60 ++++++++++++++++++++++++++++++++++++++++++++
 udp.c   | 54 ++++++++++++++++++++++++++-------------
 6 files changed, 219 insertions(+), 26 deletions(-)

diff --git a/conf.c b/conf.c
index 3aa3314..15506ec 100644
--- a/conf.c
+++ b/conf.c
@@ -776,6 +776,13 @@ static void usage(const char *name)
 	info(   "    default: gateway from interface with default route");
 	info(   "  -i, --interface NAME	Interface for addresses and routes");
 	info(   "    default: interface with first default route");
+	info(   "  -o, --outbound ADDR	Bind to address as outbound source");
+	info(   "    can be specified zero to two times (for IPv4 and IPv6)");
+	info(   "    default: use source address from routing tables");
+	info(   "  --outbound-if4 NAME	Bind to outbound interface for IPv4");
+	info(   "    default: use interface from default route");
+	info(   "  --outbound-if6 NAME	Bind to outbound interface for IPv6");
+	info(   "    default: use interface from default route");
 	info(   "  -D, --dns ADDR	Use IPv4 or IPv6 address as DNS");
 	info(   "    can be specified multiple times");
 	info(   "    a single, empty option disables DNS information");
@@ -900,7 +907,7 @@ pasta_opts:
  */
 static void conf_print(const struct ctx *c)
 {
-	char buf4[INET_ADDRSTRLEN], ifn[IFNAMSIZ];
+	char buf4[INET_ADDRSTRLEN], buf6[INET6_ADDRSTRLEN], ifn[IFNAMSIZ];
 	int i;
 
 	info("Template interface: %s%s%s%s%s",
@@ -910,6 +917,26 @@ static void conf_print(const struct ctx *c)
 	     c->ifi6 ? if_indextoname(c->ifi6, ifn) : "",
 	     c->ifi6 ? " (IPv6)" : "");
 
+	if (*c->ip4.ifname_out || *c->ip6.ifname_out) {
+		info("Outbound interface: %s%s%s%s%s",
+		     *c->ip4.ifname_out ? c->ip4.ifname_out : "",
+		     *c->ip4.ifname_out ? " (IPv4)" : "",
+		     (*c->ip4.ifname_out && *c->ip6.ifname_out) ? ", " : "",
+		     *c->ip6.ifname_out ? c->ip6.ifname_out : "",
+		     *c->ip6.ifname_out ? " (IPv6)" : "");
+	}
+
+	if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out) ||
+	    !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out)) {
+		info("Outbound address: %s%s%s",
+		     IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out) ? "" :
+		     inet_ntop(AF_INET, &c->ip4.addr_out, buf4, sizeof(buf4)),
+		     (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out) &&
+		      !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out)) ? ", " : "",
+		     IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out) ? "" :
+		     inet_ntop(AF_INET6, &c->ip6.addr_out, buf6, sizeof(buf6)));
+	}
+
 	if (c->mode == MODE_PASTA)
 		info("Namespace interface: %s", c->pasta_ifn);
 
@@ -948,8 +975,6 @@ static void conf_print(const struct ctx *c)
 	}
 
 	if (c->ifi6) {
-		char buf6[INET6_ADDRSTRLEN];
-
 		if (!c->no_ndp && !c->no_dhcpv6)
 			info("NDP/DHCPv6:");
 		else if (!c->no_ndp)
@@ -1125,6 +1150,7 @@ void conf(struct ctx *c, int argc, char **argv)
 		{"mac-addr",	required_argument,	NULL,		'M' },
 		{"gateway",	required_argument,	NULL,		'g' },
 		{"interface",	required_argument,	NULL,		'i' },
+		{"outbound",	required_argument,	NULL,		'o' },
 		{"dns",		required_argument,	NULL,		'D' },
 		{"search",	required_argument,	NULL,		'S' },
 		{"no-tcp",	no_argument,		&c->no_tcp,	1 },
@@ -1157,6 +1183,8 @@ void conf(struct ctx *c, int argc, char **argv)
 		{"runas",	required_argument,	NULL,		12 },
 		{"log-size",	required_argument,	NULL,		13 },
 		{"version",	no_argument,		NULL,		14 },
+		{"outbound-if4", required_argument,	NULL,		15 },
+		{"outbound-if6", required_argument,	NULL,		16 },
 		{ 0 },
 	};
 	struct get_bound_ports_ns_arg ns_ports_arg = { .c = c };
@@ -1175,9 +1203,9 @@ void conf(struct ctx *c, int argc, char **argv)
 
 	if (c->mode == MODE_PASTA) {
 		c->no_dhcp_dns = c->no_dhcp_dns_search = 1;
-		optstring = "dqfel:hF:I:p:P:m:a:n:M:g:i:D:S:46t:u:T:U:";
+		optstring = "dqfel:hF:I:p:P:m:a:n:M:g:i:o:D:S:46t:u:T:U:";
 	} else {
-		optstring = "dqfel:hs:F:p:P:m:a:n:M:g:i:D:S:461t:u:";
+		optstring = "dqfel:hs:F:p:P:m:a:n:M:g:i:o:D:S:461t:u:";
 	}
 
 	c->tcp.fwd_in.mode = c->tcp.fwd_out.mode = 0;
@@ -1295,6 +1323,26 @@ void conf(struct ctx *c, int argc, char **argv)
 				c->mode == MODE_PASST ? "passt " : "pasta ");
 			fprintf(stdout, VERSION_BLOB);
 			exit(EXIT_SUCCESS);
+		case 15:
+			if (*c->ip4.ifname_out)
+				die("Redundant outbound interface: %s", optarg);
+
+			ret = snprintf(c->ip4.ifname_out,
+				       sizeof(c->ip4.ifname_out), "%s", optarg);
+			if (ret <= 0 || ret >= (int)sizeof(c->ip4.ifname_out))
+				die("Invalid interface name: %s", optarg);
+
+			break;
+		case 16:
+			if (*c->ip6.ifname_out)
+				die("Redundant outbound interface: %s", optarg);
+
+			ret = snprintf(c->ip6.ifname_out,
+				       sizeof(c->ip6.ifname_out), "%s", optarg);
+			if (ret <= 0 || ret >= (int)sizeof(c->ip6.ifname_out))
+				die("Invalid interface name: %s", optarg);
+
+			break;
 		case 'd':
 			if (c->debug)
 				die("Multiple --debug options given");
@@ -1466,6 +1514,26 @@ void conf(struct ctx *c, int argc, char **argv)
 				die("Invalid interface name %s: %s", optarg,
 				    strerror(errno));
 			break;
+		case 'o':
+			if (IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out)	  &&
+			    inet_pton(AF_INET6, optarg, &c->ip6.addr_out) &&
+			    !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out)	  &&
+			    !IN6_IS_ADDR_LOOPBACK(&c->ip6.addr_out)	  &&
+			    !IN6_IS_ADDR_V4MAPPED(&c->ip6.addr_out)	  &&
+			    !IN6_IS_ADDR_V4COMPAT(&c->ip6.addr_out)	  &&
+			    !IN6_IS_ADDR_MULTICAST(&c->ip6.addr_out))
+				break;
+
+			if (IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out)	 &&
+			    inet_pton(AF_INET, optarg, &c->ip4.addr_out) &&
+			    !IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out)	 &&
+			    !IN4_IS_ADDR_BROADCAST(&c->ip4.addr_out)	 &&
+			    !IN4_IS_ADDR_MULTICAST(&c->ip4.addr_out))
+				break;
+
+			die("Invalid or redundant outbound address: %s",
+			    optarg);
+			break;
 		case 'D':
 			if (!strcmp(optarg, "none")) {
 				if (c->no_dns)
diff --git a/icmp.c b/icmp.c
index b842fa8..ddf83f8 100644
--- a/icmp.c
+++ b/icmp.c
@@ -170,8 +170,16 @@ int icmp_tap_handler(const struct ctx *c, int af, const void *addr,
 		iref.icmp.id = id = ntohs(ih->un.echo.id);
 
 		if ((s = icmp_id_map[V4][id].sock) <= 0) {
-			s = sock_l4(c, AF_INET, IPPROTO_ICMP, NULL, NULL, id,
-				    iref.u32);
+			const struct in_addr *bind_addr = NULL;
+			const char *bind_if;
+
+			bind_if = *c->ip4.ifname_out ? c->ip4.ifname_out : NULL;
+
+			if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out))
+				bind_addr = &c->ip4.addr_out;
+
+			s = sock_l4(c, AF_INET, IPPROTO_ICMP, bind_addr,
+				    bind_if, id, iref.u32);
 			if (s < 0)
 				goto fail_sock;
 			if (s > SOCKET_MAX) {
@@ -216,8 +224,16 @@ int icmp_tap_handler(const struct ctx *c, int af, const void *addr,
 
 		iref.icmp.id = id = ntohs(ih->icmp6_identifier);
 		if ((s = icmp_id_map[V6][id].sock) <= 0) {
-			s = sock_l4(c, AF_INET6, IPPROTO_ICMPV6, NULL, NULL, id,
-				    iref.u32);
+			const struct in6_addr *bind_addr = NULL;
+			const char *bind_if;
+
+			bind_if = *c->ip6.ifname_out ? c->ip6.ifname_out : NULL;
+
+			if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out))
+				bind_addr = &c->ip6.addr_out;
+
+			s = sock_l4(c, AF_INET6, IPPROTO_ICMPV6, bind_addr,
+				    bind_if, id, iref.u32);
 			if (s < 0)
 				goto fail_sock;
 			if (s > SOCKET_MAX) {
diff --git a/passt.1 b/passt.1
index f317c33..5c97f30 100644
--- a/passt.1
+++ b/passt.1
@@ -183,6 +183,25 @@ Use host interface \fIname\fR to derive addresses and routes.
 Default is to use the interfaces with the first default routes for each IP
 version.
 
+.TP
+.BR \-o ", " \-\-outbound " " \fIaddr
+Use an IPv4 \fIaddr\fR as source address for IPv4 outbound TCP connections, UDP
+flows, ICMP requests, or an IPv6 \fIaddr\fR for IPv6 ones, by binding outbound
+sockets to it.
+This option can be specified zero (for defaults) to two times (once for IPv4,
+once for IPv6).
+By default, the source address is selected by the routing tables.
+
+.TP
+.BR \-\-outbound-if4 " " \fIname
+Bind IPv4 outbound sockets to host interface \fIname\fR.
+By default, the interface given by the default route is selected.
+
+.TP
+.BR \-\-outbound-if6 " " \fIname
+Bind IPv6 outbound sockets to host interface \fIname\fR.
+By default, the interface given by the default route is selected.
+
 .TP
 .BR \-D ", " \-\-dns " " \fIaddr
 Use \fIaddr\fR (IPv4 or IPv6) for DHCP, DHCPv6, NDP or DNS forwarding, as
diff --git a/passt.h b/passt.h
index cc60c84..b73f4ff 100644
--- a/passt.h
+++ b/passt.h
@@ -106,6 +106,8 @@ enum passt_modes {
  * @dns:		DNS addresses for DHCP, zero-terminated, network order
  * @dns_match:		Forward DNS query if sent to this address, network order
  * @dns_host:		Use this DNS on the host for forwarding, network order
+ * @addr_out:		Optional source address for outbound traffic
+ * @ifname_out:		Optional interface name to bind outbound sockets to
  */
 struct ip4_ctx {
 	struct in_addr addr;
@@ -115,6 +117,9 @@ struct ip4_ctx {
 	struct in_addr dns[MAXNS + 1];
 	struct in_addr dns_match;
 	struct in_addr dns_host;
+
+	struct in_addr addr_out;
+	char ifname_out[IFNAMSIZ];
 };
 
 /**
@@ -127,6 +132,8 @@ struct ip4_ctx {
  * @dns:		DNS addresses for DHCPv6 and NDP, zero-terminated
  * @dns_match:		Forward DNS query if sent to this address
  * @dns_host:		Use this DNS on the host for forwarding
+ * @addr_out:		Optional source address for outbound traffic
+ * @ifname_out:		Optional interface name to bind outbound sockets to
  */
 struct ip6_ctx {
 	struct in6_addr addr;
@@ -137,6 +144,9 @@ struct ip6_ctx {
 	struct in6_addr dns[MAXNS + 1];
 	struct in6_addr dns_match;
 	struct in6_addr dns_host;
+
+	struct in6_addr addr_out;
+	char ifname_out[IFNAMSIZ];
 };
 
 #include <netinet/if_ether.h>
diff --git a/tcp.c b/tcp.c
index b674311..8e8d653 100644
--- a/tcp.c
+++ b/tcp.c
@@ -1946,6 +1946,61 @@ static uint16_t tcp_conn_tap_mss(const struct ctx *c,
 	return MIN(mss, USHRT_MAX);
 }
 
+/**
+ * tcp_bind_outbound() - Bind socket to outbound address and interface if given
+ * @c:		Execution context
+ * @s:		Outbound TCP socket
+ * @af:		Address family
+ */
+static void tcp_bind_outbound(const struct ctx *c, int s, sa_family_t af)
+{
+	if (af == AF_INET) {
+		if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out)) {
+			struct sockaddr_in addr4 = {
+				.sin_family = AF_INET,
+				.sin_port = 0,
+				.sin_addr = c->ip4.addr_out,
+			};
+
+			if (bind(s, (struct sockaddr *)&addr4, sizeof(addr4))) {
+				debug("Can't bind IPv4 TCP socket address: %s",
+				      strerror(errno));
+			}
+		}
+
+		if (*c->ip4.ifname_out) {
+			if (setsockopt(s, SOL_SOCKET, SO_BINDTODEVICE,
+				       c->ip4.ifname_out,
+				       strlen(c->ip4.ifname_out))) {
+				debug("Can't bind IPv4 TCP socket to interface:"
+				      " %s", strerror(errno));
+			}
+		}
+	} else if (af == AF_INET6) {
+		if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out)) {
+			struct sockaddr_in6 addr6 = {
+				.sin6_family = AF_INET6,
+				.sin6_port = 0,
+				.sin6_addr = c->ip6.addr_out,
+			};
+
+			if (bind(s, (struct sockaddr *)&addr6, sizeof(addr6))) {
+				debug("Can't bind IPv6 TCP socket address: %s",
+				      strerror(errno));
+			}
+		}
+
+		if (*c->ip6.ifname_out) {
+			if (setsockopt(s, SOL_SOCKET, SO_BINDTODEVICE,
+				       c->ip6.ifname_out,
+				       strlen(c->ip6.ifname_out))) {
+				debug("Can't bind IPv6 TCP socket to interface:"
+				      " %s", strerror(errno));
+			}
+		}
+	}
+}
+
 /**
  * tcp_conn_from_tap() - Handle connection request (SYN segment) from tap
  * @c:		Execution context
@@ -2052,6 +2107,11 @@ static void tcp_conn_from_tap(struct ctx *c, int af, const void *addr,
 	if (errno != EADDRNOTAVAIL && errno != EACCES)
 		conn_flag(c, conn, LOCAL);
 
+	if ((af == AF_INET &&  !IN4_IS_ADDR_LOOPBACK(&addr4.sin_addr)) ||
+	    (af == AF_INET6 && !IN6_IS_ADDR_LOOPBACK(&addr6.sin6_addr) &&
+			       !IN6_IS_ADDR_LINKLOCAL(&addr6.sin6_addr)))
+		tcp_bind_outbound(c, s, af);
+
 	if (connect(s, sa, sl)) {
 		if (errno != EINPROGRESS) {
 			tcp_rst(c, conn);
diff --git a/udp.c b/udp.c
index b7cbfdc..99cfc9f 100644
--- a/udp.c
+++ b/udp.c
@@ -843,20 +843,6 @@ int udp_tap_handler(struct ctx *c, int af, const void *addr,
 		sa = (struct sockaddr *)&s_in;
 		sl = sizeof(s_in);
 
-		if (!(s = udp_tap_map[V4][src].sock)) {
-			union udp_epoll_ref uref = { .udp.port = src };
-
-			s = sock_l4(c, AF_INET, IPPROTO_UDP, NULL, NULL, src,
-				    uref.u32);
-			if (s < 0)
-				return p->count;
-
-			udp_tap_map[V4][src].sock = s;
-			bitmap_set(udp_act[V4][UDP_ACT_TAP], src);
-		}
-
-		udp_tap_map[V4][src].ts = now->tv_sec;
-
 		if (IN4_ARE_ADDR_EQUAL(&s_in.sin_addr, &c->ip4.dns_match) &&
 		    ntohs(s_in.sin_port) == 53) {
 			s_in.sin_addr = c->ip4.dns_host;
@@ -868,13 +854,37 @@ int udp_tap_handler(struct ctx *c, int af, const void *addr,
 			else
 				s_in.sin_addr = c->ip4.addr_seen;
 		}
+
+		if (!(s = udp_tap_map[V4][src].sock)) {
+			union udp_epoll_ref uref = { .udp.port = src };
+			in_addr_t bind_addr = { 0 };
+			const char *bind_if = NULL;
+
+			if (!IN6_IS_ADDR_LOOPBACK(&s_in.sin_addr) &&
+			    *c->ip6.ifname_out)
+				bind_if = c->ip6.ifname_out;
+
+			if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out) &&
+			    !IN4_IS_ADDR_LOOPBACK(&s_in.sin_addr))
+				bind_addr = c->ip4.addr_out.s_addr;
+
+			s = sock_l4(c, AF_INET, IPPROTO_UDP, &bind_addr,
+				    bind_if, src, uref.u32);
+			if (s < 0)
+				return p->count;
+
+			udp_tap_map[V4][src].sock = s;
+			bitmap_set(udp_act[V4][UDP_ACT_TAP], src);
+		}
+
+		udp_tap_map[V4][src].ts = now->tv_sec;
 	} else {
 		s_in6 = (struct sockaddr_in6) {
 			.sin6_family = AF_INET6,
 			.sin6_port = uh->dest,
 			.sin6_addr = *(struct in6_addr *)addr,
 		};
-		const void *bind_addr = &in6addr_any;
+		const struct in6_addr *bind_addr = &in6addr_any;
 
 		sa = (struct sockaddr *)&s_in6;
 		sl = sizeof(s_in6);
@@ -898,9 +908,19 @@ int udp_tap_handler(struct ctx *c, int af, const void *addr,
 		if (!(s = udp_tap_map[V6][src].sock)) {
 			union udp_epoll_ref uref = { .udp.v6 = 1,
 						     .udp.port = src };
+			const char *bind_if = NULL;
+
+			if (!IN6_IS_ADDR_LOOPBACK(&s_in6.sin6_addr) &&
+			    *c->ip6.ifname_out)
+				bind_if = c->ip6.ifname_out;
+
+			if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out) &&
+			    !IN6_IS_ADDR_LOOPBACK(&s_in6.sin6_addr) &&
+			    !IN6_IS_ADDR_LINKLOCAL(&s_in6.sin6_addr))
+				bind_addr = &c->ip6.addr_out;
 
-			s = sock_l4(c, AF_INET6, IPPROTO_UDP, bind_addr, NULL,
-				    src, uref.u32);
+			s = sock_l4(c, AF_INET6, IPPROTO_UDP, &bind_addr,
+				    bind_if, src, uref.u32);
 			if (s < 0)
 				return p->count;
 
-- 
@@ -843,20 +843,6 @@ int udp_tap_handler(struct ctx *c, int af, const void *addr,
 		sa = (struct sockaddr *)&s_in;
 		sl = sizeof(s_in);
 
-		if (!(s = udp_tap_map[V4][src].sock)) {
-			union udp_epoll_ref uref = { .udp.port = src };
-
-			s = sock_l4(c, AF_INET, IPPROTO_UDP, NULL, NULL, src,
-				    uref.u32);
-			if (s < 0)
-				return p->count;
-
-			udp_tap_map[V4][src].sock = s;
-			bitmap_set(udp_act[V4][UDP_ACT_TAP], src);
-		}
-
-		udp_tap_map[V4][src].ts = now->tv_sec;
-
 		if (IN4_ARE_ADDR_EQUAL(&s_in.sin_addr, &c->ip4.dns_match) &&
 		    ntohs(s_in.sin_port) == 53) {
 			s_in.sin_addr = c->ip4.dns_host;
@@ -868,13 +854,37 @@ int udp_tap_handler(struct ctx *c, int af, const void *addr,
 			else
 				s_in.sin_addr = c->ip4.addr_seen;
 		}
+
+		if (!(s = udp_tap_map[V4][src].sock)) {
+			union udp_epoll_ref uref = { .udp.port = src };
+			in_addr_t bind_addr = { 0 };
+			const char *bind_if = NULL;
+
+			if (!IN6_IS_ADDR_LOOPBACK(&s_in.sin_addr) &&
+			    *c->ip6.ifname_out)
+				bind_if = c->ip6.ifname_out;
+
+			if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out) &&
+			    !IN4_IS_ADDR_LOOPBACK(&s_in.sin_addr))
+				bind_addr = c->ip4.addr_out.s_addr;
+
+			s = sock_l4(c, AF_INET, IPPROTO_UDP, &bind_addr,
+				    bind_if, src, uref.u32);
+			if (s < 0)
+				return p->count;
+
+			udp_tap_map[V4][src].sock = s;
+			bitmap_set(udp_act[V4][UDP_ACT_TAP], src);
+		}
+
+		udp_tap_map[V4][src].ts = now->tv_sec;
 	} else {
 		s_in6 = (struct sockaddr_in6) {
 			.sin6_family = AF_INET6,
 			.sin6_port = uh->dest,
 			.sin6_addr = *(struct in6_addr *)addr,
 		};
-		const void *bind_addr = &in6addr_any;
+		const struct in6_addr *bind_addr = &in6addr_any;
 
 		sa = (struct sockaddr *)&s_in6;
 		sl = sizeof(s_in6);
@@ -898,9 +908,19 @@ int udp_tap_handler(struct ctx *c, int af, const void *addr,
 		if (!(s = udp_tap_map[V6][src].sock)) {
 			union udp_epoll_ref uref = { .udp.v6 = 1,
 						     .udp.port = src };
+			const char *bind_if = NULL;
+
+			if (!IN6_IS_ADDR_LOOPBACK(&s_in6.sin6_addr) &&
+			    *c->ip6.ifname_out)
+				bind_if = c->ip6.ifname_out;
+
+			if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out) &&
+			    !IN6_IS_ADDR_LOOPBACK(&s_in6.sin6_addr) &&
+			    !IN6_IS_ADDR_LINKLOCAL(&s_in6.sin6_addr))
+				bind_addr = &c->ip6.addr_out;
 
-			s = sock_l4(c, AF_INET6, IPPROTO_UDP, bind_addr, NULL,
-				    src, uref.u32);
+			s = sock_l4(c, AF_INET6, IPPROTO_UDP, &bind_addr,
+				    bind_if, src, uref.u32);
 			if (s < 0)
 				return p->count;
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] conf, passt.h: Rename "outbound" interface to "template" interface
  2023-03-08  7:34 ` [PATCH 1/2] conf, passt.h: Rename "outbound" interface to "template" interface Stefano Brivio
@ 2023-03-08 21:05   ` David Gibson
  0 siblings, 0 replies; 8+ messages in thread
From: David Gibson @ 2023-03-08 21:05 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: passt-dev

[-- Attachment #1: Type: text/plain, Size: 2505 bytes --]

On Wed, Mar 08, 2023 at 08:34:48AM +0100, Stefano Brivio wrote:
> In preparation for the next patch, make it clear that the first
> routable interface fetched via netlink, or the one configured via
> -i/--interface, is simply used as template to copy addresses and
> routes, not an interface we actually use to derive the source address
> (which will be _bound to_) for outgoing packets.
> 
> The man page and usage message appear to be already clear enough.
> 
> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> ---
>  conf.c  | 11 +++++++----
>  passt.h |  4 ++--
>  2 files changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/conf.c b/conf.c
> index 0e512f4..3aa3314 100644
> --- a/conf.c
> +++ b/conf.c
> @@ -903,10 +903,13 @@ static void conf_print(const struct ctx *c)
>  	char buf4[INET_ADDRSTRLEN], ifn[IFNAMSIZ];
>  	int i;
>  
> -	if (c->ifi4)
> -		info("Outbound interface (IPv4): %s", if_indextoname(c->ifi4, ifn));
> -	if (c->ifi6)
> -		info("Outbound interface (IPv6): %s", if_indextoname(c->ifi6, ifn));
> +	info("Template interface: %s%s%s%s%s",
> +	     c->ifi4 ? if_indextoname(c->ifi4, ifn) : "",
> +	     c->ifi4 ? " (IPv4)" : "",
> +	     (c->ifi4 && c->ifi6) ? ", " : "",
> +	     c->ifi6 ? if_indextoname(c->ifi6, ifn) : "",
> +	     c->ifi6 ? " (IPv6)" : "");
> +
>  	if (c->mode == MODE_PASTA)
>  		info("Namespace interface: %s", c->pasta_ifn);
>  
> diff --git a/passt.h b/passt.h
> index e0383eb..cc60c84 100644
> --- a/passt.h
> +++ b/passt.h
> @@ -164,10 +164,10 @@ struct ip6_ctx {
>   * @fd_tap:		AF_UNIX socket, tuntap device, or pre-opened socket
>   * @mac:		Host MAC address
>   * @mac_guest:		MAC address of guest or namespace, seen or configured
> - * @ifi4:		Index of routable interface for IPv4, 0 if IPv4 disabled
> + * @ifi4:		Index of template interface for IPv4, 0 if IPv4 disabled
>   * @ip:			IPv4 configuration
>   * @dns_search:		DNS search list
> - * @ifi6:		Index of routable interface for IPv6, 0 if IPv6 disabled
> + * @ifi6:		Index of template interface for IPv6, 0 if IPv6 disabled
>   * @ip6:		IPv6 configuration
>   * @pasta_ifn:		Name of namespace interface for pasta
>   * @pasta_ifn:		Index of namespace interface for pasta

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] conf, icmp, tcp, udp: Add options to bind to outbound address and interface
  2023-03-08  7:34 ` [PATCH 2/2] conf, icmp, tcp, udp: Add options to bind to outbound address and interface Stefano Brivio
@ 2023-03-08 22:02   ` David Gibson
  2023-03-08 23:33     ` Stefano Brivio
  0 siblings, 1 reply; 8+ messages in thread
From: David Gibson @ 2023-03-08 22:02 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: passt-dev

[-- Attachment #1: Type: text/plain, Size: 19875 bytes --]

On Wed, Mar 08, 2023 at 08:34:49AM +0100, Stefano Brivio wrote:
> I didn't notice earlier: libslirp (and slirp4netns) supports binding
> outbound sockets to specific IPv4 and IPv6 addresses, to force the
> source addresse selection. If we want to claim feature parity, we
> should implement that as well.
> 
> Further, Podman supports specifying outbound interfaces as well, but
> this is simply done by resolving the primary address for an interface
> when the network back-end is started. However, since kernel version
> 5.7, commit c427bfec18f2 ("net: core: enable SO_BINDTODEVICE for
> non-root users"), we can actually bind to a specific interface name,
> which doesn't need to be validated in advance.
> 
> Implement -o / --outbound ADDR to bind to IPv4 and IPv6 addresses,
> and --outbound-ip4 and --outbound-ip6 to bind IPv4 and IPv6 sockets
> to given interfaces.

You have 'outbound-ip' here but 'outbound-if' in the code, I think you
intended the latter.

> For ICMP and UDP, we call sock_l4() to open outbound sockets, as we
> already needed to bind to given ports or echo identifiers, and we
> can bind() a socket only once: there, pass address (if any) and
> interface (if any) for the existing bind() and setsockopt() calls.
> 
> For TCP, in general, we wouldn't otherwise bind sockets. Add a
> specific helper to do that.
> 
> For UDP outbound sockets, we need to know if the final destination
> of the socket is a loopback address, before we decide whether it
> makes sense to bind the socket at all: move the block mangling the
> address destination before the creation of the socket in the IPv4
> path. This was already the case for the IPv6 path.
> 
> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> ---
>  conf.c  | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++----
>  icmp.c  | 24 +++++++++++++++---
>  passt.1 | 19 ++++++++++++++
>  passt.h | 10 ++++++++
>  tcp.c   | 60 ++++++++++++++++++++++++++++++++++++++++++++
>  udp.c   | 54 ++++++++++++++++++++++++++-------------
>  6 files changed, 219 insertions(+), 26 deletions(-)
> 
> diff --git a/conf.c b/conf.c
> index 3aa3314..15506ec 100644
> --- a/conf.c
> +++ b/conf.c
> @@ -776,6 +776,13 @@ static void usage(const char *name)
>  	info(   "    default: gateway from interface with default route");
>  	info(   "  -i, --interface NAME	Interface for addresses and routes");
>  	info(   "    default: interface with first default route");

So, I think the outbound IP and the IP we advertise to the guest
should be the same.  Certainly by default, and I'm not sure I can even
think of any case where it would be useful for them to be different.
That fits with the "only NAT when we really have to" goal.

So, could we just add say, --if4 and --if6 instead of having explicit
--outbound-* options?  -i X would be equivalent to "--if4 X --if6 X".

> +	info(   "  -o, --outbound ADDR	Bind to address as outbound source");
> +	info(   "    can be specified zero to two times (for IPv4 and IPv6)");
> +	info(   "    default: use source address from routing tables");

On the same reasoning, can we merge this option with '-a' setting both
outbound and assigned address by default?

Even if we want to allow setting these differently for some strange
cases, I think the more obvious primary "address" and "interface"
options should set the outbound address/interface.  The DHCP
advertised address should match that unless we use some more obscure
option to override it.

In fact... thinking further on that, I'm not convinced there's any
real use for a "template" interface.  Normally it should match the
outbound interface - and if you want it different, I don't think it
really buys anything over explicitly overriding the DHCP advertised
address etc.

> +	info(   "  --outbound-if4 NAME	Bind to outbound interface for IPv4");
> +	info(   "    default: use interface from default route");
> +	info(   "  --outbound-if6 NAME	Bind to outbound interface for IPv6");
> +	info(   "    default: use interface from default route");
>  	info(   "  -D, --dns ADDR	Use IPv4 or IPv6 address as DNS");
>  	info(   "    can be specified multiple times");
>  	info(   "    a single, empty option disables DNS information");
> @@ -900,7 +907,7 @@ pasta_opts:
>   */
>  static void conf_print(const struct ctx *c)
>  {
> -	char buf4[INET_ADDRSTRLEN], ifn[IFNAMSIZ];
> +	char buf4[INET_ADDRSTRLEN], buf6[INET6_ADDRSTRLEN], ifn[IFNAMSIZ];
>  	int i;
>  
>  	info("Template interface: %s%s%s%s%s",
> @@ -910,6 +917,26 @@ static void conf_print(const struct ctx *c)
>  	     c->ifi6 ? if_indextoname(c->ifi6, ifn) : "",
>  	     c->ifi6 ? " (IPv6)" : "");
>  
> +	if (*c->ip4.ifname_out || *c->ip6.ifname_out) {
> +		info("Outbound interface: %s%s%s%s%s",
> +		     *c->ip4.ifname_out ? c->ip4.ifname_out : "",
> +		     *c->ip4.ifname_out ? " (IPv4)" : "",
> +		     (*c->ip4.ifname_out && *c->ip6.ifname_out) ? ", " : "",
> +		     *c->ip6.ifname_out ? c->ip6.ifname_out : "",
> +		     *c->ip6.ifname_out ? " (IPv6)" : "");
> +	}
> +
> +	if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out) ||
> +	    !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out)) {
> +		info("Outbound address: %s%s%s",
> +		     IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out) ? "" :
> +		     inet_ntop(AF_INET, &c->ip4.addr_out, buf4, sizeof(buf4)),
> +		     (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out) &&
> +		      !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out)) ? ", " : "",
> +		     IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out) ? "" :
> +		     inet_ntop(AF_INET6, &c->ip6.addr_out, buf6, sizeof(buf6)));
> +	}
> +
>  	if (c->mode == MODE_PASTA)
>  		info("Namespace interface: %s", c->pasta_ifn);
>  
> @@ -948,8 +975,6 @@ static void conf_print(const struct ctx *c)
>  	}
>  
>  	if (c->ifi6) {
> -		char buf6[INET6_ADDRSTRLEN];
> -
>  		if (!c->no_ndp && !c->no_dhcpv6)
>  			info("NDP/DHCPv6:");
>  		else if (!c->no_ndp)
> @@ -1125,6 +1150,7 @@ void conf(struct ctx *c, int argc, char **argv)
>  		{"mac-addr",	required_argument,	NULL,		'M' },
>  		{"gateway",	required_argument,	NULL,		'g' },
>  		{"interface",	required_argument,	NULL,		'i' },
> +		{"outbound",	required_argument,	NULL,		'o' },
>  		{"dns",		required_argument,	NULL,		'D' },
>  		{"search",	required_argument,	NULL,		'S' },
>  		{"no-tcp",	no_argument,		&c->no_tcp,	1 },
> @@ -1157,6 +1183,8 @@ void conf(struct ctx *c, int argc, char **argv)
>  		{"runas",	required_argument,	NULL,		12 },
>  		{"log-size",	required_argument,	NULL,		13 },
>  		{"version",	no_argument,		NULL,		14 },
> +		{"outbound-if4", required_argument,	NULL,		15 },
> +		{"outbound-if6", required_argument,	NULL,		16 },
>  		{ 0 },
>  	};
>  	struct get_bound_ports_ns_arg ns_ports_arg = { .c = c };
> @@ -1175,9 +1203,9 @@ void conf(struct ctx *c, int argc, char **argv)
>  
>  	if (c->mode == MODE_PASTA) {
>  		c->no_dhcp_dns = c->no_dhcp_dns_search = 1;
> -		optstring = "dqfel:hF:I:p:P:m:a:n:M:g:i:D:S:46t:u:T:U:";
> +		optstring = "dqfel:hF:I:p:P:m:a:n:M:g:i:o:D:S:46t:u:T:U:";
>  	} else {
> -		optstring = "dqfel:hs:F:p:P:m:a:n:M:g:i:D:S:461t:u:";
> +		optstring = "dqfel:hs:F:p:P:m:a:n:M:g:i:o:D:S:461t:u:";
>  	}
>  
>  	c->tcp.fwd_in.mode = c->tcp.fwd_out.mode = 0;
> @@ -1295,6 +1323,26 @@ void conf(struct ctx *c, int argc, char **argv)
>  				c->mode == MODE_PASST ? "passt " : "pasta ");
>  			fprintf(stdout, VERSION_BLOB);
>  			exit(EXIT_SUCCESS);
> +		case 15:
> +			if (*c->ip4.ifname_out)
> +				die("Redundant outbound interface: %s", optarg);
> +
> +			ret = snprintf(c->ip4.ifname_out,
> +				       sizeof(c->ip4.ifname_out), "%s", optarg);
> +			if (ret <= 0 || ret >= (int)sizeof(c->ip4.ifname_out))
> +				die("Invalid interface name: %s", optarg);
> +
> +			break;
> +		case 16:
> +			if (*c->ip6.ifname_out)
> +				die("Redundant outbound interface: %s", optarg);
> +
> +			ret = snprintf(c->ip6.ifname_out,
> +				       sizeof(c->ip6.ifname_out), "%s", optarg);
> +			if (ret <= 0 || ret >= (int)sizeof(c->ip6.ifname_out))
> +				die("Invalid interface name: %s", optarg);
> +
> +			break;
>  		case 'd':
>  			if (c->debug)
>  				die("Multiple --debug options given");
> @@ -1466,6 +1514,26 @@ void conf(struct ctx *c, int argc, char **argv)
>  				die("Invalid interface name %s: %s", optarg,
>  				    strerror(errno));
>  			break;
> +		case 'o':
> +			if (IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out)	  &&
> +			    inet_pton(AF_INET6, optarg, &c->ip6.addr_out) &&
> +			    !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out)	  &&
> +			    !IN6_IS_ADDR_LOOPBACK(&c->ip6.addr_out)	  &&
> +			    !IN6_IS_ADDR_V4MAPPED(&c->ip6.addr_out)	  &&
> +			    !IN6_IS_ADDR_V4COMPAT(&c->ip6.addr_out)	  &&
> +			    !IN6_IS_ADDR_MULTICAST(&c->ip6.addr_out))
> +				break;
> +
> +			if (IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out)	 &&
> +			    inet_pton(AF_INET, optarg, &c->ip4.addr_out) &&
> +			    !IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out)	 &&
> +			    !IN4_IS_ADDR_BROADCAST(&c->ip4.addr_out)	 &&
> +			    !IN4_IS_ADDR_MULTICAST(&c->ip4.addr_out))
> +				break;
> +
> +			die("Invalid or redundant outbound address: %s",
> +			    optarg);
> +			break;

I haven't really looked at any of the conf stuff above, since it will
change if we alter the option semantics as I've suggested.

>  		case 'D':
>  			if (!strcmp(optarg, "none")) {
>  				if (c->no_dns)
> diff --git a/icmp.c b/icmp.c
> index b842fa8..ddf83f8 100644
> --- a/icmp.c
> +++ b/icmp.c
> @@ -170,8 +170,16 @@ int icmp_tap_handler(const struct ctx *c, int af, const void *addr,
>  		iref.icmp.id = id = ntohs(ih->un.echo.id);
>  
>  		if ((s = icmp_id_map[V4][id].sock) <= 0) {
> -			s = sock_l4(c, AF_INET, IPPROTO_ICMP, NULL, NULL, id,
> -				    iref.u32);
> +			const struct in_addr *bind_addr = NULL;
> +			const char *bind_if;
> +
> +			bind_if = *c->ip4.ifname_out ? c->ip4.ifname_out : NULL;
> +
> +			if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out))
> +				bind_addr = &c->ip4.addr_out;
> +
> +			s = sock_l4(c, AF_INET, IPPROTO_ICMP, bind_addr,
> +				    bind_if, id, iref.u32);

LGTM, modulo field names - again I think we should position "outbound
address" as simply "the address", with "DHCP advertised address" as
the special case override.

>  			if (s < 0)
>  				goto fail_sock;
>  			if (s > SOCKET_MAX) {
> @@ -216,8 +224,16 @@ int icmp_tap_handler(const struct ctx *c, int af, const void *addr,
>  
>  		iref.icmp.id = id = ntohs(ih->icmp6_identifier);
>  		if ((s = icmp_id_map[V6][id].sock) <= 0) {
> -			s = sock_l4(c, AF_INET6, IPPROTO_ICMPV6, NULL, NULL, id,
> -				    iref.u32);
> +			const struct in6_addr *bind_addr = NULL;
> +			const char *bind_if;
> +
> +			bind_if = *c->ip6.ifname_out ? c->ip6.ifname_out : NULL;
> +
> +			if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out))
> +				bind_addr = &c->ip6.addr_out;
> +
> +			s = sock_l4(c, AF_INET6, IPPROTO_ICMPV6, bind_addr,
> +				    bind_if, id, iref.u32);

Gitto.

>  			if (s < 0)
>  				goto fail_sock;
>  			if (s > SOCKET_MAX) {
> diff --git a/passt.1 b/passt.1
> index f317c33..5c97f30 100644
> --- a/passt.1
> +++ b/passt.1
> @@ -183,6 +183,25 @@ Use host interface \fIname\fR to derive addresses and routes.
>  Default is to use the interfaces with the first default routes for each IP
>  version.
>  
> +.TP
> +.BR \-o ", " \-\-outbound " " \fIaddr
> +Use an IPv4 \fIaddr\fR as source address for IPv4 outbound TCP connections, UDP
> +flows, ICMP requests, or an IPv6 \fIaddr\fR for IPv6 ones, by binding outbound
> +sockets to it.
> +This option can be specified zero (for defaults) to two times (once for IPv4,
> +once for IPv6).
> +By default, the source address is selected by the routing tables.
> +
> +.TP
> +.BR \-\-outbound-if4 " " \fIname
> +Bind IPv4 outbound sockets to host interface \fIname\fR.
> +By default, the interface given by the default route is selected.
> +
> +.TP
> +.BR \-\-outbound-if6 " " \fIname
> +Bind IPv6 outbound sockets to host interface \fIname\fR.
> +By default, the interface given by the default route is selected.
> +
>  .TP
>  .BR \-D ", " \-\-dns " " \fIaddr
>  Use \fIaddr\fR (IPv4 or IPv6) for DHCP, DHCPv6, NDP or DNS forwarding, as
> diff --git a/passt.h b/passt.h
> index cc60c84..b73f4ff 100644
> --- a/passt.h
> +++ b/passt.h
> @@ -106,6 +106,8 @@ enum passt_modes {
>   * @dns:		DNS addresses for DHCP, zero-terminated, network order
>   * @dns_match:		Forward DNS query if sent to this address, network order
>   * @dns_host:		Use this DNS on the host for forwarding, network order
> + * @addr_out:		Optional source address for outbound traffic
> + * @ifname_out:		Optional interface name to bind outbound sockets to
>   */
>  struct ip4_ctx {
>  	struct in_addr addr;
> @@ -115,6 +117,9 @@ struct ip4_ctx {
>  	struct in_addr dns[MAXNS + 1];
>  	struct in_addr dns_match;
>  	struct in_addr dns_host;
> +
> +	struct in_addr addr_out;
> +	char ifname_out[IFNAMSIZ];

Again, I think we should just use ip4_ctx::addr as the outbound
address, with an additional "dhcp_addr" if we must.

>  };
>  
>  /**
> @@ -127,6 +132,8 @@ struct ip4_ctx {
>   * @dns:		DNS addresses for DHCPv6 and NDP, zero-terminated
>   * @dns_match:		Forward DNS query if sent to this address
>   * @dns_host:		Use this DNS on the host for forwarding
> + * @addr_out:		Optional source address for outbound traffic
> + * @ifname_out:		Optional interface name to bind outbound sockets to
>   */
>  struct ip6_ctx {
>  	struct in6_addr addr;
> @@ -137,6 +144,9 @@ struct ip6_ctx {
>  	struct in6_addr dns[MAXNS + 1];
>  	struct in6_addr dns_match;
>  	struct in6_addr dns_host;
> +
> +	struct in6_addr addr_out;
> +	char ifname_out[IFNAMSIZ];

Ditto.

>  };
>  
>  #include <netinet/if_ether.h>
> diff --git a/tcp.c b/tcp.c
> index b674311..8e8d653 100644
> --- a/tcp.c
> +++ b/tcp.c
> @@ -1946,6 +1946,61 @@ static uint16_t tcp_conn_tap_mss(const struct ctx *c,
>  	return MIN(mss, USHRT_MAX);
>  }
>  
> +/**
> + * tcp_bind_outbound() - Bind socket to outbound address and interface if given
> + * @c:		Execution context
> + * @s:		Outbound TCP socket
> + * @af:		Address family
> + */
> +static void tcp_bind_outbound(const struct ctx *c, int s, sa_family_t af)
> +{
> +	if (af == AF_INET) {
> +		if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out)) {
> +			struct sockaddr_in addr4 = {
> +				.sin_family = AF_INET,
> +				.sin_port = 0,
> +				.sin_addr = c->ip4.addr_out,
> +			};
> +
> +			if (bind(s, (struct sockaddr *)&addr4, sizeof(addr4))) {
> +				debug("Can't bind IPv4 TCP socket address: %s",
> +				      strerror(errno));
> +			}
> +		}
> +
> +		if (*c->ip4.ifname_out) {
> +			if (setsockopt(s, SOL_SOCKET, SO_BINDTODEVICE,
> +				       c->ip4.ifname_out,
> +				       strlen(c->ip4.ifname_out))) {
> +				debug("Can't bind IPv4 TCP socket to interface:"
> +				      " %s", strerror(errno));
> +			}
> +		}
> +	} else if (af == AF_INET6) {
> +		if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out)) {
> +			struct sockaddr_in6 addr6 = {
> +				.sin6_family = AF_INET6,
> +				.sin6_port = 0,
> +				.sin6_addr = c->ip6.addr_out,
> +			};
> +
> +			if (bind(s, (struct sockaddr *)&addr6, sizeof(addr6))) {
> +				debug("Can't bind IPv6 TCP socket address: %s",
> +				      strerror(errno));
> +			}
> +		}
> +
> +		if (*c->ip6.ifname_out) {
> +			if (setsockopt(s, SOL_SOCKET, SO_BINDTODEVICE,
> +				       c->ip6.ifname_out,
> +				       strlen(c->ip6.ifname_out))) {
> +				debug("Can't bind IPv6 TCP socket to interface:"
> +				      " %s", strerror(errno));
> +			}
> +		}
> +	}
> +}
> +
>  /**
>   * tcp_conn_from_tap() - Handle connection request (SYN segment) from tap
>   * @c:		Execution context
> @@ -2052,6 +2107,11 @@ static void tcp_conn_from_tap(struct ctx *c, int af, const void *addr,
>  	if (errno != EADDRNOTAVAIL && errno != EACCES)
>  		conn_flag(c, conn, LOCAL);
>  
> +	if ((af == AF_INET &&  !IN4_IS_ADDR_LOOPBACK(&addr4.sin_addr)) ||
> +	    (af == AF_INET6 && !IN6_IS_ADDR_LOOPBACK(&addr6.sin6_addr) &&
> +			       !IN6_IS_ADDR_LINKLOCAL(&addr6.sin6_addr)))
> +		tcp_bind_outbound(c, s, af);

If we have an IPv6 link-local, shouldn't we still do the
SO_BINDTODEVICE, even though we don't want the bind()?

I also wonder if we should allow setting our outbound link-local IPv6
address, since AIUI there could be multiple on the host.  e.g. allow
-a to be given three times: one for IPv4, one for IPv6 global and one
for IPv6 link-local.


>  	if (connect(s, sa, sl)) {
>  		if (errno != EINPROGRESS) {
>  			tcp_rst(c, conn);
> diff --git a/udp.c b/udp.c
> index b7cbfdc..99cfc9f 100644
> --- a/udp.c
> +++ b/udp.c
> @@ -843,20 +843,6 @@ int udp_tap_handler(struct ctx *c, int af, const void *addr,
>  		sa = (struct sockaddr *)&s_in;
>  		sl = sizeof(s_in);
>  
> -		if (!(s = udp_tap_map[V4][src].sock)) {
> -			union udp_epoll_ref uref = { .udp.port = src };
> -
> -			s = sock_l4(c, AF_INET, IPPROTO_UDP, NULL, NULL, src,
> -				    uref.u32);
> -			if (s < 0)
> -				return p->count;
> -
> -			udp_tap_map[V4][src].sock = s;
> -			bitmap_set(udp_act[V4][UDP_ACT_TAP], src);
> -		}
> -
> -		udp_tap_map[V4][src].ts = now->tv_sec;
> -
>  		if (IN4_ARE_ADDR_EQUAL(&s_in.sin_addr, &c->ip4.dns_match) &&
>  		    ntohs(s_in.sin_port) == 53) {
>  			s_in.sin_addr = c->ip4.dns_host;
> @@ -868,13 +854,37 @@ int udp_tap_handler(struct ctx *c, int af, const void *addr,
>  			else
>  				s_in.sin_addr = c->ip4.addr_seen;
>  		}
> +
> +		if (!(s = udp_tap_map[V4][src].sock)) {
> +			union udp_epoll_ref uref = { .udp.port = src };
> +			in_addr_t bind_addr = { 0 };
> +			const char *bind_if = NULL;
> +
> +			if (!IN6_IS_ADDR_LOOPBACK(&s_in.sin_addr) &&
> +			    *c->ip6.ifname_out)
> +				bind_if = c->ip6.ifname_out;
> +
> +			if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out) &&
> +			    !IN4_IS_ADDR_LOOPBACK(&s_in.sin_addr))
> +				bind_addr = c->ip4.addr_out.s_addr;
> +
> +			s = sock_l4(c, AF_INET, IPPROTO_UDP, &bind_addr,
> +				    bind_if, src, uref.u32);
> +			if (s < 0)
> +				return p->count;
> +
> +			udp_tap_map[V4][src].sock = s;
> +			bitmap_set(udp_act[V4][UDP_ACT_TAP], src);
> +		}
> +
> +		udp_tap_map[V4][src].ts = now->tv_sec;
>  	} else {
>  		s_in6 = (struct sockaddr_in6) {
>  			.sin6_family = AF_INET6,
>  			.sin6_port = uh->dest,
>  			.sin6_addr = *(struct in6_addr *)addr,
>  		};
> -		const void *bind_addr = &in6addr_any;
> +		const struct in6_addr *bind_addr = &in6addr_any;
>  
>  		sa = (struct sockaddr *)&s_in6;
>  		sl = sizeof(s_in6);
> @@ -898,9 +908,19 @@ int udp_tap_handler(struct ctx *c, int af, const void *addr,
>  		if (!(s = udp_tap_map[V6][src].sock)) {
>  			union udp_epoll_ref uref = { .udp.v6 = 1,
>  						     .udp.port = src };
> +			const char *bind_if = NULL;
> +
> +			if (!IN6_IS_ADDR_LOOPBACK(&s_in6.sin6_addr) &&
> +			    *c->ip6.ifname_out)
> +				bind_if = c->ip6.ifname_out;
> +
> +			if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out) &&
> +			    !IN6_IS_ADDR_LOOPBACK(&s_in6.sin6_addr) &&
> +			    !IN6_IS_ADDR_LINKLOCAL(&s_in6.sin6_addr))
> +				bind_addr = &c->ip6.addr_out;
>  
> -			s = sock_l4(c, AF_INET6, IPPROTO_UDP, bind_addr, NULL,
> -				    src, uref.u32);
> +			s = sock_l4(c, AF_INET6, IPPROTO_UDP, &bind_addr,

bind_addr is already a pointer, I don't think you want the '&' here.

> +				    bind_if, src, uref.u32);
>  			if (s < 0)
>  				return p->count;
>  

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] conf, icmp, tcp, udp: Add options to bind to outbound address and interface
  2023-03-08 22:02   ` David Gibson
@ 2023-03-08 23:33     ` Stefano Brivio
  2023-03-09  0:18       ` David Gibson
  0 siblings, 1 reply; 8+ messages in thread
From: Stefano Brivio @ 2023-03-08 23:33 UTC (permalink / raw)
  To: David Gibson; +Cc: passt-dev

On Thu, 9 Mar 2023 09:02:57 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Wed, Mar 08, 2023 at 08:34:49AM +0100, Stefano Brivio wrote:
> > I didn't notice earlier: libslirp (and slirp4netns) supports binding
> > outbound sockets to specific IPv4 and IPv6 addresses, to force the
> > source addresse selection. If we want to claim feature parity, we
> > should implement that as well.
> > 
> > Further, Podman supports specifying outbound interfaces as well, but
> > this is simply done by resolving the primary address for an interface
> > when the network back-end is started. However, since kernel version
> > 5.7, commit c427bfec18f2 ("net: core: enable SO_BINDTODEVICE for
> > non-root users"), we can actually bind to a specific interface name,
> > which doesn't need to be validated in advance.
> > 
> > Implement -o / --outbound ADDR to bind to IPv4 and IPv6 addresses,
> > and --outbound-ip4 and --outbound-ip6 to bind IPv4 and IPv6 sockets
> > to given interfaces.  
> 
> You have 'outbound-ip' here but 'outbound-if' in the code, I think you
> intended the latter.

Oops, right.

> > For ICMP and UDP, we call sock_l4() to open outbound sockets, as we
> > already needed to bind to given ports or echo identifiers, and we
> > can bind() a socket only once: there, pass address (if any) and
> > interface (if any) for the existing bind() and setsockopt() calls.
> > 
> > For TCP, in general, we wouldn't otherwise bind sockets. Add a
> > specific helper to do that.
> > 
> > For UDP outbound sockets, we need to know if the final destination
> > of the socket is a loopback address, before we decide whether it
> > makes sense to bind the socket at all: move the block mangling the
> > address destination before the creation of the socket in the IPv4
> > path. This was already the case for the IPv6 path.
> > 
> > Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> > ---
> >  conf.c  | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++----
> >  icmp.c  | 24 +++++++++++++++---
> >  passt.1 | 19 ++++++++++++++
> >  passt.h | 10 ++++++++
> >  tcp.c   | 60 ++++++++++++++++++++++++++++++++++++++++++++
> >  udp.c   | 54 ++++++++++++++++++++++++++-------------
> >  6 files changed, 219 insertions(+), 26 deletions(-)
> > 
> > diff --git a/conf.c b/conf.c
> > index 3aa3314..15506ec 100644
> > --- a/conf.c
> > +++ b/conf.c
> > @@ -776,6 +776,13 @@ static void usage(const char *name)
> >  	info(   "    default: gateway from interface with default route");
> >  	info(   "  -i, --interface NAME	Interface for addresses and routes");
> >  	info(   "    default: interface with first default route");  
> 
> So, I think the outbound IP and the IP we advertise to the guest
> should be the same.  Certainly by default, and I'm not sure I can even
> think of any case where it would be useful for them to be different.
> That fits with the "only NAT when we really have to" goal.

Sure, that's a goal, but users might want to do NAT for whatever
reason, even just for slirp4netns compatibility (which we should
*really* support to play along nicely with Podman).

At the moment, it's already enough to pass '-a 10.200.0.2' and the
outbound address will be (in general) different from what we advertise.

> So, could we just add say, --if4 and --if6 instead of having explicit
> --outbound-* options?  -i X would be equivalent to "--if4 X --if6 X".
> 
> > +	info(   "  -o, --outbound ADDR	Bind to address as outbound source");
> > +	info(   "    can be specified zero to two times (for IPv4 and IPv6)");
> > +	info(   "    default: use source address from routing tables");  
> 
> On the same reasoning, can we merge this option with '-a' setting both
> outbound and assigned address by default?

Well, but this would effectively disable any possibility to do NAT: if
you want to assign a different address, it needs to be available on the
host too.

> Even if we want to allow setting these differently for some strange
> cases, I think the more obvious primary "address" and "interface"
> options should set the outbound address/interface.  The DHCP
> advertised address should match that unless we use some more obscure
> option to override it.

What do you do if that address is not available on the host, though?

Also mind that this option (which I'd like to add mostly for
slirp4netns feature parity) seems to have been introduced in libslirp
to enable the selection of a particular address on an interface with
multiple addresses -- so --if4/--if6 wouldn't be enough for that.

We could have -a selecting the outbound address, but then users can't
do NAT with it.

We could skip the --outbound-if4 and --outbound-if6 parts, but they
have the advantage (over bind()) that the interface doesn't actually
need to exist when we start.

There's also the advantage that the user doesn't need to fetch a given
address (which might even change later) if all they want is controlling
the outbound path, without affecting the source address.

Those options (--outbound-if4 and --outbound-if6) have no equivalent in
libslirp, and Podman approximates them by picking an address from the
given interface just before slirp4netns starts, so strictly speaking
this is not needed for slirp4netns compatibility, but I thought it's a
more "correct" way to implement that.

> In fact... thinking further on that, I'm not convinced there's any
> real use for a "template" interface.  Normally it should match the
> outbound interface - and if you want it different, I don't think it
> really buys anything over explicitly overriding the DHCP advertised
> address etc.

Normally, yes, but if you want it different, what you propose also
imposes some serious limitations.

> > +	info(   "  --outbound-if4 NAME	Bind to outbound interface for IPv4");
> > +	info(   "    default: use interface from default route");
> > +	info(   "  --outbound-if6 NAME	Bind to outbound interface for IPv6");
> > +	info(   "    default: use interface from default route");
> >  	info(   "  -D, --dns ADDR	Use IPv4 or IPv6 address as DNS");
> >  	info(   "    can be specified multiple times");
> >  	info(   "    a single, empty option disables DNS information");
> > @@ -900,7 +907,7 @@ pasta_opts:
> >   */
> >  static void conf_print(const struct ctx *c)
> >  {
> > -	char buf4[INET_ADDRSTRLEN], ifn[IFNAMSIZ];
> > +	char buf4[INET_ADDRSTRLEN], buf6[INET6_ADDRSTRLEN], ifn[IFNAMSIZ];
> >  	int i;
> >  
> >  	info("Template interface: %s%s%s%s%s",
> > @@ -910,6 +917,26 @@ static void conf_print(const struct ctx *c)
> >  	     c->ifi6 ? if_indextoname(c->ifi6, ifn) : "",
> >  	     c->ifi6 ? " (IPv6)" : "");
> >  
> > +	if (*c->ip4.ifname_out || *c->ip6.ifname_out) {
> > +		info("Outbound interface: %s%s%s%s%s",
> > +		     *c->ip4.ifname_out ? c->ip4.ifname_out : "",
> > +		     *c->ip4.ifname_out ? " (IPv4)" : "",
> > +		     (*c->ip4.ifname_out && *c->ip6.ifname_out) ? ", " : "",
> > +		     *c->ip6.ifname_out ? c->ip6.ifname_out : "",
> > +		     *c->ip6.ifname_out ? " (IPv6)" : "");
> > +	}
> > +
> > +	if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out) ||
> > +	    !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out)) {
> > +		info("Outbound address: %s%s%s",
> > +		     IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out) ? "" :
> > +		     inet_ntop(AF_INET, &c->ip4.addr_out, buf4, sizeof(buf4)),
> > +		     (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out) &&
> > +		      !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out)) ? ", " : "",
> > +		     IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out) ? "" :
> > +		     inet_ntop(AF_INET6, &c->ip6.addr_out, buf6, sizeof(buf6)));
> > +	}
> > +
> >  	if (c->mode == MODE_PASTA)
> >  		info("Namespace interface: %s", c->pasta_ifn);
> >  
> > @@ -948,8 +975,6 @@ static void conf_print(const struct ctx *c)
> >  	}
> >  
> >  	if (c->ifi6) {
> > -		char buf6[INET6_ADDRSTRLEN];
> > -
> >  		if (!c->no_ndp && !c->no_dhcpv6)
> >  			info("NDP/DHCPv6:");
> >  		else if (!c->no_ndp)
> > @@ -1125,6 +1150,7 @@ void conf(struct ctx *c, int argc, char **argv)
> >  		{"mac-addr",	required_argument,	NULL,		'M' },
> >  		{"gateway",	required_argument,	NULL,		'g' },
> >  		{"interface",	required_argument,	NULL,		'i' },
> > +		{"outbound",	required_argument,	NULL,		'o' },
> >  		{"dns",		required_argument,	NULL,		'D' },
> >  		{"search",	required_argument,	NULL,		'S' },
> >  		{"no-tcp",	no_argument,		&c->no_tcp,	1 },
> > @@ -1157,6 +1183,8 @@ void conf(struct ctx *c, int argc, char **argv)
> >  		{"runas",	required_argument,	NULL,		12 },
> >  		{"log-size",	required_argument,	NULL,		13 },
> >  		{"version",	no_argument,		NULL,		14 },
> > +		{"outbound-if4", required_argument,	NULL,		15 },
> > +		{"outbound-if6", required_argument,	NULL,		16 },
> >  		{ 0 },
> >  	};
> >  	struct get_bound_ports_ns_arg ns_ports_arg = { .c = c };
> > @@ -1175,9 +1203,9 @@ void conf(struct ctx *c, int argc, char **argv)
> >  
> >  	if (c->mode == MODE_PASTA) {
> >  		c->no_dhcp_dns = c->no_dhcp_dns_search = 1;
> > -		optstring = "dqfel:hF:I:p:P:m:a:n:M:g:i:D:S:46t:u:T:U:";
> > +		optstring = "dqfel:hF:I:p:P:m:a:n:M:g:i:o:D:S:46t:u:T:U:";
> >  	} else {
> > -		optstring = "dqfel:hs:F:p:P:m:a:n:M:g:i:D:S:461t:u:";
> > +		optstring = "dqfel:hs:F:p:P:m:a:n:M:g:i:o:D:S:461t:u:";
> >  	}
> >  
> >  	c->tcp.fwd_in.mode = c->tcp.fwd_out.mode = 0;
> > @@ -1295,6 +1323,26 @@ void conf(struct ctx *c, int argc, char **argv)
> >  				c->mode == MODE_PASST ? "passt " : "pasta ");
> >  			fprintf(stdout, VERSION_BLOB);
> >  			exit(EXIT_SUCCESS);
> > +		case 15:
> > +			if (*c->ip4.ifname_out)
> > +				die("Redundant outbound interface: %s", optarg);
> > +
> > +			ret = snprintf(c->ip4.ifname_out,
> > +				       sizeof(c->ip4.ifname_out), "%s", optarg);
> > +			if (ret <= 0 || ret >= (int)sizeof(c->ip4.ifname_out))
> > +				die("Invalid interface name: %s", optarg);
> > +
> > +			break;
> > +		case 16:
> > +			if (*c->ip6.ifname_out)
> > +				die("Redundant outbound interface: %s", optarg);
> > +
> > +			ret = snprintf(c->ip6.ifname_out,
> > +				       sizeof(c->ip6.ifname_out), "%s", optarg);
> > +			if (ret <= 0 || ret >= (int)sizeof(c->ip6.ifname_out))
> > +				die("Invalid interface name: %s", optarg);
> > +
> > +			break;
> >  		case 'd':
> >  			if (c->debug)
> >  				die("Multiple --debug options given");
> > @@ -1466,6 +1514,26 @@ void conf(struct ctx *c, int argc, char **argv)
> >  				die("Invalid interface name %s: %s", optarg,
> >  				    strerror(errno));
> >  			break;
> > +		case 'o':
> > +			if (IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out)	  &&
> > +			    inet_pton(AF_INET6, optarg, &c->ip6.addr_out) &&
> > +			    !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out)	  &&
> > +			    !IN6_IS_ADDR_LOOPBACK(&c->ip6.addr_out)	  &&
> > +			    !IN6_IS_ADDR_V4MAPPED(&c->ip6.addr_out)	  &&
> > +			    !IN6_IS_ADDR_V4COMPAT(&c->ip6.addr_out)	  &&
> > +			    !IN6_IS_ADDR_MULTICAST(&c->ip6.addr_out))
> > +				break;
> > +
> > +			if (IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out)	 &&
> > +			    inet_pton(AF_INET, optarg, &c->ip4.addr_out) &&
> > +			    !IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out)	 &&
> > +			    !IN4_IS_ADDR_BROADCAST(&c->ip4.addr_out)	 &&
> > +			    !IN4_IS_ADDR_MULTICAST(&c->ip4.addr_out))
> > +				break;
> > +
> > +			die("Invalid or redundant outbound address: %s",
> > +			    optarg);
> > +			break;  
> 
> I haven't really looked at any of the conf stuff above, since it will
> change if we alter the option semantics as I've suggested.
> 
> >  		case 'D':
> >  			if (!strcmp(optarg, "none")) {
> >  				if (c->no_dns)
> > diff --git a/icmp.c b/icmp.c
> > index b842fa8..ddf83f8 100644
> > --- a/icmp.c
> > +++ b/icmp.c
> > @@ -170,8 +170,16 @@ int icmp_tap_handler(const struct ctx *c, int af, const void *addr,
> >  		iref.icmp.id = id = ntohs(ih->un.echo.id);
> >  
> >  		if ((s = icmp_id_map[V4][id].sock) <= 0) {
> > -			s = sock_l4(c, AF_INET, IPPROTO_ICMP, NULL, NULL, id,
> > -				    iref.u32);
> > +			const struct in_addr *bind_addr = NULL;
> > +			const char *bind_if;
> > +
> > +			bind_if = *c->ip4.ifname_out ? c->ip4.ifname_out : NULL;
> > +
> > +			if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out))
> > +				bind_addr = &c->ip4.addr_out;
> > +
> > +			s = sock_l4(c, AF_INET, IPPROTO_ICMP, bind_addr,
> > +				    bind_if, id, iref.u32);  
> 
> LGTM, modulo field names - again I think we should position "outbound
> address" as simply "the address", with "DHCP advertised address" as
> the special case override.

But from passt/pasta's perspective, "the address" is already what's
expected to be used by the guest/container. We use that all over the
place, and I think it might really need to be different from an address
we "externally" bind to.

I really wouldn't like to see users coming up with netfilter-based hacks
to support this. :/

> >  			if (s < 0)
> >  				goto fail_sock;
> >  			if (s > SOCKET_MAX) {
> > @@ -216,8 +224,16 @@ int icmp_tap_handler(const struct ctx *c, int af, const void *addr,
> >  
> >  		iref.icmp.id = id = ntohs(ih->icmp6_identifier);
> >  		if ((s = icmp_id_map[V6][id].sock) <= 0) {
> > -			s = sock_l4(c, AF_INET6, IPPROTO_ICMPV6, NULL, NULL, id,
> > -				    iref.u32);
> > +			const struct in6_addr *bind_addr = NULL;
> > +			const char *bind_if;
> > +
> > +			bind_if = *c->ip6.ifname_out ? c->ip6.ifname_out : NULL;
> > +
> > +			if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out))
> > +				bind_addr = &c->ip6.addr_out;
> > +
> > +			s = sock_l4(c, AF_INET6, IPPROTO_ICMPV6, bind_addr,
> > +				    bind_if, id, iref.u32);  
> 
> Gitto.

Interesting typo. :)

> >  			if (s < 0)
> >  				goto fail_sock;
> >  			if (s > SOCKET_MAX) {
> > diff --git a/passt.1 b/passt.1
> > index f317c33..5c97f30 100644
> > --- a/passt.1
> > +++ b/passt.1
> > @@ -183,6 +183,25 @@ Use host interface \fIname\fR to derive addresses and routes.
> >  Default is to use the interfaces with the first default routes for each IP
> >  version.
> >  
> > +.TP
> > +.BR \-o ", " \-\-outbound " " \fIaddr
> > +Use an IPv4 \fIaddr\fR as source address for IPv4 outbound TCP connections, UDP
> > +flows, ICMP requests, or an IPv6 \fIaddr\fR for IPv6 ones, by binding outbound
> > +sockets to it.
> > +This option can be specified zero (for defaults) to two times (once for IPv4,
> > +once for IPv6).
> > +By default, the source address is selected by the routing tables.
> > +
> > +.TP
> > +.BR \-\-outbound-if4 " " \fIname
> > +Bind IPv4 outbound sockets to host interface \fIname\fR.
> > +By default, the interface given by the default route is selected.
> > +
> > +.TP
> > +.BR \-\-outbound-if6 " " \fIname
> > +Bind IPv6 outbound sockets to host interface \fIname\fR.
> > +By default, the interface given by the default route is selected.
> > +
> >  .TP
> >  .BR \-D ", " \-\-dns " " \fIaddr
> >  Use \fIaddr\fR (IPv4 or IPv6) for DHCP, DHCPv6, NDP or DNS forwarding, as
> > diff --git a/passt.h b/passt.h
> > index cc60c84..b73f4ff 100644
> > --- a/passt.h
> > +++ b/passt.h
> > @@ -106,6 +106,8 @@ enum passt_modes {
> >   * @dns:		DNS addresses for DHCP, zero-terminated, network order
> >   * @dns_match:		Forward DNS query if sent to this address, network order
> >   * @dns_host:		Use this DNS on the host for forwarding, network order
> > + * @addr_out:		Optional source address for outbound traffic
> > + * @ifname_out:		Optional interface name to bind outbound sockets to
> >   */
> >  struct ip4_ctx {
> >  	struct in_addr addr;
> > @@ -115,6 +117,9 @@ struct ip4_ctx {
> >  	struct in_addr dns[MAXNS + 1];
> >  	struct in_addr dns_match;
> >  	struct in_addr dns_host;
> > +
> > +	struct in_addr addr_out;
> > +	char ifname_out[IFNAMSIZ];  
> 
> Again, I think we should just use ip4_ctx::addr as the outbound
> address, with an additional "dhcp_addr" if we must.
> 
> >  };
> >  
> >  /**
> > @@ -127,6 +132,8 @@ struct ip4_ctx {
> >   * @dns:		DNS addresses for DHCPv6 and NDP, zero-terminated
> >   * @dns_match:		Forward DNS query if sent to this address
> >   * @dns_host:		Use this DNS on the host for forwarding
> > + * @addr_out:		Optional source address for outbound traffic
> > + * @ifname_out:		Optional interface name to bind outbound sockets to
> >   */
> >  struct ip6_ctx {
> >  	struct in6_addr addr;
> > @@ -137,6 +144,9 @@ struct ip6_ctx {
> >  	struct in6_addr dns[MAXNS + 1];
> >  	struct in6_addr dns_match;
> >  	struct in6_addr dns_host;
> > +
> > +	struct in6_addr addr_out;
> > +	char ifname_out[IFNAMSIZ];  
> 
> Ditto.
> 
> >  };
> >  
> >  #include <netinet/if_ether.h>
> > diff --git a/tcp.c b/tcp.c
> > index b674311..8e8d653 100644
> > --- a/tcp.c
> > +++ b/tcp.c
> > @@ -1946,6 +1946,61 @@ static uint16_t tcp_conn_tap_mss(const struct ctx *c,
> >  	return MIN(mss, USHRT_MAX);
> >  }
> >  
> > +/**
> > + * tcp_bind_outbound() - Bind socket to outbound address and interface if given
> > + * @c:		Execution context
> > + * @s:		Outbound TCP socket
> > + * @af:		Address family
> > + */
> > +static void tcp_bind_outbound(const struct ctx *c, int s, sa_family_t af)
> > +{
> > +	if (af == AF_INET) {
> > +		if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out)) {
> > +			struct sockaddr_in addr4 = {
> > +				.sin_family = AF_INET,
> > +				.sin_port = 0,
> > +				.sin_addr = c->ip4.addr_out,
> > +			};
> > +
> > +			if (bind(s, (struct sockaddr *)&addr4, sizeof(addr4))) {
> > +				debug("Can't bind IPv4 TCP socket address: %s",
> > +				      strerror(errno));
> > +			}
> > +		}
> > +
> > +		if (*c->ip4.ifname_out) {
> > +			if (setsockopt(s, SOL_SOCKET, SO_BINDTODEVICE,
> > +				       c->ip4.ifname_out,
> > +				       strlen(c->ip4.ifname_out))) {
> > +				debug("Can't bind IPv4 TCP socket to interface:"
> > +				      " %s", strerror(errno));
> > +			}
> > +		}
> > +	} else if (af == AF_INET6) {
> > +		if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out)) {
> > +			struct sockaddr_in6 addr6 = {
> > +				.sin6_family = AF_INET6,
> > +				.sin6_port = 0,
> > +				.sin6_addr = c->ip6.addr_out,
> > +			};
> > +
> > +			if (bind(s, (struct sockaddr *)&addr6, sizeof(addr6))) {
> > +				debug("Can't bind IPv6 TCP socket address: %s",
> > +				      strerror(errno));
> > +			}
> > +		}
> > +
> > +		if (*c->ip6.ifname_out) {
> > +			if (setsockopt(s, SOL_SOCKET, SO_BINDTODEVICE,
> > +				       c->ip6.ifname_out,
> > +				       strlen(c->ip6.ifname_out))) {
> > +				debug("Can't bind IPv6 TCP socket to interface:"
> > +				      " %s", strerror(errno));
> > +			}
> > +		}
> > +	}
> > +}
> > +
> >  /**
> >   * tcp_conn_from_tap() - Handle connection request (SYN segment) from tap
> >   * @c:		Execution context
> > @@ -2052,6 +2107,11 @@ static void tcp_conn_from_tap(struct ctx *c, int af, const void *addr,
> >  	if (errno != EADDRNOTAVAIL && errno != EACCES)
> >  		conn_flag(c, conn, LOCAL);
> >  
> > +	if ((af == AF_INET &&  !IN4_IS_ADDR_LOOPBACK(&addr4.sin_addr)) ||
> > +	    (af == AF_INET6 && !IN6_IS_ADDR_LOOPBACK(&addr6.sin6_addr) &&
> > +			       !IN6_IS_ADDR_LINKLOCAL(&addr6.sin6_addr)))
> > +		tcp_bind_outbound(c, s, af);  
> 
> If we have an IPv6 link-local, shouldn't we still do the
> SO_BINDTODEVICE, even though we don't want the bind()?

If an outbound connection uses a link-local address, we use c->ifi6 as
sin6_scope_id (see tcp_conn_from_tap()) -- we need one and it's a
reasonable assumption to use that as that's where we source the first
hop/default gateway from.

But with SO_BINDTODEVICE the kernel resolves the interface name, which
avoids 1. a netlink exchange on every new connection and 2. the need
for the interface to be present at any given time. So I'm not sure we
can actually check that they match in a safe way -- and if there's any
advantage in doing so.

I tend to see the link-local case as completely separate from any
"outbound" consideration, because that's where we make the
guest/container think they're actually talking to a gateway, not to the
"outside world".

> I also wonder if we should allow setting our outbound link-local IPv6
> address, since AIUI there could be multiple on the host.  e.g. allow
> -a to be given three times: one for IPv4, one for IPv6 global and one
> for IPv6 link-local.

Hmm, yes, I guess we should at some point.

> >  	if (connect(s, sa, sl)) {
> >  		if (errno != EINPROGRESS) {
> >  			tcp_rst(c, conn);
> > diff --git a/udp.c b/udp.c
> > index b7cbfdc..99cfc9f 100644
> > --- a/udp.c
> > +++ b/udp.c
> > @@ -843,20 +843,6 @@ int udp_tap_handler(struct ctx *c, int af, const void *addr,
> >  		sa = (struct sockaddr *)&s_in;
> >  		sl = sizeof(s_in);
> >  
> > -		if (!(s = udp_tap_map[V4][src].sock)) {
> > -			union udp_epoll_ref uref = { .udp.port = src };
> > -
> > -			s = sock_l4(c, AF_INET, IPPROTO_UDP, NULL, NULL, src,
> > -				    uref.u32);
> > -			if (s < 0)
> > -				return p->count;
> > -
> > -			udp_tap_map[V4][src].sock = s;
> > -			bitmap_set(udp_act[V4][UDP_ACT_TAP], src);
> > -		}
> > -
> > -		udp_tap_map[V4][src].ts = now->tv_sec;
> > -
> >  		if (IN4_ARE_ADDR_EQUAL(&s_in.sin_addr, &c->ip4.dns_match) &&
> >  		    ntohs(s_in.sin_port) == 53) {
> >  			s_in.sin_addr = c->ip4.dns_host;
> > @@ -868,13 +854,37 @@ int udp_tap_handler(struct ctx *c, int af, const void *addr,
> >  			else
> >  				s_in.sin_addr = c->ip4.addr_seen;
> >  		}
> > +
> > +		if (!(s = udp_tap_map[V4][src].sock)) {
> > +			union udp_epoll_ref uref = { .udp.port = src };
> > +			in_addr_t bind_addr = { 0 };
> > +			const char *bind_if = NULL;
> > +
> > +			if (!IN6_IS_ADDR_LOOPBACK(&s_in.sin_addr) &&
> > +			    *c->ip6.ifname_out)
> > +				bind_if = c->ip6.ifname_out;
> > +
> > +			if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out) &&
> > +			    !IN4_IS_ADDR_LOOPBACK(&s_in.sin_addr))
> > +				bind_addr = c->ip4.addr_out.s_addr;
> > +
> > +			s = sock_l4(c, AF_INET, IPPROTO_UDP, &bind_addr,
> > +				    bind_if, src, uref.u32);
> > +			if (s < 0)
> > +				return p->count;
> > +
> > +			udp_tap_map[V4][src].sock = s;
> > +			bitmap_set(udp_act[V4][UDP_ACT_TAP], src);
> > +		}
> > +
> > +		udp_tap_map[V4][src].ts = now->tv_sec;
> >  	} else {
> >  		s_in6 = (struct sockaddr_in6) {
> >  			.sin6_family = AF_INET6,
> >  			.sin6_port = uh->dest,
> >  			.sin6_addr = *(struct in6_addr *)addr,
> >  		};
> > -		const void *bind_addr = &in6addr_any;
> > +		const struct in6_addr *bind_addr = &in6addr_any;
> >  
> >  		sa = (struct sockaddr *)&s_in6;
> >  		sl = sizeof(s_in6);
> > @@ -898,9 +908,19 @@ int udp_tap_handler(struct ctx *c, int af, const void *addr,
> >  		if (!(s = udp_tap_map[V6][src].sock)) {
> >  			union udp_epoll_ref uref = { .udp.v6 = 1,
> >  						     .udp.port = src };
> > +			const char *bind_if = NULL;
> > +
> > +			if (!IN6_IS_ADDR_LOOPBACK(&s_in6.sin6_addr) &&
> > +			    *c->ip6.ifname_out)
> > +				bind_if = c->ip6.ifname_out;
> > +
> > +			if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out) &&
> > +			    !IN6_IS_ADDR_LOOPBACK(&s_in6.sin6_addr) &&
> > +			    !IN6_IS_ADDR_LINKLOCAL(&s_in6.sin6_addr))
> > +				bind_addr = &c->ip6.addr_out;
> >  
> > -			s = sock_l4(c, AF_INET6, IPPROTO_UDP, bind_addr, NULL,
> > -				    src, uref.u32);
> > +			s = sock_l4(c, AF_INET6, IPPROTO_UDP, &bind_addr,  
> 
> bind_addr is already a pointer, I don't think you want the '&' here.

Okay, at least this is fixed. :)

> > +				    bind_if, src, uref.u32);
> >  			if (s < 0)
> >  				return p->count;
> >    

-- 
Stefano


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] conf, icmp, tcp, udp: Add options to bind to outbound address and interface
  2023-03-08 23:33     ` Stefano Brivio
@ 2023-03-09  0:18       ` David Gibson
  2023-03-09  2:41         ` Stefano Brivio
  0 siblings, 1 reply; 8+ messages in thread
From: David Gibson @ 2023-03-09  0:18 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: passt-dev

[-- Attachment #1: Type: text/plain, Size: 27488 bytes --]

On Thu, Mar 09, 2023 at 12:33:20AM +0100, Stefano Brivio wrote:
> On Thu, 9 Mar 2023 09:02:57 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Wed, Mar 08, 2023 at 08:34:49AM +0100, Stefano Brivio wrote:
> > > I didn't notice earlier: libslirp (and slirp4netns) supports binding
> > > outbound sockets to specific IPv4 and IPv6 addresses, to force the
> > > source addresse selection. If we want to claim feature parity, we
> > > should implement that as well.
> > > 
> > > Further, Podman supports specifying outbound interfaces as well, but
> > > this is simply done by resolving the primary address for an interface
> > > when the network back-end is started. However, since kernel version
> > > 5.7, commit c427bfec18f2 ("net: core: enable SO_BINDTODEVICE for
> > > non-root users"), we can actually bind to a specific interface name,
> > > which doesn't need to be validated in advance.
> > > 
> > > Implement -o / --outbound ADDR to bind to IPv4 and IPv6 addresses,
> > > and --outbound-ip4 and --outbound-ip6 to bind IPv4 and IPv6 sockets
> > > to given interfaces.  
> > 
> > You have 'outbound-ip' here but 'outbound-if' in the code, I think you
> > intended the latter.
> 
> Oops, right.
> 
> > > For ICMP and UDP, we call sock_l4() to open outbound sockets, as we
> > > already needed to bind to given ports or echo identifiers, and we
> > > can bind() a socket only once: there, pass address (if any) and
> > > interface (if any) for the existing bind() and setsockopt() calls.
> > > 
> > > For TCP, in general, we wouldn't otherwise bind sockets. Add a
> > > specific helper to do that.
> > > 
> > > For UDP outbound sockets, we need to know if the final destination
> > > of the socket is a loopback address, before we decide whether it
> > > makes sense to bind the socket at all: move the block mangling the
> > > address destination before the creation of the socket in the IPv4
> > > path. This was already the case for the IPv6 path.
> > > 
> > > Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> > > ---
> > >  conf.c  | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++----
> > >  icmp.c  | 24 +++++++++++++++---
> > >  passt.1 | 19 ++++++++++++++
> > >  passt.h | 10 ++++++++
> > >  tcp.c   | 60 ++++++++++++++++++++++++++++++++++++++++++++
> > >  udp.c   | 54 ++++++++++++++++++++++++++-------------
> > >  6 files changed, 219 insertions(+), 26 deletions(-)
> > > 
> > > diff --git a/conf.c b/conf.c
> > > index 3aa3314..15506ec 100644
> > > --- a/conf.c
> > > +++ b/conf.c
> > > @@ -776,6 +776,13 @@ static void usage(const char *name)
> > >  	info(   "    default: gateway from interface with default route");
> > >  	info(   "  -i, --interface NAME	Interface for addresses and routes");
> > >  	info(   "    default: interface with first default route");  
> > 
> > So, I think the outbound IP and the IP we advertise to the guest
> > should be the same.  Certainly by default, and I'm not sure I can even
> > think of any case where it would be useful for them to be different.
> > That fits with the "only NAT when we really have to" goal.
> 
> Sure, that's a goal, but users might want to do NAT for whatever
> reason, even just for slirp4netns compatibility (which we should
> *really* support to play along nicely with Podman).
> 
> At the moment, it's already enough to pass '-a 10.200.0.2' and the
> outbound address will be (in general) different from what we advertise.

Yes, I know, but I'm saying I think that's kind of undesirable.  The
more prominent option should change both outbound and advertised
address to match.  Separately overriding the advertised address should
be the special case.

> > So, could we just add say, --if4 and --if6 instead of having explicit
> > --outbound-* options?  -i X would be equivalent to "--if4 X --if6 X".
> > 
> > > +	info(   "  -o, --outbound ADDR	Bind to address as outbound source");
> > > +	info(   "    can be specified zero to two times (for IPv4 and IPv6)");
> > > +	info(   "    default: use source address from routing tables");  
> > 
> > On the same reasoning, can we merge this option with '-a' setting both
> > outbound and assigned address by default?
> 
> Well, but this would effectively disable any possibility to do NAT: if
> you want to assign a different address, it needs to be available on the
> host too.
> 
> > Even if we want to allow setting these differently for some strange
> > cases, I think the more obvious primary "address" and "interface"
> > options should set the outbound address/interface.  The DHCP
> > advertised address should match that unless we use some more obscure
> > option to override it.
> 
> What do you do if that address is not available on the host, though?

Well under this proposal -a would obviously only work for host
addresses, but the --dhcp-address or --guest-address or whatever would
allow anything.

> Also mind that this option (which I'd like to add mostly for
> slirp4netns feature parity) seems to have been introduced in libslirp
> to enable the selection of a particular address on an interface with
> multiple addresses -- so --if4/--if6 wouldn't be enough for that.

Yes, I'm saying you'd use -a for that.  -if4 or --if6 would only be
useful where you want to bind to interface rather than address.

> We could have -a selecting the outbound address, but then users can't
> do NAT with it.

Not *only* with that, but they can if they also set guest address to
override.  Actually.. that's a point, my whole thing here is to only
do NAT if you really want it, so maybe the option should explicitly
mention NAT.

> We could skip the --outbound-if4 and --outbound-if6 parts, but they
> have the advantage (over bind()) that the interface doesn't actually
> need to exist when we start.

I'm not proposing changing that, I still think we should have an
option to set the outbound interface.  It's just treating that as "the
interface", the idea of template interface goes away, replaced by an
explicit override for guest DHCP address.

> There's also the advantage that the user doesn't need to fetch a given
> address (which might even change later) if all they want is controlling
> the outbound path, without affecting the source address.

Not sure what you're getting at here.

> Those options (--outbound-if4 and --outbound-if6) have no equivalent in
> libslirp, and Podman approximates them by picking an address from the
> given interface just before slirp4netns starts, so strictly speaking
> this is not needed for slirp4netns compatibility, but I thought it's a
> more "correct" way to implement that.

No argument on that point - it's really just a matter of naming I'm
talking about here.

> > In fact... thinking further on that, I'm not convinced there's any
> > real use for a "template" interface.  Normally it should match the
> > outbound interface - and if you want it different, I don't think it
> > really buys anything over explicitly overriding the DHCP advertised
> > address etc.
> 
> Normally, yes, but if you want it different, what you propose also
> imposes some serious limitations.

I'm not seeing how.

Normal mode:
	-a sets both the outbound and guest address
	--if4 and --if6 set the outbound interface.  If there's no -a,
	both outbound and guest address are derived from this as well.

NAT mode:
	Same as above, but guest address can separately be set to
	anything you want.

> > > +	info(   "  --outbound-if4 NAME	Bind to outbound interface for IPv4");
> > > +	info(   "    default: use interface from default route");
> > > +	info(   "  --outbound-if6 NAME	Bind to outbound interface for IPv6");
> > > +	info(   "    default: use interface from default route");
> > >  	info(   "  -D, --dns ADDR	Use IPv4 or IPv6 address as DNS");
> > >  	info(   "    can be specified multiple times");
> > >  	info(   "    a single, empty option disables DNS information");
> > > @@ -900,7 +907,7 @@ pasta_opts:
> > >   */
> > >  static void conf_print(const struct ctx *c)
> > >  {
> > > -	char buf4[INET_ADDRSTRLEN], ifn[IFNAMSIZ];
> > > +	char buf4[INET_ADDRSTRLEN], buf6[INET6_ADDRSTRLEN], ifn[IFNAMSIZ];
> > >  	int i;
> > >  
> > >  	info("Template interface: %s%s%s%s%s",
> > > @@ -910,6 +917,26 @@ static void conf_print(const struct ctx *c)
> > >  	     c->ifi6 ? if_indextoname(c->ifi6, ifn) : "",
> > >  	     c->ifi6 ? " (IPv6)" : "");
> > >  
> > > +	if (*c->ip4.ifname_out || *c->ip6.ifname_out) {
> > > +		info("Outbound interface: %s%s%s%s%s",
> > > +		     *c->ip4.ifname_out ? c->ip4.ifname_out : "",
> > > +		     *c->ip4.ifname_out ? " (IPv4)" : "",
> > > +		     (*c->ip4.ifname_out && *c->ip6.ifname_out) ? ", " : "",
> > > +		     *c->ip6.ifname_out ? c->ip6.ifname_out : "",
> > > +		     *c->ip6.ifname_out ? " (IPv6)" : "");
> > > +	}
> > > +
> > > +	if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out) ||
> > > +	    !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out)) {
> > > +		info("Outbound address: %s%s%s",
> > > +		     IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out) ? "" :
> > > +		     inet_ntop(AF_INET, &c->ip4.addr_out, buf4, sizeof(buf4)),
> > > +		     (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out) &&
> > > +		      !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out)) ? ", " : "",
> > > +		     IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out) ? "" :
> > > +		     inet_ntop(AF_INET6, &c->ip6.addr_out, buf6, sizeof(buf6)));
> > > +	}
> > > +
> > >  	if (c->mode == MODE_PASTA)
> > >  		info("Namespace interface: %s", c->pasta_ifn);
> > >  
> > > @@ -948,8 +975,6 @@ static void conf_print(const struct ctx *c)
> > >  	}
> > >  
> > >  	if (c->ifi6) {
> > > -		char buf6[INET6_ADDRSTRLEN];
> > > -
> > >  		if (!c->no_ndp && !c->no_dhcpv6)
> > >  			info("NDP/DHCPv6:");
> > >  		else if (!c->no_ndp)
> > > @@ -1125,6 +1150,7 @@ void conf(struct ctx *c, int argc, char **argv)
> > >  		{"mac-addr",	required_argument,	NULL,		'M' },
> > >  		{"gateway",	required_argument,	NULL,		'g' },
> > >  		{"interface",	required_argument,	NULL,		'i' },
> > > +		{"outbound",	required_argument,	NULL,		'o' },
> > >  		{"dns",		required_argument,	NULL,		'D' },
> > >  		{"search",	required_argument,	NULL,		'S' },
> > >  		{"no-tcp",	no_argument,		&c->no_tcp,	1 },
> > > @@ -1157,6 +1183,8 @@ void conf(struct ctx *c, int argc, char **argv)
> > >  		{"runas",	required_argument,	NULL,		12 },
> > >  		{"log-size",	required_argument,	NULL,		13 },
> > >  		{"version",	no_argument,		NULL,		14 },
> > > +		{"outbound-if4", required_argument,	NULL,		15 },
> > > +		{"outbound-if6", required_argument,	NULL,		16 },
> > >  		{ 0 },
> > >  	};
> > >  	struct get_bound_ports_ns_arg ns_ports_arg = { .c = c };
> > > @@ -1175,9 +1203,9 @@ void conf(struct ctx *c, int argc, char **argv)
> > >  
> > >  	if (c->mode == MODE_PASTA) {
> > >  		c->no_dhcp_dns = c->no_dhcp_dns_search = 1;
> > > -		optstring = "dqfel:hF:I:p:P:m:a:n:M:g:i:D:S:46t:u:T:U:";
> > > +		optstring = "dqfel:hF:I:p:P:m:a:n:M:g:i:o:D:S:46t:u:T:U:";
> > >  	} else {
> > > -		optstring = "dqfel:hs:F:p:P:m:a:n:M:g:i:D:S:461t:u:";
> > > +		optstring = "dqfel:hs:F:p:P:m:a:n:M:g:i:o:D:S:461t:u:";
> > >  	}
> > >  
> > >  	c->tcp.fwd_in.mode = c->tcp.fwd_out.mode = 0;
> > > @@ -1295,6 +1323,26 @@ void conf(struct ctx *c, int argc, char **argv)
> > >  				c->mode == MODE_PASST ? "passt " : "pasta ");
> > >  			fprintf(stdout, VERSION_BLOB);
> > >  			exit(EXIT_SUCCESS);
> > > +		case 15:
> > > +			if (*c->ip4.ifname_out)
> > > +				die("Redundant outbound interface: %s", optarg);
> > > +
> > > +			ret = snprintf(c->ip4.ifname_out,
> > > +				       sizeof(c->ip4.ifname_out), "%s", optarg);
> > > +			if (ret <= 0 || ret >= (int)sizeof(c->ip4.ifname_out))
> > > +				die("Invalid interface name: %s", optarg);
> > > +
> > > +			break;
> > > +		case 16:
> > > +			if (*c->ip6.ifname_out)
> > > +				die("Redundant outbound interface: %s", optarg);
> > > +
> > > +			ret = snprintf(c->ip6.ifname_out,
> > > +				       sizeof(c->ip6.ifname_out), "%s", optarg);
> > > +			if (ret <= 0 || ret >= (int)sizeof(c->ip6.ifname_out))
> > > +				die("Invalid interface name: %s", optarg);
> > > +
> > > +			break;
> > >  		case 'd':
> > >  			if (c->debug)
> > >  				die("Multiple --debug options given");
> > > @@ -1466,6 +1514,26 @@ void conf(struct ctx *c, int argc, char **argv)
> > >  				die("Invalid interface name %s: %s", optarg,
> > >  				    strerror(errno));
> > >  			break;
> > > +		case 'o':
> > > +			if (IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out)	  &&
> > > +			    inet_pton(AF_INET6, optarg, &c->ip6.addr_out) &&
> > > +			    !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out)	  &&
> > > +			    !IN6_IS_ADDR_LOOPBACK(&c->ip6.addr_out)	  &&
> > > +			    !IN6_IS_ADDR_V4MAPPED(&c->ip6.addr_out)	  &&
> > > +			    !IN6_IS_ADDR_V4COMPAT(&c->ip6.addr_out)	  &&
> > > +			    !IN6_IS_ADDR_MULTICAST(&c->ip6.addr_out))
> > > +				break;
> > > +
> > > +			if (IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out)	 &&
> > > +			    inet_pton(AF_INET, optarg, &c->ip4.addr_out) &&
> > > +			    !IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out)	 &&
> > > +			    !IN4_IS_ADDR_BROADCAST(&c->ip4.addr_out)	 &&
> > > +			    !IN4_IS_ADDR_MULTICAST(&c->ip4.addr_out))
> > > +				break;
> > > +
> > > +			die("Invalid or redundant outbound address: %s",
> > > +			    optarg);
> > > +			break;  
> > 
> > I haven't really looked at any of the conf stuff above, since it will
> > change if we alter the option semantics as I've suggested.
> > 
> > >  		case 'D':
> > >  			if (!strcmp(optarg, "none")) {
> > >  				if (c->no_dns)
> > > diff --git a/icmp.c b/icmp.c
> > > index b842fa8..ddf83f8 100644
> > > --- a/icmp.c
> > > +++ b/icmp.c
> > > @@ -170,8 +170,16 @@ int icmp_tap_handler(const struct ctx *c, int af, const void *addr,
> > >  		iref.icmp.id = id = ntohs(ih->un.echo.id);
> > >  
> > >  		if ((s = icmp_id_map[V4][id].sock) <= 0) {
> > > -			s = sock_l4(c, AF_INET, IPPROTO_ICMP, NULL, NULL, id,
> > > -				    iref.u32);
> > > +			const struct in_addr *bind_addr = NULL;
> > > +			const char *bind_if;
> > > +
> > > +			bind_if = *c->ip4.ifname_out ? c->ip4.ifname_out : NULL;
> > > +
> > > +			if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out))
> > > +				bind_addr = &c->ip4.addr_out;
> > > +
> > > +			s = sock_l4(c, AF_INET, IPPROTO_ICMP, bind_addr,
> > > +				    bind_if, id, iref.u32);  
> > 
> > LGTM, modulo field names - again I think we should position "outbound
> > address" as simply "the address", with "DHCP advertised address" as
> > the special case override.
> 
> But from passt/pasta's perspective, "the address" is already what's
> expected to be used by the guest/container. We use that all over the
> place, and I think it might really need to be different from an address
> we "externally" bind to.

Maybe occasionally, but not in the normal case.  So, I'm saying the
existing option should set *both*.

> I really wouldn't like to see users coming up with netfilter-based hacks
> to support this. :/
> 
> > >  			if (s < 0)
> > >  				goto fail_sock;
> > >  			if (s > SOCKET_MAX) {
> > > @@ -216,8 +224,16 @@ int icmp_tap_handler(const struct ctx *c, int af, const void *addr,
> > >  
> > >  		iref.icmp.id = id = ntohs(ih->icmp6_identifier);
> > >  		if ((s = icmp_id_map[V6][id].sock) <= 0) {
> > > -			s = sock_l4(c, AF_INET6, IPPROTO_ICMPV6, NULL, NULL, id,
> > > -				    iref.u32);
> > > +			const struct in6_addr *bind_addr = NULL;
> > > +			const char *bind_if;
> > > +
> > > +			bind_if = *c->ip6.ifname_out ? c->ip6.ifname_out : NULL;
> > > +
> > > +			if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out))
> > > +				bind_addr = &c->ip6.addr_out;
> > > +
> > > +			s = sock_l4(c, AF_INET6, IPPROTO_ICMPV6, bind_addr,
> > > +				    bind_if, id, iref.u32);  
> > 
> > Gitto.
> 
> Interesting typo. :)

Heh.

> 
> > >  			if (s < 0)
> > >  				goto fail_sock;
> > >  			if (s > SOCKET_MAX) {
> > > diff --git a/passt.1 b/passt.1
> > > index f317c33..5c97f30 100644
> > > --- a/passt.1
> > > +++ b/passt.1
> > > @@ -183,6 +183,25 @@ Use host interface \fIname\fR to derive addresses and routes.
> > >  Default is to use the interfaces with the first default routes for each IP
> > >  version.
> > >  
> > > +.TP
> > > +.BR \-o ", " \-\-outbound " " \fIaddr
> > > +Use an IPv4 \fIaddr\fR as source address for IPv4 outbound TCP connections, UDP
> > > +flows, ICMP requests, or an IPv6 \fIaddr\fR for IPv6 ones, by binding outbound
> > > +sockets to it.
> > > +This option can be specified zero (for defaults) to two times (once for IPv4,
> > > +once for IPv6).
> > > +By default, the source address is selected by the routing tables.
> > > +
> > > +.TP
> > > +.BR \-\-outbound-if4 " " \fIname
> > > +Bind IPv4 outbound sockets to host interface \fIname\fR.
> > > +By default, the interface given by the default route is selected.
> > > +
> > > +.TP
> > > +.BR \-\-outbound-if6 " " \fIname
> > > +Bind IPv6 outbound sockets to host interface \fIname\fR.
> > > +By default, the interface given by the default route is selected.
> > > +
> > >  .TP
> > >  .BR \-D ", " \-\-dns " " \fIaddr
> > >  Use \fIaddr\fR (IPv4 or IPv6) for DHCP, DHCPv6, NDP or DNS forwarding, as
> > > diff --git a/passt.h b/passt.h
> > > index cc60c84..b73f4ff 100644
> > > --- a/passt.h
> > > +++ b/passt.h
> > > @@ -106,6 +106,8 @@ enum passt_modes {
> > >   * @dns:		DNS addresses for DHCP, zero-terminated, network order
> > >   * @dns_match:		Forward DNS query if sent to this address, network order
> > >   * @dns_host:		Use this DNS on the host for forwarding, network order
> > > + * @addr_out:		Optional source address for outbound traffic
> > > + * @ifname_out:		Optional interface name to bind outbound sockets to
> > >   */
> > >  struct ip4_ctx {
> > >  	struct in_addr addr;
> > > @@ -115,6 +117,9 @@ struct ip4_ctx {
> > >  	struct in_addr dns[MAXNS + 1];
> > >  	struct in_addr dns_match;
> > >  	struct in_addr dns_host;
> > > +
> > > +	struct in_addr addr_out;
> > > +	char ifname_out[IFNAMSIZ];  
> > 
> > Again, I think we should just use ip4_ctx::addr as the outbound
> > address, with an additional "dhcp_addr" if we must.
> > 
> > >  };
> > >  
> > >  /**
> > > @@ -127,6 +132,8 @@ struct ip4_ctx {
> > >   * @dns:		DNS addresses for DHCPv6 and NDP, zero-terminated
> > >   * @dns_match:		Forward DNS query if sent to this address
> > >   * @dns_host:		Use this DNS on the host for forwarding
> > > + * @addr_out:		Optional source address for outbound traffic
> > > + * @ifname_out:		Optional interface name to bind outbound sockets to
> > >   */
> > >  struct ip6_ctx {
> > >  	struct in6_addr addr;
> > > @@ -137,6 +144,9 @@ struct ip6_ctx {
> > >  	struct in6_addr dns[MAXNS + 1];
> > >  	struct in6_addr dns_match;
> > >  	struct in6_addr dns_host;
> > > +
> > > +	struct in6_addr addr_out;
> > > +	char ifname_out[IFNAMSIZ];  
> > 
> > Ditto.
> > 
> > >  };
> > >  
> > >  #include <netinet/if_ether.h>
> > > diff --git a/tcp.c b/tcp.c
> > > index b674311..8e8d653 100644
> > > --- a/tcp.c
> > > +++ b/tcp.c
> > > @@ -1946,6 +1946,61 @@ static uint16_t tcp_conn_tap_mss(const struct ctx *c,
> > >  	return MIN(mss, USHRT_MAX);
> > >  }
> > >  
> > > +/**
> > > + * tcp_bind_outbound() - Bind socket to outbound address and interface if given
> > > + * @c:		Execution context
> > > + * @s:		Outbound TCP socket
> > > + * @af:		Address family
> > > + */
> > > +static void tcp_bind_outbound(const struct ctx *c, int s, sa_family_t af)
> > > +{
> > > +	if (af == AF_INET) {
> > > +		if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out)) {
> > > +			struct sockaddr_in addr4 = {
> > > +				.sin_family = AF_INET,
> > > +				.sin_port = 0,
> > > +				.sin_addr = c->ip4.addr_out,
> > > +			};
> > > +
> > > +			if (bind(s, (struct sockaddr *)&addr4, sizeof(addr4))) {
> > > +				debug("Can't bind IPv4 TCP socket address: %s",
> > > +				      strerror(errno));
> > > +			}
> > > +		}
> > > +
> > > +		if (*c->ip4.ifname_out) {
> > > +			if (setsockopt(s, SOL_SOCKET, SO_BINDTODEVICE,
> > > +				       c->ip4.ifname_out,
> > > +				       strlen(c->ip4.ifname_out))) {
> > > +				debug("Can't bind IPv4 TCP socket to interface:"
> > > +				      " %s", strerror(errno));
> > > +			}
> > > +		}
> > > +	} else if (af == AF_INET6) {
> > > +		if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out)) {
> > > +			struct sockaddr_in6 addr6 = {
> > > +				.sin6_family = AF_INET6,
> > > +				.sin6_port = 0,
> > > +				.sin6_addr = c->ip6.addr_out,
> > > +			};
> > > +
> > > +			if (bind(s, (struct sockaddr *)&addr6, sizeof(addr6))) {
> > > +				debug("Can't bind IPv6 TCP socket address: %s",
> > > +				      strerror(errno));
> > > +			}
> > > +		}
> > > +
> > > +		if (*c->ip6.ifname_out) {
> > > +			if (setsockopt(s, SOL_SOCKET, SO_BINDTODEVICE,
> > > +				       c->ip6.ifname_out,
> > > +				       strlen(c->ip6.ifname_out))) {
> > > +				debug("Can't bind IPv6 TCP socket to interface:"
> > > +				      " %s", strerror(errno));
> > > +			}
> > > +		}
> > > +	}
> > > +}
> > > +
> > >  /**
> > >   * tcp_conn_from_tap() - Handle connection request (SYN segment) from tap
> > >   * @c:		Execution context
> > > @@ -2052,6 +2107,11 @@ static void tcp_conn_from_tap(struct ctx *c, int af, const void *addr,
> > >  	if (errno != EADDRNOTAVAIL && errno != EACCES)
> > >  		conn_flag(c, conn, LOCAL);
> > >  
> > > +	if ((af == AF_INET &&  !IN4_IS_ADDR_LOOPBACK(&addr4.sin_addr)) ||
> > > +	    (af == AF_INET6 && !IN6_IS_ADDR_LOOPBACK(&addr6.sin6_addr) &&
> > > +			       !IN6_IS_ADDR_LINKLOCAL(&addr6.sin6_addr)))
> > > +		tcp_bind_outbound(c, s, af);  
> > 
> > If we have an IPv6 link-local, shouldn't we still do the
> > SO_BINDTODEVICE, even though we don't want the bind()?
> 
> If an outbound connection uses a link-local address, we use c->ifi6 as
> sin6_scope_id (see tcp_conn_from_tap()) -- we need one and it's a
> reasonable assumption to use that as that's where we source the first
> hop/default gateway from.
> 
> But with SO_BINDTODEVICE the kernel resolves the interface name, which
> avoids 1. a netlink exchange on every new connection and 2. the need
> for the interface to be present at any given time. So I'm not sure we
> can actually check that they match in a safe way -- and if there's any
> advantage in doing so.

You mean that the link-local address the guest is sending to matches
the interface we're bound to?  No I don't think we can check that as
such either.  But the guest presumably got the idea it can use that
address from somewhere.  If we're bound to a specific outbound
interface I think we should not allow link-local traffic from any
other interface to reach the guest.  If the guest just makes up a
bogus link-local address, that's kind of its problem.

> I tend to see the link-local case as completely separate from any
> "outbound" consideration, because that's where we make the
> guest/container think they're actually talking to a gateway, not to the
> "outside world".

Uh.. maybe.  We kind of need to choose, is the link-local scope
covering just the passt<->guest link, or does the guest have a view to
(one of) the host's link-local scope.  At the moment, I'm not sure
we're really consistent about how we treat this.

> > I also wonder if we should allow setting our outbound link-local IPv6
> > address, since AIUI there could be multiple on the host.  e.g. allow
> > -a to be given three times: one for IPv4, one for IPv6 global and one
> > for IPv6 link-local.
> 
> Hmm, yes, I guess we should at some point.
> 
> > >  	if (connect(s, sa, sl)) {
> > >  		if (errno != EINPROGRESS) {
> > >  			tcp_rst(c, conn);
> > > diff --git a/udp.c b/udp.c
> > > index b7cbfdc..99cfc9f 100644
> > > --- a/udp.c
> > > +++ b/udp.c
> > > @@ -843,20 +843,6 @@ int udp_tap_handler(struct ctx *c, int af, const void *addr,
> > >  		sa = (struct sockaddr *)&s_in;
> > >  		sl = sizeof(s_in);
> > >  
> > > -		if (!(s = udp_tap_map[V4][src].sock)) {
> > > -			union udp_epoll_ref uref = { .udp.port = src };
> > > -
> > > -			s = sock_l4(c, AF_INET, IPPROTO_UDP, NULL, NULL, src,
> > > -				    uref.u32);
> > > -			if (s < 0)
> > > -				return p->count;
> > > -
> > > -			udp_tap_map[V4][src].sock = s;
> > > -			bitmap_set(udp_act[V4][UDP_ACT_TAP], src);
> > > -		}
> > > -
> > > -		udp_tap_map[V4][src].ts = now->tv_sec;
> > > -
> > >  		if (IN4_ARE_ADDR_EQUAL(&s_in.sin_addr, &c->ip4.dns_match) &&
> > >  		    ntohs(s_in.sin_port) == 53) {
> > >  			s_in.sin_addr = c->ip4.dns_host;
> > > @@ -868,13 +854,37 @@ int udp_tap_handler(struct ctx *c, int af, const void *addr,
> > >  			else
> > >  				s_in.sin_addr = c->ip4.addr_seen;
> > >  		}
> > > +
> > > +		if (!(s = udp_tap_map[V4][src].sock)) {
> > > +			union udp_epoll_ref uref = { .udp.port = src };
> > > +			in_addr_t bind_addr = { 0 };
> > > +			const char *bind_if = NULL;
> > > +
> > > +			if (!IN6_IS_ADDR_LOOPBACK(&s_in.sin_addr) &&
> > > +			    *c->ip6.ifname_out)
> > > +				bind_if = c->ip6.ifname_out;
> > > +
> > > +			if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out) &&
> > > +			    !IN4_IS_ADDR_LOOPBACK(&s_in.sin_addr))
> > > +				bind_addr = c->ip4.addr_out.s_addr;
> > > +
> > > +			s = sock_l4(c, AF_INET, IPPROTO_UDP, &bind_addr,
> > > +				    bind_if, src, uref.u32);
> > > +			if (s < 0)
> > > +				return p->count;
> > > +
> > > +			udp_tap_map[V4][src].sock = s;
> > > +			bitmap_set(udp_act[V4][UDP_ACT_TAP], src);
> > > +		}
> > > +
> > > +		udp_tap_map[V4][src].ts = now->tv_sec;
> > >  	} else {
> > >  		s_in6 = (struct sockaddr_in6) {
> > >  			.sin6_family = AF_INET6,
> > >  			.sin6_port = uh->dest,
> > >  			.sin6_addr = *(struct in6_addr *)addr,
> > >  		};
> > > -		const void *bind_addr = &in6addr_any;
> > > +		const struct in6_addr *bind_addr = &in6addr_any;
> > >  
> > >  		sa = (struct sockaddr *)&s_in6;
> > >  		sl = sizeof(s_in6);
> > > @@ -898,9 +908,19 @@ int udp_tap_handler(struct ctx *c, int af, const void *addr,
> > >  		if (!(s = udp_tap_map[V6][src].sock)) {
> > >  			union udp_epoll_ref uref = { .udp.v6 = 1,
> > >  						     .udp.port = src };
> > > +			const char *bind_if = NULL;
> > > +
> > > +			if (!IN6_IS_ADDR_LOOPBACK(&s_in6.sin6_addr) &&
> > > +			    *c->ip6.ifname_out)
> > > +				bind_if = c->ip6.ifname_out;
> > > +
> > > +			if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out) &&
> > > +			    !IN6_IS_ADDR_LOOPBACK(&s_in6.sin6_addr) &&
> > > +			    !IN6_IS_ADDR_LINKLOCAL(&s_in6.sin6_addr))
> > > +				bind_addr = &c->ip6.addr_out;
> > >  
> > > -			s = sock_l4(c, AF_INET6, IPPROTO_UDP, bind_addr, NULL,
> > > -				    src, uref.u32);
> > > +			s = sock_l4(c, AF_INET6, IPPROTO_UDP, &bind_addr,  
> > 
> > bind_addr is already a pointer, I don't think you want the '&' here.
> 
> Okay, at least this is fixed. :)
> 
> > > +				    bind_if, src, uref.u32);
> > >  			if (s < 0)
> > >  				return p->count;
> > >    
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] conf, icmp, tcp, udp: Add options to bind to outbound address and interface
  2023-03-09  0:18       ` David Gibson
@ 2023-03-09  2:41         ` Stefano Brivio
  0 siblings, 0 replies; 8+ messages in thread
From: Stefano Brivio @ 2023-03-09  2:41 UTC (permalink / raw)
  To: David Gibson; +Cc: passt-dev

On Thu, 9 Mar 2023 11:18:42 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Thu, Mar 09, 2023 at 12:33:20AM +0100, Stefano Brivio wrote:
> > On Thu, 9 Mar 2023 09:02:57 +1100
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> >   
> > > On Wed, Mar 08, 2023 at 08:34:49AM +0100, Stefano Brivio wrote:  
> > > > I didn't notice earlier: libslirp (and slirp4netns) supports binding
> > > > outbound sockets to specific IPv4 and IPv6 addresses, to force the
> > > > source addresse selection. If we want to claim feature parity, we
> > > > should implement that as well.
> > > > 
> > > > Further, Podman supports specifying outbound interfaces as well, but
> > > > this is simply done by resolving the primary address for an interface
> > > > when the network back-end is started. However, since kernel version
> > > > 5.7, commit c427bfec18f2 ("net: core: enable SO_BINDTODEVICE for
> > > > non-root users"), we can actually bind to a specific interface name,
> > > > which doesn't need to be validated in advance.
> > > > 
> > > > Implement -o / --outbound ADDR to bind to IPv4 and IPv6 addresses,
> > > > and --outbound-ip4 and --outbound-ip6 to bind IPv4 and IPv6 sockets
> > > > to given interfaces.    
> > > 
> > > You have 'outbound-ip' here but 'outbound-if' in the code, I think you
> > > intended the latter.  
> > 
> > Oops, right.
> >   
> > > > For ICMP and UDP, we call sock_l4() to open outbound sockets, as we
> > > > already needed to bind to given ports or echo identifiers, and we
> > > > can bind() a socket only once: there, pass address (if any) and
> > > > interface (if any) for the existing bind() and setsockopt() calls.
> > > > 
> > > > For TCP, in general, we wouldn't otherwise bind sockets. Add a
> > > > specific helper to do that.
> > > > 
> > > > For UDP outbound sockets, we need to know if the final destination
> > > > of the socket is a loopback address, before we decide whether it
> > > > makes sense to bind the socket at all: move the block mangling the
> > > > address destination before the creation of the socket in the IPv4
> > > > path. This was already the case for the IPv6 path.
> > > > 
> > > > Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> > > > ---
> > > >  conf.c  | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++----
> > > >  icmp.c  | 24 +++++++++++++++---
> > > >  passt.1 | 19 ++++++++++++++
> > > >  passt.h | 10 ++++++++
> > > >  tcp.c   | 60 ++++++++++++++++++++++++++++++++++++++++++++
> > > >  udp.c   | 54 ++++++++++++++++++++++++++-------------
> > > >  6 files changed, 219 insertions(+), 26 deletions(-)
> > > > 
> > > > diff --git a/conf.c b/conf.c
> > > > index 3aa3314..15506ec 100644
> > > > --- a/conf.c
> > > > +++ b/conf.c
> > > > @@ -776,6 +776,13 @@ static void usage(const char *name)
> > > >  	info(   "    default: gateway from interface with default route");
> > > >  	info(   "  -i, --interface NAME	Interface for addresses and routes");
> > > >  	info(   "    default: interface with first default route");    
> > > 
> > > So, I think the outbound IP and the IP we advertise to the guest
> > > should be the same.  Certainly by default, and I'm not sure I can even
> > > think of any case where it would be useful for them to be different.
> > > That fits with the "only NAT when we really have to" goal.  
> > 
> > Sure, that's a goal, but users might want to do NAT for whatever
> > reason, even just for slirp4netns compatibility (which we should
> > *really* support to play along nicely with Podman).
> > 
> > At the moment, it's already enough to pass '-a 10.200.0.2' and the
> > outbound address will be (in general) different from what we advertise.  
> 
> Yes, I know, but I'm saying I think that's kind of undesirable.  The
> more prominent option should change both outbound and advertised
> address to match.  Separately overriding the advertised address should
> be the special case.

So, summing up from offline discussion: ideally, yes, but we don't
really want to break "-a <private address>" at this point. This covers
a few point below, and I'll answer/summarise the rest later, except for
this:

> Normal mode:
> 	-a sets both the outbound and guest address
> 	--if4 and --if6 set the outbound interface.  If there's no -a,
> 	both outbound and guest address are derived from this as well.

...which is pretty much done in v4 -- the rest can most likely be
addressed later on without this series being a substantial obstacle.

-- 
Stefano


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-03-09  2:42 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-08  7:34 [PATCH 0/2] Implement explicit outbound address and interface selection Stefano Brivio
2023-03-08  7:34 ` [PATCH 1/2] conf, passt.h: Rename "outbound" interface to "template" interface Stefano Brivio
2023-03-08 21:05   ` David Gibson
2023-03-08  7:34 ` [PATCH 2/2] conf, icmp, tcp, udp: Add options to bind to outbound address and interface Stefano Brivio
2023-03-08 22:02   ` David Gibson
2023-03-08 23:33     ` Stefano Brivio
2023-03-09  0:18       ` David Gibson
2023-03-09  2:41         ` Stefano Brivio

Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).