public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
* [PATCH 0/3] Probe host's ephemeral ports, rather than using RFC values
@ 2024-08-28  5:56 David Gibson
  2024-08-28  5:56 ` [PATCH 1/3] conf, fwd: Make ephemeral port logic more flexible David Gibson
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: David Gibson @ 2024-08-28  5:56 UTC (permalink / raw)
  To: passt-dev, Stefano Brivio; +Cc: David Gibson

As discussed on our recent call, this implements basing which ports we
consider "ephemeral" on probing the host's settings, rather than just
assuming the RFC 6335 recommended values, which are not what Linux
uses by default.

I think this is more correct, but additionally using the Linux values
means we consider more ports ephemeral, reducing kernel memory
consumption for -t all -u all.

David Gibson (3):
  conf, fwd: Make ephemeral port logic more flexible
  conf, fwd: Don't attempt to forward port 0
  fwd, conf: Probe host's ephemeral ports

 conf.c | 19 ++++++++++++----
 fwd.c  | 70 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 fwd.h  |  3 +++
 util.h |  3 ---
 4 files changed, 88 insertions(+), 7 deletions(-)

-- 
2.46.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/3] conf, fwd: Make ephemeral port logic more flexible
  2024-08-28  5:56 [PATCH 0/3] Probe host's ephemeral ports, rather than using RFC values David Gibson
@ 2024-08-28  5:56 ` David Gibson
  2024-08-28 10:01   ` Laurent Vivier
  2024-08-28  5:56 ` [PATCH 2/3] conf, fwd: Don't attempt to forward port 0 David Gibson
  2024-08-28  5:56 ` [PATCH 3/3] fwd, conf: Probe host's ephemeral ports David Gibson
  2 siblings, 1 reply; 10+ messages in thread
From: David Gibson @ 2024-08-28  5:56 UTC (permalink / raw)
  To: passt-dev, Stefano Brivio; +Cc: David Gibson

"Ephemeral" ports are those which the kernel may allocate as local
port numbers for outgoing connections or datagrams.  Because of that,
they're generally not good choices for listening servers to bind to.

Thefore when using -t all, -u all or exclude-only ranges, we map only
non-ephemeral ports.  Our logic for this is a bit rigid though: we
assume the ephemeral ports are always a fixed range at the top of the
port number space.  We also assume PORT_EPHEMERAL_MIN is a multiple of
8, or we won't set the forward bitmap correctly.

Make the logic in conf.c more flexible, using a helper moved into
fwd.[ch], although we don't change which ports we consider ephemeral
(yet).

The new handling is undoubtedly more computationally expensive, but
since it's a once-off operation at start off, I don't think it really
matters.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 conf.c | 12 ++++++++----
 fwd.c  | 17 +++++++++++++++++
 fwd.h  |  2 ++
 util.h |  3 ---
 4 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/conf.c b/conf.c
index e29b6a92..6b3dafd5 100644
--- a/conf.c
+++ b/conf.c
@@ -156,9 +156,12 @@ static void conf_ports(const struct ctx *c, char optname, const char *optarg,
 			die("'all' port forwarding is only allowed for passt");
 
 		fwd->mode = FWD_ALL;
-		memset(fwd->map, 0xff, PORT_EPHEMERAL_MIN / 8);
 
-		for (i = 0; i < PORT_EPHEMERAL_MIN; i++) {
+		for (i = 0; i < NUM_PORTS; i++) {
+			if (fwd_port_is_ephemeral(i))
+				continue;
+
+			bitmap_set(fwd->map, i);
 			if (optname == 't') {
 				ret = tcp_sock_init(c, AF_UNSPEC, NULL, NULL,
 						    i);
@@ -259,8 +262,9 @@ static void conf_ports(const struct ctx *c, char optname, const char *optarg,
 	} while ((p = next_chunk(p, ',')));
 
 	if (exclude_only) {
-		for (i = 0; i < PORT_EPHEMERAL_MIN; i++) {
-			if (bitmap_isset(exclude, i))
+		for (i = 0; i < NUM_PORTS; i++) {
+			if (fwd_port_is_ephemeral(i) ||
+			    bitmap_isset(exclude, i))
 				continue;
 
 			bitmap_set(fwd->map, i);
diff --git a/fwd.c b/fwd.c
index 2a0452fa..adf61cb5 100644
--- a/fwd.c
+++ b/fwd.c
@@ -27,6 +27,23 @@
 #include "lineread.h"
 #include "flow_table.h"
 
+/* Empheral port range: values from RFC 6335 */
+static const uint16_t fwd_ephemeral_min = (1 << 15) + (1 << 14);
+static const uint16_t fwd_ephemeral_max = NUM_PORTS - 1;
+
+/**
+ * fwd_port_is_ephemeral() - Is port number ephemeral?
+ * @port:	Port number
+ *
+ * Return: true if @port is ephemeral, that is may be allocated by the kernel as
+ *         a local port for outgoing connections or datagrams, but should not be
+ *         used for binding services to.
+ */
+bool fwd_port_is_ephemeral(uint16_t port)
+{
+	return (port >= fwd_ephemeral_min) && (port <= fwd_ephemeral_max);
+}
+
 /* See enum in kernel's include/net/tcp_states.h */
 #define UDP_LISTEN	0x07
 #define TCP_LISTEN	0x0a
diff --git a/fwd.h b/fwd.h
index b4aa8d57..42fe57eb 100644
--- a/fwd.h
+++ b/fwd.h
@@ -12,6 +12,8 @@ struct flowside;
 /* Number of ports for both TCP and UDP */
 #define	NUM_PORTS	(1U << 16)
 
+bool fwd_port_is_ephemeral(uint16_t port);
+
 enum fwd_ports_mode {
 	FWD_UNSET = 0,
 	FWD_SPEC = 1,
diff --git a/util.h b/util.h
index 1463c921..c7a59d5d 100644
--- a/util.h
+++ b/util.h
@@ -95,9 +95,6 @@
 #define FD_PROTO(x, proto)						\
 	(IN_INTERVAL(c->proto.fd_min, c->proto.fd_max, (x)))
 
-#define PORT_EPHEMERAL_MIN	((1 << 15) + (1 << 14))		/* RFC 6335 */
-#define PORT_IS_EPHEMERAL(port) ((port) >= PORT_EPHEMERAL_MIN)
-
 #define MAC_ZERO		((uint8_t [ETH_ALEN]){ 0 })
 #define MAC_IS_ZERO(addr)	(!memcmp((addr), MAC_ZERO, ETH_ALEN))
 
-- 
@@ -95,9 +95,6 @@
 #define FD_PROTO(x, proto)						\
 	(IN_INTERVAL(c->proto.fd_min, c->proto.fd_max, (x)))
 
-#define PORT_EPHEMERAL_MIN	((1 << 15) + (1 << 14))		/* RFC 6335 */
-#define PORT_IS_EPHEMERAL(port) ((port) >= PORT_EPHEMERAL_MIN)
-
 #define MAC_ZERO		((uint8_t [ETH_ALEN]){ 0 })
 #define MAC_IS_ZERO(addr)	(!memcmp((addr), MAC_ZERO, ETH_ALEN))
 
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/3] conf, fwd: Don't attempt to forward port 0
  2024-08-28  5:56 [PATCH 0/3] Probe host's ephemeral ports, rather than using RFC values David Gibson
  2024-08-28  5:56 ` [PATCH 1/3] conf, fwd: Make ephemeral port logic more flexible David Gibson
@ 2024-08-28  5:56 ` David Gibson
  2024-08-28 10:03   ` Laurent Vivier
  2024-08-28  5:56 ` [PATCH 3/3] fwd, conf: Probe host's ephemeral ports David Gibson
  2 siblings, 1 reply; 10+ messages in thread
From: David Gibson @ 2024-08-28  5:56 UTC (permalink / raw)
  To: passt-dev, Stefano Brivio; +Cc: David Gibson

When using -t all, -u all or exclude-only ranges, we'll attempt to forward
all non-ephemeral port numbers, including port 0.  However, this won't work
as intended: bind() treats a zero port not as literal port 0, but as
"pick a port for me".  Because of the special meaning of port 0, we mostly
outright exclude it in our handling.

Do the same for setting up forwards, not attempting to forward for port 0.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 conf.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/conf.c b/conf.c
index 6b3dafd5..3eb117ff 100644
--- a/conf.c
+++ b/conf.c
@@ -157,7 +157,10 @@ static void conf_ports(const struct ctx *c, char optname, const char *optarg,
 
 		fwd->mode = FWD_ALL;
 
-		for (i = 0; i < NUM_PORTS; i++) {
+		/* Skip port 0.  It has special meaning for many socket APIs, so
+		 * trying to bind it is not really safe.
+		 */
+		for (i = 1; i < NUM_PORTS; i++) {
 			if (fwd_port_is_ephemeral(i))
 				continue;
 
@@ -262,7 +265,10 @@ static void conf_ports(const struct ctx *c, char optname, const char *optarg,
 	} while ((p = next_chunk(p, ',')));
 
 	if (exclude_only) {
-		for (i = 0; i < NUM_PORTS; i++) {
+		/* Skip port 0.  It has special meaning for many socket APIs, so
+		 * trying to bind it is not really safe.
+		 */
+		for (i = 1; i < NUM_PORTS; i++) {
 			if (fwd_port_is_ephemeral(i) ||
 			    bitmap_isset(exclude, i))
 				continue;
-- 
@@ -157,7 +157,10 @@ static void conf_ports(const struct ctx *c, char optname, const char *optarg,
 
 		fwd->mode = FWD_ALL;
 
-		for (i = 0; i < NUM_PORTS; i++) {
+		/* Skip port 0.  It has special meaning for many socket APIs, so
+		 * trying to bind it is not really safe.
+		 */
+		for (i = 1; i < NUM_PORTS; i++) {
 			if (fwd_port_is_ephemeral(i))
 				continue;
 
@@ -262,7 +265,10 @@ static void conf_ports(const struct ctx *c, char optname, const char *optarg,
 	} while ((p = next_chunk(p, ',')));
 
 	if (exclude_only) {
-		for (i = 0; i < NUM_PORTS; i++) {
+		/* Skip port 0.  It has special meaning for many socket APIs, so
+		 * trying to bind it is not really safe.
+		 */
+		for (i = 1; i < NUM_PORTS; i++) {
 			if (fwd_port_is_ephemeral(i) ||
 			    bitmap_isset(exclude, i))
 				continue;
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/3] fwd, conf: Probe host's ephemeral ports
  2024-08-28  5:56 [PATCH 0/3] Probe host's ephemeral ports, rather than using RFC values David Gibson
  2024-08-28  5:56 ` [PATCH 1/3] conf, fwd: Make ephemeral port logic more flexible David Gibson
  2024-08-28  5:56 ` [PATCH 2/3] conf, fwd: Don't attempt to forward port 0 David Gibson
@ 2024-08-28  5:56 ` David Gibson
  2024-08-28 10:22   ` Laurent Vivier
  2 siblings, 1 reply; 10+ messages in thread
From: David Gibson @ 2024-08-28  5:56 UTC (permalink / raw)
  To: passt-dev, Stefano Brivio; +Cc: David Gibson

When we forward "all" ports (-t all or -u all), or use an exclude-only
range, we don't actually forward *all* ports - that wouln't leave local
ports to use for outgoing connections.  Rather we forward all non-ephemeral
ports - those that won't be used for outgoing connections or datagrams.

Currently we assume the range of ephemeral ports is that recommended by
RFC 6335, 49152-65535.  However, that's not the range used by default on
Linux, 32768-60999 but configurable with the net.ipv4.ip_local_port_range
sysctl.

We can't really know what range the guest will consider ephemeral, but if
it differs too much from the host it's likely to cause problems we can't
avoid anyway.  So, using the host's ephemeral range is a better guess than
using the RFC 6335 range.

Therefore, add logic to probe the host's ephemeral range, falling back to
the RFC 6335 range if that fails.  This has the bonus advantage of
reducing the number of ports bound by -t all, -u all on most Linux machines
thereby reducing kernel memory usage.  Specifically this reduces kernel
memory usage with -t all, -u all from ~380MiB to ~289MiB.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 conf.c |  1 +
 fwd.c  | 57 +++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 fwd.h  |  1 +
 3 files changed, 57 insertions(+), 2 deletions(-)

diff --git a/conf.c b/conf.c
index 3eb117ff..b2758864 100644
--- a/conf.c
+++ b/conf.c
@@ -1721,6 +1721,7 @@ void conf(struct ctx *c, int argc, char **argv)
 	/* Inbound port options & DNS can be parsed now (after IPv4/IPv6
 	 * settings)
 	 */
+	fwd_probe_ephemeral();
 	udp_portmap_clear();
 	optind = 0;
 	do {
diff --git a/fwd.c b/fwd.c
index adf61cb5..40f556e9 100644
--- a/fwd.c
+++ b/fwd.c
@@ -28,8 +28,61 @@
 #include "flow_table.h"
 
 /* Empheral port range: values from RFC 6335 */
-static const uint16_t fwd_ephemeral_min = (1 << 15) + (1 << 14);
-static const uint16_t fwd_ephemeral_max = NUM_PORTS - 1;
+static uint16_t fwd_ephemeral_min = (1 << 15) + (1 << 14);
+static uint16_t fwd_ephemeral_max = NUM_PORTS - 1;
+
+#define PORT_RANGE_SYSCTL	"/proc/sys/net/ipv4/ip_local_port_range"
+
+/** fwd_probe_ephemeral() - Determine what ports this host considers ephemeral
+ *
+ * Work out what ports the host thinks are emphemeral and record it for later
+ * use by fwd_port_is_ephemeral().  If we're unable to probe, assume the range
+ * recommended by RFC 6335.
+ */
+void fwd_probe_ephemeral(void)
+{
+	char *line, *tab, *end;
+	struct lineread lr;
+	long min, max;
+	ssize_t len;
+	int fd;
+
+	fd = open(PORT_RANGE_SYSCTL, O_RDONLY | O_CLOEXEC);
+	if (fd < 0)
+		warn_perror("Unable to open %s", PORT_RANGE_SYSCTL);
+
+	lineread_init(&lr, fd);
+	len = lineread_get(&lr, &line);
+	if (len < 0)
+		goto parse_err;
+
+	tab = strchr(line, '\t');
+	if (!tab)
+		goto parse_err;
+	*tab = '\0';
+
+	errno = 0;
+	min = strtol(line, &end, 10);
+	if (*end || errno)
+		goto parse_err;
+
+	errno = 0;
+	max = strtol(tab + 1, &end, 10);
+	if (*end || errno)
+		goto parse_err;
+
+	if (min < 0 || min >= NUM_PORTS ||
+	    max < 0 || max >= NUM_PORTS)
+		goto parse_err;
+
+	fwd_ephemeral_min = min;
+	fwd_ephemeral_max = max;
+
+	return;
+
+parse_err:
+	warn("Unable to parse %s", PORT_RANGE_SYSCTL);
+}
 
 /**
  * fwd_port_is_ephemeral() - Is port number ephemeral?
diff --git a/fwd.h b/fwd.h
index 42fe57eb..23aac5b2 100644
--- a/fwd.h
+++ b/fwd.h
@@ -12,6 +12,7 @@ struct flowside;
 /* Number of ports for both TCP and UDP */
 #define	NUM_PORTS	(1U << 16)
 
+void fwd_probe_ephemeral(void);
 bool fwd_port_is_ephemeral(uint16_t port);
 
 enum fwd_ports_mode {
-- 
@@ -12,6 +12,7 @@ struct flowside;
 /* Number of ports for both TCP and UDP */
 #define	NUM_PORTS	(1U << 16)
 
+void fwd_probe_ephemeral(void);
 bool fwd_port_is_ephemeral(uint16_t port);
 
 enum fwd_ports_mode {
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] conf, fwd: Make ephemeral port logic more flexible
  2024-08-28  5:56 ` [PATCH 1/3] conf, fwd: Make ephemeral port logic more flexible David Gibson
@ 2024-08-28 10:01   ` Laurent Vivier
  0 siblings, 0 replies; 10+ messages in thread
From: Laurent Vivier @ 2024-08-28 10:01 UTC (permalink / raw)
  To: David Gibson, passt-dev, Stefano Brivio

On 28/08/2024 07:56, David Gibson wrote:
> "Ephemeral" ports are those which the kernel may allocate as local
> port numbers for outgoing connections or datagrams.  Because of that,
> they're generally not good choices for listening servers to bind to.
> 
> Thefore when using -t all, -u all or exclude-only ranges, we map only
> non-ephemeral ports.  Our logic for this is a bit rigid though: we
> assume the ephemeral ports are always a fixed range at the top of the
> port number space.  We also assume PORT_EPHEMERAL_MIN is a multiple of
> 8, or we won't set the forward bitmap correctly.
> 
> Make the logic in conf.c more flexible, using a helper moved into
> fwd.[ch], although we don't change which ports we consider ephemeral
> (yet).
> 
> The new handling is undoubtedly more computationally expensive, but
> since it's a once-off operation at start off, I don't think it really
> matters.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>   conf.c | 12 ++++++++----
>   fwd.c  | 17 +++++++++++++++++
>   fwd.h  |  2 ++
>   util.h |  3 ---
>   4 files changed, 27 insertions(+), 7 deletions(-)
> 
> diff --git a/conf.c b/conf.c
> index e29b6a92..6b3dafd5 100644
> --- a/conf.c
> +++ b/conf.c
> @@ -156,9 +156,12 @@ static void conf_ports(const struct ctx *c, char optname, const char *optarg,
>   			die("'all' port forwarding is only allowed for passt");
>   
>   		fwd->mode = FWD_ALL;
> -		memset(fwd->map, 0xff, PORT_EPHEMERAL_MIN / 8);
>   
> -		for (i = 0; i < PORT_EPHEMERAL_MIN; i++) {
> +		for (i = 0; i < NUM_PORTS; i++) {
> +			if (fwd_port_is_ephemeral(i))
> +				continue;
> +
> +			bitmap_set(fwd->map, i);
>   			if (optname == 't') {
>   				ret = tcp_sock_init(c, AF_UNSPEC, NULL, NULL,
>   						    i);
> @@ -259,8 +262,9 @@ static void conf_ports(const struct ctx *c, char optname, const char *optarg,
>   	} while ((p = next_chunk(p, ',')));
>   
>   	if (exclude_only) {
> -		for (i = 0; i < PORT_EPHEMERAL_MIN; i++) {
> -			if (bitmap_isset(exclude, i))
> +		for (i = 0; i < NUM_PORTS; i++) {
> +			if (fwd_port_is_ephemeral(i) ||
> +			    bitmap_isset(exclude, i))
>   				continue;
>   
>   			bitmap_set(fwd->map, i);
> diff --git a/fwd.c b/fwd.c
> index 2a0452fa..adf61cb5 100644
> --- a/fwd.c
> +++ b/fwd.c
> @@ -27,6 +27,23 @@
>   #include "lineread.h"
>   #include "flow_table.h"
>   
> +/* Empheral port range: values from RFC 6335 */
> +static const uint16_t fwd_ephemeral_min = (1 << 15) + (1 << 14);
> +static const uint16_t fwd_ephemeral_max = NUM_PORTS - 1;
> +
> +/**
> + * fwd_port_is_ephemeral() - Is port number ephemeral?
> + * @port:	Port number
> + *
> + * Return: true if @port is ephemeral, that is may be allocated by the kernel as
> + *         a local port for outgoing connections or datagrams, but should not be
> + *         used for binding services to.
> + */
> +bool fwd_port_is_ephemeral(uint16_t port)
> +{
> +	return (port >= fwd_ephemeral_min) && (port <= fwd_ephemeral_max);
> +}
> +
>   /* See enum in kernel's include/net/tcp_states.h */
>   #define UDP_LISTEN	0x07
>   #define TCP_LISTEN	0x0a
> diff --git a/fwd.h b/fwd.h
> index b4aa8d57..42fe57eb 100644
> --- a/fwd.h
> +++ b/fwd.h
> @@ -12,6 +12,8 @@ struct flowside;
>   /* Number of ports for both TCP and UDP */
>   #define	NUM_PORTS	(1U << 16)
>   
> +bool fwd_port_is_ephemeral(uint16_t port);
> +
>   enum fwd_ports_mode {
>   	FWD_UNSET = 0,
>   	FWD_SPEC = 1,
> diff --git a/util.h b/util.h
> index 1463c921..c7a59d5d 100644
> --- a/util.h
> +++ b/util.h
> @@ -95,9 +95,6 @@
>   #define FD_PROTO(x, proto)						\
>   	(IN_INTERVAL(c->proto.fd_min, c->proto.fd_max, (x)))
>   
> -#define PORT_EPHEMERAL_MIN	((1 << 15) + (1 << 14))		/* RFC 6335 */
> -#define PORT_IS_EPHEMERAL(port) ((port) >= PORT_EPHEMERAL_MIN)
> -
>   #define MAC_ZERO		((uint8_t [ETH_ALEN]){ 0 })
>   #define MAC_IS_ZERO(addr)	(!memcmp((addr), MAC_ZERO, ETH_ALEN))
>   

Reviewed-by: Laurent Vivier <lvivier@redhat.com>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/3] conf, fwd: Don't attempt to forward port 0
  2024-08-28  5:56 ` [PATCH 2/3] conf, fwd: Don't attempt to forward port 0 David Gibson
@ 2024-08-28 10:03   ` Laurent Vivier
  0 siblings, 0 replies; 10+ messages in thread
From: Laurent Vivier @ 2024-08-28 10:03 UTC (permalink / raw)
  To: David Gibson, passt-dev, Stefano Brivio

On 28/08/2024 07:56, David Gibson wrote:
> When using -t all, -u all or exclude-only ranges, we'll attempt to forward
> all non-ephemeral port numbers, including port 0.  However, this won't work
> as intended: bind() treats a zero port not as literal port 0, but as
> "pick a port for me".  Because of the special meaning of port 0, we mostly
> outright exclude it in our handling.
> 
> Do the same for setting up forwards, not attempting to forward for port 0.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>   conf.c | 10 ++++++++--
>   1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/conf.c b/conf.c
> index 6b3dafd5..3eb117ff 100644
> --- a/conf.c
> +++ b/conf.c
> @@ -157,7 +157,10 @@ static void conf_ports(const struct ctx *c, char optname, const char *optarg,
>   
>   		fwd->mode = FWD_ALL;
>   
> -		for (i = 0; i < NUM_PORTS; i++) {
> +		/* Skip port 0.  It has special meaning for many socket APIs, so
> +		 * trying to bind it is not really safe.
> +		 */
> +		for (i = 1; i < NUM_PORTS; i++) {
>   			if (fwd_port_is_ephemeral(i))
>   				continue;
>   
> @@ -262,7 +265,10 @@ static void conf_ports(const struct ctx *c, char optname, const char *optarg,
>   	} while ((p = next_chunk(p, ',')));
>   
>   	if (exclude_only) {
> -		for (i = 0; i < NUM_PORTS; i++) {
> +		/* Skip port 0.  It has special meaning for many socket APIs, so
> +		 * trying to bind it is not really safe.
> +		 */
> +		for (i = 1; i < NUM_PORTS; i++) {
>   			if (fwd_port_is_ephemeral(i) ||
>   			    bitmap_isset(exclude, i))
>   				continue;

Reviewed-by: Laurent Vivier <lvivier@redhat.com>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] fwd, conf: Probe host's ephemeral ports
  2024-08-28  5:56 ` [PATCH 3/3] fwd, conf: Probe host's ephemeral ports David Gibson
@ 2024-08-28 10:22   ` Laurent Vivier
  2024-08-29  1:29     ` David Gibson
  0 siblings, 1 reply; 10+ messages in thread
From: Laurent Vivier @ 2024-08-28 10:22 UTC (permalink / raw)
  To: David Gibson, passt-dev, Stefano Brivio

On 28/08/2024 07:56, David Gibson wrote:
> When we forward "all" ports (-t all or -u all), or use an exclude-only
> range, we don't actually forward *all* ports - that wouln't leave local
> ports to use for outgoing connections.  Rather we forward all non-ephemeral
> ports - those that won't be used for outgoing connections or datagrams.
> 
> Currently we assume the range of ephemeral ports is that recommended by
> RFC 6335, 49152-65535.  However, that's not the range used by default on
> Linux, 32768-60999 but configurable with the net.ipv4.ip_local_port_range
> sysctl.
> 
> We can't really know what range the guest will consider ephemeral, but if
> it differs too much from the host it's likely to cause problems we can't
> avoid anyway.  So, using the host's ephemeral range is a better guess than
> using the RFC 6335 range.
> 
> Therefore, add logic to probe the host's ephemeral range, falling back to
> the RFC 6335 range if that fails.  This has the bonus advantage of
> reducing the number of ports bound by -t all, -u all on most Linux machines
> thereby reducing kernel memory usage.  Specifically this reduces kernel
> memory usage with -t all, -u all from ~380MiB to ~289MiB.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>   conf.c |  1 +
>   fwd.c  | 57 +++++++++++++++++++++++++++++++++++++++++++++++++++++++--
>   fwd.h  |  1 +
>   3 files changed, 57 insertions(+), 2 deletions(-)
> 
> diff --git a/conf.c b/conf.c
> index 3eb117ff..b2758864 100644
> --- a/conf.c
> +++ b/conf.c
> @@ -1721,6 +1721,7 @@ void conf(struct ctx *c, int argc, char **argv)
>   	/* Inbound port options & DNS can be parsed now (after IPv4/IPv6
>   	 * settings)
>   	 */
> +	fwd_probe_ephemeral();
>   	udp_portmap_clear();
>   	optind = 0;
>   	do {
> diff --git a/fwd.c b/fwd.c
> index adf61cb5..40f556e9 100644
> --- a/fwd.c
> +++ b/fwd.c
> @@ -28,8 +28,61 @@
>   #include "flow_table.h"
>   
>   /* Empheral port range: values from RFC 6335 */
> -static const uint16_t fwd_ephemeral_min = (1 << 15) + (1 << 14);
> -static const uint16_t fwd_ephemeral_max = NUM_PORTS - 1;
> +static uint16_t fwd_ephemeral_min = (1 << 15) + (1 << 14);
> +static uint16_t fwd_ephemeral_max = NUM_PORTS - 1;
> +
> +#define PORT_RANGE_SYSCTL	"/proc/sys/net/ipv4/ip_local_port_range"
> +
> +/** fwd_probe_ephemeral() - Determine what ports this host considers ephemeral
> + *
> + * Work out what ports the host thinks are emphemeral and record it for later
> + * use by fwd_port_is_ephemeral().  If we're unable to probe, assume the range
> + * recommended by RFC 6335.
> + */
> +void fwd_probe_ephemeral(void)
> +{
> +	char *line, *tab, *end;
> +	struct lineread lr;
> +	long min, max;
> +	ssize_t len;
> +	int fd;
> +
> +	fd = open(PORT_RANGE_SYSCTL, O_RDONLY | O_CLOEXEC);

Why O_CLOEXEC?
There is no close() in the function, do you rely on it to close the file descriptor?

> +	if (fd < 0)
> +		warn_perror("Unable to open %s", PORT_RANGE_SYSCTL);

goto parse_error ?

or if you add the close() in parse_error, we need a return.

> +
> +	lineread_init(&lr, fd);
> +	len = lineread_get(&lr, &line);
> +	if (len < 0)
> +		goto parse_err;
> +
> +	tab = strchr(line, '\t');
> +	if (!tab)
> +		goto parse_err;
> +	*tab = '\0';
> +
> +	errno = 0;
> +	min = strtol(line, &end, 10);
> +	if (*end || errno)
> +		goto parse_err;
> +
> +	errno = 0;
> +	max = strtol(tab + 1, &end, 10);
> +	if (*end || errno)
> +		goto parse_err;

As /proc files are well formated, why don't you use fscanf()?
Something like:

         FILE *f;

         f = fopen(PORT_RANGE_SYSCTL, "r");
	if (f == NULL) {
		warn("Unable to parse %s", PORT_RANGE_SYSCTL);
		return;
	}
         ret = fscanf(f, "%d %d", &min, &max);
         fclose(f);
         if (ret != 2)
                 goto parse_error;

Thanks,
Laurent
> +
> +	if (min < 0 || min >= NUM_PORTS ||
> +	    max < 0 || max >= NUM_PORTS)
> +		goto parse_err;
> +
> +	fwd_ephemeral_min = min;
> +	fwd_ephemeral_max = max;
> +
> +	return;
> +
> +parse_err:
> +	warn("Unable to parse %s", PORT_RANGE_SYSCTL);
> +}
>   
>   /**
>    * fwd_port_is_ephemeral() - Is port number ephemeral?
> diff --git a/fwd.h b/fwd.h
> index 42fe57eb..23aac5b2 100644
> --- a/fwd.h
> +++ b/fwd.h
> @@ -12,6 +12,7 @@ struct flowside;
>   /* Number of ports for both TCP and UDP */
>   #define	NUM_PORTS	(1U << 16)
>   
> +void fwd_probe_ephemeral(void);
>   bool fwd_port_is_ephemeral(uint16_t port);
>   
>   enum fwd_ports_mode {


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] fwd, conf: Probe host's ephemeral ports
  2024-08-28 10:22   ` Laurent Vivier
@ 2024-08-29  1:29     ` David Gibson
  2024-08-29  2:59       ` Stefano Brivio
  0 siblings, 1 reply; 10+ messages in thread
From: David Gibson @ 2024-08-29  1:29 UTC (permalink / raw)
  To: Laurent Vivier; +Cc: passt-dev, Stefano Brivio

[-- Attachment #1: Type: text/plain, Size: 5487 bytes --]

On Wed, Aug 28, 2024 at 12:22:18PM +0200, Laurent Vivier wrote:
> On 28/08/2024 07:56, David Gibson wrote:
> > When we forward "all" ports (-t all or -u all), or use an exclude-only
> > range, we don't actually forward *all* ports - that wouln't leave local
> > ports to use for outgoing connections.  Rather we forward all non-ephemeral
> > ports - those that won't be used for outgoing connections or datagrams.
> > 
> > Currently we assume the range of ephemeral ports is that recommended by
> > RFC 6335, 49152-65535.  However, that's not the range used by default on
> > Linux, 32768-60999 but configurable with the net.ipv4.ip_local_port_range
> > sysctl.
> > 
> > We can't really know what range the guest will consider ephemeral, but if
> > it differs too much from the host it's likely to cause problems we can't
> > avoid anyway.  So, using the host's ephemeral range is a better guess than
> > using the RFC 6335 range.
> > 
> > Therefore, add logic to probe the host's ephemeral range, falling back to
> > the RFC 6335 range if that fails.  This has the bonus advantage of
> > reducing the number of ports bound by -t all, -u all on most Linux machines
> > thereby reducing kernel memory usage.  Specifically this reduces kernel
> > memory usage with -t all, -u all from ~380MiB to ~289MiB.
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> >   conf.c |  1 +
> >   fwd.c  | 57 +++++++++++++++++++++++++++++++++++++++++++++++++++++++--
> >   fwd.h  |  1 +
> >   3 files changed, 57 insertions(+), 2 deletions(-)
> > 
> > diff --git a/conf.c b/conf.c
> > index 3eb117ff..b2758864 100644
> > --- a/conf.c
> > +++ b/conf.c
> > @@ -1721,6 +1721,7 @@ void conf(struct ctx *c, int argc, char **argv)
> >   	/* Inbound port options & DNS can be parsed now (after IPv4/IPv6
> >   	 * settings)
> >   	 */
> > +	fwd_probe_ephemeral();
> >   	udp_portmap_clear();
> >   	optind = 0;
> >   	do {
> > diff --git a/fwd.c b/fwd.c
> > index adf61cb5..40f556e9 100644
> > --- a/fwd.c
> > +++ b/fwd.c
> > @@ -28,8 +28,61 @@
> >   #include "flow_table.h"
> >   /* Empheral port range: values from RFC 6335 */
> > -static const uint16_t fwd_ephemeral_min = (1 << 15) + (1 << 14);
> > -static const uint16_t fwd_ephemeral_max = NUM_PORTS - 1;
> > +static uint16_t fwd_ephemeral_min = (1 << 15) + (1 << 14);
> > +static uint16_t fwd_ephemeral_max = NUM_PORTS - 1;
> > +
> > +#define PORT_RANGE_SYSCTL	"/proc/sys/net/ipv4/ip_local_port_range"
> > +
> > +/** fwd_probe_ephemeral() - Determine what ports this host considers ephemeral
> > + *
> > + * Work out what ports the host thinks are emphemeral and record it for later
> > + * use by fwd_port_is_ephemeral().  If we're unable to probe, assume the range
> > + * recommended by RFC 6335.
> > + */
> > +void fwd_probe_ephemeral(void)
> > +{
> > +	char *line, *tab, *end;
> > +	struct lineread lr;
> > +	long min, max;
> > +	ssize_t len;
> > +	int fd;
> > +
> > +	fd = open(PORT_RANGE_SYSCTL, O_RDONLY | O_CLOEXEC);
> 
> Why O_CLOEXEC?

AIUI current security best practices recommend using O_CLOEXEC
basically always.  clang-tidy complains if it's not there.

> There is no close() in the function, do you rely on it to close the file descriptor?

No, just a very dumb oversight.

> > +	if (fd < 0)
> > +		warn_perror("Unable to open %s", PORT_RANGE_SYSCTL);
> 
> goto parse_error ?

No, this is a different error, but there should be a return.  Added.

> or if you add the close() in parse_error, we need a return.
> 
> > +
> > +	lineread_init(&lr, fd);
> > +	len = lineread_get(&lr, &line);
> > +	if (len < 0)
> > +		goto parse_err;
> > +
> > +	tab = strchr(line, '\t');
> > +	if (!tab)
> > +		goto parse_err;
> > +	*tab = '\0';
> > +
> > +	errno = 0;
> > +	min = strtol(line, &end, 10);
> > +	if (*end || errno)
> > +		goto parse_err;
> > +
> > +	errno = 0;
> > +	max = strtol(tab + 1, &end, 10);
> > +	if (*end || errno)
> > +		goto parse_err;
> 
> As /proc files are well formated, why don't you use fscanf()?
> Something like:
> 
>         FILE *f;
> 
>         f = fopen(PORT_RANGE_SYSCTL, "r");
> 	if (f == NULL) {
> 		warn("Unable to parse %s", PORT_RANGE_SYSCTL);
> 		return;
> 	}
>         ret = fscanf(f, "%d %d", &min, &max);
>         fclose(f);
>         if (ret != 2)
>                 goto parse_error;

Hm, maybe.  I never feel like I know exactly what the parse rules for
scanf() are, so I tend to avoid it.  Stefano, any thoughts?

> 
> Thanks,
> Laurent
> > +
> > +	if (min < 0 || min >= NUM_PORTS ||
> > +	    max < 0 || max >= NUM_PORTS)
> > +		goto parse_err;
> > +
> > +	fwd_ephemeral_min = min;
> > +	fwd_ephemeral_max = max;
> > +
> > +	return;
> > +
> > +parse_err:
> > +	warn("Unable to parse %s", PORT_RANGE_SYSCTL);
> > +}
> >   /**
> >    * fwd_port_is_ephemeral() - Is port number ephemeral?
> > diff --git a/fwd.h b/fwd.h
> > index 42fe57eb..23aac5b2 100644
> > --- a/fwd.h
> > +++ b/fwd.h
> > @@ -12,6 +12,7 @@ struct flowside;
> >   /* Number of ports for both TCP and UDP */
> >   #define	NUM_PORTS	(1U << 16)
> > +void fwd_probe_ephemeral(void);
> >   bool fwd_port_is_ephemeral(uint16_t port);
> >   enum fwd_ports_mode {
> 

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] fwd, conf: Probe host's ephemeral ports
  2024-08-29  1:29     ` David Gibson
@ 2024-08-29  2:59       ` Stefano Brivio
  2024-08-29  4:14         ` David Gibson
  0 siblings, 1 reply; 10+ messages in thread
From: Stefano Brivio @ 2024-08-29  2:59 UTC (permalink / raw)
  To: David Gibson; +Cc: Laurent Vivier, passt-dev

On Thu, 29 Aug 2024 11:29:30 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Wed, Aug 28, 2024 at 12:22:18PM +0200, Laurent Vivier wrote:
> > On 28/08/2024 07:56, David Gibson wrote:  
> > >
> > > [...]
> > >
> > > +	lineread_init(&lr, fd);
> > > +	len = lineread_get(&lr, &line);
> > > +	if (len < 0)
> > > +		goto parse_err;
> > > +
> > > +	tab = strchr(line, '\t');
> > > +	if (!tab)
> > > +		goto parse_err;
> > > +	*tab = '\0';
> > > +
> > > +	errno = 0;
> > > +	min = strtol(line, &end, 10);
> > > +	if (*end || errno)
> > > +		goto parse_err;
> > > +
> > > +	errno = 0;
> > > +	max = strtol(tab + 1, &end, 10);
> > > +	if (*end || errno)
> > > +		goto parse_err;  
> > 
> > As /proc files are well formated, why don't you use fscanf()?
> > Something like:
> > 
> >         FILE *f;
> > 
> >         f = fopen(PORT_RANGE_SYSCTL, "r");
> > 	if (f == NULL) {
> > 		warn("Unable to parse %s", PORT_RANGE_SYSCTL);
> > 		return;
> > 	}
> >         ret = fscanf(f, "%d %d", &min, &max);
> >         fclose(f);
> >         if (ret != 2)
> >                 goto parse_error;  
> 
> Hm, maybe.  I never feel like I know exactly what the parse rules for
> scanf() are, so I tend to avoid it.  Stefano, any thoughts?

The issue with fopen() and ffscanf() is that they need dynamic
allocation, see also 32d07f5e59f2 ("passt, pasta: Completely avoid
dynamic memory allocation").

Still, here, we could use lineread_get() and sscanf() as we do in fwd.c,
procfs_scan_listen(), but I don't really have a preference. If you find
that scanf() is confusing for you, perhaps it would be better to use
something simpler as you did. On the other hand it would be 2 lines of
code instead of (somewhat bug-prone) 14 lines.

I'm not sure, I'd say whatever you feel most comfortable with...

-- 
Stefano


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] fwd, conf: Probe host's ephemeral ports
  2024-08-29  2:59       ` Stefano Brivio
@ 2024-08-29  4:14         ` David Gibson
  0 siblings, 0 replies; 10+ messages in thread
From: David Gibson @ 2024-08-29  4:14 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: Laurent Vivier, passt-dev

[-- Attachment #1: Type: text/plain, Size: 2323 bytes --]

On Thu, Aug 29, 2024 at 04:59:55AM +0200, Stefano Brivio wrote:
> On Thu, 29 Aug 2024 11:29:30 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Wed, Aug 28, 2024 at 12:22:18PM +0200, Laurent Vivier wrote:
> > > On 28/08/2024 07:56, David Gibson wrote:  
> > > >
> > > > [...]
> > > >
> > > > +	lineread_init(&lr, fd);
> > > > +	len = lineread_get(&lr, &line);
> > > > +	if (len < 0)
> > > > +		goto parse_err;
> > > > +
> > > > +	tab = strchr(line, '\t');
> > > > +	if (!tab)
> > > > +		goto parse_err;
> > > > +	*tab = '\0';
> > > > +
> > > > +	errno = 0;
> > > > +	min = strtol(line, &end, 10);
> > > > +	if (*end || errno)
> > > > +		goto parse_err;
> > > > +
> > > > +	errno = 0;
> > > > +	max = strtol(tab + 1, &end, 10);
> > > > +	if (*end || errno)
> > > > +		goto parse_err;  
> > > 
> > > As /proc files are well formated, why don't you use fscanf()?
> > > Something like:
> > > 
> > >         FILE *f;
> > > 
> > >         f = fopen(PORT_RANGE_SYSCTL, "r");
> > > 	if (f == NULL) {
> > > 		warn("Unable to parse %s", PORT_RANGE_SYSCTL);
> > > 		return;
> > > 	}
> > >         ret = fscanf(f, "%d %d", &min, &max);
> > >         fclose(f);
> > >         if (ret != 2)
> > >                 goto parse_error;  
> > 
> > Hm, maybe.  I never feel like I know exactly what the parse rules for
> > scanf() are, so I tend to avoid it.  Stefano, any thoughts?
> 
> The issue with fopen() and ffscanf() is that they need dynamic
> allocation, see also 32d07f5e59f2 ("passt, pasta: Completely avoid
> dynamic memory allocation").

Ah, good point, I'd forgotten that.

> Still, here, we could use lineread_get() and sscanf() as we do in fwd.c,
> procfs_scan_listen(), but I don't really have a preference. If you find
> that scanf() is confusing for you, perhaps it would be better to use
> something simpler as you did. On the other hand it would be 2 lines of
> code instead of (somewhat bug-prone) 14 lines.

True.  Ok, I've switched over to sscanf() for the next spin.

> I'm not sure, I'd say whatever you feel most comfortable with...
> 

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-08-29  4:28 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-08-28  5:56 [PATCH 0/3] Probe host's ephemeral ports, rather than using RFC values David Gibson
2024-08-28  5:56 ` [PATCH 1/3] conf, fwd: Make ephemeral port logic more flexible David Gibson
2024-08-28 10:01   ` Laurent Vivier
2024-08-28  5:56 ` [PATCH 2/3] conf, fwd: Don't attempt to forward port 0 David Gibson
2024-08-28 10:03   ` Laurent Vivier
2024-08-28  5:56 ` [PATCH 3/3] fwd, conf: Probe host's ephemeral ports David Gibson
2024-08-28 10:22   ` Laurent Vivier
2024-08-29  1:29     ` David Gibson
2024-08-29  2:59       ` Stefano Brivio
2024-08-29  4:14         ` David Gibson

Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).