public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
* [PATCH v2 0/4] Don't expose container loopback services to the host
@ 2024-10-01  6:24 David Gibson
  2024-10-01  6:24 ` [PATCH v2 1/4] passt.1: Mark --stderr as deprecated more prominently David Gibson
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: David Gibson @ 2024-10-01  6:24 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: David Gibson

podman issue #24045 pointed out that pasta's spliced forwarding logic
can expose services within the namespace bound only to 127.0.0.1 or
::1 to the host.  However, the namespace probably expects those to
only be accessible to itself, so that's probably not what we want.

This changes our forwarding logic so as not to do this.  Note that the
podman tests will currently fail with this series, I've submitted
podman PR https://github.com/containers/podman/pull/24064 to fix that.

Link: https://github.com/containers/podman/issues/24045

Changes since v1:
 * Add --host-lo-to-ns-lo option to preserve the old behaviour
 * Clarify the new behaviour in the man page
 * Add some extra patches making some other corrections to the man
   page

David Gibson (4):
  passt.1: Mark --stderr as deprecated more prominently
  passt.1: Clarify and update "Handling of local addresses" section
  test: Clarify test for spliced inbound transfers
  fwd: Direct inbound spliced forwards to the guest's external address

 conf.c               |  9 ++++++
 fwd.c                | 31 +++++++++++++-----
 passt.1              | 75 +++++++++++++++++++++++++++-----------------
 passt.h              |  1 +
 test/passt_in_ns/tcp |  8 ++---
 test/passt_in_ns/udp |  4 +--
 test/pasta/tcp       | 16 +++++-----
 test/pasta/udp       |  8 ++---
 8 files changed, 97 insertions(+), 55 deletions(-)

-- 
2.46.2


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/4] passt.1: Mark --stderr as deprecated more prominently
  2024-10-01  6:24 [PATCH v2 0/4] Don't expose container loopback services to the host David Gibson
@ 2024-10-01  6:24 ` David Gibson
  2024-10-01  6:25 ` [PATCH v2 2/4] passt.1: Clarify and update "Handling of local addresses" section David Gibson
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: David Gibson @ 2024-10-01  6:24 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: David Gibson

The description of this option says that it's deprecated, but unlike
--no-copy-addrs and --no-copy-routes it doesn't have a clear label.  Add
one to make it easier to spot.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 passt.1 | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/passt.1 b/passt.1
index 79d134d..acc1f92 100644
--- a/passt.1
+++ b/passt.1
@@ -95,7 +95,7 @@ detached PID namespace after starting, because the PID itself cannot change.
 Default is to fork into background.
 
 .TP
-.BR \-e ", " \-\-stderr
+.BR \-e ", " \-\-stderr " " (DEPRECATED)
 This option has no effect, and is maintained for compatibility purposes only.
 
 Note that this configuration option is \fBdeprecated\fR and will be removed in a
-- 
@@ -95,7 +95,7 @@ detached PID namespace after starting, because the PID itself cannot change.
 Default is to fork into background.
 
 .TP
-.BR \-e ", " \-\-stderr
+.BR \-e ", " \-\-stderr " " (DEPRECATED)
 This option has no effect, and is maintained for compatibility purposes only.
 
 Note that this configuration option is \fBdeprecated\fR and will be removed in a
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/4] passt.1: Clarify and update "Handling of local addresses" section
  2024-10-01  6:24 [PATCH v2 0/4] Don't expose container loopback services to the host David Gibson
  2024-10-01  6:24 ` [PATCH v2 1/4] passt.1: Mark --stderr as deprecated more prominently David Gibson
@ 2024-10-01  6:25 ` David Gibson
  2024-10-01  6:25 ` [PATCH v2 3/4] test: Clarify test for spliced inbound transfers David Gibson
  2024-10-01  6:25 ` [PATCH v2 4/4] fwd: Direct inbound spliced forwards to the guest's external address David Gibson
  3 siblings, 0 replies; 8+ messages in thread
From: David Gibson @ 2024-10-01  6:25 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: David Gibson

This section didn't mention the effect of the --map-host-loopback option
which now alters this behaviour.  Update it accordingly.

It used "local addresses" to mean specifically 127.0.0.0/8 and ::1.
However, "local" could also refer to link-local addresses or to addresses
of any scope which happen to be configured on the host.  Use "loopback
address" to be more precise about this.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 passt.1 | 54 ++++++++++++++++++++++++++++--------------------------
 1 file changed, 28 insertions(+), 26 deletions(-)

diff --git a/passt.1 b/passt.1
index acc1f92..11104e1 100644
--- a/passt.1
+++ b/passt.1
@@ -863,38 +863,40 @@ root@localhost's password:
 
 .SH NOTES
 
-.SS Handling of traffic with local destination and source addresses
-
-Both \fBpasst\fR and \fBpasta\fR can bind on ports with a local address,
-depending on the configuration. Local destination or source addresses need to be
-changed before packets are delivered to the guest or target namespace: most
-operating systems would drop packets received from non-loopback interfaces with
-local addresses, and it would also be impossible for guest or target namespace
-to route answers back.
-
-For convenience, and somewhat arbitrarily, the source address on these packets
-is translated to the address of the default IPv4 or IPv6 gateway (if any) --
-this is known to be an existing, valid address on the same subnet.
-
-Loopback destination addresses are instead translated to the observed external
-address of the guest or target namespace. For IPv6 packets, if usage of a
-link-local address by guest or namespace has ever been observed, and the
-original destination address is also a link-local address, the observed
-link-local address is used. Otherwise, the observed global address is used. For
-both IPv4 and IPv6, if no addresses have been seen yet, the configured addresses
-will be used instead.
+.SS Handling of traffic with loopback destination and source addresses
+
+Both \fBpasst\fR and \fBpasta\fR can bind on ports with a loopback
+address (127.0.0.0/8 or ::1), depending on the configuration. Loopback
+destination or source addresses need to be changed before packets are
+delivered to the guest or target namespace: most operating systems
+would drop packets received with loopback addresses on non-loopback
+interfaces, and it would also be impossible for guest or target
+namespace to route answers back.
+
+For convenience, the source address on these packets is translated to
+the address specified by the \fB\-\-map-host-loopback\fR option.  If
+not specified this defaults, somewhat arbitrarily, to the address of
+default IPv4 or IPv6 gateway (if any) -- this is known to be an
+existing, valid address on the same subnet.  If \fB\-\-no-map-gw\fR or
+\fB\-\-map-host-loopback none\fR are specified this translation is
+disabled and packets with loopback addresses are simply dropped.
+
+Loopback destination addresses are translated to the observed external
+address of the guest or target namespace. For IPv6, the observed
+link-local address is used if the translated source address is
+link-local, otherwise the observed global address is used. For both
+IPv4 and IPv6, if no addresses have been seen yet, the configured
+addresses will be used instead.
 
 For example, if \fBpasst\fR or \fBpasta\fR receive a connection from 127.0.0.1,
 with destination 127.0.0.10, and the default IPv4 gateway is 192.0.2.1, while
 the last observed source address from guest or namespace is 192.0.2.2, this will
 be translated to a connection from 192.0.2.1 to 192.0.2.2.
 
-Similarly, for traffic coming from guest or namespace, packets with destination
-address corresponding to the default gateway will have their destination address
-translated to a loopback address, if and only if a packet, in the opposite
-direction, with a loopback destination or source address, port-wise matching for
-UDP, or connection-wise for TCP, has been recently forwarded to guest or
-namespace. This behaviour can be disabled with \-\-no\-map\-gw.
+Similarly, for traffic coming from guest or namespace, packets with
+destination address corresponding to the \fB\-\-map-host-loopback\fR
+address will have their destination address translated to a loopback
+address.
 
 .SS Handling of local traffic in pasta
 
-- 
@@ -863,38 +863,40 @@ root@localhost's password:
 
 .SH NOTES
 
-.SS Handling of traffic with local destination and source addresses
-
-Both \fBpasst\fR and \fBpasta\fR can bind on ports with a local address,
-depending on the configuration. Local destination or source addresses need to be
-changed before packets are delivered to the guest or target namespace: most
-operating systems would drop packets received from non-loopback interfaces with
-local addresses, and it would also be impossible for guest or target namespace
-to route answers back.
-
-For convenience, and somewhat arbitrarily, the source address on these packets
-is translated to the address of the default IPv4 or IPv6 gateway (if any) --
-this is known to be an existing, valid address on the same subnet.
-
-Loopback destination addresses are instead translated to the observed external
-address of the guest or target namespace. For IPv6 packets, if usage of a
-link-local address by guest or namespace has ever been observed, and the
-original destination address is also a link-local address, the observed
-link-local address is used. Otherwise, the observed global address is used. For
-both IPv4 and IPv6, if no addresses have been seen yet, the configured addresses
-will be used instead.
+.SS Handling of traffic with loopback destination and source addresses
+
+Both \fBpasst\fR and \fBpasta\fR can bind on ports with a loopback
+address (127.0.0.0/8 or ::1), depending on the configuration. Loopback
+destination or source addresses need to be changed before packets are
+delivered to the guest or target namespace: most operating systems
+would drop packets received with loopback addresses on non-loopback
+interfaces, and it would also be impossible for guest or target
+namespace to route answers back.
+
+For convenience, the source address on these packets is translated to
+the address specified by the \fB\-\-map-host-loopback\fR option.  If
+not specified this defaults, somewhat arbitrarily, to the address of
+default IPv4 or IPv6 gateway (if any) -- this is known to be an
+existing, valid address on the same subnet.  If \fB\-\-no-map-gw\fR or
+\fB\-\-map-host-loopback none\fR are specified this translation is
+disabled and packets with loopback addresses are simply dropped.
+
+Loopback destination addresses are translated to the observed external
+address of the guest or target namespace. For IPv6, the observed
+link-local address is used if the translated source address is
+link-local, otherwise the observed global address is used. For both
+IPv4 and IPv6, if no addresses have been seen yet, the configured
+addresses will be used instead.
 
 For example, if \fBpasst\fR or \fBpasta\fR receive a connection from 127.0.0.1,
 with destination 127.0.0.10, and the default IPv4 gateway is 192.0.2.1, while
 the last observed source address from guest or namespace is 192.0.2.2, this will
 be translated to a connection from 192.0.2.1 to 192.0.2.2.
 
-Similarly, for traffic coming from guest or namespace, packets with destination
-address corresponding to the default gateway will have their destination address
-translated to a loopback address, if and only if a packet, in the opposite
-direction, with a loopback destination or source address, port-wise matching for
-UDP, or connection-wise for TCP, has been recently forwarded to guest or
-namespace. This behaviour can be disabled with \-\-no\-map\-gw.
+Similarly, for traffic coming from guest or namespace, packets with
+destination address corresponding to the \fB\-\-map-host-loopback\fR
+address will have their destination address translated to a loopback
+address.
 
 .SS Handling of local traffic in pasta
 
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 3/4] test: Clarify test for spliced inbound transfers
  2024-10-01  6:24 [PATCH v2 0/4] Don't expose container loopback services to the host David Gibson
  2024-10-01  6:24 ` [PATCH v2 1/4] passt.1: Mark --stderr as deprecated more prominently David Gibson
  2024-10-01  6:25 ` [PATCH v2 2/4] passt.1: Clarify and update "Handling of local addresses" section David Gibson
@ 2024-10-01  6:25 ` David Gibson
  2024-10-01  6:25 ` [PATCH v2 4/4] fwd: Direct inbound spliced forwards to the guest's external address David Gibson
  3 siblings, 0 replies; 8+ messages in thread
From: David Gibson @ 2024-10-01  6:25 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: David Gibson

The tests in pasta/tcp and pasta/udp for inbound transfers have the server
listening within the namespace explicitly bound to 127.0.0.1 or ::1.  This
only works because of the behaviour of inbound splice connections, which
always appear with both source and destination addresses as loopback in
the namespace.  That's not an inherent property for "spliced" connections
and arguably an undesirable one.  Also update the test names to make it
clearer that these tests are expecting to exercise the "splice" path.

Interestingly this was already correct for the equivalent passt_in_ns/*,
although we also update the test names for clarity there.

Note that there are similar issues in some of the podman tests, addressed
in https://github.com/containers/podman/pull/24064

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 test/passt_in_ns/tcp |  8 ++++----
 test/passt_in_ns/udp |  4 ++--
 test/pasta/tcp       | 16 ++++++++--------
 test/pasta/udp       |  8 ++++----
 4 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/test/passt_in_ns/tcp b/test/passt_in_ns/tcp
index aaf340e..319880b 100644
--- a/test/passt_in_ns/tcp
+++ b/test/passt_in_ns/tcp
@@ -32,7 +32,7 @@ host	socat -u OPEN:__BASEPATH__/big.bin TCP4:127.0.0.1:10001
 guestw
 guest	cmp test_big.bin /root/big.bin
 
-test	TCP/IPv4: host to ns: big transfer
+test	TCP/IPv4: host to ns (spliced): big transfer
 nsb	socat -u TCP4-LISTEN:10002 OPEN:__TEMP_NS_BIG__,create,trunc
 sleep	1
 host	socat -u OPEN:__BASEPATH__/big.bin TCP4:127.0.0.1:10002
@@ -90,7 +90,7 @@ host	socat -u OPEN:__BASEPATH__/small.bin TCP4:127.0.0.1:10001
 guestw
 guest	cmp test_small.bin /root/small.bin
 
-test	TCP/IPv4: host to ns: small transfer
+test	TCP/IPv4: host to ns (spliced): small transfer
 nsb	socat -u TCP4-LISTEN:10002 OPEN:__TEMP_NS_SMALL__,create,trunc
 sleep	1
 host	socat -u OPEN:__BASEPATH__/small.bin TCP4:127.0.0.1:10002
@@ -146,7 +146,7 @@ host	socat -u OPEN:__BASEPATH__/big.bin TCP6:[::1]:10001
 guestw
 guest	cmp test_big.bin /root/big.bin
 
-test	TCP/IPv6: host to ns: big transfer
+test	TCP/IPv6: host to ns (spliced): big transfer
 nsb	socat -u TCP6-LISTEN:10002 OPEN:__TEMP_NS_BIG__,create,trunc
 sleep	1
 host	socat -u OPEN:__BASEPATH__/big.bin TCP6:[::1]:10002
@@ -204,7 +204,7 @@ host	socat -u OPEN:__BASEPATH__/small.bin TCP6:[::1]:10001
 guestw
 guest	cmp test_small.bin /root/small.bin
 
-test	TCP/IPv6: host to ns: small transfer
+test	TCP/IPv6: host to ns (spliced): small transfer
 nsb	socat -u TCP6-LISTEN:10002 OPEN:__TEMP_NS_SMALL__,create,trunc
 sleep	1
 host	socat -u OPEN:__BASEPATH__/small.bin TCP6:[::1]:10002
diff --git a/test/passt_in_ns/udp b/test/passt_in_ns/udp
index 3426ab9..791511c 100644
--- a/test/passt_in_ns/udp
+++ b/test/passt_in_ns/udp
@@ -30,7 +30,7 @@ host	socat -u OPEN:__BASEPATH__/medium.bin UDP4:127.0.0.1:10001,shut-null
 guestw
 guest	cmp test.bin /root/medium.bin
 
-test	UDP/IPv4: host to ns
+test	UDP/IPv4: host to ns (recvmmsg/sendmmsg)
 nsb	socat -u UDP4-LISTEN:10002,null-eof OPEN:__TEMP_NS__,create,trunc
 sleep	1
 host	socat -u OPEN:__BASEPATH__/medium.bin UDP4:127.0.0.1:10002,shut-null
@@ -88,7 +88,7 @@ host	socat -u OPEN:__BASEPATH__/medium.bin UDP6:[::1]:10001,shut-null
 guestw
 guest	cmp test.bin /root/medium.bin
 
-test	UDP/IPv6: host to ns
+test	UDP/IPv6: host to ns (recvmmsg/sendmmsg)
 nsb	socat -u UDP6-LISTEN:10002,null-eof OPEN:__TEMP_NS__,create,trunc
 sleep	1
 host	socat -u OPEN:__BASEPATH__/medium.bin UDP6:[::1]:10002,shut-null
diff --git a/test/pasta/tcp b/test/pasta/tcp
index 6ab18c5..53b6f25 100644
--- a/test/pasta/tcp
+++ b/test/pasta/tcp
@@ -19,8 +19,8 @@ set	TEMP_NS_BIG __STATEDIR__/test_ns_big.bin
 set	TEMP_SMALL __STATEDIR__/test_small.bin
 set	TEMP_NS_SMALL __STATEDIR__/test_ns_small.bin
 
-test	TCP/IPv4: host to ns: big transfer
-nsb	socat -u TCP4-LISTEN:10002,bind=127.0.0.1 OPEN:__TEMP_NS_BIG__,create,trunc
+test	TCP/IPv4: host to ns (spliced): big transfer
+nsb	socat -u TCP4-LISTEN:10002 OPEN:__TEMP_NS_BIG__,create,trunc
 host	socat -u OPEN:__BASEPATH__/big.bin TCP4:127.0.0.1:10002
 nsw
 check	cmp __BASEPATH__/big.bin __TEMP_NS_BIG__
@@ -38,8 +38,8 @@ ns	socat -u OPEN:__BASEPATH__/big.bin TCP4:__GW__:10003
 hostw
 check	cmp __BASEPATH__/big.bin __TEMP_BIG__
 
-test	TCP/IPv4: host to ns: small transfer
-nsb	socat -u TCP4-LISTEN:10002,bind=127.0.0.1 OPEN:__TEMP_NS_SMALL__,create,trunc
+test	TCP/IPv4: host to ns (spliced): small transfer
+nsb	socat -u TCP4-LISTEN:10002 OPEN:__TEMP_NS_SMALL__,create,trunc
 host	socat OPEN:__BASEPATH__/small.bin TCP4:127.0.0.1:10002
 nsw
 check	cmp __BASEPATH__/small.bin __TEMP_NS_SMALL__
@@ -57,8 +57,8 @@ ns	socat -u OPEN:__BASEPATH__/small.bin TCP4:__GW__:10003
 hostw
 check	cmp __BASEPATH__/small.bin __TEMP_SMALL__
 
-test	TCP/IPv6: host to ns: big transfer
-nsb	socat -u TCP6-LISTEN:10002,bind=[::1] OPEN:__TEMP_NS_BIG__,create,trunc
+test	TCP/IPv6: host to ns (spliced): big transfer
+nsb	socat -u TCP6-LISTEN:10002 OPEN:__TEMP_NS_BIG__,create,trunc
 host	socat -u OPEN:__BASEPATH__/big.bin TCP6:[::1]:10002
 nsw
 check	cmp __BASEPATH__/big.bin __TEMP_NS_BIG__
@@ -77,8 +77,8 @@ ns	socat -u OPEN:__BASEPATH__/big.bin TCP6:[__GW6__%__IFNAME__]:10003
 hostw
 check	cmp __BASEPATH__/big.bin __TEMP_BIG__
 
-test	TCP/IPv6: host to ns: small transfer
-nsb	socat -u TCP6-LISTEN:10002,bind=[::1] OPEN:__TEMP_NS_SMALL__,create,trunc
+test	TCP/IPv6: host to ns (spliced): small transfer
+nsb	socat -u TCP6-LISTEN:10002 OPEN:__TEMP_NS_SMALL__,create,trunc
 host	socat -u OPEN:__BASEPATH__/small.bin TCP6:[::1]:10002
 nsw
 check	cmp __BASEPATH__/small.bin __TEMP_NS_SMALL__
diff --git a/test/pasta/udp b/test/pasta/udp
index 30e3a85..7734d02 100644
--- a/test/pasta/udp
+++ b/test/pasta/udp
@@ -17,8 +17,8 @@ htools	dd socat ip jq
 set	TEMP __STATEDIR__/test.bin
 set	TEMP_NS __STATEDIR__/test_ns.bin
 
-test	UDP/IPv4: host to ns
-nsb	socat -u UDP4-LISTEN:10002,bind=127.0.0.1,null-eof OPEN:__TEMP_NS__,create,trunc
+test	UDP/IPv4: host to ns (recvmmsg/sendmmsg)
+nsb	socat -u UDP4-LISTEN:10002,null-eof OPEN:__TEMP_NS__,create,trunc
 host	socat OPEN:__BASEPATH__/medium.bin UDP4:127.0.0.1:10002,shut-null
 nsw
 check	cmp __BASEPATH__/medium.bin __TEMP_NS__
@@ -37,8 +37,8 @@ ns	socat -u OPEN:__BASEPATH__/medium.bin UDP4:__GW__:10003,shut-null
 hostw
 check	cmp __BASEPATH__/medium.bin __TEMP__
 
-test	UDP/IPv6: host to ns
-nsb	socat -u UDP6-LISTEN:10002,bind=[::1],null-eof OPEN:__TEMP_NS__,create,trunc
+test	UDP/IPv6: host to ns (recvmmsg/sendmmsg)
+nsb	socat -u UDP6-LISTEN:10002,null-eof OPEN:__TEMP_NS__,create,trunc
 host	socat -u OPEN:__BASEPATH__/medium.bin UDP6:[::1]:10002,shut-null
 nsw
 check	cmp __BASEPATH__/medium.bin __TEMP_NS__
-- 
@@ -17,8 +17,8 @@ htools	dd socat ip jq
 set	TEMP __STATEDIR__/test.bin
 set	TEMP_NS __STATEDIR__/test_ns.bin
 
-test	UDP/IPv4: host to ns
-nsb	socat -u UDP4-LISTEN:10002,bind=127.0.0.1,null-eof OPEN:__TEMP_NS__,create,trunc
+test	UDP/IPv4: host to ns (recvmmsg/sendmmsg)
+nsb	socat -u UDP4-LISTEN:10002,null-eof OPEN:__TEMP_NS__,create,trunc
 host	socat OPEN:__BASEPATH__/medium.bin UDP4:127.0.0.1:10002,shut-null
 nsw
 check	cmp __BASEPATH__/medium.bin __TEMP_NS__
@@ -37,8 +37,8 @@ ns	socat -u OPEN:__BASEPATH__/medium.bin UDP4:__GW__:10003,shut-null
 hostw
 check	cmp __BASEPATH__/medium.bin __TEMP__
 
-test	UDP/IPv6: host to ns
-nsb	socat -u UDP6-LISTEN:10002,bind=[::1],null-eof OPEN:__TEMP_NS__,create,trunc
+test	UDP/IPv6: host to ns (recvmmsg/sendmmsg)
+nsb	socat -u UDP6-LISTEN:10002,null-eof OPEN:__TEMP_NS__,create,trunc
 host	socat -u OPEN:__BASEPATH__/medium.bin UDP6:[::1]:10002,shut-null
 nsw
 check	cmp __BASEPATH__/medium.bin __TEMP_NS__
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 4/4] fwd: Direct inbound spliced forwards to the guest's external address
  2024-10-01  6:24 [PATCH v2 0/4] Don't expose container loopback services to the host David Gibson
                   ` (2 preceding siblings ...)
  2024-10-01  6:25 ` [PATCH v2 3/4] test: Clarify test for spliced inbound transfers David Gibson
@ 2024-10-01  6:25 ` David Gibson
  2024-10-01 18:22   ` Stefano Brivio
  3 siblings, 1 reply; 8+ messages in thread
From: David Gibson @ 2024-10-01  6:25 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: David Gibson

In pasta mode, where addressing permits we "splice" connections, forwarding
directly from host socket to guest/container socket without any L2 or L3
processing.  This gives us a very large performance improvement when it's
possible.

Since the traffic is from a local socket within the guest, it will go over
the guest's 'lo' interface, and accordingly we set the guest side address
to be the loopback address.  However this has a surprising side effect:
sometimes guests will run services that are only supposed to be used within
the guest and are therefore bound to only 127.0.0.1 and/or ::1.  pasta's
forwarding exposes those services to the host, which isn't generally what
we want.

Correct this by instead forwarding inbound "splice" flows to the guest's
external address.

Link: https://github.com/containers/podman/issues/24045

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 conf.c  |  9 +++++++++
 fwd.c   | 31 +++++++++++++++++++++++--------
 passt.1 | 23 +++++++++++++++++++----
 passt.h |  1 +
 4 files changed, 52 insertions(+), 12 deletions(-)

diff --git a/conf.c b/conf.c
index 6e62510..b5318f3 100644
--- a/conf.c
+++ b/conf.c
@@ -908,6 +908,9 @@ pasta_opts:
 		"  -U, --udp-ns SPEC	UDP port forwarding to init namespace\n"
 		"    SPEC is as described above\n"
 		"    default: auto\n"
+		"  --host-lo-to-ns-lo	DEPRECATED:\n"
+		"			Translate host-loopback forwards to\n"
+		"			namespace loopback\n"
 		"  --userns NSPATH 	Target user namespace to join\n"
 		"  --netns PATH|NAME	Target network namespace to join\n"
 		"  --netns-only		Don't join existing user namespace\n"
@@ -1284,6 +1287,7 @@ void conf(struct ctx *c, int argc, char **argv)
 		{"netns-only",	no_argument,		NULL,		20 },
 		{"map-host-loopback", required_argument, NULL,		21 },
 		{"map-guest-addr", required_argument,	NULL,		22 },
+		{"host-lo-to-ns-lo", no_argument, 	NULL,		23 },
 		{ 0 },
 	};
 	const char *logname = (c->mode == MODE_PASTA) ? "pasta" : "passt";
@@ -1461,6 +1465,11 @@ void conf(struct ctx *c, int argc, char **argv)
 			conf_nat(optarg, &c->ip4.map_guest_addr,
 				 &c->ip6.map_guest_addr, NULL);
 			break;
+		case 23:
+			if (c->mode != MODE_PASTA)
+				die("--host-lo-to-ns-lo is for pasta mode only");
+			c->host_lo_to_ns_lo = 1;
+			break;
 		case 'd':
 			c->debug = 1;
 			c->quiet = 0;
diff --git a/fwd.c b/fwd.c
index a505098..c71f5e1 100644
--- a/fwd.c
+++ b/fwd.c
@@ -447,20 +447,35 @@ uint8_t fwd_nat_from_host(const struct ctx *c, uint8_t proto,
 	    (proto == IPPROTO_TCP || proto == IPPROTO_UDP)) {
 		/* spliceable */
 
-		/* Preserve the specific loopback adddress used, but let the
-		 * kernel pick a source port on the target side
+		/* The traffic will go over the guest's 'lo' interface, but by
+		 * default use its external address, so we don't inadvertently
+		 * expose services that listen only on the guest's loopback
+		 * address.  That can be overridden by --host-lo-to-ns-lo which
+		 * will instead forward to the loopback address in the guest.
+		 *
+		 * In either case, let the kernel pick the source address to
+		 * match.
 		 */
-		tgt->oaddr = ini->eaddr;
+		if (inany_v4(&ini->eaddr)) {
+			if (c->host_lo_to_ns_lo)
+				tgt->eaddr = inany_loopback4;
+			else
+				tgt->eaddr = inany_from_v4(c->ip4.addr_seen);
+			tgt->oaddr = inany_any4;
+		} else {
+			if (c->host_lo_to_ns_lo)
+				tgt->eaddr = inany_loopback6;
+			else
+				tgt->eaddr.a6 = c->ip6.addr_seen;
+			tgt->oaddr = inany_any6;
+		}
+
+		/* Let the kernel pick source port */
 		tgt->oport = 0;
 		if (proto == IPPROTO_UDP)
 			/* But for UDP preserve the source port */
 			tgt->oport = ini->eport;
 
-		if (inany_v4(&ini->eaddr))
-			tgt->eaddr = inany_loopback4;
-		else
-			tgt->eaddr = inany_loopback6;
-
 		return PIF_SPLICE;
 	}
 
diff --git a/passt.1 b/passt.1
index 11104e1..332384c 100644
--- a/passt.1
+++ b/passt.1
@@ -586,6 +586,13 @@ Configure UDP port forwarding from target namespace to init namespace.
 
 Default is \fBauto\fR.
 
+.TP
+.BR \-\-host-lo-to-ns-lo " " (DEPRECATED)
+If specified, connections forwarded with \fB\-t\fR and \fB\-u\fR from
+the host's loopback address will appear on the loopback address in the
+guest as well.  Without this option such forwarded packets will appear
+to come from the guest's public address.
+
 .TP
 .BR \-\-userns " " \fIspec
 Target user namespace to join, as a path. If PID is given, without this option,
@@ -874,8 +881,9 @@ interfaces, and it would also be impossible for guest or target
 namespace to route answers back.
 
 For convenience, the source address on these packets is translated to
-the address specified by the \fB\-\-map-host-loopback\fR option.  If
-not specified this defaults, somewhat arbitrarily, to the address of
+the address specified by the \fB\-\-map-host-loopback\fR option (with
+some exceptions in pasta mode, see next section below).  If not
+specified this defaults, somewhat arbitrarily, to the address of
 default IPv4 or IPv6 gateway (if any) -- this is known to be an
 existing, valid address on the same subnet.  If \fB\-\-no-map-gw\fR or
 \fB\-\-map-host-loopback none\fR are specified this translation is
@@ -912,8 +920,15 @@ and the new socket using the \fBsplice\fR(2) system call, and for UDP, a pair
 of \fBrecvmmsg\fR(2) and \fBsendmmsg\fR(2) system calls deals with packet
 transfers.
 
-This bypass only applies to local connections and traffic, because it's not
-possible to bind sockets to foreign addresses.
+Because it's not possible to bind sockets to foreign addresses, this
+bypass only applies to local connections and traffic.  It also means
+that the address translation differs slightly from passt mode.
+Connections from loopback to loopback on the host will appear to come
+from the target namespace's public address within the guest, unless
+\fB\-\-host-lo-to-ns-lo\fR is specified, in which case they will
+appear to come from loopback in the namespace as well.  The latter
+behaviour used to be the default, but is usually undesirable, since it
+can unintentionally expose namespace local services to the host.
 
 .SS Binding to low numbered ports (well-known or system ports, up to 1023)
 
diff --git a/passt.h b/passt.h
index 031c9b6..3b7b281 100644
--- a/passt.h
+++ b/passt.h
@@ -284,6 +284,7 @@ struct ctx {
 	int no_dhcpv6;
 	int no_ndp;
 	int no_ra;
+	int host_lo_to_ns_lo;
 
 	int low_wmem;
 	int low_rmem;
-- 
@@ -284,6 +284,7 @@ struct ctx {
 	int no_dhcpv6;
 	int no_ndp;
 	int no_ra;
+	int host_lo_to_ns_lo;
 
 	int low_wmem;
 	int low_rmem;
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 4/4] fwd: Direct inbound spliced forwards to the guest's external address
  2024-10-01  6:25 ` [PATCH v2 4/4] fwd: Direct inbound spliced forwards to the guest's external address David Gibson
@ 2024-10-01 18:22   ` Stefano Brivio
  2024-10-02  1:29     ` David Gibson
  0 siblings, 1 reply; 8+ messages in thread
From: Stefano Brivio @ 2024-10-01 18:22 UTC (permalink / raw)
  To: David Gibson; +Cc: passt-dev

Some nits, but the series looks overall good to me and I would also
apply it like this:

On Tue,  1 Oct 2024 16:25:02 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> In pasta mode, where addressing permits we "splice" connections, forwarding
> directly from host socket to guest/container socket without any L2 or L3
> processing.  This gives us a very large performance improvement when it's
> possible.
> 
> Since the traffic is from a local socket within the guest, it will go over
> the guest's 'lo' interface, and accordingly we set the guest side address
> to be the loopback address.  However this has a surprising side effect:
> sometimes guests will run services that are only supposed to be used within
> the guest and are therefore bound to only 127.0.0.1 and/or ::1.  pasta's
> forwarding exposes those services to the host, which isn't generally what
> we want.
> 
> Correct this by instead forwarding inbound "splice" flows to the guest's
> external address.
> 
> Link: https://github.com/containers/podman/issues/24045
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  conf.c  |  9 +++++++++
>  fwd.c   | 31 +++++++++++++++++++++++--------
>  passt.1 | 23 +++++++++++++++++++----
>  passt.h |  1 +
>  4 files changed, 52 insertions(+), 12 deletions(-)
> 
> diff --git a/conf.c b/conf.c
> index 6e62510..b5318f3 100644
> --- a/conf.c
> +++ b/conf.c
> @@ -908,6 +908,9 @@ pasta_opts:
>  		"  -U, --udp-ns SPEC	UDP port forwarding to init namespace\n"
>  		"    SPEC is as described above\n"
>  		"    default: auto\n"
> +		"  --host-lo-to-ns-lo	DEPRECATED:\n"

For me, this is harder to type (and remember) than --expose-ns-lo or
similar. It doesn't matter so much as it's a deprecated option anyway.

--help lines are precious, I wouldn't use one to say DEPRECATED. On the
other hand, it's deprecated :) so I don't care that much (and it has
the advantage of making it very clear for users).

> +		"			Translate host-loopback forwards to\n"
> +		"			namespace loopback\n"
>  		"  --userns NSPATH 	Target user namespace to join\n"
>  		"  --netns PATH|NAME	Target network namespace to join\n"
>  		"  --netns-only		Don't join existing user namespace\n"
> @@ -1284,6 +1287,7 @@ void conf(struct ctx *c, int argc, char **argv)
>  		{"netns-only",	no_argument,		NULL,		20 },
>  		{"map-host-loopback", required_argument, NULL,		21 },
>  		{"map-guest-addr", required_argument,	NULL,		22 },
> +		{"host-lo-to-ns-lo", no_argument, 	NULL,		23 },
>  		{ 0 },
>  	};
>  	const char *logname = (c->mode == MODE_PASTA) ? "pasta" : "passt";
> @@ -1461,6 +1465,11 @@ void conf(struct ctx *c, int argc, char **argv)
>  			conf_nat(optarg, &c->ip4.map_guest_addr,
>  				 &c->ip6.map_guest_addr, NULL);
>  			break;
> +		case 23:
> +			if (c->mode != MODE_PASTA)
> +				die("--host-lo-to-ns-lo is for pasta mode only");
> +			c->host_lo_to_ns_lo = 1;
> +			break;
>  		case 'd':
>  			c->debug = 1;
>  			c->quiet = 0;
> diff --git a/fwd.c b/fwd.c
> index a505098..c71f5e1 100644
> --- a/fwd.c
> +++ b/fwd.c
> @@ -447,20 +447,35 @@ uint8_t fwd_nat_from_host(const struct ctx *c, uint8_t proto,
>  	    (proto == IPPROTO_TCP || proto == IPPROTO_UDP)) {
>  		/* spliceable */
>  
> -		/* Preserve the specific loopback adddress used, but let the
> -		 * kernel pick a source port on the target side
> +		/* The traffic will go over the guest's 'lo' interface, but by
> +		 * default use its external address, so we don't inadvertently
> +		 * expose services that listen only on the guest's loopback
> +		 * address.  That can be overridden by --host-lo-to-ns-lo which
> +		 * will instead forward to the loopback address in the guest.
> +		 *
> +		 * In either case, let the kernel pick the source address to
> +		 * match.
>  		 */
> -		tgt->oaddr = ini->eaddr;
> +		if (inany_v4(&ini->eaddr)) {
> +			if (c->host_lo_to_ns_lo)
> +				tgt->eaddr = inany_loopback4;
> +			else
> +				tgt->eaddr = inany_from_v4(c->ip4.addr_seen);
> +			tgt->oaddr = inany_any4;
> +		} else {
> +			if (c->host_lo_to_ns_lo)
> +				tgt->eaddr = inany_loopback6;
> +			else
> +				tgt->eaddr.a6 = c->ip6.addr_seen;
> +			tgt->oaddr = inany_any6;
> +		}
> +
> +		/* Let the kernel pick source port */
>  		tgt->oport = 0;
>  		if (proto == IPPROTO_UDP)
>  			/* But for UDP preserve the source port */
>  			tgt->oport = ini->eport;
>  
> -		if (inany_v4(&ini->eaddr))
> -			tgt->eaddr = inany_loopback4;
> -		else
> -			tgt->eaddr = inany_loopback6;
> -
>  		return PIF_SPLICE;
>  	}
>  
> diff --git a/passt.1 b/passt.1
> index 11104e1..332384c 100644
> --- a/passt.1
> +++ b/passt.1
> @@ -586,6 +586,13 @@ Configure UDP port forwarding from target namespace to init namespace.
>  
>  Default is \fBauto\fR.
>  
> +.TP
> +.BR \-\-host-lo-to-ns-lo " " (DEPRECATED)
> +If specified, connections forwarded with \fB\-t\fR and \fB\-u\fR from
> +the host's loopback address will appear on the loopback address in the
> +guest as well.  Without this option such forwarded packets will appear
> +to come from the guest's public address.
> +
>  .TP
>  .BR \-\-userns " " \fIspec
>  Target user namespace to join, as a path. If PID is given, without this option,
> @@ -874,8 +881,9 @@ interfaces, and it would also be impossible for guest or target
>  namespace to route answers back.
>  
>  For convenience, the source address on these packets is translated to
> -the address specified by the \fB\-\-map-host-loopback\fR option.  If
> -not specified this defaults, somewhat arbitrarily, to the address of
> +the address specified by the \fB\-\-map-host-loopback\fR option (with
> +some exceptions in pasta mode, see next section below).  If not
> +specified this defaults, somewhat arbitrarily, to the address of
>  default IPv4 or IPv6 gateway (if any) -- this is known to be an
>  existing, valid address on the same subnet.  If \fB\-\-no-map-gw\fR or
>  \fB\-\-map-host-loopback none\fR are specified this translation is
> @@ -912,8 +920,15 @@ and the new socket using the \fBsplice\fR(2) system call, and for UDP, a pair
>  of \fBrecvmmsg\fR(2) and \fBsendmmsg\fR(2) system calls deals with packet
>  transfers.
>  
> -This bypass only applies to local connections and traffic, because it's not
> -possible to bind sockets to foreign addresses.
> +Because it's not possible to bind sockets to foreign addresses, this
> +bypass only applies to local connections and traffic.  It also means
> +that the address translation differs slightly from passt mode.
> +Connections from loopback to loopback on the host will appear to come
> +from the target namespace's public address within the guest, unless
> +\fB\-\-host-lo-to-ns-lo\fR is specified, in which case they will
> +appear to come from loopback in the namespace as well.  The latter
> +behaviour used to be the default, but is usually undesirable, since it
> +can unintentionally expose namespace local services to the host.
>  
>  .SS Binding to low numbered ports (well-known or system ports, up to 1023)
>  
> diff --git a/passt.h b/passt.h
> index 031c9b6..3b7b281 100644
> --- a/passt.h
> +++ b/passt.h
> @@ -284,6 +284,7 @@ struct ctx {
>  	int no_dhcpv6;
>  	int no_ndp;
>  	int no_ra;
> +	int host_lo_to_ns_lo;

This is missing from the struct comment, but... again, it's deprecated,
so it doesn't matter much.

-- 
Stefano


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 4/4] fwd: Direct inbound spliced forwards to the guest's external address
  2024-10-01 18:22   ` Stefano Brivio
@ 2024-10-02  1:29     ` David Gibson
  2024-10-02  5:32       ` Stefano Brivio
  0 siblings, 1 reply; 8+ messages in thread
From: David Gibson @ 2024-10-02  1:29 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: passt-dev

[-- Attachment #1: Type: text/plain, Size: 8480 bytes --]

On Tue, Oct 01, 2024 at 08:22:20PM +0200, Stefano Brivio wrote:
> Some nits, but the series looks overall good to me and I would also
> apply it like this:
> 
> On Tue,  1 Oct 2024 16:25:02 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > In pasta mode, where addressing permits we "splice" connections, forwarding
> > directly from host socket to guest/container socket without any L2 or L3
> > processing.  This gives us a very large performance improvement when it's
> > possible.
> > 
> > Since the traffic is from a local socket within the guest, it will go over
> > the guest's 'lo' interface, and accordingly we set the guest side address
> > to be the loopback address.  However this has a surprising side effect:
> > sometimes guests will run services that are only supposed to be used within
> > the guest and are therefore bound to only 127.0.0.1 and/or ::1.  pasta's
> > forwarding exposes those services to the host, which isn't generally what
> > we want.
> > 
> > Correct this by instead forwarding inbound "splice" flows to the guest's
> > external address.
> > 
> > Link: https://github.com/containers/podman/issues/24045
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> >  conf.c  |  9 +++++++++
> >  fwd.c   | 31 +++++++++++++++++++++++--------
> >  passt.1 | 23 +++++++++++++++++++----
> >  passt.h |  1 +
> >  4 files changed, 52 insertions(+), 12 deletions(-)
> > 
> > diff --git a/conf.c b/conf.c
> > index 6e62510..b5318f3 100644
> > --- a/conf.c
> > +++ b/conf.c
> > @@ -908,6 +908,9 @@ pasta_opts:
> >  		"  -U, --udp-ns SPEC	UDP port forwarding to init namespace\n"
> >  		"    SPEC is as described above\n"
> >  		"    default: auto\n"
> > +		"  --host-lo-to-ns-lo	DEPRECATED:\n"
> 
> For me, this is harder to type (and remember) than --expose-ns-lo or
> similar. It doesn't matter so much as it's a deprecated option anyway.

I'm not especially attached to the "host-lo-to-ns-lo" name, but I also
don't love "expose-ns-lo".  I was aiming to give a bit more of a hint
of exactly *when* the ns lo is exposed.

> --help lines are precious, I wouldn't use one to say DEPRECATED. On the
> other hand, it's deprecated :) so I don't care that much (and it has
> the advantage of making it very clear for users).

Right, I copied the style to be consistent with --no-copy-routes and
--no-copy-addrs.

> 
> > +		"			Translate host-loopback forwards to\n"
> > +		"			namespace loopback\n"
> >  		"  --userns NSPATH 	Target user namespace to join\n"
> >  		"  --netns PATH|NAME	Target network namespace to join\n"
> >  		"  --netns-only		Don't join existing user namespace\n"
> > @@ -1284,6 +1287,7 @@ void conf(struct ctx *c, int argc, char **argv)
> >  		{"netns-only",	no_argument,		NULL,		20 },
> >  		{"map-host-loopback", required_argument, NULL,		21 },
> >  		{"map-guest-addr", required_argument,	NULL,		22 },
> > +		{"host-lo-to-ns-lo", no_argument, 	NULL,		23 },
> >  		{ 0 },
> >  	};
> >  	const char *logname = (c->mode == MODE_PASTA) ? "pasta" : "passt";
> > @@ -1461,6 +1465,11 @@ void conf(struct ctx *c, int argc, char **argv)
> >  			conf_nat(optarg, &c->ip4.map_guest_addr,
> >  				 &c->ip6.map_guest_addr, NULL);
> >  			break;
> > +		case 23:
> > +			if (c->mode != MODE_PASTA)
> > +				die("--host-lo-to-ns-lo is for pasta mode only");
> > +			c->host_lo_to_ns_lo = 1;
> > +			break;
> >  		case 'd':
> >  			c->debug = 1;
> >  			c->quiet = 0;
> > diff --git a/fwd.c b/fwd.c
> > index a505098..c71f5e1 100644
> > --- a/fwd.c
> > +++ b/fwd.c
> > @@ -447,20 +447,35 @@ uint8_t fwd_nat_from_host(const struct ctx *c, uint8_t proto,
> >  	    (proto == IPPROTO_TCP || proto == IPPROTO_UDP)) {
> >  		/* spliceable */
> >  
> > -		/* Preserve the specific loopback adddress used, but let the
> > -		 * kernel pick a source port on the target side
> > +		/* The traffic will go over the guest's 'lo' interface, but by
> > +		 * default use its external address, so we don't inadvertently
> > +		 * expose services that listen only on the guest's loopback
> > +		 * address.  That can be overridden by --host-lo-to-ns-lo which
> > +		 * will instead forward to the loopback address in the guest.
> > +		 *
> > +		 * In either case, let the kernel pick the source address to
> > +		 * match.
> >  		 */
> > -		tgt->oaddr = ini->eaddr;
> > +		if (inany_v4(&ini->eaddr)) {
> > +			if (c->host_lo_to_ns_lo)
> > +				tgt->eaddr = inany_loopback4;
> > +			else
> > +				tgt->eaddr = inany_from_v4(c->ip4.addr_seen);
> > +			tgt->oaddr = inany_any4;
> > +		} else {
> > +			if (c->host_lo_to_ns_lo)
> > +				tgt->eaddr = inany_loopback6;
> > +			else
> > +				tgt->eaddr.a6 = c->ip6.addr_seen;
> > +			tgt->oaddr = inany_any6;
> > +		}
> > +
> > +		/* Let the kernel pick source port */
> >  		tgt->oport = 0;
> >  		if (proto == IPPROTO_UDP)
> >  			/* But for UDP preserve the source port */
> >  			tgt->oport = ini->eport;
> >  
> > -		if (inany_v4(&ini->eaddr))
> > -			tgt->eaddr = inany_loopback4;
> > -		else
> > -			tgt->eaddr = inany_loopback6;
> > -
> >  		return PIF_SPLICE;
> >  	}
> >  
> > diff --git a/passt.1 b/passt.1
> > index 11104e1..332384c 100644
> > --- a/passt.1
> > +++ b/passt.1
> > @@ -586,6 +586,13 @@ Configure UDP port forwarding from target namespace to init namespace.
> >  
> >  Default is \fBauto\fR.
> >  
> > +.TP
> > +.BR \-\-host-lo-to-ns-lo " " (DEPRECATED)
> > +If specified, connections forwarded with \fB\-t\fR and \fB\-u\fR from
> > +the host's loopback address will appear on the loopback address in the
> > +guest as well.  Without this option such forwarded packets will appear
> > +to come from the guest's public address.
> > +
> >  .TP
> >  .BR \-\-userns " " \fIspec
> >  Target user namespace to join, as a path. If PID is given, without this option,
> > @@ -874,8 +881,9 @@ interfaces, and it would also be impossible for guest or target
> >  namespace to route answers back.
> >  
> >  For convenience, the source address on these packets is translated to
> > -the address specified by the \fB\-\-map-host-loopback\fR option.  If
> > -not specified this defaults, somewhat arbitrarily, to the address of
> > +the address specified by the \fB\-\-map-host-loopback\fR option (with
> > +some exceptions in pasta mode, see next section below).  If not
> > +specified this defaults, somewhat arbitrarily, to the address of
> >  default IPv4 or IPv6 gateway (if any) -- this is known to be an
> >  existing, valid address on the same subnet.  If \fB\-\-no-map-gw\fR or
> >  \fB\-\-map-host-loopback none\fR are specified this translation is
> > @@ -912,8 +920,15 @@ and the new socket using the \fBsplice\fR(2) system call, and for UDP, a pair
> >  of \fBrecvmmsg\fR(2) and \fBsendmmsg\fR(2) system calls deals with packet
> >  transfers.
> >  
> > -This bypass only applies to local connections and traffic, because it's not
> > -possible to bind sockets to foreign addresses.
> > +Because it's not possible to bind sockets to foreign addresses, this
> > +bypass only applies to local connections and traffic.  It also means
> > +that the address translation differs slightly from passt mode.
> > +Connections from loopback to loopback on the host will appear to come
> > +from the target namespace's public address within the guest, unless
> > +\fB\-\-host-lo-to-ns-lo\fR is specified, in which case they will
> > +appear to come from loopback in the namespace as well.  The latter
> > +behaviour used to be the default, but is usually undesirable, since it
> > +can unintentionally expose namespace local services to the host.
> >  
> >  .SS Binding to low numbered ports (well-known or system ports, up to 1023)
> >  
> > diff --git a/passt.h b/passt.h
> > index 031c9b6..3b7b281 100644
> > --- a/passt.h
> > +++ b/passt.h
> > @@ -284,6 +284,7 @@ struct ctx {
> >  	int no_dhcpv6;
> >  	int no_ndp;
> >  	int no_ra;
> > +	int host_lo_to_ns_lo;
> 
> This is missing from the struct comment, but... again, it's deprecated,
> so it doesn't matter much.

Oops.  That I've added.  I could repost immediately, or wait and see
if we can brainstorm a better option name.

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 4/4] fwd: Direct inbound spliced forwards to the guest's external address
  2024-10-02  1:29     ` David Gibson
@ 2024-10-02  5:32       ` Stefano Brivio
  0 siblings, 0 replies; 8+ messages in thread
From: Stefano Brivio @ 2024-10-02  5:32 UTC (permalink / raw)
  To: David Gibson; +Cc: passt-dev

On Wed, 2 Oct 2024 11:29:19 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Tue, Oct 01, 2024 at 08:22:20PM +0200, Stefano Brivio wrote:
> > Some nits, but the series looks overall good to me and I would also
> > apply it like this:
> > 
> > On Tue,  1 Oct 2024 16:25:02 +1000
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> >   
> > > In pasta mode, where addressing permits we "splice" connections, forwarding
> > > directly from host socket to guest/container socket without any L2 or L3
> > > processing.  This gives us a very large performance improvement when it's
> > > possible.
> > > 
> > > Since the traffic is from a local socket within the guest, it will go over
> > > the guest's 'lo' interface, and accordingly we set the guest side address
> > > to be the loopback address.  However this has a surprising side effect:
> > > sometimes guests will run services that are only supposed to be used within
> > > the guest and are therefore bound to only 127.0.0.1 and/or ::1.  pasta's
> > > forwarding exposes those services to the host, which isn't generally what
> > > we want.
> > > 
> > > Correct this by instead forwarding inbound "splice" flows to the guest's
> > > external address.
> > > 
> > > Link: https://github.com/containers/podman/issues/24045
> > > 
> > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > > ---
> > >  conf.c  |  9 +++++++++
> > >  fwd.c   | 31 +++++++++++++++++++++++--------
> > >  passt.1 | 23 +++++++++++++++++++----
> > >  passt.h |  1 +
> > >  4 files changed, 52 insertions(+), 12 deletions(-)
> > > 
> > > diff --git a/conf.c b/conf.c
> > > index 6e62510..b5318f3 100644
> > > --- a/conf.c
> > > +++ b/conf.c
> > > @@ -908,6 +908,9 @@ pasta_opts:
> > >  		"  -U, --udp-ns SPEC	UDP port forwarding to init namespace\n"
> > >  		"    SPEC is as described above\n"
> > >  		"    default: auto\n"
> > > +		"  --host-lo-to-ns-lo	DEPRECATED:\n"  
> > 
> > For me, this is harder to type (and remember) than --expose-ns-lo or
> > similar. It doesn't matter so much as it's a deprecated option anyway.  
> 
> I'm not especially attached to the "host-lo-to-ns-lo" name, but I also
> don't love "expose-ns-lo".  I was aiming to give a bit more of a hint
> of exactly *when* the ns lo is exposed.

Ah, right, fair enough.

> 
> > --help lines are precious, I wouldn't use one to say DEPRECATED. On the
> > other hand, it's deprecated :) so I don't care that much (and it has
> > the advantage of making it very clear for users).  
> 
> Right, I copied the style to be consistent with --no-copy-routes and
> --no-copy-addrs.
> 
> >   
> > > +		"			Translate host-loopback forwards to\n"
> > > +		"			namespace loopback\n"
> > >  		"  --userns NSPATH 	Target user namespace to join\n"
> > >  		"  --netns PATH|NAME	Target network namespace to join\n"
> > >  		"  --netns-only		Don't join existing user namespace\n"
> > > @@ -1284,6 +1287,7 @@ void conf(struct ctx *c, int argc, char **argv)
> > >  		{"netns-only",	no_argument,		NULL,		20 },
> > >  		{"map-host-loopback", required_argument, NULL,		21 },
> > >  		{"map-guest-addr", required_argument,	NULL,		22 },
> > > +		{"host-lo-to-ns-lo", no_argument, 	NULL,		23 },
> > >  		{ 0 },
> > >  	};
> > >  	const char *logname = (c->mode == MODE_PASTA) ? "pasta" : "passt";
> > > @@ -1461,6 +1465,11 @@ void conf(struct ctx *c, int argc, char **argv)
> > >  			conf_nat(optarg, &c->ip4.map_guest_addr,
> > >  				 &c->ip6.map_guest_addr, NULL);
> > >  			break;
> > > +		case 23:
> > > +			if (c->mode != MODE_PASTA)
> > > +				die("--host-lo-to-ns-lo is for pasta mode only");
> > > +			c->host_lo_to_ns_lo = 1;
> > > +			break;
> > >  		case 'd':
> > >  			c->debug = 1;
> > >  			c->quiet = 0;
> > > diff --git a/fwd.c b/fwd.c
> > > index a505098..c71f5e1 100644
> > > --- a/fwd.c
> > > +++ b/fwd.c
> > > @@ -447,20 +447,35 @@ uint8_t fwd_nat_from_host(const struct ctx *c, uint8_t proto,
> > >  	    (proto == IPPROTO_TCP || proto == IPPROTO_UDP)) {
> > >  		/* spliceable */
> > >  
> > > -		/* Preserve the specific loopback adddress used, but let the
> > > -		 * kernel pick a source port on the target side
> > > +		/* The traffic will go over the guest's 'lo' interface, but by
> > > +		 * default use its external address, so we don't inadvertently
> > > +		 * expose services that listen only on the guest's loopback
> > > +		 * address.  That can be overridden by --host-lo-to-ns-lo which
> > > +		 * will instead forward to the loopback address in the guest.
> > > +		 *
> > > +		 * In either case, let the kernel pick the source address to
> > > +		 * match.
> > >  		 */
> > > -		tgt->oaddr = ini->eaddr;
> > > +		if (inany_v4(&ini->eaddr)) {
> > > +			if (c->host_lo_to_ns_lo)
> > > +				tgt->eaddr = inany_loopback4;
> > > +			else
> > > +				tgt->eaddr = inany_from_v4(c->ip4.addr_seen);
> > > +			tgt->oaddr = inany_any4;
> > > +		} else {
> > > +			if (c->host_lo_to_ns_lo)
> > > +				tgt->eaddr = inany_loopback6;
> > > +			else
> > > +				tgt->eaddr.a6 = c->ip6.addr_seen;
> > > +			tgt->oaddr = inany_any6;
> > > +		}
> > > +
> > > +		/* Let the kernel pick source port */
> > >  		tgt->oport = 0;
> > >  		if (proto == IPPROTO_UDP)
> > >  			/* But for UDP preserve the source port */
> > >  			tgt->oport = ini->eport;
> > >  
> > > -		if (inany_v4(&ini->eaddr))
> > > -			tgt->eaddr = inany_loopback4;
> > > -		else
> > > -			tgt->eaddr = inany_loopback6;
> > > -
> > >  		return PIF_SPLICE;
> > >  	}
> > >  
> > > diff --git a/passt.1 b/passt.1
> > > index 11104e1..332384c 100644
> > > --- a/passt.1
> > > +++ b/passt.1
> > > @@ -586,6 +586,13 @@ Configure UDP port forwarding from target namespace to init namespace.
> > >  
> > >  Default is \fBauto\fR.
> > >  
> > > +.TP
> > > +.BR \-\-host-lo-to-ns-lo " " (DEPRECATED)
> > > +If specified, connections forwarded with \fB\-t\fR and \fB\-u\fR from
> > > +the host's loopback address will appear on the loopback address in the
> > > +guest as well.  Without this option such forwarded packets will appear
> > > +to come from the guest's public address.
> > > +
> > >  .TP
> > >  .BR \-\-userns " " \fIspec
> > >  Target user namespace to join, as a path. If PID is given, without this option,
> > > @@ -874,8 +881,9 @@ interfaces, and it would also be impossible for guest or target
> > >  namespace to route answers back.
> > >  
> > >  For convenience, the source address on these packets is translated to
> > > -the address specified by the \fB\-\-map-host-loopback\fR option.  If
> > > -not specified this defaults, somewhat arbitrarily, to the address of
> > > +the address specified by the \fB\-\-map-host-loopback\fR option (with
> > > +some exceptions in pasta mode, see next section below).  If not
> > > +specified this defaults, somewhat arbitrarily, to the address of
> > >  default IPv4 or IPv6 gateway (if any) -- this is known to be an
> > >  existing, valid address on the same subnet.  If \fB\-\-no-map-gw\fR or
> > >  \fB\-\-map-host-loopback none\fR are specified this translation is
> > > @@ -912,8 +920,15 @@ and the new socket using the \fBsplice\fR(2) system call, and for UDP, a pair
> > >  of \fBrecvmmsg\fR(2) and \fBsendmmsg\fR(2) system calls deals with packet
> > >  transfers.
> > >  
> > > -This bypass only applies to local connections and traffic, because it's not
> > > -possible to bind sockets to foreign addresses.
> > > +Because it's not possible to bind sockets to foreign addresses, this
> > > +bypass only applies to local connections and traffic.  It also means
> > > +that the address translation differs slightly from passt mode.
> > > +Connections from loopback to loopback on the host will appear to come
> > > +from the target namespace's public address within the guest, unless
> > > +\fB\-\-host-lo-to-ns-lo\fR is specified, in which case they will
> > > +appear to come from loopback in the namespace as well.  The latter
> > > +behaviour used to be the default, but is usually undesirable, since it
> > > +can unintentionally expose namespace local services to the host.
> > >  
> > >  .SS Binding to low numbered ports (well-known or system ports, up to 1023)
> > >  
> > > diff --git a/passt.h b/passt.h
> > > index 031c9b6..3b7b281 100644
> > > --- a/passt.h
> > > +++ b/passt.h
> > > @@ -284,6 +284,7 @@ struct ctx {
> > >  	int no_dhcpv6;
> > >  	int no_ndp;
> > >  	int no_ra;
> > > +	int host_lo_to_ns_lo;  
> > 
> > This is missing from the struct comment, but... again, it's deprecated,
> > so it doesn't matter much.  
> 
> Oops.  That I've added.  I could repost immediately, or wait and see
> if we can brainstorm a better option name.

I don't really have better ideas, let's keep it that way then.

-- 
Stefano


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-10-02  5:32 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-10-01  6:24 [PATCH v2 0/4] Don't expose container loopback services to the host David Gibson
2024-10-01  6:24 ` [PATCH v2 1/4] passt.1: Mark --stderr as deprecated more prominently David Gibson
2024-10-01  6:25 ` [PATCH v2 2/4] passt.1: Clarify and update "Handling of local addresses" section David Gibson
2024-10-01  6:25 ` [PATCH v2 3/4] test: Clarify test for spliced inbound transfers David Gibson
2024-10-01  6:25 ` [PATCH v2 4/4] fwd: Direct inbound spliced forwards to the guest's external address David Gibson
2024-10-01 18:22   ` Stefano Brivio
2024-10-02  1:29     ` David Gibson
2024-10-02  5:32       ` Stefano Brivio

Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).