public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
* [PATCH v4 0/7] Don't expose container loopback services to the host
@ 2024-10-17  5:32 David Gibson
  2024-10-17  5:32 ` [PATCH v4 1/7] arp: Fix a handful of small warts David Gibson
                   ` (6 more replies)
  0 siblings, 7 replies; 12+ messages in thread
From: David Gibson @ 2024-10-17  5:32 UTC (permalink / raw)
  To: passt-dev, Stefano Brivio; +Cc: David Gibson

podman issue #24045 pointed out that pasta's spliced forwarding logic
can expose services within the namespace bound only to 127.0.0.1 or
::1 to the host.  However, the namespace probably expects those to
only be accessible to itself, so that's probably not what we want.

This changes our forwarding logic so as not to do this.  Note that the
podman tests will currently fail with this series, I've submitted
podman PR https://github.com/containers/podman/pull/24064 to fix that.

Link: https://github.com/containers/podman/issues/24045

Changes since v3:
 * Added several extra patches working around failures on Debian,
   because its dhclient-script doesn't wait for DAD to complete
Changes since v2:
 * Add new field do structure comment
Changes since v1:
 * Add --host-lo-to-ns-lo option to preserve the old behaviour
 * Clarify the new behaviour in the man page
 * Add some extra patches making some other corrections to the man
   page

David Gibson (7):
  arp: Fix a handful of small warts
  test: Explicitly wait for DAD to complete on SLAAC addresses
  test: Wait for DAD on DHCPv6 addresses
  passt.1: Mark --stderr as deprecated more prominently
  passt.1: Clarify and update "Handling of local addresses" section
  test: Clarify test for spliced inbound transfers
  fwd: Direct inbound spliced forwards to the guest's external address

 arp.c                 |  8 ++---
 conf.c                |  9 ++++++
 fwd.c                 | 31 +++++++++++++-----
 passt.1               | 75 ++++++++++++++++++++++++++-----------------
 passt.h               |  2 ++
 test/passt/dhcp       |  2 ++
 test/passt/ndp        |  4 ++-
 test/passt_in_ns/tcp  |  8 ++---
 test/passt_in_ns/udp  |  4 +--
 test/pasta/dhcp       |  2 ++
 test/pasta/ndp        |  3 +-
 test/pasta/tcp        | 16 ++++-----
 test/pasta/udp        |  8 ++---
 test/two_guests/basic |  6 +++-
 14 files changed, 115 insertions(+), 63 deletions(-)

-- 
2.47.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v4 1/7] arp: Fix a handful of small warts
  2024-10-17  5:32 [PATCH v4 0/7] Don't expose container loopback services to the host David Gibson
@ 2024-10-17  5:32 ` David Gibson
  2024-10-17  8:45   ` Stefano Brivio
  2024-10-17  5:32 ` [PATCH v4 2/7] test: Explicitly wait for DAD to complete on SLAAC addresses David Gibson
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 12+ messages in thread
From: David Gibson @ 2024-10-17  5:32 UTC (permalink / raw)
  To: passt-dev, Stefano Brivio; +Cc: David Gibson

This fixes a number of harmless but slightly ugly warts in the ARP
resolution code:
 * Use in4addr_any to represent 0.0.0.0 rather than hand constructing an
   example.
 * When comparing am->sip against 0.0.0.0 use sizeof(am->sip) instead of
   sizeof(am->tip) (same value, but makes more logical sense)
 * Described the guest's assigned address as such, rather than as "our
   address" - that's not usually what we mean by "our address" these days
 * Remove "we might have the same IP address" comment which I can't make
   sense of in context (possibly it's relating to the statement below,
   which already has its own comment?)

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 arp.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/arp.c b/arp.c
index 53334da..d34b20e 100644
--- a/arp.c
+++ b/arp.c
@@ -59,14 +59,12 @@ int arp(const struct ctx *c, const struct pool *p)
 	    ah->ar_op  != htons(ARPOP_REQUEST))
 		return 1;
 
-	/* Discard announcements (but not 0.0.0.0 "probes"): we might have the
-	 * same IP address, hide that.
-	 */
-	if (memcmp(am->sip, (unsigned char[4]){ 0 }, sizeof(am->tip)) &&
+	/* Discard announcements, but not 0.0.0.0 "probes" */
+	if (memcmp(am->sip, &in4addr_any, sizeof(am->sip)) &&
 	    !memcmp(am->sip, am->tip, sizeof(am->sip)))
 		return 1;
 
-	/* Don't resolve our own address, either. */
+	/* Don't resolve guest's assigned address, either. */
 	if (!memcmp(am->tip, &c->ip4.addr, sizeof(am->tip)))
 		return 1;
 
-- 
@@ -59,14 +59,12 @@ int arp(const struct ctx *c, const struct pool *p)
 	    ah->ar_op  != htons(ARPOP_REQUEST))
 		return 1;
 
-	/* Discard announcements (but not 0.0.0.0 "probes"): we might have the
-	 * same IP address, hide that.
-	 */
-	if (memcmp(am->sip, (unsigned char[4]){ 0 }, sizeof(am->tip)) &&
+	/* Discard announcements, but not 0.0.0.0 "probes" */
+	if (memcmp(am->sip, &in4addr_any, sizeof(am->sip)) &&
 	    !memcmp(am->sip, am->tip, sizeof(am->sip)))
 		return 1;
 
-	/* Don't resolve our own address, either. */
+	/* Don't resolve guest's assigned address, either. */
 	if (!memcmp(am->tip, &c->ip4.addr, sizeof(am->tip)))
 		return 1;
 
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 2/7] test: Explicitly wait for DAD to complete on SLAAC addresses
  2024-10-17  5:32 [PATCH v4 0/7] Don't expose container loopback services to the host David Gibson
  2024-10-17  5:32 ` [PATCH v4 1/7] arp: Fix a handful of small warts David Gibson
@ 2024-10-17  5:32 ` David Gibson
  2024-10-17  5:33 ` [PATCH v4 3/7] test: Wait for DAD on DHCPv6 addresses David Gibson
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: David Gibson @ 2024-10-17  5:32 UTC (permalink / raw)
  To: passt-dev, Stefano Brivio; +Cc: David Gibson

Getting a SLAAC address takes a little while because the kernel must
complete Duplicate Address Detection (DAD) before marking the address as
ready.  In several places we have an explicit 'sleep 2' to wait for that
to complete.

Fixed length delays are never a great idea, although this one is pretty
solid.  Still, it would be better to explicitly wait for DAD to complete
in case of long delays (which might happen on slow emulated hosts, or with
heavy load), and to speed the tests up if DAD completes quicker.

Replace the fixed sleeps with a loop waiting for DAD to complete.  We do
this by looping waiting for all tentative addresses to disappear.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 test/passt/ndp        | 4 +++-
 test/pasta/ndp        | 3 ++-
 test/two_guests/basic | 3 ++-
 3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/test/passt/ndp b/test/passt/ndp
index 6bf8af3..f54b8ce 100644
--- a/test/passt/ndp
+++ b/test/passt/ndp
@@ -16,7 +16,9 @@ htools	ip jq sipcalc grep cut
 
 test	Interface name
 gout	IFNAME ip -j link show | jq -rM '.[] | select(.link_type == "ether").ifname'
-guest	ip link set dev __IFNAME__ up && sleep 2
+guest	ip link set dev __IFNAME__ up
+# Wait for DAD to complete
+guest	while ip -j -6 addr show tentative | jq -e '.[].addr_info'; do sleep 0.1; done
 hout	HOST_IFNAME6 ip -j -6 route show|jq -rM '[.[] | select(.dst == "default").dev] | .[0]'
 check	[ -n "__IFNAME__" ]
 
diff --git a/test/pasta/ndp b/test/pasta/ndp
index d45ff7b..c59627f 100644
--- a/test/pasta/ndp
+++ b/test/pasta/ndp
@@ -18,7 +18,8 @@ test	Interface name
 nsout	IFNAME ip -j link show | jq -rM '.[] | select(.link_type == "ether").ifname'
 check	[ -n "__IFNAME__" ]
 ns	ip link set dev __IFNAME__ up
-sleep	2
+# Wait for DAD to complete
+ns	while ip -j -6 addr show tentative | jq -e '.[].addr_info'; do sleep 0.1; done
 
 test	SLAAC: prefix
 nsout	ADDR6 ip -j -6 addr show|jq -rM '[.[] | select(.ifname == "__IFNAME__").addr_info[] | select(.scope == "global" and .prefixlen == 64).local] | .[0]'
diff --git a/test/two_guests/basic b/test/two_guests/basic
index 4d49e85..ac50ff8 100644
--- a/test/two_guests/basic
+++ b/test/two_guests/basic
@@ -36,7 +36,8 @@ check	[ "__ADDR2__" = "__HOST_ADDR__" ]
 
 test	DHCPv6: addresses
 # Link is up now, wait for DAD to complete
-sleep	2
+guest1	while ip -j -6 addr show tentative | jq -e '.[].addr_info'; do sleep 0.1; done
+guest2	while ip -j -6 addr show tentative | jq -e '.[].addr_info'; do sleep 0.1; done
 guest1	/sbin/dhclient -6 __IFNAME1__
 guest2	/sbin/dhclient -6 __IFNAME2__
 g1out	ADDR1_6 ip -j -6 addr show|jq -rM '[.[] | select(.ifname == "__IFNAME1__").addr_info[] | select(.prefixlen == 128).local] | .[0]'
-- 
@@ -36,7 +36,8 @@ check	[ "__ADDR2__" = "__HOST_ADDR__" ]
 
 test	DHCPv6: addresses
 # Link is up now, wait for DAD to complete
-sleep	2
+guest1	while ip -j -6 addr show tentative | jq -e '.[].addr_info'; do sleep 0.1; done
+guest2	while ip -j -6 addr show tentative | jq -e '.[].addr_info'; do sleep 0.1; done
 guest1	/sbin/dhclient -6 __IFNAME1__
 guest2	/sbin/dhclient -6 __IFNAME2__
 g1out	ADDR1_6 ip -j -6 addr show|jq -rM '[.[] | select(.ifname == "__IFNAME1__").addr_info[] | select(.prefixlen == 128).local] | .[0]'
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 3/7] test: Wait for DAD on DHCPv6 addresses
  2024-10-17  5:32 [PATCH v4 0/7] Don't expose container loopback services to the host David Gibson
  2024-10-17  5:32 ` [PATCH v4 1/7] arp: Fix a handful of small warts David Gibson
  2024-10-17  5:32 ` [PATCH v4 2/7] test: Explicitly wait for DAD to complete on SLAAC addresses David Gibson
@ 2024-10-17  5:33 ` David Gibson
  2024-10-17  8:45   ` Stefano Brivio
  2024-10-17  5:33 ` [PATCH v4 4/7] passt.1: Mark --stderr as deprecated more prominently David Gibson
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 12+ messages in thread
From: David Gibson @ 2024-10-17  5:33 UTC (permalink / raw)
  To: passt-dev, Stefano Brivio; +Cc: David Gibson

After running dhclient -6 we expect the DHCPv6 assigned address to be
immediately usable.  That's true with the Fedora dhclient-script (and the
upstream ISC DHCP one), however it's not true with the Debian
dhclient-script.  The Debian script can complete with the address still
in "tentative" state, and the address won't be usable until Duplicate
Address Detection (DAD) completes.  That's arguably a bug in Debian (see
link below), but for the time being we need to work around it anyway.

We usually get away with this, because by the time we do anything where the
address matters, DAD has completed.  However, it's not robust, so we should
explicitly wait for DAD to complete when we get an DHCPv6 address.

Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1085231

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 test/passt/dhcp       | 2 ++
 test/pasta/dhcp       | 2 ++
 test/two_guests/basic | 3 +++
 3 files changed, 7 insertions(+)

diff --git a/test/passt/dhcp b/test/passt/dhcp
index e05a4bb..9925ab9 100644
--- a/test/passt/dhcp
+++ b/test/passt/dhcp
@@ -49,6 +49,8 @@ check	[ "__SEARCH__" = "__HOST_SEARCH__" ]
 
 test	DHCPv6: address
 guest	/sbin/dhclient -6 __IFNAME__
+# Wait for DAD to complete
+guest	while ip -j -6 addr show tentative | jq -e '.[].addr_info'; do sleep 0.1; done
 gout	ADDR6 ip -j -6 addr show|jq -rM '[.[] | select(.ifname == "__IFNAME__").addr_info[] | select(.prefixlen == 128).local] | .[0]'
 hout	HOST_ADDR6 ip -j -6 addr show|jq -rM '[.[] | select(.ifname == "__HOST_IFNAME6__").addr_info[] | select(.scope == "global" and .deprecated != true).local] | .[0]'
 check	[ "__ADDR6__" = "__HOST_ADDR6__" ]
diff --git a/test/pasta/dhcp b/test/pasta/dhcp
index 41556b8..d4f3ad5 100644
--- a/test/pasta/dhcp
+++ b/test/pasta/dhcp
@@ -35,6 +35,8 @@ check	[ __MTU__ = 65520 ]
 
 test	DHCPv6: address
 ns	/sbin/dhclient -6 --no-pid __IFNAME__
+# Wait for DAD to complete
+ns	while ip -j -6 addr show tentative | jq -e '.[].addr_info'; do sleep 0.1; done
 hout	HOST_IFNAME6 ip -j -6 route show|jq -rM '[.[] | select(.dst == "default").dev] | .[0]'
 nsout	ADDR6 ip -j -6 addr show|jq -rM '[.[] | select(.ifname == "__IFNAME__").addr_info[] | select(.prefixlen == 128).local] | .[0]'
 hout	HOST_ADDR6 ip -j -6 addr show|jq -rM '[.[] | select(.ifname == "__HOST_IFNAME6__").addr_info[] | select(.scope == "global" and .deprecated != true).local] | .[0]'
diff --git a/test/two_guests/basic b/test/two_guests/basic
index ac50ff8..9ba5efe 100644
--- a/test/two_guests/basic
+++ b/test/two_guests/basic
@@ -40,6 +40,9 @@ guest1	while ip -j -6 addr show tentative | jq -e '.[].addr_info'; do sleep 0.1;
 guest2	while ip -j -6 addr show tentative | jq -e '.[].addr_info'; do sleep 0.1; done
 guest1	/sbin/dhclient -6 __IFNAME1__
 guest2	/sbin/dhclient -6 __IFNAME2__
+# Wait for DAD to complete on the DHCP address
+guest1	while ip -j -6 addr show tentative | jq -e '.[].addr_info'; do sleep 0.1; done
+guest2	while ip -j -6 addr show tentative | jq -e '.[].addr_info'; do sleep 0.1; done
 g1out	ADDR1_6 ip -j -6 addr show|jq -rM '[.[] | select(.ifname == "__IFNAME1__").addr_info[] | select(.prefixlen == 128).local] | .[0]'
 g2out	ADDR2_6 ip -j -6 addr show|jq -rM '[.[] | select(.ifname == "__IFNAME2__").addr_info[] | select(.prefixlen == 128).local] | .[0]'
 hout	HOST_ADDR6 ip -j -6 addr show|jq -rM '[.[] | select(.ifname == "__HOST_IFNAME6__").addr_info[] | select(.scope == "global" and .deprecated != true).local] | .[0]'
-- 
@@ -40,6 +40,9 @@ guest1	while ip -j -6 addr show tentative | jq -e '.[].addr_info'; do sleep 0.1;
 guest2	while ip -j -6 addr show tentative | jq -e '.[].addr_info'; do sleep 0.1; done
 guest1	/sbin/dhclient -6 __IFNAME1__
 guest2	/sbin/dhclient -6 __IFNAME2__
+# Wait for DAD to complete on the DHCP address
+guest1	while ip -j -6 addr show tentative | jq -e '.[].addr_info'; do sleep 0.1; done
+guest2	while ip -j -6 addr show tentative | jq -e '.[].addr_info'; do sleep 0.1; done
 g1out	ADDR1_6 ip -j -6 addr show|jq -rM '[.[] | select(.ifname == "__IFNAME1__").addr_info[] | select(.prefixlen == 128).local] | .[0]'
 g2out	ADDR2_6 ip -j -6 addr show|jq -rM '[.[] | select(.ifname == "__IFNAME2__").addr_info[] | select(.prefixlen == 128).local] | .[0]'
 hout	HOST_ADDR6 ip -j -6 addr show|jq -rM '[.[] | select(.ifname == "__HOST_IFNAME6__").addr_info[] | select(.scope == "global" and .deprecated != true).local] | .[0]'
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 4/7] passt.1: Mark --stderr as deprecated more prominently
  2024-10-17  5:32 [PATCH v4 0/7] Don't expose container loopback services to the host David Gibson
                   ` (2 preceding siblings ...)
  2024-10-17  5:33 ` [PATCH v4 3/7] test: Wait for DAD on DHCPv6 addresses David Gibson
@ 2024-10-17  5:33 ` David Gibson
  2024-10-17  5:33 ` [PATCH v4 5/7] passt.1: Clarify and update "Handling of local addresses" section David Gibson
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: David Gibson @ 2024-10-17  5:33 UTC (permalink / raw)
  To: passt-dev, Stefano Brivio; +Cc: David Gibson

The description of this option says that it's deprecated, but unlike
--no-copy-addrs and --no-copy-routes it doesn't have a clear label.  Add
one to make it easier to spot.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 passt.1 | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/passt.1 b/passt.1
index ef33267..c573788 100644
--- a/passt.1
+++ b/passt.1
@@ -95,7 +95,7 @@ detached PID namespace after starting, because the PID itself cannot change.
 Default is to fork into background.
 
 .TP
-.BR \-e ", " \-\-stderr
+.BR \-e ", " \-\-stderr " " (DEPRECATED)
 This option has no effect, and is maintained for compatibility purposes only.
 
 Note that this configuration option is \fBdeprecated\fR and will be removed in a
-- 
@@ -95,7 +95,7 @@ detached PID namespace after starting, because the PID itself cannot change.
 Default is to fork into background.
 
 .TP
-.BR \-e ", " \-\-stderr
+.BR \-e ", " \-\-stderr " " (DEPRECATED)
 This option has no effect, and is maintained for compatibility purposes only.
 
 Note that this configuration option is \fBdeprecated\fR and will be removed in a
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 5/7] passt.1: Clarify and update "Handling of local addresses" section
  2024-10-17  5:32 [PATCH v4 0/7] Don't expose container loopback services to the host David Gibson
                   ` (3 preceding siblings ...)
  2024-10-17  5:33 ` [PATCH v4 4/7] passt.1: Mark --stderr as deprecated more prominently David Gibson
@ 2024-10-17  5:33 ` David Gibson
  2024-10-17  5:33 ` [PATCH v4 6/7] test: Clarify test for spliced inbound transfers David Gibson
  2024-10-17  5:33 ` [PATCH v4 7/7] fwd: Direct inbound spliced forwards to the guest's external address David Gibson
  6 siblings, 0 replies; 12+ messages in thread
From: David Gibson @ 2024-10-17  5:33 UTC (permalink / raw)
  To: passt-dev, Stefano Brivio; +Cc: David Gibson

This section didn't mention the effect of the --map-host-loopback option
which now alters this behaviour.  Update it accordingly.

It used "local addresses" to mean specifically 127.0.0.0/8 and ::1.
However, "local" could also refer to link-local addresses or to addresses
of any scope which happen to be configured on the host.  Use "loopback
address" to be more precise about this.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 passt.1 | 54 ++++++++++++++++++++++++++++--------------------------
 1 file changed, 28 insertions(+), 26 deletions(-)

diff --git a/passt.1 b/passt.1
index c573788..46100e2 100644
--- a/passt.1
+++ b/passt.1
@@ -882,38 +882,40 @@ root@localhost's password:
 
 .SH NOTES
 
-.SS Handling of traffic with local destination and source addresses
-
-Both \fBpasst\fR and \fBpasta\fR can bind on ports with a local address,
-depending on the configuration. Local destination or source addresses need to be
-changed before packets are delivered to the guest or target namespace: most
-operating systems would drop packets received from non-loopback interfaces with
-local addresses, and it would also be impossible for guest or target namespace
-to route answers back.
-
-For convenience, and somewhat arbitrarily, the source address on these packets
-is translated to the address of the default IPv4 or IPv6 gateway (if any) --
-this is known to be an existing, valid address on the same subnet.
-
-Loopback destination addresses are instead translated to the observed external
-address of the guest or target namespace. For IPv6 packets, if usage of a
-link-local address by guest or namespace has ever been observed, and the
-original destination address is also a link-local address, the observed
-link-local address is used. Otherwise, the observed global address is used. For
-both IPv4 and IPv6, if no addresses have been seen yet, the configured addresses
-will be used instead.
+.SS Handling of traffic with loopback destination and source addresses
+
+Both \fBpasst\fR and \fBpasta\fR can bind on ports with a loopback
+address (127.0.0.0/8 or ::1), depending on the configuration. Loopback
+destination or source addresses need to be changed before packets are
+delivered to the guest or target namespace: most operating systems
+would drop packets received with loopback addresses on non-loopback
+interfaces, and it would also be impossible for guest or target
+namespace to route answers back.
+
+For convenience, the source address on these packets is translated to
+the address specified by the \fB\-\-map-host-loopback\fR option.  If
+not specified this defaults, somewhat arbitrarily, to the address of
+default IPv4 or IPv6 gateway (if any) -- this is known to be an
+existing, valid address on the same subnet.  If \fB\-\-no-map-gw\fR or
+\fB\-\-map-host-loopback none\fR are specified this translation is
+disabled and packets with loopback addresses are simply dropped.
+
+Loopback destination addresses are translated to the observed external
+address of the guest or target namespace. For IPv6, the observed
+link-local address is used if the translated source address is
+link-local, otherwise the observed global address is used. For both
+IPv4 and IPv6, if no addresses have been seen yet, the configured
+addresses will be used instead.
 
 For example, if \fBpasst\fR or \fBpasta\fR receive a connection from 127.0.0.1,
 with destination 127.0.0.10, and the default IPv4 gateway is 192.0.2.1, while
 the last observed source address from guest or namespace is 192.0.2.2, this will
 be translated to a connection from 192.0.2.1 to 192.0.2.2.
 
-Similarly, for traffic coming from guest or namespace, packets with destination
-address corresponding to the default gateway will have their destination address
-translated to a loopback address, if and only if a packet, in the opposite
-direction, with a loopback destination or source address, port-wise matching for
-UDP, or connection-wise for TCP, has been recently forwarded to guest or
-namespace. This behaviour can be disabled with \-\-no\-map\-gw.
+Similarly, for traffic coming from guest or namespace, packets with
+destination address corresponding to the \fB\-\-map-host-loopback\fR
+address will have their destination address translated to a loopback
+address.
 
 .SS Handling of local traffic in pasta
 
-- 
@@ -882,38 +882,40 @@ root@localhost's password:
 
 .SH NOTES
 
-.SS Handling of traffic with local destination and source addresses
-
-Both \fBpasst\fR and \fBpasta\fR can bind on ports with a local address,
-depending on the configuration. Local destination or source addresses need to be
-changed before packets are delivered to the guest or target namespace: most
-operating systems would drop packets received from non-loopback interfaces with
-local addresses, and it would also be impossible for guest or target namespace
-to route answers back.
-
-For convenience, and somewhat arbitrarily, the source address on these packets
-is translated to the address of the default IPv4 or IPv6 gateway (if any) --
-this is known to be an existing, valid address on the same subnet.
-
-Loopback destination addresses are instead translated to the observed external
-address of the guest or target namespace. For IPv6 packets, if usage of a
-link-local address by guest or namespace has ever been observed, and the
-original destination address is also a link-local address, the observed
-link-local address is used. Otherwise, the observed global address is used. For
-both IPv4 and IPv6, if no addresses have been seen yet, the configured addresses
-will be used instead.
+.SS Handling of traffic with loopback destination and source addresses
+
+Both \fBpasst\fR and \fBpasta\fR can bind on ports with a loopback
+address (127.0.0.0/8 or ::1), depending on the configuration. Loopback
+destination or source addresses need to be changed before packets are
+delivered to the guest or target namespace: most operating systems
+would drop packets received with loopback addresses on non-loopback
+interfaces, and it would also be impossible for guest or target
+namespace to route answers back.
+
+For convenience, the source address on these packets is translated to
+the address specified by the \fB\-\-map-host-loopback\fR option.  If
+not specified this defaults, somewhat arbitrarily, to the address of
+default IPv4 or IPv6 gateway (if any) -- this is known to be an
+existing, valid address on the same subnet.  If \fB\-\-no-map-gw\fR or
+\fB\-\-map-host-loopback none\fR are specified this translation is
+disabled and packets with loopback addresses are simply dropped.
+
+Loopback destination addresses are translated to the observed external
+address of the guest or target namespace. For IPv6, the observed
+link-local address is used if the translated source address is
+link-local, otherwise the observed global address is used. For both
+IPv4 and IPv6, if no addresses have been seen yet, the configured
+addresses will be used instead.
 
 For example, if \fBpasst\fR or \fBpasta\fR receive a connection from 127.0.0.1,
 with destination 127.0.0.10, and the default IPv4 gateway is 192.0.2.1, while
 the last observed source address from guest or namespace is 192.0.2.2, this will
 be translated to a connection from 192.0.2.1 to 192.0.2.2.
 
-Similarly, for traffic coming from guest or namespace, packets with destination
-address corresponding to the default gateway will have their destination address
-translated to a loopback address, if and only if a packet, in the opposite
-direction, with a loopback destination or source address, port-wise matching for
-UDP, or connection-wise for TCP, has been recently forwarded to guest or
-namespace. This behaviour can be disabled with \-\-no\-map\-gw.
+Similarly, for traffic coming from guest or namespace, packets with
+destination address corresponding to the \fB\-\-map-host-loopback\fR
+address will have their destination address translated to a loopback
+address.
 
 .SS Handling of local traffic in pasta
 
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 6/7] test: Clarify test for spliced inbound transfers
  2024-10-17  5:32 [PATCH v4 0/7] Don't expose container loopback services to the host David Gibson
                   ` (4 preceding siblings ...)
  2024-10-17  5:33 ` [PATCH v4 5/7] passt.1: Clarify and update "Handling of local addresses" section David Gibson
@ 2024-10-17  5:33 ` David Gibson
  2024-10-17  5:33 ` [PATCH v4 7/7] fwd: Direct inbound spliced forwards to the guest's external address David Gibson
  6 siblings, 0 replies; 12+ messages in thread
From: David Gibson @ 2024-10-17  5:33 UTC (permalink / raw)
  To: passt-dev, Stefano Brivio; +Cc: David Gibson

The tests in pasta/tcp and pasta/udp for inbound transfers have the server
listening within the namespace explicitly bound to 127.0.0.1 or ::1.  This
only works because of the behaviour of inbound splice connections, which
always appear with both source and destination addresses as loopback in
the namespace.  That's not an inherent property for "spliced" connections
and arguably an undesirable one.  Also update the test names to make it
clearer that these tests are expecting to exercise the "splice" path.

Interestingly this was already correct for the equivalent passt_in_ns/*,
although we also update the test names for clarity there.

Note that there are similar issues in some of the podman tests, addressed
in https://github.com/containers/podman/pull/24064

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 test/passt_in_ns/tcp |  8 ++++----
 test/passt_in_ns/udp |  4 ++--
 test/pasta/tcp       | 16 ++++++++--------
 test/pasta/udp       |  8 ++++----
 4 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/test/passt_in_ns/tcp b/test/passt_in_ns/tcp
index aaf340e..319880b 100644
--- a/test/passt_in_ns/tcp
+++ b/test/passt_in_ns/tcp
@@ -32,7 +32,7 @@ host	socat -u OPEN:__BASEPATH__/big.bin TCP4:127.0.0.1:10001
 guestw
 guest	cmp test_big.bin /root/big.bin
 
-test	TCP/IPv4: host to ns: big transfer
+test	TCP/IPv4: host to ns (spliced): big transfer
 nsb	socat -u TCP4-LISTEN:10002 OPEN:__TEMP_NS_BIG__,create,trunc
 sleep	1
 host	socat -u OPEN:__BASEPATH__/big.bin TCP4:127.0.0.1:10002
@@ -90,7 +90,7 @@ host	socat -u OPEN:__BASEPATH__/small.bin TCP4:127.0.0.1:10001
 guestw
 guest	cmp test_small.bin /root/small.bin
 
-test	TCP/IPv4: host to ns: small transfer
+test	TCP/IPv4: host to ns (spliced): small transfer
 nsb	socat -u TCP4-LISTEN:10002 OPEN:__TEMP_NS_SMALL__,create,trunc
 sleep	1
 host	socat -u OPEN:__BASEPATH__/small.bin TCP4:127.0.0.1:10002
@@ -146,7 +146,7 @@ host	socat -u OPEN:__BASEPATH__/big.bin TCP6:[::1]:10001
 guestw
 guest	cmp test_big.bin /root/big.bin
 
-test	TCP/IPv6: host to ns: big transfer
+test	TCP/IPv6: host to ns (spliced): big transfer
 nsb	socat -u TCP6-LISTEN:10002 OPEN:__TEMP_NS_BIG__,create,trunc
 sleep	1
 host	socat -u OPEN:__BASEPATH__/big.bin TCP6:[::1]:10002
@@ -204,7 +204,7 @@ host	socat -u OPEN:__BASEPATH__/small.bin TCP6:[::1]:10001
 guestw
 guest	cmp test_small.bin /root/small.bin
 
-test	TCP/IPv6: host to ns: small transfer
+test	TCP/IPv6: host to ns (spliced): small transfer
 nsb	socat -u TCP6-LISTEN:10002 OPEN:__TEMP_NS_SMALL__,create,trunc
 sleep	1
 host	socat -u OPEN:__BASEPATH__/small.bin TCP6:[::1]:10002
diff --git a/test/passt_in_ns/udp b/test/passt_in_ns/udp
index 3426ab9..791511c 100644
--- a/test/passt_in_ns/udp
+++ b/test/passt_in_ns/udp
@@ -30,7 +30,7 @@ host	socat -u OPEN:__BASEPATH__/medium.bin UDP4:127.0.0.1:10001,shut-null
 guestw
 guest	cmp test.bin /root/medium.bin
 
-test	UDP/IPv4: host to ns
+test	UDP/IPv4: host to ns (recvmmsg/sendmmsg)
 nsb	socat -u UDP4-LISTEN:10002,null-eof OPEN:__TEMP_NS__,create,trunc
 sleep	1
 host	socat -u OPEN:__BASEPATH__/medium.bin UDP4:127.0.0.1:10002,shut-null
@@ -88,7 +88,7 @@ host	socat -u OPEN:__BASEPATH__/medium.bin UDP6:[::1]:10001,shut-null
 guestw
 guest	cmp test.bin /root/medium.bin
 
-test	UDP/IPv6: host to ns
+test	UDP/IPv6: host to ns (recvmmsg/sendmmsg)
 nsb	socat -u UDP6-LISTEN:10002,null-eof OPEN:__TEMP_NS__,create,trunc
 sleep	1
 host	socat -u OPEN:__BASEPATH__/medium.bin UDP6:[::1]:10002,shut-null
diff --git a/test/pasta/tcp b/test/pasta/tcp
index 6ab18c5..53b6f25 100644
--- a/test/pasta/tcp
+++ b/test/pasta/tcp
@@ -19,8 +19,8 @@ set	TEMP_NS_BIG __STATEDIR__/test_ns_big.bin
 set	TEMP_SMALL __STATEDIR__/test_small.bin
 set	TEMP_NS_SMALL __STATEDIR__/test_ns_small.bin
 
-test	TCP/IPv4: host to ns: big transfer
-nsb	socat -u TCP4-LISTEN:10002,bind=127.0.0.1 OPEN:__TEMP_NS_BIG__,create,trunc
+test	TCP/IPv4: host to ns (spliced): big transfer
+nsb	socat -u TCP4-LISTEN:10002 OPEN:__TEMP_NS_BIG__,create,trunc
 host	socat -u OPEN:__BASEPATH__/big.bin TCP4:127.0.0.1:10002
 nsw
 check	cmp __BASEPATH__/big.bin __TEMP_NS_BIG__
@@ -38,8 +38,8 @@ ns	socat -u OPEN:__BASEPATH__/big.bin TCP4:__GW__:10003
 hostw
 check	cmp __BASEPATH__/big.bin __TEMP_BIG__
 
-test	TCP/IPv4: host to ns: small transfer
-nsb	socat -u TCP4-LISTEN:10002,bind=127.0.0.1 OPEN:__TEMP_NS_SMALL__,create,trunc
+test	TCP/IPv4: host to ns (spliced): small transfer
+nsb	socat -u TCP4-LISTEN:10002 OPEN:__TEMP_NS_SMALL__,create,trunc
 host	socat OPEN:__BASEPATH__/small.bin TCP4:127.0.0.1:10002
 nsw
 check	cmp __BASEPATH__/small.bin __TEMP_NS_SMALL__
@@ -57,8 +57,8 @@ ns	socat -u OPEN:__BASEPATH__/small.bin TCP4:__GW__:10003
 hostw
 check	cmp __BASEPATH__/small.bin __TEMP_SMALL__
 
-test	TCP/IPv6: host to ns: big transfer
-nsb	socat -u TCP6-LISTEN:10002,bind=[::1] OPEN:__TEMP_NS_BIG__,create,trunc
+test	TCP/IPv6: host to ns (spliced): big transfer
+nsb	socat -u TCP6-LISTEN:10002 OPEN:__TEMP_NS_BIG__,create,trunc
 host	socat -u OPEN:__BASEPATH__/big.bin TCP6:[::1]:10002
 nsw
 check	cmp __BASEPATH__/big.bin __TEMP_NS_BIG__
@@ -77,8 +77,8 @@ ns	socat -u OPEN:__BASEPATH__/big.bin TCP6:[__GW6__%__IFNAME__]:10003
 hostw
 check	cmp __BASEPATH__/big.bin __TEMP_BIG__
 
-test	TCP/IPv6: host to ns: small transfer
-nsb	socat -u TCP6-LISTEN:10002,bind=[::1] OPEN:__TEMP_NS_SMALL__,create,trunc
+test	TCP/IPv6: host to ns (spliced): small transfer
+nsb	socat -u TCP6-LISTEN:10002 OPEN:__TEMP_NS_SMALL__,create,trunc
 host	socat -u OPEN:__BASEPATH__/small.bin TCP6:[::1]:10002
 nsw
 check	cmp __BASEPATH__/small.bin __TEMP_NS_SMALL__
diff --git a/test/pasta/udp b/test/pasta/udp
index 30e3a85..7734d02 100644
--- a/test/pasta/udp
+++ b/test/pasta/udp
@@ -17,8 +17,8 @@ htools	dd socat ip jq
 set	TEMP __STATEDIR__/test.bin
 set	TEMP_NS __STATEDIR__/test_ns.bin
 
-test	UDP/IPv4: host to ns
-nsb	socat -u UDP4-LISTEN:10002,bind=127.0.0.1,null-eof OPEN:__TEMP_NS__,create,trunc
+test	UDP/IPv4: host to ns (recvmmsg/sendmmsg)
+nsb	socat -u UDP4-LISTEN:10002,null-eof OPEN:__TEMP_NS__,create,trunc
 host	socat OPEN:__BASEPATH__/medium.bin UDP4:127.0.0.1:10002,shut-null
 nsw
 check	cmp __BASEPATH__/medium.bin __TEMP_NS__
@@ -37,8 +37,8 @@ ns	socat -u OPEN:__BASEPATH__/medium.bin UDP4:__GW__:10003,shut-null
 hostw
 check	cmp __BASEPATH__/medium.bin __TEMP__
 
-test	UDP/IPv6: host to ns
-nsb	socat -u UDP6-LISTEN:10002,bind=[::1],null-eof OPEN:__TEMP_NS__,create,trunc
+test	UDP/IPv6: host to ns (recvmmsg/sendmmsg)
+nsb	socat -u UDP6-LISTEN:10002,null-eof OPEN:__TEMP_NS__,create,trunc
 host	socat -u OPEN:__BASEPATH__/medium.bin UDP6:[::1]:10002,shut-null
 nsw
 check	cmp __BASEPATH__/medium.bin __TEMP_NS__
-- 
@@ -17,8 +17,8 @@ htools	dd socat ip jq
 set	TEMP __STATEDIR__/test.bin
 set	TEMP_NS __STATEDIR__/test_ns.bin
 
-test	UDP/IPv4: host to ns
-nsb	socat -u UDP4-LISTEN:10002,bind=127.0.0.1,null-eof OPEN:__TEMP_NS__,create,trunc
+test	UDP/IPv4: host to ns (recvmmsg/sendmmsg)
+nsb	socat -u UDP4-LISTEN:10002,null-eof OPEN:__TEMP_NS__,create,trunc
 host	socat OPEN:__BASEPATH__/medium.bin UDP4:127.0.0.1:10002,shut-null
 nsw
 check	cmp __BASEPATH__/medium.bin __TEMP_NS__
@@ -37,8 +37,8 @@ ns	socat -u OPEN:__BASEPATH__/medium.bin UDP4:__GW__:10003,shut-null
 hostw
 check	cmp __BASEPATH__/medium.bin __TEMP__
 
-test	UDP/IPv6: host to ns
-nsb	socat -u UDP6-LISTEN:10002,bind=[::1],null-eof OPEN:__TEMP_NS__,create,trunc
+test	UDP/IPv6: host to ns (recvmmsg/sendmmsg)
+nsb	socat -u UDP6-LISTEN:10002,null-eof OPEN:__TEMP_NS__,create,trunc
 host	socat -u OPEN:__BASEPATH__/medium.bin UDP6:[::1]:10002,shut-null
 nsw
 check	cmp __BASEPATH__/medium.bin __TEMP_NS__
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 7/7] fwd: Direct inbound spliced forwards to the guest's external address
  2024-10-17  5:32 [PATCH v4 0/7] Don't expose container loopback services to the host David Gibson
                   ` (5 preceding siblings ...)
  2024-10-17  5:33 ` [PATCH v4 6/7] test: Clarify test for spliced inbound transfers David Gibson
@ 2024-10-17  5:33 ` David Gibson
  6 siblings, 0 replies; 12+ messages in thread
From: David Gibson @ 2024-10-17  5:33 UTC (permalink / raw)
  To: passt-dev, Stefano Brivio; +Cc: David Gibson

In pasta mode, where addressing permits we "splice" connections, forwarding
directly from host socket to guest/container socket without any L2 or L3
processing.  This gives us a very large performance improvement when it's
possible.

Since the traffic is from a local socket within the guest, it will go over
the guest's 'lo' interface, and accordingly we set the guest side address
to be the loopback address.  However this has a surprising side effect:
sometimes guests will run services that are only supposed to be used within
the guest and are therefore bound to only 127.0.0.1 and/or ::1.  pasta's
forwarding exposes those services to the host, which isn't generally what
we want.

Correct this by instead forwarding inbound "splice" flows to the guest's
external address.

Link: https://github.com/containers/podman/issues/24045

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 conf.c  |  9 +++++++++
 fwd.c   | 31 +++++++++++++++++++++++--------
 passt.1 | 23 +++++++++++++++++++----
 passt.h |  2 ++
 4 files changed, 53 insertions(+), 12 deletions(-)

diff --git a/conf.c b/conf.c
index c631019..b3b5342 100644
--- a/conf.c
+++ b/conf.c
@@ -912,6 +912,9 @@ pasta_opts:
 		"  -U, --udp-ns SPEC	UDP port forwarding to init namespace\n"
 		"    SPEC is as described above\n"
 		"    default: auto\n"
+		"  --host-lo-to-ns-lo	DEPRECATED:\n"
+		"			Translate host-loopback forwards to\n"
+		"			namespace loopback\n"
 		"  --userns NSPATH 	Target user namespace to join\n"
 		"  --netns PATH|NAME	Target network namespace to join\n"
 		"  --netns-only		Don't join existing user namespace\n"
@@ -1289,6 +1292,7 @@ void conf(struct ctx *c, int argc, char **argv)
 		{"netns-only",	no_argument,		NULL,		20 },
 		{"map-host-loopback", required_argument, NULL,		21 },
 		{"map-guest-addr", required_argument,	NULL,		22 },
+		{"host-lo-to-ns-lo", no_argument, 	NULL,		23 },
 		{"dns-host",	required_argument,	NULL,		24 },
 		{ 0 },
 	};
@@ -1467,6 +1471,11 @@ void conf(struct ctx *c, int argc, char **argv)
 			conf_nat(optarg, &c->ip4.map_guest_addr,
 				 &c->ip6.map_guest_addr, NULL);
 			break;
+		case 23:
+			if (c->mode != MODE_PASTA)
+				die("--host-lo-to-ns-lo is for pasta mode only");
+			c->host_lo_to_ns_lo = 1;
+			break;
 		case 24:
 			if (inet_pton(AF_INET6, optarg, &c->ip6.dns_host) &&
 			    !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.dns_host))
diff --git a/fwd.c b/fwd.c
index a505098..c71f5e1 100644
--- a/fwd.c
+++ b/fwd.c
@@ -447,20 +447,35 @@ uint8_t fwd_nat_from_host(const struct ctx *c, uint8_t proto,
 	    (proto == IPPROTO_TCP || proto == IPPROTO_UDP)) {
 		/* spliceable */
 
-		/* Preserve the specific loopback adddress used, but let the
-		 * kernel pick a source port on the target side
+		/* The traffic will go over the guest's 'lo' interface, but by
+		 * default use its external address, so we don't inadvertently
+		 * expose services that listen only on the guest's loopback
+		 * address.  That can be overridden by --host-lo-to-ns-lo which
+		 * will instead forward to the loopback address in the guest.
+		 *
+		 * In either case, let the kernel pick the source address to
+		 * match.
 		 */
-		tgt->oaddr = ini->eaddr;
+		if (inany_v4(&ini->eaddr)) {
+			if (c->host_lo_to_ns_lo)
+				tgt->eaddr = inany_loopback4;
+			else
+				tgt->eaddr = inany_from_v4(c->ip4.addr_seen);
+			tgt->oaddr = inany_any4;
+		} else {
+			if (c->host_lo_to_ns_lo)
+				tgt->eaddr = inany_loopback6;
+			else
+				tgt->eaddr.a6 = c->ip6.addr_seen;
+			tgt->oaddr = inany_any6;
+		}
+
+		/* Let the kernel pick source port */
 		tgt->oport = 0;
 		if (proto == IPPROTO_UDP)
 			/* But for UDP preserve the source port */
 			tgt->oport = ini->eport;
 
-		if (inany_v4(&ini->eaddr))
-			tgt->eaddr = inany_loopback4;
-		else
-			tgt->eaddr = inany_loopback6;
-
 		return PIF_SPLICE;
 	}
 
diff --git a/passt.1 b/passt.1
index 46100e2..f084978 100644
--- a/passt.1
+++ b/passt.1
@@ -605,6 +605,13 @@ Configure UDP port forwarding from target namespace to init namespace.
 
 Default is \fBauto\fR.
 
+.TP
+.BR \-\-host-lo-to-ns-lo " " (DEPRECATED)
+If specified, connections forwarded with \fB\-t\fR and \fB\-u\fR from
+the host's loopback address will appear on the loopback address in the
+guest as well.  Without this option such forwarded packets will appear
+to come from the guest's public address.
+
 .TP
 .BR \-\-userns " " \fIspec
 Target user namespace to join, as a path. If PID is given, without this option,
@@ -893,8 +900,9 @@ interfaces, and it would also be impossible for guest or target
 namespace to route answers back.
 
 For convenience, the source address on these packets is translated to
-the address specified by the \fB\-\-map-host-loopback\fR option.  If
-not specified this defaults, somewhat arbitrarily, to the address of
+the address specified by the \fB\-\-map-host-loopback\fR option (with
+some exceptions in pasta mode, see next section below).  If not
+specified this defaults, somewhat arbitrarily, to the address of
 default IPv4 or IPv6 gateway (if any) -- this is known to be an
 existing, valid address on the same subnet.  If \fB\-\-no-map-gw\fR or
 \fB\-\-map-host-loopback none\fR are specified this translation is
@@ -931,8 +939,15 @@ and the new socket using the \fBsplice\fR(2) system call, and for UDP, a pair
 of \fBrecvmmsg\fR(2) and \fBsendmmsg\fR(2) system calls deals with packet
 transfers.
 
-This bypass only applies to local connections and traffic, because it's not
-possible to bind sockets to foreign addresses.
+Because it's not possible to bind sockets to foreign addresses, this
+bypass only applies to local connections and traffic.  It also means
+that the address translation differs slightly from passt mode.
+Connections from loopback to loopback on the host will appear to come
+from the target namespace's public address within the guest, unless
+\fB\-\-host-lo-to-ns-lo\fR is specified, in which case they will
+appear to come from loopback in the namespace as well.  The latter
+behaviour used to be the default, but is usually undesirable, since it
+can unintentionally expose namespace local services to the host.
 
 .SS Binding to low numbered ports (well-known or system ports, up to 1023)
 
diff --git a/passt.h b/passt.h
index 4908ed9..72c7f72 100644
--- a/passt.h
+++ b/passt.h
@@ -225,6 +225,7 @@ struct ip6_ctx {
  * @no_dhcpv6:		Disable DHCPv6 server
  * @no_ndp:		Disable NDP handler altogether
  * @no_ra:		Disable router advertisements
+ * @host_lo_to_ns_lo:	Map host loopback addresses to ns loopback addresses
  * @freebind:		Allow binding of non-local addresses for forwarding
  * @low_wmem:		Low probed net.core.wmem_max
  * @low_rmem:		Low probed net.core.rmem_max
@@ -285,6 +286,7 @@ struct ctx {
 	int no_dhcpv6;
 	int no_ndp;
 	int no_ra;
+	int host_lo_to_ns_lo;
 	int freebind;
 
 	int low_wmem;
-- 
@@ -225,6 +225,7 @@ struct ip6_ctx {
  * @no_dhcpv6:		Disable DHCPv6 server
  * @no_ndp:		Disable NDP handler altogether
  * @no_ra:		Disable router advertisements
+ * @host_lo_to_ns_lo:	Map host loopback addresses to ns loopback addresses
  * @freebind:		Allow binding of non-local addresses for forwarding
  * @low_wmem:		Low probed net.core.wmem_max
  * @low_rmem:		Low probed net.core.rmem_max
@@ -285,6 +286,7 @@ struct ctx {
 	int no_dhcpv6;
 	int no_ndp;
 	int no_ra;
+	int host_lo_to_ns_lo;
 	int freebind;
 
 	int low_wmem;
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 1/7] arp: Fix a handful of small warts
  2024-10-17  5:32 ` [PATCH v4 1/7] arp: Fix a handful of small warts David Gibson
@ 2024-10-17  8:45   ` Stefano Brivio
  2024-10-18  0:43     ` David Gibson
  0 siblings, 1 reply; 12+ messages in thread
From: Stefano Brivio @ 2024-10-17  8:45 UTC (permalink / raw)
  To: David Gibson; +Cc: passt-dev

On Thu, 17 Oct 2024 16:32:58 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> This fixes a number of harmless but slightly ugly warts in the ARP
> resolution code:
>  * Use in4addr_any to represent 0.0.0.0 rather than hand constructing an
>    example.
>  * When comparing am->sip against 0.0.0.0 use sizeof(am->sip) instead of
>    sizeof(am->tip) (same value, but makes more logical sense)
>  * Described the guest's assigned address as such, rather than as "our
>    address" - that's not usually what we mean by "our address" these days
>  * Remove "we might have the same IP address" comment which I can't make
>    sense of in context (possibly it's relating to the statement below,
>    which already has its own comment?)

Right, yes, I added the part below later, in 64c0f20ab3f8 ("arp: Don't
resolve own, configured IPv4 address").

The first check comes from 2166c5872e9b ("arp: Don't answer announcements
from guest or namespace") instead. The commit title makes it obvious, the
comment not so much.

> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  arp.c | 8 +++-----
>  1 file changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/arp.c b/arp.c
> index 53334da..d34b20e 100644
> --- a/arp.c
> +++ b/arp.c
> @@ -59,14 +59,12 @@ int arp(const struct ctx *c, const struct pool *p)
>  	    ah->ar_op  != htons(ARPOP_REQUEST))
>  		return 1;
>  
> -	/* Discard announcements (but not 0.0.0.0 "probes"): we might have the
> -	 * same IP address, hide that.
> -	 */
> -	if (memcmp(am->sip, (unsigned char[4]){ 0 }, sizeof(am->tip)) &&
> +	/* Discard announcements, but not 0.0.0.0 "probes" */
> +	if (memcmp(am->sip, &in4addr_any, sizeof(am->sip)) &&
>  	    !memcmp(am->sip, am->tip, sizeof(am->sip)))
>  		return 1;
>  
> -	/* Don't resolve our own address, either. */
> +	/* Don't resolve guest's assigned address, either. */

Nit: "the guest's assigned address".

>  	if (!memcmp(am->tip, &c->ip4.addr, sizeof(am->tip)))
>  		return 1;

-- 
Stefano


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 3/7] test: Wait for DAD on DHCPv6 addresses
  2024-10-17  5:33 ` [PATCH v4 3/7] test: Wait for DAD on DHCPv6 addresses David Gibson
@ 2024-10-17  8:45   ` Stefano Brivio
  2024-10-18  1:13     ` David Gibson
  0 siblings, 1 reply; 12+ messages in thread
From: Stefano Brivio @ 2024-10-17  8:45 UTC (permalink / raw)
  To: David Gibson; +Cc: passt-dev

On Thu, 17 Oct 2024 16:33:00 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> After running dhclient -6 we expect the DHCPv6 assigned address to be
> immediately usable.  That's true with the Fedora dhclient-script (and the
> upstream ISC DHCP one), however it's not true with the Debian
> dhclient-script.  The Debian script can complete with the address still
> in "tentative" state, and the address won't be usable until Duplicate
> Address Detection (DAD) completes.  That's arguably a bug in Debian (see
> link below), but for the time being we need to work around it anyway.
> 
> We usually get away with this, because by the time we do anything where the
> address matters, DAD has completed.  However, it's not robust, so we should
> explicitly wait for DAD to complete when we get an DHCPv6 address.
> 
> Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1085231
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  test/passt/dhcp       | 2 ++
>  test/pasta/dhcp       | 2 ++
>  test/two_guests/basic | 3 +++

What about the equivalent cases in passt_in_ns/dhcp and perf/passt_tcp?

-- 
Stefano


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 1/7] arp: Fix a handful of small warts
  2024-10-17  8:45   ` Stefano Brivio
@ 2024-10-18  0:43     ` David Gibson
  0 siblings, 0 replies; 12+ messages in thread
From: David Gibson @ 2024-10-18  0:43 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: passt-dev

[-- Attachment #1: Type: text/plain, Size: 2410 bytes --]

On Thu, Oct 17, 2024 at 10:45:24AM +0200, Stefano Brivio wrote:
> On Thu, 17 Oct 2024 16:32:58 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > This fixes a number of harmless but slightly ugly warts in the ARP
> > resolution code:
> >  * Use in4addr_any to represent 0.0.0.0 rather than hand constructing an
> >    example.
> >  * When comparing am->sip against 0.0.0.0 use sizeof(am->sip) instead of
> >    sizeof(am->tip) (same value, but makes more logical sense)
> >  * Described the guest's assigned address as such, rather than as "our
> >    address" - that's not usually what we mean by "our address" these days
> >  * Remove "we might have the same IP address" comment which I can't make
> >    sense of in context (possibly it's relating to the statement below,
> >    which already has its own comment?)
> 
> Right, yes, I added the part below later, in 64c0f20ab3f8 ("arp: Don't
> resolve own, configured IPv4 address").
> 
> The first check comes from 2166c5872e9b ("arp: Don't answer announcements
> from guest or namespace") instead. The commit title makes it obvious, the
> comment not so much.
> 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> >  arp.c | 8 +++-----
> >  1 file changed, 3 insertions(+), 5 deletions(-)
> > 
> > diff --git a/arp.c b/arp.c
> > index 53334da..d34b20e 100644
> > --- a/arp.c
> > +++ b/arp.c
> > @@ -59,14 +59,12 @@ int arp(const struct ctx *c, const struct pool *p)
> >  	    ah->ar_op  != htons(ARPOP_REQUEST))
> >  		return 1;
> >  
> > -	/* Discard announcements (but not 0.0.0.0 "probes"): we might have the
> > -	 * same IP address, hide that.
> > -	 */
> > -	if (memcmp(am->sip, (unsigned char[4]){ 0 }, sizeof(am->tip)) &&
> > +	/* Discard announcements, but not 0.0.0.0 "probes" */
> > +	if (memcmp(am->sip, &in4addr_any, sizeof(am->sip)) &&
> >  	    !memcmp(am->sip, am->tip, sizeof(am->sip)))
> >  		return 1;
> >  
> > -	/* Don't resolve our own address, either. */
> > +	/* Don't resolve guest's assigned address, either. */
> 
> Nit: "the guest's assigned address".

Fixed.

> >  	if (!memcmp(am->tip, &c->ip4.addr, sizeof(am->tip)))
> >  		return 1;
> 

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 3/7] test: Wait for DAD on DHCPv6 addresses
  2024-10-17  8:45   ` Stefano Brivio
@ 2024-10-18  1:13     ` David Gibson
  0 siblings, 0 replies; 12+ messages in thread
From: David Gibson @ 2024-10-18  1:13 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: passt-dev

[-- Attachment #1: Type: text/plain, Size: 1517 bytes --]

On Thu, Oct 17, 2024 at 10:45:29AM +0200, Stefano Brivio wrote:
> On Thu, 17 Oct 2024 16:33:00 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > After running dhclient -6 we expect the DHCPv6 assigned address to be
> > immediately usable.  That's true with the Fedora dhclient-script (and the
> > upstream ISC DHCP one), however it's not true with the Debian
> > dhclient-script.  The Debian script can complete with the address still
> > in "tentative" state, and the address won't be usable until Duplicate
> > Address Detection (DAD) completes.  That's arguably a bug in Debian (see
> > link below), but for the time being we need to work around it anyway.
> > 
> > We usually get away with this, because by the time we do anything where the
> > address matters, DAD has completed.  However, it's not robust, so we should
> > explicitly wait for DAD to complete when we get an DHCPv6 address.
> > 
> > Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1085231
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> >  test/passt/dhcp       | 2 ++
> >  test/pasta/dhcp       | 2 ++
> >  test/two_guests/basic | 3 +++
> 
> What about the equivalent cases in passt_in_ns/dhcp and perf/passt_tcp?

Oops, somehow I missed those.  Will fix and respin.

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-10-18  1:36 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-10-17  5:32 [PATCH v4 0/7] Don't expose container loopback services to the host David Gibson
2024-10-17  5:32 ` [PATCH v4 1/7] arp: Fix a handful of small warts David Gibson
2024-10-17  8:45   ` Stefano Brivio
2024-10-18  0:43     ` David Gibson
2024-10-17  5:32 ` [PATCH v4 2/7] test: Explicitly wait for DAD to complete on SLAAC addresses David Gibson
2024-10-17  5:33 ` [PATCH v4 3/7] test: Wait for DAD on DHCPv6 addresses David Gibson
2024-10-17  8:45   ` Stefano Brivio
2024-10-18  1:13     ` David Gibson
2024-10-17  5:33 ` [PATCH v4 4/7] passt.1: Mark --stderr as deprecated more prominently David Gibson
2024-10-17  5:33 ` [PATCH v4 5/7] passt.1: Clarify and update "Handling of local addresses" section David Gibson
2024-10-17  5:33 ` [PATCH v4 6/7] test: Clarify test for spliced inbound transfers David Gibson
2024-10-17  5:33 ` [PATCH v4 7/7] fwd: Direct inbound spliced forwards to the guest's external address David Gibson

Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).