public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
* [PATCH 0/3] Fix bug 113
@ 2025-11-20  4:34 David Gibson
  2025-11-20  4:34 ` [PATCH 1/3] util: Rename sock_l4_dualstack() to sock_l4_dualstack_any() David Gibson
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: David Gibson @ 2025-11-20  4:34 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: David Gibson

My previous changes to socket binding didn't quite fix bug 113, but
they get us close.  Here are the last few pieces we need to fix it.

This is based on my previous series improving consistency of listening
socket binding.

David Gibson (3):
  util: Rename sock_l4_dualstack() to sock_l4_dualstack_any()
  tcp: Always populate oaddr field for socket initiated flows
  fwd: Preserve non-standard loopback address when splice forwarding

 fwd.c  | 4 +++-
 pif.c  | 2 +-
 tcp.c  | 8 +++-----
 util.c | 6 +++---
 util.h | 4 ++--
 5 files changed, 12 insertions(+), 12 deletions(-)

-- 
2.51.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/3] util: Rename sock_l4_dualstack() to sock_l4_dualstack_any()
  2025-11-20  4:34 [PATCH 0/3] Fix bug 113 David Gibson
@ 2025-11-20  4:34 ` David Gibson
  2025-11-20  4:34 ` [PATCH 2/3] tcp: Always populate oaddr field for socket initiated flows David Gibson
  2025-11-20  4:34 ` [PATCH 3/3] fwd: Preserve non-standard loopback address when splice forwarding David Gibson
  2 siblings, 0 replies; 4+ messages in thread
From: David Gibson @ 2025-11-20  4:34 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: David Gibson

Stefano correctly noted that the fact a socket is dual-stack doesn't
necessarily imply that it is bound to a wildcard address.  While that's the
only case we use for dual-stack sockets, there may be others.  Therefore
rename this function to make it clearer that it always uses a wildcard
bind.

Suggested-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 pif.c  | 2 +-
 util.c | 6 +++---
 util.h | 4 ++--
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/pif.c b/pif.c
index db447b4f..3d7a90e5 100644
--- a/pif.c
+++ b/pif.c
@@ -82,7 +82,7 @@ int pif_sock_l4(const struct ctx *c, enum epoll_type type, uint8_t pif,
 	ASSERT(pif_is_socket(pif));
 
 	if (!addr) {
-		ref.fd = sock_l4_dualstack(c, type, port, ifname);
+		ref.fd = sock_l4_dualstack_any(c, type, port, ifname);
 	} else {
 		union sockaddr_inany sa;
 
diff --git a/util.c b/util.c
index 82491326..b460dda2 100644
--- a/util.c
+++ b/util.c
@@ -197,7 +197,7 @@ int sock_l4(const struct ctx *c, enum epoll_type type,
 }
 
 /**
- * sock_l4_dualstack() - Create a dual stack socket bound with wildcard address
+ * sock_l4_dualstack_any() - Create dualstack socket bound to :: and 0.0.0.0
  * @c:		Execution context
  * @type:	epoll type
  * @port	Port to bind to (:: and 0.0.0.0)
@@ -207,8 +207,8 @@ int sock_l4(const struct ctx *c, enum epoll_type type,
  *
  * A dual stack socket is effectively bound to both :: and 0.0.0.0.
  */
-int sock_l4_dualstack(const struct ctx *c, enum epoll_type type,
-		      in_port_t port, const char *ifname)
+int sock_l4_dualstack_any(const struct ctx *c, enum epoll_type type,
+			  in_port_t port, const char *ifname)
 {
 	union sockaddr_inany sa = {
 		.sa6.sin6_family = AF_INET6,
diff --git a/util.h b/util.h
index d9fbfe5d..963c0ec8 100644
--- a/util.h
+++ b/util.h
@@ -210,8 +210,8 @@ union sockaddr_inany;
 
 int sock_l4(const struct ctx *c, enum epoll_type type,
 	    const union sockaddr_inany *sa, const char *ifname);
-int sock_l4_dualstack(const struct ctx *c, enum epoll_type type,
-		      in_port_t port, const char *ifname);
+int sock_l4_dualstack_any(const struct ctx *c, enum epoll_type type,
+			  in_port_t port, const char *ifname);
 int sock_unix(char *sock_path);
 void sock_probe_mem(struct ctx *c);
 long timespec_diff_ms(const struct timespec *a, const struct timespec *b);
-- 
2.51.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 2/3] tcp: Always populate oaddr field for socket initiated flows
  2025-11-20  4:34 [PATCH 0/3] Fix bug 113 David Gibson
  2025-11-20  4:34 ` [PATCH 1/3] util: Rename sock_l4_dualstack() to sock_l4_dualstack_any() David Gibson
@ 2025-11-20  4:34 ` David Gibson
  2025-11-20  4:34 ` [PATCH 3/3] fwd: Preserve non-standard loopback address when splice forwarding David Gibson
  2 siblings, 0 replies; 4+ messages in thread
From: David Gibson @ 2025-11-20  4:34 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: David Gibson

When we receive a TCP connection, we get the peer address from the accept()
call.  In the case of a listening socket with an unspecified address (:: or
0.0.0.0) the local address of the accept()ed socket could vary.  We don't
get that from the accept() - we must explicitly call getsockname() to get
it.

Currently we avoid the latency of that extra syscall, and therefore don't
populate the initiating 'oaddr' field of a flow created by an incoming TCP
socket connection.  This more or less works, because we rarely need that
local address, but it does cause some oddities:

 * For migration we need the local address to recreate the socket on the
   destination, so we *do* call getsockname() in vhost-user mode
 * It limits our options in terms of forwarding connections flexibly based
   on the address to which they're received
 * It differs from UDP, where we explicitly use the IP_PKTINFO cmsg to
   populate oaddr.
 * It means (some) flow debug messages will contain wildcards instead of
   real local addresses

In theory we can elide this call when accept()ing from a socket bound to
a specific address instead of a wildcard.  However to do that will need
revisions to the data structures we use to keep track of listening sockets.

The lack of this information is making it hard to implement some fixes we
want.  So, pay the price of the extra syscall to get this information, with
the hope that we can later optimise it away for some cases.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 tcp.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/tcp.c b/tcp.c
index bd25952f..e734a957 100644
--- a/tcp.c
+++ b/tcp.c
@@ -2354,11 +2354,9 @@ void tcp_listen_handler(const struct ctx *c, union epoll_ref ref,
 	ini = flow_initiate_sa(flow, ref.tcp_listen.pif, &sa,
 			       NULL, ref.tcp_listen.port);
 
-	if (c->mode == MODE_VU) { /* Rebind to same address after migration */
-		if (getsockname(s, &sa.sa, &sl) ||
-		    inany_from_sockaddr(&ini->oaddr, &ini->oport, &sa) < 0)
-			err_perror("Can't get local address for socket %i", s);
-	}
+	if (getsockname(s, &sa.sa, &sl) ||
+	    inany_from_sockaddr(&ini->oaddr, &ini->oport, &sa) < 0)
+		err_perror("Can't get local address for socket %i", s);
 
 	if (!inany_is_unicast(&ini->eaddr) || ini->eport == 0) {
 		char sastr[SOCKADDR_STRLEN];
-- 
2.51.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 3/3] fwd: Preserve non-standard loopback address when splice forwarding
  2025-11-20  4:34 [PATCH 0/3] Fix bug 113 David Gibson
  2025-11-20  4:34 ` [PATCH 1/3] util: Rename sock_l4_dualstack() to sock_l4_dualstack_any() David Gibson
  2025-11-20  4:34 ` [PATCH 2/3] tcp: Always populate oaddr field for socket initiated flows David Gibson
@ 2025-11-20  4:34 ` David Gibson
  2 siblings, 0 replies; 4+ messages in thread
From: David Gibson @ 2025-11-20  4:34 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: David Gibson

When forwarding "spliced" connections outwards (-T or -U) we listen on the
guest's loopback and always forward to 127.0.0.1 (or ::1) on the host.
However, it's also possible for clients on the guest to attempt connecting
to other addresses in 127.0.0.0/8 (systemd-resolved uses 127.0.0.53 in
practice).  If the host side server is only listening on that specific
non-standard loopback address, the forward won't work.  Fix this by
preserving the specific (loopback) address when forwarding such
connections.

Link: https://bugs.passt.top/show_bug.cgi?id=113

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 fwd.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fwd.c b/fwd.c
index c417e0f5..44a0e109 100644
--- a/fwd.c
+++ b/fwd.c
@@ -660,7 +660,9 @@ uint8_t fwd_nat_from_splice(const struct ctx *c, uint8_t proto,
 		return PIF_NONE;
 	}
 
-	if (inany_v4(&ini->eaddr))
+	if (!inany_is_unspecified(&ini->oaddr))
+		tgt->eaddr = ini->oaddr;
+	else if (inany_v4(&ini->oaddr))
 		tgt->eaddr = inany_loopback4;
 	else
 		tgt->eaddr = inany_loopback6;
-- 
2.51.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-11-20  4:34 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-11-20  4:34 [PATCH 0/3] Fix bug 113 David Gibson
2025-11-20  4:34 ` [PATCH 1/3] util: Rename sock_l4_dualstack() to sock_l4_dualstack_any() David Gibson
2025-11-20  4:34 ` [PATCH 2/3] tcp: Always populate oaddr field for socket initiated flows David Gibson
2025-11-20  4:34 ` [PATCH 3/3] fwd: Preserve non-standard loopback address when splice forwarding David Gibson

Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).