public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Stefano Brivio <sbrivio@redhat.com>, passt-dev@passt.top
Cc: David Gibson <david@gibson.dropbear.id.au>
Subject: [PATCH v5 14/15] tcp: Always populate oaddr field for socket initiated flows
Date: Tue,  2 Dec 2025 15:02:14 +1100	[thread overview]
Message-ID: <20251202040215.2351792-15-david@gibson.dropbear.id.au> (raw)
In-Reply-To: <20251202040215.2351792-1-david@gibson.dropbear.id.au>

When we receive a TCP connection, we get the peer address from the accept()
call.  In the case of a listening socket with an unspecified address (:: or
0.0.0.0) the local address of the accept()ed socket could vary.  We don't
get that from the accept() - we must explicitly call getsockname() to get
it.

Currently we avoid the latency of that extra syscall, and therefore don't
populate the initiating 'oaddr' field of a flow created by an incoming TCP
socket connection.  This more or less works, because we rarely need that
local address, but it does cause some oddities:

 * For migration we need the local address to recreate the socket on the
   destination, so we *do* call getsockname() in vhost-user mode
 * It limits our options in terms of forwarding connections flexibly based
   on the address to which they're received
 * It differs from UDP, where we explicitly use the IP_PKTINFO cmsg to
   populate oaddr.
 * It means (some) flow debug messages will contain wildcards instead of
   real local addresses

In theory we can elide this call when accept()ing from a socket bound to
a specific address instead of a wildcard.  However to do that will need
revisions to the data structures we use to keep track of listening sockets.

The lack of this information is making it hard to implement some fixes we
want.  So, pay the price of the extra syscall to get this information, with
the hope that we can later optimise it away for some cases.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 tcp.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/tcp.c b/tcp.c
index aacc5b20..535d6cfa 100644
--- a/tcp.c
+++ b/tcp.c
@@ -2354,11 +2354,9 @@ void tcp_listen_handler(const struct ctx *c, union epoll_ref ref,
 	ini = flow_initiate_sa(flow, ref.tcp_listen.pif, &sa,
 			       NULL, ref.tcp_listen.port);
 
-	if (c->mode == MODE_VU) { /* Rebind to same address after migration */
-		if (getsockname(s, &sa.sa, &sl) ||
-		    inany_from_sockaddr(&ini->oaddr, &ini->oport, &sa) < 0)
-			err_perror("Can't get local address for socket %i", s);
-	}
+	if (getsockname(s, &sa.sa, &sl) ||
+	    inany_from_sockaddr(&ini->oaddr, &ini->oport, &sa) < 0)
+		err_perror("Can't get local address for socket %i", s);
 
 	if (!inany_is_unicast(&ini->eaddr) || ini->eport == 0) {
 		char sastr[SOCKADDR_STRLEN];
-- 
2.52.0


  parent reply	other threads:[~2025-12-02  4:02 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-02  4:02 [PATCH v5 00/15] Reduce differences between inbound and outbound socket binding David Gibson
2025-12-02  4:02 ` [PATCH v5 01/15] util: Correct error message on SO_BINDTODEVICE failure David Gibson
2025-12-02  4:02 ` [PATCH v5 02/15] util: Extend sock_probe_mem() to sock_probe_features() David Gibson
2025-12-03  6:34   ` Stefano Brivio
2025-12-02  4:02 ` [PATCH v5 03/15] conf: More useful errors for kernels without SO_BINDTODEVICE David Gibson
2025-12-02  4:02 ` [PATCH v5 04/15] flow: Remove bogus @path field from flowside_sock_args David Gibson
2025-12-02  4:02 ` [PATCH v5 05/15] inany: Let length of sockaddr_inany be implicit from the family David Gibson
2025-12-02  4:02 ` [PATCH v5 06/15] util, flow, pif: Simplify sock_l4_sa() interface David Gibson
2025-12-02  4:02 ` [PATCH v5 07/15] tcp: Merge tcp_ns_sock_init[46]() into tcp_sock_init_one() David Gibson
2025-12-02  4:02 ` [PATCH v5 08/15] udp: Unify some more inbound/outbound parts of udp_sock_init() David Gibson
2025-12-02  4:02 ` [PATCH v5 09/15] udp: Move udp_sock_init() special case to its caller David Gibson
2025-12-02  4:02 ` [PATCH v5 10/15] util: Fix setting of IPV6_V6ONLY socket option David Gibson
2025-12-02  4:02 ` [PATCH v5 11/15] tcp, udp: Remove fallback if creating dual stack socket fails David Gibson
2025-12-02  4:02 ` [PATCH v5 12/15] tcp, udp: Bind outbound listening sockets by interface instead of address David Gibson
2025-12-03  4:41   ` David Gibson
2025-12-03  6:38     ` Stefano Brivio
2025-12-03 13:13       ` Stefano Brivio
2025-12-02  4:02 ` [PATCH v5 13/15] util: Rename sock_l4_dualstack() to sock_l4_dualstack_any() David Gibson
2025-12-02  4:02 ` David Gibson [this message]
2025-12-02  4:02 ` [PATCH v5 15/15] fwd: Preserve non-standard loopback address when splice forwarding David Gibson
2025-12-03  6:34 ` [PATCH v5 00/15] Reduce differences between inbound and outbound socket binding Stefano Brivio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251202040215.2351792-15-david@gibson.dropbear.id.au \
    --to=david@gibson.dropbear.id.au \
    --cc=passt-dev@passt.top \
    --cc=sbrivio@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).