public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: Stefano Brivio <sbrivio@redhat.com>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: passt-dev@passt.top
Subject: Re: [PATCH 3/3] tcp, udp: Bind outbound listening sockets by interface instead of address
Date: Tue, 21 Oct 2025 23:51:12 +0200	[thread overview]
Message-ID: <20251021235112.17369db5@elisabeth> (raw)
In-Reply-To: <20251017003447.414103-4-david@gibson.dropbear.id.au>

On Fri, 17 Oct 2025 11:34:47 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> Currently, outbound forwards (-T, -U) are handled by sockets bound to the
> loopback address.  Typically we create two sockets, one for 127.0.0.1 and
> one for ::1.
> 
> This has some disadvantages:
>  * The guest can't connect to these services using its global IP address,
>    it must explicitly use 127.0.0.1 or ::1 (bug 100)
>  * The guest can't even connect via 127.0.0.0/8 addresses other than
>    127.0.0.1
>  * We can't use dual-stack sockets, we have to have separate sockets for
>    IPv4 and IPv6.
> 
> The restriction exist for a reason though.  If the guest has any interfaces
> other than pasta (e.g. a VPN tunnel) external hosts could reach the host
> via the forwards.  Especially combined with -T auto / -U auto this would
> make it very easy to make a mistake with nasty security implications.
> 
> We can achieve both goals, however, if we don't bind the outbound listening
> sockets to a particular address, but _do_ use SO_BINDTODEVICE to restrict
> them to the "lo" interface.

Nice trick, I didn't think of it. I wonder if doing the same host-side
might help solving a part of https://bugs.passt.top/show_bug.cgi?id=113
as well.

> Link: https://bugs.passt.top/show_bug.cgi?id=100
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  pif.c |  6 ------
>  tcp.c | 18 ++----------------
>  udp.c | 27 ++++++++++-----------------
>  3 files changed, 12 insertions(+), 39 deletions(-)
> 
> diff --git a/pif.c b/pif.c
> index 592fafaa..84e3ceae 100644
> --- a/pif.c
> +++ b/pif.c
> @@ -87,12 +87,6 @@ int pif_sock_l4(const struct ctx *c, enum epoll_type type, uint8_t pif,
>  
>  	ASSERT(pif_is_socket(pif));
>  
> -	if (pif == PIF_SPLICE) {
> -		/* Sanity checks */
> -		ASSERT(!ifname);
> -		ASSERT(addr && inany_is_loopback(addr));
> -	}
> -
>  	if (!addr)
>  		return sock_l4_sa(c, type, &sa, sizeof(sa.sa6),
>  				  ifname, false, data);
> diff --git a/tcp.c b/tcp.c
> index 15c012d7..982c9190 100644
> --- a/tcp.c
> +++ b/tcp.c
> @@ -2592,20 +2592,6 @@ int tcp_sock_init(const struct ctx *c, uint8_t pif,
>  
>  	return r4 < 0 ? r4 : r6;
>  }
> -/**
> - * tcp_ns_sock_init() - Init socket to listen for spliced outbound connections
> - * @c:		Execution context
> - * @port:	Port, host order
> - */
> -static void tcp_ns_sock_init(const struct ctx *c, in_port_t port)
> -{
> -	ASSERT(!c->no_tcp);
> -
> -	if (c->ifi4)
> -		tcp_sock_init_one(c, PIF_SPLICE, &inany_loopback4, NULL, port);
> -	if (c->ifi6)
> -		tcp_sock_init_one(c, PIF_SPLICE, &inany_loopback6, NULL, port);
> -}
>  
>  /**
>   * tcp_ns_socks_init() - Bind sockets in namespace for outbound connections
> @@ -2625,7 +2611,7 @@ static int tcp_ns_socks_init(void *arg)
>  		if (!bitmap_isset(c->tcp.fwd_out.map, port))
>  			continue;
>  
> -		tcp_ns_sock_init(c, port);
> +		tcp_sock_init(c, PIF_SPLICE, NULL, "lo", port);

I thought the "lo" string would be part of the Linux UAPI, but that's
not the case, and loopback_net_init() just calls:

	alloc_netdev(0, "lo", NET_NAME_PREDICTABLE, loopback_setup);

so I think it's relatively unproblematic to hardcode that as well, and it
looks like we can't create a second loopback interface, even though:

$ pasta -- sh -c 'ip link set dev lo down; ip link change dev lo name lol; ip link show lol'
1: lol: <LOOPBACK> mtu 65536 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

I don't have any quick solution and I don't think we care enough as to
write a function in netlink.c fetching links with loopback type, so I'm
totally fine with this as it is.

By the way, if we fail to use SO_BINDTODEVICE, we already defensively
close the socket. The only possible flaw that occurs to me is that
somebody could rename 'lo' and then create a link called 'lo' of a
different type. But that needs CAP_NET_ADMIN in the container anyway.

>  	}
>  
>  	return 0;
> @@ -2805,7 +2791,7 @@ static void tcp_port_rebind(struct ctx *c, bool outbound)
>  		if ((c->ifi4 && socks[port][V4] == -1) ||
>  		    (c->ifi6 && socks[port][V6] == -1)) {
>  			if (outbound)
> -				tcp_ns_sock_init(c, port);
> +				tcp_sock_init(c, PIF_SPLICE, NULL, "lo", port);

Should we have/keep a fallback for pre-5.7 / pre-c427bfec18f2 kernels?

>  			else
>  				tcp_sock_init(c, PIF_HOST, NULL, NULL, port);
>  		}
> diff --git a/udp.c b/udp.c
> index 49dd0144..e38114eb 100644
> --- a/udp.c
> +++ b/udp.c
> @@ -1127,26 +1127,16 @@ int udp_sock_init(const struct ctx *c, uint8_t pif,
>  	}
>  
>  	if ((!addr || inany_v4(addr)) && c->ifi4) {
> -		const union inany_addr *a = addr ?
> -			addr : &inany_any4;
> -
> -		if (pif == PIF_SPLICE)
> -			a = &inany_loopback4;
> -
> -		r4 = pif_sock_l4(c, EPOLL_TYPE_UDP_LISTEN, pif, a, ifname,
> +		r4 = pif_sock_l4(c, EPOLL_TYPE_UDP_LISTEN, pif,
> +				 addr ? addr : &inany_any4, ifname,
>  				 port, uref.u32);
>  
>  		socks[V4][port] = r4 < 0 ? -1 : r4;
>  	}
>  
>  	if ((!addr || !inany_v4(addr)) && c->ifi6) {
> -		const union inany_addr *a = addr ?
> -			addr : &inany_any6;
> -
> -		if (pif == PIF_SPLICE)
> -			a = &inany_loopback6;
> -
> -		r6 = pif_sock_l4(c, EPOLL_TYPE_UDP_LISTEN, pif, a, ifname,
> +		r6 = pif_sock_l4(c, EPOLL_TYPE_UDP_LISTEN, pif,
> +				 addr ? addr : &inany_any6, ifname,
>  				 port, uref.u32);
>  
>  		socks[V6][port] = r6 < 0 ? -1 : r6;
> @@ -1214,9 +1204,12 @@ static void udp_port_rebind(struct ctx *c, bool outbound)
>  			continue;
>  
>  		if ((c->ifi4 && socks[V4][port] == -1) ||
> -		    (c->ifi6 && socks[V6][port] == -1))
> -			udp_sock_init(c, outbound ? PIF_SPLICE : PIF_HOST,
> -				      NULL, NULL, port);
> +		    (c->ifi6 && socks[V6][port] == -1)) {
> +			if (outbound)
> +				udp_sock_init(c, PIF_SPLICE, NULL, "lo", port);
> +			else
> +				udp_sock_init(c, PIF_HOST, NULL, NULL, port);

Same here, should we add a fallback case? The rest of the series looks
good to me.

> +		}
>  	}
>  }
>  

-- 
Stefano


  reply	other threads:[~2025-10-21 21:51 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-17  0:34 [PATCH 0/3] RFC: Reduce differences between inbound and outbound socket binding David Gibson
2025-10-17  0:34 ` [PATCH 1/3] tcp: Merge tcp_ns_sock_init[46]() into tcp_sock_init_one() David Gibson
2025-10-20  6:08   ` Stefano Brivio
2025-10-20  9:24     ` David Gibson
2025-10-20  6:09   ` Stefano Brivio
2025-10-20  9:25     ` David Gibson
2025-10-17  0:34 ` [PATCH 2/3] udp: Unify some more inbound/outbound parts of udp_sock_init() David Gibson
2025-10-21 21:51   ` Stefano Brivio
2025-10-22  0:08     ` David Gibson
2025-10-17  0:34 ` [PATCH 3/3] tcp, udp: Bind outbound listening sockets by interface instead of address David Gibson
2025-10-21 21:51   ` Stefano Brivio [this message]
2025-10-22  0:34     ` David Gibson
2025-10-22  8:59       ` Stefano Brivio
2025-10-23  1:18         ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251021235112.17369db5@elisabeth \
    --to=sbrivio@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=passt-dev@passt.top \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).