From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by passt.top (Postfix) with ESMTPS id 1C4BF5A0274 for ; Mon, 29 Jan 2024 05:36:06 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202312; t=1706502959; bh=dF3ebY0eNRd7C/ql/Zap72+JARxgdGAn+BsmoUdYDwY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=O3gbKvH+4/8uhXLc9cPmxnEO0V34YYxQ3HZkjqVZRdb7Za6LGwY+St89eEv+IXIX4 D9Dvc+QI6bdaUqI11IEpBKAzz6f4vk/vZzIwl6G1NiV1mwv6z5cA7Y2KtdEC1VdzAM OBxzmr5WkEBRSTVgvvo2JQwYR5SV4JbFMSKsv+ih0Kz2osml60Vtz57gvER+E/VM/m KBaMOX5NwHZvn+v8qXe1UdKSqwAxnuAc/+IdGWNSku6S5/98D7Jvxab+jsmXKrZ89e IdEV95IjDTY/XlOqf9DawteN+U8wBtw/BMK2a4DfR4/opx9M2D5LbNAnfzcbdW8wmM SvfoSOCf1wC0w== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4TNb8b5KCkz4x23; Mon, 29 Jan 2024 15:35:59 +1100 (AEDT) From: David Gibson To: Stefano Brivio , passt-dev@passt.top Subject: [PATCH 02/16] tcp, udp: Don't precompute port remappings in epoll references Date: Mon, 29 Jan 2024 15:35:43 +1100 Message-ID: <20240129043557.823451-3-david@gibson.dropbear.id.au> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240129043557.823451-1-david@gibson.dropbear.id.au> References: <20240129043557.823451-1-david@gibson.dropbear.id.au> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Message-ID-Hash: IGRATBOPLMW6UQYCPFA33UVECWMYZ2PH X-Message-ID-Hash: IGRATBOPLMW6UQYCPFA33UVECWMYZ2PH X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: David Gibson X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: The epoll references for both TCP listening sockets and UDP sockets includes a port number. This gives the destination port that traffic to that socket will be sent to on the other side. That will usually be the same as the socket's bound port, but might not if the -t, -u, -T or -U options are given with different original and forwarded port numbers. As we move towards a more flexible forwarding model for passt, it's going to become possible for that destination port to vary depending on more things (for example the source or destination address). So, it will no longer make sense to have a fixed value for a listening socket. Change to simpler semantics where this field in the reference gives the bound port of the socket. We apply the translations to the correct destination port later on, when we're actually forwarding. Signed-off-by: David Gibson --- tcp.c | 8 ++++---- tcp.h | 2 +- tcp_splice.c | 4 ++++ udp.c | 14 ++++++++------ 4 files changed, 17 insertions(+), 11 deletions(-) diff --git a/tcp.c b/tcp.c index fdf56713..cd833728 100644 --- a/tcp.c +++ b/tcp.c @@ -2676,7 +2676,7 @@ static void tcp_tap_conn_from_sock(struct ctx *c, conn_event(c, conn, SOCK_ACCEPTED); inany_from_sockaddr(&conn->faddr, &conn->fport, sa); - conn->eport = ref.port; + conn->eport = ref.port + c->tcp.fwd_in.delta[ref.port]; tcp_snat_inbound(c, &conn->faddr); @@ -2861,7 +2861,7 @@ static int tcp_sock_init_af(const struct ctx *c, sa_family_t af, in_port_t port, const void *addr, const char *ifname) { union tcp_listen_epoll_ref tref = { - .port = port + c->tcp.fwd_in.delta[port], + .port = port, .pif = PIF_HOST, }; int s; @@ -2923,7 +2923,7 @@ int tcp_sock_init(const struct ctx *c, sa_family_t af, const void *addr, static void tcp_ns_sock_init4(const struct ctx *c, in_port_t port) { union tcp_listen_epoll_ref tref = { - .port = port + c->tcp.fwd_out.delta[port], + .port = port, .pif = PIF_SPLICE, }; struct in_addr loopback = IN4ADDR_LOOPBACK_INIT; @@ -2949,7 +2949,7 @@ static void tcp_ns_sock_init4(const struct ctx *c, in_port_t port) static void tcp_ns_sock_init6(const struct ctx *c, in_port_t port) { union tcp_listen_epoll_ref tref = { - .port = port + c->tcp.fwd_out.delta[port], + .port = port, .pif = PIF_SPLICE, }; int s; diff --git a/tcp.h b/tcp.h index 875006ed..5e6756d4 100644 --- a/tcp.h +++ b/tcp.h @@ -37,7 +37,7 @@ union tcp_epoll_ref { /** * union tcp_listen_epoll_ref - epoll reference portion for TCP listening - * @port: Port number we're forwarding *to* (listening port plus delta) + * @port: Bound port number of the socket * @pif: pif in which the socket is listening * @u32: Opaque u32 value of reference */ diff --git a/tcp_splice.c b/tcp_splice.c index cc9745e8..b8d64eba 100644 --- a/tcp_splice.c +++ b/tcp_splice.c @@ -401,6 +401,8 @@ static int tcp_splice_new(const struct ctx *c, struct tcp_splice_conn *conn, int *p = CONN_V6(conn) ? init_sock_pool6 : init_sock_pool4; sa_family_t af = CONN_V6(conn) ? AF_INET6 : AF_INET; + port += c->tcp.fwd_out.delta[port]; + s = tcp_conn_pool_sock(p); if (s < 0) s = tcp_conn_new_sock(c, af); @@ -409,6 +411,8 @@ static int tcp_splice_new(const struct ctx *c, struct tcp_splice_conn *conn, ASSERT(pif == PIF_HOST); + port += c->tcp.fwd_in.delta[port]; + /* If pool is empty, refill it first */ if (p[TCP_SOCK_POOL_SIZE-1] < 0) NS_CALL(tcp_sock_refill_ns, c); diff --git a/udp.c b/udp.c index c839e269..02cb7889 100644 --- a/udp.c +++ b/udp.c @@ -762,6 +762,11 @@ void udp_sock_handler(const struct ctx *c, union epoll_ref ref, uint32_t events, if (c->no_udp || !(events & EPOLLIN)) return; + if (ref.udp.pif == PIF_SPLICE) + dstport += c->udp.fwd_out.f.delta[dstport]; + else if (ref.udp.pif == PIF_HOST) + dstport += c->udp.fwd_in.f.delta[dstport]; + if (v6) { mmh_recv = udp6_l2_mh_sock; udp6_localname.sin6_port = htons(dstport); @@ -989,16 +994,13 @@ int udp_sock_init(const struct ctx *c, int ns, sa_family_t af, const void *addr, const char *ifname, in_port_t port) { union udp_epoll_ref uref = { .splice = (c->mode == MODE_PASTA), - .orig = true }; + .orig = true, .port = port }; int s, r4 = FD_REF_MAX + 1, r6 = FD_REF_MAX + 1; - if (ns) { + if (ns) uref.pif = PIF_SPLICE; - uref.port = (in_port_t)(port + c->udp.fwd_out.f.delta[port]); - } else { + else uref.pif = PIF_HOST; - uref.port = (in_port_t)(port + c->udp.fwd_in.f.delta[port]); - } if ((af == AF_INET || af == AF_UNSPEC) && c->ifi4) { uref.v6 = 0; -- 2.43.0