From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by passt.top (Postfix) with ESMTPS id 4B7865A005E for ; Tue, 22 Nov 2022 04:44:13 +0100 (CET) Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4NGVVX10HNz4xN9; Tue, 22 Nov 2022 14:44:04 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=201602; t=1669088644; bh=sfwyZ9W6QZyKlO4qiN3rSNziRPsx1pFCwU5+zNypT94=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=mRihjpCeZKuUY49uvnK78NAfspCjT7s+G9j6JzvBiPnOhcOTDYaw+NdlMBWQhBz/h qxp4fRnYmGzR6hJnZ18X21/P0vSwUfn8ArJHVubhIDHTMoG6NznyVjrRlxX0PCKrs+ muUwaARAI5Tcdjtf3wlSloQx7V1xFM6D38tRLljA= From: David Gibson To: Stefano Brivio , passt-dev@passt.top Subject: [PATCH 06/11] udp: Split splice field in udp_epoll_ref into (mostly) independent bits Date: Tue, 22 Nov 2022 14:43:57 +1100 Message-Id: <20221122034402.1517544-7-david@gibson.dropbear.id.au> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221122034402.1517544-1-david@gibson.dropbear.id.au> References: <20221122034402.1517544-1-david@gibson.dropbear.id.au> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Message-ID-Hash: AAK5NS3YRCYP4VK5MW5BHUY734B2TLM3 X-Message-ID-Hash: AAK5NS3YRCYP4VK5MW5BHUY734B2TLM3 X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: David Gibson X-Mailman-Version: 3.3.3 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: The @splice field in union udp_epoll_ref can have a number of values for different types of "spliced" packet flows. Split it into several single bit fields with more or less independent meanings. The new @splice field is just a boolean indicating whether the socket is associated with a spliced flow, making it identical to the @splice fiend in tcp_epoll_ref. The new bit @orig, indicates whether this is a socket which can originate new udp packet flows (created with -u or -U) or a socket created on the fly to handle reply socket. @ns indicates whether the socket lives in the init namespace or the pasta namespace. Making these bits more orthogonal to each other will simplify some future cleanups. Signed-off-by: David Gibson --- passt.h | 2 ++ udp.c | 53 ++++++++++++++++++++++++++--------------------------- udp.h | 15 +++++++-------- 3 files changed, 35 insertions(+), 35 deletions(-) diff --git a/passt.h b/passt.h index 6649c0a..01d8893 100644 --- a/passt.h +++ b/passt.h @@ -31,6 +31,8 @@ struct tap_l4_msg { union epoll_ref; +#include + #include "packet.h" #include "icmp.h" #include "port_fwd.h" diff --git a/udp.c b/udp.c index 2c5e4e9..4c87e66 100644 --- a/udp.c +++ b/udp.c @@ -46,19 +46,20 @@ * - from init to namespace: * * - forward direction: 127.0.0.1:5000 -> 127.0.0.1:80 in init from socket s, - * with epoll reference: index = 80, splice = UDP_TO_NS + * with epoll reference: index = 80, splice = 1, orig = 1, ns = 0 * - if udp_splice_to_ns[V4][5000].target_sock: * - send packet to udp_splice_to_ns[V4][5000].target_sock, with * destination port 80 * - otherwise: * - create new socket udp_splice_to_ns[V4][5000].target_sock * - bind in namespace to 127.0.0.1:5000 - * - add to epoll with reference: index = 5000, splice: UDP_BACK_TO_INIT + * - add to epoll with reference: index = 5000, splice = 1, orig = 0, + * ns = 1 * - set udp_splice_to_ns[V4][5000].orig_sock to s * - update udp_splice_to_ns[V4][5000].ts with current time * * - reverse direction: 127.0.0.1:80 -> 127.0.0.1:5000 in namespace socket s, - * having epoll reference: index = 5000, splice = UDP_BACK_TO_INIT + * having epoll reference: index = 5000, splice = 1, orig = 0, ns = 1 * - if udp_splice_to_ns[V4][5000].orig_sock: * - send to udp_splice_to_ns[V4][5000].orig_sock, with destination port * 5000 @@ -67,19 +68,20 @@ * - from namespace to init: * * - forward direction: 127.0.0.1:2000 -> 127.0.0.1:22 in namespace from - * socket s, with epoll reference: index = 22, splice = UDP_TO_INIT + * socket s, with epoll reference: index = 22, splice = 1, orig = 1, ns = 1 * - if udp4_splice_to_init[V4][2000].target_sock: * - send packet to udp_splice_to_init[V4][2000].target_sock, with * destination port 22 * - otherwise: * - create new socket udp_splice_to_init[V4][2000].target_sock * - bind in init to 127.0.0.1:2000 - * - add to epoll with reference: index = 2000, splice = UDP_BACK_TO_NS + * - add to epoll with reference: index = 2000, splice = 1, orig = 0, + * ns = 0 * - set udp_splice_to_init[V4][2000].orig_sock to s * - update udp_splice_to_init[V4][2000].ts with current time * * - reverse direction: 127.0.0.1:22 -> 127.0.0.1:2000 in init from socket s, - * having epoll reference: index = 2000, splice = UDP_BACK_TO_NS + * having epoll reference: index = 2000, splice = 1, orig = 0, ns = 0 * - if udp_splice_to_init[V4][2000].orig_sock: * - send to udp_splice_to_init[V4][2000].orig_sock, with destination port * 2000 @@ -404,18 +406,18 @@ static void udp_sock6_iov_init(void) * #syscalls:pasta getsockname */ int udp_splice_new(const struct ctx *c, int v6, int bound_sock, in_port_t src, - int splice) + bool ns) { struct epoll_event ev = { .events = EPOLLIN | EPOLLRDHUP | EPOLLHUP }; union epoll_ref ref = { .r.proto = IPPROTO_UDP, - .r.p.udp.udp = { .splice = splice, .v6 = v6, - .port = src } + .r.p.udp.udp = { .splice = true, .ns = ns, + .v6 = v6, .port = src } }; struct udp_splice_flow *flow; int s; int act; - if (splice == UDP_BACK_TO_INIT) { + if (ns) { flow = &udp_splice_to_ns[v6 ? V6 : V4][src]; act = UDP_ACT_SPLICE_NS; } else { @@ -499,8 +501,7 @@ static int udp_splice_new_ns(void *arg) if (ns_enter(a->c)) return 0; - a->s = udp_splice_new(a->c, a->v6, a->bound_sock, a->src, - UDP_BACK_TO_INIT); + a->s = udp_splice_new(a->c, a->v6, a->bound_sock, a->src, true); return 0; } @@ -538,8 +539,8 @@ static void udp_sock_handler_splice(const struct ctx *c, union epoll_ref ref, src = ntohs(sa->sin_port); } - switch (ref.r.p.udp.udp.splice) { - case UDP_TO_NS: + + if (ref.r.p.udp.udp.orig && !ref.r.p.udp.udp.ns) { src += c->udp.fwd_out.rdelta[src]; if (!(s = udp_splice_to_ns[v6][src].target_sock)) { @@ -551,27 +552,24 @@ static void udp_sock_handler_splice(const struct ctx *c, union epoll_ref ref, if ((s = arg.s) < 0) return; } + udp_splice_to_ns[v6][src].ts = now->tv_sec; - break; - case UDP_BACK_TO_INIT: + } else if (!ref.r.p.udp.udp.orig && ref.r.p.udp.udp.ns) { if (!(s = udp_splice_to_ns[v6][dst].orig_sock)) return; - break; - case UDP_TO_INIT: + } else if (ref.r.p.udp.udp.orig && ref.r.p.udp.udp.ns) { src += c->udp.fwd_in.rdelta[src]; if (!(s = udp_splice_to_init[v6][src].target_sock)) { - s = udp_splice_new(c, v6, ref.r.s, src, UDP_BACK_TO_NS); + s = udp_splice_new(c, v6, ref.r.s, src, false); if (s < 0) return; } udp_splice_to_init[v6][src].ts = now->tv_sec; - break; - case UDP_BACK_TO_NS: + } else if (!ref.r.p.udp.udp.orig && !ref.r.p.udp.udp.ns) { if (!(s = udp_splice_to_init[v6][dst].orig_sock)) return; - break; - default: + } else { return; } @@ -1097,15 +1095,16 @@ void udp_sock_init(const struct ctx *c, int ns, sa_family_t af, if (c->mode == MODE_PASTA) { bind_addr = &(uint32_t){ htonl(INADDR_LOOPBACK) }; - uref.udp.splice = UDP_TO_NS; + uref.udp.splice = uref.udp.orig = true; sock_l4(c, AF_INET, IPPROTO_UDP, bind_addr, ifname, port, uref.u32); } if (ns) { + uref.udp.splice = uref.udp.orig = uref.udp.ns = true; + bind_addr = &(uint32_t){ htonl(INADDR_LOOPBACK) }; - uref.udp.splice = UDP_TO_INIT; sock_l4(c, AF_INET, IPPROTO_UDP, bind_addr, ifname, port, uref.u32); @@ -1130,7 +1129,7 @@ void udp_sock_init(const struct ctx *c, int ns, sa_family_t af, if (c->mode == MODE_PASTA) { bind_addr = &in6addr_loopback; - uref.udp.splice = UDP_TO_NS; + uref.udp.splice = uref.udp.orig = true; sock_l4(c, AF_INET6, IPPROTO_UDP, bind_addr, ifname, port, uref.u32); @@ -1138,7 +1137,7 @@ void udp_sock_init(const struct ctx *c, int ns, sa_family_t af, if (ns) { bind_addr = &in6addr_loopback; - uref.udp.splice = UDP_TO_INIT; + uref.udp.splice = uref.udp.orig = uref.udp.ns = true; sock_l4(c, AF_INET6, IPPROTO_UDP, bind_addr, ifname, port, uref.u32); diff --git a/udp.h b/udp.h index 43bd28a..053991e 100644 --- a/udp.h +++ b/udp.h @@ -23,20 +23,19 @@ void udp_update_l2_buf(const unsigned char *eth_d, const unsigned char *eth_s, * union udp_epoll_ref - epoll reference portion for TCP connections * @bound: Set if this file descriptor is a bound socket * @splice: Set if descriptor is associated to "spliced" connection + * @orig: Set if a spliced socket which can originate "connections" + * @ns: Set if this is a socket in the pasta network namespace * @v6: Set for IPv6 sockets or connections * @port: Source port for connected sockets, bound port otherwise * @u32: Opaque u32 value of reference */ union udp_epoll_ref { struct { - uint32_t splice:3, -#define UDP_TO_NS 1 -#define UDP_TO_INIT 2 -#define UDP_BACK_TO_NS 3 -#define UDP_BACK_TO_INIT 4 - - v6:1, - port:16; + bool splice:1, + orig:1, + ns:1, + v6:1; + uint32_t port:16; } udp; uint32_t u32; }; -- 2.38.1