From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by passt.top (Postfix) with ESMTPS id 85E7B5A026C for ; Thu, 15 Dec 2022 02:30:29 +0100 (CET) Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4NXZRc26Brz4xcm; Thu, 15 Dec 2022 12:30:20 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=201602; t=1671067820; bh=LfRz5p4M1l17u6QpMGd9zFljn/ogJVXAmuLZRc6XkWQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=UvUkZek5g6tI0zqqs/FwgPxphaP5AaftyK5+uBj4jc8H17q1qVKhlgv1VcF7ZpvvH RTUFunhNBo/HFtKd98AzDtQ+v8/q+3SfiEzSOcUxaIrJti+oRpb2FwfSxVymeOdaXH XPyeA2hwpimPRiuoL9qvJiVleA4DuERJzHtYqJGg= From: David Gibson To: passt-dev@passt.top, Stefano Brivio Subject: [PATCH v2 7/8] udp: Decide whether to "splice" per datagram rather than per socket Date: Thu, 15 Dec 2022 12:30:17 +1100 Message-Id: <20221215013018.1556807-8-david@gibson.dropbear.id.au> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221215013018.1556807-1-david@gibson.dropbear.id.au> References: <20221215013018.1556807-1-david@gibson.dropbear.id.au> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Message-ID-Hash: WN3SQPUE43OEAGLZ6DKYSOQMABWGSGFC X-Message-ID-Hash: WN3SQPUE43OEAGLZ6DKYSOQMABWGSGFC X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: David Gibson X-Mailman-Version: 3.3.3 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Currently we have special sockets for receiving datagrams from locahost which can use the optimized "splice" path rather than going across the tap interface. We want to loosen this so that sockets can receive sockets that will be forwarded by both the spliced and non-spliced paths. To do this, we alter the meaning of the @splice bit in the reference to mean that packets receieved on this socket *can* be spliced, not that they *will* be spliced. They'll only actually be spliced if they come from 127.0.0.1 or ::1. We can't (for now) remove the splice bit entirely, unlike with TCP. Our gateway mapping means that if the ns initiates communication to the gw address, we'll translate that to target 127.0.0.1 on the host side. Reply packets will therefore have source address 127.0.0.1 when received on the host, but these need to go via the tap path where that will be translated back to the gateway address. We need the @splice bit to distinguish that case from packets going from localhost to a port mapped explicitly with -u which should be spliced. Signed-off-by: David Gibson --- udp.c | 52 +++++++++++++++++++++++++++++++++------------------- udp.h | 2 +- 2 files changed, 34 insertions(+), 20 deletions(-) diff --git a/udp.c b/udp.c index ed31216..8ba3453 100644 --- a/udp.c +++ b/udp.c @@ -513,16 +513,25 @@ static int udp_splice_new_ns(void *arg) } /** - * sa_port() - Determine port from a sockaddr_in or sockaddr_in6 + * udp_mmh_splice_port() - Is source address of message suitable for splicing? * @v6: Is @sa a sockaddr_in6 (otherwise sockaddr_in)? - * @sa: Pointer to either sockaddr_in or sockaddr_in6 + * @mmh: mmsghdr of incoming message + * + * Return: if @sa refers to localhost (127.0.0.1 or ::1) the port from + * @sa in host order, otherwise -1. */ -static in_port_t sa_port(bool v6, const void *sa) +static int udp_mmh_splice_port(bool v6, const struct mmsghdr *mmh) { - const struct sockaddr_in6 *sa6 = sa; - const struct sockaddr_in *sa4 = sa; + const struct sockaddr_in6 *sa6 = mmh->msg_hdr.msg_name; + const struct sockaddr_in *sa4 = mmh->msg_hdr.msg_name; + + if (v6 && IN6_IS_ADDR_LOOPBACK(&sa6->sin6_addr)) + return ntohs(sa6->sin6_port); + + if (!v6 && IN4_IS_ADDR_LOOPBACK(&sa4->sin_addr)) + return ntohs(sa4->sin_port); - return v6 ? ntohs(sa6->sin6_port) : ntohs(sa4->sin_port); + return -1; } /** @@ -918,23 +927,28 @@ void udp_sock_handler(const struct ctx *c, union epoll_ref ref, uint32_t events, if (n <= 0) return; - if (!ref.r.p.udp.udp.splice) { - udp_tap_send(c, 0, n, dstport, v6, now); - return; - } - for (i = 0; i < n; i += m) { - in_port_t src = sa_port(v6, mmh_recv[i].msg_hdr.msg_name); + int splicefrom = -1; + m = n; + + if (ref.r.p.udp.udp.splice) { + splicefrom = udp_mmh_splice_port(v6, mmh_recv + i); + + for (m = 1; i + m < n; m++) { + int p; - for (m = 1; i + m < n; m++) { - void *mname = mmh_recv[i + m].msg_hdr.msg_name; - if (sa_port(v6, mname) != src) - break; + p = udp_mmh_splice_port(v6, mmh_recv + i + m); + if (p != splicefrom) + break; + } } - udp_splice_sendfrom(c, i, m, src, dstport, v6, - ref.r.p.udp.udp.ns, ref.r.p.udp.udp.orig, - now); + if (splicefrom >= 0) + udp_splice_sendfrom(c, i, m, splicefrom, dstport, + v6, ref.r.p.udp.udp.ns, + ref.r.p.udp.udp.orig, now); + else + udp_tap_send(c, i, m, dstport, v6, now); } } diff --git a/udp.h b/udp.h index 053991e..2a03335 100644 --- a/udp.h +++ b/udp.h @@ -22,7 +22,7 @@ void udp_update_l2_buf(const unsigned char *eth_d, const unsigned char *eth_s, /** * union udp_epoll_ref - epoll reference portion for TCP connections * @bound: Set if this file descriptor is a bound socket - * @splice: Set if descriptor is associated to "spliced" connection + * @splice: Set if descriptor packets to be "spliced" * @orig: Set if a spliced socket which can originate "connections" * @ns: Set if this is a socket in the pasta network namespace * @v6: Set for IPv6 sockets or connections -- 2.38.1