From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by passt.top (Postfix) with ESMTPS id 7F8A85A026E for ; Thu, 24 Nov 2022 02:17:10 +0100 (CET) Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4NHg7y306qz4xTx; Thu, 24 Nov 2022 12:17:02 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=201602; t=1669252622; bh=1lWAwBMtvZ+49Ep/Tr7xnZE5vnb6ZlSIVSwMdG1HnYw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=h3fftz+Pg+APWJ2//S83VzA/Duz9RYa221S38Fq1lAsFQCnO3hCU97keu/XiOemP7 yDOGu6+7YvDtCpGg0gCcgs9kEhk8MVTKaoSERhLdp9vlfph4kimZPibo6++0QA/kkj OqenKc/wwQfN8x34cHPALfZ3JN03QitvS80zhDtc= From: David Gibson To: passt-dev@passt.top, Stefano Brivio Subject: [PATCH v2 16/16] udp: Correct splice forwarding when receiving from multiple sources Date: Thu, 24 Nov 2022 12:16:59 +1100 Message-Id: <20221124011659.1024901-17-david@gibson.dropbear.id.au> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221124011659.1024901-1-david@gibson.dropbear.id.au> References: <20221124011659.1024901-1-david@gibson.dropbear.id.au> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Message-ID-Hash: 2QLSYI5DNSE4LJXIDY2IAEAAWH6YCWIP X-Message-ID-Hash: 2QLSYI5DNSE4LJXIDY2IAEAAWH6YCWIP X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: David Gibson X-Mailman-Version: 3.3.3 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: udp_sock_handler_splice() reads a whole batch of datagrams at once with recvmmsg(). It then forwards them all via a single socket on the other side, based on the source port. However, it's entirely possible that the datagrams in the set have different source ports, and thus ought to be forwarded via different sockets on the destination side. In fact this situation arises with the iperf -P4 throughput tests in our own test suite. AFAICT we only get away with this because iperf3 is strictly one way and doesn't send reply packets which would be misdirected because of the incorrect source ports. Alter udp_sock_handler_splice() to split the packets it receives into batches with the same source address and send each batch with a separate sendmmsg(). For now we only look for already contiguous batches, which means that if there are multiple active flows interleaved this is likely to degenerate to batches of size 1. For now this is the simplest way to correct the behaviour and we can try to optimize later. Signed-off-by: David Gibson --- udp.c | 27 +++++++++++++++++++-------- 1 file changed, 19 insertions(+), 8 deletions(-) diff --git a/udp.c b/udp.c index 2311e7d..ee5c2c5 100644 --- a/udp.c +++ b/udp.c @@ -572,9 +572,9 @@ static void udp_splice_sendfrom(const struct ctx *c, struct mmsghdr *mmh, int n, static void udp_sock_handler_splice(const struct ctx *c, union epoll_ref ref, uint32_t events, const struct timespec *now) { - in_port_t src, dst = ref.r.p.udp.udp.port; + in_port_t dst = ref.r.p.udp.udp.port; + int v6 = ref.r.p.udp.udp.v6, n, i, m; struct mmsghdr *mmh_recv, *mmh_send; - int v6 = ref.r.p.udp.udp.v6, n, i; if (!(events & EPOLLIN)) return; @@ -610,12 +610,23 @@ static void udp_sock_handler_splice(const struct ctx *c, union epoll_ref ref, }); } - for (i = 0; i < n; i++) - mmh_send[i].msg_hdr.msg_iov->iov_len = mmh_recv[i].msg_len; - - src = sa_port(v6, mmh_recv[0].msg_hdr.msg_name); - udp_splice_sendfrom(c, mmh_send, n, src, ref.r.p.udp.udp.port, - v6, ref.r.p.udp.udp.ns, ref.r.p.udp.udp.orig, now); + for (i = 0; i < n; i += m) { + const struct mmsghdr *mmh = &mmh_recv[i]; + in_port_t src = sa_port(v6, mmh->msg_hdr.msg_name); + + m = 0; + do { + mmh_send[i + m].msg_hdr.msg_iov->iov_len = mmh->msg_len; + mmh++; + m++; + } while (sa_port(v6, mmh->msg_hdr.msg_name) == src); + + udp_splice_sendfrom(c, mmh_send + i, m, + src, ref.r.p.udp.udp.port, + v6, ref.r.p.udp.udp.ns, + ref.r.p.udp.udp.orig, + now); + } } /** -- 2.38.1