public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: passt-dev@passt.top, Stefano Brivio <sbrivio@redhat.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Subject: [PATCH v2 16/16] udp: Correct splice forwarding when receiving from multiple sources
Date: Thu, 24 Nov 2022 12:16:59 +1100	[thread overview]
Message-ID: <20221124011659.1024901-17-david@gibson.dropbear.id.au> (raw)
In-Reply-To: <20221124011659.1024901-1-david@gibson.dropbear.id.au>

udp_sock_handler_splice() reads a whole batch of datagrams at once with
recvmmsg().  It then forwards them all via a single socket on the other
side, based on the source port.

However, it's entirely possible that the datagrams in the set have
different source ports, and thus ought to be forwarded via different
sockets on the destination side.  In fact this situation arises with the
iperf -P4 throughput tests in our own test suite.  AFAICT we only get away
with this because iperf3 is strictly one way and doesn't send reply packets
which would be misdirected because of the incorrect source ports.

Alter udp_sock_handler_splice() to split the packets it receives into
batches with the same source address and send each batch with a separate
sendmmsg().

For now we only look for already contiguous batches, which means that if
there are multiple active flows interleaved this is likely to degenerate
to batches of size 1.  For now this is the simplest way to correct the
behaviour and we can try to optimize later.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 udp.c | 27 +++++++++++++++++++--------
 1 file changed, 19 insertions(+), 8 deletions(-)

diff --git a/udp.c b/udp.c
index 2311e7d..ee5c2c5 100644
--- a/udp.c
+++ b/udp.c
@@ -572,9 +572,9 @@ static void udp_splice_sendfrom(const struct ctx *c, struct mmsghdr *mmh, int n,
 static void udp_sock_handler_splice(const struct ctx *c, union epoll_ref ref,
 				    uint32_t events, const struct timespec *now)
 {
-	in_port_t src, dst = ref.r.p.udp.udp.port;
+	in_port_t dst = ref.r.p.udp.udp.port;
+	int v6 = ref.r.p.udp.udp.v6, n, i, m;
 	struct mmsghdr *mmh_recv, *mmh_send;
-	int v6 = ref.r.p.udp.udp.v6, n, i;
 
 	if (!(events & EPOLLIN))
 		return;
@@ -610,12 +610,23 @@ static void udp_sock_handler_splice(const struct ctx *c, union epoll_ref ref,
 		});
 	}
 
-	for (i = 0; i < n; i++)
-		mmh_send[i].msg_hdr.msg_iov->iov_len = mmh_recv[i].msg_len;
-
-	src = sa_port(v6, mmh_recv[0].msg_hdr.msg_name);
-	udp_splice_sendfrom(c, mmh_send, n, src, ref.r.p.udp.udp.port,
-			    v6, ref.r.p.udp.udp.ns, ref.r.p.udp.udp.orig, now);
+	for (i = 0; i < n; i += m) {
+		const struct mmsghdr *mmh = &mmh_recv[i];
+		in_port_t src = sa_port(v6, mmh->msg_hdr.msg_name);
+
+		m = 0;
+		do {
+			mmh_send[i + m].msg_hdr.msg_iov->iov_len = mmh->msg_len;
+			mmh++;
+			m++;
+		} while (sa_port(v6, mmh->msg_hdr.msg_name) == src);
+
+		udp_splice_sendfrom(c, mmh_send + i, m,
+				    src, ref.r.p.udp.udp.port,
+				    v6, ref.r.p.udp.udp.ns,
+				    ref.r.p.udp.udp.orig,
+				    now);
+	}
 }
 
 /**
-- 
@@ -572,9 +572,9 @@ static void udp_splice_sendfrom(const struct ctx *c, struct mmsghdr *mmh, int n,
 static void udp_sock_handler_splice(const struct ctx *c, union epoll_ref ref,
 				    uint32_t events, const struct timespec *now)
 {
-	in_port_t src, dst = ref.r.p.udp.udp.port;
+	in_port_t dst = ref.r.p.udp.udp.port;
+	int v6 = ref.r.p.udp.udp.v6, n, i, m;
 	struct mmsghdr *mmh_recv, *mmh_send;
-	int v6 = ref.r.p.udp.udp.v6, n, i;
 
 	if (!(events & EPOLLIN))
 		return;
@@ -610,12 +610,23 @@ static void udp_sock_handler_splice(const struct ctx *c, union epoll_ref ref,
 		});
 	}
 
-	for (i = 0; i < n; i++)
-		mmh_send[i].msg_hdr.msg_iov->iov_len = mmh_recv[i].msg_len;
-
-	src = sa_port(v6, mmh_recv[0].msg_hdr.msg_name);
-	udp_splice_sendfrom(c, mmh_send, n, src, ref.r.p.udp.udp.port,
-			    v6, ref.r.p.udp.udp.ns, ref.r.p.udp.udp.orig, now);
+	for (i = 0; i < n; i += m) {
+		const struct mmsghdr *mmh = &mmh_recv[i];
+		in_port_t src = sa_port(v6, mmh->msg_hdr.msg_name);
+
+		m = 0;
+		do {
+			mmh_send[i + m].msg_hdr.msg_iov->iov_len = mmh->msg_len;
+			mmh++;
+			m++;
+		} while (sa_port(v6, mmh->msg_hdr.msg_name) == src);
+
+		udp_splice_sendfrom(c, mmh_send + i, m,
+				    src, ref.r.p.udp.udp.port,
+				    v6, ref.r.p.udp.udp.ns,
+				    ref.r.p.udp.udp.orig,
+				    now);
+	}
 }
 
 /**
-- 
2.38.1


  parent reply	other threads:[~2022-11-24  1:17 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-24  1:16 [PATCH v2 00/16] Simplify and correct handling of "spliced" UDP forwarding David Gibson
2022-11-24  1:16 ` [PATCH v2 01/16] udp: Also bind() connected ports for "splice" forwarding David Gibson
2022-11-25  1:47   ` Stefano Brivio
2022-11-25  7:01     ` David Gibson
2022-11-24  1:16 ` [PATCH v2 02/16] udp: Separate tracking of inbound and outbound packet flows David Gibson
2022-11-25  1:47   ` Stefano Brivio
2022-11-25  7:06     ` David Gibson
2022-11-24  1:16 ` [PATCH v2 03/16] udp: Always use sendto() rather than send() for forwarding spliced packets David Gibson
2022-11-24  1:16 ` [PATCH v2 04/16] udp: Don't connect "forward" sockets for spliced flows David Gibson
2022-11-25  1:47   ` Stefano Brivio
2022-11-25  7:07     ` David Gibson
2022-12-01 18:49       ` Stefano Brivio
2022-11-24  1:16 ` [PATCH v2 05/16] udp: Remove the @bound field from union udp_epoll_ref David Gibson
2022-11-24  1:16 ` [PATCH v2 06/16] udp: Split splice field in udp_epoll_ref into (mostly) independent bits David Gibson
2022-11-24  1:16 ` [PATCH v2 07/16] udp: Don't create double sockets for -U port David Gibson
2022-11-24  1:16 ` [PATCH v2 08/16] udp: Re-use fixed bound sockets for packet forwarding when possible David Gibson
2022-11-24  1:16 ` [PATCH v2 09/16] udp: Don't explicitly track originating socket for spliced "connections" David Gibson
2022-11-25  1:48   ` Stefano Brivio
2022-11-25  7:09     ` David Gibson
2022-11-24  1:16 ` [PATCH v2 10/16] udp: Update UDP "connection" timestamps in both directions David Gibson
2022-11-24  1:16 ` [PATCH v2 11/16] udp: Simplify udp_sock_handler_splice David Gibson
2022-11-24  1:16 ` [PATCH v2 12/16] udp: Make UDP_SPLICE_FRAMES and UDP_TAP_FRAMES_MEM the same thing David Gibson
2022-11-24  1:16 ` [PATCH v2 13/16] udp: Add helper to extract port from a sockaddr_in or sockaddr_in6 David Gibson
2022-11-25  1:48   ` Stefano Brivio
2022-11-25  7:10     ` David Gibson
2022-11-24  1:16 ` [PATCH v2 14/16] udp: Unify buffers for tap and splice paths David Gibson
2022-11-24  1:16 ` [PATCH v2 15/16] udp: Split send half of udp_sock_handler_splice() from the receive half David Gibson
2022-11-24  1:16 ` David Gibson [this message]
2022-11-29  5:55   ` [PATCH v2 16/16] udp: Correct splice forwarding when receiving from multiple sources David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221124011659.1024901-17-david@gibson.dropbear.id.au \
    --to=david@gibson.dropbear.id.au \
    --cc=passt-dev@passt.top \
    --cc=sbrivio@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).