* [PATCH v4 0/8] Don't use additional sockets for receiving "spliced" UDP communications
@ 2023-01-05 4:26 David Gibson
2023-01-05 4:26 ` [PATCH v4 1/8] udp: Move sending pasta tap frames to the end of udp_sock_handler() David Gibson
` (8 more replies)
0 siblings, 9 replies; 12+ messages in thread
From: David Gibson @ 2023-01-05 4:26 UTC (permalink / raw)
To: passt-dev, Stefano Brivio; +Cc: David Gibson
At present, the UDP "splice" and "tap" paths are quite separate. We
have separate sockets to receive packets bound for the tap and splice
paths. This leads to some code duplication, and extra open sockets.
This series partially unifies the two paths, allowing us to use a
single (host side) socket, bound to 0.0.0.0 or :: to receive packets
for both cases.
Changes since v3:
* Fixed really dumb compile error, and actually ran through the tests.
Oops.
Changes since v2:
* Don't receive multiple packets at once for pasta mode - seems to
hurt throughput on balance.
* Add some comments clarifying reasoning here.
Changes since v1:
* Renamed udp_localname[46] to udp[46]_localname
* Allow handling of UDP port 0
* Fix a bug which could misidentify certain v6 packets as v4-spliceable
* Some minor cosmetic fixes to code and commit messages
David Gibson (8):
udp: Move sending pasta tap frames to the end of udp_sock_handler()
udp: Split sending to passt tap interface into separate function
udp: Split receive from preparation and send in udp_sock_handler()
udp: Don't handle tap receive batch size calculation within a #define
udp: Pre-populate msg_names with local address
udp: Unify udp_sock_handler_splice() with udp_sock_handler()
udp: Decide whether to "splice" per datagram rather than per socket
udp: Don't use separate sockets to listen for spliced packets
udp.c | 388 ++++++++++++++++++++++++++++++---------------------------
udp.h | 2 +-
util.h | 7 ++
3 files changed, 213 insertions(+), 184 deletions(-)
--
2.39.0
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v4 1/8] udp: Move sending pasta tap frames to the end of udp_sock_handler()
2023-01-05 4:26 [PATCH v4 0/8] Don't use additional sockets for receiving "spliced" UDP communications David Gibson
@ 2023-01-05 4:26 ` David Gibson
2023-01-05 4:26 ` [PATCH v4 2/8] udp: Split sending to passt tap interface into separate function David Gibson
` (7 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: David Gibson @ 2023-01-05 4:26 UTC (permalink / raw)
To: passt-dev, Stefano Brivio; +Cc: David Gibson
udp_sock_handler() has a surprising difference in flow between pasta and
passt mode: For pasta we send each frame to the tap interface as we prepare
it. For passt, though, we prepare all the frames, then send them with a
single sendmmsg().
Alter the pasta path to also prepare all the frames, then send them at the
end. We already have a suitable data structure for the passt case. This
will make it easier to abstract out the tap backend difference in future.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
udp.c | 61 ++++++++++++++++++++++++++++++++++++++++-------------------
1 file changed, 42 insertions(+), 19 deletions(-)
diff --git a/udp.c b/udp.c
index 4610a02..a27fe88 100644
--- a/udp.c
+++ b/udp.c
@@ -783,6 +783,35 @@ static size_t udp_update_hdr6(const struct ctx *c, int n, in_port_t dstport,
return buf_len;
}
+/**
+ * udp_tap_send_pasta() - Send datagrams to the pasta tap interface
+ * @c: Execution context
+ * @mmh: Array of message headers to send
+ * @n: Number of message headers to send
+ *
+ * #syscalls:pasta write
+ */
+static void udp_tap_send_pasta(const struct ctx *c, struct mmsghdr *mmh,
+ unsigned int n)
+{
+ unsigned int i, j;
+
+ for (i = 0; i < n; i++) {
+ for (j = 0; j < mmh[i].msg_hdr.msg_iovlen; j++) {
+ struct iovec *iov = &mmh[i].msg_hdr.msg_iov[j];
+
+ /* We can't use writev() because the tap
+ * character device relies on the write()
+ * boundaries to discern frame boundaries
+ */
+ if (write(c->fd_tap, iov->iov_base, iov->iov_len) < 0)
+ debug("tap write: %s", strerror(errno));
+ else
+ pcap(iov->iov_base, iov->iov_len);
+ }
+ }
+}
+
/**
* udp_sock_handler() - Handle new data from socket
* @c: Execution context
@@ -835,31 +864,25 @@ void udp_sock_handler(const struct ctx *c, union epoll_ref ref, uint32_t events,
else
buf_len = udp_update_hdr4(c, i, dstport, now);
- if (c->mode == MODE_PASTA) {
- void *frame = tap_iov[i].iov_base;
+ tap_iov[i].iov_len = buf_len;
- if (write(c->fd_tap, frame, buf_len) < 0)
- debug("tap write: %s", strerror(errno));
- pcap(frame, buf_len);
- } else {
- tap_iov[i].iov_len = buf_len;
-
- /* With bigger messages, qemu closes the connection. */
- if (msg_bufs && msg_len + buf_len > SHRT_MAX) {
- tap_mmh[msg_i].msg_hdr.msg_iovlen = msg_bufs;
- msg_i++;
- tap_mmh[msg_i].msg_hdr.msg_iov = &tap_iov[i];
- msg_len = msg_bufs = 0;
- }
-
- msg_len += buf_len;
- msg_bufs++;
+ /* With bigger messages, qemu closes the connection. */
+ if (c->mode == MODE_PASST && msg_bufs &&
+ msg_len + buf_len > SHRT_MAX) {
+ tap_mmh[msg_i].msg_hdr.msg_iovlen = msg_bufs;
+ msg_i++;
+ tap_mmh[msg_i].msg_hdr.msg_iov = &tap_iov[i];
+ msg_len = msg_bufs = 0;
}
+ msg_len += buf_len;
+ msg_bufs++;
}
tap_mmh[msg_i].msg_hdr.msg_iovlen = msg_bufs;
- if (c->mode == MODE_PASTA)
+ if (c->mode == MODE_PASTA) {
+ udp_tap_send_pasta(c, tap_mmh, msg_i + 1);
return;
+ }
ret = sendmmsg(c->fd_tap, tap_mmh, msg_i + 1,
MSG_NOSIGNAL | MSG_DONTWAIT);
--
@@ -783,6 +783,35 @@ static size_t udp_update_hdr6(const struct ctx *c, int n, in_port_t dstport,
return buf_len;
}
+/**
+ * udp_tap_send_pasta() - Send datagrams to the pasta tap interface
+ * @c: Execution context
+ * @mmh: Array of message headers to send
+ * @n: Number of message headers to send
+ *
+ * #syscalls:pasta write
+ */
+static void udp_tap_send_pasta(const struct ctx *c, struct mmsghdr *mmh,
+ unsigned int n)
+{
+ unsigned int i, j;
+
+ for (i = 0; i < n; i++) {
+ for (j = 0; j < mmh[i].msg_hdr.msg_iovlen; j++) {
+ struct iovec *iov = &mmh[i].msg_hdr.msg_iov[j];
+
+ /* We can't use writev() because the tap
+ * character device relies on the write()
+ * boundaries to discern frame boundaries
+ */
+ if (write(c->fd_tap, iov->iov_base, iov->iov_len) < 0)
+ debug("tap write: %s", strerror(errno));
+ else
+ pcap(iov->iov_base, iov->iov_len);
+ }
+ }
+}
+
/**
* udp_sock_handler() - Handle new data from socket
* @c: Execution context
@@ -835,31 +864,25 @@ void udp_sock_handler(const struct ctx *c, union epoll_ref ref, uint32_t events,
else
buf_len = udp_update_hdr4(c, i, dstport, now);
- if (c->mode == MODE_PASTA) {
- void *frame = tap_iov[i].iov_base;
+ tap_iov[i].iov_len = buf_len;
- if (write(c->fd_tap, frame, buf_len) < 0)
- debug("tap write: %s", strerror(errno));
- pcap(frame, buf_len);
- } else {
- tap_iov[i].iov_len = buf_len;
-
- /* With bigger messages, qemu closes the connection. */
- if (msg_bufs && msg_len + buf_len > SHRT_MAX) {
- tap_mmh[msg_i].msg_hdr.msg_iovlen = msg_bufs;
- msg_i++;
- tap_mmh[msg_i].msg_hdr.msg_iov = &tap_iov[i];
- msg_len = msg_bufs = 0;
- }
-
- msg_len += buf_len;
- msg_bufs++;
+ /* With bigger messages, qemu closes the connection. */
+ if (c->mode == MODE_PASST && msg_bufs &&
+ msg_len + buf_len > SHRT_MAX) {
+ tap_mmh[msg_i].msg_hdr.msg_iovlen = msg_bufs;
+ msg_i++;
+ tap_mmh[msg_i].msg_hdr.msg_iov = &tap_iov[i];
+ msg_len = msg_bufs = 0;
}
+ msg_len += buf_len;
+ msg_bufs++;
}
tap_mmh[msg_i].msg_hdr.msg_iovlen = msg_bufs;
- if (c->mode == MODE_PASTA)
+ if (c->mode == MODE_PASTA) {
+ udp_tap_send_pasta(c, tap_mmh, msg_i + 1);
return;
+ }
ret = sendmmsg(c->fd_tap, tap_mmh, msg_i + 1,
MSG_NOSIGNAL | MSG_DONTWAIT);
--
2.39.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 2/8] udp: Split sending to passt tap interface into separate function
2023-01-05 4:26 [PATCH v4 0/8] Don't use additional sockets for receiving "spliced" UDP communications David Gibson
2023-01-05 4:26 ` [PATCH v4 1/8] udp: Move sending pasta tap frames to the end of udp_sock_handler() David Gibson
@ 2023-01-05 4:26 ` David Gibson
2023-01-05 4:26 ` [PATCH v4 3/8] udp: Split receive from preparation and send in udp_sock_handler() David Gibson
` (6 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: David Gibson @ 2023-01-05 4:26 UTC (permalink / raw)
To: passt-dev, Stefano Brivio; +Cc: David Gibson
The last part of udp_sock_handler() does the actual sending of frames
to the tap interface. For pasta that's just a call to
udp_tap_send_pasta() but for passt, it's moderately complex and open
coded.
For symmetry, move the passt send path into its own function,
udp_tap_send_passt(). This will make it easier to abstract the tap
interface in future (e.g. when we want to add vhost-user).
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
udp.c | 130 ++++++++++++++++++++++++++++++++--------------------------
1 file changed, 72 insertions(+), 58 deletions(-)
diff --git a/udp.c b/udp.c
index a27fe88..7281bc3 100644
--- a/udp.c
+++ b/udp.c
@@ -812,6 +812,73 @@ static void udp_tap_send_pasta(const struct ctx *c, struct mmsghdr *mmh,
}
}
+/**
+ * udp_tap_send_passt() - Send datagrams to the passt tap interface
+ * @c: Execution context
+ * @mmh: Array of message headers to send
+ * @n: Number of message headers to send
+ *
+ * #syscalls:passt sendmmsg sendmsg
+ */
+static void udp_tap_send_passt(const struct ctx *c, struct mmsghdr *mmh, int n)
+{
+ struct msghdr *last_mh;
+ ssize_t missing = 0;
+ size_t msg_len = 0;
+ unsigned int i;
+ int ret;
+
+ ret = sendmmsg(c->fd_tap, mmh, n, MSG_NOSIGNAL | MSG_DONTWAIT);
+ if (ret <= 0)
+ return;
+
+ /* If we lose some messages to sendmmsg() here, fine, it's UDP. However,
+ * the last message needs to be delivered completely, otherwise qemu
+ * will fail to reassemble the next message and close the connection. Go
+ * through headers from the last sent message, counting bytes, and, if
+ * and as soon as we see more bytes than sendmmsg() sent, re-send the
+ * rest with a blocking call.
+ *
+ * In pictures, given this example:
+ *
+ * iov #0 iov #1 iov #2 iov #3
+ * tap_mmh[ret - 1].msg_hdr: .... ...... ..... ......
+ * tap_mmh[ret - 1].msg_len: 7 .... ...
+ *
+ * when 'msglen' reaches: 10 ^
+ * and 'missing' below is: 3 ---
+ *
+ * re-send everything from here: ^-- ----- ------
+ */
+ last_mh = &mmh[ret - 1].msg_hdr;
+ for (i = 0; i < last_mh->msg_iovlen; i++) {
+ if (missing <= 0) {
+ msg_len += last_mh->msg_iov[i].iov_len;
+ missing = msg_len - mmh[ret - 1].msg_len;
+ }
+
+ if (missing > 0) {
+ uint8_t **iov_base;
+ int first_offset;
+
+ iov_base = (uint8_t **)&last_mh->msg_iov[i].iov_base;
+ first_offset = last_mh->msg_iov[i].iov_len - missing;
+ *iov_base += first_offset;
+ last_mh->msg_iov[i].iov_len = missing;
+
+ last_mh->msg_iov = &last_mh->msg_iov[i];
+
+ if (sendmsg(c->fd_tap, last_mh, MSG_NOSIGNAL) < 0)
+ debug("UDP: %li bytes to tap missing", missing);
+
+ *iov_base -= first_offset;
+ break;
+ }
+ }
+
+ pcapmm(mmh, ret);
+}
+
/**
* udp_sock_handler() - Handle new data from socket
* @c: Execution context
@@ -820,16 +887,14 @@ static void udp_tap_send_pasta(const struct ctx *c, struct mmsghdr *mmh,
* @now: Current timestamp
*
* #syscalls recvmmsg
- * #syscalls:passt sendmmsg sendmsg
*/
void udp_sock_handler(const struct ctx *c, union epoll_ref ref, uint32_t events,
const struct timespec *now)
{
in_port_t dstport = ref.r.p.udp.udp.port;
- ssize_t n, msg_len = 0, missing = 0;
struct mmsghdr *tap_mmh, *sock_mmh;
- int msg_bufs = 0, msg_i = 0, ret;
- struct msghdr *last_mh;
+ int msg_bufs = 0, msg_i = 0;
+ ssize_t n, msg_len = 0;
struct iovec *tap_iov;
unsigned int i;
@@ -879,61 +944,10 @@ void udp_sock_handler(const struct ctx *c, union epoll_ref ref, uint32_t events,
}
tap_mmh[msg_i].msg_hdr.msg_iovlen = msg_bufs;
- if (c->mode == MODE_PASTA) {
+ if (c->mode == MODE_PASTA)
udp_tap_send_pasta(c, tap_mmh, msg_i + 1);
- return;
- }
-
- ret = sendmmsg(c->fd_tap, tap_mmh, msg_i + 1,
- MSG_NOSIGNAL | MSG_DONTWAIT);
- if (ret <= 0)
- return;
-
- /* If we lose some messages to sendmmsg() here, fine, it's UDP. However,
- * the last message needs to be delivered completely, otherwise qemu
- * will fail to reassemble the next message and close the connection. Go
- * through headers from the last sent message, counting bytes, and, if
- * and as soon as we see more bytes than sendmmsg() sent, re-send the
- * rest with a blocking call.
- *
- * In pictures, given this example:
- *
- * iov #0 iov #1 iov #2 iov #3
- * tap_mmh[ret - 1].msg_hdr: .... ...... ..... ......
- * tap_mmh[ret - 1].msg_len: 7 .... ...
- *
- * when 'msglen' reaches: 10 ^
- * and 'missing' below is: 3 ---
- *
- * re-send everything from here: ^-- ----- ------
- */
- last_mh = &tap_mmh[ret - 1].msg_hdr;
- for (i = 0, msg_len = 0; i < last_mh->msg_iovlen; i++) {
- if (missing <= 0) {
- msg_len += last_mh->msg_iov[i].iov_len;
- missing = msg_len - tap_mmh[ret - 1].msg_len;
- }
-
- if (missing > 0) {
- uint8_t **iov_base;
- int first_offset;
-
- iov_base = (uint8_t **)&last_mh->msg_iov[i].iov_base;
- first_offset = last_mh->msg_iov[i].iov_len - missing;
- *iov_base += first_offset;
- last_mh->msg_iov[i].iov_len = missing;
-
- last_mh->msg_iov = &last_mh->msg_iov[i];
-
- if (sendmsg(c->fd_tap, last_mh, MSG_NOSIGNAL) < 0)
- debug("UDP: %li bytes to tap missing", missing);
-
- *iov_base -= first_offset;
- break;
- }
- }
-
- pcapmm(tap_mmh, ret);
+ else
+ udp_tap_send_passt(c, tap_mmh, msg_i + 1);
}
/**
--
@@ -812,6 +812,73 @@ static void udp_tap_send_pasta(const struct ctx *c, struct mmsghdr *mmh,
}
}
+/**
+ * udp_tap_send_passt() - Send datagrams to the passt tap interface
+ * @c: Execution context
+ * @mmh: Array of message headers to send
+ * @n: Number of message headers to send
+ *
+ * #syscalls:passt sendmmsg sendmsg
+ */
+static void udp_tap_send_passt(const struct ctx *c, struct mmsghdr *mmh, int n)
+{
+ struct msghdr *last_mh;
+ ssize_t missing = 0;
+ size_t msg_len = 0;
+ unsigned int i;
+ int ret;
+
+ ret = sendmmsg(c->fd_tap, mmh, n, MSG_NOSIGNAL | MSG_DONTWAIT);
+ if (ret <= 0)
+ return;
+
+ /* If we lose some messages to sendmmsg() here, fine, it's UDP. However,
+ * the last message needs to be delivered completely, otherwise qemu
+ * will fail to reassemble the next message and close the connection. Go
+ * through headers from the last sent message, counting bytes, and, if
+ * and as soon as we see more bytes than sendmmsg() sent, re-send the
+ * rest with a blocking call.
+ *
+ * In pictures, given this example:
+ *
+ * iov #0 iov #1 iov #2 iov #3
+ * tap_mmh[ret - 1].msg_hdr: .... ...... ..... ......
+ * tap_mmh[ret - 1].msg_len: 7 .... ...
+ *
+ * when 'msglen' reaches: 10 ^
+ * and 'missing' below is: 3 ---
+ *
+ * re-send everything from here: ^-- ----- ------
+ */
+ last_mh = &mmh[ret - 1].msg_hdr;
+ for (i = 0; i < last_mh->msg_iovlen; i++) {
+ if (missing <= 0) {
+ msg_len += last_mh->msg_iov[i].iov_len;
+ missing = msg_len - mmh[ret - 1].msg_len;
+ }
+
+ if (missing > 0) {
+ uint8_t **iov_base;
+ int first_offset;
+
+ iov_base = (uint8_t **)&last_mh->msg_iov[i].iov_base;
+ first_offset = last_mh->msg_iov[i].iov_len - missing;
+ *iov_base += first_offset;
+ last_mh->msg_iov[i].iov_len = missing;
+
+ last_mh->msg_iov = &last_mh->msg_iov[i];
+
+ if (sendmsg(c->fd_tap, last_mh, MSG_NOSIGNAL) < 0)
+ debug("UDP: %li bytes to tap missing", missing);
+
+ *iov_base -= first_offset;
+ break;
+ }
+ }
+
+ pcapmm(mmh, ret);
+}
+
/**
* udp_sock_handler() - Handle new data from socket
* @c: Execution context
@@ -820,16 +887,14 @@ static void udp_tap_send_pasta(const struct ctx *c, struct mmsghdr *mmh,
* @now: Current timestamp
*
* #syscalls recvmmsg
- * #syscalls:passt sendmmsg sendmsg
*/
void udp_sock_handler(const struct ctx *c, union epoll_ref ref, uint32_t events,
const struct timespec *now)
{
in_port_t dstport = ref.r.p.udp.udp.port;
- ssize_t n, msg_len = 0, missing = 0;
struct mmsghdr *tap_mmh, *sock_mmh;
- int msg_bufs = 0, msg_i = 0, ret;
- struct msghdr *last_mh;
+ int msg_bufs = 0, msg_i = 0;
+ ssize_t n, msg_len = 0;
struct iovec *tap_iov;
unsigned int i;
@@ -879,61 +944,10 @@ void udp_sock_handler(const struct ctx *c, union epoll_ref ref, uint32_t events,
}
tap_mmh[msg_i].msg_hdr.msg_iovlen = msg_bufs;
- if (c->mode == MODE_PASTA) {
+ if (c->mode == MODE_PASTA)
udp_tap_send_pasta(c, tap_mmh, msg_i + 1);
- return;
- }
-
- ret = sendmmsg(c->fd_tap, tap_mmh, msg_i + 1,
- MSG_NOSIGNAL | MSG_DONTWAIT);
- if (ret <= 0)
- return;
-
- /* If we lose some messages to sendmmsg() here, fine, it's UDP. However,
- * the last message needs to be delivered completely, otherwise qemu
- * will fail to reassemble the next message and close the connection. Go
- * through headers from the last sent message, counting bytes, and, if
- * and as soon as we see more bytes than sendmmsg() sent, re-send the
- * rest with a blocking call.
- *
- * In pictures, given this example:
- *
- * iov #0 iov #1 iov #2 iov #3
- * tap_mmh[ret - 1].msg_hdr: .... ...... ..... ......
- * tap_mmh[ret - 1].msg_len: 7 .... ...
- *
- * when 'msglen' reaches: 10 ^
- * and 'missing' below is: 3 ---
- *
- * re-send everything from here: ^-- ----- ------
- */
- last_mh = &tap_mmh[ret - 1].msg_hdr;
- for (i = 0, msg_len = 0; i < last_mh->msg_iovlen; i++) {
- if (missing <= 0) {
- msg_len += last_mh->msg_iov[i].iov_len;
- missing = msg_len - tap_mmh[ret - 1].msg_len;
- }
-
- if (missing > 0) {
- uint8_t **iov_base;
- int first_offset;
-
- iov_base = (uint8_t **)&last_mh->msg_iov[i].iov_base;
- first_offset = last_mh->msg_iov[i].iov_len - missing;
- *iov_base += first_offset;
- last_mh->msg_iov[i].iov_len = missing;
-
- last_mh->msg_iov = &last_mh->msg_iov[i];
-
- if (sendmsg(c->fd_tap, last_mh, MSG_NOSIGNAL) < 0)
- debug("UDP: %li bytes to tap missing", missing);
-
- *iov_base -= first_offset;
- break;
- }
- }
-
- pcapmm(tap_mmh, ret);
+ else
+ udp_tap_send_passt(c, tap_mmh, msg_i + 1);
}
/**
--
2.39.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 3/8] udp: Split receive from preparation and send in udp_sock_handler()
2023-01-05 4:26 [PATCH v4 0/8] Don't use additional sockets for receiving "spliced" UDP communications David Gibson
2023-01-05 4:26 ` [PATCH v4 1/8] udp: Move sending pasta tap frames to the end of udp_sock_handler() David Gibson
2023-01-05 4:26 ` [PATCH v4 2/8] udp: Split sending to passt tap interface into separate function David Gibson
@ 2023-01-05 4:26 ` David Gibson
2023-01-05 4:26 ` [PATCH v4 4/8] udp: Don't handle tap receive batch size calculation within a #define David Gibson
` (5 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: David Gibson @ 2023-01-05 4:26 UTC (permalink / raw)
To: passt-dev, Stefano Brivio; +Cc: David Gibson
The receive part of udp_sock_handler() and udp_sock_handler_splice() is now
almost identical. In preparation for merging that, split the receive part
of udp_sock_handler() from the part preparing and sending the frames for
sending on the tap interface. The latter goes into a new udp_tap_send()
function.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
udp.c | 79 +++++++++++++++++++++++++++++++++++++++--------------------
1 file changed, 52 insertions(+), 27 deletions(-)
diff --git a/udp.c b/udp.c
index 7281bc3..64c9219 100644
--- a/udp.c
+++ b/udp.c
@@ -880,51 +880,39 @@ static void udp_tap_send_passt(const struct ctx *c, struct mmsghdr *mmh, int n)
}
/**
- * udp_sock_handler() - Handle new data from socket
+ * udp_tap_send() - Prepare UDP datagrams and send to tap interface
* @c: Execution context
- * @ref: epoll reference
- * @events: epoll events bitmap
+ * @start: Index of first datagram in udp[46]_l2_buf pool
+ * @n: Number of datagrams to send
+ * @dstport: Destination port number
+ * @v6: True if using IPv6
* @now: Current timestamp
*
- * #syscalls recvmmsg
+ * Return: size of tap frame with headers
*/
-void udp_sock_handler(const struct ctx *c, union epoll_ref ref, uint32_t events,
- const struct timespec *now)
+static void udp_tap_send(const struct ctx *c,
+ unsigned int start, unsigned int n,
+ in_port_t dstport, bool v6, const struct timespec *now)
{
- in_port_t dstport = ref.r.p.udp.udp.port;
- struct mmsghdr *tap_mmh, *sock_mmh;
int msg_bufs = 0, msg_i = 0;
- ssize_t n, msg_len = 0;
+ struct mmsghdr *tap_mmh;
struct iovec *tap_iov;
+ ssize_t msg_len = 0;
unsigned int i;
- if (events == EPOLLERR)
- return;
-
- if (ref.r.p.udp.udp.splice) {
- udp_sock_handler_splice(c, ref, events, now);
- return;
- }
-
- if (ref.r.p.udp.udp.v6) {
+ if (v6) {
tap_mmh = udp6_l2_mh_tap;
- sock_mmh = udp6_l2_mh_sock;
tap_iov = udp6_l2_iov_tap;
} else {
tap_mmh = udp4_l2_mh_tap;
- sock_mmh = udp4_l2_mh_sock;
tap_iov = udp4_l2_iov_tap;
}
- n = recvmmsg(ref.r.s, sock_mmh, UDP_TAP_FRAMES, 0, NULL);
- if (n <= 0)
- return;
-
- tap_mmh[0].msg_hdr.msg_iov = &tap_iov[0];
- for (i = 0; i < (unsigned)n; i++) {
+ tap_mmh[0].msg_hdr.msg_iov = &tap_iov[start];
+ for (i = start; i < start + n; i++) {
size_t buf_len;
- if (ref.r.p.udp.udp.v6)
+ if (v6)
buf_len = udp_update_hdr6(c, i, dstport, now);
else
buf_len = udp_update_hdr4(c, i, dstport, now);
@@ -950,6 +938,43 @@ void udp_sock_handler(const struct ctx *c, union epoll_ref ref, uint32_t events,
udp_tap_send_passt(c, tap_mmh, msg_i + 1);
}
+/**
+ * udp_sock_handler() - Handle new data from socket
+ * @c: Execution context
+ * @ref: epoll reference
+ * @events: epoll events bitmap
+ * @now: Current timestamp
+ *
+ * #syscalls recvmmsg
+ */
+void udp_sock_handler(const struct ctx *c, union epoll_ref ref, uint32_t events,
+ const struct timespec *now)
+{
+ in_port_t dstport = ref.r.p.udp.udp.port;
+ bool v6 = ref.r.p.udp.udp.v6;
+ struct mmsghdr *sock_mmh;
+ ssize_t n;
+
+ if (events == EPOLLERR)
+ return;
+
+ if (ref.r.p.udp.udp.splice) {
+ udp_sock_handler_splice(c, ref, events, now);
+ return;
+ }
+
+ if (ref.r.p.udp.udp.v6)
+ sock_mmh = udp6_l2_mh_sock;
+ else
+ sock_mmh = udp4_l2_mh_sock;
+
+ n = recvmmsg(ref.r.s, sock_mmh, UDP_TAP_FRAMES, 0, NULL);
+ if (n <= 0)
+ return;
+
+ udp_tap_send(c, 0, n, dstport, v6, now);
+}
+
/**
* udp_tap_handler() - Handle packets from tap
* @c: Execution context
--
@@ -880,51 +880,39 @@ static void udp_tap_send_passt(const struct ctx *c, struct mmsghdr *mmh, int n)
}
/**
- * udp_sock_handler() - Handle new data from socket
+ * udp_tap_send() - Prepare UDP datagrams and send to tap interface
* @c: Execution context
- * @ref: epoll reference
- * @events: epoll events bitmap
+ * @start: Index of first datagram in udp[46]_l2_buf pool
+ * @n: Number of datagrams to send
+ * @dstport: Destination port number
+ * @v6: True if using IPv6
* @now: Current timestamp
*
- * #syscalls recvmmsg
+ * Return: size of tap frame with headers
*/
-void udp_sock_handler(const struct ctx *c, union epoll_ref ref, uint32_t events,
- const struct timespec *now)
+static void udp_tap_send(const struct ctx *c,
+ unsigned int start, unsigned int n,
+ in_port_t dstport, bool v6, const struct timespec *now)
{
- in_port_t dstport = ref.r.p.udp.udp.port;
- struct mmsghdr *tap_mmh, *sock_mmh;
int msg_bufs = 0, msg_i = 0;
- ssize_t n, msg_len = 0;
+ struct mmsghdr *tap_mmh;
struct iovec *tap_iov;
+ ssize_t msg_len = 0;
unsigned int i;
- if (events == EPOLLERR)
- return;
-
- if (ref.r.p.udp.udp.splice) {
- udp_sock_handler_splice(c, ref, events, now);
- return;
- }
-
- if (ref.r.p.udp.udp.v6) {
+ if (v6) {
tap_mmh = udp6_l2_mh_tap;
- sock_mmh = udp6_l2_mh_sock;
tap_iov = udp6_l2_iov_tap;
} else {
tap_mmh = udp4_l2_mh_tap;
- sock_mmh = udp4_l2_mh_sock;
tap_iov = udp4_l2_iov_tap;
}
- n = recvmmsg(ref.r.s, sock_mmh, UDP_TAP_FRAMES, 0, NULL);
- if (n <= 0)
- return;
-
- tap_mmh[0].msg_hdr.msg_iov = &tap_iov[0];
- for (i = 0; i < (unsigned)n; i++) {
+ tap_mmh[0].msg_hdr.msg_iov = &tap_iov[start];
+ for (i = start; i < start + n; i++) {
size_t buf_len;
- if (ref.r.p.udp.udp.v6)
+ if (v6)
buf_len = udp_update_hdr6(c, i, dstport, now);
else
buf_len = udp_update_hdr4(c, i, dstport, now);
@@ -950,6 +938,43 @@ void udp_sock_handler(const struct ctx *c, union epoll_ref ref, uint32_t events,
udp_tap_send_passt(c, tap_mmh, msg_i + 1);
}
+/**
+ * udp_sock_handler() - Handle new data from socket
+ * @c: Execution context
+ * @ref: epoll reference
+ * @events: epoll events bitmap
+ * @now: Current timestamp
+ *
+ * #syscalls recvmmsg
+ */
+void udp_sock_handler(const struct ctx *c, union epoll_ref ref, uint32_t events,
+ const struct timespec *now)
+{
+ in_port_t dstport = ref.r.p.udp.udp.port;
+ bool v6 = ref.r.p.udp.udp.v6;
+ struct mmsghdr *sock_mmh;
+ ssize_t n;
+
+ if (events == EPOLLERR)
+ return;
+
+ if (ref.r.p.udp.udp.splice) {
+ udp_sock_handler_splice(c, ref, events, now);
+ return;
+ }
+
+ if (ref.r.p.udp.udp.v6)
+ sock_mmh = udp6_l2_mh_sock;
+ else
+ sock_mmh = udp4_l2_mh_sock;
+
+ n = recvmmsg(ref.r.s, sock_mmh, UDP_TAP_FRAMES, 0, NULL);
+ if (n <= 0)
+ return;
+
+ udp_tap_send(c, 0, n, dstport, v6, now);
+}
+
/**
* udp_tap_handler() - Handle packets from tap
* @c: Execution context
--
2.39.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 4/8] udp: Don't handle tap receive batch size calculation within a #define
2023-01-05 4:26 [PATCH v4 0/8] Don't use additional sockets for receiving "spliced" UDP communications David Gibson
` (2 preceding siblings ...)
2023-01-05 4:26 ` [PATCH v4 3/8] udp: Split receive from preparation and send in udp_sock_handler() David Gibson
@ 2023-01-05 4:26 ` David Gibson
2023-01-05 4:26 ` [PATCH v4 5/8] udp: Pre-populate msg_names with local address David Gibson
` (4 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: David Gibson @ 2023-01-05 4:26 UTC (permalink / raw)
To: passt-dev, Stefano Brivio; +Cc: David Gibson
UDP_MAX_FRAMES gives the maximum number of datagrams we'll ever handle as a
batch for sizing our buffers and control structures. The subtly different
UDP_TAP_FRAMES gives the maximum number of datagrams we'll actually try to
receive at once for tap packets in the current configuration.
This depends on the mode, meaning that the macro has a non-obvious
dependency on the usual 'c' context variable being available. We only use
it in one place, so it makes more sense to open code this. Add an
explanatory comment while we're there.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
udp.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/udp.c b/udp.c
index 64c9219..6951c4c 100644
--- a/udp.c
+++ b/udp.c
@@ -119,7 +119,6 @@
#define UDP_CONN_TIMEOUT 180 /* s, timeout for ephemeral or local bind */
#define UDP_MAX_FRAMES 32 /* max # of frames to receive at once */
-#define UDP_TAP_FRAMES (c->mode == MODE_PASST ? UDP_MAX_FRAMES : 1)
/**
* struct udp_tap_port - Port tracking based on tap-facing source port
@@ -950,10 +949,14 @@ static void udp_tap_send(const struct ctx *c,
void udp_sock_handler(const struct ctx *c, union epoll_ref ref, uint32_t events,
const struct timespec *now)
{
+ /* For not entirely clear reasons (data locality?) pasta gets
+ * better throughput if we receive the datagrams one at a
+ * time.
+ */
+ ssize_t n = (c->mode == MODE_PASST ? UDP_MAX_FRAMES : 1);
in_port_t dstport = ref.r.p.udp.udp.port;
bool v6 = ref.r.p.udp.udp.v6;
struct mmsghdr *sock_mmh;
- ssize_t n;
if (events == EPOLLERR)
return;
@@ -968,7 +971,7 @@ void udp_sock_handler(const struct ctx *c, union epoll_ref ref, uint32_t events,
else
sock_mmh = udp4_l2_mh_sock;
- n = recvmmsg(ref.r.s, sock_mmh, UDP_TAP_FRAMES, 0, NULL);
+ n = recvmmsg(ref.r.s, sock_mmh, n, 0, NULL);
if (n <= 0)
return;
--
@@ -119,7 +119,6 @@
#define UDP_CONN_TIMEOUT 180 /* s, timeout for ephemeral or local bind */
#define UDP_MAX_FRAMES 32 /* max # of frames to receive at once */
-#define UDP_TAP_FRAMES (c->mode == MODE_PASST ? UDP_MAX_FRAMES : 1)
/**
* struct udp_tap_port - Port tracking based on tap-facing source port
@@ -950,10 +949,14 @@ static void udp_tap_send(const struct ctx *c,
void udp_sock_handler(const struct ctx *c, union epoll_ref ref, uint32_t events,
const struct timespec *now)
{
+ /* For not entirely clear reasons (data locality?) pasta gets
+ * better throughput if we receive the datagrams one at a
+ * time.
+ */
+ ssize_t n = (c->mode == MODE_PASST ? UDP_MAX_FRAMES : 1);
in_port_t dstport = ref.r.p.udp.udp.port;
bool v6 = ref.r.p.udp.udp.v6;
struct mmsghdr *sock_mmh;
- ssize_t n;
if (events == EPOLLERR)
return;
@@ -968,7 +971,7 @@ void udp_sock_handler(const struct ctx *c, union epoll_ref ref, uint32_t events,
else
sock_mmh = udp4_l2_mh_sock;
- n = recvmmsg(ref.r.s, sock_mmh, UDP_TAP_FRAMES, 0, NULL);
+ n = recvmmsg(ref.r.s, sock_mmh, n, 0, NULL);
if (n <= 0)
return;
--
2.39.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 5/8] udp: Pre-populate msg_names with local address
2023-01-05 4:26 [PATCH v4 0/8] Don't use additional sockets for receiving "spliced" UDP communications David Gibson
` (3 preceding siblings ...)
2023-01-05 4:26 ` [PATCH v4 4/8] udp: Don't handle tap receive batch size calculation within a #define David Gibson
@ 2023-01-05 4:26 ` David Gibson
2023-01-05 4:26 ` [PATCH v4 6/8] udp: Unify udp_sock_handler_splice() with udp_sock_handler() David Gibson
` (3 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: David Gibson @ 2023-01-05 4:26 UTC (permalink / raw)
To: passt-dev, Stefano Brivio; +Cc: David Gibson
udp_splice_namebuf is now used only for spliced sending, and so it is
only ever populated with the localhost address, either IPv4 or IPv6.
So, replace the awkward initialization in udp_sock_handler_splice()
with statically initialized versions for IPv4 and IPv6. We then just
need to update the port number in udp_sock_handler_splice().
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
udp.c | 40 ++++++++++++++++++----------------------
util.h | 7 +++++++
2 files changed, 25 insertions(+), 22 deletions(-)
diff --git a/udp.c b/udp.c
index 6951c4c..f6c07a5 100644
--- a/udp.c
+++ b/udp.c
@@ -232,11 +232,18 @@ static struct mmsghdr udp4_l2_mh_tap [UDP_MAX_FRAMES];
static struct mmsghdr udp6_l2_mh_tap [UDP_MAX_FRAMES];
/* recvmmsg()/sendmmsg() data for "spliced" connections */
-static struct sockaddr_storage udp_splice_namebuf;
-
static struct iovec udp4_iov_splice [UDP_MAX_FRAMES];
static struct iovec udp6_iov_splice [UDP_MAX_FRAMES];
+static struct sockaddr_in udp4_localname = {
+ .sin_family = AF_INET,
+ .sin_addr = IN4ADDR_LOOPBACK_INIT,
+};
+static struct sockaddr_in6 udp6_localname = {
+ .sin6_family = AF_INET6,
+ .sin6_addr = IN6ADDR_LOOPBACK_INIT,
+};
+
static struct mmsghdr udp4_mh_splice [UDP_MAX_FRAMES];
static struct mmsghdr udp6_mh_splice [UDP_MAX_FRAMES];
@@ -610,23 +617,10 @@ static void udp_sock_handler_splice(const struct ctx *c, union epoll_ref ref,
if (n <= 0)
return;
- if (v6) {
- *((struct sockaddr_in6 *)&udp_splice_namebuf) =
- ((struct sockaddr_in6) {
- .sin6_family = AF_INET6,
- .sin6_addr = IN6ADDR_LOOPBACK_INIT,
- .sin6_port = htons(dst),
- .sin6_scope_id = 0,
- });
- } else {
- *((struct sockaddr_in *)&udp_splice_namebuf) =
- ((struct sockaddr_in) {
- .sin_family = AF_INET,
- .sin_addr = { .s_addr = htonl(INADDR_LOOPBACK) },
- .sin_port = htons(dst),
- .sin_zero = { 0 },
- });
- }
+ if (v6)
+ udp6_localname.sin6_port = htons(dst);
+ else
+ udp4_localname.sin_port = htons(dst);
for (i = 0; i < n; i += m) {
in_port_t src = sa_port(v6, mmh_recv[i].msg_hdr.msg_name);
@@ -1256,9 +1250,11 @@ static void udp_splice_iov_init(void)
struct msghdr *mh4 = &udp4_mh_splice[i].msg_hdr;
struct msghdr *mh6 = &udp6_mh_splice[i].msg_hdr;
- mh4->msg_name = mh6->msg_name = &udp_splice_namebuf;
- mh4->msg_namelen = sizeof(udp_splice_namebuf);
- mh6->msg_namelen = sizeof(udp_splice_namebuf);
+ mh4->msg_name = &udp4_localname;
+ mh4->msg_namelen = sizeof(udp4_localname);
+
+ mh6->msg_name = &udp6_localname;
+ mh6->msg_namelen = sizeof(udp6_localname);
udp4_iov_splice[i].iov_base = udp4_l2_buf[i].data;
udp6_iov_splice[i].iov_base = udp6_l2_buf[i].data;
diff --git a/util.h b/util.h
index 02b55a9..3baec91 100644
--- a/util.h
+++ b/util.h
@@ -79,6 +79,13 @@
(IN_MULTICAST(ntohl((a)->s_addr)))
#define IN4_ARE_ADDR_EQUAL(a, b) \
(((struct in_addr *)(a))->s_addr == ((struct in_addr *)b)->s_addr)
+#if __BYTE_ORDER == __BIG_ENDIAN
+#define IN4ADDR_LOOPBACK_INIT \
+ { .s_addr = INADDR_LOOPBACK }
+#else
+#define IN4ADDR_LOOPBACK_INIT \
+ { .s_addr = __bswap_constant_32(INADDR_LOOPBACK) }
+#endif
#define NS_FN_STACK_SIZE (RLIMIT_STACK_VAL * 1024 / 8)
int do_clone(int (*fn)(void *), char *stack_area, size_t stack_size, int flags,
--
@@ -79,6 +79,13 @@
(IN_MULTICAST(ntohl((a)->s_addr)))
#define IN4_ARE_ADDR_EQUAL(a, b) \
(((struct in_addr *)(a))->s_addr == ((struct in_addr *)b)->s_addr)
+#if __BYTE_ORDER == __BIG_ENDIAN
+#define IN4ADDR_LOOPBACK_INIT \
+ { .s_addr = INADDR_LOOPBACK }
+#else
+#define IN4ADDR_LOOPBACK_INIT \
+ { .s_addr = __bswap_constant_32(INADDR_LOOPBACK) }
+#endif
#define NS_FN_STACK_SIZE (RLIMIT_STACK_VAL * 1024 / 8)
int do_clone(int (*fn)(void *), char *stack_area, size_t stack_size, int flags,
--
2.39.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 6/8] udp: Unify udp_sock_handler_splice() with udp_sock_handler()
2023-01-05 4:26 [PATCH v4 0/8] Don't use additional sockets for receiving "spliced" UDP communications David Gibson
` (4 preceding siblings ...)
2023-01-05 4:26 ` [PATCH v4 5/8] udp: Pre-populate msg_names with local address David Gibson
@ 2023-01-05 4:26 ` David Gibson
2023-01-05 4:26 ` [PATCH v4 7/8] udp: Decide whether to "splice" per datagram rather than per socket David Gibson
` (2 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: David Gibson @ 2023-01-05 4:26 UTC (permalink / raw)
To: passt-dev, Stefano Brivio; +Cc: David Gibson
These two functions now have a very similar structure, and their first
part (calling recvmmsg()) is functionally identical. So, merge the two
functions into one.
This does have the side effect of meaning we no longer receive multiple
packets at once for splice (we already didn't for tap). This does hurt
throughput for small spliced packets, but improves it for large spliced
packets and tap packets.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
udp.c | 94 +++++++++++++++++++++--------------------------------------
1 file changed, 34 insertions(+), 60 deletions(-)
diff --git a/udp.c b/udp.c
index f6c07a5..f160ecc 100644
--- a/udp.c
+++ b/udp.c
@@ -590,52 +590,6 @@ static void udp_splice_sendfrom(const struct ctx *c, unsigned start, unsigned n,
sendmmsg(s, mmh_send + start, n, MSG_NOSIGNAL);
}
-/**
- * udp_sock_handler_splice() - Handler for socket mapped to "spliced" connection
- * @c: Execution context
- * @ref: epoll reference
- * @events: epoll events bitmap
- * @now: Current timestamp
- */
-static void udp_sock_handler_splice(const struct ctx *c, union epoll_ref ref,
- uint32_t events, const struct timespec *now)
-{
- in_port_t dst = ref.r.p.udp.udp.port;
- int v6 = ref.r.p.udp.udp.v6, n, i, m;
- struct mmsghdr *mmh_recv;
-
- if (!(events & EPOLLIN))
- return;
-
- if (v6)
- mmh_recv = udp6_l2_mh_sock;
- else
- mmh_recv = udp4_l2_mh_sock;
-
- n = recvmmsg(ref.r.s, mmh_recv, UDP_MAX_FRAMES, 0, NULL);
-
- if (n <= 0)
- return;
-
- if (v6)
- udp6_localname.sin6_port = htons(dst);
- else
- udp4_localname.sin_port = htons(dst);
-
- for (i = 0; i < n; i += m) {
- in_port_t src = sa_port(v6, mmh_recv[i].msg_hdr.msg_name);
-
- for (m = 1; i + m < n; m++) {
- void *mname = mmh_recv[i + m].msg_hdr.msg_name;
- if (sa_port(v6, mname) != src)
- break;
- }
-
- udp_splice_sendfrom(c, i, m, src, dst, v6, ref.r.p.udp.udp.ns,
- ref.r.p.udp.udp.orig, now);
- }
-}
-
/**
* udp_update_hdr4() - Update headers for one IPv4 datagram
* @c: Execution context
@@ -944,32 +898,52 @@ void udp_sock_handler(const struct ctx *c, union epoll_ref ref, uint32_t events,
const struct timespec *now)
{
/* For not entirely clear reasons (data locality?) pasta gets
- * better throughput if we receive the datagrams one at a
- * time.
+ * better throughput if we receive tap datagrams one at a
+ * atime. For small splice datagrams throughput is slightly
+ * better if we do batch, but it's slightly worse for large
+ * splice datagrams. Since we don't know before we receive
+ * whether we'll use tap or splice, always go one at a time
+ * for pasta mode.
*/
ssize_t n = (c->mode == MODE_PASST ? UDP_MAX_FRAMES : 1);
in_port_t dstport = ref.r.p.udp.udp.port;
bool v6 = ref.r.p.udp.udp.v6;
- struct mmsghdr *sock_mmh;
+ struct mmsghdr *mmh_recv;
+ unsigned int i, m;
- if (events == EPOLLERR)
+ if (!(events & EPOLLIN))
return;
- if (ref.r.p.udp.udp.splice) {
- udp_sock_handler_splice(c, ref, events, now);
- return;
+ if (v6) {
+ mmh_recv = udp6_l2_mh_sock;
+ udp6_localname.sin6_port = htons(dstport);
+ } else {
+ mmh_recv = udp4_l2_mh_sock;
+ udp4_localname.sin_port = htons(dstport);
}
- if (ref.r.p.udp.udp.v6)
- sock_mmh = udp6_l2_mh_sock;
- else
- sock_mmh = udp4_l2_mh_sock;
-
- n = recvmmsg(ref.r.s, sock_mmh, n, 0, NULL);
+ n = recvmmsg(ref.r.s, mmh_recv, n, 0, NULL);
if (n <= 0)
return;
- udp_tap_send(c, 0, n, dstport, v6, now);
+ if (!ref.r.p.udp.udp.splice) {
+ udp_tap_send(c, 0, n, dstport, v6, now);
+ return;
+ }
+
+ for (i = 0; i < n; i += m) {
+ in_port_t src = sa_port(v6, mmh_recv[i].msg_hdr.msg_name);
+
+ for (m = 1; i + m < n; m++) {
+ void *mname = mmh_recv[i + m].msg_hdr.msg_name;
+ if (sa_port(v6, mname) != src)
+ break;
+ }
+
+ udp_splice_sendfrom(c, i, m, src, dstport, v6,
+ ref.r.p.udp.udp.ns, ref.r.p.udp.udp.orig,
+ now);
+ }
}
/**
--
@@ -590,52 +590,6 @@ static void udp_splice_sendfrom(const struct ctx *c, unsigned start, unsigned n,
sendmmsg(s, mmh_send + start, n, MSG_NOSIGNAL);
}
-/**
- * udp_sock_handler_splice() - Handler for socket mapped to "spliced" connection
- * @c: Execution context
- * @ref: epoll reference
- * @events: epoll events bitmap
- * @now: Current timestamp
- */
-static void udp_sock_handler_splice(const struct ctx *c, union epoll_ref ref,
- uint32_t events, const struct timespec *now)
-{
- in_port_t dst = ref.r.p.udp.udp.port;
- int v6 = ref.r.p.udp.udp.v6, n, i, m;
- struct mmsghdr *mmh_recv;
-
- if (!(events & EPOLLIN))
- return;
-
- if (v6)
- mmh_recv = udp6_l2_mh_sock;
- else
- mmh_recv = udp4_l2_mh_sock;
-
- n = recvmmsg(ref.r.s, mmh_recv, UDP_MAX_FRAMES, 0, NULL);
-
- if (n <= 0)
- return;
-
- if (v6)
- udp6_localname.sin6_port = htons(dst);
- else
- udp4_localname.sin_port = htons(dst);
-
- for (i = 0; i < n; i += m) {
- in_port_t src = sa_port(v6, mmh_recv[i].msg_hdr.msg_name);
-
- for (m = 1; i + m < n; m++) {
- void *mname = mmh_recv[i + m].msg_hdr.msg_name;
- if (sa_port(v6, mname) != src)
- break;
- }
-
- udp_splice_sendfrom(c, i, m, src, dst, v6, ref.r.p.udp.udp.ns,
- ref.r.p.udp.udp.orig, now);
- }
-}
-
/**
* udp_update_hdr4() - Update headers for one IPv4 datagram
* @c: Execution context
@@ -944,32 +898,52 @@ void udp_sock_handler(const struct ctx *c, union epoll_ref ref, uint32_t events,
const struct timespec *now)
{
/* For not entirely clear reasons (data locality?) pasta gets
- * better throughput if we receive the datagrams one at a
- * time.
+ * better throughput if we receive tap datagrams one at a
+ * atime. For small splice datagrams throughput is slightly
+ * better if we do batch, but it's slightly worse for large
+ * splice datagrams. Since we don't know before we receive
+ * whether we'll use tap or splice, always go one at a time
+ * for pasta mode.
*/
ssize_t n = (c->mode == MODE_PASST ? UDP_MAX_FRAMES : 1);
in_port_t dstport = ref.r.p.udp.udp.port;
bool v6 = ref.r.p.udp.udp.v6;
- struct mmsghdr *sock_mmh;
+ struct mmsghdr *mmh_recv;
+ unsigned int i, m;
- if (events == EPOLLERR)
+ if (!(events & EPOLLIN))
return;
- if (ref.r.p.udp.udp.splice) {
- udp_sock_handler_splice(c, ref, events, now);
- return;
+ if (v6) {
+ mmh_recv = udp6_l2_mh_sock;
+ udp6_localname.sin6_port = htons(dstport);
+ } else {
+ mmh_recv = udp4_l2_mh_sock;
+ udp4_localname.sin_port = htons(dstport);
}
- if (ref.r.p.udp.udp.v6)
- sock_mmh = udp6_l2_mh_sock;
- else
- sock_mmh = udp4_l2_mh_sock;
-
- n = recvmmsg(ref.r.s, sock_mmh, n, 0, NULL);
+ n = recvmmsg(ref.r.s, mmh_recv, n, 0, NULL);
if (n <= 0)
return;
- udp_tap_send(c, 0, n, dstport, v6, now);
+ if (!ref.r.p.udp.udp.splice) {
+ udp_tap_send(c, 0, n, dstport, v6, now);
+ return;
+ }
+
+ for (i = 0; i < n; i += m) {
+ in_port_t src = sa_port(v6, mmh_recv[i].msg_hdr.msg_name);
+
+ for (m = 1; i + m < n; m++) {
+ void *mname = mmh_recv[i + m].msg_hdr.msg_name;
+ if (sa_port(v6, mname) != src)
+ break;
+ }
+
+ udp_splice_sendfrom(c, i, m, src, dstport, v6,
+ ref.r.p.udp.udp.ns, ref.r.p.udp.udp.orig,
+ now);
+ }
}
/**
--
2.39.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 7/8] udp: Decide whether to "splice" per datagram rather than per socket
2023-01-05 4:26 [PATCH v4 0/8] Don't use additional sockets for receiving "spliced" UDP communications David Gibson
` (5 preceding siblings ...)
2023-01-05 4:26 ` [PATCH v4 6/8] udp: Unify udp_sock_handler_splice() with udp_sock_handler() David Gibson
@ 2023-01-05 4:26 ` David Gibson
2023-01-05 4:26 ` [PATCH v4 8/8] udp: Don't use separate sockets to listen for spliced packets David Gibson
2023-01-05 21:50 ` [PATCH v4 0/8] Don't use additional sockets for receiving "spliced" UDP communications Stefano Brivio
8 siblings, 0 replies; 12+ messages in thread
From: David Gibson @ 2023-01-05 4:26 UTC (permalink / raw)
To: passt-dev, Stefano Brivio; +Cc: David Gibson
Currently we have special sockets for receiving datagrams from locahost
which can use the optimized "splice" path rather than going across the tap
interface.
We want to loosen this so that sockets can receive sockets that will be
forwarded by both the spliced and non-spliced paths. To do this, we alter
the meaning of the @splice bit in the reference to mean that packets
receieved on this socket *can* be spliced, not that they *will* be spliced.
They'll only actually be spliced if they come from 127.0.0.1 or ::1.
We can't (for now) remove the splice bit entirely, unlike with TCP. Our
gateway mapping means that if the ns initiates communication to the gw
address, we'll translate that to target 127.0.0.1 on the host side. Reply
packets will therefore have source address 127.0.0.1 when received on the
host, but these need to go via the tap path where that will be translated
back to the gateway address. We need the @splice bit to distinguish that
case from packets going from localhost to a port mapped explicitly with
-u which should be spliced.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
udp.c | 52 +++++++++++++++++++++++++++++++++-------------------
udp.h | 2 +-
2 files changed, 34 insertions(+), 20 deletions(-)
diff --git a/udp.c b/udp.c
index f160ecc..3f216ed 100644
--- a/udp.c
+++ b/udp.c
@@ -513,16 +513,25 @@ static int udp_splice_new_ns(void *arg)
}
/**
- * sa_port() - Determine port from a sockaddr_in or sockaddr_in6
+ * udp_mmh_splice_port() - Is source address of message suitable for splicing?
* @v6: Is @sa a sockaddr_in6 (otherwise sockaddr_in)?
- * @sa: Pointer to either sockaddr_in or sockaddr_in6
+ * @mmh: mmsghdr of incoming message
+ *
+ * Return: if @sa refers to localhost (127.0.0.1 or ::1) the port from
+ * @sa in host order, otherwise -1.
*/
-static in_port_t sa_port(bool v6, const void *sa)
+static int udp_mmh_splice_port(bool v6, const struct mmsghdr *mmh)
{
- const struct sockaddr_in6 *sa6 = sa;
- const struct sockaddr_in *sa4 = sa;
+ const struct sockaddr_in6 *sa6 = mmh->msg_hdr.msg_name;
+ const struct sockaddr_in *sa4 = mmh->msg_hdr.msg_name;
+
+ if (v6 && IN6_IS_ADDR_LOOPBACK(&sa6->sin6_addr))
+ return ntohs(sa6->sin6_port);
+
+ if (!v6 && IN4_IS_ADDR_LOOPBACK(&sa4->sin_addr))
+ return ntohs(sa4->sin_port);
- return v6 ? ntohs(sa6->sin6_port) : ntohs(sa4->sin_port);
+ return -1;
}
/**
@@ -926,23 +935,28 @@ void udp_sock_handler(const struct ctx *c, union epoll_ref ref, uint32_t events,
if (n <= 0)
return;
- if (!ref.r.p.udp.udp.splice) {
- udp_tap_send(c, 0, n, dstport, v6, now);
- return;
- }
-
for (i = 0; i < n; i += m) {
- in_port_t src = sa_port(v6, mmh_recv[i].msg_hdr.msg_name);
+ int splicefrom = -1;
+ m = n;
+
+ if (ref.r.p.udp.udp.splice) {
+ splicefrom = udp_mmh_splice_port(v6, mmh_recv + i);
+
+ for (m = 1; i + m < n; m++) {
+ int p;
- for (m = 1; i + m < n; m++) {
- void *mname = mmh_recv[i + m].msg_hdr.msg_name;
- if (sa_port(v6, mname) != src)
- break;
+ p = udp_mmh_splice_port(v6, mmh_recv + i + m);
+ if (p != splicefrom)
+ break;
+ }
}
- udp_splice_sendfrom(c, i, m, src, dstport, v6,
- ref.r.p.udp.udp.ns, ref.r.p.udp.udp.orig,
- now);
+ if (splicefrom >= 0)
+ udp_splice_sendfrom(c, i, m, splicefrom, dstport,
+ v6, ref.r.p.udp.udp.ns,
+ ref.r.p.udp.udp.orig, now);
+ else
+ udp_tap_send(c, i, m, dstport, v6, now);
}
}
diff --git a/udp.h b/udp.h
index 053991e..2a03335 100644
--- a/udp.h
+++ b/udp.h
@@ -22,7 +22,7 @@ void udp_update_l2_buf(const unsigned char *eth_d, const unsigned char *eth_s,
/**
* union udp_epoll_ref - epoll reference portion for TCP connections
* @bound: Set if this file descriptor is a bound socket
- * @splice: Set if descriptor is associated to "spliced" connection
+ * @splice: Set if descriptor packets to be "spliced"
* @orig: Set if a spliced socket which can originate "connections"
* @ns: Set if this is a socket in the pasta network namespace
* @v6: Set for IPv6 sockets or connections
--
@@ -22,7 +22,7 @@ void udp_update_l2_buf(const unsigned char *eth_d, const unsigned char *eth_s,
/**
* union udp_epoll_ref - epoll reference portion for TCP connections
* @bound: Set if this file descriptor is a bound socket
- * @splice: Set if descriptor is associated to "spliced" connection
+ * @splice: Set if descriptor packets to be "spliced"
* @orig: Set if a spliced socket which can originate "connections"
* @ns: Set if this is a socket in the pasta network namespace
* @v6: Set for IPv6 sockets or connections
--
2.39.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 8/8] udp: Don't use separate sockets to listen for spliced packets
2023-01-05 4:26 [PATCH v4 0/8] Don't use additional sockets for receiving "spliced" UDP communications David Gibson
` (6 preceding siblings ...)
2023-01-05 4:26 ` [PATCH v4 7/8] udp: Decide whether to "splice" per datagram rather than per socket David Gibson
@ 2023-01-05 4:26 ` David Gibson
2023-01-05 21:50 ` [PATCH v4 0/8] Don't use additional sockets for receiving "spliced" UDP communications Stefano Brivio
8 siblings, 0 replies; 12+ messages in thread
From: David Gibson @ 2023-01-05 4:26 UTC (permalink / raw)
To: passt-dev, Stefano Brivio; +Cc: David Gibson
Currently, when ports are forwarded inbound in pasta mode, we open two
sockets for incoming traffic: one listens on the public IP address and will
forward packets to the tuntap interface. The other listens on localhost
and forwards via "splicing" (resending directly via sockets in the ns).
Now that we've improved the logic about whether we "splice" any individual
packet, we don't need this. Instead we can have a single socket bound to
0.0.0.0 or ::, marked as able to splice and udp_sock_handler() will deal
with each packet as appropriate.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
udp.c | 53 +++++++++++++----------------------------------------
1 file changed, 13 insertions(+), 40 deletions(-)
diff --git a/udp.c b/udp.c
index 3f216ed..dbfdb61 100644
--- a/udp.c
+++ b/udp.c
@@ -1124,7 +1124,6 @@ void udp_sock_init(const struct ctx *c, int ns, sa_family_t af,
const void *addr, const char *ifname, in_port_t port)
{
union udp_epoll_ref uref = { .u32 = 0 };
- const void *bind_addr;
int s;
if (ns) {
@@ -1136,67 +1135,41 @@ void udp_sock_init(const struct ctx *c, int ns, sa_family_t af,
}
if ((af == AF_INET || af == AF_UNSPEC) && c->ifi4) {
- if (!addr && c->mode == MODE_PASTA)
- bind_addr = &c->ip4.addr;
- else
- bind_addr = addr;
-
uref.udp.v6 = 0;
+ uref.udp.splice = (c->mode == MODE_PASTA);
+ uref.udp.orig = true;
if (!ns) {
- uref.udp.splice = 0;
- s = sock_l4(c, AF_INET, IPPROTO_UDP, bind_addr, ifname,
+ s = sock_l4(c, AF_INET, IPPROTO_UDP, addr, ifname,
port, uref.u32);
udp_tap_map[V4][uref.udp.port].sock = s;
-
- if (c->mode == MODE_PASTA) {
- bind_addr = &(uint32_t){ htonl(INADDR_LOOPBACK) };
- uref.udp.splice = uref.udp.orig = true;
-
- s = sock_l4(c, AF_INET, IPPROTO_UDP, bind_addr,
- ifname, port, uref.u32);
- udp_splice_init[V4][port].sock = s;
- }
+ udp_splice_init[V4][port].sock = s;
} else {
- uref.udp.splice = uref.udp.orig = uref.udp.ns = true;
-
- bind_addr = &(uint32_t){ htonl(INADDR_LOOPBACK) };
+ struct in_addr loopback = { htonl(INADDR_LOOPBACK) };
+ uref.udp.ns = true;
- s = sock_l4(c, AF_INET, IPPROTO_UDP, bind_addr,
+ s = sock_l4(c, AF_INET, IPPROTO_UDP, &loopback,
ifname, port, uref.u32);
udp_splice_ns[V4][port].sock = s;
}
}
if ((af == AF_INET6 || af == AF_UNSPEC) && c->ifi6) {
- if (!addr && c->mode == MODE_PASTA)
- bind_addr = &c->ip6.addr;
- else
- bind_addr = addr;
-
uref.udp.v6 = 1;
+ uref.udp.splice = (c->mode == MODE_PASTA);
+ uref.udp.orig = true;
if (!ns) {
- uref.udp.splice = 0;
- s = sock_l4(c, AF_INET6, IPPROTO_UDP, bind_addr, ifname,
+ s = sock_l4(c, AF_INET6, IPPROTO_UDP, addr, ifname,
port, uref.u32);
udp_tap_map[V6][uref.udp.port].sock = s;
-
- if (c->mode == MODE_PASTA) {
- bind_addr = &in6addr_loopback;
- uref.udp.splice = uref.udp.orig = true;
-
- s = sock_l4(c, AF_INET6, IPPROTO_UDP, bind_addr,
- ifname, port, uref.u32);
- udp_splice_init[V6][port].sock = s;
- }
+ udp_splice_init[V6][port].sock = s;
} else {
- bind_addr = &in6addr_loopback;
- uref.udp.splice = uref.udp.orig = uref.udp.ns = true;
+ uref.udp.ns = true;
- s = sock_l4(c, AF_INET6, IPPROTO_UDP, bind_addr,
+ s = sock_l4(c, AF_INET6, IPPROTO_UDP, &in6addr_loopback,
ifname, port, uref.u32);
udp_splice_ns[V6][port].sock = s;
}
--
@@ -1124,7 +1124,6 @@ void udp_sock_init(const struct ctx *c, int ns, sa_family_t af,
const void *addr, const char *ifname, in_port_t port)
{
union udp_epoll_ref uref = { .u32 = 0 };
- const void *bind_addr;
int s;
if (ns) {
@@ -1136,67 +1135,41 @@ void udp_sock_init(const struct ctx *c, int ns, sa_family_t af,
}
if ((af == AF_INET || af == AF_UNSPEC) && c->ifi4) {
- if (!addr && c->mode == MODE_PASTA)
- bind_addr = &c->ip4.addr;
- else
- bind_addr = addr;
-
uref.udp.v6 = 0;
+ uref.udp.splice = (c->mode == MODE_PASTA);
+ uref.udp.orig = true;
if (!ns) {
- uref.udp.splice = 0;
- s = sock_l4(c, AF_INET, IPPROTO_UDP, bind_addr, ifname,
+ s = sock_l4(c, AF_INET, IPPROTO_UDP, addr, ifname,
port, uref.u32);
udp_tap_map[V4][uref.udp.port].sock = s;
-
- if (c->mode == MODE_PASTA) {
- bind_addr = &(uint32_t){ htonl(INADDR_LOOPBACK) };
- uref.udp.splice = uref.udp.orig = true;
-
- s = sock_l4(c, AF_INET, IPPROTO_UDP, bind_addr,
- ifname, port, uref.u32);
- udp_splice_init[V4][port].sock = s;
- }
+ udp_splice_init[V4][port].sock = s;
} else {
- uref.udp.splice = uref.udp.orig = uref.udp.ns = true;
-
- bind_addr = &(uint32_t){ htonl(INADDR_LOOPBACK) };
+ struct in_addr loopback = { htonl(INADDR_LOOPBACK) };
+ uref.udp.ns = true;
- s = sock_l4(c, AF_INET, IPPROTO_UDP, bind_addr,
+ s = sock_l4(c, AF_INET, IPPROTO_UDP, &loopback,
ifname, port, uref.u32);
udp_splice_ns[V4][port].sock = s;
}
}
if ((af == AF_INET6 || af == AF_UNSPEC) && c->ifi6) {
- if (!addr && c->mode == MODE_PASTA)
- bind_addr = &c->ip6.addr;
- else
- bind_addr = addr;
-
uref.udp.v6 = 1;
+ uref.udp.splice = (c->mode == MODE_PASTA);
+ uref.udp.orig = true;
if (!ns) {
- uref.udp.splice = 0;
- s = sock_l4(c, AF_INET6, IPPROTO_UDP, bind_addr, ifname,
+ s = sock_l4(c, AF_INET6, IPPROTO_UDP, addr, ifname,
port, uref.u32);
udp_tap_map[V6][uref.udp.port].sock = s;
-
- if (c->mode == MODE_PASTA) {
- bind_addr = &in6addr_loopback;
- uref.udp.splice = uref.udp.orig = true;
-
- s = sock_l4(c, AF_INET6, IPPROTO_UDP, bind_addr,
- ifname, port, uref.u32);
- udp_splice_init[V6][port].sock = s;
- }
+ udp_splice_init[V6][port].sock = s;
} else {
- bind_addr = &in6addr_loopback;
- uref.udp.splice = uref.udp.orig = uref.udp.ns = true;
+ uref.udp.ns = true;
- s = sock_l4(c, AF_INET6, IPPROTO_UDP, bind_addr,
+ s = sock_l4(c, AF_INET6, IPPROTO_UDP, &in6addr_loopback,
ifname, port, uref.u32);
udp_splice_ns[V6][port].sock = s;
}
--
2.39.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v4 0/8] Don't use additional sockets for receiving "spliced" UDP communications
2023-01-05 4:26 [PATCH v4 0/8] Don't use additional sockets for receiving "spliced" UDP communications David Gibson
` (7 preceding siblings ...)
2023-01-05 4:26 ` [PATCH v4 8/8] udp: Don't use separate sockets to listen for spliced packets David Gibson
@ 2023-01-05 21:50 ` Stefano Brivio
2023-01-06 0:59 ` David Gibson
2023-01-13 0:07 ` Stefano Brivio
8 siblings, 2 replies; 12+ messages in thread
From: Stefano Brivio @ 2023-01-05 21:50 UTC (permalink / raw)
To: David Gibson; +Cc: passt-dev
On Thu, 5 Jan 2023 15:26:17 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:
> At present, the UDP "splice" and "tap" paths are quite separate. We
> have separate sockets to receive packets bound for the tap and splice
> paths. This leads to some code duplication, and extra open sockets.
>
> This series partially unifies the two paths, allowing us to use a
> single (host side) socket, bound to 0.0.0.0 or :: to receive packets
> for both cases.
>
> Changes since v3:
> * Fixed really dumb compile error, and actually ran through the tests.
> Oops.
> Changes since v2:
> * Don't receive multiple packets at once for pasta mode - seems to
> hurt throughput on balance.
> * Add some comments clarifying reasoning here.
> Changes since v1:
> * Renamed udp_localname[46] to udp[46]_localname
> * Allow handling of UDP port 0
> * Fix a bug which could misidentify certain v6 packets as v4-spliceable
> * Some minor cosmetic fixes to code and commit messages
Thanks, this looks good to me now, and I'm trying to run the tests
before applying it, but for some reason they get invariably stuck at
perf/passt_udp in the host to guest throughput test.
I think it has nothing to do with this series, and it's rather related
to the "new" suspected virtio-net TX timeout issue we started hitting a
while ago, I still need to play with kernel version/workarounds etc.
I'll keep you posted.
--
Stefano
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4 0/8] Don't use additional sockets for receiving "spliced" UDP communications
2023-01-05 21:50 ` [PATCH v4 0/8] Don't use additional sockets for receiving "spliced" UDP communications Stefano Brivio
@ 2023-01-06 0:59 ` David Gibson
2023-01-13 0:07 ` Stefano Brivio
1 sibling, 0 replies; 12+ messages in thread
From: David Gibson @ 2023-01-06 0:59 UTC (permalink / raw)
To: Stefano Brivio; +Cc: passt-dev
[-- Attachment #1: Type: text/plain, Size: 1929 bytes --]
On Thu, Jan 05, 2023 at 10:50:03PM +0100, Stefano Brivio wrote:
> On Thu, 5 Jan 2023 15:26:17 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
>
> > At present, the UDP "splice" and "tap" paths are quite separate. We
> > have separate sockets to receive packets bound for the tap and splice
> > paths. This leads to some code duplication, and extra open sockets.
> >
> > This series partially unifies the two paths, allowing us to use a
> > single (host side) socket, bound to 0.0.0.0 or :: to receive packets
> > for both cases.
> >
> > Changes since v3:
> > * Fixed really dumb compile error, and actually ran through the tests.
> > Oops.
> > Changes since v2:
> > * Don't receive multiple packets at once for pasta mode - seems to
> > hurt throughput on balance.
> > * Add some comments clarifying reasoning here.
> > Changes since v1:
> > * Renamed udp_localname[46] to udp[46]_localname
> > * Allow handling of UDP port 0
> > * Fix a bug which could misidentify certain v6 packets as v4-spliceable
> > * Some minor cosmetic fixes to code and commit messages
>
> Thanks, this looks good to me now, and I'm trying to run the tests
> before applying it, but for some reason they get invariably stuck at
> perf/passt_udp in the host to guest throughput test.
>
> I think it has nothing to do with this series, and it's rather related
> to the "new" suspected virtio-net TX timeout issue we started hitting a
> while ago, I still need to play with kernel version/workarounds etc.
> I'll keep you posted.
Ah, right. I think I hit a similar issue with tx timeouts showing
up. I think I just retried and it managed to get through on the 3rd
or fourth attempt :/.
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4 0/8] Don't use additional sockets for receiving "spliced" UDP communications
2023-01-05 21:50 ` [PATCH v4 0/8] Don't use additional sockets for receiving "spliced" UDP communications Stefano Brivio
2023-01-06 0:59 ` David Gibson
@ 2023-01-13 0:07 ` Stefano Brivio
1 sibling, 0 replies; 12+ messages in thread
From: Stefano Brivio @ 2023-01-13 0:07 UTC (permalink / raw)
To: David Gibson; +Cc: passt-dev, Laurent Vivier
On Thu, 5 Jan 2023 22:50:03 +0100
Stefano Brivio <sbrivio@redhat.com> wrote:
> On Thu, 5 Jan 2023 15:26:17 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
>
> > At present, the UDP "splice" and "tap" paths are quite separate. We
> > have separate sockets to receive packets bound for the tap and splice
> > paths. This leads to some code duplication, and extra open sockets.
> >
> > This series partially unifies the two paths, allowing us to use a
> > single (host side) socket, bound to 0.0.0.0 or :: to receive packets
> > for both cases.
> >
> > Changes since v3:
> > * Fixed really dumb compile error, and actually ran through the tests.
> > Oops.
> > Changes since v2:
> > * Don't receive multiple packets at once for pasta mode - seems to
> > hurt throughput on balance.
> > * Add some comments clarifying reasoning here.
> > Changes since v1:
> > * Renamed udp_localname[46] to udp[46]_localname
> > * Allow handling of UDP port 0
> > * Fix a bug which could misidentify certain v6 packets as v4-spliceable
> > * Some minor cosmetic fixes to code and commit messages
>
> Thanks, this looks good to me now, and I'm trying to run the tests
> before applying it, but for some reason they get invariably stuck at
> perf/passt_udp in the host to guest throughput test.
>
> I think it has nothing to do with this series, and it's rather related
> to the "new" suspected virtio-net TX timeout issue we started hitting a
> while ago, I still need to play with kernel version/workarounds etc.
> I'll keep you posted.
Finally, tests consistently pass with a workaround Laurent found, that
is, reverting kernel commit a7766ef18b33 ("virtio_net: disable cb
aggressively") from the kernel I'm using for guests.
I just pushed this, I'm testing the other pending series now.
--
Stefano
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2023-01-13 0:07 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-05 4:26 [PATCH v4 0/8] Don't use additional sockets for receiving "spliced" UDP communications David Gibson
2023-01-05 4:26 ` [PATCH v4 1/8] udp: Move sending pasta tap frames to the end of udp_sock_handler() David Gibson
2023-01-05 4:26 ` [PATCH v4 2/8] udp: Split sending to passt tap interface into separate function David Gibson
2023-01-05 4:26 ` [PATCH v4 3/8] udp: Split receive from preparation and send in udp_sock_handler() David Gibson
2023-01-05 4:26 ` [PATCH v4 4/8] udp: Don't handle tap receive batch size calculation within a #define David Gibson
2023-01-05 4:26 ` [PATCH v4 5/8] udp: Pre-populate msg_names with local address David Gibson
2023-01-05 4:26 ` [PATCH v4 6/8] udp: Unify udp_sock_handler_splice() with udp_sock_handler() David Gibson
2023-01-05 4:26 ` [PATCH v4 7/8] udp: Decide whether to "splice" per datagram rather than per socket David Gibson
2023-01-05 4:26 ` [PATCH v4 8/8] udp: Don't use separate sockets to listen for spliced packets David Gibson
2023-01-05 21:50 ` [PATCH v4 0/8] Don't use additional sockets for receiving "spliced" UDP communications Stefano Brivio
2023-01-06 0:59 ` David Gibson
2023-01-13 0:07 ` Stefano Brivio
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).