public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
* [PATCH 0/8] Clean ups and speed ups to benchmarks
@ 2023-11-06  7:08 David Gibson
  2023-11-06  7:08 ` [PATCH 1/8] test/perf: Remove stale iperf3c/iperf3s directives David Gibson
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: David Gibson @ 2023-11-06  7:08 UTC (permalink / raw)
  To: passt-dev, Stefano Brivio; +Cc: David Gibson

Our standard "make check" includes a number of benchmarks, which take
quite a long time to run.  This series makes a number of improvements
to how we run these, which reduces wasted time and reduces the full
run time by some 10-12 minutes.

David Gibson (8):
  test/perf: Remove stale iperf3c/iperf3s directives
  test/perf: Get iperf3 stats from client side
  test/perf: Start iperf3 server less often
  test/perf: Small MTUs for spliced TCP aren't interesting
  test/perf: Explicitly control UDP packet length, instead of MTU
  test/perf: "MTU" changes in passt_tcp host to guest aren't useful
  test/perf: Remove unnecessary --pacing-timer options
  test/perf: Simplify calculation of "omit" time for TCP throughput

 .gitignore          |   2 +-
 test/lib/test       |  83 ++++++++++++++++-----------
 test/perf/passt_tcp |  84 +++++++++++++--------------
 test/perf/passt_udp |  86 +++++++++++++---------------
 test/perf/pasta_tcp | 125 ++++++++++++++++------------------------
 test/perf/pasta_udp | 136 ++++++++++++++++++++++++++------------------
 6 files changed, 260 insertions(+), 256 deletions(-)

-- 
2.41.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/8] test/perf: Remove stale iperf3c/iperf3s directives
  2023-11-06  7:08 [PATCH 0/8] Clean ups and speed ups to benchmarks David Gibson
@ 2023-11-06  7:08 ` David Gibson
  2023-11-06  7:08 ` [PATCH 2/8] test/perf: Get iperf3 stats from client side David Gibson
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: David Gibson @ 2023-11-06  7:08 UTC (permalink / raw)
  To: passt-dev, Stefano Brivio; +Cc: David Gibson

Some older revisions used separate iperf3c and iperf3s test directives to
invoke the iperf3 client and server.  Those were combined into a single
iperf3 directive some time ago, but a couple of places still have the old
syntax.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 test/perf/pasta_tcp | 3 +--
 test/perf/pasta_udp | 4 ----
 2 files changed, 1 insertion(+), 6 deletions(-)

diff --git a/test/perf/pasta_tcp b/test/perf/pasta_tcp
index 4b13384..9e9dc37 100644
--- a/test/perf/pasta_tcp
+++ b/test/perf/pasta_tcp
@@ -43,8 +43,7 @@ ns	ip link set dev lo mtu 1500
 iperf3	BW ns host ::1 100${i}3 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 15.0 20.0
 ns	ip link set dev lo mtu 4000
-iperf3c	ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__
-iperf3s	BW host 100${i}3 __THREADS__
+iperf3	BW ns host ::1 100${i}3 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 15.0 20.0
 ns	ip link set dev lo mtu 16384
 iperf3	BW ns host ::1 100${i}3 __THREADS__ __TIME__ __OPTS__
diff --git a/test/perf/pasta_udp b/test/perf/pasta_udp
index 7007b6f..3de73a0 100644
--- a/test/perf/pasta_udp
+++ b/test/perf/pasta_udp
@@ -84,8 +84,6 @@ tr	UDP throughput over IPv6: host to ns
 bw	-
 bw	-
 bw	-
-#iperf3c	host ::1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G
-#iperf3s	BW ns 100${i}2 __THREADS__
 iperf3	BW host ns ::1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G
 bw	__BW__ 7.0 9.0
 
@@ -103,8 +101,6 @@ tr	UDP throughput over IPv4: host to ns
 bw	-
 bw	-
 bw	-
-#iperf3c	host 127.0.0.1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G
-#iperf3s	BW ns 100${i}2 __THREADS__
 iperf3	BW host ns 127.0.0.1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G
 bw	__BW__ 7.0 9.0
 
-- 
@@ -84,8 +84,6 @@ tr	UDP throughput over IPv6: host to ns
 bw	-
 bw	-
 bw	-
-#iperf3c	host ::1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G
-#iperf3s	BW ns 100${i}2 __THREADS__
 iperf3	BW host ns ::1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G
 bw	__BW__ 7.0 9.0
 
@@ -103,8 +101,6 @@ tr	UDP throughput over IPv4: host to ns
 bw	-
 bw	-
 bw	-
-#iperf3c	host 127.0.0.1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G
-#iperf3s	BW ns 100${i}2 __THREADS__
 iperf3	BW host ns 127.0.0.1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G
 bw	__BW__ 7.0 9.0
 
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/8] test/perf: Get iperf3 stats from client side
  2023-11-06  7:08 [PATCH 0/8] Clean ups and speed ups to benchmarks David Gibson
  2023-11-06  7:08 ` [PATCH 1/8] test/perf: Remove stale iperf3c/iperf3s directives David Gibson
@ 2023-11-06  7:08 ` David Gibson
  2023-11-06  7:08 ` [PATCH 3/8] test/perf: Start iperf3 server less often David Gibson
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: David Gibson @ 2023-11-06  7:08 UTC (permalink / raw)
  To: passt-dev, Stefano Brivio; +Cc: David Gibson

iperf3 generates statistics about its run on both the client and server
sides.  They don't have exactly the same information, but both have the
pieces we need (AFAICT the server communicates some nformation to the
client over the control socket, so the most important information is in the
client side output, even if measured by the server).

Currently we use the server side information for our measurements. Using
the client side information has several advantages though:

 * We can directly wait for the client to complete and we know we'll have
   the output we want.  We don't need to sleep to give the server time to
   write out the results.
 * That in turn means we can wrap up as soon as the client is done, we
   don't need to wait overlong to make sure everything is finished.
 * The slightly different organisation of the data in the client output
   means that we always want the same json value, rather than requiring
   slightly different onces for UDP and TCP.

The fact that we avoid some extra delays speeds up the overal run of the
perf tests by around 7 minutes (out of around 35 minutes) on my laptop.

The fact that we no longer unconditionally kill client and server after
a certain time means that the client could run indefinitely if the server
doesn't respond.  We mitigate that by setting 1s connect timeout on the
client.  This isn't foolproof - if we get an initial response, but then
lose connectivity this could still run indefinitely, however it does cover
by far the most likely failure cases.  --snd-timeout would provide more
robustness, but I've hit odd failures when trying to use it.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 .gitignore    |  2 +-
 test/lib/test | 32 ++++++++++++++------------------
 2 files changed, 15 insertions(+), 19 deletions(-)

diff --git a/.gitignore b/.gitignore
index d3d0e2c..d1c8be9 100644
--- a/.gitignore
+++ b/.gitignore
@@ -6,5 +6,5 @@
 /qrap
 /pasta.1
 /seccomp.h
-/s*.json
+/c*.json
 README.plain.md
diff --git a/test/lib/test b/test/lib/test
index 115dd21..3ca5dbc 100755
--- a/test/lib/test
+++ b/test/lib/test
@@ -31,41 +31,37 @@ test_iperf3() {
 	__procs="$((${1} - 1))"; shift
 	__time="${1}"; shift
 
-	pane_or_context_run "${__sctx}" 'rm -f s*.json'
+	pane_or_context_run "${__cctx}" 'rm -f c*.json'
 
 	pane_or_context_run_bg "${__sctx}" 				\
 		 'for i in $(seq 0 '${__procs}'); do'			\
-		 '	(iperf3 -s1J -p'${__port}' -i'${__time}		\
-		 '	 > s${i}.json) &'				\
-		 '	echo $! > s${i}.pid &'				\
+		 '	(iperf3 -s1 -p'${__port}' -i'${__time}') &'	\
+		 '	echo $! > s${i}.pid; '				\
 		 'done'							\
 
 	sleep 1		# Wait for server to be ready
 
-	pane_or_context_run_bg "${__cctx}" 				\
+        # A 1s wait for connection on what's basically a local link
+        # indicates something is pretty wrong
+        __timeout=1000
+	pane_or_context_run "${__cctx}" 				\
 		 '('							\
 		 '	for i in $(seq 0 '${__procs}'); do'		\
-		 '		iperf3 -c '${__dest}' -p '${__port}	\
-		 '		 -t'${__time}' -i0 -T s${i} '"${@}"' &' \
+		 '		iperf3 -J -c '${__dest}' -p '${__port}	\
+		 '		 --connect-timeout '${__timeout}	\
+		 '		 -t'${__time}' -i0 -T c${i} '"${@}"	\
+                 ' 		> c${i}.json &'				\
 		 '	done;'						\
 		 '	wait'						\
 		 ')'
 
-	sleep $((__time + 5))
-
-	# If client fails to deliver control message, tell server we're done
+	# Kill the server, just in case -1 didn't work right
 	pane_or_context_run "${__sctx}" 'kill -INT $(cat s*.pid); rm s*.pid'
 
-	sleep 1		# ...and wait for output to be flushed
-
 	__jval=".end.sum_received.bits_per_second"
-	for __opt in ${@}; do
-		# UDP test
-		[ "${__opt}" = "-u" ] && __jval=".intervals[0].sum.bits_per_second"
-	done
 
-	__bw=$(pane_or_context_output "${__sctx}"			\
-		 'cat s*.json | jq -rMs "map('${__jval}') | add"')
+	__bw=$(pane_or_context_output "${__cctx}"			\
+		 'cat c*.json | jq -rMs "map('${__jval}') | add"')
 
 	TEST_ONE_subs="$(list_add_pair "${TEST_ONE_subs}" "__${__var}__" "${__bw}" )"
 
-- 
@@ -31,41 +31,37 @@ test_iperf3() {
 	__procs="$((${1} - 1))"; shift
 	__time="${1}"; shift
 
-	pane_or_context_run "${__sctx}" 'rm -f s*.json'
+	pane_or_context_run "${__cctx}" 'rm -f c*.json'
 
 	pane_or_context_run_bg "${__sctx}" 				\
 		 'for i in $(seq 0 '${__procs}'); do'			\
-		 '	(iperf3 -s1J -p'${__port}' -i'${__time}		\
-		 '	 > s${i}.json) &'				\
-		 '	echo $! > s${i}.pid &'				\
+		 '	(iperf3 -s1 -p'${__port}' -i'${__time}') &'	\
+		 '	echo $! > s${i}.pid; '				\
 		 'done'							\
 
 	sleep 1		# Wait for server to be ready
 
-	pane_or_context_run_bg "${__cctx}" 				\
+        # A 1s wait for connection on what's basically a local link
+        # indicates something is pretty wrong
+        __timeout=1000
+	pane_or_context_run "${__cctx}" 				\
 		 '('							\
 		 '	for i in $(seq 0 '${__procs}'); do'		\
-		 '		iperf3 -c '${__dest}' -p '${__port}	\
-		 '		 -t'${__time}' -i0 -T s${i} '"${@}"' &' \
+		 '		iperf3 -J -c '${__dest}' -p '${__port}	\
+		 '		 --connect-timeout '${__timeout}	\
+		 '		 -t'${__time}' -i0 -T c${i} '"${@}"	\
+                 ' 		> c${i}.json &'				\
 		 '	done;'						\
 		 '	wait'						\
 		 ')'
 
-	sleep $((__time + 5))
-
-	# If client fails to deliver control message, tell server we're done
+	# Kill the server, just in case -1 didn't work right
 	pane_or_context_run "${__sctx}" 'kill -INT $(cat s*.pid); rm s*.pid'
 
-	sleep 1		# ...and wait for output to be flushed
-
 	__jval=".end.sum_received.bits_per_second"
-	for __opt in ${@}; do
-		# UDP test
-		[ "${__opt}" = "-u" ] && __jval=".intervals[0].sum.bits_per_second"
-	done
 
-	__bw=$(pane_or_context_output "${__sctx}"			\
-		 'cat s*.json | jq -rMs "map('${__jval}') | add"')
+	__bw=$(pane_or_context_output "${__cctx}"			\
+		 'cat c*.json | jq -rMs "map('${__jval}') | add"')
 
 	TEST_ONE_subs="$(list_add_pair "${TEST_ONE_subs}" "__${__var}__" "${__bw}" )"
 
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/8] test/perf: Start iperf3 server less often
  2023-11-06  7:08 [PATCH 0/8] Clean ups and speed ups to benchmarks David Gibson
  2023-11-06  7:08 ` [PATCH 1/8] test/perf: Remove stale iperf3c/iperf3s directives David Gibson
  2023-11-06  7:08 ` [PATCH 2/8] test/perf: Get iperf3 stats from client side David Gibson
@ 2023-11-06  7:08 ` David Gibson
  2023-11-06  7:08 ` [PATCH 4/8] test/perf: Small MTUs for spliced TCP aren't interesting David Gibson
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: David Gibson @ 2023-11-06  7:08 UTC (permalink / raw)
  To: passt-dev, Stefano Brivio; +Cc: David Gibson

Currently we start both the iperf3 server(s) and client(s) afresh each time
we want to make a bandwidth measurement.  That's not really necessary as
usually a whole batch of bandwidth measurements can use the same server.

Split up the iperf3 directive into 3 directives: iperf3s to start the
server, iperf3 to make a measurement and iperf3k to kill the server, so
that we can start the server less often.  This - and more importantly, the
reduced number of waits for the server to be ready - reduces runtime of the
performance tests on my laptop by about 4m (out of ~28minutes).

For now we still restart the server between IPv4 and IPv6 tests.  That's
because in some cases the latency measurements we make in between use the
same ports.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 test/lib/test       | 57 ++++++++++++++++++++++-----------
 test/perf/passt_tcp | 59 ++++++++++++++++++++--------------
 test/perf/passt_udp | 57 ++++++++++++++++++++-------------
 test/perf/pasta_tcp | 77 ++++++++++++++++++++++++++++++---------------
 test/perf/pasta_udp | 72 +++++++++++++++++++++++++++++-------------
 5 files changed, 213 insertions(+), 109 deletions(-)

diff --git a/test/lib/test b/test/lib/test
index 3ca5dbc..1d571c3 100755
--- a/test/lib/test
+++ b/test/lib/test
@@ -13,19 +13,45 @@
 # Copyright (c) 2021 Red Hat GmbH
 # Author: Stefano Brivio <sbrivio@redhat.com>
 
+# test_iperf3s() - Start iperf3 server
+# $1:	Destination/server context
+# $2:	Port number, ${i} is translated to process index
+# $3:	Number of processes to run in parallel
+test_iperf3s() {
+	__sctx="${1}"
+	__port="${2}"
+	__procs="$((${3} - 1))"
+
+	pane_or_context_run_bg "${__sctx}" 				\
+		 'for i in $(seq 0 '${__procs}'); do'			\
+		 '	iperf3 -s -p'${__port}' &'			\
+		 '	echo $! > s${i}.pid; '				\
+		 'done'							\
+
+	sleep 1		# Wait for server to be ready
+}
+
+# test_iperf3k() - Kill iperf3 server
+# $1:	Destination/server context
+test_iperf3k() {
+	__sctx="${1}"
+
+	pane_or_context_run "${__sctx}" 'kill -INT $(cat s*.pid); rm s*.pid'
+
+	sleep 3		# Wait for kernel to free up ports
+}
+
 # test_iperf3() - Ugly helper for iperf3 directive
 # $1:	Variable name: to put the measure bandwidth into
 # $2:	Source/client context
-# $3:	Destination/server context
-# $4:	Destination name or address for client
-# $5:	Port number, ${i} is translated to process index
-# $6:	Number of processes to run in parallel
-# $7:	Run time, in seconds
+# $3:	Destination name or address for client
+# $4:	Port number, ${i} is translated to process index
+# $5:	Number of processes to run in parallel
+# $6:	Run time, in seconds
 # $@:	Client options
 test_iperf3() {
 	__var="${1}"; shift
 	__cctx="${1}"; shift
-	__sctx="${1}"; shift
 	__dest="${1}"; shift
 	__port="${1}"; shift
 	__procs="$((${1} - 1))"; shift
@@ -33,14 +59,6 @@ test_iperf3() {
 
 	pane_or_context_run "${__cctx}" 'rm -f c*.json'
 
-	pane_or_context_run_bg "${__sctx}" 				\
-		 'for i in $(seq 0 '${__procs}'); do'			\
-		 '	(iperf3 -s1 -p'${__port}' -i'${__time}') &'	\
-		 '	echo $! > s${i}.pid; '				\
-		 'done'							\
-
-	sleep 1		# Wait for server to be ready
-
         # A 1s wait for connection on what's basically a local link
         # indicates something is pretty wrong
         __timeout=1000
@@ -55,17 +73,12 @@ test_iperf3() {
 		 '	wait'						\
 		 ')'
 
-	# Kill the server, just in case -1 didn't work right
-	pane_or_context_run "${__sctx}" 'kill -INT $(cat s*.pid); rm s*.pid'
-
 	__jval=".end.sum_received.bits_per_second"
 
 	__bw=$(pane_or_context_output "${__cctx}"			\
 		 'cat c*.json | jq -rMs "map('${__jval}') | add"')
 
 	TEST_ONE_subs="$(list_add_pair "${TEST_ONE_subs}" "__${__var}__" "${__bw}" )"
-
-	sleep 3		# Wait for kernel to free up ports
 }
 
 test_one_line() {
@@ -283,6 +296,12 @@ test_one_line() {
 	"lat")
 		table_value_latency ${__arg} || TEST_ONE_perf_nok=1
 		;;
+	"iperf3s")
+		test_iperf3s ${__arg}
+                ;;
+	"iperf3k")
+		test_iperf3k ${__arg}
+                ;;
 	"iperf3")
 		test_iperf3 ${__arg}
 		;;
diff --git a/test/perf/passt_tcp b/test/perf/passt_tcp
index 7046f3c..9363922 100644
--- a/test/perf/passt_tcp
+++ b/test/perf/passt_tcp
@@ -50,22 +50,25 @@ th	MTU 256B 576B 1280B 1500B 9000B 65520B
 
 
 tr	TCP throughput over IPv6: guest to host
+iperf3s	ns 100${i}2 __THREADS__
+
 bw	-
 bw	-
-
 guest	ip link set dev __IFNAME__ mtu 1280
-iperf3	BW guest ns __GW6__%__IFNAME__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -w 4M
+iperf3	BW guest __GW6__%__IFNAME__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -w 4M
 bw	__BW__ 1.2 1.5
 guest	ip link set dev __IFNAME__ mtu 1500
-iperf3	BW guest ns __GW6__%__IFNAME__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -w 4M
+iperf3	BW guest __GW6__%__IFNAME__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -w 4M
 bw	__BW__ 1.6 1.8
 guest	ip link set dev __IFNAME__ mtu 9000
-iperf3	BW guest ns __GW6__%__IFNAME__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -w 8M
+iperf3	BW guest __GW6__%__IFNAME__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -w 8M
 bw	__BW__ 4.0 5.0
 guest	ip link set dev __IFNAME__ mtu 65520
-iperf3	BW guest ns __GW6__%__IFNAME__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -w 16M
+iperf3	BW guest __GW6__%__IFNAME__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -w 16M
 bw	__BW__ 7.0 8.0
 
+iperf3k	ns
+
 tl	TCP RR latency over IPv6: guest to host
 lat	-
 lat	-
@@ -86,27 +89,30 @@ nsb	tcp_crr --nolog -6
 gout	LAT tcp_crr --nolog -6 -c -H __GW6__%__IFNAME__ | sed -n 's/^throughput=\(.*\)/\1/p'
 lat	__LAT__ 500 400
 
-
 tr	TCP throughput over IPv4: guest to host
+iperf3s	ns 100${i}2 __THREADS__
+
 guest	ip link set dev __IFNAME__ mtu 256
-iperf3	BW guest ns __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -w 1M
+iperf3	BW guest __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -w 1M
 bw	__BW__ 0.2 0.3
 guest	ip link set dev __IFNAME__ mtu 576
-iperf3	BW guest ns __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -w 1M
+iperf3	BW guest __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -w 1M
 bw	__BW__ 0.5 0.8
 guest	ip link set dev __IFNAME__ mtu 1280
-iperf3	BW guest ns __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -w 4M
+iperf3	BW guest __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -w 4M
 bw	__BW__ 1.2 1.5
 guest	ip link set dev __IFNAME__ mtu 1500
-iperf3	BW guest ns __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -w 4M
+iperf3	BW guest __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -w 4M
 bw	__BW__ 1.6 1.8
 guest	ip link set dev __IFNAME__ mtu 9000
-iperf3	BW guest ns __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -w 8M
+iperf3	BW guest __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -w 8M
 bw	__BW__ 4.0 5.0
 guest	ip link set dev __IFNAME__ mtu 65520
-iperf3	BW guest ns __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -w 16M
+iperf3	BW guest __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -w 16M
 bw	__BW__ 7.0 8.0
 
+iperf3k	ns
+
 tl	TCP RR latency over IPv4: guest to host
 lat	-
 lat	-
@@ -127,24 +133,27 @@ nsb	tcp_crr --nolog -4
 gout	LAT tcp_crr --nolog -4 -c -H __GW__ | sed -n 's/^throughput=\(.*\)/\1/p'
 lat	__LAT__ 500 400
 
-
 tr	TCP throughput over IPv6: host to guest
+iperf3s	guest 100${i}1 __THREADS__
+
 bw	-
 bw	-
 ns	ip link set dev lo mtu 1280
-iperf3	BW ns guest ::1 100${i}1 __THREADS__ __TIME__ __OPTS__
+iperf3	BW ns ::1 100${i}1 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 1.0 1.2
 ns	ip link set dev lo mtu 1500
-iperf3	BW ns guest ::1 100${i}1 __THREADS__ __TIME__ __OPTS__
+iperf3	BW ns ::1 100${i}1 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 2.0 3.0
 ns	ip link set dev lo mtu 9000
-iperf3	BW ns guest ::1 100${i}1 __THREADS__ __TIME__ __OPTS__
+iperf3	BW ns ::1 100${i}1 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 5.0 6.0
 ns	ip link set dev lo mtu 65520
-iperf3	BW ns guest ::1 100${i}1 __THREADS__ __TIME__ __OPTS__
+iperf3	BW ns ::1 100${i}1 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 6.0 6.8
 ns	ip link set dev lo mtu 65535
 
+iperf3k	guest
+
 tl	TCP RR latency over IPv6: host to guest
 lat	-
 lat	-
@@ -169,27 +178,31 @@ lat	__LAT__ 500 350
 
 
 tr	TCP throughput over IPv4: host to guest
+iperf3s	guest 100${i}1 __THREADS__
+
 ns	ip link set dev lo mtu 256
-iperf3	BW ns guest 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__
+iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 0.3 0.5
 ns	ip link set dev lo mtu 576
-iperf3	BW ns guest 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__
+iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 0.5 1.0
 ns	ip link set dev lo mtu 1280
 ns	ip addr add ::1 dev lo
-iperf3	BW ns guest 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__
+iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 2.0 3.0
 ns	ip link set dev lo mtu 1500
-iperf3	BW ns guest 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__
+iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 2.0 3.0
 ns	ip link set dev lo mtu 9000
-iperf3	BW ns guest 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__
+iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 5.0 6.0
 ns	ip link set dev lo mtu 65520
-iperf3	BW ns guest 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__
+iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 6.0 6.8
 ns	ip link set dev lo mtu 65535
 
+iperf3k	guest
+
 tl	TCP RR latency over IPv4: host to guest
 lat	-
 lat	-
diff --git a/test/perf/passt_udp b/test/perf/passt_udp
index a117b6a..12d8fbb 100644
--- a/test/perf/passt_udp
+++ b/test/perf/passt_udp
@@ -41,23 +41,26 @@ report	passt udp __THREADS__ __FREQ__
 
 th	MTU 256B 576B 1280B 1500B 9000B 65520B
 
-
 tr	UDP throughput over IPv6: guest to host
+iperf3s	ns 100${i}2 __THREADS__
+
 bw	-
 bw	-
 guest	ip link set dev __IFNAME__ mtu 1280
-iperf3	BW guest ns __GW6__%__IFNAME__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 2G
+iperf3	BW guest __GW6__%__IFNAME__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 2G
 bw	__BW__ 0.8 1.2
 guest	ip link set dev __IFNAME__ mtu 1500
-iperf3	BW guest ns __GW6__%__IFNAME__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW guest __GW6__%__IFNAME__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 3G
 bw	__BW__ 1.0 1.5
 guest	ip link set dev __IFNAME__ mtu 9000
-iperf3	BW guest ns __GW6__%__IFNAME__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 5G
+iperf3	BW guest __GW6__%__IFNAME__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 5G
 bw	__BW__ 4.0 5.0
 guest	ip link set dev __IFNAME__ mtu 65520
-iperf3	BW guest ns __GW6__%__IFNAME__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 7G
+iperf3	BW guest __GW6__%__IFNAME__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 7G
 bw	__BW__ 4.0 5.0
 
+iperf3k	ns
+
 tl	UDP RR latency over IPv6: guest to host
 lat	-
 lat	-
@@ -70,25 +73,29 @@ lat	__LAT__ 200 150
 
 
 tr	UDP throughput over IPv4: guest to host
+iperf3s	ns 100${i}2 __THREADS__
+
 guest	ip link set dev __IFNAME__ mtu 256
-iperf3	BW guest ns __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 500M
+iperf3	BW guest __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 500M
 bw	__BW__ 0.0 0.0
 guest	ip link set dev __IFNAME__ mtu 576
-iperf3	BW guest ns __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 1G
+iperf3	BW guest __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 1G
 bw	__BW__ 0.4 0.6
 guest	ip link set dev __IFNAME__ mtu 1280
-iperf3	BW guest ns __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 2G
+iperf3	BW guest __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 2G
 bw	__BW__ 0.8 1.2
 guest	ip link set dev __IFNAME__ mtu 1500
-iperf3	BW guest ns __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW guest __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 3G
 bw	__BW__ 1.0 1.5
 guest	ip link set dev __IFNAME__ mtu 9000
-iperf3	BW guest ns __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 6G
+iperf3	BW guest __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 6G
 bw	__BW__ 4.0 5.0
 guest	ip link set dev __IFNAME__ mtu 65520
-iperf3	BW guest ns __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 7G
+iperf3	BW guest __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 7G
 bw	__BW__ 4.0 5.0
 
+iperf3k	ns
+
 tl	UDP RR latency over IPv4: guest to host
 lat	-
 lat	-
@@ -101,21 +108,25 @@ lat	__LAT__ 200 150
 
 
 tr	UDP throughput over IPv6: host to guest
+iperf3s	guest 100${i}1 __THREADS__
+
 bw	-
 bw	-
 ns	ip link set dev lo mtu 1280
-iperf3	BW ns guest  ::1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 2G
+iperf3	BW ns ::1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 2G
 bw	__BW__ 0.8 1.2
 ns	ip link set dev lo mtu 1500
-iperf3	BW ns guest ::1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 2G
+iperf3	BW ns ::1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 2G
 bw	__BW__ 1.0 1.5
 ns	ip link set dev lo mtu 9000
-iperf3	BW ns guest ::1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns ::1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 3G
 bw	__BW__ 3.0 4.0
 ns	ip link set dev lo mtu 65520
-iperf3	BW ns guest ::1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns ::1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 3G
 bw	__BW__ 3.0 4.0
 
+iperf3k	guest
+
 tl	UDP RR latency over IPv6: host to guest
 lat	-
 lat	-
@@ -130,26 +141,30 @@ ns	ip link set dev lo mtu 65535
 
 
 tr	UDP throughput over IPv4: host to guest
+iperf3s	guest 100${i}1 __THREADS__
+
 ns	ip link set dev lo mtu 256
-iperf3	BW ns guest 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 1G
+iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 1G
 bw	__BW__ 0.0 0.0
 ns	ip link set dev lo mtu 576
-iperf3	BW ns guest 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 1G
+iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 1G
 bw	__BW__ 0.4 0.6
 ns	ip link set dev lo mtu 1280
 ns	ip addr add ::1 dev lo
-iperf3	BW ns guest 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 3G
 bw	__BW__ 0.8 1.2
 ns	ip link set dev lo mtu 1500
-iperf3	BW ns guest 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 3G
 bw	__BW__ 1.0 1.5
 ns	ip link set dev lo mtu 9000
-iperf3	BW ns guest 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 3G
 bw	__BW__ 3.0 4.0
 ns	ip link set dev lo mtu 65520
-iperf3	BW ns guest 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 3G
 bw	__BW__ 3.0 4.0
 
+iperf3k	guest
+
 tl	UDP RR latency over IPv4: host to guest
 lat	-
 lat	-
diff --git a/test/perf/pasta_tcp b/test/perf/pasta_tcp
index 9e9dc37..a8938c3 100644
--- a/test/perf/pasta_tcp
+++ b/test/perf/pasta_tcp
@@ -37,21 +37,24 @@ report	pasta lo_tcp __THREADS__ __FREQ__
 
 th	MTU 1500B 4000B 16384B 65535B
 
-
 tr	TCP throughput over IPv6: ns to host
+iperf3s	host 100${i}3 __THREADS__
+
 ns	ip link set dev lo mtu 1500
-iperf3	BW ns host ::1 100${i}3 __THREADS__ __TIME__ __OPTS__
+iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 15.0 20.0
 ns	ip link set dev lo mtu 4000
-iperf3	BW ns host ::1 100${i}3 __THREADS__ __TIME__ __OPTS__
+iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 15.0 20.0
 ns	ip link set dev lo mtu 16384
-iperf3	BW ns host ::1 100${i}3 __THREADS__ __TIME__ __OPTS__
+iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 15.0 20.0
 ns	ip link set dev lo mtu 65535
-iperf3	BW ns host ::1 100${i}3 __THREADS__ __TIME__ __OPTS__
+iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 15.0 20.0
 
+iperf3k	host
+
 tl	TCP RR latency over IPv6: ns to host
 lat	-
 lat	-
@@ -72,19 +75,23 @@ lat	__LAT__ 500 350
 
 
 tr	TCP throughput over IPv4: ns to host
+iperf3s	host 100${i}3 __THREADS__
+
 ns	ip link set dev lo mtu 1500
-iperf3	BW ns host 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__
+iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 15.0 20.0
 ns	ip link set dev lo mtu 4000
-iperf3	BW ns host 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__
+iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 15.0 20.0
 ns	ip link set dev lo mtu 16384
-iperf3	BW ns host 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__
+iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 15.0 20.0
 ns	ip link set dev lo mtu 65535
-iperf3	BW ns host 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__
+iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 15.0 20.0
 
+iperf3k	host
+
 tl	TCP RR latency over IPv4: ns to host
 lat	-
 lat	-
@@ -103,14 +110,17 @@ nsout	LAT tcp_crr --nolog -P 10003 -C 10013 -4 -c -H 127.0.0.1 | sed -n 's/^thro
 hostw
 lat	__LAT__ 500 350
 
-
 tr	TCP throughput over IPv6: host to ns
+iperf3s	ns 100${i}2 __THREADS__
+
 bw	-
 bw	-
 bw	-
-iperf3	BW host ns ::1 100${i}2 __THREADS__ __TIME__ __OPTS__
+iperf3	BW host ::1 100${i}2 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 15.0 20.0
 
+iperf3k	ns
+
 tl	TCP RR latency over IPv6: host to ns
 lat	-
 lat	-
@@ -131,12 +141,16 @@ lat	__LAT__ 1000 700
 
 
 tr	TCP throughput over IPv4: host to ns
+iperf3s	ns 100${i}2 __THREADS__
+
 bw	-
 bw	-
 bw	-
-iperf3	BW host ns 127.0.0.1 100${i}2 __THREADS__ __TIME__ __OPTS__
+iperf3	BW host 127.0.0.1 100${i}2 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 15.0 20.0
 
+iperf3k	ns
+
 tl	TCP RR latency over IPv4: host to ns
 lat	-
 lat	-
@@ -158,7 +172,6 @@ lat	__LAT__ 1000 700
 
 te
 
-
 test	pasta: throughput and latency (connections via tap)
 
 nsout	GW ip -j -4 route show|jq -rM '.[] | select(.dst == "default").gateway'
@@ -173,21 +186,24 @@ report	pasta tap_tcp __THREADS__ __FREQ__
 
 th	MTU 1500B 4000B 16384B 65520B
 
-
 tr	TCP throughput over IPv6: ns to host
+iperf3s	host 100${i}3 __THREADS__
+
 ns	ip link set dev __IFNAME__ mtu 1500
-iperf3	BW ns host __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -w 512k
+iperf3	BW ns __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -w 512k
 bw	__BW__ 0.2 0.4
 ns	ip link set dev __IFNAME__ mtu 4000
-iperf3	BW ns host __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -w 1M
+iperf3	BW ns __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -w 1M
 bw	__BW__ 0.3 0.5
 ns	ip link set dev __IFNAME__ mtu 16384
-iperf3	BW ns host __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -w 8M
+iperf3	BW ns __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -w 8M
 bw	__BW__ 1.5 2.0
 ns	ip link set dev __IFNAME__ mtu 65520
-iperf3	BW ns host __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -w 8M
+iperf3	BW ns __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -w 8M
 bw	__BW__ 2.0 2.5
 
+iperf3k	host
+
 tl	TCP RR latency over IPv6: ns to host
 lat	-
 lat	-
@@ -208,19 +224,23 @@ lat	__LAT__ 1500 500
 
 
 tr	TCP throughput over IPv4: ns to host
+iperf3s	host 100${i}3 __THREADS__
+
 ns	ip link set dev __IFNAME__ mtu 1500
-iperf3	BW ns host __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -w 512k
+iperf3	BW ns __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -w 512k
 bw	__BW__ 0.2 0.4
 ns	ip link set dev __IFNAME__ mtu 4000
-iperf3s	BW ns host __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -w 1M
+iperf3	BW ns __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -w 1M
 bw	__BW__ 0.3 0.5
 ns	ip link set dev __IFNAME__ mtu 16384
-iperf3	BW ns host __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -w 8M
+iperf3	BW ns __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -w 8M
 bw	__BW__ 1.5 2.0
 ns	ip link set dev __IFNAME__ mtu 65520
-iperf3	BW ns host __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -w 8M
+iperf3	BW ns __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -w 8M
 bw	__BW__ 2.0 2.5
 
+iperf3k	host
+
 tl	TCP RR latency over IPv4: ns to host
 lat	-
 lat	-
@@ -239,16 +259,19 @@ nsout	LAT tcp_crr --nolog -P 10003 -C 10013 -4 -c -H __GW__ | sed -n 's/^through
 hostw
 lat	__LAT__ 1500 500
 
-
 tr	TCP throughput over IPv6: host to ns
+iperf3s	ns 100${i}2 __THREADS__
+
 nsout	IFNAME ip -j link show | jq -rM '.[] | select(.link_type == "ether").ifname'
 nsout	ADDR6 ip -j -6 addr show|jq -rM '.[] | select(.ifname == "__IFNAME__").addr_info[] | select(.scope == "global" and .prefixlen == 64).local'
 bw	-
 bw	-
 bw	-
-iperf3	BW host ns __ADDR6__ 100${i}2 __THREADS__ __TIME__ __OPTS__
+iperf3	BW host __ADDR6__ 100${i}2 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 8.0 10.0
 
+iperf3k	ns
+
 tl	TCP RR latency over IPv6: host to ns
 lat	-
 lat	-
@@ -270,13 +293,17 @@ lat	__LAT__ 5000 10000
 
 
 tr	TCP throughput over IPv4: host to ns
+iperf3s	ns 100${i}2 __THREADS__
+
 nsout	ADDR ip -j -4 addr show|jq -rM '.[] | select(.ifname == "__IFNAME__").addr_info[0].local'
 bw	-
 bw	-
 bw	-
-iperf3	BW host ns __ADDR__ 100${i}2 __THREADS__ __TIME__ __OPTS__
+iperf3	BW host __ADDR__ 100${i}2 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 8.0 10.0
 
+iperf3k	ns
+
 tl	TCP RR latency over IPv4: host to ns
 lat	-
 lat	-
diff --git a/test/perf/pasta_udp b/test/perf/pasta_udp
index 3de73a0..0628bd9 100644
--- a/test/perf/pasta_udp
+++ b/test/perf/pasta_udp
@@ -33,19 +33,23 @@ th	MTU 1500B 4000B 16384B 65535B
 
 
 tr	UDP throughput over IPv6: ns to host
+iperf3s	host 100${i}3 __THREADS__
+
 ns	ip link set dev lo mtu 1500
-iperf3	BW ns host ::1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
 bw	__BW__ 1.0 1.5
 ns	ip link set dev lo mtu 4000
-iperf3	BW ns host ::1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
 bw	__BW__ 1.2 1.8
 ns	ip link set dev lo mtu 16384
-iperf3	BW ns host ::1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 10G
+iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 10G
 bw	__BW__ 5.0 6.0
 ns	ip link set dev lo mtu 65535
-iperf3	BW ns host ::1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 15G
+iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 15G
 bw	__BW__ 7.0 9.0
 
+iperf3k	host
+
 tl	UDP RR latency over IPv6: ns to host
 lat	-
 lat	-
@@ -57,19 +61,23 @@ lat	__LAT__ 200 150
 
 
 tr	UDP throughput over IPv4: ns to host
+iperf3s	host 100${i}3 __THREADS__
+
 ns	ip link set dev lo mtu 1500
-iperf3	BW ns host 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
 bw	__BW__ 1.0 1.5
 ns	ip link set dev lo mtu 4000
-iperf3	BW ns host 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
 bw	__BW__ 1.2 1.8
 ns	ip link set dev lo mtu 16384
-iperf3	BW ns host 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 10G
+iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 10G
 bw	__BW__ 5.0 6.0
 ns	ip link set dev lo mtu 65535
-iperf3	BW ns host 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 15G
+iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 15G
 bw	__BW__ 7.0 9.0
 
+iperf3k	host
+
 tl	UDP RR latency over IPv4: ns to host
 lat	-
 lat	-
@@ -81,12 +89,16 @@ lat	__LAT__ 200 150
 
 
 tr	UDP throughput over IPv6: host to ns
+iperf3s	ns 100${i}2 __THREADS__
+
 bw	-
 bw	-
 bw	-
-iperf3	BW host ns ::1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G
+iperf3	BW host ::1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G
 bw	__BW__ 7.0 9.0
 
+iperf3k	ns
+
 tl	UDP RR latency over IPv6: host to ns
 lat	-
 lat	-
@@ -98,12 +110,15 @@ lat	__LAT__ 200 150
 
 
 tr	UDP throughput over IPv4: host to ns
+iperf3s	ns 100${i}2 __THREADS__
 bw	-
 bw	-
 bw	-
-iperf3	BW host ns 127.0.0.1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G
+iperf3	BW host 127.0.0.1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G
 bw	__BW__ 7.0 9.0
 
+iperf3k	ns
+
 tl	UDP RR latency over IPv4: host to ns
 lat	-
 lat	-
@@ -129,19 +144,23 @@ report	pasta tap_udp 1 __FREQ__
 th	MTU 1500B 4000B 16384B 65520B
 
 tr	UDP throughput over IPv6: ns to host
+iperf3s	host 100${i}3 __THREADS__
+
 ns	ip link set dev __IFNAME__ mtu 1500
-iperf3	BW ns host __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 2G
+iperf3	BW ns __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 2G
 bw	__BW__ 0.3 0.5
 ns	ip link set dev __IFNAME__ mtu 4000
-iperf3	BW ns host __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
 bw	__BW__ 0.5 0.8
 ns	ip link set dev __IFNAME__ mtu 16384
-iperf3	BW ns host __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 4G
+iperf3	BW ns __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 4G
 bw	__BW__ 3.0 4.0
 ns	ip link set dev __IFNAME__ mtu 65520
-iperf3	BW ns host __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 6G
+iperf3	BW ns __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 6G
 bw	__BW__ 6.0 7.0
 
+iperf3k	host
+
 tl	UDP RR latency over IPv6: ns to host
 lat	-
 lat	-
@@ -153,19 +172,23 @@ lat	__LAT__ 200 150
 
 
 tr	UDP throughput over IPv4: ns to host
+iperf3s	host 100${i}3 __THREADS__
+
 ns	ip link set dev __IFNAME__ mtu 1500
-iperf3	BW ns host __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 2G
+iperf3	BW ns __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 2G
 bw	__BW__ 0.3 0.5
 ns	ip link set dev __IFNAME__ mtu 4000
-iperf3	BW ns host __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
 bw	__BW__ 0.5 0.8
 ns	ip link set dev __IFNAME__ mtu 16384
-iperf3	BW ns host __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 4G
+iperf3	BW ns __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 4G
 bw	__BW__ 3.0 4.0
 ns	ip link set dev __IFNAME__ mtu 65520
-iperf3	BW ns host __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 6G
+iperf3	BW ns __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 6G
 bw	__BW__ 6.0 7.0
 
+iperf3k	host
+
 tl	UDP RR latency over IPv4: ns to host
 lat	-
 lat	-
@@ -175,16 +198,19 @@ nsout	LAT udp_rr --nolog -P 10003 -C 10013 -4 -c -H __GW__ | sed -n 's/^throughp
 hostw
 lat	__LAT__ 200 150
 
-
 tr	UDP throughput over IPv6: host to ns
+iperf3s	ns 100${i}2 __THREADS__
+
 nsout	IFNAME ip -j link show | jq -rM '.[] | select(.link_type == "ether").ifname'
 nsout	ADDR6 ip -j -6 addr show|jq -rM '.[] | select(.ifname == "__IFNAME__").addr_info[] | select(.scope == "global" and .prefixlen == 64).local'
 bw	-
 bw	-
 bw	-
-iperf3	BW host ns __ADDR6__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G
+iperf3	BW host __ADDR6__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G
 bw	__BW__ 7.0 9.0
 
+iperf3k	ns
+
 tl	UDP RR latency over IPv6: host to ns
 lat	-
 lat	-
@@ -196,13 +222,17 @@ lat	__LAT__ 200 150
 
 
 tr	UDP throughput over IPv4: host to ns
+iperf3s	ns 100${i}2 __THREADS__
+
 nsout	ADDR ip -j -4 addr show|jq -rM '.[] | select(.ifname == "__IFNAME__").addr_info[0].local'
 bw	-
 bw	-
 bw	-
-iperf3	BW host ns __ADDR__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G
+iperf3	BW host __ADDR__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G
 bw	__BW__ 7.0 9.0
 
+iperf3k	ns
+
 tl	UDP RR latency over IPv4: host to ns
 lat	-
 lat	-
-- 
@@ -33,19 +33,23 @@ th	MTU 1500B 4000B 16384B 65535B
 
 
 tr	UDP throughput over IPv6: ns to host
+iperf3s	host 100${i}3 __THREADS__
+
 ns	ip link set dev lo mtu 1500
-iperf3	BW ns host ::1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
 bw	__BW__ 1.0 1.5
 ns	ip link set dev lo mtu 4000
-iperf3	BW ns host ::1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
 bw	__BW__ 1.2 1.8
 ns	ip link set dev lo mtu 16384
-iperf3	BW ns host ::1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 10G
+iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 10G
 bw	__BW__ 5.0 6.0
 ns	ip link set dev lo mtu 65535
-iperf3	BW ns host ::1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 15G
+iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 15G
 bw	__BW__ 7.0 9.0
 
+iperf3k	host
+
 tl	UDP RR latency over IPv6: ns to host
 lat	-
 lat	-
@@ -57,19 +61,23 @@ lat	__LAT__ 200 150
 
 
 tr	UDP throughput over IPv4: ns to host
+iperf3s	host 100${i}3 __THREADS__
+
 ns	ip link set dev lo mtu 1500
-iperf3	BW ns host 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
 bw	__BW__ 1.0 1.5
 ns	ip link set dev lo mtu 4000
-iperf3	BW ns host 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
 bw	__BW__ 1.2 1.8
 ns	ip link set dev lo mtu 16384
-iperf3	BW ns host 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 10G
+iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 10G
 bw	__BW__ 5.0 6.0
 ns	ip link set dev lo mtu 65535
-iperf3	BW ns host 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 15G
+iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 15G
 bw	__BW__ 7.0 9.0
 
+iperf3k	host
+
 tl	UDP RR latency over IPv4: ns to host
 lat	-
 lat	-
@@ -81,12 +89,16 @@ lat	__LAT__ 200 150
 
 
 tr	UDP throughput over IPv6: host to ns
+iperf3s	ns 100${i}2 __THREADS__
+
 bw	-
 bw	-
 bw	-
-iperf3	BW host ns ::1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G
+iperf3	BW host ::1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G
 bw	__BW__ 7.0 9.0
 
+iperf3k	ns
+
 tl	UDP RR latency over IPv6: host to ns
 lat	-
 lat	-
@@ -98,12 +110,15 @@ lat	__LAT__ 200 150
 
 
 tr	UDP throughput over IPv4: host to ns
+iperf3s	ns 100${i}2 __THREADS__
 bw	-
 bw	-
 bw	-
-iperf3	BW host ns 127.0.0.1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G
+iperf3	BW host 127.0.0.1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G
 bw	__BW__ 7.0 9.0
 
+iperf3k	ns
+
 tl	UDP RR latency over IPv4: host to ns
 lat	-
 lat	-
@@ -129,19 +144,23 @@ report	pasta tap_udp 1 __FREQ__
 th	MTU 1500B 4000B 16384B 65520B
 
 tr	UDP throughput over IPv6: ns to host
+iperf3s	host 100${i}3 __THREADS__
+
 ns	ip link set dev __IFNAME__ mtu 1500
-iperf3	BW ns host __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 2G
+iperf3	BW ns __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 2G
 bw	__BW__ 0.3 0.5
 ns	ip link set dev __IFNAME__ mtu 4000
-iperf3	BW ns host __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
 bw	__BW__ 0.5 0.8
 ns	ip link set dev __IFNAME__ mtu 16384
-iperf3	BW ns host __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 4G
+iperf3	BW ns __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 4G
 bw	__BW__ 3.0 4.0
 ns	ip link set dev __IFNAME__ mtu 65520
-iperf3	BW ns host __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 6G
+iperf3	BW ns __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 6G
 bw	__BW__ 6.0 7.0
 
+iperf3k	host
+
 tl	UDP RR latency over IPv6: ns to host
 lat	-
 lat	-
@@ -153,19 +172,23 @@ lat	__LAT__ 200 150
 
 
 tr	UDP throughput over IPv4: ns to host
+iperf3s	host 100${i}3 __THREADS__
+
 ns	ip link set dev __IFNAME__ mtu 1500
-iperf3	BW ns host __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 2G
+iperf3	BW ns __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 2G
 bw	__BW__ 0.3 0.5
 ns	ip link set dev __IFNAME__ mtu 4000
-iperf3	BW ns host __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
 bw	__BW__ 0.5 0.8
 ns	ip link set dev __IFNAME__ mtu 16384
-iperf3	BW ns host __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 4G
+iperf3	BW ns __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 4G
 bw	__BW__ 3.0 4.0
 ns	ip link set dev __IFNAME__ mtu 65520
-iperf3	BW ns host __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 6G
+iperf3	BW ns __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 6G
 bw	__BW__ 6.0 7.0
 
+iperf3k	host
+
 tl	UDP RR latency over IPv4: ns to host
 lat	-
 lat	-
@@ -175,16 +198,19 @@ nsout	LAT udp_rr --nolog -P 10003 -C 10013 -4 -c -H __GW__ | sed -n 's/^throughp
 hostw
 lat	__LAT__ 200 150
 
-
 tr	UDP throughput over IPv6: host to ns
+iperf3s	ns 100${i}2 __THREADS__
+
 nsout	IFNAME ip -j link show | jq -rM '.[] | select(.link_type == "ether").ifname'
 nsout	ADDR6 ip -j -6 addr show|jq -rM '.[] | select(.ifname == "__IFNAME__").addr_info[] | select(.scope == "global" and .prefixlen == 64).local'
 bw	-
 bw	-
 bw	-
-iperf3	BW host ns __ADDR6__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G
+iperf3	BW host __ADDR6__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G
 bw	__BW__ 7.0 9.0
 
+iperf3k	ns
+
 tl	UDP RR latency over IPv6: host to ns
 lat	-
 lat	-
@@ -196,13 +222,17 @@ lat	__LAT__ 200 150
 
 
 tr	UDP throughput over IPv4: host to ns
+iperf3s	ns 100${i}2 __THREADS__
+
 nsout	ADDR ip -j -4 addr show|jq -rM '.[] | select(.ifname == "__IFNAME__").addr_info[0].local'
 bw	-
 bw	-
 bw	-
-iperf3	BW host ns __ADDR__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G
+iperf3	BW host __ADDR__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G
 bw	__BW__ 7.0 9.0
 
+iperf3k	ns
+
 tl	UDP RR latency over IPv4: host to ns
 lat	-
 lat	-
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/8] test/perf: Small MTUs for spliced TCP aren't interesting
  2023-11-06  7:08 [PATCH 0/8] Clean ups and speed ups to benchmarks David Gibson
                   ` (2 preceding siblings ...)
  2023-11-06  7:08 ` [PATCH 3/8] test/perf: Start iperf3 server less often David Gibson
@ 2023-11-06  7:08 ` David Gibson
  2023-11-06  7:08 ` [PATCH 5/8] test/perf: Explicitly control UDP packet length, instead of MTU David Gibson
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: David Gibson @ 2023-11-06  7:08 UTC (permalink / raw)
  To: passt-dev, Stefano Brivio; +Cc: David Gibson

Currently we make TCP throughput measurements for spliced connections with
a number of different MTU values.  However, the results from this aren't
really interesting.

Unlike with tap connections, spliced connections only involve the loopback
interface on host and container, not a "real" external interface.  lo
typically has an MTU of 65535 and there is very little reason to ever
change that.  So, the measurements for smaller MTUs are rarely going to be
relevant.

In addition, the fact that we can offload all the {de,}packetization to the
kernel with splice(2) means that the throughput difference between these
MTUs isn't very great anyway.

Remove the short MTUs and only show spliced throughput for the normal
65535 byte loopback MTU.  This reduces runtime of the performance tests on
my laptop by about 1 minute (out of ~24 minutes).

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 test/perf/pasta_tcp | 53 +--------------------------------------------
 1 file changed, 1 insertion(+), 52 deletions(-)

diff --git a/test/perf/pasta_tcp b/test/perf/pasta_tcp
index a8938c3..3a8ad40 100644
--- a/test/perf/pasta_tcp
+++ b/test/perf/pasta_tcp
@@ -35,39 +35,23 @@ hout	FREQ [ -n "__FREQ_CPUFREQ__" ] && echo __FREQ_CPUFREQ__ || echo __FREQ_PROC
 info	Throughput in Gbps, latency in µs, __THREADS__ threads at __FREQ__ GHz, __STREAMS__ streams each
 report	pasta lo_tcp __THREADS__ __FREQ__
 
-th	MTU 1500B 4000B 16384B 65535B
+th	MTU 65535B
 
 tr	TCP throughput over IPv6: ns to host
 iperf3s	host 100${i}3 __THREADS__
 
-ns	ip link set dev lo mtu 1500
-iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__
-bw	__BW__ 15.0 20.0
-ns	ip link set dev lo mtu 4000
-iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__
-bw	__BW__ 15.0 20.0
-ns	ip link set dev lo mtu 16384
-iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__
-bw	__BW__ 15.0 20.0
-ns	ip link set dev lo mtu 65535
 iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 15.0 20.0
 
 iperf3k	host
 
 tl	TCP RR latency over IPv6: ns to host
-lat	-
-lat	-
-lat	-
 hostb	tcp_rr --nolog -P 10003 -C 10013 -6
 nsout	LAT tcp_rr --nolog -P 10003 -C 10013 -6 -c -H ::1 | sed -n 's/^throughput=\(.*\)/\1/p'
 hostw
 lat	__LAT__ 150 100
 
 tl	TCP CRR latency over IPv6: ns to host
-lat	-
-lat	-
-lat	-
 hostb	tcp_crr --nolog -P 10003 -C 10013 -6
 nsout	LAT tcp_crr --nolog -P 10003 -C 10013 -6 -c -H ::1 | sed -n 's/^throughput=\(.*\)/\1/p'
 hostw
@@ -77,34 +61,18 @@ lat	__LAT__ 500 350
 tr	TCP throughput over IPv4: ns to host
 iperf3s	host 100${i}3 __THREADS__
 
-ns	ip link set dev lo mtu 1500
-iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__
-bw	__BW__ 15.0 20.0
-ns	ip link set dev lo mtu 4000
-iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__
-bw	__BW__ 15.0 20.0
-ns	ip link set dev lo mtu 16384
-iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__
-bw	__BW__ 15.0 20.0
-ns	ip link set dev lo mtu 65535
 iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 15.0 20.0
 
 iperf3k	host
 
 tl	TCP RR latency over IPv4: ns to host
-lat	-
-lat	-
-lat	-
 hostb	tcp_rr --nolog -P 10003 -C 10013 -4
 nsout	LAT tcp_rr --nolog -P 10003 -C 10013 -4 -c -H 127.0.0.1 | sed -n 's/^throughput=\(.*\)/\1/p'
 hostw
 lat	__LAT__ 150 100
 
 tl	TCP CRR latency over IPv4: ns to host
-lat	-
-lat	-
-lat	-
 hostb	tcp_crr --nolog -P 10003 -C 10013 -4
 nsout	LAT tcp_crr --nolog -P 10003 -C 10013 -4 -c -H 127.0.0.1 | sed -n 's/^throughput=\(.*\)/\1/p'
 hostw
@@ -113,27 +81,18 @@ lat	__LAT__ 500 350
 tr	TCP throughput over IPv6: host to ns
 iperf3s	ns 100${i}2 __THREADS__
 
-bw	-
-bw	-
-bw	-
 iperf3	BW host ::1 100${i}2 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 15.0 20.0
 
 iperf3k	ns
 
 tl	TCP RR latency over IPv6: host to ns
-lat	-
-lat	-
-lat	-
 nsb	tcp_rr --nolog -P 10002 -C 10012 -6
 hout	LAT tcp_rr --nolog -P 10002 -C 10012 -6 -c -H ::1 | sed -n 's/^throughput=\(.*\)/\1/p'
 nsw
 lat	__LAT__ 150 100
 
 tl	TCP CRR latency over IPv6: host to ns
-lat	-
-lat	-
-lat	-
 nsb	tcp_crr --nolog -P 10002 -C 10012 -6
 hout	LAT tcp_crr --nolog -P 10002 -C 10012 -6 -c -H ::1 | sed -n 's/^throughput=\(.*\)/\1/p'
 nsw
@@ -143,28 +102,18 @@ lat	__LAT__ 1000 700
 tr	TCP throughput over IPv4: host to ns
 iperf3s	ns 100${i}2 __THREADS__
 
-bw	-
-bw	-
-bw	-
 iperf3	BW host 127.0.0.1 100${i}2 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 15.0 20.0
 
 iperf3k	ns
 
 tl	TCP RR latency over IPv4: host to ns
-lat	-
-lat	-
-lat	-
 nsb	tcp_rr --nolog -P 10002 -C 10012 -4
 hout	LAT tcp_rr --nolog -P 10002 -C 10012 -4 -c -H 127.0.0.1 | sed -n 's/^throughput=\(.*\)/\1/p'
 nsw
 lat	__LAT__ 150 100
 
 tl	TCP CRR latency over IPv4: host to ns
-lat	-
-lat	-
-lat	-
-sleep	1
 nsb	tcp_crr --nolog -P 10002 -C 10012 -4
 hout	LAT tcp_crr --nolog -P 10002 -C 10012 -4 -c -H 127.0.0.1 | sed -n 's/^throughput=\(.*\)/\1/p'
 nsw
-- 
@@ -35,39 +35,23 @@ hout	FREQ [ -n "__FREQ_CPUFREQ__" ] && echo __FREQ_CPUFREQ__ || echo __FREQ_PROC
 info	Throughput in Gbps, latency in µs, __THREADS__ threads at __FREQ__ GHz, __STREAMS__ streams each
 report	pasta lo_tcp __THREADS__ __FREQ__
 
-th	MTU 1500B 4000B 16384B 65535B
+th	MTU 65535B
 
 tr	TCP throughput over IPv6: ns to host
 iperf3s	host 100${i}3 __THREADS__
 
-ns	ip link set dev lo mtu 1500
-iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__
-bw	__BW__ 15.0 20.0
-ns	ip link set dev lo mtu 4000
-iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__
-bw	__BW__ 15.0 20.0
-ns	ip link set dev lo mtu 16384
-iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__
-bw	__BW__ 15.0 20.0
-ns	ip link set dev lo mtu 65535
 iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 15.0 20.0
 
 iperf3k	host
 
 tl	TCP RR latency over IPv6: ns to host
-lat	-
-lat	-
-lat	-
 hostb	tcp_rr --nolog -P 10003 -C 10013 -6
 nsout	LAT tcp_rr --nolog -P 10003 -C 10013 -6 -c -H ::1 | sed -n 's/^throughput=\(.*\)/\1/p'
 hostw
 lat	__LAT__ 150 100
 
 tl	TCP CRR latency over IPv6: ns to host
-lat	-
-lat	-
-lat	-
 hostb	tcp_crr --nolog -P 10003 -C 10013 -6
 nsout	LAT tcp_crr --nolog -P 10003 -C 10013 -6 -c -H ::1 | sed -n 's/^throughput=\(.*\)/\1/p'
 hostw
@@ -77,34 +61,18 @@ lat	__LAT__ 500 350
 tr	TCP throughput over IPv4: ns to host
 iperf3s	host 100${i}3 __THREADS__
 
-ns	ip link set dev lo mtu 1500
-iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__
-bw	__BW__ 15.0 20.0
-ns	ip link set dev lo mtu 4000
-iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__
-bw	__BW__ 15.0 20.0
-ns	ip link set dev lo mtu 16384
-iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__
-bw	__BW__ 15.0 20.0
-ns	ip link set dev lo mtu 65535
 iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 15.0 20.0
 
 iperf3k	host
 
 tl	TCP RR latency over IPv4: ns to host
-lat	-
-lat	-
-lat	-
 hostb	tcp_rr --nolog -P 10003 -C 10013 -4
 nsout	LAT tcp_rr --nolog -P 10003 -C 10013 -4 -c -H 127.0.0.1 | sed -n 's/^throughput=\(.*\)/\1/p'
 hostw
 lat	__LAT__ 150 100
 
 tl	TCP CRR latency over IPv4: ns to host
-lat	-
-lat	-
-lat	-
 hostb	tcp_crr --nolog -P 10003 -C 10013 -4
 nsout	LAT tcp_crr --nolog -P 10003 -C 10013 -4 -c -H 127.0.0.1 | sed -n 's/^throughput=\(.*\)/\1/p'
 hostw
@@ -113,27 +81,18 @@ lat	__LAT__ 500 350
 tr	TCP throughput over IPv6: host to ns
 iperf3s	ns 100${i}2 __THREADS__
 
-bw	-
-bw	-
-bw	-
 iperf3	BW host ::1 100${i}2 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 15.0 20.0
 
 iperf3k	ns
 
 tl	TCP RR latency over IPv6: host to ns
-lat	-
-lat	-
-lat	-
 nsb	tcp_rr --nolog -P 10002 -C 10012 -6
 hout	LAT tcp_rr --nolog -P 10002 -C 10012 -6 -c -H ::1 | sed -n 's/^throughput=\(.*\)/\1/p'
 nsw
 lat	__LAT__ 150 100
 
 tl	TCP CRR latency over IPv6: host to ns
-lat	-
-lat	-
-lat	-
 nsb	tcp_crr --nolog -P 10002 -C 10012 -6
 hout	LAT tcp_crr --nolog -P 10002 -C 10012 -6 -c -H ::1 | sed -n 's/^throughput=\(.*\)/\1/p'
 nsw
@@ -143,28 +102,18 @@ lat	__LAT__ 1000 700
 tr	TCP throughput over IPv4: host to ns
 iperf3s	ns 100${i}2 __THREADS__
 
-bw	-
-bw	-
-bw	-
 iperf3	BW host 127.0.0.1 100${i}2 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 15.0 20.0
 
 iperf3k	ns
 
 tl	TCP RR latency over IPv4: host to ns
-lat	-
-lat	-
-lat	-
 nsb	tcp_rr --nolog -P 10002 -C 10012 -4
 hout	LAT tcp_rr --nolog -P 10002 -C 10012 -4 -c -H 127.0.0.1 | sed -n 's/^throughput=\(.*\)/\1/p'
 nsw
 lat	__LAT__ 150 100
 
 tl	TCP CRR latency over IPv4: host to ns
-lat	-
-lat	-
-lat	-
-sleep	1
 nsb	tcp_crr --nolog -P 10002 -C 10012 -4
 hout	LAT tcp_crr --nolog -P 10002 -C 10012 -4 -c -H 127.0.0.1 | sed -n 's/^throughput=\(.*\)/\1/p'
 nsw
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 5/8] test/perf: Explicitly control UDP packet length, instead of MTU
  2023-11-06  7:08 [PATCH 0/8] Clean ups and speed ups to benchmarks David Gibson
                   ` (3 preceding siblings ...)
  2023-11-06  7:08 ` [PATCH 4/8] test/perf: Small MTUs for spliced TCP aren't interesting David Gibson
@ 2023-11-06  7:08 ` David Gibson
  2023-11-06  7:08 ` [PATCH 6/8] test/perf: "MTU" changes in passt_tcp host to guest aren't useful David Gibson
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: David Gibson @ 2023-11-06  7:08 UTC (permalink / raw)
  To: passt-dev, Stefano Brivio; +Cc: David Gibson

Packet size can make a big difference to UDP throughput, so it makes sense
to measure it for a variety of different sizes.  Currently we do this by
adjusting the MTU on the relevant interface before running iperf3.

However, the UDP packet size has no inherent connection to the MTU - it's
controlled by the sender, and the MTU just affects whether the packet will
make it through or be fragmented.  The only reason adjusting the MTU works
is because iperf3 bases its default packet size on the (path) MTU.

We can test this more simply by using the -l option to the iperf3 client
to directly control the packet size, instead of adjusting the MTU.

As well as simplifying this lets us test different packet sizes for host to
ns traffic.  We couldn't do that previously because we don't have
permission to change the MTU on the host.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 test/perf/passt_udp |  69 +++++++++++-------------------
 test/perf/pasta_udp | 100 ++++++++++++++++++++++----------------------
 2 files changed, 75 insertions(+), 94 deletions(-)

diff --git a/test/perf/passt_udp b/test/perf/passt_udp
index 12d8fbb..10f638f 100644
--- a/test/perf/passt_udp
+++ b/test/perf/passt_udp
@@ -39,24 +39,21 @@ info	Throughput in Gbps, latency in µs, __THREADS__ threads at __FREQ__ GHz, on
 
 report	passt udp __THREADS__ __FREQ__
 
-th	MTU 256B 576B 1280B 1500B 9000B 65520B
+th	pktlen 256B 576B 1280B 1500B 9000B 65520B
 
 tr	UDP throughput over IPv6: guest to host
 iperf3s	ns 100${i}2 __THREADS__
+# (datagram size) = (packet size) - 48: 40 bytes of IPv6 header, 8 of UDP header
 
 bw	-
 bw	-
-guest	ip link set dev __IFNAME__ mtu 1280
-iperf3	BW guest __GW6__%__IFNAME__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 2G
+iperf3	BW guest __GW6__%__IFNAME__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 2G -l 1232
 bw	__BW__ 0.8 1.2
-guest	ip link set dev __IFNAME__ mtu 1500
-iperf3	BW guest __GW6__%__IFNAME__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW guest __GW6__%__IFNAME__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 3G -l 1452
 bw	__BW__ 1.0 1.5
-guest	ip link set dev __IFNAME__ mtu 9000
-iperf3	BW guest __GW6__%__IFNAME__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 5G
+iperf3	BW guest __GW6__%__IFNAME__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 5G -l 8952
 bw	__BW__ 4.0 5.0
-guest	ip link set dev __IFNAME__ mtu 65520
-iperf3	BW guest __GW6__%__IFNAME__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 7G
+iperf3	BW guest __GW6__%__IFNAME__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 7G -l 64372
 bw	__BW__ 4.0 5.0
 
 iperf3k	ns
@@ -74,24 +71,19 @@ lat	__LAT__ 200 150
 
 tr	UDP throughput over IPv4: guest to host
 iperf3s	ns 100${i}2 __THREADS__
+# (datagram size) = (packet size) - 28: 20 bytes of IPv4 header, 8 of UDP header
 
-guest	ip link set dev __IFNAME__ mtu 256
-iperf3	BW guest __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 500M
+iperf3	BW guest __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 500M -l 228
 bw	__BW__ 0.0 0.0
-guest	ip link set dev __IFNAME__ mtu 576
-iperf3	BW guest __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 1G
+iperf3	BW guest __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 1G -l 548
 bw	__BW__ 0.4 0.6
-guest	ip link set dev __IFNAME__ mtu 1280
-iperf3	BW guest __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 2G
+iperf3	BW guest __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 2G -l 1252
 bw	__BW__ 0.8 1.2
-guest	ip link set dev __IFNAME__ mtu 1500
-iperf3	BW guest __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW guest __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 3G -l 1472
 bw	__BW__ 1.0 1.5
-guest	ip link set dev __IFNAME__ mtu 9000
-iperf3	BW guest __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 6G
+iperf3	BW guest __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 6G -l 8972
 bw	__BW__ 4.0 5.0
-guest	ip link set dev __IFNAME__ mtu 65520
-iperf3	BW guest __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 7G
+iperf3	BW guest __GW__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 7G -l 65492
 bw	__BW__ 4.0 5.0
 
 iperf3k	ns
@@ -109,20 +101,17 @@ lat	__LAT__ 200 150
 
 tr	UDP throughput over IPv6: host to guest
 iperf3s	guest 100${i}1 __THREADS__
+# (datagram size) = (packet size) - 48: 40 bytes of IPv6 header, 8 of UDP header
 
 bw	-
 bw	-
-ns	ip link set dev lo mtu 1280
-iperf3	BW ns ::1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 2G
+iperf3	BW ns ::1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 2G -l 1232
 bw	__BW__ 0.8 1.2
-ns	ip link set dev lo mtu 1500
-iperf3	BW ns ::1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 2G
+iperf3	BW ns ::1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 2G -l 1452
 bw	__BW__ 1.0 1.5
-ns	ip link set dev lo mtu 9000
-iperf3	BW ns ::1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns ::1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 3G -l 8952
 bw	__BW__ 3.0 4.0
-ns	ip link set dev lo mtu 65520
-iperf3	BW ns ::1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns ::1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 3G -l 64372
 bw	__BW__ 3.0 4.0
 
 iperf3k	guest
@@ -137,30 +126,23 @@ guestb	udp_rr --nolog -P 10001 -C 10011 -6
 sleep	1
 nsout	LAT udp_rr --nolog -P 10001 -C 10011 -6 -c -H ::1 | sed -n 's/^throughput=\(.*\)/\1/p'
 lat	__LAT__ 200 150
-ns	ip link set dev lo mtu 65535
 
 
 tr	UDP throughput over IPv4: host to guest
 iperf3s	guest 100${i}1 __THREADS__
+# (datagram size) = (packet size) - 28: 20 bytes of IPv4 header, 8 of UDP header
 
-ns	ip link set dev lo mtu 256
-iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 1G
+iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 1G -l 228
 bw	__BW__ 0.0 0.0
-ns	ip link set dev lo mtu 576
-iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 1G
+iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 1G -l 548
 bw	__BW__ 0.4 0.6
-ns	ip link set dev lo mtu 1280
-ns	ip addr add ::1 dev lo
-iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 3G -l 1252
 bw	__BW__ 0.8 1.2
-ns	ip link set dev lo mtu 1500
-iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 3G -l 1472
 bw	__BW__ 1.0 1.5
-ns	ip link set dev lo mtu 9000
-iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 3G -l 8972
 bw	__BW__ 3.0 4.0
-ns	ip link set dev lo mtu 65520
-iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__ -b 3G -l 65492
 bw	__BW__ 3.0 4.0
 
 iperf3k	guest
@@ -175,6 +157,5 @@ guestb	udp_rr --nolog -P 10001 -C 10011 -4
 sleep	1
 nsout	LAT udp_rr --nolog -P 10001 -C 10011 -4 -c -H 127.0.0.1 | sed -n 's/^throughput=\(.*\)/\1/p'
 lat	__LAT__ 200 150
-ns	ip link set dev lo mtu 65535
 
 te
diff --git a/test/perf/pasta_udp b/test/perf/pasta_udp
index 0628bd9..5e3db1e 100644
--- a/test/perf/pasta_udp
+++ b/test/perf/pasta_udp
@@ -29,23 +29,20 @@ info	Throughput in Gbps, latency in µs, one thread at __FREQ__ GHz, __STREAMS__
 
 report	pasta lo_udp 1 __FREQ__
 
-th	MTU 1500B 4000B 16384B 65535B
+th	pktlen 1500B 4000B 16384B 65535B
 
 
 tr	UDP throughput over IPv6: ns to host
 iperf3s	host 100${i}3 __THREADS__
+# (datagram size) = (packet size) - 48: 40 bytes of IPv6 header, 8 of UDP header
 
-ns	ip link set dev lo mtu 1500
-iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G -l 1452
 bw	__BW__ 1.0 1.5
-ns	ip link set dev lo mtu 4000
-iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G -l 3972
 bw	__BW__ 1.2 1.8
-ns	ip link set dev lo mtu 16384
-iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 10G
+iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 10G -l 16336
 bw	__BW__ 5.0 6.0
-ns	ip link set dev lo mtu 65535
-iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 15G
+iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 15G -l 65487
 bw	__BW__ 7.0 9.0
 
 iperf3k	host
@@ -62,18 +59,15 @@ lat	__LAT__ 200 150
 
 tr	UDP throughput over IPv4: ns to host
 iperf3s	host 100${i}3 __THREADS__
+# (datagram size) = (packet size) - 28: 20 bytes of IPv4 header, 8 of UDP header
 
-ns	ip link set dev lo mtu 1500
-iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G -l 1372
 bw	__BW__ 1.0 1.5
-ns	ip link set dev lo mtu 4000
-iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G -l 3972
 bw	__BW__ 1.2 1.8
-ns	ip link set dev lo mtu 16384
-iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 10G
+iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 10G -l 16356
 bw	__BW__ 5.0 6.0
-ns	ip link set dev lo mtu 65535
-iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 15G
+iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 15G -l 65507
 bw	__BW__ 7.0 9.0
 
 iperf3k	host
@@ -91,10 +85,13 @@ lat	__LAT__ 200 150
 tr	UDP throughput over IPv6: host to ns
 iperf3s	ns 100${i}2 __THREADS__
 
-bw	-
-bw	-
-bw	-
-iperf3	BW host ::1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G
+iperf3	BW host ::1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 3G -l 1452
+bw	__BW__ 1.0 1.5
+iperf3	BW host ::1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 3G -l 3972
+bw	__BW__ 1.2 1.8
+iperf3	BW host ::1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 10G -l 16336
+bw	__BW__ 5.0 6.0
+iperf3	BW host ::1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G -l 16336
 bw	__BW__ 7.0 9.0
 
 iperf3k	ns
@@ -111,10 +108,13 @@ lat	__LAT__ 200 150
 
 tr	UDP throughput over IPv4: host to ns
 iperf3s	ns 100${i}2 __THREADS__
-bw	-
-bw	-
-bw	-
-iperf3	BW host 127.0.0.1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G
+iperf3	BW host 127.0.0.1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 3G -l 1372
+bw	__BW__ 1.0 1.5
+iperf3	BW host 127.0.0.1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 3G -l 3972
+bw	__BW__ 1.2 1.8
+iperf3	BW host 127.0.0.1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 10G -l 16356
+bw	__BW__ 5.0 6.0
+iperf3	BW host 127.0.0.1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G -l 65507
 bw	__BW__ 7.0 9.0
 
 iperf3k	ns
@@ -141,22 +141,19 @@ nsout	IFNAME ip -j link show | jq -rM '.[] | select(.link_type == "ether").ifnam
 info	Throughput in Gbps, latency in µs, one thread at __FREQ__ GHz, __STREAMS__ streams
 report	pasta tap_udp 1 __FREQ__
 
-th	MTU 1500B 4000B 16384B 65520B
+th	pktlen 1500B 4000B 16384B 65520B
 
 tr	UDP throughput over IPv6: ns to host
 iperf3s	host 100${i}3 __THREADS__
+# (datagram size) = (packet size) - 48: 40 bytes of IPv6 header, 8 of UDP header
 
-ns	ip link set dev __IFNAME__ mtu 1500
-iperf3	BW ns __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 2G
+iperf3	BW ns __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 2G -l 1472
 bw	__BW__ 0.3 0.5
-ns	ip link set dev __IFNAME__ mtu 4000
-iperf3	BW ns __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G -l 3972
 bw	__BW__ 0.5 0.8
-ns	ip link set dev __IFNAME__ mtu 16384
-iperf3	BW ns __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 4G
+iperf3	BW ns __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 4G -l 16356
 bw	__BW__ 3.0 4.0
-ns	ip link set dev __IFNAME__ mtu 65520
-iperf3	BW ns __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 6G
+iperf3	BW ns __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 6G -l 65472
 bw	__BW__ 6.0 7.0
 
 iperf3k	host
@@ -173,18 +170,15 @@ lat	__LAT__ 200 150
 
 tr	UDP throughput over IPv4: ns to host
 iperf3s	host 100${i}3 __THREADS__
+# (datagram size) = (packet size) - 28: 20 bytes of IPv4 header, 8 of UDP header
 
-ns	ip link set dev __IFNAME__ mtu 1500
-iperf3	BW ns __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 2G
+iperf3	BW ns __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 2G -l 1472
 bw	__BW__ 0.3 0.5
-ns	ip link set dev __IFNAME__ mtu 4000
-iperf3	BW ns __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G -l 3972
 bw	__BW__ 0.5 0.8
-ns	ip link set dev __IFNAME__ mtu 16384
-iperf3	BW ns __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 4G
+iperf3	BW ns __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 4G -l 16356
 bw	__BW__ 3.0 4.0
-ns	ip link set dev __IFNAME__ mtu 65520
-iperf3	BW ns __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 6G
+iperf3	BW ns __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 6G -l 65492
 bw	__BW__ 6.0 7.0
 
 iperf3k	host
@@ -203,10 +197,13 @@ iperf3s	ns 100${i}2 __THREADS__
 
 nsout	IFNAME ip -j link show | jq -rM '.[] | select(.link_type == "ether").ifname'
 nsout	ADDR6 ip -j -6 addr show|jq -rM '.[] | select(.ifname == "__IFNAME__").addr_info[] | select(.scope == "global" and .prefixlen == 64).local'
-bw	-
-bw	-
-bw	-
-iperf3	BW host __ADDR6__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G
+iperf3	BW host __ADDR6__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 2G -l 1472
+bw	__BW__ 0.3 0.5
+iperf3	BW host __ADDR6__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 3G -l 3972
+bw	__BW__ 0.5 0.8
+iperf3	BW host __ADDR6__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 4G -l 16356
+bw	__BW__ 3.0 4.0
+iperf3	BW host __ADDR6__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G -l 65472
 bw	__BW__ 7.0 9.0
 
 iperf3k	ns
@@ -225,10 +222,13 @@ tr	UDP throughput over IPv4: host to ns
 iperf3s	ns 100${i}2 __THREADS__
 
 nsout	ADDR ip -j -4 addr show|jq -rM '.[] | select(.ifname == "__IFNAME__").addr_info[0].local'
-bw	-
-bw	-
-bw	-
-iperf3	BW host __ADDR__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G
+iperf3	BW host __ADDR__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 2G -l 1472
+bw	__BW__ 0.3 0.5
+iperf3	BW host __ADDR__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 3G -l 3972
+bw	__BW__ 0.5 0.8
+iperf3	BW host __ADDR__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 4G -l 16356
+bw	__BW__ 3.0 4.0
+iperf3	BW host __ADDR__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G -l 65492
 bw	__BW__ 7.0 9.0
 
 iperf3k	ns
-- 
@@ -29,23 +29,20 @@ info	Throughput in Gbps, latency in µs, one thread at __FREQ__ GHz, __STREAMS__
 
 report	pasta lo_udp 1 __FREQ__
 
-th	MTU 1500B 4000B 16384B 65535B
+th	pktlen 1500B 4000B 16384B 65535B
 
 
 tr	UDP throughput over IPv6: ns to host
 iperf3s	host 100${i}3 __THREADS__
+# (datagram size) = (packet size) - 48: 40 bytes of IPv6 header, 8 of UDP header
 
-ns	ip link set dev lo mtu 1500
-iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G -l 1452
 bw	__BW__ 1.0 1.5
-ns	ip link set dev lo mtu 4000
-iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G -l 3972
 bw	__BW__ 1.2 1.8
-ns	ip link set dev lo mtu 16384
-iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 10G
+iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 10G -l 16336
 bw	__BW__ 5.0 6.0
-ns	ip link set dev lo mtu 65535
-iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 15G
+iperf3	BW ns ::1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 15G -l 65487
 bw	__BW__ 7.0 9.0
 
 iperf3k	host
@@ -62,18 +59,15 @@ lat	__LAT__ 200 150
 
 tr	UDP throughput over IPv4: ns to host
 iperf3s	host 100${i}3 __THREADS__
+# (datagram size) = (packet size) - 28: 20 bytes of IPv4 header, 8 of UDP header
 
-ns	ip link set dev lo mtu 1500
-iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G -l 1372
 bw	__BW__ 1.0 1.5
-ns	ip link set dev lo mtu 4000
-iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G -l 3972
 bw	__BW__ 1.2 1.8
-ns	ip link set dev lo mtu 16384
-iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 10G
+iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 10G -l 16356
 bw	__BW__ 5.0 6.0
-ns	ip link set dev lo mtu 65535
-iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 15G
+iperf3	BW ns 127.0.0.1 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 15G -l 65507
 bw	__BW__ 7.0 9.0
 
 iperf3k	host
@@ -91,10 +85,13 @@ lat	__LAT__ 200 150
 tr	UDP throughput over IPv6: host to ns
 iperf3s	ns 100${i}2 __THREADS__
 
-bw	-
-bw	-
-bw	-
-iperf3	BW host ::1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G
+iperf3	BW host ::1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 3G -l 1452
+bw	__BW__ 1.0 1.5
+iperf3	BW host ::1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 3G -l 3972
+bw	__BW__ 1.2 1.8
+iperf3	BW host ::1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 10G -l 16336
+bw	__BW__ 5.0 6.0
+iperf3	BW host ::1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G -l 16336
 bw	__BW__ 7.0 9.0
 
 iperf3k	ns
@@ -111,10 +108,13 @@ lat	__LAT__ 200 150
 
 tr	UDP throughput over IPv4: host to ns
 iperf3s	ns 100${i}2 __THREADS__
-bw	-
-bw	-
-bw	-
-iperf3	BW host 127.0.0.1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G
+iperf3	BW host 127.0.0.1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 3G -l 1372
+bw	__BW__ 1.0 1.5
+iperf3	BW host 127.0.0.1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 3G -l 3972
+bw	__BW__ 1.2 1.8
+iperf3	BW host 127.0.0.1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 10G -l 16356
+bw	__BW__ 5.0 6.0
+iperf3	BW host 127.0.0.1 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G -l 65507
 bw	__BW__ 7.0 9.0
 
 iperf3k	ns
@@ -141,22 +141,19 @@ nsout	IFNAME ip -j link show | jq -rM '.[] | select(.link_type == "ether").ifnam
 info	Throughput in Gbps, latency in µs, one thread at __FREQ__ GHz, __STREAMS__ streams
 report	pasta tap_udp 1 __FREQ__
 
-th	MTU 1500B 4000B 16384B 65520B
+th	pktlen 1500B 4000B 16384B 65520B
 
 tr	UDP throughput over IPv6: ns to host
 iperf3s	host 100${i}3 __THREADS__
+# (datagram size) = (packet size) - 48: 40 bytes of IPv6 header, 8 of UDP header
 
-ns	ip link set dev __IFNAME__ mtu 1500
-iperf3	BW ns __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 2G
+iperf3	BW ns __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 2G -l 1472
 bw	__BW__ 0.3 0.5
-ns	ip link set dev __IFNAME__ mtu 4000
-iperf3	BW ns __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G -l 3972
 bw	__BW__ 0.5 0.8
-ns	ip link set dev __IFNAME__ mtu 16384
-iperf3	BW ns __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 4G
+iperf3	BW ns __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 4G -l 16356
 bw	__BW__ 3.0 4.0
-ns	ip link set dev __IFNAME__ mtu 65520
-iperf3	BW ns __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 6G
+iperf3	BW ns __GW6__%__IFNAME__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 6G -l 65472
 bw	__BW__ 6.0 7.0
 
 iperf3k	host
@@ -173,18 +170,15 @@ lat	__LAT__ 200 150
 
 tr	UDP throughput over IPv4: ns to host
 iperf3s	host 100${i}3 __THREADS__
+# (datagram size) = (packet size) - 28: 20 bytes of IPv4 header, 8 of UDP header
 
-ns	ip link set dev __IFNAME__ mtu 1500
-iperf3	BW ns __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 2G
+iperf3	BW ns __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 2G -l 1472
 bw	__BW__ 0.3 0.5
-ns	ip link set dev __IFNAME__ mtu 4000
-iperf3	BW ns __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G
+iperf3	BW ns __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 3G -l 3972
 bw	__BW__ 0.5 0.8
-ns	ip link set dev __IFNAME__ mtu 16384
-iperf3	BW ns __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 4G
+iperf3	BW ns __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 4G -l 16356
 bw	__BW__ 3.0 4.0
-ns	ip link set dev __IFNAME__ mtu 65520
-iperf3	BW ns __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 6G
+iperf3	BW ns __GW__ 100${i}3 __THREADS__ __TIME__ __OPTS__ -b 6G -l 65492
 bw	__BW__ 6.0 7.0
 
 iperf3k	host
@@ -203,10 +197,13 @@ iperf3s	ns 100${i}2 __THREADS__
 
 nsout	IFNAME ip -j link show | jq -rM '.[] | select(.link_type == "ether").ifname'
 nsout	ADDR6 ip -j -6 addr show|jq -rM '.[] | select(.ifname == "__IFNAME__").addr_info[] | select(.scope == "global" and .prefixlen == 64).local'
-bw	-
-bw	-
-bw	-
-iperf3	BW host __ADDR6__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G
+iperf3	BW host __ADDR6__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 2G -l 1472
+bw	__BW__ 0.3 0.5
+iperf3	BW host __ADDR6__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 3G -l 3972
+bw	__BW__ 0.5 0.8
+iperf3	BW host __ADDR6__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 4G -l 16356
+bw	__BW__ 3.0 4.0
+iperf3	BW host __ADDR6__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G -l 65472
 bw	__BW__ 7.0 9.0
 
 iperf3k	ns
@@ -225,10 +222,13 @@ tr	UDP throughput over IPv4: host to ns
 iperf3s	ns 100${i}2 __THREADS__
 
 nsout	ADDR ip -j -4 addr show|jq -rM '.[] | select(.ifname == "__IFNAME__").addr_info[0].local'
-bw	-
-bw	-
-bw	-
-iperf3	BW host __ADDR__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G
+iperf3	BW host __ADDR__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 2G -l 1472
+bw	__BW__ 0.3 0.5
+iperf3	BW host __ADDR__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 3G -l 3972
+bw	__BW__ 0.5 0.8
+iperf3	BW host __ADDR__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 4G -l 16356
+bw	__BW__ 3.0 4.0
+iperf3	BW host __ADDR__ 100${i}2 __THREADS__ __TIME__ __OPTS__ -b 15G -l 65492
 bw	__BW__ 7.0 9.0
 
 iperf3k	ns
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 6/8] test/perf: "MTU" changes in passt_tcp host to guest aren't useful
  2023-11-06  7:08 [PATCH 0/8] Clean ups and speed ups to benchmarks David Gibson
                   ` (4 preceding siblings ...)
  2023-11-06  7:08 ` [PATCH 5/8] test/perf: Explicitly control UDP packet length, instead of MTU David Gibson
@ 2023-11-06  7:08 ` David Gibson
  2023-11-06  7:08 ` [PATCH 7/8] test/perf: Remove unnecessary --pacing-timer options David Gibson
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: David Gibson @ 2023-11-06  7:08 UTC (permalink / raw)
  To: passt-dev, Stefano Brivio; +Cc: David Gibson

The TCP packet size used on the passt L2 link (qemu socket) makes a huge
difference to passt/pasta throughput; many of passt's overheads (chiefly
syscalls) are per-packet.

That packet size is largely determined by the MTU on the L2 link, so we
benchmark for a number of different MTUs.  That works well for the guest to
host transfers.  For the host to guest transfers, we purport to test for
different MTUs, but we're not actually adjusting anything interesting.

The host to guest transfers adjust the MTU on the "host's" (actually ns)
loopback interface.  However, that only affects the packet size for the
socket going to passt, not the packet size for the L2 link that passt
manages - passt can and will repack the stream into packets of its own
size.  Since the depacketization on that socket is handled by the kernel it
doesn't have a lot of bearing on passt's performance.

We can't fix this by changing the L2 link MTU from the guest side (as we do
for guest to host), because that would only change the guest's view of the
MTU, passt would still think it has the large MTU.  We could test this by
using the --mtu option to passt, but that would require restarting passt
for each run, which is awkward in the current setup.  So, for now, drop all
the "small MTU" tests for host to guest.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 test/perf/passt_tcp | 37 ++++++++-----------------------------
 1 file changed, 8 insertions(+), 29 deletions(-)

diff --git a/test/perf/passt_tcp b/test/perf/passt_tcp
index 9363922..205b9af 100644
--- a/test/perf/passt_tcp
+++ b/test/perf/passt_tcp
@@ -138,19 +138,11 @@ iperf3s	guest 100${i}1 __THREADS__
 
 bw	-
 bw	-
-ns	ip link set dev lo mtu 1280
-iperf3	BW ns ::1 100${i}1 __THREADS__ __TIME__ __OPTS__
-bw	__BW__ 1.0 1.2
-ns	ip link set dev lo mtu 1500
-iperf3	BW ns ::1 100${i}1 __THREADS__ __TIME__ __OPTS__
-bw	__BW__ 2.0 3.0
-ns	ip link set dev lo mtu 9000
-iperf3	BW ns ::1 100${i}1 __THREADS__ __TIME__ __OPTS__
-bw	__BW__ 5.0 6.0
-ns	ip link set dev lo mtu 65520
+bw	-
+bw	-
+bw	-
 iperf3	BW ns ::1 100${i}1 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 6.0 6.8
-ns	ip link set dev lo mtu 65535
 
 iperf3k	guest
 
@@ -180,26 +172,13 @@ lat	__LAT__ 500 350
 tr	TCP throughput over IPv4: host to guest
 iperf3s	guest 100${i}1 __THREADS__
 
-ns	ip link set dev lo mtu 256
-iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__
-bw	__BW__ 0.3 0.5
-ns	ip link set dev lo mtu 576
-iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__
-bw	__BW__ 0.5 1.0
-ns	ip link set dev lo mtu 1280
-ns	ip addr add ::1 dev lo
-iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__
-bw	__BW__ 2.0 3.0
-ns	ip link set dev lo mtu 1500
-iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__
-bw	__BW__ 2.0 3.0
-ns	ip link set dev lo mtu 9000
-iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__
-bw	__BW__ 5.0 6.0
-ns	ip link set dev lo mtu 65520
+bw	-
+bw	-
+bw	-
+bw	-
+bw	-
 iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 6.0 6.8
-ns	ip link set dev lo mtu 65535
 
 iperf3k	guest
 
-- 
@@ -138,19 +138,11 @@ iperf3s	guest 100${i}1 __THREADS__
 
 bw	-
 bw	-
-ns	ip link set dev lo mtu 1280
-iperf3	BW ns ::1 100${i}1 __THREADS__ __TIME__ __OPTS__
-bw	__BW__ 1.0 1.2
-ns	ip link set dev lo mtu 1500
-iperf3	BW ns ::1 100${i}1 __THREADS__ __TIME__ __OPTS__
-bw	__BW__ 2.0 3.0
-ns	ip link set dev lo mtu 9000
-iperf3	BW ns ::1 100${i}1 __THREADS__ __TIME__ __OPTS__
-bw	__BW__ 5.0 6.0
-ns	ip link set dev lo mtu 65520
+bw	-
+bw	-
+bw	-
 iperf3	BW ns ::1 100${i}1 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 6.0 6.8
-ns	ip link set dev lo mtu 65535
 
 iperf3k	guest
 
@@ -180,26 +172,13 @@ lat	__LAT__ 500 350
 tr	TCP throughput over IPv4: host to guest
 iperf3s	guest 100${i}1 __THREADS__
 
-ns	ip link set dev lo mtu 256
-iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__
-bw	__BW__ 0.3 0.5
-ns	ip link set dev lo mtu 576
-iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__
-bw	__BW__ 0.5 1.0
-ns	ip link set dev lo mtu 1280
-ns	ip addr add ::1 dev lo
-iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__
-bw	__BW__ 2.0 3.0
-ns	ip link set dev lo mtu 1500
-iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__
-bw	__BW__ 2.0 3.0
-ns	ip link set dev lo mtu 9000
-iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__
-bw	__BW__ 5.0 6.0
-ns	ip link set dev lo mtu 65520
+bw	-
+bw	-
+bw	-
+bw	-
+bw	-
 iperf3	BW ns 127.0.0.1 100${i}1 __THREADS__ __TIME__ __OPTS__
 bw	__BW__ 6.0 6.8
-ns	ip link set dev lo mtu 65535
 
 iperf3k	guest
 
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 7/8] test/perf: Remove unnecessary --pacing-timer options
  2023-11-06  7:08 [PATCH 0/8] Clean ups and speed ups to benchmarks David Gibson
                   ` (5 preceding siblings ...)
  2023-11-06  7:08 ` [PATCH 6/8] test/perf: "MTU" changes in passt_tcp host to guest aren't useful David Gibson
@ 2023-11-06  7:08 ` David Gibson
  2023-11-06  7:08 ` [PATCH 8/8] test/perf: Simplify calculation of "omit" time for TCP throughput David Gibson
  2023-11-07 12:45 ` [PATCH 0/8] Clean ups and speed ups to benchmarks Stefano Brivio
  8 siblings, 0 replies; 10+ messages in thread
From: David Gibson @ 2023-11-06  7:08 UTC (permalink / raw)
  To: passt-dev, Stefano Brivio; +Cc: David Gibson

We always set --pacing-timer when invoking iperf3.  However, the iperf3
man page implies this is only relevant for the -b option.  We only use the
-b option for the UDP tests, not TCP, so remove --pacing-timer from the TCP
cases.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 test/perf/passt_tcp | 2 +-
 test/perf/pasta_tcp | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/test/perf/passt_tcp b/test/perf/passt_tcp
index 205b9af..da4e369 100644
--- a/test/perf/passt_tcp
+++ b/test/perf/passt_tcp
@@ -41,7 +41,7 @@ set	THREADS 1
 set	STREAMS 8
 set	TIME 10
 hout	OMIT echo __TIME__ / 6 | bc -l
-set	OPTS -Z -P __STREAMS__ -l 1M -O__OMIT__ --pacing-timer 1000000
+set	OPTS -Z -P __STREAMS__ -l 1M -O__OMIT__
 
 info	Throughput in Gbps, latency in µs, one thread at __FREQ__ GHz, __STREAMS__ streams
 report	passt tcp __THREADS__ __FREQ__
diff --git a/test/perf/pasta_tcp b/test/perf/pasta_tcp
index 3a8ad40..11c73f8 100644
--- a/test/perf/pasta_tcp
+++ b/test/perf/pasta_tcp
@@ -25,7 +25,7 @@ set	THREADS 2
 set	STREAMS 2
 set	TIME 10
 hout	OMIT echo __TIME__ / 6 | bc -l
-set	OPTS -Z -w 4M -l 1M -P __STREAMS__ -O__OMIT__ --pacing-timer 10000
+set	OPTS -Z -w 4M -l 1M -P __STREAMS__ -O__OMIT__
 
 hout	FREQ_PROCFS (echo "scale=1"; sed -n 's/cpu MHz.*: \([0-9]*\)\..*$/(\1+10^2\/2)\/10^3/p' /proc/cpuinfo) | bc -l | head -n1
 hout	FREQ_CPUFREQ (echo "scale=1"; printf '( %i + 10^5 / 2 ) / 10^6\n' $(cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq) ) | bc -l
@@ -128,7 +128,7 @@ nsout	GW6 ip -j -6 route show|jq -rM '.[] | select(.dst == "default").gateway'
 nsout	IFNAME ip -j link show | jq -rM '.[] | select(.link_type == "ether").ifname'
 set	THREADS 1
 set	STREAMS 2
-set	OPTS -Z -P __STREAMS__ -i1 -O__OMIT__ --pacing-timer 100000
+set	OPTS -Z -P __STREAMS__ -i1 -O__OMIT__
 
 info	Throughput in Gbps, latency in µs, one thread at __FREQ__ GHz, __STREAMS__ streams
 report	pasta tap_tcp __THREADS__ __FREQ__
-- 
@@ -25,7 +25,7 @@ set	THREADS 2
 set	STREAMS 2
 set	TIME 10
 hout	OMIT echo __TIME__ / 6 | bc -l
-set	OPTS -Z -w 4M -l 1M -P __STREAMS__ -O__OMIT__ --pacing-timer 10000
+set	OPTS -Z -w 4M -l 1M -P __STREAMS__ -O__OMIT__
 
 hout	FREQ_PROCFS (echo "scale=1"; sed -n 's/cpu MHz.*: \([0-9]*\)\..*$/(\1+10^2\/2)\/10^3/p' /proc/cpuinfo) | bc -l | head -n1
 hout	FREQ_CPUFREQ (echo "scale=1"; printf '( %i + 10^5 / 2 ) / 10^6\n' $(cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq) ) | bc -l
@@ -128,7 +128,7 @@ nsout	GW6 ip -j -6 route show|jq -rM '.[] | select(.dst == "default").gateway'
 nsout	IFNAME ip -j link show | jq -rM '.[] | select(.link_type == "ether").ifname'
 set	THREADS 1
 set	STREAMS 2
-set	OPTS -Z -P __STREAMS__ -i1 -O__OMIT__ --pacing-timer 100000
+set	OPTS -Z -P __STREAMS__ -i1 -O__OMIT__
 
 info	Throughput in Gbps, latency in µs, one thread at __FREQ__ GHz, __STREAMS__ streams
 report	pasta tap_tcp __THREADS__ __FREQ__
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 8/8] test/perf: Simplify calculation of "omit" time for TCP throughput
  2023-11-06  7:08 [PATCH 0/8] Clean ups and speed ups to benchmarks David Gibson
                   ` (6 preceding siblings ...)
  2023-11-06  7:08 ` [PATCH 7/8] test/perf: Remove unnecessary --pacing-timer options David Gibson
@ 2023-11-06  7:08 ` David Gibson
  2023-11-07 12:45 ` [PATCH 0/8] Clean ups and speed ups to benchmarks Stefano Brivio
  8 siblings, 0 replies; 10+ messages in thread
From: David Gibson @ 2023-11-06  7:08 UTC (permalink / raw)
  To: passt-dev, Stefano Brivio; +Cc: David Gibson

For the TCP throughput tests, we use iperf3's -O "omit" option which
ignores results for the given time at the beginning of the test.  Currently
we calculate this as 1/6th of the test measurement time.  The purpose of
-O, however, is to skip over the TCP slow start period, which in no way
depends on the overall length of the test.

The slow start time is roughly speaking
    log_2 ( max_window_size / MSS ) * round_trip_time
These factors all vary between tests and machines we're running on, but we
can estimate some reasonable bounds for them:
  * The maximum window size is bounded by the buffer sizes at each end,
    which shouldn't exceed 16MiB
  * The mss varies with the MTU we use, but the smallest we use in tests is
    ~256 bytes
  * Round trip time will vary with the system, but with these essentially
    local transfers it will typically be well under 1ms (on my laptop it is
    closer to 0.03ms)

That gives a worst case slow start time of about 16ms.  Setting an omit
time of 0.1s uniformly is therefore more than enough, and substantially
smaller than what we calculate now for the default case (10s / 6 ~= 1.7s).

This reduces total time for the standard benchmark run by around 30s.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 test/perf/passt_tcp | 2 +-
 test/perf/pasta_tcp | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/test/perf/passt_tcp b/test/perf/passt_tcp
index da4e369..631a407 100644
--- a/test/perf/passt_tcp
+++ b/test/perf/passt_tcp
@@ -40,7 +40,7 @@ hout	FREQ [ -n "__FREQ_CPUFREQ__" ] && echo __FREQ_CPUFREQ__ || echo __FREQ_PROC
 set	THREADS 1
 set	STREAMS 8
 set	TIME 10
-hout	OMIT echo __TIME__ / 6 | bc -l
+set	OMIT 0.1
 set	OPTS -Z -P __STREAMS__ -l 1M -O__OMIT__
 
 info	Throughput in Gbps, latency in µs, one thread at __FREQ__ GHz, __STREAMS__ streams
diff --git a/test/perf/pasta_tcp b/test/perf/pasta_tcp
index 11c73f8..7777532 100644
--- a/test/perf/pasta_tcp
+++ b/test/perf/pasta_tcp
@@ -24,7 +24,7 @@ ns	/sbin/sysctl -w net.ipv4.tcp_timestamps=0
 set	THREADS 2
 set	STREAMS 2
 set	TIME 10
-hout	OMIT echo __TIME__ / 6 | bc -l
+set	OMIT 0.1
 set	OPTS -Z -w 4M -l 1M -P __STREAMS__ -O__OMIT__
 
 hout	FREQ_PROCFS (echo "scale=1"; sed -n 's/cpu MHz.*: \([0-9]*\)\..*$/(\1+10^2\/2)\/10^3/p' /proc/cpuinfo) | bc -l | head -n1
-- 
@@ -24,7 +24,7 @@ ns	/sbin/sysctl -w net.ipv4.tcp_timestamps=0
 set	THREADS 2
 set	STREAMS 2
 set	TIME 10
-hout	OMIT echo __TIME__ / 6 | bc -l
+set	OMIT 0.1
 set	OPTS -Z -w 4M -l 1M -P __STREAMS__ -O__OMIT__
 
 hout	FREQ_PROCFS (echo "scale=1"; sed -n 's/cpu MHz.*: \([0-9]*\)\..*$/(\1+10^2\/2)\/10^3/p' /proc/cpuinfo) | bc -l | head -n1
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/8] Clean ups and speed ups to benchmarks
  2023-11-06  7:08 [PATCH 0/8] Clean ups and speed ups to benchmarks David Gibson
                   ` (7 preceding siblings ...)
  2023-11-06  7:08 ` [PATCH 8/8] test/perf: Simplify calculation of "omit" time for TCP throughput David Gibson
@ 2023-11-07 12:45 ` Stefano Brivio
  8 siblings, 0 replies; 10+ messages in thread
From: Stefano Brivio @ 2023-11-07 12:45 UTC (permalink / raw)
  To: David Gibson; +Cc: passt-dev

On Mon,  6 Nov 2023 18:08:25 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> Our standard "make check" includes a number of benchmarks, which take
> quite a long time to run.  This series makes a number of improvements
> to how we run these, which reduces wasted time and reduces the full
> run time by some 10-12 minutes.
> 
> David Gibson (8):
>   test/perf: Remove stale iperf3c/iperf3s directives
>   test/perf: Get iperf3 stats from client side
>   test/perf: Start iperf3 server less often
>   test/perf: Small MTUs for spliced TCP aren't interesting
>   test/perf: Explicitly control UDP packet length, instead of MTU
>   test/perf: "MTU" changes in passt_tcp host to guest aren't useful
>   test/perf: Remove unnecessary --pacing-timer options
>   test/perf: Simplify calculation of "omit" time for TCP throughput

Applied.

-- 
Stefano


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-11-07 12:45 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-06  7:08 [PATCH 0/8] Clean ups and speed ups to benchmarks David Gibson
2023-11-06  7:08 ` [PATCH 1/8] test/perf: Remove stale iperf3c/iperf3s directives David Gibson
2023-11-06  7:08 ` [PATCH 2/8] test/perf: Get iperf3 stats from client side David Gibson
2023-11-06  7:08 ` [PATCH 3/8] test/perf: Start iperf3 server less often David Gibson
2023-11-06  7:08 ` [PATCH 4/8] test/perf: Small MTUs for spliced TCP aren't interesting David Gibson
2023-11-06  7:08 ` [PATCH 5/8] test/perf: Explicitly control UDP packet length, instead of MTU David Gibson
2023-11-06  7:08 ` [PATCH 6/8] test/perf: "MTU" changes in passt_tcp host to guest aren't useful David Gibson
2023-11-06  7:08 ` [PATCH 7/8] test/perf: Remove unnecessary --pacing-timer options David Gibson
2023-11-06  7:08 ` [PATCH 8/8] test/perf: Simplify calculation of "omit" time for TCP throughput David Gibson
2023-11-07 12:45 ` [PATCH 0/8] Clean ups and speed ups to benchmarks Stefano Brivio

Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).