* Performance with unified IPv4/IPv6 tap queues
@ 2024-10-11 0:11 Jon Maloy
2024-10-11 18:07 ` Stefano Brivio
0 siblings, 1 reply; 2+ messages in thread
From: Jon Maloy @ 2024-10-11 0:11 UTC (permalink / raw)
To: passt-dev, Stefano Brivio, David Gibson, Laurent Vivier
Hi all,
I added the addressing/routing workarounds suggested by Stefano, and
the performance measurements now seems to be working flawlessly,
even the one Stefano said failed in his runs.
I made 5 runs from the master branch, and 5 with my two patches applied.
You can observe the resulta at
https://drive.google.com/drive/folders/1xGcWJ79smELbWOPwcJdsmyvoIrTz9R56
However, there seems to be a systematic decrease in throughput.
If we take the average over the runs for IPv6 ns - > host via tap,
we get 33.56 Gb/s vs 31.84 Gb/, i.e. a 5% difference.
I don´t really know what to make of this, and would like to know if
anybody else can confirm or falsify this.
///jon
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: Performance with unified IPv4/IPv6 tap queues
2024-10-11 0:11 Performance with unified IPv4/IPv6 tap queues Jon Maloy
@ 2024-10-11 18:07 ` Stefano Brivio
0 siblings, 0 replies; 2+ messages in thread
From: Stefano Brivio @ 2024-10-11 18:07 UTC (permalink / raw)
To: Jon Maloy; +Cc: passt-dev, David Gibson, Laurent Vivier
[-- Attachment #1: Type: text/plain, Size: 3036 bytes --]
On Thu, 10 Oct 2024 20:11:57 -0400
Jon Maloy <jmaloy@redhat.com> wrote:
> Hi all,
> I added the addressing/routing workarounds suggested by Stefano, and
For context: there were/are two issues in the tests with Jon's setup
(private IPv6 address and route on the host):
1. this private address was assigned with a /40 netmask, but in the
pasta throughput tests via tap, namespace to host, to find out a local,
non-loopback address to use, we do:
ip -j -6 addr show|jq -rM '.[] | select(.ifname == "eth0").addr_info[] | select(.scope == "global" and .prefixlen == 64).local'
...I don't remember if there's a valid reason why we filter on /64
addresses. I guess we should drop that if not needed. Workaround for
this: assign the address as /64.
2. the default gateway for IPv6 wasn't a link-local address. In ndp(),
we use our_tap_ll as source address for advertisements (and before the
introduction of our_tap_ll, this was conceptually the same).
However, with --config-net (which is not used in these tests, because
we want to test NDP and DHCPv6), we would copy routes, including the
default gateway, from the host, and the default gateway copied from the
host is the gateway address we also expect in the container.
I quickly tried to change this logic (I'm not sure if we really *need*
to use a link-local address as source for the advertisement, hence as
router address), but if I use a non-link-local address, the kernel
refuses to assign it.
Workaround: use a link-local address as gateway address.
> the performance measurements now seems to be working flawlessly,
> even the one Stefano said failed in his runs.
>
> I made 5 runs from the master branch, and 5 with my two patches applied.
>
> You can observe the resulta at
> https://drive.google.com/drive/folders/1xGcWJ79smELbWOPwcJdsmyvoIrTz9R56
...kind of, one would need a Google account and specific access.
Anyway, I attached your logs to this email.
> However, there seems to be a systematic decrease in throughput.
> If we take the average over the runs for IPv6 ns - > host via tap,
> we get 33.56 Gb/s vs 31.84 Gb/, i.e. a 5% difference.
That's not on the path that's directly affected by your patches: that's
namespace to host, but the queues are used on the host to namespace (or
guest) direction.
On the other hand, acknowledgement segments are actually using those
queues.
> I don´t really know what to make of this, and would like to know if
> anybody else can confirm or falsify this.
It's quite hard to get statistically significant figures with those
tests (transfers last one second) -- those are there just to check that
there's nothing seriously wrong (that is, a massive decrease in
throughput).
To understand if this is an actual decrease in throughput, I would
suggest to run a manual test, much longer (at least 20-30 seconds),
with pasta or passt running under perf(1). Then, check throughput and
cycles spent on the various system calls involved.
--
Stefano
[-- Attachment #2: test.log_mini_no_patches_1 --]
[-- Type: application/octet-stream, Size: 8610 bytes --]
[-- Attachment #3: test.log_mini_no_patches_2 --]
[-- Type: application/octet-stream, Size: 8610 bytes --]
[-- Attachment #4: test.log_mini_no_patches_3 --]
[-- Type: application/octet-stream, Size: 8610 bytes --]
[-- Attachment #5: test.log_mini_no_patches_4 --]
[-- Type: application/octet-stream, Size: 8610 bytes --]
[-- Attachment #6: test.log_mini_no_patches_5 --]
[-- Type: application/octet-stream, Size: 8610 bytes --]
[-- Attachment #7: test.log_mini_two_patches_1 --]
[-- Type: application/octet-stream, Size: 8610 bytes --]
[-- Attachment #8: test.log_mini_two_patches_2 --]
[-- Type: application/octet-stream, Size: 8610 bytes --]
[-- Attachment #9: test.log_mini_two_patches_3 --]
[-- Type: application/octet-stream, Size: 8610 bytes --]
[-- Attachment #10: test.log_mini_two_patches_4 --]
[-- Type: application/octet-stream, Size: 9846 bytes --]
[-- Attachment #11: test.log_mini_two_patches_5 --]
[-- Type: application/octet-stream, Size: 8610 bytes --]
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2024-10-11 18:07 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-10-11 0:11 Performance with unified IPv4/IPv6 tap queues Jon Maloy
2024-10-11 18:07 ` Stefano Brivio
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).