From: Stefano Brivio <sbrivio@redhat.com>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: passt-dev@passt.top
Subject: Re: [PATCH v2 11/12] tap: Don't size pool_tap[46] for the maximum number of packets
Date: Wed, 1 Jan 2025 22:54:44 +0100 [thread overview]
Message-ID: <20250101225444.130c1034@elisabeth> (raw)
In-Reply-To: <20241220083535.1372523-12-david@gibson.dropbear.id.au>
On Fri, 20 Dec 2024 19:35:34 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:
> Currently we attempt to size pool_tap[46] so they have room for the maximum
> possible number of packets that could fit in pkt_buf, TAP_MSGS. However,
> the calculation isn't quite correct: TAP_MSGS is based on ETH_ZLEN (60) as
> the minimum possible L2 frame size. But, we don't enforce that L2 frames
> are at least ETH_ZLEN when we receive them from the tap backend, and since
> we're dealing with virtual interfaces we don't have the physical Ethernet
> limitations requiring that length. Indeed it is possible to generate a
> legitimate frame smaller than that (e.g. a zero-payload UDP/IPv4 frame on
> the 'pasta' backend is only 42 bytes long).
>
> It's also unclear if this limit is sufficient for vhost-user which isn't
> limited by the size of pkt_buf as the other modes are.
>
> We could attempt to correct the calculation, but that would leave us with
> even larger arrays, which in practice rarely accumulate more than a handful
> of packets. So, instead, put an arbitrary cap on the number of packets we
> can put in a batch, and if we run out of space, process and flush the
> batch.
I ran a few more tests with this, keeping TAP_MSGS at 256, and in
general I couldn't really see a difference in latency (especially for
UDP streams with small packets) or throughput. Figures from short
throughput tests (such as the ones from the test suite) look a bit more
variable, but I don't have any statistically meaningful data.
Then I looked into how many messages we might have in the array without
this change, and I realised that, with the throughput tests from the
suite, we very easily exceed the 256 limit.
Perhaps surprisingly we get the highest buffer counts with TCP transfers
and intermediate MTUs: we're at about 4000-5000 with 1500 bytes (and
more like ~1000 with 1280 bytes) meaning that we move 6 to 8 megabytes
in one shot, every 5-10ms (at 8 Gbps). With that kind of time interval,
the extra system call overhead from forcibly flushing batches might
become rather relevant.
With lower MTUs, it looks like we have a lower CPU load and
transmissions are scheduled differently (resulting in smaller batches),
but I didn't really trace things.
So I start thinking that this has the *potential* to introduce a
performance regression in some cases and we shouldn't just assume that
some arbitrary 256 limit is good enough. I didn't check with perf(1),
though.
Right now that array takes, effectively, less than 100 KiB (it's ~5000
copies of struct iovec, 16 bytes each), and in theory that could be
~2.5 MiB (at 161319 items). Even if we double or triple that (let's
assume we use 2 * ETH_ALEN to keep it simple) it's not much... and will
have no practical effect anyway.
All in all, I think we shouldn't change this limit without a deeper
understanding of the practical impact. While this change doesn't bring
any practical advantage, the current behaviour is somewhat tested by
now, and a small limit isn't.
--
Stefano
next prev parent reply other threads:[~2025-01-01 21:54 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-20 8:35 [PATCH v2 00/12] Cleanups to packet pool handling and sizing David Gibson
2024-12-20 8:35 ` [PATCH v2 01/12] test focus David Gibson
2024-12-20 8:35 ` [PATCH v2 02/12] hack: stop on fail, but not perf fail David Gibson
2024-12-20 8:35 ` [PATCH v2 03/12] make passt dumpable David Gibson
2024-12-20 8:35 ` [PATCH v2 04/12] packet: Use flexible array member in struct pool David Gibson
2024-12-20 8:35 ` [PATCH v2 05/12] packet: Don't pass start and offset separately too packet_check_range() David Gibson
2024-12-20 8:35 ` [PATCH v2 06/12] packet: Don't hard code maximum packet size to UINT16_MAX David Gibson
2025-01-01 21:54 ` Stefano Brivio
2025-01-02 1:00 ` David Gibson
2025-01-02 21:59 ` Stefano Brivio
2025-01-03 1:16 ` David Gibson
2025-01-05 23:43 ` Stefano Brivio
2024-12-20 8:35 ` [PATCH v2 07/12] packet: Remove unhelpful packet_get_try() macro David Gibson
2025-01-01 21:54 ` Stefano Brivio
2025-01-02 2:15 ` David Gibson
2025-01-02 22:00 ` Stefano Brivio
2025-01-03 4:48 ` David Gibson
2025-01-06 10:55 ` Stefano Brivio
2024-12-20 8:35 ` [PATCH v2 08/12] util: Add abort_with_msg() and ASSERT_WITH_MSG() helpers David Gibson
2024-12-20 8:35 ` [PATCH v2 09/12] packet: Distinguish severities of different packet_{add,git}_do() errors David Gibson
2025-01-01 21:54 ` Stefano Brivio
2025-01-02 2:58 ` David Gibson
2025-01-02 22:00 ` Stefano Brivio
2025-01-03 5:06 ` David Gibson
2025-01-06 10:55 ` Stefano Brivio
2024-12-20 8:35 ` [PATCH v2 10/12] packet: Move packet length checks into packet_check_range() David Gibson
2024-12-20 8:35 ` [PATCH v2 11/12] tap: Don't size pool_tap[46] for the maximum number of packets David Gibson
2025-01-01 21:54 ` Stefano Brivio [this message]
2025-01-02 3:46 ` David Gibson
2025-01-02 22:00 ` Stefano Brivio
2025-01-03 6:06 ` David Gibson
2024-12-20 8:35 ` [PATCH v2 12/12] packet: More cautious checks to avoid pointer arithmetic UB David Gibson
2024-12-20 9:00 ` [PATCH v2 00/12] Cleanups to packet pool handling and sizing David Gibson
2024-12-20 10:06 ` Stefano Brivio
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250101225444.130c1034@elisabeth \
--to=sbrivio@redhat.com \
--cc=david@gibson.dropbear.id.au \
--cc=passt-dev@passt.top \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).