From: David Gibson <david@gibson.dropbear.id.au>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: passt-dev@passt.top
Subject: Re: [PATCH 3/3] tap: Don't size pool_tap[46] for the maximum number of packets
Date: Fri, 20 Dec 2024 12:13:23 +1100 [thread overview]
Message-ID: <Z2TEs6DN5VWmxVos@zatzit> (raw)
In-Reply-To: <20241219100015.3e4b7599@elisabeth>
[-- Attachment #1: Type: text/plain, Size: 6108 bytes --]
On Thu, Dec 19, 2024 at 10:00:15AM +0100, Stefano Brivio wrote:
> On Fri, 13 Dec 2024 23:01:56 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
>
> > Currently we attempt to size pool_tap[46] so they have room for the maximum
> > possible number of packets that could fit in pkt_buf, TAP_MSGS. However,
> > the calculation isn't quite correct: TAP_MSGS is based on ETH_ZLEN (60) as
> > the minimum possible L2 frame size. But, we don't enforce that L2 frames
> > are at least ETH_ZLEN when we receive them from the tap backend, and since
> > we're dealing with virtual interfaces we don't have the physical Ethernet
> > limitations requiring that length. Indeed it is possible to generate a
> > legitimate frame smaller than that (e.g. a zero-payload UDP/IPv4 frame on
> > the 'pasta' backend is only 42 bytes long).
> >
> > It's also unclear if this limit is sufficient for vhost-user which isn't
> > limited by the size of pkt_buf as the other modes are.
> >
> > We could attempt to correct the calculation, but that would leave us with
> > even larger arrays, which in practice rarely accumulate more than a handful
> > of packets. So, instead, put an arbitrary cap on the number of packets we
> > can put in a batch, and if we run out of space, process and flush the
> > batch.
> >
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> > packet.c | 13 ++++++++++++-
> > packet.h | 3 +++
> > passt.h | 2 --
> > tap.c | 18 +++++++++++++++---
> > tap.h | 3 ++-
> > vu_common.c | 3 ++-
> > 6 files changed, 34 insertions(+), 8 deletions(-)
> >
> > diff --git a/packet.c b/packet.c
> > index 5bfa7304..b68580cc 100644
> > --- a/packet.c
> > +++ b/packet.c
> > @@ -22,6 +22,17 @@
> > #include "util.h"
> > #include "log.h"
> >
> > +/**
> > + * pool_full() - Is a packet pool full?
> > + * @p: Pointer to packet pool
> > + *
> > + * Return: true if the pool is full, false if more packets can be added
> > + */
> > +bool pool_full(const struct pool *p)
> > +{
> > + return p->count >= p->size;
> > +}
> > +
> > /**
> > * packet_add_do() - Add data as packet descriptor to given pool
> > * @p: Existing pool
> > @@ -35,7 +46,7 @@ void packet_add_do(struct pool *p, size_t len, const char *start,
> > {
> > size_t idx = p->count;
> >
> > - if (idx >= p->size) {
> > + if (pool_full(p)) {
> > trace("add packet index %zu to pool with size %zu, %s:%i",
> > idx, p->size, func, line);
> > return;
> > diff --git a/packet.h b/packet.h
> > index 98eb8812..3618f213 100644
> > --- a/packet.h
> > +++ b/packet.h
> > @@ -6,6 +6,8 @@
> > #ifndef PACKET_H
> > #define PACKET_H
> >
> > +#include <stdbool.h>
> > +
> > /**
> > * struct pool - Generic pool of packets stored in nmemory
> > * @size: Number of usable descriptors for the pool
> > @@ -23,6 +25,7 @@ void packet_add_do(struct pool *p, size_t len, const char *start,
> > void *packet_get_do(const struct pool *p, const size_t idx,
> > size_t offset, size_t len, size_t *left,
> > const char *func, int line);
> > +bool pool_full(const struct pool *p);
> > void pool_flush(struct pool *p);
> >
> > #define packet_add(p, len, start) \
> > diff --git a/passt.h b/passt.h
> > index 0dd4efa0..81b2787f 100644
> > --- a/passt.h
> > +++ b/passt.h
> > @@ -70,8 +70,6 @@ static_assert(sizeof(union epoll_ref) <= sizeof(union epoll_data),
> >
> > #define TAP_BUF_BYTES \
> > ROUND_DOWN(((ETH_MAX_MTU + sizeof(uint32_t)) * 128), PAGE_SIZE)
> > -#define TAP_MSGS \
> > - DIV_ROUND_UP(TAP_BUF_BYTES, ETH_ZLEN - 2 * ETH_ALEN + sizeof(uint32_t))
> >
> > #define PKT_BUF_BYTES MAX(TAP_BUF_BYTES, 0)
> > extern char pkt_buf [PKT_BUF_BYTES];
> > diff --git a/tap.c b/tap.c
> > index 68231f09..42370a26 100644
> > --- a/tap.c
> > +++ b/tap.c
> > @@ -61,6 +61,8 @@
> > #include "vhost_user.h"
> > #include "vu_common.h"
> >
> > +#define TAP_MSGS 256
>
> Sorry, I stopped at 2/3, had just a quick look at this one, and I
> missed this.
>
> Assuming 4 KiB pages, this changes from 161319 to 256. You mention that
Yes. I'm certainly open to arguments on what the number should be.
> in practice we never have more than a handful of messages, which is
> probably almost always the case, but I wonder if that's also the case
> with UDP "real-time" streams, where we could have bursts of a few
> hundred (thousand?) messages at a time.
Maybe. If we are getting them in large bursts, then we're no longer
really suceeding at the streams being "real-time", but sure, we should
try to catch up as best we can.
> I wonder: how bad would it be to correct the calculation, instead? We
> wouldn't actually use more memory, right?
I was pretty painful when I tried, and it would use more memory. The
safe option would be to use ETH_HLEN as the minimum size (which is
pretty much all we enforce in the tap layer), which would expand the
iovec array here by 2-3x. It's not enormous, but it's not nothing.
Or do you mean the unused pages of the array would never be
instantiated? In which case, yeah, I guess not.
Remember that with the changes in this patch if we exceed TAP_MSGS,
nothing particularly bad happens: we don't crash, and we don't drop
packets; we just process things in batches of TAP_MSGS frames at a
time. So this doesn't need to be large enough to handle any burst we
could ever get, just large enough to adequately mitigate the per-batch
costs, which I don't think are _that_ large. 256 was a first guess at
that. Maybe it's not enough, but I'd be pretty surprised if it needed
to be greater than ~1000 to make the per-batch costs negligible
compared to the per-frame costs. UDP_MAX_FRAMES, which is on the
reverse path but serves a similar function, is only 32.
--
David Gibson (he or they) | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you, not the other way
| around.
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2024-12-20 1:13 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-13 12:01 [PATCH 0/3] Cleanups to packet pool handling and sizing David Gibson
2024-12-13 12:01 ` [PATCH 1/3] packet: Use flexible array member in struct pool David Gibson
2024-12-13 12:01 ` [PATCH 2/3] packet: Don't have struct pool specify its buffer David Gibson
2024-12-19 9:00 ` Stefano Brivio
2024-12-20 0:59 ` David Gibson
2024-12-20 9:51 ` Stefano Brivio
2024-12-21 6:59 ` David Gibson
2024-12-13 12:01 ` [PATCH 3/3] tap: Don't size pool_tap[46] for the maximum number of packets David Gibson
2024-12-19 9:00 ` Stefano Brivio
2024-12-20 1:13 ` David Gibson [this message]
2024-12-20 9:51 ` Stefano Brivio
2024-12-21 7:00 ` David Gibson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z2TEs6DN5VWmxVos@zatzit \
--to=david@gibson.dropbear.id.au \
--cc=passt-dev@passt.top \
--cc=sbrivio@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).