From: David Gibson <david@gibson.dropbear.id.au>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: passt-dev@passt.top
Subject: Re: [PATCH v2 09/12] packet: Distinguish severities of different packet_{add,git}_do() errors
Date: Thu, 2 Jan 2025 13:58:19 +1100 [thread overview]
Message-ID: <Z3YAywsKK17UE7kB@zatzit> (raw)
In-Reply-To: <20250101225441.00bdb632@elisabeth>
[-- Attachment #1: Type: text/plain, Size: 13146 bytes --]
On Wed, Jan 01, 2025 at 10:54:41PM +0100, Stefano Brivio wrote:
> On Fri, 20 Dec 2024 19:35:32 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
>
> > packet_add() and packet_get() can fail for a number of reasons, and we
> > currently treat them all basically the same: we log a trace() level message
> > and for packet_get() we return NULL. However the different causes of
> > error are quite different and suggest different handling:
> >
> > 1) If we run out of space in the pool on add, that's (rightly) non-fatal,
> > but is an unusual situation which we might want to know about. Promote
> > it's message to debug() level.
> >
> > 2) All packet_check_range() errors and the checks on packet length indicate
> > a serious problem. Due to either a bug in calling code, or some sort
> > of memory-clobbering, we've been given addresses or sizes of packets
> > that are nonsensical. If things are this bad, we shouldn't try to carry
> > on replace these with asserts.
> >
> > 3) Requesting a packet index that doesn't exist in the pool in packet_get()
> > indicates a bug in the caller - it should always check the index first.
> > Replace this with an assert.
>
> The reasons for 2) and 3) being trace() messages (they should probably
> be debug() following the same reasoning as 1)) are graceful degradation
> and security (availability).
Hmm..
> If we have a packet triggering some sort of "memory clobbering", or an
> issue in the caller, that might very well be just one packet or a few
> of them.
I see the case for packet_add_do(). In that case it could be that
this is just a bad packet triggering bugs in the initial processing
code, and just dropping it will be sufficient to save us from damaging
state.
For packet_get_do(), however, this indicates that we have a packet
that's out of bounds, despite the fact that we checked its bounds when
it was added to the pool. That means we *already* have corrupted
memory, and we have no way to assess the extent of the damage. It's
entirely possible we're already badly compromised. Under those
circumstances I think it is much better to terminate immediately,
rather than risk doing any more hard to trace damage.
> I think it's a disservice to users to crash in that case, and it has
> the potential to worsen a security flaw if this packet can be built on
> purpose. It's also a disservice to package maintainers because it has
> the potential to turn a harmless issue into an annoying situation with
> associated time pressure and everything.
Ok, that argument makes sense to me for packet_add_do(), but again,
not for packet_get_do().
> Those are not warn() calls, by the way, because we don't want to make
> it too easy, in that (perhaps unlikely) case, to flood logs, and
> possibly hide information due to rotation.
Right, I see the argument, but it also worries me that the message -
which even if we are successfully containing the damage here
definitely indicates a bug in the caller - could easily never be seen
at trace or even debug level.
> I suppose that the argument for asserts here is so that we discover
> functional issues as soon as possible, which I understand of course,
That's the main reason. The secondary reason is that it clarifies the
meaning of a NULL return from packet_get_do(). With this change it
means: you requested a range that's not in the packet, whereas
previously it could mean either that or "something is horribly wrong
in the pool state, and we don't know what".
> but it relies on two assumptions that might not hold:
>
> 1. the crash will be reported, while a debug() message will go
> unnoticed.
>
> Well, the crash is more noticeable, but that doesn't mean that it
> will be reported. Users might/will try to hack something around it
> or try to survive with slirp4netns and all its flaws.
I suppose so.
> On the other hand, a malfunction (say, failed connections) might be
> equally likely to be reported, along with debug() messages.
That only applies if this causes malfunctions in intended behaviour.
If it's triggered due to an attack, rather than a flaw handling
expected behaviour then there's no reason it would be reported.
> 2. the check and the assertion themselves are correct. This was not the
> case for the two most recent severe issues (commit 61c0b0d0f199, bug
> #105) we discovered through assertions.
That's fair, I can see why you'd be gun shy about asserts given that.
> Issues such as https://github.com/containers/podman/issues/22925, on
> the other hand, are good examples of how asserts saved us a lot
> of time and effort later on (compared to, say, "connections stop
> working after three days"). But I would argue that those are
> substantially different cases where assertions are checking
> something that's much more inherent to the implementation.
Eh... I can't really see a way these assertions are different other
than that one turned out to be wrong and the other didn't.
> In this case, I would argue that there's no need to crash, so we
> shouldn't. We can presumably carry on after that, and we're not leaking
> any resource or sneakily leave behind some condition that will cause
> apparently unrelated issues at a later time.
Again, true (plausibly) for packet_add_do(), but not for
packet_get_do().
>
> > 4) On packet_get() requesting a section of the packet beyond its bounds is
> > correct already. This happens routinely from some callers when they
> > probe to see if certain parts of the packet are present. At worst it
> > indicates that the guest has generate a malformed packet, which we
> > should quietly drop as we already do.
This patch doesn't change that. We check against the stated packet
bounds first and return NULL *before* we check against the buffer
bounds and possibly trigger an assert.
Ok... here's a tentative proposal, let me know what you think:
* I promote the packet_check_range() messages to warn(), or even
err(). Yes, this might allow for log flooding, but I would argue
that if we have a bug allowing this path to be triggered and
something is frequently triggering it, we *want* that to be noisy.
We *think* we're containing the damage here, but we don't really
know.
* I make packet_check_range() failures in packet_add_do() drop the
packet and continue, as they used to
* packet_check_range() failures in packet_get_do() will still
assert()
* I leave the assert()s for packet _index_ out of range
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> > packet.c | 71 ++++++++++++++++++++---------------------------------
> > packet.h | 3 ++-
> > vu_common.c | 13 +++++++---
> > 3 files changed, 38 insertions(+), 49 deletions(-)
> >
> > diff --git a/packet.c b/packet.c
> > index c921aa15..24f12448 100644
> > --- a/packet.c
> > +++ b/packet.c
> > @@ -30,37 +30,27 @@
> > * @func: For tracing: name of calling function
> > * @line: For tracing: caller line of function call
> > *
> > - * Return: 0 if the range is valid, -1 otherwise
> > + * ASSERT()s if the given range isn't valid (within the expected buffer(s) for
> > + * the pool).
> > */
> > -static int packet_check_range(const struct pool *p, const char *ptr, size_t len,
> > - const char *func, int line)
> > +static void packet_check_range(const struct pool *p,
> > + const char *ptr, size_t len,
> > + const char *func, int line)
> > {
> > if (p->buf_size == 0) {
> > - int ret;
> > -
> > - ret = vu_packet_check_range((void *)p->buf, ptr, len);
> > -
> > - if (ret == -1)
> > - trace("cannot find region, %s:%i", func, line);
> > -
> > - return ret;
> > - }
> > -
> > - if (ptr < p->buf) {
> > - trace("packet range start %p before buffer start %p, %s:%i",
> > - (void *)ptr, (void *)p->buf, func, line);
> > - return -1;
> > - }
> > -
> > - if (ptr + len > p->buf + p->buf_size) {
> > - trace("packet range end %p after buffer end %p, %s:%i",
> > - (void *)(ptr + len), (void *)(p->buf + p->buf_size),
> > - func, line);
> > - return -1;
> > + vu_packet_check_range((void *)p->buf, ptr, len, func, line);
> > + return;
> > }
> >
> > - return 0;
> > + ASSERT_WITH_MSG(ptr >= p->buf,
> > + "packet range start %p before buffer start %p, %s:%i",
> > + (void *)ptr, (void *)p->buf, func, line);
> > + ASSERT_WITH_MSG(ptr + len <= p->buf + p->buf_size,
> > + "packet range end %p after buffer end %p, %s:%i",
> > + (void *)(ptr + len), (void *)(p->buf + p->buf_size),
> > + func, line);
> > }
> > +
> > /**
> > * packet_add_do() - Add data as packet descriptor to given pool
> > * @p: Existing pool
> > @@ -75,18 +65,16 @@ void packet_add_do(struct pool *p, size_t len, const char *start,
> > size_t idx = p->count;
> >
> > if (idx >= p->size) {
> > - trace("add packet index %zu to pool with size %zu, %s:%i",
> > + debug("add packet index %zu to pool with size %zu, %s:%i",
> > idx, p->size, func, line);
> > return;
> > }
> >
> > - if (packet_check_range(p, start, len, func, line))
> > - return;
> > + packet_check_range(p, start, len, func, line);
> >
> > - if (len > PACKET_MAX_LEN) {
> > - trace("add packet length %zu, %s:%i", len, func, line);
> > - return;
> > - }
> > + ASSERT_WITH_MSG(len <= PACKET_MAX_LEN,
> > + "add packet length %zu (max %zu), %s:%i",
> > + len, PACKET_MAX_LEN, func, line);
> >
> > p->pkt[idx].iov_base = (void *)start;
> > p->pkt[idx].iov_len = len;
> > @@ -111,16 +99,12 @@ void *packet_get_do(const struct pool *p, size_t idx, size_t offset,
> > {
> > char *ptr;
> >
> > - if (idx >= p->size || idx >= p->count) {
> > - trace("packet %zu from pool size: %zu, count: %zu, %s:%i",
> > - idx, p->size, p->count, func, line);
> > - return NULL;
> > - }
> > -
> > - if (len > PACKET_MAX_LEN) {
> > - trace("packet data length %zu, %s:%i", len, func, line);
> > - return NULL;
> > - }
> > + ASSERT_WITH_MSG(idx < p->size && idx < p->count,
> > + "packet %zu from pool size: %zu, count: %zu, %s:%i",
> > + idx, p->size, p->count, func, line);
> > + ASSERT_WITH_MSG(len <= PACKET_MAX_LEN,
> > + "packet range length %zu (max %zu), %s:%i",
> > + len, PACKET_MAX_LEN, func, line);
> >
> > if (len + offset > p->pkt[idx].iov_len) {
> > trace("data length %zu, offset %zu from length %zu, %s:%i",
> > @@ -130,8 +114,7 @@ void *packet_get_do(const struct pool *p, size_t idx, size_t offset,
> >
> > ptr = (char *)p->pkt[idx].iov_base + offset;
> >
> > - if (packet_check_range(p, ptr, len, func, line))
> > - return NULL;
> > + packet_check_range(p, ptr, len, func, line);
> >
> > if (left)
> > *left = p->pkt[idx].iov_len - offset - len;
> > diff --git a/packet.h b/packet.h
> > index f95cda08..b164f77e 100644
> > --- a/packet.h
> > +++ b/packet.h
> > @@ -31,7 +31,8 @@ struct pool {
> > struct iovec pkt[];
> > };
> >
> > -int vu_packet_check_range(void *buf, const char *ptr, size_t len);
> > +void vu_packet_check_range(void *buf, const char *ptr, size_t len,
> > + const char *func, int line);
> > void packet_add_do(struct pool *p, size_t len, const char *start,
> > const char *func, int line);
> > void *packet_get_do(const struct pool *p, const size_t idx,
> > diff --git a/vu_common.c b/vu_common.c
> > index 531f8786..528b9b08 100644
> > --- a/vu_common.c
> > +++ b/vu_common.c
> > @@ -24,10 +24,13 @@
> > * @buf: Array of the available memory regions
> > * @ptr: Start of desired data range
> > * @size: Length of desired data range
> > + * @func: For tracing: name of calling function
> > + * @line: For tracing: caller line of function call
>
> If those arguments are mandatory, I would drop the "For tracing"
> specification.
>
> > *
> > - * Return: 0 if the zone is in a mapped memory region, -1 otherwise
> > + * Aborts if the zone isn't in any of the device regions
> > */
> > -int vu_packet_check_range(void *buf, const char *ptr, size_t len)
> > +void vu_packet_check_range(void *buf, const char *ptr, size_t len,
> > + const char *func, int line)
> > {
> > struct vu_dev_region *dev_region;
> >
> > @@ -37,10 +40,12 @@ int vu_packet_check_range(void *buf, const char *ptr, size_t len)
> >
> > if (m <= ptr &&
> > ptr + len <= m + dev_region->mmap_offset + dev_region->size)
> > - return 0;
> > + return;
> > }
> >
> > - return -1;
> > + abort_with_msg(
> > + "package range at %p, length %zd not within dev region %s:%i",
>
> s/package/packet/
>
> > + (void *)ptr, len, func, line);
> > }
> >
> > /**
>
--
David Gibson (he or they) | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you, not the other way
| around.
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2025-01-02 3:47 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-20 8:35 [PATCH v2 00/12] Cleanups to packet pool handling and sizing David Gibson
2024-12-20 8:35 ` [PATCH v2 01/12] test focus David Gibson
2024-12-20 8:35 ` [PATCH v2 02/12] hack: stop on fail, but not perf fail David Gibson
2024-12-20 8:35 ` [PATCH v2 03/12] make passt dumpable David Gibson
2024-12-20 8:35 ` [PATCH v2 04/12] packet: Use flexible array member in struct pool David Gibson
2024-12-20 8:35 ` [PATCH v2 05/12] packet: Don't pass start and offset separately too packet_check_range() David Gibson
2024-12-20 8:35 ` [PATCH v2 06/12] packet: Don't hard code maximum packet size to UINT16_MAX David Gibson
2025-01-01 21:54 ` Stefano Brivio
2025-01-02 1:00 ` David Gibson
2025-01-02 21:59 ` Stefano Brivio
2025-01-03 1:16 ` David Gibson
2025-01-05 23:43 ` Stefano Brivio
2024-12-20 8:35 ` [PATCH v2 07/12] packet: Remove unhelpful packet_get_try() macro David Gibson
2025-01-01 21:54 ` Stefano Brivio
2025-01-02 2:15 ` David Gibson
2025-01-02 22:00 ` Stefano Brivio
2025-01-03 4:48 ` David Gibson
2025-01-06 10:55 ` Stefano Brivio
2024-12-20 8:35 ` [PATCH v2 08/12] util: Add abort_with_msg() and ASSERT_WITH_MSG() helpers David Gibson
2024-12-20 8:35 ` [PATCH v2 09/12] packet: Distinguish severities of different packet_{add,git}_do() errors David Gibson
2025-01-01 21:54 ` Stefano Brivio
2025-01-02 2:58 ` David Gibson [this message]
2025-01-02 22:00 ` Stefano Brivio
2025-01-03 5:06 ` David Gibson
2025-01-06 10:55 ` Stefano Brivio
2024-12-20 8:35 ` [PATCH v2 10/12] packet: Move packet length checks into packet_check_range() David Gibson
2024-12-20 8:35 ` [PATCH v2 11/12] tap: Don't size pool_tap[46] for the maximum number of packets David Gibson
2025-01-01 21:54 ` Stefano Brivio
2025-01-02 3:46 ` David Gibson
2025-01-02 22:00 ` Stefano Brivio
2025-01-03 6:06 ` David Gibson
2024-12-20 8:35 ` [PATCH v2 12/12] packet: More cautious checks to avoid pointer arithmetic UB David Gibson
2024-12-20 9:00 ` [PATCH v2 00/12] Cleanups to packet pool handling and sizing David Gibson
2024-12-20 10:06 ` Stefano Brivio
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z3YAywsKK17UE7kB@zatzit \
--to=david@gibson.dropbear.id.au \
--cc=passt-dev@passt.top \
--cc=sbrivio@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).