From: Stefano Brivio <sbrivio@redhat.com>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: passt-dev@passt.top
Subject: Re: [PATCH v2 09/12] packet: Distinguish severities of different packet_{add,git}_do() errors
Date: Thu, 2 Jan 2025 23:00:08 +0100 [thread overview]
Message-ID: <20250102230008.391068cb@elisabeth> (raw)
In-Reply-To: <Z3YAywsKK17UE7kB@zatzit>
On Thu, 2 Jan 2025 13:58:19 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:
> On Wed, Jan 01, 2025 at 10:54:41PM +0100, Stefano Brivio wrote:
> > On Fri, 20 Dec 2024 19:35:32 +1100
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> >
> > > packet_add() and packet_get() can fail for a number of reasons, and we
> > > currently treat them all basically the same: we log a trace() level message
> > > and for packet_get() we return NULL. However the different causes of
> > > error are quite different and suggest different handling:
> > >
> > > 1) If we run out of space in the pool on add, that's (rightly) non-fatal,
> > > but is an unusual situation which we might want to know about. Promote
> > > it's message to debug() level.
> > >
> > > 2) All packet_check_range() errors and the checks on packet length indicate
> > > a serious problem. Due to either a bug in calling code, or some sort
> > > of memory-clobbering, we've been given addresses or sizes of packets
> > > that are nonsensical. If things are this bad, we shouldn't try to carry
> > > on replace these with asserts.
> > >
> > > 3) Requesting a packet index that doesn't exist in the pool in packet_get()
> > > indicates a bug in the caller - it should always check the index first.
> > > Replace this with an assert.
> >
> > The reasons for 2) and 3) being trace() messages (they should probably
> > be debug() following the same reasoning as 1)) are graceful degradation
> > and security (availability).
>
> Hmm..
>
> > If we have a packet triggering some sort of "memory clobbering", or an
> > issue in the caller, that might very well be just one packet or a few
> > of them.
>
> I see the case for packet_add_do(). In that case it could be that
> this is just a bad packet triggering bugs in the initial processing
> code, and just dropping it will be sufficient to save us from damaging
> state.
>
> For packet_get_do(), however, this indicates that we have a packet
> that's out of bounds, despite the fact that we checked its bounds when
> it was added to the pool. That means we *already* have corrupted
> memory, and we have no way to assess the extent of the damage. It's
> entirely possible we're already badly compromised. Under those
> circumstances I think it is much better to terminate immediately,
> rather than risk doing any more hard to trace damage.
It depends on which part of packet_get_do(). If we're exceeding the
packet count or the size of a pool, then I guess yes, otherwise not so
much.
> > I think it's a disservice to users to crash in that case, and it has
> > the potential to worsen a security flaw if this packet can be built on
> > purpose. It's also a disservice to package maintainers because it has
> > the potential to turn a harmless issue into an annoying situation with
> > associated time pressure and everything.
>
> Ok, that argument makes sense to me for packet_add_do(), but again,
> not for packet_get_do().
>
> > Those are not warn() calls, by the way, because we don't want to make
> > it too easy, in that (perhaps unlikely) case, to flood logs, and
> > possibly hide information due to rotation.
>
> Right, I see the argument, but it also worries me that the message -
> which even if we are successfully containing the damage here
> definitely indicates a bug in the caller - could easily never be seen
> at trace or even debug level.
>
> > I suppose that the argument for asserts here is so that we discover
> > functional issues as soon as possible, which I understand of course,
>
> That's the main reason. The secondary reason is that it clarifies the
> meaning of a NULL return from packet_get_do(). With this change it
> means: you requested a range that's not in the packet, whereas
> previously it could mean either that or "something is horribly wrong
> in the pool state, and we don't know what".
>
> > but it relies on two assumptions that might not hold:
> >
> > 1. the crash will be reported, while a debug() message will go
> > unnoticed.
> >
> > Well, the crash is more noticeable, but that doesn't mean that it
> > will be reported. Users might/will try to hack something around it
> > or try to survive with slirp4netns and all its flaws.
>
> I suppose so.
>
> > On the other hand, a malfunction (say, failed connections) might be
> > equally likely to be reported, along with debug() messages.
>
> That only applies if this causes malfunctions in intended behaviour.
> If it's triggered due to an attack, rather than a flaw handling
> expected behaviour then there's no reason it would be reported.
>
> > 2. the check and the assertion themselves are correct. This was not the
> > case for the two most recent severe issues (commit 61c0b0d0f199, bug
> > #105) we discovered through assertions.
>
> That's fair, I can see why you'd be gun shy about asserts given that.
>
> > Issues such as https://github.com/containers/podman/issues/22925, on
> > the other hand, are good examples of how asserts saved us a lot
> > of time and effort later on (compared to, say, "connections stop
> > working after three days"). But I would argue that those are
> > substantially different cases where assertions are checking
> > something that's much more inherent to the implementation.
>
> Eh... I can't really see a way these assertions are different other
> than that one turned out to be wrong and the other didn't.
Ah, sorry, I meant that there is a substantial difference between the
ones related to Podman issue 22925 *and what might happen in
packet_get_range()*.
Not that there's a difference between those related to Podman issue
22925 and those from commit 61c0b0d0f199 and bug #105.
> > In this case, I would argue that there's no need to crash, so we
> > shouldn't. We can presumably carry on after that, and we're not leaking
> > any resource or sneakily leave behind some condition that will cause
> > apparently unrelated issues at a later time.
>
> Again, true (plausibly) for packet_add_do(), but not for
> packet_get_do().
>
> > > 4) On packet_get() requesting a section of the packet beyond its bounds is
> > > correct already. This happens routinely from some callers when they
> > > probe to see if certain parts of the packet are present. At worst it
> > > indicates that the guest has generate a malformed packet, which we
> > > should quietly drop as we already do.
>
> This patch doesn't change that. We check against the stated packet
> bounds first and return NULL *before* we check against the buffer
> bounds and possibly trigger an assert.
Yes... I wasn't actually disputing that (quote is from yourself).
> Ok... here's a tentative proposal, let me know what you think:
So, first off, I have to say that it's a bit hard to reason about
something that is not giving users any problem at the moment (well, at
least not without a part of this series), and which didn't really give
us problems in the past.
It's difficult to quantify with experience and examples (even letting
alone the priority of the whole matter for a moment...). Anyway, let me
try:
> * I promote the packet_check_range() messages to warn(), or even
> err(). Yes, this might allow for log flooding, but I would argue
> that if we have a bug allowing this path to be triggered and
> something is frequently triggering it, we *want* that to be noisy.
> We *think* we're containing the damage here, but we don't really
> know.
I still think it's our responsibility to not flood the system log in
*any* circumstance.
Remember that this is not necessarily *our* log where the worst that
can happen is that we hide information. Even with sane system log
configurations, one message per packet can get close to a substantial
denial of service (especially with systemd-syslogd, see also
https://github.com/systemd/systemd/issues/13425).
I guess you don't realise until you run passt without -f, for example
with libvirt, for a large number of virtual machines. Or in KubeVirt.
Or for a bunch of containers. It's a common and very much intended
usage.
It's not true in general that we don't know. We can quantify the
difference in damage very clearly if the attacker manages to make
packet_check_range() print one line for each packet (and nothing
else).
> * I make packet_check_range() failures in packet_add_do() drop the
> packet and continue, as they used to
> * packet_check_range() failures in packet_get_do() will still
> assert()
I guess the (start < p->buf) one makes sense, but not so much the one
where we check that offset plus length is within the pool, because that
one could be very well triggered by a specific packet itself.
> * I leave the assert()s for packet _index_ out of range
Okay, yes, those make sense to me as well.
> > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > > ---
> > > packet.c | 71 ++++++++++++++++++++---------------------------------
> > > packet.h | 3 ++-
> > > vu_common.c | 13 +++++++---
> > > 3 files changed, 38 insertions(+), 49 deletions(-)
> > >
> > > diff --git a/packet.c b/packet.c
> > > index c921aa15..24f12448 100644
> > > --- a/packet.c
> > > +++ b/packet.c
> > > @@ -30,37 +30,27 @@
> > > * @func: For tracing: name of calling function
> > > * @line: For tracing: caller line of function call
> > > *
> > > - * Return: 0 if the range is valid, -1 otherwise
> > > + * ASSERT()s if the given range isn't valid (within the expected buffer(s) for
> > > + * the pool).
> > > */
> > > -static int packet_check_range(const struct pool *p, const char *ptr, size_t len,
> > > - const char *func, int line)
> > > +static void packet_check_range(const struct pool *p,
> > > + const char *ptr, size_t len,
> > > + const char *func, int line)
> > > {
> > > if (p->buf_size == 0) {
> > > - int ret;
> > > -
> > > - ret = vu_packet_check_range((void *)p->buf, ptr, len);
> > > -
> > > - if (ret == -1)
> > > - trace("cannot find region, %s:%i", func, line);
> > > -
> > > - return ret;
> > > - }
> > > -
> > > - if (ptr < p->buf) {
> > > - trace("packet range start %p before buffer start %p, %s:%i",
> > > - (void *)ptr, (void *)p->buf, func, line);
> > > - return -1;
> > > - }
> > > -
> > > - if (ptr + len > p->buf + p->buf_size) {
> > > - trace("packet range end %p after buffer end %p, %s:%i",
> > > - (void *)(ptr + len), (void *)(p->buf + p->buf_size),
> > > - func, line);
> > > - return -1;
> > > + vu_packet_check_range((void *)p->buf, ptr, len, func, line);
> > > + return;
> > > }
> > >
> > > - return 0;
> > > + ASSERT_WITH_MSG(ptr >= p->buf,
> > > + "packet range start %p before buffer start %p, %s:%i",
> > > + (void *)ptr, (void *)p->buf, func, line);
> > > + ASSERT_WITH_MSG(ptr + len <= p->buf + p->buf_size,
> > > + "packet range end %p after buffer end %p, %s:%i",
> > > + (void *)(ptr + len), (void *)(p->buf + p->buf_size),
> > > + func, line);
> > > }
> > > +
> > > /**
> > > * packet_add_do() - Add data as packet descriptor to given pool
> > > * @p: Existing pool
> > > @@ -75,18 +65,16 @@ void packet_add_do(struct pool *p, size_t len, const char *start,
> > > size_t idx = p->count;
> > >
> > > if (idx >= p->size) {
> > > - trace("add packet index %zu to pool with size %zu, %s:%i",
> > > + debug("add packet index %zu to pool with size %zu, %s:%i",
> > > idx, p->size, func, line);
> > > return;
> > > }
> > >
> > > - if (packet_check_range(p, start, len, func, line))
> > > - return;
> > > + packet_check_range(p, start, len, func, line);
> > >
> > > - if (len > PACKET_MAX_LEN) {
> > > - trace("add packet length %zu, %s:%i", len, func, line);
> > > - return;
> > > - }
> > > + ASSERT_WITH_MSG(len <= PACKET_MAX_LEN,
> > > + "add packet length %zu (max %zu), %s:%i",
> > > + len, PACKET_MAX_LEN, func, line);
> > >
> > > p->pkt[idx].iov_base = (void *)start;
> > > p->pkt[idx].iov_len = len;
> > > @@ -111,16 +99,12 @@ void *packet_get_do(const struct pool *p, size_t idx, size_t offset,
> > > {
> > > char *ptr;
> > >
> > > - if (idx >= p->size || idx >= p->count) {
> > > - trace("packet %zu from pool size: %zu, count: %zu, %s:%i",
> > > - idx, p->size, p->count, func, line);
> > > - return NULL;
> > > - }
> > > -
> > > - if (len > PACKET_MAX_LEN) {
> > > - trace("packet data length %zu, %s:%i", len, func, line);
> > > - return NULL;
> > > - }
> > > + ASSERT_WITH_MSG(idx < p->size && idx < p->count,
> > > + "packet %zu from pool size: %zu, count: %zu, %s:%i",
> > > + idx, p->size, p->count, func, line);
> > > + ASSERT_WITH_MSG(len <= PACKET_MAX_LEN,
> > > + "packet range length %zu (max %zu), %s:%i",
> > > + len, PACKET_MAX_LEN, func, line);
> > >
> > > if (len + offset > p->pkt[idx].iov_len) {
> > > trace("data length %zu, offset %zu from length %zu, %s:%i",
> > > @@ -130,8 +114,7 @@ void *packet_get_do(const struct pool *p, size_t idx, size_t offset,
> > >
> > > ptr = (char *)p->pkt[idx].iov_base + offset;
> > >
> > > - if (packet_check_range(p, ptr, len, func, line))
> > > - return NULL;
> > > + packet_check_range(p, ptr, len, func, line);
> > >
> > > if (left)
> > > *left = p->pkt[idx].iov_len - offset - len;
> > > diff --git a/packet.h b/packet.h
> > > index f95cda08..b164f77e 100644
> > > --- a/packet.h
> > > +++ b/packet.h
> > > @@ -31,7 +31,8 @@ struct pool {
> > > struct iovec pkt[];
> > > };
> > >
> > > -int vu_packet_check_range(void *buf, const char *ptr, size_t len);
> > > +void vu_packet_check_range(void *buf, const char *ptr, size_t len,
> > > + const char *func, int line);
> > > void packet_add_do(struct pool *p, size_t len, const char *start,
> > > const char *func, int line);
> > > void *packet_get_do(const struct pool *p, const size_t idx,
> > > diff --git a/vu_common.c b/vu_common.c
> > > index 531f8786..528b9b08 100644
> > > --- a/vu_common.c
> > > +++ b/vu_common.c
> > > @@ -24,10 +24,13 @@
> > > * @buf: Array of the available memory regions
> > > * @ptr: Start of desired data range
> > > * @size: Length of desired data range
> > > + * @func: For tracing: name of calling function
> > > + * @line: For tracing: caller line of function call
> >
> > If those arguments are mandatory, I would drop the "For tracing"
> > specification.
> >
> > > *
> > > - * Return: 0 if the zone is in a mapped memory region, -1 otherwise
> > > + * Aborts if the zone isn't in any of the device regions
> > > */
> > > -int vu_packet_check_range(void *buf, const char *ptr, size_t len)
> > > +void vu_packet_check_range(void *buf, const char *ptr, size_t len,
> > > + const char *func, int line)
> > > {
> > > struct vu_dev_region *dev_region;
> > >
> > > @@ -37,10 +40,12 @@ int vu_packet_check_range(void *buf, const char *ptr, size_t len)
> > >
> > > if (m <= ptr &&
> > > ptr + len <= m + dev_region->mmap_offset + dev_region->size)
> > > - return 0;
> > > + return;
> > > }
> > >
> > > - return -1;
> > > + abort_with_msg(
> > > + "package range at %p, length %zd not within dev region %s:%i",
> >
> > s/package/packet/
> >
> > > + (void *)ptr, len, func, line);
> > > }
> > >
> > > /**
--
Stefano
next prev parent reply other threads:[~2025-01-02 22:00 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-20 8:35 [PATCH v2 00/12] Cleanups to packet pool handling and sizing David Gibson
2024-12-20 8:35 ` [PATCH v2 01/12] test focus David Gibson
2024-12-20 8:35 ` [PATCH v2 02/12] hack: stop on fail, but not perf fail David Gibson
2024-12-20 8:35 ` [PATCH v2 03/12] make passt dumpable David Gibson
2024-12-20 8:35 ` [PATCH v2 04/12] packet: Use flexible array member in struct pool David Gibson
2024-12-20 8:35 ` [PATCH v2 05/12] packet: Don't pass start and offset separately too packet_check_range() David Gibson
2024-12-20 8:35 ` [PATCH v2 06/12] packet: Don't hard code maximum packet size to UINT16_MAX David Gibson
2025-01-01 21:54 ` Stefano Brivio
2025-01-02 1:00 ` David Gibson
2025-01-02 21:59 ` Stefano Brivio
2025-01-03 1:16 ` David Gibson
2025-01-05 23:43 ` Stefano Brivio
2024-12-20 8:35 ` [PATCH v2 07/12] packet: Remove unhelpful packet_get_try() macro David Gibson
2025-01-01 21:54 ` Stefano Brivio
2025-01-02 2:15 ` David Gibson
2025-01-02 22:00 ` Stefano Brivio
2025-01-03 4:48 ` David Gibson
2025-01-06 10:55 ` Stefano Brivio
2024-12-20 8:35 ` [PATCH v2 08/12] util: Add abort_with_msg() and ASSERT_WITH_MSG() helpers David Gibson
2024-12-20 8:35 ` [PATCH v2 09/12] packet: Distinguish severities of different packet_{add,git}_do() errors David Gibson
2025-01-01 21:54 ` Stefano Brivio
2025-01-02 2:58 ` David Gibson
2025-01-02 22:00 ` Stefano Brivio [this message]
2025-01-03 5:06 ` David Gibson
2025-01-06 10:55 ` Stefano Brivio
2024-12-20 8:35 ` [PATCH v2 10/12] packet: Move packet length checks into packet_check_range() David Gibson
2024-12-20 8:35 ` [PATCH v2 11/12] tap: Don't size pool_tap[46] for the maximum number of packets David Gibson
2025-01-01 21:54 ` Stefano Brivio
2025-01-02 3:46 ` David Gibson
2025-01-02 22:00 ` Stefano Brivio
2025-01-03 6:06 ` David Gibson
2024-12-20 8:35 ` [PATCH v2 12/12] packet: More cautious checks to avoid pointer arithmetic UB David Gibson
2024-12-20 9:00 ` [PATCH v2 00/12] Cleanups to packet pool handling and sizing David Gibson
2024-12-20 10:06 ` Stefano Brivio
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250102230008.391068cb@elisabeth \
--to=sbrivio@redhat.com \
--cc=david@gibson.dropbear.id.au \
--cc=passt-dev@passt.top \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).