From: Stefano Brivio <sbrivio@redhat.com>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: passt-dev@passt.top, Matej Hrica <mhrica@redhat.com>
Subject: Re: [PATCH 3/4] util, lineread, tap: Overflow checks on long signed sums and subtractions
Date: Fri, 28 Jun 2024 20:30:54 +0200 [thread overview]
Message-ID: <20240628203054.73eff0a3@elisabeth> (raw)
In-Reply-To: <20240628095510.4d300810@elisabeth>
On Fri, 28 Jun 2024 09:55:18 +0200
Stefano Brivio <sbrivio@redhat.com> wrote:
> On Fri, 28 Jun 2024 17:11:43 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
>
> > On Thu, Jun 27, 2024 at 09:55:46AM +0200, Stefano Brivio wrote:
> > > On Thu, 27 Jun 2024 11:13:14 +1000
> > > David Gibson <david@gibson.dropbear.id.au> wrote:
> > >
> > > > On Thu, Jun 27, 2024 at 01:45:35AM +0200, Stefano Brivio wrote:
> > > > > Potential sum and subtraction overflows were reported by Coverity in a
> > > > > few places where we use size_t and ssize_t types.
> > > > >
> > > > > Strictly speaking, those are not false positives even if I don't see a
> > > > > way to trigger those: for instance, if we receive up to n bytes from a
> > > > > socket up, the return value from recv() is already constrained and
> > > > > can't overflow for the usage we make in tap_handler_passt().
> > > >
> > > > Actually, I think they are false positives. In a bunch of cases the
> > > > reasoning for that does rely on assuming the kernel will never return
> > > > a value greater than the buffer size for read(), write() or similar.
> > >
> > > No, that was exactly my point: return values are constrained by the
> > > kernel, but a static checker doesn't necessarily have to assume a kernel
> > > that's properly functioning.
> >
> > Well, yes.
> >
> > > In general, static checkers do, especially if POSIX has a clear
> > > definition of a system call, and for what matters to us, they should.
> >
> > Right, that's the assumption I was working under.
> >
> > > But here Coverity is ignoring that, and I'm not sure we should call it
> > > a false positive. It's kind of arbitrary really. I think Coverity in
> > > these cases just prefers to "blindly" apply CERT C INT32-C locally,
> > > which is not necessarily a bad choice, because "false positives" are
> > > not so much of a nuisance.
> > >
> > > > So possibly just ASSERT()ing that would suffice.
> > >
> > > In some cases yes, but as we have built-ins in gcc and Clang that aim
> > > at keeping the cost of the checks down by, quoting gcc documentation,
> > > using "hardware instructions to implement these built-in functions where
> > > possible", and they already implement the operation, open-coding our own
> > > checks for assertions looks redundant and might result in slower
> > > code.
> >
> > It's not runtime cost I'm concerned about, I'm sure that's trivial.
> > What does concern me is:
> > - the overflow checking functions look like line noise
>
> Sure, that bothers me as well. I'm not sure: would comments improve that? Say:
>
> /* n += len; */
> if (sadd_overflow(n, len, &n))
> return;
>
> or a different macro? Or an ASSERT() on sadd_overflow() itself, so that
> it doesn't really look "inline"?
...that doesn't "work" either, which is rather puzzling. With this
diff, on top of this series:
diff --git a/lineread.c b/lineread.c
index f72a14e..82f262a 100644
--- a/lineread.c
+++ b/lineread.c
@@ -77,6 +77,7 @@ ssize_t lineread_get(struct lineread *lr, char **line)
{
bool eof = false;
ssize_t line_len;
+ ssize_t test;
while ((line_len = peek_line(lr, eof)) < 0) {
ssize_t rc;
@@ -103,8 +104,10 @@ ssize_t lineread_get(struct lineread *lr, char
**line)
if (rc == 0)
eof = true;
- else if (sadd_overflow(lr->count, rc, &lr->count))
- return -ERANGE;
+ else {
+ ASSERT(!sadd_overflow(lr->count, rc, &test));
+ lr->count += rc;
+ }
}
*line = lr->buf + lr->next_line;
I get these "events":
Called function "read(lr->fd, lr->buf + lr->next_line + lr->count, 8192L - lr->next_line - lr->count)", and a possible return value may be less than zero.
Assigning: "rc" = "read(lr->fd, lr->buf + lr->next_line + lr->count, 8192L - lr->next_line - lr->count)".
around read(), then:
Condition "!!__builtin_add_overflow(lr->count, rc, &test)", taking false branch.
and at the sum:
The expression "lr->count" is considered to have possibly overflowed.
More attempts below:
> > - it introduces extra error paths which make it harder to see what
> > the code's doing
>
> We already have equivalent error paths in all the cases I'm touching
> here, though.
>
> > - it doesn't really save reasoning about what ranges things can
> > have, because we need to know where to put them, unless we put
> > them on every single operation, which makes the above points much
> > worse
>
> I wouldn't put them on every single operation. Again, it's clear that
> in all these cases, we *can't* hit those overflows. Not even in the
> write_remainder() case
>
> I would just continue with the only possible reasonable approach, which
> is, reasoning about which ranges things can have, and then test things
> with static checkers, and if they have false positives, well, a bit of
> cruft here and there (be it suppressions or redundant checks) is a
> very reasonable price to pay for what they offer, I think.
>
> > I prefer checking that the syscall return values are within the bounds
> > we expect, rather than checking for later overflows, because as well
> > as the above, if we ever do get weird values out of the syscalls it
> > should show up the problem as close to the source as possible.
> >
> > > > > In any case, take care of those by adding two functions that
> > > > > explicitly check for overflows in sums and subtractions of long signed
> > > > > values, and using them.
> > > > >
> > > > > Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> > > > > ---
> > > > > lineread.c | 5 +++--
> > > > > tap.c | 21 +++++++++++++++------
> > > > > util.c | 7 +++++--
> > > > > util.h | 46 +++++++++++++++++++++++++++++++++++++++++++++-
> > > > > 4 files changed, 68 insertions(+), 11 deletions(-)
> > > > >
> > > > > diff --git a/lineread.c b/lineread.c
> > > > > index 0387f4a..12f2d24 100644
> > > > > --- a/lineread.c
> > > > > +++ b/lineread.c
> > > > > @@ -17,6 +17,7 @@
> > > > > #include <string.h>
> > > > > #include <stdbool.h>
> > > > > #include <unistd.h>
> > > > > +#include <errno.h>
> > > > >
> > > > > #include "lineread.h"
> > > > > #include "util.h"
> > > > > @@ -102,8 +103,8 @@ ssize_t lineread_get(struct lineread *lr, char **line)
> > > > >
> > > > > if (rc == 0)
> > > > > eof = true;
> > > > > - else
> > > > > - lr->count += rc;
> > > >
> > > > From the construction of the read, lr->count + rc can never exceed
> > > > LINEREAD_BUFFER_SIZE - lr->next_line, so this can't overflow.
> > >
> > > Sure. But, especially as package maintainer, in this case I prefer to
> > > have a useless check than carrying suppressions around.
> >
> > Right, but my hope is that if we ASSERT() or otherwise check that
> > property of the read return value, then that will allow Coverity to
> > reason out the rest without needing an explicit suppression.
>
> Well, I tried (of course...), along with dozens of other attempts, but
> it didn't work for any of these cases. For example, here, other than an
> ASSERT() on the return value of the read(), I also tried stuff like:
>
> ASSERT(lr->count < SSIZE_MAX && rc < SSIZE_MAX)
> lr->count += (size_t)rc;
>
> but no, it doesn't work. I think what Coverity wants to see is the sum
> tested in infinite precision, first.
I also tried to check the return value of the read() call in the most
obvious way, that is, just after read():
ASSERT(rc <= LINEREAD_BUFFER_SIZE - lr->next_line - lr->count);
but lr->count is still reported as potentially overflowed. I think the
reason is that the subtraction in the ASSERT() itself is prone to
overflow.
Anyway, given that both rc and lr->count are ssize_t, the condition of
this ASSERT() can't overflow:
ASSERT((size_t)rc + lr->count < SSIZE_MAX);
but at this point Coverity says:
The check "(size_t)rc + lr->count < 9223372036854775807UL", which appears to be a guard against integer overflow, is not a useful guard because it is either always true, or never true. This taints "rc".
And simpler, rather obvious stuff like:
ASSERT(rc <= LINEREAD_BUFFER_SIZE);
ASSERT(lr->count <= LINEREAD_BUFFER_SIZE);
doesn't work either.
I think this proves that Coverity really isn't happy unless the sum
itself happens in infinite precision (not even size_t domain with
ssize_t operands is enough).
So, well, I can report the false positive, unless you have further
ideas to check.
Meanwhile, we can either try to make this patch more acceptable, or
I'll suppress checks (downstream) as needed.
> > > > > + else if (saddl_overflow(lr->count, rc, &lr->count))
> > > > > + return -ERANGE;
> > > > > }
> > > > >
> > > > > *line = lr->buf + lr->next_line;
> > > > > diff --git a/tap.c b/tap.c
> > > > > index ec994a2..7f8c26d 100644
> > > > > --- a/tap.c
> > > > > +++ b/tap.c
> > > > > @@ -1031,7 +1031,11 @@ redo:
> > > > > */
> > > > > if (l2len > n) {
> > > > > rem = recv(c->fd_tap, p + n, l2len - n, 0);
> > > > > - if ((n += rem) != l2len)
> > > >
> > > > Similarly, rem <= l2len - n, and therefore n + rem <= l2len.
> > >
> > > Same here.
> > >
> > > > > +
> > > > > + if (saddl_overflow(n, rem, &n))
> > > > > + return;
> > > > > +
> > > > > + if (n != l2len)
> > > > > return;
> > > > > }
> > > > >
> > > > > @@ -1046,7 +1050,9 @@ redo:
> > > > >
> > > > > next:
> > > > > p += l2len;
> > > > > - n -= l2len;
> > > >
> > > > We already checked that l2len <= n, so this one can't overflow either.
> > >
> > > Same here.
> > >
> > > > Not sure why Coverity can't see that itself, though :/. Possibly it
> > > > doesn't understand gotos well enough to see that the only goto next is
> > > > after that check.
> > >
> > > It sees that, that's the path it takes in reporting a potential
> > > overflow here. I think here, again, it's just blindly requesting
> > > INT32-C from CERT C rules, locally.
> >
> > Hmmm... in each of these cases the reasoning to show that there can't
> > be an overflow isn't very complicated - at least once you put in the
> > assumption about the syscall return values, which is why I'm hoping
> > asserting that fact will let Coverity sort it out.
>
> It doesn't.
>
> > If it's reasoning more locally than the function, I can't see how that
> > won't devolve into anything other than "never use signed arithmetic in
> > C, at all, ever".
>
> We do a lot of signed arithmetic, and yet, just those five cases are
> problematic. I guess there's something peculiar with system calls
> return values, or with ssize_t / size_t, even. Maybe I could try to
> show Coverity a re-definition of those types... other than that I'm
> pretty much out of ideas.
--
Stefano
next prev parent reply other threads:[~2024-06-28 18:31 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-26 23:45 [PATCH 0/4] Small, assorted "hardening" fixes Stefano Brivio
2024-06-26 23:45 ` [PATCH 1/4] conf: Copy up to MAXDNSRCH - 1 bytes, not MAXDNSRCH Stefano Brivio
2024-06-27 0:45 ` David Gibson
2024-06-27 7:27 ` Stefano Brivio
2024-06-27 10:11 ` David Gibson
2024-06-26 23:45 ` [PATCH 2/4] tcp_splice: Check return value of setsockopt() for SO_RCVLOWAT Stefano Brivio
2024-06-27 0:46 ` David Gibson
2024-06-26 23:45 ` [PATCH 3/4] util, lineread, tap: Overflow checks on long signed sums and subtractions Stefano Brivio
2024-06-27 1:13 ` David Gibson
2024-06-27 7:55 ` Stefano Brivio
2024-06-27 20:46 ` Stefano Brivio
2024-06-28 7:15 ` David Gibson
2024-06-28 7:11 ` David Gibson
2024-06-28 7:55 ` Stefano Brivio
2024-06-28 18:30 ` Stefano Brivio [this message]
2024-07-08 13:01 ` Stefano Brivio
2024-06-26 23:45 ` [PATCH 4/4] tap: Drop frames from guest whose length is more than remaining buffer Stefano Brivio
2024-06-27 1:30 ` David Gibson
2024-06-27 8:21 ` Stefano Brivio
2024-06-28 7:19 ` David Gibson
2024-06-28 7:56 ` Stefano Brivio
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240628203054.73eff0a3@elisabeth \
--to=sbrivio@redhat.com \
--cc=david@gibson.dropbear.id.au \
--cc=mhrica@redhat.com \
--cc=passt-dev@passt.top \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).