From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by passt.top (Postfix) with ESMTP id B86D85A004F for ; Fri, 28 Jun 2024 20:31:35 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1719599494; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=MtWQNfxkcFXB4avKOlP8Tagp1lCe7xjMMSBrHT664bE=; b=MaFl3wV+lWpQVakiDVAWD5nM2gkk1nB4lPq8EFPQff1zFfoeHHHEeJ6Emm4GrUlSF/x0jn bBO1egJgYCJVLDuphsLRdJkt4cynWySG8KAvgSHEwgR2kbmQ7ADSJj/kBYl80WIOaqdlIZ 83Iv1gZAPidiGJ8DQzb7w+pZfIlog2Q= Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-139-UXDK5EHaN5KD-3H_ELvvxg-1; Fri, 28 Jun 2024 14:31:32 -0400 X-MC-Unique: UXDK5EHaN5KD-3H_ELvvxg-1 Received: by mail-qk1-f198.google.com with SMTP id af79cd13be357-79d56e0b68cso128213185a.1 for ; Fri, 28 Jun 2024 11:31:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719599492; x=1720204292; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=MtWQNfxkcFXB4avKOlP8Tagp1lCe7xjMMSBrHT664bE=; b=URXLGt60iRnksUTRAnmrWDAREZJeWf0B45t1lhxvVZPdVCx9579gPtkFiQlVr9/myu gMnwQAiZyNrVWYUBbiJ0GBwNzK9ZJMvZdbjU9g8z6HXte+gfUVQI5mM79WvCuR1pLF7d XEEdulj1M384XV60MvxMG7it3V1CLpppno0b+5o+0Lgj98AcWpyqaX7xo8L9yv2ZD22x p1poVwscZDtW1B3CF7+nHjXswG69I6yPHlSoh+qHszU+fk4JiYgkIm8/+hGt17AUG/Gn +P37J7CdVjGcRwjBJf+H4N2MotGS7uvHPHtyMeSwU44dqCEMKNJGxNEPAcFWCtJU5p12 q7RQ== X-Gm-Message-State: AOJu0YyKFBTSTwThq3OM4qalqr1ThLXtg6pPjzcgwKMgwkFrAKL7S+Qm hQjywSss4xHT8lbL41xgt7b13sENBGe9sd1XL309Sp6aVBdgwHnDbJaAslZQi5MJTnQsKy6pyXr kMHZ6ZvXWFlYluNpb0QCoG6krzQI84GhwmH/O5lFj0InNEVrWLQ== X-Received: by 2002:a05:620a:4108:b0:79c:c55b:f5fd with SMTP id af79cd13be357-79cc55bfa30mr1024087885a.76.1719599491997; Fri, 28 Jun 2024 11:31:31 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFO8tDy56VqhfPljJEizr3WwLGqwIr7k8ulSXYFXJoVYU4GnOc5LgkmW/ArjC1DgcwRtXDcKA== X-Received: by 2002:a05:620a:4108:b0:79c:c55b:f5fd with SMTP id af79cd13be357-79cc55bfa30mr1024084185a.76.1719599491388; Fri, 28 Jun 2024 11:31:31 -0700 (PDT) Received: from maya.cloud.tilaa.com (maya.cloud.tilaa.com. [164.138.29.33]) by smtp.gmail.com with ESMTPSA id af79cd13be357-79d69302764sm95265185a.99.2024.06.28.11.31.29 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 28 Jun 2024 11:31:30 -0700 (PDT) Date: Fri, 28 Jun 2024 20:30:54 +0200 From: Stefano Brivio To: David Gibson Subject: Re: [PATCH 3/4] util, lineread, tap: Overflow checks on long signed sums and subtractions Message-ID: <20240628203054.73eff0a3@elisabeth> In-Reply-To: <20240628095510.4d300810@elisabeth> References: <20240626234536.3306466-1-sbrivio@redhat.com> <20240626234536.3306466-4-sbrivio@redhat.com> <20240627095537.4d3f20f8@elisabeth> <20240628095510.4d300810@elisabeth> Organization: Red Hat X-Mailer: Claws Mail 4.2.0 (GTK 3.24.41; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-ID-Hash: I2HPNNFGZFRNLWHFOK2H2DLOBDVO2LBP X-Message-ID-Hash: I2HPNNFGZFRNLWHFOK2H2DLOBDVO2LBP X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, Matej Hrica X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Fri, 28 Jun 2024 09:55:18 +0200 Stefano Brivio wrote: > On Fri, 28 Jun 2024 17:11:43 +1000 > David Gibson wrote: > > > On Thu, Jun 27, 2024 at 09:55:46AM +0200, Stefano Brivio wrote: > > > On Thu, 27 Jun 2024 11:13:14 +1000 > > > David Gibson wrote: > > > > > > > On Thu, Jun 27, 2024 at 01:45:35AM +0200, Stefano Brivio wrote: > > > > > Potential sum and subtraction overflows were reported by Coverity in a > > > > > few places where we use size_t and ssize_t types. > > > > > > > > > > Strictly speaking, those are not false positives even if I don't see a > > > > > way to trigger those: for instance, if we receive up to n bytes from a > > > > > socket up, the return value from recv() is already constrained and > > > > > can't overflow for the usage we make in tap_handler_passt(). > > > > > > > > Actually, I think they are false positives. In a bunch of cases the > > > > reasoning for that does rely on assuming the kernel will never return > > > > a value greater than the buffer size for read(), write() or similar. > > > > > > No, that was exactly my point: return values are constrained by the > > > kernel, but a static checker doesn't necessarily have to assume a kernel > > > that's properly functioning. > > > > Well, yes. > > > > > In general, static checkers do, especially if POSIX has a clear > > > definition of a system call, and for what matters to us, they should. > > > > Right, that's the assumption I was working under. > > > > > But here Coverity is ignoring that, and I'm not sure we should call it > > > a false positive. It's kind of arbitrary really. I think Coverity in > > > these cases just prefers to "blindly" apply CERT C INT32-C locally, > > > which is not necessarily a bad choice, because "false positives" are > > > not so much of a nuisance. > > > > > > > So possibly just ASSERT()ing that would suffice. > > > > > > In some cases yes, but as we have built-ins in gcc and Clang that aim > > > at keeping the cost of the checks down by, quoting gcc documentation, > > > using "hardware instructions to implement these built-in functions where > > > possible", and they already implement the operation, open-coding our own > > > checks for assertions looks redundant and might result in slower > > > code. > > > > It's not runtime cost I'm concerned about, I'm sure that's trivial. > > What does concern me is: > > - the overflow checking functions look like line noise > > Sure, that bothers me as well. I'm not sure: would comments improve that? Say: > > /* n += len; */ > if (sadd_overflow(n, len, &n)) > return; > > or a different macro? Or an ASSERT() on sadd_overflow() itself, so that > it doesn't really look "inline"? ...that doesn't "work" either, which is rather puzzling. With this diff, on top of this series: diff --git a/lineread.c b/lineread.c index f72a14e..82f262a 100644 --- a/lineread.c +++ b/lineread.c @@ -77,6 +77,7 @@ ssize_t lineread_get(struct lineread *lr, char **line) { bool eof = false; ssize_t line_len; + ssize_t test; while ((line_len = peek_line(lr, eof)) < 0) { ssize_t rc; @@ -103,8 +104,10 @@ ssize_t lineread_get(struct lineread *lr, char **line) if (rc == 0) eof = true; - else if (sadd_overflow(lr->count, rc, &lr->count)) - return -ERANGE; + else { + ASSERT(!sadd_overflow(lr->count, rc, &test)); + lr->count += rc; + } } *line = lr->buf + lr->next_line; I get these "events": Called function "read(lr->fd, lr->buf + lr->next_line + lr->count, 8192L - lr->next_line - lr->count)", and a possible return value may be less than zero. Assigning: "rc" = "read(lr->fd, lr->buf + lr->next_line + lr->count, 8192L - lr->next_line - lr->count)". around read(), then: Condition "!!__builtin_add_overflow(lr->count, rc, &test)", taking false branch. and at the sum: The expression "lr->count" is considered to have possibly overflowed. More attempts below: > > - it introduces extra error paths which make it harder to see what > > the code's doing > > We already have equivalent error paths in all the cases I'm touching > here, though. > > > - it doesn't really save reasoning about what ranges things can > > have, because we need to know where to put them, unless we put > > them on every single operation, which makes the above points much > > worse > > I wouldn't put them on every single operation. Again, it's clear that > in all these cases, we *can't* hit those overflows. Not even in the > write_remainder() case > > I would just continue with the only possible reasonable approach, which > is, reasoning about which ranges things can have, and then test things > with static checkers, and if they have false positives, well, a bit of > cruft here and there (be it suppressions or redundant checks) is a > very reasonable price to pay for what they offer, I think. > > > I prefer checking that the syscall return values are within the bounds > > we expect, rather than checking for later overflows, because as well > > as the above, if we ever do get weird values out of the syscalls it > > should show up the problem as close to the source as possible. > > > > > > > In any case, take care of those by adding two functions that > > > > > explicitly check for overflows in sums and subtractions of long signed > > > > > values, and using them. > > > > > > > > > > Signed-off-by: Stefano Brivio > > > > > --- > > > > > lineread.c | 5 +++-- > > > > > tap.c | 21 +++++++++++++++------ > > > > > util.c | 7 +++++-- > > > > > util.h | 46 +++++++++++++++++++++++++++++++++++++++++++++- > > > > > 4 files changed, 68 insertions(+), 11 deletions(-) > > > > > > > > > > diff --git a/lineread.c b/lineread.c > > > > > index 0387f4a..12f2d24 100644 > > > > > --- a/lineread.c > > > > > +++ b/lineread.c > > > > > @@ -17,6 +17,7 @@ > > > > > #include > > > > > #include > > > > > #include > > > > > +#include > > > > > > > > > > #include "lineread.h" > > > > > #include "util.h" > > > > > @@ -102,8 +103,8 @@ ssize_t lineread_get(struct lineread *lr, char **line) > > > > > > > > > > if (rc == 0) > > > > > eof = true; > > > > > - else > > > > > - lr->count += rc; > > > > > > > > From the construction of the read, lr->count + rc can never exceed > > > > LINEREAD_BUFFER_SIZE - lr->next_line, so this can't overflow. > > > > > > Sure. But, especially as package maintainer, in this case I prefer to > > > have a useless check than carrying suppressions around. > > > > Right, but my hope is that if we ASSERT() or otherwise check that > > property of the read return value, then that will allow Coverity to > > reason out the rest without needing an explicit suppression. > > Well, I tried (of course...), along with dozens of other attempts, but > it didn't work for any of these cases. For example, here, other than an > ASSERT() on the return value of the read(), I also tried stuff like: > > ASSERT(lr->count < SSIZE_MAX && rc < SSIZE_MAX) > lr->count += (size_t)rc; > > but no, it doesn't work. I think what Coverity wants to see is the sum > tested in infinite precision, first. I also tried to check the return value of the read() call in the most obvious way, that is, just after read(): ASSERT(rc <= LINEREAD_BUFFER_SIZE - lr->next_line - lr->count); but lr->count is still reported as potentially overflowed. I think the reason is that the subtraction in the ASSERT() itself is prone to overflow. Anyway, given that both rc and lr->count are ssize_t, the condition of this ASSERT() can't overflow: ASSERT((size_t)rc + lr->count < SSIZE_MAX); but at this point Coverity says: The check "(size_t)rc + lr->count < 9223372036854775807UL", which appears to be a guard against integer overflow, is not a useful guard because it is either always true, or never true. This taints "rc". And simpler, rather obvious stuff like: ASSERT(rc <= LINEREAD_BUFFER_SIZE); ASSERT(lr->count <= LINEREAD_BUFFER_SIZE); doesn't work either. I think this proves that Coverity really isn't happy unless the sum itself happens in infinite precision (not even size_t domain with ssize_t operands is enough). So, well, I can report the false positive, unless you have further ideas to check. Meanwhile, we can either try to make this patch more acceptable, or I'll suppress checks (downstream) as needed. > > > > > + else if (saddl_overflow(lr->count, rc, &lr->count)) > > > > > + return -ERANGE; > > > > > } > > > > > > > > > > *line = lr->buf + lr->next_line; > > > > > diff --git a/tap.c b/tap.c > > > > > index ec994a2..7f8c26d 100644 > > > > > --- a/tap.c > > > > > +++ b/tap.c > > > > > @@ -1031,7 +1031,11 @@ redo: > > > > > */ > > > > > if (l2len > n) { > > > > > rem = recv(c->fd_tap, p + n, l2len - n, 0); > > > > > - if ((n += rem) != l2len) > > > > > > > > Similarly, rem <= l2len - n, and therefore n + rem <= l2len. > > > > > > Same here. > > > > > > > > + > > > > > + if (saddl_overflow(n, rem, &n)) > > > > > + return; > > > > > + > > > > > + if (n != l2len) > > > > > return; > > > > > } > > > > > > > > > > @@ -1046,7 +1050,9 @@ redo: > > > > > > > > > > next: > > > > > p += l2len; > > > > > - n -= l2len; > > > > > > > > We already checked that l2len <= n, so this one can't overflow either. > > > > > > Same here. > > > > > > > Not sure why Coverity can't see that itself, though :/. Possibly it > > > > doesn't understand gotos well enough to see that the only goto next is > > > > after that check. > > > > > > It sees that, that's the path it takes in reporting a potential > > > overflow here. I think here, again, it's just blindly requesting > > > INT32-C from CERT C rules, locally. > > > > Hmmm... in each of these cases the reasoning to show that there can't > > be an overflow isn't very complicated - at least once you put in the > > assumption about the syscall return values, which is why I'm hoping > > asserting that fact will let Coverity sort it out. > > It doesn't. > > > If it's reasoning more locally than the function, I can't see how that > > won't devolve into anything other than "never use signed arithmetic in > > C, at all, ever". > > We do a lot of signed arithmetic, and yet, just those five cases are > problematic. I guess there's something peculiar with system calls > return values, or with ssize_t / size_t, even. Maybe I could try to > show Coverity a re-definition of those types... other than that I'm > pretty much out of ideas. -- Stefano