From: Stefano Brivio <sbrivio@redhat.com>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: passt-dev@passt.top
Subject: Re: [PATCH 10/11] tap: Discard guest data on length descriptor mismatch
Date: Thu, 25 Jul 2024 13:09:19 +0200 [thread overview]
Message-ID: <20240725130908.2ca21e7c@elisabeth> (raw)
In-Reply-To: <ZqInu0VUjlLW9T3S@zatzit>
On Thu, 25 Jul 2024 20:23:55 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:
> On Thu, Jul 25, 2024 at 11:15:03AM +0200, Stefano Brivio wrote:
> > On Thu, 25 Jul 2024 14:37:43 +1000
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> >
> > > On Wed, Jul 24, 2024 at 11:50:16PM +0200, Stefano Brivio wrote:
> > > > This was reported by Matej a while ago, but we forgot to fix it. Even
> > > > if the hypervisor is necessarily trusted by passt, as it can in any
> > > > case terminate the guest or disrupt guest connectivity, it's a good
> > > > idea to be robust against possible issues.
> > > >
> > > > Instead of resetting the connection to the hypervisor, just discard
> > > > the data we read with a single recv(), as we had a few cases where
> > > > QEMU would get the length descriptor wrong, in the past.
> > > >
> > > > While at it, change l2len in tap_handler_passt() to uint32_t, as the
> > > > length descriptor is logically unsigned and 32-bit wide.
> > > >
> > > > Reported-by: Matej Hrica <mhrica@redhat.com>
> > > > Suggested-by: Matej Hrica <mhrica@redhat.com>
> > > > Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> > > > ---
> > > > tap.c | 10 ++++++----
> > > > 1 file changed, 6 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/tap.c b/tap.c
> > > > index 44bd444..62ba6a4 100644
> > > > --- a/tap.c
> > > > +++ b/tap.c
> > > > @@ -1011,15 +1011,18 @@ redo:
> > > > }
> > > >
> > > > while (n > (ssize_t)sizeof(uint32_t)) {
> > > > - ssize_t l2len = ntohl(*(uint32_t *)p);
> > > > + uint32_t l2len = ntohl(*(uint32_t *)p);
> > > >
> > > > p += sizeof(uint32_t);
> > > > n -= sizeof(uint32_t);
> > > >
> > > > + if (l2len > (ssize_t)TAP_BUF_BYTES - n)
> > > > + return;
> > >
> > > Neither the condition nor the action makes much sense to me here.
> > > We're testing if the frame can fit in the the remaining buffer space.
> >
> > Not really, we're just checking that the length descriptor fits the
> > remaining buffer space. We're using this in the second recv() below,
> > that's why it matters here.
>
> But AFAICT, what we need to know is if the remainder of the frame fits
> in the buffer.
It always does because of how TAP_BUF_BYTES and TAP_BUF_FILL are
defined.
> That could be less than the length descriptor if we've
> already recv()ed part of a frame.
>
> > > But we may have already read part (or all) of the frame - i.e. it's
> > > included in 'n'. So I don't see how that condition is useful.
> >
> > ...that is, it has nothing to do with exceeding or not exceeding the
> > buffer on recv(), that's already taken care of by the recv() call,
> > implicitly.
> >
> > > Then, simply returning doesn't seem right under pretty much any
> > > circumstances - that discards some amount of data, and leaves us in an
> > > unsynchronized state w.r.t. the frame boundaries.
> >
> > That might happen, of course, but it might also happen that the
> > hypervisor sent us *one* corrupted buffer, and the next recv() will
> > read consistent data.
>
> Well, sure, it's possible, but it doesn't seem particularly likely to
> me. AFAICT this is a stream which we need every length field to
> interpret properly. If we lose one, or it's corrupted, I think we're
> done for.
In most cases, the content of one recv() corresponds to a given number
of whole frames, so what we're doing by returning here is, practically
speaking, similar to what you're suggesting below with MSG_TRUNC.
> > > If this is just supposed to be a sanity check on the frame length,
> > > then I think we'd be better off with a fixed limit - 64kiB is the
> > > obvious choice.
> >
> > That's already checked below (l2len > ETH_MAX_MTU), and...
>
> Right. I wonder if it would make sense to do that earlier.
We can. But right now what I want to fix here is just that missing
check, because it actually causes passt to crash (even though we assume
a trusted hypervisor, so it's not really security relevant, but not nice
either).
> > > If we hit that, we can warn() and discard data up to
> > > the end of the too-large frame. That at least has a chance of letting
> > > us recover and move on to future acceptable frames.
> >
> > that's exactly what we do in that case (goto next).
>
> Only for the case that the length is too long, but not *too* long. In
> particular it needs to fit in the buffer to even get there. If we
> sanity checked the frame length earlier we could use MSG_TRUNC to
> discard even a ludicrously large frame and still continue on to the
> next one.
We don't know if the length descriptor actually matches the length of
the frame, though. If you have a way you think is more robust, would
you mind sending a patch?
> > > > /* At most one packet might not fit in a single read, and this
> > > > * needs to be blocking.
> > > > */
> > > > - if (l2len > n) {
> > > > + if (l2len > (size_t)n) {
> > > > rem = recv(c->fd_tap, p + n, l2len - n, 0);
> > ^^^^^^^^^^^^^^^^
> >
> > This the reason why the check above is relevant.
>
> Relevant, sure, but I still don't think it's right. Actually
> (TAP_BUF_BYTES - n) is an even stranger quantity than I initially
> thought. It's the total space of the buffer minus the current partial
> frame - counting *both* the stuff before our partial frame and after
> it.
That should be enough to make sure we don't have bad side effects on
this second recv(), but yes:
> I think instead we need to check for (p + l2len > pkt_buf + TAP_BUF_BYTES).
...that is actually more accurate. I can fix this up in another
version, unless you can think of a more comprehensive change that also
gives us better possibilities to recover from a corrupted stream.
> > > > if ((n += rem) != l2len)
> > > > return;
> > >
> > > Pre-existing, but a 'return' here basically lands us in a situation we
> > > have no meaningful chance of recovering from. A die() would be
> > > preferable. Better yet would be continuing to re-recv() until we have
> > > the whole frame, similar to what we do for write_remainder().
> >
> > Same as above, it depends on what failure you're assuming. If it's just
> > one botched recv(), instead, we recv() again the next time and we
> > recover.
>
> Even if it's just one bad recv(), we still have no idea where we are
> w.r.t. frame boundaries, so I can't see any way we could recover.
The idea is that the recv() is _usually_ big enough as to flush the
buffer anyway.
> > But yes, the first attempt should probably be to recv() the rest of the
> > frame. I didn't implement this because it adds complexity and I think
> > that, eventually, we should turn this into a proper ringbuffer anyway.
> >
> > > > @@ -1028,8 +1031,7 @@ redo:
> > > > /* Complete the partial read above before discarding a malformed
> > > > * frame, otherwise the stream will be inconsistent.
> > > > */
> > > > - if (l2len < (ssize_t)sizeof(struct ethhdr) ||
> > > > - l2len > (ssize_t)ETH_MAX_MTU)
> > > > + if (l2len < sizeof(struct ethhdr) || l2len > ETH_MAX_MTU)
> > > > goto next;
> > > > tap_add_packet(c, l2len, p);
> >
>
--
Stefano
next prev parent reply other threads:[~2024-07-25 11:10 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-24 21:50 [PATCH 00/11] Minor assorted fixes, mostly logging and tests Stefano Brivio
2024-07-24 21:50 ` [PATCH 01/11] tap: Don't quit if pasta gets EIO on writev() to tap, interface might be down Stefano Brivio
2024-07-25 3:21 ` David Gibson
2024-07-24 21:50 ` [PATCH 02/11] tcp: Change SO_PEEK_OFF support message to debug() Stefano Brivio
2024-07-25 3:22 ` David Gibson
2024-07-24 21:50 ` [PATCH 03/11] log: Drop newlines in the middle of the perror()-like messages Stefano Brivio
2024-07-25 3:26 ` David Gibson
2024-07-25 11:27 ` Stefano Brivio
2024-07-26 0:33 ` David Gibson
2024-07-24 21:50 ` [PATCH 04/11] log: Fix sub-second part in relative log time calculation Stefano Brivio
2024-07-25 3:32 ` David Gibson
2024-07-25 7:51 ` Stefano Brivio
2024-07-24 21:50 ` [PATCH 05/11] log: Initialise timestamp for relative log time also if we use a log file Stefano Brivio
2024-07-25 3:35 ` David Gibson
2024-07-25 7:51 ` Stefano Brivio
2024-07-24 21:50 ` [PATCH 06/11] test: Fix memory/passt tests, --netns-only is not a valid option for passt Stefano Brivio
2024-07-25 3:50 ` David Gibson
2024-07-24 21:50 ` [PATCH 07/11] test: Update names of symbols and slabinfo entries Stefano Brivio
2024-07-25 3:54 ` David Gibson
2024-07-24 21:50 ` [PATCH 08/11] test: iperf3 3.16 introduces multiple threads, drop our own implementation of that Stefano Brivio
2024-07-25 4:23 ` David Gibson
2024-07-24 21:50 ` [PATCH 09/11] tap: Exit if we fail to bind a UNIX domain socket with explicit path Stefano Brivio
2024-07-25 4:00 ` David Gibson
2024-07-24 21:50 ` [PATCH 10/11] tap: Discard guest data on length descriptor mismatch Stefano Brivio
2024-07-25 4:37 ` David Gibson
2024-07-25 9:15 ` Stefano Brivio
2024-07-25 10:23 ` David Gibson
2024-07-25 11:09 ` Stefano Brivio [this message]
2024-07-26 1:22 ` David Gibson
2024-07-24 21:50 ` [PATCH 11/11] conf: Accept addresses enclosed by square brackets in port forwarding specifiers Stefano Brivio
2024-07-25 4:39 ` David Gibson
2024-07-25 11:26 ` Stefano Brivio
2024-07-25 11:28 ` [PATCH 00/11] Minor assorted fixes, mostly logging and tests Stefano Brivio
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240725130908.2ca21e7c@elisabeth \
--to=sbrivio@redhat.com \
--cc=david@gibson.dropbear.id.au \
--cc=passt-dev@passt.top \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).