From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by passt.top (Postfix) with ESMTPS id D246D5A0265 for ; Wed, 9 Nov 2022 11:16:34 +0100 (CET) Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4N6gqJ6Tdsz4xTt; Wed, 9 Nov 2022 21:16:28 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=201602; t=1667988988; bh=yglsTm+NmNF7OnvZJUmTQsrKKsMA7h8akYbx1d/7z1U=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=aK3MvHMOxaC5p3cBM0yvYgg8uPgAljllIdp+2ixeTX1z8PPlWaVoByrCUQ/Y0+h34 BeDgI5YGf2OHwS5zCgfxPZ0LrSXGMZGFGjLQh1oB3C5FxhNORdsZ5drO9jUJyM8wJ1 fBuVJiigoa56TuuTcts2GN4oYnX3v4v92yhDZxqc= Date: Wed, 9 Nov 2022 21:15:48 +1100 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH 1/2] tap: Keep stream consistent if qemu length descriptor spans two recv() calls Message-ID: References: <20221108085419.3220900-1-sbrivio@redhat.com> <20221108085419.3220900-2-sbrivio@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="zthasQgHVJxDwJf2" Content-Disposition: inline In-Reply-To: <20221108085419.3220900-2-sbrivio@redhat.com> Message-ID-Hash: DT7DFWKFTYJOIMMTTSSDMKGUX3WDNIOY X-Message-ID-Hash: DT7DFWKFTYJOIMMTTSSDMKGUX3WDNIOY X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.3 Precedence: list List-Id: Development discussion and patches for passt Archived-At: <> Archived-At: List-Archive: <> List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --zthasQgHVJxDwJf2 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Nov 08, 2022 at 09:54:18AM +0100, Stefano Brivio wrote: > I got all paranoid after triggering a divide-by-zero general > protection fault in passt with a qemu version without the virtio_net > TX hang fix, while flooding UDP. I start thinking this was actually > coming from some random changes I was playing with, but before > reaching this conclusion I reviewed once more the relatively short > path in tap_handler_passt() before we start using packet_*() > functions, and found this. >=20 > Never observed in practice, but artificially reproduced with changes > in qemu's socket interface: if we don't receive from qemu a complete > length descriptor in one recv() call, or if we receive a partial one > at the end of one call, we currently disregard the rest, which would > make the stream inconsistent. >=20 > Nothing really bad happens, except that from that point on we would > disregard all the packets we get until, if ever, we get the stream > back in sync by chance. >=20 > Force reading a complete packet length descriptor with a blocking > recv(), if needed -- not just a complete packet later. >=20 > Signed-off-by: Stefano Brivio This seems an ok short term fix, but I think we want another approach in the slightly longer term. Read on.. > --- > tap.c | 18 +++++++++++++++--- > 1 file changed, 15 insertions(+), 3 deletions(-) >=20 > diff --git a/tap.c b/tap.c > index f8314ef..11ac732 100644 > --- a/tap.c > +++ b/tap.c > @@ -747,14 +747,26 @@ redo: > return -ECONNRESET; > } > =20 > - while (n > (ssize_t)sizeof(uint32_t)) { > - ssize_t len =3D ntohl(*(uint32_t *)p); > + while (n > 0) { > + ssize_t len; > + > + /* Force receiving at least a complete length descriptor to > + * avoid an inconsistent stream. > + */ Is it actually enough for this to be blocking? AFAICT, recv() on a stream socket, like read(), can return less data than you requested. > + if (n < (ssize_t)sizeof(uint32_t)) { > + rem =3D recv(c->fd_tap, p + n, > + (ssize_t)sizeof(uint32_t) - n, 0); > + if ((n +=3D rem) !=3D (ssize_t)sizeof(uint32_t)) > + return 0; > + } > + > + len =3D ntohl(*(uint32_t *)p); > =20 > p +=3D sizeof(uint32_t); > n -=3D sizeof(uint32_t); > =20 > /* At most one packet might not fit in a single read, and this > - * needs to be blocking. > + * also needs to be blocking. Same issue here (obviously not introduced by this patch, though). > */ > if (len > n) { > rem =3D recv(c->fd_tap, p + n, len - n, 0); Can we handle both these cases more neatly (and without blocking recv()) calls, if we maintain two pointers into pkt_buf. The first one tracks how much we've read from the qemu socket, the second tracks how much has been parsed into packets. When we get an epoll notification on the qemu socket, we recv() and advance the first pointer. Then we discern as many full packets as we can, advancing the second pointer. --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --zthasQgHVJxDwJf2 Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEoULxWu4/Ws0dB+XtgypY4gEwYSIFAmNrfaYACgkQgypY4gEw YSJEDQ//RMYzthZMpq+fKJNxbZRcC39u6bxbDcROEPqPyQZs9uBmlHO/AAx5of3y tFoJkrTXmN914fYZMyKk0felUb8hWpAp78ZUA4AcYEPZq2+kPL4uGTKrgHD5bEqj j/oyE1xRKjk+LEijnTGJ6nk5DqDOG9M6X8ne8bHXoD6h2L7d3qnGvOdvSU6cGYQg buE/w36YU+iFfQxuF0Ja+uuq+EoB0ITlXRBf8mLBnNWyMi3TVWoaaxOzf1dqBjFP NZWM4aKa6b15z0cqIz47sumU5u4XRvVPRvSVfrHSQiXRjHeW5LThTtn2gloaoE3R eysShydebCH7/iWAvfIOMWEwdiLXhvzUecbN3aIBI3IWaiapeWnr04UnNZThKktx rUsXVl/1ge7AtVjfAeZ/MrFak8ODfZaRBd06OjB+Po32J4HqmlBQRx5okefG1skl BY4HhmeDAmsP4M2M94L0u8fjISrK1Ux2BGxjnrFRKe4scuWW7KlNqE1izxWH+duW NY0H5uwWLUVqMGo54+07+MP+efcDfMh9pOs42l5jojS2cU8WNHUaPozpEjRAYfzO wMPfbC6tf9nRtC4snoh49weju6f0+RGHWU46BeN3uMNoPFGKqHkEBrlr56opG+5z WTgZTl8MY0XNPnnDeat5ctnNKRNqgHVfcugMCPwKA74RbMmwnE4= =NKA7 -----END PGP SIGNATURE----- --zthasQgHVJxDwJf2--