On Thu, Jun 27, 2024 at 10:21:04AM +0200, Stefano Brivio wrote: > On Thu, 27 Jun 2024 11:30:02 +1000 > David Gibson wrote: > > > On Thu, Jun 27, 2024 at 01:45:36AM +0200, Stefano Brivio wrote: > > > This was reported by Matej a while ago, but we forgot to fix it. Even > > > if the hypervisor is necessarily trusted by passt, as it can in any > > > case terminate the guest or disrupt guest connectivity, it's a good > > > idea to be robust against possible issues. > > > > > > Instead of resetting the connection to the hypervisor, just skip the > > > offending frame, as we had a few cases where QEMU would get the > > > length descriptor wrong, in the past. > > > > > > Reported-by: Matej Hrica > > > Suggested-by: Matej Hrica > > > Signed-off-by: Stefano Brivio > > > --- > > > tap.c | 3 +++ > > > 1 file changed, 3 insertions(+) > > > > > > diff --git a/tap.c b/tap.c > > > index 7f8c26d..bb993e0 100644 > > > --- a/tap.c > > > +++ b/tap.c > > > @@ -1026,6 +1026,9 @@ redo: > > > > So.. I just realised there's a different pre-existing problem here, > > above what's quoted in the patch we have: > > > > ssize_t l2len = ntohl(*(uint32_t *)p); > > > > On a platform where ssize_t is only 32-bits we could get a negative > > value here, which would be very bad. > > We can't, because the length is anyway limited to the maximum IPv4 path > MTU in any hypervisor we might be talking to. Only if we trust the hypervisor. And that the user didn't misconfigure us to point the socket at something that's not actually a hypervisor. And that it's not some future version of the hypervisor configured to use a different framing format. And that our code is robust enough to never get out of sync on the stream. I really think it's better to read this into a u32, and range sanity check it before we do anything else. > > So l2len should be a uint32_t, not ssize_t. > > True, I can also change that while at it. > > > We do then need to make sure that the comparison between > > l2len and n is unsigned - it's safe to cast n to size_t there, because > > we've verified it's positive as the loop condition. > > > > Or... maybe it's simpler. The frame length is encoded as 32-bits, but > > we can't meaningfully have frames above 64k (maybe 64k+ETH_HLEN). So > > possibly we should just reset the tap connection if we see such a > > frame (most likely it means we've somehow gotten out of sync, anyway). > > I'd rather "fix" type and comparison, for the sake of whatever static > checkers might eventually come up with. > > > > p += sizeof(uint32_t); > > > n -= sizeof(uint32_t); > > > > > > > + if (l2len > (ssize_t)TAP_BUF_BYTES - n) > > > > I hate to discard valid frames from the guest. > > That won't happen because of how TAP_BUF_FILL and TAP_BUF_BYTES are > defined. If this condition is true, the length descriptor actually > mismatched. If you have a better proposal, let me know. Hmm.. I'll have a look. > > > + goto next; > > > > ..and this is not safe. This skips (l2len > n) check, which means > > that the n -= l2len at next could now have a signed overflow, which is > > UB. > > It's safe because of the change from 3/4, which will just return on > overflow. In any case, yes, I can add a return directly, it's not very > productive to speculate what kind of issues a hypervisor might have, > let's just avoid crashing in case. > > > > + > > > /* At most one packet might not fit in a single read, and this > > > * needs to be blocking. > > > */ > > > -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson