public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: Stefano Brivio <sbrivio@redhat.com>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: passt-dev@passt.top
Subject: Re: [PATCH 1/2] tap: Keep stream consistent if qemu length descriptor spans two recv() calls
Date: Thu, 10 Nov 2022 13:59:38 +0100	[thread overview]
Message-ID: <20221110135938.27d23884@elisabeth> (raw)
In-Reply-To: <20221109112929.0ee6c107@elisabeth>

On Wed, 9 Nov 2022 11:29:29 +0100
Stefano Brivio <sbrivio@redhat.com> wrote:

> On Wed, 9 Nov 2022 21:15:48 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Tue, Nov 08, 2022 at 09:54:18AM +0100, Stefano Brivio wrote:  
> > > I got all paranoid after triggering a divide-by-zero general
> > > protection fault in passt with a qemu version without the virtio_net
> > > TX hang fix, while flooding UDP. I start thinking this was actually
> > > coming from some random changes I was playing with, but before
> > > reaching this conclusion I reviewed once more the relatively short
> > > path in tap_handler_passt() before we start using packet_*()
> > > functions, and found this.
> > > 
> > > Never observed in practice, but artificially reproduced with changes
> > > in qemu's socket interface: if we don't receive from qemu a complete
> > > length descriptor in one recv() call, or if we receive a partial one
> > > at the end of one call, we currently disregard the rest, which would
> > > make the stream inconsistent.
> > > 
> > > Nothing really bad happens, except that from that point on we would
> > > disregard all the packets we get until, if ever, we get the stream
> > > back in sync by chance.
> > > 
> > > Force reading a complete packet length descriptor with a blocking
> > > recv(), if needed -- not just a complete packet later.
> > > 
> > > Signed-off-by: Stefano Brivio <sbrivio@redhat.com>    
> > 
> > This seems an ok short term fix, but I think we want another approach
> > in the slightly longer term.  Read on..
> >   
> > > ---
> > >  tap.c | 18 +++++++++++++++---
> > >  1 file changed, 15 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/tap.c b/tap.c
> > > index f8314ef..11ac732 100644
> > > --- a/tap.c
> > > +++ b/tap.c
> > > @@ -747,14 +747,26 @@ redo:
> > >  		return -ECONNRESET;
> > >  	}
> > >  
> > > -	while (n > (ssize_t)sizeof(uint32_t)) {
> > > -		ssize_t len = ntohl(*(uint32_t *)p);
> > > +	while (n > 0) {
> > > +		ssize_t len;
> > > +
> > > +		/* Force receiving at least a complete length descriptor to
> > > +		 * avoid an inconsistent stream.
> > > +		 */    
> > 
> > Is it actually enough for this to be blocking?  AFAICT, recv() on a
> > stream socket, like read(), can return less data than you requested.  
> 
> It's not enough, hence the check on 'rem' afterwards, and this doesn't
> cover anyway the case were qemu would decide to send one byte at a
> time (because as you pointed out blocking doesn't mean we'll get the
> full amount requested), which never happens in practice, though.
> 
> > > +		if (n < (ssize_t)sizeof(uint32_t)) {
> > > +			rem = recv(c->fd_tap, p + n,
> > > +				   (ssize_t)sizeof(uint32_t) - n, 0);
> > > +			if ((n += rem) != (ssize_t)sizeof(uint32_t))
> > > +				return 0;
> > > +		}
> > > +
> > > +		len = ntohl(*(uint32_t *)p);
> > >  
> > >  		p += sizeof(uint32_t);
> > >  		n -= sizeof(uint32_t);
> > >  
> > >  		/* At most one packet might not fit in a single read, and this
> > > -		 * needs to be blocking.
> > > +		 * also needs to be blocking.    
> > 
> > Same issue here (obviously not introduced by this patch, though).  
> 
> Same here.
> 
> > >  		 */
> > >  		if (len > n) {
> > >  			rem = recv(c->fd_tap, p + n, len - n, 0);    
> > 
> > Can we handle both these cases more neatly (and without blocking
> > recv()) calls, if we maintain two pointers into pkt_buf.  The first
> > one tracks how much we've read from the qemu socket, the second tracks
> > how much has been parsed into packets.  When we get an epoll
> > notification on the qemu socket, we recv() and advance the first
> > pointer.  Then we discern as many full packets as we can, advancing
> > the second pointer.  
> 
> Yes, and I actually drafted something like that, but it takes a lot of
> attention and time to get it right, so I preferred to keep it simple
> until now. I can file a ticket as enhancement.

https://bugs.passt.top/show_bug.cgi?id=38

-- 
Stefano


  reply	other threads:[~2022-11-10 12:59 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-08  8:54 [PATCH 0/2] Try harder to avoid inconsistent qemu packet stream Stefano Brivio
2022-11-08  8:54 ` [PATCH 1/2] tap: Keep stream consistent if qemu length descriptor spans two recv() calls Stefano Brivio
2022-11-09 10:15   ` David Gibson
2022-11-09 10:29     ` Stefano Brivio
2022-11-10 12:59       ` Stefano Brivio [this message]
2022-11-08  8:54 ` [PATCH 2/2] tap: Return -EIO from tap_handler_passt() on inconsistent packet stream Stefano Brivio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221110135938.27d23884@elisabeth \
    --to=sbrivio@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=passt-dev@passt.top \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).