public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: Stefano Brivio <sbrivio@redhat.com>
To: "Richard W.M. Jones" <rjones@redhat.com>
Cc: passt-dev@passt.top
Subject: Re: [PATCH v2 3/5] passt, tap: Add --fd option
Date: Thu, 17 Nov 2022 17:18:05 +0100	[thread overview]
Message-ID: <20221117171805.3746f53a@elisabeth> (raw)
In-Reply-To: <20221117160243.GH7636@redhat.com>

On Thu, 17 Nov 2022 16:02:43 +0000
"Richard W.M. Jones" <rjones@redhat.com> wrote:

> On Thu, Nov 17, 2022 at 04:49:31PM +0100, Stefano Brivio wrote:
> > On Thu, 17 Nov 2022 15:33:06 +0000
> > "Richard W.M. Jones" <rjones@redhat.com> wrote:
> >   
> > > On Thu, Nov 17, 2022 at 04:26:40PM +0100, Stefano Brivio wrote:  
> > > > From: "Richard W.M. Jones" <rjones@redhat.com>
> > > > 
> > > > This passes a fully connected stream socket to passt.
> > > > 
> > > > Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
> > > > [sbrivio: reuse fd_tap instead of adding a new descriptor,
> > > >  imply --one-off on --fd, add to optstring and usage()]
> > > > Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> > > > ---
> > > > v2:
> > > >  - reuse fd_tap, we don't need a separate file descriptor
> > > >  - add F to optstring and usage(), for both passt and pasta
> > > >  - imply --one-off, we can't do much once the socket is closed
> > > > 
> > > > With this, the trick from 5/5 goes a bit further: passt reads
> > > > from the file descriptor passed by the wrapper.    
> > > 
> > > Thanks for the v2 .. I'll add it to my series and play with it.
> > >   
> > > > However, we get EPOLLRDHUP right away, from the close() on
> > > > one end of the socket pair I guess. Should we just ignore
> > > > all EPOLLRDHUP events, just the first one...?    
> > > 
> > > Does it see the event out-of-band or does it get an in-band
> > > read(2) == 0 after finishing reading the data?  
> > 
> > Out-of-band, so to speak: we won't even recv() if we get EPOLLRDHUP
> > (that's handled in tap_handler()). If I do this on top of this patch:
> > 
> > --- a/tap.c
> > +++ b/tap.c
> > @@ -1073,7 +1073,7 @@ void tap_sock_init(struct ctx *c)
> >                         struct epoll_event ev = { 0 };
> >  
> >                         ev.data.fd = c->fd_tap;
> > -                       ev.events = EPOLLIN | EPOLLET | EPOLLRDHUP;
> > +                       ev.events = EPOLLIN | EPOLLET;
> > 
> > Then it gets those four bytes:
> > 
> > [pid 2538704] epoll_wait(5, 0x7ffedc4a6320, 8, 1000) = 1
> > [pid 2538704] recvfrom(4, 0x560797677000, 8323069, MSG_DONTWAIT, NULL, NULL) = 4
> > [pid 2538704] epoll_wait(5, [], 8, 1000) = 0
> > [pid 2538704] epoll_wait(5, 0x7ffedc4a6320, 8, 1000) = -1 EINTR (Interrupted system call)
> > 
> > and does nothing with them, as expected. Two epoll_wait() calls later,
> > the syscall is interrupted, I'm not sure why and how we should react (in
> > main(), passt.c) in that case.  
> 
> With EPOLLRDHUP removed it's a bit deadlock-y.  One case I commonly
> see is:
> 
> child (qemu):
> 
> 1295637 write(5, "\0\0\0\0", 4 <unfinished ...>
> 1295637 <... write resumed>)            = 4
> 1295637 poll([{fd=5, events=POLLIN|POLLOUT}], 1, -1 <unfinished ...>
> 1295637 <... poll resumed>)             = 1 ([{fd=5, revents=POLLOUT}])
> 1295637 read(3,  <unfinished ...>
> 1295637 <... read resumed>"", 512)      = 0
> 1295637 shutdown(5, SHUT_WR <unfinished ...>
> 1295637 <... shutdown resumed>)         = 0
> 1295637 poll([{fd=5, events=POLLIN}], 1, -1 <unfinished ...>
> 
> then later the parent (passt):
> 
> 1295636 epoll_wait(5, 0x7ffe93a894e0, 8, 1000) = 1
> 1295636 recvfrom(4, 0x5576383cf000, 8323069, MSG_DONTWAIT, NULL, NULL) = 4
> 1295636 epoll_wait(5, [], 8, 1000)      = 0
> 1295636 epoll_wait(5, [], 8, 1000)      = 0
> 1295636 epoll_wait(5, [], 8, 1000)      = 0
> 1295636 epoll_wait(5, [], 8, 1000)      = 0
> 1295636 epoll_wait(5, [], 8, 1000)      = 0
> 1295636 epoll_wait(5, [], 8, 1000)      = 0
> 1295636 epoll_wait(5, [], 8, 1000)      = 0
> 1295636 epoll_wait(5, [], 8, 1000)      = 0
> 1295636 epoll_wait(5, [], 8, 1000)      = 0
> (forever)

Ah, yes, we ignore EPOLLRDHUP, and there's a one-second timeout there,
so it's actually expected.

> Removing EPOLLET (edge triggered) delivers the EPOLLIN event over and
> over again to passt as expected:
> 
> 1299436 recvfrom(4, "", 8323069, MSG_DONTWAIT, NULL, NULL) = 0
> 1299436 epoll_wait(5, 0x7ffdd8a65640, 8, 1000) = 1
> 1299436 recvfrom(4, "", 8323069, MSG_DONTWAIT, NULL, NULL) = 0
> 1299436 epoll_wait(5, 0x7ffdd8a65640, 8, 1000) = 1
> 1299436 recvfrom(4, "", 8323069, MSG_DONTWAIT, NULL, NULL) = 0
> (forever)

I'm not sure that's EPOLLIN, it could be anything (we don't check
explicitly).

> but I expected that passt would exit the first time it reads 0 from
> the socket.
> 
> tap_handler_passt() seems like it only considers the n<0 (error) and
> n>0 (data) cases.  What do you think about checking for n == 0 and  
> exiting immediately if c->one_off is true?

I think it's reasonable in any case. At the same time, we don't
necessarily need to ignore EPOLLRDHUP, we could also simply try to read
from the socket first, that is, move this:

	if ((c->mode == MODE_PASST && tap_handler_passt(c, now)) ||
	    (c->mode == MODE_PASTA && tap_handler_pasta(c, now)))
		goto reinit;

before this:

	if (events & (EPOLLRDHUP | EPOLLHUP | EPOLLERR))
		goto reinit;

and we'll just get EBADF on recv(). If we have data to read, we won't
necessarily get a EPOLLRDHUP (or EPOLLHUP) and EPOLLIN separately, so
trying to read first makes sense for the case you described.

An explicit check on EPOLLIN is probably a good idea too: read if we
have anything to read, first, then quit if we have a reason to do so.

-- 
Stefano


  reply	other threads:[~2022-11-17 16:18 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-17 12:26 [PATCH passt 0/5] Add fuzzing Richard W.M. Jones
2022-11-17 12:26 ` [PATCH passt 1/5] build: Force-create pasta symlink Richard W.M. Jones
2022-11-17 12:26 ` [PATCH passt 2/5] build: Remove *~ files with make clean Richard W.M. Jones
2022-11-17 12:26 ` [PATCH passt 3/5] Add --fd option Richard W.M. Jones
2022-11-17 15:26   ` [PATCH v2 3/5] passt, tap: " Stefano Brivio
2022-11-17 15:31     ` Stefano Brivio
2022-11-17 15:33       ` Richard W.M. Jones
2022-11-17 15:33     ` Richard W.M. Jones
2022-11-17 15:49       ` Stefano Brivio
2022-11-17 16:02         ` Richard W.M. Jones
2022-11-17 16:18           ` Stefano Brivio [this message]
2022-11-17 12:26 ` [PATCH passt 4/5] XXX build: Add extra syscalls needed by AFL instrumentation Richard W.M. Jones
2022-11-17 14:22   ` Stefano Brivio
2022-11-17 12:26 ` [PATCH passt 5/5] Import fuzzing wrapper from libnbd Richard W.M. Jones
2022-11-17 15:35   ` Stefano Brivio
2022-11-17 13:58 ` [PATCH passt 0/5] Add fuzzing Stefano Brivio
2022-11-17 14:13   ` Richard W.M. Jones

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221117171805.3746f53a@elisabeth \
    --to=sbrivio@redhat.com \
    --cc=passt-dev@passt.top \
    --cc=rjones@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).