public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: passt-dev@passt.top, Stefano Brivio <sbrivio@redhat.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Subject: [PATCH 3/5] tap: Don't use EPOLLET on Qemu sockets
Date: Fri, 26 Jul 2024 17:20:29 +1000	[thread overview]
Message-ID: <20240726072031.3941305-4-david@gibson.dropbear.id.au> (raw)
In-Reply-To: <20240726072031.3941305-1-david@gibson.dropbear.id.au>

Currently we set EPOLLET (edge trigger) on the epoll flags for the
connected Qemu Unix socket.  It's not clear that there's a reason for
doing this: for TCP sockets we need to use EPOLLET, because we leave data
in the socket buffers for our flow control handling.  That consideration
doesn't apply to the way we handle the qemu socket however.

Furthermore, using EPOLLET causes additional complications:

1) We don't set EPOLLET when opening /dev/net/tun for pasta mode, however
   we *do* set it when using pasta mode with --fd.  This inconsistency
   doesn't seem to have broken anything, but it's odd.

2) EPOLLET requires that tap_handler_passt() loop until all data available
   is read (otherwise we may have data in the buffer but never get an event
   causing us to read it).  We do that with a rather ugly goto.

   Worse, our condition for that goto appears to be incorrect.  We'll only
   loop if rem is non-zero, which will only happen if we perform a blocking
   recv() for a partially received frame.  We'll only perform that second
   recv() if the original recv() resulted in a partially read frame.  As
   far as I can tell the original recv() could end on a frame boundary
   (never triggering the second recv()) even if there is additional data in
   the socket buffer.  In that circumstance we wouldn't goto redo and could
   leave unprocessed frames in the qemu socket buffer indefinitely.

   This doesn't seem to have caused any problems in practice, but since
   there's no obvious reason to use EPOLLET here anyway, we might as well
   get rid of it.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 tap.c | 14 ++++----------
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/tap.c b/tap.c
index 08700f65..df1287ad 100644
--- a/tap.c
+++ b/tap.c
@@ -989,7 +989,7 @@ static void tap_sock_reset(struct ctx *c)
 void tap_handler_passt(struct ctx *c, uint32_t events,
 		       const struct timespec *now)
 {
-	ssize_t n, rem;
+	ssize_t n;
 	char *p;
 
 	if (events & (EPOLLRDHUP | EPOLLHUP | EPOLLERR)) {
@@ -997,9 +997,7 @@ void tap_handler_passt(struct ctx *c, uint32_t events,
 		return;
 	}
 
-redo:
 	p = pkt_buf;
-	rem = 0;
 
 	tap_flush_pools();
 
@@ -1028,7 +1026,7 @@ redo:
 		 * needs to be blocking.
 		 */
 		if (l2len > n) {
-			rem = recv(c->fd_tap, p + n, l2len - n, 0);
+			ssize_t rem = recv(c->fd_tap, p + n, l2len - n, 0);
 			if ((n += rem) != l2len)
 				return;
 		}
@@ -1040,10 +1038,6 @@ redo:
 	}
 
 	tap_handler(c, now);
-
-	/* We can't use EPOLLET otherwise. */
-	if (rem)
-		goto redo;
 }
 
 /**
@@ -1228,7 +1222,7 @@ void tap_listen_handler(struct ctx *c, uint32_t events)
 		trace("tap: failed to set SO_SNDBUF to %i", v);
 
 	ref.fd = c->fd_tap;
-	ev.events = EPOLLIN | EPOLLET | EPOLLRDHUP;
+	ev.events = EPOLLIN | EPOLLRDHUP;
 	ev.data.u64 = ref.u64;
 	epoll_ctl(c->epollfd, EPOLL_CTL_ADD, c->fd_tap, &ev);
 }
@@ -1317,7 +1311,7 @@ void tap_sock_init(struct ctx *c)
 		else
 			ref.type = EPOLL_TYPE_TAP_PASTA;
 
-		ev.events = EPOLLIN | EPOLLET | EPOLLRDHUP;
+		ev.events = EPOLLIN | EPOLLRDHUP;
 		ev.data.u64 = ref.u64;
 		epoll_ctl(c->epollfd, EPOLL_CTL_ADD, c->fd_tap, &ev);
 		return;
-- 
@@ -989,7 +989,7 @@ static void tap_sock_reset(struct ctx *c)
 void tap_handler_passt(struct ctx *c, uint32_t events,
 		       const struct timespec *now)
 {
-	ssize_t n, rem;
+	ssize_t n;
 	char *p;
 
 	if (events & (EPOLLRDHUP | EPOLLHUP | EPOLLERR)) {
@@ -997,9 +997,7 @@ void tap_handler_passt(struct ctx *c, uint32_t events,
 		return;
 	}
 
-redo:
 	p = pkt_buf;
-	rem = 0;
 
 	tap_flush_pools();
 
@@ -1028,7 +1026,7 @@ redo:
 		 * needs to be blocking.
 		 */
 		if (l2len > n) {
-			rem = recv(c->fd_tap, p + n, l2len - n, 0);
+			ssize_t rem = recv(c->fd_tap, p + n, l2len - n, 0);
 			if ((n += rem) != l2len)
 				return;
 		}
@@ -1040,10 +1038,6 @@ redo:
 	}
 
 	tap_handler(c, now);
-
-	/* We can't use EPOLLET otherwise. */
-	if (rem)
-		goto redo;
 }
 
 /**
@@ -1228,7 +1222,7 @@ void tap_listen_handler(struct ctx *c, uint32_t events)
 		trace("tap: failed to set SO_SNDBUF to %i", v);
 
 	ref.fd = c->fd_tap;
-	ev.events = EPOLLIN | EPOLLET | EPOLLRDHUP;
+	ev.events = EPOLLIN | EPOLLRDHUP;
 	ev.data.u64 = ref.u64;
 	epoll_ctl(c->epollfd, EPOLL_CTL_ADD, c->fd_tap, &ev);
 }
@@ -1317,7 +1311,7 @@ void tap_sock_init(struct ctx *c)
 		else
 			ref.type = EPOLL_TYPE_TAP_PASTA;
 
-		ev.events = EPOLLIN | EPOLLET | EPOLLRDHUP;
+		ev.events = EPOLLIN | EPOLLRDHUP;
 		ev.data.u64 = ref.u64;
 		epoll_ctl(c->epollfd, EPOLL_CTL_ADD, c->fd_tap, &ev);
 		return;
-- 
2.45.2


  parent reply	other threads:[~2024-07-26  7:20 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-26  7:20 [PATCH 0/5] Fix assorted errors in the Qemu socket tap receive handler David Gibson
2024-07-26  7:20 ` [PATCH 1/5] tap: Better report errors receiving from QEMU socket David Gibson
2024-07-26 11:25   ` Stefano Brivio
2024-07-26 11:50     ` Stefano Brivio
2024-07-26 12:02     ` David Gibson
2024-07-26  7:20 ` [PATCH 2/5] tap: Don't attempt to carry on if we get a bad frame length from qemu David Gibson
2024-07-26 11:26   ` Stefano Brivio
2024-07-26  7:20 ` David Gibson [this message]
2024-07-26  8:00   ` [PATCH 3/5] tap: Don't use EPOLLET on Qemu sockets Stefano Brivio
2024-07-26 10:44     ` Stefano Brivio
2024-07-26 12:12     ` David Gibson
2024-07-26 13:25       ` Stefano Brivio
2024-07-29  1:15         ` David Gibson
2024-07-26  7:20 ` [PATCH 4/5] tap: Correctly handle frames of odd length David Gibson
2024-07-26  7:20 ` [PATCH 5/5] tap: Improve handling of partially received frames on qemu socket David Gibson
2024-07-26 11:39   ` Stefano Brivio
2024-07-26 12:33     ` David Gibson
2024-07-26 13:19 ` [PATCH 0/5] Fix assorted errors in the Qemu socket tap receive handler Stefano Brivio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240726072031.3941305-4-david@gibson.dropbear.id.au \
    --to=david@gibson.dropbear.id.au \
    --cc=passt-dev@passt.top \
    --cc=sbrivio@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).