From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=pass (2048-bit key; secure) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202512 header.b=GkC/lVX7; dkim-atps=neutral Received: from mail.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by passt.top (Postfix) with ESMTPS id 486AE5A004E for ; Sat, 27 Dec 2025 05:06:08 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202512; t=1766808365; bh=nuswDMZ9kFMZn/PSw2pRivGaQ2YjE7KpwthbbkwrUNU=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=GkC/lVX7mO7HA83nhxqALEph4mQQFgGfIv15EiJ/NLwFBHY8rJfmkG+FvVQGCcYal MtKtR4I2xDT+cefJ2r3T9pNcYH+40wYM12majGPvDMZ2zI/OWilkUEAyFMYmldxntS uC/RUqnmo3CsvSmyYb+heIAVM8GSztMpqawBOZXRVrxs/hXoNh62A9XlWwIhWPUXLV nnWWwnAABXV+edhTZEBLt/FsQXFuX+IRGPDR82R13+oQJvuk5NXYvPGEHndMSHG5yw voPX6V9ir5ZIqdBWyn0ToMAm5Us/TBPcQCT4C68rLv+/oWXY00wIF3UGxBH6umeBun h+Hu6yZBkbilQ== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4ddTRx6fwRz4wCm; Sat, 27 Dec 2025 15:06:05 +1100 (AEDT) Date: Sat, 27 Dec 2025 15:05:41 +1100 From: David Gibson To: Laurent Vivier Subject: Re: [PATCH 6/7] flow: Compute epoll events inside flow_epoll_set() Message-ID: References: <20251219164518.930012-1-lvivier@redhat.com> <20251219164518.930012-7-lvivier@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="cjP9R0QdeXTYI3Cm" Content-Disposition: inline In-Reply-To: <20251219164518.930012-7-lvivier@redhat.com> Message-ID-Hash: KDDMFFPSW6INI4OWEUSTNAKE5STDTTMF X-Message-ID-Hash: KDDMFFPSW6INI4OWEUSTNAKE5STDTTMF X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --cjP9R0QdeXTYI3Cm Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Dec 19, 2025 at 05:45:17PM +0100, Laurent Vivier wrote: > Rather than having each caller compute the appropriate epoll events and > pass them to flow_epoll_set(), move the event computation into > flow_epoll_set() itself. The function already switches on epoll_type to > determine other parameters, so computing events there is a natural fit. I'm not sure about this. Yes, the switch already exists. But, adding more protocol specific logic - and more access to the protocol specific flow entrie from flow_epoll_set() seems like a layering violation. I feel like it belongs better with the rest of the protocol specific logic. Maybe there's a sufficient advantage to putting it in flow_epollset() to outweigh that, but it's not obvious to me from this commit message. > Move tcp_conn_epoll_events() from tcp.c and tcp_splice_conn_epoll_events() > from tcp_splice.c into flow.c where they can be called internally. For > simpler cases (PING, UDP, TCP_TIMER), the events are constant values that > are now set directly in the switch cases. >=20 > This centralizes all epoll event logic in one place, making it easier to > understand and maintain the relationship between flow types and their > epoll configurations. Yeah... I'm not convinced about this. >=20 > Signed-off-by: Laurent Vivier > --- > flow.c | 86 +++++++++++++++++++++++++++++++++++++++++++++++++--- > flow.h | 2 +- > icmp.c | 2 +- > tcp.c | 43 ++------------------------ > tcp_splice.c | 35 ++------------------- > udp_flow.c | 2 +- > 6 files changed, 90 insertions(+), 80 deletions(-) >=20 > diff --git a/flow.c b/flow.c > index a3f75015a078..6b9ccca6ef50 100644 > --- a/flow.c > +++ b/flow.c > @@ -391,21 +391,77 @@ void flow_epollid_clear(struct flow_common *f) > f->epollid =3D EPOLLFD_ID_INVALID; > } > =20 > +/** > + * tcp_splice_conn_epoll_events() - epoll events masks for given state > + * @events: Connection event flags > + * @ev: Events to fill in, 0 is accepted socket, 1 is connecting socket > + */ > +static uint32_t tcp_splice_conn_epoll_events(uint16_t events, unsigned s= idei) > +{ > + uint32_t e =3D 0; > + > + if (events & SPLICE_ESTABLISHED) { > + if (!(events & FIN_SENT(!sidei))) > + e =3D EPOLLIN | EPOLLRDHUP; > + } else if (sidei =3D=3D 1 && events & SPLICE_CONNECT) { > + e =3D EPOLLOUT; > + } > + > + if (events & OUT_WAIT(sidei)) > + e |=3D EPOLLOUT; > + if (events & OUT_WAIT(!sidei)) > + e &=3D ~EPOLLIN; > + > + return e; > +} > + > +/** > + * tcp_conn_epoll_events() - epoll events mask for given connection state > + * @events: Current connection events > + * @conn_flags: Connection flags > + * > + * Return: epoll events mask corresponding to implied connection state > + */ > +static uint32_t tcp_conn_epoll_events(uint8_t events, uint8_t conn_flags) > +{ > + if (!events) > + return 0; > + > + if (events & ESTABLISHED) { > + if (events & TAP_FIN_SENT) > + return EPOLLET; > + > + if (conn_flags & STALLED) { > + if (conn_flags & ACK_FROM_TAP_BLOCKS) > + return EPOLLRDHUP | EPOLLET; > + > + return EPOLLIN | EPOLLRDHUP | EPOLLET; > + } > + > + return EPOLLIN | EPOLLRDHUP; > + } > + > + if (events =3D=3D TAP_SYN_RCVD) > + return EPOLLOUT | EPOLLET | EPOLLRDHUP; > + > + return EPOLLET | EPOLLRDHUP; > +} > + > /** > * flow_epoll_set() - Add or modify epoll registration for a flow socket > * @type: epoll type > * @f: Flow to register socket for > - * @events: epoll events to watch for > * @fd: File descriptor to register > * @sidei: Side index of the flow > * > * Return: 0 on success, -1 on error (from epoll_ctl()) > */ > int flow_epoll_set(enum epoll_type type, const struct flow_common *f, > - uint32_t events, int fd, unsigned int sidei) > + int fd, unsigned int sidei) > { > struct epoll_event ev; > union epoll_ref ref; > + uint32_t events; > int epollfd; > int m; > =20 > @@ -413,8 +469,24 @@ int flow_epoll_set(enum epoll_type type, const struc= t flow_common *f, > ref.type =3D type; > =20 > switch (type) { > - case EPOLL_TYPE_TCP_SPLICE: > - case EPOLL_TYPE_TCP: > + case EPOLL_TYPE_TCP_SPLICE: { > + const struct tcp_splice_conn *conn =3D &((union flow *)f)->tcp_splice; > + > + ref.flowside =3D flow_sidx(f, sidei); > + if (flow_in_epoll(f)) { > + m =3D EPOLL_CTL_MOD; > + epollfd =3D flow_epollfd(f); > + } else { > + m =3D EPOLL_CTL_ADD; > + epollfd =3D epoll_id_to_fd[EPOLLFD_ID_DEFAULT]; > + } > + > + events =3D tcp_splice_conn_epoll_events(conn->events, sidei); > + break; > + } > + case EPOLL_TYPE_TCP: { > + const struct tcp_tap_conn *conn =3D &((union flow *)f)->tcp; > + > ref.flowside =3D flow_sidx(f, sidei); > if (flow_in_epoll(f)) { > m =3D EPOLL_CTL_MOD; > @@ -423,19 +495,24 @@ int flow_epoll_set(enum epoll_type type, const stru= ct flow_common *f, > m =3D EPOLL_CTL_ADD; > epollfd =3D epoll_id_to_fd[EPOLLFD_ID_DEFAULT]; > } > + > + events =3D tcp_conn_epoll_events(conn->events, conn->flags); > break; > + } > case EPOLL_TYPE_TCP_TIMER: { > const struct tcp_tap_conn *conn =3D &((union flow *)f)->tcp; > =20 > ref.flow =3D flow_idx(f); > m =3D conn->timer =3D=3D -1 ? EPOLL_CTL_ADD : EPOLL_CTL_MOD; > epollfd =3D flow_epollfd(f); > + events =3D EPOLLIN | EPOLLET; > break; > } > case EPOLL_TYPE_PING: > ref.flowside =3D flow_sidx(f, sidei); > m =3D EPOLL_CTL_ADD; > epollfd =3D flow_epollfd(f); > + events =3D EPOLLIN; > break; > case EPOLL_TYPE_UDP: { > union { > @@ -446,6 +523,7 @@ int flow_epoll_set(enum epoll_type type, const struct= flow_common *f, > ref.data =3D fref.data; > m =3D EPOLL_CTL_ADD; > epollfd =3D flow_epollfd(f); > + events =3D EPOLLIN; > break; > } > default: > diff --git a/flow.h b/flow.h > index 9740ede8963e..ed1e16139a4d 100644 > --- a/flow.h > +++ b/flow.h > @@ -266,7 +266,7 @@ int flow_epollfd(const struct flow_common *f); > void flow_epollid_set(struct flow_common *f, int epollid); > void flow_epollid_clear(struct flow_common *f); > int flow_epoll_set(enum epoll_type type, const struct flow_common *f, > - uint32_t events, int fd, unsigned int sidei); > + int fd, unsigned int sidei); > void flow_epollid_register(int epollid, int epollfd); > void flow_defer_handler(const struct ctx *c, const struct timespec *now); > int flow_migrate_source_early(struct ctx *c, const struct migrate_stage = *stage, > diff --git a/icmp.c b/icmp.c > index f0b6fcff1776..5487a904117d 100644 > --- a/icmp.c > +++ b/icmp.c > @@ -210,7 +210,7 @@ static struct icmp_ping_flow *icmp_ping_new(const str= uct ctx *c, > goto cancel; > =20 > flow_epollid_set(&pingf->f, EPOLLFD_ID_DEFAULT); > - if (flow_epoll_set(EPOLL_TYPE_PING, &pingf->f, EPOLLIN, pingf->sock, > + if (flow_epoll_set(EPOLL_TYPE_PING, &pingf->f, pingf->sock, > TGTSIDE) < 0) { > close(pingf->sock); > goto cancel; > diff --git a/tcp.c b/tcp.c > index ec4f6d2ecc04..2e8a7d516273 100644 > --- a/tcp.c > +++ b/tcp.c > @@ -489,38 +489,6 @@ int tcp_set_peek_offset(const struct tcp_tap_conn *c= onn, int offset) > return 0; > } > =20 > -/** > - * tcp_conn_epoll_events() - epoll events mask for given connection state > - * @events: Current connection events > - * @conn_flags: Connection flags > - * > - * Return: epoll events mask corresponding to implied connection state > - */ > -static uint32_t tcp_conn_epoll_events(uint8_t events, uint8_t conn_flags) > -{ > - if (!events) > - return 0; > - > - if (events & ESTABLISHED) { > - if (events & TAP_FIN_SENT) > - return EPOLLET; > - > - if (conn_flags & STALLED) { > - if (conn_flags & ACK_FROM_TAP_BLOCKS) > - return EPOLLRDHUP | EPOLLET; > - > - return EPOLLIN | EPOLLRDHUP | EPOLLET; > - } > - > - return EPOLLIN | EPOLLRDHUP; > - } > - > - if (events =3D=3D TAP_SYN_RCVD) > - return EPOLLOUT | EPOLLET | EPOLLRDHUP; > - > - return EPOLLET | EPOLLRDHUP; > -} > - > /** > * tcp_epoll_ctl() - Add/modify/delete epoll state from connection events > * @c: Execution context > @@ -530,8 +498,6 @@ static uint32_t tcp_conn_epoll_events(uint8_t events,= uint8_t conn_flags) > */ > static int tcp_epoll_ctl(const struct ctx *c, struct tcp_tap_conn *conn) > { > - uint32_t events; > - > if (conn->events =3D=3D CLOSED) { > int epollfd =3D flow_in_epoll(&conn->f) ? flow_epollfd(&conn->f) : c->= epollfd; > if (flow_in_epoll(&conn->f)) > @@ -541,9 +507,7 @@ static int tcp_epoll_ctl(const struct ctx *c, struct = tcp_tap_conn *conn) > return 0; > } > =20 > - events =3D tcp_conn_epoll_events(conn->events, conn->flags); > - > - if (flow_epoll_set(EPOLL_TYPE_TCP, &conn->f, events, conn->sock, > + if (flow_epoll_set(EPOLL_TYPE_TCP, &conn->f, conn->sock, > !TAPSIDE(conn))) > return -errno; > =20 > @@ -551,7 +515,7 @@ static int tcp_epoll_ctl(const struct ctx *c, struct = tcp_tap_conn *conn) > =20 > if (conn->timer !=3D -1) { > if (flow_epoll_set(EPOLL_TYPE_TCP_TIMER, &conn->f, > - EPOLLIN | EPOLLET, conn->timer, 0)) > + conn->timer, 0)) > return -errno; > } > =20 > @@ -582,8 +546,7 @@ static void tcp_timer_ctl(const struct ctx *c, struct= tcp_tap_conn *conn) > return; > } > =20 > - if (flow_epoll_set(EPOLL_TYPE_TCP_TIMER, &conn->f, > - EPOLLIN | EPOLLET, fd, 0)) { > + if (flow_epoll_set(EPOLL_TYPE_TCP_TIMER, &conn->f, fd, 0)) { > flow_dbg_perror(conn, "failed to add timer"); > close(fd); > return; > diff --git a/tcp_splice.c b/tcp_splice.c > index ccf627f4427d..1eeb678d534b 100644 > --- a/tcp_splice.c > +++ b/tcp_splice.c > @@ -109,30 +109,6 @@ static struct tcp_splice_conn *conn_at_sidx(flow_sid= x_t sidx) > return &flow->tcp_splice; > } > =20 > -/** > - * tcp_splice_conn_epoll_events() - epoll events masks for given state > - * @events: Connection event flags > - * @ev: Events to fill in, 0 is accepted socket, 1 is connecting socket > - */ > -static uint32_t tcp_splice_conn_epoll_events(uint16_t events, unsigned s= idei) > -{ > - uint32_t e =3D 0; > - > - if (events & SPLICE_ESTABLISHED) { > - if (!(events & FIN_SENT(!sidei))) > - e =3D EPOLLIN | EPOLLRDHUP; > - } else if (sidei =3D=3D 1 && events & SPLICE_CONNECT) { > - e =3D EPOLLOUT; > - } > - > - if (events & OUT_WAIT(sidei)) > - e |=3D EPOLLOUT; > - if (events & OUT_WAIT(!sidei)) > - e &=3D ~EPOLLIN; > - > - return e; > -} > - > /** > * tcp_splice_epoll_ctl() - Add/modify/delete epoll state from connectio= n events > * @conn: Connection pointer > @@ -141,15 +117,8 @@ static uint32_t tcp_splice_conn_epoll_events(uint16_= t events, unsigned sidei) > */ > static int tcp_splice_epoll_ctl(struct tcp_splice_conn *conn) > { > - uint32_t events[2]; > - > - events[0] =3D tcp_splice_conn_epoll_events(conn->events, 0); > - events[1] =3D tcp_splice_conn_epoll_events(conn->events, 1); > - > - if (flow_epoll_set(EPOLL_TYPE_TCP_SPLICE, &conn->f, events[0], > - conn->s[0], 0) || > - flow_epoll_set(EPOLL_TYPE_TCP_SPLICE, &conn->f, events[1], > - conn->s[1], 1)) { > + if (flow_epoll_set(EPOLL_TYPE_TCP_SPLICE, &conn->f, conn->s[0], 0) || > + flow_epoll_set(EPOLL_TYPE_TCP_SPLICE, &conn->f, conn->s[1], 1)) { > int ret =3D -errno; > flow_perror(conn, "ERROR on epoll_ctl()"); > return ret; > diff --git a/udp_flow.c b/udp_flow.c > index 07ff589670c1..afacfb356261 100644 > --- a/udp_flow.c > +++ b/udp_flow.c > @@ -84,7 +84,7 @@ static int udp_flow_sock(const struct ctx *c, > } > =20 > flow_epollid_set(&uflow->f, EPOLLFD_ID_DEFAULT); > - rc =3D flow_epoll_set(EPOLL_TYPE_UDP, &uflow->f, EPOLLIN, s, sidei); > + rc =3D flow_epoll_set(EPOLL_TYPE_UDP, &uflow->f, s, sidei); > if (rc < 0) { > close(s); > return rc; > --=20 > 2.51.1 >=20 --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --cjP9R0QdeXTYI3Cm Content-Type: application/pgp-signature; name=signature.asc -----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmlPWwoACgkQzQJF27ox 2Gcc0w//dWnAflLmTDvkdA7I6DdVJNAwu1gRKZd4+iCiqjbxVBWw9meGvFkUteF1 XnCQOkXdrWYlbWgG9gSFTBXA8CyMcfRhg0FMvoUb360y49GIEsXG/nTwLWIH98kt DThn74ArsTbAEj6CLw1JFX8BZ3jFMe5tLjpkEaUugbkPYwTE+uASm4KdngAnscf1 KK2RSGnnIopJVQnBy2wDH3lsdIQFiBmb9cyKTB/TAA41OuiRPqop0i+yO7sKnn1g RZW2IFb9KSpfeJQALONcD8e9YR5b+F4lywKOWAiahDtwTh47NWuvORDj1c2pHBt9 hD1lV8tHK+b/VnSu3Az4Gkkf9mA9CFrub//1vyxFw/4bF21Tug7bb9bPEVkOhRwE g5YIuuMvTuiT1YL3bnFfHA6muV7vDaP0Q84+d2bZ8qMHt8MFoIgOAksRpa5qCVwB qgPlzdIOxNKp2K726eEyiWwCNuPOYKmWvm0voqvffSOUj3d2EVNna5q7Tnd/AJ4Y YjtFh1wHrlhH1nMLb0oTde5eGnIMJTWqVedfB/V10dviSGRuiZd344uvqSemXbUc /nCBivlxVmb3iKF9qTHR36cuQmVjmvC94dOxzP2HgNhrrlWKbn5TeJLYistyt0aU KOWjp/OePVHFG3jNdietpYG4xnrx0J7vlFHD+ymgFc3P0lWBGDg= =ADlK -----END PGP SIGNATURE----- --cjP9R0QdeXTYI3Cm--