* [PATCH v2 0/3] Register TCP flows with epoll at creation time
@ 2026-01-19 16:19 Laurent Vivier
2026-01-19 16:19 ` [PATCH v2 1/3] tcp_splice: Register fds with epoll at flow creation Laurent Vivier
` (3 more replies)
0 siblings, 4 replies; 12+ messages in thread
From: Laurent Vivier @ 2026-01-19 16:19 UTC (permalink / raw)
To: passt-dev; +Cc: Laurent Vivier
This series simplifies epoll registration by moving it to flow creation
time rather than deferring it to the first event update. By registering
sockets with epoll immediately (using events=0), the epoll control
functions can always use EPOLL_CTL_MOD, eliminating the need to track
whether a flow is already registered.
Patch 1 registers both splice sockets with epoll in tcp_splice_connect()
before initiating the connection, allowing tcp_splice_epoll_ctl() to
always use EPOLL_CTL_MOD.
Patch 2 does the same for regular TCP connections in tcp_conn_from_tap(),
tcp_tap_conn_from_sock(), and tcp_flow_migrate_target(), simplifying
tcp_epoll_ctl() by removing the ADD/MOD conditional logic.
Patch 3 removes EPOLLFD_ID_INVALID and related infrastructure now that
all flows are registered at creation: flow_in_epoll(), flow_epollid_clear(),
and the defensive checks in flow_epollfd().
Thanks,
Laurent
Changes in v2:
- Explained error handling and rollback behavior in commit message
(patch 1)
- Added error handling for flow_epoll_set() in tcp_conn_from_tap() and
tcp_tap_conn_from_sock() (patch 2)
- Added ASSERT for epollid bounds check in flow_epollid_register()
(patch 3)
Laurent Vivier (3):
tcp_splice: Register fds with epoll at flow creation
tcp: Register fds with epoll at flow creation
flow: Remove EPOLLFD_ID_INVALID
flow.c | 34 +++-------------------------------
flow.h | 6 +-----
icmp.c | 1 -
tcp.c | 47 ++++++++++++++++++++++++-----------------------
tcp_splice.c | 20 ++++++++++----------
udp_flow.c | 1 -
6 files changed, 38 insertions(+), 71 deletions(-)
--
2.52.0
^ permalink raw reply [flat|nested] 12+ messages in thread* [PATCH v2 1/3] tcp_splice: Register fds with epoll at flow creation 2026-01-19 16:19 [PATCH v2 0/3] Register TCP flows with epoll at creation time Laurent Vivier @ 2026-01-19 16:19 ` Laurent Vivier 2026-01-20 0:06 ` David Gibson 2026-01-19 16:19 ` [PATCH v2 2/3] tcp: " Laurent Vivier ` (2 subsequent siblings) 3 siblings, 1 reply; 12+ messages in thread From: Laurent Vivier @ 2026-01-19 16:19 UTC (permalink / raw) To: passt-dev; +Cc: Laurent Vivier Register both splice connection sockets with epoll using empty events (events=0) in tcp_splice_connect(), before initiating the connection. This allows tcp_splice_epoll_ctl() to always use EPOLL_CTL_MOD, removing the need to check whether fds are already registered. As a result, the conditional ADD/MOD logic is no longer needed, simplifying the function. If the second flow_epoll_set() fails after the first succeeds, we don't need explicit rollback: tcp_splice_conn_from_sock() sets the CLOSING flag on error, and conn_flag() handles it by calling epoll_del() for both sockets. Signed-off-by: Laurent Vivier <lvivier@redhat.com> --- tcp_splice.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/tcp_splice.c b/tcp_splice.c index a7c04ca8652a..cb81e012ee4b 100644 --- a/tcp_splice.c +++ b/tcp_splice.c @@ -142,20 +142,12 @@ static uint32_t tcp_splice_conn_epoll_events(uint16_t events, unsigned sidei) static int tcp_splice_epoll_ctl(struct tcp_splice_conn *conn) { uint32_t events[2]; - int m; - - if (flow_in_epoll(&conn->f)) { - m = EPOLL_CTL_MOD; - } else { - flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT); - m = EPOLL_CTL_ADD; - } events[0] = tcp_splice_conn_epoll_events(conn->events, 0); events[1] = tcp_splice_conn_epoll_events(conn->events, 1); - if (flow_epoll_set(&conn->f, m, events[0], conn->s[0], 0) || - flow_epoll_set(&conn->f, m, events[1], conn->s[1], 1)) { + if (flow_epoll_set(&conn->f, EPOLL_CTL_MOD, events[0], conn->s[0], 0) || + flow_epoll_set(&conn->f, EPOLL_CTL_MOD, events[1], conn->s[1], 1)) { int ret = -errno; flow_perror(conn, "ERROR on epoll_ctl()"); return ret; @@ -368,6 +360,14 @@ static int tcp_splice_connect(const struct ctx *c, struct tcp_splice_conn *conn) pif_sockaddr(c, &sa, tgtpif, &tgt->eaddr, tgt->eport); + flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT); + if (flow_epoll_set(&conn->f, EPOLL_CTL_ADD, 0, conn->s[0], 0) || + flow_epoll_set(&conn->f, EPOLL_CTL_ADD, 0, conn->s[1], 1)) { + int ret = -errno; + flow_perror(conn, "Cannot register to epollfd"); + return ret; + } + conn_event(conn, SPLICE_CONNECT); if (connect(conn->s[1], &sa.sa, socklen_inany(&sa))) { -- 2.52.0 ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/3] tcp_splice: Register fds with epoll at flow creation 2026-01-19 16:19 ` [PATCH v2 1/3] tcp_splice: Register fds with epoll at flow creation Laurent Vivier @ 2026-01-20 0:06 ` David Gibson 0 siblings, 0 replies; 12+ messages in thread From: David Gibson @ 2026-01-20 0:06 UTC (permalink / raw) To: Laurent Vivier; +Cc: passt-dev [-- Attachment #1: Type: text/plain, Size: 2734 bytes --] On Mon, Jan 19, 2026 at 05:19:13PM +0100, Laurent Vivier wrote: > Register both splice connection sockets with epoll using empty events > (events=0) in tcp_splice_connect(), before initiating the connection. > > This allows tcp_splice_epoll_ctl() to always use EPOLL_CTL_MOD, removing > the need to check whether fds are already registered. As a result, the > conditional ADD/MOD logic is no longer needed, simplifying the function. > > If the second flow_epoll_set() fails after the first succeeds, we don't > need explicit rollback: tcp_splice_conn_from_sock() sets the CLOSING > flag on error, and conn_flag() handles it by calling epoll_del() for > both sockets. > > Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> > --- > tcp_splice.c | 20 ++++++++++---------- > 1 file changed, 10 insertions(+), 10 deletions(-) > > diff --git a/tcp_splice.c b/tcp_splice.c > index a7c04ca8652a..cb81e012ee4b 100644 > --- a/tcp_splice.c > +++ b/tcp_splice.c > @@ -142,20 +142,12 @@ static uint32_t tcp_splice_conn_epoll_events(uint16_t events, unsigned sidei) > static int tcp_splice_epoll_ctl(struct tcp_splice_conn *conn) > { > uint32_t events[2]; > - int m; > - > - if (flow_in_epoll(&conn->f)) { > - m = EPOLL_CTL_MOD; > - } else { > - flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT); > - m = EPOLL_CTL_ADD; > - } > > events[0] = tcp_splice_conn_epoll_events(conn->events, 0); > events[1] = tcp_splice_conn_epoll_events(conn->events, 1); > > - if (flow_epoll_set(&conn->f, m, events[0], conn->s[0], 0) || > - flow_epoll_set(&conn->f, m, events[1], conn->s[1], 1)) { > + if (flow_epoll_set(&conn->f, EPOLL_CTL_MOD, events[0], conn->s[0], 0) || > + flow_epoll_set(&conn->f, EPOLL_CTL_MOD, events[1], conn->s[1], 1)) { > int ret = -errno; > flow_perror(conn, "ERROR on epoll_ctl()"); > return ret; > @@ -368,6 +360,14 @@ static int tcp_splice_connect(const struct ctx *c, struct tcp_splice_conn *conn) > > pif_sockaddr(c, &sa, tgtpif, &tgt->eaddr, tgt->eport); > > + flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT); > + if (flow_epoll_set(&conn->f, EPOLL_CTL_ADD, 0, conn->s[0], 0) || > + flow_epoll_set(&conn->f, EPOLL_CTL_ADD, 0, conn->s[1], 1)) { > + int ret = -errno; > + flow_perror(conn, "Cannot register to epollfd"); > + return ret; > + } > + > conn_event(conn, SPLICE_CONNECT); > > if (connect(conn->s[1], &sa.sa, socklen_inany(&sa))) { > -- > 2.52.0 > -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v2 2/3] tcp: Register fds with epoll at flow creation 2026-01-19 16:19 [PATCH v2 0/3] Register TCP flows with epoll at creation time Laurent Vivier 2026-01-19 16:19 ` [PATCH v2 1/3] tcp_splice: Register fds with epoll at flow creation Laurent Vivier @ 2026-01-19 16:19 ` Laurent Vivier 2026-01-20 0:08 ` David Gibson 2026-01-21 8:13 ` Stefano Brivio 2026-01-19 16:19 ` [PATCH v2 3/3] flow: Remove EPOLLFD_ID_INVALID Laurent Vivier 2026-01-20 20:24 ` [PATCH v2 0/3] Register TCP flows with epoll at creation time Stefano Brivio 3 siblings, 2 replies; 12+ messages in thread From: Laurent Vivier @ 2026-01-19 16:19 UTC (permalink / raw) To: passt-dev; +Cc: Laurent Vivier Register connection sockets with epoll using empty events (events=0) in tcp_conn_from_tap(), tcp_tap_conn_from_sock() and tcp_flow_repair_socket(). This allows tcp_epoll_ctl() to always use EPOLL_CTL_MOD, removing the need to check whether fds are already registered. As a result, the conditional ADD/MOD logic is no longer needed, simplifying the function. Signed-off-by: Laurent Vivier <lvivier@redhat.com> --- flow.c | 1 + tcp.c | 46 ++++++++++++++++++++++++---------------------- 2 files changed, 25 insertions(+), 22 deletions(-) diff --git a/flow.c b/flow.c index cefe6c8b5b24..532339ce7fe1 100644 --- a/flow.c +++ b/flow.c @@ -357,6 +357,7 @@ static void flow_set_state(struct flow_common *f, enum flow_state state) * * Return: true if flow is registered with epoll, false otherwise */ +/* cppcheck-suppress unusedFunction */ bool flow_in_epoll(const struct flow_common *f) { return f->epollid != EPOLLFD_ID_INVALID; diff --git a/tcp.c b/tcp.c index 1db861705ddb..29d69354bd94 100644 --- a/tcp.c +++ b/tcp.c @@ -528,37 +528,22 @@ static uint32_t tcp_conn_epoll_events(uint8_t events, uint8_t conn_flags) static int tcp_epoll_ctl(struct tcp_tap_conn *conn) { uint32_t events; - int m; if (conn->events == CLOSED) { - if (flow_in_epoll(&conn->f)) { - int epollfd = flow_epollfd(&conn->f); + int epollfd = flow_epollfd(&conn->f); - epoll_del(epollfd, conn->sock); - if (conn->timer != -1) - epoll_del(epollfd, conn->timer); - } + epoll_del(epollfd, conn->sock); + if (conn->timer != -1) + epoll_del(epollfd, conn->timer); return 0; } events = tcp_conn_epoll_events(conn->events, conn->flags); - if (flow_in_epoll(&conn->f)) { - m = EPOLL_CTL_MOD; - } else { - flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT); - m = EPOLL_CTL_ADD; - } - - if (flow_epoll_set(&conn->f, m, events, conn->sock, - !TAPSIDE(conn)) < 0) { - int ret = -errno; - - if (m == EPOLL_CTL_ADD) - flow_epollid_clear(&conn->f); - return ret; - } + if (flow_epoll_set(&conn->f, EPOLL_CTL_MOD, events, conn->sock, + !TAPSIDE(conn)) < 0) + return -errno; return 0; } @@ -1710,6 +1695,11 @@ static void tcp_conn_from_tap(const struct ctx *c, sa_family_t af, conn->sock = s; conn->timer = -1; conn->listening_sock = -1; + flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT); + if (flow_epoll_set(&conn->f, EPOLL_CTL_ADD, 0, s, TGTSIDE) < 0) { + flow_perror(flow, "Can't register with epoll"); + goto cancel; + } conn_event(c, conn, TAP_SYN_RCVD); conn->wnd_to_tap = WINDOW_DEFAULT; @@ -2433,6 +2423,15 @@ static void tcp_tap_conn_from_sock(const struct ctx *c, union flow *flow, conn->sock = s; conn->timer = -1; conn->ws_to_tap = conn->ws_from_tap = 0; + + flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT); + if (flow_epoll_set(&conn->f, EPOLL_CTL_ADD, 0, s, INISIDE) < 0) { + flow_perror(flow, "Can't register with epoll"); + conn_flag(c, conn, CLOSING); + FLOW_ACTIVATE(conn); + return; + } + conn_event(c, conn, SOCK_ACCEPTED); hash = flow_hash_insert(c, TAP_SIDX(conn)); @@ -3825,6 +3824,9 @@ int tcp_flow_migrate_target(struct ctx *c, int fd) return 0; } + flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT); + flow_epoll_set(&conn->f, EPOLL_CTL_ADD, 0, conn->sock, !TAPSIDE(conn)); + flow_hash_insert(c, TAP_SIDX(conn)); FLOW_ACTIVATE(conn); -- 2.52.0 ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 2/3] tcp: Register fds with epoll at flow creation 2026-01-19 16:19 ` [PATCH v2 2/3] tcp: " Laurent Vivier @ 2026-01-20 0:08 ` David Gibson 2026-01-21 8:13 ` Stefano Brivio 1 sibling, 0 replies; 12+ messages in thread From: David Gibson @ 2026-01-20 0:08 UTC (permalink / raw) To: Laurent Vivier; +Cc: passt-dev [-- Attachment #1: Type: text/plain, Size: 4011 bytes --] On Mon, Jan 19, 2026 at 05:19:14PM +0100, Laurent Vivier wrote: > Register connection sockets with epoll using empty events > (events=0) in tcp_conn_from_tap(), tcp_tap_conn_from_sock() > and tcp_flow_repair_socket(). > > This allows tcp_epoll_ctl() to always use EPOLL_CTL_MOD, removing > the need to check whether fds are already registered. As a result, the > conditional ADD/MOD logic is no longer needed, simplifying the function. > > Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> > --- > flow.c | 1 + > tcp.c | 46 ++++++++++++++++++++++++---------------------- > 2 files changed, 25 insertions(+), 22 deletions(-) > > diff --git a/flow.c b/flow.c > index cefe6c8b5b24..532339ce7fe1 100644 > --- a/flow.c > +++ b/flow.c > @@ -357,6 +357,7 @@ static void flow_set_state(struct flow_common *f, enum flow_state state) > * > * Return: true if flow is registered with epoll, false otherwise > */ > +/* cppcheck-suppress unusedFunction */ > bool flow_in_epoll(const struct flow_common *f) > { > return f->epollid != EPOLLFD_ID_INVALID; > diff --git a/tcp.c b/tcp.c > index 1db861705ddb..29d69354bd94 100644 > --- a/tcp.c > +++ b/tcp.c > @@ -528,37 +528,22 @@ static uint32_t tcp_conn_epoll_events(uint8_t events, uint8_t conn_flags) > static int tcp_epoll_ctl(struct tcp_tap_conn *conn) > { > uint32_t events; > - int m; > > if (conn->events == CLOSED) { > - if (flow_in_epoll(&conn->f)) { > - int epollfd = flow_epollfd(&conn->f); > + int epollfd = flow_epollfd(&conn->f); > > - epoll_del(epollfd, conn->sock); > - if (conn->timer != -1) > - epoll_del(epollfd, conn->timer); > - } > + epoll_del(epollfd, conn->sock); > + if (conn->timer != -1) > + epoll_del(epollfd, conn->timer); > > return 0; > } > > events = tcp_conn_epoll_events(conn->events, conn->flags); > > - if (flow_in_epoll(&conn->f)) { > - m = EPOLL_CTL_MOD; > - } else { > - flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT); > - m = EPOLL_CTL_ADD; > - } > - > - if (flow_epoll_set(&conn->f, m, events, conn->sock, > - !TAPSIDE(conn)) < 0) { > - int ret = -errno; > - > - if (m == EPOLL_CTL_ADD) > - flow_epollid_clear(&conn->f); > - return ret; > - } > + if (flow_epoll_set(&conn->f, EPOLL_CTL_MOD, events, conn->sock, > + !TAPSIDE(conn)) < 0) > + return -errno; > > return 0; > } > @@ -1710,6 +1695,11 @@ static void tcp_conn_from_tap(const struct ctx *c, sa_family_t af, > conn->sock = s; > conn->timer = -1; > conn->listening_sock = -1; > + flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT); > + if (flow_epoll_set(&conn->f, EPOLL_CTL_ADD, 0, s, TGTSIDE) < 0) { > + flow_perror(flow, "Can't register with epoll"); > + goto cancel; > + } > conn_event(c, conn, TAP_SYN_RCVD); > > conn->wnd_to_tap = WINDOW_DEFAULT; > @@ -2433,6 +2423,15 @@ static void tcp_tap_conn_from_sock(const struct ctx *c, union flow *flow, > conn->sock = s; > conn->timer = -1; > conn->ws_to_tap = conn->ws_from_tap = 0; > + > + flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT); > + if (flow_epoll_set(&conn->f, EPOLL_CTL_ADD, 0, s, INISIDE) < 0) { > + flow_perror(flow, "Can't register with epoll"); > + conn_flag(c, conn, CLOSING); > + FLOW_ACTIVATE(conn); > + return; > + } > + > conn_event(c, conn, SOCK_ACCEPTED); > > hash = flow_hash_insert(c, TAP_SIDX(conn)); > @@ -3825,6 +3824,9 @@ int tcp_flow_migrate_target(struct ctx *c, int fd) > return 0; > } > > + flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT); > + flow_epoll_set(&conn->f, EPOLL_CTL_ADD, 0, conn->sock, !TAPSIDE(conn)); > + > flow_hash_insert(c, TAP_SIDX(conn)); > FLOW_ACTIVATE(conn); > > -- > 2.52.0 > -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 2/3] tcp: Register fds with epoll at flow creation 2026-01-19 16:19 ` [PATCH v2 2/3] tcp: " Laurent Vivier 2026-01-20 0:08 ` David Gibson @ 2026-01-21 8:13 ` Stefano Brivio 2026-01-21 8:43 ` Laurent Vivier 1 sibling, 1 reply; 12+ messages in thread From: Stefano Brivio @ 2026-01-21 8:13 UTC (permalink / raw) To: Laurent Vivier; +Cc: passt-dev, David Gibson On Mon, 19 Jan 2026 17:19:14 +0100 Laurent Vivier <lvivier@redhat.com> wrote: > Register connection sockets with epoll using empty events > (events=0) in tcp_conn_from_tap(), tcp_tap_conn_from_sock() > and tcp_flow_repair_socket(). > > This allows tcp_epoll_ctl() to always use EPOLL_CTL_MOD, removing > the need to check whether fds are already registered. As a result, the > conditional ADD/MOD logic is no longer needed, simplifying the function. > > Signed-off-by: Laurent Vivier <lvivier@redhat.com> > --- > flow.c | 1 + > tcp.c | 46 ++++++++++++++++++++++++---------------------- > 2 files changed, 25 insertions(+), 22 deletions(-) > > diff --git a/flow.c b/flow.c > index cefe6c8b5b24..532339ce7fe1 100644 > --- a/flow.c > +++ b/flow.c > @@ -357,6 +357,7 @@ static void flow_set_state(struct flow_common *f, enum flow_state state) > * > * Return: true if flow is registered with epoll, false otherwise > */ > +/* cppcheck-suppress unusedFunction */ > bool flow_in_epoll(const struct flow_common *f) > { > return f->epollid != EPOLLFD_ID_INVALID; > diff --git a/tcp.c b/tcp.c > index 1db861705ddb..29d69354bd94 100644 > --- a/tcp.c > +++ b/tcp.c > @@ -528,37 +528,22 @@ static uint32_t tcp_conn_epoll_events(uint8_t events, uint8_t conn_flags) > static int tcp_epoll_ctl(struct tcp_tap_conn *conn) > { > uint32_t events; > - int m; > > if (conn->events == CLOSED) { > - if (flow_in_epoll(&conn->f)) { > - int epollfd = flow_epollfd(&conn->f); > + int epollfd = flow_epollfd(&conn->f); > > - epoll_del(epollfd, conn->sock); > - if (conn->timer != -1) > - epoll_del(epollfd, conn->timer); > - } > + epoll_del(epollfd, conn->sock); > + if (conn->timer != -1) > + epoll_del(epollfd, conn->timer); > > return 0; > } > > events = tcp_conn_epoll_events(conn->events, conn->flags); > > - if (flow_in_epoll(&conn->f)) { > - m = EPOLL_CTL_MOD; > - } else { > - flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT); > - m = EPOLL_CTL_ADD; > - } > - > - if (flow_epoll_set(&conn->f, m, events, conn->sock, > - !TAPSIDE(conn)) < 0) { > - int ret = -errno; > - > - if (m == EPOLL_CTL_ADD) > - flow_epollid_clear(&conn->f); > - return ret; > - } > + if (flow_epoll_set(&conn->f, EPOLL_CTL_MOD, events, conn->sock, > + !TAPSIDE(conn)) < 0) > + return -errno; > > return 0; > } > @@ -1710,6 +1695,11 @@ static void tcp_conn_from_tap(const struct ctx *c, sa_family_t af, > conn->sock = s; > conn->timer = -1; > conn->listening_sock = -1; > + flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT); > + if (flow_epoll_set(&conn->f, EPOLL_CTL_ADD, 0, s, TGTSIDE) < 0) { > + flow_perror(flow, "Can't register with epoll"); > + goto cancel; > + } > conn_event(c, conn, TAP_SYN_RCVD); > > conn->wnd_to_tap = WINDOW_DEFAULT; > @@ -2433,6 +2423,15 @@ static void tcp_tap_conn_from_sock(const struct ctx *c, union flow *flow, > conn->sock = s; > conn->timer = -1; > conn->ws_to_tap = conn->ws_from_tap = 0; > + > + flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT); > + if (flow_epoll_set(&conn->f, EPOLL_CTL_ADD, 0, s, INISIDE) < 0) { > + flow_perror(flow, "Can't register with epoll"); > + conn_flag(c, conn, CLOSING); > + FLOW_ACTIVATE(conn); > + return; > + } > + > conn_event(c, conn, SOCK_ACCEPTED); > > hash = flow_hash_insert(c, TAP_SIDX(conn)); > @@ -3825,6 +3824,9 @@ int tcp_flow_migrate_target(struct ctx *c, int fd) > return 0; > } > > + flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT); > + flow_epoll_set(&conn->f, EPOLL_CTL_ADD, 0, conn->sock, !TAPSIDE(conn)); Oops, I overlooked this, because I missed re-checking on v2 after David pointed it out: this causes Coverity Scan to (reasonably, I think) complain that: /home/sbrivio/passt/tcp.c:3693:2: Type: Unchecked return value (CHECKED_RETURN) /home/sbrivio/passt/tcp.c:3648:2: Unchecked call to function 1. path: Condition "!(flow = flow_alloc())", taking false branch. /home/sbrivio/passt/tcp.c:3653:2: 2. path: Condition "read_all_buf(fd, &t, 112UL /* sizeof (t) */)", taking false branch. /home/sbrivio/passt/tcp.c:3685:2: 3. path: Condition "rc = tcp_flow_repair_socket(c, conn)", taking false branch. /home/sbrivio/passt/tcp.c:3693:2: 4. path: Condition "!(conn->f.pif[1] == PIF_TAP)", taking true branch. /home/sbrivio/passt/tcp.c:3693:2: 5. check_return: Calling "flow_epoll_set" without checking return value (as is done elsewhere 6 out of 7 times). /home/sbrivio/passt/icmp.c:213:2: Examples where return value from this function is checked 6. example_checked: Example 1: "flow_epoll_set(&pingf->f, 1, EPOLLIN, pingf->sock, 1U)" has its value checked in "flow_epoll_set(&pingf->f, 1, EPOLLIN, pingf->sock, 1U) < 0". /home/sbrivio/passt/tcp.c:1695:2: Examples where return value from this function is checked 7. example_checked: Example 2: "flow_epoll_set(&conn->f, 1, 0U, s, 1U)" has its value checked in "flow_epoll_set(&conn->f, 1, 0U, s, 1U) < 0". /home/sbrivio/passt/tcp_splice.c:364:2: Examples where return value from this function is checked 8. example_checked: Example 3: "flow_epoll_set(&conn->f, 1, 0U, conn->s[1], 1U)" has its value checked in "flow_epoll_set(&conn->f, 1, 0U, conn->s[1], 1U)". /home/sbrivio/passt/tcp_splice.c:364:2: Examples where return value from this function is checked 9. example_checked: Example 4: "flow_epoll_set(&conn->f, 1, 0U, conn->s[0], 0U)" has its value checked in "flow_epoll_set(&conn->f, 1, 0U, conn->s[0], 0U)". /home/sbrivio/passt/udp_flow.c:87:2: Examples where return value from this function is checked 10. example_checked: Example 5: "flow_epoll_set(&uflow->f, 1, EPOLLIN, s, sidei)" has its value checked in "flow_epoll_set(&uflow->f, 1, EPOLLIN, s, sidei) < 0". I don't think it's overly problematic but epoll_ctl() can, in theory, fail regardless of the arguments. > + > flow_hash_insert(c, TAP_SIDX(conn)); > FLOW_ACTIVATE(conn); > -- Stefano ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 2/3] tcp: Register fds with epoll at flow creation 2026-01-21 8:13 ` Stefano Brivio @ 2026-01-21 8:43 ` Laurent Vivier 2026-01-21 11:41 ` Stefano Brivio 0 siblings, 1 reply; 12+ messages in thread From: Laurent Vivier @ 2026-01-21 8:43 UTC (permalink / raw) To: Stefano Brivio; +Cc: passt-dev, David Gibson On 1/21/26 09:13, Stefano Brivio wrote: > On Mon, 19 Jan 2026 17:19:14 +0100 > Laurent Vivier <lvivier@redhat.com> wrote: > >> Register connection sockets with epoll using empty events >> (events=0) in tcp_conn_from_tap(), tcp_tap_conn_from_sock() >> and tcp_flow_repair_socket(). >> >> This allows tcp_epoll_ctl() to always use EPOLL_CTL_MOD, removing >> the need to check whether fds are already registered. As a result, the >> conditional ADD/MOD logic is no longer needed, simplifying the function. >> >> Signed-off-by: Laurent Vivier <lvivier@redhat.com> >> --- >> flow.c | 1 + >> tcp.c | 46 ++++++++++++++++++++++++---------------------- >> 2 files changed, 25 insertions(+), 22 deletions(-) >> >> diff --git a/flow.c b/flow.c >> index cefe6c8b5b24..532339ce7fe1 100644 >> --- a/flow.c >> +++ b/flow.c >> @@ -357,6 +357,7 @@ static void flow_set_state(struct flow_common *f, enum flow_state state) >> * >> * Return: true if flow is registered with epoll, false otherwise >> */ >> +/* cppcheck-suppress unusedFunction */ >> bool flow_in_epoll(const struct flow_common *f) >> { >> return f->epollid != EPOLLFD_ID_INVALID; >> diff --git a/tcp.c b/tcp.c >> index 1db861705ddb..29d69354bd94 100644 >> --- a/tcp.c >> +++ b/tcp.c >> @@ -528,37 +528,22 @@ static uint32_t tcp_conn_epoll_events(uint8_t events, uint8_t conn_flags) >> static int tcp_epoll_ctl(struct tcp_tap_conn *conn) >> { >> uint32_t events; >> - int m; >> >> if (conn->events == CLOSED) { >> - if (flow_in_epoll(&conn->f)) { >> - int epollfd = flow_epollfd(&conn->f); >> + int epollfd = flow_epollfd(&conn->f); >> >> - epoll_del(epollfd, conn->sock); >> - if (conn->timer != -1) >> - epoll_del(epollfd, conn->timer); >> - } >> + epoll_del(epollfd, conn->sock); >> + if (conn->timer != -1) >> + epoll_del(epollfd, conn->timer); >> >> return 0; >> } >> >> events = tcp_conn_epoll_events(conn->events, conn->flags); >> >> - if (flow_in_epoll(&conn->f)) { >> - m = EPOLL_CTL_MOD; >> - } else { >> - flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT); >> - m = EPOLL_CTL_ADD; >> - } >> - >> - if (flow_epoll_set(&conn->f, m, events, conn->sock, >> - !TAPSIDE(conn)) < 0) { >> - int ret = -errno; >> - >> - if (m == EPOLL_CTL_ADD) >> - flow_epollid_clear(&conn->f); >> - return ret; >> - } >> + if (flow_epoll_set(&conn->f, EPOLL_CTL_MOD, events, conn->sock, >> + !TAPSIDE(conn)) < 0) >> + return -errno; >> >> return 0; >> } >> @@ -1710,6 +1695,11 @@ static void tcp_conn_from_tap(const struct ctx *c, sa_family_t af, >> conn->sock = s; >> conn->timer = -1; >> conn->listening_sock = -1; >> + flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT); >> + if (flow_epoll_set(&conn->f, EPOLL_CTL_ADD, 0, s, TGTSIDE) < 0) { >> + flow_perror(flow, "Can't register with epoll"); >> + goto cancel; >> + } >> conn_event(c, conn, TAP_SYN_RCVD); >> >> conn->wnd_to_tap = WINDOW_DEFAULT; >> @@ -2433,6 +2423,15 @@ static void tcp_tap_conn_from_sock(const struct ctx *c, union flow *flow, >> conn->sock = s; >> conn->timer = -1; >> conn->ws_to_tap = conn->ws_from_tap = 0; >> + >> + flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT); >> + if (flow_epoll_set(&conn->f, EPOLL_CTL_ADD, 0, s, INISIDE) < 0) { >> + flow_perror(flow, "Can't register with epoll"); >> + conn_flag(c, conn, CLOSING); >> + FLOW_ACTIVATE(conn); >> + return; >> + } >> + >> conn_event(c, conn, SOCK_ACCEPTED); >> >> hash = flow_hash_insert(c, TAP_SIDX(conn)); >> @@ -3825,6 +3824,9 @@ int tcp_flow_migrate_target(struct ctx *c, int fd) >> return 0; >> } >> >> + flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT); >> + flow_epoll_set(&conn->f, EPOLL_CTL_ADD, 0, conn->sock, !TAPSIDE(conn)); > > Oops, I overlooked this, because I missed re-checking on v2 after David > pointed it out: this causes Coverity Scan to (reasonably, I think) > complain that: > > /home/sbrivio/passt/tcp.c:3693:2: > Type: Unchecked return value (CHECKED_RETURN) > > /home/sbrivio/passt/tcp.c:3648:2: Unchecked call to function > 1. path: Condition "!(flow = flow_alloc())", taking false branch. > /home/sbrivio/passt/tcp.c:3653:2: > 2. path: Condition "read_all_buf(fd, &t, 112UL /* sizeof (t) */)", taking false branch. > /home/sbrivio/passt/tcp.c:3685:2: > 3. path: Condition "rc = tcp_flow_repair_socket(c, conn)", taking false branch. > /home/sbrivio/passt/tcp.c:3693:2: > 4. path: Condition "!(conn->f.pif[1] == PIF_TAP)", taking true branch. > /home/sbrivio/passt/tcp.c:3693:2: > 5. check_return: Calling "flow_epoll_set" without checking return value (as is done elsewhere 6 out of 7 times). > /home/sbrivio/passt/icmp.c:213:2: Examples where return value from this function is checked > 6. example_checked: Example 1: "flow_epoll_set(&pingf->f, 1, EPOLLIN, pingf->sock, 1U)" has its value checked in "flow_epoll_set(&pingf->f, 1, EPOLLIN, pingf->sock, 1U) < 0". > /home/sbrivio/passt/tcp.c:1695:2: Examples where return value from this function is checked > 7. example_checked: Example 2: "flow_epoll_set(&conn->f, 1, 0U, s, 1U)" has its value checked in "flow_epoll_set(&conn->f, 1, 0U, s, 1U) < 0". > /home/sbrivio/passt/tcp_splice.c:364:2: Examples where return value from this function is checked > 8. example_checked: Example 3: "flow_epoll_set(&conn->f, 1, 0U, conn->s[1], 1U)" has its value checked in "flow_epoll_set(&conn->f, 1, 0U, conn->s[1], 1U)". > /home/sbrivio/passt/tcp_splice.c:364:2: Examples where return value from this function is checked > 9. example_checked: Example 4: "flow_epoll_set(&conn->f, 1, 0U, conn->s[0], 0U)" has its value checked in "flow_epoll_set(&conn->f, 1, 0U, conn->s[0], 0U)". > /home/sbrivio/passt/udp_flow.c:87:2: Examples where return value from this function is checked > 10. example_checked: Example 5: "flow_epoll_set(&uflow->f, 1, EPOLLIN, s, sidei)" has its value checked in "flow_epoll_set(&uflow->f, 1, EPOLLIN, s, sidei) < 0". > > I don't think it's overly problematic but epoll_ctl() can, in theory, > fail regardless of the arguments. > Yes, I agree, but the problem is catched later by tcp_epoll_ctl() within tcp_flow_migrate_target_ext(). Furthermore, at this point, to handle the error we have to return 0 (single point of failure) and still execute FLOW_ACTIVATE() anyway. The result is therefore identical (only flow_hash_insert() is not executed in the absence of an error). Thanks, Laurent ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 2/3] tcp: Register fds with epoll at flow creation 2026-01-21 8:43 ` Laurent Vivier @ 2026-01-21 11:41 ` Stefano Brivio 2026-01-21 12:18 ` Laurent Vivier 0 siblings, 1 reply; 12+ messages in thread From: Stefano Brivio @ 2026-01-21 11:41 UTC (permalink / raw) To: Laurent Vivier; +Cc: passt-dev, David Gibson On Wed, 21 Jan 2026 09:43:55 +0100 Laurent Vivier <lvivier@redhat.com> wrote: > On 1/21/26 09:13, Stefano Brivio wrote: > > On Mon, 19 Jan 2026 17:19:14 +0100 > > Laurent Vivier <lvivier@redhat.com> wrote: > > > >> Register connection sockets with epoll using empty events > >> (events=0) in tcp_conn_from_tap(), tcp_tap_conn_from_sock() > >> and tcp_flow_repair_socket(). > >> > >> This allows tcp_epoll_ctl() to always use EPOLL_CTL_MOD, removing > >> the need to check whether fds are already registered. As a result, the > >> conditional ADD/MOD logic is no longer needed, simplifying the function. > >> > >> Signed-off-by: Laurent Vivier <lvivier@redhat.com> > >> --- > >> flow.c | 1 + > >> tcp.c | 46 ++++++++++++++++++++++++---------------------- > >> 2 files changed, 25 insertions(+), 22 deletions(-) > >> > >> diff --git a/flow.c b/flow.c > >> index cefe6c8b5b24..532339ce7fe1 100644 > >> --- a/flow.c > >> +++ b/flow.c > >> @@ -357,6 +357,7 @@ static void flow_set_state(struct flow_common *f, enum flow_state state) > >> * > >> * Return: true if flow is registered with epoll, false otherwise > >> */ > >> +/* cppcheck-suppress unusedFunction */ > >> bool flow_in_epoll(const struct flow_common *f) > >> { > >> return f->epollid != EPOLLFD_ID_INVALID; > >> diff --git a/tcp.c b/tcp.c > >> index 1db861705ddb..29d69354bd94 100644 > >> --- a/tcp.c > >> +++ b/tcp.c > >> @@ -528,37 +528,22 @@ static uint32_t tcp_conn_epoll_events(uint8_t events, uint8_t conn_flags) > >> static int tcp_epoll_ctl(struct tcp_tap_conn *conn) > >> { > >> uint32_t events; > >> - int m; > >> > >> if (conn->events == CLOSED) { > >> - if (flow_in_epoll(&conn->f)) { > >> - int epollfd = flow_epollfd(&conn->f); > >> + int epollfd = flow_epollfd(&conn->f); > >> > >> - epoll_del(epollfd, conn->sock); > >> - if (conn->timer != -1) > >> - epoll_del(epollfd, conn->timer); > >> - } > >> + epoll_del(epollfd, conn->sock); > >> + if (conn->timer != -1) > >> + epoll_del(epollfd, conn->timer); > >> > >> return 0; > >> } > >> > >> events = tcp_conn_epoll_events(conn->events, conn->flags); > >> > >> - if (flow_in_epoll(&conn->f)) { > >> - m = EPOLL_CTL_MOD; > >> - } else { > >> - flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT); > >> - m = EPOLL_CTL_ADD; > >> - } > >> - > >> - if (flow_epoll_set(&conn->f, m, events, conn->sock, > >> - !TAPSIDE(conn)) < 0) { > >> - int ret = -errno; > >> - > >> - if (m == EPOLL_CTL_ADD) > >> - flow_epollid_clear(&conn->f); > >> - return ret; > >> - } > >> + if (flow_epoll_set(&conn->f, EPOLL_CTL_MOD, events, conn->sock, > >> + !TAPSIDE(conn)) < 0) > >> + return -errno; > >> > >> return 0; > >> } > >> @@ -1710,6 +1695,11 @@ static void tcp_conn_from_tap(const struct ctx *c, sa_family_t af, > >> conn->sock = s; > >> conn->timer = -1; > >> conn->listening_sock = -1; > >> + flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT); > >> + if (flow_epoll_set(&conn->f, EPOLL_CTL_ADD, 0, s, TGTSIDE) < 0) { > >> + flow_perror(flow, "Can't register with epoll"); > >> + goto cancel; > >> + } > >> conn_event(c, conn, TAP_SYN_RCVD); > >> > >> conn->wnd_to_tap = WINDOW_DEFAULT; > >> @@ -2433,6 +2423,15 @@ static void tcp_tap_conn_from_sock(const struct ctx *c, union flow *flow, > >> conn->sock = s; > >> conn->timer = -1; > >> conn->ws_to_tap = conn->ws_from_tap = 0; > >> + > >> + flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT); > >> + if (flow_epoll_set(&conn->f, EPOLL_CTL_ADD, 0, s, INISIDE) < 0) { > >> + flow_perror(flow, "Can't register with epoll"); > >> + conn_flag(c, conn, CLOSING); > >> + FLOW_ACTIVATE(conn); > >> + return; > >> + } > >> + > >> conn_event(c, conn, SOCK_ACCEPTED); > >> > >> hash = flow_hash_insert(c, TAP_SIDX(conn)); > >> @@ -3825,6 +3824,9 @@ int tcp_flow_migrate_target(struct ctx *c, int fd) > >> return 0; > >> } > >> > >> + flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT); > >> + flow_epoll_set(&conn->f, EPOLL_CTL_ADD, 0, conn->sock, !TAPSIDE(conn)); > > > > Oops, I overlooked this, because I missed re-checking on v2 after David > > pointed it out: this causes Coverity Scan to (reasonably, I think) > > complain that: > > > > /home/sbrivio/passt/tcp.c:3693:2: > > Type: Unchecked return value (CHECKED_RETURN) > > > > /home/sbrivio/passt/tcp.c:3648:2: Unchecked call to function > > 1. path: Condition "!(flow = flow_alloc())", taking false branch. > > /home/sbrivio/passt/tcp.c:3653:2: > > 2. path: Condition "read_all_buf(fd, &t, 112UL /* sizeof (t) */)", taking false branch. > > /home/sbrivio/passt/tcp.c:3685:2: > > 3. path: Condition "rc = tcp_flow_repair_socket(c, conn)", taking false branch. > > /home/sbrivio/passt/tcp.c:3693:2: > > 4. path: Condition "!(conn->f.pif[1] == PIF_TAP)", taking true branch. > > /home/sbrivio/passt/tcp.c:3693:2: > > 5. check_return: Calling "flow_epoll_set" without checking return value (as is done elsewhere 6 out of 7 times). > > /home/sbrivio/passt/icmp.c:213:2: Examples where return value from this function is checked > > 6. example_checked: Example 1: "flow_epoll_set(&pingf->f, 1, EPOLLIN, pingf->sock, 1U)" has its value checked in "flow_epoll_set(&pingf->f, 1, EPOLLIN, pingf->sock, 1U) < 0". > > /home/sbrivio/passt/tcp.c:1695:2: Examples where return value from this function is checked > > 7. example_checked: Example 2: "flow_epoll_set(&conn->f, 1, 0U, s, 1U)" has its value checked in "flow_epoll_set(&conn->f, 1, 0U, s, 1U) < 0". > > /home/sbrivio/passt/tcp_splice.c:364:2: Examples where return value from this function is checked > > 8. example_checked: Example 3: "flow_epoll_set(&conn->f, 1, 0U, conn->s[1], 1U)" has its value checked in "flow_epoll_set(&conn->f, 1, 0U, conn->s[1], 1U)". > > /home/sbrivio/passt/tcp_splice.c:364:2: Examples where return value from this function is checked > > 9. example_checked: Example 4: "flow_epoll_set(&conn->f, 1, 0U, conn->s[0], 0U)" has its value checked in "flow_epoll_set(&conn->f, 1, 0U, conn->s[0], 0U)". > > /home/sbrivio/passt/udp_flow.c:87:2: Examples where return value from this function is checked > > 10. example_checked: Example 5: "flow_epoll_set(&uflow->f, 1, EPOLLIN, s, sidei)" has its value checked in "flow_epoll_set(&uflow->f, 1, EPOLLIN, s, sidei) < 0". > > > > I don't think it's overly problematic but epoll_ctl() can, in theory, > > fail regardless of the arguments. > > > > Yes, I agree, but the problem is catched later by tcp_epoll_ctl() within > tcp_flow_migrate_target_ext(). I see. It's just not really obvious, and other than that it adds static checkers noise that I'm trying to keep to a minimum. > Furthermore, at this point, to handle the error we have to return 0 (single point of > failure) and still execute FLOW_ACTIVATE() anyway. The result is therefore identical (only > flow_hash_insert() is not executed in the absence of an error). Well, not calling flow_hash_insert() in that case looks like a substantial simplification (for auditing / debugging purposes) to me. Given that we already have an early return above on failures from tcp_flow_repair_socket(), maybe that could become an 'out' goto label at the end, and we would have a similar comment for this other case? I'll send a patch in this sense in a bit, unless you see something wrong with it. -- Stefano ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 2/3] tcp: Register fds with epoll at flow creation 2026-01-21 11:41 ` Stefano Brivio @ 2026-01-21 12:18 ` Laurent Vivier 0 siblings, 0 replies; 12+ messages in thread From: Laurent Vivier @ 2026-01-21 12:18 UTC (permalink / raw) To: Stefano Brivio; +Cc: passt-dev, David Gibson On 1/21/26 12:41, Stefano Brivio wrote: > On Wed, 21 Jan 2026 09:43:55 +0100 > Laurent Vivier <lvivier@redhat.com> wrote: > >> On 1/21/26 09:13, Stefano Brivio wrote: >>> On Mon, 19 Jan 2026 17:19:14 +0100 >>> Laurent Vivier <lvivier@redhat.com> wrote: >>> >>>> Register connection sockets with epoll using empty events >>>> (events=0) in tcp_conn_from_tap(), tcp_tap_conn_from_sock() >>>> and tcp_flow_repair_socket(). >>>> >>>> This allows tcp_epoll_ctl() to always use EPOLL_CTL_MOD, removing >>>> the need to check whether fds are already registered. As a result, the >>>> conditional ADD/MOD logic is no longer needed, simplifying the function. >>>> >>>> Signed-off-by: Laurent Vivier <lvivier@redhat.com> >>>> --- >>>> flow.c | 1 + >>>> tcp.c | 46 ++++++++++++++++++++++++---------------------- >>>> 2 files changed, 25 insertions(+), 22 deletions(-) >>>> >>>> diff --git a/flow.c b/flow.c >>>> index cefe6c8b5b24..532339ce7fe1 100644 >>>> --- a/flow.c >>>> +++ b/flow.c >>>> @@ -357,6 +357,7 @@ static void flow_set_state(struct flow_common *f, enum flow_state state) >>>> * >>>> * Return: true if flow is registered with epoll, false otherwise >>>> */ >>>> +/* cppcheck-suppress unusedFunction */ >>>> bool flow_in_epoll(const struct flow_common *f) >>>> { >>>> return f->epollid != EPOLLFD_ID_INVALID; >>>> diff --git a/tcp.c b/tcp.c >>>> index 1db861705ddb..29d69354bd94 100644 >>>> --- a/tcp.c >>>> +++ b/tcp.c >>>> @@ -528,37 +528,22 @@ static uint32_t tcp_conn_epoll_events(uint8_t events, uint8_t conn_flags) >>>> static int tcp_epoll_ctl(struct tcp_tap_conn *conn) >>>> { >>>> uint32_t events; >>>> - int m; >>>> >>>> if (conn->events == CLOSED) { >>>> - if (flow_in_epoll(&conn->f)) { >>>> - int epollfd = flow_epollfd(&conn->f); >>>> + int epollfd = flow_epollfd(&conn->f); >>>> >>>> - epoll_del(epollfd, conn->sock); >>>> - if (conn->timer != -1) >>>> - epoll_del(epollfd, conn->timer); >>>> - } >>>> + epoll_del(epollfd, conn->sock); >>>> + if (conn->timer != -1) >>>> + epoll_del(epollfd, conn->timer); >>>> >>>> return 0; >>>> } >>>> >>>> events = tcp_conn_epoll_events(conn->events, conn->flags); >>>> >>>> - if (flow_in_epoll(&conn->f)) { >>>> - m = EPOLL_CTL_MOD; >>>> - } else { >>>> - flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT); >>>> - m = EPOLL_CTL_ADD; >>>> - } >>>> - >>>> - if (flow_epoll_set(&conn->f, m, events, conn->sock, >>>> - !TAPSIDE(conn)) < 0) { >>>> - int ret = -errno; >>>> - >>>> - if (m == EPOLL_CTL_ADD) >>>> - flow_epollid_clear(&conn->f); >>>> - return ret; >>>> - } >>>> + if (flow_epoll_set(&conn->f, EPOLL_CTL_MOD, events, conn->sock, >>>> + !TAPSIDE(conn)) < 0) >>>> + return -errno; >>>> >>>> return 0; >>>> } >>>> @@ -1710,6 +1695,11 @@ static void tcp_conn_from_tap(const struct ctx *c, sa_family_t af, >>>> conn->sock = s; >>>> conn->timer = -1; >>>> conn->listening_sock = -1; >>>> + flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT); >>>> + if (flow_epoll_set(&conn->f, EPOLL_CTL_ADD, 0, s, TGTSIDE) < 0) { >>>> + flow_perror(flow, "Can't register with epoll"); >>>> + goto cancel; >>>> + } >>>> conn_event(c, conn, TAP_SYN_RCVD); >>>> >>>> conn->wnd_to_tap = WINDOW_DEFAULT; >>>> @@ -2433,6 +2423,15 @@ static void tcp_tap_conn_from_sock(const struct ctx *c, union flow *flow, >>>> conn->sock = s; >>>> conn->timer = -1; >>>> conn->ws_to_tap = conn->ws_from_tap = 0; >>>> + >>>> + flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT); >>>> + if (flow_epoll_set(&conn->f, EPOLL_CTL_ADD, 0, s, INISIDE) < 0) { >>>> + flow_perror(flow, "Can't register with epoll"); >>>> + conn_flag(c, conn, CLOSING); >>>> + FLOW_ACTIVATE(conn); >>>> + return; >>>> + } >>>> + >>>> conn_event(c, conn, SOCK_ACCEPTED); >>>> >>>> hash = flow_hash_insert(c, TAP_SIDX(conn)); >>>> @@ -3825,6 +3824,9 @@ int tcp_flow_migrate_target(struct ctx *c, int fd) >>>> return 0; >>>> } >>>> >>>> + flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT); >>>> + flow_epoll_set(&conn->f, EPOLL_CTL_ADD, 0, conn->sock, !TAPSIDE(conn)); >>> >>> Oops, I overlooked this, because I missed re-checking on v2 after David >>> pointed it out: this causes Coverity Scan to (reasonably, I think) >>> complain that: >>> >>> /home/sbrivio/passt/tcp.c:3693:2: >>> Type: Unchecked return value (CHECKED_RETURN) >>> >>> /home/sbrivio/passt/tcp.c:3648:2: Unchecked call to function >>> 1. path: Condition "!(flow = flow_alloc())", taking false branch. >>> /home/sbrivio/passt/tcp.c:3653:2: >>> 2. path: Condition "read_all_buf(fd, &t, 112UL /* sizeof (t) */)", taking false branch. >>> /home/sbrivio/passt/tcp.c:3685:2: >>> 3. path: Condition "rc = tcp_flow_repair_socket(c, conn)", taking false branch. >>> /home/sbrivio/passt/tcp.c:3693:2: >>> 4. path: Condition "!(conn->f.pif[1] == PIF_TAP)", taking true branch. >>> /home/sbrivio/passt/tcp.c:3693:2: >>> 5. check_return: Calling "flow_epoll_set" without checking return value (as is done elsewhere 6 out of 7 times). >>> /home/sbrivio/passt/icmp.c:213:2: Examples where return value from this function is checked >>> 6. example_checked: Example 1: "flow_epoll_set(&pingf->f, 1, EPOLLIN, pingf->sock, 1U)" has its value checked in "flow_epoll_set(&pingf->f, 1, EPOLLIN, pingf->sock, 1U) < 0". >>> /home/sbrivio/passt/tcp.c:1695:2: Examples where return value from this function is checked >>> 7. example_checked: Example 2: "flow_epoll_set(&conn->f, 1, 0U, s, 1U)" has its value checked in "flow_epoll_set(&conn->f, 1, 0U, s, 1U) < 0". >>> /home/sbrivio/passt/tcp_splice.c:364:2: Examples where return value from this function is checked >>> 8. example_checked: Example 3: "flow_epoll_set(&conn->f, 1, 0U, conn->s[1], 1U)" has its value checked in "flow_epoll_set(&conn->f, 1, 0U, conn->s[1], 1U)". >>> /home/sbrivio/passt/tcp_splice.c:364:2: Examples where return value from this function is checked >>> 9. example_checked: Example 4: "flow_epoll_set(&conn->f, 1, 0U, conn->s[0], 0U)" has its value checked in "flow_epoll_set(&conn->f, 1, 0U, conn->s[0], 0U)". >>> /home/sbrivio/passt/udp_flow.c:87:2: Examples where return value from this function is checked >>> 10. example_checked: Example 5: "flow_epoll_set(&uflow->f, 1, EPOLLIN, s, sidei)" has its value checked in "flow_epoll_set(&uflow->f, 1, EPOLLIN, s, sidei) < 0". >>> >>> I don't think it's overly problematic but epoll_ctl() can, in theory, >>> fail regardless of the arguments. >>> >> >> Yes, I agree, but the problem is catched later by tcp_epoll_ctl() within >> tcp_flow_migrate_target_ext(). > > I see. It's just not really obvious, and other than that it adds static > checkers noise that I'm trying to keep to a minimum. > >> Furthermore, at this point, to handle the error we have to return 0 (single point of >> failure) and still execute FLOW_ACTIVATE() anyway. The result is therefore identical (only >> flow_hash_insert() is not executed in the absence of an error). > > Well, not calling flow_hash_insert() in that case looks like a > substantial simplification (for auditing / debugging purposes) to me. > > Given that we already have an early return above on failures from > tcp_flow_repair_socket(), maybe that could become an 'out' goto label at > the end, and we would have a similar comment for this other case? > > I'll send a patch in this sense in a bit, unless you see something > wrong with it. > Yes, that makes sense. Go ahead. Thanks, Laurent ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v2 3/3] flow: Remove EPOLLFD_ID_INVALID 2026-01-19 16:19 [PATCH v2 0/3] Register TCP flows with epoll at creation time Laurent Vivier 2026-01-19 16:19 ` [PATCH v2 1/3] tcp_splice: Register fds with epoll at flow creation Laurent Vivier 2026-01-19 16:19 ` [PATCH v2 2/3] tcp: " Laurent Vivier @ 2026-01-19 16:19 ` Laurent Vivier 2026-01-20 0:09 ` David Gibson 2026-01-20 20:24 ` [PATCH v2 0/3] Register TCP flows with epoll at creation time Stefano Brivio 3 siblings, 1 reply; 12+ messages in thread From: Laurent Vivier @ 2026-01-19 16:19 UTC (permalink / raw) To: passt-dev; +Cc: Laurent Vivier As all flows are now registered with an epollid at creation, we no longer need to test if a flow is in epoll. Remove all related code including flow_in_epoll() and flow_epollid_clear(). Signed-off-by: Laurent Vivier <lvivier@redhat.com> --- flow.c | 35 +++-------------------------------- flow.h | 6 +----- icmp.c | 1 - tcp.c | 1 - udp_flow.c | 1 - 5 files changed, 4 insertions(+), 40 deletions(-) diff --git a/flow.c b/flow.c index 532339ce7fe1..1b8643358575 100644 --- a/flow.c +++ b/flow.c @@ -127,7 +127,7 @@ static_assert(ARRAY_SIZE(flow_epoll) == FLOW_NUM_TYPES, unsigned flow_first_free; union flow flowtab[FLOW_MAX]; static const union flow *flow_new_entry; /* = NULL */ -static int epoll_id_to_fd[EPOLLFD_ID_MAX]; +static int epoll_id_to_fd[EPOLLFD_ID_SIZE]; /* Hash table to index it */ #define FLOW_HASH_LOAD 70 /* % */ @@ -351,18 +351,6 @@ static void flow_set_state(struct flow_common *f, enum flow_state state) flow_log_details_(f, LOG_DEBUG, MAX(state, oldstate)); } -/** - * flow_in_epoll() - Check if flow is registered with an epoll instance - * @f: Flow to check - * - * Return: true if flow is registered with epoll, false otherwise - */ -/* cppcheck-suppress unusedFunction */ -bool flow_in_epoll(const struct flow_common *f) -{ - return f->epollid != EPOLLFD_ID_INVALID; -} - /** * flow_epollfd() - Get the epoll file descriptor for a flow * @f: Flow to query @@ -371,13 +359,6 @@ bool flow_in_epoll(const struct flow_common *f) */ int flow_epollfd(const struct flow_common *f) { - if (f->epollid >= EPOLLFD_ID_MAX) { - flow_log_(f, true, LOG_WARNING, - "Invalid epollid %i for flow, assuming default", - f->epollid); - return epoll_id_to_fd[EPOLLFD_ID_DEFAULT]; - } - return epoll_id_to_fd[f->epollid]; } @@ -388,20 +369,11 @@ int flow_epollfd(const struct flow_common *f) */ void flow_epollid_set(struct flow_common *f, int epollid) { - ASSERT(epollid < EPOLLFD_ID_MAX); + ASSERT(epollid < EPOLLFD_ID_SIZE); f->epollid = epollid; } -/** - * flow_epollid_clear() - Clear the flow epoll id - * @f: Flow to update - */ -void flow_epollid_clear(struct flow_common *f) -{ - f->epollid = EPOLLFD_ID_INVALID; -} - /** * flow_epoll_set() - Add or modify epoll registration for a flow socket * @f: Flow to register socket for @@ -435,7 +407,7 @@ int flow_epoll_set(const struct flow_common *f, int command, uint32_t events, */ void flow_epollid_register(int epollid, int epollfd) { - ASSERT(epollid < EPOLLFD_ID_MAX); + ASSERT(epollid < EPOLLFD_ID_SIZE); epoll_id_to_fd[epollid] = epollfd; } @@ -643,7 +615,6 @@ union flow *flow_alloc(void) flow_new_entry = flow; memset(flow, 0, sizeof(*flow)); - flow_epollid_clear(&flow->f); flow_set_state(&flow->f, FLOW_STATE_NEW); return flow; diff --git a/flow.h b/flow.h index a74e13507957..d636358df422 100644 --- a/flow.h +++ b/flow.h @@ -178,7 +178,7 @@ int flowside_connect(const struct ctx *c, int s, * @pif[]: Interface for each side of the flow * @side[]: Information for each side of the flow * @tap_omac: MAC address of remote endpoint as seen from the guest - * @epollid: epollfd identifier, or EPOLLFD_ID_INVALID + * @epollid: epollfd identifier */ struct flow_common { #ifdef __GNUC__ @@ -203,8 +203,6 @@ struct flow_common { #define EPOLLFD_ID_DEFAULT 0 #define EPOLLFD_ID_SIZE (1 << EPOLLFD_ID_BITS) -#define EPOLLFD_ID_MAX (EPOLLFD_ID_SIZE - 1) -#define EPOLLFD_ID_INVALID EPOLLFD_ID_MAX #define FLOW_INDEX_BITS 17 /* 128k - 1 */ #define FLOW_MAX MAX_FROM_BITS(FLOW_INDEX_BITS) @@ -261,10 +259,8 @@ flow_sidx_t flow_lookup_sa(const struct ctx *c, uint8_t proto, uint8_t pif, union flow; void flow_init(void); -bool flow_in_epoll(const struct flow_common *f); int flow_epollfd(const struct flow_common *f); void flow_epollid_set(struct flow_common *f, int epollid); -void flow_epollid_clear(struct flow_common *f); int flow_epoll_set(const struct flow_common *f, int command, uint32_t events, int fd, unsigned int sidei); void flow_epollid_register(int epollid, int epollfd); diff --git a/icmp.c b/icmp.c index eb7f11be5dad..9da5a787d181 100644 --- a/icmp.c +++ b/icmp.c @@ -213,7 +213,6 @@ static struct icmp_ping_flow *icmp_ping_new(const struct ctx *c, if (flow_epoll_set(&pingf->f, EPOLL_CTL_ADD, EPOLLIN, pingf->sock, TGTSIDE) < 0) { close(pingf->sock); - flow_epollid_clear(&pingf->f); goto cancel; } diff --git a/tcp.c b/tcp.c index 29d69354bd94..31d4e41ecb1a 100644 --- a/tcp.c +++ b/tcp.c @@ -3759,7 +3759,6 @@ static int tcp_flow_repair_connect(const struct ctx *c, return rc; } - flow_epollid_clear(&conn->f); conn->timer = -1; conn->listening_sock = -1; diff --git a/udp_flow.c b/udp_flow.c index 80b15433f0ac..d824e2c55b2c 100644 --- a/udp_flow.c +++ b/udp_flow.c @@ -86,7 +86,6 @@ static int udp_flow_sock(const struct ctx *c, flow_epollid_set(&uflow->f, EPOLLFD_ID_DEFAULT); if (flow_epoll_set(&uflow->f, EPOLL_CTL_ADD, EPOLLIN, s, sidei) < 0) { rc = -errno; - flow_epollid_clear(&uflow->f); close(s); return rc; } -- 2.52.0 ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 3/3] flow: Remove EPOLLFD_ID_INVALID 2026-01-19 16:19 ` [PATCH v2 3/3] flow: Remove EPOLLFD_ID_INVALID Laurent Vivier @ 2026-01-20 0:09 ` David Gibson 0 siblings, 0 replies; 12+ messages in thread From: David Gibson @ 2026-01-20 0:09 UTC (permalink / raw) To: Laurent Vivier; +Cc: passt-dev [-- Attachment #1: Type: text/plain, Size: 5941 bytes --] On Mon, Jan 19, 2026 at 05:19:15PM +0100, Laurent Vivier wrote: > As all flows are now registered with an epollid at creation, we no > longer need to test if a flow is in epoll. Remove all related code > including flow_in_epoll() and flow_epollid_clear(). > > Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> > --- > flow.c | 35 +++-------------------------------- > flow.h | 6 +----- > icmp.c | 1 - > tcp.c | 1 - > udp_flow.c | 1 - > 5 files changed, 4 insertions(+), 40 deletions(-) > > diff --git a/flow.c b/flow.c > index 532339ce7fe1..1b8643358575 100644 > --- a/flow.c > +++ b/flow.c > @@ -127,7 +127,7 @@ static_assert(ARRAY_SIZE(flow_epoll) == FLOW_NUM_TYPES, > unsigned flow_first_free; > union flow flowtab[FLOW_MAX]; > static const union flow *flow_new_entry; /* = NULL */ > -static int epoll_id_to_fd[EPOLLFD_ID_MAX]; > +static int epoll_id_to_fd[EPOLLFD_ID_SIZE]; > > /* Hash table to index it */ > #define FLOW_HASH_LOAD 70 /* % */ > @@ -351,18 +351,6 @@ static void flow_set_state(struct flow_common *f, enum flow_state state) > flow_log_details_(f, LOG_DEBUG, MAX(state, oldstate)); > } > > -/** > - * flow_in_epoll() - Check if flow is registered with an epoll instance > - * @f: Flow to check > - * > - * Return: true if flow is registered with epoll, false otherwise > - */ > -/* cppcheck-suppress unusedFunction */ > -bool flow_in_epoll(const struct flow_common *f) > -{ > - return f->epollid != EPOLLFD_ID_INVALID; > -} > - > /** > * flow_epollfd() - Get the epoll file descriptor for a flow > * @f: Flow to query > @@ -371,13 +359,6 @@ bool flow_in_epoll(const struct flow_common *f) > */ > int flow_epollfd(const struct flow_common *f) > { > - if (f->epollid >= EPOLLFD_ID_MAX) { > - flow_log_(f, true, LOG_WARNING, > - "Invalid epollid %i for flow, assuming default", > - f->epollid); > - return epoll_id_to_fd[EPOLLFD_ID_DEFAULT]; > - } > - > return epoll_id_to_fd[f->epollid]; > } > > @@ -388,20 +369,11 @@ int flow_epollfd(const struct flow_common *f) > */ > void flow_epollid_set(struct flow_common *f, int epollid) > { > - ASSERT(epollid < EPOLLFD_ID_MAX); > + ASSERT(epollid < EPOLLFD_ID_SIZE); > > f->epollid = epollid; > } > > -/** > - * flow_epollid_clear() - Clear the flow epoll id > - * @f: Flow to update > - */ > -void flow_epollid_clear(struct flow_common *f) > -{ > - f->epollid = EPOLLFD_ID_INVALID; > -} > - > /** > * flow_epoll_set() - Add or modify epoll registration for a flow socket > * @f: Flow to register socket for > @@ -435,7 +407,7 @@ int flow_epoll_set(const struct flow_common *f, int command, uint32_t events, > */ > void flow_epollid_register(int epollid, int epollfd) > { > - ASSERT(epollid < EPOLLFD_ID_MAX); > + ASSERT(epollid < EPOLLFD_ID_SIZE); > > epoll_id_to_fd[epollid] = epollfd; > } > @@ -643,7 +615,6 @@ union flow *flow_alloc(void) > > flow_new_entry = flow; > memset(flow, 0, sizeof(*flow)); > - flow_epollid_clear(&flow->f); > flow_set_state(&flow->f, FLOW_STATE_NEW); > > return flow; > diff --git a/flow.h b/flow.h > index a74e13507957..d636358df422 100644 > --- a/flow.h > +++ b/flow.h > @@ -178,7 +178,7 @@ int flowside_connect(const struct ctx *c, int s, > * @pif[]: Interface for each side of the flow > * @side[]: Information for each side of the flow > * @tap_omac: MAC address of remote endpoint as seen from the guest > - * @epollid: epollfd identifier, or EPOLLFD_ID_INVALID > + * @epollid: epollfd identifier > */ > struct flow_common { > #ifdef __GNUC__ > @@ -203,8 +203,6 @@ struct flow_common { > > #define EPOLLFD_ID_DEFAULT 0 > #define EPOLLFD_ID_SIZE (1 << EPOLLFD_ID_BITS) > -#define EPOLLFD_ID_MAX (EPOLLFD_ID_SIZE - 1) > -#define EPOLLFD_ID_INVALID EPOLLFD_ID_MAX > > #define FLOW_INDEX_BITS 17 /* 128k - 1 */ > #define FLOW_MAX MAX_FROM_BITS(FLOW_INDEX_BITS) > @@ -261,10 +259,8 @@ flow_sidx_t flow_lookup_sa(const struct ctx *c, uint8_t proto, uint8_t pif, > union flow; > > void flow_init(void); > -bool flow_in_epoll(const struct flow_common *f); > int flow_epollfd(const struct flow_common *f); > void flow_epollid_set(struct flow_common *f, int epollid); > -void flow_epollid_clear(struct flow_common *f); > int flow_epoll_set(const struct flow_common *f, int command, uint32_t events, > int fd, unsigned int sidei); > void flow_epollid_register(int epollid, int epollfd); > diff --git a/icmp.c b/icmp.c > index eb7f11be5dad..9da5a787d181 100644 > --- a/icmp.c > +++ b/icmp.c > @@ -213,7 +213,6 @@ static struct icmp_ping_flow *icmp_ping_new(const struct ctx *c, > if (flow_epoll_set(&pingf->f, EPOLL_CTL_ADD, EPOLLIN, pingf->sock, > TGTSIDE) < 0) { > close(pingf->sock); > - flow_epollid_clear(&pingf->f); > goto cancel; > } > > diff --git a/tcp.c b/tcp.c > index 29d69354bd94..31d4e41ecb1a 100644 > --- a/tcp.c > +++ b/tcp.c > @@ -3759,7 +3759,6 @@ static int tcp_flow_repair_connect(const struct ctx *c, > return rc; > } > > - flow_epollid_clear(&conn->f); > conn->timer = -1; > conn->listening_sock = -1; > > diff --git a/udp_flow.c b/udp_flow.c > index 80b15433f0ac..d824e2c55b2c 100644 > --- a/udp_flow.c > +++ b/udp_flow.c > @@ -86,7 +86,6 @@ static int udp_flow_sock(const struct ctx *c, > flow_epollid_set(&uflow->f, EPOLLFD_ID_DEFAULT); > if (flow_epoll_set(&uflow->f, EPOLL_CTL_ADD, EPOLLIN, s, sidei) < 0) { > rc = -errno; > - flow_epollid_clear(&uflow->f); > close(s); > return rc; > } > -- > 2.52.0 > -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 0/3] Register TCP flows with epoll at creation time 2026-01-19 16:19 [PATCH v2 0/3] Register TCP flows with epoll at creation time Laurent Vivier ` (2 preceding siblings ...) 2026-01-19 16:19 ` [PATCH v2 3/3] flow: Remove EPOLLFD_ID_INVALID Laurent Vivier @ 2026-01-20 20:24 ` Stefano Brivio 3 siblings, 0 replies; 12+ messages in thread From: Stefano Brivio @ 2026-01-20 20:24 UTC (permalink / raw) To: Laurent Vivier; +Cc: passt-dev, David Gibson On Mon, 19 Jan 2026 17:19:12 +0100 Laurent Vivier <lvivier@redhat.com> wrote: > This series simplifies epoll registration by moving it to flow creation > time rather than deferring it to the first event update. By registering > sockets with epoll immediately (using events=0), the epoll control > functions can always use EPOLL_CTL_MOD, eliminating the need to track > whether a flow is already registered. > > Patch 1 registers both splice sockets with epoll in tcp_splice_connect() > before initiating the connection, allowing tcp_splice_epoll_ctl() to > always use EPOLL_CTL_MOD. > > Patch 2 does the same for regular TCP connections in tcp_conn_from_tap(), > tcp_tap_conn_from_sock(), and tcp_flow_migrate_target(), simplifying > tcp_epoll_ctl() by removing the ADD/MOD conditional logic. > > Patch 3 removes EPOLLFD_ID_INVALID and related infrastructure now that > all flows are registered at creation: flow_in_epoll(), flow_epollid_clear(), > and the defensive checks in flow_epollfd(). Applied. -- Stefano ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2026-01-21 12:18 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2026-01-19 16:19 [PATCH v2 0/3] Register TCP flows with epoll at creation time Laurent Vivier 2026-01-19 16:19 ` [PATCH v2 1/3] tcp_splice: Register fds with epoll at flow creation Laurent Vivier 2026-01-20 0:06 ` David Gibson 2026-01-19 16:19 ` [PATCH v2 2/3] tcp: " Laurent Vivier 2026-01-20 0:08 ` David Gibson 2026-01-21 8:13 ` Stefano Brivio 2026-01-21 8:43 ` Laurent Vivier 2026-01-21 11:41 ` Stefano Brivio 2026-01-21 12:18 ` Laurent Vivier 2026-01-19 16:19 ` [PATCH v2 3/3] flow: Remove EPOLLFD_ID_INVALID Laurent Vivier 2026-01-20 0:09 ` David Gibson 2026-01-20 20:24 ` [PATCH v2 0/3] Register TCP flows with epoll at creation time Stefano Brivio
Code repositories for project(s) associated with this public inbox https://passt.top/passt This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for IMAP folder(s).