* [PATCH] flow: Set EPOLLFD_ID_DEFAULT on newly allocated flows, not EPOLLFD_ID_INVALID
@ 2025-12-08 21:28 Stefano Brivio
2025-12-08 21:54 ` Stefano Brivio
0 siblings, 1 reply; 6+ messages in thread
From: Stefano Brivio @ 2025-12-08 21:28 UTC (permalink / raw)
To: passt-dev; +Cc: David Gibson, Laurent Vivier
We're somehow hitting:
ASSERTION FAILED in flow_epollfd (flow.c:362): f->epollid < ((1 << 8) - 1)
on an inbound spliced connection, with a single forwarded port, an
HTTP server in a Podman container, and a GET request. Reproducer at
https://bodhi.fedoraproject.org/updates/FEDORA-2025-93b4eb64c3#comment-4473411
printf 'FROM registry.fedoraproject.org/fedora:latest\nRUN /usr/bin/dnf install -y httpd\nEXPOSE 80\nCMD ["-D", "FOREGROUND"]\nENTRYPOINT ["/usr/sbin/httpd"]\n' > Containerfile
podman build -t fedora-httpd $(pwd)
podman run -d -p 8080:80 localhost/fedora-httpd
curl http://localhost:8080
I guess we don't set EPOLLFD_ID_DEFAULT early enough on inbound spliced
sockets for some reason and we get a socket event while we still have
EPOLLFD_ID_INVALID set.
As we're not really using epoll identifiers yet, set
EPOLLFD_ID_DEFAULT right away on newly allocated flows, while we
figure this out.
Link: https://bodhi.fedoraproject.org/updates/FEDORA-2025-93b4eb64c3#comment-4473411
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
---
I just merged this, posting for awareness / review.
flow.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/flow.c b/flow.c
index 8d72965..f1bde9a 100644
--- a/flow.c
+++ b/flow.c
@@ -382,7 +382,8 @@ void flow_epollid_set(struct flow_common *f, int epollid)
*/
void flow_epollid_clear(struct flow_common *f)
{
- f->epollid = EPOLLFD_ID_INVALID;
+ /* FIXME: Use EPOLLFD_ID_INVALID instead once it's safe to do so */
+ f->epollid = EPOLLFD_ID_DEFAULT;
}
/**
--
2.43.0
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH] flow: Set EPOLLFD_ID_DEFAULT on newly allocated flows, not EPOLLFD_ID_INVALID
2025-12-08 21:28 [PATCH] flow: Set EPOLLFD_ID_DEFAULT on newly allocated flows, not EPOLLFD_ID_INVALID Stefano Brivio
@ 2025-12-08 21:54 ` Stefano Brivio
2025-12-08 23:36 ` David Gibson
2025-12-09 0:01 ` David Gibson
0 siblings, 2 replies; 6+ messages in thread
From: Stefano Brivio @ 2025-12-08 21:54 UTC (permalink / raw)
To: passt-dev; +Cc: David Gibson, Laurent Vivier
On Mon, 8 Dec 2025 22:28:22 +0100
Stefano Brivio <sbrivio@redhat.com> wrote:
> We're somehow hitting:
>
> ASSERTION FAILED in flow_epollfd (flow.c:362): f->epollid < ((1 << 8) - 1)
>
> on an inbound spliced connection, with a single forwarded port, an
> HTTP server in a Podman container, and a GET request. Reproducer at
> https://bodhi.fedoraproject.org/updates/FEDORA-2025-93b4eb64c3#comment-4473411
>
> printf 'FROM registry.fedoraproject.org/fedora:latest\nRUN /usr/bin/dnf install -y httpd\nEXPOSE 80\nCMD ["-D", "FOREGROUND"]\nENTRYPOINT ["/usr/sbin/httpd"]\n' > Containerfile
> podman build -t fedora-httpd $(pwd)
> podman run -d -p 8080:80 localhost/fedora-httpd
>
> curl http://localhost:8080
>
> I guess we don't set EPOLLFD_ID_DEFAULT early enough on inbound spliced
> sockets for some reason and we get a socket event while we still have
> EPOLLFD_ID_INVALID set.
>
> As we're not really using epoll identifiers yet, set
> EPOLLFD_ID_DEFAULT right away on newly allocated flows, while we
> figure this out.
>
> Link: https://bodhi.fedoraproject.org/updates/FEDORA-2025-93b4eb64c3#comment-4473411
> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> ---
> I just merged this, posting for awareness / review.
Ah, never mind, this makes it worse somehow:
5.6384: Flow 0 (TCP connection (spliced)): SPLICE_CONNECT
5.6384: Flow 0 (TCP connection (spliced)): ERROR on epoll_ctl(): No such file or directory
...still looking for a workaround / fix.
--
Stefano
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] flow: Set EPOLLFD_ID_DEFAULT on newly allocated flows, not EPOLLFD_ID_INVALID
2025-12-08 21:54 ` Stefano Brivio
@ 2025-12-08 23:36 ` David Gibson
2025-12-08 23:46 ` Stefano Brivio
2025-12-09 0:01 ` David Gibson
1 sibling, 1 reply; 6+ messages in thread
From: David Gibson @ 2025-12-08 23:36 UTC (permalink / raw)
To: Stefano Brivio; +Cc: passt-dev, Laurent Vivier
[-- Attachment #1: Type: text/plain, Size: 2031 bytes --]
On Mon, Dec 08, 2025 at 10:54:00PM +0100, Stefano Brivio wrote:
> On Mon, 8 Dec 2025 22:28:22 +0100
> Stefano Brivio <sbrivio@redhat.com> wrote:
>
> > We're somehow hitting:
> >
> > ASSERTION FAILED in flow_epollfd (flow.c:362): f->epollid < ((1 << 8) - 1)
> >
> > on an inbound spliced connection, with a single forwarded port, an
> > HTTP server in a Podman container, and a GET request. Reproducer at
> > https://bodhi.fedoraproject.org/updates/FEDORA-2025-93b4eb64c3#comment-4473411
> >
> > printf 'FROM registry.fedoraproject.org/fedora:latest\nRUN /usr/bin/dnf install -y httpd\nEXPOSE 80\nCMD ["-D", "FOREGROUND"]\nENTRYPOINT ["/usr/sbin/httpd"]\n' > Containerfile
> > podman build -t fedora-httpd $(pwd)
> > podman run -d -p 8080:80 localhost/fedora-httpd
> >
> > curl http://localhost:8080
> >
> > I guess we don't set EPOLLFD_ID_DEFAULT early enough on inbound spliced
> > sockets for some reason and we get a socket event while we still have
> > EPOLLFD_ID_INVALID set.
> >
> > As we're not really using epoll identifiers yet, set
> > EPOLLFD_ID_DEFAULT right away on newly allocated flows, while we
> > figure this out.
> >
> > Link: https://bodhi.fedoraproject.org/updates/FEDORA-2025-93b4eb64c3#comment-4473411
> > Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> > ---
> > I just merged this, posting for awareness / review.
>
> Ah, never mind, this makes it worse somehow:
>
> 5.6384: Flow 0 (TCP connection (spliced)): SPLICE_CONNECT
> 5.6384: Flow 0 (TCP connection (spliced)): ERROR on epoll_ctl(): No such file or directory
Does this imply you managed to reproduce locally? You hadn't as of
your comment a few after the one linked. I also haven't managed to
reproduce this.
> ...still looking for a workaround / fix.
>
> --
> Stefano
>
--
David Gibson (he or they) | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you, not the other way
| around.
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] flow: Set EPOLLFD_ID_DEFAULT on newly allocated flows, not EPOLLFD_ID_INVALID
2025-12-08 23:36 ` David Gibson
@ 2025-12-08 23:46 ` Stefano Brivio
0 siblings, 0 replies; 6+ messages in thread
From: Stefano Brivio @ 2025-12-08 23:46 UTC (permalink / raw)
To: David Gibson; +Cc: passt-dev, Laurent Vivier
On Tue, 9 Dec 2025 10:36:01 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:
> On Mon, Dec 08, 2025 at 10:54:00PM +0100, Stefano Brivio wrote:
> > On Mon, 8 Dec 2025 22:28:22 +0100
> > Stefano Brivio <sbrivio@redhat.com> wrote:
> >
> > > We're somehow hitting:
> > >
> > > ASSERTION FAILED in flow_epollfd (flow.c:362): f->epollid < ((1 << 8) - 1)
> > >
> > > on an inbound spliced connection, with a single forwarded port, an
> > > HTTP server in a Podman container, and a GET request. Reproducer at
> > > https://bodhi.fedoraproject.org/updates/FEDORA-2025-93b4eb64c3#comment-4473411
> > >
> > > printf 'FROM registry.fedoraproject.org/fedora:latest\nRUN /usr/bin/dnf install -y httpd\nEXPOSE 80\nCMD ["-D", "FOREGROUND"]\nENTRYPOINT ["/usr/sbin/httpd"]\n' > Containerfile
> > > podman build -t fedora-httpd $(pwd)
> > > podman run -d -p 8080:80 localhost/fedora-httpd
> > >
> > > curl http://localhost:8080
> > >
> > > I guess we don't set EPOLLFD_ID_DEFAULT early enough on inbound spliced
> > > sockets for some reason and we get a socket event while we still have
> > > EPOLLFD_ID_INVALID set.
> > >
> > > As we're not really using epoll identifiers yet, set
> > > EPOLLFD_ID_DEFAULT right away on newly allocated flows, while we
> > > figure this out.
> > >
> > > Link: https://bodhi.fedoraproject.org/updates/FEDORA-2025-93b4eb64c3#comment-4473411
> > > Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> > > ---
> > > I just merged this, posting for awareness / review.
> >
> > Ah, never mind, this makes it worse somehow:
> >
> > 5.6384: Flow 0 (TCP connection (spliced)): SPLICE_CONNECT
> > 5.6384: Flow 0 (TCP connection (spliced)): ERROR on epoll_ctl(): No such file or directory
>
> Does this imply you managed to reproduce locally? You hadn't as of
> your comment a few after the one linked. I also haven't managed to
> reproduce this.
Just simulate an error (that's not EINPROGRESS) on connect() in
tcp_splice_connect(). Patch coming.
--
Stefano
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] flow: Set EPOLLFD_ID_DEFAULT on newly allocated flows, not EPOLLFD_ID_INVALID
2025-12-08 21:54 ` Stefano Brivio
2025-12-08 23:36 ` David Gibson
@ 2025-12-09 0:01 ` David Gibson
2025-12-09 0:05 ` Stefano Brivio
1 sibling, 1 reply; 6+ messages in thread
From: David Gibson @ 2025-12-09 0:01 UTC (permalink / raw)
To: Stefano Brivio; +Cc: passt-dev, Laurent Vivier
[-- Attachment #1: Type: text/plain, Size: 2466 bytes --]
On Mon, Dec 08, 2025 at 10:54:00PM +0100, Stefano Brivio wrote:
> On Mon, 8 Dec 2025 22:28:22 +0100
> Stefano Brivio <sbrivio@redhat.com> wrote:
>
> > We're somehow hitting:
> >
> > ASSERTION FAILED in flow_epollfd (flow.c:362): f->epollid < ((1 << 8) - 1)
> >
> > on an inbound spliced connection, with a single forwarded port, an
> > HTTP server in a Podman container, and a GET request. Reproducer at
> > https://bodhi.fedoraproject.org/updates/FEDORA-2025-93b4eb64c3#comment-4473411
> >
> > printf 'FROM registry.fedoraproject.org/fedora:latest\nRUN /usr/bin/dnf install -y httpd\nEXPOSE 80\nCMD ["-D", "FOREGROUND"]\nENTRYPOINT ["/usr/sbin/httpd"]\n' > Containerfile
> > podman build -t fedora-httpd $(pwd)
> > podman run -d -p 8080:80 localhost/fedora-httpd
> >
> > curl http://localhost:8080
> >
> > I guess we don't set EPOLLFD_ID_DEFAULT early enough on inbound spliced
> > sockets for some reason and we get a socket event while we still have
> > EPOLLFD_ID_INVALID set.
> >
> > As we're not really using epoll identifiers yet, set
> > EPOLLFD_ID_DEFAULT right away on newly allocated flows, while we
> > figure this out.
> >
> > Link: https://bodhi.fedoraproject.org/updates/FEDORA-2025-93b4eb64c3#comment-4473411
> > Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> > ---
> > I just merged this, posting for awareness / review.
>
> Ah, never mind, this makes it worse somehow:
>
> 5.6384: Flow 0 (TCP connection (spliced)): SPLICE_CONNECT
> 5.6384: Flow 0 (TCP connection (spliced)): ERROR on epoll_ctl(): No such file or directory
This makes sense: epollfd != EPOLLFD_ID_INVALID indicates that the
flow's fds are already in the epoll (flow_in_epoll() will return
true). With epollfd initialised to EPOLLFD_ID_DEFAULT, we'll attempt
EPOLL_CTL_MOD on the very first tcp_splice_epoll_ctl(), having never
added the fds to the epoll set, hence this error.
> ...still looking for a workaround / fix.
Could the flow - for some other reason - be closing almost
immediately, before it even adds itself to the epoll? If that's the
case, we could potentially trigger this in the (flag == CLOSING)
section of conn_flag_do().
I haven't managed to reproduce, so I can't test this myself.
--
David Gibson (he or they) | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you, not the other way
| around.
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] flow: Set EPOLLFD_ID_DEFAULT on newly allocated flows, not EPOLLFD_ID_INVALID
2025-12-09 0:01 ` David Gibson
@ 2025-12-09 0:05 ` Stefano Brivio
0 siblings, 0 replies; 6+ messages in thread
From: Stefano Brivio @ 2025-12-09 0:05 UTC (permalink / raw)
To: David Gibson; +Cc: passt-dev, Laurent Vivier
On Tue, 9 Dec 2025 11:01:27 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:
> Could the flow - for some other reason - be closing almost
> immediately, before it even adds itself to the epoll? If that's the
> case, we could potentially trigger this in the (flag == CLOSING)
> section of conn_flag_do().
Yes, that's what happens, see my previous email.
--
Stefano
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-12-09 0:05 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-12-08 21:28 [PATCH] flow: Set EPOLLFD_ID_DEFAULT on newly allocated flows, not EPOLLFD_ID_INVALID Stefano Brivio
2025-12-08 21:54 ` Stefano Brivio
2025-12-08 23:36 ` David Gibson
2025-12-08 23:46 ` Stefano Brivio
2025-12-09 0:01 ` David Gibson
2025-12-09 0:05 ` Stefano Brivio
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).