On Tue, Aug 13, 2024 at 10:58:42PM -0700, Matt Hamilton wrote: > I am using Podman in Fedora 40, which uses pasta by default for rootless > container networking. > > Fedora 40's base version of passt is `passt-0^20240326.g4988e2b-1.fc40`, but > recently two newer versions were released, > `passt-0^20240726.g57a21d2-1.fc40` and `0^20240806.gee36266-1.fc40`. > > After upgrading, one pod kept going offline after a few minutes. The > containers remained running, but could not make outbound connections. > Journalctl revealed that the pasta process for the pod had crashed with: > > Aug 08 23:07:55 dev pasta[95859]: ASSERTION FAILED in flow_hash > (flow.c:566): pif != PIF_NONE && !inany_is_unspecified(&side->eaddr) > && side->eport != 0 && side->fport != 0 Ouch. > Aug 08 23:07:55 dev audit[95859]: SECCOMP auid=1000 uid=1000 > gid=1000 ses=1 > subj=unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 > pid=95859 comm="pasta.avx2" exe="/usr/bin/pasta.avx2" sig=31 > arch=c000003e syscall=186 compat=0 ip=0x7f8f8c23b64f code=0x80000000 > Aug 08 23:07:55 dev audit[95859]: ANOM_ABEND auid=1000 uid=1000 > gid=1000 ses=1 > subj=unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 > pid=95859 comm="pasta.avx2" exe="/usr/bin/pasta.avx2" sig=31 res=1 > > After much debugging, I isolated the trigger to a particular container > making a peer-to-peer TCP connection to a remote address with port 0. Huh. > Reverting passt to version 20240326 works as expected, and the container > stays online. It's been a long time since I wrote any C, but the code seems > clear and checks that the endpoint and forwarding ports do not equal 0. I > assume that a port 0 connection is not realistic or useful,  and that actual > attempt to connect over this port indicate a bug in the client code. Is this > correct? So, AFAICT the RFCs don't preclude using port 0 for connections on the wire. However, it's usually not really sensible to do so: at least on systems with a BSD-like socket interface, a port of 0 usually means "unspecified" or "kernel, please pick for me". Obviously this client is making it happen - my guess would be that a 0 port in connect() is interpreted as a literal port 0, but I'm not sure how the server is receiving it in thie case, since a bind() with port 0 will cause the kernel to pick a port. So, it does look like the client is doing something weird, although whether it's technically invalid is debateable. Even if it is valid for the client to do this, pasta can't really handle that case, because it's using the sockets interface to do the forwarding. BUT, it absolutely should not be crashing - it should log a debug message, drop the connection and carry on. We have code which is supposed to handle this case gracefully before reaching that assertion. I'm not immediately sure why that's not working. One possibility is that the client _isn't_ doing something weird, but an unusual port forwarding configuration on pasta is remapping a sensible port to port 0, thus causing the crash. Getting the full podman command line for the failing container would be the next step here. If you could file a bug at https://bugs.passt.top that would be most helpful. -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson