public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
* [PATCH] netlink: Don't try to get further datagrams in nl_route_dup() on NLMSG_DONE
@ 2024-03-15 11:24 Stefano Brivio
  2024-03-15 13:11 ` Paul Holzinger
  2024-03-18  3:16 ` David Gibson
  0 siblings, 2 replies; 6+ messages in thread
From: Stefano Brivio @ 2024-03-15 11:24 UTC (permalink / raw)
  To: passt-dev; +Cc: Martin Pitt, Paul Holzinger, David Gibson

Martin reports that, with Fedora Linux kernel version
kernel-core-6.9.0-0.rc0.20240313gitb0546776ad3f.4.fc41.x86_64,
including commit 87d381973e49 ("genetlink: fit NLMSG_DONE into same
read() as families"), pasta doesn't exit once the network namespace
is gone.

Actually, pasta is completely non-functional, at least with default
options, because nl_route_dup(), which duplicates routes from the
parent namespace into the target namespace at start-up, is stuck on
a second receive operation for RTM_GETROUTE.

However, with that commit, the kernel is now able to fit the whole
response, including the NLMSG_DONE message, into a single datagram,
so no further messages will be received.

It turns out that commit 4d6e9d0816e2 ("netlink: Always process all
responses to a netlink request") accidentally relied on the fact that
we would always get at least two datagrams as a response to
RTM_GETROUTE.

That is, the test to check if we expect another datagram, is based
on the 'status' variable, which is 0 if we just parsed NLMSG_DONE,
but we'll also expect another datagram if NLMSG_OK on the last
message is false. But NLMSG_OK with a zero length is always false.

The problem is that we don't distinguish if status is zero because
we got a NLMSG_DONE message, or because we processed all the
available datagram bytes.

Introduce an explicit check on NLMSG_DONE. We should probably
refactor this slightly, for example by introducing a special return
code from nl_status(), but this is probably the least invasive fix
for the issue at hand.

Reported-by: Martin Pitt <mpitt@redhat.com>
Link: https://github.com/containers/podman/issues/22052
Fixes: 4d6e9d0816e2 ("netlink: Always process all responses to a netlink request")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
---
 netlink.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/netlink.c b/netlink.c
index 9e7cccb..20de9b3 100644
--- a/netlink.c
+++ b/netlink.c
@@ -525,7 +525,8 @@ int nl_route_dup(int s_src, unsigned int ifi_src,
 		}
 	}
 
-	if (!NLMSG_OK(nh, status) || status > 0) {
+	if (nh->nlmsg_type != NLMSG_DONE &&
+	    (!NLMSG_OK(nh, status) || status > 0)) {
 		/* Process any remaining datagrams in a different
 		 * buffer so we don't overwrite the first one.
 		 */
-- 
@@ -525,7 +525,8 @@ int nl_route_dup(int s_src, unsigned int ifi_src,
 		}
 	}
 
-	if (!NLMSG_OK(nh, status) || status > 0) {
+	if (nh->nlmsg_type != NLMSG_DONE &&
+	    (!NLMSG_OK(nh, status) || status > 0)) {
 		/* Process any remaining datagrams in a different
 		 * buffer so we don't overwrite the first one.
 		 */
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-03-18  3:18 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-15 11:24 [PATCH] netlink: Don't try to get further datagrams in nl_route_dup() on NLMSG_DONE Stefano Brivio
2024-03-15 13:11 ` Paul Holzinger
2024-03-15 14:52   ` Stefano Brivio
2024-03-15 15:17     ` Stefano Brivio
2024-03-15 15:21       ` Paul Holzinger
2024-03-18  3:16 ` David Gibson

Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).