From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: passt.top; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=C+YL91wC; dkim-atps=neutral Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by passt.top (Postfix) with ESMTPS id 07ABE5A0271 for ; Sat, 10 Jan 2026 19:12:20 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1768068739; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4Lb/N5SqF1khSvGmKKKKDTk4GnDP8BLQWK+YPqF/ggA=; b=C+YL91wC7T8N9WrGMnLgdjKmCSODrGFXkrvZIkHQI8rtCodqB6PVt0DYTW97SvmlxSYX5C 9deGOm+CkQvnL4wlXXJf2o8LOxmbq7XTxD8g1vBs94nt2jQzDska5i0A7nZLcAfl8+cXgn nge/t8BT41Lss8CRGW83fbG211+IdiA= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-391-vInEb-ozOD-OPpwo2ZOxsQ-1; Sat, 10 Jan 2026 13:12:17 -0500 X-MC-Unique: vInEb-ozOD-OPpwo2ZOxsQ-1 X-Mimecast-MFC-AGG-ID: vInEb-ozOD-OPpwo2ZOxsQ_1768068737 Received: by mail-wm1-f69.google.com with SMTP id 5b1f17b1804b1-47d3c9b8c56so53169145e9.0 for ; Sat, 10 Jan 2026 10:12:17 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1768068735; x=1768673535; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=4Lb/N5SqF1khSvGmKKKKDTk4GnDP8BLQWK+YPqF/ggA=; b=eM6RL0DN1QW/v+NBg1xeat1+0IEAFG+GP+h0jBXXc4Nu0PjGuOogGeR7PfEVVHQayI RKWPigihYHPAT0sKLoF29uC/uHAbrLDZAWdy09E4OI6NxeuK36QQoVxEkXrAV2J5xHso cDMALj6UOnw8iM744ttip+jn78WA7fQoYD/y6rJS5mVNkE7Rvhct9HZSfvAw06iNLgDK c7GfHkVQEA2zcPV1+IF5PFayBd73HldiW6T7QuyXQSd96LcFq7wGhk0jTikMY3zIoKEW OQve+LQKlZ0BSJ80Smk3pfPxBPw2wRlVhVXbb82/Zlj7c3x/IogFsk+tFBZh4fg41Oib pP8g== X-Forwarded-Encrypted: i=1; AJvYcCXgfxFXTMaNbxCc1RQrCqm7xKByZxWWUU7DIRfp5jheaXQcagtV4lkrUByK/WPXvPrvJYQ9ctugX0o=@passt.top X-Gm-Message-State: AOJu0YxNJJv0+3rm6M3jtb/3GjOWHbuQ9oIp67hXBLmmEIQ8RPTV09rt XwffPg7EjkzMSJSIc8bri8DPgeZEFrDltkrN2GOOa5aAWXw2XKcvJm9U2ADZk7Pq/u1FAde+02P RPbnoEkU2uq1Y9v8YqWjvxONh3Qf9/4HXjKcinK7uaqVoNrNLYXQTag== X-Gm-Gg: AY/fxX7iksI9UVhJo1iivzxbMyqx8P5w6vIDrRIQpgmKe/JgECqZ7hnVUSO0sgbamCy FR5ZI2fpeTZPJOUHpZcC0Y6o2BtxWKUwA7RxSu7Dnahc8kOqKkdxdUcRh0gHXV+aPjYlX+8Fg/J +l89d+vd7RYacGR+9ccDIZME4wTjt/eVpxvPXOcxnqcwe8a8t6ggxSkVfPciXVqA6dJwb5gN62Q QkAYkFcw2a2ofPEXDdOLw8fxuKDpzq6oIVEAuCM9ApGpRlVjzMhrd85hLiJgvgdxtKjSyw/QHYs wtVm0TSON84c0DZQ0XNXTEiGMysVUqaYlLMIELSjisZSMSe7pWmnMbXGefvjtIRQM29MMy1g3bp I2SXgXpg/SE4s5LGCtKkE X-Received: by 2002:a5d:64e3:0:b0:430:f68f:ee7d with SMTP id ffacd0b85a97d-432c379b79cmr17154419f8f.47.1768068735214; Sat, 10 Jan 2026 10:12:15 -0800 (PST) X-Google-Smtp-Source: AGHT+IEFpwU6kQ821IcCPJcv8VTdj/FlcpAyiEMSMMqu0N5uYLoa/2bsWK3aAbftlu+IXD7QqzZ3WQ== X-Received: by 2002:a5d:64e3:0:b0:430:f68f:ee7d with SMTP id ffacd0b85a97d-432c379b79cmr17154394f8f.47.1768068734647; Sat, 10 Jan 2026 10:12:14 -0800 (PST) Received: from maya.myfinge.rs (ifcgrfdd.trafficplex.cloud. [2a10:fc81:a806:d6a9::1]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-432bd5df9afsm31026763f8f.24.2026.01.10.10.12.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 10 Jan 2026 10:12:13 -0800 (PST) Date: Sat, 10 Jan 2026 19:12:02 +0100 From: Stefano Brivio To: Paul Holzinger Subject: Re: [PATCH] conf, pasta: Add --no-tap option Message-ID: <20260110191202.027b7f95@elisabeth> In-Reply-To: <337b7401-7794-4538-80b0-7ddae66daae7@redhat.com> References: <20251229095558.918055-1-yuhuang@redhat.com> <20260105221056.71e7ff8b@elisabeth> <337b7401-7794-4538-80b0-7ddae66daae7@redhat.com> Organization: Red Hat X-Mailer: Claws Mail 4.2.0 (GTK 3.24.49; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: NtZYz9EUqjxxOYwQvt_v6uefCXUjwLKN-W6PBlv1C7I_1768068737 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-ID-Hash: 3ZJ3R5FYSBICVZU4LVH67DTTB4FURFAU X-Message-ID-Hash: 3ZJ3R5FYSBICVZU4LVH67DTTB4FURFAU X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: Yumei Huang , passt-dev@passt.top, david@gibson.dropbear.id.au, Jon Maloy X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: [Cc'ing Jon for awareness around the part about netlink monitor and capabilities, four paragraphs down] On Wed, 7 Jan 2026 16:20:18 +0100 Paul Holzinger wrote: > On 05/01/2026 22:10, Stefano Brivio wrote: > > On Mon, 5 Jan 2026 14:48:15 +0100 > > Paul Holzinger wrote: > > > >> Sorry I was out for a while so I didn't had time to clarify on the bug > >> before. > >> > >> On 29/12/2025 10:55, Yumei Huang wrote: > >>> This patch introduces a mode where we only forward loopback connections > >>> and traffic between two namespaces (via the loopback interface, 'lo'), > >>> without a tap device. > >>> > >>> With this, podman can support forwarding ::1 in custom networks when using > >>> rootlesskit for forwarding ports. > >> I guess I didn't really communicate my requirements well. > > I guess it's more likely that you actually did, but I mixed up the > > association between requirements and use cases, sorry for that. > > > > In any case, good that we need this anyway, just for another use case. > > :) > > > >> When we use > >> rootlessport (rootlesskit) today for custom networks we only do so as > >> rootless user and it forwards ::1 (by possibly mapping this to v4 inside > >> the container) fine. > > So, wait a moment, is my comment at: > > > > https://github.com/containers/podman/issues/14491#issuecomment-2898191772 > > > > actually wrong? I don't have time right now to test that but from user > > reports and some vague memory I thought ::1 forwarding wouldn't work > > with custom networks regardless of root or rootless, because > > rootlesskit didn't handle that anyway. > > yes, rootlesskit handles ipv6 just fine, it is just that our > rootlessport code remaps that to v4 inside the container. Actually, at a glance, I don't think that this could be fixed entirely in the rootlessport implementation, as rootlesskit doesn't seem to look at the destination address of the original connection at all. > >> My main point for this feature was using as root (requires further > >> changes to allow pasta running as root). > > ...which should be entirely on Podman side and it's still on my plate, > > by the way: > > > > https://github.com/containers/podman/issues/17840 > > https://pad.passt.top/p/Features_2025#L40 > > I don't see how this can be fixed on the podman side, the network > namespace of a rootful container (not userns=auto) is owned by the root > user. If you configure something in there you must have real > CAP_NET_ADMIN from the host init userns. So pasta must not drop this > privilege before configuring the netns. Oops, right. My starting point was this change, which is actually trivial (at least as a test) and something I already tried out, but then I hit a number of issues in Podman I never really figured out. So yes, it takes one change in pasta, but the substantial part left for me to figure out is why Podman didn't just work with it. It's not necessarily complicated, I spent just a couple of hours on it, so maybe there's something simple I missed. > And even then with the future > netlink monitor work we would need to keep that privilege level to > modify the netns even during runtime? This just reminded me that, somewhat surprisingly, for netlink operations, the check on capabilities is not just performed on the process creating the socket when the socket is created, but also later *on the sender of the message*. This is inconsistent with other operations on other types of sockets where the whole context is checked and assigned at the time of the creation, and was introduced because of a specific behaviour of Zebra (the routing daemon) in 2014, see discussion around: https://lore.kernel.org/all/87d2g7d9ag.fsf_-_@x220.int.ebiederm.org/#r and I stumbled upon it a while ago while preparing a seitan demo replaying nft messages for an unprivileged container: https://seitan.rocks/seitan/tree/demo/nft.hjson#n38 So, my blanket answer "we create that socket at the beginning" doesn't apply here. However, assuming that this RFC patch from Jon actually works (I haven't tested it): https://archives.passt.top/passt-dev/20251215015441.887736-11-jmaloy@redhat.com/ I would say we're fine with it. Well, there's still the possibility that it doesn't work if Podman originally detached the network namespace, I'm not sure. If it doesn't work, we'll need to retain more capabilities, or even keep a cloned process around for this kind of stuff. We could also fix that in the kernel, Zebra doesn't need that quirk anymore. > >> Because as root podman does > >> port forwarding via DNAT firewall rules (i.e. custom nftables rules we > >> add). The kernel however never added support for DNAT on ::1 meaning > >> clients trying to access that are not getting forwarded. The only way to > >> support this is using a user space helper. Right now this doesn't work > >> and we do not use rootlessport for this either so I was just thinking > >> ahead because we do have these users requests who want ::1 to work as root. > >> > >> For the current rootlessport use case we also must bind all ports as > >> given (i.e. also addresses 0.0.0.0 bind address), just forwarding > >> loopback to loopback is not what we want or do for security reasons, see > >> CVE-2021-20199. And logically it would not really work to have another > >> process bind 0.0.0.0 and this pasta helper bind lo on the same port at > >> the same time. > >> > >> The way I am thinking is bind ports as normal, add the no-tap option and > >> add two options to give the v4 and v6 namespace (container) side connect > >> addresses so we never actually connect to lo. Then we also should have a > >> dynamic way to update the connect addresses at runtime which is required > >> for podman network connect/disconnect to work which changes the > >> addresses inside the namespace, see > >> https://github.com/containers/podman/commit/e88d8dbeae2aebd2d816f16a21891764163afcd4. > >> > >> Overall none of this is a blocker for removing rootlessport. I think our > >> plan was and still is to use the dynamic port forwarding logic David is > >> working on to replace the rootless custom network port forwarding case > >> with that. > > Regardless of other requirements that are needed as well to support > > forwarding ::1 for root containers (or rootless with --userns=auto), > > this feature by itself makes sense as it is and we'll need it as it is, > > right? > > > > By the way we routinely get requests for this feature by pasta (and > > Podman) users, regardless of any specific Podman integration, so I > > think the feature is generic enough as to make sense regardless of your > > plan for root containers. > > I am not sure how I would use or integrate a loopback to loopback > forwarder in podman so I don't think we would need or can use that as is. Well, I'm not sure, I just remember that you had in mind some use cases that could be fixed with this (and even noted them down in the references from the ticket). Sorry Yumei, I should have checked more recently, as it looks like this doesn't currently have as much priority as I thought, at least in Podman's perspective. In any case it's definitely useful. By the way, if it's for the root case, we'll still need it the day we support operation when started as root. If it's to fix up IPv4 / IPv6 loopback mapping in the rootless case, it would be usable right away. > I think the use case itself is still interesting and if there are end > users asking for it sure not objections from me. I guess it could be > interesting to expose a service without giving it access to the full > internet and without having to deal with complicated firewall rules, > i.e. with this we get a container that only could communicate by > replying to the forwarded ports. Right, yes, it might also be one way to implement "isolated" containers as described in https://bugs.passt.top/show_bug.cgi?id=139 (I still have to follow up on comments there, and that might take a while, but let me quickly mention that it has little/nothing to do with local mode). -- Stefano