From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: passt.top; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=bToPPQ0T; dkim-atps=neutral Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by passt.top (Postfix) with ESMTP id A47E45A0262 for ; Tue, 01 Oct 2024 20:22:27 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1727806946; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=luvBbpsip4Ysd1ZkdWsx81ePoLUDxMdwc6hecV1NKfo=; b=bToPPQ0TnzgvL8Ej6I9Kb/ZPyBQV8eAJEoN54X/bsXVMrzBF2NjXOCAzcpN/8+wez9myjY VPaDIRk1X5ixoMvmFol5JLK+f78DUllrfPlfG0fbBpDI9eol/kZDvUCai3SoQrqSYhmycR SXnt6PgiEtzWTv5YrvVW+zmMQIigxl8= Received: from mail-pl1-f200.google.com (mail-pl1-f200.google.com [209.85.214.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-169-W9JN_d03P86y4Wko0OffQA-1; Tue, 01 Oct 2024 14:22:25 -0400 X-MC-Unique: W9JN_d03P86y4Wko0OffQA-1 Received: by mail-pl1-f200.google.com with SMTP id d9443c01a7336-20bb491189aso10396545ad.3 for ; Tue, 01 Oct 2024 11:22:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727806944; x=1728411744; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=luvBbpsip4Ysd1ZkdWsx81ePoLUDxMdwc6hecV1NKfo=; b=WP8a10PD4+8GJLHlV1+zVlY5rgs+qValBtCqciKGTIOtTM4yN+AYoAsALgE3jrcyBf WYUx4P6lBY3gYHhRnqLur3RuEbwlX3/KrB4hVkZr+Qr5uPeUjWHmPUpeKwSRSO3H3toF T1G1RruaIjytkg4AX8vGPTWCn4fOIJBA3hu8OSyU3Q5p8xA1tvhzPsOtEeWUT7ZuUXGf XRcyKItln2s+0XOXSM3TdO44owPYzDWunmB7GfJyVnJRq8kVW2u1tUKMA0YPSOLnw9lZ hQXC1/AJNXxwIOur6kaTI1l5sD9775sGARir2411JNauGRa/6OBB0yse5+7RLBdOOZRw PjqA== X-Gm-Message-State: AOJu0YzOxQY57+FcnTov8ewtumXnjMBjDoO7mVThpqWa1Z+uQRLuZH2Y c/A+99z3JfxqMwRXOLw1Gu3idbHRR4bb9L8vjZ0ADuBpBUZfj85Eqmw826XNBXUfWAfsASW49A+ wkfzmtwr+QfasoZLC6FEkpaaXOVEc2beV9ZUY6B5Pu/ieh8/oDZtQk22QAA== X-Received: by 2002:a17:902:e550:b0:205:6a9b:7e3e with SMTP id d9443c01a7336-20bc5a8799cmr10050405ad.56.1727806943871; Tue, 01 Oct 2024 11:22:23 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE55PRJYKALHQIhuuLd6C99wtEanzMntOAvaHdvr5vToxdoG+binuEsj7bHpV5s2d7p0MJSmQ== X-Received: by 2002:a17:902:e550:b0:205:6a9b:7e3e with SMTP id d9443c01a7336-20bc5a8799cmr10049975ad.56.1727806943446; Tue, 01 Oct 2024 11:22:23 -0700 (PDT) Received: from maya.myfinge.rs (ifcgrfdd.trafficplex.cloud. [2a10:fc81:a806:d6a9::1]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-20b37e61c4csm72697835ad.274.2024.10.01.11.22.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Oct 2024 11:22:22 -0700 (PDT) Date: Tue, 1 Oct 2024 20:22:20 +0200 From: Stefano Brivio To: David Gibson Subject: Re: [PATCH v2 4/4] fwd: Direct inbound spliced forwards to the guest's external address Message-ID: <20241001202220.7683c732@elisabeth> In-Reply-To: <20241001062502.1345449-5-david@gibson.dropbear.id.au> References: <20241001062502.1345449-1-david@gibson.dropbear.id.au> <20241001062502.1345449-5-david@gibson.dropbear.id.au> Organization: Red Hat X-Mailer: Claws Mail 4.2.0 (GTK 3.24.41; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-ID-Hash: EMHPSONLSRRHZZIU32ECL6SL6THZ5QRY X-Message-ID-Hash: EMHPSONLSRRHZZIU32ECL6SL6THZ5QRY X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Some nits, but the series looks overall good to me and I would also apply it like this: On Tue, 1 Oct 2024 16:25:02 +1000 David Gibson wrote: > In pasta mode, where addressing permits we "splice" connections, forwarding > directly from host socket to guest/container socket without any L2 or L3 > processing. This gives us a very large performance improvement when it's > possible. > > Since the traffic is from a local socket within the guest, it will go over > the guest's 'lo' interface, and accordingly we set the guest side address > to be the loopback address. However this has a surprising side effect: > sometimes guests will run services that are only supposed to be used within > the guest and are therefore bound to only 127.0.0.1 and/or ::1. pasta's > forwarding exposes those services to the host, which isn't generally what > we want. > > Correct this by instead forwarding inbound "splice" flows to the guest's > external address. > > Link: https://github.com/containers/podman/issues/24045 > > Signed-off-by: David Gibson > --- > conf.c | 9 +++++++++ > fwd.c | 31 +++++++++++++++++++++++-------- > passt.1 | 23 +++++++++++++++++++---- > passt.h | 1 + > 4 files changed, 52 insertions(+), 12 deletions(-) > > diff --git a/conf.c b/conf.c > index 6e62510..b5318f3 100644 > --- a/conf.c > +++ b/conf.c > @@ -908,6 +908,9 @@ pasta_opts: > " -U, --udp-ns SPEC UDP port forwarding to init namespace\n" > " SPEC is as described above\n" > " default: auto\n" > + " --host-lo-to-ns-lo DEPRECATED:\n" For me, this is harder to type (and remember) than --expose-ns-lo or similar. It doesn't matter so much as it's a deprecated option anyway. --help lines are precious, I wouldn't use one to say DEPRECATED. On the other hand, it's deprecated :) so I don't care that much (and it has the advantage of making it very clear for users). > + " Translate host-loopback forwards to\n" > + " namespace loopback\n" > " --userns NSPATH Target user namespace to join\n" > " --netns PATH|NAME Target network namespace to join\n" > " --netns-only Don't join existing user namespace\n" > @@ -1284,6 +1287,7 @@ void conf(struct ctx *c, int argc, char **argv) > {"netns-only", no_argument, NULL, 20 }, > {"map-host-loopback", required_argument, NULL, 21 }, > {"map-guest-addr", required_argument, NULL, 22 }, > + {"host-lo-to-ns-lo", no_argument, NULL, 23 }, > { 0 }, > }; > const char *logname = (c->mode == MODE_PASTA) ? "pasta" : "passt"; > @@ -1461,6 +1465,11 @@ void conf(struct ctx *c, int argc, char **argv) > conf_nat(optarg, &c->ip4.map_guest_addr, > &c->ip6.map_guest_addr, NULL); > break; > + case 23: > + if (c->mode != MODE_PASTA) > + die("--host-lo-to-ns-lo is for pasta mode only"); > + c->host_lo_to_ns_lo = 1; > + break; > case 'd': > c->debug = 1; > c->quiet = 0; > diff --git a/fwd.c b/fwd.c > index a505098..c71f5e1 100644 > --- a/fwd.c > +++ b/fwd.c > @@ -447,20 +447,35 @@ uint8_t fwd_nat_from_host(const struct ctx *c, uint8_t proto, > (proto == IPPROTO_TCP || proto == IPPROTO_UDP)) { > /* spliceable */ > > - /* Preserve the specific loopback adddress used, but let the > - * kernel pick a source port on the target side > + /* The traffic will go over the guest's 'lo' interface, but by > + * default use its external address, so we don't inadvertently > + * expose services that listen only on the guest's loopback > + * address. That can be overridden by --host-lo-to-ns-lo which > + * will instead forward to the loopback address in the guest. > + * > + * In either case, let the kernel pick the source address to > + * match. > */ > - tgt->oaddr = ini->eaddr; > + if (inany_v4(&ini->eaddr)) { > + if (c->host_lo_to_ns_lo) > + tgt->eaddr = inany_loopback4; > + else > + tgt->eaddr = inany_from_v4(c->ip4.addr_seen); > + tgt->oaddr = inany_any4; > + } else { > + if (c->host_lo_to_ns_lo) > + tgt->eaddr = inany_loopback6; > + else > + tgt->eaddr.a6 = c->ip6.addr_seen; > + tgt->oaddr = inany_any6; > + } > + > + /* Let the kernel pick source port */ > tgt->oport = 0; > if (proto == IPPROTO_UDP) > /* But for UDP preserve the source port */ > tgt->oport = ini->eport; > > - if (inany_v4(&ini->eaddr)) > - tgt->eaddr = inany_loopback4; > - else > - tgt->eaddr = inany_loopback6; > - > return PIF_SPLICE; > } > > diff --git a/passt.1 b/passt.1 > index 11104e1..332384c 100644 > --- a/passt.1 > +++ b/passt.1 > @@ -586,6 +586,13 @@ Configure UDP port forwarding from target namespace to init namespace. > > Default is \fBauto\fR. > > +.TP > +.BR \-\-host-lo-to-ns-lo " " (DEPRECATED) > +If specified, connections forwarded with \fB\-t\fR and \fB\-u\fR from > +the host's loopback address will appear on the loopback address in the > +guest as well. Without this option such forwarded packets will appear > +to come from the guest's public address. > + > .TP > .BR \-\-userns " " \fIspec > Target user namespace to join, as a path. If PID is given, without this option, > @@ -874,8 +881,9 @@ interfaces, and it would also be impossible for guest or target > namespace to route answers back. > > For convenience, the source address on these packets is translated to > -the address specified by the \fB\-\-map-host-loopback\fR option. If > -not specified this defaults, somewhat arbitrarily, to the address of > +the address specified by the \fB\-\-map-host-loopback\fR option (with > +some exceptions in pasta mode, see next section below). If not > +specified this defaults, somewhat arbitrarily, to the address of > default IPv4 or IPv6 gateway (if any) -- this is known to be an > existing, valid address on the same subnet. If \fB\-\-no-map-gw\fR or > \fB\-\-map-host-loopback none\fR are specified this translation is > @@ -912,8 +920,15 @@ and the new socket using the \fBsplice\fR(2) system call, and for UDP, a pair > of \fBrecvmmsg\fR(2) and \fBsendmmsg\fR(2) system calls deals with packet > transfers. > > -This bypass only applies to local connections and traffic, because it's not > -possible to bind sockets to foreign addresses. > +Because it's not possible to bind sockets to foreign addresses, this > +bypass only applies to local connections and traffic. It also means > +that the address translation differs slightly from passt mode. > +Connections from loopback to loopback on the host will appear to come > +from the target namespace's public address within the guest, unless > +\fB\-\-host-lo-to-ns-lo\fR is specified, in which case they will > +appear to come from loopback in the namespace as well. The latter > +behaviour used to be the default, but is usually undesirable, since it > +can unintentionally expose namespace local services to the host. > > .SS Binding to low numbered ports (well-known or system ports, up to 1023) > > diff --git a/passt.h b/passt.h > index 031c9b6..3b7b281 100644 > --- a/passt.h > +++ b/passt.h > @@ -284,6 +284,7 @@ struct ctx { > int no_dhcpv6; > int no_ndp; > int no_ra; > + int host_lo_to_ns_lo; This is missing from the struct comment, but... again, it's deprecated, so it doesn't matter much. -- Stefano