From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=pass (2048-bit key; secure) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202408 header.b=iPscrAeL; dkim-atps=neutral Received: from mail.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by passt.top (Postfix) with ESMTPS id 67E8C5A004E for ; Thu, 10 Oct 2024 07:57:59 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202408; t=1728539862; bh=sMjncBvULywE7sSfVfyzRx3aG1CUVpjxnv+nfrAtOkM=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=iPscrAeLnerQ2fU15gsUbFU2Z3dDBCZsdglZU9hCNr0GaMlIPFe/JeLL9L5kLCtJn dt+VMMfo8TAFd6eX1hmog1eBBw4LgYNOhpgTGNyRkLsgIl5O8e6FmK4ilL2cixJs0Y iwpUrA6wnN9/+Uxv1GP4PtNzPdtOGNfMV3/dy8P7pnWu8qnQZk56A502qdJD+EL8fJ iX0+6O0JzaL/sFJVVDGdF6kdDkicOu1LKD3ndmJjUuK8QQULd41UrJYXY+6v1tGhwU 5fHyPsLSsXDkHBJ9V4Hs4pRyApQtEdiQVWxDDQGxk+oEGVGLxKB4jhqq/5RkPrS2B1 rgX6KH8fk6FmQ== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4XPJvB3500z4xG4; Thu, 10 Oct 2024 16:57:42 +1100 (AEDT) Date: Thu, 10 Oct 2024 16:57:32 +1100 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH v3 4/4] fwd: Direct inbound spliced forwards to the guest's external address Message-ID: References: <20241002054826.1812844-1-david@gibson.dropbear.id.au> <20241002054826.1812844-5-david@gibson.dropbear.id.au> <20241009150721.63af48f6@elisabeth> <20241009224433.7fc28fc7@elisabeth> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="OzNpKgGIt/oMvBZK" Content-Disposition: inline In-Reply-To: <20241009224433.7fc28fc7@elisabeth> Message-ID-Hash: 5HBVVUO4U3ZABA6FWFJ3RSQU6AQRGOKH X-Message-ID-Hash: 5HBVVUO4U3ZABA6FWFJ3RSQU6AQRGOKH X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --OzNpKgGIt/oMvBZK Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Oct 09, 2024 at 10:44:33PM +0200, Stefano Brivio wrote: > On Wed, 9 Oct 2024 15:07:21 +0200 > Stefano Brivio wrote: >=20 > > On Wed, 2 Oct 2024 15:48:26 +1000 > > David Gibson wrote: > >=20 > > > In pasta mode, where addressing permits we "splice" connections, forw= arding > > > directly from host socket to guest/container socket without any L2 or= L3 > > > processing. This gives us a very large performance improvement when = it's > > > possible. > > >=20 > > > Since the traffic is from a local socket within the guest, it will go= over > > > the guest's 'lo' interface, and accordingly we set the guest side add= ress > > > to be the loopback address. However this has a surprising side effec= t: > > > sometimes guests will run services that are only supposed to be used = within > > > the guest and are therefore bound to only 127.0.0.1 and/or ::1. past= a's > > > forwarding exposes those services to the host, which isn't generally = what > > > we want. > > >=20 > > > Correct this by instead forwarding inbound "splice" flows to the gues= t's > > > external address. > > >=20 > > > Link: https://github.com/containers/podman/issues/24045 > > >=20 > > > Signed-off-by: David Gibson > > > --- > > > conf.c | 9 +++++++++ > > > fwd.c | 31 +++++++++++++++++++++++-------- > > > passt.1 | 23 +++++++++++++++++++---- > > > passt.h | 2 ++ > > > 4 files changed, 53 insertions(+), 12 deletions(-) > > >=20 > > > diff --git a/conf.c b/conf.c > > > index 6e62510..b5318f3 100644 > > > --- a/conf.c > > > +++ b/conf.c > > > @@ -908,6 +908,9 @@ pasta_opts: > > > " -U, --udp-ns SPEC UDP port forwarding to init namespace\n" > > > " SPEC is as described above\n" > > > " default: auto\n" > > > + " --host-lo-to-ns-lo DEPRECATED:\n" > > > + " Translate host-loopback forwards to\n" > > > + " namespace loopback\n" > > > " --userns NSPATH Target user namespace to join\n" > > > " --netns PATH|NAME Target network namespace to join\n" > > > " --netns-only Don't join existing user namespace\n" > > > @@ -1284,6 +1287,7 @@ void conf(struct ctx *c, int argc, char **argv) > > > {"netns-only", no_argument, NULL, 20 }, > > > {"map-host-loopback", required_argument, NULL, 21 }, > > > {"map-guest-addr", required_argument, NULL, 22 }, > > > + {"host-lo-to-ns-lo", no_argument, NULL, 23 }, > > > { 0 }, > > > }; > > > const char *logname =3D (c->mode =3D=3D MODE_PASTA) ? "pasta" : "pa= sst"; > > > @@ -1461,6 +1465,11 @@ void conf(struct ctx *c, int argc, char **argv) > > > conf_nat(optarg, &c->ip4.map_guest_addr, > > > &c->ip6.map_guest_addr, NULL); > > > break; > > > + case 23: > > > + if (c->mode !=3D MODE_PASTA) > > > + die("--host-lo-to-ns-lo is for pasta mode only"); > > > + c->host_lo_to_ns_lo =3D 1; > > > + break; > > > case 'd': > > > c->debug =3D 1; > > > c->quiet =3D 0; > > > diff --git a/fwd.c b/fwd.c > > > index a505098..c71f5e1 100644 > > > --- a/fwd.c > > > +++ b/fwd.c > > > @@ -447,20 +447,35 @@ uint8_t fwd_nat_from_host(const struct ctx *c, = uint8_t proto, > > > (proto =3D=3D IPPROTO_TCP || proto =3D=3D IPPROTO_UDP)) { > > > /* spliceable */ > > > =20 > > > - /* Preserve the specific loopback adddress used, but let the > > > - * kernel pick a source port on the target side > > > + /* The traffic will go over the guest's 'lo' interface, but by > > > + * default use its external address, so we don't inadvertently > > > + * expose services that listen only on the guest's loopback > > > + * address. That can be overridden by --host-lo-to-ns-lo which > > > + * will instead forward to the loopback address in the guest. > > > + * > > > + * In either case, let the kernel pick the source address to > > > + * match. > > > */ > > > - tgt->oaddr =3D ini->eaddr; > > > + if (inany_v4(&ini->eaddr)) { > > > + if (c->host_lo_to_ns_lo) > > > + tgt->eaddr =3D inany_loopback4; > > > + else > > > + tgt->eaddr =3D inany_from_v4(c->ip4.addr_seen); > > > + tgt->oaddr =3D inany_any4; > > > + } else { > > > + if (c->host_lo_to_ns_lo) > > > + tgt->eaddr =3D inany_loopback6; > > > + else > > > + tgt->eaddr.a6 =3D c->ip6.addr_seen; =20 > >=20 > > Either this... > >=20 > > > + tgt->oaddr =3D inany_any6; =20 > >=20 > > or this (and not something before this patch, up to 3/4) make the > > "TCP/IPv6: host to ns (spliced): big transfer" test in pasta/tcp hang, > > sometimes (about one in three/four runs), that's what I mistakenly > > reported as coming from Laurent's series at: Huh, interesting. Just got back from my leave and ran that group of tests in a loop this afternoon, but didn't manage to reproduce. I have administrivia that will probably fill the rest of this week, but I'll look into this as soon as I can. --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --OzNpKgGIt/oMvBZK Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmcHbLwACgkQzQJF27ox 2GeP9A/7ByzvckQgg9npr/hTDlGUe+MmKX1UmjoBfbpdhW1xVhgcuFc/rrzHAUAa GflPl/eafoy/poElw93MmX8MlMbEK8PMqCgtieVH18gCPpc6/aAB94aeloq38yBh SS3v4eB3LywvPJyUCEFMMf0OqTUYQDNomRgscYL4ZivlZCf+BaVDeyLqYIowTeHO WmoNUn4hhBd8jh9DcxG/NTGkgiMh4RiG/Wr6krdsisNZ4+W3YV9PcrVxn6PMocxm KuSKrg/F7rO4yHZTpRYfhRTpKfC2AKpkvUe+HZaUj8hGmU4llTYHoAva/QuPCxrv Ct+cIoG4/UjBQ3kC/F+ubgNklMrb6iz/LO+34XZrSHkVf1aOTswPwqZSepJUyeYh pb8ewFHhPv79NlUKntd60Ip4ZvEkDEUmQofGMhc03jeU5wzTbj+uY7JAxAFCDwsY vr58+eteNPUAgDjtvCSl1ed//Bo3S1eWV4j4Xx+b+7uUhBlmqSyknBo+jItTwTtN bV2buA8KesEuXC/mKzgnQJRABOYINbkW5T0zZ9VCRUpU2kmFWsPVUpDKadqGZujy iW5JenOCis2MOwz5dMToKzFRHBDHtWHBfGVO2U/KmieCkVRQv688TEkH1sgtGM2L ou6NJI6OcDjlaJSwRmnJT7JxrZu34/czSyOmL2DOy9i+axp85kY= =Kxrn -----END PGP SIGNATURE----- --OzNpKgGIt/oMvBZK--