From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: passt.top; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=OUO5RtIq; dkim-atps=neutral Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by passt.top (Postfix) with ESMTP id 7F71F5A004C for ; Wed, 09 Oct 2024 22:44:42 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1728506681; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5IQWLYs+P+Y4MZ7obnn4Rdpy5s27Mmyq/G2Czj1wDWc=; b=OUO5RtIqwl1LfTQWL6T6VkyPrWGmL3VrOwuwyv+HsG/qtIw4vB4fX/KbnDg0SXVsXF4Sn2 MPQx0KdjNXG9gYy7P61qKQqmvUWjhKCDF7WI1aTYE4EQi6dwZzCU7fj9rwwk6ZuxLSLg5o VOZdoADpBnvcN/XVEAUp5i9+Nx4K/oo= Received: from mail-wr1-f72.google.com (mail-wr1-f72.google.com [209.85.221.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-454-AC0ADbWGPyechdI8-ZNZpA-1; Wed, 09 Oct 2024 16:44:39 -0400 X-MC-Unique: AC0ADbWGPyechdI8-ZNZpA-1 Received: by mail-wr1-f72.google.com with SMTP id ffacd0b85a97d-37ce063895cso60711f8f.0 for ; Wed, 09 Oct 2024 13:44:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728506677; x=1729111477; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=53f2ZFHNBsEJihXQT3lfS8OGlClTcivANAQrgbVY3rE=; b=I5Yw9gcVN8ryVBkevJZo2M8td2aY6wlU/Aq0iwImHf6m3CY0aDVC3pvR++Vyh3naUW orZ+L8S1Q5gmdbvNRUbaIc9eKw9jddTNcb0QqxIjaCCv4y9OPoVB32olu7qf9ylKqCwG 2wlEFwne7qskOXVZ4IPdT+7DJabpyDhunCS/BFIHy7X5w3S6SycMVGgJVmPyaJVUPAy0 YwFP4Xo0lehvtigOQ0T2KhR1EKuQ+EyOUyqZOI4Xt3tbPM+yjzGrirxTqV1zmiLc3wlQ /J9K0TcSWfbqS2GFD40ZywoBUtXKf885otlqTN7ee3trc3A4sUrOhQtVUVhb5AsGxzIx wPlw== X-Gm-Message-State: AOJu0Yw2VOHtvcRM/+XTX4YZZib12cy1wXu3EdKQHGWAtt8/+Xm0KLLZ x8531m4w8IToBZdy13/zlc1dBz0bTIoIrtmFRox+J30G5QFng0gDlwzakIdGpeNeaxj2LWGQrNv 77Kjy8Bh4V7bW3cUdYzVJQNBkDS/A08dcbopn+lP92FDpjDH50ZeN0A2eyg== X-Received: by 2002:a5d:6312:0:b0:374:c432:4971 with SMTP id ffacd0b85a97d-37d3a9be8f3mr2447956f8f.16.1728506677196; Wed, 09 Oct 2024 13:44:37 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHbvXdx7J0UHcwcSZ0sxRKojpGOVgT0ECsJ5Y8gJg9O7kkdIu4AUaOZlnZ08LOG8mm2zQgV5w== X-Received: by 2002:a5d:6312:0:b0:374:c432:4971 with SMTP id ffacd0b85a97d-37d3a9be8f3mr2447939f8f.16.1728506676460; Wed, 09 Oct 2024 13:44:36 -0700 (PDT) Received: from maya.myfinge.rs (ifcgrfdd.trafficplex.cloud. [176.103.220.4]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-37d43d2c4f5sm1876328f8f.94.2024.10.09.13.44.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Oct 2024 13:44:35 -0700 (PDT) Date: Wed, 9 Oct 2024 22:44:33 +0200 From: Stefano Brivio To: David Gibson Subject: Re: [PATCH v3 4/4] fwd: Direct inbound spliced forwards to the guest's external address Message-ID: <20241009224433.7fc28fc7@elisabeth> In-Reply-To: <20241009150721.63af48f6@elisabeth> References: <20241002054826.1812844-1-david@gibson.dropbear.id.au> <20241002054826.1812844-5-david@gibson.dropbear.id.au> <20241009150721.63af48f6@elisabeth> Organization: Red Hat X-Mailer: Claws Mail 4.2.0 (GTK 3.24.41; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Message-ID-Hash: 4TQGAWWCMY3Z646EI5IOWXYM7KI45WGL X-Message-ID-Hash: 4TQGAWWCMY3Z646EI5IOWXYM7KI45WGL X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Wed, 9 Oct 2024 15:07:21 +0200 Stefano Brivio wrote: > On Wed, 2 Oct 2024 15:48:26 +1000 > David Gibson wrote: >=20 > > In pasta mode, where addressing permits we "splice" connections, forwar= ding > > directly from host socket to guest/container socket without any L2 or L= 3 > > processing. This gives us a very large performance improvement when it= 's > > possible. > >=20 > > Since the traffic is from a local socket within the guest, it will go o= ver > > the guest's 'lo' interface, and accordingly we set the guest side addre= ss > > to be the loopback address. However this has a surprising side effect: > > sometimes guests will run services that are only supposed to be used wi= thin > > the guest and are therefore bound to only 127.0.0.1 and/or ::1. pasta'= s > > forwarding exposes those services to the host, which isn't generally wh= at > > we want. > >=20 > > Correct this by instead forwarding inbound "splice" flows to the guest'= s > > external address. > >=20 > > Link: https://github.com/containers/podman/issues/24045 > >=20 > > Signed-off-by: David Gibson > > --- > > conf.c | 9 +++++++++ > > fwd.c | 31 +++++++++++++++++++++++-------- > > passt.1 | 23 +++++++++++++++++++---- > > passt.h | 2 ++ > > 4 files changed, 53 insertions(+), 12 deletions(-) > >=20 > > diff --git a/conf.c b/conf.c > > index 6e62510..b5318f3 100644 > > --- a/conf.c > > +++ b/conf.c > > @@ -908,6 +908,9 @@ pasta_opts: > > =09=09" -U, --udp-ns SPEC=09UDP port forwarding to init namespace\n" > > =09=09" SPEC is as described above\n" > > =09=09" default: auto\n" > > +=09=09" --host-lo-to-ns-lo=09DEPRECATED:\n" > > +=09=09"=09=09=09Translate host-loopback forwards to\n" > > +=09=09"=09=09=09namespace loopback\n" > > =09=09" --userns NSPATH =09Target user namespace to join\n" > > =09=09" --netns PATH|NAME=09Target network namespace to join\n" > > =09=09" --netns-only=09=09Don't join existing user namespace\n" > > @@ -1284,6 +1287,7 @@ void conf(struct ctx *c, int argc, char **argv) > > =09=09{"netns-only",=09no_argument,=09=09NULL,=09=0920 }, > > =09=09{"map-host-loopback", required_argument, NULL,=09=0921 }, > > =09=09{"map-guest-addr", required_argument,=09NULL,=09=0922 }, > > +=09=09{"host-lo-to-ns-lo", no_argument, =09NULL,=09=0923 }, > > =09=09{ 0 }, > > =09}; > > =09const char *logname =3D (c->mode =3D=3D MODE_PASTA) ? "pasta" : "pa= sst"; > > @@ -1461,6 +1465,11 @@ void conf(struct ctx *c, int argc, char **argv) > > =09=09=09conf_nat(optarg, &c->ip4.map_guest_addr, > > =09=09=09=09 &c->ip6.map_guest_addr, NULL); > > =09=09=09break; > > +=09=09case 23: > > +=09=09=09if (c->mode !=3D MODE_PASTA) > > +=09=09=09=09die("--host-lo-to-ns-lo is for pasta mode only"); > > +=09=09=09c->host_lo_to_ns_lo =3D 1; > > +=09=09=09break; > > =09=09case 'd': > > =09=09=09c->debug =3D 1; > > =09=09=09c->quiet =3D 0; > > diff --git a/fwd.c b/fwd.c > > index a505098..c71f5e1 100644 > > --- a/fwd.c > > +++ b/fwd.c > > @@ -447,20 +447,35 @@ uint8_t fwd_nat_from_host(const struct ctx *c, ui= nt8_t proto, > > =09 (proto =3D=3D IPPROTO_TCP || proto =3D=3D IPPROTO_UDP)) { > > =09=09/* spliceable */ > > =20 > > -=09=09/* Preserve the specific loopback adddress used, but let the > > -=09=09 * kernel pick a source port on the target side > > +=09=09/* The traffic will go over the guest's 'lo' interface, but by > > +=09=09 * default use its external address, so we don't inadvertently > > +=09=09 * expose services that listen only on the guest's loopback > > +=09=09 * address. That can be overridden by --host-lo-to-ns-lo which > > +=09=09 * will instead forward to the loopback address in the guest. > > +=09=09 * > > +=09=09 * In either case, let the kernel pick the source address to > > +=09=09 * match. > > =09=09 */ > > -=09=09tgt->oaddr =3D ini->eaddr; > > +=09=09if (inany_v4(&ini->eaddr)) { > > +=09=09=09if (c->host_lo_to_ns_lo) > > +=09=09=09=09tgt->eaddr =3D inany_loopback4; > > +=09=09=09else > > +=09=09=09=09tgt->eaddr =3D inany_from_v4(c->ip4.addr_seen); > > +=09=09=09tgt->oaddr =3D inany_any4; > > +=09=09} else { > > +=09=09=09if (c->host_lo_to_ns_lo) > > +=09=09=09=09tgt->eaddr =3D inany_loopback6; > > +=09=09=09else > > +=09=09=09=09tgt->eaddr.a6 =3D c->ip6.addr_seen; =20 >=20 > Either this... >=20 > > +=09=09=09tgt->oaddr =3D inany_any6; =20 >=20 > or this (and not something before this patch, up to 3/4) make the > "TCP/IPv6: host to ns (spliced): big transfer" test in pasta/tcp hang, > sometimes (about one in three/four runs), that's what I mistakenly > reported as coming from Laurent's series at: >=20 > https://archives.passt.top/passt-dev/20241002163238.1778ed19@elisabeth/ >=20 > It hangs like this (display with >=3D 240 columns): Ouch, sorry, it looks like saving something in claws-mail as draft and sending it later means lines will be forcefully wrapped. Here's the original test output: ns$ ip -j -4 addr show|jq -rM '.[] | select(.ifname =3D=3D "enp9s0").addr_i= nfo[0].local' =E2=94= =82...passed. 88.198.0.164 = =E2=94=82 ns$ ip -j -4 route show|jq -rM '.[] | select(.dst =3D=3D "default").gateway= ' =E2=94= =82Starting test: TCP/IPv4: ns to host (spliced): big transfer 88.198.0.161 = =E2=94=82? = cmp /home/sbrivio/passt/test/big.bin /tmp/passt-tests-EsDdjG/pasta/tcp/test= _big.bin ns$ ip -j link show | jq -rM '.[] | select(.ifname =3D=3D "enp9s0").mtu' = =E2=94= =82...passed. 65520 = =E2=94=82 ns$ /sbin/dhclient -6 --no-pid enp9s0 = =E2=94=82St= arting test: TCP/IPv4: ns to host (via tap): big transfer ns$ ip -j -6 addr show|jq -rM '[.[] | select(.ifname =3D=3D "enp9s0").addr_= info[] | select(.prefixlen =3D=3D 128).local] | .[0]' =E2= =94=82? cmp /home/sbrivio/passt/test/big.bin /tmp/passt-tests-EsDdjG/pasta/= tcp/test_big.bin 2a01:4f8:222:904::2 = =E2=94=82..= .passed. ns$ ip -j -6 route show|jq -rM '.[] | select(.dst =3D=3D "default").gateway= ' =E2=94= =82 fe80::1 = =E2=94=82St= arting test: TCP/IPv4: host to ns (spliced): small transfer ns$ which socat ip jq >/dev/null = =E2=94=82? = cmp /home/sbrivio/passt/test/small.bin /tmp/passt-tests-EsDdjG/pasta/tcp/te= st_ns_small.bin ns$ socat -u TCP4-LISTEN:10002 OPEN:/tmp/passt-tests-EsDdjG/pasta/tcp/test_= ns_big.bin,create,trunc =E2=94=82..= .passed. ns$ socat -u OPEN:/home/sbrivio/passt/test/big.bin TCP4:127.0.0.1:10003 = =E2=94=82 ns$ ip -j -4 route show|jq -rM '.[] | select(.dst =3D=3D "default").gateway= ' =E2=94= =82Starting test: TCP/IPv4: ns to host (spliced): small transfer 88.198.0.161 = =E2=94=82? = cmp /home/sbrivio/passt/test/small.bin /tmp/passt-tests-EsDdjG/pasta/tcp/te= st_small.bin ns$ socat -u OPEN:/home/sbrivio/passt/test/big.bin TCP4:88.198.0.161:10003 = =E2=94=82..= .passed. ns$ socat -u TCP4-LISTEN:10002 OPEN:/tmp/passt-tests-EsDdjG/pasta/tcp/test_= ns_small.bin,create,trunc =E2=94=82 ns$ socat OPEN:/home/sbrivio/passt/test/small.bin TCP4:127.0.0.1:10003 = =E2=94=82St= arting test: TCP/IPv4: ns to host (via tap): small transfer ns$ ip -j -4 route show|jq -rM '.[] | select(.dst =3D=3D "default").gateway= ' =E2=94= =82? cmp /home/sbrivio/passt/test/small.bin /tmp/passt-tests-EsDdjG/pasta/t= cp/test_small.bin 88.198.0.161 = =E2=94=82..= .passed. ns$ socat -u OPEN:/home/sbrivio/passt/test/small.bin TCP4:88.198.0.161:1000= 3 =E2=94=82 ns$ strace socat -u TCP6-LISTEN:10002 OPEN:/tmp/passt-tests-EsDdjG/pasta/tc= p/test_ns_big.bin,create,trunc 2>/tmp/socat_server.strace =E2=94=82St= arting test: TCP/IPv6: host to ns (spliced): big transfer = =E2=94=82 =E2=94=80=E2=94=80namespace=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=AC=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=B4=E2=94=80=E2= =94=80pasta/tcp [7/12] - TCP/IPv6: host to ns (spliced): big transfer=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80 host$ ip -j -6 route show|jq -rM '[.[] | select(.dst =3D=3D "default").gate= way] | .[0]' =E2=94=82 router: 88.19= 8.0.161 fe80::1 = =E2=94=82DNS: host$ which ip jq >/dev/null = =E2=94=82 185.12.64.1 host$ ip -j -4 addr show|jq -rM '.[] | select(.ifname =3D=3D "enp9s0").addr= _info[0].local' =E2=94=82 185.12.64.2 88.198.0.164 = =E2=94=82 NAT to host ::1: = fe80::1 host$ ip -j -4 route show|jq -rM '[.[] | select(.dst =3D=3D "default").gate= way] | .[0]' =E2=94=82NDP/DHCPv6: 88.198.0.161 = =E2=94=82 assign: 2a01:4f8:= 222:904::2 host$ ip -j -6 route show|jq -rM '[.[] | select(.dst =3D=3D "default").dev]= | .[0]' =E2=94=82 router: fe80:= :1 enp9s0 = =E2=94=82 our link-local: f= e80::1 host$ ip -j -6 addr show|jq -rM '[.[] | select(.ifname =3D=3D "enp9s0").add= r_info[] | select(.scope =3D=3D "global" and .depreca=E2=94=82DNS: ted !=3D true).local] | .[0]' = =E2=94=82 2a01:4ff:ff00::= add:2 2a01:4f8:222:904::2 = =E2=94=82 2a01:4ff:ff00::ad= d:1 host$ ip -j -6 route show|jq -rM '[.[] | select(.dst =3D=3D "default").gate= way] | .[0]' =E2=94=82NDP: received RS,= sending RA fe80::1 = =E2=94=82DHCP: offer to discov= er host$ which socat ip jq >/dev/null = =E2=94=82 from 1e:48:6f:6e:= b6:50 host$ socat -u OPEN:/home/sbrivio/passt/test/big.bin TCP4:127.0.0.1:10002 = =E2=94=82DHCP: ack to request host$ socat -u TCP4-LISTEN:10003,bind=3D127.0.0.1 OPEN:/tmp/passt-tests-EsD= djG/pasta/tcp/test_big.bin,create,trunc =E2=94=82 from 1e:48:6f:6= e:b6:50 host$ socat -u TCP4-LISTEN:10003 OPEN:/tmp/passt-tests-EsDdjG/pasta/tcp/tes= t_big.bin,create,trunc =E2=94=82DHCPv6: received SOLI= CIT, sending ADVERTISE host$ socat OPEN:/home/sbrivio/passt/test/small.bin TCP4:127.0.0.1:10002 = =E2=94=82DHCPv6: received REQU= EST/RENEW/CONFIRM, sending REPLY host$ socat -u TCP4-LISTEN:10003,bind=3D127.0.0.1 OPEN:/tmp/passt-tests-EsD= djG/pasta/tcp/test_small.bin,create,trunc =E2=94=82NDP: received NS, s= ending NA host$ socat -u TCP4-LISTEN:10003 OPEN:/tmp/passt-tests-EsDdjG/pasta/tcp/tes= t_small.bin,create,trunc =E2=94=82NDP: received NS, sen= ding NA host$ strace socat -u OPEN:/home/sbrivio/passt/test/big.bin TCP6:[::1]:1000= 2 2>/tmp/socat_client.strace =E2=94=82NDP: received NS, sen= ding NA host$ = =E2=94=82 =E2=94=80=E2=94=80host=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=B4=E2=94=80=E2=94=80pasta=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80 Testing commit: a056cfc fwd: Direct inbound spliced forwards to the guest's= external address = PASS: 23 | FAIL: 0 | 2024-10-04= T16:16:28+00:00 > ...even without strace. The client is done, the server hangs. >=20 > If I unblock this manually by re-running the same client command, the > server wakes up, writes the file, and terminates, and the test > continues normally. >=20 > Those three "received NS, sending NA" messages in the pasta pane are > printed in a short time after the test starts. >=20 > If I run this with TRACE=3D1 (which needs the patch I just sent), this > is pasta's debugging output for this test: >=20 > -- > 6.1401: pasta: epoll event on listening TCP socket 6 (events: > 0x00000001) 6.1402: Flow 0 (NEW): FREE -> NEW > 6.1402: Flow 0 (INI): NEW -> INI > 6.1402: Flow 0 (INI): HOST [::1]:48910 -> [::]:10002 =3D> ? > 6.1402: Flow 0 (TGT): INI -> TGT > 6.1402: Flow 0 (TGT): HOST [::1]:48910 -> [::]:10002 =3D> SPLICE [::]:0 > -> [2a01:4f8:222:904::2]:10002 6.1402: Flow 0 (TCP connection =20 > (spliced)): TGT -> TYPED 6.1402: Flow 0 (TCP connection (spliced)): > HOST [::1]:48910 -> [::]:10002 =3D> SPLICE [::]:0 -> > [2a01:4f8:222:904::2]:10002 6.1402: Flow 0 (TCP connection (spliced)): > event at tcp_splice_connect:377 6.1402: Flow 0 (TCP connection > (spliced)): SPLICE_CONNECT 6.1402: Flow 0 (TCP connection (spliced)): > TYPED -> ACTIVE 6.1402: Flow 0 (TCP connection (spliced)): HOST > [::1]:48910 -> [::]:10002 =3D> SPLICE [::]:0 -> > [2a01:4f8:222:904::2]:10002 6.1402: pasta: epoll event on /dev/net/tun > device 13 (events: 0x00000001) 6.1402: NDP: received NS, sending NA > 7.0006: pasta: epoll event on namespace timer watch 12 (events: > 0x00000001) 7.0007: TCP (spliced): cannot set pool pipe size to 524288 > 7.0007: TCP (spliced): cannot set pool pipe size to 524288 7.0007: TCP > (spliced): cannot set pool pipe size to 524288 7.0007: TCP (spliced): > cannot set pool pipe size to 524288 7.0007: Flow 0 (TCP connection > (spliced)): flag at tcp_splice_timer:766 7.0007: Flow 0 (TCP connection > (spliced)): flag at tcp_splice_timer:766 7.1585: pasta: epoll event on > /dev/net/tun device 13 (events: 0x00000001) 7.1585: NDP: received NS, > sending NA 8.0006: pasta: epoll event on namespace timer watch 12 > (events: 0x00000001) 8.0006: Flow 0 (TCP connection (spliced)): flag at > tcp_splice_timer:766 8.0006: Flow 0 (TCP connection (spliced)): flag at > tcp_splice_timer:766 8.1825: pasta: epoll event on /dev/net/tun device > 13 (events: 0x00000001) 8.1825: NDP: received NS, sending NA 9.0006: > pasta: epoll event on namespace timer watch 12 (events: 0x00000001) > 9.2065: pasta: epoll event on connected spliced TCP socket 118 (events: > 0x0000001c) 9.2065: Flow 0 (TCP connection (spliced)): Error event on > socket: No route to host 9.2065: Flow 0 (TCP connection (spliced)): > flag at tcp_splice_sock_handler:624 9.2065: Flow 0 (TCP connection > (spliced)): RCVLOWAT_ACT_1 9.2068: Flow 0 (TCP connection (spliced)): > CLOSED 9.2068: Flow 0 (FREE): ACTIVE -> FREE 9.2068: Flow 0 (FREE): > HOST [::1]:48910 -> [::]:10002 =3D> SPLICE [::]:0 -> > [2a01:4f8:222:904::2]:10002 10.0006: pasta: epoll event on namespace > timer watch 12 (events: 0x00000001) 11.0006: pasta: epoll event on > namespace timer watch 12 (events: 0x00000001) 12.0006: pasta: epoll > event on namespace timer watch 12 (events: 0x00000001) 13.0006: pasta: > epoll event on namespace timer watch 12 (events: 0x00000001) [...] -- This was: 6.1401: pasta: epoll event on listening TCP socket 6 (events: 0x00000001) 6= .1402: Flow 0 (NEW): FREE -> NEW 6.1402: Flow 0 (INI): NEW -> INI 6.1402: Flow 0 (INI): HOST [::1]:48910 -> [::]:10002 =3D> ? 6.1402: Flow 0 (TGT): INI -> TGT 6.1402: Flow 0 (TGT): HOST [::1]:48910 -> [::]:10002 =3D> SPLICE [::]:0 -> = [2a01:4f8:222:904::2]:10002 6.1402: Flow 0 (TCP connection (spliced)): TGT -> TYPED 6.1402: Flow 0 (TCP connection (spliced)): HOST [::1]:48910 -> [::]:10002 = =3D> SPLICE [::]:0 -> [2a01:4f8:222:904::2]:10002 6.1402: Flow 0 (TCP connection (spliced)): event at tcp_splice_connect:377 6.1402: Flow 0 (TCP connection (spliced)): SPLICE_CONNECT 6.1402: Flow 0 (TCP connection (spliced)): TYPED -> ACTIVE 6.1402: Flow 0 (TCP connection (spliced)): HOST [::1]:48910 -> [::]:10002 = =3D> SPLICE [::]:0 -> [2a01:4f8:222:904::2]:10002 6.1402: pasta: epoll event on /dev/net/tun device 13 (events: 0x00000001) 6.1402: NDP: received NS, sending NA 7.0006: pasta: epoll event on namespace timer watch 12 (events: 0x00000001) 7.0007: TCP (spliced): cannot set pool pipe size to 524288 7.0007: TCP (spliced): cannot set pool pipe size to 524288 7.0007: TCP (spliced): cannot set pool pipe size to 524288 7.0007: TCP (spliced): cannot set pool pipe size to 524288 7.0007: Flow 0 (TCP connection (spliced)): flag at tcp_splice_timer:766 7.0007: Flow 0 (TCP connection (spliced)): flag at tcp_splice_timer:766 7.1585: pasta: epoll event on /dev/net/tun device 13 (events: 0x00000001) 7.1585: NDP: received NS, sending NA 8.0006: pasta: epoll event on namespace timer watch 12 (events: 0x00000001) 8.0006: Flow 0 (TCP connection (spliced)): flag at tcp_splice_timer:766 8.0006: Flow 0 (TCP connection (spliced)): flag at tcp_splice_timer:766 8.1825: pasta: epoll event on /dev/net/tun device 13 (events: 0x00000001) 8.1825: NDP: received NS, sending NA 9.0006: pasta: epoll event on namespace timer watch 12 (events: 0x00000001) 9.2065: pasta: epoll event on connected spliced TCP socket 118 (events: 0x0= 000001c) 9.2065: Flow 0 (TCP connection (spliced)): Error event on socket: No route = to host 9.2065: Flow 0 (TCP connection (spliced)): flag at tcp_splice_sock_handler:= 624 9.2065: Flow 0 (TCP connection (spliced)): RCVLOWAT_ACT_1 9.2068: Flow 0 (TCP connection (spliced)): CLOSED 9.2068: Flow 0 (FREE): ACTIVE -> FREE 9.2068: Flow 0 (FREE): HOST [::1]:48910 -> [::]:10002 =3D> SPLICE [::]:0 ->= [2a01:4f8:222:904::2]:10002 10.0006: pasta: epoll event on namespace timer watch 12 (events: 0x00000001= ) 11.0006: pasta: epoll event on namespace timer watch 12 (events: 0x00000001= ) 12.0006: pasta: epoll event on namespace timer watch 12 (events: 0x00000001= ) 13.0006: pasta: epoll event on namespace timer watch 12 (events: 0x00000001= ) [...] > Relevant parts of strace output from the client: >=20 > -- > openat(AT_FDCWD, "/home/sbrivio/passt/test/big.bin", O_RDONLY) =3D 5 > ioctl(5, TCGETS, 0x7ffd600ae4a0) =3D -1 ENOTTY (Inappropriate > ioctl for device) fcntl(5, F_SETFD, FD_CLOEXEC) =3D 0 > socket(AF_INET6, SOCK_STREAM, IPPROTO_TCP) =3D 6 > fcntl(6, F_SETFD, FD_CLOEXEC) =3D 0 > connect(6, {sa_family=3DAF_INET6, sin6_port=3Dhtons(10002), > sin6_flowinfo=3Dhtonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), > sin6_scope_id=3D0}, 28) =3D 0 getsockname(6, {sa_family=3DAF_INET6, > sin6_port=3Dhtons(39038), sin6_flowinfo=3Dhtonl(0), inet_pton(AF_INET6, > "::1", &sin6_addr), sin6_scope_id=3D0}, [112 =3D> 28]) =3D 0 pselect6(7, = [5], > [6], [], NULL, NULL) =3D 2 (in [5], out [6]) read(5, > "\335>\210#\264\331\273\276\257['\357\365\361\2\262\\\255O\5L\302Q\231\16= \234\266\307\32\362\206\333"..., =20 > 8192) =3D 8192 write(6, > "\335>\210#\264\331\273\276\257['\357\365\361\2\262\\\255O\5L\302Q\231\16= \234\266\307\32\362\206\333"..., =20 > 8192) =3D 8192 pselect6(7, [5], [6], [], NULL, NULL) =3D 2 (in [5], out > [6]) read(5, > "\343;H\320\177\323\245^\321%\\l\224\341R\235\337\33s\236\232\265\2608\31= 2\257D\204\375\324\313\5"..., > 8192) =3D 8192 write(6, > "\343;H\320\177\323\245^\321%\\l\224\341R\235\337\33s\236\232\265\2608\31= 2\257D\204\375\324\313\5"..., > 8192) =3D 8192 pselect6(7, [5], [6], [], NULL, NULL) =3D 2 (in [5], out > [6]) This was: openat(AT_FDCWD, "/home/sbrivio/passt/test/big.bin", O_RDONLY) =3D 5 ioctl(5, TCGETS, 0x7ffd600ae4a0) =3D -1 ENOTTY (Inappropriate ioctl = for device) fcntl(5, F_SETFD, FD_CLOEXEC) =3D 0 socket(AF_INET6, SOCK_STREAM, IPPROTO_TCP) =3D 6 fcntl(6, F_SETFD, FD_CLOEXEC) =3D 0 connect(6, {sa_family=3DAF_INET6, sin6_port=3Dhtons(10002), sin6_flowinfo= =3Dhtonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=3D0}, 28= ) =3D 0 getsockname(6, {sa_family=3DAF_INET6, sin6_port=3Dhtons(39038), sin6_flowin= fo=3Dhtonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=3D0}, = [112 =3D> 28]) =3D 0 pselect6(7, [5], [6], [], NULL, NULL) =3D 2 (in [5], out [6]) read(5, "\335>\210#\264\331\273\276\257['\357\365\361\2\262\\\255O\5L\302Q\= 231\16\234\266\307\32\362\206\333"..., 8192) =3D 8192 write(6, "\335>\210#\264\331\273\276\257['\357\365\361\2\262\\\255O\5L\302Q= \231\16\234\266\307\32\362\206\333"..., 8192) =3D 8192 pselect6(7, [5], [6], [], NULL, NULL) =3D 2 (in [5], out [6]) read(5, "\343;H\320\177\323\245^\321%\\l\224\341R\235\337\33s\236\232\265\2= 608\312\257D\204\375\324\313\5"..., 8192) =3D 8192 write(6, "\343;H\320\177\323\245^\321%\\l\224\341R\235\337\33s\236\232\265\= 2608\312\257D\204\375\324\313\5"..., 8192) =3D 8192 pselect6(7, [5], [6], [], NULL, NULL) =3D 2 (in [5], out [6]) > [...] >=20 > pselect6(7, [5], [6], [], NULL, NULL) =3D 2 (in [5], out [6]) > read(5, "", 8192) =3D 0 > shutdown(6, SHUT_WR) =3D 0 > shutdown(6, SHUT_RDWR) =3D 0 > exit_group(0) =3D ? > +++ exited with 0 +++ > -- >=20 > and from the server: >=20 > -- > socket(AF_INET6, SOCK_STREAM, IPPROTO_TCP) =3D 6 > fcntl(6, F_SETFD, FD_CLOEXEC) =3D 0 > setsockopt(6, SOL_SOCKET, SO_REUSEADDR, [1], 4) =3D 0 > bind(6, {sa_family=3DAF_INET6, sin6_port=3Dhtons(10002), > sin6_flowinfo=3Dhtonl(0), inet_pton(AF_INET6, "::", &sin6_addr), > sin6_scope_id=3D0}, 28) =3D 0 listen(6, 5) =3D= 0 > getsockname(6, {sa_family=3DAF_INET6, sin6_port=3Dhtons(10002), > sin6_flowinfo=3Dhtonl(0), inet_pton(AF_INET6, "::", &sin6_addr), > sin6_scope_id=3D0}, [28]) =3D 0 pselect6(7, [4 6], NULL, NULL, NULL, NULL= -- And this was: socket(AF_INET6, SOCK_STREAM, IPPROTO_TCP) =3D 6 fcntl(6, F_SETFD, FD_CLOEXEC) =3D 0 setsockopt(6, SOL_SOCKET, SO_REUSEADDR, [1], 4) =3D 0 bind(6, {sa_family=3DAF_INET6, sin6_port=3Dhtons(10002), sin6_flowinfo=3Dht= onl(0), inet_pton(AF_INET6, "::", &sin6_addr), sin6_scope_id=3D0}, 28) =3D = 0 listen(6, 5) =3D 0 getsockname(6, {sa_family=3DAF_INET6, sin6_port=3Dhtons(10002), sin6_flowin= fo=3Dhtonl(0), inet_pton(AF_INET6, "::", &sin6_addr), sin6_scope_id=3D0}, [= 28]) =3D 0 pselect6(7, [4 6], NULL, NULL, NULL, NULL > If I connect from the host without a server in the namespace (but > with the port forwarded by pasta), I get a connection reset, and > if the port is not forwarded by pasta, connection refused. >=20 > But this is another case: we start connecting and accept the > connection (probably we shouldn't). Note the "No route to host" > error on the socket. >=20 > It looks somehow similar to the race I fixed with commit > f4e9f26480ef ("pasta: Disable neighbour solicitations on device > up to prevent DAD"), but it doesn't look like an invalid > c->ip6.addr_seen, because otherwise pasta would reset the > connection, I suppose. >=20 > I haven't debugged further yet. This looks like an existing > issue in pasta rather than in this series or in the tests, > but it blocks tests, so I haven't applied this yet. --=20 Stefano