From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by passt.top (Postfix) with ESMTPS id C46BE5A0269 for ; Tue, 8 Nov 2022 02:02:13 +0100 (CET) Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4N5qZB1j3jz4xx0; Tue, 8 Nov 2022 12:02:10 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=201602; t=1667869330; bh=7rJ7wDw38UICwc5mYoXZU35mZv+fezq4fLx7YXUscsQ=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=hUUxMwBn8R8ayS/9pdgzSQb47p1iacy+euP3qKUv6DQTzUK9bx+Ikk37fE94dTC+x PqvxX41vEp16MkuLtzOWGMT97jQ2nphAhYf2Q5v5O4eeVADlFn8/bgdnfKzSzDzCI2 wdx8GH33C+N1UWqp2ufqfoi74xdgE5LIUO3zPEXo= Date: Tue, 8 Nov 2022 11:35:52 +1100 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH 01/10] tcp: no v6 flag in ref Message-ID: References: <20221104084333.3761760-1-david@gibson.dropbear.id.au> <20221104084333.3761760-2-david@gibson.dropbear.id.au> <20221107190753.5e1cf0de@elisabeth> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="+oSP1YzV6f7wa9mf" Content-Disposition: inline In-Reply-To: <20221107190753.5e1cf0de@elisabeth> Message-ID-Hash: 6I567NLHKJJAE7AEWLHWO74V5M4O226P X-Message-ID-Hash: 6I567NLHKJJAE7AEWLHWO74V5M4O226P X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.3 Precedence: list List-Id: Development discussion and patches for passt Archived-At: <> Archived-At: List-Archive: <> List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --+oSP1YzV6f7wa9mf Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Nov 07, 2022 at 07:07:53PM +0100, Stefano Brivio wrote: > On Fri, 4 Nov 2022 19:43:24 +1100 > David Gibson wrote: >=20 > > --- > > tcp.c | 9 ++++----- > > tcp.h | 1 - > > tcp_splice.c | 13 +++++++------ > > 3 files changed, 11 insertions(+), 12 deletions(-) > >=20 > > diff --git a/tcp.c b/tcp.c > > index 713248f..3d48d6e 100644 > > --- a/tcp.c > > +++ b/tcp.c > > @@ -761,8 +761,7 @@ static int tcp_epoll_ctl(const struct ctx *c, struc= t tcp_conn *conn) > > { > > int m =3D (conn->flags & IN_EPOLL) ? EPOLL_CTL_MOD : EPOLL_CTL_ADD; > > union epoll_ref ref =3D { .r.proto =3D IPPROTO_TCP, .r.s =3D conn->so= ck, > > - .r.p.tcp.tcp.index =3D conn - tc, > > - .r.p.tcp.tcp.v6 =3D CONN_V6(conn) }; > > + .r.p.tcp.tcp.index =3D conn - tc, }; > > struct epoll_event ev =3D { .data.u64 =3D ref.u64 }; > > =20 > > if (conn->events =3D=3D CLOSED) { > > @@ -2857,6 +2856,7 @@ static void tcp_conn_from_sock(struct ctx *c, uni= on epoll_ref ref, > > if (c->tcp.conn_count >=3D TCP_MAX_CONNS) > > return; > > =20 > > + sa.ss_family =3D AF_UNSPEC; /* FIXME: stop clang-tidy complaining */ >=20 > No need for /* FIXME */ here in my opinion, "FIXME" maybe isn't quite the right sentiment. The issue here is I don't think this should be necessary. AFAIK sa.ss_family will be written on any succesful return from accept4(), it's just the clang-tidy doesn't seem to know that. > valgrind should also hit > this I'm not sure. Depends whether valgrind's knowledge of syscall behaviours has the same bug or not. > -- perhaps move to initialiser though. >=20 > > sl =3D sizeof(sa); > > s =3D accept4(ref.r.s, (struct sockaddr *)&sa, &sl, SOCK_NONBLOCK); > > if (s < 0) > > @@ -2868,7 +2868,7 @@ static void tcp_conn_from_sock(struct ctx *c, uni= on epoll_ref ref, > > conn->ws_to_tap =3D conn->ws_from_tap =3D 0; > > conn_event(c, conn, SOCK_ACCEPTED); > > =20 > > - if (ref.r.p.tcp.tcp.v6) { > > + if (sa.ss_family =3D=3D AF_INET6) { > > struct sockaddr_in6 sa6; > > =20 > > memcpy(&sa6, &sa, sizeof(sa6)); > > @@ -3147,8 +3147,7 @@ static void tcp_sock_init6(const struct ctx *c, i= nt ns, > > const struct in6_addr *addr, const char *ifname, > > in_port_t port) > > { > > - union tcp_epoll_ref tref =3D { .tcp.listen =3D 1, .tcp.outbound =3D n= s, > > - .tcp.v6 =3D 1 }; > > + union tcp_epoll_ref tref =3D { .tcp.listen =3D 1, .tcp.outbound =3D n= s, }; >=20 > Undocumented style rule: no comma at the end, it's post-modern C anyway. Fixed. > > bool spliced =3D false, tap =3D true; > > int s; > > =20 > > diff --git a/tcp.h b/tcp.h > > index 3fabb5a..85ac750 100644 > > --- a/tcp.h > > +++ b/tcp.h > > @@ -45,7 +45,6 @@ union tcp_epoll_ref { > > uint32_t listen:1, > > splice:1, > > outbound:1, > > - v6:1, >=20 > Don't forget to update the comment above. Fixed. > > timer:1, > > index:20; > > } tcp; > > diff --git a/tcp_splice.c b/tcp_splice.c > > index 99c5fa7..1f26ff4 100644 > > --- a/tcp_splice.c > > +++ b/tcp_splice.c > > @@ -211,12 +211,10 @@ static int tcp_splice_epoll_ctl(const struct ctx = *c, > > int m =3D (conn->flags & IN_EPOLL) ? EPOLL_CTL_MOD : EPOLL_CTL_ADD; > > union epoll_ref ref_a =3D { .r.proto =3D IPPROTO_TCP, .r.s =3D conn->= a, > > .r.p.tcp.tcp.splice =3D 1, > > - .r.p.tcp.tcp.index =3D conn - tc, > > - .r.p.tcp.tcp.v6 =3D CONN_V6(conn) }; > > + .r.p.tcp.tcp.index =3D conn - tc, }; > > union epoll_ref ref_b =3D { .r.proto =3D IPPROTO_TCP, .r.s =3D conn->= b, > > .r.p.tcp.tcp.splice =3D 1, > > - .r.p.tcp.tcp.index =3D conn - tc, > > - .r.p.tcp.tcp.v6 =3D CONN_V6(conn) }; > > + .r.p.tcp.tcp.index =3D conn - tc, }; >=20 > Commas. Fixed. >=20 > > struct epoll_event ev_a =3D { .data.u64 =3D ref_a.u64 }; > > struct epoll_event ev_b =3D { .data.u64 =3D ref_b.u64 }; > > uint32_t events_a, events_b; > > @@ -579,12 +577,15 @@ void tcp_sock_handler_splice(struct ctx *c, union= epoll_ref ref, > > struct tcp_splice_conn *conn; > > =20 > > if (ref.r.p.tcp.tcp.listen) { > > + struct sockaddr sa; >=20 > Should this be sockaddr_storage in this case and then cast? I never > remember. So, for this patch, no it doesn't, but it will have to change later, in fact. The trick here is that we only want the sa_family field, we don't need the full socket address. struct sockaddr has that (and maybe a little extra, but not much). accept4() will truncate the socket address when it writes, but that's ok for our purposes. Using the short form may mitigate the concern below because there's less data to copy_to_user(). Of course, when I change this later to add different handling for IPv4-mapped IPv6 source addresses we will need the full address, so this will need to be a sockaddr_in6. > > + socklen_t sl =3D sizeof(sa); > > int s; > > =20 > > if (c->tcp.splice_conn_count >=3D TCP_SPLICE_MAX_CONNS) > > return; > > =20 > > - if ((s =3D accept4(ref.r.s, NULL, NULL, SOCK_NONBLOCK)) < 0) > > + sa.sa_family =3D AF_UNSPEC; /* FIXME: stop clang-tidy complaining */ > > + if ((s =3D accept4(ref.r.s, &sa, &sl, SOCK_NONBLOCK)) < 0) >=20 > Same comment about /* FIXME */ as above. >=20 > This adds a copy_to_user() and microseconds, but it's unavoidable. I > tried harder to optimise for latency on the spliced path, that's why it > was NULL here and not above. >=20 > > return; > > =20 > > if (setsockopt(s, SOL_TCP, TCP_QUICKACK, &((int){ 1 }), > > @@ -595,7 +596,7 @@ void tcp_sock_handler_splice(struct ctx *c, union e= poll_ref ref, > > =20 > > conn =3D CONN(c->tcp.splice_conn_count++); > > conn->a =3D s; > > - conn->flags =3D ref.r.p.tcp.tcp.v6 ? SOCK_V6 : 0; > > + conn->flags =3D (sa.sa_family =3D=3D AF_INET6) ? SOCK_V6 : 0; > > =20 > > if (tcp_splice_new(c, conn, ref.r.p.tcp.tcp.index, > > ref.r.p.tcp.tcp.outbound)) >=20 --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --+oSP1YzV6f7wa9mf Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEoULxWu4/Ws0dB+XtgypY4gEwYSIFAmNppEsACgkQgypY4gEw YSJQiQ//dsNNpJH/aAjCj0+wbau7WHZtY0NvOr6GYpJzQB47SvMFnPTy9HHNAO/+ XQtUziegeJ2G+nfPts9VmMVig4oD2Oz0qU4Y22t8s+bD1O4EXVkTfZu6QQs/2eZK MCpCdlCRy4X207FTy+KwDQ9ln1hSetzY9WtdSrwmz/zSe7xkZX8lhf7qW6P3HETM YfkSSiODwI6/yCSh1MlusY/tFI7C3u4iw6tCvcm3xKIqQoBe8lZPnGzWYDDyEdVy RwBZC8zO2igevoEE4+sfPEh3V5lI6gsA881RYOnt8jEw8hT7V2vgIKsxEzcQzT/U oUAcgQ17MUmndyvFh9OwM5jiwBF3VCIWAXdDuIxe/vcY2cdit6LUKvVJ+1dJ8q6n vDaO9JGHtxydVKR6Gj+Qs6Wd/xHyVTqO8GwM/Uupt8jRzxYhHC13sSc0Fm8HtMc8 yN1Hupcl+CP1ApeB60zjpx9lDCVySf/BCWOf2GXraPUhFXVbEz128wVBK4zp7iAS 98U69a6VokTRNF2CCdpbV90v2JHIw2Kf19pG/0nRW0ysSo3aWTnvx8GeoM7qtiek jsyBjsr3s9QRvYiIbsMo1XQjSx4r6nJqRnbLjZ8tGWMxGjDBPOsGybTl6rW7ihXy M+taZ/cB+VURjjl1v05w/rpeyJlACLV31Mc3cGXZfy8DlubwFTk= =n6iU -----END PGP SIGNATURE----- --+oSP1YzV6f7wa9mf--