From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by passt.top (Postfix) with ESMTPS id 806235A0265 for ; Thu, 13 Oct 2022 06:09:49 +0200 (CEST) Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4Mnwyb3chwz4xGw; Thu, 13 Oct 2022 15:09:43 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=201602; t=1665634183; bh=BwvZJph4I2cTQiLKesgMRTTGdm7fembjIRqn3BBTevE=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=l3wI2xspo1Ai0/rC/tJbtEKWAQwyP+8C9DSYSjM39S5FNx9g18s96SxcGmiBLuFoU mehhVKuTzVutV1USxed3Itasl36lj0V1mqCYY/1m6v9wqhEJ6znorqCzQw13u8ytmn HQc4JoXCXU3+HTfCkAOOLFu5DVsGv0WOEGL8XoxI= Date: Thu, 13 Oct 2022 12:19:49 +1100 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH v2 2/3] tcp, tcp_splice: Fix port remapping for inbound, spliced connections Message-ID: References: <20221012154554.1650667-1-sbrivio@redhat.com> <20221012154554.1650667-3-sbrivio@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="jtWIZV9tbHpZoGC2" Content-Disposition: inline In-Reply-To: <20221012154554.1650667-3-sbrivio@redhat.com> Message-ID-Hash: QNPVCVCQ3MZP5LMT2SG27SF7C4UGDDGQ X-Message-ID-Hash: QNPVCVCQ3MZP5LMT2SG27SF7C4UGDDGQ X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.3 Precedence: list List-Id: Development discussion and patches for passt Archived-At: <> Archived-At: List-Archive: <> List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --jtWIZV9tbHpZoGC2 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Oct 12, 2022 at 05:45:53PM +0200, Stefano Brivio wrote: > In pasta mode, when we receive a new inbound connection, we need to > select a socket that was created in the namespace to proceed and > connect() it to its final destination. >=20 > The existing condition might pick a wrong socket, though, if the > destination port is remapped, because we'll check the bitmap of > inbound ports using the remapped port (stored in the epoll reference) > as index, and not the original port. >=20 > Instead of using the port bitmap for this purpose, store this > information in the epoll reference itself, by adding a new 'outbound' > bit, that's set if the listening socket was created the namespace, > and unset otherwise. >=20 > Then, use this bit to pick a socket on the right side. >=20 > Suggested-by: David Gibson > Fixes: 33482d5bf293 ("passt: Add PASTA mode, major rework") > Signed-off-by: Stefano Brivio Reviewed-by: David Gibson > --- > v2: > - use a bit in epoll reference to indicate whether a socket is for > inbound or outbound connections, instead of trying (and failing) > to rebuild that information from maps (David Gibson) >=20 > tcp.c | 7 +++---- > tcp.h | 2 ++ > tcp_splice.c | 20 +++++++++++++------- > 3 files changed, 18 insertions(+), 11 deletions(-) >=20 > diff --git a/tcp.c b/tcp.c > index 63153b6..0d4ce57 100644 > --- a/tcp.c > +++ b/tcp.c > @@ -3084,15 +3084,14 @@ void tcp_sock_handler(struct ctx *c, union epoll_= ref ref, uint32_t events, > void tcp_sock_init(const struct ctx *c, int ns, sa_family_t af, > const void *addr, const char *ifname, in_port_t port) > { > - union tcp_epoll_ref tref =3D { .tcp.listen =3D 1 }; > + union tcp_epoll_ref tref =3D { .tcp.listen =3D 1, .tcp.outbound =3D ns = }; > const void *bind_addr; > int s; > =20 > - if (ns) { > + if (ns) > tref.tcp.index =3D (in_port_t)(port + c->tcp.fwd_out.delta[port]); > - } else { > + else > tref.tcp.index =3D (in_port_t)(port + c->tcp.fwd_in.delta[port]); > - } > =20 > if (af =3D=3D AF_INET || af =3D=3D AF_UNSPEC) { > if (!addr && c->mode =3D=3D MODE_PASTA) > diff --git a/tcp.h b/tcp.h > index 7ba7ab7..72e7815 100644 > --- a/tcp.h > +++ b/tcp.h > @@ -34,6 +34,7 @@ void tcp_update_l2_buf(const unsigned char *eth_d, cons= t unsigned char *eth_s, > * union tcp_epoll_ref - epoll reference portion for TCP connections > * @listen: Set if this file descriptor is a listening socket > * @splice: Set if descriptor is associated to a spliced connection > + * @outbound: Listening socket maps to outbound, spliced connection > * @v6: Set for IPv6 sockets or connections > * @timer: Reference is a timerfd descriptor for connection > * @index: Index of connection in table, or port for bound sockets > @@ -43,6 +44,7 @@ union tcp_epoll_ref { > struct { > uint32_t listen:1, > splice:1, > + outbound:1, > v6:1, > timer:1, > index:20; > diff --git a/tcp_splice.c b/tcp_splice.c > index 96c31c8..99c5fa7 100644 > --- a/tcp_splice.c > +++ b/tcp_splice.c > @@ -35,6 +35,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -506,19 +507,19 @@ static int tcp_splice_connect_ns(void *arg) > * @c: Execution context > * @conn: Connection pointer > * @port: Destination port, host order > + * @outbound: Connection request coming from namespace > * > * Return: return code from connect() > */ > static int tcp_splice_new(const struct ctx *c, struct tcp_splice_conn *c= onn, > - in_port_t port) > + in_port_t port, int outbound) > { > - struct tcp_splice_connect_ns_arg ns_arg =3D { c, conn, port, 0 }; > int *p, i, s =3D -1; > =20 > - if (bitmap_isset(c->tcp.fwd_in.map, port)) > - p =3D CONN_V6(conn) ? ns_sock_pool6 : ns_sock_pool4; > - else > + if (outbound) > p =3D CONN_V6(conn) ? init_sock_pool6 : init_sock_pool4; > + else > + p =3D CONN_V6(conn) ? ns_sock_pool6 : ns_sock_pool4; > =20 > for (i =3D 0; i < TCP_SOCK_POOL_SIZE; i++, p++) { > SWAP(s, *p); > @@ -526,11 +527,15 @@ static int tcp_splice_new(const struct ctx *c, stru= ct tcp_splice_conn *conn, > break; > } > =20 > - if (s < 0 && bitmap_isset(c->tcp.fwd_in.map, port)) { > + /* No socket available in namespace: create a new one for connect() */ > + if (s < 0 && !outbound) { > + struct tcp_splice_connect_ns_arg ns_arg =3D { c, conn, port, 0 }; > + > NS_CALL(tcp_splice_connect_ns, &ns_arg); > return ns_arg.ret; > } > =20 > + /* Otherwise, the socket will connect on the side it was created on */ > return tcp_splice_connect(c, conn, s, port); > } > =20 > @@ -592,7 +597,8 @@ void tcp_sock_handler_splice(struct ctx *c, union epo= ll_ref ref, > conn->a =3D s; > conn->flags =3D ref.r.p.tcp.tcp.v6 ? SOCK_V6 : 0; > =20 > - if (tcp_splice_new(c, conn, ref.r.p.tcp.tcp.index)) > + if (tcp_splice_new(c, conn, ref.r.p.tcp.tcp.index, > + ref.r.p.tcp.tcp.outbound)) > conn_flag(c, conn, CLOSING); > =20 > return; --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --jtWIZV9tbHpZoGC2 Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEoULxWu4/Ws0dB+XtgypY4gEwYSIFAmNHZ68ACgkQgypY4gEw YSJM7w//Z8iLEZMoCtdXNzKNkRzg1ffIMxTAqYJVkZuXiVsfxhrKgrhCsz43XP3g mglDXzmfUar3Lee5uY/vgGX6XsIxy6yKAB/XMcOe1hznc2NrLmCCn2475eaPvxyG VUvtxR3Y/XT3g3XzNo3f7dInSdxMiO6VArNHmiMJF1wQpyyC/QRLe74Tj0QuopXz bx5e72xwLWDMBBf0c2zvwND0EZ1nmkdgur0poRAw6p4P0FLHUDUecsiffHdOCvxj VdO+fPGtjGULJL7GPDyXmVF9qXL9LOZE02fhLHOZV12gq/bAvjBxkgAT7W5TD6Zg BaW+k59GZgclQ5fzjHFPPGpOCNTYn06PiRcP9hWLnSYaiQiRLn2DujQEwrRKjvUU +3KDyumO8M6JUBBWeB2vMsOww/1PXwxXA0SNxiP/CIPyyCl9jc8EAKGGh7pNrN8J UOCSVO0N+w0Zze+TDcQlaQzY/aaW16arDPlNWiczYfV5wrYy+SzBswG3PN4vPi82 E1gtZeEnPD0iDLDjgv40MMSzjfRDPGVVdGKl2DlPBOZg8tlskE5l+zRPAyvQYAoO kwzsx+q1IIMnVQDaMI6HRCb23VMFTDAjkzGej6pue+BwWBKe/x3ETx042Oz/iQgz WErB2vWzeKKBWZDmfIkvBkPAHxq/Pxg9p/Vq3JY8AoflqUp4D/c= =WsRW -----END PGP SIGNATURE----- --jtWIZV9tbHpZoGC2--