From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=pass (2048-bit key; secure) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202410 header.b=O8cIrN/A; dkim-atps=neutral Received: from mail.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by passt.top (Postfix) with ESMTPS id 72B7A5A061A for ; Wed, 06 Nov 2024 02:48:01 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202410; t=1730857670; bh=FakOjvK5wyApopcZFt2EG+HoeA8S48L2oValcUnf3bU=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=O8cIrN/AhvJiPrMoWjV3wz0wKb7UqnNScMc0bx5nxFcBcM2l3HQ7eeTSJMv8i1+wN j5ClEENfdoFYvF6AGJCicOIlsqMTf5WNOXMv/+p34fjpEtMXF/Z3j3C/ECBRHWLjyt Em5mS4bH0H3IvGDMiTOmvWR+ESpPGxrd3WK/ojzm65bmMJNYTmMyatdKbOPc3rrm2L Sn3235f3eR28xi+81OVakSRQ0wDo3BE8joWTib3eiY5PsO3uJr2HBamz0hRlcQ2J/x VFayvmyHgxNrPnoaaByv5hpBYmto1QKM3YBNHAaDoNmuu9c/Cd9wcW6YZU0pDNv2RK SWIS7Lnjm/M9g== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4Xjp4Q124Pz4wZx; Wed, 6 Nov 2024 12:47:50 +1100 (AEDT) Date: Wed, 6 Nov 2024 12:47:47 +1100 From: David Gibson To: Jon Maloy Subject: Re: [PATCH v3] tcp: unify payload and flags l2 frames array Message-ID: References: <20241106010744.2082849-1-jmaloy@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="B65brLn059YrACkp" Content-Disposition: inline In-Reply-To: <20241106010744.2082849-1-jmaloy@redhat.com> Message-ID-Hash: 6QWUTVHHAB6M66BXMZ6DPMBPK75CYD6G X-Message-ID-Hash: 6QWUTVHHAB6M66BXMZ6DPMBPK75CYD6G X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, sbrivio@redhat.com, lvivier@redhat.com, dgibson@redhat.com X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --B65brLn059YrACkp Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Nov 05, 2024 at 08:07:44PM -0500, Jon Maloy wrote: > In order to reduce static memory and code footprint, we merge > the array for l2 flag frames into the one for payload frames. >=20 > This change also ensures that no flag message will be sent out > over the l2 media bypassing already queued payload messages. >=20 > Performance measurements with iperf3, where we force all > traffic via the tap queue, show no significant difference: >=20 > Dual traffic both directions sinmultaneously, with patch: > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D > host->ns: > -------- > [ ID] Interval Transfer Bitrate Retr > [ 5] 0.00-100.00 sec 36.3 GBytes 3.12 Gbits/sec 4759 sender > [ 5] 0.00-100.04 sec 36.3 GBytes 3.11 Gbits/sec receiver >=20 > ns->host: > --------- > [ ID] Interval Transfer Bitrate > [ 5] 0.00-100.00 sec 321 GBytes 27.6 Gbits/sec receiver >=20 > Dual traffic both directions sinmultaneously, without patch: > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > host->ns: > -------- > [ ID] Interval Transfer Bitrate Retr > [ 5] 0.00-100.00 sec 35.0 GBytes 3.01 Gbits/sec 6001 sender > [ 5] 0.00-100.04 sec 34.8 GBytes 2.99 Gbits/sec receiver >=20 > ns->host > -------- > [ ID] Interval Transfer Bitrate > [ 5] 0.00-100.00 sec 345 GBytes 29.6 Gbits/sec receiver >=20 > Single connection, with patch: > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D > host->ns: > --------- > [ ID] Interval Transfer Bitrate Retr > [ 5] 0.00-100.00 sec 138 GBytes 11.8 Gbits/sec 922 sender > [ 5] 0.00-100.04 sec 138 GBytes 11.8 Gbits/sec receiver >=20 > ns->host: > ----------- > [ ID] Interval Transfer Bitrate > [ 5] 0.00-100.00 sec 430 GBytes 36.9 Gbits/sec receiver >=20 > Single connection, without patch: > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D > host->ns: > ------------ > [ ID] Interval Transfer Bitrate Retr > [ 5] 0.00-100.00 sec 139 GBytes 11.9 Gbits/sec 900 sender > [ 5] 0.00-100.04 sec 139 GBytes 11.9 Gbits/sec receiver >=20 > ns->host: > --------- > [ ID] Interval Transfer Bitrate > [ 5] 0.00-100.00 sec 440 GBytes 37.8 Gbits/sec receiver >=20 > Signed-off-by: Jon Maloy Reviewed-by: David Gibson >=20 > ---- > v2: - Adapted to and rebased on latest release. > - Removed redundant tcp_flags_push() function. > - Added measurement results. > v3: - Rebased on latest release. > - Fixes based on comments from D. Gibson > --- > tcp.c | 1 - > tcp_buf.c | 70 ++++++++++++-------------------------------------- > tcp_buf.h | 1 - > tcp_internal.h | 15 ----------- > 4 files changed, 17 insertions(+), 70 deletions(-) >=20 > diff --git a/tcp.c b/tcp.c > index 4e0a17e..b652900 100644 > --- a/tcp.c > +++ b/tcp.c > @@ -937,7 +937,6 @@ bool tcp_flow_defer(const struct tcp_tap_conn *conn) > /* cppcheck-suppress [constParameterPointer, unmatchedSuppression] */ > void tcp_defer_handler(struct ctx *c) > { > - tcp_flags_flush(c); > tcp_payload_flush(c); > } > =20 > diff --git a/tcp_buf.c b/tcp_buf.c > index 274e313..d29c1a9 100644 > --- a/tcp_buf.c > +++ b/tcp_buf.c > @@ -20,7 +20,7 @@ > =20 > #include > =20 > -#include > +#include Oops, thought I'd changed that one already. > =20 > #include "util.h" > #include "ip.h" > @@ -59,22 +59,10 @@ static_assert(MSS6 <=3D sizeof(tcp_payload[0].data), = "MSS6 is greater than 65516") > static struct tcp_tap_conn *tcp_frame_conns[TCP_FRAMES_MEM]; > static unsigned int tcp_payload_used; > =20 > -static struct tap_hdr tcp_flags_tap_hdr[TCP_FRAMES_MEM]; > -/* IPv4 headers for TCP segment without payload */ > -static struct iphdr tcp4_flags_ip[TCP_FRAMES_MEM]; > -/* TCP segments without payload for IPv4 frames */ > -static struct tcp_flags_t tcp_flags[TCP_FRAMES_MEM]; > - > -static unsigned int tcp_flags_used; > - > -/* IPv6 headers for TCP segment without payload */ > -static struct ipv6hdr tcp6_flags_ip[TCP_FRAMES_MEM]; > - > /* recvmsg()/sendmsg() data for tap */ > static struct iovec iov_sock [TCP_FRAMES_MEM + 1]; > =20 > static struct iovec tcp_l2_iov[TCP_FRAMES_MEM][TCP_NUM_IOVS]; > -static struct iovec tcp_l2_flags_iov[TCP_FRAMES_MEM][TCP_NUM_IOVS]; > =20 > /** > * tcp_update_l2_buf() - Update Ethernet header buffers with addresses > @@ -103,15 +91,6 @@ void tcp_sock_iov_init(const struct ctx *c) > for (i =3D 0; i < ARRAY_SIZE(tcp_payload); i++) { > tcp6_payload_ip[i] =3D ip6; > tcp4_payload_ip[i] =3D iph; > - tcp_payload[i].th.doff =3D sizeof(struct tcphdr) / 4; > - tcp_payload[i].th.ack =3D 1; > - } > - > - for (i =3D 0; i < ARRAY_SIZE(tcp_flags); i++) { > - tcp6_flags_ip[i] =3D ip6; > - tcp4_flags_ip[i] =3D iph; > - tcp_flags[i].th.doff =3D sizeof(struct tcphdr) / 4; > - tcp_flags[i].th.ack =3D 1; > } > =20 > for (i =3D 0; i < TCP_FRAMES_MEM; i++) { > @@ -121,25 +100,6 @@ void tcp_sock_iov_init(const struct ctx *c) > iov[TCP_IOV_ETH].iov_len =3D sizeof(struct ethhdr); > iov[TCP_IOV_PAYLOAD].iov_base =3D &tcp_payload[i]; > } > - > - for (i =3D 0; i < TCP_FRAMES_MEM; i++) { > - struct iovec *iov =3D tcp_l2_flags_iov[i]; > - > - iov[TCP_IOV_TAP] =3D tap_hdr_iov(c, &tcp_flags_tap_hdr[i]); > - iov[TCP_IOV_ETH].iov_len =3D sizeof(struct ethhdr); > - iov[TCP_IOV_PAYLOAD].iov_base =3D &tcp_flags[i]; > - } > -} > - > -/** > - * tcp_flags_flush() - Send out buffers for segments with no data (flags) > - * @c: Execution context > - */ > -void tcp_flags_flush(const struct ctx *c) > -{ > - tap_send_frames(c, &tcp_l2_flags_iov[0][0], TCP_NUM_IOVS, > - tcp_flags_used); > - tcp_flags_used =3D 0; > } > =20 > /** > @@ -171,7 +131,7 @@ static void tcp_revert_seq(const struct ctx *c, struc= t tcp_tap_conn **conns, > } > =20 > /** > - * tcp_payload_flush() - Send out buffers for segments with data > + * tcp_payload_flush() - Send out buffers for segments with data or flags > * @c: Execution context > */ > void tcp_payload_flush(const struct ctx *c) > @@ -197,37 +157,35 @@ void tcp_payload_flush(const struct ctx *c) > */ > int tcp_buf_send_flag(const struct ctx *c, struct tcp_tap_conn *conn, in= t flags) > { > - struct tcp_flags_t *payload; > + struct tcp_payload_t *payload; > struct iovec *iov; > size_t optlen; > size_t l4len; > uint32_t seq; > int ret; > =20 > - iov =3D tcp_l2_flags_iov[tcp_flags_used]; > + iov =3D tcp_l2_iov[tcp_payload_used]; > if (CONN_V4(conn)) { > - iov[TCP_IOV_IP] =3D IOV_OF_LVALUE(tcp4_flags_ip[tcp_flags_used]); > + iov[TCP_IOV_IP] =3D IOV_OF_LVALUE(tcp4_payload_ip[tcp_payload_used]); > iov[TCP_IOV_ETH].iov_base =3D &tcp4_eth_src; > } else { > - iov[TCP_IOV_IP] =3D IOV_OF_LVALUE(tcp6_flags_ip[tcp_flags_used]); > + iov[TCP_IOV_IP] =3D IOV_OF_LVALUE(tcp6_payload_ip[tcp_payload_used]); > iov[TCP_IOV_ETH].iov_base =3D &tcp6_eth_src; > } > =20 > payload =3D iov[TCP_IOV_PAYLOAD].iov_base; > seq =3D conn->seq_to_tap; > ret =3D tcp_prepare_flags(c, conn, flags, &payload->th, > - &payload->opts, &optlen); > + (struct tcp_syn_opts *)&payload->data, &optlen); > if (ret <=3D 0) > return ret; > =20 > - tcp_flags_used++; > + tcp_payload_used++; > l4len =3D tcp_l2_buf_fill_headers(conn, iov, optlen, NULL, seq, false); > iov[TCP_IOV_PAYLOAD].iov_len =3D l4len; > - > if (flags & DUP_ACK) { > - struct iovec *dup_iov; > + struct iovec *dup_iov =3D tcp_l2_iov[tcp_payload_used++]; > =20 > - dup_iov =3D tcp_l2_flags_iov[tcp_flags_used++]; > memcpy(dup_iov[TCP_IOV_TAP].iov_base, iov[TCP_IOV_TAP].iov_base, > iov[TCP_IOV_TAP].iov_len); > dup_iov[TCP_IOV_ETH].iov_base =3D iov[TCP_IOV_ETH].iov_base; > @@ -237,8 +195,8 @@ int tcp_buf_send_flag(const struct ctx *c, struct tcp= _tap_conn *conn, int flags) > dup_iov[TCP_IOV_PAYLOAD].iov_len =3D l4len; > } > =20 > - if (tcp_flags_used > TCP_FRAMES_MEM - 2) > - tcp_flags_flush(c); > + if (tcp_payload_used > TCP_FRAMES_MEM - 2) > + tcp_payload_flush(c); > =20 > return 0; > } > @@ -254,6 +212,7 @@ int tcp_buf_send_flag(const struct ctx *c, struct tcp= _tap_conn *conn, int flags) > static void tcp_data_to_tap(const struct ctx *c, struct tcp_tap_conn *co= nn, > ssize_t dlen, int no_csum, uint32_t seq) > { > + struct tcp_payload_t *payload; > const uint16_t *check =3D NULL; > struct iovec *iov; > size_t l4len; > @@ -274,6 +233,11 @@ static void tcp_data_to_tap(const struct ctx *c, str= uct tcp_tap_conn *conn, > iov[TCP_IOV_IP] =3D IOV_OF_LVALUE(tcp6_payload_ip[tcp_payload_used]); > iov[TCP_IOV_ETH].iov_base =3D &tcp6_eth_src; > } > + payload =3D iov[TCP_IOV_PAYLOAD].iov_base; > + payload->th.th_off =3D sizeof(struct tcphdr) / 4; > + payload->th.th_x2 =3D 0; > + payload->th.th_flags =3D 0; > + payload->th.ack =3D 1; > l4len =3D tcp_l2_buf_fill_headers(conn, iov, dlen, check, seq, false); > iov[TCP_IOV_PAYLOAD].iov_len =3D l4len; > if (++tcp_payload_used > TCP_FRAMES_MEM - 1) > diff --git a/tcp_buf.h b/tcp_buf.h > index 49c04d4..54f5e53 100644 > --- a/tcp_buf.h > +++ b/tcp_buf.h > @@ -7,7 +7,6 @@ > #define TCP_BUF_H > =20 > void tcp_sock_iov_init(const struct ctx *c); > -void tcp_flags_flush(const struct ctx *c); > void tcp_payload_flush(const struct ctx *c); > int tcp_buf_data_from_sock(const struct ctx *c, struct tcp_tap_conn *con= n); > int tcp_buf_send_flag(const struct ctx *c, struct tcp_tap_conn *conn, in= t flags); > diff --git a/tcp_internal.h b/tcp_internal.h > index a5a47df..c846f60 100644 > --- a/tcp_internal.h > +++ b/tcp_internal.h > @@ -134,21 +134,6 @@ struct tcp_syn_opts { > .ws =3D TCP_OPT_WS(ws_), \ > }) > =20 > -/** > - * struct tcp_flags_t - TCP header and data to send zero-length > - * segments (flags) > - * @th: TCP header > - * @opts TCP options > - */ > -struct tcp_flags_t { > - struct tcphdr th; > - struct tcp_syn_opts opts; > -#ifdef __AVX2__ > -} __attribute__ ((packed, aligned(32))); > -#else > -} __attribute__ ((packed, aligned(__alignof__(unsigned int)))); > -#endif > - > extern char tcp_buf_discard [MAX_WINDOW]; > =20 > void conn_flag_do(const struct ctx *c, struct tcp_tap_conn *conn, --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --B65brLn059YrACkp Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmcqysIACgkQzQJF27ox 2GcyyQ/8CMo6Yk5fOyKmZJYLaL9KNvB/+fsGFQV4kBKKbubzewUJAr74oPmZUDkT 7FN8HHPgVMhpLiiOGWUesYGcNguSTmMcgG2kKyiVf+ahKQNl1fTQ7L27MOT/XWsN xYxQI0+fAgkEWnOLvg78YJLi7BDqk7VfeBL8EiEYQxLrDQDJYc/2MYHh/Ld0xAOA XPlD2jTkZrvSDjtg6NoYKIgBR11N3VnDkbkYeCvCw3Qsx2jLyYJ4ZwFL6bOchFeF esSN9T89OS9MhGV4+OsxXNfMe6H5W2PABrsOa6iJW2FRUHkRujbjz/Dxb8ecY1QU wWxmkj9JvtmCgRG2KdfmUzIlWzBe+f3GJW03sYF2+Z4Y2bEAVH4GIC7Ju7b8fCUc kkOyePi7mshIe07ATt4szvbhnv3qj2raubRQP2nyB4teVkMLd1lzCGa3GAWXQ3Xl R07ifisnfs9K8W7nT7GT9pRwrwQNqo1E5aJ5oAxmc6fMpH1yr91yEKvdGfbnx5KJ cJVwrwonk77TGDB/dzusFN6FnC+j3Tw3rJzW6NpH6iFozdPENHlG0x1qIlAnvlkh Bspb3aKtRjQLy+OylPP55EFlr0EBWPAM6yk8FdsDzNXzdap86Mf6vszM+rvxWGHV /fMLQ78ZzT48c3Vi4UpQVyIvJlo8RgR/Uc1nzkAQ7K+rLE+DZxs= =DiQ1 -----END PGP SIGNATURE----- --B65brLn059YrACkp--