From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: passt.top; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=dw9T8fyA; dkim-atps=neutral Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by passt.top (Postfix) with ESMTP id 6C11C5A004E for ; Sun, 22 Sep 2024 08:45:19 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1726987518; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=alfL/24ECRfpxkxUIiO9E3nK/tzoDWAaxhTP8Vdyb4Y=; b=dw9T8fyAGOhK0pnXtQwvZH4DDf8YYVTD3qkskNP4A7ydjz3lls8yZdcZOGRooOhN6HNB5i fsgZYXNz2A28kSsOhjo9GZg5t2mgBPzPBb50IXRXUvhUDgwJkoHxsixV4P1KPaaZiJ/p9L +HBeTjfXHVpIMrxu9F4N4zoLMxYYLUI= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-98-EXeg2X-YN0S_FqNd0xFGTg-1; Sun, 22 Sep 2024 02:45:16 -0400 X-MC-Unique: EXeg2X-YN0S_FqNd0xFGTg-1 Received: by mail-wm1-f69.google.com with SMTP id 5b1f17b1804b1-42cae209243so20262335e9.1 for ; Sat, 21 Sep 2024 23:45:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726987514; x=1727592314; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=FwglmlUp/IJqtjXL+c3j+AfXhcw3d5bUZZDuP6+4rmU=; b=JO6ucZdOQokf7FS0AIu/sjoM05T0zHjfWpbTVM7ezAdoOTfQ8Ejg+cqwGouU9/qTy9 5mL6g1R5H6kRUrYT9HL3x7J1drn/Iy0rx8l7mbH3pNDqZya6NADvr4UJxSDFIYr3TLgP Xi+QJ8E4n2J96iMz+CtBn99sTvNQszYv77Fa6EdMXAVikJ51XSTOv3C9ryYfCs6NDg4e wjTV8Y08DJYWAQRwJdvK6YnM1ijpfRQ+l/Slv70NKPu4wmW/L/ni340D9aP6cjUpT8Wu +IIyrSVx+uxAC1lhFHj5rWHh3TziGByg70BQf/U+9GUraWwzf52btbodWZql0xf8QUkn Co0Q== X-Gm-Message-State: AOJu0YyuzwkI1Jl9kOxjk8ZFLYsrie0TBWLczqVxZLTpwViFU/+MCPfQ QOs5PHHXyC75cADldt0tL26C8EINN1RLIowuNAE79R3r/PCTpOW0epoe33Cr0pAepYbugKFZjPo a7yhjvtUgQ/0oHiqm0sKf8fBsQLJhvP41sjRWJL3clSNeN3Qaj7usq3BkY568LmIcBb5sttBEXF OZnLk0aSDKdAgN/bgrSG3RcJpBDV+u0cSw X-Received: by 2002:adf:ec4d:0:b0:374:ce8b:bc65 with SMTP id ffacd0b85a97d-37a42380fc2mr3735541f8f.56.1726987514239; Sat, 21 Sep 2024 23:45:14 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHyHGXGisQSamOlKq+ACXvW33n0dXzZxx5wxkgGfXwD6Iy9UOKNgzPKW1EFEpn21xHJmvHGRw== X-Received: by 2002:adf:ec4d:0:b0:374:ce8b:bc65 with SMTP id ffacd0b85a97d-37a42380fc2mr3735523f8f.56.1726987513599; Sat, 21 Sep 2024 23:45:13 -0700 (PDT) Received: from maya.myfinge.rs (ifcgrfdd.trafficplex.cloud. [176.103.220.4]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-378e72e4b60sm21424332f8f.24.2024.09.21.23.45.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 21 Sep 2024 23:45:11 -0700 (PDT) Date: Sun, 22 Sep 2024 08:45:08 +0200 From: Stefano Brivio To: Jon Maloy Subject: Re: [PATCH] tcp: unify payload and flags l2 frames array Message-ID: <20240922084508.04fe7206@elisabeth> In-Reply-To: <20240921194333.3938826-1-jmaloy@redhat.com> References: <20240921194333.3938826-1-jmaloy@redhat.com> Organization: Red Hat X-Mailer: Claws Mail 4.2.0 (GTK 3.24.41; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Message-ID-Hash: NU4XN2KHZBWCZPYGCEMEF6TWSYBWID2P X-Message-ID-Hash: NU4XN2KHZBWCZPYGCEMEF6TWSYBWID2P X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, lvivier@redhat.com, dgibson@redhat.com X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Sat, 21 Sep 2024 15:43:33 -0400 Jon Maloy wrote: > In order to reduce memory and code footprint, we merge the array As it is, with this one, I think we're actually wasting more memory, because, if I understand this patch correctly, every ACK we send needs its own 64k unused buffer. > for l2 flag frames into the one for payload frames. >=20 > This change also ensures that no flag message will be sent out > over the l2 media bypassing already queued payload messages. >=20 > Signed-off-by: Jon Maloy > --- > tcp.c | 3 +-- > tcp_buf.c | 70 ++++++++++++------------------------------------------- > 2 files changed, 16 insertions(+), 57 deletions(-) If I apply this on top of "[PATCH v3 0/2] tcp: unify IPv4 and IPv6 tap queues" (it doesn't apply otherwise), it doesn't build: tcp.c:1149:5: error: conflicting types for =E2=80=98tcp_prepare_flags=E2=80= =99; have =E2=80=98int(const struct ctx *, struct tcp_tap_conn *, int, str= uct tcphdr *, uint8_t *, size_t *)=E2=80=99 {aka =E2=80=98int(const struct = ctx *, struct tcp_tap_conn *, int, struct tcphdr *, unsigned char *, long = unsigned int *)=E2=80=99} 1149 | int tcp_prepare_flags(const struct ctx *c, struct tcp_tap_conn *con= n, | ^~~~~~~~~~~~~~~~~ In file included from tcp.c:305: tcp_internal.h:98:5: note: previous declaration of =E2=80=98tcp_prepare_fla= gs=E2=80=99 with type =E2=80=98int(const struct ctx *, struct tcp_tap_conn = *, int, struct tcphdr *, char *, size_t *)=E2=80=99 {aka =E2=80=98int(cons= t struct ctx *, struct tcp_tap_conn *, int, struct tcphdr *, char *, long = unsigned int *)=E2=80=99} 98 | int tcp_prepare_flags(const struct ctx *c, struct tcp_tap_conn *con= n, int flags, | ^~~~~~~~~~~~~~~~~ tcp_buf.c: In function =E2=80=98tcp_buf_send_flag=E2=80=99: tcp_buf.c:204:40: warning: pointer targets in passing argument 5 of =E2=80= =98tcp_prepare_flags=E2=80=99 differ in signedness [-Wpointer-sign] 204 | payload->data, &optlen); | ~~~~~~~^~~~~~ | | | uint8_t * {aka unsigned char= *} In file included from tcp_buf.c:33: tcp_internal.h:99:48: note: expected =E2=80=98char *=E2=80=99 but argument = is of type =E2=80=98uint8_t *=E2=80=99 {aka =E2=80=98unsigned char *=E2=80= =99} 99 | struct tcphdr *th, char *data, size_t *optlen= ); | ~~~~~~^~~~ make: *** [Makefile:122: passt] Error 1 > diff --git a/tcp.c b/tcp.c > index 12e864f..dd060d0 100644 > --- a/tcp.c > +++ b/tcp.c > @@ -868,7 +868,6 @@ bool tcp_flow_defer(const struct tcp_tap_conn *conn) > /* cppcheck-suppress [constParameterPointer, unmatchedSuppression] */ > void tcp_defer_handler(struct ctx *c) > { > -=09tcp_flags_flush(c); > =09tcp_payload_flush(c); > } > =20 > @@ -1148,7 +1147,7 @@ static void tcp_update_seqack_from_tap(const struct= ctx *c, > *=09 1 otherwise > */ > int tcp_prepare_flags(const struct ctx *c, struct tcp_tap_conn *conn, > -=09=09 int flags, struct tcphdr *th, char *data, > +=09=09 int flags, struct tcphdr *th, uint8_t *data, > =09=09 size_t *optlen) > { > =09struct tcp_info tinfo =3D { 0 }; > diff --git a/tcp_buf.c b/tcp_buf.c > index 625c29c..58636d1 100644 > --- a/tcp_buf.c > +++ b/tcp_buf.c > @@ -33,6 +33,7 @@ > #include "tcp_internal.h" > #include "tcp_buf.h" > =20 > +#define PAYLOAD_FLAGS htons(0x5010) /* doff =3D 5, ack =3D 1 */ > #define TCP_FRAMES_MEM=09=09=09128 > #define TCP_FRAMES=09=09=09=09=09=09=09 \ > =09(c->mode =3D=3D MODE_PASTA ? 1 : TCP_FRAMES_MEM) > @@ -52,21 +53,6 @@ struct tcp_payload_t { > } __attribute__ ((packed, aligned(__alignof__(unsigned int)))); > #endif > =20 > -/** > - * struct tcp_flags_t - TCP header and data to send zero-length > - * segments (flags) > - * @th:=09=09TCP header > - * @opts=09TCP options > - */ > -struct tcp_flags_t { > -=09struct tcphdr th; > -=09char opts[OPT_MSS_LEN + OPT_WS_LEN + 1]; > -#ifdef __AVX2__ > -} __attribute__ ((packed, aligned(32))); > -#else > -} __attribute__ ((packed, aligned(__alignof__(unsigned int)))); > -#endif > - > /* Ethernet header for IPv4 and IPv6 frames */ > static struct ethhdr=09=09tcp4_eth_src; > static struct ethhdr=09=09tcp6_eth_src; > @@ -87,22 +73,10 @@ static_assert(MSS6 <=3D sizeof(tcp_payload[0].data), = "MSS6 is greater than 65516") > static struct tcp_tap_conn *tcp_frame_conns[TCP_FRAMES_MEM]; > static unsigned int tcp_payload_used; > =20 > -static struct tap_hdr=09=09tcp_flags_tap_hdr[TCP_FRAMES_MEM]; > -/* IPv4 headers for TCP segment without payload */ > -static struct iphdr=09=09tcp4_flags_ip[TCP_FRAMES_MEM]; > -/* TCP segments without payload for IPv4 frames */ > -static struct tcp_flags_t=09tcp_flags[TCP_FRAMES_MEM]; > - > -static unsigned int tcp_flags_used; > - > -/* IPv6 headers for TCP segment without payload */ > -static struct ipv6hdr=09=09tcp6_flags_ip[TCP_FRAMES_MEM]; > - > /* recvmsg()/sendmsg() data for tap */ > static struct iovec=09iov_sock=09=09[TCP_FRAMES_MEM + 1]; > =20 > static struct iovec=09tcp_l2_iov[TCP_FRAMES_MEM][TCP_NUM_IOVS]; > -static struct iovec=09tcp_l2_flags_iov[TCP_FRAMES_MEM][TCP_NUM_IOVS]; > =20 > /** > * tcp_update_l2_buf() - Update Ethernet header buffers with addresses > @@ -135,13 +109,6 @@ void tcp_sock_iov_init(const struct ctx *c) > =09=09tcp_payload[i].th.ack =3D 1; > =09} > =20 > -=09for (i =3D 0; i < ARRAY_SIZE(tcp_flags); i++) { > -=09=09tcp6_flags_ip[i] =3D ip6; > -=09=09tcp4_flags_ip[i] =3D iph; > -=09=09tcp_flags[i].th.doff =3D sizeof(struct tcphdr) / 4; > -=09=09tcp_flags[i].th.ack =3D 1; > -=09} > - > =09for (i =3D 0; i < TCP_FRAMES_MEM; i++) { > =09=09struct iovec *iov =3D tcp_l2_iov[i]; > =20 > @@ -149,14 +116,6 @@ void tcp_sock_iov_init(const struct ctx *c) > =09=09iov[TCP_IOV_ETH].iov_len =3D sizeof(struct ethhdr); > =09=09iov[TCP_IOV_PAYLOAD].iov_base =3D &tcp_payload[i]; > =09} > - > -=09for (i =3D 0; i < TCP_FRAMES_MEM; i++) { > -=09=09struct iovec *iov =3D tcp_l2_flags_iov[i]; > - > -=09=09iov[TCP_IOV_TAP] =3D tap_hdr_iov(c, &tcp_flags_tap_hdr[i]); > -=09=09iov[TCP_IOV_ETH].iov_len =3D sizeof(struct ethhdr); > -=09=09iov[TCP_IOV_PAYLOAD].iov_base =3D &tcp_flags[i]; > -=09} > } > =20 > /** > @@ -165,9 +124,7 @@ void tcp_sock_iov_init(const struct ctx *c) > */ > void tcp_flags_flush(const struct ctx *c) > { > -=09tap_send_frames(c, &tcp_l2_flags_iov[0][0], TCP_NUM_IOVS, > -=09=09=09tcp_flags_used); > -=09tcp_flags_used =3D 0; > +=09tcp_payload_flush(c); Should this still be called tcp_payload_flush() instead of, say, tcp_packets_flush() or tcp_frames_flush()? Does it make sense to keep tcp_flags_flush() at all? > } > =20 > /** > @@ -225,37 +182,35 @@ void tcp_payload_flush(const struct ctx *c) > */ > int tcp_buf_send_flag(const struct ctx *c, struct tcp_tap_conn *conn, in= t flags) > { > -=09struct tcp_flags_t *payload; > +=09struct tcp_payload_t *payload; Should the struct still be defined as tcp_payload_t (instead of tcp_packet_t or tcp_frame_t) and this still be called payload? > =09struct iovec *iov; > =09size_t optlen; > =09size_t l4len; > =09uint32_t seq; > =09int ret; > =20 > -=09iov =3D tcp_l2_flags_iov[tcp_flags_used]; > +=09iov =3D tcp_l2_iov[tcp_payload_used]; > =09if (CONN_V4(conn)) { > -=09=09iov[TCP_IOV_IP] =3D IOV_OF_LVALUE(tcp4_flags_ip[tcp_flags_used]); > +=09=09iov[TCP_IOV_IP] =3D IOV_OF_LVALUE(tcp4_payload_ip[tcp_payload_used= ]); Same with this counter. > =09=09iov[TCP_IOV_ETH].iov_base =3D &tcp4_eth_src; > =09} else { > -=09=09iov[TCP_IOV_IP] =3D IOV_OF_LVALUE(tcp6_flags_ip[tcp_flags_used]); > +=09=09iov[TCP_IOV_IP] =3D IOV_OF_LVALUE(tcp6_payload_ip[tcp_payload_used= ]); > =09=09iov[TCP_IOV_ETH].iov_base =3D &tcp6_eth_src; > =09} > =20 > =09payload =3D iov[TCP_IOV_PAYLOAD].iov_base; > =09seq =3D conn->seq_to_tap; > =09ret =3D tcp_prepare_flags(c, conn, flags, &payload->th, > -=09=09=09=09payload->opts, &optlen); > +=09=09=09=09payload->data, &optlen); > =09if (ret <=3D 0) > =09=09return ret; > =20 > -=09tcp_flags_used++; > +=09tcp_payload_used++; > =09l4len =3D tcp_l2_buf_fill_headers(conn, iov, optlen, NULL, seq, false= ); > =09iov[TCP_IOV_PAYLOAD].iov_len =3D l4len; ...with this, we still have 64k reserved just to write (typically) 20 bytes. I think that, for these frames, we could avoid wasting all this memory by using a separate set of iov items and setting pointers as needed. > - > =09if (flags & DUP_ACK) { > -=09=09struct iovec *dup_iov; > +=09=09struct iovec *dup_iov =3D tcp_l2_iov[tcp_payload_used++]; > =20 > -=09=09dup_iov =3D tcp_l2_flags_iov[tcp_flags_used++]; > =09=09memcpy(dup_iov[TCP_IOV_TAP].iov_base, iov[TCP_IOV_TAP].iov_base, > =09=09 iov[TCP_IOV_TAP].iov_len); > =09=09dup_iov[TCP_IOV_ETH].iov_base =3D iov[TCP_IOV_ETH].iov_base; > @@ -265,7 +220,7 @@ int tcp_buf_send_flag(const struct ctx *c, struct tcp= _tap_conn *conn, int flags) > =09=09dup_iov[TCP_IOV_PAYLOAD].iov_len =3D l4len; > =09} > =20 > -=09if (tcp_flags_used > TCP_FRAMES_MEM - 2) > +=09if (tcp_payload_used > TCP_FRAMES_MEM - 2) > =09=09tcp_flags_flush(c); > =20 > =09return 0; > @@ -282,8 +237,10 @@ int tcp_buf_send_flag(const struct ctx *c, struct tc= p_tap_conn *conn, int flags) > static void tcp_data_to_tap(const struct ctx *c, struct tcp_tap_conn *co= nn, > =09=09=09 ssize_t dlen, int no_csum, uint32_t seq) > { > +=09struct tcp_payload_t *payload; > =09const uint16_t *check =3D NULL; > =09struct iovec *iov; > +=09uint16_t *flags; > =09size_t l4len; > =20 > =09conn->seq_to_tap =3D seq + dlen; > @@ -302,6 +259,9 @@ static void tcp_data_to_tap(const struct ctx *c, stru= ct tcp_tap_conn *conn, > =09=09iov[TCP_IOV_IP] =3D IOV_OF_LVALUE(tcp6_payload_ip[tcp_payload_used= ]); > =09=09iov[TCP_IOV_ETH].iov_base =3D &tcp6_eth_src; > =09} > +=09payload =3D iov[TCP_IOV_PAYLOAD].iov_base; > +=09flags =3D &payload->th.window - 1; I guess using offsetof(struct tcp_hdr, doff) is a bit less mysterious. > +=09*(flags) =3D PAYLOAD_FLAGS; Excess parentheses. > =09l4len =3D tcp_l2_buf_fill_headers(conn, iov, dlen, check, seq, false)= ; > =09iov[TCP_IOV_PAYLOAD].iov_len =3D l4len; > =09if (++tcp_payload_used > TCP_FRAMES_MEM - 1) --=20 Stefano