From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by passt.top (Postfix) with ESMTPS id B0EEF5A027A for ; Tue, 22 Aug 2023 07:30:09 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=201602; t=1692682203; bh=O2IfG3dnG9OeoupvbO7h7fpUTo9wPiU4a7ViFeib83s=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=iug9G2/5Scy4BttCXeOke8yFRpn2S6jrqdiW7Sz+M/gtw/UNkeR1g+UvI8ySVaj97 tBdAIIRg0ZDZX9USK/NwRCcMNTBgpYSRr4MHImFef1jtUyVvIhPjzEamS36Yu+zlgL qmO6nK/wUOCEXoPrP1D8wBtYAHnD0DT5UhLQMs7Y= Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4RVHwq4RwWz4x2J; Tue, 22 Aug 2023 15:30:03 +1000 (AEST) From: David Gibson To: Stefano Brivio , passt-dev@passt.top Subject: [PATCH v4 9/9] tcp: Remove broken pressure calculations for tcp_defer_handler() Date: Tue, 22 Aug 2023 15:30:00 +1000 Message-ID: <20230822053000.1118063-10-david@gibson.dropbear.id.au> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230822053000.1118063-1-david@gibson.dropbear.id.au> References: <20230822053000.1118063-1-david@gibson.dropbear.id.au> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Message-ID-Hash: RIZV2T2S53GNEC3OAHXG6MJI24OEBHEB X-Message-ID-Hash: RIZV2T2S53GNEC3OAHXG6MJI24OEBHEB X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: David Gibson X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: tcp_defer_handler() performs a potentially expensive linear scan of the connection table. So, to mitigate the cost of that we skip if if we're not under at least moderate pressure: either 30% of available connections or 30% (estimated) of available fds used. But, the calculation for this has been broken since it was introduced: we calculate "max_conns" based on c->tcp.conn_count, not TCP_MAX_CONNS, meaning we only exit early if conn_count is less than 30% of itself, i.e. never. If that calculation is "corrected" to be based on TCP_MAX_CONNS, it completely tanks the TCP CRR times for passt - from ~60ms to >1000ms on my laptop. My guess is that this is because in the case of many short lived connections, we're letting the table become much fuller before compacting it. That means that other places which perform a table scan now have to do much, much more. For the time being, simply remove the tests, since they're not doing anything useful. We can reintroduce them more carefully if we see a need for them. This also removes the only user of c->tcp.splice_conn_count, so that can be removed as well. Signed-off-by: David Gibson --- tcp.c | 9 --------- tcp.h | 2 -- tcp_splice.c | 2 -- 3 files changed, 13 deletions(-) diff --git a/tcp.c b/tcp.c index f396ede..c89e6e4 100644 --- a/tcp.c +++ b/tcp.c @@ -309,9 +309,6 @@ #define TCP_FRAMES \ (c->mode == MODE_PASST ? TCP_FRAMES_MEM : 1) -#define TCP_FILE_PRESSURE 30 /* % of c->nofile */ -#define TCP_CONN_PRESSURE 30 /* % of c->tcp.conn_count */ - #define TCP_HASH_TABLE_LOAD 70 /* % */ #define TCP_HASH_TABLE_SIZE (TCP_MAX_CONNS * 100 / \ TCP_HASH_TABLE_LOAD) @@ -1385,17 +1382,11 @@ static void tcp_l2_data_buf_flush(struct ctx *c) */ void tcp_defer_handler(struct ctx *c) { - int max_conns = c->tcp.conn_count / 100 * TCP_CONN_PRESSURE; - int max_files = c->nofile / 100 * TCP_FILE_PRESSURE; union tcp_conn *conn; tcp_l2_flags_buf_flush(c); tcp_l2_data_buf_flush(c); - if ((c->tcp.conn_count < MIN(max_files, max_conns)) && - (c->tcp.splice_conn_count < MIN(max_files / 6, max_conns))) - return; - for (conn = tc + c->tcp.conn_count - 1; conn >= tc; conn--) { if (conn->c.spliced) { if (conn->splice.flags & CLOSING) diff --git a/tcp.h b/tcp.h index 1608d58..9eaec3f 100644 --- a/tcp.h +++ b/tcp.h @@ -56,7 +56,6 @@ union tcp_listen_epoll_ref { * struct tcp_ctx - Execution context for TCP routines * @hash_secret: 128-bit secret for hash functions, ISN and hash table * @conn_count: Count of total connections in connection table - * @splice_conn_count: Count of spliced connections in connection table * @port_to_tap: Ports bound host-side, packets to tap or spliced * @fwd_in: Port forwarding configuration for inbound packets * @fwd_out: Port forwarding configuration for outbound packets @@ -67,7 +66,6 @@ union tcp_listen_epoll_ref { struct tcp_ctx { uint64_t hash_secret[2]; int conn_count; - int splice_conn_count; struct port_fwd fwd_in; struct port_fwd fwd_out; struct timespec timer_run; diff --git a/tcp_splice.c b/tcp_splice.c index 1f89d6a..5b36975 100644 --- a/tcp_splice.c +++ b/tcp_splice.c @@ -295,7 +295,6 @@ void tcp_splice_destroy(struct ctx *c, union tcp_conn *conn_union) conn->flags = 0; debug("TCP (spliced): index %li, CLOSED", CONN_IDX(conn)); - c->tcp.splice_conn_count--; tcp_table_compact(c, conn_union); } @@ -513,7 +512,6 @@ bool tcp_splice_conn_from_sock(struct ctx *c, union tcp_listen_epoll_ref ref, trace("TCP (spliced): failed to set TCP_QUICKACK on %i", s); conn->c.spliced = true; - c->tcp.splice_conn_count++; conn->a = s; if (tcp_splice_new(c, conn, ref.port, ref.ns)) -- 2.41.0