From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by passt.top (Postfix) with ESMTPS id A41A45A026B for ; Thu, 17 Nov 2022 06:59:16 +0100 (CET) Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4NCTkl1FCxz4xZ3; Thu, 17 Nov 2022 16:59:11 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=201602; t=1668664751; bh=Rek/Ri32OgZeG6He3tCjx3B160sOFYfmKMRPmcx+mwE=; h=From:To:Cc:Subject:Date:From; b=aOyv9r6LIkavv6vnxQrPm6fcVKnZVZ9u38oF3r3IgkJWDTNfufcX5hDm2AgTUnAde p+KCbpfbCoQAMAZUrwFe73odzjm/qw80XBv9lEpwfo7qLDVtb18fcHIlWgsPAW8VLg yOJKv76N6CA0Uisz0BPQBI2YlCQQNj7EjlBeFAyk= From: David Gibson To: passt-dev@passt.top, Stefano Brivio Subject: [PATCH v2 00/32] Use dual stack sockets to listen for inbound TCP connections Date: Thu, 17 Nov 2022 16:58:36 +1100 Message-Id: <20221117055908.2782981-1-david@gibson.dropbear.id.au> X-Mailer: git-send-email 2.38.1 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Message-ID-Hash: DDXMKSVFUBG6RRHC7UOQZK6LSVPFAEQQ X-Message-ID-Hash: DDXMKSVFUBG6RRHC7UOQZK6LSVPFAEQQ X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: David Gibson X-Mailman-Version: 3.3.3 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: When forwarding many ports, passt can consume a lot of kernel memory because of the many listening sockets it opens. There are not a lot of ways we can reduce that, but here's one. Currently we create separate listening sockets for each port for both IPv4 and IPv6. However in Linux (and probably other platforms), it's possible to listen for both IPv4 and IPv6 connections on an IPv6 socket. This series uses such dual stack sockets to halve the number of listening sockets needed for TCP. When forwarding all TCP and UDP ports, this reduces the kernel memory used from around 677 MiB to around 487 MiB (kernel 6.0.8 on an x86_64 Fedora 37 machine). This should also be possible for UDP, but that will require a mostly separate implementation. Changes since v2: * Assorted minor polishing based on Stefano's review David Gibson (32): clang-tidy: Suppress warning about assignments in if statements style: Minor corrections to function comments tcp_splice: #include tcp_splice.h in tcp_splice.c tcp: Remove unused TCP_MAX_SOCKS constant tcp: Better helpers for converting between connection pointer and index tcp_splice: Helpers for converting from index to/from tcp_splice_conn tcp: Move connection state structures into a shared header tcp: Add connection union type tcp: Improved helpers to update connections after moving tcp: Unify spliced and non-spliced connection tables tcp: Unify tcp_defer_handler and tcp_splice_defer_handler() tcp: Partially unify tcp_timer() and tcp_splice_timer() tcp: Unify the IN_EPOLL flag tcp: Separate helpers to create ns listening sockets tcp: Unify part of spliced and non-spliced conn_from_sock path tcp: Use the same sockets to listen for spliced and non-spliced connections tcp: Remove splice from tcp_epoll_ref tcp: Don't store hash bucket in connection structures inany: Helper functions for handling addresses which could be IPv4 or IPv6 tcp: Hash IPv4 and IPv4-mapped-IPv6 addresses the same tcp: Take tcp_hash_insert() address from struct tcp_conn tcp: Simplify tcp_hash_match() to take an inany_addr tcp: Unify initial sequence number calculation for IPv4 and IPv6 tcp: Have tcp_seq_init() take its parameters from struct tcp_conn tcp: Fix small errors in tcp_seq_init() time handling tcp: Remove v6 flag from tcp_epoll_ref tcp: NAT IPv4-mapped IPv6 addresses like IPv4 addresses tcp_splice: Allow splicing of connections from IPv4-mapped loopback tcp: Consolidate tcp_sock_init[46] util: Allow sock_l4() to open dual stack sockets util: Always return -1 on error in sock_l4() tcp: Use dual stack sockets for port forwarding when possible Makefile | 15 +- conf.c | 12 +- inany.h | 94 +++++ siphash.c | 2 + tap.c | 6 +- tcp.c | 981 ++++++++++++++++++++++----------------------------- tcp.h | 11 +- tcp_conn.h | 192 ++++++++++ tcp_splice.c | 333 +++++++---------- tcp_splice.h | 12 +- util.c | 19 +- 11 files changed, 891 insertions(+), 786 deletions(-) create mode 100644 inany.h create mode 100644 tcp_conn.h -- 2.38.1