From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by passt.top (Postfix) with ESMTPS id 8B9C35A02CE for ; Tue, 14 May 2024 03:03:45 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202312; t=1715648621; bh=LRMxJK37/NTm6VfPGMFeFz8ItEy/VSIjWX8ce7dtZyc=; h=From:To:Cc:Subject:Date:From; b=XdQxGUoP0ae6IFqwVC1GlPtieLRboV4nVeXJmK9Kl659kDQhm93NhQzP8lZL6Yp1M zfX0/hF7Sv/F1UyRZ6+360ZXjx/XwLZ29saeypJxxDnmmZMenD+N2wfhMCugaIYC8T bxTDpStIxztVNLuAfmUfVevobrYkVZvi9hdk+Uf3wi4R0rq2RP9VOuoj3vftpBdZlj y+aDN6cUoUI7uqGnaYb4i59GYDBCrS7ITUkV22XM3MZ8eqpsCJgUYEcOwoDUEP/k8h MRkfCLgymfHdBQbosm89maFvreuigHY1TzF4XslpT5IYpu9Ll2eAVEVAZChL38kNnk hn9o2Vo1mUt5A== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4VddQj68v8z4wcR; Tue, 14 May 2024 11:03:41 +1000 (AEST) From: David Gibson To: Stefano Brivio , passt-dev@passt.top Subject: [PATCH v5 00/19] RFC: Unified flow table Date: Tue, 14 May 2024 11:03:18 +1000 Message-ID: <20240514010337.1104606-1-david@gibson.dropbear.id.au> X-Mailer: git-send-email 2.45.0 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Message-ID-Hash: DZJY4FE6KWZXZJV665IXODATY3OTDAXT X-Message-ID-Hash: DZJY4FE6KWZXZJV665IXODATY3OTDAXT X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: David Gibson X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: This is a fourth draft of the first steps in implementing more general "connection" tracking, as described at: https://pad.passt.top/p/NewForwardingModel This series changes the TCP connection table and hash table into a more general flow table that can track other protocols as well. Each flow uniformly keeps track of all the relevant addresses and ports, which will allow for more robust control of NAT and port forwarding. ICMP is converted to use the new flow table. This doesn't include UDP, but I'm working on it right now and making progress. I'm posting this to give a head start on the review :) Caveats: * We significantly increase the size of a connection/flow entry Changes since v4: * flowside_from_af() no longer fills in unspecified addresses when passed NULL * Split and rename flow hash lookup function * Clarified flow state transitions, and enforced where practical * Made side 0 always the initiating side of a flow, rather than letting the protocol specific code decide * Separated pifs from flowside addresses to allow better structure packing Changes since v3: * Complex rebase on top of the many things that have happened upstream since v2. * Assorted other changes. * Replace TAPFSIDE() and SOCKFSIDE() macros with local variables. Changes since v2: * Cosmetic fixes based on review * Extra doc comments for enum flow_type * Rename flowside to flowaddrs which turns out to make more sense in light of future changes * Fix bug where the socket flowaddrs for tap initiated connections wasn't initialised to match the socket address we were using in the case of map-gw NAT * New flowaddrs_from_sock() helper used in most cases which is cleaner and should avoid bugs like the above * Using newer centralised workarounds for clang-tidy issue 58992 * Remove duplicate definition of FLOW_MAX as maximum flow type and maximum number of tracked flows * Rebased on newer versions of preliminary work (ICMP, flow based dispatch and allocation, bind/address cleanups) * Unified hash table as well as base flow table * Integrated ICMP Changes since v1: * Terminology changes - "Endpoint" address/port instead of "correspondent" address/port - "flowside" instead of "demiflow" * Actually move the connection table to a new flow table structure in new files * Significant rearrangement of earlier patchs on top of that new table, to reduce churn David Gibson (19): flow: Clarify and enforce flow state transitions flow: Make side 0 always be the initiating side flow: Record the pifs for each side of each flow tcp: Remove interim 'tapside' field from connection flow: Common data structures for tracking flow addresses flow: Populate address information for initiating side flow: Populate address information for non-initiating side tcp, flow: Remove redundant information, repack connection structures tcp: Obtain guest address from flowside tcp: Simplify endpoint validation using flowside information tcp_splice: Eliminate SPLICE_V6 flag tcp, flow: Replace TCP specific hash function with general flow hash flow, tcp: Generalise TCP hash table to general flow hash table tcp: Re-use flow hash for initial sequence number generation icmp: Use flowsides as the source of truth wherever possible icmp: Look up ping flows using flow hash icmp: Eliminate icmp_id_map flow, tcp: Flow based NAT and port forwarding for TCP flow, icmp: Use general flow forwarding rules for ICMP flow.c | 538 +++++++++++++++++++++++++++++++++++++++++++++------ flow.h | 149 +++++++++++++- flow_table.h | 21 ++ fwd.c | 110 +++++++++++ fwd.h | 12 ++ icmp.c | 98 ++++++---- icmp_flow.h | 1 - inany.h | 29 ++- passt.h | 3 + pif.h | 1 - tap.c | 11 -- tap.h | 1 - tcp.c | 484 ++++++++++++--------------------------------- tcp_conn.h | 36 ++-- tcp_splice.c | 97 ++-------- tcp_splice.h | 5 +- 16 files changed, 999 insertions(+), 597 deletions(-) -- 2.45.0