public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: Stefano Brivio <sbrivio@redhat.com>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: passt-dev@passt.top, jmaloy@redhat.com
Subject: Re: [PATCH v6 01/26] flow: Common address information for initiating side
Date: Wed, 26 Jun 2024 00:23:28 +0200	[thread overview]
Message-ID: <20240626002328.74900b0d@elisabeth> (raw)
In-Reply-To: <20240614061348.3814736-2-david@gibson.dropbear.id.au>

On Fri, 14 Jun 2024 16:13:23 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> Handling of each protocol needs some degree of tracking of the
> addresses and ports at the end of each connection or flow.  Sometimes
> that's explicit (as in the guest visible addresses for TCP
> connections), sometimes implicit (the bound and connected addresses of
> sockets).
> 
> To allow more consistent handling across protocols we want to
> uniformly track the address and port at each end of the connection.
> Furthermore, because we allow port remapping, and we sometimes need to
> apply NAT, the addresses and ports can be different as seen by the
> guest/namespace and as by the host.
> 
> Introduce 'struct flowside' to keep track of address and port
> information related to one side of a flow. Store two of these in the
> common fields of a flow to track that information for both sides.
> 
> For now we only populate the initiating side, requiring that
> information be completed when a flows enter INI.  Later patches will
> populate the target side.
> 
> For now this leaves some information redundantly recorded in both generic
> and type specific fields.  We'll fix that in later patches.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  flow.c       | 96 +++++++++++++++++++++++++++++++++++++++++++++++++---
>  flow.h       | 16 +++++++++
>  flow_table.h |  8 ++++-
>  icmp.c       |  9 +++--
>  passt.h      |  3 ++
>  tcp.c        |  6 ++--
>  6 files changed, 127 insertions(+), 11 deletions(-)
> 
> diff --git a/flow.c b/flow.c
> index d05aa495..1819111d 100644
> --- a/flow.c
> +++ b/flow.c
> @@ -108,6 +108,31 @@ static const union flow *flow_new_entry; /* = NULL */
>  /* Last time the flow timers ran */
>  static struct timespec flow_timer_run;
>  
> +/** flowside_from_af() - Initialise flowside from addresses
> + * @fside:	flowside to initialise
> + * @af:		Address family (AF_INET or AF_INET6)
> + * @eaddr:	Endpoint address (pointer to in_addr or in6_addr)
> + * @eport:	Endpoint port
> + * @faddr:	Forwarding address (pointer to in_addr or in6_addr)
> + * @fport:	Forwarding port
> + */
> +static void flowside_from_af(struct flowside *fside, sa_family_t af,
> +			     const void *eaddr, in_port_t eport,
> +			     const void *faddr, in_port_t fport)
> +{
> +	if (faddr)
> +		inany_from_af(&fside->faddr, af, faddr);
> +	else
> +		fside->faddr = inany_any6;

I kind of wonder if, for clarity, we should perhaps:

#define inany_any		inany_any6

?

And... I don't actually have in mind how IP versions are stored in the
flow table for unspecified addresses, but I wonder if we're losing some
information this way: if this is called with AF_INET as 'af', we're not
saying anywhere in the flowside information that this is going to be an
IPv4 flowside, right?

> +	fside->fport = fport;
> +
> +	if (eaddr)
> +		inany_from_af(&fside->eaddr, af, eaddr);
> +	else
> +		fside->eaddr = inany_any6;
> +	fside->eport = eport;
> +}
> +
>  /** flow_log_ - Log flow-related message
>   * @f:		flow the message is related to
>   * @pri:	Log priority
> @@ -140,6 +165,8 @@ void flow_log_(const struct flow_common *f, int pri, const char *fmt, ...)
>   */
>  static void flow_set_state(struct flow_common *f, enum flow_state state)
>  {
> +	char estr[INANY_ADDRSTRLEN], fstr[INANY_ADDRSTRLEN];
> +	const struct flowside *ini = &f->side[INISIDE];
>  	uint8_t oldstate = f->state;
>  
>  	ASSERT(state < FLOW_NUM_STATES);
> @@ -150,18 +177,28 @@ static void flow_set_state(struct flow_common *f, enum flow_state state)
>  		  FLOW_STATE(f));
>  
>  	if (MAX(state, oldstate) >= FLOW_STATE_TGT)
> -		flow_log_(f, LOG_DEBUG, "%s => %s", pif_name(f->pif[INISIDE]),
> -			                            pif_name(f->pif[TGTSIDE]));
> +		flow_log_(f, LOG_DEBUG, "%s [%s]:%hu -> [%s]:%hu => %s",
> +			  pif_name(f->pif[INISIDE]),
> +			  inany_ntop(&ini->eaddr, estr, sizeof(estr)),
> +			  ini->eport,
> +			  inany_ntop(&ini->faddr, fstr, sizeof(fstr)),
> +			  ini->fport,
> +			  pif_name(f->pif[TGTSIDE]));
>  	else if (MAX(state, oldstate) >= FLOW_STATE_INI)
> -		flow_log_(f, LOG_DEBUG, "%s => ?", pif_name(f->pif[INISIDE]));
> +		flow_log_(f, LOG_DEBUG, "%s [%s]:%hu -> [%s]:%hu => ?",
> +			  pif_name(f->pif[INISIDE]),
> +			  inany_ntop(&ini->eaddr, estr, sizeof(estr)),
> +			  ini->eport,
> +			  inany_ntop(&ini->faddr, fstr, sizeof(fstr)),
> +			  ini->fport);
>  }
>  
>  /**
> - * flow_initiate() - Move flow to INI, setting INISIDE details
> + * flow_initiate_() - Move flow to INI, setting setting pif[INISIDE]

s/setting setting/setting/

>   * @flow:	Flow to change state
>   * @pif:	pif of the initiating side
>   */
> -void flow_initiate(union flow *flow, uint8_t pif)
> +static void flow_initiate_(union flow *flow, uint8_t pif)
>  {
>  	struct flow_common *f = &flow->f;
>  
> @@ -174,6 +211,55 @@ void flow_initiate(union flow *flow, uint8_t pif)
>  	flow_set_state(f, FLOW_STATE_INI);
>  }
>  
> +/**
> + * flow_initiate_af() - Move flow to INI, setting INISIDE details
> + * @flow:	Flow to change state
> + * @pif:	pif of the initiating side
> + * @af:		Address family of @eaddr and @faddr
> + * @saddr:	Source address (pointer to in_addr or in6_addr)
> + * @sport:	Endpoint port
> + * @daddr:	Destination address (pointer to in_addr or in6_addr)
> + * @dport:	Destination port
> + *
> + * Return: pointer to the initiating flowside information
> + */
> +const struct flowside *flow_initiate_af(union flow *flow, uint8_t pif,
> +					sa_family_t af,
> +					const void *saddr, in_port_t sport,
> +					const void *daddr, in_port_t dport)
> +{
> +	struct flowside *ini = &flow->f.side[INISIDE];
> +
> +	flowside_from_af(ini, af, saddr, sport, daddr, dport);
> +	flow_initiate_(flow, pif);
> +	return ini;
> +}
> +
> +/**
> + * flow_initiate_sa() - Move flow to INI, setting INISIDE details
> + * @flow:	Flow to change state
> + * @pif:	pif of the initiating side
> + * @ssa:	Source socket address
> + * @dport:	Destination port
> + *
> + * Return: pointer to the initiating flowside information
> + */
> +const struct flowside *flow_initiate_sa(union flow *flow, uint8_t pif,
> +					const union sockaddr_inany *ssa,
> +					in_port_t dport)
> +{
> +	struct flowside *ini = &flow->f.side[INISIDE];
> +
> +	inany_from_sockaddr(&ini->eaddr, &ini->eport, ssa);
> +	if (inany_v4(&ini->eaddr))
> +		ini->faddr = inany_any4;
> +	else
> +		ini->faddr = inany_any6;
> +	ini->fport = dport;
> +	flow_initiate_(flow, pif);
> +	return ini;
> +}
> +
>  /**
>   * flow_target() - Move flow to TGT, setting TGTSIDE details
>   * @flow:	Flow to change state
> diff --git a/flow.h b/flow.h
> index 29ef9f12..b873ea7f 100644
> --- a/flow.h
> +++ b/flow.h
> @@ -135,11 +135,26 @@ extern const uint8_t flow_proto[];
>  #define INISIDE			0	/* Initiating side */
>  #define TGTSIDE			1	/* Target side */
>  
> +/**
> + * struct flowside - Address information for one side of a flow
> + * @eaddr:	Endpoint address (remote address from passt's PoV)
> + * @faddr:	Forwarding address (local address from passt's PoV)
> + * @eport:	Endpoint port
> + * @fport:	Forwarding port
> + */
> +struct flowside {
> +	union inany_addr	faddr;
> +	union inany_addr	eaddr;
> +	in_port_t		fport;
> +	in_port_t		eport;
> +};
> +
>  /**
>   * struct flow_common - Common fields for packet flows
>   * @state:	State of the flow table entry
>   * @type:	Type of packet flow
>   * @pif[]:	Interface for each side of the flow
> + * @side[]:	Information for each side of the flow
>   */
>  struct flow_common {
>  #ifdef __GNUC__
> @@ -154,6 +169,7 @@ struct flow_common {
>  		      "Not enough bits for type field");
>  #endif
>  	uint8_t		pif[SIDES];
> +	struct flowside	side[SIDES];
>  };
>  
>  #define FLOW_INDEX_BITS		17	/* 128k - 1 */
> diff --git a/flow_table.h b/flow_table.h
> index 1b163491..2e912532 100644
> --- a/flow_table.h
> +++ b/flow_table.h
> @@ -107,7 +107,13 @@ static inline flow_sidx_t flow_sidx(const struct flow_common *f,
>  union flow *flow_alloc(void);
>  void flow_alloc_cancel(union flow *flow);
>  
> -void flow_initiate(union flow *flow, uint8_t pif);
> +const struct flowside *flow_initiate_af(union flow *flow, uint8_t pif,
> +					sa_family_t af,
> +					const void *saddr, in_port_t sport,
> +					const void *daddr, in_port_t dport);
> +const struct flowside *flow_initiate_sa(union flow *flow, uint8_t pif,
> +					const union sockaddr_inany *ssa,
> +					in_port_t dport);
>  void flow_target(union flow *flow, uint8_t pif);
>  
>  union flow *flow_set_type(union flow *flow, enum flow_type type);
> diff --git a/icmp.c b/icmp.c
> index 80330f6f..38d1deeb 100644
> --- a/icmp.c
> +++ b/icmp.c
> @@ -146,12 +146,15 @@ static void icmp_ping_close(const struct ctx *c,
>   * @id_sock:	Pointer to ping flow entry slot in icmp_id_map[] to update
>   * @af:		Address family, AF_INET or AF_INET6
>   * @id:		ICMP id for the new socket
> + * @saddr:	Source address
> + * @daddr:	Destination address
>   *
>   * Return: Newly opened ping flow, or NULL on failure
>   */
>  static struct icmp_ping_flow *icmp_ping_new(const struct ctx *c,
>  					    struct icmp_ping_flow **id_sock,
> -					    sa_family_t af, uint16_t id)
> +					    sa_family_t af, uint16_t id,
> +					    const void *saddr, const void *daddr)
>  {
>  	uint8_t flowtype = af == AF_INET ? FLOW_PING4 : FLOW_PING6;
>  	union epoll_ref ref = { .type = EPOLL_TYPE_PING };
> @@ -163,7 +166,7 @@ static struct icmp_ping_flow *icmp_ping_new(const struct ctx *c,
>  	if (!flow)
>  		return NULL;
>  
> -	flow_initiate(flow, PIF_TAP);
> +	flow_initiate_af(flow, PIF_TAP, af, saddr, id, daddr, id);
>  	flow_target(flow, PIF_HOST);
>  	pingf = FLOW_SET_TYPE(flow, flowtype, ping);
>  
> @@ -269,7 +272,7 @@ int icmp_tap_handler(const struct ctx *c, uint8_t pif, sa_family_t af,
>  	}
>  
>  	if (!(pingf = *id_sock))
> -		if (!(pingf = icmp_ping_new(c, id_sock, af, id)))
> +		if (!(pingf = icmp_ping_new(c, id_sock, af, id, saddr, daddr)))
>  			return 1;
>  
>  	pingf->ts = now->tv_sec;
> diff --git a/passt.h b/passt.h
> index 46d073a2..76810867 100644
> --- a/passt.h
> +++ b/passt.h
> @@ -17,6 +17,9 @@ union epoll_ref;
>  
>  #include "pif.h"
>  #include "packet.h"
> +#include "siphash.h"
> +#include "ip.h"
> +#include "inany.h"
>  #include "flow.h"
>  #include "icmp.h"
>  #include "fwd.h"
> diff --git a/tcp.c b/tcp.c
> index 68524235..07a2eb1c 100644
> --- a/tcp.c
> +++ b/tcp.c
> @@ -1629,7 +1629,7 @@ static void tcp_conn_from_tap(struct ctx *c, sa_family_t af,
>  	if (!(flow = flow_alloc()))
>  		return;
>  
> -	flow_initiate(flow, PIF_TAP);
> +	flow_initiate_af(flow, PIF_TAP, af, saddr, srcport, daddr, dstport);
>  
>  	if (af == AF_INET) {
>  		if (IN4_IS_ADDR_UNSPECIFIED(saddr) ||
> @@ -2288,7 +2288,9 @@ void tcp_listen_handler(struct ctx *c, union epoll_ref ref,
>  	if (s < 0)
>  		goto cancel;
>  
> -	flow_initiate(flow, ref.tcp_listen.pif);
> +	/* FIXME: When listening port has a specific bound address, record that
> +	 * as the forwarding address */
> +	flow_initiate_sa(flow, ref.tcp_listen.pif, &sa, ref.tcp_listen.port);
>  
>  	if (sa.sa_family == AF_INET) {
>  		const struct in_addr *addr = &sa.sa4.sin_addr;

-- 
Stefano


  reply	other threads:[~2024-06-25 22:24 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-14  6:13 [PATCH v6 00/26] RFC: Unified flow table David Gibson
2024-06-14  6:13 ` [PATCH v6 01/26] flow: Common address information for initiating side David Gibson
2024-06-25 22:23   ` Stefano Brivio [this message]
2024-06-26  0:19     ` David Gibson
2024-06-14  6:13 ` [PATCH v6 02/26] flow: Common address information for target side David Gibson
2024-06-25 22:23   ` Stefano Brivio
2024-06-26  0:25     ` David Gibson
2024-06-14  6:13 ` [PATCH v6 03/26] tcp, flow: Remove redundant information, repack connection structures David Gibson
2024-06-25 22:25   ` Stefano Brivio
2024-06-26  0:23     ` David Gibson
2024-06-14  6:13 ` [PATCH v6 04/26] tcp: Obtain guest address from flowside David Gibson
2024-06-14  6:13 ` [PATCH v6 05/26] tcp: Manage outbound address via flow table David Gibson
2024-06-14  6:13 ` [PATCH v6 06/26] tcp: Simplify endpoint validation using flowside information David Gibson
2024-06-14  6:13 ` [PATCH v6 07/26] tcp_splice: Eliminate SPLICE_V6 flag David Gibson
2024-06-14  6:13 ` [PATCH v6 08/26] tcp, flow: Replace TCP specific hash function with general flow hash David Gibson
2024-06-14  6:13 ` [PATCH v6 09/26] flow, tcp: Generalise TCP hash table to general flow hash table David Gibson
2024-06-14  6:13 ` [PATCH v6 10/26] tcp: Re-use flow hash for initial sequence number generation David Gibson
2024-06-14  6:13 ` [PATCH v6 11/26] icmp: Remove redundant id field from flow table entry David Gibson
2024-06-14  6:13 ` [PATCH v6 12/26] icmp: Obtain destination addresses from the flowsides David Gibson
2024-06-14  6:13 ` [PATCH v6 13/26] icmp: Look up ping flows using flow hash David Gibson
2024-06-14  6:13 ` [PATCH v6 14/26] icmp: Eliminate icmp_id_map David Gibson
2024-06-14  6:13 ` [PATCH v6 15/26] icmp: Manage outbound socket address via flow table David Gibson
2024-06-14  6:13 ` [PATCH v6 16/26] flow, tcp: Flow based NAT and port forwarding for TCP David Gibson
2024-06-26 22:49   ` Stefano Brivio
2024-06-27  5:55     ` David Gibson
2024-06-14  6:13 ` [PATCH v6 17/26] flow, icmp: Use general flow forwarding rules for ICMP David Gibson
2024-06-14  6:13 ` [PATCH v6 18/26] fwd: Update flow forwarding logic for UDP David Gibson
2024-06-14  6:13 ` [PATCH v6 19/26] udp: Create flow table entries " David Gibson
2024-06-14  6:13 ` [PATCH v6 20/26] udp: Direct traffic from tap according to flow table David Gibson
2024-06-14  6:13 ` [PATCH v6 21/26] udp: Direct traffic from host to guest " David Gibson
2024-06-14  6:13 ` [PATCH v6 22/26] udp: Direct spliced traffic " David Gibson
2024-06-14  6:13 ` [PATCH v6 23/26] udp: Remove 'splicesrc' tracking David Gibson
2024-06-14  6:13 ` [PATCH v6 24/26] udp: Remove tap port flags field David Gibson
2024-06-14  6:13 ` [PATCH v6 25/26] udp: Remove rdelta port forwarding maps David Gibson
2024-06-14  6:13 ` [PATCH v6 26/26] udp: Eliminate 'splice' flag from epoll reference David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240626002328.74900b0d@elisabeth \
    --to=sbrivio@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=jmaloy@redhat.com \
    --cc=passt-dev@passt.top \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).