From: David Gibson <david@gibson.dropbear.id.au>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: passt-dev@passt.top
Subject: Re: [PATCH v4 01/16] flow: Common data structures for tracking flow addresses
Date: Tue, 14 May 2024 10:11:59 +1000 [thread overview]
Message-ID: <ZkKsT2ssCdxoTzfR@zatzit> (raw)
In-Reply-To: <20240513200700.07fb2518@elisabeth>
[-- Attachment #1: Type: text/plain, Size: 10733 bytes --]
On Mon, May 13, 2024 at 08:07:00PM +0200, Stefano Brivio wrote:
> Minor comments/nits only:
>
> On Fri, 3 May 2024 11:11:20 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
>
> > Handling of each protocol needs some degree of tracking of the addresses
> > and ports at the end of each connection or flow. Sometimes that's explicit
> > (as in the guest visible addresses for TCP connections), sometimes implicit
> > (the bound and connected addresses of sockets).
> >
> > To allow more general and robust handling, and more consistency across
> > protocols we want to uniformly track the address and port at each end of
> > the connection. Furthermore, because we allow port remapping, and we
> > sometimes need to apply NAT, the addresses and ports can be different as
> > seen by the guest/namespace and as by the host.
> >
> > Introduce 'struct flowside' to keep track of common information
> > related to one side of each flow. For now that's the addresses, ports
> > and the pif id. Store two of these in the common fields of a flow to
> > track that information for both sides. For now we just introduce the
> > structure itself, helpers to populate it, and logging of the contents
> > when starting and ending flows. Later patches will actually put
> > something useful there.
> >
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> > flow.c | 28 ++++++++++++++++++--
> > flow.h | 75 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > passt.h | 3 +++
> > pif.h | 1 -
> > tcp_conn.h | 1 -
> > 5 files changed, 104 insertions(+), 4 deletions(-)
> >
> > diff --git a/flow.c b/flow.c
> > index 80dd269..02d6008 100644
> > --- a/flow.c
> > +++ b/flow.c
> > @@ -51,10 +51,11 @@ static_assert(ARRAY_SIZE(flow_proto) == FLOW_NUM_TYPES,
> > *
> > * ALLOC - A tentatively allocated entry
> > * Operations:
> > + * - Common flow fields other than type may be accessed
> > * - flow_alloc_cancel() returns the entry to FREE state
> > * - FLOW_START() set the entry's type and moves to START state
> > * Caveats:
> > - * - It's not safe to write fields in the flow entry
> > + * - It's not safe to write flow type specific fields in the entry
> > * - It's not safe to allocate further entries with flow_alloc()
> > * - It's not safe to return to the main epoll loop (use FLOW_START()
> > * to move to START state before doing so)
> > @@ -62,6 +63,7 @@ static_assert(ARRAY_SIZE(flow_proto) == FLOW_NUM_TYPES,
> > *
> > * START - An entry being prepared by flow type specific code
> > * Operations:
> > + * - Common flow fields other than type may be accessed
> > * - Flow type specific fields may be accessed
> > * - flow_*() logging functions
> > * - flow_alloc_cancel() returns the entry to FREE state
> > @@ -168,9 +170,21 @@ void flow_log_(const struct flow_common *f, int pri, const char *fmt, ...)
> > union flow *flow_start(union flow *flow, enum flow_type type,
> > unsigned iniside)
> > {
> > - (void)iniside;
> > + char ebuf[INANY_ADDRSTRLEN], fbuf[INANY_ADDRSTRLEN];
> > + const struct flowside *a = &flow->f.side[iniside];
>
> As long as iniside is used as a binary value (I guess it's unsigned
> because you have in mind that it could eventually be extended, right?),
Not really. My intention is that it's fundamentally a two value
variable. However, it's used as an array index and doesn't represent
true/false values, so bool didn't seem right. Signs added extra
complications in some cases, hence unsigned. I'd use uint1_t if that
were a thing...
> I think '!!iniside' would be clearer and perhaps more robust here.
Hm. I don't really like that. If iniside ever has a value other than
0 or 1, that's a bug. Fwiw, this particular instance is gone in the
latest version and there are more places where we use just constants,
but it's not all of them. I guess see what you think on the new
version.
> > + const struct flowside *b = &flow->f.side[!iniside];
> > +
> > flow->f.type = type;
> > flow_dbg(flow, "START %s", flow_type_str[flow->f.type]);
> > + flow_dbg(flow, " from side %u (%s): [%s]:%hu -> [%s]:%hu",
> > + iniside, pif_name(a->pif),
> > + inany_ntop(&a->eaddr, ebuf, sizeof(ebuf)), a->eport,
> > + inany_ntop(&a->faddr, fbuf, sizeof(fbuf)), a->fport);
> > + flow_dbg(flow, " to side %u (%s): [%s]:%hu -> [%s]:%hu",
> > + !iniside, pif_name(b->pif),
> > + inany_ntop(&b->faddr, fbuf, sizeof(fbuf)), b->fport,
> > + inany_ntop(&b->eaddr, ebuf, sizeof(ebuf)), b->eport);
> > +
> > return flow;
> > }
> >
> > @@ -180,10 +194,20 @@ union flow *flow_start(union flow *flow, enum flow_type type,
> > */
> > static void flow_end(union flow *flow)
> > {
> > + char ebuf[INANY_ADDRSTRLEN], fbuf[INANY_ADDRSTRLEN];
> > + const struct flowside *a = &flow->f.side[0];
> > + const struct flowside *b = &flow->f.side[1];
> > +
> > if (flow->f.type == FLOW_TYPE_NONE)
> > return; /* Nothing to do */
> >
> > flow_dbg(flow, "END %s", flow_type_str[flow->f.type]);
> > + flow_dbg(flow, " side 0 (%s): [%s]:%hu <-> [%s]:%hu", pif_name(a->pif),
> > + inany_ntop(&a->faddr, fbuf, sizeof(fbuf)), a->fport,
> > + inany_ntop(&a->eaddr, ebuf, sizeof(ebuf)), a->eport);
> > + flow_dbg(flow, " side 1 (%s): [%s]:%hu <-> [%s]:%hu", pif_name(b->pif),
> > + inany_ntop(&b->faddr, fbuf, sizeof(fbuf)), b->fport,
> > + inany_ntop(&b->eaddr, ebuf, sizeof(ebuf)), b->eport);
> > flow->f.type = FLOW_TYPE_NONE;
> > }
> >
> > diff --git a/flow.h b/flow.h
> > index c943c44..f7fb537 100644
> > --- a/flow.h
> > +++ b/flow.h
> > @@ -35,11 +35,86 @@ extern const uint8_t flow_proto[];
> > #define FLOW_PROTO(f) \
> > ((f)->type < FLOW_NUM_TYPES ? flow_proto[(f)->type] : 0)
> >
> > +/**
> > + * struct flowside - Common information for one side of a flow
> > + * @eaddr: Endpoint address (remote address from passt's PoV)
> > + * @faddr: Forwarding address (local address from passt's PoV)
> > + * @eport: Endpoint port
> > + * @fport: Forwarding port
> > + * @pif: pif ID on which this side of the flow exists
> > + */
> > +struct flowside {
> > + union inany_addr faddr;
> > + union inany_addr eaddr;
> > + in_port_t fport;
> > + in_port_t eport;
> > + uint8_t pif;
> > +};
> > +static_assert(_Alignof(struct flowside) == _Alignof(uint32_t),
> > + "Unexpected alignment for struct flowside");
>
> I'm too thick to understand the reason behind this assert.
I guess there isn't a particularly strong reason. This was mostly so
I didn't get surprised by some weird alignment padding.
> > +
> > +/** flowside_from_inany - Initialize flowside from inany addresses
>
> flowside_from_inany(), it's a function.
Gone in the latest version anyway.
>
> > + * @fside: flowside to initialize
> > + * @pif: pif id of this flowside
> > + * @faddr: Forwarding address (inany)
> > + * @fport: Forwarding port
> > + * @eaddr: Endpoint address (inany)
> > + * @eport: Endpoint port
> > + */
> > +/* cppcheck-suppress unusedFunction */
> > +static inline void flowside_from_inany(struct flowside *fside, uint8_t pif,
> > + const union inany_addr *faddr, in_port_t fport,
> > + const union inany_addr *eaddr, in_port_t eport)
> > +{
> > + fside->pif = pif;
> > + fside->faddr = *faddr;
> > + fside->eaddr = *eaddr;
> > + fside->fport = fport;
> > + fside->eport = eport;
> > +}
> > +
> > +/** flowside_from_af - Initialize flowside from addresses
>
> flowside_from_af()
Fixed. I changed to the british spelling of initialise while I was at
it.
> > + * @fside: flowside to initialize
> > + * @pif: pif id of this flowside
> > + * @af: Address family (AF_INET or AF_INET6)
> > + * @faddr: Forwarding address (pointer to in_addr or in6_addr, or NULL)
> > + * @fport: Forwarding port
> > + * @eaddr: Endpoint address (pointer to in_addr or in6_addr, or NULL)
> > + * @eport: Endpoint port
> > + *
> > + * If NULL is given for either address, the appropriate unspecified/any address
>
> s/any/wildcard/ makes it a bit easier to follow, I guess.
That behaviour and comment is gone in the latest version.
> > + * for the address family is substituted.
> > + */
> > +/* cppcheck-suppress unusedFunction */
> > +static inline void flowside_from_af(struct flowside *fside,
> > + uint8_t pif, sa_family_t af,
> > + const void *faddr, in_port_t fport,
> > + const void *eaddr, in_port_t eport)
> > +{
> > + const union inany_addr *any = af == AF_INET ? &inany_any4 : &inany_any6;
> > +
> > + fside->pif = pif;
> > + if (faddr)
> > + inany_from_af(&fside->faddr, af, faddr);
> > + else
> > + fside->faddr = *any;
> > + if (eaddr)
> > + inany_from_af(&fside->eaddr, af, eaddr);
> > + else
> > + fside->eaddr = *any;
> > + fside->fport = fport;
> > + fside->eport = eport;
> > +}
> > +
> > +#define SIDES 2
> > +
> > /**
> > * struct flow_common - Common fields for packet flows
> > + * @side[]: Information for each side of the flow
> > * @type: Type of packet flow
> > */
> > struct flow_common {
> > + struct flowside side[SIDES];
> > uint8_t type;
> > };
> >
> > diff --git a/passt.h b/passt.h
> > index bc58d64..3db0b8e 100644
> > --- a/passt.h
> > +++ b/passt.h
> > @@ -17,6 +17,9 @@ union epoll_ref;
> >
> > #include "pif.h"
> > #include "packet.h"
> > +#include "siphash.h"
> > +#include "ip.h"
> > +#include "inany.h"
> > #include "flow.h"
> > #include "icmp.h"
> > #include "fwd.h"
> > diff --git a/pif.h b/pif.h
> > index bd52936..ca85b34 100644
> > --- a/pif.h
> > +++ b/pif.h
> > @@ -38,7 +38,6 @@ static inline const char *pif_type(enum pif_type pt)
> > return "?";
> > }
> >
> > -/* cppcheck-suppress unusedFunction */
> > static inline const char *pif_name(uint8_t pif)
> > {
> > return pif_type(pif);
> > diff --git a/tcp_conn.h b/tcp_conn.h
> > index d280b22..1a07dd5 100644
> > --- a/tcp_conn.h
> > +++ b/tcp_conn.h
> > @@ -106,7 +106,6 @@ struct tcp_tap_conn {
> > uint32_t seq_init_from_tap;
> > };
> >
> > -#define SIDES 2
> > /**
> > * struct tcp_splice_conn - Descriptor for a spliced TCP connection
> > * @f: Generic flow information
>
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2024-05-14 0:18 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-03 1:11 [PATCH v4 00/16] RFC: Unified flow table David Gibson
2024-05-03 1:11 ` [PATCH v4 01/16] flow: Common data structures for tracking flow addresses David Gibson
2024-05-13 18:07 ` Stefano Brivio
2024-05-14 0:11 ` David Gibson [this message]
2024-05-03 1:11 ` [PATCH v4 02/16] tcp: Maintain flowside information for "tap" connections David Gibson
2024-05-13 18:07 ` Stefano Brivio
2024-05-14 0:15 ` David Gibson
2024-05-03 1:11 ` [PATCH v4 03/16] tcp_splice: Maintain flowside information for spliced connections David Gibson
2024-05-03 1:11 ` [PATCH v4 04/16] tcp: Obtain guest address from flowside David Gibson
2024-05-13 18:07 ` Stefano Brivio
2024-05-14 0:18 ` David Gibson
2024-05-03 1:11 ` [PATCH v4 05/16] tcp: Simplify endpoint validation using flowside information David Gibson
2024-05-03 1:11 ` [PATCH v4 06/16] tcp, tcp_splice: Construct sockaddrs for connect() from flowside David Gibson
2024-05-03 1:11 ` [PATCH v4 07/16] tcp_splice: Eliminate SPLICE_V6 flag David Gibson
2024-05-03 1:11 ` [PATCH v4 08/16] tcp, flow: Replace TCP specific hash function with general flow hash David Gibson
2024-05-03 1:11 ` [PATCH v4 09/16] flow, tcp: Generalise TCP hash table to general flow hash table David Gibson
2024-05-03 1:11 ` [PATCH v4 10/16] tcp: Re-use flow hash for initial sequence number generation David Gibson
2024-05-03 1:11 ` [PATCH v4 11/16] icmp: Populate flowside information David Gibson
2024-05-03 1:11 ` [PATCH v4 12/16] icmp: Use flowsides as the source of truth wherever possible David Gibson
2024-05-03 1:11 ` [PATCH v4 13/16] icmp: Look up ping flows using flow hash David Gibson
2024-05-03 1:11 ` [PATCH v4 14/16] icmp: Eliminate icmp_id_map David Gibson
2024-05-03 1:11 ` [PATCH v4 15/16] flow, tcp: flow based NAT and port forwarding for TCP David Gibson
2024-05-03 1:11 ` [PATCH v4 16/16] flow, icmp: Use general flow forwarding rules for ICMP David Gibson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZkKsT2ssCdxoTzfR@zatzit \
--to=david@gibson.dropbear.id.au \
--cc=passt-dev@passt.top \
--cc=sbrivio@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).