From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by passt.top (Postfix) with ESMTP id B1CA55A004E for ; Wed, 26 Jun 2024 00:24:11 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1719354250; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FGEc6Mdci2K/UDImz592HK1wpq5NWI/giQ6dA+WTqPQ=; b=Jb11Uuh4FzCFyUBjBCdyNoQvXIQOIAQcz2cc7nnz6oSKUqybwJXlUS70QlPHuli+vPt+ZM 5xTYMi+OXwgvklQIfq+09lmniPVVDUt8dWnMcOZtzoBwads0bm+AM09tXKmpvxjd5HLiol /G6PePYmtsf6gFsX4uTaqi+lGAcjPjY= Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-147-TmNbt9b6ODi7mxXMGlAXFA-1; Tue, 25 Jun 2024 18:24:06 -0400 X-MC-Unique: TmNbt9b6ODi7mxXMGlAXFA-1 Received: by mail-qt1-f198.google.com with SMTP id d75a77b69052e-444ff2c9a07so17282121cf.0 for ; Tue, 25 Jun 2024 15:24:05 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719354245; x=1719959045; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=FGEc6Mdci2K/UDImz592HK1wpq5NWI/giQ6dA+WTqPQ=; b=bRI7zZPwFa0AFp9R8j60nYKp7+H5yRRwhu+nAkbzOXSRMsb0Sgiru7sKvkRvUwQiP0 lWQZJ9d2lNuSjB3+V9dqqyd77cxR9rNBp0jOEgSCoOpbwfHdE8pR+RDGdIW91sGjm7gz WTPVeGb/7Qy7O+xucyN88bzIdwds989IdN9qWJd8VeIXkhNb+8D8abcwsF0zSYbN8omH VgqSn/0eJpcvTjozHJbZCjFKfRIV4mD84PG+iI8Xz8+8sleS3CqxuX+YtrgabNr0ITBQ V+v838cJg2R8n7UZaDSnGE9gQcURQrGr+bf0dwoo6k5kOrcANcHYrjjn0o2H9IwUz838 VaKg== X-Gm-Message-State: AOJu0YzBJ3oV3CUgYMGr+4LqvxeVU211+xXfp/ly5fh9FJiOjhWYGASq oCncyO17DDy2YIsoWb16ODMCS6j+wgCaFSEkAE6KEQ0DlIVwzu64rLqKVokjlSuCyrvrryT6Oiz n6MWd3TbHq1OwdXQ978isApejQWgBN0atZUYqGrA76T7Teo0D8w== X-Received: by 2002:a05:622a:110f:b0:444:fbb5:ca10 with SMTP id d75a77b69052e-444fbb5cf5amr27211121cf.40.1719354245312; Tue, 25 Jun 2024 15:24:05 -0700 (PDT) X-Google-Smtp-Source: AGHT+IECSTU8/eO7eKTNHy2EcKZXUrFGr7ofgtteGDd8eGKihQVpIVUySjhhDTkhnxi5Xhf0wklfuA== X-Received: by 2002:a05:622a:110f:b0:444:fbb5:ca10 with SMTP id d75a77b69052e-444fbb5cf5amr27210501cf.40.1719354243824; Tue, 25 Jun 2024 15:24:03 -0700 (PDT) Received: from maya.cloud.tilaa.com (maya.cloud.tilaa.com. [164.138.29.33]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-444c2b4ca60sm60238001cf.8.2024.06.25.15.24.03 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 25 Jun 2024 15:24:03 -0700 (PDT) Date: Wed, 26 Jun 2024 00:23:28 +0200 From: Stefano Brivio To: David Gibson Subject: Re: [PATCH v6 01/26] flow: Common address information for initiating side Message-ID: <20240626002328.74900b0d@elisabeth> In-Reply-To: <20240614061348.3814736-2-david@gibson.dropbear.id.au> References: <20240614061348.3814736-1-david@gibson.dropbear.id.au> <20240614061348.3814736-2-david@gibson.dropbear.id.au> Organization: Red Hat X-Mailer: Claws Mail 4.2.0 (GTK 3.24.41; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-ID-Hash: GK3EDNQ3TKPOODFH3LI4OD36M47ZDEMQ X-Message-ID-Hash: GK3EDNQ3TKPOODFH3LI4OD36M47ZDEMQ X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, jmaloy@redhat.com X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Fri, 14 Jun 2024 16:13:23 +1000 David Gibson wrote: > Handling of each protocol needs some degree of tracking of the > addresses and ports at the end of each connection or flow. Sometimes > that's explicit (as in the guest visible addresses for TCP > connections), sometimes implicit (the bound and connected addresses of > sockets). > > To allow more consistent handling across protocols we want to > uniformly track the address and port at each end of the connection. > Furthermore, because we allow port remapping, and we sometimes need to > apply NAT, the addresses and ports can be different as seen by the > guest/namespace and as by the host. > > Introduce 'struct flowside' to keep track of address and port > information related to one side of a flow. Store two of these in the > common fields of a flow to track that information for both sides. > > For now we only populate the initiating side, requiring that > information be completed when a flows enter INI. Later patches will > populate the target side. > > For now this leaves some information redundantly recorded in both generic > and type specific fields. We'll fix that in later patches. > > Signed-off-by: David Gibson > --- > flow.c | 96 +++++++++++++++++++++++++++++++++++++++++++++++++--- > flow.h | 16 +++++++++ > flow_table.h | 8 ++++- > icmp.c | 9 +++-- > passt.h | 3 ++ > tcp.c | 6 ++-- > 6 files changed, 127 insertions(+), 11 deletions(-) > > diff --git a/flow.c b/flow.c > index d05aa495..1819111d 100644 > --- a/flow.c > +++ b/flow.c > @@ -108,6 +108,31 @@ static const union flow *flow_new_entry; /* = NULL */ > /* Last time the flow timers ran */ > static struct timespec flow_timer_run; > > +/** flowside_from_af() - Initialise flowside from addresses > + * @fside: flowside to initialise > + * @af: Address family (AF_INET or AF_INET6) > + * @eaddr: Endpoint address (pointer to in_addr or in6_addr) > + * @eport: Endpoint port > + * @faddr: Forwarding address (pointer to in_addr or in6_addr) > + * @fport: Forwarding port > + */ > +static void flowside_from_af(struct flowside *fside, sa_family_t af, > + const void *eaddr, in_port_t eport, > + const void *faddr, in_port_t fport) > +{ > + if (faddr) > + inany_from_af(&fside->faddr, af, faddr); > + else > + fside->faddr = inany_any6; I kind of wonder if, for clarity, we should perhaps: #define inany_any inany_any6 ? And... I don't actually have in mind how IP versions are stored in the flow table for unspecified addresses, but I wonder if we're losing some information this way: if this is called with AF_INET as 'af', we're not saying anywhere in the flowside information that this is going to be an IPv4 flowside, right? > + fside->fport = fport; > + > + if (eaddr) > + inany_from_af(&fside->eaddr, af, eaddr); > + else > + fside->eaddr = inany_any6; > + fside->eport = eport; > +} > + > /** flow_log_ - Log flow-related message > * @f: flow the message is related to > * @pri: Log priority > @@ -140,6 +165,8 @@ void flow_log_(const struct flow_common *f, int pri, const char *fmt, ...) > */ > static void flow_set_state(struct flow_common *f, enum flow_state state) > { > + char estr[INANY_ADDRSTRLEN], fstr[INANY_ADDRSTRLEN]; > + const struct flowside *ini = &f->side[INISIDE]; > uint8_t oldstate = f->state; > > ASSERT(state < FLOW_NUM_STATES); > @@ -150,18 +177,28 @@ static void flow_set_state(struct flow_common *f, enum flow_state state) > FLOW_STATE(f)); > > if (MAX(state, oldstate) >= FLOW_STATE_TGT) > - flow_log_(f, LOG_DEBUG, "%s => %s", pif_name(f->pif[INISIDE]), > - pif_name(f->pif[TGTSIDE])); > + flow_log_(f, LOG_DEBUG, "%s [%s]:%hu -> [%s]:%hu => %s", > + pif_name(f->pif[INISIDE]), > + inany_ntop(&ini->eaddr, estr, sizeof(estr)), > + ini->eport, > + inany_ntop(&ini->faddr, fstr, sizeof(fstr)), > + ini->fport, > + pif_name(f->pif[TGTSIDE])); > else if (MAX(state, oldstate) >= FLOW_STATE_INI) > - flow_log_(f, LOG_DEBUG, "%s => ?", pif_name(f->pif[INISIDE])); > + flow_log_(f, LOG_DEBUG, "%s [%s]:%hu -> [%s]:%hu => ?", > + pif_name(f->pif[INISIDE]), > + inany_ntop(&ini->eaddr, estr, sizeof(estr)), > + ini->eport, > + inany_ntop(&ini->faddr, fstr, sizeof(fstr)), > + ini->fport); > } > > /** > - * flow_initiate() - Move flow to INI, setting INISIDE details > + * flow_initiate_() - Move flow to INI, setting setting pif[INISIDE] s/setting setting/setting/ > * @flow: Flow to change state > * @pif: pif of the initiating side > */ > -void flow_initiate(union flow *flow, uint8_t pif) > +static void flow_initiate_(union flow *flow, uint8_t pif) > { > struct flow_common *f = &flow->f; > > @@ -174,6 +211,55 @@ void flow_initiate(union flow *flow, uint8_t pif) > flow_set_state(f, FLOW_STATE_INI); > } > > +/** > + * flow_initiate_af() - Move flow to INI, setting INISIDE details > + * @flow: Flow to change state > + * @pif: pif of the initiating side > + * @af: Address family of @eaddr and @faddr > + * @saddr: Source address (pointer to in_addr or in6_addr) > + * @sport: Endpoint port > + * @daddr: Destination address (pointer to in_addr or in6_addr) > + * @dport: Destination port > + * > + * Return: pointer to the initiating flowside information > + */ > +const struct flowside *flow_initiate_af(union flow *flow, uint8_t pif, > + sa_family_t af, > + const void *saddr, in_port_t sport, > + const void *daddr, in_port_t dport) > +{ > + struct flowside *ini = &flow->f.side[INISIDE]; > + > + flowside_from_af(ini, af, saddr, sport, daddr, dport); > + flow_initiate_(flow, pif); > + return ini; > +} > + > +/** > + * flow_initiate_sa() - Move flow to INI, setting INISIDE details > + * @flow: Flow to change state > + * @pif: pif of the initiating side > + * @ssa: Source socket address > + * @dport: Destination port > + * > + * Return: pointer to the initiating flowside information > + */ > +const struct flowside *flow_initiate_sa(union flow *flow, uint8_t pif, > + const union sockaddr_inany *ssa, > + in_port_t dport) > +{ > + struct flowside *ini = &flow->f.side[INISIDE]; > + > + inany_from_sockaddr(&ini->eaddr, &ini->eport, ssa); > + if (inany_v4(&ini->eaddr)) > + ini->faddr = inany_any4; > + else > + ini->faddr = inany_any6; > + ini->fport = dport; > + flow_initiate_(flow, pif); > + return ini; > +} > + > /** > * flow_target() - Move flow to TGT, setting TGTSIDE details > * @flow: Flow to change state > diff --git a/flow.h b/flow.h > index 29ef9f12..b873ea7f 100644 > --- a/flow.h > +++ b/flow.h > @@ -135,11 +135,26 @@ extern const uint8_t flow_proto[]; > #define INISIDE 0 /* Initiating side */ > #define TGTSIDE 1 /* Target side */ > > +/** > + * struct flowside - Address information for one side of a flow > + * @eaddr: Endpoint address (remote address from passt's PoV) > + * @faddr: Forwarding address (local address from passt's PoV) > + * @eport: Endpoint port > + * @fport: Forwarding port > + */ > +struct flowside { > + union inany_addr faddr; > + union inany_addr eaddr; > + in_port_t fport; > + in_port_t eport; > +}; > + > /** > * struct flow_common - Common fields for packet flows > * @state: State of the flow table entry > * @type: Type of packet flow > * @pif[]: Interface for each side of the flow > + * @side[]: Information for each side of the flow > */ > struct flow_common { > #ifdef __GNUC__ > @@ -154,6 +169,7 @@ struct flow_common { > "Not enough bits for type field"); > #endif > uint8_t pif[SIDES]; > + struct flowside side[SIDES]; > }; > > #define FLOW_INDEX_BITS 17 /* 128k - 1 */ > diff --git a/flow_table.h b/flow_table.h > index 1b163491..2e912532 100644 > --- a/flow_table.h > +++ b/flow_table.h > @@ -107,7 +107,13 @@ static inline flow_sidx_t flow_sidx(const struct flow_common *f, > union flow *flow_alloc(void); > void flow_alloc_cancel(union flow *flow); > > -void flow_initiate(union flow *flow, uint8_t pif); > +const struct flowside *flow_initiate_af(union flow *flow, uint8_t pif, > + sa_family_t af, > + const void *saddr, in_port_t sport, > + const void *daddr, in_port_t dport); > +const struct flowside *flow_initiate_sa(union flow *flow, uint8_t pif, > + const union sockaddr_inany *ssa, > + in_port_t dport); > void flow_target(union flow *flow, uint8_t pif); > > union flow *flow_set_type(union flow *flow, enum flow_type type); > diff --git a/icmp.c b/icmp.c > index 80330f6f..38d1deeb 100644 > --- a/icmp.c > +++ b/icmp.c > @@ -146,12 +146,15 @@ static void icmp_ping_close(const struct ctx *c, > * @id_sock: Pointer to ping flow entry slot in icmp_id_map[] to update > * @af: Address family, AF_INET or AF_INET6 > * @id: ICMP id for the new socket > + * @saddr: Source address > + * @daddr: Destination address > * > * Return: Newly opened ping flow, or NULL on failure > */ > static struct icmp_ping_flow *icmp_ping_new(const struct ctx *c, > struct icmp_ping_flow **id_sock, > - sa_family_t af, uint16_t id) > + sa_family_t af, uint16_t id, > + const void *saddr, const void *daddr) > { > uint8_t flowtype = af == AF_INET ? FLOW_PING4 : FLOW_PING6; > union epoll_ref ref = { .type = EPOLL_TYPE_PING }; > @@ -163,7 +166,7 @@ static struct icmp_ping_flow *icmp_ping_new(const struct ctx *c, > if (!flow) > return NULL; > > - flow_initiate(flow, PIF_TAP); > + flow_initiate_af(flow, PIF_TAP, af, saddr, id, daddr, id); > flow_target(flow, PIF_HOST); > pingf = FLOW_SET_TYPE(flow, flowtype, ping); > > @@ -269,7 +272,7 @@ int icmp_tap_handler(const struct ctx *c, uint8_t pif, sa_family_t af, > } > > if (!(pingf = *id_sock)) > - if (!(pingf = icmp_ping_new(c, id_sock, af, id))) > + if (!(pingf = icmp_ping_new(c, id_sock, af, id, saddr, daddr))) > return 1; > > pingf->ts = now->tv_sec; > diff --git a/passt.h b/passt.h > index 46d073a2..76810867 100644 > --- a/passt.h > +++ b/passt.h > @@ -17,6 +17,9 @@ union epoll_ref; > > #include "pif.h" > #include "packet.h" > +#include "siphash.h" > +#include "ip.h" > +#include "inany.h" > #include "flow.h" > #include "icmp.h" > #include "fwd.h" > diff --git a/tcp.c b/tcp.c > index 68524235..07a2eb1c 100644 > --- a/tcp.c > +++ b/tcp.c > @@ -1629,7 +1629,7 @@ static void tcp_conn_from_tap(struct ctx *c, sa_family_t af, > if (!(flow = flow_alloc())) > return; > > - flow_initiate(flow, PIF_TAP); > + flow_initiate_af(flow, PIF_TAP, af, saddr, srcport, daddr, dstport); > > if (af == AF_INET) { > if (IN4_IS_ADDR_UNSPECIFIED(saddr) || > @@ -2288,7 +2288,9 @@ void tcp_listen_handler(struct ctx *c, union epoll_ref ref, > if (s < 0) > goto cancel; > > - flow_initiate(flow, ref.tcp_listen.pif); > + /* FIXME: When listening port has a specific bound address, record that > + * as the forwarding address */ > + flow_initiate_sa(flow, ref.tcp_listen.pif, &sa, ref.tcp_listen.port); > > if (sa.sa_family == AF_INET) { > const struct in_addr *addr = &sa.sa4.sin_addr; -- Stefano