From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by passt.top (Postfix) with ESMTP id AE2365A026F for ; Thu, 7 Mar 2024 09:53:55 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1709801634; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=T6RYq+RSEIbNuhBl8jy0BXnuvFTuR3/cOE6T4tw6+YY=; b=W0Yc/G1iGy9uyiGsDB6/OC0JBbMLxHjjt4NZnTNASx53n8qvrg4M5JSizb805vdT4wT4uV AmYwJ6MEJtPZr5NNpUMQWwMN8nuWa0fkEh6mea0A6Tj88M+i7E2CQ1MfWVeldn5mHYw90q Gy0x+IRzFWBcqlnOIGE24LA/fBDPoJE= Received: from mail-ed1-f71.google.com (mail-ed1-f71.google.com [209.85.208.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-402-Yx8qcdimOpaxO5-aKV1nKQ-1; Thu, 07 Mar 2024 03:53:53 -0500 X-MC-Unique: Yx8qcdimOpaxO5-aKV1nKQ-1 Received: by mail-ed1-f71.google.com with SMTP id 4fb4d7f45d1cf-559555e38b0so677757a12.3 for ; Thu, 07 Mar 2024 00:53:52 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709801630; x=1710406430; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=T6RYq+RSEIbNuhBl8jy0BXnuvFTuR3/cOE6T4tw6+YY=; b=jgkd75MIeRoU1OXbDpgjAqFXmm7+Kkrrh48fDBdyOgSlr3AlAkin5SEtXGLgCF56HS cAjAsOy8tP+VXe7H4jW7l/Yhuj3p48lGP3BbfRvcBuRaMzZB45FH5IgfQqUCcipD/IM5 occhFKU8cY3D7SUeQV4Fo1ML3XzxuNHovMul4z+hn/zzheJRasF1tcYvbeoo9T5d2XEG ixmaT4p3G1skXwulw/itswqFpGXy8ihZiO8iGILvf3t00v2odafzhbp4y0wTplCfpK3R /41XvnS23rrRgLEMaIxZIdo1i5Qon127QTIi4EypMqYd7drMWfdu+kYDpXaUJ1cApeZG tJyQ== X-Gm-Message-State: AOJu0YxmKIIKzBgmwU7VUnuGtQGSv+NDQUBlAWlcXU2URAw+2Q6DHug1 3DgKBPLGwAvnXa76Zd12iH5dEF6MWKqMvHqb4t7LkY1cXTx1i9fCwyLKD8y9cbbp79W5qrrFp5+ pDp93ApO4Ek1eOTKApQ2cJ2YUNKRxhqsTQhpLMd3n83KoFDEzpU7eEr8Bi5dZ X-Received: by 2002:a17:906:4818:b0:a45:2fc4:f25d with SMTP id w24-20020a170906481800b00a452fc4f25dmr7947024ejq.12.1709801630361; Thu, 07 Mar 2024 00:53:50 -0800 (PST) X-Google-Smtp-Source: AGHT+IG76vEkPACk+uiCESsmg6/SkFm/u2fQ17q1rQTJ6LxGkzjWny79ndEbXKtnhq5o7Vvfa67vnw== X-Received: by 2002:a17:906:4818:b0:a45:2fc4:f25d with SMTP id w24-20020a170906481800b00a452fc4f25dmr7947006ejq.12.1709801629778; Thu, 07 Mar 2024 00:53:49 -0800 (PST) Received: from maya.cloud.tilaa.com (maya.cloud.tilaa.com. [164.138.29.33]) by smtp.gmail.com with ESMTPSA id lz19-20020a170906fb1300b00a45b7f9e822sm1642289ejb.118.2024.03.07.00.53.48 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Thu, 07 Mar 2024 00:53:49 -0800 (PST) Date: Thu, 7 Mar 2024 09:53:15 +0100 From: Stefano Brivio To: David Gibson Subject: Re: [PATCH 1/3] icmp: Store ping socket information in flow table Message-ID: <20240307095315.36eeef59@elisabeth> In-Reply-To: <20240229041534.2573559-2-david@gibson.dropbear.id.au> References: <20240229041534.2573559-1-david@gibson.dropbear.id.au> <20240229041534.2573559-2-david@gibson.dropbear.id.au> Organization: Red Hat X-Mailer: Claws Mail 4.2.0 (GTK 3.24.36; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-ID-Hash: FYKHED6KJFRK55Z6Z7DRKR3VSXYI45EG X-Message-ID-Hash: FYKHED6KJFRK55Z6Z7DRKR3VSXYI45EG X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Apologies for the delay, I'm still reviewing 3/3, but I have a question meanwhile: On Thu, 29 Feb 2024 15:15:32 +1100 David Gibson wrote: > Currently icmp_id_map[][] stores information about ping sockets in a > bespoke structure. Move the same information into new types of flow > in the flow table. To match that change, replace the existing ICMP > timer with a flow-based timer for expiring ping sockets. This has the > advantage that we only need to scan the active flows, not all possible > ids. > > We convert icmp_id_map[][] to point to the flow table entries, rather > than containing its own information. We do still use that array for > locating the right ping flows, rather than using a "flow native" form > of lookup for the time being. > > Signed-off-by: David Gibson > --- > Makefile | 6 +-- > flow.c | 9 ++++ > flow.h | 4 ++ > flow_table.h | 2 + > icmp.c | 143 +++++++++++++++++++++++---------------------------- > icmp.h | 2 +- > icmp_flow.h | 31 +++++++++++ > passt.c | 5 -- > 8 files changed, 115 insertions(+), 87 deletions(-) > create mode 100644 icmp_flow.h > > diff --git a/Makefile b/Makefile > index 2d6a5155..47fc5448 100644 > --- a/Makefile > +++ b/Makefile > @@ -54,9 +54,9 @@ SRCS = $(PASST_SRCS) $(QRAP_SRCS) > MANPAGES = passt.1 pasta.1 qrap.1 > > PASST_HEADERS = arch.h arp.h checksum.h conf.h dhcp.h dhcpv6.h flow.h fwd.h \ > - flow_table.h icmp.h inany.h iov.h isolation.h lineread.h log.h ndp.h \ > - netlink.h packet.h passt.h pasta.h pcap.h pif.h siphash.h tap.h tcp.h \ > - tcp_conn.h tcp_splice.h udp.h util.h > + flow_table.h icmp.h icmp_flow.h inany.h iov.h isolation.h lineread.h \ > + log.h ndp.h netlink.h packet.h passt.h pasta.h pcap.h pif.h siphash.h \ > + tap.h tcp.h tcp_conn.h tcp_splice.h udp.h util.h > HEADERS = $(PASST_HEADERS) seccomp.h > > C := \#include \nstruct tcp_info x = { .tcpi_snd_wnd = 0 }; > diff --git a/flow.c b/flow.c > index d7974d59..5835d6c0 100644 > --- a/flow.c > +++ b/flow.c > @@ -21,6 +21,8 @@ const char *flow_type_str[] = { > [FLOW_TYPE_NONE] = "", > [FLOW_TCP] = "TCP connection", > [FLOW_TCP_SPLICE] = "TCP connection (spliced)", > + [FLOW_PING4] = "ICMP ping sequence", > + [FLOW_PING6] = "ICMPv6 ping sequence", > }; > static_assert(ARRAY_SIZE(flow_type_str) == FLOW_NUM_TYPES, > "flow_type_str[] doesn't match enum flow_type"); > @@ -28,6 +30,8 @@ static_assert(ARRAY_SIZE(flow_type_str) == FLOW_NUM_TYPES, > const uint8_t flow_proto[] = { > [FLOW_TCP] = IPPROTO_TCP, > [FLOW_TCP_SPLICE] = IPPROTO_TCP, > + [FLOW_PING4] = IPPROTO_ICMP, > + [FLOW_PING6] = IPPROTO_ICMPV6, > }; > static_assert(ARRAY_SIZE(flow_proto) == FLOW_NUM_TYPES, > "flow_proto[] doesn't match enum flow_type"); > @@ -294,6 +298,11 @@ void flow_defer_handler(const struct ctx *c, const struct timespec *now) > if (!closed && timer) > tcp_splice_timer(c, flow); > break; > + case FLOW_PING4: > + case FLOW_PING6: > + if (timer) > + closed = icmp_ping_timer(c, flow, now); > + break; > default: > /* Assume other flow types don't need any handling */ > ; > diff --git a/flow.h b/flow.h > index 8b66751b..c943c441 100644 > --- a/flow.h > +++ b/flow.h > @@ -19,6 +19,10 @@ enum flow_type { > FLOW_TCP, > /* A TCP connection between a host socket and ns socket */ > FLOW_TCP_SPLICE, > + /* ICMP echo requests from guest to host and matching replies back */ > + FLOW_PING4, > + /* ICMPv6 echo requests from guest to host and matching replies back */ > + FLOW_PING6, > > FLOW_NUM_TYPES, > }; > diff --git a/flow_table.h b/flow_table.h > index eecf8844..b7e5529a 100644 > --- a/flow_table.h > +++ b/flow_table.h > @@ -8,6 +8,7 @@ > #define FLOW_TABLE_H > > #include "tcp_conn.h" > +#include "icmp_flow.h" > > /** > * struct flow_free_cluster - Information about a cluster of free entries > @@ -33,6 +34,7 @@ union flow { > struct flow_free_cluster free; > struct tcp_tap_conn tcp; > struct tcp_splice_conn tcp_splice; > + struct icmp_ping_flow ping; > }; > > /* Global Flow Table */ > diff --git a/icmp.c b/icmp.c > index fb2fcafc..1caf791d 100644 > --- a/icmp.c > +++ b/icmp.c > @@ -39,24 +39,17 @@ > #include "siphash.h" > #include "inany.h" > #include "icmp.h" > +#include "flow_table.h" > > #define ICMP_ECHO_TIMEOUT 60 /* s, timeout for ICMP socket activity */ > #define ICMP_NUM_IDS (1U << 16) > > -/** > - * struct icmp_id_sock - Tracking information for single ICMP echo identifier > - * @sock: Bound socket for identifier > - * @seq: Last sequence number sent to tap, host order, -1: not sent yet > - * @ts: Last associated activity from tap, seconds > - */ > -struct icmp_id_sock { > - int sock; > - int seq; > - time_t ts; > -}; > +/* Sides of a flow as we use them for ping streams */ > +#define SOCKSIDE 0 > +#define TAPSIDE 1 > > /* Indexed by ICMP echo identifier */ > -static struct icmp_id_sock icmp_id_map[IP_VERSIONS][ICMP_NUM_IDS]; > +static struct icmp_ping_flow *icmp_id_map[IP_VERSIONS][ICMP_NUM_IDS]; > > /** > * icmp_sock_handler() - Handle new data from ICMP or ICMPv6 socket > @@ -66,8 +59,8 @@ static struct icmp_id_sock icmp_id_map[IP_VERSIONS][ICMP_NUM_IDS]; > */ > void icmp_sock_handler(const struct ctx *c, sa_family_t af, union epoll_ref ref) > { > - struct icmp_id_sock *const id_sock = af == AF_INET > - ? &icmp_id_map[V4][ref.icmp.id] : &icmp_id_map[V6][ref.icmp.id]; > + struct icmp_ping_flow *pingf = af == AF_INET > + ? icmp_id_map[V4][ref.icmp.id] : icmp_id_map[V6][ref.icmp.id]; > const char *const pname = af == AF_INET ? "ICMP" : "ICMPv6"; > union sockaddr_inany sr; > socklen_t sl = sizeof(sr); > @@ -78,6 +71,8 @@ void icmp_sock_handler(const struct ctx *c, sa_family_t af, union epoll_ref ref) > if (c->no_icmp) > return; > > + ASSERT(pingf); > + > n = recvfrom(ref.fd, buf, sizeof(buf), 0, &sr.sa, &sl); > if (n < 0) { > warn("%s: recvfrom() error on ping socket: %s", > @@ -112,10 +107,10 @@ void icmp_sock_handler(const struct ctx *c, sa_family_t af, union epoll_ref ref) > > /* In PASTA mode, we'll get any reply we send, discard them. */ > if (c->mode == MODE_PASTA) { > - if (id_sock->seq == seq) > + if (pingf->seq == seq) > return; > > - id_sock->seq = seq; > + pingf->seq = seq; > } > > debug("%s: echo reply to tap, ID: %"PRIu16", seq: %"PRIu16, pname, > @@ -132,16 +127,22 @@ unexpected: > } > > /** > - * icmp_ping_close() - Close and clean up a ping socket > + * icmp_ping_close() - Close and clean up a ping flow > * @c: Execution context > - * @id_sock: Socket number and other info > + * @pingf: ping flow entry to close > */ > -static void icmp_ping_close(const struct ctx *c, struct icmp_id_sock *id_sock) > +static void icmp_ping_close(const struct ctx *c, > + const struct icmp_ping_flow *pingf) > { > - epoll_ctl(c->epollfd, EPOLL_CTL_DEL, id_sock->sock, NULL); > - close(id_sock->sock); > - id_sock->sock = -1; > - id_sock->seq = -1; > + uint16_t id = pingf->id; > + > + epoll_ctl(c->epollfd, EPOLL_CTL_DEL, pingf->sock, NULL); > + close(pingf->sock); > + > + if (pingf->f.type == FLOW_PING4) > + icmp_id_map[V4][id] = NULL; > + else > + icmp_id_map[V6][id] = NULL; > } > > /** > @@ -151,17 +152,27 @@ static void icmp_ping_close(const struct ctx *c, struct icmp_id_sock *id_sock) > * @af: Address family, AF_INET or AF_INET6 > * @id: ICMP id for the new socket > * > - * Return: Newly opened ping socket fd, or -1 on failure > + * Return: Newly opened ping flow, or NULL on failure > */ > -static int icmp_ping_new(const struct ctx *c, struct icmp_id_sock *id_sock, > - sa_family_t af, uint16_t id) > +static struct icmp_ping_flow *icmp_ping_new(const struct ctx *c, > + struct icmp_ping_flow **id_sock, I'm not quite sure why we still need id_sock passed as parameter, and what it's supposed to contain (you haven't updated the function comment). Now that all the information is encapsulated in the flow, and you return the new flow, with a trivial change in icmp_tap_handler(), couldn't we just drop id_sock here? -- Stefano