From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by passt.top (Postfix) with ESMTP id 9C4725A0278 for ; Sun, 18 Feb 2024 22:01:08 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1708290067; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=thtf1Nv08l+A+9EFPCOUW1geee1wDGM1AjmQbc5fAXE=; b=d71DUm1T7p4tHKLgEQZeRoZxcyDkOH/LFB16OMaWg2u7/KxqcnAoT3JVmNDKGRCGSQZMIE k/BsZ+GqJnoQ4DTb8YifgD64NCchnfg6JSfzQ88QqWw2s1wa4g0bDvMAarSN6GbAiaREqA rYFWBwMn+odJ1roSsfkqV20zD/IMpzg= Received: from mail-ej1-f71.google.com (mail-ej1-f71.google.com [209.85.218.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-616-WSaBCRsRMgqlo41PcFCKhQ-1; Sun, 18 Feb 2024 16:01:05 -0500 X-MC-Unique: WSaBCRsRMgqlo41PcFCKhQ-1 Received: by mail-ej1-f71.google.com with SMTP id a640c23a62f3a-a3e42733561so43374466b.0 for ; Sun, 18 Feb 2024 13:01:05 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708290064; x=1708894864; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=thtf1Nv08l+A+9EFPCOUW1geee1wDGM1AjmQbc5fAXE=; b=QlihScWHM7FLQtatgH3Vw7TlWXLDts+Lr4/08xmDya5HRg4/IyQPEQjE3XnKW2zdRI /CUM8x0p+N5/T+UFljwyUDTJRbpuc3bR4HWcsg3XcG46pGhsiUnC56SwsH8e0QBgPWAk 2C1eTJZ97TgH+jA88TgirPq7dzZc7kBpdRN6hZMKB2E1+N0nemHowDOmuPde3CsbyD23 EibG+svJ+Zb/Zll0Pg3RW+/cM2BF1Omd01MHKYGD4fEb2gqoBwe3NAPz0kt8YkzOsRGZ dm+x59TTFV7NjA6xrueN3HQLquti8nyJg/DANKVwfP/fF2crQbY2sd9Osmbh4IgNzj5G meuw== X-Gm-Message-State: AOJu0Yx2USkjxRTvqUvjjrDMDF9nkVI2+co1hvO3LjpRbK98lZB2trYm lyKvTnKIOTmK7ocOw7QYYPkB4uRHAI8x6GUAOO3LusfIfCVxQvT/Na42votw3mXyfKzZM0xQO24 RlFgyw78233hr0sv4xTTvQPnF10Uqpktq1beXlk1A9B+5q0EliQ== X-Received: by 2002:a05:6402:3d9:b0:564:6bf1:e804 with SMTP id t25-20020a05640203d900b005646bf1e804mr980648edw.42.1708290064482; Sun, 18 Feb 2024 13:01:04 -0800 (PST) X-Google-Smtp-Source: AGHT+IFvywC5kJCyWnU67WqWRGlTGAYZEbVb0D1DOOXQnLwV2Yc61nPo4WH6W8tRPKeI9BCUmn6F7g== X-Received: by 2002:a05:6402:3d9:b0:564:6bf1:e804 with SMTP id t25-20020a05640203d900b005646bf1e804mr980640edw.42.1708290064136; Sun, 18 Feb 2024 13:01:04 -0800 (PST) Received: from maya.cloud.tilaa.com (maya.cloud.tilaa.com. [164.138.29.33]) by smtp.gmail.com with ESMTPSA id w16-20020a056402129000b0056470bf320asm475390edv.43.2024.02.18.13.01.03 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 18 Feb 2024 13:01:03 -0800 (PST) Date: Sun, 18 Feb 2024 22:00:29 +0100 From: Stefano Brivio To: David Gibson Subject: Re: [PATCH v2 12/22] tcp, tcp_splice: Helpers for getting sockets from the pools Message-ID: <20240218220029.57acacf8@elisabeth> In-Reply-To: <20240206011734.884138-13-david@gibson.dropbear.id.au> References: <20240206011734.884138-1-david@gibson.dropbear.id.au> <20240206011734.884138-13-david@gibson.dropbear.id.au> Organization: Red Hat X-Mailer: Claws Mail 4.2.0 (GTK 3.24.36; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-ID-Hash: IJH2EFO6SREDTWHVYNCJTNTF3B5CUV4V X-Message-ID-Hash: IJH2EFO6SREDTWHVYNCJTNTF3B5CUV4V X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Tue, 6 Feb 2024 12:17:24 +1100 David Gibson wrote: > We maintain pools of ready-to-connect sockets in both the original and > (for pasta) guest namespace to reduce latency when starting new TCP > connections. If we exhaust those pools we have to take a higher > latency path to get a new socket. > > Currently we open-code that fallback in the places we need it. To > improve clarity encapsulate that into helper functions. > > Signed-off-by: David Gibson > --- > tcp.c | 29 ++++++++++++++++++++++++----- > tcp_conn.h | 2 +- > tcp_splice.c | 46 +++++++++++++++++++++++++--------------------- > 3 files changed, 50 insertions(+), 27 deletions(-) > > diff --git a/tcp.c b/tcp.c > index e15b932f..c06d1cc4 100644 > --- a/tcp.c > +++ b/tcp.c > @@ -1792,7 +1792,7 @@ int tcp_conn_pool_sock(int pool[]) > * > * Return: socket number on success, negative code if socket creation failed > */ > -int tcp_conn_new_sock(const struct ctx *c, sa_family_t af) > +static int tcp_conn_new_sock(const struct ctx *c, sa_family_t af) > { > int s; > > @@ -1811,6 +1811,27 @@ int tcp_conn_new_sock(const struct ctx *c, sa_family_t af) > return s; > } > > +/** > + * tcp_conn_sock() - Obtain a connectable socket in the host/init namespace > + * @c: Execution context > + * @af: Address family (AF_INET or AF_INET6) > + * > + * Return: Socket fd on success, -errno on failure > + */ > +int tcp_conn_sock(const struct ctx *c, sa_family_t af) > +{ > + int *pool = af == AF_INET6 ? init_sock_pool6 : init_sock_pool4; > + int s; > + > + if ((s = tcp_conn_pool_sock(pool)) >= 0) > + return s; > + > + /* If the pool is empty we just open a new one without refilling the > + * pool to keep latency down. > + */ > + return tcp_conn_new_sock(c, af); > +} > + > /** > * tcp_conn_tap_mss() - Get MSS value advertised by tap/guest > * @conn: Connection pointer > @@ -1909,7 +1930,6 @@ static void tcp_conn_from_tap(struct ctx *c, sa_family_t af, > const struct tcphdr *th, const char *opts, > size_t optlen, const struct timespec *now) > { > - int *pool = af == AF_INET6 ? init_sock_pool6 : init_sock_pool4; > struct sockaddr_in addr4 = { > .sin_family = AF_INET, > .sin_port = th->dest, > @@ -1931,9 +1951,8 @@ static void tcp_conn_from_tap(struct ctx *c, sa_family_t af, > if (!(flow = flow_alloc())) > return; > > - if ((s = tcp_conn_pool_sock(pool)) < 0) > - if ((s = tcp_conn_new_sock(c, af)) < 0) > - goto cancel; > + if ((s = tcp_conn_sock(c, af)) < 0) > + goto cancel; > > if (!c->no_map_gw) { > if (af == AF_INET && IN4_ARE_ADDR_EQUAL(daddr, &c->ip4.gw)) > diff --git a/tcp_conn.h b/tcp_conn.h > index 20c7cb8b..e55edafe 100644 > --- a/tcp_conn.h > +++ b/tcp_conn.h > @@ -159,7 +159,7 @@ bool tcp_flow_defer(union flow *flow); > bool tcp_splice_flow_defer(union flow *flow); > void tcp_splice_timer(const struct ctx *c, union flow *flow); > int tcp_conn_pool_sock(int pool[]); > -int tcp_conn_new_sock(const struct ctx *c, sa_family_t af); > +int tcp_conn_sock(const struct ctx *c, sa_family_t af); > void tcp_sock_refill_pool(const struct ctx *c, int pool[], sa_family_t af); > void tcp_splice_refill(const struct ctx *c); > > diff --git a/tcp_splice.c b/tcp_splice.c > index 576fe9be..609f3242 100644 > --- a/tcp_splice.c > +++ b/tcp_splice.c > @@ -90,7 +90,7 @@ static const char *tcp_splice_flag_str[] __attribute((__unused__)) = { > }; > > /* Forward declaration */ > -static int tcp_sock_refill_ns(void *arg); > +static int tcp_conn_sock_ns(const struct ctx *c, sa_family_t af); > > /** > * tcp_splice_conn_epoll_events() - epoll events masks for given state > @@ -380,36 +380,19 @@ static int tcp_splice_connect(const struct ctx *c, struct tcp_splice_conn *conn, > static int tcp_splice_new(const struct ctx *c, struct tcp_splice_conn *conn, > in_port_t port, uint8_t pif) > { > + sa_family_t af = CONN_V6(conn) ? AF_INET6 : AF_INET; > int s = -1; > > - /* If the pool is empty we take slightly different approaches > - * for init or ns sockets. For init sockets we just open a > - * new one without refilling the pool to keep latency down. > - * For ns sockets, we're going to incur the latency of > - * entering the ns anyway, so we might as well refill the > - * pool. > - */ > if (pif == PIF_SPLICE) { > - int *p = CONN_V6(conn) ? init_sock_pool6 : init_sock_pool4; > - sa_family_t af = CONN_V6(conn) ? AF_INET6 : AF_INET; > - > port += c->tcp.fwd_out.delta[port]; > > - s = tcp_conn_pool_sock(p); > - if (s < 0) > - s = tcp_conn_new_sock(c, af); > + s = tcp_conn_sock(c, af); > } else { > - int *p = CONN_V6(conn) ? ns_sock_pool6 : ns_sock_pool4; > - > ASSERT(pif == PIF_HOST); > > port += c->tcp.fwd_in.delta[port]; > > - /* If pool is empty, refill it first */ > - if (p[TCP_SOCK_POOL_SIZE-1] < 0) > - NS_CALL(tcp_sock_refill_ns, c); > - > - s = tcp_conn_pool_sock(p); > + s = tcp_conn_sock_ns(c, af); > } > > if (s < 0) { > @@ -709,6 +692,27 @@ static int tcp_sock_refill_ns(void *arg) > return 0; > } > > +/** > + * tcp_conn_sock_ns() - Obtain a connectable socket in the namespace > + * @c: Execution context > + * @af: Address family (AF_INET or AF_INET6) > + * > + * Return: Socket fd in the namespace on success, -errno on failure > + */ > +static int tcp_conn_sock_ns(const struct ctx *c, sa_family_t af) > +{ > + int *p = af == AF_INET6 ? ns_sock_pool6 : ns_sock_pool4; > + > + /* If the pool is empty we have to incur the latency of entering the ns. > + * Therefore, we might as well refill the whole pool while we're at it, > + * which differs from tcp_conn_sock(). > + */ > + if (p[TCP_SOCK_POOL_SIZE-1] < 0) Nit, for consistency (but yes, it was already like this): if (p[TCP_SOCK_POOL_SIZE - 1] < 0) -- Stefano