From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by passt.top (Postfix) with ESMTP id C49845A026D for ; Tue, 7 Nov 2023 09:33:55 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1699346034; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=RVw/rT8siRZnJ0U3p5yusSBznaBgy1tVe43mkygXsFA=; b=SUB6wJ6Pw9wvF7/grBAIQa5HqcUQ30Ll09KekuVj3xvLwiqtXyfFJ8XhJ06+4BX65mqZ+w zR1i0Su76vfNPRT6PPm0/jblZgflEkf5AnVl8w7LXtP/bHh0Dyx87a6fPHmx9eO6gzR5d+ BclNn1mVDGxOw2bdH1eQ7qfp5xeEcB8= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-94-NRQJ2cstMaO0SYOe2lpqAw-1; Tue, 07 Nov 2023 03:33:51 -0500 X-MC-Unique: NRQJ2cstMaO0SYOe2lpqAw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 0020F3811803; Tue, 7 Nov 2023 08:33:51 +0000 (UTC) Received: from elisabeth (unknown [10.39.208.12]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 016581121306; Tue, 7 Nov 2023 08:33:49 +0000 (UTC) Date: Tue, 7 Nov 2023 09:33:47 +0100 From: Stefano Brivio To: David Gibson Subject: Re: [PATCH 1/2] udp: Consistently use -1 to indicate un-opened sockets in maps Message-ID: <20231107093347.3e9286d9@elisabeth> In-Reply-To: <20231106021709.603571-2-david@gibson.dropbear.id.au> References: <20231106021709.603571-1-david@gibson.dropbear.id.au> <20231106021709.603571-2-david@gibson.dropbear.id.au> Organization: Red Hat MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.3 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-ID-Hash: TQ644USCX3P5ZQO5WPZDUKTCNMQIO34I X-Message-ID-Hash: TQ644USCX3P5ZQO5WPZDUKTCNMQIO34I X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, bugs.passt.top@bitsbetwixt.com X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Mon, 6 Nov 2023 13:17:08 +1100 David Gibson wrote: > udp uses the udp_tap_map, udp_splice_ns and udp_splice_init tables to keep > track of already opened sockets bound to specific ports. We need a way to > indicate entries where a socket hasn't been opened, but the code isn't > consistent if this is indicated by a 0 or a -1: > * udp_splice_sendfrom() and udp_tap_handler() assume that 0 indicates > an unopened socket > * udp_sock_init() fills in -1 for a failure to open a socket > * udp_timer_one() is somewhere in between, treating only strictly > positive fds as valid > > -1 (or, at least, negative) is really the correct choice here, since 0 is > a theoretically valid fd value (if very unlikely in practice). Not so unlikely, actually (see also commit 6943d41d6cd0, where I missed to fix the UDP equivalents). By default we close standard input after initialising the "tap" file descriptor, so, depending on configuration options, zero might very well happen to be a UDP socket. I even pondered for a while to open a dummy file descriptor after closing standard input just for the sake of having zero as a "reserved" value, but it's not guaranteed to work. > Change to use that consistently throughout. > > The table does need to be initialised to all -1 values before any calls to > udp_sock_init() which can happen from conf_ports(). Because C doesn't make > it easy to statically initialise non zero values in large tables, this does > require a somewhat awkward call to initialise the table from conf(). This > is the best approach I could see for the short term, with any luck it will > go away at some point when those socket tables are replaced by a unified > flow table. > > Signed-off-by: David Gibson > --- > conf.c | 1 + > udp.c | 26 +++++++++++++++++++++----- > udp.h | 1 + > 3 files changed, 23 insertions(+), 5 deletions(-) > > diff --git a/conf.c b/conf.c > index a235b31..95b3e4b 100644 > --- a/conf.c > +++ b/conf.c > @@ -1740,6 +1740,7 @@ void conf(struct ctx *c, int argc, char **argv) > c->no_map_gw = 1; > > /* Inbound port options can be parsed now (after IPv4/IPv6 settings) */ > + udp_portmap_clear(); > optind = 1; > do { > name = getopt_long(argc, argv, optstring, options, NULL); > diff --git a/udp.c b/udp.c > index cadf393..a8473e3 100644 > --- a/udp.c > +++ b/udp.c > @@ -238,6 +238,20 @@ static struct sockaddr_in6 udp6_localname = { > static struct mmsghdr udp4_mh_splice [UDP_MAX_FRAMES]; > static struct mmsghdr udp6_mh_splice [UDP_MAX_FRAMES]; > > +/** > + * udp_portmap_clear() - Clear UDP port map before configuration > + */ > +void udp_portmap_clear(void) > +{ > + unsigned i; > + > + for (i = 0; i < NUM_PORTS; i++) { > + udp_tap_map[V4][i].sock = udp_tap_map[V6][i].sock = -1; > + udp_splice_ns[V4][i].sock = udp_splice_ns[V6][i].sock = -1; > + udp_splice_init[V4][i].sock = udp_splice_init[V6][i].sock = -1; > + } > +} For TCP we do: $ grep memset\(.*0xff tcp.c tcp_splice.c tcp.c: memset(init_sock_pool4, 0xff, sizeof(init_sock_pool4)); tcp.c: memset(init_sock_pool6, 0xff, sizeof(init_sock_pool6)); tcp.c: memset(tcp_sock_init_ext, 0xff, sizeof(tcp_sock_init_ext)); tcp.c: memset(tcp_sock_ns, 0xff, sizeof(tcp_sock_ns)); tcp_splice.c: memset(splice_pipe_pool, 0xff, sizeof(splice_pipe_pool)); tcp_splice.c: memset(&ns_sock_pool4, 0xff, sizeof(ns_sock_pool4)); tcp_splice.c: memset(&ns_sock_pool6, 0xff, sizeof(ns_sock_pool6)); ...given how common this is, perhaps we could introduce a helper. In any case, I'll go ahead and apply this now, as the issue is quite bad, we can change this detail later. -- Stefano