From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by passt.top (Postfix) with ESMTPS id F318C5A026F for ; Fri, 5 Jan 2024 10:40:03 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202312; t=1704447600; bh=urukU5gGHUjbbbbAI/t9yoPMzCegmsuDLCWzAQqD17Y=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=ngF7FThK4jLMRqhSXpyC3qosGCMKR/Y32dodcg+MAlA+p7vTJVt/mwSEJYUXZ2DhL LRGK8K/afcMd2JtwdZ3/NbdXgmSzSjej72nOwyclU2RQwmMG4bQktWhXYA7X2vq8ss deet64Y2TyU+Q8CDoZ4wgdQfCvW1a/wIOSDsxo9fIBO2LVrC7DIXe/CqrEuqrfOTdU wZiQgLjFjobnezGh9N7W4NbldfaSmYhJwM9BpY6o4yIEVBk4l7k/jCwPASIxNZmFVb 8JGxyzfhid9z2fF+6GF8uKD4nlPYbz+13v9IbADseQ06QmtiPwTJbr+W17TntqMmB8 fL4KyWZGGunXA== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4T5z2S1zbhz4wjB; Fri, 5 Jan 2024 20:40:00 +1100 (AEDT) Date: Fri, 5 Jan 2024 20:39:50 +1100 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH v3 13/13] flow: Avoid moving flow entries to compact table Message-ID: References: <20231221061549.976358-1-david@gibson.dropbear.id.au> <20231221061549.976358-14-david@gibson.dropbear.id.au> <20231228192525.7ba1ee48@elisabeth> <20231230113304.37c60a9a@elisabeth> <20240102191341.7c91dd44@elisabeth> <20240105093335.0c725692@elisabeth> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="xrb5880aOS+QD+Jp" Content-Disposition: inline In-Reply-To: <20240105093335.0c725692@elisabeth> Message-ID-Hash: BCUUCP3342H4GYH3Z7B6WTG4VWCZ5ERS X-Message-ID-Hash: BCUUCP3342H4GYH3Z7B6WTG4VWCZ5ERS X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --xrb5880aOS+QD+Jp Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Jan 05, 2024 at 09:33:35AM +0100, Stefano Brivio wrote: > On Thu, 4 Jan 2024 21:02:19 +1100 > David Gibson wrote: >=20 > > On Tue, Jan 02, 2024 at 07:13:41PM +0100, Stefano Brivio wrote: > > > On Mon, 1 Jan 2024 23:01:17 +1100 > > > David Gibson wrote: > > > =20 > > > > On Sat, Dec 30, 2023 at 11:33:04AM +0100, Stefano Brivio wrote: =20 > > > > > On Thu, 28 Dec 2023 19:25:25 +0100 > > > > > Stefano Brivio wrote: > > > > > =20 > > > > > > > On Thu, 21 Dec 2023 17:15:49 +1100 > > > > > > > David Gibson wrote: > > > > > > >=20 > > > > > > > [...] =20 > > > > > > > > > > > > [...] > > > > > > > > > > > > I wonder if we really have to keep track of the number of (non-= )entries > > > > > > in the free "block", and if we have to be explicit about the tw= o cases. > > > > > >=20 > > > > > > I'm trying to find out if we can simplify the whole thing with = slightly > > > > > > different variants, for example: =20 > > > > >=20 > > > > > So... I think the version with (explicit) blocks has this fundame= ntal > > > > > advantage, on deletion: > > > > > =20 > > > > > > > + flow->f.type =3D FLOW_TYPE_NONE; > > > > > > > + /* Put it back in a length 1 free block, don't attempt to f= ully reverse > > > > > > > + * flow_alloc()s steps. This will get folded together the = next time > > > > > > > + * flow_defer_handler runs anyway() */ > > > > > > > + flow->free.n =3D 1; > > > > > > > + flow->free.next =3D flow_first_free; > > > > > > > + flow_first_free =3D FLOW_IDX(flow); =20 > > > > >=20 > > > > > which is doable even without explicit blocks, but much harder to > > > > > follow. =20 > > > >=20 > > > > Remember this is not a general deletion, only a "cancel" of the most > > > > recent allocation. =20 > > >=20 > > > Oh, I thought that was only the case for this series and you would use > > > that as actual deletion in another pending series (which I haven't > > > finished reviewing yet). =20 > >=20 > > No. Not allowing deletion of any entry at any time is what I'm > > trading off to get both O(1) allocation and (effectively) O(1) > > deletion. > >=20 > > > But now I'm not sure anymore why I was thinking this... > > >=20 > > > Anyway... do we really need it, then? Can't we just mark the "failed" > > > flows as whatever means "closed" for a specific protocol, and clean > > > them up later, instead of calling cancel() right away? =20 > >=20 > > We could, but I'm not sure we want to. For starters, that requires > > protocol-specific behaviour whenever we need to back out an allocation > > like this. Not a big deal, since that's in protocol specific code > > already, but I think it's uglier than calling cancel. > >=20 > > It also requires that the protocol specific deferred cleanup functions > > (e.g. tcp_flow_defer()) handle partially initialised entries. With > > 'cancel' we can back out just the initialisation steps we've already > > done (because we know where we've failed during init), then remove the > > entry. The deferred cleanup function only needs to deal with > > "complete" entries. Again, certainly possible, but IMO uglier than > > having 'cancel'. >=20 > Okay, yes, I see now. >=20 > Another doubt that comes to me now is: if you don't plan to use this > alloc_cancel() thing anywhere else, the only reason why you are adding > it is to replace the (flow_count >=3D FLOW_MAX) check with a flow_alloc() > version that can fail. >=20 > But at this point, speaking of ugliness, couldn't we just have a > bool flow_may_alloc() { return flow_first_free < FLOW_MAX }; the caller > can use to decide to abort earlier? To me it looks so much simpler and > more robust. Well, we could, but there are a couple of reasons I don't love it. The first is abstraction: this returns explicit handling of the layout of the table to the protocol specific callers. It's not a huge deal right now, but once we have 4 or 5 protocols doing this, having to change all of them if we make any tiny change to the semantics of flow_first_free isn't great. The other issue is that to do this (without a bunch of fairly large and ugly temporaries) means we'd populate at least some of the fields in flow_common before we have officially "allocated" the entry. At that point it becomes a bit fuzzy as to when that allocation really occurs. Is it when we do the FLOW_MAX tesT? Is it when we write to f.type? Is it when we update flow_first_free? If we fail somewhere in the middle of that, what steps do we need to reverse? For those reasons I prefer the scheme presented. Fwiw, in an earlier draft I did this differently with a "flow_prealloc()", which was essentially the check against FLOW_MAX, then a later flow_alloc_commit(). I thought it turned out pretty confusing compared to the alloc/cancel approach. --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --xrb5880aOS+QD+Jp Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmWXzmEACgkQzQJF27ox 2Gf7gA//aOlX99ReYMwdVaE8dlVuOdV+7nONhDtvjdu/LS4cF6SLPMP1KlbkPCtm WSPh+nIT7IB4QH0p29pqFcDk6ZJ+4wYX7GtgfcYRvq7zK6+p334PWlKvSelRlPAI 2v/PmSbzcrOf+LgeSCeysozLZlUT4cv7jxesLl9dFr6Qe0+o2S10KQHL68TpV1Yc DDQnvlCIP6utoYzYAZ89wiOQV7Bea7qEnBbXNCcsqJKDDrlAM1xchM1Zgppcs7do G5BW27vMAoVggmiCRMtTIKyM0YZIFBx8Yp1htJhNaE3WlpA2NYD0IgwQbP3drIgn JispGI4DdY4E+WRVToBA8IYSQdy/29swsjqxfmyYcmkYy8RRJ4aHo8O2URX7Yu47 4G/hj3crtQy3UPmEEgfqz3DU1auRXGcEhyfFPq+jg1mUf1BvjwJV/9JvgZ4mAPod p57hmdwpdCLhCOqj7UKQv41fAo+9HH/axHMVuTbtrhnvArO+/tRgPi/5MuMIjR/k L4K0eePrVsbRfYs3xV7B+z85PeDLYOiKBEMVOBK0+Nw0+e2iypdINHqoU7XVS6zF he62jINaUJsx4q2/TocFrRe2SRo8DD6Maye7rP+dNFsqjkvgxVlRM5PNFkyxXU/s pEhdJe+zgz5qm4Cji/iTEWLK3RlcasakQPmWTGVf3vvS+xY9jl8= =xIZD -----END PGP SIGNATURE----- --xrb5880aOS+QD+Jp--