From: David Gibson <david@gibson.dropbear.id.au>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: passt-dev@passt.top
Subject: Re: [PATCH v3 13/13] flow: Avoid moving flow entries to compact table
Date: Fri, 5 Jan 2024 20:39:50 +1100 [thread overview]
Message-ID: <ZZfOZgDeVlzQU6PI@zatzit> (raw)
In-Reply-To: <20240105093335.0c725692@elisabeth>
[-- Attachment #1: Type: text/plain, Size: 5104 bytes --]
On Fri, Jan 05, 2024 at 09:33:35AM +0100, Stefano Brivio wrote:
> On Thu, 4 Jan 2024 21:02:19 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
>
> > On Tue, Jan 02, 2024 at 07:13:41PM +0100, Stefano Brivio wrote:
> > > On Mon, 1 Jan 2024 23:01:17 +1100
> > > David Gibson <david@gibson.dropbear.id.au> wrote:
> > >
> > > > On Sat, Dec 30, 2023 at 11:33:04AM +0100, Stefano Brivio wrote:
> > > > > On Thu, 28 Dec 2023 19:25:25 +0100
> > > > > Stefano Brivio <sbrivio@redhat.com> wrote:
> > > > >
> > > > > > > On Thu, 21 Dec 2023 17:15:49 +1100
> > > > > > > David Gibson <david@gibson.dropbear.id.au> wrote:
> > > > > > >
> > > > > > > [...]
> > > > > >
> > > > > > [...]
> > > > > >
> > > > > > I wonder if we really have to keep track of the number of (non-)entries
> > > > > > in the free "block", and if we have to be explicit about the two cases.
> > > > > >
> > > > > > I'm trying to find out if we can simplify the whole thing with slightly
> > > > > > different variants, for example:
> > > > >
> > > > > So... I think the version with (explicit) blocks has this fundamental
> > > > > advantage, on deletion:
> > > > >
> > > > > > > + flow->f.type = FLOW_TYPE_NONE;
> > > > > > > + /* Put it back in a length 1 free block, don't attempt to fully reverse
> > > > > > > + * flow_alloc()s steps. This will get folded together the next time
> > > > > > > + * flow_defer_handler runs anyway() */
> > > > > > > + flow->free.n = 1;
> > > > > > > + flow->free.next = flow_first_free;
> > > > > > > + flow_first_free = FLOW_IDX(flow);
> > > > >
> > > > > which is doable even without explicit blocks, but much harder to
> > > > > follow.
> > > >
> > > > Remember this is not a general deletion, only a "cancel" of the most
> > > > recent allocation.
> > >
> > > Oh, I thought that was only the case for this series and you would use
> > > that as actual deletion in another pending series (which I haven't
> > > finished reviewing yet).
> >
> > No. Not allowing deletion of any entry at any time is what I'm
> > trading off to get both O(1) allocation and (effectively) O(1)
> > deletion.
> >
> > > But now I'm not sure anymore why I was thinking this...
> > >
> > > Anyway... do we really need it, then? Can't we just mark the "failed"
> > > flows as whatever means "closed" for a specific protocol, and clean
> > > them up later, instead of calling cancel() right away?
> >
> > We could, but I'm not sure we want to. For starters, that requires
> > protocol-specific behaviour whenever we need to back out an allocation
> > like this. Not a big deal, since that's in protocol specific code
> > already, but I think it's uglier than calling cancel.
> >
> > It also requires that the protocol specific deferred cleanup functions
> > (e.g. tcp_flow_defer()) handle partially initialised entries. With
> > 'cancel' we can back out just the initialisation steps we've already
> > done (because we know where we've failed during init), then remove the
> > entry. The deferred cleanup function only needs to deal with
> > "complete" entries. Again, certainly possible, but IMO uglier than
> > having 'cancel'.
>
> Okay, yes, I see now.
>
> Another doubt that comes to me now is: if you don't plan to use this
> alloc_cancel() thing anywhere else, the only reason why you are adding
> it is to replace the (flow_count >= FLOW_MAX) check with a flow_alloc()
> version that can fail.
>
> But at this point, speaking of ugliness, couldn't we just have a
> bool flow_may_alloc() { return flow_first_free < FLOW_MAX }; the caller
> can use to decide to abort earlier? To me it looks so much simpler and
> more robust.
Well, we could, but there are a couple of reasons I don't love it.
The first is abstraction: this returns explicit handling of the layout
of the table to the protocol specific callers. It's not a huge deal
right now, but once we have 4 or 5 protocols doing this, having to
change all of them if we make any tiny change to the semantics of
flow_first_free isn't great.
The other issue is that to do this (without a bunch of fairly large
and ugly temporaries) means we'd populate at least some of the fields
in flow_common before we have officially "allocated" the entry. At
that point it becomes a bit fuzzy as to when that allocation really
occurs. Is it when we do the FLOW_MAX tesT? Is it when we write to
f.type? Is it when we update flow_first_free? If we fail somewhere
in the middle of that, what steps do we need to reverse?
For those reasons I prefer the scheme presented. Fwiw, in an earlier
draft I did this differently with a "flow_prealloc()", which was
essentially the check against FLOW_MAX, then a later
flow_alloc_commit(). I thought it turned out pretty confusing
compared to the alloc/cancel approach.
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2024-01-05 9:40 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-21 6:15 [PATCH v3 00/13] Manage more flow related things from generic flow code David Gibson
2023-12-21 6:15 ` [PATCH v3 01/13] flow: Make flow_table.h #include the protocol specific headers it needs David Gibson
2023-12-21 6:15 ` [PATCH v3 02/13] treewide: Standardise on 'now' for current timestamp variables David Gibson
2023-12-21 6:15 ` [PATCH v3 03/13] tcp, tcp_splice: Remove redundant handling from tcp_timer() David Gibson
2023-12-21 6:15 ` [PATCH v3 04/13] tcp, tcp_splice: Move per-type cleanup logic into per-type helpers David Gibson
2023-12-21 6:15 ` [PATCH v3 05/13] flow, tcp: Add flow-centric dispatch for deferred flow handling David Gibson
2023-12-28 18:24 ` Stefano Brivio
2023-12-31 5:56 ` David Gibson
2024-01-02 18:13 ` Stefano Brivio
2024-01-03 3:45 ` David Gibson
2023-12-21 6:15 ` [PATCH v3 06/13] flow, tcp: Add handling for per-flow timers David Gibson
2023-12-21 6:15 ` [PATCH v3 07/13] epoll: Better handling of number of epoll types David Gibson
2023-12-21 6:15 ` [PATCH v3 08/13] tcp, tcp_splice: Avoid double layered dispatch for connected TCP sockets David Gibson
2023-12-21 6:15 ` [PATCH v3 09/13] flow: Move flow_log_() to near top of flow.c David Gibson
2023-12-21 6:15 ` [PATCH v3 10/13] flow: Move flow_count from context structure to a global David Gibson
2023-12-28 18:25 ` Stefano Brivio
2023-12-31 5:58 ` David Gibson
2024-01-02 18:13 ` Stefano Brivio
2024-01-03 3:54 ` David Gibson
2024-01-03 7:08 ` Stefano Brivio
2024-01-04 9:51 ` David Gibson
2024-01-05 7:55 ` Stefano Brivio
2024-01-07 5:23 ` David Gibson
2023-12-21 6:15 ` [PATCH v3 11/13] flow: Abstract allocation of new flows with helper function David Gibson
2023-12-21 6:15 ` [PATCH v3 12/13] flow: Enforce that freeing of closed flows must happen in deferred handlers David Gibson
2023-12-21 6:15 ` [PATCH v3 13/13] flow: Avoid moving flow entries to compact table David Gibson
2023-12-28 18:25 ` Stefano Brivio
2023-12-30 10:33 ` Stefano Brivio
2024-01-01 12:01 ` David Gibson
2024-01-02 18:13 ` Stefano Brivio
2024-01-04 10:02 ` David Gibson
2024-01-05 8:33 ` Stefano Brivio
2024-01-05 9:39 ` David Gibson [this message]
2024-01-05 10:27 ` Stefano Brivio
2024-01-06 11:32 ` David Gibson
2024-01-06 13:02 ` Stefano Brivio
2024-01-07 5:20 ` David Gibson
2024-01-01 10:44 ` David Gibson
2024-01-02 18:13 ` Stefano Brivio
2024-01-05 9:45 ` David Gibson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZZfOZgDeVlzQU6PI@zatzit \
--to=david@gibson.dropbear.id.au \
--cc=passt-dev@passt.top \
--cc=sbrivio@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).