public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: passt-dev@passt.top
Subject: Re: [PATCH v3 13/13] flow: Avoid moving flow entries to compact table
Date: Fri, 5 Jan 2024 20:39:50 +1100	[thread overview]
Message-ID: <ZZfOZgDeVlzQU6PI@zatzit> (raw)
In-Reply-To: <20240105093335.0c725692@elisabeth>

[-- Attachment #1: Type: text/plain, Size: 5104 bytes --]

On Fri, Jan 05, 2024 at 09:33:35AM +0100, Stefano Brivio wrote:
> On Thu, 4 Jan 2024 21:02:19 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Tue, Jan 02, 2024 at 07:13:41PM +0100, Stefano Brivio wrote:
> > > On Mon, 1 Jan 2024 23:01:17 +1100
> > > David Gibson <david@gibson.dropbear.id.au> wrote:
> > >   
> > > > On Sat, Dec 30, 2023 at 11:33:04AM +0100, Stefano Brivio wrote:  
> > > > > On Thu, 28 Dec 2023 19:25:25 +0100
> > > > > Stefano Brivio <sbrivio@redhat.com> wrote:
> > > > >     
> > > > > > > On Thu, 21 Dec 2023 17:15:49 +1100
> > > > > > > David Gibson <david@gibson.dropbear.id.au> wrote:
> > > > > > > 
> > > > > > > [...]    
> > > > > >
> > > > > > [...]
> > > > > >
> > > > > > I wonder if we really have to keep track of the number of (non-)entries
> > > > > > in the free "block", and if we have to be explicit about the two cases.
> > > > > > 
> > > > > > I'm trying to find out if we can simplify the whole thing with slightly
> > > > > > different variants, for example:    
> > > > > 
> > > > > So... I think the version with (explicit) blocks has this fundamental
> > > > > advantage, on deletion:
> > > > >     
> > > > > > > +	flow->f.type = FLOW_TYPE_NONE;
> > > > > > > +	/* Put it back in a length 1 free block, don't attempt to fully reverse
> > > > > > > +	 * flow_alloc()s steps.  This will get folded together the next time
> > > > > > > +	 * flow_defer_handler runs anyway() */
> > > > > > > +	flow->free.n = 1;
> > > > > > > +	flow->free.next = flow_first_free;
> > > > > > > +	flow_first_free = FLOW_IDX(flow);    
> > > > > 
> > > > > which is doable even without explicit blocks, but much harder to
> > > > > follow.    
> > > > 
> > > > Remember this is not a general deletion, only a "cancel" of the most
> > > > recent allocation.  
> > > 
> > > Oh, I thought that was only the case for this series and you would use
> > > that as actual deletion in another pending series (which I haven't
> > > finished reviewing yet).  
> > 
> > No.  Not allowing deletion of any entry at any time is what I'm
> > trading off to get both O(1) allocation and (effectively) O(1)
> > deletion.
> > 
> > > But now I'm not sure anymore why I was thinking this...
> > > 
> > > Anyway... do we really need it, then? Can't we just mark the "failed"
> > > flows as whatever means "closed" for a specific protocol, and clean
> > > them up later, instead of calling cancel() right away?  
> > 
> > We could, but I'm not sure we want to.  For starters, that requires
> > protocol-specific behaviour whenever we need to back out an allocation
> > like this.  Not a big deal, since that's in protocol specific code
> > already, but I think it's uglier than calling cancel.
> > 
> > It also requires that the protocol specific deferred cleanup functions
> > (e.g. tcp_flow_defer()) handle partially initialised entries.  With
> > 'cancel' we can back out just the initialisation steps we've already
> > done (because we know where we've failed during init), then remove the
> > entry.  The deferred cleanup function only needs to deal with
> > "complete" entries.  Again, certainly possible, but IMO uglier than
> > having 'cancel'.
> 
> Okay, yes, I see now.
> 
> Another doubt that comes to me now is: if you don't plan to use this
> alloc_cancel() thing anywhere else, the only reason why you are adding
> it is to replace the (flow_count >= FLOW_MAX) check with a flow_alloc()
> version that can fail.
> 
> But at this point, speaking of ugliness, couldn't we just have a
> bool flow_may_alloc() { return flow_first_free < FLOW_MAX }; the caller
> can use to decide to abort earlier? To me it looks so much simpler and
> more robust.

Well, we could, but there are a couple of reasons I don't love it.
The first is abstraction: this returns explicit handling of the layout
of the table to the protocol specific callers.  It's not a huge deal
right now, but once we have 4 or 5 protocols doing this, having to
change all of them if we make any tiny change to the semantics of
flow_first_free isn't great.

The other issue is that to do this (without a bunch of fairly large
and ugly temporaries) means we'd populate at least some of the fields
in flow_common before we have officially "allocated" the entry.  At
that point it becomes a bit fuzzy as to when that allocation really
occurs.  Is it when we do the FLOW_MAX tesT?  Is it when we write to
f.type?  Is it when we update flow_first_free?  If we fail somewhere
in the middle of that, what steps do we need to reverse?

For those reasons I prefer the scheme presented.  Fwiw, in an earlier
draft I did this differently with a "flow_prealloc()", which was
essentially the check against FLOW_MAX, then a later
flow_alloc_commit().  I thought it turned out pretty confusing
compared to the alloc/cancel approach.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2024-01-05  9:40 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-21  6:15 [PATCH v3 00/13] Manage more flow related things from generic flow code David Gibson
2023-12-21  6:15 ` [PATCH v3 01/13] flow: Make flow_table.h #include the protocol specific headers it needs David Gibson
2023-12-21  6:15 ` [PATCH v3 02/13] treewide: Standardise on 'now' for current timestamp variables David Gibson
2023-12-21  6:15 ` [PATCH v3 03/13] tcp, tcp_splice: Remove redundant handling from tcp_timer() David Gibson
2023-12-21  6:15 ` [PATCH v3 04/13] tcp, tcp_splice: Move per-type cleanup logic into per-type helpers David Gibson
2023-12-21  6:15 ` [PATCH v3 05/13] flow, tcp: Add flow-centric dispatch for deferred flow handling David Gibson
2023-12-28 18:24   ` Stefano Brivio
2023-12-31  5:56     ` David Gibson
2024-01-02 18:13       ` Stefano Brivio
2024-01-03  3:45         ` David Gibson
2023-12-21  6:15 ` [PATCH v3 06/13] flow, tcp: Add handling for per-flow timers David Gibson
2023-12-21  6:15 ` [PATCH v3 07/13] epoll: Better handling of number of epoll types David Gibson
2023-12-21  6:15 ` [PATCH v3 08/13] tcp, tcp_splice: Avoid double layered dispatch for connected TCP sockets David Gibson
2023-12-21  6:15 ` [PATCH v3 09/13] flow: Move flow_log_() to near top of flow.c David Gibson
2023-12-21  6:15 ` [PATCH v3 10/13] flow: Move flow_count from context structure to a global David Gibson
2023-12-28 18:25   ` Stefano Brivio
2023-12-31  5:58     ` David Gibson
2024-01-02 18:13       ` Stefano Brivio
2024-01-03  3:54         ` David Gibson
2024-01-03  7:08           ` Stefano Brivio
2024-01-04  9:51             ` David Gibson
2024-01-05  7:55               ` Stefano Brivio
2024-01-07  5:23                 ` David Gibson
2023-12-21  6:15 ` [PATCH v3 11/13] flow: Abstract allocation of new flows with helper function David Gibson
2023-12-21  6:15 ` [PATCH v3 12/13] flow: Enforce that freeing of closed flows must happen in deferred handlers David Gibson
2023-12-21  6:15 ` [PATCH v3 13/13] flow: Avoid moving flow entries to compact table David Gibson
2023-12-28 18:25   ` Stefano Brivio
2023-12-30 10:33     ` Stefano Brivio
2024-01-01 12:01       ` David Gibson
2024-01-02 18:13         ` Stefano Brivio
2024-01-04 10:02           ` David Gibson
2024-01-05  8:33             ` Stefano Brivio
2024-01-05  9:39               ` David Gibson [this message]
2024-01-05 10:27                 ` Stefano Brivio
2024-01-06 11:32                   ` David Gibson
2024-01-06 13:02                     ` Stefano Brivio
2024-01-07  5:20                       ` David Gibson
2024-01-01 10:44     ` David Gibson
2024-01-02 18:13       ` Stefano Brivio
2024-01-05  9:45         ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZZfOZgDeVlzQU6PI@zatzit \
    --to=david@gibson.dropbear.id.au \
    --cc=passt-dev@passt.top \
    --cc=sbrivio@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).