public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: passt-dev@passt.top
Subject: Re: [PATCH v3 16/16] tcp: Don't defer hash table removal
Date: Fri, 1 Dec 2023 11:07:50 +1100	[thread overview]
Message-ID: <ZWkj1udgR2Le8YZ1@zatzit> (raw)
In-Reply-To: <20231130134532.7c5b959e@elisabeth>

[-- Attachment #1: Type: text/plain, Size: 4301 bytes --]

On Thu, Nov 30, 2023 at 01:45:32PM +0100, Stefano Brivio wrote:
> On Thu, 30 Nov 2023 13:02:22 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > When a TCP connection is closed, we mark it by setting events to CLOSED,
> > then some time later we do final cleanups: closing sockets, removing from
> > the hash table and so forth.
> > 
> > This does mean that when making a hash lookup we need to exclude any
> > apparent matches that are CLOSED, since they represent a stale connection.
> > This can happen in practice if one connection closes and a new one with the
> > same endpoints is started shortly afterward.
> > 
> > Checking for CLOSED is quite specific to TCP however, and won't work when
> > we extend the hash table to more general flows.  So, alter the code to
> > immediately remove the connection from the hash table when CLOSED, although
> > we still defer closing sockets and other cleanup.
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> >  tcp.c | 10 +++++++---
> >  1 file changed, 7 insertions(+), 3 deletions(-)
> > 
> > diff --git a/tcp.c b/tcp.c
> > index 74d06bf..17c7cba 100644
> > --- a/tcp.c
> > +++ b/tcp.c
> > @@ -781,6 +781,9 @@ static void conn_flag_do(const struct ctx *c, struct tcp_tap_conn *conn,
> >  		tcp_timer_ctl(c, conn);
> >  }
> >  
> > +static void tcp_hash_remove(const struct ctx *c,
> > +			    const struct tcp_tap_conn *conn);
> > +
> >  /**
> >   * conn_event_do() - Set and log connection events, update epoll state
> >   * @c:		Execution context
> > @@ -825,7 +828,9 @@ static void conn_event_do(const struct ctx *c, struct tcp_tap_conn *conn,
> >  		flow_dbg(conn, "%s",
> >  			 num == -1 	       ? "CLOSED" : tcp_event_str[num]);
> >  
> > -	if ((event == TAP_FIN_RCVD) && !(conn->events & SOCK_FIN_RCVD))
> > +	if (event == CLOSED)
> > +		tcp_hash_remove(c, conn);
> > +	else if ((event == TAP_FIN_RCVD) && !(conn->events & SOCK_FIN_RCVD))
> >  		conn_flag(c, conn, ACTIVE_CLOSE);
> >  	else
> >  		tcp_epoll_ctl(c, conn);
> > @@ -1150,7 +1155,7 @@ static int tcp_hash_match(const struct tcp_tap_conn *conn,
> >  			  const union inany_addr *faddr,
> >  			  in_port_t eport, in_port_t fport)
> >  {
> > -	if (conn->events != CLOSED && inany_equals(&conn->faddr, faddr) &&
> > +	if (inany_equals(&conn->faddr, faddr) &&
> >  	    conn->eport == eport && conn->fport == fport)
> >  		return 1;
> >  
> > @@ -1308,7 +1313,6 @@ static void tcp_conn_destroy(struct ctx *c, union flow *flow)
> >  	if (conn->timer != -1)
> >  		close(conn->timer);
> >  
> > -	tcp_hash_remove(c, conn);
> >  	flow_table_compact(c, flow);
> 
> I was pretty sure, due to the way I originally implemented this, that
> removing an entry from the hash table without compacting the table
> afterwards, with an event possibly coming between the two, would
> present some inconsistency while we're handling that event.
> 
> But looking at it now, I don't see any issue with this. I just wanted
> to raise it in case you're aware of (but didn't think about) some
> concern in this sense.

I think it's ok.  The thing is that compacting the connection table
itself is largely independent of the hash table, whose buckets are
separately indexed.  A hash remove shuffles things around in the hash
buckets, but doesn't change where connections sit in the connection
table.  A connection table compaction changes the indices in the
connection table, which requires updating things in the hash buckets,
but not moving things around in the buckets - exactly the same entries
are in every hash bucket, it's just that one of them has a new "name"
now.

> By the way, the reason why I deferred tcp_hash_remove() back then was
> to save cycles between epoll events and get higher CRR rates, but I
> think the effect is negligible anyway.

Right.. to process a FIN and the next SYN at once, I guess?  I figured
this might make a difference, but probably not much.  There's no
syscall here, and batching doesn't reduce the total amount of work in
this case.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2023-12-01  0:08 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-30  2:02 [PATCH v3 00/16] Introduce unified flow table, first steps David Gibson
2023-11-30  2:02 ` [PATCH v3 01/16] treewide: Add messages to static_assert() calls David Gibson
2023-11-30  2:02 ` [PATCH v3 02/16] flow, tcp: Generalise connection types David Gibson
2023-11-30  2:02 ` [PATCH v3 03/16] flow, tcp: Move TCP connection table to unified flow table David Gibson
2023-11-30  2:02 ` [PATCH v3 04/16] flow, tcp: Consolidate flow pointer<->index helpers David Gibson
2023-11-30  2:02 ` [PATCH v3 05/16] util: MAX_FROM_BITS() should be unsigned David Gibson
2023-11-30  2:02 ` [PATCH v3 06/16] flow: Make unified version of flow table compaction David Gibson
2023-11-30  2:02 ` [PATCH v3 07/16] flow, tcp: Add logging helpers for connection related messages David Gibson
2023-11-30  2:02 ` [PATCH v3 08/16] flow: Introduce 'sidx' type to represent one side of one flow David Gibson
2023-12-02  4:35   ` Stefano Brivio
2023-11-30  2:02 ` [PATCH v3 09/16] tcp: Remove unneccessary bounds check in tcp_timer_handler() David Gibson
2023-11-30  2:02 ` [PATCH v3 10/16] flow,tcp: Generalise TCP epoll_ref to generic flows David Gibson
2023-11-30  2:02 ` [PATCH v3 11/16] tcp_splice: Use unsigned to represent side David Gibson
2023-11-30  2:02 ` [PATCH v3 12/16] flow,tcp: Use epoll_ref type including flow and side David Gibson
2023-11-30  2:02 ` [PATCH v3 13/16] test: Avoid hitting guestfish command length limits David Gibson
2023-11-30  2:02 ` [PATCH v3 14/16] pif: Add helpers to get the name of a pif David Gibson
2023-11-30 12:45   ` Stefano Brivio
2023-12-01  0:08     ` David Gibson
2023-11-30  2:02 ` [PATCH v3 15/16] tcp: "TCP" hash secret doesn't need to be TCP specific David Gibson
2023-11-30  2:02 ` [PATCH v3 16/16] tcp: Don't defer hash table removal David Gibson
2023-11-30 12:45   ` Stefano Brivio
2023-12-01  0:07     ` David Gibson [this message]
2023-12-02  4:34       ` Stefano Brivio
2023-12-04  0:43         ` David Gibson
2023-12-04  9:54 ` [PATCH v3 00/16] Introduce unified flow table, first steps Stefano Brivio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZWkj1udgR2Le8YZ1@zatzit \
    --to=david@gibson.dropbear.id.au \
    --cc=passt-dev@passt.top \
    --cc=sbrivio@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).