public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: passt-dev@passt.top
Subject: Re: [PATCH v3 13/14] fwd, tcp, udp: Add forwarding rule to listening socket epoll references
Date: Wed, 14 Jan 2026 11:37:41 +1100	[thread overview]
Message-ID: <aWblVcRGLUrKvBu7@zatzit> (raw)
In-Reply-To: <20260113231235.544d72f8@elisabeth>

[-- Attachment #1: Type: text/plain, Size: 9505 bytes --]

On Tue, Jan 13, 2026 at 11:12:35PM +0100, Stefano Brivio wrote:
> On Thu,  8 Jan 2026 13:29:47 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > Now that we have a table of all our forwarding rules, every listening
> > socket can be associated with a specific rule.  Add an index allowing us to
> > locate that rule from the socket's epoll reference.  We don't use it yet,
> > but we'll use it to optimise rule lookup when forwarding new flows.
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> >  fwd.c | 15 ++++++++++-----
> >  fwd.h |  5 +++++
> >  tcp.c |  4 +++-
> >  tcp.h |  5 ++---
> >  udp.c |  4 +++-
> >  udp.h |  5 ++---
> >  6 files changed, 25 insertions(+), 13 deletions(-)
> > 
> > diff --git a/fwd.c b/fwd.c
> > index 7c4575ff..6727d26f 100644
> > --- a/fwd.c
> > +++ b/fwd.c
> > @@ -474,6 +474,7 @@ void fwd_rules_print(const struct fwd_ports *fwd)
> >  
> >  /** fwd_sync_one() - Create or remove listening sockets for a forward entry
> >   * @c:		Execution context
> > + * @fwd:	Forwarding table
> >   * @rule:	Forwarding rule
> >   * @pif:	Interface to create listening sockets for
> >   * @proto:	Protocol to listen for
> > @@ -481,19 +482,23 @@ void fwd_rules_print(const struct fwd_ports *fwd)
> >   *
> >   * Return: 0 on success, -1 on failure
> >   */
> > -static int fwd_sync_one(const struct ctx *c, const struct fwd_rule *rule,
> > +static int fwd_sync_one(const struct ctx *c,
> > +			const struct fwd_ports *fwd, const struct fwd_rule *rule,
> >  			uint8_t pif, uint8_t proto, const uint8_t *scanmap)
> >  {
> >  	const union inany_addr *addr = fwd_rule_addr(rule);
> >  	const char *ifname = rule->ifname;
> >  	bool bound_one = false;
> > -	unsigned port;
> > +	unsigned port, idx;
> >  
> >  	ASSERT(pif_is_socket(pif));
> >  
> >  	if (!*ifname)
> >  		ifname = NULL;
> >  
> > +	idx = rule - fwd->rules;
> > +	ASSERT(idx < MAX_FWD_RULES);
> > +	
> >  	for (port = rule->first; port <= rule->last; port++) {
> >  		int fd = rule->socks[port - rule->first];
> >  
> > @@ -514,9 +519,9 @@ static int fwd_sync_one(const struct ctx *c, const struct fwd_rule *rule,
> >  		}
> >  
> >  		if (proto == IPPROTO_TCP)
> > -			fd = tcp_listen(c, pif, addr, ifname, port);
> > +			fd = tcp_listen(c, pif, idx, addr, ifname, port);
> >  		else if (proto == IPPROTO_UDP)
> > -			fd = udp_listen(c, pif, addr, ifname, port);
> > +			fd = udp_listen(c, pif, idx, addr, ifname, port);
> >  		else
> >  			ASSERT(0);
> >  
> > @@ -588,7 +593,7 @@ static int fwd_listen_sync_(void *arg)
> >  		ns_enter(a->c);
> >  
> >  	for (i = 0; i < a->fwd->count; i++) {
> > -		a->ret = fwd_sync_one(a->c, &a->fwd->rules[i],
> > +		a->ret = fwd_sync_one(a->c, a->fwd, &a->fwd->rules[i],
> >  				      a->pif, a->proto, a->fwd->map);
> >  		if (a->ret < 0)
> >  			break;
> > diff --git a/fwd.h b/fwd.h
> > index cfe9ed46..435f422a 100644
> > --- a/fwd.h
> > +++ b/fwd.h
> > @@ -48,14 +48,19 @@ struct fwd_rule {
> >   * union fwd_listen_ref - information about a single listening socket
> >   * @port:	Bound port number of the socket
> >   * @pif:	pif in which the socket is listening
> > + * @rule:	Index of forwarding rule
> >   */
> >  union fwd_listen_ref {
> >  	struct {
> >  		in_port_t	port;
> >  		uint8_t		pif;
> > +#define FWD_RULE_BITS	8
> > +		unsigned	rule :FWD_RULE_BITS;
> >  	};
> >  	uint32_t u32;
> >  };
> > +static_assert(sizeof(union fwd_listen_ref) == sizeof(uint32_t));
> 
> Why do we need this, specifically?

It goes into the data field of the epoll_ref so it has to be exactly
32-bits.  With the bitfields, it's maybe not instantly obvious that
the structure isn't larger than that.  In particular, this relies on
the compiler not inserting padding between @pif and @rule; since
alignof(unsigned) == 4, typically, I was concerned it might.  Even if
that is guaranteed by the C standard, I think it's nicer not to
require the reader to know that.

> > +static_assert(MAX_FWD_RULES <= (1U << FWD_RULE_BITS));
> 
> I start wondering if instead of having a 'rule' field supporting 256
> rules, with 128 as maximum number of rules, we could just have 256 as
> maximum number of rules and use the usual MAX_FROM_BITS() macro to keep
> things simpler.

Good idea, done.

Btw, as a later change, I'm considering merging the four forwarding
tables into one.  If that's done we don't need @pif in the epoll_ref
any more (it will be in the rule), and we'll have 16-bits of space if
we need to expand the rule table

> After all, it's not really rules[] taking space:

Certainly.

> struct fwd_ports {
>         enum fwd_ports_mode        mode;                 /*     0     4 */
>         int                        scan4;                /*     4     4 */
>         int                        scan6;                /*     8     4 */
>         unsigned int               count;                /*    12     4 */
>         struct fwd_rule            rules[128];           /*    16  7168 */
>         /* --- cacheline 112 boundary (7168 bytes) was 16 bytes ago --- */
>         uint8_t                    map[8192];            /*  7184  8192 */
>         /* --- cacheline 240 boundary (15360 bytes) was 16 bytes ago --- */
>         unsigned int               listen_sock_count;    /* 15376     4 */
>         int                        listen_socks[196608]; /* 15380 786432 */
> 
>         /* size: 801816, cachelines: 12529, members: 8 */
>         /* padding: 4 */
>         /* last cacheline: 24 bytes */
> };
> 
> >  enum fwd_ports_mode {
> >  	FWD_UNSET = 0,
> > diff --git a/tcp.c b/tcp.c
> > index e9b440da..fc03e38f 100644
> > --- a/tcp.c
> > +++ b/tcp.c
> > @@ -2672,18 +2672,20 @@ void tcp_sock_handler(const struct ctx *c, union epoll_ref ref,
> >   * tcp_listen() - Create listening socket
> >   * @c:		Execution context
> >   * @pif:	Interface to open the socket for (PIF_HOST or PIF_SPLICE)
> > + * @rule:	Index of relevant forwarding rule
> >   * @addr:	Pointer to address for binding, NULL for any
> >   * @ifname:	Name of interface to bind to, NULL for any
> >   * @port:	Port, host order
> >   *
> >   * Return: Socket fd on success, negative error code on failure
> >   */
> > -int tcp_listen(const struct ctx *c, uint8_t pif,
> > +int tcp_listen(const struct ctx *c, uint8_t pif, unsigned rule,
> >  	       const union inany_addr *addr, const char *ifname, in_port_t port)
> >  {
> >  	union fwd_listen_ref ref = {
> >  		.port = port,
> >  		.pif = pif,
> > +		.rule = rule,
> >  	};
> >  	int s;
> >  
> > diff --git a/tcp.h b/tcp.h
> > index 45f97d93..24b90870 100644
> > --- a/tcp.h
> > +++ b/tcp.h
> > @@ -18,9 +18,8 @@ void tcp_sock_handler(const struct ctx *c, union epoll_ref ref,
> >  int tcp_tap_handler(const struct ctx *c, uint8_t pif, sa_family_t af,
> >  		    const void *saddr, const void *daddr, uint32_t flow_lbl,
> >  		    const struct pool *p, int idx, const struct timespec *now);
> > -int tcp_listen(const struct ctx *c, uint8_t pif,
> > -	       const union inany_addr *addr, const char *ifname,
> > -	       in_port_t port);
> > +int tcp_listen(const struct ctx *c, uint8_t pif, unsigned rule,
> > +	       const union inany_addr *addr, const char *ifname, in_port_t port);
> >  int tcp_init(struct ctx *c);
> >  void tcp_timer(const struct ctx *c, const struct timespec *now);
> >  void tcp_defer_handler(struct ctx *c);
> > diff --git a/udp.c b/udp.c
> > index 92a87198..761221f6 100644
> > --- a/udp.c
> > +++ b/udp.c
> > @@ -1115,18 +1115,20 @@ int udp_tap_handler(const struct ctx *c, uint8_t pif,
> >   * udp_listen() - Initialise listening socket for a given port
> >   * @c:		Execution context
> >   * @pif:	Interface to open the socket for (PIF_HOST or PIF_SPLICE)
> > + * @rule:	Index of relevant forwarding rule
> >   * @addr:	Pointer to address for binding, NULL if not configured
> >   * @ifname:	Name of interface to bind to, NULL if not configured
> >   * @port:	Port, host order
> >   *
> >   * Return: Socket fd on success, negative error code on failure
> >   */
> > -int udp_listen(const struct ctx *c, uint8_t pif,
> > +int udp_listen(const struct ctx *c, uint8_t pif, unsigned rule,
> >  	       const union inany_addr *addr, const char *ifname, in_port_t port)
> >  {
> >  	union fwd_listen_ref ref = {
> >  		.pif = pif,
> >  		.port = port,
> > +		.rule = rule,
> >  	};
> >  	int s;
> >  
> > diff --git a/udp.h b/udp.h
> > index 3c6f90a9..2b91d728 100644
> > --- a/udp.h
> > +++ b/udp.h
> > @@ -14,9 +14,8 @@ int udp_tap_handler(const struct ctx *c, uint8_t pif,
> >  		    sa_family_t af, const void *saddr, const void *daddr,
> >  		    uint8_t ttl, const struct pool *p, int idx,
> >  		    const struct timespec *now);
> > -int udp_listen(const struct ctx *c, uint8_t pif,
> > -	       const union inany_addr *addr, const char *ifname,
> > -	       in_port_t port);
> > +int udp_listen(const struct ctx *c, uint8_t pif, unsigned rule,
> > +	       const union inany_addr *addr, const char *ifname, in_port_t port);
> >  int udp_init(struct ctx *c);
> >  void udp_update_l2_buf(const unsigned char *eth_d);
> >  
> 
> -- 
> Stefano
> 

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2026-01-14  0:42 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-08  2:29 [PATCH v3 00/14] Introduce forwarding table David Gibson
2026-01-08  2:29 ` [PATCH v3 01/14] inany: Extend inany_ntop() to treat NULL as a fully unspecified address David Gibson
2026-01-08 13:16   ` Laurent Vivier
2026-01-08  2:29 ` [PATCH v3 02/14] conf, fwd: Keep a table of our port forwarding configuration David Gibson
2026-01-12 23:26   ` Stefano Brivio
2026-01-13  5:12     ` David Gibson
2026-01-13 22:13       ` Stefano Brivio
2026-01-13 23:53         ` David Gibson
2026-01-08  2:29 ` [PATCH v3 03/14] conf: Accurately record ifname and address for outbound forwards David Gibson
2026-01-08  2:29 ` [PATCH v3 04/14] conf, fwd: Record "auto" port forwards in forwarding table David Gibson
2026-01-12 23:26   ` Stefano Brivio
2026-01-13  5:14     ` David Gibson
2026-01-08  2:29 ` [PATCH v3 05/14] fwd: Make space to store listening sockets in forward table David Gibson
2026-01-12 23:26   ` Stefano Brivio
2026-01-13  5:28     ` David Gibson
2026-01-13 22:13       ` Stefano Brivio
2026-01-13 23:57         ` David Gibson
2026-01-08  2:29 ` [PATCH v3 06/14] ip: Add ipproto_name() function David Gibson
2026-01-08 13:22   ` Laurent Vivier
2026-01-08 23:12     ` David Gibson
2026-01-08  2:29 ` [PATCH v3 07/14] fwd, tcp, udp: Set up listening sockets based on forward table David Gibson
2026-01-12 23:26   ` Stefano Brivio
2026-01-13  5:38     ` David Gibson
2026-01-13 22:13       ` Stefano Brivio
2026-01-13 23:59         ` David Gibson
2026-01-08  2:29 ` [PATCH v3 08/14] tcp, udp: Remove old auto-forwarding socket arrays David Gibson
2026-01-08  2:29 ` [PATCH v3 09/14] conf, fwd: Check forwarding table for conflicting rules David Gibson
2026-01-12 23:26   ` Stefano Brivio
2026-01-13  5:41     ` David Gibson
2026-01-08  2:29 ` [PATCH v3 10/14] fwd: Generate auto-forward exclusions from socket fd tables David Gibson
2026-01-08  2:29 ` [PATCH v3 11/14] flow, fwd: Consult rules table when forwarding a new flow from socket David Gibson
2026-01-13 22:12   ` Stefano Brivio
2026-01-14  0:09     ` David Gibson
2026-01-08  2:29 ` [PATCH v3 12/14] fwd: Remap ports based directly on forwarding rule David Gibson
2026-01-13 22:12   ` Stefano Brivio
2026-01-14  0:24     ` David Gibson
2026-01-08  2:29 ` [PATCH v3 13/14] fwd, tcp, udp: Add forwarding rule to listening socket epoll references David Gibson
2026-01-13 22:12   ` Stefano Brivio
2026-01-14  0:37     ` David Gibson [this message]
2026-01-08  2:29 ` [PATCH v3 14/14] flow, fwd: Optimise forwarding rule lookup using epoll ref when possible David Gibson
2026-01-13 22:13   ` Stefano Brivio
2026-01-14  1:06     ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aWblVcRGLUrKvBu7@zatzit \
    --to=david@gibson.dropbear.id.au \
    --cc=passt-dev@passt.top \
    --cc=sbrivio@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).