From: David Gibson <david@gibson.dropbear.id.au>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: passt-dev@passt.top
Subject: Re: [PATCH v3 05/14] fwd: Make space to store listening sockets in forward table
Date: Tue, 13 Jan 2026 16:28:27 +1100 [thread overview]
Message-ID: <aWXX-wwEDo5Cx1lN@zatzit> (raw)
In-Reply-To: <20260113002622.48f32d54@elisabeth>
[-- Attachment #1: Type: text/plain, Size: 6997 bytes --]
On Tue, Jan 13, 2026 at 12:26:22AM +0100, Stefano Brivio wrote:
> On Thu, 8 Jan 2026 13:29:39 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
>
> > At present, we don't keep track of the fds for listening sockets (except
> > for "auto" ones). Since the fd is stored in the epoll reference, we didn't
> > need an alternative source of it for the various handlers.
> >
> > However, we're intending to allow dynamic changes to forwarding
> > configuration in future. That means we need a way to enumerate sockets so
> > we can close them on removal of a forward.
> >
> > Extend our forwarding table data structure to make space for all the
> > listening sockets. To avoid allocation, this imposes another limit:
> > we could run out of space for socket fds before we run out of slots
> > for forwarding rules.
> >
> > We don't actually do anything with the allocate spaced yet. For
> > "auto" forwards it's redundant with existing arrays. We'll fix both
> > of those in later patches.
> >
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> > fwd.c | 10 +++++++++-
> > fwd.h | 13 +++++++++++++
> > 2 files changed, 22 insertions(+), 1 deletion(-)
> >
> > diff --git a/fwd.c b/fwd.c
> > index 69aca441..f27a4220 100644
> > --- a/fwd.c
> > +++ b/fwd.c
> > @@ -345,6 +345,7 @@ void fwd_rule_add(struct fwd_ports *fwd, uint8_t flags,
> > {
> > /* Flags which can be set from the caller */
> > const uint8_t allowed_flags = FWD_WEAK | FWD_SCAN;
> > + unsigned num = (unsigned)last - first + 1;
> > struct fwd_rule *new;
> > unsigned port;
> >
> > @@ -352,6 +353,8 @@ void fwd_rule_add(struct fwd_ports *fwd, uint8_t flags,
> >
> > if (fwd->count >= ARRAY_SIZE(fwd->rules))
> > die("Too many port forwarding ranges");
> > + if ((fwd->listen_sock_count + num) > ARRAY_SIZE(fwd->listen_socks))
> > + die("Too many listening sockets");
>
> Here, and above: we plan to trigger this from a client at runtime, and
> if there are too many listening sockets, or too many rules/ranges, we
> should fail and report failure (ENOBUFS I guess) instead of quitting.
I'm aware, but for now, these errors are always fatal, so I was going
to defer the error plumbing until later. It's not really any harder,
and this keeps things a bit simpler while we establish the basic
structure of the table.
> > new = &fwd->rules[fwd->count++];
> > new->flags = flags;
> > @@ -373,8 +376,13 @@ void fwd_rule_add(struct fwd_ports *fwd, uint8_t flags,
> >
> > new->to = to;
> >
> > + new->socks = &fwd->listen_socks[fwd->listen_sock_count];
> > + fwd->listen_sock_count += num;
> > +
> > for (port = new->first; port <= new->last; port++) {
> > - /* Fill in the legacy data structures to match the table */
> > + new->socks[port - new->first] = -1;
>
> It's probably saner to initialise these, but just to confirm my
> understanding: this is not needed as (unlike what you have for rules)
> the array is always compacted and its used length described, right?
It is needed. Later patches use the fd array to determine if a
specific port is _actually_ listening, or just maybe listening. This
matters for FWD_WEAK and FWD_SCAN entries. It's nicer to keep it
consistent for regular mappings too - that allows fwd_sync_one() to
know if it has to do anything or not.
As for compaction: I do intend to keep the subarrays used for each
rule compacted together, although we don't get allow rules to be
deleted, so there's nothing to be done for that yet. I'm not
compacting within the subarray, and have no plans to do so at present.
> > +
> > + /* Fill in the legacy forwarding data structures to match the table */
> > if (!(new->flags & FWD_SCAN))
> > bitmap_set(fwd->map, port);
> > fwd->delta[port] = new->to - new->first;
> > diff --git a/fwd.h b/fwd.h
> > index 94869c2a..3ddcb91d 100644
> > --- a/fwd.h
> > +++ b/fwd.h
> > @@ -23,6 +23,7 @@ bool fwd_port_is_ephemeral(in_port_t port);
> > * @first: First port number to forward
> > * @last: Last port number to forward
> > * @to: Port number to forward port @first to.
> > + * @socks: Array of listening sockets for this entry
> > * @flags: Flag mask
> > * FWD_DUAL_STACK_ANY - match any IPv4 or IPv6 address (@addr should be ::)
> > * FWD_WEAK - Don't give an error if binds fail for some forwards
> > @@ -34,6 +35,7 @@ struct fwd_rule {
> > union inany_addr addr;
> > char ifname[IFNAMSIZ];
> > in_port_t first, last, to;
> > + int *socks;
> > #define FWD_DUAL_STACK_ANY BIT(0)
> > #define FWD_WEAK BIT(1)
> > #define FWD_SCAN BIT(2)
> > @@ -65,6 +67,13 @@ enum fwd_ports_mode {
> >
> > #define PORT_BITMAP_SIZE DIV_ROUND_UP(NUM_PORTS, 8)
> >
> > +/* Maximum number of listening sockets (per pif & protocol)
> > + *
> > + * Rationale: This lets us listen on every port for two addresses (which we need
> > + * for -T auto without SO_BINDTODEVICE), plus a comfortable number of extras.
> > + */
> > +#define MAX_LISTEN_SOCKS (NUM_PORTS * 3)
> > +
> > /**
> > * fwd_ports() - Describes port forwarding for one protocol and direction
> > * @mode: Overall forwarding mode (all, none, auto, specific ports)
> > @@ -74,6 +83,8 @@ enum fwd_ports_mode {
> > * @rules: Array of forwarding rules
> > * @map: Bitmap describing which ports are forwarded
> > * @delta: Offset between the original destination and mapped port number
> > + * @listen_sock_count: Number of entries used in @listen_socks
> > + * @listen_socks: Listening sockets for forwarding
>
> To keep those aligned:
>
> /**
> * fwd_ports() - Describes port forwarding for one protocol and direction
> * @mode: Overall forwarding mode (all, none, auto, some ports)
> * @scan4: /proc/net fd to scan for IPv4 ports when in AUTO mode
> * @scan6: /proc/net fd to scan for IPv6 ports when in AUTO mode
> * @count: Number of forwarding rules
> * @rules: Array of forwarding rules
> * @map: Bitmap describing which ports are forwarded
> * @delta: Offset between original and mapped destination port
> * @listen_sock_count: Number of entries used in @listen_socks
> * @listen_socks: Listening sockets for forwarding
> */
Done. I'd also be very open to more succinct names for these fields,
but they haven't occurred to me yet.
> > */
> > struct fwd_ports {
> > enum fwd_ports_mode mode;
> > @@ -83,6 +94,8 @@ struct fwd_ports {
> > struct fwd_rule rules[MAX_FWD_RULES];
> > uint8_t map[PORT_BITMAP_SIZE];
> > in_port_t delta[NUM_PORTS];
> > + unsigned listen_sock_count;
> > + int listen_socks[MAX_LISTEN_SOCKS];
> > };
> >
> > #define FWD_PORT_SCAN_INTERVAL 1000 /* ms */
>
> --
> Stefano
>
--
David Gibson (he or they) | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you, not the other way
| around.
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2026-01-13 5:28 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-08 2:29 [PATCH v3 00/14] Introduce forwarding table David Gibson
2026-01-08 2:29 ` [PATCH v3 01/14] inany: Extend inany_ntop() to treat NULL as a fully unspecified address David Gibson
2026-01-08 13:16 ` Laurent Vivier
2026-01-08 2:29 ` [PATCH v3 02/14] conf, fwd: Keep a table of our port forwarding configuration David Gibson
2026-01-12 23:26 ` Stefano Brivio
2026-01-13 5:12 ` David Gibson
2026-01-08 2:29 ` [PATCH v3 03/14] conf: Accurately record ifname and address for outbound forwards David Gibson
2026-01-08 2:29 ` [PATCH v3 04/14] conf, fwd: Record "auto" port forwards in forwarding table David Gibson
2026-01-12 23:26 ` Stefano Brivio
2026-01-13 5:14 ` David Gibson
2026-01-08 2:29 ` [PATCH v3 05/14] fwd: Make space to store listening sockets in forward table David Gibson
2026-01-12 23:26 ` Stefano Brivio
2026-01-13 5:28 ` David Gibson [this message]
2026-01-08 2:29 ` [PATCH v3 06/14] ip: Add ipproto_name() function David Gibson
2026-01-08 13:22 ` Laurent Vivier
2026-01-08 23:12 ` David Gibson
2026-01-08 2:29 ` [PATCH v3 07/14] fwd, tcp, udp: Set up listening sockets based on forward table David Gibson
2026-01-12 23:26 ` Stefano Brivio
2026-01-13 5:38 ` David Gibson
2026-01-08 2:29 ` [PATCH v3 08/14] tcp, udp: Remove old auto-forwarding socket arrays David Gibson
2026-01-08 2:29 ` [PATCH v3 09/14] conf, fwd: Check forwarding table for conflicting rules David Gibson
2026-01-12 23:26 ` Stefano Brivio
2026-01-13 5:41 ` David Gibson
2026-01-08 2:29 ` [PATCH v3 10/14] fwd: Generate auto-forward exclusions from socket fd tables David Gibson
2026-01-08 2:29 ` [PATCH v3 11/14] flow, fwd: Consult rules table when forwarding a new flow from socket David Gibson
2026-01-08 2:29 ` [PATCH v3 12/14] fwd: Remap ports based directly on forwarding rule David Gibson
2026-01-08 2:29 ` [PATCH v3 13/14] fwd, tcp, udp: Add forwarding rule to listening socket epoll references David Gibson
2026-01-08 2:29 ` [PATCH v3 14/14] flow, fwd: Optimise forwarding rule lookup using epoll ref when possible David Gibson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aWXX-wwEDo5Cx1lN@zatzit \
--to=david@gibson.dropbear.id.au \
--cc=passt-dev@passt.top \
--cc=sbrivio@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).