public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: Jon Maloy <jmaloy@redhat.com>, passt-dev@passt.top
Subject: Re: [PATCH v7 08/13] conf, pasta: Track observed guest IPv4 addresses in unified address array
Date: Mon, 22 Jun 2026 11:46:45 +1000	[thread overview]
Message-ID: <ajiUBRLRHMQqGKjv@zatzit> (raw)
In-Reply-To: <20260620001040.76c9d2b1@elisabeth>

[-- Attachment #1: Type: text/plain, Size: 14772 bytes --]

On Sat, Jun 20, 2026 at 12:10:41AM +0200, Stefano Brivio wrote:
> On Sun, 12 Apr 2026 20:53:14 -0400
> Jon Maloy <jmaloy@redhat.com> wrote:
> 
> > We remove the addr_seen field in struct ip4_ctx and replace it by
> > setting a new CONF_ADDR_OBSERVED flag in the corresponding entry
> > in the unified address array.
> > 
> > The observed IPv4 address is always added at or moved to position 0,
> > increasing chances for a fast lookup.
> > 
> > Signed-off-by: Jon Maloy <jmaloy@redhat.com>
> > 
> > ---
> > v4: - Removed migration protocol update, to be added in later commit
> >     - Allow only one OBSERVED address at a time
> >     - Some other changes based on feedback from David G
> > v5: - Allowing multiple observed IPv4 addresses
> > v6: - Refactored fwd_set_addr(), notably:
> >       o Limited number of allowed observed addresses to four per protocol
> >       o I kept the memmove() calls, since I find no more elegant way to
> >         do this. Performance cost should be minimal, since these parts
> >         of the code will execute only very exceptionally. Note that
> >         removing the 'oldest' entry implicitly means removing the least
> >         used one, since the latter will migrate to the highest position
> >         after a few iterations of remove/add.
> >       o Also kept the prefix_len update. Not sure about this, but I
> >         cannot see how the current approach can cause any harm.
> >     - Other changes suggested by David G, notably reversing some
> >       residues after an accidental merge/re-split with the next
> >       commit.
> > v7: - Changed fwd_set_addr() to only accept keeping one observed-only
> >       address per protocol, as suggested by David.
> 
> Sorry, I just spotted this in David's review of v6. Actually, I
> think that keeping track of a few multiple observed addresses
> (especially with different scope) might be convenient and it would
> already be useful here together with 4/13 to avoid resolving via ARP
> any of a few addresses recently seen from the guest.

So.. not resolving ARPs is the one thing where we could actualy use
multiple guest observed addresses - mostly we use it for directing
traffic to the guest, for which we need a single address.

But.. I feel like switching the ARP resolution from "everything
except" to "only these" would be a better solution.  That also lets
the guest move to a brand new unused address without getting bogus DAD
failures.

> While ARP probes for duplicate addresses are usually coming from DHCP
> clients, there might other mechanisms to assign addresses using those.

> Besides, I think David's suggestion was to keep a single observed
> address per IP version _and_ scope, not just one per IP version.

Ah, yes, that is a good point.  That was absolutely my intention.
There's no question we need separate observed addresses for IPv6
link-local and IPv6 global.  (or at least no question as long as we
accept the need for observed addresses at all, which is a different
discussion).

> If we
> just keep one per version, regardless of the scope, we'll now cycle
> between one link-local and one global unicast address (in most cases),
> right?
> 
> >     - Eliminated redundant tap_check_src_addr4() call level.
> >     - I keep fwd_select_addr() for the same pragmatic reason it was
> >       introduced: to avoid ugly, deeply indented code that tends
> >       to wrap across several lines.
> > ---
> >  conf.c    |   6 ---
> >  fwd.c     | 124 +++++++++++++++++++++++++++++++++++++++++++++++-------
> >  fwd.h     |   4 ++
> >  migrate.c |  17 +++++++-
> >  passt.h   |   6 +--
> >  tap.c     |   8 +++-
> >  6 files changed, 136 insertions(+), 29 deletions(-)
> > 
> > diff --git a/conf.c b/conf.c
> > index 924ade2..f503d0f 100644
> > --- a/conf.c
> > +++ b/conf.c
> > @@ -767,13 +767,8 @@ static unsigned int conf_ip4(struct ctx *c, unsigned int ifi)
> >  		}
> >  		if (!rc || !fwd_get_addr(c, AF_INET, 0, 0))
> >  			return 0;
> > -
> > -		a = fwd_get_addr(c, AF_INET, CONF_ADDR_HOST, 0);
> >  	}
> >  
> > -	if (a)
> > -		ip4->addr_seen = *inany_v4(&a->addr);
> > -
> >  	ip4->our_tap_addr = ip4->guest_gw;
> >  
> >  	return ifi;
> > @@ -787,7 +782,6 @@ static void conf_ip4_local(struct ctx *c)
> >  {
> >  	struct ip4_ctx *ip4 = &c->ip4;
> >  
> > -	ip4->addr_seen = IP4_LL_GUEST_ADDR;
> >  	ip4->our_tap_addr = ip4->guest_gw = IP4_LL_GUEST_GW;
> >  	ip4->no_copy_addrs = ip4->no_copy_routes = true;
> >  	fwd_set_addr(c, &inany_from_v4(IP4_LL_GUEST_ADDR),
> > diff --git a/fwd.c b/fwd.c
> > index d3f576a..8c7bf91 100644
> > --- a/fwd.c
> > +++ b/fwd.c
> > @@ -28,6 +28,7 @@
> >  #include "inany.h"
> >  #include "fwd.h"
> >  #include "passt.h"
> > +#include "conf.h"
> >  #include "lineread.h"
> >  #include "flow_table.h"
> >  #include "netlink.h"
> > @@ -260,21 +261,68 @@ void fwd_neigh_table_init(const struct ctx *c)
> >  void fwd_set_addr(struct ctx *c, const union inany_addr *addr,
> >  		  uint8_t flags, int prefix_len)
> >  {
> > -	struct guest_addr *a;
> > +	struct guest_addr *a, *arr = &c->addrs[0], *rm = NULL;
> > +	int count = c->addr_count;
> > +	int af_cnt = 0;
> >  
> > -	for_each_addr(a, c->addrs, c->addr_count, inany_af(addr)) {
> > -		goto found;
> > +	for_each_addr(a, c->addrs, c->addr_count, AF_UNSPEC) {
> > +		if (!inany_equals(&a->addr, addr))
> > +			continue;
> > +
> > +		/* Adjust and update prefix_len if provided and applicable */
> > +		if (prefix_len && !(a->flags & CONF_ADDR_USER))
> > +			a->prefix_len = inany_prefix_len(addr, prefix_len);
> > +
> > +		/* Nothing more to change */
> > +		if ((a->flags & flags) == flags)
> > +			return;
> > +
> > +		a->flags |= flags;
> > +		if (!(flags & CONF_ADDR_OBSERVED))
> > +			return;
> > +
> > +		/* Observed address moves to position 0: remove, re-add later */
> > +		prefix_len = a->prefix_len;
> > +		memmove(a, a + 1, (&arr[count] - (a + 1)) * sizeof(*a));
> > +		c->addr_count = --count;
> > +		break;
> >  	}
> >  
> > -	if (c->addr_count >= MAX_GUEST_ADDRS)
> > +	if (count >= MAX_GUEST_ADDRS) {
> > +		debug("Address table full, can't add address");
> >  		return;
> > +	}
> >  
> > -	a = &c->addrs[c->addr_count++];
> > -
> > -found:
> > +	/* Add to head or tail, depending on flag */
> > +	if (flags & CONF_ADDR_OBSERVED) {
> > +		a = &arr[0];
> > +		memmove(&arr[1], a, count * sizeof(*a));
> > +	} else {
> > +		a = &arr[count];
> > +	}
> > +	c->addr_count = ++count;
> >  	a->addr = *addr;
> >  	a->prefix_len = inany_prefix_len6(addr, prefix_len);
> >  	a->flags = flags;
> > +
> > +	if (!(flags & CONF_ADDR_OBSERVED))
> > +		return;
> > +
> > +	/* Remove excess observed-only address if more than one */
> > +	for (int i = count - 1; i >= 0; i--) {
> > +		a = &arr[i];
> > +		if (inany_af(&a->addr) != inany_af(addr))
> > +			continue;
> > +		if (a->flags != CONF_ADDR_OBSERVED)
> > +			continue;
> > +		if (!rm)
> > +			rm = a;
> > +		af_cnt++;
> > +	}
> > +	if (af_cnt > 1) {
> > +		memmove(rm, rm + 1, (&arr[count] - (rm + 1)) * sizeof(*rm));
> > +		c->addr_count--;
> > +	}
> >  }
> >  
> >  /**
> > @@ -985,6 +1033,38 @@ static bool is_dns_flow(uint8_t proto, const struct flowside *ini)
> >  		((ini->oport == 53) || (ini->oport == 853));
> >  }
> >  
> > +/**
> > + * fwd_select_addr() - Select address with priority-based search
> > + * @c:		Execution context
> > + * @af:		Address family (AF_INET or AF_INET6)
> > + * @primary:	Primary flags to match (or 0 to skip)
> > + * @secondary:	Secondary flags to match (or 0 to skip)
> > + * @skip:	Flags to exclude from search
> > + *
> > + * Search for address entries in priority order.
> > + *
> > + * Return: pointer to selected address entry, or NULL if none found
> > + */
> > +const struct guest_addr *fwd_select_addr(const struct ctx *c, int af,
> > +					 int primary, int secondary, int skip)
> > +{
> > +	const struct guest_addr *a;
> > +
> > +	if (primary) {
> > +		a = fwd_get_addr(c, af, primary, skip);
> > +		if (a)
> > +			return a;
> > +	}
> > +
> > +	if (secondary) {
> > +		a = fwd_get_addr(c, af, secondary, skip);
> > +		if (a)
> > +			return a;
> > +	}
> > +
> > +	return NULL;
> > +}
> > +
> >  /**
> >   * fwd_guest_accessible() - Is address guest-accessible
> >   * @c:		Execution context
> > @@ -1014,11 +1094,6 @@ static bool fwd_guest_accessible(const struct ctx *c,
> >  		if (inany_equals(addr, &a->addr))
> >  			return false;
> >  	}
> > -	/* Also check addr_seen: it tracks the address the guest is actually
> > -	 * using, which may differ from configured addresses.
> > -	 */
> > -	if (inany_equals4(addr, &c->ip4.addr_seen))
> > -		return false;
> >  
> >  	/* For IPv6, addr_seen starts unspecified, because we don't know what LL
> >  	 * address the guest will take until we see it.  Only check against it
> > @@ -1214,10 +1289,20 @@ uint8_t fwd_nat_from_host(const struct ctx *c,
> >  		 * match.
> >  		 */
> >  		if (inany_v4(&ini->eaddr)) {
> > -			if (c->host_lo_to_ns_lo)
> > +			if (c->host_lo_to_ns_lo) {
> >  				tgt->eaddr = inany_loopback4;
> > -			else
> > -				tgt->eaddr = inany_from_v4(c->ip4.addr_seen);
> > +			} else {
> > +				const struct guest_addr *a;
> > +
> > +				a = fwd_select_addr(c, AF_INET,
> > +						    CONF_ADDR_OBSERVED,
> > +						    CONF_ADDR_USER |
> > +						    CONF_ADDR_HOST, 0);
> > +				if (!a)
> > +					return PIF_NONE;
> > +
> > +				tgt->eaddr = a->addr;
> > +			}
> >  			tgt->oaddr = inany_any4;
> >  		} else {
> >  			if (c->host_lo_to_ns_lo)
> > @@ -1252,7 +1337,14 @@ uint8_t fwd_nat_from_host(const struct ctx *c,
> >  	tgt->oport = ini->eport;
> >  
> >  	if (inany_v4(&tgt->oaddr)) {
> > -		tgt->eaddr = inany_from_v4(c->ip4.addr_seen);
> > +		const struct guest_addr *a;
> > +
> > +		a = fwd_select_addr(c, AF_INET, CONF_ADDR_OBSERVED,
> > +				    CONF_ADDR_USER | CONF_ADDR_HOST, 0);
> > +		if (!a)
> > +			return PIF_NONE;
> > +
> > +		tgt->eaddr = a->addr;
> >  	} else {
> >  		if (inany_is_linklocal6(&tgt->oaddr))
> >  			tgt->eaddr.a6 = c->ip6.addr_ll_seen;
> > diff --git a/fwd.h b/fwd.h
> > index c5a1068..9893856 100644
> > --- a/fwd.h
> > +++ b/fwd.h
> > @@ -25,6 +25,10 @@ void fwd_probe_ephemeral(void);
> >  bool fwd_port_is_ephemeral(in_port_t port);
> >  const struct guest_addr *fwd_get_addr(const struct ctx *c, sa_family_t af,
> >  				      uint8_t incl, uint8_t excl);
> > +const struct guest_addr *fwd_select_addr(const struct ctx *c, int af,
> > +					 int primary, int secondary, int skip);
> > +void fwd_set_addr(struct ctx *c, const union inany_addr *addr,
> > +		  uint8_t flags, int prefix_len);
> >  
> >  /**
> >   * struct fwd_rule - Forwarding rule governing a range of ports
> > diff --git a/migrate.c b/migrate.c
> > index 1e8858a..1e02720 100644
> > --- a/migrate.c
> > +++ b/migrate.c
> > @@ -18,6 +18,8 @@
> >  #include "util.h"
> >  #include "ip.h"
> >  #include "passt.h"
> > +#include "conf.h"
> > +#include "fwd.h"
> >  #include "inany.h"
> >  #include "flow.h"
> >  #include "flow_table.h"
> > @@ -57,11 +59,18 @@ static int seen_addrs_source_v2(struct ctx *c,
> >  	struct migrate_seen_addrs_v2 addrs = {
> >  		.addr6 = c->ip6.addr_seen,
> >  		.addr6_ll = c->ip6.addr_ll_seen,
> > -		.addr4 = c->ip4.addr_seen,
> >  	};
> > +	const struct guest_addr *a;
> >  
> >  	(void)stage;
> >  
> > +	/* IPv4 observed address, with fallback to configured address */
> > +	a = fwd_select_addr(c, AF_INET, CONF_ADDR_OBSERVED,
> > +			    CONF_ADDR_USER | CONF_ADDR_HOST,
> > +			    CONF_ADDR_LINKLOCAL);
> > +	if (a)
> > +		addrs.addr4 = *inany_v4(&a->addr);
> > +
> >  	memcpy(addrs.mac, c->guest_mac, sizeof(addrs.mac));
> >  
> >  	if (write_all_buf(fd, &addrs, sizeof(addrs)))
> > @@ -90,7 +99,11 @@ static int seen_addrs_target_v2(struct ctx *c,
> >  
> >  	c->ip6.addr_seen = addrs.addr6;
> >  	c->ip6.addr_ll_seen = addrs.addr6_ll;
> > -	c->ip4.addr_seen = addrs.addr4;
> > +
> > +	if (addrs.addr4.s_addr)
> > +		fwd_set_addr(c, &inany_from_v4(addrs.addr4),
> > +			     CONF_ADDR_OBSERVED, 0);
> > +
> >  	memcpy(c->guest_mac, addrs.mac, sizeof(c->guest_mac));
> >  
> >  	return 0;
> > diff --git a/passt.h b/passt.h
> > index f75656d..5da1d55 100644
> > --- a/passt.h
> > +++ b/passt.h
> > @@ -64,8 +64,9 @@ enum passt_modes {
> >  	MODE_VU,
> >  };
> >  
> > -/* Maximum number of addresses in context address array */
> > +/* Limits on number of addresses in context address array */
> >  #define MAX_GUEST_ADDRS		32
> > +#define MAX_OBSERVED_ADDRS	4
> >  
> >  /**
> >   * struct guest_addr - Unified IPv4/IPv6 address entry
> > @@ -81,11 +82,11 @@ struct guest_addr {
> >  #define CONF_ADDR_HOST		BIT(1)		/* From host interface */
> >  #define CONF_ADDR_GENERATED	BIT(2)		/* Generated by PASST/PASTA */
> >  #define CONF_ADDR_LINKLOCAL	BIT(3)		/* Link-local address */
> > +#define CONF_ADDR_OBSERVED	BIT(4)		/* Seen in guest traffic */
> >  };
> >  
> >  /**
> >   * struct ip4_ctx - IPv4 execution context
> > - * @addr_seen:		Latest IPv4 address seen as source from tap
> >   * @guest_gw:		IPv4 gateway as seen by the guest
> >   * @map_host_loopback:	Outbound connections to this address are NATted to the
> >   *                      host's 127.0.0.1
> > @@ -101,7 +102,6 @@ struct guest_addr {
> >   * @no_copy_addrs:	Don't copy all addresses when configuring namespace
> >   */
> >  struct ip4_ctx {
> > -	struct in_addr addr_seen;
> >  	struct in_addr guest_gw;
> >  	struct in_addr map_host_loopback;
> >  	struct in_addr map_guest_addr;
> > diff --git a/tap.c b/tap.c
> > index eb93f74..7f04e12 100644
> > --- a/tap.c
> > +++ b/tap.c
> > @@ -47,6 +47,7 @@
> >  #include "ip.h"
> >  #include "iov.h"
> >  #include "passt.h"
> > +#include "fwd.h"
> >  #include "arp.h"
> >  #include "dhcp.h"
> >  #include "ndp.h"
> > @@ -756,9 +757,12 @@ resume:
> >  			continue;
> >  		}
> >  
> > -		if (iph->saddr && c->ip4.addr_seen.s_addr != iph->saddr)
> > -			c->ip4.addr_seen.s_addr = iph->saddr;
> > +		if (iph->saddr) {
> > +			const union inany_addr *addr;
> >  
> > +			addr = &inany_from_v4(*(struct in_addr *) &iph->saddr);
> > +			fwd_set_addr(c, addr, CONF_ADDR_OBSERVED, 0);
> > +		}
> >  		if (!iov_drop_header(&data, hlen))
> >  			continue;
> >  		if (iov_tail_size(&data) != l4len)
> 
> -- 
> Stefano
> 

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2026-06-22  2:40 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-13  0:53 [PATCH v7 00/13] Introduce multiple addresses and late binding Jon Maloy
2026-04-13  0:53 ` [PATCH v7 01/13] dhcpv6: Fix reply destination to match client's source address Jon Maloy
2026-05-14  5:21   ` David Gibson
2026-06-19 22:11     ` Stefano Brivio
2026-06-22  3:39       ` David Gibson
2026-06-19 22:09   ` Stefano Brivio
2026-04-13  0:53 ` [PATCH v7 02/13] passt, pasta: Introduce unified multi-address data structures Jon Maloy
2026-05-14  6:30   ` David Gibson
2026-05-14 23:28     ` Stefano Brivio
2026-05-25  9:35       ` David Gibson
2026-06-19 22:09   ` Stefano Brivio
2026-06-22  1:39     ` David Gibson
2026-04-13  0:53 ` [PATCH v7 03/13] fwd: Unify guest accessibility checks with unified address array Jon Maloy
2026-05-25  9:38   ` David Gibson
2026-06-19 22:09   ` Stefano Brivio
2026-04-13  0:53 ` [PATCH v7 04/13] arp: Check all configured addresses in ARP filtering Jon Maloy
2026-06-19 22:10   ` Stefano Brivio
2026-04-13  0:53 ` [PATCH v7 05/13] conf: Allow multiple -a/--address options per address family Jon Maloy
2026-05-25  9:47   ` David Gibson
2026-06-19 22:10   ` Stefano Brivio
2026-04-13  0:53 ` [PATCH v7 06/13] netlink, conf: Read all addresses from template interface at startup Jon Maloy
2026-04-13  0:53 ` [PATCH v7 07/13] netlink, pasta: refactor function pasta_ns_conf() Jon Maloy
2026-05-26  1:58   ` David Gibson
2026-04-13  0:53 ` [PATCH v7 08/13] conf, pasta: Track observed guest IPv4 addresses in unified address array Jon Maloy
2026-05-27  2:46   ` David Gibson
2026-06-19 22:10   ` Stefano Brivio
2026-06-22  1:46     ` David Gibson [this message]
2026-04-13  0:53 ` [PATCH v7 09/13] conf, pasta: Track observed guest IPv6 " Jon Maloy
2026-05-27  3:40   ` David Gibson
2026-06-19 22:11     ` Stefano Brivio
2026-04-13  0:53 ` [PATCH v7 10/13] migrate: Update protocol to v3 for multi-address support Jon Maloy
2026-05-27  3:55   ` David Gibson
2026-06-19 22:11   ` Stefano Brivio
2026-04-13  0:53 ` [PATCH v7 11/13] dhcp: Select address for DHCP distribution Jon Maloy
2026-05-27  4:30   ` David Gibson
2026-06-19 22:11   ` Stefano Brivio
2026-04-13  0:53 ` [PATCH v7 12/13] dhcpv6: Select addresses for DHCPv6 distribution Jon Maloy
2026-05-27  4:40   ` David Gibson
2026-06-19 22:11   ` Stefano Brivio
2026-04-13  0:53 ` [PATCH v7 13/13] ndp: Support advertising multiple prefixes in Router Advertisements Jon Maloy
2026-05-27  4:52   ` David Gibson
2026-06-19 22:11   ` Stefano Brivio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ajiUBRLRHMQqGKjv@zatzit \
    --to=david@gibson.dropbear.id.au \
    --cc=jmaloy@redhat.com \
    --cc=passt-dev@passt.top \
    --cc=sbrivio@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).