public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
* [PATCH] Send an initial ARP request to resolve the guest IP address
@ 2025-09-07 11:01 Volker Diels-Grabsch
  2025-09-08  4:00 ` David Gibson
  0 siblings, 1 reply; 14+ messages in thread
From: Volker Diels-Grabsch @ 2025-09-07 11:01 UTC (permalink / raw)
  To: passt-dev; +Cc: Volker Diels-Grabsch

When restarting passt while QEMU keeps running with a configured
"reconnect-ms" setting, the port forwardings will stop working until
the guest sends some outgoing network traffic.

Reason: Although QEMU reconnects successfully to the unix domain
socket of the new passt process, that one no longer knows the guest's
MAC address and uses instead a broadcast MAC address.  However, this
is ignored by the guest, at least if the guest runs Linux.  Only after
the guest sends some network package on its own initiative, passt will
know the MAC address and will be able to establish forwarded
connections.

This change fixes this issue by sending an ARP request to resolve the
guest's MAC address via its IP address, which we do know, right after
the unix domain socket (re)connection.

The only case where the IP is "wrong" would be if the configuration
changed, and/or on the very first start right after qemu started.  But
in those cases, we just wouldn't get an ARP response, and can't do
anything until we receive the guest's DHCP request - just as before.
In other words, in the worst case an ARP request would be harmless.

Signed-off-by: Volker Diels-Grabsch <v@njh.eu>
---
 arp.c | 34 ++++++++++++++++++++++++++++++++++
 arp.h |  1 +
 tap.c | 15 +++++++++++++--
 3 files changed, 48 insertions(+), 2 deletions(-)

diff --git a/arp.c b/arp.c
index 44677ad..561581a 100644
--- a/arp.c
+++ b/arp.c
@@ -112,3 +112,37 @@ int arp(const struct ctx *c, struct iov_tail *data)
 
 	return 1;
 }
+
+/**
+ * send_initial_arp_req() - Send initial ARP request to retrieve guest MAC address
+ * @c:		Execution context
+ */
+void send_initial_arp_req(const struct ctx *c)
+{
+	struct {
+		struct ethhdr eh;
+		struct arphdr ah;
+		struct arpmsg am;
+	} __attribute__((__packed__)) req;
+
+	/* Ethernet header */
+	req.eh.h_proto = htons(ETH_P_ARP);
+	memcpy(req.eh.h_dest, c->guest_mac, sizeof(req.eh.h_dest));
+	memcpy(req.eh.h_source, c->our_tap_mac, sizeof(req.eh.h_source));
+
+	/* ARP header */
+	req.ah.ar_op = htons(ARPOP_REQUEST);
+	req.ah.ar_hrd = htons(ARPHRD_ETHER);
+	req.ah.ar_pro = htons(ETH_P_IP);
+	req.ah.ar_hln = ETH_ALEN;
+	req.ah.ar_pln = 4;
+
+	/* ARP message */
+	memcpy(req.am.sha,	c->our_tap_mac,		sizeof(req.am.sha));
+	memcpy(req.am.sip,	&c->ip4.our_tap_addr,	sizeof(req.am.sip));
+	memcpy(req.am.tha,	c->guest_mac,		sizeof(req.am.tha));
+	memcpy(req.am.tip,	&c->ip4.addr,		sizeof(req.am.tip));
+
+	info("sending initial ARP request to retrieve guest MAC address after reconnect");
+	tap_send_single(c, &req, sizeof(req));
+}
diff --git a/arp.h b/arp.h
index 86bcbf8..5490144 100644
--- a/arp.h
+++ b/arp.h
@@ -21,5 +21,6 @@ struct arpmsg {
 } __attribute__((__packed__));
 
 int arp(const struct ctx *c, struct iov_tail *data);
+void send_initial_arp_req(const struct ctx *c);
 
 #endif /* ARP_H */
diff --git a/tap.c b/tap.c
index 7ba6399..47dedd5 100644
--- a/tap.c
+++ b/tap.c
@@ -1355,6 +1355,16 @@ static void tap_start_connection(const struct ctx *c)
 	ev.events = EPOLLIN | EPOLLRDHUP;
 	ev.data.u64 = ref.u64;
 	epoll_ctl(c->epollfd, EPOLL_CTL_ADD, c->fd_tap, &ev);
+
+	switch (c->mode) {
+	case MODE_PASST:
+		send_initial_arp_req(c);
+		break;
+	case MODE_PASTA:
+		break;
+	case MODE_VU:
+		break;
+	}
 }
 
 /**
@@ -1504,8 +1514,9 @@ void tap_backend_init(struct ctx *c)
 		tap_sock_unix_init(c);
 
 		/* In passt mode, we don't know the guest's MAC address until it
-		 * sends us packets.  Use the broadcast address so that our
-		 * first packets will reach it.
+		 * sends us packets (e.g. responds to our initial ARP request).
+		 * Until then, use the broadcast address so that our first
+		 * packets will reach it.
 		 */
 		memset(&c->guest_mac, 0xff, sizeof(c->guest_mac));
 		break;
-- 
@@ -1355,6 +1355,16 @@ static void tap_start_connection(const struct ctx *c)
 	ev.events = EPOLLIN | EPOLLRDHUP;
 	ev.data.u64 = ref.u64;
 	epoll_ctl(c->epollfd, EPOLL_CTL_ADD, c->fd_tap, &ev);
+
+	switch (c->mode) {
+	case MODE_PASST:
+		send_initial_arp_req(c);
+		break;
+	case MODE_PASTA:
+		break;
+	case MODE_VU:
+		break;
+	}
 }
 
 /**
@@ -1504,8 +1514,9 @@ void tap_backend_init(struct ctx *c)
 		tap_sock_unix_init(c);
 
 		/* In passt mode, we don't know the guest's MAC address until it
-		 * sends us packets.  Use the broadcast address so that our
-		 * first packets will reach it.
+		 * sends us packets (e.g. responds to our initial ARP request).
+		 * Until then, use the broadcast address so that our first
+		 * packets will reach it.
 		 */
 		memset(&c->guest_mac, 0xff, sizeof(c->guest_mac));
 		break;
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] Send an initial ARP request to resolve the guest IP address
  2025-09-07 11:01 [PATCH] Send an initial ARP request to resolve the guest IP address Volker Diels-Grabsch
@ 2025-09-08  4:00 ` David Gibson
  2025-09-08  9:12   ` [PATCH v2] " Volker Diels-Grabsch
  0 siblings, 1 reply; 14+ messages in thread
From: David Gibson @ 2025-09-08  4:00 UTC (permalink / raw)
  To: Volker Diels-Grabsch; +Cc: passt-dev

[-- Attachment #1: Type: text/plain, Size: 5579 bytes --]

On Sun, Sep 07, 2025 at 01:01:08PM +0200, Volker Diels-Grabsch wrote:
> When restarting passt while QEMU keeps running with a configured
> "reconnect-ms" setting, the port forwardings will stop working until
> the guest sends some outgoing network traffic.
> 
> Reason: Although QEMU reconnects successfully to the unix domain
> socket of the new passt process, that one no longer knows the guest's
> MAC address and uses instead a broadcast MAC address.  However, this
> is ignored by the guest, at least if the guest runs Linux.

Huh... I thought Linux would respond to that.  I wonder if there's
some sysctl config or version difference going on here.

> Only after
> the guest sends some network package on its own initiative, passt will

[Aside: Although "packet" and "package" usually mean the same thing in
 English, it's always "network packets" not "network packages" and
 always "software packages" not "software packets"]

> know the MAC address and will be able to establish forwarded
> connections.
> 
> This change fixes this issue by sending an ARP request to resolve the
> guest's MAC address via its IP address, which we do know, right after
> the unix domain socket (re)connection.

Right.  The guest doesn't necessarily use the IP we give it, but it
usually will, so this will work most of the time.

> The only case where the IP is "wrong" would be if the configuration
> changed, and/or on the very first start right after qemu started.  But
> in those cases, we just wouldn't get an ARP response, and can't do
> anything until we receive the guest's DHCP request - just as before.
> In other words, in the worst case an ARP request would be harmless.

Right.  This seems like a good idea to me.  It won't help in all
cases, but it will help in some, and I can't see any harm it could do.

Some comments on the implementation details below.

> Signed-off-by: Volker Diels-Grabsch <v@njh.eu>
> ---
>  arp.c | 34 ++++++++++++++++++++++++++++++++++
>  arp.h |  1 +
>  tap.c | 15 +++++++++++++--
>  3 files changed, 48 insertions(+), 2 deletions(-)
> 
> diff --git a/arp.c b/arp.c
> index 44677ad..561581a 100644
> --- a/arp.c
> +++ b/arp.c
> @@ -112,3 +112,37 @@ int arp(const struct ctx *c, struct iov_tail *data)
>  
>  	return 1;
>  }
> +
> +/**
> + * send_initial_arp_req() - Send initial ARP request to retrieve guest MAC address

Wrap to 80 columns, please.

Also, since this is exposed from this module the name should start
with 'arp_'.

> + * @c:		Execution context
> + */
> +void send_initial_arp_req(const struct ctx *c)
> +{
> +	struct {
> +		struct ethhdr eh;
> +		struct arphdr ah;
> +		struct arpmsg am;
> +	} __attribute__((__packed__)) req;
> +
> +	/* Ethernet header */
> +	req.eh.h_proto = htons(ETH_P_ARP);
> +	memcpy(req.eh.h_dest, c->guest_mac, sizeof(req.eh.h_dest));

At this point we expect guest_mac to be the broadcast address, but it
seems like explicitly using broadcast would be a little more robust if
that changes at some point.

> +	memcpy(req.eh.h_source, c->our_tap_mac, sizeof(req.eh.h_source));
> +
> +	/* ARP header */
> +	req.ah.ar_op = htons(ARPOP_REQUEST);
> +	req.ah.ar_hrd = htons(ARPHRD_ETHER);
> +	req.ah.ar_pro = htons(ETH_P_IP);
> +	req.ah.ar_hln = ETH_ALEN;
> +	req.ah.ar_pln = 4;
> +
> +	/* ARP message */
> +	memcpy(req.am.sha,	c->our_tap_mac,		sizeof(req.am.sha));
> +	memcpy(req.am.sip,	&c->ip4.our_tap_addr,	sizeof(req.am.sip));
> +	memcpy(req.am.tha,	c->guest_mac,		sizeof(req.am.tha));
> +	memcpy(req.am.tip,	&c->ip4.addr,		sizeof(req.am.tip));
> +
> +	info("sending initial ARP request to retrieve guest MAC address after reconnect");
> +	tap_send_single(c, &req, sizeof(req));
> +}
> diff --git a/arp.h b/arp.h
> index 86bcbf8..5490144 100644
> --- a/arp.h
> +++ b/arp.h
> @@ -21,5 +21,6 @@ struct arpmsg {
>  } __attribute__((__packed__));
>  
>  int arp(const struct ctx *c, struct iov_tail *data);
> +void send_initial_arp_req(const struct ctx *c);
>  
>  #endif /* ARP_H */
> diff --git a/tap.c b/tap.c
> index 7ba6399..47dedd5 100644
> --- a/tap.c
> +++ b/tap.c
> @@ -1355,6 +1355,16 @@ static void tap_start_connection(const struct ctx *c)
>  	ev.events = EPOLLIN | EPOLLRDHUP;
>  	ev.data.u64 = ref.u64;
>  	epoll_ctl(c->epollfd, EPOLL_CTL_ADD, c->fd_tap, &ev);
> +
> +	switch (c->mode) {
> +	case MODE_PASST:
> +		send_initial_arp_req(c);
> +		break;
> +	case MODE_PASTA:
> +		break;
> +	case MODE_VU:
> +		break;
> +	}

I don't think we want to make this conditional on MODE_PASST.  I think
MODE_VU could suffer from the same problem.  So could MODE_PASTA,
although it would take a more unusual setup to trigger it.  In any
case, sending it unconditionally should be at worst harmless and is
simpler.

>  }
>  
>  /**
> @@ -1504,8 +1514,9 @@ void tap_backend_init(struct ctx *c)
>  		tap_sock_unix_init(c);
>  
>  		/* In passt mode, we don't know the guest's MAC address until it
> -		 * sends us packets.  Use the broadcast address so that our
> -		 * first packets will reach it.
> +		 * sends us packets (e.g. responds to our initial ARP request).
> +		 * Until then, use the broadcast address so that our first
> +		 * packets will reach it.
>  		 */
>  		memset(&c->guest_mac, 0xff, sizeof(c->guest_mac));
>  		break;
> -- 
> 2.47.2
> 

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2] Send an initial ARP request to resolve the guest IP address
  2025-09-08  4:00 ` David Gibson
@ 2025-09-08  9:12   ` Volker Diels-Grabsch
  2025-09-08  9:22     ` Volker Diels-Grabsch
  0 siblings, 1 reply; 14+ messages in thread
From: Volker Diels-Grabsch @ 2025-09-08  9:12 UTC (permalink / raw)
  To: passt-dev

Dear David,

Thanks a lot for your timely review.  I applied all of your suggested
improvements, and moreover introduced MAC_BROADCAST, analogous to the
already existing MAC_ZERO, to improve readability.

Regarding the maximum of 80 columns per line, I did fix it at the one
place you asked for, but noticed that there are two other places where
my patch hits that limit.  However, I'm not sure about the expected
code formatting when breaking those lines.  So if those need to be
broken as well, I'd be grateful for coding style suggestions.

Best regards,
Volker

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2] Send an initial ARP request to resolve the guest IP address
  2025-09-08  9:12   ` [PATCH v2] " Volker Diels-Grabsch
@ 2025-09-08  9:22     ` Volker Diels-Grabsch
  2025-09-09  2:52       ` David Gibson
  0 siblings, 1 reply; 14+ messages in thread
From: Volker Diels-Grabsch @ 2025-09-08  9:22 UTC (permalink / raw)
  To: passt-dev; +Cc: Volker Diels-Grabsch

When restarting passt while QEMU keeps running with a configured
"reconnect-ms" setting, the port forwardings will stop working until
the guest sends some outgoing network traffic.

Reason: Although QEMU reconnects successfully to the unix domain
socket of the new passt process, that one no longer knows the guest's
MAC address and uses instead a broadcast MAC address.  However, this
is ignored by the guest, at least if the guest runs Linux.  Only after
the guest sends some network package on its own initiative, passt will
know the MAC address and will be able to establish forwarded
connections.

This change fixes this issue by sending an ARP request to resolve the
guest's MAC address via its IP address, which we do know, right after
the unix domain socket (re)connection.

The only case where the IP is "wrong" would be if the configuration
changed, and/or on the very first start right after qemu started.  But
in those cases, we just wouldn't get an ARP response, and can't do
anything until we receive the guest's DHCP request - just as before.
In other words, in the worst case an ARP request would be harmless.

Signed-off-by: Volker Diels-Grabsch <v@njh.eu>
---
 arp.c  | 34 ++++++++++++++++++++++++++++++++++
 arp.h  |  1 +
 tap.c  |  9 ++++++---
 util.h |  1 +
 4 files changed, 42 insertions(+), 3 deletions(-)

diff --git a/arp.c b/arp.c
index 44677ad..c217356 100644
--- a/arp.c
+++ b/arp.c
@@ -112,3 +112,37 @@ int arp(const struct ctx *c, struct iov_tail *data)
 
 	return 1;
 }
+
+/**
+ * arp_send_init_req() - Send initial ARP request to retrieve guest MAC address
+ * @c:		Execution context
+ */
+void arp_send_init_req(const struct ctx *c)
+{
+	struct {
+		struct ethhdr eh;
+		struct arphdr ah;
+		struct arpmsg am;
+	} __attribute__((__packed__)) req;
+
+	/* Ethernet header */
+	req.eh.h_proto = htons(ETH_P_ARP);
+	memcpy(req.eh.h_dest, MAC_BROADCAST, sizeof(req.eh.h_dest));
+	memcpy(req.eh.h_source, c->our_tap_mac, sizeof(req.eh.h_source));
+
+	/* ARP header */
+	req.ah.ar_op = htons(ARPOP_REQUEST);
+	req.ah.ar_hrd = htons(ARPHRD_ETHER);
+	req.ah.ar_pro = htons(ETH_P_IP);
+	req.ah.ar_hln = ETH_ALEN;
+	req.ah.ar_pln = 4;
+
+	/* ARP message */
+	memcpy(req.am.sha,	c->our_tap_mac,		sizeof(req.am.sha));
+	memcpy(req.am.sip,	&c->ip4.our_tap_addr,	sizeof(req.am.sip));
+	memcpy(req.am.tha,	MAC_BROADCAST,		sizeof(req.am.tha));
+	memcpy(req.am.tip,	&c->ip4.addr,		sizeof(req.am.tip));
+
+	info("sending initial ARP request to retrieve guest MAC address after reconnect");
+	tap_send_single(c, &req, sizeof(req));
+}
diff --git a/arp.h b/arp.h
index 86bcbf8..d5ad0e1 100644
--- a/arp.h
+++ b/arp.h
@@ -21,5 +21,6 @@ struct arpmsg {
 } __attribute__((__packed__));
 
 int arp(const struct ctx *c, struct iov_tail *data);
+void arp_send_init_req(const struct ctx *c);
 
 #endif /* ARP_H */
diff --git a/tap.c b/tap.c
index 7ba6399..4249353 100644
--- a/tap.c
+++ b/tap.c
@@ -1355,6 +1355,8 @@ static void tap_start_connection(const struct ctx *c)
 	ev.events = EPOLLIN | EPOLLRDHUP;
 	ev.data.u64 = ref.u64;
 	epoll_ctl(c->epollfd, EPOLL_CTL_ADD, c->fd_tap, &ev);
+
+	arp_send_init_req(c);
 }
 
 /**
@@ -1504,10 +1506,11 @@ void tap_backend_init(struct ctx *c)
 		tap_sock_unix_init(c);
 
 		/* In passt mode, we don't know the guest's MAC address until it
-		 * sends us packets.  Use the broadcast address so that our
-		 * first packets will reach it.
+		 * sends us packets (e.g. responds to our initial ARP request).
+		 * Until then, use the broadcast address so that our first
+		 * packets will reach it.
 		 */
-		memset(&c->guest_mac, 0xff, sizeof(c->guest_mac));
+		memcpy(&c->guest_mac, MAC_BROADCAST, sizeof(c->guest_mac));
 		break;
 	}
 
diff --git a/util.h b/util.h
index 2a8c38f..3719f0c 100644
--- a/util.h
+++ b/util.h
@@ -97,6 +97,7 @@ void abort_with_msg(const char *fmt, ...)
 #define FD_PROTO(x, proto)						\
 	(IN_INTERVAL(c->proto.fd_min, c->proto.fd_max, (x)))
 
+#define MAC_BROADCAST		((uint8_t [ETH_ALEN]){ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff })
 #define MAC_ZERO		((uint8_t [ETH_ALEN]){ 0 })
 #define MAC_IS_ZERO(addr)	(!memcmp((addr), MAC_ZERO, ETH_ALEN))
 
-- 
@@ -97,6 +97,7 @@ void abort_with_msg(const char *fmt, ...)
 #define FD_PROTO(x, proto)						\
 	(IN_INTERVAL(c->proto.fd_min, c->proto.fd_max, (x)))
 
+#define MAC_BROADCAST		((uint8_t [ETH_ALEN]){ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff })
 #define MAC_ZERO		((uint8_t [ETH_ALEN]){ 0 })
 #define MAC_IS_ZERO(addr)	(!memcmp((addr), MAC_ZERO, ETH_ALEN))
 
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] Send an initial ARP request to resolve the guest IP address
  2025-09-08  9:22     ` Volker Diels-Grabsch
@ 2025-09-09  2:52       ` David Gibson
  2025-09-09 10:10         ` Volker Diels-Grabsch
  0 siblings, 1 reply; 14+ messages in thread
From: David Gibson @ 2025-09-09  2:52 UTC (permalink / raw)
  To: Volker Diels-Grabsch; +Cc: passt-dev

[-- Attachment #1: Type: text/plain, Size: 5135 bytes --]

On Mon, Sep 08, 2025 at 11:22:44AM +0200, Volker Diels-Grabsch wrote:
> When restarting passt while QEMU keeps running with a configured
> "reconnect-ms" setting, the port forwardings will stop working until
> the guest sends some outgoing network traffic.
> 
> Reason: Although QEMU reconnects successfully to the unix domain
> socket of the new passt process, that one no longer knows the guest's
> MAC address and uses instead a broadcast MAC address.  However, this
> is ignored by the guest, at least if the guest runs Linux.  Only after
> the guest sends some network package on its own initiative, passt will
> know the MAC address and will be able to establish forwarded
> connections.
> 
> This change fixes this issue by sending an ARP request to resolve the
> guest's MAC address via its IP address, which we do know, right after
> the unix domain socket (re)connection.
> 
> The only case where the IP is "wrong" would be if the configuration
> changed, and/or on the very first start right after qemu started.  But
> in those cases, we just wouldn't get an ARP response, and can't do
> anything until we receive the guest's DHCP request - just as before.
> In other words, in the worst case an ARP request would be harmless.
> 
> Signed-off-by: Volker Diels-Grabsch <v@njh.eu>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

Looks good to me.  It would be nice to also send an Neighbour
Discovery request to accomplish the same thing for IPv6 only guests,
but that could be a separate patch.

Note that even with this patch, active TCP connections (and in some
cases UDP flows) will be broken by a passt restart.

> ---
>  arp.c  | 34 ++++++++++++++++++++++++++++++++++
>  arp.h  |  1 +
>  tap.c  |  9 ++++++---
>  util.h |  1 +
>  4 files changed, 42 insertions(+), 3 deletions(-)
> 
> diff --git a/arp.c b/arp.c
> index 44677ad..c217356 100644
> --- a/arp.c
> +++ b/arp.c
> @@ -112,3 +112,37 @@ int arp(const struct ctx *c, struct iov_tail *data)
>  
>  	return 1;
>  }
> +
> +/**
> + * arp_send_init_req() - Send initial ARP request to retrieve guest MAC address
> + * @c:		Execution context
> + */
> +void arp_send_init_req(const struct ctx *c)
> +{
> +	struct {
> +		struct ethhdr eh;
> +		struct arphdr ah;
> +		struct arpmsg am;
> +	} __attribute__((__packed__)) req;
> +
> +	/* Ethernet header */
> +	req.eh.h_proto = htons(ETH_P_ARP);
> +	memcpy(req.eh.h_dest, MAC_BROADCAST, sizeof(req.eh.h_dest));
> +	memcpy(req.eh.h_source, c->our_tap_mac, sizeof(req.eh.h_source));
> +
> +	/* ARP header */
> +	req.ah.ar_op = htons(ARPOP_REQUEST);
> +	req.ah.ar_hrd = htons(ARPHRD_ETHER);
> +	req.ah.ar_pro = htons(ETH_P_IP);
> +	req.ah.ar_hln = ETH_ALEN;
> +	req.ah.ar_pln = 4;
> +
> +	/* ARP message */
> +	memcpy(req.am.sha,	c->our_tap_mac,		sizeof(req.am.sha));
> +	memcpy(req.am.sip,	&c->ip4.our_tap_addr,	sizeof(req.am.sip));
> +	memcpy(req.am.tha,	MAC_BROADCAST,		sizeof(req.am.tha));
> +	memcpy(req.am.tip,	&c->ip4.addr,		sizeof(req.am.tip));
> +
> +	info("sending initial ARP request to retrieve guest MAC address after reconnect");
> +	tap_send_single(c, &req, sizeof(req));
> +}
> diff --git a/arp.h b/arp.h
> index 86bcbf8..d5ad0e1 100644
> --- a/arp.h
> +++ b/arp.h
> @@ -21,5 +21,6 @@ struct arpmsg {
>  } __attribute__((__packed__));
>  
>  int arp(const struct ctx *c, struct iov_tail *data);
> +void arp_send_init_req(const struct ctx *c);
>  
>  #endif /* ARP_H */
> diff --git a/tap.c b/tap.c
> index 7ba6399..4249353 100644
> --- a/tap.c
> +++ b/tap.c
> @@ -1355,6 +1355,8 @@ static void tap_start_connection(const struct ctx *c)
>  	ev.events = EPOLLIN | EPOLLRDHUP;
>  	ev.data.u64 = ref.u64;
>  	epoll_ctl(c->epollfd, EPOLL_CTL_ADD, c->fd_tap, &ev);
> +
> +	arp_send_init_req(c);
>  }
>  
>  /**
> @@ -1504,10 +1506,11 @@ void tap_backend_init(struct ctx *c)
>  		tap_sock_unix_init(c);
>  
>  		/* In passt mode, we don't know the guest's MAC address until it
> -		 * sends us packets.  Use the broadcast address so that our
> -		 * first packets will reach it.
> +		 * sends us packets (e.g. responds to our initial ARP request).
> +		 * Until then, use the broadcast address so that our first
> +		 * packets will reach it.
>  		 */
> -		memset(&c->guest_mac, 0xff, sizeof(c->guest_mac));
> +		memcpy(&c->guest_mac, MAC_BROADCAST, sizeof(c->guest_mac));
>  		break;
>  	}
>  
> diff --git a/util.h b/util.h
> index 2a8c38f..3719f0c 100644
> --- a/util.h
> +++ b/util.h
> @@ -97,6 +97,7 @@ void abort_with_msg(const char *fmt, ...)
>  #define FD_PROTO(x, proto)						\
>  	(IN_INTERVAL(c->proto.fd_min, c->proto.fd_max, (x)))
>  
> +#define MAC_BROADCAST		((uint8_t [ETH_ALEN]){ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff })
>  #define MAC_ZERO		((uint8_t [ETH_ALEN]){ 0 })
>  #define MAC_IS_ZERO(addr)	(!memcmp((addr), MAC_ZERO, ETH_ALEN))
>  
> -- 
> 2.47.3
> 

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] Send an initial ARP request to resolve the guest IP address
  2025-09-09  2:52       ` David Gibson
@ 2025-09-09 10:10         ` Volker Diels-Grabsch
  2025-09-09 14:49           ` Volker Diels-Grabsch
  2025-09-09 15:55           ` [PATCH v2] Send an initial ARP " Stefano Brivio
  0 siblings, 2 replies; 14+ messages in thread
From: Volker Diels-Grabsch @ 2025-09-09 10:10 UTC (permalink / raw)
  To: David Gibson; +Cc: passt-dev

David Gibson wrote:
> Looks good to me.  It would be nice to also send an Neighbour
> Discovery request to accomplish the same thing for IPv6 only guests,
> but that could be a separate patch.

Good point!  I'd prefer to do it in the same patch, as I'd also like
fix another minor detail in this one.

Just for the sake of clarity: I had a look at RFC4861 and there are
many types of Neighbour Discovery requests.  For our purpose, I believe
that we want to send a "Neighbor Solicitation Message" described in
section 4.3:

https://datatracker.ietf.org/doc/html/rfc4861#section-4.3

Do you agree, or did you have something else in mind?

> Note that even with this patch, active TCP connections (and in some
> cases UDP flows) will be broken by a passt restart.

Indeed, that is unavoidable for a user-space tool opening TCP and UDP
connections, I guess, unless "passt" itself is wrapped into another
process or system tool that keeps those connections open.  But let's
not go into that.

If I really needed that level of independence, there is still the
option to use a VPN like Wireguard, and then have a fixed passt
process that is never restarted, and forwards only the wireguard UDP
port.  The service discrimination would then happen on IP rather than
port level, so that (re)configuration of the service assignments to VM
could happen always on-the-fly directly by the Cryptokey Routing table.

(And maybe there are even better options for that use case, this is
just the first simple solution that came to my mind, even though it
would incur some crypto overhead unless you are already using
wireguard anyways.)


Best regards,
Volker

-- 
.---<<<((()))>>>---.
|      [[||]]      |
'---<<<((()))>>>---'

^ permalink raw reply	[flat|nested] 14+ messages in thread

* (no subject)
  2025-09-09 10:10         ` Volker Diels-Grabsch
@ 2025-09-09 14:49           ` Volker Diels-Grabsch
  2025-09-09 14:49             ` [PATCH] Send an initial ARP and NDP request to resolve the guest IP address Volker Diels-Grabsch
  2025-09-09 15:55           ` [PATCH v2] Send an initial ARP " Stefano Brivio
  1 sibling, 1 reply; 14+ messages in thread
From: Volker Diels-Grabsch @ 2025-09-09 14:49 UTC (permalink / raw)
  To: passt-dev; +Cc: Volker Diels-Grabsch

Okay, so here is the improved patch, which now performs NDP in
addition to ARP.  I tested everything locally with QEMU VMs running a
minimal Debian system, and it all worked flawlessly.

Moreover, the log messages are improved, and I added new log message
"Guest MAC address: ..." as soon as we discover the guest's MAC
address or notice a change of it.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] Send an initial ARP and NDP request to resolve the guest IP address
  2025-09-09 14:49           ` Volker Diels-Grabsch
@ 2025-09-09 14:49             ` Volker Diels-Grabsch
  2025-09-10  3:32               ` David Gibson
  2025-09-10  9:29               ` Stefano Brivio
  0 siblings, 2 replies; 14+ messages in thread
From: Volker Diels-Grabsch @ 2025-09-09 14:49 UTC (permalink / raw)
  To: passt-dev; +Cc: Volker Diels-Grabsch

When restarting passt while QEMU keeps running with a configured
"reconnect-ms" setting, the port forwardings will stop working until
the guest sends some outgoing network traffic.

Reason: Although QEMU reconnects successfully to the unix domain
socket of the new passt process, that one no longer knows the guest's
MAC address and uses instead the broadcast MAC address.  However, this
is ignored by the guest, at least if the guest runs Linux.  Only after
the guest sends some network package on its own initiative, passt will
know the MAC address and will be able to establish forwarded
connections.

This change fixes this issue by sending an ARP and an NDP request to
resolve the guest's MAC address via its IPv4 and IPv6 address, which
we do know, right after the unix domain socket (re)connection.

The only case where the IP is "wrong" would be if the configuration
changed, or on the very first start right after qemu started.  But in
those cases, we just wouldn't get an ARP/NDP response, and can't do
anything until we receive the guest's DHCP request - just as before.
In other words, in the worst case the ARP/NDP requests would be
harmless.

Signed-off-by: Volker Diels-Grabsch <v@njh.eu>
---
 arp.c  | 33 +++++++++++++++++++++++++++++++++
 arp.h  |  1 +
 ndp.c  | 19 +++++++++++++++++++
 ndp.h  |  1 +
 tap.c  | 16 ++++++++++++----
 util.h |  1 +
 6 files changed, 67 insertions(+), 4 deletions(-)

diff --git a/arp.c b/arp.c
index 44677ad..c1bd63b 100644
--- a/arp.c
+++ b/arp.c
@@ -112,3 +112,36 @@ int arp(const struct ctx *c, struct iov_tail *data)
 
 	return 1;
 }
+
+/**
+ * arp_send_init_req() - Send initial ARP request to retrieve guest MAC address
+ * @c:		Execution context
+ */
+void arp_send_init_req(const struct ctx *c)
+{
+	struct {
+		struct ethhdr eh;
+		struct arphdr ah;
+		struct arpmsg am;
+	} __attribute__((__packed__)) req;
+
+	/* Ethernet header */
+	req.eh.h_proto = htons(ETH_P_ARP);
+	memcpy(req.eh.h_dest, MAC_BROADCAST, sizeof(req.eh.h_dest));
+	memcpy(req.eh.h_source, c->our_tap_mac, sizeof(req.eh.h_source));
+
+	/* ARP header */
+	req.ah.ar_op = htons(ARPOP_REQUEST);
+	req.ah.ar_hrd = htons(ARPHRD_ETHER);
+	req.ah.ar_pro = htons(ETH_P_IP);
+	req.ah.ar_hln = ETH_ALEN;
+	req.ah.ar_pln = 4;
+
+	/* ARP message */
+	memcpy(req.am.sha,	c->our_tap_mac,		sizeof(req.am.sha));
+	memcpy(req.am.sip,	&c->ip4.our_tap_addr,	sizeof(req.am.sip));
+	memcpy(req.am.tha,	MAC_BROADCAST,		sizeof(req.am.tha));
+	memcpy(req.am.tip,	&c->ip4.addr,		sizeof(req.am.tip));
+
+	tap_send_single(c, &req, sizeof(req));
+}
diff --git a/arp.h b/arp.h
index 86bcbf8..d5ad0e1 100644
--- a/arp.h
+++ b/arp.h
@@ -21,5 +21,6 @@ struct arpmsg {
 } __attribute__((__packed__));
 
 int arp(const struct ctx *c, struct iov_tail *data);
+void arp_send_init_req(const struct ctx *c);
 
 #endif /* ARP_H */
diff --git a/ndp.c b/ndp.c
index eb090cd..b3bdedb 100644
--- a/ndp.c
+++ b/ndp.c
@@ -438,3 +438,22 @@ void ndp_timer(const struct ctx *c, const struct timespec *now)
 first:
 	next_ra = now->tv_sec + interval;
 }
+
+/**
+ * ndp_send_init_req() - Send initial NDP NS to retrieve guest MAC address
+ * @c:		Execution context
+ */
+void ndp_send_init_req(const struct ctx *c)
+{
+	struct ndp_ns ns = {
+		.ih = {
+			.icmp6_type		= NS,
+			.icmp6_code		= 0,
+			.icmp6_router		= 0, /* Reserved */
+			.icmp6_solicited	= 0, /* Reserved */
+			.icmp6_override		= 0, /* Reserved */
+		},
+		.target_addr = c->ip6.addr
+	};
+	ndp_send(c, &c->ip6.addr, &ns, sizeof(ns));
+}
diff --git a/ndp.h b/ndp.h
index b1dd5e8..781ea86 100644
--- a/ndp.h
+++ b/ndp.h
@@ -11,5 +11,6 @@ struct icmp6hdr;
 int ndp(const struct ctx *c, const struct in6_addr *saddr,
 	struct iov_tail *data);
 void ndp_timer(const struct ctx *c, const struct timespec *now);
+void ndp_send_init_req(const struct ctx *c);
 
 #endif /* NDP_H */
diff --git a/tap.c b/tap.c
index 7ba6399..ea61eae 100644
--- a/tap.c
+++ b/tap.c
@@ -1088,6 +1088,7 @@ void tap_add_packet(struct ctx *c, struct iov_tail *data,
 {
 	struct ethhdr eh_storage;
 	const struct ethhdr *eh;
+	char bufmac[ETH_ADDRSTRLEN];
 
 	pcap_iov(data->iov, data->cnt, data->off);
 
@@ -1097,6 +1098,7 @@ void tap_add_packet(struct ctx *c, struct iov_tail *data,
 
 	if (memcmp(c->guest_mac, eh->h_source, ETH_ALEN)) {
 		memcpy(c->guest_mac, eh->h_source, ETH_ALEN);
+		info("Guest MAC address: %s", eth_ntop(c->guest_mac, bufmac, sizeof(bufmac)));
 		proto_update_l2_buf(c->guest_mac, NULL);
 	}
 
@@ -1355,6 +1357,11 @@ static void tap_start_connection(const struct ctx *c)
 	ev.events = EPOLLIN | EPOLLRDHUP;
 	ev.data.u64 = ref.u64;
 	epoll_ctl(c->epollfd, EPOLL_CTL_ADD, c->fd_tap, &ev);
+
+	info("Sending initial ARP and NDP request to retrieve"
+	     " guest MAC address after reconnect");
+	arp_send_init_req(c);
+	ndp_send_init_req(c);
 }
 
 /**
@@ -1503,11 +1510,12 @@ void tap_backend_init(struct ctx *c)
 	case MODE_PASST:
 		tap_sock_unix_init(c);
 
-		/* In passt mode, we don't know the guest's MAC address until it
-		 * sends us packets.  Use the broadcast address so that our
-		 * first packets will reach it.
+		/* In passt mode, we don't know the guest's MAC address until
+		 * it sends us packets (e.g. responds to our initial ARP or
+		 * NDP request).  Until then, use the broadcast address so
+		 * that our first packets will have a chance to reach it.
 		 */
-		memset(&c->guest_mac, 0xff, sizeof(c->guest_mac));
+		memcpy(&c->guest_mac, MAC_BROADCAST, sizeof(c->guest_mac));
 		break;
 	}
 
diff --git a/util.h b/util.h
index 2a8c38f..3719f0c 100644
--- a/util.h
+++ b/util.h
@@ -97,6 +97,7 @@ void abort_with_msg(const char *fmt, ...)
 #define FD_PROTO(x, proto)						\
 	(IN_INTERVAL(c->proto.fd_min, c->proto.fd_max, (x)))
 
+#define MAC_BROADCAST		((uint8_t [ETH_ALEN]){ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff })
 #define MAC_ZERO		((uint8_t [ETH_ALEN]){ 0 })
 #define MAC_IS_ZERO(addr)	(!memcmp((addr), MAC_ZERO, ETH_ALEN))
 
-- 
@@ -97,6 +97,7 @@ void abort_with_msg(const char *fmt, ...)
 #define FD_PROTO(x, proto)						\
 	(IN_INTERVAL(c->proto.fd_min, c->proto.fd_max, (x)))
 
+#define MAC_BROADCAST		((uint8_t [ETH_ALEN]){ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff })
 #define MAC_ZERO		((uint8_t [ETH_ALEN]){ 0 })
 #define MAC_IS_ZERO(addr)	(!memcmp((addr), MAC_ZERO, ETH_ALEN))
 
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] Send an initial ARP request to resolve the guest IP address
  2025-09-09 10:10         ` Volker Diels-Grabsch
  2025-09-09 14:49           ` Volker Diels-Grabsch
@ 2025-09-09 15:55           ` Stefano Brivio
  2025-09-10  3:33             ` David Gibson
  1 sibling, 1 reply; 14+ messages in thread
From: Stefano Brivio @ 2025-09-09 15:55 UTC (permalink / raw)
  To: Volker Diels-Grabsch; +Cc: David Gibson, passt-dev

On Tue, 9 Sep 2025 12:10:45 +0200
Volker Diels-Grabsch <v@njh.eu> wrote:

> David Gibson wrote:
> > Looks good to me.  It would be nice to also send an Neighbour
> > Discovery request to accomplish the same thing for IPv6 only guests,
> > but that could be a separate patch.  
> 
> Good point!  I'd prefer to do it in the same patch, as I'd also like
> fix another minor detail in this one.
> 
> Just for the sake of clarity: I had a look at RFC4861 and there are
> many types of Neighbour Discovery requests.  For our purpose, I believe
> that we want to send a "Neighbor Solicitation Message" described in
> section 4.3:
> 
> https://datatracker.ietf.org/doc/html/rfc4861#section-4.3
> 
> Do you agree, or did you have something else in mind?

I was about to comment on this, but from the new patch you sent, I see
you already figured out it's a Neighbour Solicitation and that we
already have some bits of code for that. :)

> > Note that even with this patch, active TCP connections (and in some
> > cases UDP flows) will be broken by a passt restart.  
> 
> Indeed, that is unavoidable for a user-space tool opening TCP and UDP
> connections, I guess, unless "passt" itself is wrapped into another
> process or system tool that keeps those connections open.  But let's
> not go into that.

Actually, it's not really unavoidable, in the sense that we recently
added (see migrate.c, repair.c, passt-repair.c, and passt-repair(1))
support for migration of live TCP connections triggered by vhost-user
commands:

  https://qemu-project.gitlab.io/qemu/interop/vhost-user.html#migrating-back-end-state

which is based on the TCP_REPAIR socket option in the Linux kernel,
which was in turn added to support a similar feature in CRIU:

  https://criu.org/TCP_connection

and while this was done with KubeVirt in mind:

  https://github.com/kubevirt/enhancements/blob/main/veps/sig-network/passt/passt-migration-proposal.md

that is, migration between two different nodes / hosts, there's nothing
that really prevents migration between two instances of passt via, for
example, load/dump from/to a binary file.

Actually, we initially wanted to add the file option for testing
purposes, but we skipped it eventually and went straight ahead for the
direct implementation.

Some bits of "documentation":

  git log migrate.c
  test/migrate/basic

...yes, a new website with some space for this stuff is in
(infinitesimally slow) progress.

-- 
Stefano


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Send an initial ARP and NDP request to resolve the guest IP address
  2025-09-09 14:49             ` [PATCH] Send an initial ARP and NDP request to resolve the guest IP address Volker Diels-Grabsch
@ 2025-09-10  3:32               ` David Gibson
  2025-09-10  9:29               ` Stefano Brivio
  1 sibling, 0 replies; 14+ messages in thread
From: David Gibson @ 2025-09-10  3:32 UTC (permalink / raw)
  To: Volker Diels-Grabsch; +Cc: passt-dev

[-- Attachment #1: Type: text/plain, Size: 4160 bytes --]

On Tue, Sep 09, 2025 at 04:49:20PM +0200, Volker Diels-Grabsch wrote:
> When restarting passt while QEMU keeps running with a configured
> "reconnect-ms" setting, the port forwardings will stop working until
> the guest sends some outgoing network traffic.
> 
> Reason: Although QEMU reconnects successfully to the unix domain
> socket of the new passt process, that one no longer knows the guest's
> MAC address and uses instead the broadcast MAC address.  However, this
> is ignored by the guest, at least if the guest runs Linux.  Only after
> the guest sends some network package on its own initiative, passt will
> know the MAC address and will be able to establish forwarded
> connections.
> 
> This change fixes this issue by sending an ARP and an NDP request to
> resolve the guest's MAC address via its IPv4 and IPv6 address, which
> we do know, right after the unix domain socket (re)connection.
> 
> The only case where the IP is "wrong" would be if the configuration
> changed, or on the very first start right after qemu started.  But in
> those cases, we just wouldn't get an ARP/NDP response, and can't do
> anything until we receive the guest's DHCP request - just as before.
> In other words, in the worst case the ARP/NDP requests would be
> harmless.
> 
> Signed-off-by: Volker Diels-Grabsch <v@njh.eu>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

Two tiny nits that aren't worth a respin, but maybe Stefano will want
to change on merge:

[snip]
> diff --git a/tap.c b/tap.c
> index 7ba6399..ea61eae 100644
> --- a/tap.c
> +++ b/tap.c
> @@ -1088,6 +1088,7 @@ void tap_add_packet(struct ctx *c, struct iov_tail *data,
>  {
>  	struct ethhdr eh_storage;
>  	const struct ethhdr *eh;
> +	char bufmac[ETH_ADDRSTRLEN];

We'd generally prefer to move this local to the if block where it's
used.


>  
>  	pcap_iov(data->iov, data->cnt, data->off);
>  
> @@ -1097,6 +1098,7 @@ void tap_add_packet(struct ctx *c, struct iov_tail *data,
>  
>  	if (memcmp(c->guest_mac, eh->h_source, ETH_ALEN)) {
>  		memcpy(c->guest_mac, eh->h_source, ETH_ALEN);
> +		info("Guest MAC address: %s", eth_ntop(c->guest_mac, bufmac, sizeof(bufmac)));
>  		proto_update_l2_buf(c->guest_mac, NULL);
>  	}
>  
> @@ -1355,6 +1357,11 @@ static void tap_start_connection(const struct ctx *c)
>  	ev.events = EPOLLIN | EPOLLRDHUP;
>  	ev.data.u64 = ref.u64;
>  	epoll_ctl(c->epollfd, EPOLL_CTL_ADD, c->fd_tap, &ev);
> +
> +	info("Sending initial ARP and NDP request to retrieve"
> +	     " guest MAC address after reconnect");

I think it's going to be rare that we care about this, so I'd demote
it to a debug().

> +	arp_send_init_req(c);
> +	ndp_send_init_req(c);
>  }
>  
>  /**
> @@ -1503,11 +1510,12 @@ void tap_backend_init(struct ctx *c)
>  	case MODE_PASST:
>  		tap_sock_unix_init(c);
>  
> -		/* In passt mode, we don't know the guest's MAC address until it
> -		 * sends us packets.  Use the broadcast address so that our
> -		 * first packets will reach it.
> +		/* In passt mode, we don't know the guest's MAC address until
> +		 * it sends us packets (e.g. responds to our initial ARP or
> +		 * NDP request).  Until then, use the broadcast address so
> +		 * that our first packets will have a chance to reach it.
>  		 */
> -		memset(&c->guest_mac, 0xff, sizeof(c->guest_mac));
> +		memcpy(&c->guest_mac, MAC_BROADCAST, sizeof(c->guest_mac));
>  		break;
>  	}
>  
> diff --git a/util.h b/util.h
> index 2a8c38f..3719f0c 100644
> --- a/util.h
> +++ b/util.h
> @@ -97,6 +97,7 @@ void abort_with_msg(const char *fmt, ...)
>  #define FD_PROTO(x, proto)						\
>  	(IN_INTERVAL(c->proto.fd_min, c->proto.fd_max, (x)))
>  
> +#define MAC_BROADCAST		((uint8_t [ETH_ALEN]){ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff })
>  #define MAC_ZERO		((uint8_t [ETH_ALEN]){ 0 })
>  #define MAC_IS_ZERO(addr)	(!memcmp((addr), MAC_ZERO, ETH_ALEN))
>  
> -- 
> 2.47.3
> 

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] Send an initial ARP request to resolve the guest IP address
  2025-09-09 15:55           ` [PATCH v2] Send an initial ARP " Stefano Brivio
@ 2025-09-10  3:33             ` David Gibson
  0 siblings, 0 replies; 14+ messages in thread
From: David Gibson @ 2025-09-10  3:33 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: Volker Diels-Grabsch, passt-dev

[-- Attachment #1: Type: text/plain, Size: 2049 bytes --]

On Tue, Sep 09, 2025 at 05:55:30PM +0200, Stefano Brivio wrote:
> On Tue, 9 Sep 2025 12:10:45 +0200
> Volker Diels-Grabsch <v@njh.eu> wrote:
> 
> > David Gibson wrote:
> > > Looks good to me.  It would be nice to also send an Neighbour
> > > Discovery request to accomplish the same thing for IPv6 only guests,
> > > but that could be a separate patch.  
> > 
> > Good point!  I'd prefer to do it in the same patch, as I'd also like
> > fix another minor detail in this one.
> > 
> > Just for the sake of clarity: I had a look at RFC4861 and there are
> > many types of Neighbour Discovery requests.  For our purpose, I believe
> > that we want to send a "Neighbor Solicitation Message" described in
> > section 4.3:
> > 
> > https://datatracker.ietf.org/doc/html/rfc4861#section-4.3
> > 
> > Do you agree, or did you have something else in mind?
> 
> I was about to comment on this, but from the new patch you sent, I see
> you already figured out it's a Neighbour Solicitation and that we
> already have some bits of code for that. :)
> 
> > > Note that even with this patch, active TCP connections (and in some
> > > cases UDP flows) will be broken by a passt restart.  
> > 
> > Indeed, that is unavoidable for a user-space tool opening TCP and UDP
> > connections, I guess, unless "passt" itself is wrapped into another
> > process or system tool that keeps those connections open.  But let's
> > not go into that.
> 
> Actually, it's not really unavoidable, in the sense that we recently
> added (see migrate.c, repair.c, passt-repair.c, and passt-repair(1))
> support for migration of live TCP connections triggered by vhost-user
> commands:

Right, but doing this requires knowing in advance, support from qemu
and a bunch of infrastructure.  It's not going to work if you just
kill and restart passt.

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Send an initial ARP and NDP request to resolve the guest IP address
  2025-09-09 14:49             ` [PATCH] Send an initial ARP and NDP request to resolve the guest IP address Volker Diels-Grabsch
  2025-09-10  3:32               ` David Gibson
@ 2025-09-10  9:29               ` Stefano Brivio
  2025-09-10 10:33                 ` Volker Diels-Grabsch
  1 sibling, 1 reply; 14+ messages in thread
From: Stefano Brivio @ 2025-09-10  9:29 UTC (permalink / raw)
  To: Volker Diels-Grabsch; +Cc: passt-dev, David Gibson

On Tue,  9 Sep 2025 16:49:20 +0200
Volker Diels-Grabsch <v@njh.eu> wrote:

> When restarting passt while QEMU keeps running with a configured
> "reconnect-ms" setting, the port forwardings will stop working until
> the guest sends some outgoing network traffic.
> 
> Reason: Although QEMU reconnects successfully to the unix domain
> socket of the new passt process, that one no longer knows the guest's
> MAC address and uses instead the broadcast MAC address.  However, this
> is ignored by the guest, at least if the guest runs Linux.  Only after
> the guest sends some network package on its own initiative, passt will
> know the MAC address and will be able to establish forwarded
> connections.
> 
> This change fixes this issue by sending an ARP and an NDP request to
> resolve the guest's MAC address via its IPv4 and IPv6 address, which
> we do know, right after the unix domain socket (re)connection.
> 
> The only case where the IP is "wrong" would be if the configuration
> changed, or on the very first start right after qemu started.  But in
> those cases, we just wouldn't get an ARP/NDP response, and can't do
> anything until we receive the guest's DHCP request - just as before.
> In other words, in the worst case the ARP/NDP requests would be
> harmless.

Thanks for the implementation, this looks like a small but quite
relevant feature we missed until now. I have a couple of comments on
top of David's ones:

> Signed-off-by: Volker Diels-Grabsch <v@njh.eu>
> ---
>  arp.c  | 33 +++++++++++++++++++++++++++++++++
>  arp.h  |  1 +
>  ndp.c  | 19 +++++++++++++++++++
>  ndp.h  |  1 +
>  tap.c  | 16 ++++++++++++----
>  util.h |  1 +
>  6 files changed, 67 insertions(+), 4 deletions(-)
> 
> diff --git a/arp.c b/arp.c
> index 44677ad..c1bd63b 100644
> --- a/arp.c
> +++ b/arp.c
> @@ -112,3 +112,36 @@ int arp(const struct ctx *c, struct iov_tail *data)
>  
>  	return 1;
>  }
> +
> +/**
> + * arp_send_init_req() - Send initial ARP request to retrieve guest MAC address
> + * @c:		Execution context
> + */
> +void arp_send_init_req(const struct ctx *c)
> +{
> +	struct {
> +		struct ethhdr eh;
> +		struct arphdr ah;
> +		struct arpmsg am;
> +	} __attribute__((__packed__)) req;
> +
> +	/* Ethernet header */
> +	req.eh.h_proto = htons(ETH_P_ARP);
> +	memcpy(req.eh.h_dest, MAC_BROADCAST, sizeof(req.eh.h_dest));
> +	memcpy(req.eh.h_source, c->our_tap_mac, sizeof(req.eh.h_source));
> +
> +	/* ARP header */
> +	req.ah.ar_op = htons(ARPOP_REQUEST);
> +	req.ah.ar_hrd = htons(ARPHRD_ETHER);
> +	req.ah.ar_pro = htons(ETH_P_IP);
> +	req.ah.ar_hln = ETH_ALEN;
> +	req.ah.ar_pln = 4;
> +
> +	/* ARP message */
> +	memcpy(req.am.sha,	c->our_tap_mac,		sizeof(req.am.sha));
> +	memcpy(req.am.sip,	&c->ip4.our_tap_addr,	sizeof(req.am.sip));
> +	memcpy(req.am.tha,	MAC_BROADCAST,		sizeof(req.am.tha));
> +	memcpy(req.am.tip,	&c->ip4.addr,		sizeof(req.am.tip));
> +
> +	tap_send_single(c, &req, sizeof(req));
> +}
> diff --git a/arp.h b/arp.h
> index 86bcbf8..d5ad0e1 100644
> --- a/arp.h
> +++ b/arp.h
> @@ -21,5 +21,6 @@ struct arpmsg {
>  } __attribute__((__packed__));
>  
>  int arp(const struct ctx *c, struct iov_tail *data);
> +void arp_send_init_req(const struct ctx *c);
>  
>  #endif /* ARP_H */
> diff --git a/ndp.c b/ndp.c
> index eb090cd..b3bdedb 100644
> --- a/ndp.c
> +++ b/ndp.c
> @@ -438,3 +438,22 @@ void ndp_timer(const struct ctx *c, const struct timespec *now)
>  first:
>  	next_ra = now->tv_sec + interval;
>  }
> +
> +/**
> + * ndp_send_init_req() - Send initial NDP NS to retrieve guest MAC address
> + * @c:		Execution context
> + */
> +void ndp_send_init_req(const struct ctx *c)
> +{
> +	struct ndp_ns ns = {
> +		.ih = {
> +			.icmp6_type		= NS,
> +			.icmp6_code		= 0,
> +			.icmp6_router		= 0, /* Reserved */
> +			.icmp6_solicited	= 0, /* Reserved */
> +			.icmp6_override		= 0, /* Reserved */
> +		},
> +		.target_addr = c->ip6.addr
> +	};
> +	ndp_send(c, &c->ip6.addr, &ns, sizeof(ns));
> +}
> diff --git a/ndp.h b/ndp.h
> index b1dd5e8..781ea86 100644
> --- a/ndp.h
> +++ b/ndp.h
> @@ -11,5 +11,6 @@ struct icmp6hdr;
>  int ndp(const struct ctx *c, const struct in6_addr *saddr,
>  	struct iov_tail *data);
>  void ndp_timer(const struct ctx *c, const struct timespec *now);
> +void ndp_send_init_req(const struct ctx *c);
>  
>  #endif /* NDP_H */
> diff --git a/tap.c b/tap.c
> index 7ba6399..ea61eae 100644
> --- a/tap.c
> +++ b/tap.c
> @@ -1088,6 +1088,7 @@ void tap_add_packet(struct ctx *c, struct iov_tail *data,
>  {
>  	struct ethhdr eh_storage;
>  	const struct ethhdr *eh;
> +	char bufmac[ETH_ADDRSTRLEN];
>  
>  	pcap_iov(data->iov, data->cnt, data->off);
>  
> @@ -1097,6 +1098,7 @@ void tap_add_packet(struct ctx *c, struct iov_tail *data,
>  
>  	if (memcmp(c->guest_mac, eh->h_source, ETH_ALEN)) {
>  		memcpy(c->guest_mac, eh->h_source, ETH_ALEN);
> +		info("Guest MAC address: %s", eth_ntop(c->guest_mac, bufmac, sizeof(bufmac)));
>  		proto_update_l2_buf(c->guest_mac, NULL);
>  	}
>  
> @@ -1355,6 +1357,11 @@ static void tap_start_connection(const struct ctx *c)
>  	ev.events = EPOLLIN | EPOLLRDHUP;
>  	ev.data.u64 = ref.u64;
>  	epoll_ctl(c->epollfd, EPOLL_CTL_ADD, c->fd_tap, &ev);
> +
> +	info("Sending initial ARP and NDP request to retrieve"
> +	     " guest MAC address after reconnect");
> +	arp_send_init_req(c);

This should be conditional to whether we have IPv4 support enabled or
not, and the check would need to be analogous to the one from
tap4_handler() (sorry, it's a bit hidden):

	if (!c->ifi4 || ...)
		return ...;

> +	ndp_send_init_req(c);

And this should only happen if IPv6 is enabled, see tap6_handler():

	if (!c->ifi6 || ...)
		return ...;

and also, arguably, iff NDP support is not disabled by means of
--no-ndp (c->no_ndp).

Strictly speaking, we could send this anyway and still fit the current
documentation of --no-ndp:

       --no-ndp
              Disable NDP responses. NDP messages coming from guest or  target
              namespace will be ignored.

but this would make --no-ndp a misnomer, and given that we'll ignore
neighbour advertisements, it makes no sense to send a solicitation
anyway.

All in all, I would just not do this on c->no_ndp. If you can think
of a terse way of updating the man page to reflect this, that would
be appreciated, but I think it's also fine like it is.

By the way, we'll also ignore responses on --no-icmp. I just realised
that the man page is currently inaccurate, because it refers to echo
messages only, but in tap6_handler() we have:

		if (proto == IPPROTO_ICMPV6) {
			...

			if (c->no_icmp)
				continue;

			...

			if (ndp(c, saddr, &ndp_data))
				continue;

			...
		}

So I think we should update the man page to mention that --no-icmp
means no ICMP and no ICMPv6, and also skip sending the NDP solicitation
in that case.

Or update the code to reflect what the man page says, but then the
option could be considered a misnomer, so I wouldn't go this way.

>  }
>  
>  /**
> @@ -1503,11 +1510,12 @@ void tap_backend_init(struct ctx *c)
>  	case MODE_PASST:
>  		tap_sock_unix_init(c);
>  
> -		/* In passt mode, we don't know the guest's MAC address until it
> -		 * sends us packets.  Use the broadcast address so that our
> -		 * first packets will reach it.
> +		/* In passt mode, we don't know the guest's MAC address until
> +		 * it sends us packets (e.g. responds to our initial ARP or

I don't think the response is an example, so I wouldn't use "e.g."
here, rather "i.e." / "that is", if that's the expected behaviour.

> +		 * NDP request).  Until then, use the broadcast address so
> +		 * that our first packets will have a chance to reach it.
>  		 */
> -		memset(&c->guest_mac, 0xff, sizeof(c->guest_mac));
> +		memcpy(&c->guest_mac, MAC_BROADCAST, sizeof(c->guest_mac));
>  		break;
>  	}
>  
> diff --git a/util.h b/util.h
> index 2a8c38f..3719f0c 100644
> --- a/util.h
> +++ b/util.h
> @@ -97,6 +97,7 @@ void abort_with_msg(const char *fmt, ...)
>  #define FD_PROTO(x, proto)						\
>  	(IN_INTERVAL(c->proto.fd_min, c->proto.fd_max, (x)))
>  
> +#define MAC_BROADCAST		((uint8_t [ETH_ALEN]){ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff })

This can be easily wrapped to fit 80 columns without otherwise
affecting readability, see examples just above and below:

#define MAC_BROADCAST							\
	((uint8_t [ETH_ALEN]){ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff })

>  #define MAC_ZERO		((uint8_t [ETH_ALEN]){ 0 })
>  #define MAC_IS_ZERO(addr)	(!memcmp((addr), MAC_ZERO, ETH_ALEN))
>  

The rest looks good to me!

-- 
Stefano


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Send an initial ARP and NDP request to resolve the guest IP address
  2025-09-10  9:29               ` Stefano Brivio
@ 2025-09-10 10:33                 ` Volker Diels-Grabsch
  2025-09-10 14:01                   ` Stefano Brivio
  0 siblings, 1 reply; 14+ messages in thread
From: Volker Diels-Grabsch @ 2025-09-10 10:33 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: passt-dev, David Gibson

Dear Stefano,

Thanks for your timely review. I agree with almost everything of it.

Just a small clarification on this one:

Stefano Brivio wrote:
> > @@ -1503,11 +1510,12 @@ void tap_backend_init(struct ctx *c)
> >  	case MODE_PASST:
> >  		tap_sock_unix_init(c);
> >  
> > -		/* In passt mode, we don't know the guest's MAC address until it
> > -		 * sends us packets.  Use the broadcast address so that our
> > -		 * first packets will reach it.
> > +		/* In passt mode, we don't know the guest's MAC address until
> > +		 * it sends us packets (e.g. responds to our initial ARP or
> 
> I don't think the response is an example, so I wouldn't use "e.g."
> here, rather "i.e." / "that is", if that's the expected behaviour.

The reason for using "e.g." is the following: There is still the
"usual" case where the passt client (QEMU) was freshly started
together with passt.

In that case, it *will not* respond to neither our ARP nor NDP request,
simply because it won't recognize the IPv4/6 addresses, because it
doesn't yet have any.  In that situation, we'll learn about the guest's
MAC address only after it sends a DCHP request, or NDP, or similar to
us, on its own initiative - not as a response to anything we might
have sent.

So I'd propose to either keep the "e.g." wording, or to extend the
comment by enumerating all possible cases.

(Regarding the latter, I don't feel confident enough to be able to
really enumerate them all.  In addition to DCHP, NDP NS and DHCPv6, a
sufficient strange client network stack might even broadcast nonsense
packets to us, which we might not process further, but still learn the
guest's MAC address from.)


Best regards,
Volker

-- 
.---<<<((()))>>>---.
|      [[||]]      |
'---<<<((()))>>>---'

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Send an initial ARP and NDP request to resolve the guest IP address
  2025-09-10 10:33                 ` Volker Diels-Grabsch
@ 2025-09-10 14:01                   ` Stefano Brivio
  0 siblings, 0 replies; 14+ messages in thread
From: Stefano Brivio @ 2025-09-10 14:01 UTC (permalink / raw)
  To: Volker Diels-Grabsch; +Cc: passt-dev, David Gibson

On Wed, 10 Sep 2025 12:33:44 +0200
Volker Diels-Grabsch <v@njh.eu> wrote:

> Dear Stefano,
> 
> Thanks for your timely review. I agree with almost everything of it.
> 
> Just a small clarification on this one:
> 
> Stefano Brivio wrote:
> > > @@ -1503,11 +1510,12 @@ void tap_backend_init(struct ctx *c)
> > >  	case MODE_PASST:
> > >  		tap_sock_unix_init(c);
> > >  
> > > -		/* In passt mode, we don't know the guest's MAC address until it
> > > -		 * sends us packets.  Use the broadcast address so that our
> > > -		 * first packets will reach it.
> > > +		/* In passt mode, we don't know the guest's MAC address until
> > > +		 * it sends us packets (e.g. responds to our initial ARP or  
> > 
> > I don't think the response is an example, so I wouldn't use "e.g."
> > here, rather "i.e." / "that is", if that's the expected behaviour.  
> 
> The reason for using "e.g." is the following: There is still the
> "usual" case where the passt client (QEMU) was freshly started
> together with passt.
> 
> In that case, it *will not* respond to neither our ARP nor NDP request,
> simply because it won't recognize the IPv4/6 addresses, because it
> doesn't yet have any.  In that situation, we'll learn about the guest's
> MAC address only after it sends a DCHP request, or NDP, or similar to
> us, on its own initiative - not as a response to anything we might
> have sent.
> 
> So I'd propose to either keep the "e.g." wording, or to extend the
> comment by enumerating all possible cases.

Ah, sorry, I see what you meant now. Of course, it makes sense.

> (Regarding the latter, I don't feel confident enough to be able to
> really enumerate them all.  In addition to DCHP, NDP NS and DHCPv6, a
> sufficient strange client network stack might even broadcast nonsense
> packets to us, which we might not process further, but still learn the
> guest's MAC address from.)

I guess the version in v4 is more terse without any real loss of
generality or clarity.

-- 
Stefano


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-09-10 14:02 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-09-07 11:01 [PATCH] Send an initial ARP request to resolve the guest IP address Volker Diels-Grabsch
2025-09-08  4:00 ` David Gibson
2025-09-08  9:12   ` [PATCH v2] " Volker Diels-Grabsch
2025-09-08  9:22     ` Volker Diels-Grabsch
2025-09-09  2:52       ` David Gibson
2025-09-09 10:10         ` Volker Diels-Grabsch
2025-09-09 14:49           ` Volker Diels-Grabsch
2025-09-09 14:49             ` [PATCH] Send an initial ARP and NDP request to resolve the guest IP address Volker Diels-Grabsch
2025-09-10  3:32               ` David Gibson
2025-09-10  9:29               ` Stefano Brivio
2025-09-10 10:33                 ` Volker Diels-Grabsch
2025-09-10 14:01                   ` Stefano Brivio
2025-09-09 15:55           ` [PATCH v2] Send an initial ARP " Stefano Brivio
2025-09-10  3:33             ` David Gibson

Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).