From: Stefano Brivio <sbrivio@redhat.com>
To: Volker Diels-Grabsch <v@njh.eu>
Cc: passt-dev@passt.top, David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [PATCH] Send an initial ARP and NDP request to resolve the guest IP address
Date: Wed, 10 Sep 2025 11:29:06 +0200 [thread overview]
Message-ID: <20250910112906.5e1b7e5a@elisabeth> (raw)
In-Reply-To: <20250909145516.762957-2-v@njh.eu>
On Tue, 9 Sep 2025 16:49:20 +0200
Volker Diels-Grabsch <v@njh.eu> wrote:
> When restarting passt while QEMU keeps running with a configured
> "reconnect-ms" setting, the port forwardings will stop working until
> the guest sends some outgoing network traffic.
>
> Reason: Although QEMU reconnects successfully to the unix domain
> socket of the new passt process, that one no longer knows the guest's
> MAC address and uses instead the broadcast MAC address. However, this
> is ignored by the guest, at least if the guest runs Linux. Only after
> the guest sends some network package on its own initiative, passt will
> know the MAC address and will be able to establish forwarded
> connections.
>
> This change fixes this issue by sending an ARP and an NDP request to
> resolve the guest's MAC address via its IPv4 and IPv6 address, which
> we do know, right after the unix domain socket (re)connection.
>
> The only case where the IP is "wrong" would be if the configuration
> changed, or on the very first start right after qemu started. But in
> those cases, we just wouldn't get an ARP/NDP response, and can't do
> anything until we receive the guest's DHCP request - just as before.
> In other words, in the worst case the ARP/NDP requests would be
> harmless.
Thanks for the implementation, this looks like a small but quite
relevant feature we missed until now. I have a couple of comments on
top of David's ones:
> Signed-off-by: Volker Diels-Grabsch <v@njh.eu>
> ---
> arp.c | 33 +++++++++++++++++++++++++++++++++
> arp.h | 1 +
> ndp.c | 19 +++++++++++++++++++
> ndp.h | 1 +
> tap.c | 16 ++++++++++++----
> util.h | 1 +
> 6 files changed, 67 insertions(+), 4 deletions(-)
>
> diff --git a/arp.c b/arp.c
> index 44677ad..c1bd63b 100644
> --- a/arp.c
> +++ b/arp.c
> @@ -112,3 +112,36 @@ int arp(const struct ctx *c, struct iov_tail *data)
>
> return 1;
> }
> +
> +/**
> + * arp_send_init_req() - Send initial ARP request to retrieve guest MAC address
> + * @c: Execution context
> + */
> +void arp_send_init_req(const struct ctx *c)
> +{
> + struct {
> + struct ethhdr eh;
> + struct arphdr ah;
> + struct arpmsg am;
> + } __attribute__((__packed__)) req;
> +
> + /* Ethernet header */
> + req.eh.h_proto = htons(ETH_P_ARP);
> + memcpy(req.eh.h_dest, MAC_BROADCAST, sizeof(req.eh.h_dest));
> + memcpy(req.eh.h_source, c->our_tap_mac, sizeof(req.eh.h_source));
> +
> + /* ARP header */
> + req.ah.ar_op = htons(ARPOP_REQUEST);
> + req.ah.ar_hrd = htons(ARPHRD_ETHER);
> + req.ah.ar_pro = htons(ETH_P_IP);
> + req.ah.ar_hln = ETH_ALEN;
> + req.ah.ar_pln = 4;
> +
> + /* ARP message */
> + memcpy(req.am.sha, c->our_tap_mac, sizeof(req.am.sha));
> + memcpy(req.am.sip, &c->ip4.our_tap_addr, sizeof(req.am.sip));
> + memcpy(req.am.tha, MAC_BROADCAST, sizeof(req.am.tha));
> + memcpy(req.am.tip, &c->ip4.addr, sizeof(req.am.tip));
> +
> + tap_send_single(c, &req, sizeof(req));
> +}
> diff --git a/arp.h b/arp.h
> index 86bcbf8..d5ad0e1 100644
> --- a/arp.h
> +++ b/arp.h
> @@ -21,5 +21,6 @@ struct arpmsg {
> } __attribute__((__packed__));
>
> int arp(const struct ctx *c, struct iov_tail *data);
> +void arp_send_init_req(const struct ctx *c);
>
> #endif /* ARP_H */
> diff --git a/ndp.c b/ndp.c
> index eb090cd..b3bdedb 100644
> --- a/ndp.c
> +++ b/ndp.c
> @@ -438,3 +438,22 @@ void ndp_timer(const struct ctx *c, const struct timespec *now)
> first:
> next_ra = now->tv_sec + interval;
> }
> +
> +/**
> + * ndp_send_init_req() - Send initial NDP NS to retrieve guest MAC address
> + * @c: Execution context
> + */
> +void ndp_send_init_req(const struct ctx *c)
> +{
> + struct ndp_ns ns = {
> + .ih = {
> + .icmp6_type = NS,
> + .icmp6_code = 0,
> + .icmp6_router = 0, /* Reserved */
> + .icmp6_solicited = 0, /* Reserved */
> + .icmp6_override = 0, /* Reserved */
> + },
> + .target_addr = c->ip6.addr
> + };
> + ndp_send(c, &c->ip6.addr, &ns, sizeof(ns));
> +}
> diff --git a/ndp.h b/ndp.h
> index b1dd5e8..781ea86 100644
> --- a/ndp.h
> +++ b/ndp.h
> @@ -11,5 +11,6 @@ struct icmp6hdr;
> int ndp(const struct ctx *c, const struct in6_addr *saddr,
> struct iov_tail *data);
> void ndp_timer(const struct ctx *c, const struct timespec *now);
> +void ndp_send_init_req(const struct ctx *c);
>
> #endif /* NDP_H */
> diff --git a/tap.c b/tap.c
> index 7ba6399..ea61eae 100644
> --- a/tap.c
> +++ b/tap.c
> @@ -1088,6 +1088,7 @@ void tap_add_packet(struct ctx *c, struct iov_tail *data,
> {
> struct ethhdr eh_storage;
> const struct ethhdr *eh;
> + char bufmac[ETH_ADDRSTRLEN];
>
> pcap_iov(data->iov, data->cnt, data->off);
>
> @@ -1097,6 +1098,7 @@ void tap_add_packet(struct ctx *c, struct iov_tail *data,
>
> if (memcmp(c->guest_mac, eh->h_source, ETH_ALEN)) {
> memcpy(c->guest_mac, eh->h_source, ETH_ALEN);
> + info("Guest MAC address: %s", eth_ntop(c->guest_mac, bufmac, sizeof(bufmac)));
> proto_update_l2_buf(c->guest_mac, NULL);
> }
>
> @@ -1355,6 +1357,11 @@ static void tap_start_connection(const struct ctx *c)
> ev.events = EPOLLIN | EPOLLRDHUP;
> ev.data.u64 = ref.u64;
> epoll_ctl(c->epollfd, EPOLL_CTL_ADD, c->fd_tap, &ev);
> +
> + info("Sending initial ARP and NDP request to retrieve"
> + " guest MAC address after reconnect");
> + arp_send_init_req(c);
This should be conditional to whether we have IPv4 support enabled or
not, and the check would need to be analogous to the one from
tap4_handler() (sorry, it's a bit hidden):
if (!c->ifi4 || ...)
return ...;
> + ndp_send_init_req(c);
And this should only happen if IPv6 is enabled, see tap6_handler():
if (!c->ifi6 || ...)
return ...;
and also, arguably, iff NDP support is not disabled by means of
--no-ndp (c->no_ndp).
Strictly speaking, we could send this anyway and still fit the current
documentation of --no-ndp:
--no-ndp
Disable NDP responses. NDP messages coming from guest or target
namespace will be ignored.
but this would make --no-ndp a misnomer, and given that we'll ignore
neighbour advertisements, it makes no sense to send a solicitation
anyway.
All in all, I would just not do this on c->no_ndp. If you can think
of a terse way of updating the man page to reflect this, that would
be appreciated, but I think it's also fine like it is.
By the way, we'll also ignore responses on --no-icmp. I just realised
that the man page is currently inaccurate, because it refers to echo
messages only, but in tap6_handler() we have:
if (proto == IPPROTO_ICMPV6) {
...
if (c->no_icmp)
continue;
...
if (ndp(c, saddr, &ndp_data))
continue;
...
}
So I think we should update the man page to mention that --no-icmp
means no ICMP and no ICMPv6, and also skip sending the NDP solicitation
in that case.
Or update the code to reflect what the man page says, but then the
option could be considered a misnomer, so I wouldn't go this way.
> }
>
> /**
> @@ -1503,11 +1510,12 @@ void tap_backend_init(struct ctx *c)
> case MODE_PASST:
> tap_sock_unix_init(c);
>
> - /* In passt mode, we don't know the guest's MAC address until it
> - * sends us packets. Use the broadcast address so that our
> - * first packets will reach it.
> + /* In passt mode, we don't know the guest's MAC address until
> + * it sends us packets (e.g. responds to our initial ARP or
I don't think the response is an example, so I wouldn't use "e.g."
here, rather "i.e." / "that is", if that's the expected behaviour.
> + * NDP request). Until then, use the broadcast address so
> + * that our first packets will have a chance to reach it.
> */
> - memset(&c->guest_mac, 0xff, sizeof(c->guest_mac));
> + memcpy(&c->guest_mac, MAC_BROADCAST, sizeof(c->guest_mac));
> break;
> }
>
> diff --git a/util.h b/util.h
> index 2a8c38f..3719f0c 100644
> --- a/util.h
> +++ b/util.h
> @@ -97,6 +97,7 @@ void abort_with_msg(const char *fmt, ...)
> #define FD_PROTO(x, proto) \
> (IN_INTERVAL(c->proto.fd_min, c->proto.fd_max, (x)))
>
> +#define MAC_BROADCAST ((uint8_t [ETH_ALEN]){ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff })
This can be easily wrapped to fit 80 columns without otherwise
affecting readability, see examples just above and below:
#define MAC_BROADCAST \
((uint8_t [ETH_ALEN]){ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff })
> #define MAC_ZERO ((uint8_t [ETH_ALEN]){ 0 })
> #define MAC_IS_ZERO(addr) (!memcmp((addr), MAC_ZERO, ETH_ALEN))
>
The rest looks good to me!
--
Stefano
next prev parent reply other threads:[~2025-09-10 9:29 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-07 11:01 [PATCH] Send an initial ARP " Volker Diels-Grabsch
2025-09-08 4:00 ` David Gibson
2025-09-08 9:12 ` [PATCH v2] " Volker Diels-Grabsch
2025-09-08 9:22 ` Volker Diels-Grabsch
2025-09-09 2:52 ` David Gibson
2025-09-09 10:10 ` Volker Diels-Grabsch
2025-09-09 14:49 ` Volker Diels-Grabsch
2025-09-09 14:49 ` [PATCH] Send an initial ARP and NDP request to resolve the guest IP address Volker Diels-Grabsch
2025-09-10 3:32 ` David Gibson
2025-09-10 9:29 ` Stefano Brivio [this message]
2025-09-10 10:33 ` Volker Diels-Grabsch
2025-09-10 14:01 ` Stefano Brivio
2025-09-09 15:55 ` [PATCH v2] Send an initial ARP " Stefano Brivio
2025-09-10 3:33 ` David Gibson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250910112906.5e1b7e5a@elisabeth \
--to=sbrivio@redhat.com \
--cc=david@gibson.dropbear.id.au \
--cc=passt-dev@passt.top \
--cc=v@njh.eu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).