From: Stefano Brivio <sbrivio@redhat.com>
To: Volker Diels-Grabsch <v@njh.eu>
Cc: passt-dev@passt.top
Subject: Re: [PATCH v4] Send an initial ARP and NDP request to resolve the guest IP address
Date: Wed, 10 Sep 2025 16:32:18 +0200 [thread overview]
Message-ID: <20250910163218.620bfef7@elisabeth> (raw)
In-Reply-To: <20250910113632.80620-4-v@njh.eu>
On Wed, 10 Sep 2025 13:35:42 +0200
Volker Diels-Grabsch <v@njh.eu> wrote:
> When restarting passt while QEMU keeps running with a configured
> "reconnect-ms" setting, the port forwardings will stop working until
> the guest sends some outgoing network traffic.
>
> Reason: Although QEMU reconnects successfully to the unix domain
> socket of the new passt process, that one no longer knows the guest's
> MAC address and uses instead the broadcast MAC address. However, this
> is ignored by the guest, at least if the guest runs Linux. Only after
> the guest sends some network package on its own initiative, passt will
> know the MAC address and will be able to establish forwarded
> connections.
>
> This change fixes this issue by sending an ARP and an NDP request to
> resolve the guest's MAC address via its IPv4 and IPv6 address, which
> we do know, right after the unix domain socket (re)connection.
>
> The only case where the IP is "wrong" would be if the configuration
> changed, or on the very first start right after qemu started. But in
> those cases, we just wouldn't get an ARP/NDP response, and can't do
> anything until we receive the guest's DHCP request - just as before.
> In other words, in the worst case the ARP/NDP requests would be
> harmless.
>
> Signed-off-by: Volker Diels-Grabsch <v@njh.eu>
> ---
> arp.c | 35 +++++++++++++++++++++++++++++++++++
> arp.h | 1 +
> ndp.c | 21 +++++++++++++++++++++
> ndp.h | 1 +
> passt.1 | 9 +++++----
> tap.c | 15 +++++++++++----
> util.h | 2 ++
> 7 files changed, 76 insertions(+), 8 deletions(-)
>
> diff --git a/arp.c b/arp.c
> index 44677ad..b263419 100644
> --- a/arp.c
> +++ b/arp.c
> @@ -112,3 +112,38 @@ int arp(const struct ctx *c, struct iov_tail *data)
>
> return 1;
> }
> +
> +/**
> + * arp_send_init_req() - Send initial ARP request to retrieve guest MAC address
> + * @c: Execution context
> + */
> +void arp_send_init_req(const struct ctx *c)
> +{
> + struct {
> + struct ethhdr eh;
> + struct arphdr ah;
> + struct arpmsg am;
> + } __attribute__((__packed__)) req;
> +
> + /* Ethernet header */
> + req.eh.h_proto = htons(ETH_P_ARP);
> + memcpy(req.eh.h_dest, MAC_BROADCAST, sizeof(req.eh.h_dest));
> + memcpy(req.eh.h_source, c->our_tap_mac, sizeof(req.eh.h_source));
> +
> + /* ARP header */
> + req.ah.ar_op = htons(ARPOP_REQUEST);
> + req.ah.ar_hrd = htons(ARPHRD_ETHER);
> + req.ah.ar_pro = htons(ETH_P_IP);
> + req.ah.ar_hln = ETH_ALEN;
> + req.ah.ar_pln = 4;
> +
> + /* ARP message */
> + memcpy(req.am.sha, c->our_tap_mac, sizeof(req.am.sha));
> + memcpy(req.am.sip, &c->ip4.our_tap_addr, sizeof(req.am.sip));
> + memcpy(req.am.tha, MAC_BROADCAST, sizeof(req.am.tha));
> + memcpy(req.am.tip, &c->ip4.addr, sizeof(req.am.tip));
> +
> + debug("Sending initial ARP request to retrieve"
> + " guest MAC address after reconnect");
I should have mentioned this along with an earlier comment of mine I
guess: user-visible strings are an exception to the 80-column limit,
rationale:
https://docs.kernel.org/process/coding-style.html#breaking-long-lines-and-strings
...by the way, note that this doesn't necessarily happen after a
reconnection, because if the user specifies a given address for the
guest (-a 192.0.2.1), the ARP request here makes sense even on the
first connection. So maybe you could just say something like:
debug("Sending initial ARP request for guest MAC address");
?
> + tap_send_single(c, &req, sizeof(req));
> +}
> diff --git a/arp.h b/arp.h
> index 86bcbf8..d5ad0e1 100644
> --- a/arp.h
> +++ b/arp.h
> @@ -21,5 +21,6 @@ struct arpmsg {
> } __attribute__((__packed__));
>
> int arp(const struct ctx *c, struct iov_tail *data);
> +void arp_send_init_req(const struct ctx *c);
>
> #endif /* ARP_H */
> diff --git a/ndp.c b/ndp.c
> index eb090cd..a47fe42 100644
> --- a/ndp.c
> +++ b/ndp.c
> @@ -438,3 +438,24 @@ void ndp_timer(const struct ctx *c, const struct timespec *now)
> first:
> next_ra = now->tv_sec + interval;
> }
> +
> +/**
> + * ndp_send_init_req() - Send initial NDP NS to retrieve guest MAC address
> + * @c: Execution context
> + */
> +void ndp_send_init_req(const struct ctx *c)
> +{
> + struct ndp_ns ns = {
> + .ih = {
> + .icmp6_type = NS,
> + .icmp6_code = 0,
> + .icmp6_router = 0, /* Reserved */
> + .icmp6_solicited = 0, /* Reserved */
> + .icmp6_override = 0, /* Reserved */
> + },
> + .target_addr = c->ip6.addr
> + };
> + debug("Sending initial NDP NS request to retrieve"
> + " guest MAC address after reconnect");
Same as above.
> + ndp_send(c, &c->ip6.addr, &ns, sizeof(ns));
> +}
> diff --git a/ndp.h b/ndp.h
> index b1dd5e8..781ea86 100644
> --- a/ndp.h
> +++ b/ndp.h
> @@ -11,5 +11,6 @@ struct icmp6hdr;
> int ndp(const struct ctx *c, const struct in6_addr *saddr,
> struct iov_tail *data);
> void ndp_timer(const struct ctx *c, const struct timespec *now);
> +void ndp_send_init_req(const struct ctx *c);
>
> #endif /* NDP_H */
> diff --git a/passt.1 b/passt.1
> index cef98b2..7000377 100644
> --- a/passt.1
> +++ b/passt.1
> @@ -319,8 +319,9 @@ silently dropped.
>
> .TP
> .BR \-\-no-icmp
> -Disable the ICMP/ICMPv6 echo handler. ICMP and ICMPv6 echo requests coming from
> -guest or target namespace will be silently dropped.
> +Disable the ICMP/ICMPv6 handler, and hence also NDP.
Maybe we could make the code directly reflect this and simplify checks?
That is, in conf() (conf.c), you could set c->no_ndp if c->no_icmp
(after all options have been parsed).
> ICMP and ICMPv6 requests
> +coming from guest or target namespace will be silently dropped. No initial NDP
> +message will be sent.
>
> .TP
> .BR \-\-no-dhcp
> @@ -330,8 +331,8 @@ selected IPv4 default route.
>
> .TP
> .BR \-\-no-ndp
> -Disable NDP responses. NDP messages coming from guest or target namespace will
> -be ignored.
> +Disable Neighbor Discovery. NDP messages coming from guest or target
> +namespace will be ignored. No initial NDP message will be sent.
>
> .TP
> .BR \-\-no-dhcpv6
> diff --git a/tap.c b/tap.c
> index 7ba6399..25c32f9 100644
> --- a/tap.c
> +++ b/tap.c
> @@ -1096,7 +1096,9 @@ void tap_add_packet(struct ctx *c, struct iov_tail *data,
> return;
>
> if (memcmp(c->guest_mac, eh->h_source, ETH_ALEN)) {
> + char bufmac[ETH_ADDRSTRLEN];
Customary, for readability: one newline here (between declaration and
code).
> memcpy(c->guest_mac, eh->h_source, ETH_ALEN);
> + info("Guest MAC address: %s", eth_ntop(c->guest_mac, bufmac, sizeof(bufmac)));
This is one line you can break easily instead. But I don't think it
should be info(), otherwise the guest can happily spam system logs on
the host.
I'm undecided between suggesting debug() or trace(), because on one
hand it's a rather relevant information (maybe say "New guest MAC
address"?), on the other hand nothing prevents a guest from using a
different MAC address every other frame, and make debugging with --debug
impossible if this happens.
We don't have (yet) a generic rate limiting mechanism for prints.
Maybe we could go with trace() for the moment, and the day we have it,
switch this to a rate-limited debug() (or even info(), at that point).
> proto_update_l2_buf(c->guest_mac, NULL);
> }
>
> @@ -1355,6 +1357,11 @@ static void tap_start_connection(const struct ctx *c)
> ev.events = EPOLLIN | EPOLLRDHUP;
> ev.data.u64 = ref.u64;
> epoll_ctl(c->epollfd, EPOLL_CTL_ADD, c->fd_tap, &ev);
> +
> + if (c->ifi4)
> + arp_send_init_req(c);
> + if (c->ifi6 && !c->no_ndp && !c->no_icmp)
...here you could just check for c->no_ndp, if you set it on
c->no_icmp.
> + ndp_send_init_req(c);
> }
>
> /**
> @@ -1503,11 +1510,11 @@ void tap_backend_init(struct ctx *c)
> case MODE_PASST:
> tap_sock_unix_init(c);
>
> - /* In passt mode, we don't know the guest's MAC address until it
> - * sends us packets. Use the broadcast address so that our
> - * first packets will reach it.
> + /* In passt mode, we don't know the guest's MAC address until
> + * it sends us packets. Until then, use the broadcast address
> + * so that our first packets will have a chance to reach it.
> */
> - memset(&c->guest_mac, 0xff, sizeof(c->guest_mac));
> + memcpy(&c->guest_mac, MAC_BROADCAST, sizeof(c->guest_mac));
> break;
> }
>
> diff --git a/util.h b/util.h
> index 2a8c38f..22eaac5 100644
> --- a/util.h
> +++ b/util.h
> @@ -97,6 +97,8 @@ void abort_with_msg(const char *fmt, ...)
> #define FD_PROTO(x, proto) \
> (IN_INTERVAL(c->proto.fd_min, c->proto.fd_max, (x)))
>
> +#define MAC_BROADCAST \
> + ((uint8_t [ETH_ALEN]){ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff })
> #define MAC_ZERO ((uint8_t [ETH_ALEN]){ 0 })
> #define MAC_IS_ZERO(addr) (!memcmp((addr), MAC_ZERO, ETH_ALEN))
>
The rest all looks good to me.
--
Stefano
next prev parent reply other threads:[~2025-09-10 14:32 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-10 11:35 [PATCH v4] Send an initial ARP and NDP request to resolve the guest IP Volker Diels-Grabsch
2025-09-10 11:35 ` [PATCH v4] Send an initial ARP and NDP request to resolve the guest IP address Volker Diels-Grabsch
2025-09-10 14:32 ` Stefano Brivio [this message]
2025-09-10 14:10 ` [PATCH v4] Send an initial ARP and NDP request to resolve the guest IP Stefano Brivio
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250910163218.620bfef7@elisabeth \
--to=sbrivio@redhat.com \
--cc=passt-dev@passt.top \
--cc=v@njh.eu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).