From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: passt.top; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=ULTOxvYS; dkim-atps=neutral Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by passt.top (Postfix) with ESMTPS id 62EFA5A0271 for ; Thu, 25 Sep 2025 00:18:57 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1758752336; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FzB6BzFMR3CteUHioVmEQ9SH6lU98JhTzJtcC+RJ/K4=; b=ULTOxvYSUb/oCpTx4RGvDl6JAQaR2l12vcu2llDsyNxEBlOrXO3vC2/wRKFT0wgtJeEiMK /063NPVMBMMhaGZD4Z2q5nJWR34i6kdun4X9ms9YfvAxOKshHBMtslS00AZ/gJgzjdNMct kJiqs9KgmCKgTDtaNqV7R8eBCY7hP1k= Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-59-Hp3GuCD6Nbu_eMyfKPWAYQ-1; Wed, 24 Sep 2025 18:18:54 -0400 X-MC-Unique: Hp3GuCD6Nbu_eMyfKPWAYQ-1 X-Mimecast-MFC-AGG-ID: Hp3GuCD6Nbu_eMyfKPWAYQ_1758752334 Received: by mail-qt1-f198.google.com with SMTP id d75a77b69052e-4cdcff1c0b1so10926151cf.1 for ; Wed, 24 Sep 2025 15:18:54 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1758752334; x=1759357134; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=FzB6BzFMR3CteUHioVmEQ9SH6lU98JhTzJtcC+RJ/K4=; b=a7AqtX3sWtN9afC/Q9r38t/uhbQyCLcDhQR9fTqzKnQIYWHOkPmY5mITzgcDGMlykm KfsJPs6gvrTZTv+bXa7O8QMH90VSws5ZmZnN/CwSl4j63IfOoXLf5zsMZqa/atV8FUpz pWhF7ph6CuhY/ZMXHDFFSBHVT8zI0wsHbm2yO2SLEe+EXvtbE9867oEYM5k8c/d+fu48 U1KM3QBjzExBXMaXt53TE3QIkCNZVU86brOjO+K55lNQQ4kz7KUZH7HJaDXOE7SO2fkG FhBHS2fY5wF7LG73qjfoWqGtUsoI9Fa2zSHwXgzVbEmxVKdO383t55nbOK7qL9mxBr4y iS3w== X-Forwarded-Encrypted: i=1; AJvYcCVeAkmao0h7rkoYl+ggHHwnh0wS3BauqBgpumc2sbtOCkUhpc+JkkfXZmoggYIl860Hig4V4PC6Dlo=@passt.top X-Gm-Message-State: AOJu0YwQH+AcdBlCbgmKYD21ZY5W4VVvc8tKBMbWG56zftCQ2bssUPln KEqjhQySNtTa3ScndUCwOU6sC0drt/4X9L8GS83LCCnMHG//wP9KADDzDGu/V4OIOxw4Mrwq2DA jkVOzlVfl+8VC0HdRAnFJ5TmwOXnN1eFw8xOQSENykhEm6hEnCjsnaQ== X-Gm-Gg: ASbGnctL6yIKugCT5Cs+7JzDWGWTUqDs6nNgzpIIZcgkSq4TpV/ecX5khVjs5DSvNd5 2WgfXkKCIqpFESyNaKkUuKvP5Sf/AnCaI2asc3/tfu+LAqYLXzE6+Ma+oKS0IajvbwR7KEp/v4/ s8H/qAXryDlK42LjOWnBjIBQjzvGC1lZujXMGJ7PnaX+SoulIwWm7hiQOd3Wc+ASC5/TkcSYyxP 3mYTiOCpHql5a9FJhzIPybgmoYdWiBJeuaWy0aJpERxrfUA6wI/5w2Ku07D0HKjVmIM/clEbhrL 5zLZ2mWSMwFQ8Gn6bXs2GTalalRUTDiiAY/FM7rdEfFct4TmlITt4f8tVD4QaGt1WbU98UUfRSp qTtjbtl4whQ== X-Received: by 2002:ac8:5c88:0:b0:4b1:247f:4e0f with SMTP id d75a77b69052e-4da4cd41ec5mr21672111cf.57.1758752333857; Wed, 24 Sep 2025 15:18:53 -0700 (PDT) X-Google-Smtp-Source: AGHT+IF2jgIuOuZ30hqrjPsG/ViMpkkZ8499q6JO23aa41T7bY5yp+bCqgFwQzh3WRrJo2Z4o2Kpng== X-Received: by 2002:ac8:5c88:0:b0:4b1:247f:4e0f with SMTP id d75a77b69052e-4da4cd41ec5mr21671811cf.57.1758752333332; Wed, 24 Sep 2025 15:18:53 -0700 (PDT) Received: from ?IPV6:2001:4958:2206:8901:6025:1483:4146:72dd? ([2001:4958:2206:8901:6025:1483:4146:72dd]) by smtp.gmail.com with ESMTPSA id af79cd13be357-85c2737869esm9864485a.11.2025.09.24.15.18.52 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 24 Sep 2025 15:18:52 -0700 (PDT) Message-ID: <5dda48fc-d854-436d-acd1-734d461efd59@redhat.com> Date: Wed, 24 Sep 2025 18:18:52 -0400 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v9 9/9] arp/ndp: send gratuitous ARP / unsolicitated NA when MAC cache entry added To: David Gibson References: <20250924011330.1168921-1-jmaloy@redhat.com> <20250924011330.1168921-10-jmaloy@redhat.com> From: Jon Maloy In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: -tgI9U9vE8r_CWGCVlbAbc3kpGQ0WeXn8rLDmday1sM_1758752334 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Message-ID-Hash: ETWDU6TVKHL4W7NNNIOCILO4VXMOOLYM X-Message-ID-Hash: ETWDU6TVKHL4W7NNNIOCILO4VXMOOLYM X-MailFrom: jmaloy@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: sbrivio@redhat.com, dgibson@redhat.com, passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On 2025-09-23 23:22, David Gibson wrote: > On Tue, Sep 23, 2025 at 09:13:30PM -0400, Jon Maloy wrote: >> Gratuitious ARP and unsolicitated NA should be handled with caution >> because of the risk of malignant users emitting them to disturb >> network communication. >> >> There is however one case we where we know it is legitimate >> and safe for us to send out such messages: The one time we switch >> from using ctx->own_tap_mac to a MAC address received via the >> recently added neigbour subscription function. Later changes to >> the MAC address of a host in an existing entry cannot be fully >> trusted, so we abstain from doing it in such cases. > > So, I think you're right that the gratuitous ARP is safe in this case. > > But it concerns me that (other that some edge cases) we're sending > data to the guest under own_tap_mac before we get the real MAC. At > the point we send data from a flow to the guest, I would have expected > to already have an entry in the host neighbour table (because by > definition the host is talking to the peer), therefore in our cache, > by the subscriber. > > I'm wondering if it could be as simple as both the neighbour update > and the actual data coming in the same epoll batch, and we could avoid > the temporary use of own_tap_mac by prioritising processing of the > neighbour events. > I experimented a bit with this. My test program is a simple UDP client-server pair, exchanging first 3 UDP messages client->server, followed by 3 messages server->client. First, I changed the main() loop a bit, so that netlink events are handled before all other events, if any. (Basically, I added an extra loop before the main loop, only handling netlink events, before moving on to the main loop (where netlink events had been excluded.) This should secure absolute priority of netlink events before any other events. As you will see below, this made no difference to the scenarios I describe. 1: When starting the container, I notice that there is no subscription event in PASTA, even though I can see the entry for the remote host is present in the host's ARP table. There is never any event coming up even if I wait for 10+ minutes. 2: The first UDP is attempted sent from the guest. An ARP request is sent to PASTA, and responded to with the 9a:9a: address. 3: The UDP, and two more UDPs, are sent via PASTA to the remote host. Those are responded to and sent back to the guest. 4: I now receive a neigbour event, and can update my cache, but since there is still no new ARP request from the guest, even if I wait for many minutes, he continues in the belief the old address is confirmed. 5: If I run the same test again after a few minutes, the guest *does* send out an ARP request a few seconds after the message exchange, and is now updated with the correct address. - If i run this sequence in the opposite direction everything seems to work ok, at least if the ARP entry is already present on the local host. - When I delete that ARP entry before running the sequence, a neigbour event shows up after some seconds, but it can take up to a minute, at least. If I run my sequence from the remote host before that happens, there will be an ARP request from the guest (for the response UDPs), responded to with the default tap mac, and it will remain like that for a long time, since the guest considers the mac address confirmed. It doesn't help much that a neigbour event shows up some seconds after the exchange. In brief, the guest *will* be updated eventually, but depending on luck and timing it may take a long time, at least several minutes. My gratuitous ARPs/ non-solicitated NAs doesn't completely solve this issue, but it significantly reduces the potential time gap before the guest gets properly updated. >> When sending this type of messages, we notice that the guest accepts >> the update, but also asks for a confirmation in the form of a regular >> ARP/NS request. This is responded to with the new value, and we have >> exactly the effect we wanted. >> >> This commit adds this functionality. >> >> Signed-off-by: Jon Maloy >> --- >> arp.c | 39 +++++++++++++++++++++++++++++++++++++++ >> arp.h | 2 ++ >> fwd.c | 11 +++++++++++ >> ndp.c | 10 ++++++++++ >> ndp.h | 1 + >> 5 files changed, 63 insertions(+) >> >> diff --git a/arp.c b/arp.c >> index 442faff..259f736 100644 >> --- a/arp.c >> +++ b/arp.c >> @@ -151,3 +151,42 @@ void arp_send_init_req(const struct ctx *c) >> debug("Sending initial ARP request for guest MAC address"); >> tap_send_single(c, &req, sizeof(req)); >> } >> + >> +/** >> + * arp_send_gratuitous() - Send a gratuitous ARP announcement for an IPv4 host >> + * @c: Execution context >> + * @ip: IPv4 address we announce as owned by @mac >> + * @mac: MAC address to advertise for @ip >> + */ >> +void arp_send_gratuitous(const struct ctx *c, struct in_addr ip, >> + const unsigned char *mac) >> +{ >> + char ip_str[INET_ADDRSTRLEN]; >> + struct { >> + struct ethhdr eh; >> + struct arphdr ah; >> + struct arpmsg am; >> + } __attribute__((__packed__)) req; > > 'req' is not a great name, since this is an ARP response, not a > request (but see below). > >> + /* Ethernet header */ >> + req.eh.h_proto = htons(ETH_P_ARP); >> + memcpy(req.eh.h_dest, MAC_BROADCAST, sizeof(req.eh.h_dest)); >> + memcpy(req.eh.h_source, c->our_tap_mac, sizeof(req.eh.h_source)); >> + >> + /* ARP header */ >> + req.ah.ar_op = htons(ARPOP_REPLY); >> + req.ah.ar_hrd = htons(ARPHRD_ETHER); >> + req.ah.ar_pro = htons(ETH_P_IP); >> + req.ah.ar_hln = ETH_ALEN; >> + req.ah.ar_pln = 4; >> + >> + /* ARP message */ >> + memcpy(req.am.sha, mac, sizeof(req.am.sha)); >> + memcpy(req.am.sip, &ip, sizeof(req.am.sip)); >> + memcpy(req.am.tha, MAC_BROADCAST, sizeof(req.am.tha)); >> + memcpy(req.am.tip, &ip, sizeof(req.am.tip)); > > So, I was trying to check if it made sense to use the same IP for both > source and target here, and came across > https://www.rfc-editor.org/rfc/rfc5227#section-3 > > Which suggests we should (counter intuitively) be using ARP requests, > not ARP replies for announcements. Instead of gratuitous ARP, you mean? I can try it. ///jon > >> + inet_ntop(AF_INET, &ip, ip_str, sizeof(ip_str)); >> + debug("Sending gratuitous ARP for %s", ip_str); >> + tap_send_single(c, &req, sizeof(req)); >> +} >> diff --git a/arp.h b/arp.h >> index d5ad0e1..b0dbb56 100644 >> --- a/arp.h >> +++ b/arp.h >> @@ -22,5 +22,7 @@ struct arpmsg { >> >> int arp(const struct ctx *c, struct iov_tail *data); >> void arp_send_init_req(const struct ctx *c); >> +void arp_send_gratuitous(const struct ctx *c, struct in_addr ip, >> + const unsigned char *mac); >> >> #endif /* ARP_H */ >> diff --git a/fwd.c b/fwd.c >> index c6348ab..879a351 100644 >> --- a/fwd.c >> +++ b/fwd.c >> @@ -26,6 +26,8 @@ >> #include "passt.h" >> #include "lineread.h" >> #include "flow_table.h" >> +#include "arp.h" >> +#include "ndp.h" >> >> /* Empheral port range: values from RFC 6335 */ >> static in_port_t fwd_ephemeral_min = (1 << 15) + (1 << 14); >> @@ -129,6 +131,15 @@ void fwd_neigh_mac_cache_alloc(const struct ctx *c, >> >> memcpy(&e->addr, addr, sizeof(*addr)); >> memcpy(e->mac, mac, ETH_ALEN); >> + >> + /* Send gratuitous ARP / unsolicited NA for the new mapping */ > > AFAICT this doesn't actually implement what the commit message > describes - it seems to always send an ARP/NA when the neighbour table > is updated. > >> + if (inany_v4(addr)) { >> + struct in_addr ip4 = *inany_v4(addr); >> + >> + arp_send_gratuitous(c, ip4, e->mac); >> + } else { >> + ndp_send_unsolicited_na(c, &addr->a6); >> + } >> } >> >> /** >> diff --git a/ndp.c b/ndp.c >> index 70b68aa..8914f31 100644 >> --- a/ndp.c >> +++ b/ndp.c >> @@ -226,6 +226,16 @@ static void ndp_na(const struct ctx *c, const struct in6_addr *dst, >> ndp_send(c, dst, &na, sizeof(na)); >> } >> >> +/** >> + * ndp_send_unsolicited_na() - Send unsolicited NA >> + * @c: Execution context >> + * @addr: IPv6 address to advertise >> + */ >> +void ndp_send_unsolicited_na(const struct ctx *c, const struct in6_addr *addr) >> +{ >> + ndp_na(c, &in6addr_ll_all_nodes, addr); >> +} >> + >> /** >> * ndp_ra() - Send an NDP Router Advertisement (RA) message >> * @c: Execution context >> diff --git a/ndp.h b/ndp.h >> index 781ea86..320009c 100644 >> --- a/ndp.h >> +++ b/ndp.h >> @@ -12,5 +12,6 @@ int ndp(const struct ctx *c, const struct in6_addr *saddr, >> struct iov_tail *data); >> void ndp_timer(const struct ctx *c, const struct timespec *now); >> void ndp_send_init_req(const struct ctx *c); >> +void ndp_send_unsolicited_na(const struct ctx *c, const struct in6_addr *addr); >> >> #endif /* NDP_H */ >> -- >> 2.50.1 >> >