From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: passt.top; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=KykvdyoC; dkim-atps=neutral Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by passt.top (Postfix) with ESMTPS id C95A85A0619 for ; Tue, 07 Oct 2025 00:33:14 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1759789992; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=muYb7iDqQTLqLT8boIOc6xgEkfX/PJLWTYplYJfMzvE=; b=KykvdyoCNX0k+LWjkFCuIGs37AVTGggEC7yFn8sT5o7aJ7fv7Ek4dEkxictT8M03BftMkM SiHLxG/HK4YkiGkV3Dv9oJ5VeBmUMjT8K575UsSUpYXI1fUBvTXFaITHoB11gP8YkboyDD JO2kvRtQmd+KWaCwhnNV19P7cmytHhk= Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-577-DtpNByC6PcSXGpJDHn0UiA-1; Mon, 06 Oct 2025 18:33:11 -0400 X-MC-Unique: DtpNByC6PcSXGpJDHn0UiA-1 X-Mimecast-MFC-AGG-ID: DtpNByC6PcSXGpJDHn0UiA_1759789990 Received: by mail-wm1-f71.google.com with SMTP id 5b1f17b1804b1-46e375dab7dso24199065e9.3 for ; Mon, 06 Oct 2025 15:33:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759789989; x=1760394789; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=muYb7iDqQTLqLT8boIOc6xgEkfX/PJLWTYplYJfMzvE=; b=ZU+G5K7IApBRNgZM+Qg7ju+atTEoIQdRvgf2z6biSM9YhjbbIltlwHpuIMDJdfuEXd q52wZCcsIAjz4yoGSNS/T9a4newD+IYu5fSnE8NJt2M3yFjdmUIVJn2+tQMrVyEDb4LF GO3yblcimmHNnoAKedAVu2pSBaW0Hc0RaaUCgzJle5aX/5qA3pXzMG+AIbo818pAx7uD sbzqoPU6ALAEprF2Hip0HZJbsusCq/UMSa8luEsrlq7vXOtbl8soPu0pPIATqIn7QRmA tXPRmE/Lx5XFjy1Wc2QMd/Zd8cL1DzdJoM9A7zP96+kBt/b8GHFrcN6vV4iuQidoKXE0 Vh8Q== X-Forwarded-Encrypted: i=1; AJvYcCVQn0Q3h5VvPW/KywkcmQpx8tlMqhdBr8jEl1I8x1+7SYZXm6e8ZHzjlhfDG+tMwYUJSM30ypcgEHg=@passt.top X-Gm-Message-State: AOJu0YzjsKvnz3tfvPIeJODyp1up4v93/nirLG74Ge5xXhulBjsOFX/8 izTcn4wz++04SK7JF8k3Ri+vJWalXLteDEQHL1v06TKrGPPcGHZ2JbHjW6s2jYtqjp3T4jsWurk YziYlTYYqc5CRZmUXR3RsfbfKeXjyWwIMjsY1X1BLtlaOsxERWzlGVuHWlnR5Dw== X-Gm-Gg: ASbGncuVDeq2Izo1XfahyZy2ljv3LvQj4he29N9I4gGhe+srKN2UVykiAiO58G1ju5v 84DYMGUQI+9Urcs0X619W+JutIJCoacgxH+/qUTyj9Exb0LE8xYn4Fsp3L11CL2pTpjt2KU2RMc aSqZVKcJcts9Q32xDxDVhasWGuQ/FIu3eGhT2Cg+SVRi7MQv6jbcPnTb7eCy7ZenC47V2gxrLsu VVw6/5KFCHHL4ZPuNVHiytZaTxNcXEy27+YiDCt8vQHDGzPd9769ybcFYD74N0aMMi0m0FNBbcN dNCXdCeQqVwW7CzbM9vBkiXku/dQ1P2hL0H4TNbEd9nAwet8ddGSUDLusRb6uKoEhU/Mtd1yJA= = X-Received: by 2002:a05:600c:6818:b0:45f:2cd5:5086 with SMTP id 5b1f17b1804b1-46e71103fd8mr94982105e9.3.1759789989429; Mon, 06 Oct 2025 15:33:09 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGFW/e1AmRLwCAsjWJ2FKQXrHbfvyoRi/h5lfWK+CLteMZlYc3+Rk6UfndPv+fW1rbaEAAdGw== X-Received: by 2002:a05:600c:6818:b0:45f:2cd5:5086 with SMTP id 5b1f17b1804b1-46e71103fd8mr94981975e9.3.1759789988919; Mon, 06 Oct 2025 15:33:08 -0700 (PDT) Received: from maya.myfinge.rs (ifcgrfdd.trafficplex.cloud. [176.103.220.4]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4255d8abe49sm22784831f8f.21.2025.10.06.15.33.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Oct 2025 15:33:08 -0700 (PDT) Date: Tue, 7 Oct 2025 00:33:06 +0200 From: Stefano Brivio To: Jon Maloy Subject: Re: [PATCH v12 1/9] netlink: add subsciption on changes in NDP/ARP table Message-ID: <20251007003306.1d347bca@elisabeth> In-Reply-To: <20251003003412.588801-2-jmaloy@redhat.com> References: <20251003003412.588801-1-jmaloy@redhat.com> <20251003003412.588801-2-jmaloy@redhat.com> Organization: Red Hat X-Mailer: Claws Mail 4.2.0 (GTK 3.24.49; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: ysQxDH1Xg3zBx7M8Lhm5jX_iQwZT_tFFDEGFHlD1d3Y_1759789990 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-ID-Hash: GD3255BZNIBPRKC5GPABNIA46SJ3UWVT X-Message-ID-Hash: GD3255BZNIBPRKC5GPABNIA46SJ3UWVT X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: dgibson@redhat.com, david@gibson.dropbear.id.au, passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: In the title: s/subsciption/subscription/ On Thu, 2 Oct 2025 20:34:04 -0400 Jon Maloy wrote: > The solution to bug https://bugs.passt.top/show_bug.cgi?id=120 > requires the ability to translate from an IP address to its > corresponding MAC address in cases where those are present in > the ARP or NDP tables. > > To keep track of the contents of these tables we add a netlink > based neighbour subscription feature. > > Signed-off-by: Jon Maloy > Reviewed-by: David Gibson > > --- > v3: - Added an attribute contianing NDA_DST to sent message, so > that we let the kernel do the filtering of the IP address > and return only one entry. > - Added interface index to the call signature. Since the only > interface we know is the template interface, this limits > the number of hosts that will be seen as 'network segment > local' from a PASST viewpoint. > v4: - Made loop independent of attribute order. > - Ignoring L2 addresses which are not of size ETH_ALEN. > v5: - Changed return value of new function, so caller can know if > a MAC address really was found. > v6: - Removed warning printout which had ended up in the wrong > commit. > v8: - Changed to neighbour event subscription model > - netlink: arp/ndp table subscription > v10:- Updated according to David's latest comments on v8 > - Added functionaly where we initially read current > state of ARP/NDP tables > v12:- Updates based on feedback from David and Stefano > --- > epoll_type.h | 2 + > netlink.c | 204 ++++++++++++++++++++++++++++++++++++++++++++++++++- > netlink.h | 4 + > passt.c | 7 ++ > 4 files changed, 214 insertions(+), 3 deletions(-) > > diff --git a/epoll_type.h b/epoll_type.h > index 12ac59b..a90ffb6 100644 > --- a/epoll_type.h > +++ b/epoll_type.h > @@ -44,6 +44,8 @@ enum epoll_type { > EPOLL_TYPE_REPAIR_LISTEN, > /* TCP_REPAIR helper socket */ > EPOLL_TYPE_REPAIR, > + /* Netlink neighbour subscription socket */ > + EPOLL_TYPE_NL_NEIGH, > > EPOLL_NUM_TYPES, > }; > diff --git a/netlink.c b/netlink.c > index c436780..3fe2fdd 100644 > --- a/netlink.c > +++ b/netlink.c > @@ -26,6 +26,7 @@ > #include > #include > #include > +#include > > #include > #include > @@ -40,6 +41,10 @@ > #define RTNH_NEXT_AND_DEC(rtnh, attrlen) \ > ((attrlen) -= RTNH_ALIGN((rtnh)->rtnh_len), RTNH_NEXT(rtnh)) > > +/* Convenience macro borrowed from kernel */ > +#define NUD_VALID \ > + (NUD_PERMANENT | NUD_NOARP | NUD_REACHABLE | NUD_PROBE | NUD_STALE) > + > /* Netlink expects a buffer of at least 8kiB or the system page size, > * whichever is larger. 32kiB is recommended for more efficient. > * Since the largest page size on any remotely common Linux setup is > @@ -50,9 +55,10 @@ > #define NLBUFSIZ 65536 > > /* Socket in init, in target namespace, sequence (just needs to be monotonic) */ > -int nl_sock = -1; > -int nl_sock_ns = -1; > -static int nl_seq = 1; > +int nl_sock = -1; > +int nl_sock_ns = -1; > +static int nl_sock_neigh = -1; > +static int nl_seq = 1; > > /** > * nl_sock_init_do() - Set up netlink sockets in init or target namespace > @@ -1103,3 +1109,195 @@ int nl_link_set_flags(int s, unsigned int ifi, > > return nl_do(s, &req, RTM_NEWLINK, 0, sizeof(req)); > } > + > +/** > + * nl_neigh_msg_read() - Interpret a neigbour state message from netlink > + * @c: Execution context > + * @nh: Message to be read > + * Nit: excess newline. > + */ > +static void nl_neigh_msg_read(const struct ctx *c, struct nlmsghdr *nh) > +{ > + struct ndmsg *ndm = NLMSG_DATA(nh); > + struct rtattr *rta = (struct rtattr *)(ndm + 1); > + size_t na = NLMSG_PAYLOAD(nh, sizeof(*ndm)); > + char ip_str[INET6_ADDRSTRLEN]; > + char mac_str[ETH_ADDRSTRLEN]; > + const uint8_t *lladdr = NULL; > + const void *dst = NULL; > + size_t lladdr_len = 0; > + uint8_t mac[ETH_ALEN]; > + union inany_addr addr; > + size_t dstlen = 0; > + > + if (nh->nlmsg_type == NLMSG_DONE) > + return; > + > + if (nh->nlmsg_type == NLMSG_ERROR) { > + warn_perror("nlmsg_type error at msg read"); > + return; > + } > + > + if (nh->nlmsg_type != RTM_NEWNEIGH && > + nh->nlmsg_type != RTM_DELNEIGH) > + return; > + > + for (; RTA_OK(rta, na); rta = RTA_NEXT(rta, na)) { > + switch (rta->rta_type) { > + case NDA_DST: > + dst = RTA_DATA(rta); > + dstlen = RTA_PAYLOAD(rta); > + break; > + case NDA_LLADDR: > + lladdr = RTA_DATA(rta); > + lladdr_len = RTA_PAYLOAD(rta); > + break; > + default: > + break; > + } > + } > + > + if (!dst) > + return; > + > + if (!lladdr || lladdr_len != ETH_ALEN) > + return; > + > + if (ndm->ndm_type != ARPHRD_ETHER) > + return; > + > + memcpy(mac, lladdr, ETH_ALEN); > + eth_ntop(mac, mac_str, sizeof(mac_str)); > + > + if (ndm->ndm_family == AF_INET && > + ndm->ndm_ifindex != c->ifi4) > + return; > + > + if (ndm->ndm_family == AF_INET6 && > + ndm->ndm_ifindex != c->ifi6) > + return; > + > + if (ndm->ndm_family != AF_INET && > + ndm->ndm_family != AF_INET6) > + return; > + > + if (ndm->ndm_family == AF_INET && > + dstlen != sizeof(struct in_addr)) > + return; > + > + if (ndm->ndm_family == AF_INET6 && > + dstlen != sizeof(struct in6_addr)) > + return; > + > + inany_from_af(&addr, ndm->ndm_family, dst); > + inany_ntop(dst, ip_str, sizeof(ip_str)); > + > + if (nh->nlmsg_type == RTM_NEWNEIGH && ndm->ndm_state & NUD_VALID) > + trace("neigh table update: %s / %s", ip_str, mac_str); > + else > + trace("neigh table delete: %s / %s", ip_str, mac_str); > +} > + > +/** > + * nl_neigh_sync() - Read current contents ARP/NDP tables > + * @c: Execution context > + * @proto: Protocol, AF_INET or AF_INET6 > + * @ifi: Interface index > + * > + */ > +static void nl_neigh_sync(const struct ctx *c, int proto, int ifi) > +{ > + struct { > + struct nlmsghdr nlh; > + struct ndmsg ndm; > + } req = { > + .nlh = {0}, > + .ndm.ndm_family = proto, > + .ndm.ndm_ifindex = ifi, > + .ndm.ndm_state = 0, > + .ndm.ndm_flags = 0, > + .ndm.ndm_type = 0 > + }; > + struct nlmsghdr *nh; > + char buf[NLBUFSIZ]; > + ssize_t status; > + uint32_t seq; > + > + seq = nl_send(nl_sock_neigh, &req, RTM_GETNEIGH, NLM_F_DUMP, sizeof(req)); > + nl_foreach_oftype(nh, status, nl_sock_neigh, buf, seq, RTM_NEWNEIGH) > + nl_neigh_msg_read(c, nh); > + if (status < 0) > + warn("Failed to read netlink message. status == %li", status); That should be a human-readable message, so I'm not sure if the C notation for equality belongs there. In general we use the same format as perror() (you can't use warn_perror() here) that is, "Failed to read netlink message: %s", strerror_(status)); -- Stefano