public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
* [PATCH 0/7] Prevent DAD for link-local addresses in containers
@ 2024-08-14 22:54 Stefano Brivio
  2024-08-14 22:54 ` [PATCH 1/7] netlink: Fix typo in function comment for nl_addr_get() Stefano Brivio
                   ` (6 more replies)
  0 siblings, 7 replies; 17+ messages in thread
From: Stefano Brivio @ 2024-08-14 22:54 UTC (permalink / raw)
  To: passt-dev; +Cc: Paul Holzinger

There's no point in letting a container perform duplicate address
detection as we'll silently discard neighbour solicitations with
unspecified source addresses anyway, without relaying them to anybody.

And we realised that it's not harmless, see the whole discussion around
https://github.com/containers/podman/pull/23561#discussion_r1711639663:
we can't communicate with the container right away because of that,
which is surely annoying for tests, but it could also be an issue for
use cases with very short-lived containers or namespaces.

Disabling DAD via procfs configuration would be simpler than all this,
but we don't own the namespace (unless we spawn a shell), so we
shouldn't mess up with procfs entries, assuming it's even possible.

Set the nodad attribute, and prevent DAD from being triggered before
on link up, before we can set that attribute.


Stefano Brivio (7):
  netlink: Fix typo in function comment for nl_addr_get()
  netlink, pasta: Split MTU setting functionality out of nl_link_up()
  netlink, pasta: Turn nl_link_up() into a generic function to set link
    flags
  netlink, pasta: Disable DAD for link-local addresses on namespace
    interface
  netlink, pasta: Fetch link-local address from namespace interface once
    it's up
  pasta: Disable neighbour solicitations on device up to prevent DAD
  netlink: Fix typo in function comment for nl_addr_set()

 netlink.c | 186 ++++++++++++++++++++++++++++++++++++++++++++++++++----
 netlink.h |   6 +-
 pasta.c   |  29 ++++++++-
 3 files changed, 206 insertions(+), 15 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/7] netlink: Fix typo in function comment for nl_addr_get()
  2024-08-14 22:54 [PATCH 0/7] Prevent DAD for link-local addresses in containers Stefano Brivio
@ 2024-08-14 22:54 ` Stefano Brivio
  2024-08-15  2:39   ` David Gibson
  2024-08-14 22:54 ` [PATCH 2/7] netlink, pasta: Split MTU setting functionality out of nl_link_up() Stefano Brivio
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Stefano Brivio @ 2024-08-14 22:54 UTC (permalink / raw)
  To: passt-dev; +Cc: Paul Holzinger

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
---
 netlink.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/netlink.c b/netlink.c
index 093de26..e6a315e 100644
--- a/netlink.c
+++ b/netlink.c
@@ -682,7 +682,7 @@ int nl_route_dup(int s_src, unsigned int ifi_src,
  * @prefix_len:	Mask or prefix length, to fill (for IPv4)
  * @addr_l:	Link-scoped address to fill (for IPv6)
  *
- * Return: 9 on success, negative error code on failure
+ * Return: 0 on success, negative error code on failure
  */
 int nl_addr_get(int s, unsigned int ifi, sa_family_t af,
 		void *addr, int *prefix_len, void *addr_l)
-- 
@@ -682,7 +682,7 @@ int nl_route_dup(int s_src, unsigned int ifi_src,
  * @prefix_len:	Mask or prefix length, to fill (for IPv4)
  * @addr_l:	Link-scoped address to fill (for IPv6)
  *
- * Return: 9 on success, negative error code on failure
+ * Return: 0 on success, negative error code on failure
  */
 int nl_addr_get(int s, unsigned int ifi, sa_family_t af,
 		void *addr, int *prefix_len, void *addr_l)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/7] netlink, pasta: Split MTU setting functionality out of nl_link_up()
  2024-08-14 22:54 [PATCH 0/7] Prevent DAD for link-local addresses in containers Stefano Brivio
  2024-08-14 22:54 ` [PATCH 1/7] netlink: Fix typo in function comment for nl_addr_get() Stefano Brivio
@ 2024-08-14 22:54 ` Stefano Brivio
  2024-08-15  2:41   ` David Gibson
  2024-08-14 22:54 ` [PATCH 3/7] netlink, pasta: Turn nl_link_up() into a generic function to set link flags Stefano Brivio
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Stefano Brivio @ 2024-08-14 22:54 UTC (permalink / raw)
  To: passt-dev; +Cc: Paul Holzinger

As we'll use nl_link_up() for more than just bringing up devices, it
will become awkward to carry empty MTU values around whenever we call
it.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
---
 netlink.c | 35 +++++++++++++++++++++++++----------
 netlink.h |  3 ++-
 pasta.c   |  7 +++++--
 3 files changed, 32 insertions(+), 13 deletions(-)

diff --git a/netlink.c b/netlink.c
index e6a315e..e33765e 100644
--- a/netlink.c
+++ b/netlink.c
@@ -942,14 +942,14 @@ int nl_link_set_mac(int s, unsigned int ifi, const void *mac)
 }
 
 /**
- * nl_link_up() - Bring link up
+ * nl_link_set_mtu() - Set link MTU
  * @s:		Netlink socket
  * @ifi:	Interface index
- * @mtu:	If non-zero, set interface MTU
+ * @mtu:	Interface MTU
  *
  * Return: 0 on success, negative error code on failure
  */
-int nl_link_up(int s, unsigned int ifi, int mtu)
+int nl_link_set_mtu(int s, unsigned int ifi, int mtu)
 {
 	struct req_t {
 		struct nlmsghdr nlh;
@@ -959,17 +959,32 @@ int nl_link_up(int s, unsigned int ifi, int mtu)
 	} req = {
 		.ifm.ifi_family	  = AF_UNSPEC,
 		.ifm.ifi_index	  = ifi,
-		.ifm.ifi_flags	  = IFF_UP,
-		.ifm.ifi_change	  = IFF_UP,
 		.rta.rta_type	  = IFLA_MTU,
 		.rta.rta_len	  = RTA_LENGTH(sizeof(unsigned int)),
 		.mtu		  = mtu,
 	};
-	ssize_t len = sizeof(req);
 
-	if (!mtu)
-		/* Shorten request to drop MTU attribute */
-		len = offsetof(struct req_t, rta);
+	return nl_do(s, &req, RTM_NEWLINK, 0, sizeof(req));
+}
+
+/**
+ * nl_link_up() - Bring link up
+ * @s:		Netlink socket
+ * @ifi:	Interface index
+ *
+ * Return: 0 on success, negative error code on failure
+ */
+int nl_link_up(int s, unsigned int ifi)
+{
+	struct req_t {
+		struct nlmsghdr nlh;
+		struct ifinfomsg ifm;
+	} req = {
+		.ifm.ifi_family	  = AF_UNSPEC,
+		.ifm.ifi_index	  = ifi,
+		.ifm.ifi_flags	  = IFF_UP,
+		.ifm.ifi_change	  = IFF_UP,
+	};
 
-	return nl_do(s, &req, RTM_NEWLINK, 0, len);
+	return nl_do(s, &req, RTM_NEWLINK, 0, sizeof(req));
 }
diff --git a/netlink.h b/netlink.h
index 3a1f0de..87d27ae 100644
--- a/netlink.h
+++ b/netlink.h
@@ -23,6 +23,7 @@ int nl_addr_dup(int s_src, unsigned int ifi_src,
 		int s_dst, unsigned int ifi_dst, sa_family_t af);
 int nl_link_get_mac(int s, unsigned int ifi, void *mac);
 int nl_link_set_mac(int s, unsigned int ifi, const void *mac);
-int nl_link_up(int s, unsigned int ifi, int mtu);
+int nl_link_set_mtu(int s, unsigned int ifi, int mtu);
+int nl_link_up(int s, unsigned int ifi);
 
 #endif /* NETLINK_H */
diff --git a/pasta.c b/pasta.c
index 615ff7b..3a0652e 100644
--- a/pasta.c
+++ b/pasta.c
@@ -288,7 +288,7 @@ void pasta_ns_conf(struct ctx *c)
 {
 	int rc = 0;
 
-	rc = nl_link_up(nl_sock_ns, 1 /* lo */, 0);
+	rc = nl_link_up(nl_sock_ns, 1 /* lo */);
 	if (rc < 0)
 		die("Couldn't bring up loopback interface in namespace: %s",
 		    strerror(-rc));
@@ -303,7 +303,10 @@ void pasta_ns_conf(struct ctx *c)
 		    strerror(-rc));
 
 	if (c->pasta_conf_ns) {
-		nl_link_up(nl_sock_ns, c->pasta_ifi, c->mtu);
+		if (c->mtu != -1)
+			nl_link_set_mtu(nl_sock_ns, c->pasta_ifi, c->mtu);
+
+		nl_link_up(nl_sock_ns, c->pasta_ifi);
 
 		if (c->ifi4) {
 			if (c->ip4.no_copy_addrs) {
-- 
@@ -288,7 +288,7 @@ void pasta_ns_conf(struct ctx *c)
 {
 	int rc = 0;
 
-	rc = nl_link_up(nl_sock_ns, 1 /* lo */, 0);
+	rc = nl_link_up(nl_sock_ns, 1 /* lo */);
 	if (rc < 0)
 		die("Couldn't bring up loopback interface in namespace: %s",
 		    strerror(-rc));
@@ -303,7 +303,10 @@ void pasta_ns_conf(struct ctx *c)
 		    strerror(-rc));
 
 	if (c->pasta_conf_ns) {
-		nl_link_up(nl_sock_ns, c->pasta_ifi, c->mtu);
+		if (c->mtu != -1)
+			nl_link_set_mtu(nl_sock_ns, c->pasta_ifi, c->mtu);
+
+		nl_link_up(nl_sock_ns, c->pasta_ifi);
 
 		if (c->ifi4) {
 			if (c->ip4.no_copy_addrs) {
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/7] netlink, pasta: Turn nl_link_up() into a generic function to set link flags
  2024-08-14 22:54 [PATCH 0/7] Prevent DAD for link-local addresses in containers Stefano Brivio
  2024-08-14 22:54 ` [PATCH 1/7] netlink: Fix typo in function comment for nl_addr_get() Stefano Brivio
  2024-08-14 22:54 ` [PATCH 2/7] netlink, pasta: Split MTU setting functionality out of nl_link_up() Stefano Brivio
@ 2024-08-14 22:54 ` Stefano Brivio
  2024-08-15  2:42   ` David Gibson
  2024-08-14 22:54 ` [PATCH 4/7] netlink, pasta: Disable DAD for link-local addresses on namespace interface Stefano Brivio
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Stefano Brivio @ 2024-08-14 22:54 UTC (permalink / raw)
  To: passt-dev; +Cc: Paul Holzinger

In the next patches, we'll reuse it to set flags other than IFF_UP.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
---
 netlink.c | 11 +++++++----
 netlink.h |  3 ++-
 pasta.c   |  4 ++--
 3 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/netlink.c b/netlink.c
index e33765e..873e6c7 100644
--- a/netlink.c
+++ b/netlink.c
@@ -968,13 +968,16 @@ int nl_link_set_mtu(int s, unsigned int ifi, int mtu)
 }
 
 /**
- * nl_link_up() - Bring link up
+ * nl_link_set_flags() - Set link flags
  * @s:		Netlink socket
  * @ifi:	Interface index
+ * @set:	Device flags to set
+ * @change:	Mask of device flag changes
  *
  * Return: 0 on success, negative error code on failure
  */
-int nl_link_up(int s, unsigned int ifi)
+int nl_link_set_flags(int s, unsigned int ifi,
+		      unsigned int set, unsigned int change)
 {
 	struct req_t {
 		struct nlmsghdr nlh;
@@ -982,8 +985,8 @@ int nl_link_up(int s, unsigned int ifi)
 	} req = {
 		.ifm.ifi_family	  = AF_UNSPEC,
 		.ifm.ifi_index	  = ifi,
-		.ifm.ifi_flags	  = IFF_UP,
-		.ifm.ifi_change	  = IFF_UP,
+		.ifm.ifi_flags	  = set,
+		.ifm.ifi_change	  = change,
 	};
 
 	return nl_do(s, &req, RTM_NEWLINK, 0, sizeof(req));
diff --git a/netlink.h b/netlink.h
index 87d27ae..178f8ae 100644
--- a/netlink.h
+++ b/netlink.h
@@ -24,6 +24,7 @@ int nl_addr_dup(int s_src, unsigned int ifi_src,
 int nl_link_get_mac(int s, unsigned int ifi, void *mac);
 int nl_link_set_mac(int s, unsigned int ifi, const void *mac);
 int nl_link_set_mtu(int s, unsigned int ifi, int mtu);
-int nl_link_up(int s, unsigned int ifi);
+int nl_link_set_flags(int s, unsigned int ifi,
+		      unsigned int set, unsigned int change);
 
 #endif /* NETLINK_H */
diff --git a/pasta.c b/pasta.c
index 3a0652e..96545b1 100644
--- a/pasta.c
+++ b/pasta.c
@@ -288,7 +288,7 @@ void pasta_ns_conf(struct ctx *c)
 {
 	int rc = 0;
 
-	rc = nl_link_up(nl_sock_ns, 1 /* lo */);
+	rc = nl_link_set_flags(nl_sock_ns, 1 /* lo */, IFF_UP, IFF_UP);
 	if (rc < 0)
 		die("Couldn't bring up loopback interface in namespace: %s",
 		    strerror(-rc));
@@ -306,7 +306,7 @@ void pasta_ns_conf(struct ctx *c)
 		if (c->mtu != -1)
 			nl_link_set_mtu(nl_sock_ns, c->pasta_ifi, c->mtu);
 
-		nl_link_up(nl_sock_ns, c->pasta_ifi);
+		nl_link_set_flags(nl_sock_ns, c->pasta_ifi, IFF_UP, IFF_UP);
 
 		if (c->ifi4) {
 			if (c->ip4.no_copy_addrs) {
-- 
@@ -288,7 +288,7 @@ void pasta_ns_conf(struct ctx *c)
 {
 	int rc = 0;
 
-	rc = nl_link_up(nl_sock_ns, 1 /* lo */);
+	rc = nl_link_set_flags(nl_sock_ns, 1 /* lo */, IFF_UP, IFF_UP);
 	if (rc < 0)
 		die("Couldn't bring up loopback interface in namespace: %s",
 		    strerror(-rc));
@@ -306,7 +306,7 @@ void pasta_ns_conf(struct ctx *c)
 		if (c->mtu != -1)
 			nl_link_set_mtu(nl_sock_ns, c->pasta_ifi, c->mtu);
 
-		nl_link_up(nl_sock_ns, c->pasta_ifi);
+		nl_link_set_flags(nl_sock_ns, c->pasta_ifi, IFF_UP, IFF_UP);
 
 		if (c->ifi4) {
 			if (c->ip4.no_copy_addrs) {
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/7] netlink, pasta: Disable DAD for link-local addresses on namespace interface
  2024-08-14 22:54 [PATCH 0/7] Prevent DAD for link-local addresses in containers Stefano Brivio
                   ` (2 preceding siblings ...)
  2024-08-14 22:54 ` [PATCH 3/7] netlink, pasta: Turn nl_link_up() into a generic function to set link flags Stefano Brivio
@ 2024-08-14 22:54 ` Stefano Brivio
  2024-08-15  3:01   ` David Gibson
  2024-08-14 22:54 ` [PATCH 5/7] netlink, pasta: Fetch link-local address from namespace interface once it's up Stefano Brivio
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Stefano Brivio @ 2024-08-14 22:54 UTC (permalink / raw)
  To: passt-dev; +Cc: Paul Holzinger

It makes no sense for a container or a guest to try and perform
duplicate address detection for their link-local address, as we'll
anyway not relay neighbour solicitations with an unspecified source
address.

While they perform duplicate address detection, the link-local address
is not usable, which prevents us from bringing up especially
containers and communicate with them right away via IPv6.

This is not enough to prevent DAD and reach the container right away:
we'll need a couple more patches.

A large part of the function setting the nodad attribute is copied^W
vendored from nl_routes_dup(), and we could probably refactor things
to avoid code duplication, eventually, but keep this simple for the
moment.

Link: https://github.com/containers/podman/pull/23561#discussion_r1711639663
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
---
 netlink.c | 97 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 netlink.h |  1 +
 pasta.c   |  6 ++++
 3 files changed, 104 insertions(+)

diff --git a/netlink.c b/netlink.c
index 873e6c7..4b49de1 100644
--- a/netlink.c
+++ b/netlink.c
@@ -673,6 +673,103 @@ int nl_route_dup(int s_src, unsigned int ifi_src,
 	return 0;
 }
 
+/**
+ * nl_addr_set_ll_nodad() - Set IFA_F_NODAD on IPv6 link-local addresses
+ * @s:		Netlink socket
+ * @ifi:	Interface index in target namespace
+ *
+ * Return: 0 on success, negative error code on failure
+ */
+int nl_addr_set_ll_nodad(int s, unsigned int ifi)
+{
+	struct req_t {
+		struct nlmsghdr nlh;
+		struct ifaddrmsg ifa;
+	} req = {
+		.ifa.ifa_family    = AF_INET6,
+		.ifa.ifa_index     = ifi,
+	};
+	ssize_t nlmsgs_size, left, status;
+	unsigned ll_addrs = 0;
+	struct nlmsghdr *nh;
+	char buf[NLBUFSIZ];
+	uint32_t seq;
+	unsigned i;
+
+	seq = nl_send(s, &req, RTM_GETADDR, NLM_F_DUMP, sizeof(req));
+
+	/* nl_foreach() will step through multiple response datagrams,
+	 * which we don't want here because we need to have all the
+	 * addresses in the buffer at once. See also nl_route_dup().
+	 */
+	nh = nl_next(s, buf, NULL, &nlmsgs_size);
+	for (left = nlmsgs_size;
+	     NLMSG_OK(nh, left) && (status = nl_status(nh, left, seq)) > 0;
+	     nh = NLMSG_NEXT(nh, left)) {
+		struct ifaddrmsg *ifa = (struct ifaddrmsg *)NLMSG_DATA(nh);
+		bool discard = false;
+		struct rtattr *rta;
+		size_t na;
+
+		if (nh->nlmsg_type != RTM_NEWADDR)
+			continue;
+
+		if (ifa->ifa_index != ifi || ifa->ifa_scope != RT_SCOPE_LINK)
+			discard = true;
+
+		ifa->ifa_flags |= IFA_F_NODAD;
+
+		for (rta = IFA_RTA(ifa), na = IFA_PAYLOAD(nh); RTA_OK(rta, na);
+		     rta = RTA_NEXT(rta, na)) {
+			/* If 32-bit flags are used, add IFA_F_NODAD there */
+			if (rta->rta_type == IFA_FLAGS)
+				*(uint32_t *)RTA_DATA(rta) |= IFA_F_NODAD;
+		}
+
+		if (discard)
+			nh->nlmsg_type = NLMSG_NOOP;
+		else
+			ll_addrs++;
+	}
+
+	if (!NLMSG_OK(nh, left)) {
+		/* Process any remaining datagrams in a different
+		 * buffer so we don't overwrite the first one.
+		 */
+		char tail[NLBUFSIZ];
+		unsigned extra = 0;
+
+		nl_foreach_oftype(nh, status, s, tail, seq, RTM_NEWADDR)
+			extra++;
+
+		if (extra) {
+			err("netlink: Too many link-local addresses");
+			return -E2BIG;
+		}
+	}
+
+	if (status < 0)
+		return status;
+
+	for (i = 0; i < ll_addrs; i++) {
+		for (nh = (struct nlmsghdr *)buf, left = nlmsgs_size;
+		     NLMSG_OK(nh, left);
+		     nh = NLMSG_NEXT(nh, left)) {
+			int rc;
+
+			if (nh->nlmsg_type != RTM_NEWADDR)
+				continue;
+
+			rc = nl_do(s, nh, RTM_NEWADDR, NLM_F_REPLACE,
+				nh->nlmsg_len);
+			if (rc < 0)
+				return rc;
+		}
+	}
+
+	return 0;
+}
+
 /**
  * nl_addr_get() - Get most specific global address, given interface and family
  * @s:		Netlink socket
diff --git a/netlink.h b/netlink.h
index 178f8ae..66a44ad 100644
--- a/netlink.h
+++ b/netlink.h
@@ -19,6 +19,7 @@ int nl_addr_get(int s, unsigned int ifi, sa_family_t af,
 		void *addr, int *prefix_len, void *addr_l);
 int nl_addr_set(int s, unsigned int ifi, sa_family_t af,
 		const void *addr, int prefix_len);
+int nl_addr_set_ll_nodad(int s, unsigned int ifi);
 int nl_addr_dup(int s_src, unsigned int ifi_src,
 		int s_dst, unsigned int ifi_dst, sa_family_t af);
 int nl_link_get_mac(int s, unsigned int ifi, void *mac);
diff --git a/pasta.c b/pasta.c
index 96545b1..838bbb3 100644
--- a/pasta.c
+++ b/pasta.c
@@ -340,6 +340,12 @@ void pasta_ns_conf(struct ctx *c)
 		}
 
 		if (c->ifi6) {
+			rc = nl_addr_set_ll_nodad(nl_sock_ns, c->pasta_ifi);
+			if (rc < 0) {
+				die("Can't disable DAD for LL in namespace: %s",
+				    strerror(-rc));
+			}
+
 			if (c->ip6.no_copy_addrs) {
 				rc = nl_addr_set(nl_sock_ns, c->pasta_ifi,
 						 AF_INET6, &c->ip6.addr, 64);
-- 
@@ -340,6 +340,12 @@ void pasta_ns_conf(struct ctx *c)
 		}
 
 		if (c->ifi6) {
+			rc = nl_addr_set_ll_nodad(nl_sock_ns, c->pasta_ifi);
+			if (rc < 0) {
+				die("Can't disable DAD for LL in namespace: %s",
+				    strerror(-rc));
+			}
+
 			if (c->ip6.no_copy_addrs) {
 				rc = nl_addr_set(nl_sock_ns, c->pasta_ifi,
 						 AF_INET6, &c->ip6.addr, 64);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 5/7] netlink, pasta: Fetch link-local address from namespace interface once it's up
  2024-08-14 22:54 [PATCH 0/7] Prevent DAD for link-local addresses in containers Stefano Brivio
                   ` (3 preceding siblings ...)
  2024-08-14 22:54 ` [PATCH 4/7] netlink, pasta: Disable DAD for link-local addresses on namespace interface Stefano Brivio
@ 2024-08-14 22:54 ` Stefano Brivio
  2024-08-15  3:04   ` David Gibson
  2024-08-14 22:54 ` [PATCH 6/7] pasta: Disable neighbour solicitations on device up to prevent DAD Stefano Brivio
  2024-08-14 22:54 ` [PATCH 7/7] netlink: Fix typo in function comment for nl_addr_set() Stefano Brivio
  6 siblings, 1 reply; 17+ messages in thread
From: Stefano Brivio @ 2024-08-14 22:54 UTC (permalink / raw)
  To: passt-dev; +Cc: Paul Holzinger

As soon as we bring up the interface, the Linux kernel will set up a
link-local address for it, so we can fetch it and start using right
away, if we need a link-local address to communicate to the container
before we see any traffic coming from it.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
---
 netlink.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++
 netlink.h |  1 +
 pasta.c   |  7 +++++++
 3 files changed, 55 insertions(+)

diff --git a/netlink.c b/netlink.c
index 4b49de1..3b37087 100644
--- a/netlink.c
+++ b/netlink.c
@@ -836,6 +836,53 @@ int nl_addr_get(int s, unsigned int ifi, sa_family_t af,
 	return status;
 }
 
+/**
+ * nl_addr_get_ll() - Get first IPv6 link-local address for a given interface
+ * @s:		Netlink socket
+ * @ifi:	Interface index in outer network namespace
+ * @addr:	Link-local address to fill
+ *
+ * Return: 0 on success, negative error code on failure
+ */
+int nl_addr_get_ll(int s, unsigned int ifi, void *addr)
+{
+	struct req_t {
+		struct nlmsghdr nlh;
+		struct ifaddrmsg ifa;
+	} req = {
+		.ifa.ifa_family		= AF_INET6,
+		.ifa.ifa_index		= ifi,
+	};
+	struct nlmsghdr *nh;
+	bool found = false;
+	char buf[NLBUFSIZ];
+	ssize_t status;
+	uint32_t seq;
+
+	seq = nl_send(s, &req, RTM_GETADDR, NLM_F_DUMP, sizeof(req));
+	nl_foreach_oftype(nh, status, s, buf, seq, RTM_NEWADDR) {
+		struct ifaddrmsg *ifa = (struct ifaddrmsg *)NLMSG_DATA(nh);
+		struct rtattr *rta;
+		size_t na;
+
+		if (ifa->ifa_index != ifi || ifa->ifa_scope != RT_SCOPE_LINK ||
+		    found)
+			continue;
+
+		for (rta = IFA_RTA(ifa), na = IFA_PAYLOAD(nh); RTA_OK(rta, na);
+		     rta = RTA_NEXT(rta, na)) {
+			if (rta->rta_type != IFA_ADDRESS)
+				continue;
+
+			if (!found) {
+				memcpy(addr, RTA_DATA(rta), RTA_PAYLOAD(rta));
+				found = true;
+			}
+		}
+	}
+	return status;
+}
+
 /**
  * nl_add_set() - Set IP addresses for given interface and address family
  * @s:		Netlink socket
diff --git a/netlink.h b/netlink.h
index 66a44ad..bdfdef0 100644
--- a/netlink.h
+++ b/netlink.h
@@ -19,6 +19,7 @@ int nl_addr_get(int s, unsigned int ifi, sa_family_t af,
 		void *addr, int *prefix_len, void *addr_l);
 int nl_addr_set(int s, unsigned int ifi, sa_family_t af,
 		const void *addr, int prefix_len);
+int nl_addr_get_ll(int s, unsigned int ifi, void *addr);
 int nl_addr_set_ll_nodad(int s, unsigned int ifi);
 int nl_addr_dup(int s_src, unsigned int ifi_src,
 		int s_dst, unsigned int ifi_dst, sa_family_t af);
diff --git a/pasta.c b/pasta.c
index 838bbb3..cebf54f 100644
--- a/pasta.c
+++ b/pasta.c
@@ -340,6 +340,13 @@ void pasta_ns_conf(struct ctx *c)
 		}
 
 		if (c->ifi6) {
+			rc = nl_addr_get_ll(nl_sock_ns, c->pasta_ifi,
+					    &c->ip6.addr_ll_seen);
+			if (rc < 0) {
+				die("Can't fetch LL address from namespace: %s",
+				    strerror(-rc));
+			}
+
 			rc = nl_addr_set_ll_nodad(nl_sock_ns, c->pasta_ifi);
 			if (rc < 0) {
 				die("Can't disable DAD for LL in namespace: %s",
-- 
@@ -340,6 +340,13 @@ void pasta_ns_conf(struct ctx *c)
 		}
 
 		if (c->ifi6) {
+			rc = nl_addr_get_ll(nl_sock_ns, c->pasta_ifi,
+					    &c->ip6.addr_ll_seen);
+			if (rc < 0) {
+				die("Can't fetch LL address from namespace: %s",
+				    strerror(-rc));
+			}
+
 			rc = nl_addr_set_ll_nodad(nl_sock_ns, c->pasta_ifi);
 			if (rc < 0) {
 				die("Can't disable DAD for LL in namespace: %s",
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 6/7] pasta: Disable neighbour solicitations on device up to prevent DAD
  2024-08-14 22:54 [PATCH 0/7] Prevent DAD for link-local addresses in containers Stefano Brivio
                   ` (4 preceding siblings ...)
  2024-08-14 22:54 ` [PATCH 5/7] netlink, pasta: Fetch link-local address from namespace interface once it's up Stefano Brivio
@ 2024-08-14 22:54 ` Stefano Brivio
  2024-08-15  3:06   ` David Gibson
  2024-08-14 22:54 ` [PATCH 7/7] netlink: Fix typo in function comment for nl_addr_set() Stefano Brivio
  6 siblings, 1 reply; 17+ messages in thread
From: Stefano Brivio @ 2024-08-14 22:54 UTC (permalink / raw)
  To: passt-dev; +Cc: Paul Holzinger

As soon as we the kernel notifier for IPv6 address configuration
(addrconf_notify()) sees that we bring the target interface up
(NETDEV_UP), it will schedule duplicate address detection, so, by
itself, setting the nodad flag later is useless, because that won't
stop a detection that's already in progress.

However, if we disable neighbour solicitations with IFF_NOARP (which
is a misnomer for IPv6 interfaces, but there's no possibility of
mixing things up), the notifier will not trigger DAD, because it can't
be done, of course, without neighbour solicitations.

Set IFF_NOARP as we bring up the device, and drop it after we had a
chance to set the nodad attribute on the link.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
---
 pasta.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/pasta.c b/pasta.c
index cebf54f..babbfd5 100644
--- a/pasta.c
+++ b/pasta.c
@@ -303,10 +303,15 @@ void pasta_ns_conf(struct ctx *c)
 		    strerror(-rc));
 
 	if (c->pasta_conf_ns) {
+		unsigned int flags = IFF_UP;
+
 		if (c->mtu != -1)
 			nl_link_set_mtu(nl_sock_ns, c->pasta_ifi, c->mtu);
 
-		nl_link_set_flags(nl_sock_ns, c->pasta_ifi, IFF_UP, IFF_UP);
+		if (c->ifi6) /* Avoid duplicate address detection on link up */
+			flags |= IFF_NOARP;
+
+		nl_link_set_flags(nl_sock_ns, c->pasta_ifi, flags, flags);
 
 		if (c->ifi4) {
 			if (c->ip4.no_copy_addrs) {
@@ -353,6 +358,10 @@ void pasta_ns_conf(struct ctx *c)
 				    strerror(-rc));
 			}
 
+			/* We dodged DAD: re-enable neighbour solicitations */
+			nl_link_set_flags(nl_sock_ns, c->pasta_ifi,
+					  0, IFF_NOARP);
+
 			if (c->ip6.no_copy_addrs) {
 				rc = nl_addr_set(nl_sock_ns, c->pasta_ifi,
 						 AF_INET6, &c->ip6.addr, 64);
-- 
@@ -303,10 +303,15 @@ void pasta_ns_conf(struct ctx *c)
 		    strerror(-rc));
 
 	if (c->pasta_conf_ns) {
+		unsigned int flags = IFF_UP;
+
 		if (c->mtu != -1)
 			nl_link_set_mtu(nl_sock_ns, c->pasta_ifi, c->mtu);
 
-		nl_link_set_flags(nl_sock_ns, c->pasta_ifi, IFF_UP, IFF_UP);
+		if (c->ifi6) /* Avoid duplicate address detection on link up */
+			flags |= IFF_NOARP;
+
+		nl_link_set_flags(nl_sock_ns, c->pasta_ifi, flags, flags);
 
 		if (c->ifi4) {
 			if (c->ip4.no_copy_addrs) {
@@ -353,6 +358,10 @@ void pasta_ns_conf(struct ctx *c)
 				    strerror(-rc));
 			}
 
+			/* We dodged DAD: re-enable neighbour solicitations */
+			nl_link_set_flags(nl_sock_ns, c->pasta_ifi,
+					  0, IFF_NOARP);
+
 			if (c->ip6.no_copy_addrs) {
 				rc = nl_addr_set(nl_sock_ns, c->pasta_ifi,
 						 AF_INET6, &c->ip6.addr, 64);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 7/7] netlink: Fix typo in function comment for nl_addr_set()
  2024-08-14 22:54 [PATCH 0/7] Prevent DAD for link-local addresses in containers Stefano Brivio
                   ` (5 preceding siblings ...)
  2024-08-14 22:54 ` [PATCH 6/7] pasta: Disable neighbour solicitations on device up to prevent DAD Stefano Brivio
@ 2024-08-14 22:54 ` Stefano Brivio
  2024-08-15  3:07   ` David Gibson
  6 siblings, 1 reply; 17+ messages in thread
From: Stefano Brivio @ 2024-08-14 22:54 UTC (permalink / raw)
  To: passt-dev; +Cc: Paul Holzinger

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
---
 netlink.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/netlink.c b/netlink.c
index 3b37087..142f110 100644
--- a/netlink.c
+++ b/netlink.c
@@ -884,7 +884,7 @@ int nl_addr_get_ll(int s, unsigned int ifi, void *addr)
 }
 
 /**
- * nl_add_set() - Set IP addresses for given interface and address family
+ * nl_addr_set() - Set IP addresses for given interface and address family
  * @s:		Netlink socket
  * @ifi:	Interface index
  * @af:		Address family
-- 
@@ -884,7 +884,7 @@ int nl_addr_get_ll(int s, unsigned int ifi, void *addr)
 }
 
 /**
- * nl_add_set() - Set IP addresses for given interface and address family
+ * nl_addr_set() - Set IP addresses for given interface and address family
  * @s:		Netlink socket
  * @ifi:	Interface index
  * @af:		Address family
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/7] netlink: Fix typo in function comment for nl_addr_get()
  2024-08-14 22:54 ` [PATCH 1/7] netlink: Fix typo in function comment for nl_addr_get() Stefano Brivio
@ 2024-08-15  2:39   ` David Gibson
  0 siblings, 0 replies; 17+ messages in thread
From: David Gibson @ 2024-08-15  2:39 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: passt-dev, Paul Holzinger

[-- Attachment #1: Type: text/plain, Size: 1012 bytes --]

On Thu, Aug 15, 2024 at 12:54:23AM +0200, Stefano Brivio wrote:
> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>

Heh.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> ---
>  netlink.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/netlink.c b/netlink.c
> index 093de26..e6a315e 100644
> --- a/netlink.c
> +++ b/netlink.c
> @@ -682,7 +682,7 @@ int nl_route_dup(int s_src, unsigned int ifi_src,
>   * @prefix_len:	Mask or prefix length, to fill (for IPv4)
>   * @addr_l:	Link-scoped address to fill (for IPv6)
>   *
> - * Return: 9 on success, negative error code on failure
> + * Return: 0 on success, negative error code on failure
>   */
>  int nl_addr_get(int s, unsigned int ifi, sa_family_t af,
>  		void *addr, int *prefix_len, void *addr_l)

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/7] netlink, pasta: Split MTU setting functionality out of nl_link_up()
  2024-08-14 22:54 ` [PATCH 2/7] netlink, pasta: Split MTU setting functionality out of nl_link_up() Stefano Brivio
@ 2024-08-15  2:41   ` David Gibson
  0 siblings, 0 replies; 17+ messages in thread
From: David Gibson @ 2024-08-15  2:41 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: passt-dev, Paul Holzinger

[-- Attachment #1: Type: text/plain, Size: 3846 bytes --]

On Thu, Aug 15, 2024 at 12:54:24AM +0200, Stefano Brivio wrote:
> As we'll use nl_link_up() for more than just bringing up devices, it
> will become awkward to carry empty MTU values around whenever we call
> it.

I prefer clearer single purpose functions too.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> 
> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> ---
>  netlink.c | 35 +++++++++++++++++++++++++----------
>  netlink.h |  3 ++-
>  pasta.c   |  7 +++++--
>  3 files changed, 32 insertions(+), 13 deletions(-)
> 
> diff --git a/netlink.c b/netlink.c
> index e6a315e..e33765e 100644
> --- a/netlink.c
> +++ b/netlink.c
> @@ -942,14 +942,14 @@ int nl_link_set_mac(int s, unsigned int ifi, const void *mac)
>  }
>  
>  /**
> - * nl_link_up() - Bring link up
> + * nl_link_set_mtu() - Set link MTU
>   * @s:		Netlink socket
>   * @ifi:	Interface index
> - * @mtu:	If non-zero, set interface MTU
> + * @mtu:	Interface MTU
>   *
>   * Return: 0 on success, negative error code on failure
>   */
> -int nl_link_up(int s, unsigned int ifi, int mtu)
> +int nl_link_set_mtu(int s, unsigned int ifi, int mtu)
>  {
>  	struct req_t {
>  		struct nlmsghdr nlh;
> @@ -959,17 +959,32 @@ int nl_link_up(int s, unsigned int ifi, int mtu)
>  	} req = {
>  		.ifm.ifi_family	  = AF_UNSPEC,
>  		.ifm.ifi_index	  = ifi,
> -		.ifm.ifi_flags	  = IFF_UP,
> -		.ifm.ifi_change	  = IFF_UP,
>  		.rta.rta_type	  = IFLA_MTU,
>  		.rta.rta_len	  = RTA_LENGTH(sizeof(unsigned int)),
>  		.mtu		  = mtu,
>  	};
> -	ssize_t len = sizeof(req);
>  
> -	if (!mtu)
> -		/* Shorten request to drop MTU attribute */
> -		len = offsetof(struct req_t, rta);
> +	return nl_do(s, &req, RTM_NEWLINK, 0, sizeof(req));
> +}
> +
> +/**
> + * nl_link_up() - Bring link up
> + * @s:		Netlink socket
> + * @ifi:	Interface index
> + *
> + * Return: 0 on success, negative error code on failure
> + */
> +int nl_link_up(int s, unsigned int ifi)
> +{
> +	struct req_t {
> +		struct nlmsghdr nlh;
> +		struct ifinfomsg ifm;
> +	} req = {
> +		.ifm.ifi_family	  = AF_UNSPEC,
> +		.ifm.ifi_index	  = ifi,
> +		.ifm.ifi_flags	  = IFF_UP,
> +		.ifm.ifi_change	  = IFF_UP,
> +	};
>  
> -	return nl_do(s, &req, RTM_NEWLINK, 0, len);
> +	return nl_do(s, &req, RTM_NEWLINK, 0, sizeof(req));
>  }
> diff --git a/netlink.h b/netlink.h
> index 3a1f0de..87d27ae 100644
> --- a/netlink.h
> +++ b/netlink.h
> @@ -23,6 +23,7 @@ int nl_addr_dup(int s_src, unsigned int ifi_src,
>  		int s_dst, unsigned int ifi_dst, sa_family_t af);
>  int nl_link_get_mac(int s, unsigned int ifi, void *mac);
>  int nl_link_set_mac(int s, unsigned int ifi, const void *mac);
> -int nl_link_up(int s, unsigned int ifi, int mtu);
> +int nl_link_set_mtu(int s, unsigned int ifi, int mtu);
> +int nl_link_up(int s, unsigned int ifi);
>  
>  #endif /* NETLINK_H */
> diff --git a/pasta.c b/pasta.c
> index 615ff7b..3a0652e 100644
> --- a/pasta.c
> +++ b/pasta.c
> @@ -288,7 +288,7 @@ void pasta_ns_conf(struct ctx *c)
>  {
>  	int rc = 0;
>  
> -	rc = nl_link_up(nl_sock_ns, 1 /* lo */, 0);
> +	rc = nl_link_up(nl_sock_ns, 1 /* lo */);
>  	if (rc < 0)
>  		die("Couldn't bring up loopback interface in namespace: %s",
>  		    strerror(-rc));
> @@ -303,7 +303,10 @@ void pasta_ns_conf(struct ctx *c)
>  		    strerror(-rc));
>  
>  	if (c->pasta_conf_ns) {
> -		nl_link_up(nl_sock_ns, c->pasta_ifi, c->mtu);
> +		if (c->mtu != -1)
> +			nl_link_set_mtu(nl_sock_ns, c->pasta_ifi, c->mtu);
> +
> +		nl_link_up(nl_sock_ns, c->pasta_ifi);
>  
>  		if (c->ifi4) {
>  			if (c->ip4.no_copy_addrs) {

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/7] netlink, pasta: Turn nl_link_up() into a generic function to set link flags
  2024-08-14 22:54 ` [PATCH 3/7] netlink, pasta: Turn nl_link_up() into a generic function to set link flags Stefano Brivio
@ 2024-08-15  2:42   ` David Gibson
  0 siblings, 0 replies; 17+ messages in thread
From: David Gibson @ 2024-08-15  2:42 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: passt-dev, Paul Holzinger

[-- Attachment #1: Type: text/plain, Size: 3019 bytes --]

On Thu, Aug 15, 2024 at 12:54:25AM +0200, Stefano Brivio wrote:
> In the next patches, we'll reuse it to set flags other than IFF_UP.
> 
> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>

If we had more instances it might be nice to have a wrapper to just
ifup, but there's only 2 callers, so,

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> ---
>  netlink.c | 11 +++++++----
>  netlink.h |  3 ++-
>  pasta.c   |  4 ++--
>  3 files changed, 11 insertions(+), 7 deletions(-)
> 
> diff --git a/netlink.c b/netlink.c
> index e33765e..873e6c7 100644
> --- a/netlink.c
> +++ b/netlink.c
> @@ -968,13 +968,16 @@ int nl_link_set_mtu(int s, unsigned int ifi, int mtu)
>  }
>  
>  /**
> - * nl_link_up() - Bring link up
> + * nl_link_set_flags() - Set link flags
>   * @s:		Netlink socket
>   * @ifi:	Interface index
> + * @set:	Device flags to set
> + * @change:	Mask of device flag changes
>   *
>   * Return: 0 on success, negative error code on failure
>   */
> -int nl_link_up(int s, unsigned int ifi)
> +int nl_link_set_flags(int s, unsigned int ifi,
> +		      unsigned int set, unsigned int change)
>  {
>  	struct req_t {
>  		struct nlmsghdr nlh;
> @@ -982,8 +985,8 @@ int nl_link_up(int s, unsigned int ifi)
>  	} req = {
>  		.ifm.ifi_family	  = AF_UNSPEC,
>  		.ifm.ifi_index	  = ifi,
> -		.ifm.ifi_flags	  = IFF_UP,
> -		.ifm.ifi_change	  = IFF_UP,
> +		.ifm.ifi_flags	  = set,
> +		.ifm.ifi_change	  = change,
>  	};
>  
>  	return nl_do(s, &req, RTM_NEWLINK, 0, sizeof(req));
> diff --git a/netlink.h b/netlink.h
> index 87d27ae..178f8ae 100644
> --- a/netlink.h
> +++ b/netlink.h
> @@ -24,6 +24,7 @@ int nl_addr_dup(int s_src, unsigned int ifi_src,
>  int nl_link_get_mac(int s, unsigned int ifi, void *mac);
>  int nl_link_set_mac(int s, unsigned int ifi, const void *mac);
>  int nl_link_set_mtu(int s, unsigned int ifi, int mtu);
> -int nl_link_up(int s, unsigned int ifi);
> +int nl_link_set_flags(int s, unsigned int ifi,
> +		      unsigned int set, unsigned int change);
>  
>  #endif /* NETLINK_H */
> diff --git a/pasta.c b/pasta.c
> index 3a0652e..96545b1 100644
> --- a/pasta.c
> +++ b/pasta.c
> @@ -288,7 +288,7 @@ void pasta_ns_conf(struct ctx *c)
>  {
>  	int rc = 0;
>  
> -	rc = nl_link_up(nl_sock_ns, 1 /* lo */);
> +	rc = nl_link_set_flags(nl_sock_ns, 1 /* lo */, IFF_UP, IFF_UP);
>  	if (rc < 0)
>  		die("Couldn't bring up loopback interface in namespace: %s",
>  		    strerror(-rc));
> @@ -306,7 +306,7 @@ void pasta_ns_conf(struct ctx *c)
>  		if (c->mtu != -1)
>  			nl_link_set_mtu(nl_sock_ns, c->pasta_ifi, c->mtu);
>  
> -		nl_link_up(nl_sock_ns, c->pasta_ifi);
> +		nl_link_set_flags(nl_sock_ns, c->pasta_ifi, IFF_UP, IFF_UP);
>  
>  		if (c->ifi4) {
>  			if (c->ip4.no_copy_addrs) {

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/7] netlink, pasta: Disable DAD for link-local addresses on namespace interface
  2024-08-14 22:54 ` [PATCH 4/7] netlink, pasta: Disable DAD for link-local addresses on namespace interface Stefano Brivio
@ 2024-08-15  3:01   ` David Gibson
  2024-08-15  6:52     ` Stefano Brivio
  0 siblings, 1 reply; 17+ messages in thread
From: David Gibson @ 2024-08-15  3:01 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: passt-dev, Paul Holzinger

[-- Attachment #1: Type: text/plain, Size: 6498 bytes --]

On Thu, Aug 15, 2024 at 12:54:26AM +0200, Stefano Brivio wrote:
> It makes no sense for a container or a guest to try and perform
> duplicate address detection for their link-local address, as we'll
> anyway not relay neighbour solicitations with an unspecified source
> address.
> 
> While they perform duplicate address detection, the link-local address
> is not usable, which prevents us from bringing up especially
> containers and communicate with them right away via IPv6.
> 
> This is not enough to prevent DAD and reach the container right away:
> we'll need a couple more patches.
> 
> A large part of the function setting the nodad attribute is copied^W
> vendored from nl_routes_dup(), and we could probably refactor things
> to avoid code duplication, eventually, but keep this simple for the
> moment.

I don't really care about the duplication, but I'm not sure
nl_routes_dup() was the right thing to vendor.

> Link: https://github.com/containers/podman/pull/23561#discussion_r1711639663
> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> ---
>  netlink.c | 97 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  netlink.h |  1 +
>  pasta.c   |  6 ++++
>  3 files changed, 104 insertions(+)
> 
> diff --git a/netlink.c b/netlink.c
> index 873e6c7..4b49de1 100644
> --- a/netlink.c
> +++ b/netlink.c
> @@ -673,6 +673,103 @@ int nl_route_dup(int s_src, unsigned int ifi_src,
>  	return 0;
>  }
>  
> +/**
> + * nl_addr_set_ll_nodad() - Set IFA_F_NODAD on IPv6 link-local addresses
> + * @s:		Netlink socket
> + * @ifi:	Interface index in target namespace
> + *
> + * Return: 0 on success, negative error code on failure
> + */
> +int nl_addr_set_ll_nodad(int s, unsigned int ifi)
> +{
> +	struct req_t {
> +		struct nlmsghdr nlh;
> +		struct ifaddrmsg ifa;
> +	} req = {
> +		.ifa.ifa_family    = AF_INET6,
> +		.ifa.ifa_index     = ifi,
> +	};
> +	ssize_t nlmsgs_size, left, status;
> +	unsigned ll_addrs = 0;
> +	struct nlmsghdr *nh;
> +	char buf[NLBUFSIZ];
> +	uint32_t seq;
> +	unsigned i;
> +
> +	seq = nl_send(s, &req, RTM_GETADDR, NLM_F_DUMP, sizeof(req));
> +
> +	/* nl_foreach() will step through multiple response datagrams,
> +	 * which we don't want here because we need to have all the
> +	 * addresses in the buffer at once. See also nl_route_dup().

Hmm.. do we need them all in the buffer at once, though?  For
routes_dup we needed it because we take multiple passes through the
whole list, and that's not the case here.  I guess we can't do an
nl_do() within the loop, because that will expect the response to its
own command while we're still getting reponses from the original
NLM_F_DUMP.  nl_addr_dup() gets away with it because the nl_do()s are
on a different netlink socket.

But.. I think we could nl_send() each NODAD request as we construct
it, keep a count, then wait for all the queued responses.  It means we
can't easily match an error response to which thing caused it, but
doesn't look like we were reporting in that much detail anyway.

> +	 */
> +	nh = nl_next(s, buf, NULL, &nlmsgs_size);
> +	for (left = nlmsgs_size;
> +	     NLMSG_OK(nh, left) && (status = nl_status(nh, left, seq)) > 0;
> +	     nh = NLMSG_NEXT(nh, left)) {
> +		struct ifaddrmsg *ifa = (struct ifaddrmsg *)NLMSG_DATA(nh);
> +		bool discard = false;
> +		struct rtattr *rta;
> +		size_t na;
> +
> +		if (nh->nlmsg_type != RTM_NEWADDR)
> +			continue;
> +
> +		if (ifa->ifa_index != ifi || ifa->ifa_scope != RT_SCOPE_LINK)
> +			discard = true;
> +
> +		ifa->ifa_flags |= IFA_F_NODAD;
> +
> +		for (rta = IFA_RTA(ifa), na = IFA_PAYLOAD(nh); RTA_OK(rta, na);
> +		     rta = RTA_NEXT(rta, na)) {
> +			/* If 32-bit flags are used, add IFA_F_NODAD there */
> +			if (rta->rta_type == IFA_FLAGS)
> +				*(uint32_t *)RTA_DATA(rta) |= IFA_F_NODAD;
> +		}
> +
> +		if (discard)
> +			nh->nlmsg_type = NLMSG_NOOP;
> +		else
> +			ll_addrs++;
> +	}
> +
> +	if (!NLMSG_OK(nh, left)) {
> +		/* Process any remaining datagrams in a different
> +		 * buffer so we don't overwrite the first one.
> +		 */
> +		char tail[NLBUFSIZ];
> +		unsigned extra = 0;
> +
> +		nl_foreach_oftype(nh, status, s, tail, seq, RTM_NEWADDR)
> +			extra++;
> +
> +		if (extra) {
> +			err("netlink: Too many link-local addresses");
> +			return -E2BIG;
> +		}
> +	}
> +
> +	if (status < 0)
> +		return status;
> +
> +	for (i = 0; i < ll_addrs; i++) {
> +		for (nh = (struct nlmsghdr *)buf, left = nlmsgs_size;
> +		     NLMSG_OK(nh, left);
> +		     nh = NLMSG_NEXT(nh, left)) {
> +			int rc;
> +
> +			if (nh->nlmsg_type != RTM_NEWADDR)
> +				continue;
> +
> +			rc = nl_do(s, nh, RTM_NEWADDR, NLM_F_REPLACE,
> +				nh->nlmsg_len);
> +			if (rc < 0)
> +				return rc;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
>  /**
>   * nl_addr_get() - Get most specific global address, given interface and family
>   * @s:		Netlink socket
> diff --git a/netlink.h b/netlink.h
> index 178f8ae..66a44ad 100644
> --- a/netlink.h
> +++ b/netlink.h
> @@ -19,6 +19,7 @@ int nl_addr_get(int s, unsigned int ifi, sa_family_t af,
>  		void *addr, int *prefix_len, void *addr_l);
>  int nl_addr_set(int s, unsigned int ifi, sa_family_t af,
>  		const void *addr, int prefix_len);
> +int nl_addr_set_ll_nodad(int s, unsigned int ifi);
>  int nl_addr_dup(int s_src, unsigned int ifi_src,
>  		int s_dst, unsigned int ifi_dst, sa_family_t af);
>  int nl_link_get_mac(int s, unsigned int ifi, void *mac);
> diff --git a/pasta.c b/pasta.c
> index 96545b1..838bbb3 100644
> --- a/pasta.c
> +++ b/pasta.c
> @@ -340,6 +340,12 @@ void pasta_ns_conf(struct ctx *c)
>  		}
>  
>  		if (c->ifi6) {
> +			rc = nl_addr_set_ll_nodad(nl_sock_ns, c->pasta_ifi);
> +			if (rc < 0) {
> +				die("Can't disable DAD for LL in namespace: %s",
> +				    strerror(-rc));

So... I'm usually the one arguing *for* ASSERT()s and die()s, but in
this case it seems overly drastic.  If we're unable to set DAD it will
slow things down, but mostly things should still work.  I'd prefer to
see this as just a warn().

> +			}
> +
>  			if (c->ip6.no_copy_addrs) {
>  				rc = nl_addr_set(nl_sock_ns, c->pasta_ifi,
>  						 AF_INET6, &c->ip6.addr, 64);
> -- 
> 2.43.0
> 

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/7] netlink, pasta: Fetch link-local address from namespace interface once it's up
  2024-08-14 22:54 ` [PATCH 5/7] netlink, pasta: Fetch link-local address from namespace interface once it's up Stefano Brivio
@ 2024-08-15  3:04   ` David Gibson
  2024-08-15  6:53     ` Stefano Brivio
  0 siblings, 1 reply; 17+ messages in thread
From: David Gibson @ 2024-08-15  3:04 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: passt-dev, Paul Holzinger

[-- Attachment #1: Type: text/plain, Size: 3743 bytes --]

On Thu, Aug 15, 2024 at 12:54:27AM +0200, Stefano Brivio wrote:
> As soon as we bring up the interface, the Linux kernel will set up a
> link-local address for it, so we can fetch it and start using right
> away, if we need a link-local address to communicate to the container
> before we see any traffic coming from it.
> 
> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> ---
>  netlink.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++
>  netlink.h |  1 +
>  pasta.c   |  7 +++++++
>  3 files changed, 55 insertions(+)
> 
> diff --git a/netlink.c b/netlink.c
> index 4b49de1..3b37087 100644
> --- a/netlink.c
> +++ b/netlink.c
> @@ -836,6 +836,53 @@ int nl_addr_get(int s, unsigned int ifi, sa_family_t af,
>  	return status;
>  }
>  
> +/**
> + * nl_addr_get_ll() - Get first IPv6 link-local address for a given interface
> + * @s:		Netlink socket
> + * @ifi:	Interface index in outer network namespace
> + * @addr:	Link-local address to fill
> + *
> + * Return: 0 on success, negative error code on failure
> + */
> +int nl_addr_get_ll(int s, unsigned int ifi, void *addr)

Given this is explicitly for IPv6, I don't see a reason not to use
(struct in6_addr *addr) for greater type safety.

> +{
> +	struct req_t {
> +		struct nlmsghdr nlh;
> +		struct ifaddrmsg ifa;
> +	} req = {
> +		.ifa.ifa_family		= AF_INET6,
> +		.ifa.ifa_index		= ifi,
> +	};
> +	struct nlmsghdr *nh;
> +	bool found = false;
> +	char buf[NLBUFSIZ];
> +	ssize_t status;
> +	uint32_t seq;
> +
> +	seq = nl_send(s, &req, RTM_GETADDR, NLM_F_DUMP, sizeof(req));
> +	nl_foreach_oftype(nh, status, s, buf, seq, RTM_NEWADDR) {
> +		struct ifaddrmsg *ifa = (struct ifaddrmsg *)NLMSG_DATA(nh);
> +		struct rtattr *rta;
> +		size_t na;
> +
> +		if (ifa->ifa_index != ifi || ifa->ifa_scope != RT_SCOPE_LINK ||
> +		    found)
> +			continue;
> +
> +		for (rta = IFA_RTA(ifa), na = IFA_PAYLOAD(nh); RTA_OK(rta, na);
> +		     rta = RTA_NEXT(rta, na)) {
> +			if (rta->rta_type != IFA_ADDRESS)
> +				continue;
> +
> +			if (!found) {
> +				memcpy(addr, RTA_DATA(rta), RTA_PAYLOAD(rta));
> +				found = true;
> +			}
> +		}
> +	}
> +	return status;
> +}
> +
>  /**
>   * nl_add_set() - Set IP addresses for given interface and address family
>   * @s:		Netlink socket
> diff --git a/netlink.h b/netlink.h
> index 66a44ad..bdfdef0 100644
> --- a/netlink.h
> +++ b/netlink.h
> @@ -19,6 +19,7 @@ int nl_addr_get(int s, unsigned int ifi, sa_family_t af,
>  		void *addr, int *prefix_len, void *addr_l);
>  int nl_addr_set(int s, unsigned int ifi, sa_family_t af,
>  		const void *addr, int prefix_len);
> +int nl_addr_get_ll(int s, unsigned int ifi, void *addr);
>  int nl_addr_set_ll_nodad(int s, unsigned int ifi);
>  int nl_addr_dup(int s_src, unsigned int ifi_src,
>  		int s_dst, unsigned int ifi_dst, sa_family_t af);
> diff --git a/pasta.c b/pasta.c
> index 838bbb3..cebf54f 100644
> --- a/pasta.c
> +++ b/pasta.c
> @@ -340,6 +340,13 @@ void pasta_ns_conf(struct ctx *c)
>  		}
>  
>  		if (c->ifi6) {
> +			rc = nl_addr_get_ll(nl_sock_ns, c->pasta_ifi,
> +					    &c->ip6.addr_ll_seen);
> +			if (rc < 0) {
> +				die("Can't fetch LL address from namespace: %s",
> +				    strerror(-rc));

Again, we can generally cope with not having an addr_ll_seen
initially, so I think a warn() would make more sense.

> +			}
> +
>  			rc = nl_addr_set_ll_nodad(nl_sock_ns, c->pasta_ifi);
>  			if (rc < 0) {
>  				die("Can't disable DAD for LL in namespace: %s",

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 6/7] pasta: Disable neighbour solicitations on device up to prevent DAD
  2024-08-14 22:54 ` [PATCH 6/7] pasta: Disable neighbour solicitations on device up to prevent DAD Stefano Brivio
@ 2024-08-15  3:06   ` David Gibson
  0 siblings, 0 replies; 17+ messages in thread
From: David Gibson @ 2024-08-15  3:06 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: passt-dev, Paul Holzinger

[-- Attachment #1: Type: text/plain, Size: 2258 bytes --]

On Thu, Aug 15, 2024 at 12:54:28AM +0200, Stefano Brivio wrote:
> As soon as we the kernel notifier for IPv6 address configuration
> (addrconf_notify()) sees that we bring the target interface up
> (NETDEV_UP), it will schedule duplicate address detection, so, by
> itself, setting the nodad flag later is useless, because that won't
> stop a detection that's already in progress.

Ah, I did wonder about that on the earlier patch.

> However, if we disable neighbour solicitations with IFF_NOARP (which
> is a misnomer for IPv6 interfaces, but there's no possibility of
> mixing things up), the notifier will not trigger DAD, because it can't
> be done, of course, without neighbour solicitations.
> 
> Set IFF_NOARP as we bring up the device, and drop it after we had a
> chance to set the nodad attribute on the link.
> 
> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> ---
>  pasta.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/pasta.c b/pasta.c
> index cebf54f..babbfd5 100644
> --- a/pasta.c
> +++ b/pasta.c
> @@ -303,10 +303,15 @@ void pasta_ns_conf(struct ctx *c)
>  		    strerror(-rc));
>  
>  	if (c->pasta_conf_ns) {
> +		unsigned int flags = IFF_UP;
> +
>  		if (c->mtu != -1)
>  			nl_link_set_mtu(nl_sock_ns, c->pasta_ifi, c->mtu);
>  
> -		nl_link_set_flags(nl_sock_ns, c->pasta_ifi, IFF_UP, IFF_UP);
> +		if (c->ifi6) /* Avoid duplicate address detection on link up */
> +			flags |= IFF_NOARP;
> +
> +		nl_link_set_flags(nl_sock_ns, c->pasta_ifi, flags, flags);
>  
>  		if (c->ifi4) {
>  			if (c->ip4.no_copy_addrs) {
> @@ -353,6 +358,10 @@ void pasta_ns_conf(struct ctx *c)
>  				    strerror(-rc));
>  			}
>  
> +			/* We dodged DAD: re-enable neighbour solicitations */
> +			nl_link_set_flags(nl_sock_ns, c->pasta_ifi,
> +					  0, IFF_NOARP);
> +
>  			if (c->ip6.no_copy_addrs) {
>  				rc = nl_addr_set(nl_sock_ns, c->pasta_ifi,
>  						 AF_INET6, &c->ip6.addr, 64);

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 7/7] netlink: Fix typo in function comment for nl_addr_set()
  2024-08-14 22:54 ` [PATCH 7/7] netlink: Fix typo in function comment for nl_addr_set() Stefano Brivio
@ 2024-08-15  3:07   ` David Gibson
  0 siblings, 0 replies; 17+ messages in thread
From: David Gibson @ 2024-08-15  3:07 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: passt-dev, Paul Holzinger

[-- Attachment #1: Type: text/plain, Size: 908 bytes --]

On Thu, Aug 15, 2024 at 12:54:29AM +0200, Stefano Brivio wrote:
> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> ---
>  netlink.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/netlink.c b/netlink.c
> index 3b37087..142f110 100644
> --- a/netlink.c
> +++ b/netlink.c
> @@ -884,7 +884,7 @@ int nl_addr_get_ll(int s, unsigned int ifi, void *addr)
>  }
>  
>  /**
> - * nl_add_set() - Set IP addresses for given interface and address family
> + * nl_addr_set() - Set IP addresses for given interface and address family
>   * @s:		Netlink socket
>   * @ifi:	Interface index
>   * @af:		Address family

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/7] netlink, pasta: Disable DAD for link-local addresses on namespace interface
  2024-08-15  3:01   ` David Gibson
@ 2024-08-15  6:52     ` Stefano Brivio
  0 siblings, 0 replies; 17+ messages in thread
From: Stefano Brivio @ 2024-08-15  6:52 UTC (permalink / raw)
  To: David Gibson; +Cc: passt-dev, Paul Holzinger

On Thu, 15 Aug 2024 13:01:08 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Thu, Aug 15, 2024 at 12:54:26AM +0200, Stefano Brivio wrote:
> > It makes no sense for a container or a guest to try and perform
> > duplicate address detection for their link-local address, as we'll
> > anyway not relay neighbour solicitations with an unspecified source
> > address.
> > 
> > While they perform duplicate address detection, the link-local address
> > is not usable, which prevents us from bringing up especially
> > containers and communicate with them right away via IPv6.
> > 
> > This is not enough to prevent DAD and reach the container right away:
> > we'll need a couple more patches.
> > 
> > A large part of the function setting the nodad attribute is copied^W
> > vendored from nl_routes_dup(), and we could probably refactor things
> > to avoid code duplication, eventually, but keep this simple for the
> > moment.  
> 
> I don't really care about the duplication, but I'm not sure
> nl_routes_dup() was the right thing to vendor.
> 
> > Link: https://github.com/containers/podman/pull/23561#discussion_r1711639663
> > Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> > ---
> >  netlink.c | 97 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  netlink.h |  1 +
> >  pasta.c   |  6 ++++
> >  3 files changed, 104 insertions(+)
> > 
> > diff --git a/netlink.c b/netlink.c
> > index 873e6c7..4b49de1 100644
> > --- a/netlink.c
> > +++ b/netlink.c
> > @@ -673,6 +673,103 @@ int nl_route_dup(int s_src, unsigned int ifi_src,
> >  	return 0;
> >  }
> >  
> > +/**
> > + * nl_addr_set_ll_nodad() - Set IFA_F_NODAD on IPv6 link-local addresses
> > + * @s:		Netlink socket
> > + * @ifi:	Interface index in target namespace
> > + *
> > + * Return: 0 on success, negative error code on failure
> > + */
> > +int nl_addr_set_ll_nodad(int s, unsigned int ifi)
> > +{
> > +	struct req_t {
> > +		struct nlmsghdr nlh;
> > +		struct ifaddrmsg ifa;
> > +	} req = {
> > +		.ifa.ifa_family    = AF_INET6,
> > +		.ifa.ifa_index     = ifi,
> > +	};
> > +	ssize_t nlmsgs_size, left, status;
> > +	unsigned ll_addrs = 0;
> > +	struct nlmsghdr *nh;
> > +	char buf[NLBUFSIZ];
> > +	uint32_t seq;
> > +	unsigned i;
> > +
> > +	seq = nl_send(s, &req, RTM_GETADDR, NLM_F_DUMP, sizeof(req));
> > +
> > +	/* nl_foreach() will step through multiple response datagrams,
> > +	 * which we don't want here because we need to have all the
> > +	 * addresses in the buffer at once. See also nl_route_dup().  
> 
> Hmm.. do we need them all in the buffer at once, though?  For
> routes_dup we needed it because we take multiple passes through the
> whole list, and that's not the case here.

Right, we don't need that, I shouldn't have vendored the comment as it
was.

> I guess we can't do an nl_do() within the loop, because that will
> expect the response to its own command while we're still getting
> reponses from the original NLM_F_DUMP.

Exactly, that's why.

> nl_addr_dup() gets away with it because the nl_do()s are on a
> different netlink socket.

Correct.

> But.. I think we could nl_send() each NODAD request as we construct
> it, keep a count, then wait for all the queued responses. 

Oh, I didn't think of doing that. It's definitely worth a try.

> It means we can't easily match an error response to which thing
> caused it, but doesn't look like we were reporting in that much
> detail anyway.

Right, I don't think we should care about that here.

> 
> > +	 */
> > +	nh = nl_next(s, buf, NULL, &nlmsgs_size);
> > +	for (left = nlmsgs_size;
> > +	     NLMSG_OK(nh, left) && (status = nl_status(nh, left, seq)) > 0;
> > +	     nh = NLMSG_NEXT(nh, left)) {
> > +		struct ifaddrmsg *ifa = (struct ifaddrmsg *)NLMSG_DATA(nh);
> > +		bool discard = false;
> > +		struct rtattr *rta;
> > +		size_t na;
> > +
> > +		if (nh->nlmsg_type != RTM_NEWADDR)
> > +			continue;
> > +
> > +		if (ifa->ifa_index != ifi || ifa->ifa_scope != RT_SCOPE_LINK)
> > +			discard = true;
> > +
> > +		ifa->ifa_flags |= IFA_F_NODAD;
> > +
> > +		for (rta = IFA_RTA(ifa), na = IFA_PAYLOAD(nh); RTA_OK(rta, na);
> > +		     rta = RTA_NEXT(rta, na)) {
> > +			/* If 32-bit flags are used, add IFA_F_NODAD there */
> > +			if (rta->rta_type == IFA_FLAGS)
> > +				*(uint32_t *)RTA_DATA(rta) |= IFA_F_NODAD;
> > +		}
> > +
> > +		if (discard)
> > +			nh->nlmsg_type = NLMSG_NOOP;
> > +		else
> > +			ll_addrs++;
> > +	}
> > +
> > +	if (!NLMSG_OK(nh, left)) {
> > +		/* Process any remaining datagrams in a different
> > +		 * buffer so we don't overwrite the first one.
> > +		 */
> > +		char tail[NLBUFSIZ];
> > +		unsigned extra = 0;
> > +
> > +		nl_foreach_oftype(nh, status, s, tail, seq, RTM_NEWADDR)
> > +			extra++;
> > +
> > +		if (extra) {
> > +			err("netlink: Too many link-local addresses");
> > +			return -E2BIG;
> > +		}
> > +	}
> > +
> > +	if (status < 0)
> > +		return status;
> > +
> > +	for (i = 0; i < ll_addrs; i++) {
> > +		for (nh = (struct nlmsghdr *)buf, left = nlmsgs_size;
> > +		     NLMSG_OK(nh, left);
> > +		     nh = NLMSG_NEXT(nh, left)) {
> > +			int rc;
> > +
> > +			if (nh->nlmsg_type != RTM_NEWADDR)
> > +				continue;
> > +
> > +			rc = nl_do(s, nh, RTM_NEWADDR, NLM_F_REPLACE,
> > +				nh->nlmsg_len);
> > +			if (rc < 0)
> > +				return rc;
> > +		}
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> >  /**
> >   * nl_addr_get() - Get most specific global address, given interface and family
> >   * @s:		Netlink socket
> > diff --git a/netlink.h b/netlink.h
> > index 178f8ae..66a44ad 100644
> > --- a/netlink.h
> > +++ b/netlink.h
> > @@ -19,6 +19,7 @@ int nl_addr_get(int s, unsigned int ifi, sa_family_t af,
> >  		void *addr, int *prefix_len, void *addr_l);
> >  int nl_addr_set(int s, unsigned int ifi, sa_family_t af,
> >  		const void *addr, int prefix_len);
> > +int nl_addr_set_ll_nodad(int s, unsigned int ifi);
> >  int nl_addr_dup(int s_src, unsigned int ifi_src,
> >  		int s_dst, unsigned int ifi_dst, sa_family_t af);
> >  int nl_link_get_mac(int s, unsigned int ifi, void *mac);
> > diff --git a/pasta.c b/pasta.c
> > index 96545b1..838bbb3 100644
> > --- a/pasta.c
> > +++ b/pasta.c
> > @@ -340,6 +340,12 @@ void pasta_ns_conf(struct ctx *c)
> >  		}
> >  
> >  		if (c->ifi6) {
> > +			rc = nl_addr_set_ll_nodad(nl_sock_ns, c->pasta_ifi);
> > +			if (rc < 0) {
> > +				die("Can't disable DAD for LL in namespace: %s",
> > +				    strerror(-rc));  
> 
> So... I'm usually the one arguing *for* ASSERT()s and die()s, but in
> this case it seems overly drastic.  If we're unable to set DAD it will
> slow things down, but mostly things should still work.  I'd prefer to
> see this as just a warn().

Definitely, yeah.

-- 
Stefano


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/7] netlink, pasta: Fetch link-local address from namespace interface once it's up
  2024-08-15  3:04   ` David Gibson
@ 2024-08-15  6:53     ` Stefano Brivio
  0 siblings, 0 replies; 17+ messages in thread
From: Stefano Brivio @ 2024-08-15  6:53 UTC (permalink / raw)
  To: David Gibson; +Cc: passt-dev, Paul Holzinger

On Thu, 15 Aug 2024 13:04:42 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Thu, Aug 15, 2024 at 12:54:27AM +0200, Stefano Brivio wrote:
> > As soon as we bring up the interface, the Linux kernel will set up a
> > link-local address for it, so we can fetch it and start using right
> > away, if we need a link-local address to communicate to the container
> > before we see any traffic coming from it.
> > 
> > Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> > ---
> >  netlink.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++
> >  netlink.h |  1 +
> >  pasta.c   |  7 +++++++
> >  3 files changed, 55 insertions(+)
> > 
> > diff --git a/netlink.c b/netlink.c
> > index 4b49de1..3b37087 100644
> > --- a/netlink.c
> > +++ b/netlink.c
> > @@ -836,6 +836,53 @@ int nl_addr_get(int s, unsigned int ifi, sa_family_t af,
> >  	return status;
> >  }
> >  
> > +/**
> > + * nl_addr_get_ll() - Get first IPv6 link-local address for a given interface
> > + * @s:		Netlink socket
> > + * @ifi:	Interface index in outer network namespace
> > + * @addr:	Link-local address to fill
> > + *
> > + * Return: 0 on success, negative error code on failure
> > + */
> > +int nl_addr_get_ll(int s, unsigned int ifi, void *addr)  
> 
> Given this is explicitly for IPv6, I don't see a reason not to use
> (struct in6_addr *addr) for greater type safety.

I'll change that.

> > +{
> > +	struct req_t {
> > +		struct nlmsghdr nlh;
> > +		struct ifaddrmsg ifa;
> > +	} req = {
> > +		.ifa.ifa_family		= AF_INET6,
> > +		.ifa.ifa_index		= ifi,
> > +	};
> > +	struct nlmsghdr *nh;
> > +	bool found = false;
> > +	char buf[NLBUFSIZ];
> > +	ssize_t status;
> > +	uint32_t seq;
> > +
> > +	seq = nl_send(s, &req, RTM_GETADDR, NLM_F_DUMP, sizeof(req));
> > +	nl_foreach_oftype(nh, status, s, buf, seq, RTM_NEWADDR) {
> > +		struct ifaddrmsg *ifa = (struct ifaddrmsg *)NLMSG_DATA(nh);
> > +		struct rtattr *rta;
> > +		size_t na;
> > +
> > +		if (ifa->ifa_index != ifi || ifa->ifa_scope != RT_SCOPE_LINK ||
> > +		    found)
> > +			continue;
> > +
> > +		for (rta = IFA_RTA(ifa), na = IFA_PAYLOAD(nh); RTA_OK(rta, na);
> > +		     rta = RTA_NEXT(rta, na)) {
> > +			if (rta->rta_type != IFA_ADDRESS)
> > +				continue;
> > +
> > +			if (!found) {
> > +				memcpy(addr, RTA_DATA(rta), RTA_PAYLOAD(rta));
> > +				found = true;
> > +			}
> > +		}
> > +	}
> > +	return status;
> > +}
> > +
> >  /**
> >   * nl_add_set() - Set IP addresses for given interface and address family
> >   * @s:		Netlink socket
> > diff --git a/netlink.h b/netlink.h
> > index 66a44ad..bdfdef0 100644
> > --- a/netlink.h
> > +++ b/netlink.h
> > @@ -19,6 +19,7 @@ int nl_addr_get(int s, unsigned int ifi, sa_family_t af,
> >  		void *addr, int *prefix_len, void *addr_l);
> >  int nl_addr_set(int s, unsigned int ifi, sa_family_t af,
> >  		const void *addr, int prefix_len);
> > +int nl_addr_get_ll(int s, unsigned int ifi, void *addr);
> >  int nl_addr_set_ll_nodad(int s, unsigned int ifi);
> >  int nl_addr_dup(int s_src, unsigned int ifi_src,
> >  		int s_dst, unsigned int ifi_dst, sa_family_t af);
> > diff --git a/pasta.c b/pasta.c
> > index 838bbb3..cebf54f 100644
> > --- a/pasta.c
> > +++ b/pasta.c
> > @@ -340,6 +340,13 @@ void pasta_ns_conf(struct ctx *c)
> >  		}
> >  
> >  		if (c->ifi6) {
> > +			rc = nl_addr_get_ll(nl_sock_ns, c->pasta_ifi,
> > +					    &c->ip6.addr_ll_seen);
> > +			if (rc < 0) {
> > +				die("Can't fetch LL address from namespace: %s",
> > +				    strerror(-rc));  
> 
> Again, we can generally cope with not having an addr_ll_seen
> initially, so I think a warn() would make more sense.

Of course. I'll fix this as well.

-- 
Stefano


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2024-08-15  6:53 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-08-14 22:54 [PATCH 0/7] Prevent DAD for link-local addresses in containers Stefano Brivio
2024-08-14 22:54 ` [PATCH 1/7] netlink: Fix typo in function comment for nl_addr_get() Stefano Brivio
2024-08-15  2:39   ` David Gibson
2024-08-14 22:54 ` [PATCH 2/7] netlink, pasta: Split MTU setting functionality out of nl_link_up() Stefano Brivio
2024-08-15  2:41   ` David Gibson
2024-08-14 22:54 ` [PATCH 3/7] netlink, pasta: Turn nl_link_up() into a generic function to set link flags Stefano Brivio
2024-08-15  2:42   ` David Gibson
2024-08-14 22:54 ` [PATCH 4/7] netlink, pasta: Disable DAD for link-local addresses on namespace interface Stefano Brivio
2024-08-15  3:01   ` David Gibson
2024-08-15  6:52     ` Stefano Brivio
2024-08-14 22:54 ` [PATCH 5/7] netlink, pasta: Fetch link-local address from namespace interface once it's up Stefano Brivio
2024-08-15  3:04   ` David Gibson
2024-08-15  6:53     ` Stefano Brivio
2024-08-14 22:54 ` [PATCH 6/7] pasta: Disable neighbour solicitations on device up to prevent DAD Stefano Brivio
2024-08-15  3:06   ` David Gibson
2024-08-14 22:54 ` [PATCH 7/7] netlink: Fix typo in function comment for nl_addr_set() Stefano Brivio
2024-08-15  3:07   ` David Gibson

Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).