public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
* [PATCH v2 0/4] Translate source addresses for ICMP errors
@ 2025-04-17  1:55 David Gibson
  2025-04-17  1:55 ` [PATCH v2 1/4] fwd: Split out helpers for port-independent NAT David Gibson
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: David Gibson @ 2025-04-17  1:55 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev, Jon Maloy; +Cc: David Gibson

We now propagate ICMP errors on UDP flows back into ICMP packets on
the tap interface.  However, we don't always get the source address
right for the synthesized message.  Because ICMPs can be generated by
intermediate routers, that source address might not be one of the
endpoints, so the address translation we already have isn't
sufficient.

Implement properly translating ICMP addresses when we need to.  This
ended up a bit messier than I hoped, but it seems to work.  A simple
case to test this is:

	pasta --config-net --map-host-loopback=172.16.1.1 -- \
		sh -c "echo hello | socat STDIO UDP4:172.16.1.1:10001"

where 10001 is a port where nothing is listening on the host.

Without this series, this will just time out.  pasta sends an ICMP
Port Unreachable message, but it's sent with source address 127.0.0.1
and so discarded by the guest.  With this series, the address is
properly translated and we correctly get the error from socat:

2025/04/16 19:02:37 socat[3] E read(5, 0x555c3dbf2000, 8192): Connection refused

v2:
 * Fix a (bogus) coverity warning
 * Minor cosmetic changes based on Stefano's review

David Gibson (4):
  fwd: Split out helpers for port-independent NAT
  treewide: Improve robustness against sockaddrs of unexpected family
  udp: Rework offender address handling in udp_sock_recverr()
  udp: Translate offender addresses for ICMP messages

 flow.c     | 16 ++++++++--
 fwd.c      | 87 ++++++++++++++++++++++++++++++++++++++----------------
 fwd.h      |  3 ++
 inany.h    | 28 +++++++++++-------
 tcp.c      | 10 +++----
 udp.c      | 81 ++++++++++++++++++++++++++++++++++++--------------
 udp_flow.c |  6 ++--
 7 files changed, 162 insertions(+), 69 deletions(-)

-- 
2.49.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2 1/4] fwd: Split out helpers for port-independent NAT
  2025-04-17  1:55 [PATCH v2 0/4] Translate source addresses for ICMP errors David Gibson
@ 2025-04-17  1:55 ` David Gibson
  2025-04-17  1:55 ` [PATCH v2 2/4] treewide: Improve robustness against sockaddrs of unexpected family David Gibson
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: David Gibson @ 2025-04-17  1:55 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev, Jon Maloy; +Cc: David Gibson

Currently the functions fwd_nat_from_*() make some address translations
based on both the IP address and protocol port numbers, and others based
only on the address.  We have some upcoming cases where it's useful to use
the IP-address-only translations separately, so split them out into helper
functions.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 fwd.c | 87 ++++++++++++++++++++++++++++++++++++++++++-----------------
 1 file changed, 62 insertions(+), 25 deletions(-)

diff --git a/fwd.c b/fwd.c
index 2829cd24..5c70e834 100644
--- a/fwd.c
+++ b/fwd.c
@@ -323,6 +323,30 @@ static bool fwd_guest_accessible(const struct ctx *c,
 	return fwd_guest_accessible6(c, &addr->a6);
 }
 
+/**
+ * nat_outbound() - Apply address translation for outbound (TAP to HOST)
+ * @c:		Execution context
+ * @addr:	Input address (as seen on TAP interface)
+ * @translated:	Output address (as seen on HOST interface)
+ *
+ * Only handles translations that depend *only* on the address.  Anything
+ * related to specific ports or flows is handled elsewhere.
+ */
+static void nat_outbound(const struct ctx *c, const union inany_addr *addr,
+			 union inany_addr *translated)
+{
+	if (inany_equals4(addr, &c->ip4.map_host_loopback))
+		*translated = inany_loopback4;
+	else if (inany_equals6(addr, &c->ip6.map_host_loopback))
+		*translated = inany_loopback6;
+	else if (inany_equals4(addr, &c->ip4.map_guest_addr))
+		*translated = inany_from_v4(c->ip4.addr);
+	else if (inany_equals6(addr, &c->ip6.map_guest_addr))
+		translated->a6 = c->ip6.addr;
+	else
+		*translated = *addr;
+}
+
 /**
  * fwd_nat_from_tap() - Determine to forward a flow from the tap interface
  * @c:		Execution context
@@ -342,16 +366,8 @@ uint8_t fwd_nat_from_tap(const struct ctx *c, uint8_t proto,
 	else if (is_dns_flow(proto, ini) &&
 		   inany_equals6(&ini->oaddr, &c->ip6.dns_match))
 		tgt->eaddr.a6 = c->ip6.dns_host;
-	else if (inany_equals4(&ini->oaddr, &c->ip4.map_host_loopback))
-		tgt->eaddr = inany_loopback4;
-	else if (inany_equals6(&ini->oaddr, &c->ip6.map_host_loopback))
-		tgt->eaddr = inany_loopback6;
-	else if (inany_equals4(&ini->oaddr, &c->ip4.map_guest_addr))
-		tgt->eaddr = inany_from_v4(c->ip4.addr);
-	else if (inany_equals6(&ini->oaddr, &c->ip6.map_guest_addr))
-		tgt->eaddr.a6 = c->ip6.addr;
 	else
-		tgt->eaddr = ini->oaddr;
+		nat_outbound(c, &ini->oaddr, &tgt->eaddr);
 
 	tgt->eport = ini->oport;
 
@@ -423,6 +439,42 @@ uint8_t fwd_nat_from_splice(const struct ctx *c, uint8_t proto,
 	return PIF_HOST;
 }
 
+/**
+ * nat_inbound() - Apply address translation for outbound (HOST to TAP)
+ * @c:		Execution context
+ * @addr:	Input address (as seen on HOST interface)
+ * @translated:	Output address (as seen on TAP interface)
+ *
+ * Return: true on success, false if it couldn't translate the address
+ *
+ * Only handles translations that depend *only* on the address.  Anything
+ * related to specific ports or flows is handled elsewhere.
+ */
+static bool nat_inbound(const struct ctx *c, const union inany_addr *addr,
+			 union inany_addr *translated)
+{
+	if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.map_host_loopback) &&
+	    inany_equals4(addr, &in4addr_loopback)) {
+		/* Specifically 127.0.0.1, not 127.0.0.0/8 */
+		*translated = inany_from_v4(c->ip4.map_host_loopback);
+	} else if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.map_host_loopback) &&
+		   inany_equals6(addr, &in6addr_loopback)) {
+		translated->a6 = c->ip6.map_host_loopback;
+	} else if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.map_guest_addr) &&
+		   inany_equals4(addr, &c->ip4.addr)) {
+		*translated = inany_from_v4(c->ip4.map_guest_addr);
+	} else if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.map_guest_addr) &&
+		   inany_equals6(addr, &c->ip6.addr)) {
+		translated->a6 = c->ip6.map_guest_addr;
+	} else if (fwd_guest_accessible(c, addr)) {
+		*translated = *addr;
+	} else {
+		return false;
+	}
+
+	return true;
+}
+
 /**
  * fwd_nat_from_host() - Determine to forward a flow from the host interface
  * @c:		Execution context
@@ -479,20 +531,7 @@ uint8_t fwd_nat_from_host(const struct ctx *c, uint8_t proto,
 		return PIF_SPLICE;
 	}
 
-	if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.map_host_loopback) &&
-	    inany_equals4(&ini->eaddr, &in4addr_loopback)) {
-		/* Specifically 127.0.0.1, not 127.0.0.0/8 */
-		tgt->oaddr = inany_from_v4(c->ip4.map_host_loopback);
-	} else if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.map_host_loopback) &&
-		   inany_equals6(&ini->eaddr, &in6addr_loopback)) {
-		tgt->oaddr.a6 = c->ip6.map_host_loopback;
-	} else if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.map_guest_addr) &&
-		   inany_equals4(&ini->eaddr, &c->ip4.addr)) {
-		tgt->oaddr = inany_from_v4(c->ip4.map_guest_addr);
-	} else if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.map_guest_addr) &&
-		   inany_equals6(&ini->eaddr, &c->ip6.addr)) {
-		tgt->oaddr.a6 = c->ip6.map_guest_addr;
-	} else if (!fwd_guest_accessible(c, &ini->eaddr)) {
+	if (!nat_inbound(c, &ini->eaddr, &tgt->oaddr)) {
 		if (inany_v4(&ini->eaddr)) {
 			if (IN4_IS_ADDR_UNSPECIFIED(&c->ip4.our_tap_addr))
 				/* No source address we can use */
@@ -501,8 +540,6 @@ uint8_t fwd_nat_from_host(const struct ctx *c, uint8_t proto,
 		} else {
 			tgt->oaddr.a6 = c->ip6.our_tap_ll;
 		}
-	} else {
-		tgt->oaddr = ini->eaddr;
 	}
 	tgt->oport = ini->eport;
 
-- 
@@ -323,6 +323,30 @@ static bool fwd_guest_accessible(const struct ctx *c,
 	return fwd_guest_accessible6(c, &addr->a6);
 }
 
+/**
+ * nat_outbound() - Apply address translation for outbound (TAP to HOST)
+ * @c:		Execution context
+ * @addr:	Input address (as seen on TAP interface)
+ * @translated:	Output address (as seen on HOST interface)
+ *
+ * Only handles translations that depend *only* on the address.  Anything
+ * related to specific ports or flows is handled elsewhere.
+ */
+static void nat_outbound(const struct ctx *c, const union inany_addr *addr,
+			 union inany_addr *translated)
+{
+	if (inany_equals4(addr, &c->ip4.map_host_loopback))
+		*translated = inany_loopback4;
+	else if (inany_equals6(addr, &c->ip6.map_host_loopback))
+		*translated = inany_loopback6;
+	else if (inany_equals4(addr, &c->ip4.map_guest_addr))
+		*translated = inany_from_v4(c->ip4.addr);
+	else if (inany_equals6(addr, &c->ip6.map_guest_addr))
+		translated->a6 = c->ip6.addr;
+	else
+		*translated = *addr;
+}
+
 /**
  * fwd_nat_from_tap() - Determine to forward a flow from the tap interface
  * @c:		Execution context
@@ -342,16 +366,8 @@ uint8_t fwd_nat_from_tap(const struct ctx *c, uint8_t proto,
 	else if (is_dns_flow(proto, ini) &&
 		   inany_equals6(&ini->oaddr, &c->ip6.dns_match))
 		tgt->eaddr.a6 = c->ip6.dns_host;
-	else if (inany_equals4(&ini->oaddr, &c->ip4.map_host_loopback))
-		tgt->eaddr = inany_loopback4;
-	else if (inany_equals6(&ini->oaddr, &c->ip6.map_host_loopback))
-		tgt->eaddr = inany_loopback6;
-	else if (inany_equals4(&ini->oaddr, &c->ip4.map_guest_addr))
-		tgt->eaddr = inany_from_v4(c->ip4.addr);
-	else if (inany_equals6(&ini->oaddr, &c->ip6.map_guest_addr))
-		tgt->eaddr.a6 = c->ip6.addr;
 	else
-		tgt->eaddr = ini->oaddr;
+		nat_outbound(c, &ini->oaddr, &tgt->eaddr);
 
 	tgt->eport = ini->oport;
 
@@ -423,6 +439,42 @@ uint8_t fwd_nat_from_splice(const struct ctx *c, uint8_t proto,
 	return PIF_HOST;
 }
 
+/**
+ * nat_inbound() - Apply address translation for outbound (HOST to TAP)
+ * @c:		Execution context
+ * @addr:	Input address (as seen on HOST interface)
+ * @translated:	Output address (as seen on TAP interface)
+ *
+ * Return: true on success, false if it couldn't translate the address
+ *
+ * Only handles translations that depend *only* on the address.  Anything
+ * related to specific ports or flows is handled elsewhere.
+ */
+static bool nat_inbound(const struct ctx *c, const union inany_addr *addr,
+			 union inany_addr *translated)
+{
+	if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.map_host_loopback) &&
+	    inany_equals4(addr, &in4addr_loopback)) {
+		/* Specifically 127.0.0.1, not 127.0.0.0/8 */
+		*translated = inany_from_v4(c->ip4.map_host_loopback);
+	} else if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.map_host_loopback) &&
+		   inany_equals6(addr, &in6addr_loopback)) {
+		translated->a6 = c->ip6.map_host_loopback;
+	} else if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.map_guest_addr) &&
+		   inany_equals4(addr, &c->ip4.addr)) {
+		*translated = inany_from_v4(c->ip4.map_guest_addr);
+	} else if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.map_guest_addr) &&
+		   inany_equals6(addr, &c->ip6.addr)) {
+		translated->a6 = c->ip6.map_guest_addr;
+	} else if (fwd_guest_accessible(c, addr)) {
+		*translated = *addr;
+	} else {
+		return false;
+	}
+
+	return true;
+}
+
 /**
  * fwd_nat_from_host() - Determine to forward a flow from the host interface
  * @c:		Execution context
@@ -479,20 +531,7 @@ uint8_t fwd_nat_from_host(const struct ctx *c, uint8_t proto,
 		return PIF_SPLICE;
 	}
 
-	if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.map_host_loopback) &&
-	    inany_equals4(&ini->eaddr, &in4addr_loopback)) {
-		/* Specifically 127.0.0.1, not 127.0.0.0/8 */
-		tgt->oaddr = inany_from_v4(c->ip4.map_host_loopback);
-	} else if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.map_host_loopback) &&
-		   inany_equals6(&ini->eaddr, &in6addr_loopback)) {
-		tgt->oaddr.a6 = c->ip6.map_host_loopback;
-	} else if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.map_guest_addr) &&
-		   inany_equals4(&ini->eaddr, &c->ip4.addr)) {
-		tgt->oaddr = inany_from_v4(c->ip4.map_guest_addr);
-	} else if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.map_guest_addr) &&
-		   inany_equals6(&ini->eaddr, &c->ip6.addr)) {
-		tgt->oaddr.a6 = c->ip6.map_guest_addr;
-	} else if (!fwd_guest_accessible(c, &ini->eaddr)) {
+	if (!nat_inbound(c, &ini->eaddr, &tgt->oaddr)) {
 		if (inany_v4(&ini->eaddr)) {
 			if (IN4_IS_ADDR_UNSPECIFIED(&c->ip4.our_tap_addr))
 				/* No source address we can use */
@@ -501,8 +540,6 @@ uint8_t fwd_nat_from_host(const struct ctx *c, uint8_t proto,
 		} else {
 			tgt->oaddr.a6 = c->ip6.our_tap_ll;
 		}
-	} else {
-		tgt->oaddr = ini->eaddr;
 	}
 	tgt->oport = ini->eport;
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2 2/4] treewide: Improve robustness against sockaddrs of unexpected family
  2025-04-17  1:55 [PATCH v2 0/4] Translate source addresses for ICMP errors David Gibson
  2025-04-17  1:55 ` [PATCH v2 1/4] fwd: Split out helpers for port-independent NAT David Gibson
@ 2025-04-17  1:55 ` David Gibson
  2025-04-17  1:55 ` [PATCH v2 3/4] udp: Rework offender address handling in udp_sock_recverr() David Gibson
  2025-04-17  1:55 ` [PATCH v2 4/4] udp: Translate offender addresses for ICMP messages David Gibson
  3 siblings, 0 replies; 5+ messages in thread
From: David Gibson @ 2025-04-17  1:55 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev, Jon Maloy; +Cc: David Gibson

inany_from_sockaddr() expects a socket address of family AF_INET or
AF_INET6 and ASSERT()s if it gets anything else.  In many of the callers we
can handle an unexpected family more gracefully, though, e.g. by failing
a single flow rather than killing passt.

Change inany_from_sockaddr() to return an error instead of ASSERT()ing,
and handle those errors in the callers.  Improve the reporting of any such
errors while we're at it.

With this greater robustness, allow inany_from_sockaddr() to take a void *
rather than specifically a union sockaddr_inany *.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 flow.c     | 16 ++++++++++++++--
 inany.h    | 28 +++++++++++++++++-----------
 tcp.c      | 10 ++++------
 udp_flow.c |  6 +++---
 4 files changed, 38 insertions(+), 22 deletions(-)

diff --git a/flow.c b/flow.c
index 3c81cb42..447c0211 100644
--- a/flow.c
+++ b/flow.c
@@ -408,7 +408,12 @@ struct flowside *flow_initiate_sa(union flow *flow, uint8_t pif,
 {
 	struct flowside *ini = &flow->f.side[INISIDE];
 
-	inany_from_sockaddr(&ini->eaddr, &ini->eport, ssa);
+	if (inany_from_sockaddr(&ini->eaddr, &ini->eport, ssa) < 0) {
+		char str[SOCKADDR_STRLEN];
+
+		ASSERT_WITH_MSG(0, "Bad socket address %s",
+				sockaddr_ntop(ssa, str, sizeof(str)));
+	}
 	if (daddr)
 		ini->oaddr = *daddr;
 	else if (inany_v4(&ini->eaddr))
@@ -768,7 +773,14 @@ flow_sidx_t flow_lookup_sa(const struct ctx *c, uint8_t proto, uint8_t pif,
 		.oport = oport,
 	};
 
-	inany_from_sockaddr(&side.eaddr, &side.eport, esa);
+	if (inany_from_sockaddr(&side.eaddr, &side.eport, esa) < 0) {
+		char str[SOCKADDR_STRLEN];
+
+		warn("Flow lookup on bad socket address %s",
+		     sockaddr_ntop(esa, str, sizeof(str)));
+		return FLOW_SIDX_NONE;
+	}
+
 	if (oaddr)
 		side.oaddr = *oaddr;
 	else if (inany_v4(&side.eaddr))
diff --git a/inany.h b/inany.h
index 1c247e1e..7ca5cbd3 100644
--- a/inany.h
+++ b/inany.h
@@ -237,24 +237,30 @@ static inline void inany_from_af(union inany_addr *aa,
 }
 
 /** inany_from_sockaddr - Extract IPv[46] address and port number from sockaddr
- * @aa:		Pointer to store IPv[46] address
+ * @dst:	Pointer to store IPv[46] address (output)
  * @port:	Pointer to store port number, host order
- * @addr:	AF_INET or AF_INET6 socket address
+ * @addr:	Socket address
+ *
+ * Return: 0 on success, -1 on error (bad address family)
  */
-static inline void inany_from_sockaddr(union inany_addr *aa, in_port_t *port,
-				       const union sockaddr_inany *sa)
+static inline int inany_from_sockaddr(union inany_addr *dst, in_port_t *port,
+				      const void *addr)
 {
+	const union sockaddr_inany *sa = (const union sockaddr_inany *)addr;
+
 	if (sa->sa_family == AF_INET6) {
-		inany_from_af(aa, AF_INET6, &sa->sa6.sin6_addr);
+		inany_from_af(dst, AF_INET6, &sa->sa6.sin6_addr);
 		*port = ntohs(sa->sa6.sin6_port);
-	} else if (sa->sa_family == AF_INET) {
-		inany_from_af(aa, AF_INET, &sa->sa4.sin_addr);
+		return 0;
+	}
+
+	if (sa->sa_family == AF_INET) {
+		inany_from_af(dst, AF_INET, &sa->sa4.sin_addr);
 		*port = ntohs(sa->sa4.sin_port);
-	} else {
-		/* Not valid to call with other address families */
-		ASSERT_WITH_MSG(0, "Unexpected sockaddr family: %u",
-				sa->sa_family);
+		return 0;
 	}
+
+	return -1;
 }
 
 /** inany_siphash_feed- Fold IPv[46] address into an in-progress siphash
diff --git a/tcp.c b/tcp.c
index 9c6bc529..0ac298a7 100644
--- a/tcp.c
+++ b/tcp.c
@@ -1546,9 +1546,8 @@ static void tcp_conn_from_tap(const struct ctx *c, sa_family_t af,
 
 	if (c->mode == MODE_VU) { /* To rebind to same oport after migration */
 		sl = sizeof(sa);
-		if (!getsockname(s, &sa.sa, &sl))
-			inany_from_sockaddr(&tgt->oaddr, &tgt->oport, &sa);
-		else
+		if (getsockname(s, &sa.sa, &sl) ||
+		    inany_from_sockaddr(&tgt->oaddr, &tgt->oport, &sa) < 0)
 			err_perror("Can't get local address for socket %i", s);
 	}
 
@@ -2204,9 +2203,8 @@ void tcp_listen_handler(const struct ctx *c, union epoll_ref ref,
 			       NULL, ref.tcp_listen.port);
 
 	if (c->mode == MODE_VU) { /* Rebind to same address after migration */
-		if (!getsockname(s, &sa.sa, &sl))
-			inany_from_sockaddr(&ini->oaddr, &ini->oport, &sa);
-		else
+		if (getsockname(s, &sa.sa, &sl) ||
+		    inany_from_sockaddr(&ini->oaddr, &ini->oport, &sa) < 0)
 			err_perror("Can't get local address for socket %i", s);
 	}
 
diff --git a/udp_flow.c b/udp_flow.c
index ef2cbb06..fea1cf3c 100644
--- a/udp_flow.c
+++ b/udp_flow.c
@@ -158,12 +158,12 @@ static flow_sidx_t udp_flow_new(const struct ctx *c, union flow *flow,
 		socklen_t sl = sizeof(sa);
 		in_port_t port;
 
-		if (getsockname(uflow->s[TGTSIDE], &sa.sa, &sl) < 0) {
+		if (getsockname(uflow->s[TGTSIDE], &sa.sa, &sl) < 0 ||
+		    inany_from_sockaddr(&uflow->f.side[TGTSIDE].oaddr,
+					&port, &sa) < 0) {
 			flow_perror(uflow, "Unable to determine local address");
 			goto cancel;
 		}
-		inany_from_sockaddr(&uflow->f.side[TGTSIDE].oaddr,
-				    &port, &sa);
 		if (port != tgt->oport) {
 			flow_err(uflow, "Unexpected local port");
 			goto cancel;
-- 
@@ -158,12 +158,12 @@ static flow_sidx_t udp_flow_new(const struct ctx *c, union flow *flow,
 		socklen_t sl = sizeof(sa);
 		in_port_t port;
 
-		if (getsockname(uflow->s[TGTSIDE], &sa.sa, &sl) < 0) {
+		if (getsockname(uflow->s[TGTSIDE], &sa.sa, &sl) < 0 ||
+		    inany_from_sockaddr(&uflow->f.side[TGTSIDE].oaddr,
+					&port, &sa) < 0) {
 			flow_perror(uflow, "Unable to determine local address");
 			goto cancel;
 		}
-		inany_from_sockaddr(&uflow->f.side[TGTSIDE].oaddr,
-				    &port, &sa);
 		if (port != tgt->oport) {
 			flow_err(uflow, "Unexpected local port");
 			goto cancel;
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2 3/4] udp: Rework offender address handling in udp_sock_recverr()
  2025-04-17  1:55 [PATCH v2 0/4] Translate source addresses for ICMP errors David Gibson
  2025-04-17  1:55 ` [PATCH v2 1/4] fwd: Split out helpers for port-independent NAT David Gibson
  2025-04-17  1:55 ` [PATCH v2 2/4] treewide: Improve robustness against sockaddrs of unexpected family David Gibson
@ 2025-04-17  1:55 ` David Gibson
  2025-04-17  1:55 ` [PATCH v2 4/4] udp: Translate offender addresses for ICMP messages David Gibson
  3 siblings, 0 replies; 5+ messages in thread
From: David Gibson @ 2025-04-17  1:55 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev, Jon Maloy; +Cc: David Gibson

Make a number of changes to udp_sock_recverr() to improve the robustness
of how we handle addresses.

 * Get the "offender" address (source of the ICMP packet) using the
   SO_EE_OFFENDER() macro, reducing assumptions about structure layout.
 * Parse the offender sockaddr using inany_from_sockaddr()
 * Check explicitly that the source and destination pifs are what we
   expect.  Previously we checked something that was probably equivalent
   in practice, but isn't strictly speaking what we require for the rest
   of the code.
 * Verify that for an ICMPv4 error we also have an IPv4 source/offender
   and destination/endpoint address
 * Verify that for an ICMPv6 error we have an IPv6 endpoint
 * Improve debug reporting of any failures

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 udp.c | 69 +++++++++++++++++++++++++++++++++++++++++------------------
 1 file changed, 48 insertions(+), 21 deletions(-)

diff --git a/udp.c b/udp.c
index 57769d06..d09b3eb9 100644
--- a/udp.c
+++ b/udp.c
@@ -159,6 +159,12 @@ udp_meta[UDP_MAX_FRAMES];
 	MAX(CMSG_SPACE(sizeof(struct in_pktinfo)),	\
 	    CMSG_SPACE(sizeof(struct in6_pktinfo)))
 
+#define RECVERR_SPACE							\
+	MAX(CMSG_SPACE(sizeof(struct sock_extended_err) +		\
+		       sizeof(struct sockaddr_in)),			\
+	    CMSG_SPACE(sizeof(struct sock_extended_err) +		\
+		       sizeof(struct sockaddr_in6)))
+
 /**
  * enum udp_iov_idx - Indices for the buffers making up a single UDP frame
  * @UDP_IOV_TAP         tap specific header
@@ -516,12 +522,8 @@ static int udp_pktinfo(struct msghdr *msg, union inany_addr *dst)
 static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
 			    uint8_t pif, in_port_t port)
 {
-	struct errhdr {
-		struct sock_extended_err ee;
-		union sockaddr_inany saddr;
-	};
-	char buf[PKTINFO_SPACE + CMSG_SPACE(sizeof(struct errhdr))];
-	const struct errhdr *eh = NULL;
+	char buf[PKTINFO_SPACE + RECVERR_SPACE];
+	const struct sock_extended_err *ee;
 	char data[ICMP6_MAX_DLEN];
 	struct cmsghdr *hdr;
 	struct iovec iov = {
@@ -538,7 +540,13 @@ static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
 		.msg_controllen = sizeof(buf),
 	};
 	const struct flowside *toside;
-	flow_sidx_t tosidx;
+	char astr[INANY_ADDRSTRLEN];
+	char sastr[SOCKADDR_STRLEN];
+	union inany_addr offender;
+	const struct in_addr *o4;
+	in_port_t offender_port;
+	struct udp_flow *uflow;
+	uint8_t topif;
 	size_t dlen;
 	ssize_t rc;
 
@@ -569,10 +577,10 @@ static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
 		return -1;
 	}
 
-	eh = (const struct errhdr *)CMSG_DATA(hdr);
+	ee = (const struct sock_extended_err *)CMSG_DATA(hdr);
 
 	debug("%s error on UDP socket %i: %s",
-	      str_ee_origin(&eh->ee), s, strerror_(eh->ee.ee_errno));
+	      str_ee_origin(ee), s, strerror_(ee->ee_errno));
 
 	if (!flow_sidx_valid(sidx)) {
 		/* No hint from the socket, determine flow from addresses */
@@ -588,25 +596,44 @@ static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
 			debug("Ignoring UDP error without flow");
 			return 1;
 		}
+	} else {
+		pif = pif_at_sidx(sidx);
 	}
 
-	tosidx = flow_sidx_opposite(sidx);
-	toside = flowside_at_sidx(tosidx);
+	uflow = udp_at_sidx(sidx);
+	ASSERT(uflow);
+	toside = &uflow->f.side[!sidx.sidei];
+	topif = uflow->f.pif[!sidx.sidei];
 	dlen = rc;
 
-	if (pif_is_socket(pif_at_sidx(tosidx))) {
-		/* XXX Is there any way to propagate ICMPs from socket to
-		 * socket? */
-	} else if (hdr->cmsg_level == IPPROTO_IP) {
+	if (inany_from_sockaddr(&offender, &offender_port,
+				SO_EE_OFFENDER(ee)) < 0)
+		goto fail;
+
+	if (pif != PIF_HOST || topif != PIF_TAP)
+		/* XXX Can we support any other cases? */
+		goto fail;
+
+	if (hdr->cmsg_level == IPPROTO_IP &&
+	    (o4 = inany_v4(&offender)) && inany_v4(&toside->eaddr)) {
 		dlen = MIN(dlen, ICMP4_MAX_DLEN);
-		udp_send_tap_icmp4(c, &eh->ee, toside,
-				   eh->saddr.sa4.sin_addr, data, dlen);
-	} else if (hdr->cmsg_level == IPPROTO_IPV6) {
-		udp_send_tap_icmp6(c, &eh->ee, toside,
-				   &eh->saddr.sa6.sin6_addr, data,
-				   dlen, sidx.flowi);
+		udp_send_tap_icmp4(c, ee, toside, *o4, data, dlen);
+		return 1;
 	}
 
+	if (hdr->cmsg_level == IPPROTO_IPV6 && !inany_v4(&toside->eaddr)) {
+		udp_send_tap_icmp6(c, ee, toside, &offender.a6, data, dlen,
+				   FLOW_IDX(uflow));
+		return 1;
+	}
+
+fail:
+	flow_dbg(uflow, "Can't propagate %s error from %s %s to %s %s",
+		 str_ee_origin(ee),
+		 pif_name(pif),
+		 sockaddr_ntop(SO_EE_OFFENDER(ee), sastr, sizeof(sastr)),
+		 pif_name(topif),
+		 inany_ntop(&toside->eaddr, astr, sizeof(astr)));
 	return 1;
 }
 
-- 
@@ -159,6 +159,12 @@ udp_meta[UDP_MAX_FRAMES];
 	MAX(CMSG_SPACE(sizeof(struct in_pktinfo)),	\
 	    CMSG_SPACE(sizeof(struct in6_pktinfo)))
 
+#define RECVERR_SPACE							\
+	MAX(CMSG_SPACE(sizeof(struct sock_extended_err) +		\
+		       sizeof(struct sockaddr_in)),			\
+	    CMSG_SPACE(sizeof(struct sock_extended_err) +		\
+		       sizeof(struct sockaddr_in6)))
+
 /**
  * enum udp_iov_idx - Indices for the buffers making up a single UDP frame
  * @UDP_IOV_TAP         tap specific header
@@ -516,12 +522,8 @@ static int udp_pktinfo(struct msghdr *msg, union inany_addr *dst)
 static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
 			    uint8_t pif, in_port_t port)
 {
-	struct errhdr {
-		struct sock_extended_err ee;
-		union sockaddr_inany saddr;
-	};
-	char buf[PKTINFO_SPACE + CMSG_SPACE(sizeof(struct errhdr))];
-	const struct errhdr *eh = NULL;
+	char buf[PKTINFO_SPACE + RECVERR_SPACE];
+	const struct sock_extended_err *ee;
 	char data[ICMP6_MAX_DLEN];
 	struct cmsghdr *hdr;
 	struct iovec iov = {
@@ -538,7 +540,13 @@ static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
 		.msg_controllen = sizeof(buf),
 	};
 	const struct flowside *toside;
-	flow_sidx_t tosidx;
+	char astr[INANY_ADDRSTRLEN];
+	char sastr[SOCKADDR_STRLEN];
+	union inany_addr offender;
+	const struct in_addr *o4;
+	in_port_t offender_port;
+	struct udp_flow *uflow;
+	uint8_t topif;
 	size_t dlen;
 	ssize_t rc;
 
@@ -569,10 +577,10 @@ static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
 		return -1;
 	}
 
-	eh = (const struct errhdr *)CMSG_DATA(hdr);
+	ee = (const struct sock_extended_err *)CMSG_DATA(hdr);
 
 	debug("%s error on UDP socket %i: %s",
-	      str_ee_origin(&eh->ee), s, strerror_(eh->ee.ee_errno));
+	      str_ee_origin(ee), s, strerror_(ee->ee_errno));
 
 	if (!flow_sidx_valid(sidx)) {
 		/* No hint from the socket, determine flow from addresses */
@@ -588,25 +596,44 @@ static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
 			debug("Ignoring UDP error without flow");
 			return 1;
 		}
+	} else {
+		pif = pif_at_sidx(sidx);
 	}
 
-	tosidx = flow_sidx_opposite(sidx);
-	toside = flowside_at_sidx(tosidx);
+	uflow = udp_at_sidx(sidx);
+	ASSERT(uflow);
+	toside = &uflow->f.side[!sidx.sidei];
+	topif = uflow->f.pif[!sidx.sidei];
 	dlen = rc;
 
-	if (pif_is_socket(pif_at_sidx(tosidx))) {
-		/* XXX Is there any way to propagate ICMPs from socket to
-		 * socket? */
-	} else if (hdr->cmsg_level == IPPROTO_IP) {
+	if (inany_from_sockaddr(&offender, &offender_port,
+				SO_EE_OFFENDER(ee)) < 0)
+		goto fail;
+
+	if (pif != PIF_HOST || topif != PIF_TAP)
+		/* XXX Can we support any other cases? */
+		goto fail;
+
+	if (hdr->cmsg_level == IPPROTO_IP &&
+	    (o4 = inany_v4(&offender)) && inany_v4(&toside->eaddr)) {
 		dlen = MIN(dlen, ICMP4_MAX_DLEN);
-		udp_send_tap_icmp4(c, &eh->ee, toside,
-				   eh->saddr.sa4.sin_addr, data, dlen);
-	} else if (hdr->cmsg_level == IPPROTO_IPV6) {
-		udp_send_tap_icmp6(c, &eh->ee, toside,
-				   &eh->saddr.sa6.sin6_addr, data,
-				   dlen, sidx.flowi);
+		udp_send_tap_icmp4(c, ee, toside, *o4, data, dlen);
+		return 1;
 	}
 
+	if (hdr->cmsg_level == IPPROTO_IPV6 && !inany_v4(&toside->eaddr)) {
+		udp_send_tap_icmp6(c, ee, toside, &offender.a6, data, dlen,
+				   FLOW_IDX(uflow));
+		return 1;
+	}
+
+fail:
+	flow_dbg(uflow, "Can't propagate %s error from %s %s to %s %s",
+		 str_ee_origin(ee),
+		 pif_name(pif),
+		 sockaddr_ntop(SO_EE_OFFENDER(ee), sastr, sizeof(sastr)),
+		 pif_name(topif),
+		 inany_ntop(&toside->eaddr, astr, sizeof(astr)));
 	return 1;
 }
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2 4/4] udp: Translate offender addresses for ICMP messages
  2025-04-17  1:55 [PATCH v2 0/4] Translate source addresses for ICMP errors David Gibson
                   ` (2 preceding siblings ...)
  2025-04-17  1:55 ` [PATCH v2 3/4] udp: Rework offender address handling in udp_sock_recverr() David Gibson
@ 2025-04-17  1:55 ` David Gibson
  3 siblings, 0 replies; 5+ messages in thread
From: David Gibson @ 2025-04-17  1:55 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev, Jon Maloy; +Cc: David Gibson

We've recently added support for propagating ICMP errors related to a UDP
flow from the host to the guest, by handling the extended UDP error on the
socket and synthesizing a suitable ICMP on the tap interface.

Currently we create that ICMP with a source address of the "offender" from
the extended error information - the source of the ICMP error received on
the host.  However, we don't translate this address for cases where we NAT
between host and guest.  This means (amongst other things) that we won't
get a "Connection refused" error as expected if send data from the guest to
the --map-host-loopback address.  The error comes from 127.0.0.1 on the
host, which doesn't make sense on the tap interface and will be discarded
by the guest.

Because ICMP errors can be sent by an intermediate host, not just by the
endpoints of the flow, we can't handle this translation purely with the
information in the flow table entry.  We need to explicitly translate this
address by our NAT rules, which we can do with the nat_inbound() helper.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 fwd.c |  4 ++--
 fwd.h |  3 +++
 udp.c | 18 ++++++++++++++----
 3 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/fwd.c b/fwd.c
index 5c70e834..b73c2c8d 100644
--- a/fwd.c
+++ b/fwd.c
@@ -450,8 +450,8 @@ uint8_t fwd_nat_from_splice(const struct ctx *c, uint8_t proto,
  * Only handles translations that depend *only* on the address.  Anything
  * related to specific ports or flows is handled elsewhere.
  */
-static bool nat_inbound(const struct ctx *c, const union inany_addr *addr,
-			 union inany_addr *translated)
+bool nat_inbound(const struct ctx *c, const union inany_addr *addr,
+		 union inany_addr *translated)
 {
 	if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.map_host_loopback) &&
 	    inany_equals4(addr, &in4addr_loopback)) {
diff --git a/fwd.h b/fwd.h
index 3562f3ca..0458a3c5 100644
--- a/fwd.h
+++ b/fwd.h
@@ -7,6 +7,7 @@
 #ifndef FWD_H
 #define FWD_H
 
+union inany_addr;
 struct flowside;
 
 /* Number of ports for both TCP and UDP */
@@ -47,6 +48,8 @@ void fwd_scan_ports_udp(struct fwd_ports *fwd, const struct fwd_ports *rev,
 			const struct fwd_ports *tcp_rev);
 void fwd_scan_ports_init(struct ctx *c);
 
+bool nat_inbound(const struct ctx *c, const union inany_addr *addr,
+		 union inany_addr *translated);
 uint8_t fwd_nat_from_tap(const struct ctx *c, uint8_t proto,
 			 const struct flowside *ini, struct flowside *tgt);
 uint8_t fwd_nat_from_splice(const struct ctx *c, uint8_t proto,
diff --git a/udp.c b/udp.c
index d09b3eb9..f5a5cd12 100644
--- a/udp.c
+++ b/udp.c
@@ -539,10 +539,10 @@ static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
 		.msg_control = buf,
 		.msg_controllen = sizeof(buf),
 	};
-	const struct flowside *toside;
+	const struct flowside *fromside, *toside;
+	union inany_addr offender, otap;
 	char astr[INANY_ADDRSTRLEN];
 	char sastr[SOCKADDR_STRLEN];
-	union inany_addr offender;
 	const struct in_addr *o4;
 	in_port_t offender_port;
 	struct udp_flow *uflow;
@@ -602,6 +602,7 @@ static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
 
 	uflow = udp_at_sidx(sidx);
 	ASSERT(uflow);
+	fromside = &uflow->f.side[sidx.sidei];
 	toside = &uflow->f.side[!sidx.sidei];
 	topif = uflow->f.pif[!sidx.sidei];
 	dlen = rc;
@@ -614,15 +615,24 @@ static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
 		/* XXX Can we support any other cases? */
 		goto fail;
 
+	/* If the offender *is* the endpoint, make sure our translation is
+	 * consistent with the flow's translation.  This matters if the flow
+	 * endpoint has a port specific translation (like --dns-match).
+	 */
+	if (inany_equals(&offender, &fromside->eaddr))
+		otap = toside->oaddr;
+	else if (!nat_inbound(c, &offender, &otap))
+		goto fail;
+
 	if (hdr->cmsg_level == IPPROTO_IP &&
-	    (o4 = inany_v4(&offender)) && inany_v4(&toside->eaddr)) {
+	    (o4 = inany_v4(&otap)) && inany_v4(&toside->eaddr)) {
 		dlen = MIN(dlen, ICMP4_MAX_DLEN);
 		udp_send_tap_icmp4(c, ee, toside, *o4, data, dlen);
 		return 1;
 	}
 
 	if (hdr->cmsg_level == IPPROTO_IPV6 && !inany_v4(&toside->eaddr)) {
-		udp_send_tap_icmp6(c, ee, toside, &offender.a6, data, dlen,
+		udp_send_tap_icmp6(c, ee, toside, &otap.a6, data, dlen,
 				   FLOW_IDX(uflow));
 		return 1;
 	}
-- 
@@ -539,10 +539,10 @@ static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
 		.msg_control = buf,
 		.msg_controllen = sizeof(buf),
 	};
-	const struct flowside *toside;
+	const struct flowside *fromside, *toside;
+	union inany_addr offender, otap;
 	char astr[INANY_ADDRSTRLEN];
 	char sastr[SOCKADDR_STRLEN];
-	union inany_addr offender;
 	const struct in_addr *o4;
 	in_port_t offender_port;
 	struct udp_flow *uflow;
@@ -602,6 +602,7 @@ static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
 
 	uflow = udp_at_sidx(sidx);
 	ASSERT(uflow);
+	fromside = &uflow->f.side[sidx.sidei];
 	toside = &uflow->f.side[!sidx.sidei];
 	topif = uflow->f.pif[!sidx.sidei];
 	dlen = rc;
@@ -614,15 +615,24 @@ static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
 		/* XXX Can we support any other cases? */
 		goto fail;
 
+	/* If the offender *is* the endpoint, make sure our translation is
+	 * consistent with the flow's translation.  This matters if the flow
+	 * endpoint has a port specific translation (like --dns-match).
+	 */
+	if (inany_equals(&offender, &fromside->eaddr))
+		otap = toside->oaddr;
+	else if (!nat_inbound(c, &offender, &otap))
+		goto fail;
+
 	if (hdr->cmsg_level == IPPROTO_IP &&
-	    (o4 = inany_v4(&offender)) && inany_v4(&toside->eaddr)) {
+	    (o4 = inany_v4(&otap)) && inany_v4(&toside->eaddr)) {
 		dlen = MIN(dlen, ICMP4_MAX_DLEN);
 		udp_send_tap_icmp4(c, ee, toside, *o4, data, dlen);
 		return 1;
 	}
 
 	if (hdr->cmsg_level == IPPROTO_IPV6 && !inany_v4(&toside->eaddr)) {
-		udp_send_tap_icmp6(c, ee, toside, &offender.a6, data, dlen,
+		udp_send_tap_icmp6(c, ee, toside, &otap.a6, data, dlen,
 				   FLOW_IDX(uflow));
 		return 1;
 	}
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-04-17  1:56 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-04-17  1:55 [PATCH v2 0/4] Translate source addresses for ICMP errors David Gibson
2025-04-17  1:55 ` [PATCH v2 1/4] fwd: Split out helpers for port-independent NAT David Gibson
2025-04-17  1:55 ` [PATCH v2 2/4] treewide: Improve robustness against sockaddrs of unexpected family David Gibson
2025-04-17  1:55 ` [PATCH v2 3/4] udp: Rework offender address handling in udp_sock_recverr() David Gibson
2025-04-17  1:55 ` [PATCH v2 4/4] udp: Translate offender addresses for ICMP messages David Gibson

Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).