* [PATCH 0/4] Translate source addresses for ICMP errors
@ 2025-04-16 9:07 David Gibson
2025-04-16 9:07 ` [PATCH 1/4] fwd: Split out helpers for port-independent NAT David Gibson
` (4 more replies)
0 siblings, 5 replies; 10+ messages in thread
From: David Gibson @ 2025-04-16 9:07 UTC (permalink / raw)
To: Stefano Brivio, Jon Maloy, passt-dev; +Cc: David Gibson
We now propagate ICMP errors on UDP flows back into ICMP packets on
the tap interface. However, we don't always get the source address
right for the synthesized message. Because ICMPs can be generated by
intermediate routers, that source address might not be one of the
endpoints, so the address translation we already have isn't
sufficient.
Implement properly translating ICMP addresses when we need to. This
ended up a bit messier than I hoped, but it seems to work. A simple
case to test this is:
pasta --config-net --map-host-loopback=172.16.1.1 -- \
sh -c "echo hello | socat STDIO UDP4:172.16.1.1:10001"
where 10001 is a port where nothing is listening on the host.
Without this series, this will just time out. pasta sends an ICMP
Port Unreachable message, but it's sent with source address 127.0.0.1
and so discarded by the guest. With this series, the address is
properly translated and we correctly get the error from socat:
2025/04/16 19:02:37 socat[3] E read(5, 0x555c3dbf2000, 8192): Connection refused
David Gibson (4):
fwd: Split out helpers for port-independent NAT
treewide: Improve robustness against sockaddrs of unexpected family
udp: Rework offender address handling in udp_sock_recverr()
udp: Translate offender addresses for ICMP messages
flow.c | 16 ++++++++--
fwd.c | 87 ++++++++++++++++++++++++++++++++++++++----------------
fwd.h | 3 ++
inany.h | 22 +++++++++-----
tcp.c | 10 +++----
udp.c | 79 +++++++++++++++++++++++++++++++++++--------------
udp_flow.c | 6 ++--
7 files changed, 157 insertions(+), 66 deletions(-)
--
2.49.0
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/4] fwd: Split out helpers for port-independent NAT
2025-04-16 9:07 [PATCH 0/4] Translate source addresses for ICMP errors David Gibson
@ 2025-04-16 9:07 ` David Gibson
2025-04-16 9:07 ` [PATCH 2/4] treewide: Improve robustness against sockaddrs of unexpected family David Gibson
` (3 subsequent siblings)
4 siblings, 0 replies; 10+ messages in thread
From: David Gibson @ 2025-04-16 9:07 UTC (permalink / raw)
To: Stefano Brivio, Jon Maloy, passt-dev; +Cc: David Gibson
Currently the functions fwd_nat_from_*() make some address translations
based on both the IP address and protocol port numbers, and others based
only on the address. We have some upcoming cases where it's useful to use
the IP-address-only translations separately, so split them out into helper
functions.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
fwd.c | 87 ++++++++++++++++++++++++++++++++++++++++++-----------------
1 file changed, 62 insertions(+), 25 deletions(-)
diff --git a/fwd.c b/fwd.c
index 2829cd24..5c70e834 100644
--- a/fwd.c
+++ b/fwd.c
@@ -323,6 +323,30 @@ static bool fwd_guest_accessible(const struct ctx *c,
return fwd_guest_accessible6(c, &addr->a6);
}
+/**
+ * nat_outbound() - Apply address translation for outbound (TAP to HOST)
+ * @c: Execution context
+ * @addr: Input address (as seen on TAP interface)
+ * @translated: Output address (as seen on HOST interface)
+ *
+ * Only handles translations that depend *only* on the address. Anything
+ * related to specific ports or flows is handled elsewhere.
+ */
+static void nat_outbound(const struct ctx *c, const union inany_addr *addr,
+ union inany_addr *translated)
+{
+ if (inany_equals4(addr, &c->ip4.map_host_loopback))
+ *translated = inany_loopback4;
+ else if (inany_equals6(addr, &c->ip6.map_host_loopback))
+ *translated = inany_loopback6;
+ else if (inany_equals4(addr, &c->ip4.map_guest_addr))
+ *translated = inany_from_v4(c->ip4.addr);
+ else if (inany_equals6(addr, &c->ip6.map_guest_addr))
+ translated->a6 = c->ip6.addr;
+ else
+ *translated = *addr;
+}
+
/**
* fwd_nat_from_tap() - Determine to forward a flow from the tap interface
* @c: Execution context
@@ -342,16 +366,8 @@ uint8_t fwd_nat_from_tap(const struct ctx *c, uint8_t proto,
else if (is_dns_flow(proto, ini) &&
inany_equals6(&ini->oaddr, &c->ip6.dns_match))
tgt->eaddr.a6 = c->ip6.dns_host;
- else if (inany_equals4(&ini->oaddr, &c->ip4.map_host_loopback))
- tgt->eaddr = inany_loopback4;
- else if (inany_equals6(&ini->oaddr, &c->ip6.map_host_loopback))
- tgt->eaddr = inany_loopback6;
- else if (inany_equals4(&ini->oaddr, &c->ip4.map_guest_addr))
- tgt->eaddr = inany_from_v4(c->ip4.addr);
- else if (inany_equals6(&ini->oaddr, &c->ip6.map_guest_addr))
- tgt->eaddr.a6 = c->ip6.addr;
else
- tgt->eaddr = ini->oaddr;
+ nat_outbound(c, &ini->oaddr, &tgt->eaddr);
tgt->eport = ini->oport;
@@ -423,6 +439,42 @@ uint8_t fwd_nat_from_splice(const struct ctx *c, uint8_t proto,
return PIF_HOST;
}
+/**
+ * nat_inbound() - Apply address translation for outbound (HOST to TAP)
+ * @c: Execution context
+ * @addr: Input address (as seen on HOST interface)
+ * @translated: Output address (as seen on TAP interface)
+ *
+ * Return: true on success, false if it couldn't translate the address
+ *
+ * Only handles translations that depend *only* on the address. Anything
+ * related to specific ports or flows is handled elsewhere.
+ */
+static bool nat_inbound(const struct ctx *c, const union inany_addr *addr,
+ union inany_addr *translated)
+{
+ if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.map_host_loopback) &&
+ inany_equals4(addr, &in4addr_loopback)) {
+ /* Specifically 127.0.0.1, not 127.0.0.0/8 */
+ *translated = inany_from_v4(c->ip4.map_host_loopback);
+ } else if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.map_host_loopback) &&
+ inany_equals6(addr, &in6addr_loopback)) {
+ translated->a6 = c->ip6.map_host_loopback;
+ } else if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.map_guest_addr) &&
+ inany_equals4(addr, &c->ip4.addr)) {
+ *translated = inany_from_v4(c->ip4.map_guest_addr);
+ } else if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.map_guest_addr) &&
+ inany_equals6(addr, &c->ip6.addr)) {
+ translated->a6 = c->ip6.map_guest_addr;
+ } else if (fwd_guest_accessible(c, addr)) {
+ *translated = *addr;
+ } else {
+ return false;
+ }
+
+ return true;
+}
+
/**
* fwd_nat_from_host() - Determine to forward a flow from the host interface
* @c: Execution context
@@ -479,20 +531,7 @@ uint8_t fwd_nat_from_host(const struct ctx *c, uint8_t proto,
return PIF_SPLICE;
}
- if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.map_host_loopback) &&
- inany_equals4(&ini->eaddr, &in4addr_loopback)) {
- /* Specifically 127.0.0.1, not 127.0.0.0/8 */
- tgt->oaddr = inany_from_v4(c->ip4.map_host_loopback);
- } else if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.map_host_loopback) &&
- inany_equals6(&ini->eaddr, &in6addr_loopback)) {
- tgt->oaddr.a6 = c->ip6.map_host_loopback;
- } else if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.map_guest_addr) &&
- inany_equals4(&ini->eaddr, &c->ip4.addr)) {
- tgt->oaddr = inany_from_v4(c->ip4.map_guest_addr);
- } else if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.map_guest_addr) &&
- inany_equals6(&ini->eaddr, &c->ip6.addr)) {
- tgt->oaddr.a6 = c->ip6.map_guest_addr;
- } else if (!fwd_guest_accessible(c, &ini->eaddr)) {
+ if (!nat_inbound(c, &ini->eaddr, &tgt->oaddr)) {
if (inany_v4(&ini->eaddr)) {
if (IN4_IS_ADDR_UNSPECIFIED(&c->ip4.our_tap_addr))
/* No source address we can use */
@@ -501,8 +540,6 @@ uint8_t fwd_nat_from_host(const struct ctx *c, uint8_t proto,
} else {
tgt->oaddr.a6 = c->ip6.our_tap_ll;
}
- } else {
- tgt->oaddr = ini->eaddr;
}
tgt->oport = ini->eport;
--
@@ -323,6 +323,30 @@ static bool fwd_guest_accessible(const struct ctx *c,
return fwd_guest_accessible6(c, &addr->a6);
}
+/**
+ * nat_outbound() - Apply address translation for outbound (TAP to HOST)
+ * @c: Execution context
+ * @addr: Input address (as seen on TAP interface)
+ * @translated: Output address (as seen on HOST interface)
+ *
+ * Only handles translations that depend *only* on the address. Anything
+ * related to specific ports or flows is handled elsewhere.
+ */
+static void nat_outbound(const struct ctx *c, const union inany_addr *addr,
+ union inany_addr *translated)
+{
+ if (inany_equals4(addr, &c->ip4.map_host_loopback))
+ *translated = inany_loopback4;
+ else if (inany_equals6(addr, &c->ip6.map_host_loopback))
+ *translated = inany_loopback6;
+ else if (inany_equals4(addr, &c->ip4.map_guest_addr))
+ *translated = inany_from_v4(c->ip4.addr);
+ else if (inany_equals6(addr, &c->ip6.map_guest_addr))
+ translated->a6 = c->ip6.addr;
+ else
+ *translated = *addr;
+}
+
/**
* fwd_nat_from_tap() - Determine to forward a flow from the tap interface
* @c: Execution context
@@ -342,16 +366,8 @@ uint8_t fwd_nat_from_tap(const struct ctx *c, uint8_t proto,
else if (is_dns_flow(proto, ini) &&
inany_equals6(&ini->oaddr, &c->ip6.dns_match))
tgt->eaddr.a6 = c->ip6.dns_host;
- else if (inany_equals4(&ini->oaddr, &c->ip4.map_host_loopback))
- tgt->eaddr = inany_loopback4;
- else if (inany_equals6(&ini->oaddr, &c->ip6.map_host_loopback))
- tgt->eaddr = inany_loopback6;
- else if (inany_equals4(&ini->oaddr, &c->ip4.map_guest_addr))
- tgt->eaddr = inany_from_v4(c->ip4.addr);
- else if (inany_equals6(&ini->oaddr, &c->ip6.map_guest_addr))
- tgt->eaddr.a6 = c->ip6.addr;
else
- tgt->eaddr = ini->oaddr;
+ nat_outbound(c, &ini->oaddr, &tgt->eaddr);
tgt->eport = ini->oport;
@@ -423,6 +439,42 @@ uint8_t fwd_nat_from_splice(const struct ctx *c, uint8_t proto,
return PIF_HOST;
}
+/**
+ * nat_inbound() - Apply address translation for outbound (HOST to TAP)
+ * @c: Execution context
+ * @addr: Input address (as seen on HOST interface)
+ * @translated: Output address (as seen on TAP interface)
+ *
+ * Return: true on success, false if it couldn't translate the address
+ *
+ * Only handles translations that depend *only* on the address. Anything
+ * related to specific ports or flows is handled elsewhere.
+ */
+static bool nat_inbound(const struct ctx *c, const union inany_addr *addr,
+ union inany_addr *translated)
+{
+ if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.map_host_loopback) &&
+ inany_equals4(addr, &in4addr_loopback)) {
+ /* Specifically 127.0.0.1, not 127.0.0.0/8 */
+ *translated = inany_from_v4(c->ip4.map_host_loopback);
+ } else if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.map_host_loopback) &&
+ inany_equals6(addr, &in6addr_loopback)) {
+ translated->a6 = c->ip6.map_host_loopback;
+ } else if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.map_guest_addr) &&
+ inany_equals4(addr, &c->ip4.addr)) {
+ *translated = inany_from_v4(c->ip4.map_guest_addr);
+ } else if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.map_guest_addr) &&
+ inany_equals6(addr, &c->ip6.addr)) {
+ translated->a6 = c->ip6.map_guest_addr;
+ } else if (fwd_guest_accessible(c, addr)) {
+ *translated = *addr;
+ } else {
+ return false;
+ }
+
+ return true;
+}
+
/**
* fwd_nat_from_host() - Determine to forward a flow from the host interface
* @c: Execution context
@@ -479,20 +531,7 @@ uint8_t fwd_nat_from_host(const struct ctx *c, uint8_t proto,
return PIF_SPLICE;
}
- if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.map_host_loopback) &&
- inany_equals4(&ini->eaddr, &in4addr_loopback)) {
- /* Specifically 127.0.0.1, not 127.0.0.0/8 */
- tgt->oaddr = inany_from_v4(c->ip4.map_host_loopback);
- } else if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.map_host_loopback) &&
- inany_equals6(&ini->eaddr, &in6addr_loopback)) {
- tgt->oaddr.a6 = c->ip6.map_host_loopback;
- } else if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.map_guest_addr) &&
- inany_equals4(&ini->eaddr, &c->ip4.addr)) {
- tgt->oaddr = inany_from_v4(c->ip4.map_guest_addr);
- } else if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.map_guest_addr) &&
- inany_equals6(&ini->eaddr, &c->ip6.addr)) {
- tgt->oaddr.a6 = c->ip6.map_guest_addr;
- } else if (!fwd_guest_accessible(c, &ini->eaddr)) {
+ if (!nat_inbound(c, &ini->eaddr, &tgt->oaddr)) {
if (inany_v4(&ini->eaddr)) {
if (IN4_IS_ADDR_UNSPECIFIED(&c->ip4.our_tap_addr))
/* No source address we can use */
@@ -501,8 +540,6 @@ uint8_t fwd_nat_from_host(const struct ctx *c, uint8_t proto,
} else {
tgt->oaddr.a6 = c->ip6.our_tap_ll;
}
- } else {
- tgt->oaddr = ini->eaddr;
}
tgt->oport = ini->eport;
--
2.49.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 2/4] treewide: Improve robustness against sockaddrs of unexpected family
2025-04-16 9:07 [PATCH 0/4] Translate source addresses for ICMP errors David Gibson
2025-04-16 9:07 ` [PATCH 1/4] fwd: Split out helpers for port-independent NAT David Gibson
@ 2025-04-16 9:07 ` David Gibson
2025-04-16 9:41 ` Stefano Brivio
2025-04-16 9:07 ` [PATCH 3/4] udp: Rework offender address handling in udp_sock_recverr() David Gibson
` (2 subsequent siblings)
4 siblings, 1 reply; 10+ messages in thread
From: David Gibson @ 2025-04-16 9:07 UTC (permalink / raw)
To: Stefano Brivio, Jon Maloy, passt-dev; +Cc: David Gibson
inany_from_sockaddr() expects a socket address of family AF_INET or
AF_INET6 and ASSERT()s if it gets anything else. In many of the callers we
can handle an unexpected family more gracefully, though, e.g. by failing
a single flow rather than killing passt.
Change inany_from_sockaddr() to return an error instead of ASSERT()ing,
and handle those errors in the callers. Improve the reporting of any such
errors while we're at it.
With this greater robustness, allow inany_from_sockaddr() to take a void *
rather than specifically a union sockaddr_inany *.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
flow.c | 16 ++++++++++++++--
inany.h | 22 ++++++++++++++--------
tcp.c | 10 ++++------
udp_flow.c | 6 +++---
4 files changed, 35 insertions(+), 19 deletions(-)
diff --git a/flow.c b/flow.c
index 3c81cb42..447c0211 100644
--- a/flow.c
+++ b/flow.c
@@ -408,7 +408,12 @@ struct flowside *flow_initiate_sa(union flow *flow, uint8_t pif,
{
struct flowside *ini = &flow->f.side[INISIDE];
- inany_from_sockaddr(&ini->eaddr, &ini->eport, ssa);
+ if (inany_from_sockaddr(&ini->eaddr, &ini->eport, ssa) < 0) {
+ char str[SOCKADDR_STRLEN];
+
+ ASSERT_WITH_MSG(0, "Bad socket address %s",
+ sockaddr_ntop(ssa, str, sizeof(str)));
+ }
if (daddr)
ini->oaddr = *daddr;
else if (inany_v4(&ini->eaddr))
@@ -768,7 +773,14 @@ flow_sidx_t flow_lookup_sa(const struct ctx *c, uint8_t proto, uint8_t pif,
.oport = oport,
};
- inany_from_sockaddr(&side.eaddr, &side.eport, esa);
+ if (inany_from_sockaddr(&side.eaddr, &side.eport, esa) < 0) {
+ char str[SOCKADDR_STRLEN];
+
+ warn("Flow lookup on bad socket address %s",
+ sockaddr_ntop(esa, str, sizeof(str)));
+ return FLOW_SIDX_NONE;
+ }
+
if (oaddr)
side.oaddr = *oaddr;
else if (inany_v4(&side.eaddr))
diff --git a/inany.h b/inany.h
index 1c247e1e..f5c618bf 100644
--- a/inany.h
+++ b/inany.h
@@ -239,22 +239,28 @@ static inline void inany_from_af(union inany_addr *aa,
/** inany_from_sockaddr - Extract IPv[46] address and port number from sockaddr
* @aa: Pointer to store IPv[46] address
* @port: Pointer to store port number, host order
- * @addr: AF_INET or AF_INET6 socket address
+ * @addr: Socket address
+ *
+ * Return: 0 on success, -1 on error (bad address family)
*/
-static inline void inany_from_sockaddr(union inany_addr *aa, in_port_t *port,
- const union sockaddr_inany *sa)
+static inline int inany_from_sockaddr(union inany_addr *aa, in_port_t *port,
+ const void *sa_)
{
+ const union sockaddr_inany *sa = (const union sockaddr_inany *)sa_;
+
if (sa->sa_family == AF_INET6) {
inany_from_af(aa, AF_INET6, &sa->sa6.sin6_addr);
*port = ntohs(sa->sa6.sin6_port);
- } else if (sa->sa_family == AF_INET) {
+ return 0;
+ }
+
+ if (sa->sa_family == AF_INET) {
inany_from_af(aa, AF_INET, &sa->sa4.sin_addr);
*port = ntohs(sa->sa4.sin_port);
- } else {
- /* Not valid to call with other address families */
- ASSERT_WITH_MSG(0, "Unexpected sockaddr family: %u",
- sa->sa_family);
+ return 0;
}
+
+ return -1;
}
/** inany_siphash_feed- Fold IPv[46] address into an in-progress siphash
diff --git a/tcp.c b/tcp.c
index 9c6bc529..0ac298a7 100644
--- a/tcp.c
+++ b/tcp.c
@@ -1546,9 +1546,8 @@ static void tcp_conn_from_tap(const struct ctx *c, sa_family_t af,
if (c->mode == MODE_VU) { /* To rebind to same oport after migration */
sl = sizeof(sa);
- if (!getsockname(s, &sa.sa, &sl))
- inany_from_sockaddr(&tgt->oaddr, &tgt->oport, &sa);
- else
+ if (getsockname(s, &sa.sa, &sl) ||
+ inany_from_sockaddr(&tgt->oaddr, &tgt->oport, &sa) < 0)
err_perror("Can't get local address for socket %i", s);
}
@@ -2204,9 +2203,8 @@ void tcp_listen_handler(const struct ctx *c, union epoll_ref ref,
NULL, ref.tcp_listen.port);
if (c->mode == MODE_VU) { /* Rebind to same address after migration */
- if (!getsockname(s, &sa.sa, &sl))
- inany_from_sockaddr(&ini->oaddr, &ini->oport, &sa);
- else
+ if (getsockname(s, &sa.sa, &sl) ||
+ inany_from_sockaddr(&ini->oaddr, &ini->oport, &sa) < 0)
err_perror("Can't get local address for socket %i", s);
}
diff --git a/udp_flow.c b/udp_flow.c
index ef2cbb06..fea1cf3c 100644
--- a/udp_flow.c
+++ b/udp_flow.c
@@ -158,12 +158,12 @@ static flow_sidx_t udp_flow_new(const struct ctx *c, union flow *flow,
socklen_t sl = sizeof(sa);
in_port_t port;
- if (getsockname(uflow->s[TGTSIDE], &sa.sa, &sl) < 0) {
+ if (getsockname(uflow->s[TGTSIDE], &sa.sa, &sl) < 0 ||
+ inany_from_sockaddr(&uflow->f.side[TGTSIDE].oaddr,
+ &port, &sa) < 0) {
flow_perror(uflow, "Unable to determine local address");
goto cancel;
}
- inany_from_sockaddr(&uflow->f.side[TGTSIDE].oaddr,
- &port, &sa);
if (port != tgt->oport) {
flow_err(uflow, "Unexpected local port");
goto cancel;
--
@@ -158,12 +158,12 @@ static flow_sidx_t udp_flow_new(const struct ctx *c, union flow *flow,
socklen_t sl = sizeof(sa);
in_port_t port;
- if (getsockname(uflow->s[TGTSIDE], &sa.sa, &sl) < 0) {
+ if (getsockname(uflow->s[TGTSIDE], &sa.sa, &sl) < 0 ||
+ inany_from_sockaddr(&uflow->f.side[TGTSIDE].oaddr,
+ &port, &sa) < 0) {
flow_perror(uflow, "Unable to determine local address");
goto cancel;
}
- inany_from_sockaddr(&uflow->f.side[TGTSIDE].oaddr,
- &port, &sa);
if (port != tgt->oport) {
flow_err(uflow, "Unexpected local port");
goto cancel;
--
2.49.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 3/4] udp: Rework offender address handling in udp_sock_recverr()
2025-04-16 9:07 [PATCH 0/4] Translate source addresses for ICMP errors David Gibson
2025-04-16 9:07 ` [PATCH 1/4] fwd: Split out helpers for port-independent NAT David Gibson
2025-04-16 9:07 ` [PATCH 2/4] treewide: Improve robustness against sockaddrs of unexpected family David Gibson
@ 2025-04-16 9:07 ` David Gibson
2025-04-16 14:27 ` Stefano Brivio
2025-04-16 9:07 ` [PATCH 4/4] udp: Translate offender addresses for ICMP messages David Gibson
2025-04-16 14:27 ` [PATCH 0/4] Translate source addresses for ICMP errors Stefano Brivio
4 siblings, 1 reply; 10+ messages in thread
From: David Gibson @ 2025-04-16 9:07 UTC (permalink / raw)
To: Stefano Brivio, Jon Maloy, passt-dev; +Cc: David Gibson
Make a number of changes to udp_sock_recverr() to improve the robustness
of how we handle addresses.
* Get the "offender" address (source of the ICMP packet) using the
SO_EE_OFFENDER() macro, reducing assumptions about structure layout.
* Parse the offender sockaddr using inany_from_sockaddr()
* Check explicitly that the source and destination pifs are what we
expect. Previously we checked something that was probably equivalent
in practice, but isn't strictly speaking what we require for the rest
of the code.
* Verify that for an ICMPv4 error we also have an IPv4 source/offender
and destination/endpoint address
* Verify that for an ICMPv6 error we have an IPv6 endpoint
* Improve debug reporting of any failures
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
udp.c | 67 ++++++++++++++++++++++++++++++++++++++++-------------------
1 file changed, 46 insertions(+), 21 deletions(-)
diff --git a/udp.c b/udp.c
index 57769d06..4352520e 100644
--- a/udp.c
+++ b/udp.c
@@ -159,6 +159,12 @@ udp_meta[UDP_MAX_FRAMES];
MAX(CMSG_SPACE(sizeof(struct in_pktinfo)), \
CMSG_SPACE(sizeof(struct in6_pktinfo)))
+#define RECVERR_SPACE \
+ MAX(CMSG_SPACE(sizeof(struct sock_extended_err) + \
+ sizeof(struct sockaddr_in)), \
+ CMSG_SPACE(sizeof(struct sock_extended_err) + \
+ sizeof(struct sockaddr_in6)))
+
/**
* enum udp_iov_idx - Indices for the buffers making up a single UDP frame
* @UDP_IOV_TAP tap specific header
@@ -516,12 +522,8 @@ static int udp_pktinfo(struct msghdr *msg, union inany_addr *dst)
static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
uint8_t pif, in_port_t port)
{
- struct errhdr {
- struct sock_extended_err ee;
- union sockaddr_inany saddr;
- };
- char buf[PKTINFO_SPACE + CMSG_SPACE(sizeof(struct errhdr))];
- const struct errhdr *eh = NULL;
+ char buf[PKTINFO_SPACE + RECVERR_SPACE];
+ const struct sock_extended_err *ee;
char data[ICMP6_MAX_DLEN];
struct cmsghdr *hdr;
struct iovec iov = {
@@ -538,7 +540,12 @@ static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
.msg_controllen = sizeof(buf),
};
const struct flowside *toside;
- flow_sidx_t tosidx;
+ char astr[INANY_ADDRSTRLEN];
+ char sastr[SOCKADDR_STRLEN];
+ union inany_addr offender;
+ const struct in_addr *o4;
+ in_port_t offender_port;
+ uint8_t topif;
size_t dlen;
ssize_t rc;
@@ -569,10 +576,10 @@ static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
return -1;
}
- eh = (const struct errhdr *)CMSG_DATA(hdr);
+ ee = (const struct sock_extended_err *)CMSG_DATA(hdr);
debug("%s error on UDP socket %i: %s",
- str_ee_origin(&eh->ee), s, strerror_(eh->ee.ee_errno));
+ str_ee_origin(ee), s, strerror_(ee->ee_errno));
if (!flow_sidx_valid(sidx)) {
/* No hint from the socket, determine flow from addresses */
@@ -588,25 +595,43 @@ static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
debug("Ignoring UDP error without flow");
return 1;
}
+ } else {
+ pif = pif_at_sidx(sidx);
}
- tosidx = flow_sidx_opposite(sidx);
- toside = flowside_at_sidx(tosidx);
+ toside = flowside_at_sidx(flow_sidx_opposite(sidx));
+ topif = pif_at_sidx(flow_sidx_opposite(sidx));
dlen = rc;
- if (pif_is_socket(pif_at_sidx(tosidx))) {
- /* XXX Is there any way to propagate ICMPs from socket to
- * socket? */
- } else if (hdr->cmsg_level == IPPROTO_IP) {
+ if (inany_from_sockaddr(&offender, &offender_port,
+ SO_EE_OFFENDER(ee)) < 0)
+ goto fail;
+
+ if (pif != PIF_HOST || topif != PIF_TAP)
+ /* XXX Can we support any other cases? */
+ goto fail;
+
+ if (hdr->cmsg_level == IPPROTO_IP &&
+ (o4 = inany_v4(&offender)) && inany_v4(&toside->eaddr)) {
dlen = MIN(dlen, ICMP4_MAX_DLEN);
- udp_send_tap_icmp4(c, &eh->ee, toside,
- eh->saddr.sa4.sin_addr, data, dlen);
- } else if (hdr->cmsg_level == IPPROTO_IPV6) {
- udp_send_tap_icmp6(c, &eh->ee, toside,
- &eh->saddr.sa6.sin6_addr, data,
- dlen, sidx.flowi);
+ udp_send_tap_icmp4(c, ee, toside, *o4, data, dlen);
+ return 1;
+ }
+
+ if (hdr->cmsg_level == IPPROTO_IPV6 && !inany_v4(&toside->eaddr)) {
+ udp_send_tap_icmp6(c, ee, toside, &offender.a6, data, dlen,
+ sidx.flowi);
+ return 1;
}
+fail:
+ flow_dbg(flow_at_sidx(sidx),
+ "Can't propagate %s error from %s %s to %s %s",
+ str_ee_origin(ee),
+ pif_name(pif),
+ sockaddr_ntop(SO_EE_OFFENDER(ee), sastr, sizeof(sastr)),
+ pif_name(topif),
+ inany_ntop(&toside->eaddr, astr, sizeof(astr)));
return 1;
}
--
@@ -159,6 +159,12 @@ udp_meta[UDP_MAX_FRAMES];
MAX(CMSG_SPACE(sizeof(struct in_pktinfo)), \
CMSG_SPACE(sizeof(struct in6_pktinfo)))
+#define RECVERR_SPACE \
+ MAX(CMSG_SPACE(sizeof(struct sock_extended_err) + \
+ sizeof(struct sockaddr_in)), \
+ CMSG_SPACE(sizeof(struct sock_extended_err) + \
+ sizeof(struct sockaddr_in6)))
+
/**
* enum udp_iov_idx - Indices for the buffers making up a single UDP frame
* @UDP_IOV_TAP tap specific header
@@ -516,12 +522,8 @@ static int udp_pktinfo(struct msghdr *msg, union inany_addr *dst)
static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
uint8_t pif, in_port_t port)
{
- struct errhdr {
- struct sock_extended_err ee;
- union sockaddr_inany saddr;
- };
- char buf[PKTINFO_SPACE + CMSG_SPACE(sizeof(struct errhdr))];
- const struct errhdr *eh = NULL;
+ char buf[PKTINFO_SPACE + RECVERR_SPACE];
+ const struct sock_extended_err *ee;
char data[ICMP6_MAX_DLEN];
struct cmsghdr *hdr;
struct iovec iov = {
@@ -538,7 +540,12 @@ static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
.msg_controllen = sizeof(buf),
};
const struct flowside *toside;
- flow_sidx_t tosidx;
+ char astr[INANY_ADDRSTRLEN];
+ char sastr[SOCKADDR_STRLEN];
+ union inany_addr offender;
+ const struct in_addr *o4;
+ in_port_t offender_port;
+ uint8_t topif;
size_t dlen;
ssize_t rc;
@@ -569,10 +576,10 @@ static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
return -1;
}
- eh = (const struct errhdr *)CMSG_DATA(hdr);
+ ee = (const struct sock_extended_err *)CMSG_DATA(hdr);
debug("%s error on UDP socket %i: %s",
- str_ee_origin(&eh->ee), s, strerror_(eh->ee.ee_errno));
+ str_ee_origin(ee), s, strerror_(ee->ee_errno));
if (!flow_sidx_valid(sidx)) {
/* No hint from the socket, determine flow from addresses */
@@ -588,25 +595,43 @@ static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
debug("Ignoring UDP error without flow");
return 1;
}
+ } else {
+ pif = pif_at_sidx(sidx);
}
- tosidx = flow_sidx_opposite(sidx);
- toside = flowside_at_sidx(tosidx);
+ toside = flowside_at_sidx(flow_sidx_opposite(sidx));
+ topif = pif_at_sidx(flow_sidx_opposite(sidx));
dlen = rc;
- if (pif_is_socket(pif_at_sidx(tosidx))) {
- /* XXX Is there any way to propagate ICMPs from socket to
- * socket? */
- } else if (hdr->cmsg_level == IPPROTO_IP) {
+ if (inany_from_sockaddr(&offender, &offender_port,
+ SO_EE_OFFENDER(ee)) < 0)
+ goto fail;
+
+ if (pif != PIF_HOST || topif != PIF_TAP)
+ /* XXX Can we support any other cases? */
+ goto fail;
+
+ if (hdr->cmsg_level == IPPROTO_IP &&
+ (o4 = inany_v4(&offender)) && inany_v4(&toside->eaddr)) {
dlen = MIN(dlen, ICMP4_MAX_DLEN);
- udp_send_tap_icmp4(c, &eh->ee, toside,
- eh->saddr.sa4.sin_addr, data, dlen);
- } else if (hdr->cmsg_level == IPPROTO_IPV6) {
- udp_send_tap_icmp6(c, &eh->ee, toside,
- &eh->saddr.sa6.sin6_addr, data,
- dlen, sidx.flowi);
+ udp_send_tap_icmp4(c, ee, toside, *o4, data, dlen);
+ return 1;
+ }
+
+ if (hdr->cmsg_level == IPPROTO_IPV6 && !inany_v4(&toside->eaddr)) {
+ udp_send_tap_icmp6(c, ee, toside, &offender.a6, data, dlen,
+ sidx.flowi);
+ return 1;
}
+fail:
+ flow_dbg(flow_at_sidx(sidx),
+ "Can't propagate %s error from %s %s to %s %s",
+ str_ee_origin(ee),
+ pif_name(pif),
+ sockaddr_ntop(SO_EE_OFFENDER(ee), sastr, sizeof(sastr)),
+ pif_name(topif),
+ inany_ntop(&toside->eaddr, astr, sizeof(astr)));
return 1;
}
--
2.49.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 4/4] udp: Translate offender addresses for ICMP messages
2025-04-16 9:07 [PATCH 0/4] Translate source addresses for ICMP errors David Gibson
` (2 preceding siblings ...)
2025-04-16 9:07 ` [PATCH 3/4] udp: Rework offender address handling in udp_sock_recverr() David Gibson
@ 2025-04-16 9:07 ` David Gibson
2025-04-16 14:27 ` [PATCH 0/4] Translate source addresses for ICMP errors Stefano Brivio
4 siblings, 0 replies; 10+ messages in thread
From: David Gibson @ 2025-04-16 9:07 UTC (permalink / raw)
To: Stefano Brivio, Jon Maloy, passt-dev; +Cc: David Gibson
We've recently added support for propagating ICMP errors related to a UDP
flow from the host to the guest, by handling the extended UDP error on the
socket and synthesizing a suitable ICMP on the tap interface.
Currently we create that ICMP with a source address of the "offender" from
the extended error information - the source of the ICMP error received on
the host. However, we don't translate this address for cases where we NAT
between host and guest. This means (amongst other things) that we won't
get a "Connection refused" error as expected if send data from the guest to
the --map-host-loopback address. The error comes from 127.0.0.1 on the
host, which doesn't make sense on the tap interface and will be discarded
by the guest.
Because ICMP errors can be sent by an intermediate host, not just by the
endpoints of the flow, we can't handle this translation purely with the
information in the flow table entry. We need to explicitly translate this
address by our NAT rules, which we can do with the nat_inbound() helper.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
fwd.c | 4 ++--
fwd.h | 3 +++
udp.c | 18 ++++++++++++++----
3 files changed, 19 insertions(+), 6 deletions(-)
diff --git a/fwd.c b/fwd.c
index 5c70e834..b73c2c8d 100644
--- a/fwd.c
+++ b/fwd.c
@@ -450,8 +450,8 @@ uint8_t fwd_nat_from_splice(const struct ctx *c, uint8_t proto,
* Only handles translations that depend *only* on the address. Anything
* related to specific ports or flows is handled elsewhere.
*/
-static bool nat_inbound(const struct ctx *c, const union inany_addr *addr,
- union inany_addr *translated)
+bool nat_inbound(const struct ctx *c, const union inany_addr *addr,
+ union inany_addr *translated)
{
if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.map_host_loopback) &&
inany_equals4(addr, &in4addr_loopback)) {
diff --git a/fwd.h b/fwd.h
index 3562f3ca..0458a3c5 100644
--- a/fwd.h
+++ b/fwd.h
@@ -7,6 +7,7 @@
#ifndef FWD_H
#define FWD_H
+union inany_addr;
struct flowside;
/* Number of ports for both TCP and UDP */
@@ -47,6 +48,8 @@ void fwd_scan_ports_udp(struct fwd_ports *fwd, const struct fwd_ports *rev,
const struct fwd_ports *tcp_rev);
void fwd_scan_ports_init(struct ctx *c);
+bool nat_inbound(const struct ctx *c, const union inany_addr *addr,
+ union inany_addr *translated);
uint8_t fwd_nat_from_tap(const struct ctx *c, uint8_t proto,
const struct flowside *ini, struct flowside *tgt);
uint8_t fwd_nat_from_splice(const struct ctx *c, uint8_t proto,
diff --git a/udp.c b/udp.c
index 4352520e..ebee1d14 100644
--- a/udp.c
+++ b/udp.c
@@ -539,10 +539,10 @@ static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
.msg_control = buf,
.msg_controllen = sizeof(buf),
};
- const struct flowside *toside;
+ const struct flowside *fromside, *toside;
+ union inany_addr offender, otap;
char astr[INANY_ADDRSTRLEN];
char sastr[SOCKADDR_STRLEN];
- union inany_addr offender;
const struct in_addr *o4;
in_port_t offender_port;
uint8_t topif;
@@ -599,6 +599,7 @@ static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
pif = pif_at_sidx(sidx);
}
+ fromside = flowside_at_sidx(sidx);
toside = flowside_at_sidx(flow_sidx_opposite(sidx));
topif = pif_at_sidx(flow_sidx_opposite(sidx));
dlen = rc;
@@ -611,15 +612,24 @@ static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
/* XXX Can we support any other cases? */
goto fail;
+ /* If the offender *is* the endpoint, make sure our translation is
+ * consistent with the flow's translation. This matters if the flow
+ * endpoint has a port specific translation (like --dns-match).
+ */
+ if (inany_equals(&offender, &fromside->eaddr))
+ otap = toside->oaddr;
+ else if (!nat_inbound(c, &offender, &otap))
+ goto fail;
+
if (hdr->cmsg_level == IPPROTO_IP &&
- (o4 = inany_v4(&offender)) && inany_v4(&toside->eaddr)) {
+ (o4 = inany_v4(&otap)) && inany_v4(&toside->eaddr)) {
dlen = MIN(dlen, ICMP4_MAX_DLEN);
udp_send_tap_icmp4(c, ee, toside, *o4, data, dlen);
return 1;
}
if (hdr->cmsg_level == IPPROTO_IPV6 && !inany_v4(&toside->eaddr)) {
- udp_send_tap_icmp6(c, ee, toside, &offender.a6, data, dlen,
+ udp_send_tap_icmp6(c, ee, toside, &otap.a6, data, dlen,
sidx.flowi);
return 1;
}
--
@@ -539,10 +539,10 @@ static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
.msg_control = buf,
.msg_controllen = sizeof(buf),
};
- const struct flowside *toside;
+ const struct flowside *fromside, *toside;
+ union inany_addr offender, otap;
char astr[INANY_ADDRSTRLEN];
char sastr[SOCKADDR_STRLEN];
- union inany_addr offender;
const struct in_addr *o4;
in_port_t offender_port;
uint8_t topif;
@@ -599,6 +599,7 @@ static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
pif = pif_at_sidx(sidx);
}
+ fromside = flowside_at_sidx(sidx);
toside = flowside_at_sidx(flow_sidx_opposite(sidx));
topif = pif_at_sidx(flow_sidx_opposite(sidx));
dlen = rc;
@@ -611,15 +612,24 @@ static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
/* XXX Can we support any other cases? */
goto fail;
+ /* If the offender *is* the endpoint, make sure our translation is
+ * consistent with the flow's translation. This matters if the flow
+ * endpoint has a port specific translation (like --dns-match).
+ */
+ if (inany_equals(&offender, &fromside->eaddr))
+ otap = toside->oaddr;
+ else if (!nat_inbound(c, &offender, &otap))
+ goto fail;
+
if (hdr->cmsg_level == IPPROTO_IP &&
- (o4 = inany_v4(&offender)) && inany_v4(&toside->eaddr)) {
+ (o4 = inany_v4(&otap)) && inany_v4(&toside->eaddr)) {
dlen = MIN(dlen, ICMP4_MAX_DLEN);
udp_send_tap_icmp4(c, ee, toside, *o4, data, dlen);
return 1;
}
if (hdr->cmsg_level == IPPROTO_IPV6 && !inany_v4(&toside->eaddr)) {
- udp_send_tap_icmp6(c, ee, toside, &offender.a6, data, dlen,
+ udp_send_tap_icmp6(c, ee, toside, &otap.a6, data, dlen,
sidx.flowi);
return 1;
}
--
2.49.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 2/4] treewide: Improve robustness against sockaddrs of unexpected family
2025-04-16 9:07 ` [PATCH 2/4] treewide: Improve robustness against sockaddrs of unexpected family David Gibson
@ 2025-04-16 9:41 ` Stefano Brivio
2025-04-17 1:14 ` David Gibson
0 siblings, 1 reply; 10+ messages in thread
From: Stefano Brivio @ 2025-04-16 9:41 UTC (permalink / raw)
To: David Gibson; +Cc: Jon Maloy, passt-dev
On Wed, 16 Apr 2025 19:07:05 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:
> @@ -239,22 +239,28 @@ static inline void inany_from_af(union inany_addr *aa,
> /** inany_from_sockaddr - Extract IPv[46] address and port number from sockaddr
> * @aa: Pointer to store IPv[46] address
> * @port: Pointer to store port number, host order
> - * @addr: AF_INET or AF_INET6 socket address
> + * @addr: Socket address
This is actually sa_ now but... can we do something for argument names
in general, here? What about dst, port, sa, or dst, port, addr?
> + *
> + * Return: 0 on success, -1 on error (bad address family)
> */
> -static inline void inany_from_sockaddr(union inany_addr *aa, in_port_t *port,
> - const union sockaddr_inany *sa)
> +static inline int inany_from_sockaddr(union inany_addr *aa, in_port_t *port,
> + const void *sa_)
> {
> + const union sockaddr_inany *sa = (const union sockaddr_inany *)sa_;
--
Stefano
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 3/4] udp: Rework offender address handling in udp_sock_recverr()
2025-04-16 9:07 ` [PATCH 3/4] udp: Rework offender address handling in udp_sock_recverr() David Gibson
@ 2025-04-16 14:27 ` Stefano Brivio
2025-04-17 1:33 ` David Gibson
0 siblings, 1 reply; 10+ messages in thread
From: Stefano Brivio @ 2025-04-16 14:27 UTC (permalink / raw)
To: David Gibson; +Cc: Jon Maloy, passt-dev
On Wed, 16 Apr 2025 19:07:06 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:
> Make a number of changes to udp_sock_recverr() to improve the robustness
> of how we handle addresses.
>
> * Get the "offender" address (source of the ICMP packet) using the
> SO_EE_OFFENDER() macro, reducing assumptions about structure layout.
> * Parse the offender sockaddr using inany_from_sockaddr()
> * Check explicitly that the source and destination pifs are what we
> expect. Previously we checked something that was probably equivalent
> in practice, but isn't strictly speaking what we require for the rest
> of the code.
> * Verify that for an ICMPv4 error we also have an IPv4 source/offender
> and destination/endpoint address
> * Verify that for an ICMPv6 error we have an IPv6 endpoint
> * Improve debug reporting of any failures
>
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
> udp.c | 67 ++++++++++++++++++++++++++++++++++++++++-------------------
> 1 file changed, 46 insertions(+), 21 deletions(-)
>
> diff --git a/udp.c b/udp.c
> index 57769d06..4352520e 100644
> --- a/udp.c
> +++ b/udp.c
> @@ -159,6 +159,12 @@ udp_meta[UDP_MAX_FRAMES];
> MAX(CMSG_SPACE(sizeof(struct in_pktinfo)), \
> CMSG_SPACE(sizeof(struct in6_pktinfo)))
>
> +#define RECVERR_SPACE \
> + MAX(CMSG_SPACE(sizeof(struct sock_extended_err) + \
> + sizeof(struct sockaddr_in)), \
> + CMSG_SPACE(sizeof(struct sock_extended_err) + \
> + sizeof(struct sockaddr_in6)))
> +
> /**
> * enum udp_iov_idx - Indices for the buffers making up a single UDP frame
> * @UDP_IOV_TAP tap specific header
> @@ -516,12 +522,8 @@ static int udp_pktinfo(struct msghdr *msg, union inany_addr *dst)
> static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
> uint8_t pif, in_port_t port)
> {
> - struct errhdr {
> - struct sock_extended_err ee;
> - union sockaddr_inany saddr;
> - };
> - char buf[PKTINFO_SPACE + CMSG_SPACE(sizeof(struct errhdr))];
> - const struct errhdr *eh = NULL;
> + char buf[PKTINFO_SPACE + RECVERR_SPACE];
> + const struct sock_extended_err *ee;
> char data[ICMP6_MAX_DLEN];
> struct cmsghdr *hdr;
> struct iovec iov = {
> @@ -538,7 +540,12 @@ static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
> .msg_controllen = sizeof(buf),
> };
> const struct flowside *toside;
> - flow_sidx_t tosidx;
> + char astr[INANY_ADDRSTRLEN];
> + char sastr[SOCKADDR_STRLEN];
> + union inany_addr offender;
> + const struct in_addr *o4;
> + in_port_t offender_port;
> + uint8_t topif;
> size_t dlen;
> ssize_t rc;
>
> @@ -569,10 +576,10 @@ static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
> return -1;
> }
>
> - eh = (const struct errhdr *)CMSG_DATA(hdr);
> + ee = (const struct sock_extended_err *)CMSG_DATA(hdr);
>
> debug("%s error on UDP socket %i: %s",
> - str_ee_origin(&eh->ee), s, strerror_(eh->ee.ee_errno));
> + str_ee_origin(ee), s, strerror_(ee->ee_errno));
>
> if (!flow_sidx_valid(sidx)) {
> /* No hint from the socket, determine flow from addresses */
> @@ -588,25 +595,43 @@ static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
> debug("Ignoring UDP error without flow");
> return 1;
> }
> + } else {
> + pif = pif_at_sidx(sidx);
Two stray trailing tabs here.
> }
>
> - tosidx = flow_sidx_opposite(sidx);
> - toside = flowside_at_sidx(tosidx);
> + toside = flowside_at_sidx(flow_sidx_opposite(sidx));
> + topif = pif_at_sidx(flow_sidx_opposite(sidx));
> dlen = rc;
>
> - if (pif_is_socket(pif_at_sidx(tosidx))) {
> - /* XXX Is there any way to propagate ICMPs from socket to
> - * socket? */
> - } else if (hdr->cmsg_level == IPPROTO_IP) {
> + if (inany_from_sockaddr(&offender, &offender_port,
> + SO_EE_OFFENDER(ee)) < 0)
> + goto fail;
> +
> + if (pif != PIF_HOST || topif != PIF_TAP)
> + /* XXX Can we support any other cases? */
> + goto fail;
> +
> + if (hdr->cmsg_level == IPPROTO_IP &&
> + (o4 = inany_v4(&offender)) && inany_v4(&toside->eaddr)) {
> dlen = MIN(dlen, ICMP4_MAX_DLEN);
> - udp_send_tap_icmp4(c, &eh->ee, toside,
> - eh->saddr.sa4.sin_addr, data, dlen);
> - } else if (hdr->cmsg_level == IPPROTO_IPV6) {
> - udp_send_tap_icmp6(c, &eh->ee, toside,
> - &eh->saddr.sa6.sin6_addr, data,
> - dlen, sidx.flowi);
> + udp_send_tap_icmp4(c, ee, toside, *o4, data, dlen);
> + return 1;
> + }
> +
> + if (hdr->cmsg_level == IPPROTO_IPV6 && !inany_v4(&toside->eaddr)) {
> + udp_send_tap_icmp6(c, ee, toside, &offender.a6, data, dlen,
> + sidx.flowi);
> + return 1;
> }
>
> +fail:
> + flow_dbg(flow_at_sidx(sidx),
Coverity Scan seems to hallucinate here and says that flow_at_sidx()
could return NULL, with its return value later dereferenced by
flow_log(), even if you're explicitly checking flow_sidx_valid() in all
the paths reaching to this point.
Calling this conditionally only if flow_sidx_valid() doesn't mask the
false positive either (I guess that's the part that goes wrong
somehow), we really need to check if (flow_at_sidx(sidx)) flow_dbg(...).
Would it be possible to add the useless check just for my own sanity?
> + "Can't propagate %s error from %s %s to %s %s",
> + str_ee_origin(ee),
> + pif_name(pif),
> + sockaddr_ntop(SO_EE_OFFENDER(ee), sastr, sizeof(sastr)),
> + pif_name(topif),
> + inany_ntop(&toside->eaddr, astr, sizeof(astr)));
> return 1;
> }
--
Stefano
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 0/4] Translate source addresses for ICMP errors
2025-04-16 9:07 [PATCH 0/4] Translate source addresses for ICMP errors David Gibson
` (3 preceding siblings ...)
2025-04-16 9:07 ` [PATCH 4/4] udp: Translate offender addresses for ICMP messages David Gibson
@ 2025-04-16 14:27 ` Stefano Brivio
4 siblings, 0 replies; 10+ messages in thread
From: Stefano Brivio @ 2025-04-16 14:27 UTC (permalink / raw)
To: David Gibson; +Cc: Jon Maloy, passt-dev
On Wed, 16 Apr 2025 19:07:03 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:
> We now propagate ICMP errors on UDP flows back into ICMP packets on
> the tap interface. However, we don't always get the source address
> right for the synthesized message. Because ICMPs can be generated by
> intermediate routers, that source address might not be one of the
> endpoints, so the address translation we already have isn't
> sufficient.
>
> Implement properly translating ICMP addresses when we need to. This
> ended up a bit messier than I hoped, but it seems to work. A simple
> case to test this is:
>
> pasta --config-net --map-host-loopback=172.16.1.1 -- \
> sh -c "echo hello | socat STDIO UDP4:172.16.1.1:10001"
>
> where 10001 is a port where nothing is listening on the host.
Oh, that's convenient.
I also checked this against the "bad resolver address" case I reported
previously, everything "works":
# nslookup passt.top 169.254.1.1
;; communications error to 169.254.1.1#53: connection refused
;; communications error to 169.254.1.1#53: connection refused
;; communications error to 169.254.1.1#53: connection refused
;; no servers could be reached
Except for those few comments to 2/4 and 3/4, everything else looks
good to me.
--
Stefano
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/4] treewide: Improve robustness against sockaddrs of unexpected family
2025-04-16 9:41 ` Stefano Brivio
@ 2025-04-17 1:14 ` David Gibson
0 siblings, 0 replies; 10+ messages in thread
From: David Gibson @ 2025-04-17 1:14 UTC (permalink / raw)
To: Stefano Brivio; +Cc: Jon Maloy, passt-dev
[-- Attachment #1: Type: text/plain, Size: 1298 bytes --]
On Wed, Apr 16, 2025 at 11:41:31AM +0200, Stefano Brivio wrote:
> On Wed, 16 Apr 2025 19:07:05 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
>
> > @@ -239,22 +239,28 @@ static inline void inany_from_af(union inany_addr *aa,
> > /** inany_from_sockaddr - Extract IPv[46] address and port number from sockaddr
> > * @aa: Pointer to store IPv[46] address
> > * @port: Pointer to store port number, host order
> > - * @addr: AF_INET or AF_INET6 socket address
> > + * @addr: Socket address
>
> This is actually sa_ now but... can we do something for argument names
> in general, here? What about dst, port, sa, or dst, port, addr?
Sure, done.
>
> > + *
> > + * Return: 0 on success, -1 on error (bad address family)
> > */
> > -static inline void inany_from_sockaddr(union inany_addr *aa, in_port_t *port,
> > - const union sockaddr_inany *sa)
> > +static inline int inany_from_sockaddr(union inany_addr *aa, in_port_t *port,
> > + const void *sa_)
> > {
> > + const union sockaddr_inany *sa = (const union sockaddr_inany *)sa_;
>
--
David Gibson (he or they) | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you, not the other way
| around.
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 3/4] udp: Rework offender address handling in udp_sock_recverr()
2025-04-16 14:27 ` Stefano Brivio
@ 2025-04-17 1:33 ` David Gibson
0 siblings, 0 replies; 10+ messages in thread
From: David Gibson @ 2025-04-17 1:33 UTC (permalink / raw)
To: Stefano Brivio; +Cc: Jon Maloy, passt-dev
[-- Attachment #1: Type: text/plain, Size: 6465 bytes --]
On Wed, Apr 16, 2025 at 04:27:36PM +0200, Stefano Brivio wrote:
> On Wed, 16 Apr 2025 19:07:06 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
>
> > Make a number of changes to udp_sock_recverr() to improve the robustness
> > of how we handle addresses.
> >
> > * Get the "offender" address (source of the ICMP packet) using the
> > SO_EE_OFFENDER() macro, reducing assumptions about structure layout.
> > * Parse the offender sockaddr using inany_from_sockaddr()
> > * Check explicitly that the source and destination pifs are what we
> > expect. Previously we checked something that was probably equivalent
> > in practice, but isn't strictly speaking what we require for the rest
> > of the code.
> > * Verify that for an ICMPv4 error we also have an IPv4 source/offender
> > and destination/endpoint address
> > * Verify that for an ICMPv6 error we have an IPv6 endpoint
> > * Improve debug reporting of any failures
> >
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> > udp.c | 67 ++++++++++++++++++++++++++++++++++++++++-------------------
> > 1 file changed, 46 insertions(+), 21 deletions(-)
> >
> > diff --git a/udp.c b/udp.c
> > index 57769d06..4352520e 100644
> > --- a/udp.c
> > +++ b/udp.c
> > @@ -159,6 +159,12 @@ udp_meta[UDP_MAX_FRAMES];
> > MAX(CMSG_SPACE(sizeof(struct in_pktinfo)), \
> > CMSG_SPACE(sizeof(struct in6_pktinfo)))
> >
> > +#define RECVERR_SPACE \
> > + MAX(CMSG_SPACE(sizeof(struct sock_extended_err) + \
> > + sizeof(struct sockaddr_in)), \
> > + CMSG_SPACE(sizeof(struct sock_extended_err) + \
> > + sizeof(struct sockaddr_in6)))
> > +
> > /**
> > * enum udp_iov_idx - Indices for the buffers making up a single UDP frame
> > * @UDP_IOV_TAP tap specific header
> > @@ -516,12 +522,8 @@ static int udp_pktinfo(struct msghdr *msg, union inany_addr *dst)
> > static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
> > uint8_t pif, in_port_t port)
> > {
> > - struct errhdr {
> > - struct sock_extended_err ee;
> > - union sockaddr_inany saddr;
> > - };
> > - char buf[PKTINFO_SPACE + CMSG_SPACE(sizeof(struct errhdr))];
> > - const struct errhdr *eh = NULL;
> > + char buf[PKTINFO_SPACE + RECVERR_SPACE];
> > + const struct sock_extended_err *ee;
> > char data[ICMP6_MAX_DLEN];
> > struct cmsghdr *hdr;
> > struct iovec iov = {
> > @@ -538,7 +540,12 @@ static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
> > .msg_controllen = sizeof(buf),
> > };
> > const struct flowside *toside;
> > - flow_sidx_t tosidx;
> > + char astr[INANY_ADDRSTRLEN];
> > + char sastr[SOCKADDR_STRLEN];
> > + union inany_addr offender;
> > + const struct in_addr *o4;
> > + in_port_t offender_port;
> > + uint8_t topif;
> > size_t dlen;
> > ssize_t rc;
> >
> > @@ -569,10 +576,10 @@ static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
> > return -1;
> > }
> >
> > - eh = (const struct errhdr *)CMSG_DATA(hdr);
> > + ee = (const struct sock_extended_err *)CMSG_DATA(hdr);
> >
> > debug("%s error on UDP socket %i: %s",
> > - str_ee_origin(&eh->ee), s, strerror_(eh->ee.ee_errno));
> > + str_ee_origin(ee), s, strerror_(ee->ee_errno));
> >
> > if (!flow_sidx_valid(sidx)) {
> > /* No hint from the socket, determine flow from addresses */
> > @@ -588,25 +595,43 @@ static int udp_sock_recverr(const struct ctx *c, int s, flow_sidx_t sidx,
> > debug("Ignoring UDP error without flow");
> > return 1;
> > }
> > + } else {
> > + pif = pif_at_sidx(sidx);
>
> Two stray trailing tabs here.
Oops, fixed.
> > }
> >
> > - tosidx = flow_sidx_opposite(sidx);
> > - toside = flowside_at_sidx(tosidx);
> > + toside = flowside_at_sidx(flow_sidx_opposite(sidx));
> > + topif = pif_at_sidx(flow_sidx_opposite(sidx));
> > dlen = rc;
> >
> > - if (pif_is_socket(pif_at_sidx(tosidx))) {
> > - /* XXX Is there any way to propagate ICMPs from socket to
> > - * socket? */
> > - } else if (hdr->cmsg_level == IPPROTO_IP) {
> > + if (inany_from_sockaddr(&offender, &offender_port,
> > + SO_EE_OFFENDER(ee)) < 0)
> > + goto fail;
> > +
> > + if (pif != PIF_HOST || topif != PIF_TAP)
> > + /* XXX Can we support any other cases? */
> > + goto fail;
> > +
> > + if (hdr->cmsg_level == IPPROTO_IP &&
> > + (o4 = inany_v4(&offender)) && inany_v4(&toside->eaddr)) {
> > dlen = MIN(dlen, ICMP4_MAX_DLEN);
> > - udp_send_tap_icmp4(c, &eh->ee, toside,
> > - eh->saddr.sa4.sin_addr, data, dlen);
> > - } else if (hdr->cmsg_level == IPPROTO_IPV6) {
> > - udp_send_tap_icmp6(c, &eh->ee, toside,
> > - &eh->saddr.sa6.sin6_addr, data,
> > - dlen, sidx.flowi);
> > + udp_send_tap_icmp4(c, ee, toside, *o4, data, dlen);
> > + return 1;
> > + }
> > +
> > + if (hdr->cmsg_level == IPPROTO_IPV6 && !inany_v4(&toside->eaddr)) {
> > + udp_send_tap_icmp6(c, ee, toside, &offender.a6, data, dlen,
> > + sidx.flowi);
> > + return 1;
> > }
> >
> > +fail:
> > + flow_dbg(flow_at_sidx(sidx),
>
> Coverity Scan seems to hallucinate here and says that flow_at_sidx()
> could return NULL, with its return value later dereferenced by
> flow_log(), even if you're explicitly checking flow_sidx_valid() in all
> the paths reaching to this point.
>
> Calling this conditionally only if flow_sidx_valid() doesn't mask the
> false positive either (I guess that's the part that goes wrong
> somehow), we really need to check if (flow_at_sidx(sidx)) flow_dbg(...).
>
> Would it be possible to add the useless check just for my own sanity?
Sure. I was already borderline on whether it was clearer to introduce
an explicit uflow variable, so I've done that now, and asserted it's
non-NULL. I've checked that removes the coverity whinge, at least
running locally.
> > + "Can't propagate %s error from %s %s to %s %s",
> > + str_ee_origin(ee),
> > + pif_name(pif),
> > + sockaddr_ntop(SO_EE_OFFENDER(ee), sastr, sizeof(sastr)),
> > + pif_name(topif),
> > + inany_ntop(&toside->eaddr, astr, sizeof(astr)));
> > return 1;
> > }
>
--
David Gibson (he or they) | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you, not the other way
| around.
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-04-17 1:56 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-04-16 9:07 [PATCH 0/4] Translate source addresses for ICMP errors David Gibson
2025-04-16 9:07 ` [PATCH 1/4] fwd: Split out helpers for port-independent NAT David Gibson
2025-04-16 9:07 ` [PATCH 2/4] treewide: Improve robustness against sockaddrs of unexpected family David Gibson
2025-04-16 9:41 ` Stefano Brivio
2025-04-17 1:14 ` David Gibson
2025-04-16 9:07 ` [PATCH 3/4] udp: Rework offender address handling in udp_sock_recverr() David Gibson
2025-04-16 14:27 ` Stefano Brivio
2025-04-17 1:33 ` David Gibson
2025-04-16 9:07 ` [PATCH 4/4] udp: Translate offender addresses for ICMP messages David Gibson
2025-04-16 14:27 ` [PATCH 0/4] Translate source addresses for ICMP errors Stefano Brivio
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).