public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
* [PATCH] udp: Actually discard datagrams we can't forward
@ 2025-05-02 21:06 Stefano Brivio
  2025-05-03  2:44 ` David Gibson
  0 siblings, 1 reply; 2+ messages in thread
From: Stefano Brivio @ 2025-05-02 21:06 UTC (permalink / raw)
  To: passt-dev

Given that udp_sock_fwd() now loops on udp_peek_addr() to get endpoint
addresses for datagrams, if we can't forward one of these datagrams,
we need to make sure we actually discard it. Otherwise, with MSG_PEEK,
we won't dequeue and loop on it forever.

For example, if we fail to create a socket for a new flow, because,
say, the destination of an inbound packet is multicast, and we can't
bind() to a multicast address, the loop will look like this:

18.0563: Flow 0 (NEW): FREE -> NEW
18.0563: Flow 0 (INI): NEW -> INI
18.0563: Flow 0 (INI): HOST [127.0.0.1]:42487 -> [127.0.0.1]:9997 => ?
18.0563: Flow 0 (TGT): INI -> TGT
18.0563: Flow 0 (TGT): HOST [127.0.0.1]:42487 -> [ff02::c]:9997 => SPLICE [0.0.0.0]:42487 -> [88.198.0.164]:9997
18.0563: Flow 0 (UDP flow): TGT -> TYPED
18.0564: Flow 0 (UDP flow): HOST [127.0.0.1]:42487 -> [ff02::c]:9997 => SPLICE [0.0.0.0]:42487 -> [88.198.0.164]:9997
18.0564: Flow 0 (UDP flow): Couldn't open flow specific socket: Invalid argument
18.0564: Flow 0 (FREE): TYPED -> FREE
18.0564: Flow 0 (FREE): HOST [127.0.0.1]:42487 -> [ff02::c]:9997 => SPLICE [0.0.0.0]:42487 -> [88.198.0.164]:9997
18.0564: Discarding datagram without flow
18.0564: Flow 0 (NEW): FREE -> NEW
18.0564: Flow 0 (INI): NEW -> INI
18.0564: Flow 0 (INI): HOST [127.0.0.1]:42487 -> [127.0.0.1]:9997 => ?
18.0564: Flow 0 (TGT): INI -> TGT
18.0564: Flow 0 (TGT): HOST [127.0.0.1]:42487 -> [ff02::c]:9997 => SPLICE [0.0.0.0]:42487 -> [88.198.0.164]:9997
18.0564: Flow 0 (UDP flow): TGT -> TYPED
18.0564: Flow 0 (UDP flow): HOST [127.0.0.1]:42487 -> [ff02::c]:9997 => SPLICE [0.0.0.0]:42487 -> [88.198.0.164]:9997
18.0564: Flow 0 (UDP flow): Couldn't open flow specific socket: Invalid argument
18.0564: Flow 0 (FREE): TYPED -> FREE
18.0564: Flow 0 (FREE): HOST [127.0.0.1]:42487 -> [ff02::c]:9997 => SPLICE [0.0.0.0]:42487 -> [88.198.0.164]:9997
18.0564: Discarding datagram without flow

and seen from strace:

epoll_wait(3, [{events=EPOLLIN, data=0x1076c00000705}], 8, 1000) = 1
recvmsg(7, {msg_name={sa_family=AF_INET6, sin6_port=htons(55899), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "fe80::26e8:53ff:fef3:13b6", &sin6_addr), sin6_scope_id=if_nametoindex("wlp4s0")}, msg_namelen=28, msg_iov=NULL, msg_iovlen=0, msg_control=[{cmsg_len=36, cmsg_level=SOL_IPV6, cmsg_type=0x32, cmsg_data="\xff\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0c\x03\x00\x00\x00"}], msg_controllen=40, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_DONTWAIT) = 0
socket(AF_INET6, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_UDP) = 12
setsockopt(12, SOL_IPV6, IPV6_V6ONLY, [1], 4) = 0
setsockopt(12, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
setsockopt(12, SOL_IPV6, IPV6_RECVERR, [1], 4) = 0
setsockopt(12, SOL_IPV6, IPV6_RECVPKTINFO, [1], 4) = 0
bind(12, {sa_family=AF_INET6, sin6_port=htons(1900), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "ff02::c", &sin6_addr), sin6_scope_id=0}, 28) = -1 EINVAL (Invalid argument)
close(12)                               = 0
recvmsg(7, {msg_name={sa_family=AF_INET6, sin6_port=htons(55899), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "fe80::26e8:53ff:fef3:13b6", &sin6_addr), sin6_scope_id=if_nametoindex("wlp4s0")}, msg_namelen=28, msg_iov=NULL, msg_iovlen=0, msg_control=[{cmsg_len=36, cmsg_level=SOL_IPV6, cmsg_type=0x32, cmsg_data="\xff\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0c\x03\x00\x00\x00"}], msg_controllen=40, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_DONTWAIT) = 0
socket(AF_INET6, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_UDP) = 12
setsockopt(12, SOL_IPV6, IPV6_V6ONLY, [1], 4) = 0
setsockopt(12, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
setsockopt(12, SOL_IPV6, IPV6_RECVERR, [1], 4) = 0
setsockopt(12, SOL_IPV6, IPV6_RECVPKTINFO, [1], 4) = 0
bind(12, {sa_family=AF_INET6, sin6_port=htons(1900), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "ff02::c", &sin6_addr), sin6_scope_id=0}, 28) = -1 EINVAL (Invalid argument)
close(12)                               = 0

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
---
 udp.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/udp.c b/udp.c
index f5a5cd1..ca28b37 100644
--- a/udp.c
+++ b/udp.c
@@ -828,6 +828,7 @@ void udp_sock_fwd(const struct ctx *c, int s, uint8_t frompif,
 	int rc;
 
 	while ((rc = udp_peek_addr(s, &src, &dst)) != 0) {
+		bool discard = false;
 		flow_sidx_t tosidx;
 		uint8_t topif;
 
@@ -861,8 +862,17 @@ void udp_sock_fwd(const struct ctx *c, int s, uint8_t frompif,
 			flow_err(uflow,
 				 "No support for forwarding UDP from %s to %s",
 				 pif_name(frompif), pif_name(topif));
+			discard = true;
 		} else {
 			debug("Discarding datagram without flow");
+			discard = true;
+		}
+
+		if (discard) {
+			struct msghdr msg = { 0 };
+
+			if (recvmsg(s, &msg, MSG_DONTWAIT) < 0)
+				debug_perror("Failed to discard datagram");
 		}
 	}
 }
-- 
@@ -828,6 +828,7 @@ void udp_sock_fwd(const struct ctx *c, int s, uint8_t frompif,
 	int rc;
 
 	while ((rc = udp_peek_addr(s, &src, &dst)) != 0) {
+		bool discard = false;
 		flow_sidx_t tosidx;
 		uint8_t topif;
 
@@ -861,8 +862,17 @@ void udp_sock_fwd(const struct ctx *c, int s, uint8_t frompif,
 			flow_err(uflow,
 				 "No support for forwarding UDP from %s to %s",
 				 pif_name(frompif), pif_name(topif));
+			discard = true;
 		} else {
 			debug("Discarding datagram without flow");
+			discard = true;
+		}
+
+		if (discard) {
+			struct msghdr msg = { 0 };
+
+			if (recvmsg(s, &msg, MSG_DONTWAIT) < 0)
+				debug_perror("Failed to discard datagram");
 		}
 	}
 }
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] udp: Actually discard datagrams we can't forward
  2025-05-02 21:06 [PATCH] udp: Actually discard datagrams we can't forward Stefano Brivio
@ 2025-05-03  2:44 ` David Gibson
  0 siblings, 0 replies; 2+ messages in thread
From: David Gibson @ 2025-05-03  2:44 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: passt-dev

[-- Attachment #1: Type: text/plain, Size: 5402 bytes --]

On Fri, May 02, 2025 at 11:06:40PM +0200, Stefano Brivio wrote:
> Given that udp_sock_fwd() now loops on udp_peek_addr() to get endpoint
> addresses for datagrams, if we can't forward one of these datagrams,
> we need to make sure we actually discard it. Otherwise, with MSG_PEEK,
> we won't dequeue and loop on it forever.
> 
> For example, if we fail to create a socket for a new flow, because,
> say, the destination of an inbound packet is multicast, and we can't
> bind() to a multicast address, the loop will look like this:
> 
> 18.0563: Flow 0 (NEW): FREE -> NEW
> 18.0563: Flow 0 (INI): NEW -> INI
> 18.0563: Flow 0 (INI): HOST [127.0.0.1]:42487 -> [127.0.0.1]:9997 => ?
> 18.0563: Flow 0 (TGT): INI -> TGT
> 18.0563: Flow 0 (TGT): HOST [127.0.0.1]:42487 -> [ff02::c]:9997 => SPLICE [0.0.0.0]:42487 -> [88.198.0.164]:9997
> 18.0563: Flow 0 (UDP flow): TGT -> TYPED
> 18.0564: Flow 0 (UDP flow): HOST [127.0.0.1]:42487 -> [ff02::c]:9997 => SPLICE [0.0.0.0]:42487 -> [88.198.0.164]:9997
> 18.0564: Flow 0 (UDP flow): Couldn't open flow specific socket: Invalid argument
> 18.0564: Flow 0 (FREE): TYPED -> FREE
> 18.0564: Flow 0 (FREE): HOST [127.0.0.1]:42487 -> [ff02::c]:9997 => SPLICE [0.0.0.0]:42487 -> [88.198.0.164]:9997
> 18.0564: Discarding datagram without flow
> 18.0564: Flow 0 (NEW): FREE -> NEW
> 18.0564: Flow 0 (INI): NEW -> INI
> 18.0564: Flow 0 (INI): HOST [127.0.0.1]:42487 -> [127.0.0.1]:9997 => ?
> 18.0564: Flow 0 (TGT): INI -> TGT
> 18.0564: Flow 0 (TGT): HOST [127.0.0.1]:42487 -> [ff02::c]:9997 => SPLICE [0.0.0.0]:42487 -> [88.198.0.164]:9997
> 18.0564: Flow 0 (UDP flow): TGT -> TYPED
> 18.0564: Flow 0 (UDP flow): HOST [127.0.0.1]:42487 -> [ff02::c]:9997 => SPLICE [0.0.0.0]:42487 -> [88.198.0.164]:9997
> 18.0564: Flow 0 (UDP flow): Couldn't open flow specific socket: Invalid argument
> 18.0564: Flow 0 (FREE): TYPED -> FREE
> 18.0564: Flow 0 (FREE): HOST [127.0.0.1]:42487 -> [ff02::c]:9997 => SPLICE [0.0.0.0]:42487 -> [88.198.0.164]:9997
> 18.0564: Discarding datagram without flow
> 
> and seen from strace:
> 
> epoll_wait(3, [{events=EPOLLIN, data=0x1076c00000705}], 8, 1000) = 1
> recvmsg(7, {msg_name={sa_family=AF_INET6, sin6_port=htons(55899), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "fe80::26e8:53ff:fef3:13b6", &sin6_addr), sin6_scope_id=if_nametoindex("wlp4s0")}, msg_namelen=28, msg_iov=NULL, msg_iovlen=0, msg_control=[{cmsg_len=36, cmsg_level=SOL_IPV6, cmsg_type=0x32, cmsg_data="\xff\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0c\x03\x00\x00\x00"}], msg_controllen=40, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_DONTWAIT) = 0
> socket(AF_INET6, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_UDP) = 12
> setsockopt(12, SOL_IPV6, IPV6_V6ONLY, [1], 4) = 0
> setsockopt(12, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
> setsockopt(12, SOL_IPV6, IPV6_RECVERR, [1], 4) = 0
> setsockopt(12, SOL_IPV6, IPV6_RECVPKTINFO, [1], 4) = 0
> bind(12, {sa_family=AF_INET6, sin6_port=htons(1900), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "ff02::c", &sin6_addr), sin6_scope_id=0}, 28) = -1 EINVAL (Invalid argument)
> close(12)                               = 0
> recvmsg(7, {msg_name={sa_family=AF_INET6, sin6_port=htons(55899), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "fe80::26e8:53ff:fef3:13b6", &sin6_addr), sin6_scope_id=if_nametoindex("wlp4s0")}, msg_namelen=28, msg_iov=NULL, msg_iovlen=0, msg_control=[{cmsg_len=36, cmsg_level=SOL_IPV6, cmsg_type=0x32, cmsg_data="\xff\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0c\x03\x00\x00\x00"}], msg_controllen=40, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_DONTWAIT) = 0
> socket(AF_INET6, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_UDP) = 12
> setsockopt(12, SOL_IPV6, IPV6_V6ONLY, [1], 4) = 0
> setsockopt(12, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
> setsockopt(12, SOL_IPV6, IPV6_RECVERR, [1], 4) = 0
> setsockopt(12, SOL_IPV6, IPV6_RECVPKTINFO, [1], 4) = 0
> bind(12, {sa_family=AF_INET6, sin6_port=htons(1900), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "ff02::c", &sin6_addr), sin6_scope_id=0}, 28) = -1 EINVAL (Invalid argument)
> close(12)                               = 0

Oof, yes, that's a somewhat embarrassing oversight.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> ---
>  udp.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/udp.c b/udp.c
> index f5a5cd1..ca28b37 100644
> --- a/udp.c
> +++ b/udp.c
> @@ -828,6 +828,7 @@ void udp_sock_fwd(const struct ctx *c, int s, uint8_t frompif,
>  	int rc;
>  
>  	while ((rc = udp_peek_addr(s, &src, &dst)) != 0) {
> +		bool discard = false;
>  		flow_sidx_t tosidx;
>  		uint8_t topif;
>  
> @@ -861,8 +862,17 @@ void udp_sock_fwd(const struct ctx *c, int s, uint8_t frompif,
>  			flow_err(uflow,
>  				 "No support for forwarding UDP from %s to %s",
>  				 pif_name(frompif), pif_name(topif));
> +			discard = true;
>  		} else {
>  			debug("Discarding datagram without flow");
> +			discard = true;
> +		}
> +
> +		if (discard) {
> +			struct msghdr msg = { 0 };
> +
> +			if (recvmsg(s, &msg, MSG_DONTWAIT) < 0)
> +				debug_perror("Failed to discard datagram");
>  		}
>  	}
>  }

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2025-05-03  4:12 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-05-02 21:06 [PATCH] udp: Actually discard datagrams we can't forward Stefano Brivio
2025-05-03  2:44 ` David Gibson

Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).