On Fri, May 02, 2025 at 11:06:40PM +0200, Stefano Brivio wrote: > Given that udp_sock_fwd() now loops on udp_peek_addr() to get endpoint > addresses for datagrams, if we can't forward one of these datagrams, > we need to make sure we actually discard it. Otherwise, with MSG_PEEK, > we won't dequeue and loop on it forever. > > For example, if we fail to create a socket for a new flow, because, > say, the destination of an inbound packet is multicast, and we can't > bind() to a multicast address, the loop will look like this: > > 18.0563: Flow 0 (NEW): FREE -> NEW > 18.0563: Flow 0 (INI): NEW -> INI > 18.0563: Flow 0 (INI): HOST [127.0.0.1]:42487 -> [127.0.0.1]:9997 => ? > 18.0563: Flow 0 (TGT): INI -> TGT > 18.0563: Flow 0 (TGT): HOST [127.0.0.1]:42487 -> [ff02::c]:9997 => SPLICE [0.0.0.0]:42487 -> [88.198.0.164]:9997 > 18.0563: Flow 0 (UDP flow): TGT -> TYPED > 18.0564: Flow 0 (UDP flow): HOST [127.0.0.1]:42487 -> [ff02::c]:9997 => SPLICE [0.0.0.0]:42487 -> [88.198.0.164]:9997 > 18.0564: Flow 0 (UDP flow): Couldn't open flow specific socket: Invalid argument > 18.0564: Flow 0 (FREE): TYPED -> FREE > 18.0564: Flow 0 (FREE): HOST [127.0.0.1]:42487 -> [ff02::c]:9997 => SPLICE [0.0.0.0]:42487 -> [88.198.0.164]:9997 > 18.0564: Discarding datagram without flow > 18.0564: Flow 0 (NEW): FREE -> NEW > 18.0564: Flow 0 (INI): NEW -> INI > 18.0564: Flow 0 (INI): HOST [127.0.0.1]:42487 -> [127.0.0.1]:9997 => ? > 18.0564: Flow 0 (TGT): INI -> TGT > 18.0564: Flow 0 (TGT): HOST [127.0.0.1]:42487 -> [ff02::c]:9997 => SPLICE [0.0.0.0]:42487 -> [88.198.0.164]:9997 > 18.0564: Flow 0 (UDP flow): TGT -> TYPED > 18.0564: Flow 0 (UDP flow): HOST [127.0.0.1]:42487 -> [ff02::c]:9997 => SPLICE [0.0.0.0]:42487 -> [88.198.0.164]:9997 > 18.0564: Flow 0 (UDP flow): Couldn't open flow specific socket: Invalid argument > 18.0564: Flow 0 (FREE): TYPED -> FREE > 18.0564: Flow 0 (FREE): HOST [127.0.0.1]:42487 -> [ff02::c]:9997 => SPLICE [0.0.0.0]:42487 -> [88.198.0.164]:9997 > 18.0564: Discarding datagram without flow > > and seen from strace: > > epoll_wait(3, [{events=EPOLLIN, data=0x1076c00000705}], 8, 1000) = 1 > recvmsg(7, {msg_name={sa_family=AF_INET6, sin6_port=htons(55899), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "fe80::26e8:53ff:fef3:13b6", &sin6_addr), sin6_scope_id=if_nametoindex("wlp4s0")}, msg_namelen=28, msg_iov=NULL, msg_iovlen=0, msg_control=[{cmsg_len=36, cmsg_level=SOL_IPV6, cmsg_type=0x32, cmsg_data="\xff\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0c\x03\x00\x00\x00"}], msg_controllen=40, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_DONTWAIT) = 0 > socket(AF_INET6, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_UDP) = 12 > setsockopt(12, SOL_IPV6, IPV6_V6ONLY, [1], 4) = 0 > setsockopt(12, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 > setsockopt(12, SOL_IPV6, IPV6_RECVERR, [1], 4) = 0 > setsockopt(12, SOL_IPV6, IPV6_RECVPKTINFO, [1], 4) = 0 > bind(12, {sa_family=AF_INET6, sin6_port=htons(1900), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "ff02::c", &sin6_addr), sin6_scope_id=0}, 28) = -1 EINVAL (Invalid argument) > close(12) = 0 > recvmsg(7, {msg_name={sa_family=AF_INET6, sin6_port=htons(55899), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "fe80::26e8:53ff:fef3:13b6", &sin6_addr), sin6_scope_id=if_nametoindex("wlp4s0")}, msg_namelen=28, msg_iov=NULL, msg_iovlen=0, msg_control=[{cmsg_len=36, cmsg_level=SOL_IPV6, cmsg_type=0x32, cmsg_data="\xff\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0c\x03\x00\x00\x00"}], msg_controllen=40, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_DONTWAIT) = 0 > socket(AF_INET6, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_UDP) = 12 > setsockopt(12, SOL_IPV6, IPV6_V6ONLY, [1], 4) = 0 > setsockopt(12, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 > setsockopt(12, SOL_IPV6, IPV6_RECVERR, [1], 4) = 0 > setsockopt(12, SOL_IPV6, IPV6_RECVPKTINFO, [1], 4) = 0 > bind(12, {sa_family=AF_INET6, sin6_port=htons(1900), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "ff02::c", &sin6_addr), sin6_scope_id=0}, 28) = -1 EINVAL (Invalid argument) > close(12) = 0 Oof, yes, that's a somewhat embarrassing oversight. Reviewed-by: David Gibson > Signed-off-by: Stefano Brivio > --- > udp.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/udp.c b/udp.c > index f5a5cd1..ca28b37 100644 > --- a/udp.c > +++ b/udp.c > @@ -828,6 +828,7 @@ void udp_sock_fwd(const struct ctx *c, int s, uint8_t frompif, > int rc; > > while ((rc = udp_peek_addr(s, &src, &dst)) != 0) { > + bool discard = false; > flow_sidx_t tosidx; > uint8_t topif; > > @@ -861,8 +862,17 @@ void udp_sock_fwd(const struct ctx *c, int s, uint8_t frompif, > flow_err(uflow, > "No support for forwarding UDP from %s to %s", > pif_name(frompif), pif_name(topif)); > + discard = true; > } else { > debug("Discarding datagram without flow"); > + discard = true; > + } > + > + if (discard) { > + struct msghdr msg = { 0 }; > + > + if (recvmsg(s, &msg, MSG_DONTWAIT) < 0) > + debug_perror("Failed to discard datagram"); > } > } > } -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson