From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by passt.top (Postfix) with ESMTP id 121EA5A004F for ; Tue, 16 Jul 2024 09:26:35 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1721114794; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=RPMZq15W5CBwRvfaQhWPy6aeQjDajnnTYR2LPsCEZ0E=; b=QFCOPCfLG3Aolg07X9m+ZkjRN5V0z4ux1E6k8tH7RfCtdCNF5fa55BV+C+t6VHLu9yXbcn YsE6hHOE2m/hWIGXWcR3YwZ3lj+kXdvtH4wNDZ8eW+x4MTFKFjwiP8yemjzuZVth2msESE fDa6GYavo9+gsO8HyHeoh4Do1LfMUSc= Received: from mail-qv1-f69.google.com (mail-qv1-f69.google.com [209.85.219.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-505-S1kgGn63OLudEkXGTSwX6Q-1; Tue, 16 Jul 2024 03:26:33 -0400 X-MC-Unique: S1kgGn63OLudEkXGTSwX6Q-1 Received: by mail-qv1-f69.google.com with SMTP id 6a1803df08f44-6b75e880a12so52995966d6.3 for ; Tue, 16 Jul 2024 00:26:33 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721114792; x=1721719592; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=RPMZq15W5CBwRvfaQhWPy6aeQjDajnnTYR2LPsCEZ0E=; b=s7fuwtEmBcoOND82fwifC+jRqUExFrrJo+/Yxk2eb1c075dpwahv9Ngr7zAblgRtB1 VKmG6Y5rCKHO0lxE9xMI3nZu1b00MmcEM7T5b9xSFlWlAgPmNHqwWNneh44YcoKfaG06 lQKyNdVSf+RNn/nE0sUEE4DuDj0nm0ssIvqCq7FQPkUXWV10BzHoNj0RPJBbg1NXBMme HzyrgGZmdVfCBZ8Cqkw7uo3/oek4lmLmzYo8qAuNAILxZusHY44Xkb4K1ym6qL6Wj406 UpdJXtkCVnZaXFTxjQYdkUhKeKtmJ781HdVPTsUMHFKdwu4UIOq0d7WPrxuUfkBFWCGq xJ/w== X-Gm-Message-State: AOJu0YygQoMR59+ZJKz4uaYGsCQMudkuQnLf+a/oKZ/OKUZK4Ank7ZEB EaOrTyEMhDJ1rzIP+hbNmU0Cwb7A3pOPtnljmw7L8/c/peKAw7K+bcDiiZuM+Li/BQu1UuthMf1 DDDbZ2dqYq+jFLlM3SnKdK9Y1uuhiT/l+F2cMLH6NxSwXA97oBd/3dYAOakDO X-Received: by 2002:a05:6214:27e4:b0:6b5:a8b9:e8ab with SMTP id 6a1803df08f44-6b77f5e177amr14951466d6.49.1721114792263; Tue, 16 Jul 2024 00:26:32 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE67w2bVHekvC8GZaxnx/fprDp5U/aeamZg6dS4zTnEjfq5c+jEqBzgys7FFlu1iqNekQCIVQ== X-Received: by 2002:a05:6214:27e4:b0:6b5:a8b9:e8ab with SMTP id 6a1803df08f44-6b77f5e177amr14951346d6.49.1721114791788; Tue, 16 Jul 2024 00:26:31 -0700 (PDT) Received: from maya.cloud.tilaa.com (maya.cloud.tilaa.com. [164.138.29.33]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6b76199d5a2sm28216956d6.60.2024.07.16.00.26.30 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 16 Jul 2024 00:26:31 -0700 (PDT) Date: Tue, 16 Jul 2024 09:25:56 +0200 From: Stefano Brivio To: David Gibson Subject: Re: [PATCH 5/5] udp: Handle errors on UDP sockets Message-ID: <20240716092556.432cf763@elisabeth> In-Reply-To: <20240716052936.1204164-6-david@gibson.dropbear.id.au> References: <20240716052936.1204164-1-david@gibson.dropbear.id.au> <20240716052936.1204164-6-david@gibson.dropbear.id.au> Organization: Red Hat X-Mailer: Claws Mail 4.2.0 (GTK 3.24.41; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-ID-Hash: JMZVTCV2YUU3H5L7KER6WKYBCNWQEVUN X-Message-ID-Hash: JMZVTCV2YUU3H5L7KER6WKYBCNWQEVUN X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Nits only (the rest of the series looks good to me): On Tue, 16 Jul 2024 15:29:36 +1000 David Gibson wrote: > Currently we ignore all events other than EPOLLIN on UDP sockets. This > means that if we ever receive an EPOLLERR event, we'll enter an infinite > loop on epoll, because we'll never do anything to clear the error. > > Luckily that doesn't seem to have happened in practice, but it's certainly > fragile. Furthermore changes in how we handle UDP sockets with the flow > table mean we will start receiving error events. > > Add handling of EPOLLERR events. For now we just read the error from the > error queue (thereby clearing the error state) and print a debug message. > We can add more substantial handling of specific events in future if we > want to. > > Signed-off-by: David Gibson > --- > udp.c | 59 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > util.c | 29 +++++++++++++++++++++++++++++ > util.h | 3 +++ > 3 files changed, 91 insertions(+) > > diff --git a/udp.c b/udp.c > index fbf3ce73..d761717a 100644 > --- a/udp.c > +++ b/udp.c > @@ -110,6 +110,7 @@ > #include > #include > #include > +#include > > #include "checksum.h" > #include "util.h" > @@ -728,6 +729,59 @@ static void udp_tap_prepare(const struct ctx *c, const struct mmsghdr *mmh, > (*tap_iov)[UDP_IOV_PAYLOAD].iov_len = l4len; > } > > +/** > + * udp_sock_recverr() - Receive and clear an error from a socket > + * @s: Socket to receive from > + * > + * Return: true if an error received and processed, false if no more errors It took me a while to realise there's an implied "was" between "error" and "received". Maybe: * Return: true: errors received and processed, false: no more errors ? > + * > + * #syscalls recvmsg > + */ > +static bool udp_sock_recverr(int s) > +{ > + const struct sock_extended_err *ee; > + const struct cmsghdr *hdr; > + char buf[CMSG_SPACE(sizeof(*ee))]; > + struct msghdr mh = { > + .msg_name = NULL, > + .msg_namelen = 0, > + .msg_iov = NULL, > + .msg_iovlen = 0, > + .msg_control = buf, > + .msg_controllen = sizeof(buf), > + }; > + ssize_t rc; > + > + rc = recvmsg(s, &mh, MSG_ERRQUEUE); > + if (rc < 0) { > + if (errno != EAGAIN) There are some existing cases where we forgot about this, but EAGAIN and EWOULDBLOCK are not guaranteed to be the same value, and as recvmsg(2) says: POSIX.1 allows either error to be returned for this case, and does not require these constants to have the same value, so a portable application should check for both possibilities. > + err_perror("Failed to read error queue"); > + return false; > + } > + > + if (!(mh.msg_flags & MSG_ERRQUEUE)) { > + err("Missing MSG_ERRQUEUE flag reading error queue"); > + return false; > + } > + > + hdr = CMSG_FIRSTHDR(&mh); > + if (!((hdr->cmsg_level == IPPROTO_IP && > + hdr->cmsg_type == IP_RECVERR) || > + (hdr->cmsg_level == IPPROTO_IPV6 && > + hdr->cmsg_type == IPV6_RECVERR))) { > + err("Unexpected cmsg reading error queue"); > + return false; > + } > + > + ee = (const struct sock_extended_err *)CMSG_DATA(hdr); > + > + /* TODO: When possible propagate and otherwise handle errors */ > + debug("%s error on UDP socket %i: %s", > + str_ee_origin(ee), s, strerror(ee->ee_errno)); > + > + return true; > +} > + > /** > * udp_sock_recv() - Receive datagrams from a socket > * @c: Execution context > @@ -751,6 +805,11 @@ static int udp_sock_recv(const struct ctx *c, int s, uint32_t events, > > ASSERT(!c->no_udp); > > + /* Clear any errors first */ > + if (events & EPOLLERR) For consistency: curly brackets around multi-line block (or simply change that to "while (udp_sock_recverr(s));"? No preference from me). > + while (udp_sock_recverr(s)) > + ; > + > if (!(events & EPOLLIN)) > return 0; > > diff --git a/util.c b/util.c > index 428847d2..70de1e16 100644 > --- a/util.c > +++ b/util.c > @@ -25,6 +25,7 @@ > #include > #include > #include > +#include > > #include "util.h" > #include "iov.h" > @@ -97,6 +98,14 @@ static int sock_l4_sa(const struct ctx *c, enum epoll_type type, > if (setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, &y, sizeof(y))) > debug("Failed to set SO_REUSEADDR on socket %i", fd); > > + if (proto == IPPROTO_UDP) { > + int level = af == AF_INET ? IPPROTO_IP : IPPROTO_IPV6; > + int opt = af == AF_INET ? IP_RECVERR : IPV6_RECVERR; > + > + if (setsockopt(fd, level, opt, &y, sizeof(y))) > + die("Failed to set RECVERR on socket %i", fd); There's also a die_perror() which would be practical here. > + } > + > if (ifname && *ifname) { > /* Supported since kernel version 5.7, commit c427bfec18f2 > * ("net: core: enable SO_BINDTODEVICE for non-root users"). If > @@ -654,3 +663,23 @@ const char *sockaddr_ntop(const void *sa, char *dst, socklen_t size) > > return dst; > } > + > +/** str_ee_origin() - Convert socket extended error origin to a string > + * @ee: Socket extended error structure > + * > + * Return: Static string describing error origin > + */ > +const char *str_ee_origin(const struct sock_extended_err *ee) > +{ > + const char *const desc[] = { > + [SO_EE_ORIGIN_NONE] = "", > + [SO_EE_ORIGIN_LOCAL] = "Local", > + [SO_EE_ORIGIN_ICMP] = "ICMP", > + [SO_EE_ORIGIN_ICMP6] = "ICMPv6", > + }; > + > + if (ee->ee_origin < ARRAY_SIZE(desc)) > + return desc[ee->ee_origin]; > + > + return ""; > +} > diff --git a/util.h b/util.h > index d0150396..1d479ddf 100644 > --- a/util.h > +++ b/util.h > @@ -194,7 +194,10 @@ static inline const char *af_name(sa_family_t af) > > #define SOCKADDR_STRLEN MAX(SOCKADDR_INET_STRLEN, SOCKADDR_INET6_STRLEN) > > +struct sock_extended_err; > + > const char *sockaddr_ntop(const void *sa, char *dst, socklen_t size); > +const char *str_ee_origin(const struct sock_extended_err *ee); > > /** > * mod_sub() - Modular arithmetic subtraction -- Stefano