From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by passt.top (Postfix) with ESMTPS id 68A9C5A004F for ; Wed, 17 Jul 2024 02:20:49 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202312; t=1721175639; bh=2GUYmT2SNNC6ju9r5djOStnskkh+rMoITgHFgPT91SI=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=N1gagIYUl77BpAf3JEfFh3xEzfg3Z6t8id/jEDGdLeA2wQhUZGHmXS5h5WFm0RY5i hRxZgs0CA2hlCFZR1rl/pgfOh2iBeJyqU4HiDPhS0I7n/Y8BZMivX3Hhp43kFix6ss f6roSN5UD2P1VhEdfsAxhaBKYykyAGihnLU3OAOlCS42PdS/wpUjj2mHb8LGnyfU8a fiWiYG3XnXzq697eUW9lqD67PsLMy1yT3jekUcxNP1EKpx7fv9ZU6449SRXf2tAf41 WbWoHEX0A3NuhfG83mISyMRg/N2FrJQ1guRVX6URIp9jAPu3j0sbUHqa0oTBlBqKMp NeQFvQB9J9d5g== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4WNxRW6SV6z4wym; Wed, 17 Jul 2024 10:20:39 +1000 (AEST) Date: Wed, 17 Jul 2024 10:04:32 +1000 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH 5/5] udp: Handle errors on UDP sockets Message-ID: References: <20240716052936.1204164-1-david@gibson.dropbear.id.au> <20240716052936.1204164-6-david@gibson.dropbear.id.au> <20240716092556.432cf763@elisabeth> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="MAuEACu1lE80PuTG" Content-Disposition: inline In-Reply-To: <20240716092556.432cf763@elisabeth> Message-ID-Hash: OR2J3PH2XYIIFLLBRLKHVT6DRV5E6KEW X-Message-ID-Hash: OR2J3PH2XYIIFLLBRLKHVT6DRV5E6KEW X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --MAuEACu1lE80PuTG Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Jul 16, 2024 at 09:25:56AM +0200, Stefano Brivio wrote: > Nits only (the rest of the series looks good to me): >=20 > On Tue, 16 Jul 2024 15:29:36 +1000 > David Gibson wrote: >=20 > > Currently we ignore all events other than EPOLLIN on UDP sockets. This > > means that if we ever receive an EPOLLERR event, we'll enter an infinite > > loop on epoll, because we'll never do anything to clear the error. > >=20 > > Luckily that doesn't seem to have happened in practice, but it's certai= nly > > fragile. Furthermore changes in how we handle UDP sockets with the flow > > table mean we will start receiving error events. > >=20 > > Add handling of EPOLLERR events. For now we just read the error from t= he > > error queue (thereby clearing the error state) and print a debug messag= e. > > We can add more substantial handling of specific events in future if we > > want to. > >=20 > > Signed-off-by: David Gibson > > --- > > udp.c | 59 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > util.c | 29 +++++++++++++++++++++++++++++ > > util.h | 3 +++ > > 3 files changed, 91 insertions(+) > >=20 > > diff --git a/udp.c b/udp.c > > index fbf3ce73..d761717a 100644 > > --- a/udp.c > > +++ b/udp.c > > @@ -110,6 +110,7 @@ > > #include > > #include > > #include > > +#include > > =20 > > #include "checksum.h" > > #include "util.h" > > @@ -728,6 +729,59 @@ static void udp_tap_prepare(const struct ctx *c, c= onst struct mmsghdr *mmh, > > (*tap_iov)[UDP_IOV_PAYLOAD].iov_len =3D l4len; > > } > > =20 > > +/** > > + * udp_sock_recverr() - Receive and clear an error from a socket > > + * @s: Socket to receive from > > + * > > + * Return: true if an error received and processed, false if no more e= rrors >=20 > It took me a while to realise there's an implied "was" between "error" > and "received". Maybe: >=20 > * Return: true: errors received and processed, false: no more errors >=20 > ? Adjusted. >=20 > > + * > > + * #syscalls recvmsg > > + */ > > +static bool udp_sock_recverr(int s) > > +{ > > + const struct sock_extended_err *ee; > > + const struct cmsghdr *hdr; > > + char buf[CMSG_SPACE(sizeof(*ee))]; > > + struct msghdr mh =3D { > > + .msg_name =3D NULL, > > + .msg_namelen =3D 0, > > + .msg_iov =3D NULL, > > + .msg_iovlen =3D 0, > > + .msg_control =3D buf, > > + .msg_controllen =3D sizeof(buf), > > + }; > > + ssize_t rc; > > + > > + rc =3D recvmsg(s, &mh, MSG_ERRQUEUE); > > + if (rc < 0) { > > + if (errno !=3D EAGAIN) >=20 > There are some existing cases where we forgot about this, but EAGAIN > and EWOULDBLOCK are not guaranteed to be the same value, and as > recvmsg(2) says: >=20 > POSIX.1 allows either error to be returned for this case, and does > not require these constants to have the same value, so a portable > application should check for both possibilities. Fair point. We're Linux only for now where they certainly do have the same value, but no reason not to be robust against portability in future. > > + err_perror("Failed to read error queue"); > > + return false; > > + } > > + > > + if (!(mh.msg_flags & MSG_ERRQUEUE)) { > > + err("Missing MSG_ERRQUEUE flag reading error queue"); > > + return false; > > + } > > + > > + hdr =3D CMSG_FIRSTHDR(&mh); > > + if (!((hdr->cmsg_level =3D=3D IPPROTO_IP && > > + hdr->cmsg_type =3D=3D IP_RECVERR) || > > + (hdr->cmsg_level =3D=3D IPPROTO_IPV6 && > > + hdr->cmsg_type =3D=3D IPV6_RECVERR))) { > > + err("Unexpected cmsg reading error queue"); > > + return false; > > + } > > + > > + ee =3D (const struct sock_extended_err *)CMSG_DATA(hdr); > > + > > + /* TODO: When possible propagate and otherwise handle errors */ > > + debug("%s error on UDP socket %i: %s", > > + str_ee_origin(ee), s, strerror(ee->ee_errno)); > > + > > + return true; > > +} > > + > > /** > > * udp_sock_recv() - Receive datagrams from a socket > > * @c: Execution context > > @@ -751,6 +805,11 @@ static int udp_sock_recv(const struct ctx *c, int = s, uint32_t events, > > =20 > > ASSERT(!c->no_udp); > > =20 > > + /* Clear any errors first */ > > + if (events & EPOLLERR) >=20 > For consistency: curly brackets around multi-line block (or simply > change that to "while (udp_sock_recverr(s));"? No preference from me). Done. > > + while (udp_sock_recverr(s)) > > + ; > > + > > if (!(events & EPOLLIN)) > > return 0; > > =20 > > diff --git a/util.c b/util.c > > index 428847d2..70de1e16 100644 > > --- a/util.c > > +++ b/util.c > > @@ -25,6 +25,7 @@ > > #include > > #include > > #include > > +#include > > =20 > > #include "util.h" > > #include "iov.h" > > @@ -97,6 +98,14 @@ static int sock_l4_sa(const struct ctx *c, enum epol= l_type type, > > if (setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, &y, sizeof(y))) > > debug("Failed to set SO_REUSEADDR on socket %i", fd); > > =20 > > + if (proto =3D=3D IPPROTO_UDP) { > > + int level =3D af =3D=3D AF_INET ? IPPROTO_IP : IPPROTO_IPV6; > > + int opt =3D af =3D=3D AF_INET ? IP_RECVERR : IPV6_RECVERR; > > + > > + if (setsockopt(fd, level, opt, &y, sizeof(y))) > > + die("Failed to set RECVERR on socket %i", fd); >=20 > There's also a die_perror() which would be practical here. Oops, I think I meant to use that. Done. > > + } > > + > > if (ifname && *ifname) { > > /* Supported since kernel version 5.7, commit c427bfec18f2 > > * ("net: core: enable SO_BINDTODEVICE for non-root users"). If > > @@ -654,3 +663,23 @@ const char *sockaddr_ntop(const void *sa, char *ds= t, socklen_t size) > > =20 > > return dst; > > } > > + > > +/** str_ee_origin() - Convert socket extended error origin to a string > > + * @ee: Socket extended error structure > > + * > > + * Return: Static string describing error origin > > + */ > > +const char *str_ee_origin(const struct sock_extended_err *ee) > > +{ > > + const char *const desc[] =3D { > > + [SO_EE_ORIGIN_NONE] =3D "", > > + [SO_EE_ORIGIN_LOCAL] =3D "Local", > > + [SO_EE_ORIGIN_ICMP] =3D "ICMP", > > + [SO_EE_ORIGIN_ICMP6] =3D "ICMPv6", > > + }; > > + > > + if (ee->ee_origin < ARRAY_SIZE(desc)) > > + return desc[ee->ee_origin]; > > + > > + return ""; > > +} > > diff --git a/util.h b/util.h > > index d0150396..1d479ddf 100644 > > --- a/util.h > > +++ b/util.h > > @@ -194,7 +194,10 @@ static inline const char *af_name(sa_family_t af) > > =20 > > #define SOCKADDR_STRLEN MAX(SOCKADDR_INET_STRLEN, SOCKADDR_INET6_STRL= EN) > > =20 > > +struct sock_extended_err; > > + > > const char *sockaddr_ntop(const void *sa, char *dst, socklen_t size); > > +const char *str_ee_origin(const struct sock_extended_err *ee); > > =20 > > /** > > * mod_sub() - Modular arithmetic subtraction >=20 --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --MAuEACu1lE80PuTG Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmaXCosACgkQzQJF27ox 2Gc6VxAAi6XWWBVOh0Xjq+b91NRpqfHcsHJTk/uk099JfH9d2JSzH/I5uIrJpRSZ UQ82JKLz4YqxgzSykMIHD25dZw5NM9FvFRtMS2jtSvSIgL/hDQSiN06OOwupQXi8 8rdZPl9anrhxeRvgzP7oEPd/tukg/EO8Uq6j7mLSezZV3J5eHCxv1pWA0AdH2eyW o0wBggB6iA1gTtG95zhOEOjWmHzAD1nIL9hCRFbmpOiPzBA2S5Zayq6OQtJbCXwc Tnb98eeAYr2mQVcsUrf6awAu7b34dTMcrCq+eZzHdLCcBLm/T4B4iedToT4Ux7s3 bqG77ibpYlECkngFMfdXBxb76FrsMbEMpOiQCP4HwZ9U6964XHLc1zqPFTcJAAvR j7RSM0fQ7/9U1MLvNOw5W+cBq3g4jmIM+GZCpkd1cyRKQPrnUr+iM3D6g65xbNUg pAR7e1C4AI8H8e4F4aQcVJ9061LTn/pN5UPYzVpXxSIGo54YKF3kp22S9AwZ0wkE 8JefFZa3ZrbCP3KCkt6lQsPwvCLky9sYBXOKRfOYMkDQ8WPfk1g1QvLRvRondDUB tAMuQ+gtkVLMh1h42PmJIQEuqza9ni7RKgsKd3WyImAkkZMhPm6h/WpSPWn/DFE2 NAYyNAT0W6KIlirFVYz3ZU6mg6QQ4o+7iReRDVDq4fkTDjzIGKU= =98J7 -----END PGP SIGNATURE----- --MAuEACu1lE80PuTG--