From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=pass (2048-bit key; secure) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202502 header.b=K2NHpUab; dkim-atps=neutral Received: from mail.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by passt.top (Postfix) with ESMTPS id 0FEC75A0008 for ; Fri, 28 Feb 2025 03:13:49 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202502; t=1740708824; bh=KWlWYXa4iz1g2JeaEJ1FbQaihIecRUxgp1IZqfxMjQk=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=K2NHpUab226fprErhV5b+8IqY8q3R5YW/g4L3Vj2qUVUlbO0VtFHCceqeEoq+uGvU e/38KePkPf2+9Rm//M9UPGnU8wPWncEIjAJNx00nKIE+TtJjRuF0QMluIhLL9I9hrW xpCiy7nBsXfgDIhT/4//NnbS8F/AjGNLJPxa+5kVcEu1V6E0X1CbxcICZ0onWdcPMq O/ZY/X77YL3mHYzLbJ4ZbKFHBH1mMIFAyQHEueCBWWjIRQBOwyNFMMhx/EzhqX/yYV U4fmdTZEV+A4WT2DLrwRSXcofUGx9T0OjxafIaJY5LqaGvV8OmgXlC6BrF5NiYtgLA mWoH1GzWwL/KQ== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4Z3sFh5M1Zz4x5h; Fri, 28 Feb 2025 13:13:44 +1100 (AEDT) Date: Fri, 28 Feb 2025 13:08:33 +1100 From: David Gibson To: Jon Maloy Subject: Re: [PATCH v7 2/4] udp: create and send ICMPv4 to local peer when applicable Message-ID: References: <20250227213518.506955-1-jmaloy@redhat.com> <20250227213518.506955-3-jmaloy@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="AReVVmRipdid4PwS" Content-Disposition: inline In-Reply-To: <20250227213518.506955-3-jmaloy@redhat.com> Message-ID-Hash: LWTP2OFGAPQZY5SLFJHLONU7GZFH5X53 X-Message-ID-Hash: LWTP2OFGAPQZY5SLFJHLONU7GZFH5X53 X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, sbrivio@redhat.com, lvivier@redhat.com, dgibson@redhat.com X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --AReVVmRipdid4PwS Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Feb 27, 2025 at 04:35:16PM -0500, Jon Maloy wrote: > When a local peer sends a UDP message to a non-existing port on an > existing remote host, that host will return an ICMP message containing > the error code ICMP_PORT_UNREACH, plus the header and the first eight > bytes of the original message. If the sender socket has been connected, > it uses this message to issue a "Connection Refused" event to the user. >=20 > Until now, we have only read such events from the externally facing > socket, but we don't forward them back to the local sender because > we cannot read the ICMP message directly to user space. Because of > this, the local peer will hang and wait for a response that never > arrives. >=20 > We now fix this for IPv4 by recreating and forwarding a correct ICMP > message back to the internal sender. We synthesize the message based > on the information in the extended error structure, plus the returned > part of the original message body. >=20 > Note that for the sake of completeness, we even produce ICMP messages > for other error codes. We have noticed that at least ICMP_PROT_UNREACH > is propagated as an error event back to the user. >=20 > Reviewed-by: David Gibson > Signed-off-by: Jon Maloy I know I reviewed this already, but I did spot a couple of extra things with the context of the later patches. [snip] > +/** > + * udp_send_conn_fail_icmp4() - Construct and send ICMPv4 to local peer > + * @c: Execution context > + * @ee: Extended error descriptor > + * @ref: epoll reference > + * @in: First bytes (max 8) of original UDP message body > + * @dlen: Length of the read part of original UDP message body > + */ > +static void udp_send_conn_fail_icmp4(const struct ctx *c, > + const struct sock_extended_err *ee, > + const struct flowside *toside, > + void *in, size_t dlen) > +{ > + struct in_addr oaddr =3D toside->oaddr.v4mapped.a4; > + struct in_addr eaddr =3D toside->eaddr.v4mapped.a4; > + in_port_t eport =3D toside->eport; > + in_port_t oport =3D toside->oport; > + struct { > + struct icmphdr icmp4h; > + struct iphdr ip4h; > + struct udphdr uh; > + char data[ICMP4_MAX_DLEN]; > + } __attribute__((packed, aligned(__alignof__(max_align_t)))) msg; > + size_t msglen =3D sizeof(msg) - sizeof(msg.data) + dlen; > + Having an ASSERT(dlen < ICMP_MAX_DLEN) would make me (and maybe some static checkers) a bit less nervous :). > + memset(&msg, 0, sizeof(msg)); > + msg.icmp4h.type =3D ee->ee_type; > + msg.icmp4h.code =3D ee->ee_code; Do you need any extra info here in the case of an ICMP Would Fragment message, as you do for the sort of equivalent ICMPv6 Packet Too Big? Not that I consider supporting MTU discovery essential in this round of patches, but it would be a little odd to implement it for IPv6 but not IPv4. [snip] > - /* TODO: When possible propagate and otherwise handle errors */ > + udp_send_conn_fail_icmp4(c, ee, toside, data, rc); > + } else { > + trace("Ignoring received cmsg on listener socket"); Nit: I think you can be more specific that this is an IP_RECVERR cmsg. > + } > debug("%s error on UDP socket %i: %s", > str_ee_origin(ee), s, strerror_(ee->ee_errno)); > =20 > @@ -461,15 +515,16 @@ static int udp_sock_recverr(int s) > /** > * udp_sock_errs() - Process errors on a socket > * @c: Execution context > - * @s: Socket to receive from > + * @ref: epoll reference > * @events: epoll events bitmap > * > * Return: Number of errors handled, or < 0 if we have an unrecoverable = error > */ > -int udp_sock_errs(const struct ctx *c, int s, uint32_t events) > +int udp_sock_errs(const struct ctx *c, union epoll_ref ref, uint32_t eve= nts) > { > unsigned n_err =3D 0; > socklen_t errlen; > + int s =3D ref.fd; > int rc, err; > =20 > ASSERT(!c->no_udp); > @@ -478,7 +533,7 @@ int udp_sock_errs(const struct ctx *c, int s, uint32_= t events) > return 0; /* Nothing to do */ > =20 > /* Empty the error queue */ > - while ((rc =3D udp_sock_recverr(s)) > 0) > + while ((rc =3D udp_sock_recverr(c, ref)) > 0) > n_err +=3D rc; > =20 > if (rc < 0) > @@ -510,7 +565,7 @@ int udp_sock_errs(const struct ctx *c, int s, uint32_= t events) > * @c: Execution context > * @s: Socket to receive from > * @events: epoll events bitmap > - * @mmh mmsghdr array to receive into > + * @mmh mmsghdr array to receive into This appears to be a spurious whitespace change. > * > * Return: Number of datagrams received > * > @@ -558,7 +613,7 @@ static void udp_buf_listen_sock_handler(const struct = ctx *c, > const socklen_t sasize =3D sizeof(udp_meta[0].s_in); > int n, i; > =20 > - if (udp_sock_errs(c, ref.fd, events) < 0) { > + if (udp_sock_errs(c, ref, events) < 0) { > err("UDP: Unrecoverable error on listening socket:" > " (%s port %hu)", pif_name(ref.udp.pif), ref.udp.port); > /* FIXME: what now? close/re-open socket? */ > @@ -661,7 +716,7 @@ static void udp_buf_reply_sock_handler(const struct c= tx *c, union epoll_ref ref, > =20 > from_s =3D uflow->s[ref.flowside.sidei]; > =20 > - if (udp_sock_errs(c, from_s, events) < 0) { > + if (udp_sock_errs(c, ref, events) < 0) { > flow_err(uflow, "Unrecoverable error on reply socket"); > flow_err_details(uflow); > udp_flow_close(c, uflow); > diff --git a/udp_internal.h b/udp_internal.h > index cc80e30..3b081f5 100644 > --- a/udp_internal.h > +++ b/udp_internal.h > @@ -30,5 +30,5 @@ size_t udp_update_hdr4(struct iphdr *ip4h, struct udp_p= ayload_t *bp, > size_t udp_update_hdr6(struct ipv6hdr *ip6h, struct udp_payload_t *bp, > const struct flowside *toside, size_t dlen, > bool no_udp_csum); > -int udp_sock_errs(const struct ctx *c, int s, uint32_t events); > +int udp_sock_errs(const struct ctx *c, union epoll_ref ref, uint32_t eve= nts); > #endif /* UDP_INTERNAL_H */ > diff --git a/udp_vu.c b/udp_vu.c > index 4123510..c26a223 100644 > --- a/udp_vu.c > +++ b/udp_vu.c > @@ -227,7 +227,7 @@ void udp_vu_listen_sock_handler(const struct ctx *c, = union epoll_ref ref, > struct vu_virtq *vq =3D &vdev->vq[VHOST_USER_RX_QUEUE]; > int i; > =20 > - if (udp_sock_errs(c, ref.fd, events) < 0) { > + if (udp_sock_errs(c, ref, events) < 0) { > err("UDP: Unrecoverable error on listening socket:" > " (%s port %hu)", pif_name(ref.udp.pif), ref.udp.port); > return; > @@ -302,7 +302,7 @@ void udp_vu_reply_sock_handler(const struct ctx *c, u= nion epoll_ref ref, > =20 > ASSERT(!c->no_udp); > =20 > - if (udp_sock_errs(c, from_s, events) < 0) { > + if (udp_sock_errs(c, ref, events) < 0) { > flow_err(uflow, "Unrecoverable error on reply socket"); > flow_err_details(uflow); > udp_flow_close(c, uflow); --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --AReVVmRipdid4PwS Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmfBGpwACgkQzQJF27ox 2GcSQRAAibVHoBY1+mkY9C8Gv0Vdr6s83JgPKCUAz/xS8GVcaesmY6skvf7C5IXz zsNI14qA6CIdkjqefd3KdxjxfDEdqWVIrkDH/MN2z5yUKzd2URkMJcGS4HXqos0d O9HlyrhR62SIHY6q0J18A/ii3CwVdqV1NfAovLfN+4L9elHefGa7Hk/6L6sMtnkx PZkHRx71UrBzPX0uKdmUNGuOgO8cjWtz8yBWuX8WFjgJrlJ4egpCv3mKYHGQ7vZv XfXNZLDAtuvf0ZpNhptBZ/SqVhE7C2Jmpvuc0nFKD1iQriaZxIwlodUynwCPNCqF relD/W11sEJ66SsenOpQgAgWjgF4eBap0gB62VVm55Cu79ihJ3Nx9kGsC33pHcIe tfa3fto+XtwFo0QS8ESgW0tFbTkZWJOEi0eF/XxjdYeBhkHzymi1Jkm1cFiHwbrk o7wSrYkolY22vP4SYQfheV1dhY23nBxzI8pcmnMkl1A/lO4K3NeMQvw/RHTzcAO/ B93qFI8/IkdS1ucrQUreqgXJpzvikIairw2gdFSMVUSUkebQnYlDpKErHu2cU2x1 8my4tfZNb02r35JX5laKcTR7EIIjOx6NoAOoHAt7dQmcUTESHdI3faGr2jftQcRK fyW0bXK0LgXLwzYOk1eorNAORaaBEv88Z+hOa8s6sCD86SLmEYQ= =/aBj -----END PGP SIGNATURE----- --AReVVmRipdid4PwS--