From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=pass (2048-bit key; secure) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202502 header.b=fPu3ncbg; dkim-atps=neutral Received: from mail.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by passt.top (Postfix) with ESMTPS id CE4905A0777 for ; Mon, 24 Feb 2025 03:21:54 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202502; t=1740363705; bh=dsxsYKkRZptXkV8F0dk+yTVn7sN9KmQA2wLDxVB2P6I=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=fPu3ncbgZIfURXTxcxOPUT5pLdKc75XwpljL/eFiDtjgmv8Px1BQtHfTXDnGYDVXj BhjIjE0XY8oilrn7HQzgPVl9xOAElbUAn7IST5gEODg2/YK9gELX/AaqXhx9b5ZnD7 h/8SVVZ5CpOGUpQoDSF/Fq+aVx5yGRbRtRCPhdpmCQPfLrw3ahhBnmyl19u3RCZiHZ LlrBoCh5HMcSoN3/znYTnWUHm+828Xp3OQxpyl+9mHm3vKUtX2zByRjws0To1Yymxl zIqWvNZN4WyGhHWOxgcZvcyp4XKdYb4gjqE7ymoncLEwDfM22mLmcOjuaoIUU5GvLz F0ulanfGCk5Mg== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4Z1Pcn3RLbz4wcj; Mon, 24 Feb 2025 13:21:45 +1100 (AEDT) Date: Mon, 24 Feb 2025 13:21:40 +1100 From: David Gibson To: Jon Maloy Subject: Re: [PATCH v5 4/4] udp: create and send ICMPv6 to local peer when applicable Message-ID: References: <20250223225951.3534010-1-jmaloy@redhat.com> <20250223225951.3534010-5-jmaloy@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="neaVT/V4xkD764Qb" Content-Disposition: inline In-Reply-To: <20250223225951.3534010-5-jmaloy@redhat.com> Message-ID-Hash: N24PLNXFHM6VOJ6APHMSJSVLBMVTYGF4 X-Message-ID-Hash: N24PLNXFHM6VOJ6APHMSJSVLBMVTYGF4 X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, sbrivio@redhat.com, lvivier@redhat.com, dgibson@redhat.com X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --neaVT/V4xkD764Qb Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, Feb 23, 2025 at 05:59:51PM -0500, Jon Maloy wrote: > When a local peer sends a UDP message to a non-existing port on an > existing remote host, that host will return an ICMP message containing > the error code ICMP_PORT_UNREACH, plus the header and the first eight > bytes of the original message. If the sender socket has been connected, > it uses this message to issue a "Connection Refused" event to the user. >=20 > Until now, we have only read such events from the externally facing > socket, but we don't forward them back to the local sender because > we cannot read the ICMP message directly to user space. Because of > this, the local peer will hang and wait for a response that never > arrives. >=20 > We now fix this for IPv6 by recreating and forwarding a correct ICMP > message back to the internal sender. We synthesize the message based > on the information in the extended error structure, plus the returned > part of the original message body. >=20 > Note that for the sake of completeness, we even produce ICMP messages > for other error codes. We have noticed that at least ICMP_PROT_UNREACH > is propagated as an error event back to the user. >=20 > Signed-off-by: Jon Maloy > --- > tap.c | 8 ++++---- > tap.h | 4 ++++ > udp.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++- > 3 files changed, 54 insertions(+), 5 deletions(-) >=20 > diff --git a/tap.c b/tap.c > index 380a7b9..31aec62 100644 > --- a/tap.c > +++ b/tap.c > @@ -247,10 +247,10 @@ void tap_icmp4_send(const struct ctx *c, struct in_= addr src, struct in_addr dst, > * > * Return: pointer at which to write the packet's payload > */ > -static void *tap_push_ip6h(struct ipv6hdr *ip6h, > - const struct in6_addr *src, > - const struct in6_addr *dst, > - size_t l4len, uint8_t proto, uint32_t flow) > +void *tap_push_ip6h(struct ipv6hdr *ip6h, > + const struct in6_addr *src, > + const struct in6_addr *dst, > + size_t l4len, uint8_t proto, uint32_t flow) > { > ip6h->payload_len =3D htons(l4len); > ip6h->priority =3D 0; > diff --git a/tap.h b/tap.h > index b7b8cef..67aa9e6 100644 > --- a/tap.h > +++ b/tap.h > @@ -53,6 +53,10 @@ void *tap_push_uh6(struct udphdr *uh, > void *in, size_t dlen); > void *tap_push_ip4h(struct iphdr *ip4h, struct in_addr src, > struct in_addr dst, size_t l4len, uint8_t proto); > +void *tap_push_ip6h(struct ipv6hdr *ip6h, > + const struct in6_addr *src, > + const struct in6_addr *dst, > + size_t l4len, uint8_t proto, uint32_t flow); > void tap_udp4_send(const struct ctx *c, struct in_addr src, in_port_t sp= ort, > struct in_addr dst, in_port_t dport, > const void *in, size_t dlen); > diff --git a/udp.c b/udp.c > index 3dfd478..0e37fb3 100644 > --- a/udp.c > +++ b/udp.c > @@ -88,6 +88,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -440,6 +441,46 @@ static void udp_send_conn_fail_icmp4(const struct ct= x *c, > tap_icmp4_send(c, oaddr, eaddr, &msg, msglen); > } > =20 > + > +/** > + * udp_send_conn_fail_icmp6() - Construct and send ICMP to local peer s/ICMP/ICMPv6/ > + * @c: Execution context > + * @ee: Extended error descriptor > + * @ref: epoll reference > + * @in: First bytes (max 8) of original UDP message body > + * @dlen: Length of the read part of original UDP message body > + * @flow: IPv6 flow identifier As with 2/4 a bunch of these function comments seem to use inconsistent tabs/spaces before the parameter descriptions. > + */ > +static void udp_send_conn_fail_icmp6(const struct ctx *c, > + const struct sock_extended_err *ee, > + const struct flowside *toside, > + void *in, size_t dlen, uint32_t flow) > +{ > + const struct in6_addr *oaddr =3D &toside->oaddr.a6; > + const struct in6_addr *eaddr =3D &toside->eaddr.a6; > + in_port_t eport =3D toside->eport; > + in_port_t oport =3D toside->oport; > + struct { > + struct icmp6_hdr icmp6h; > + struct ipv6hdr ip6h; > + struct udphdr uh; > + char data[8]; > + } __attribute__((packed, aligned(__alignof__(max_align_t)))) msg; > + size_t msglen =3D sizeof(msg) - sizeof(msg.data) + dlen; > + size_t l4len =3D dlen + sizeof(struct udphdr); > + > + memset(&msg, 0, sizeof(msg)); > + msg.icmp6h.icmp6_type =3D ee->ee_type; > + msg.icmp6h.icmp6_code =3D ee->ee_code; > + > + /* Reconstruct the original headers as returned in the ICMP message */ > + tap_push_ip6h(&msg.ip6h, eaddr, oaddr, l4len, IPPROTO_UDP, flow); > + tap_push_uh6(&msg.uh, eaddr, eport, oaddr, oport, in, dlen); > + memcpy(&msg.data, in, dlen); > + > + tap_icmp6_send(c, oaddr, eaddr, &msg, msglen); > +} > + > /** > * udp_sock_recverr() - Receive and clear an error from a socket > * @c: Execution context > @@ -498,8 +539,12 @@ static int udp_sock_recverr(const struct ctx *c, uni= on epoll_ref ref) > if (ref.type =3D=3D EPOLL_TYPE_UDP_REPLY) { > flow_sidx_t sidx =3D flow_sidx_opposite(ref.flowside); > const struct flowside *toside =3D flowside_at_sidx(sidx); > + uint32_t flow =3D sidx.flowi; > =20 > - udp_send_conn_fail_icmp4(c, ee, toside, data, rc); > + if (ee->ee_type =3D=3D ICMP_DEST_UNREACH) > + udp_send_conn_fail_icmp4(c, ee, toside, data, rc); > + else if (ee->ee_type =3D=3D ICMP6_DST_UNREACH) > + udp_send_conn_fail_icmp6(c, ee, toside, data, rc, flow); Ah... there's another little wrinkle I didn't consider looking at 2/4. This will work or now, but at some point we want to allow passt to be configured to NAT between IPv4 and IPv6. That means that we shouldn't base whether we sent an ICMP or ICMPv6 to the guest on whether we received an ICMP or ICMPv6 on the socket side. Instead we should look at whether the guest is using IPv6 for this flow, so using inany_v4() the addresses in toside. > } else { > trace("Ignoring received cmsg on listener socket"); > } --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --neaVT/V4xkD764Qb Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAme717MACgkQzQJF27ox 2GdeZw/+LBtU5BvfLb80ce/GaSyJf5MYowGlKRl3l2mTrEf783S/YSF6r0OzoRl3 TezwBdV933JrgE3v1uCMK2sz8VsSUt62/FKu+qJGtdf6xUtCXX/brnwyhh4fBA8I AufTveU66793h3ADD7RJQ9Qp03XVpd/y9lvdw8s/wUz/ERUskDVxlcP+6CewC4+c 1DtliPtWKyN1wDvImb1bi7fvPeg7s1KLDb3uXw3Gdu5qu3JOo2ug7r8mZgvuxnWX St/voaxOWAVDURpQv76LKSz0LuFXKf+Ez9JyEe9e9GsX8LbC9VBZSKFZbL2saC/f qhu6eP1h5L3pgcwMM8W2wM2nSIAWTGi38cfufK45swhR/vbdSZ+8dUxRtaVQQPO5 v1RTlnnto6SgwRFyxWI2hHRiGg1BNbP5/KhCJyzISJjwXGU/QjC1cW8UVZdgKCX0 4dcjenKMnb6XfWp5RV4CT6PJGK1/yBx/dvKOkM7ur/dAsAhzCD17lUP+rCcZcmw+ FYikDgt42fh98NtCM6e4NTzTHlrP6yq1Xeb/hAlkclWzNW6xH+ujBT2MI14L4Lx3 uU+6MDlLa3XSuWL0wIVJW5cofb0LkhhDk2v05lqWuAU3S7vJ0LcrfluSwxmyCw6L p145c59Ag60DiPKX7xsrcjelzCGIf/oEUIY4PP/PvJnD321gPqs= =Cxc2 -----END PGP SIGNATURE----- --neaVT/V4xkD764Qb--