From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by passt.top (Postfix) with ESMTPS id 057705A0272 for ; Fri, 23 Feb 2024 05:42:12 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202312; t=1708663328; bh=wJIVQyg2R1TcSFIn5MIOrS+Y1RBkd0mS9WuP6/jtwqk=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=qOpF323RWxEvrzY1Q+5ebTRBWH5CYCxmC/ZALc1TxCcCqq3rb5B9vQh+SuMRh3hLK zdtXcYKHERSaj/Pzb++OATQMB3LfNwQ0jdfECngva/iC5Dy7Rbk7j+OHW1z6eD3IJH sKHYwqJaI2V8n10hQ9htOP3idOGFBYnntzi3N14J1EArlBw695FrB/c35evcSP184L 4Z7B9Kzrr8zWW0UNptunDDrj1Cj8R2hECy4vZCujRCLUYZQYbeiSeBEwNjB4x4M68r WZ3brqUPfn+1tMhKGsrBI/GTf5XHIF08HS7lpU5mq/5jV7LlOyulfHH/G6w5pmetgq VEin0j2UODK1g== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4Tgy6846Lbz4wcR; Fri, 23 Feb 2024 15:42:08 +1100 (AEDT) Date: Fri, 23 Feb 2024 14:56:46 +1100 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH v2 19/22] tcp: Validate TCP endpoint addresses Message-ID: References: <20240206011734.884138-1-david@gibson.dropbear.id.au> <20240206011734.884138-20-david@gibson.dropbear.id.au> <20240222134544.0c5a8bc3@elisabeth> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="iWVGeeDFgCqeYVHj" Content-Disposition: inline In-Reply-To: <20240222134544.0c5a8bc3@elisabeth> Message-ID-Hash: 52GS5HOC22WFSF57LQFDJAUN3DTRJPZ2 X-Message-ID-Hash: 52GS5HOC22WFSF57LQFDJAUN3DTRJPZ2 X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --iWVGeeDFgCqeYVHj Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Feb 22, 2024 at 01:45:44PM +0100, Stefano Brivio wrote: > On Tue, 6 Feb 2024 12:17:31 +1100 > David Gibson wrote: >=20 > > TCP connections should typically not have wildcard addresses (0.0.0.0 or > > ::) nor a zero port number for either endpoint. >=20 > This is only true for non-local connections, see also: > https://archives.passt.top/passt-dev/20240119105630.089c5d34@elisabeth/ >=20 > Just looking at RFC 6890, which should be sufficient: >=20 > +----------------------+----------------------------+ > | Attribute | Value | > +----------------------+----------------------------+ > | Address Block | 0.0.0.0/8 | > | Name | "This host on this network"| > | RFC | [RFC1122], Section 3.2.1.3 | > | Allocation Date | September 1981 | > | Termination Date | N/A | > | Source | True | > | Destination | False | > | Forwardable | False | > | Global | False | > | Reserved-by-Protocol | True | > +----------------------+----------------------------+ It's unclear to me from either rfc 6890 or rfc 1122 whether this is describing something meaningful on the wire, or only something meaningful in APIs. > ...it can be used as source, but surely not by a non-loopback > interface, so not by the guest. >=20 > About using it as destination: the table says it can't be done, but: >=20 > $ nc -l -p 8818 & echo abcd | nc 0.0.0.0 8818 > [2] 2624647 > abcd > > and: >=20 > # tcpdump -ni lo tcp port 8818 > tcpdump: verbose output suppressed, use -v[v]... for full protocol deco= de > listening on lo, link-type EN10MB (Ethernet), snapshot length 262144 by= tes > 13:24:51.379436 IP 127.0.0.1.58574 > 127.0.0.1.8818: Flags [S], seq 328= 5989360, win 65495, options [mss 65495,sackOK,TS val 1253719321 ecr 0,nop,w= scale 7], length 0 > 13:24:51.379447 IP 127.0.0.1.8818 > 127.0.0.1.58574: Flags [S.], seq 31= 28422158, ack 3285989361, win 65483, options [mss 65495,sackOK,TS val 12537= 19321 ecr 1253719321,nop,wscale 7], length 0 > 13:24:51.379457 IP 127.0.0.1.58574 > 127.0.0.1.8818: Flags [.], ack 1, = win 512, options [nop,nop,TS val 1253719321 ecr 1253719321], length 0 > 13:24:51.379508 IP 127.0.0.1.58574 > 127.0.0.1.8818: Flags [P.], seq 1:= 6, ack 1, win 512, options [nop,nop,TS val 1253719321 ecr 1253719321], leng= th 5 > 13:24:51.379513 IP 127.0.0.1.8818 > 127.0.0.1.58574: Flags [.], ack 6, = win 512, options [nop,nop,TS val 1253719321 ecr 1253719321], length 0 >=20 > it's not effectively used, but the kernel understands what I mean by > 0.0.0.0: Right: the 0.0.0.0 means something at the API level, but not AFAICT, at the wire level. At least for the "from tap" side of this change, it's the wire level that I'm checking. > connect(3, {sa_family=3DAF_INET, sin_port=3Dhtons(8818), > sin_addr=3Dinet_addr("0.0.0.0")}, 16) =3D -1 EINPROGRESS (Operation now= in progress) >=20 > So I wouldn't exclude its usage in tcp_listen_handler() -- even though > sure, the kernel translates that, so I don't think it can ever become a > practical issue. Hmm. In listen_handler() it's not "on the wire", but reading from an API still seems different from sending to an API to me. At the very least 0.0.0.0 doesn't have the same meaning in all API contexts, which makes it, IMO, unsafe to carry around from one API to another. > If it annoys you in the perspective of the flow table, maybe we can > turn it into a loopback address, when we see it's used as source or > destination? It appears the kernel already does that (IMO, correctly). If we do see a 0.0.0.0 here, I think that would be the kernel indicating some sort of special case that would need special handling. So, again, I don't think we should just blindly copy this address to other APIs. > If that's problematic as well, I can totally live with this patch, > though. >=20 > About IPv6: >=20 >=20 > +----------------------+---------------------+ > | Attribute | Value | > +----------------------+---------------------+ > | Address Block | ::/128 | > | Name | Unspecified Address | > | RFC | [RFC4291] | > | Allocation Date | February 2006 | > | Termination Date | N/A | > | Source | True | > | Destination | False | > | Forwardable | False | > | Global | False | > | Reserved-by-Protocol | True | > +----------------------+---------------------+ >=20 > ...this is more specific, and its usage as source address is exemplified > in RFC 4291, 2.5.2: >=20 > One example of its use is in the Source Address field of > any IPv6 packets sent by an initializing host before it has learned > its own address. >=20 > ...which should never be the case for any flow passt has to handle, so > I think it's fine to refuse it in tcp_listen_handler(). Well, depends how broadly you define "flow". We may need to handle this for NDP and/or DHCPv6. But this is specifically for TCP. > The rest of the series, up to 22/22, looks good to me. >=20 > > It's not entirely clear > > (at least to me) if it's strictly against the RFCs to do so, but at any > > rate the socket interfaces often treat those values specially[1], so it= 's > > not really possible to manipulate such connections. Likewise they shou= ld > > not have broadcast or multicast addresses for either endpoint. > >=20 > > However, nothing prevents a guest from creating a SYN packet with such > > values, and it's not entirely clear what the effect on passt would be. = To > > ensure sane behaviour, explicitly check for this case and drop such > > packets, logging a debug warning (we don't want a higher level, because > > that would allow a guest to spam the logs). > >=20 > > We never expect such an address on an accept()ed socket either, but just > > in case, check for it as well. > >=20 > > [1] Depending on context as "unknown", "match any" or "kernel, pick > > something for me" > >=20 > > Signed-off-by: David Gibson > > --- > > tcp.c | 74 +++++++++++++++++++++++++++++++++++++++++++++++++++++------ > > 1 file changed, 67 insertions(+), 7 deletions(-) > >=20 > > diff --git a/tcp.c b/tcp.c > > index 31d4e87b..236a8d23 100644 > > --- a/tcp.c > > +++ b/tcp.c > > @@ -284,6 +284,7 @@ > > #include > > #include > > #include > > +#include > > =20 > > #include /* For struct tcp_info */ > > =20 > > @@ -1930,27 +1931,59 @@ static void tcp_conn_from_tap(struct ctx *c, sa= _family_t af, > > const struct tcphdr *th, const char *opts, > > size_t optlen, const struct timespec *now) > > { > > + in_port_t srcport =3D ntohs(th->source); > > + in_port_t dstport =3D ntohs(th->dest); > > struct sockaddr_in addr4 =3D { > > .sin_family =3D AF_INET, > > - .sin_port =3D th->dest, > > + .sin_port =3D htons(dstport), > > .sin_addr =3D *(struct in_addr *)daddr, > > }; > > struct sockaddr_in6 addr6 =3D { > > .sin6_family =3D AF_INET6, > > - .sin6_port =3D th->dest, > > + .sin6_port =3D htons(dstport), > > .sin6_addr =3D *(struct in6_addr *)daddr, > > }; > > const struct sockaddr *sa; > > struct tcp_tap_conn *conn; > > union flow *flow; > > + int s =3D -1, mss; > > socklen_t sl; > > - int s, mss; > > - > > - (void)saddr; > > =20 > > if (!(flow =3D flow_alloc())) > > return; > > =20 > > + if (af =3D=3D AF_INET) { > > + if (IN4_IS_ADDR_UNSPECIFIED(saddr) || > > + IN4_IS_ADDR_BROADCAST(saddr) || > > + IN4_IS_ADDR_MULTICAST(saddr) || srcport =3D=3D 0 || > > + IN4_IS_ADDR_UNSPECIFIED(daddr) || > > + IN4_IS_ADDR_BROADCAST(daddr) || > > + IN4_IS_ADDR_MULTICAST(daddr) || dstport =3D=3D 0) { > > + char sstr[INET_ADDRSTRLEN], dstr[INET_ADDRSTRLEN]; > > + > > + debug("Invalid endpoint in TCP SYN: %s:%hu -> %s:%hu", > > + inet_ntop(AF_INET, saddr, sstr, sizeof(sstr)), > > + srcport, > > + inet_ntop(AF_INET, daddr, dstr, sizeof(dstr)), > > + dstport); > > + goto cancel; > > + } > > + } else if (af =3D=3D AF_INET6) { > > + if (IN6_IS_ADDR_UNSPECIFIED(saddr) || > > + IN6_IS_ADDR_MULTICAST(saddr) || srcport =3D=3D 0 || > > + IN6_IS_ADDR_UNSPECIFIED(daddr) || > > + IN6_IS_ADDR_MULTICAST(daddr) || dstport =3D=3D 0) { > > + char sstr[INET6_ADDRSTRLEN], dstr[INET6_ADDRSTRLEN]; > > + > > + debug("Invalid endpoint in TCP SYN: %s:%hu -> %s:%hu", > > + inet_ntop(AF_INET6, saddr, sstr, sizeof(sstr)), > > + srcport, > > + inet_ntop(AF_INET6, daddr, dstr, sizeof(dstr)), > > + dstport); > > + goto cancel; > > + } > > + } > > + > > if ((s =3D tcp_conn_sock(c, af)) < 0) > > goto cancel; > > =20 > > @@ -2001,8 +2034,8 @@ static void tcp_conn_from_tap(struct ctx *c, sa_f= amily_t af, > > sl =3D sizeof(addr6); > > } > > =20 > > - conn->fport =3D ntohs(th->dest); > > - conn->eport =3D ntohs(th->source); > > + conn->fport =3D dstport; > > + conn->eport =3D srcport; > > =20 > > conn->seq_init_from_tap =3D ntohl(th->seq); > > conn->seq_from_tap =3D conn->seq_init_from_tap + 1; > > @@ -2732,6 +2765,33 @@ void tcp_listen_handler(struct ctx *c, union epo= ll_ref ref, > > if (s < 0) > > goto cancel; > > =20 > > + if (sa.sa_family =3D=3D AF_INET) { > > + const struct in_addr *addr =3D &sa.sa4.sin_addr; > > + in_port_t port =3D sa.sa4.sin_port; > > + > > + if (IN4_IS_ADDR_UNSPECIFIED(addr) || > > + IN4_IS_ADDR_BROADCAST(addr) || > > + IN4_IS_ADDR_MULTICAST(addr) || port =3D=3D 0) { > > + char str[INET_ADDRSTRLEN]; > > + > > + err("Invalid endpoint from TCP accept(): %s:%hu", > > + inet_ntop(AF_INET, addr, str, sizeof(str)), port); > > + goto cancel; > > + } > > + } else if (sa.sa_family =3D=3D AF_INET6) { > > + const struct in6_addr *addr =3D &sa.sa6.sin6_addr; > > + in_port_t port =3D sa.sa6.sin6_port; > > + > > + if (IN6_IS_ADDR_UNSPECIFIED(addr) || > > + IN6_IS_ADDR_MULTICAST(addr) || port =3D=3D 0) { > > + char str[INET6_ADDRSTRLEN]; > > + > > + err("Invalid endpoint from TCP accept(): %s:%hu", > > + inet_ntop(AF_INET6, addr, str, sizeof(str)), port); > > + goto cancel; > > + } > > + } > > + > > if (tcp_splice_conn_from_sock(c, ref.tcp_listen.pif, > > ref.tcp_listen.port, flow, s, &sa)) > > return; >=20 --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --iWVGeeDFgCqeYVHj Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmXYF20ACgkQzQJF27ox 2GfJ6g/9HStF3g1kPbS8wloH77oqAAggb1V6hD27ieMIj2SbgXWToeXTGzOf+gCw V+fi8ZBENOlvY4OXrYwa5q0P4GY0LsPOq12A0IcYj/gBv+oPYgTFOOfgMlkQCD0v n24w80Gp+vYGZKjosi+w5+SjASNivF08943MZb3CQJcV4kyrH/aZ2VUDCdRWV9Go y0y9Is/SdLZqlq70MX+YIN4FHIiW7O4M+ahRuOh8WbzrdibE8dvgDXuTmhcb8ZGR cIsQK3K5uALSZvxtNijHFEhF34uDeihwGf8MVXcpwENyjmUCHGTZJyEPtD+4zO0L WvcdeWdxt6PB4kkG31rXVfXvjTInvO8GAQ26zKrBNmOt2ifr/olGF0LAxGP5aLFB G4YBniG2XpUPMRJlCpclBtaJwG7kn+lL7IQCS97tVn7zeriGCFBn1oygmSUCuMVH dHnp0U9OpmLTH98m1mqWhU2wZ/Ag4sisrcJ6tzajzhCZuh1UdyLpo0oHKKFelBYM H3Fe1E7d1nwePvJ6Am6ichnIzsnmtswXcMMJaX7m3rWD8Z8PmJH7Rk0beIuxnrPJ tSJ2MmJUudGkNp/NjaALIWha24UXog/0Vm2UP0tuu01TXCGlIdIzzMSOLiWQPUG/ uX1a84OJ5o8NOUGG/XVCz8eoF829de8gWfebtHlzSqrSui0seYI= =A7rH -----END PGP SIGNATURE----- --iWVGeeDFgCqeYVHj--