From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=pass (2048-bit key; secure) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202412 header.b=ZPEgrvVR; dkim-atps=neutral Received: from mail.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by passt.top (Postfix) with ESMTPS id BB4565A0272 for ; Thu, 02 Jan 2025 02:33:27 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202412; t=1735781584; bh=4Mw2vRecclUWi1TIgT/sczPa3miLqHU49Zuo9sioCWw=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=ZPEgrvVRlKOig5i/BiFp13qP+kg5mXwKm31CRwbdaBSgmbiGn3fIVbuN9Z+6LAkGY jMIqmPK8LVxDEAmG99KB2kUIMv1NMTDMNIAbboWotB598CXCHZfq+EyBFsVcEwAJoE K1FnrdsjO36d05a7nkJPlHCdwtFlOStTXpj94bV9SRBFaLqnWMJ/jBm8Ue1PGuHbl1 w+R1LdXImJgFJJlqKLR7/GlkYyv73bvdXmXV7hSVfJwRO4VBOf1GriiKztSYd7v7op EPoXseT28UfowY/DP8QOuecU7OxM/6gzd3BcuJq3xID+8+TYeC1SeIeYoJa73grvmU 0t4P2RK9si4fg== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4YNq3446tgz4x3q; Thu, 2 Jan 2025 12:33:04 +1100 (AEDT) Date: Thu, 2 Jan 2025 12:00:30 +1100 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH v2 06/12] packet: Don't hard code maximum packet size to UINT16_MAX Message-ID: References: <20241220083535.1372523-1-david@gibson.dropbear.id.au> <20241220083535.1372523-7-david@gibson.dropbear.id.au> <20250101225433.45f52b86@elisabeth> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="YZLa1lK1WG1xxaEL" Content-Disposition: inline In-Reply-To: <20250101225433.45f52b86@elisabeth> Message-ID-Hash: TYK5M4G26JXCL6NIXELVI5DAVVGBV3VN X-Message-ID-Hash: TYK5M4G26JXCL6NIXELVI5DAVVGBV3VN X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --YZLa1lK1WG1xxaEL Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Jan 01, 2025 at 10:54:33PM +0100, Stefano Brivio wrote: > On Fri, 20 Dec 2024 19:35:29 +1100 > David Gibson wrote: >=20 > > We verify that every packet we store in a pool - and every partial pack= et > > we retreive from it has a length no longer than UINT16_MAX. This > > originated in the older packet pool implementation which stored packet > > lengths in a uint16_t. Now, that packets are represented by a struct > > iovec with its size_t length, this check serves only as a sanity / secu= rity > > check that we don't have some wildly out of range length due to a bug > > elsewhere. > >=20 > > However, UINT16_MAX (65535) isn't quite enough, because the "packet" as > > stored in the pool is in fact an entire frame including both L2 and any > > backend specific headers. We can exceed this in passt mode, even with = the > > default MTU: 65520 bytes of IP datagram + 14 bytes of Ethernet header + > > 4 bytes of qemu stream length header =3D 65538 bytes. > >=20 > > Introduce our own define for the maximum length of a packet in the pool= and > > set it slightly larger, allowing 128 bytes for L2 and/or other backend > > specific headers. We'll use different amounts of that depending on the > > tap backend, but since this is just a sanity check, the bound doesn't n= eed > > to be 100% tight. >=20 > I couldn't find the time to check what's the maximum amount of bytes we > can get here depending on hypervisor and interface, but if this patch So, it's a separate calculation for each backend type, and some of them are pretty tricky. For anything based on the kernel tap device it is 65535, because it has an internal frame size limit of 65535, already including any L2 headers (it explicitly limits the MTU to 65535 - hard_header_len). There is no "hardware" header. For the qemu stream protocol it gets pretty complicated, because there are multiple layers which could clamp the maximum size. It doesn't look like the socket protocol code itself imposes a limit beyond the structural one of (2^32-1 + 4) (well, and putting it into an ssize_t, which could be less for 32-bit systems). AFAICT, it's not theoretically impossible to have gigabyte frames with a weird virtual NIC model... though obviously that wouldn't be IP, and probably not even Ethernet. Each virtual NIC could have its own limit. I suspect that's going to be in the vicinity of 64k. But, I'm really struggling to figure out what it is just for virtio-net, so I really don't want to try to figure it out for all of them. With a virtio-net NIC, I seem to be able to set MTU all the way up to 65535 successfully, which implies a maximum frame size of 65535 + 14 (L2 header) + 4 (stream protocol header) =3D 65553 at least. Similar situation for vhost-user, where I'm finding it even more inscrutable to figure out what limits are imposed at the sub-IP levels. At the moment the "hardware" header (virtio_net_hdr_mrg_rxbuf) doesn't count towards what we store in the packet.c layer, but we might have reasons to change that. So, any sub-IP limits for qemu, I'm basically not managing to find. However, we (more or less) only care about IP, which imposes a more practical limit of: 65535 + L2 header size + "hardware" header size. At present that maxes out at 65553, as above, but if we ever support other L2 encapsulations, or other backend protocols with larger "hardware" headers, that could change. > fixes an actual issue as you seem to imply, actually checking that with > QEMU and muvm would be nice. >=20 > By the way, as you mention a specific calculation, does it really make > sense to use a "good enough" value here? Can we ever exceed 65538 > bytes, or can we use that as limit? It would be good to find out, while > at it. So, yes, I think we can exceed 65538. But more significantly, trying to make the limit tight here feels like a borderline layering violation. The packet layer doesn't really care about the frame size as long as it's "sane". Fwiw, in the draft changes I have improving MTU handling, it's my intention that individual backends calculate and/or enforce tighter limits of their own where practical, and BUILD_ASSERT() that those fit within the packet layer's frame size limit. --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --YZLa1lK1WG1xxaEL Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmd15SAACgkQzQJF27ox 2GckKw/+KHb9dCOUaaY+5Ozn/M57IY1MWJwuW4cl5xy2t1DylFckuW1A6FUw5/8g Xa42mjTU74d3b7BFZ8Pu1smR4gmiZKRo9LJ3eoFXCE7kvJ0fZUpxv6D/OTnJYgIN XNumBROP/RSwj1eK895pJlP5neU6hzjkFb205iZ3+OWrA/x3bf+G1Ja3OwfabNik xiA4xu64vX68eppmOurUMDtaC4yyICKMs7lIQOHAmza0i1Aw3Ej/0YM5QtCTA+0G JL+ro72Yr9UXtWMghwbIs/kvj4rQXW4O00iZF0gsdFI/3eYqE76hq/T8Hx1vavHc 1EMJFfXFnZUW6ODggGMK9Vd7OSMvI4miF3EJbGVOyVvRFexB9jbOMcjMKxTXjT/5 A2vrysn0Bw1ItsepmZdoOz9uHIOFP0CPheIte5MbbzzsPQxjbNpN+iwgMPX8HIgm uo54uR6ar7kkwAD0JJc2Ltw8Ad8kC46DzeWAljXPrliipgvIibuZSmnJ29DbfqAL wVLd7K1JmUTTv9igJmfXeoTVqkw0k2aDMig+C6E7JXh05TrgtSiF1aVVLMpzh/k7 wdol261RH90TM/YyI4T9gQ6lcpnTzsP54ZEmF0Bri6gc2bSpmOTEHod6ZP3NBww5 /dP5hT9ZM0b93iSGhfPySbaQ7W6JQJ8zKuOU26iW6ql4rzQuEtg= =6Bx4 -----END PGP SIGNATURE----- --YZLa1lK1WG1xxaEL--