From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=pass (2048-bit key; secure) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202410 header.b=Tne9eHdE; dkim-atps=neutral Received: from mail.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by passt.top (Postfix) with ESMTPS id CF88F5A061A for ; Fri, 08 Nov 2024 05:19:48 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202410; t=1731039570; bh=r0KVJlujz4wtktW4VU7tc1c4MQNtFZVLZ8xkY/9aEyE=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=Tne9eHdEzbG8OEH0WbUI3L72aiap+qkd9Uj7/ydadvGCQGddTfnlUy07x3RZn2JZC d4JY2lLvqBvL6gA0TqclTKG723JDhsCvUP/otq6IpWJJ8bpV+8TUAZw/S0+D2c0x4N x7Nk7Q74uj5XuZFbKK5UoentlqRcBoopzonmU1vg8OAIj4bDpgRuI9ceHmwH1TaD4R ACUem5qF9lDT4PBx2NrGqmaKsOEe/5ld41PqatTR7LXVeMXISFDTK259LCkD1jiKjf /fmjCs4SkVPMXoOk7a01u0CXUBqQOO8sILoNcS1Qf3tag/gLR/dgqjRN1E0hfqj2QP QhjSexxDWbb/w== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4Xl5LV0p2hz4wxx; Fri, 8 Nov 2024 15:19:30 +1100 (AEDT) Date: Fri, 8 Nov 2024 15:18:32 +1100 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH 1/1] iov: iov tail helpers Message-ID: References: <20241105023222.698658-1-david@gibson.dropbear.id.au> <20241105023222.698658-2-david@gibson.dropbear.id.au> <20241106015631.53041587@elisabeth> <20241106110618.333e3c8e@elisabeth> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="xfxIHJI4AfQ64x/h" Content-Disposition: inline In-Reply-To: <20241106110618.333e3c8e@elisabeth> Message-ID-Hash: I5BLLXOQN4KCXX45HURWXJFH7TGTUTVG X-Message-ID-Hash: I5BLLXOQN4KCXX45HURWXJFH7TGTUTVG X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --xfxIHJI4AfQ64x/h Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Nov 06, 2024 at 11:06:18AM +0100, Stefano Brivio wrote: > On Wed, 6 Nov 2024 13:38:38 +1100 > David Gibson wrote: >=20 > > On Wed, Nov 06, 2024 at 01:56:31AM +0100, Stefano Brivio wrote: > > > On Tue, 5 Nov 2024 13:32:22 +1100 > > > David Gibson wrote: > > > =20 > > > > In the vhost-user code we have a number of places where we need to = locate > > > > a particular header within the guest-supplied IO vector. We need t= o work > > > > out which buffer the header is in, and verify that it's contiguous = and > > > > aligned as we need. At the moment this is open-coded, but introduc= e a > > > > helper to make this more straightforward. > > > >=20 > > > > We add a new datatype 'struct iov_tail' representing an IO vector f= rom > > > > which we've logically consumed some number of headers. The IOV_PUL= L_HEADER > > > > macro consumes a new header from the vector, returning a pointer and > > > > updating the iov_tail. =20 > > >=20 > > > The interfaces look usable and straightforward to me. I find some nam= es > > > and comments a bit obscure, though. =20 > >=20 > > Yeah, I don't love the names either. > >=20 > > > First off, I would intuitively say that the "tail" is always at the > > > end, and if we already consumed something, that's always at the "head= ". =20 > >=20 > > Right.. which is true in a sense. The idea is you'd set one of these > > up, to cover a whole (say) frame, then pull bits off the front as you > > need it. So, the iov_tail does represent the "tail", as in the > > unprocessed bit of the frame at each point... > >=20 > > > If we call the whole abstraction "tail", we risk ending up talking ab= out > > > the tail of the tail, and the head of the tail. Consider this part fr= om > > > the cover letter: =20 > >=20 > > .. but, yeah, heads of tails and tails of tails gets confusing. > > Unless we rewrite in LISP, I guess. > >=20 > > > > "iov tail", that is an iov from which you've > > > > already consumed (in some sense) some data from the beginning. =20 > > >=20 > > > ...in other words, that's an IO vector called tail, and we > > > already consumed some data from its head. > > >=20 > > > What about (iov-based) "batch"? =20 > >=20 > > I don't think "batch" is really any better, it's just unclear on a > > different set of axes. > >=20 > > Would "iov remainder" be any better? >=20 > Now that I read your reply, the "tail" name becomes clearer as it's > clear that we just use it for left-overs. I don't think "remainder" > makes it much clearer, so I would rather stick to "tail" and try to > explain the purpose of struct iov_tail a bit more specifically (see > below). > > > > Signed-off-by: David Gibson > > > > --- > > > > iov.c | 83 +++++++++++++++++++++++++++++++++++++++++++++++++++++++= ++++ > > > > iov.h | 24 +++++++++++++++++ > > > > 2 files changed, 107 insertions(+) > > > >=20 > > > > diff --git a/iov.c b/iov.c > > > > index 3f9e229..3d384ae 100644 > > > > --- a/iov.c > > > > +++ b/iov.c > > > > @@ -156,3 +156,86 @@ size_t iov_size(const struct iovec *iov, size_= t iov_cnt) > > > > =20 > > > > return len; > > > > } > > > > + > > > > +/** > > > > + * iov_tail_shorten() - Remove any buffers from an IOV tail that a= re wholly consumed =20 > > >=20 > > > "Remove" is a bit difficult to interpret (does it deallocate? Throw > > > data away^), I would rather say that we... detach (?) those buffers f= rom > > > the batch/tail. =20 > >=20 > > Yeah, I'm not sure how to express this either. In a sense this > > operation is a logical no-op: it shouldn't change the results of any > > future operation, but it might make them slightly faster. >=20 > Oh, this wasn't clear to me. I guess that "reduce", "minimise", or > "prune" would convey this better. Good idea. I've gone with "prune". > > > This operation itself raises a question though: if the batch already > > > carries the information that some buffers were completely consumed, > > > should it ever be in a state where we want to drop these buffers from > > > it? > > >=20 > > > That is, it sounds like we have some other operation that allows it to > > > be in an inconsistent state. =20 > >=20 > > Sort of, yes, but I think this is the right design choice. It means > > we can trivially construct one of these things with an arbitrary > > offset, and we only do the work of stepping through the buffers when > > we actually have to. We can discard bytes simply by adding to the > > offset, without having to look at the actual buffers until later. > >=20 > > Basically if we want to ensure that the representation is always > > "minimal", in the sense that the offset lies within the first buffer, > > then a bunch more operations need to do work to maintain that. Plus, > > this approach is naturally robust: if we somehow get an iov_tail in > > non-minimal form, peek/pull will just handle it with no extra logic. > >=20 > > > > + * @tail: IO vector tail (modified) > > > > + * > > > > + * Return: true if the tail still contains any bytes, otherwise fa= lse > > > > + */ > > > > +bool iov_tail_shorten(struct iov_tail *tail) > > > > +{ > > > > + size_t i; > > > > + > > > > + i =3D iov_skip_bytes(tail->iov, tail->cnt, tail->off, &tail->off); > > > > + tail->iov +=3D i; > > > > + tail->cnt -=3D i; > > > > + > > > > + return !!tail->cnt; > > > > +} > > > > + > > > > +/** > > > > + * iov_tail_size - Calculate the total size of an IO vector tail > > > > + * @tail: IO vector tail > > > > + * > > > > + * Returns: The total size in bytes. > > > > + */ > > > > +/* cppcheck-suppress unusedFunction */ > > > > +size_t iov_tail_size(struct iov_tail *tail) > > > > +{ > > > > + iov_tail_shorten(tail); > > > > + return iov_size(tail->iov, tail->cnt) - tail->off; > > > > +} > > > > + > > > > +/** > > > > + * iov_peek_header_() - Get pointer to header from an IOV tail =20 > > >=20 > > > I think that this needs to be more generic than "header", because yes, > > > we're using it for headers, but that word doesn't really help in this > > > context. =20 > >=20 > > Hm. Well, we are using it for headers, and we're also always pulling > > from the "head" of whatever we have left. > >=20 > > > What about "aligned block", or just "block"? =20 > >=20 > > Maybe... but it really does have to be from the start of the current > > tail. >=20 > "Head block"? Or forget about it, "header" is actually fine. >=20 > > > =20 > > > > + * @tail: IO vector tail to get header from > > > > + * @len: Length of header to remove in bytes =20 > > >=20 > > > to remove, in bytes > > > =20 > > > > + * @align: Required alignment of header in bytes =20 > > >=20 > > > Judging from this comment alone, it's not clear if 0 or 1 should be > > > used to get freely aligned blocks. > > > =20 > > > > + * > > > > + * @tail may be modified, but will be semantically equivalent. > > > > + * > > > > + * Returns: Pointer to the removed header, NULL if it overruns the= IO > > > > + * vector, is not contiguous or is misaligned. > > > > + */ > > > > +void *iov_peek_header_(struct iov_tail *tail, size_t len, size_t a= lign) > > > > +{ > > > > + char *p; > > > > + > > > > + if (!iov_tail_shorten(tail)) > > > > + return NULL; /* Nothing left */ > > > > + > > > > + if (tail->off + len < tail->off) > > > > + return NULL; /* Overflow */ > > > > + > > > > + if (tail->off + len > tail->iov[0].iov_len) > > > > + return NULL; /* Not contiguous */ =20 > > >=20 > > > I'm not sure if this observation is useful in some cases, but this > > > doesn't necessarily mean that the header/block is not contiguous: if > > > tail->iov[0].iov_base + tail->iov[0].iov_len =3D=3D tail->iov[1].iov_= base, > > > it actually is. =20 > >=20 > > Hm, true. Not sure if it's worth the tests handle that case though. >=20 > I guess it's worth it if there's any chance we'll ever want to support > split headers. Eh, barely, even then. We'd still need to have a fall back path for when the header is *really* not contiguous. So it's a tiny optimization for a case that will only happen if we're improbably lucky about where the buffers end up. Incidentally it would be quite easy to extend these helpers to handle the split header case. Add a parameter with a "spare" buffer of the necessary size/type. If the header is contiguous we just return it as now, otherwise we linearize it into the spare buffer and return a pointer to that. Not optimal, but might be worth it for the simplicity. > > > > + p =3D (char *)tail->iov[0].iov_base + tail->off; > > > > + if ((uintptr_t)p % align) > > > > + return NULL; /* not aligned */ > > > > + > > > > + return p; > > > > +} > > > > +/** > > > > + * iov_pull_header_() - Remove a header from an IOV tail =20 > > >=20 > > > I know that "pulling" is widely used, but it's sometimes ambiguous (I > > > guess we already had a discussion about that in the past). What about > > > "remove", or "drop"? =20 > >=20 > > "remove" might work. I don't like "drop" because that implies to me > > it's just gone, rather than returned. I'm going with "remove" for now. > > > > + * @tail: IO vector tail to remove header from (modified) > > > > + * @len: Length of header to remove in bytes > > > > + * @align: Required alignment of header in bytes > > > > + * > > > > + * @tail is updated so that it no longer includes the extracted he= ader > > > > + * > > > > + * Returns: Pointer to the removed header, NULL if it overruns the= IO > > > > + * vector, is not contiguous or is misaligned. > > > > + */ > > > > +/* cppcheck-suppress unusedFunction */ > > > > +void *iov_pull_header_(struct iov_tail *tail, size_t len, size_t a= lign) > > > > +{ > > > > + char *p =3D iov_peek_header_(tail, len, align); > > > > + > > > > + if (!p) > > > > + return NULL; > > > > + > > > > + tail->off =3D tail->off + len; =20 > > >=20 > > > This could just be +=3D I guess. =20 > >=20 > > True. > >=20 > > > > + return p; > > > > +} > > > > diff --git a/iov.h b/iov.h > > > > index a9e1722..a2f449c 100644 > > > > --- a/iov.h > > > > +++ b/iov.h > > > > @@ -28,4 +28,28 @@ size_t iov_from_buf(const struct iovec *iov, siz= e_t iov_cnt, > > > > size_t iov_to_buf(const struct iovec *iov, size_t iov_cnt, > > > > size_t offset, void *buf, size_t bytes); > > > > size_t iov_size(const struct iovec *iov, size_t iov_cnt); > > > > + > > > > +/** > > > > + * struct iov_tail - Represents the fail portion of an IO vector = =20 >=20 > ...so, it's not just a tail portion, it's a tail portion in a rather > specific and constrained context, and once that's revealed these > helpers are easier (for me) to understand. What about: >=20 > * struct iov_tail - Remaining, not consumed portion (tail) of IO vector >=20 > ? Ok. I've added a "theory of operation" comment, as well as revising a bunch of other comments, which I hope will help. > > > s/fail/tail/ > > > =20 > > > > + * @iov: IO vector > > > > + * @cnt: Number of entries in @iov > > > > + * @off: Current offset in @iov > > > > + */ > > > > +struct iov_tail { > > > > + const struct iovec *iov; > > > > + size_t cnt, off; > > > > +}; > > > > + > > > > +#define IOV_TAIL(iov_, cnt_, off_) \ > > > > + (struct iov_tail){ .iov =3D (iov_), .cnt =3D (cnt_), .off =3D (of= f_) } > > > > + > > > > +bool iov_tail_shorten(struct iov_tail *tail); > > > > +size_t iov_tail_size(struct iov_tail *tail); > > > > +void *iov_peek_header_(struct iov_tail *tail, size_t len, size_t a= lign); > > > > +#define IOV_PEEK_HEADER(tail_, ty_) \ > > > > + ((ty_ *)(iov_peek_header_((tail_), sizeof(ty_), __alignof__(ty_))= )) =20 > > >=20 > > > I guess 'x' would be as clear as 'ty_' (actually, I'm failing to guess > > > what it stands for). =20 > >=20 > > "type" >=20 > Oh. That's kind of obscure. I guess it would be better to actually > spell that out and define this on multiple lines. >=20 > > > > +void *iov_pull_header_(struct iov_tail *tail, size_t len, size_t a= lign); > > > > +#define IOV_PULL_HEADER(tail_, ty_) \ > > > > + ((ty_ *)(iov_pull_header_((tail_), sizeof(ty_), __alignof__(ty_))= )) > > > > + > > > > #endif /* IOVEC_H */ =20 > > >=20 > > > I would have expected some functions to add data or build those > > > tails... or we don't need them for some reason? =20 > >=20 > > The IOV_TAIL() macro builds one. Adding data doesn't make sense; the > > idea is this is a view into part of an existing IO vector, which > > doesn't require copying the struct iovecs of that original vector. >=20 > I see now, I didn't pay much attention to that. Perhaps some one-liner > function-like comments around IOV_TAIL(), IOV_PEEK_HEADER() and > IOV_PULL_HEADER() would help. I've actually given them full function-style comments. --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --xfxIHJI4AfQ64x/h Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmctkRQACgkQzQJF27ox 2GdKxQ/+JG1waqasdMV2IZUPZzdBDkBr6LXdmMrcST5bXB6pqAqyjgiRMla7XzeH Ng0S4aegaF6P6ginzzG9Hg1LkKFgEk2RIYDSKvwIkXD+dBXjrobfGLspsO6wirKO 0GSJyBaSW29ewNvRCJoFOoN/LjCkA/Kc+I8n0bMQej+KkexgS4/PHbd3+A0YCw/R uGfdJsLuOwLkMvUeJo6jIS3m5XGaFt2Ey5TLzjQMvBUNsr0NUlApkhKueELyfX8t N95pqCKLAi9u5TxETOYTRW2shQ6ojsBzaK2jU18wOrEEUaiu3nd4TN0PV2hwAVnP /O95jh4VPGjP5WkWkIyotYCYFzs3LNUXiH2/esdkfx3PlWVfdXd1FoRfYgdy9E9x HaGgQAiF2Q4kgF1hmkqpvUQAD6ejocVy7F7rGzMDVUrfFgimwa+tC2XSFev5K4PO 85sogqGtLjCG+i6Lp2jD4blUNGZpBUJ6oBBKUqk5EcFE+7m0Srlpo76hCnmxccFI ip/YwMwde+N7CjzobX3OCzP2J7uAOXkOXTjl5UuKCRWKRkKSjqATb4NqbkS9Tfcx Qq9ebviXbNl9VNWsehYyrgfqD+MoIt8O0SQxyp0p9jR9moxsXPEUWe7gYgmf7Qvf J385AzwyjpJxX732pQaE4gHVx3RicuQc8I/o5i2CE+UdSFhHv2I= =hjDF -----END PGP SIGNATURE----- --xfxIHJI4AfQ64x/h--