On Tue, Aug 05, 2025 at 05:46:28PM +0200, Laurent Vivier wrote: > The packet pool was previously limited to handling packets contained > within a single buffer. > > This patch extends the packet pool to support iovec array, > allowing a single logical packet to be composed of multiple iovec. > > To accommodate this, the storage format within the pool is modified. > For a multi-vector packet, a header entry is now stored first with > iov_base = NULL and iov_len holding the number of subsequent > vectors. The actual data vectors are then stored in the following > pool slots. > > The packet_add_do() and packet_get_do() functions are updated to > manage this new format for storing and retrieving packets. The > pool_full() check is also adjusted to ensure there is enough > space for all vectors of a new packet before adding it. > > Signed-off-by: Laurent Vivier > --- > packet.c | 50 +++++++++++++++++++++++++++++++++----------------- > packet.h | 2 +- > tap.c | 4 ++-- > 3 files changed, 36 insertions(+), 20 deletions(-) > > diff --git a/packet.c b/packet.c > index 4b93688509a4..d697232d951a 100644 > --- a/packet.c > +++ b/packet.c > @@ -90,12 +90,13 @@ static int packet_check_range(const struct pool *p, const char *ptr, size_t len, > /** > * pool_full() - Is a packet pool full? > * @p: Pointer to packet pool > + * @data: check data can fit in the pool > * > - * Return: true if the pool is full, false if more packets can be added > + * Return: true if the pool is full, false if data can be added > */ > -bool pool_full(const struct pool *p) > +bool pool_full(const struct pool *p, const struct iov_tail *data) Given the slightly changed semantics, I wonder if 'pool_can_fit()' might be a better name now. > { > - return p->count >= p->size; > + return p->count + data->cnt + (data->cnt > 1) >= p->size; This test is only correct if data is already pruned. As I've said elsewhere, it might be worth changing to the assumption that iov_tails are pruned everywhere outside the iov_tail internal handling. Oh.. also I think the new check is off by one (in the relatively safe direction). It will say there's no room when there is just exactly enough room. > } > > /** > @@ -108,11 +109,9 @@ bool pool_full(const struct pool *p) > void packet_add_do(struct pool *p, struct iov_tail *data, > const char *func, int line) > { > - size_t idx = p->count; > - const char *start; > - size_t len; > + size_t idx = p->count, i, offset; > > - if (pool_full(p)) { > + if (pool_full(p, data)) { > debug("add packet index %zu to pool with size %zu, %s:%i", > idx, p->size, func, line); > return; > @@ -121,18 +120,30 @@ void packet_add_do(struct pool *p, struct iov_tail *data, > if (!iov_tail_prune(data)) > return; > > - ASSERT(data->cnt == 1); /* we don't support iovec */ > + if (data->cnt > 1) { > + p->pkt[idx].iov_base = NULL; > + p->pkt[idx].iov_len = data->cnt; > + idx++; > + } > > - len = data->iov[0].iov_len - data->off; > - start = (char *)data->iov[0].iov_base + data->off; > + offset = data->off; > + for (i = 0; i < data->cnt; i++) { > + const char *start; > + size_t len; > > - if (packet_check_range(p, start, len, func, line)) > - return; > + len = data->iov[i].iov_len - offset; > + start = (char *)data->iov[i].iov_base + offset; > + offset = 0; > > - p->pkt[idx].iov_base = (void *)start; > - p->pkt[idx].iov_len = len; > + if (packet_check_range(p, start, len, func, line)) > + return; > > - p->count++; > + p->pkt[idx].iov_base = (void *)start; > + p->pkt[idx].iov_len = len; > + idx++; Hm. Isn't the above equivalent to iov_tail_clone()? Is calling packet_check_range() on each chunk the only reason for open-coding it here? > + } > + > + p->count = idx; > } > > /** > @@ -162,9 +173,14 @@ bool packet_get_do(const struct pool *p, size_t idx, > return false; > } > > - data->cnt = 1; > + if (p->pkt[idx].iov_base) { > + data->cnt = 1; > + data->iov = &p->pkt[idx]; > + } else { > + data->cnt = p->pkt[idx].iov_len; > + data->iov = &p->pkt[idx + 1]; > + } > data->off = 0; > - data->iov = &p->pkt[idx]; > > for (i = 0; i < data->cnt; i++) { > ASSERT_WITH_MSG(!packet_check_range(p, data->iov[i].iov_base, > diff --git a/packet.h b/packet.h > index e51cbd19fdc4..67dc7deb17db 100644 > --- a/packet.h > +++ b/packet.h > @@ -37,7 +37,7 @@ void packet_add_do(struct pool *p, struct iov_tail *data, > const char *func, int line); > bool packet_get_do(const struct pool *p, const size_t idx, > struct iov_tail *data, const char *func, int line); > -bool pool_full(const struct pool *p); > +bool pool_full(const struct pool *p, const struct iov_tail *data); > void pool_flush(struct pool *p); > > #define packet_add(p, data) \ > diff --git a/tap.c b/tap.c > index 9fd00915bb01..95688b22fcb7 100644 > --- a/tap.c > +++ b/tap.c > @@ -1103,14 +1103,14 @@ void tap_add_packet(struct ctx *c, struct iov_tail *data, > switch (ntohs(eh->h_proto)) { > case ETH_P_ARP: > case ETH_P_IP: > - if (pool_full(pool_tap4)) { > + if (pool_full(pool_tap4, data)) { > tap4_handler(c, pool_tap4, now); > pool_flush(pool_tap4); > } > packet_add(pool_tap4, data); > break; > case ETH_P_IPV6: > - if (pool_full(pool_tap6)) { > + if (pool_full(pool_tap6, data)) { > tap6_handler(c, pool_tap6, now); > pool_flush(pool_tap6); > } -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson