public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: Laurent Vivier <lvivier@redhat.com>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: passt-dev@passt.top
Subject: Re: [PATCH v7 00/31] Introduce discontiguous frames management
Date: Fri, 25 Jul 2025 13:45:36 +0200	[thread overview]
Message-ID: <a7496d32-c6a1-4f48-979f-16c9e8a7b611@redhat.com> (raw)
In-Reply-To: <a379a63f-4c29-48ce-b065-e3cb3fe94fdb@redhat.com>

On 24/07/2025 15:01, Laurent Vivier wrote:
> On 18/07/2025 20:45, Stefano Brivio wrote:
>> On Mon, 23 Jun 2025 13:06:04 +0200
>> Laurent Vivier <lvivier@redhat.com> wrote:
>>
>>> This series introduces iov_tail to convey frame information
>>> between functions.
>>>
>>> This is only an API change, for the moment the memory pool
>>> is only able to store contiguous buffer, so, except for
>>> vhost-user in a special case, we only play with iovec array
>>> with only one entry.
>>>
>>> v7:
>>>      - Add a patch to fix comment style of 'Return:'
>>>      - Fix ignore_arp()/accept_arp()
>>>      - Fix coverity error
>>>      - Fix several comments
>>
>> I was about to apply this without 1/31 (I applied the v2 of it you sent
>> outside of this series instead, which is actually up to date) and with
>> the minor comment fix to 31/31... but the test perf/passt_vu_tcp fails
>> rather consistently now (and I triple checked without this series):
>>
>> - "TCP throughput over IPv6: guest to host" with MTU 1500 and 9000
>>    bytes now reports between 0 and 0.6 Gbps. The guest kernel prints a
>>    series of two messages with ~1-10 µs interval:
>>
>> [   21.159827] TCP: out of memory -- consider tuning tcp_mem
>> [   21.159831] TCP: out of memory -- consider tuning tcp_mem
>>
>> - "TCP throughput over IPv4: guest to host" never reports 0 Gbps, but
>>    the throughput figure for large MTU (65520 bytes) is very low (5.4
>>    Gbps in the last run). Here I'm getting four messages:
>>
>> [   40.807818] TCP: out of memory -- consider tuning tcp_mem
>> [   40.807829] TCP: out of memory -- consider tuning tcp_mem
>> [   40.807829] TCP: out of memory -- consider tuning tcp_mem
>> [   40.807830] TCP: out of memory -- consider tuning tcp_mem
>>
>> - on the reverse direction, "TCP throughput over IPv4: host to guest"
>>    (but not with IPv6), the iperf3 client gets SIGSEGV, but not
>>    consistently, it happened once out of five times.
>>
>> To me it smells a bit like we're leaking virtqueue slots but I looked
>> again at the whole series and I couldn't find anything obvious... at
>> least not yet.
>>
>> UDP tests never fail and the throughput is the same as before.
>>
> 
> I think the problem is the way we use the iovec array.
> 
> In tap4_handler() we have a packet_get() that provides a pointer to the iovec array from 
> pool. Idx is 0, iovec idx is 0.
> 
> Then we have a pool_flush(), so first available idx is now 0 again.
> 
> And then we have packet_add() with the iovec idx (in "data") of the previous packet_get() 
> that we try to add at the same index (as pool is empty again, and first available idx is 0).
> 
> When I wrote this patch I guessed that when we release packet (pool_flush()) we don't use 
> anymore the iovec array of the packet, it appears to be not true.

Could you try the following patch (my iperf3 continue to crash <defunct> on my host 
system, not related to this problem), it's a little bit ugly (use of alloca()) but it's an 
easy fix.

diff --git a/packet.c b/packet.c
index 8dbe00af12c6..6d187192dd3a 100644
--- a/packet.c
+++ b/packet.c
@@ -155,18 +155,23 @@ static size_t packet_iov_next_idx(const struct pool *p, size_t idx,
   * @func:	For tracing: name of calling function
   * @line:	For tracing: caller line of function call
   */
-static void packet_iov_data(const struct pool *p, size_t idx,
-			    struct iov_tail *data,
-			    const char *func, int line)
+static size_t packet_iov_data(const struct pool *p, size_t idx,
+			      struct iovec *iov, size_t cnt,
+			      const char *func, int line)
  {
-	struct iovec *iov = (struct iovec *)p->buf;
-	size_t iov_idx, iov_cnt;
+	const struct iovec *src_iov = (struct iovec *)p->buf;
+	size_t src_idx, src_cnt;
+	size_t i;

-	iov_idx = packet_iov_idx(p, idx, &iov_cnt, func, line);
+	src_idx = packet_iov_idx(p, idx, &src_cnt, func, line);

-	data->iov = &iov[iov_idx];
-	data->cnt = iov_cnt;
-	data->off = 0;
+	if (cnt < src_cnt)
+		return 0;
+
+	for (i = 0; i < src_cnt; i++)
+		iov[i] = src_iov[src_idx + i];
+
+	return i;
  }

  /**
@@ -268,6 +273,7 @@ void packet_add_do(struct pool *p, struct iov_tail *data,
   */
  bool packet_get_do(const struct pool *p, size_t idx,
  		   struct iov_tail *data,
+		   struct iovec *iov, size_t cnt,
  		   const char *func, int line)
  {
  	ASSERT_WITH_MSG(p->count <= p->size,
@@ -281,12 +287,13 @@ bool packet_get_do(const struct pool *p, size_t idx,
  	}

  	if (p->memory) {
-		packet_iov_data(p, idx, data, func, line);
+		data->cnt = packet_iov_data(p, idx, iov, cnt, func, line);
+		data->iov = iov;
  	} else {
  		data->cnt = 1;
-		data->off = 0;
  		data->iov = &p->pkt[idx];
  	}
+	data->off = 0;

  	ASSERT_WITH_MSG(!packet_iov_check_range(p, data, func, line),
  			"Corrupt packet pool, %s:%i", func, line);
diff --git a/packet.h b/packet.h
index 45843a6775ab..f1e28b3f31d3 100644
--- a/packet.h
+++ b/packet.h
@@ -35,14 +35,16 @@ int vu_packet_check_range(struct vdev_memory *memory,
  void packet_add_do(struct pool *p, struct iov_tail *data,
  		   const char *func, int line);
  bool packet_get_do(const struct pool *p, const size_t idx,
-		   struct iov_tail *data, const char *func, int line);
+		   struct iov_tail *data,
+		   struct iovec *iov, size_t cnt,
+		   const char *func, int line);
  bool pool_full(const struct pool *p);
  void pool_flush(struct pool *p);

  #define packet_add(p, data)					\
  	packet_add_do(p, data, __func__, __LINE__)
  #define packet_get(p, idx, data)					\
-	packet_get_do(p, idx, data, __func__, __LINE__)
+	packet_get_do(p, idx, data, (struct iovec *)alloca(sizeof(struct iovec) * 10), 10, 
__func__, __LINE__)

  #define PACKET_POOL_DECL(_name, _size)					\
  struct _name ## _t {							\


  reply	other threads:[~2025-07-25 11:45 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-23 11:06 [PATCH v7 00/31] Introduce discontiguous frames management Laurent Vivier
2025-06-23 11:06 ` [PATCH v7 01/31] style: Fix 'Return' comment style Laurent Vivier
2025-06-23 11:06 ` [PATCH v7 02/31] arp: Don't mix incoming and outgoing buffers Laurent Vivier
2025-06-23 11:06 ` [PATCH v7 03/31] iov: Introduce iov_tail_clone() and iov_tail_drop() Laurent Vivier
2025-06-23 11:06 ` [PATCH v7 04/31] iov: Update IOV_REMOVE_HEADER() and IOV_PEEK_HEADER() Laurent Vivier
2025-06-23 11:06 ` [PATCH v7 05/31] tap: Use iov_tail with tap_add_packet() Laurent Vivier
2025-06-23 11:06 ` [PATCH v7 06/31] packet: Use iov_tail with packet_add() Laurent Vivier
2025-06-23 11:06 ` [PATCH v7 07/31] packet: Add packet_data() Laurent Vivier
2025-06-23 11:06 ` [PATCH v7 08/31] arp: Convert to iov_tail Laurent Vivier
2025-06-23 11:06 ` [PATCH v7 09/31] ndp: " Laurent Vivier
2025-06-23 11:06 ` [PATCH v7 10/31] icmp: " Laurent Vivier
2025-06-23 11:06 ` [PATCH v7 11/31] udp: " Laurent Vivier
2025-06-23 11:06 ` [PATCH v7 12/31] tcp: Convert tcp_tap_handler() to use iov_tail Laurent Vivier
2025-06-23 11:06 ` [PATCH v7 13/31] tcp: Convert tcp_data_from_tap() " Laurent Vivier
2025-06-23 11:06 ` [PATCH v7 14/31] dhcpv6: move offset initialization out of dhcpv6_opt() Laurent Vivier
2025-06-23 11:06 ` [PATCH v7 15/31] dhcpv6: Extract sending of NotOnLink status Laurent Vivier
2025-06-23 11:06 ` [PATCH v7 16/31] dhcpv6: Convert to iov_tail Laurent Vivier
2025-06-23 11:06 ` [PATCH v7 17/31] dhcpv6: Use iov_tail in dhcpv6_opt() Laurent Vivier
2025-06-23 11:06 ` [PATCH v7 18/31] dhcp: Convert to iov_tail Laurent Vivier
2025-06-23 11:06 ` [PATCH v7 19/31] ip: Use iov_tail in ipv6_l4hdr() Laurent Vivier
2025-06-23 11:06 ` [PATCH v7 20/31] tap: Convert tap4_handler() to iov_tail Laurent Vivier
2025-06-23 11:06 ` [PATCH v7 21/31] tap: Convert tap6_handler() " Laurent Vivier
2025-06-23 11:06 ` [PATCH v7 22/31] packet: rename packet_data() to packet_get() Laurent Vivier
2025-06-23 11:06 ` [PATCH v7 23/31] arp: use iov_tail rather than pool Laurent Vivier
2025-06-23 11:06 ` [PATCH v7 24/31] dhcp: " Laurent Vivier
2025-06-23 11:06 ` [PATCH v7 25/31] dhcpv6: " Laurent Vivier
2025-06-23 11:06 ` [PATCH v7 26/31] icmp: " Laurent Vivier
2025-06-23 11:06 ` [PATCH v7 27/31] ndp: " Laurent Vivier
2025-06-23 11:06 ` [PATCH v7 28/31] packet: remove PACKET_POOL() and PACKET_POOL_P() Laurent Vivier
2025-06-23 11:06 ` [PATCH v7 29/31] packet: remove unused parameter from PACKET_POOL_DECL() Laurent Vivier
2025-06-23 11:06 ` [PATCH v7 30/31] packet: add memory regions information into pool Laurent Vivier
2025-06-23 11:06 ` [PATCH v7 31/31] packet: use buf to store iovec array Laurent Vivier
2025-07-18 18:45   ` Stefano Brivio
2025-07-18 18:45 ` [PATCH v7 00/31] Introduce discontiguous frames management Stefano Brivio
2025-07-24 13:01   ` Laurent Vivier
2025-07-25 11:45     ` Laurent Vivier [this message]
2025-07-25 14:18       ` Stefano Brivio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a7496d32-c6a1-4f48-979f-16c9e8a7b611@redhat.com \
    --to=lvivier@redhat.com \
    --cc=passt-dev@passt.top \
    --cc=sbrivio@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).