public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
* [PATCH v2 00/12] Cleanups to packet pool handling and sizing
@ 2024-12-20  8:35 David Gibson
  2024-12-20  8:35 ` [PATCH v2 01/12] test focus David Gibson
                   ` (12 more replies)
  0 siblings, 13 replies; 15+ messages in thread
From: David Gibson @ 2024-12-20  8:35 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: David Gibson

This... is not any of the things I said I would be working on.  I can
only say that a herd of very hairy yaks led me astray.  Looking at bug
66 I spotted some problems with our handling of MTUs / maximum frame
sizes.  Looking at that I found some weirdness and some real, if
minor, bugs in the sizing and handling of the packet pools.

Changes in v2:
 * Stefano convinced me that packet_check_range() is still worthwhile.
   * So don't remove it... but in looking at it I spotted various
     flaws in the checks, so address those in a number of new patches.

David Gibson (12):
  test focus
  hack: stop on fail, but not perf fail
  make passt dumpable
  packet: Use flexible array member in struct pool
  packet: Don't pass start and offset separately too
    packet_check_range()
  packet: Don't hard code maximum packet size to UINT16_MAX
  packet: Remove unhelpful packet_get_try() macro
  util: Add abort_with_msg() and ASSERT_WITH_MSG() helpers
  packet: Distinguish severities of different packet_{add,git}_do()
    errors
  packet: Move packet length checks into packet_check_range()
  tap: Don't size pool_tap[46] for the maximum number of packets
  packet: More cautious checks to avoid pointer arithmetic UB

 dhcpv6.c      |   2 +-
 ip.c          |   2 +-
 isolation.c   |   2 +-
 packet.c      | 106 ++++++++++++++++++++++----------------------------
 packet.h      |  19 ++++++---
 passt.h       |   2 -
 tap.c         |  18 +++++++--
 tap.h         |   3 +-
 test/lib/term |   1 +
 test/lib/test |   4 +-
 test/run      |  38 +++++++++---------
 util.c        |  19 +++++++++
 util.h        |  25 +++++-------
 vu_common.c   |  34 ++++++++++------
 14 files changed, 153 insertions(+), 122 deletions(-)

-- 
2.47.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 01/12] test focus
  2024-12-20  8:35 [PATCH v2 00/12] Cleanups to packet pool handling and sizing David Gibson
@ 2024-12-20  8:35 ` David Gibson
  2024-12-20  8:35 ` [PATCH v2 02/12] hack: stop on fail, but not perf fail David Gibson
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: David Gibson @ 2024-12-20  8:35 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: David Gibson

---
 test/run | 38 +++++++++++++++++++-------------------
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/test/run b/test/run
index f188d8ea..fc710475 100755
--- a/test/run
+++ b/test/run
@@ -81,9 +81,9 @@ run() {
 	test passt/shutdown
 	teardown pasta
 
-	setup pasta_options
-	test pasta_options/log_to_file
-	teardown pasta_options
+	#setup pasta_options
+	#test pasta_options/log_to_file
+	#teardown pasta_options
 
 	setup build
 	test pasta_podman/bats
@@ -132,24 +132,24 @@ run() {
 
 	VALGRIND=0
 	VHOST_USER=0
-	setup passt_in_ns
-	test passt/ndp
-	test passt_in_ns/dhcp
-	test perf/passt_tcp
-	test perf/passt_udp
-	test perf/pasta_tcp
-	test perf/pasta_udp
-	test passt_in_ns/shutdown
-	teardown passt_in_ns
+	#setup passt_in_ns
+	#test passt/ndp
+	#test passt_in_ns/dhcp
+	#test perf/passt_tcp
+	#test perf/passt_udp
+	#test perf/pasta_tcp
+	#test perf/pasta_udp
+	#test passt_in_ns/shutdown
+	#teardown passt_in_ns
 
 	VHOST_USER=1
-	setup passt_in_ns
-	test passt_vu/ndp
-	test passt_vu_in_ns/dhcp
-	test perf/passt_vu_tcp
-	test perf/passt_vu_udp
-	test passt_vu_in_ns/shutdown
-	teardown passt_in_ns
+	#setup passt_in_ns
+	#test passt_vu/ndp
+	#test passt_vu_in_ns/dhcp
+	#test perf/passt_vu_tcp
+	#test perf/passt_vu_udp
+	#test passt_vu_in_ns/shutdown
+	#teardown passt_in_ns
 
 	# TODO: Make those faster by at least pre-installing gcc and make on
 	# non-x86 images, then re-enable.
-- 
@@ -81,9 +81,9 @@ run() {
 	test passt/shutdown
 	teardown pasta
 
-	setup pasta_options
-	test pasta_options/log_to_file
-	teardown pasta_options
+	#setup pasta_options
+	#test pasta_options/log_to_file
+	#teardown pasta_options
 
 	setup build
 	test pasta_podman/bats
@@ -132,24 +132,24 @@ run() {
 
 	VALGRIND=0
 	VHOST_USER=0
-	setup passt_in_ns
-	test passt/ndp
-	test passt_in_ns/dhcp
-	test perf/passt_tcp
-	test perf/passt_udp
-	test perf/pasta_tcp
-	test perf/pasta_udp
-	test passt_in_ns/shutdown
-	teardown passt_in_ns
+	#setup passt_in_ns
+	#test passt/ndp
+	#test passt_in_ns/dhcp
+	#test perf/passt_tcp
+	#test perf/passt_udp
+	#test perf/pasta_tcp
+	#test perf/pasta_udp
+	#test passt_in_ns/shutdown
+	#teardown passt_in_ns
 
 	VHOST_USER=1
-	setup passt_in_ns
-	test passt_vu/ndp
-	test passt_vu_in_ns/dhcp
-	test perf/passt_vu_tcp
-	test perf/passt_vu_udp
-	test passt_vu_in_ns/shutdown
-	teardown passt_in_ns
+	#setup passt_in_ns
+	#test passt_vu/ndp
+	#test passt_vu_in_ns/dhcp
+	#test perf/passt_vu_tcp
+	#test perf/passt_vu_udp
+	#test passt_vu_in_ns/shutdown
+	#teardown passt_in_ns
 
 	# TODO: Make those faster by at least pre-installing gcc and make on
 	# non-x86 images, then re-enable.
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 02/12] hack: stop on fail, but not perf fail
  2024-12-20  8:35 [PATCH v2 00/12] Cleanups to packet pool handling and sizing David Gibson
  2024-12-20  8:35 ` [PATCH v2 01/12] test focus David Gibson
@ 2024-12-20  8:35 ` David Gibson
  2024-12-20  8:35 ` [PATCH v2 03/12] make passt dumpable David Gibson
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: David Gibson @ 2024-12-20  8:35 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: David Gibson

---
 test/lib/term | 1 +
 test/lib/test | 4 ++--
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/test/lib/term b/test/lib/term
index ed690de8..81191714 100755
--- a/test/lib/term
+++ b/test/lib/term
@@ -413,6 +413,7 @@ info_failed() {
 
 	[ ${FAST} -eq 1 ] || status_bar_blink
 
+	exit 1
 	pause_continue \
 		"Press any key to pause test session"		\
 		"Resuming in "					\
diff --git a/test/lib/test b/test/lib/test
index e6726bea..91729af7 100755
--- a/test/lib/test
+++ b/test/lib/test
@@ -101,7 +101,7 @@ test_one_line() {
 		IFS="${__ifs}"
 		;;
 	"test")
-		[ ${TEST_ONE_perf_nok} -eq 0 ] || TEST_ONE_nok=1
+		# [ ${TEST_ONE_perf_nok} -eq 0 ] || TEST_ONE_nok=1
 		[ ${TEST_ONE_nok} -eq 1 ] && status_test_fail
 		[ ${TEST_ONE_nok} -eq 0 ] && status_test_ok
 
@@ -374,7 +374,7 @@ test_one() {
 	[ ${DEMO} -eq 1 ] && return
 
 	[ ${TEST_ONE_skip} -eq 1 ] && status_test_skip && return
-	[ ${TEST_ONE_perf_nok} -eq 0 ] || TEST_ONE_nok=1
+	# [ ${TEST_ONE_perf_nok} -eq 0 ] || TEST_ONE_nok=1
 	[ ${TEST_ONE_nok} -eq 0 ] && status_test_ok || status_test_fail
 }
 
-- 
@@ -101,7 +101,7 @@ test_one_line() {
 		IFS="${__ifs}"
 		;;
 	"test")
-		[ ${TEST_ONE_perf_nok} -eq 0 ] || TEST_ONE_nok=1
+		# [ ${TEST_ONE_perf_nok} -eq 0 ] || TEST_ONE_nok=1
 		[ ${TEST_ONE_nok} -eq 1 ] && status_test_fail
 		[ ${TEST_ONE_nok} -eq 0 ] && status_test_ok
 
@@ -374,7 +374,7 @@ test_one() {
 	[ ${DEMO} -eq 1 ] && return
 
 	[ ${TEST_ONE_skip} -eq 1 ] && status_test_skip && return
-	[ ${TEST_ONE_perf_nok} -eq 0 ] || TEST_ONE_nok=1
+	# [ ${TEST_ONE_perf_nok} -eq 0 ] || TEST_ONE_nok=1
 	[ ${TEST_ONE_nok} -eq 0 ] && status_test_ok || status_test_fail
 }
 
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 03/12] make passt dumpable
  2024-12-20  8:35 [PATCH v2 00/12] Cleanups to packet pool handling and sizing David Gibson
  2024-12-20  8:35 ` [PATCH v2 01/12] test focus David Gibson
  2024-12-20  8:35 ` [PATCH v2 02/12] hack: stop on fail, but not perf fail David Gibson
@ 2024-12-20  8:35 ` David Gibson
  2024-12-20  8:35 ` [PATCH v2 04/12] packet: Use flexible array member in struct pool David Gibson
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: David Gibson @ 2024-12-20  8:35 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: David Gibson

---
 isolation.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/isolation.c b/isolation.c
index c944fb35..df58bb87 100644
--- a/isolation.c
+++ b/isolation.c
@@ -377,7 +377,7 @@ void isolate_postfork(const struct ctx *c)
 {
 	struct sock_fprog prog;
 
-	prctl(PR_SET_DUMPABLE, 0);
+//	prctl(PR_SET_DUMPABLE, 0);
 
 	switch (c->mode) {
 	case MODE_PASST:
-- 
@@ -377,7 +377,7 @@ void isolate_postfork(const struct ctx *c)
 {
 	struct sock_fprog prog;
 
-	prctl(PR_SET_DUMPABLE, 0);
+//	prctl(PR_SET_DUMPABLE, 0);
 
 	switch (c->mode) {
 	case MODE_PASST:
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 04/12] packet: Use flexible array member in struct pool
  2024-12-20  8:35 [PATCH v2 00/12] Cleanups to packet pool handling and sizing David Gibson
                   ` (2 preceding siblings ...)
  2024-12-20  8:35 ` [PATCH v2 03/12] make passt dumpable David Gibson
@ 2024-12-20  8:35 ` David Gibson
  2024-12-20  8:35 ` [PATCH v2 05/12] packet: Don't pass start and offset separately too packet_check_range() David Gibson
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: David Gibson @ 2024-12-20  8:35 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: David Gibson

Currently we have a dummy pkt[1] array, which we alias with an array of
a different size via various macros.  However, we already require C11 which
includes flexible array members, so we can do better.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 packet.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/packet.h b/packet.h
index 3f70e949..85ee5508 100644
--- a/packet.h
+++ b/packet.h
@@ -21,7 +21,7 @@ struct pool {
 	size_t buf_size;
 	size_t size;
 	size_t count;
-	struct iovec pkt[1];
+	struct iovec pkt[];
 };
 
 int vu_packet_check_range(void *buf, size_t offset, size_t len,
-- 
@@ -21,7 +21,7 @@ struct pool {
 	size_t buf_size;
 	size_t size;
 	size_t count;
-	struct iovec pkt[1];
+	struct iovec pkt[];
 };
 
 int vu_packet_check_range(void *buf, size_t offset, size_t len,
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 05/12] packet: Don't pass start and offset separately too packet_check_range()
  2024-12-20  8:35 [PATCH v2 00/12] Cleanups to packet pool handling and sizing David Gibson
                   ` (3 preceding siblings ...)
  2024-12-20  8:35 ` [PATCH v2 04/12] packet: Use flexible array member in struct pool David Gibson
@ 2024-12-20  8:35 ` David Gibson
  2024-12-20  8:35 ` [PATCH v2 06/12] packet: Don't hard code maximum packet size to UINT16_MAX David Gibson
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: David Gibson @ 2024-12-20  8:35 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: David Gibson

Fundamentally what packet_check_range() does is to check whether a given
memory range is within the allowed / expected memory set aside for packets
from a particular pool.  That range could represent a whole packet (from
packet_add_do()) or part of a packet (from packet_get_do()), but it doesn't
really matter which.

However, we pass the start of the range as two parameters: @start which is
the start of the packet, and @offset which is the offset within the packet
of the range we're interested in.  We never use these separately, only as
(start + offset).  Simplify the interface of packet_check_range() and
vu_packet_check_range() to directly take the start of the relevant range.
This will allow some additional future improvements.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 packet.c    | 36 +++++++++++++++++++-----------------
 packet.h    |  3 +--
 vu_common.c | 11 ++++-------
 3 files changed, 24 insertions(+), 26 deletions(-)

diff --git a/packet.c b/packet.c
index 03a11e6a..0330b548 100644
--- a/packet.c
+++ b/packet.c
@@ -23,23 +23,22 @@
 #include "log.h"
 
 /**
- * packet_check_range() - Check if a packet memory range is valid
+ * packet_check_range() - Check if a memory range is valid for a pool
  * @p:		Packet pool
- * @offset:	Offset of data range in packet descriptor
+ * @ptr:	Start of desired data range
  * @len:	Length of desired data range
- * @start:	Start of the packet descriptor
  * @func:	For tracing: name of calling function
  * @line:	For tracing: caller line of function call
  *
  * Return: 0 if the range is valid, -1 otherwise
  */
-static int packet_check_range(const struct pool *p, size_t offset, size_t len,
-			      const char *start, const char *func, int line)
+static int packet_check_range(const struct pool *p, const char *ptr, size_t len,
+			      const char *func, int line)
 {
 	if (p->buf_size == 0) {
 		int ret;
 
-		ret = vu_packet_check_range((void *)p->buf, offset, len, start);
+		ret = vu_packet_check_range((void *)p->buf, ptr, len);
 
 		if (ret == -1)
 			trace("cannot find region, %s:%i", func, line);
@@ -47,16 +46,16 @@ static int packet_check_range(const struct pool *p, size_t offset, size_t len,
 		return ret;
 	}
 
-	if (start < p->buf) {
-		trace("packet start %p before buffer start %p, "
-		      "%s:%i", (void *)start, (void *)p->buf, func, line);
+	if (ptr < p->buf) {
+		trace("packet range start %p before buffer start %p, %s:%i",
+		      (void *)ptr, (void *)p->buf, func, line);
 		return -1;
 	}
 
-	if (start + len + offset > p->buf + p->buf_size) {
-		trace("packet offset plus length %zu from size %zu, "
-		      "%s:%i", start - p->buf + len + offset,
-		      p->buf_size, func, line);
+	if (ptr + len > p->buf + p->buf_size) {
+		trace("packet range end %p after buffer end %p, %s:%i",
+		      (void *)(ptr + len), (void *)(p->buf + p->buf_size),
+		      func, line);
 		return -1;
 	}
 
@@ -81,7 +80,7 @@ void packet_add_do(struct pool *p, size_t len, const char *start,
 		return;
 	}
 
-	if (packet_check_range(p, 0, len, start, func, line))
+	if (packet_check_range(p, start, len, func, line))
 		return;
 
 	if (len > UINT16_MAX) {
@@ -110,6 +109,8 @@ void packet_add_do(struct pool *p, size_t len, const char *start,
 void *packet_get_do(const struct pool *p, size_t idx, size_t offset,
 		    size_t len, size_t *left, const char *func, int line)
 {
+	char *ptr;
+
 	if (idx >= p->size || idx >= p->count) {
 		if (func) {
 			trace("packet %zu from pool size: %zu, count: %zu, "
@@ -135,14 +136,15 @@ void *packet_get_do(const struct pool *p, size_t idx, size_t offset,
 		return NULL;
 	}
 
-	if (packet_check_range(p, offset, len, p->pkt[idx].iov_base,
-			       func, line))
+	ptr = (char *)p->pkt[idx].iov_base + offset;
+
+	if (packet_check_range(p, ptr, len, func, line))
 		return NULL;
 
 	if (left)
 		*left = p->pkt[idx].iov_len - offset - len;
 
-	return (char *)p->pkt[idx].iov_base + offset;
+	return ptr;
 }
 
 /**
diff --git a/packet.h b/packet.h
index 85ee5508..bdc07fef 100644
--- a/packet.h
+++ b/packet.h
@@ -24,8 +24,7 @@ struct pool {
 	struct iovec pkt[];
 };
 
-int vu_packet_check_range(void *buf, size_t offset, size_t len,
-			  const char *start);
+int vu_packet_check_range(void *buf, const char *ptr, size_t len);
 void packet_add_do(struct pool *p, size_t len, const char *start,
 		   const char *func, int line);
 void *packet_get_do(const struct pool *p, const size_t idx,
diff --git a/vu_common.c b/vu_common.c
index 299b5a32..531f8786 100644
--- a/vu_common.c
+++ b/vu_common.c
@@ -22,14 +22,12 @@
  * vu_packet_check_range() - Check if a given memory zone is contained in
  * 			     a mapped guest memory region
  * @buf:	Array of the available memory regions
- * @offset:	Offset of data range in packet descriptor
+ * @ptr:	Start of desired data range
  * @size:	Length of desired data range
- * @start:	Start of the packet descriptor
  *
  * Return: 0 if the zone is in a mapped memory region, -1 otherwise
  */
-int vu_packet_check_range(void *buf, size_t offset, size_t len,
-			  const char *start)
+int vu_packet_check_range(void *buf, const char *ptr, size_t len)
 {
 	struct vu_dev_region *dev_region;
 
@@ -37,9 +35,8 @@ int vu_packet_check_range(void *buf, size_t offset, size_t len,
 		/* NOLINTNEXTLINE(performance-no-int-to-ptr) */
 		char *m = (char *)(uintptr_t)dev_region->mmap_addr;
 
-		if (m <= start &&
-		    start + offset + len <= m + dev_region->mmap_offset +
-					       dev_region->size)
+		if (m <= ptr &&
+		    ptr + len <= m + dev_region->mmap_offset + dev_region->size)
 			return 0;
 	}
 
-- 
@@ -22,14 +22,12 @@
  * vu_packet_check_range() - Check if a given memory zone is contained in
  * 			     a mapped guest memory region
  * @buf:	Array of the available memory regions
- * @offset:	Offset of data range in packet descriptor
+ * @ptr:	Start of desired data range
  * @size:	Length of desired data range
- * @start:	Start of the packet descriptor
  *
  * Return: 0 if the zone is in a mapped memory region, -1 otherwise
  */
-int vu_packet_check_range(void *buf, size_t offset, size_t len,
-			  const char *start)
+int vu_packet_check_range(void *buf, const char *ptr, size_t len)
 {
 	struct vu_dev_region *dev_region;
 
@@ -37,9 +35,8 @@ int vu_packet_check_range(void *buf, size_t offset, size_t len,
 		/* NOLINTNEXTLINE(performance-no-int-to-ptr) */
 		char *m = (char *)(uintptr_t)dev_region->mmap_addr;
 
-		if (m <= start &&
-		    start + offset + len <= m + dev_region->mmap_offset +
-					       dev_region->size)
+		if (m <= ptr &&
+		    ptr + len <= m + dev_region->mmap_offset + dev_region->size)
 			return 0;
 	}
 
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 06/12] packet: Don't hard code maximum packet size to UINT16_MAX
  2024-12-20  8:35 [PATCH v2 00/12] Cleanups to packet pool handling and sizing David Gibson
                   ` (4 preceding siblings ...)
  2024-12-20  8:35 ` [PATCH v2 05/12] packet: Don't pass start and offset separately too packet_check_range() David Gibson
@ 2024-12-20  8:35 ` David Gibson
  2024-12-20  8:35 ` [PATCH v2 07/12] packet: Remove unhelpful packet_get_try() macro David Gibson
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: David Gibson @ 2024-12-20  8:35 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: David Gibson

We verify that every packet we store in a pool - and every partial packet
we retreive from it has a length no longer than UINT16_MAX.  This
originated in the older packet pool implementation which stored packet
lengths in a uint16_t.  Now, that packets are represented by a struct
iovec with its size_t length, this check serves only as a sanity / security
check that we don't have some wildly out of range length due to a bug
elsewhere.

However, UINT16_MAX (65535) isn't quite enough, because the "packet" as
stored in the pool is in fact an entire frame including both L2 and any
backend specific headers.  We can exceed this in passt mode, even with the
default MTU: 65520 bytes of IP datagram + 14 bytes of Ethernet header +
4 bytes of qemu stream length header = 65538 bytes.

Introduce our own define for the maximum length of a packet in the pool and
set it slightly larger, allowing 128 bytes for L2 and/or other backend
specific headers.  We'll use different amounts of that depending on the
tap backend, but since this is just a sanity check, the bound doesn't need
to be 100% tight.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 packet.c | 4 ++--
 packet.h | 7 +++++++
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/packet.c b/packet.c
index 0330b548..bcac0375 100644
--- a/packet.c
+++ b/packet.c
@@ -83,7 +83,7 @@ void packet_add_do(struct pool *p, size_t len, const char *start,
 	if (packet_check_range(p, start, len, func, line))
 		return;
 
-	if (len > UINT16_MAX) {
+	if (len > PACKET_MAX_LEN) {
 		trace("add packet length %zu, %s:%i", len, func, line);
 		return;
 	}
@@ -119,7 +119,7 @@ void *packet_get_do(const struct pool *p, size_t idx, size_t offset,
 		return NULL;
 	}
 
-	if (len > UINT16_MAX) {
+	if (len > PACKET_MAX_LEN) {
 		if (func) {
 			trace("packet data length %zu, %s:%i",
 			      len, func, line);
diff --git a/packet.h b/packet.h
index bdc07fef..05d37695 100644
--- a/packet.h
+++ b/packet.h
@@ -6,6 +6,13 @@
 #ifndef PACKET_H
 #define PACKET_H
 
+/* Maximum size of a single packet in a pool, including all headers.  Sized to
+ * allow a maximum size IP datagram (65535) plus some extra space or L2 and
+ * backend specific headers.  This is just for sanity checking, so doesn't need
+ * to be a tight limit.
+ */
+#define PACKET_MAX_LEN	ROUND_UP(IP_MAX_MTU + 128, sizeof(unsigned long))
+
 /**
  * struct pool - Generic pool of packets stored in a buffer
  * @buf:	Buffer storing packet descriptors,
-- 
@@ -6,6 +6,13 @@
 #ifndef PACKET_H
 #define PACKET_H
 
+/* Maximum size of a single packet in a pool, including all headers.  Sized to
+ * allow a maximum size IP datagram (65535) plus some extra space or L2 and
+ * backend specific headers.  This is just for sanity checking, so doesn't need
+ * to be a tight limit.
+ */
+#define PACKET_MAX_LEN	ROUND_UP(IP_MAX_MTU + 128, sizeof(unsigned long))
+
 /**
  * struct pool - Generic pool of packets stored in a buffer
  * @buf:	Buffer storing packet descriptors,
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 07/12] packet: Remove unhelpful packet_get_try() macro
  2024-12-20  8:35 [PATCH v2 00/12] Cleanups to packet pool handling and sizing David Gibson
                   ` (5 preceding siblings ...)
  2024-12-20  8:35 ` [PATCH v2 06/12] packet: Don't hard code maximum packet size to UINT16_MAX David Gibson
@ 2024-12-20  8:35 ` David Gibson
  2024-12-20  8:35 ` [PATCH v2 08/12] util: Add abort_with_msg() and ASSERT_WITH_MSG() helpers David Gibson
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: David Gibson @ 2024-12-20  8:35 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: David Gibson

Two places in the code use the packet_get_try() variant on packet_get().
The difference is that packet_get_try() passes a NULL 'func' to
packet_get_do(), which suppresses log messages.  The places we use this
are ones where we expect to sometimes have a failure retreiving the packet
range, even in normal cases.  So, suppressing the log messages seems like
it makes sense, except:

1) It suppresses log messages on all errors.  We (somewhat) expect to hit
   the case where the requested range is not within the received packet.
   However, it also suppresses message in cases where the requested packet
   index doesn't exist, the requested range has a nonsensical length or
   doesn't lie in even the right vague part of memory.  None of those
   latter cases are expected, and the message would be helpful if we ever
   actually hit them.

2) The suppressed messages aren't actually that disruptive.  For the case
   in ip.c, we'd log only if we run out of IPv6 packet before reaching a
   (non-option) L4 header.  That shouldn't be the case in normal operation
   so getting a message (at trave level) is not unreasonable.
   For the case in dhcpv6.c we do suppress a message every time we look for
   but don't find a specific DHCPv6 option.  That can happen in perfectly
   ok cases, but, again these are trace() level and DHCPv6 transactions
   aren't that common.  Suppressing the message for this case doesn't
   seem worth the drawbacks of (1).

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 dhcpv6.c |  2 +-
 ip.c     |  2 +-
 packet.c | 18 +++++-------------
 packet.h |  3 ---
 4 files changed, 7 insertions(+), 18 deletions(-)

diff --git a/dhcpv6.c b/dhcpv6.c
index 0523bba8..c0d05131 100644
--- a/dhcpv6.c
+++ b/dhcpv6.c
@@ -271,7 +271,7 @@ static struct opt_hdr *dhcpv6_opt(const struct pool *p, size_t *offset,
 	if (!*offset)
 		*offset = sizeof(struct udphdr) + sizeof(struct msg_hdr);
 
-	while ((o = packet_get_try(p, 0, *offset, sizeof(*o), &left))) {
+	while ((o = packet_get(p, 0, *offset, sizeof(*o), &left))) {
 		unsigned int opt_len = ntohs(o->l) + sizeof(*o);
 
 		if (ntohs(o->l) > left)
diff --git a/ip.c b/ip.c
index 2cc7f654..e391f81b 100644
--- a/ip.c
+++ b/ip.c
@@ -51,7 +51,7 @@ char *ipv6_l4hdr(const struct pool *p, int idx, size_t offset, uint8_t *proto,
 	if (!IPV6_NH_OPT(nh))
 		goto found;
 
-	while ((o = packet_get_try(p, idx, offset, sizeof(*o), dlen))) {
+	while ((o = packet_get(p, idx, offset, sizeof(*o), dlen))) {
 		nh = o->nexthdr;
 		hdrlen = (o->hdrlen + 1) * 8;
 
diff --git a/packet.c b/packet.c
index bcac0375..c921aa15 100644
--- a/packet.c
+++ b/packet.c
@@ -112,27 +112,19 @@ void *packet_get_do(const struct pool *p, size_t idx, size_t offset,
 	char *ptr;
 
 	if (idx >= p->size || idx >= p->count) {
-		if (func) {
-			trace("packet %zu from pool size: %zu, count: %zu, "
-			      "%s:%i", idx, p->size, p->count, func, line);
-		}
+		trace("packet %zu from pool size: %zu, count: %zu, %s:%i",
+		      idx, p->size, p->count, func, line);
 		return NULL;
 	}
 
 	if (len > PACKET_MAX_LEN) {
-		if (func) {
-			trace("packet data length %zu, %s:%i",
-			      len, func, line);
-		}
+		trace("packet data length %zu, %s:%i", len, func, line);
 		return NULL;
 	}
 
 	if (len + offset > p->pkt[idx].iov_len) {
-		if (func) {
-			trace("data length %zu, offset %zu from length %zu, "
-			      "%s:%i", len, offset, p->pkt[idx].iov_len,
-			      func, line);
-		}
+		trace("data length %zu, offset %zu from length %zu, %s:%i",
+		      len, offset, p->pkt[idx].iov_len, func, line);
 		return NULL;
 	}
 
diff --git a/packet.h b/packet.h
index 05d37695..f95cda08 100644
--- a/packet.h
+++ b/packet.h
@@ -45,9 +45,6 @@ void pool_flush(struct pool *p);
 #define packet_get(p, idx, offset, len, left)				\
 	packet_get_do(p, idx, offset, len, left, __func__, __LINE__)
 
-#define packet_get_try(p, idx, offset, len, left)			\
-	packet_get_do(p, idx, offset, len, left, NULL, 0)
-
 #define PACKET_POOL_DECL(_name, _size, _buf)				\
 struct _name ## _t {							\
 	char *buf;							\
-- 
@@ -45,9 +45,6 @@ void pool_flush(struct pool *p);
 #define packet_get(p, idx, offset, len, left)				\
 	packet_get_do(p, idx, offset, len, left, __func__, __LINE__)
 
-#define packet_get_try(p, idx, offset, len, left)			\
-	packet_get_do(p, idx, offset, len, left, NULL, 0)
-
 #define PACKET_POOL_DECL(_name, _size, _buf)				\
 struct _name ## _t {							\
 	char *buf;							\
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 08/12] util: Add abort_with_msg() and ASSERT_WITH_MSG() helpers
  2024-12-20  8:35 [PATCH v2 00/12] Cleanups to packet pool handling and sizing David Gibson
                   ` (6 preceding siblings ...)
  2024-12-20  8:35 ` [PATCH v2 07/12] packet: Remove unhelpful packet_get_try() macro David Gibson
@ 2024-12-20  8:35 ` David Gibson
  2024-12-20  8:35 ` [PATCH v2 09/12] packet: Distinguish severities of different packet_{add,git}_do() errors David Gibson
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: David Gibson @ 2024-12-20  8:35 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: David Gibson

We already have the ASSERT() macro which will abort() passt based on a
condition.  It always has a fixed error message based on its location and
the asserted expression.  We have some upcoming cases where we want to
customise the message when hitting an assert.

Add abort_with_msg() and ASSERT_WITH_MSG() helpers to allow this.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 util.c | 19 +++++++++++++++++++
 util.h | 25 ++++++++++---------------
 2 files changed, 29 insertions(+), 15 deletions(-)

diff --git a/util.c b/util.c
index 11973c44..a62c255b 100644
--- a/util.c
+++ b/util.c
@@ -837,3 +837,22 @@ void raw_random(void *buf, size_t buflen)
 	if (random_read < buflen)
 		die("Unexpected EOF on random data source");
 }
+
+/**
+ * abort_with_msg() - Print error message and abort
+ * @fmt:	Format string
+ * @...:	Format parameters
+ */
+void abort_with_msg(const char *fmt, ...)
+{
+	va_list ap;
+
+	va_start(ap, fmt);
+	vlogmsg(true, false, LOG_CRIT, fmt, ap);
+	va_end(ap);
+
+	/* This may actually cause a SIGSYS instead of SIGABRT, due to seccomp,
+	 * but that will still get the job done.
+	 */
+	abort();
+}
diff --git a/util.h b/util.h
index 3fa1d125..020e75a5 100644
--- a/util.h
+++ b/util.h
@@ -67,27 +67,22 @@
 #define STRINGIFY(x)	#x
 #define STR(x)		STRINGIFY(x)
 
-#ifdef CPPCHECK_6936
+void abort_with_msg(const char *fmt, ...)
+	__attribute__((format(printf, 1, 2), noreturn));
+
 /* Some cppcheck versions get confused by aborts inside a loop, causing
  * it to give false positive uninitialised variable warnings later in
  * the function, because it doesn't realise the non-initialising path
  * already exited.  See https://trac.cppcheck.net/ticket/13227
+ *
+ * Therefore, avoid using the usual do while wrapper we use to force the macro
+ * to act like a single statement requiring a ';'.
  */
-#define ASSERT(expr)		\
-	((expr) ? (void)0 : abort())
-#else
+#define ASSERT_WITH_MSG(expr, ...)					\
+	((expr) ? (void)0 : abort_with_msg(__VA_ARGS__))
 #define ASSERT(expr)							\
-	do {								\
-		if (!(expr)) {						\
-			err("ASSERTION FAILED in %s (%s:%d): %s",	\
-			    __func__, __FILE__, __LINE__, STRINGIFY(expr)); \
-			/* This may actually SIGSYS, due to seccomp,	\
-			 * but that will still get the job done		\
-			 */						\
-			abort();					\
-		}							\
-	} while (0)
-#endif
+	ASSERT_WITH_MSG((expr), "ASSSERTION FAILED in %s (%s:%d): %s",	\
+			__func__, __FILE__, __LINE__, STRINGIFY(expr))
 
 #ifdef P_tmpdir
 #define TMPDIR		P_tmpdir
-- 
@@ -67,27 +67,22 @@
 #define STRINGIFY(x)	#x
 #define STR(x)		STRINGIFY(x)
 
-#ifdef CPPCHECK_6936
+void abort_with_msg(const char *fmt, ...)
+	__attribute__((format(printf, 1, 2), noreturn));
+
 /* Some cppcheck versions get confused by aborts inside a loop, causing
  * it to give false positive uninitialised variable warnings later in
  * the function, because it doesn't realise the non-initialising path
  * already exited.  See https://trac.cppcheck.net/ticket/13227
+ *
+ * Therefore, avoid using the usual do while wrapper we use to force the macro
+ * to act like a single statement requiring a ';'.
  */
-#define ASSERT(expr)		\
-	((expr) ? (void)0 : abort())
-#else
+#define ASSERT_WITH_MSG(expr, ...)					\
+	((expr) ? (void)0 : abort_with_msg(__VA_ARGS__))
 #define ASSERT(expr)							\
-	do {								\
-		if (!(expr)) {						\
-			err("ASSERTION FAILED in %s (%s:%d): %s",	\
-			    __func__, __FILE__, __LINE__, STRINGIFY(expr)); \
-			/* This may actually SIGSYS, due to seccomp,	\
-			 * but that will still get the job done		\
-			 */						\
-			abort();					\
-		}							\
-	} while (0)
-#endif
+	ASSERT_WITH_MSG((expr), "ASSSERTION FAILED in %s (%s:%d): %s",	\
+			__func__, __FILE__, __LINE__, STRINGIFY(expr))
 
 #ifdef P_tmpdir
 #define TMPDIR		P_tmpdir
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 09/12] packet: Distinguish severities of different packet_{add,git}_do() errors
  2024-12-20  8:35 [PATCH v2 00/12] Cleanups to packet pool handling and sizing David Gibson
                   ` (7 preceding siblings ...)
  2024-12-20  8:35 ` [PATCH v2 08/12] util: Add abort_with_msg() and ASSERT_WITH_MSG() helpers David Gibson
@ 2024-12-20  8:35 ` David Gibson
  2024-12-20  8:35 ` [PATCH v2 10/12] packet: Move packet length checks into packet_check_range() David Gibson
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: David Gibson @ 2024-12-20  8:35 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: David Gibson

packet_add() and packet_get() can fail for a number of reasons, and we
currently treat them all basically the same: we log a trace() level message
and for packet_get() we return NULL.  However the different causes of
error are quite different and suggest different handling:

1) If we run out of space in the pool on add, that's (rightly) non-fatal,
   but is an unusual situation which we might want to know about.  Promote
   it's message to debug() level.

2) All packet_check_range() errors and the checks on packet length indicate
   a serious problem.  Due to either a bug in calling code, or some sort
   of memory-clobbering, we've been given addresses or sizes of packets
   that are nonsensical.  If things are this bad, we shouldn't try to carry
   on replace these with asserts.

3) Requesting a packet index that doesn't exist in the pool in packet_get()
   indicates a bug in the caller - it should always check the index first.
   Replace this with an assert.

4) On packet_get() requesting a section of the packet beyond its bounds is
   correct already.  This happens routinely from some callers when they
   probe to see if certain parts of the packet are present.  At worst it
   indicates that the guest has generate a malformed packet, which we
   should quietly drop as we already do.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 packet.c    | 71 ++++++++++++++++++++---------------------------------
 packet.h    |  3 ++-
 vu_common.c | 13 +++++++---
 3 files changed, 38 insertions(+), 49 deletions(-)

diff --git a/packet.c b/packet.c
index c921aa15..24f12448 100644
--- a/packet.c
+++ b/packet.c
@@ -30,37 +30,27 @@
  * @func:	For tracing: name of calling function
  * @line:	For tracing: caller line of function call
  *
- * Return: 0 if the range is valid, -1 otherwise
+ * ASSERT()s if the given range isn't valid (within the expected buffer(s) for
+ * the pool).
  */
-static int packet_check_range(const struct pool *p, const char *ptr, size_t len,
-			      const char *func, int line)
+static void packet_check_range(const struct pool *p,
+			       const char *ptr, size_t len,
+			       const char *func, int line)
 {
 	if (p->buf_size == 0) {
-		int ret;
-
-		ret = vu_packet_check_range((void *)p->buf, ptr, len);
-
-		if (ret == -1)
-			trace("cannot find region, %s:%i", func, line);
-
-		return ret;
-	}
-
-	if (ptr < p->buf) {
-		trace("packet range start %p before buffer start %p, %s:%i",
-		      (void *)ptr, (void *)p->buf, func, line);
-		return -1;
-	}
-
-	if (ptr + len > p->buf + p->buf_size) {
-		trace("packet range end %p after buffer end %p, %s:%i",
-		      (void *)(ptr + len), (void *)(p->buf + p->buf_size),
-		      func, line);
-		return -1;
+		vu_packet_check_range((void *)p->buf, ptr, len, func, line);
+		return;
 	}
 
-	return 0;
+	ASSERT_WITH_MSG(ptr >= p->buf,
+			"packet range start %p before buffer start %p, %s:%i",
+			(void *)ptr, (void *)p->buf, func, line);
+	ASSERT_WITH_MSG(ptr + len <= p->buf + p->buf_size,
+			"packet range end %p after buffer end %p, %s:%i",
+			(void *)(ptr + len), (void *)(p->buf + p->buf_size),
+			func, line);
 }
+
 /**
  * packet_add_do() - Add data as packet descriptor to given pool
  * @p:		Existing pool
@@ -75,18 +65,16 @@ void packet_add_do(struct pool *p, size_t len, const char *start,
 	size_t idx = p->count;
 
 	if (idx >= p->size) {
-		trace("add packet index %zu to pool with size %zu, %s:%i",
+		debug("add packet index %zu to pool with size %zu, %s:%i",
 		      idx, p->size, func, line);
 		return;
 	}
 
-	if (packet_check_range(p, start, len, func, line))
-		return;
+	packet_check_range(p, start, len, func, line);
 
-	if (len > PACKET_MAX_LEN) {
-		trace("add packet length %zu, %s:%i", len, func, line);
-		return;
-	}
+	ASSERT_WITH_MSG(len <= PACKET_MAX_LEN,
+			"add packet length %zu (max %zu), %s:%i",
+			len, PACKET_MAX_LEN, func, line);
 
 	p->pkt[idx].iov_base = (void *)start;
 	p->pkt[idx].iov_len = len;
@@ -111,16 +99,12 @@ void *packet_get_do(const struct pool *p, size_t idx, size_t offset,
 {
 	char *ptr;
 
-	if (idx >= p->size || idx >= p->count) {
-		trace("packet %zu from pool size: %zu, count: %zu, %s:%i",
-		      idx, p->size, p->count, func, line);
-		return NULL;
-	}
-
-	if (len > PACKET_MAX_LEN) {
-		trace("packet data length %zu, %s:%i", len, func, line);
-		return NULL;
-	}
+	ASSERT_WITH_MSG(idx < p->size && idx < p->count,
+			"packet %zu from pool size: %zu, count: %zu, %s:%i",
+			idx, p->size, p->count, func, line);
+	ASSERT_WITH_MSG(len <= PACKET_MAX_LEN,
+			"packet range length %zu (max %zu), %s:%i",
+			len, PACKET_MAX_LEN, func, line);
 
 	if (len + offset > p->pkt[idx].iov_len) {
 		trace("data length %zu, offset %zu from length %zu, %s:%i",
@@ -130,8 +114,7 @@ void *packet_get_do(const struct pool *p, size_t idx, size_t offset,
 
 	ptr = (char *)p->pkt[idx].iov_base + offset;
 
-	if (packet_check_range(p, ptr, len, func, line))
-		return NULL;
+	packet_check_range(p, ptr, len, func, line);
 
 	if (left)
 		*left = p->pkt[idx].iov_len - offset - len;
diff --git a/packet.h b/packet.h
index f95cda08..b164f77e 100644
--- a/packet.h
+++ b/packet.h
@@ -31,7 +31,8 @@ struct pool {
 	struct iovec pkt[];
 };
 
-int vu_packet_check_range(void *buf, const char *ptr, size_t len);
+void vu_packet_check_range(void *buf, const char *ptr, size_t len,
+			   const char *func, int line);
 void packet_add_do(struct pool *p, size_t len, const char *start,
 		   const char *func, int line);
 void *packet_get_do(const struct pool *p, const size_t idx,
diff --git a/vu_common.c b/vu_common.c
index 531f8786..528b9b08 100644
--- a/vu_common.c
+++ b/vu_common.c
@@ -24,10 +24,13 @@
  * @buf:	Array of the available memory regions
  * @ptr:	Start of desired data range
  * @size:	Length of desired data range
+ * @func:	For tracing: name of calling function
+ * @line:	For tracing: caller line of function call
  *
- * Return: 0 if the zone is in a mapped memory region, -1 otherwise
+ * Aborts if the zone isn't in any of the device regions
  */
-int vu_packet_check_range(void *buf, const char *ptr, size_t len)
+void vu_packet_check_range(void *buf, const char *ptr, size_t len,
+			   const char *func, int line)
 {
 	struct vu_dev_region *dev_region;
 
@@ -37,10 +40,12 @@ int vu_packet_check_range(void *buf, const char *ptr, size_t len)
 
 		if (m <= ptr &&
 		    ptr + len <= m + dev_region->mmap_offset + dev_region->size)
-			return 0;
+			return;
 	}
 
-	return -1;
+	abort_with_msg(
+		"package range at %p, length %zd not within dev region %s:%i",
+		(void *)ptr, len, func, line);
 }
 
 /**
-- 
@@ -24,10 +24,13 @@
  * @buf:	Array of the available memory regions
  * @ptr:	Start of desired data range
  * @size:	Length of desired data range
+ * @func:	For tracing: name of calling function
+ * @line:	For tracing: caller line of function call
  *
- * Return: 0 if the zone is in a mapped memory region, -1 otherwise
+ * Aborts if the zone isn't in any of the device regions
  */
-int vu_packet_check_range(void *buf, const char *ptr, size_t len)
+void vu_packet_check_range(void *buf, const char *ptr, size_t len,
+			   const char *func, int line)
 {
 	struct vu_dev_region *dev_region;
 
@@ -37,10 +40,12 @@ int vu_packet_check_range(void *buf, const char *ptr, size_t len)
 
 		if (m <= ptr &&
 		    ptr + len <= m + dev_region->mmap_offset + dev_region->size)
-			return 0;
+			return;
 	}
 
-	return -1;
+	abort_with_msg(
+		"package range at %p, length %zd not within dev region %s:%i",
+		(void *)ptr, len, func, line);
 }
 
 /**
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 10/12] packet: Move packet length checks into packet_check_range()
  2024-12-20  8:35 [PATCH v2 00/12] Cleanups to packet pool handling and sizing David Gibson
                   ` (8 preceding siblings ...)
  2024-12-20  8:35 ` [PATCH v2 09/12] packet: Distinguish severities of different packet_{add,git}_do() errors David Gibson
@ 2024-12-20  8:35 ` David Gibson
  2024-12-20  8:35 ` [PATCH v2 11/12] tap: Don't size pool_tap[46] for the maximum number of packets David Gibson
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: David Gibson @ 2024-12-20  8:35 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: David Gibson

Both packet_add_do() and packet_get_do() have a check on the given length,
essentially sanity checking it before validating that it's in an expected
memory region.  This can be folded into packet_check_range() which performs
similar checks for both functions.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 packet.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/packet.c b/packet.c
index 24f12448..9e0e6555 100644
--- a/packet.c
+++ b/packet.c
@@ -37,6 +37,10 @@ static void packet_check_range(const struct pool *p,
 			       const char *ptr, size_t len,
 			       const char *func, int line)
 {
+	ASSERT_WITH_MSG(len <= PACKET_MAX_LEN,
+			"packet_check_range length %zu (max %zu), %s:%i",
+			len, PACKET_MAX_LEN, func, line);
+
 	if (p->buf_size == 0) {
 		vu_packet_check_range((void *)p->buf, ptr, len, func, line);
 		return;
@@ -72,10 +76,6 @@ void packet_add_do(struct pool *p, size_t len, const char *start,
 
 	packet_check_range(p, start, len, func, line);
 
-	ASSERT_WITH_MSG(len <= PACKET_MAX_LEN,
-			"add packet length %zu (max %zu), %s:%i",
-			len, PACKET_MAX_LEN, func, line);
-
 	p->pkt[idx].iov_base = (void *)start;
 	p->pkt[idx].iov_len = len;
 
@@ -102,9 +102,6 @@ void *packet_get_do(const struct pool *p, size_t idx, size_t offset,
 	ASSERT_WITH_MSG(idx < p->size && idx < p->count,
 			"packet %zu from pool size: %zu, count: %zu, %s:%i",
 			idx, p->size, p->count, func, line);
-	ASSERT_WITH_MSG(len <= PACKET_MAX_LEN,
-			"packet range length %zu (max %zu), %s:%i",
-			len, PACKET_MAX_LEN, func, line);
 
 	if (len + offset > p->pkt[idx].iov_len) {
 		trace("data length %zu, offset %zu from length %zu, %s:%i",
-- 
@@ -37,6 +37,10 @@ static void packet_check_range(const struct pool *p,
 			       const char *ptr, size_t len,
 			       const char *func, int line)
 {
+	ASSERT_WITH_MSG(len <= PACKET_MAX_LEN,
+			"packet_check_range length %zu (max %zu), %s:%i",
+			len, PACKET_MAX_LEN, func, line);
+
 	if (p->buf_size == 0) {
 		vu_packet_check_range((void *)p->buf, ptr, len, func, line);
 		return;
@@ -72,10 +76,6 @@ void packet_add_do(struct pool *p, size_t len, const char *start,
 
 	packet_check_range(p, start, len, func, line);
 
-	ASSERT_WITH_MSG(len <= PACKET_MAX_LEN,
-			"add packet length %zu (max %zu), %s:%i",
-			len, PACKET_MAX_LEN, func, line);
-
 	p->pkt[idx].iov_base = (void *)start;
 	p->pkt[idx].iov_len = len;
 
@@ -102,9 +102,6 @@ void *packet_get_do(const struct pool *p, size_t idx, size_t offset,
 	ASSERT_WITH_MSG(idx < p->size && idx < p->count,
 			"packet %zu from pool size: %zu, count: %zu, %s:%i",
 			idx, p->size, p->count, func, line);
-	ASSERT_WITH_MSG(len <= PACKET_MAX_LEN,
-			"packet range length %zu (max %zu), %s:%i",
-			len, PACKET_MAX_LEN, func, line);
 
 	if (len + offset > p->pkt[idx].iov_len) {
 		trace("data length %zu, offset %zu from length %zu, %s:%i",
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 11/12] tap: Don't size pool_tap[46] for the maximum number of packets
  2024-12-20  8:35 [PATCH v2 00/12] Cleanups to packet pool handling and sizing David Gibson
                   ` (9 preceding siblings ...)
  2024-12-20  8:35 ` [PATCH v2 10/12] packet: Move packet length checks into packet_check_range() David Gibson
@ 2024-12-20  8:35 ` David Gibson
  2024-12-20  8:35 ` [PATCH v2 12/12] packet: More cautious checks to avoid pointer arithmetic UB David Gibson
  2024-12-20  9:00 ` [PATCH v2 00/12] Cleanups to packet pool handling and sizing David Gibson
  12 siblings, 0 replies; 15+ messages in thread
From: David Gibson @ 2024-12-20  8:35 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: David Gibson

Currently we attempt to size pool_tap[46] so they have room for the maximum
possible number of packets that could fit in pkt_buf, TAP_MSGS.  However,
the calculation isn't quite correct: TAP_MSGS is based on ETH_ZLEN (60) as
the minimum possible L2 frame size.  But, we don't enforce that L2 frames
are at least ETH_ZLEN when we receive them from the tap backend, and since
we're dealing with virtual interfaces we don't have the physical Ethernet
limitations requiring that length.  Indeed it is possible to generate a
legitimate frame smaller than that (e.g. a zero-payload UDP/IPv4 frame on
the 'pasta' backend is only 42 bytes long).

It's also unclear if this limit is sufficient for vhost-user which isn't
limited by the size of pkt_buf as the other modes are.

We could attempt to correct the calculation, but that would leave us with
even larger arrays, which in practice rarely accumulate more than a handful
of packets.  So, instead, put an arbitrary cap on the number of packets we
can put in a batch, and if we run out of space, process and flush the
batch.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 packet.c    | 13 ++++++++++++-
 packet.h    |  3 +++
 passt.h     |  2 --
 tap.c       | 18 +++++++++++++++---
 tap.h       |  3 ++-
 vu_common.c |  3 ++-
 6 files changed, 34 insertions(+), 8 deletions(-)

diff --git a/packet.c b/packet.c
index 9e0e6555..c3b80924 100644
--- a/packet.c
+++ b/packet.c
@@ -55,6 +55,17 @@ static void packet_check_range(const struct pool *p,
 			func, line);
 }
 
+/**
+ * pool_full() - Is a packet pool full?
+ * @p:		Pointer to packet pool
+ *
+ * Return: true if the pool is full, false if more packets can be added
+ */
+bool pool_full(const struct pool *p)
+{
+	return p->count >= p->size;
+}
+
 /**
  * packet_add_do() - Add data as packet descriptor to given pool
  * @p:		Existing pool
@@ -68,7 +79,7 @@ void packet_add_do(struct pool *p, size_t len, const char *start,
 {
 	size_t idx = p->count;
 
-	if (idx >= p->size) {
+	if (pool_full(p)) {
 		debug("add packet index %zu to pool with size %zu, %s:%i",
 		      idx, p->size, func, line);
 		return;
diff --git a/packet.h b/packet.h
index b164f77e..8d78841b 100644
--- a/packet.h
+++ b/packet.h
@@ -6,6 +6,8 @@
 #ifndef PACKET_H
 #define PACKET_H
 
+#include <stdbool.h>
+
 /* Maximum size of a single packet in a pool, including all headers.  Sized to
  * allow a maximum size IP datagram (65535) plus some extra space or L2 and
  * backend specific headers.  This is just for sanity checking, so doesn't need
@@ -38,6 +40,7 @@ void packet_add_do(struct pool *p, size_t len, const char *start,
 void *packet_get_do(const struct pool *p, const size_t idx,
 		    size_t offset, size_t len, size_t *left,
 		    const char *func, int line);
+bool pool_full(const struct pool *p);
 void pool_flush(struct pool *p);
 
 #define packet_add(p, len, start)					\
diff --git a/passt.h b/passt.h
index 0dd4efa0..81b2787f 100644
--- a/passt.h
+++ b/passt.h
@@ -70,8 +70,6 @@ static_assert(sizeof(union epoll_ref) <= sizeof(union epoll_data),
 
 #define TAP_BUF_BYTES							\
 	ROUND_DOWN(((ETH_MAX_MTU + sizeof(uint32_t)) * 128), PAGE_SIZE)
-#define TAP_MSGS							\
-	DIV_ROUND_UP(TAP_BUF_BYTES, ETH_ZLEN - 2 * ETH_ALEN + sizeof(uint32_t))
 
 #define PKT_BUF_BYTES		MAX(TAP_BUF_BYTES, 0)
 extern char pkt_buf		[PKT_BUF_BYTES];
diff --git a/tap.c b/tap.c
index cd32a901..4421ce4d 100644
--- a/tap.c
+++ b/tap.c
@@ -61,6 +61,8 @@
 #include "vhost_user.h"
 #include "vu_common.h"
 
+#define TAP_MSGS		256
+
 /* IPv4 (plus ARP) and IPv6 message batches from tap/guest to IP handlers */
 static PACKET_POOL_NOINIT(pool_tap4, TAP_MSGS, pkt_buf);
 static PACKET_POOL_NOINIT(pool_tap6, TAP_MSGS, pkt_buf);
@@ -966,8 +968,10 @@ void tap_handler(struct ctx *c, const struct timespec *now)
  * @c:		Execution context
  * @l2len:	Total L2 packet length
  * @p:		Packet buffer
+ * @now:	Current timestamp
  */
-void tap_add_packet(struct ctx *c, ssize_t l2len, char *p)
+void tap_add_packet(struct ctx *c, ssize_t l2len, char *p,
+		    const struct timespec *now)
 {
 	const struct ethhdr *eh;
 
@@ -983,9 +987,17 @@ void tap_add_packet(struct ctx *c, ssize_t l2len, char *p)
 	switch (ntohs(eh->h_proto)) {
 	case ETH_P_ARP:
 	case ETH_P_IP:
+		if (pool_full(pool_tap4)) {
+			tap4_handler(c, pool_tap4, now);
+			pool_flush(pool_tap4);
+		}
 		packet_add(pool_tap4, l2len, p);
 		break;
 	case ETH_P_IPV6:
+		if (pool_full(pool_tap6)) {
+			tap6_handler(c, pool_tap6, now);
+			pool_flush(pool_tap6);
+		}
 		packet_add(pool_tap6, l2len, p);
 		break;
 	default:
@@ -1066,7 +1078,7 @@ static void tap_passt_input(struct ctx *c, const struct timespec *now)
 		p += sizeof(uint32_t);
 		n -= sizeof(uint32_t);
 
-		tap_add_packet(c, l2len, p);
+		tap_add_packet(c, l2len, p, now);
 
 		p += l2len;
 		n -= l2len;
@@ -1129,7 +1141,7 @@ static void tap_pasta_input(struct ctx *c, const struct timespec *now)
 		    len > (ssize_t)ETH_MAX_MTU)
 			continue;
 
-		tap_add_packet(c, len, pkt_buf + n);
+		tap_add_packet(c, len, pkt_buf + n, now);
 	}
 
 	tap_handler(c, now);
diff --git a/tap.h b/tap.h
index dfbd8b9e..a1b18430 100644
--- a/tap.h
+++ b/tap.h
@@ -74,6 +74,7 @@ void tap_sock_update_pool(void *base, size_t size);
 void tap_backend_init(struct ctx *c);
 void tap_flush_pools(void);
 void tap_handler(struct ctx *c, const struct timespec *now);
-void tap_add_packet(struct ctx *c, ssize_t l2len, char *p);
+void tap_add_packet(struct ctx *c, ssize_t l2len, char *p,
+		    const struct timespec *now);
 
 #endif /* TAP_H */
diff --git a/vu_common.c b/vu_common.c
index 528b9b08..6f49f873 100644
--- a/vu_common.c
+++ b/vu_common.c
@@ -187,7 +187,8 @@ static void vu_handle_tx(struct vu_dev *vdev, int index,
 
 		tap_add_packet(vdev->context,
 			       elem[count].out_sg[0].iov_len - hdrlen,
-			       (char *)elem[count].out_sg[0].iov_base + hdrlen);
+			       (char *)elem[count].out_sg[0].iov_base + hdrlen,
+			       now);
 		count++;
 	}
 	tap_handler(vdev->context, now);
-- 
@@ -187,7 +187,8 @@ static void vu_handle_tx(struct vu_dev *vdev, int index,
 
 		tap_add_packet(vdev->context,
 			       elem[count].out_sg[0].iov_len - hdrlen,
-			       (char *)elem[count].out_sg[0].iov_base + hdrlen);
+			       (char *)elem[count].out_sg[0].iov_base + hdrlen,
+			       now);
 		count++;
 	}
 	tap_handler(vdev->context, now);
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 12/12] packet: More cautious checks to avoid pointer arithmetic UB
  2024-12-20  8:35 [PATCH v2 00/12] Cleanups to packet pool handling and sizing David Gibson
                   ` (10 preceding siblings ...)
  2024-12-20  8:35 ` [PATCH v2 11/12] tap: Don't size pool_tap[46] for the maximum number of packets David Gibson
@ 2024-12-20  8:35 ` David Gibson
  2024-12-20  9:00 ` [PATCH v2 00/12] Cleanups to packet pool handling and sizing David Gibson
  12 siblings, 0 replies; 15+ messages in thread
From: David Gibson @ 2024-12-20  8:35 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: David Gibson

packet_check_range (and vu_packet_check_range()) verify that the packet or
section of packet we're interested in lies in the packet buffer pool we
expect it to.  However, in doing so it doesn't avoid the possibility of
an integer overflow while performing pointer arithmetic, with is UB.  In
fact, AFAICT it's UB even to use arbitrary pointer arithmetic to construct
a pointer outside of a known valid buffer.

To do this safely, we can't calculate the end of a memory region with
pointer addition, at least if we're treating the length as untrusted.
Instead we must work out the offset of one memory region within another
using pointer subtraction, then do integer checks against the length of
the outer region.  We then need to be careful about the order of checks so
that those integer checks can't themselves overflow.

While addressing this, correct a flaw in vu_packet_check_range() where it
only checked that the given packet piece was above the region's mmap
address, not the base of the specific section of the mmap we expect to find
it in (dev_region->mmap_addr + dev_region->mmap_offset).

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 packet.c    |  9 ++++++---
 vu_common.c | 15 +++++++++++----
 2 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/packet.c b/packet.c
index c3b80924..04a7a829 100644
--- a/packet.c
+++ b/packet.c
@@ -49,9 +49,12 @@ static void packet_check_range(const struct pool *p,
 	ASSERT_WITH_MSG(ptr >= p->buf,
 			"packet range start %p before buffer start %p, %s:%i",
 			(void *)ptr, (void *)p->buf, func, line);
-	ASSERT_WITH_MSG(ptr + len <= p->buf + p->buf_size,
-			"packet range end %p after buffer end %p, %s:%i",
-			(void *)(ptr + len), (void *)(p->buf + p->buf_size),
+	ASSERT_WITH_MSG(len <= p->buf_size,
+			"packet range length %zu larger than buffer %zu, %s:%i",
+			len, p->buf_size, func, line);
+	ASSERT_WITH_MSG((size_t)(ptr - p->buf) <= p->buf_size - len,
+"packet range %p, len %zu extends beyond buffer %p, len %zu, %s:%i",
+			(void *)ptr, len, (void *)p->buf, p->buf_size,
 			func, line);
 }
 
diff --git a/vu_common.c b/vu_common.c
index 6f49f873..3f7fb29d 100644
--- a/vu_common.c
+++ b/vu_common.c
@@ -35,16 +35,23 @@ void vu_packet_check_range(void *buf, const char *ptr, size_t len,
 	struct vu_dev_region *dev_region;
 
 	for (dev_region = buf; dev_region->mmap_addr; dev_region++) {
+		uintptr_t base_addr = dev_region->mmap_addr +
+			dev_region->mmap_offset;
 		/* NOLINTNEXTLINE(performance-no-int-to-ptr) */
-		char *m = (char *)(uintptr_t)dev_region->mmap_addr;
+		const char *base = (const char *)base_addr;
 
-		if (m <= ptr &&
-		    ptr + len <= m + dev_region->mmap_offset + dev_region->size)
+		ASSERT(base_addr >= dev_region->mmap_addr);
+
+		if (len > dev_region->size)
+			continue;
+
+		if (base <= ptr &&
+		    (size_t)(ptr - base) <= dev_region->size - len)
 			return;
 	}
 
 	abort_with_msg(
-		"package range at %p, length %zd not within dev region %s:%i",
+		"packet range at %p, length %zd not within dev region, %s:%i",
 		(void *)ptr, len, func, line);
 }
 
-- 
@@ -35,16 +35,23 @@ void vu_packet_check_range(void *buf, const char *ptr, size_t len,
 	struct vu_dev_region *dev_region;
 
 	for (dev_region = buf; dev_region->mmap_addr; dev_region++) {
+		uintptr_t base_addr = dev_region->mmap_addr +
+			dev_region->mmap_offset;
 		/* NOLINTNEXTLINE(performance-no-int-to-ptr) */
-		char *m = (char *)(uintptr_t)dev_region->mmap_addr;
+		const char *base = (const char *)base_addr;
 
-		if (m <= ptr &&
-		    ptr + len <= m + dev_region->mmap_offset + dev_region->size)
+		ASSERT(base_addr >= dev_region->mmap_addr);
+
+		if (len > dev_region->size)
+			continue;
+
+		if (base <= ptr &&
+		    (size_t)(ptr - base) <= dev_region->size - len)
 			return;
 	}
 
 	abort_with_msg(
-		"package range at %p, length %zd not within dev region %s:%i",
+		"packet range at %p, length %zd not within dev region, %s:%i",
 		(void *)ptr, len, func, line);
 }
 
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 00/12] Cleanups to packet pool handling and sizing
  2024-12-20  8:35 [PATCH v2 00/12] Cleanups to packet pool handling and sizing David Gibson
                   ` (11 preceding siblings ...)
  2024-12-20  8:35 ` [PATCH v2 12/12] packet: More cautious checks to avoid pointer arithmetic UB David Gibson
@ 2024-12-20  9:00 ` David Gibson
  2024-12-20 10:06   ` Stefano Brivio
  12 siblings, 1 reply; 15+ messages in thread
From: David Gibson @ 2024-12-20  9:00 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev

[-- Attachment #1: Type: text/plain, Size: 2211 bytes --]

On Fri, Dec 20, 2024 at 07:35:23PM +1100, David Gibson wrote:
> This... is not any of the things I said I would be working on.  I can
> only say that a herd of very hairy yaks led me astray.  Looking at bug
> 66 I spotted some problems with our handling of MTUs / maximum frame
> sizes.  Looking at that I found some weirdness and some real, if
> minor, bugs in the sizing and handling of the packet pools.
> 
> Changes in v2:
>  * Stefano convinced me that packet_check_range() is still worthwhile.
>    * So don't remove it... but in looking at it I spotted various
>      flaws in the checks, so address those in a number of new patches.
> 
> David Gibson (12):
>   test focus
>   hack: stop on fail, but not perf fail
>   make passt dumpable

Argh, oops.  Of course, 1..3 are test/debug patches that should not be
included in the series.

>   packet: Use flexible array member in struct pool
>   packet: Don't pass start and offset separately too
>     packet_check_range()
>   packet: Don't hard code maximum packet size to UINT16_MAX
>   packet: Remove unhelpful packet_get_try() macro
>   util: Add abort_with_msg() and ASSERT_WITH_MSG() helpers
>   packet: Distinguish severities of different packet_{add,git}_do()
>     errors
>   packet: Move packet length checks into packet_check_range()
>   tap: Don't size pool_tap[46] for the maximum number of packets
>   packet: More cautious checks to avoid pointer arithmetic UB
> 
>  dhcpv6.c      |   2 +-
>  ip.c          |   2 +-
>  isolation.c   |   2 +-
>  packet.c      | 106 ++++++++++++++++++++++----------------------------
>  packet.h      |  19 ++++++---
>  passt.h       |   2 -
>  tap.c         |  18 +++++++--
>  tap.h         |   3 +-
>  test/lib/term |   1 +
>  test/lib/test |   4 +-
>  test/run      |  38 +++++++++---------
>  util.c        |  19 +++++++++
>  util.h        |  25 +++++-------
>  vu_common.c   |  34 ++++++++++------
>  14 files changed, 153 insertions(+), 122 deletions(-)
> 

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 00/12] Cleanups to packet pool handling and sizing
  2024-12-20  9:00 ` [PATCH v2 00/12] Cleanups to packet pool handling and sizing David Gibson
@ 2024-12-20 10:06   ` Stefano Brivio
  0 siblings, 0 replies; 15+ messages in thread
From: Stefano Brivio @ 2024-12-20 10:06 UTC (permalink / raw)
  To: David Gibson; +Cc: passt-dev

On Fri, 20 Dec 2024 20:00:13 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Fri, Dec 20, 2024 at 07:35:23PM +1100, David Gibson wrote:
> > This... is not any of the things I said I would be working on.  I can
> > only say that a herd of very hairy yaks led me astray.  Looking at bug
> > 66 I spotted some problems with our handling of MTUs / maximum frame
> > sizes.  Looking at that I found some weirdness and some real, if
> > minor, bugs in the sizing and handling of the packet pools.
> > 
> > Changes in v2:
> >  * Stefano convinced me that packet_check_range() is still worthwhile.
> >    * So don't remove it... but in looking at it I spotted various
> >      flaws in the checks, so address those in a number of new patches.
> > 
> > David Gibson (12):
> >   test focus
> >   hack: stop on fail, but not perf fail
> >   make passt dumpable  
> 
> Argh, oops.  Of course, 1..3 are test/debug patches that should not be
> included in the series.

Yeah, it's fine, 4/12 to 12/12 apply cleanly anyway.

-- 
Stefano


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2024-12-20 10:06 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-12-20  8:35 [PATCH v2 00/12] Cleanups to packet pool handling and sizing David Gibson
2024-12-20  8:35 ` [PATCH v2 01/12] test focus David Gibson
2024-12-20  8:35 ` [PATCH v2 02/12] hack: stop on fail, but not perf fail David Gibson
2024-12-20  8:35 ` [PATCH v2 03/12] make passt dumpable David Gibson
2024-12-20  8:35 ` [PATCH v2 04/12] packet: Use flexible array member in struct pool David Gibson
2024-12-20  8:35 ` [PATCH v2 05/12] packet: Don't pass start and offset separately too packet_check_range() David Gibson
2024-12-20  8:35 ` [PATCH v2 06/12] packet: Don't hard code maximum packet size to UINT16_MAX David Gibson
2024-12-20  8:35 ` [PATCH v2 07/12] packet: Remove unhelpful packet_get_try() macro David Gibson
2024-12-20  8:35 ` [PATCH v2 08/12] util: Add abort_with_msg() and ASSERT_WITH_MSG() helpers David Gibson
2024-12-20  8:35 ` [PATCH v2 09/12] packet: Distinguish severities of different packet_{add,git}_do() errors David Gibson
2024-12-20  8:35 ` [PATCH v2 10/12] packet: Move packet length checks into packet_check_range() David Gibson
2024-12-20  8:35 ` [PATCH v2 11/12] tap: Don't size pool_tap[46] for the maximum number of packets David Gibson
2024-12-20  8:35 ` [PATCH v2 12/12] packet: More cautious checks to avoid pointer arithmetic UB David Gibson
2024-12-20  9:00 ` [PATCH v2 00/12] Cleanups to packet pool handling and sizing David Gibson
2024-12-20 10:06   ` Stefano Brivio

Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).