public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
* [PATCH 0/6] Of course there are more flow table preliminaries
@ 2024-07-17  4:52 David Gibson
  2024-07-17  4:52 ` [PATCH 1/6] flow, icmp, tcp: Clean up helpers for getting flow from index David Gibson
                   ` (6 more replies)
  0 siblings, 7 replies; 14+ messages in thread
From: David Gibson @ 2024-07-17  4:52 UTC (permalink / raw)
  To: passt-dev, Stefano Brivio; +Cc: David Gibson

This series has some further preliminaries for the flow table.
Specifically this is focused on some clean ups to handling of indicies
for flows and sides.  We also add some additional test programs for
platform requirements.

This is based on the series adding UDP error handling.

David Gibson (6):
  flow, icmp, tcp: Clean up helpers for getting flow from index
  flow, tcp_splice: Prefer 'sidei' for variables referring to side index
  flow: Introduce flow_foreach_sidei() macro
  tcp_splice: Use parameterised macros for per-side event/flag bits
  doc: Test behaviour of closing duplicate UDP sockets
  doc: Extend zero-recv test with methods using msghdr

 doc/platform-requirements/.gitignore      |   1 +
 doc/platform-requirements/Makefile        |   4 +-
 doc/platform-requirements/recv-zero.c     |  60 +++++++--
 doc/platform-requirements/udp-close-dup.c | 105 +++++++++++++++
 flow.h                                    |  16 +--
 flow_table.h                              |  48 +++++--
 icmp.c                                    |  22 +++-
 tcp.c                                     |  28 +++-
 tcp_conn.h                                |  15 +--
 tcp_splice.c                              | 152 ++++++++++++----------
 10 files changed, 332 insertions(+), 119 deletions(-)
 create mode 100644 doc/platform-requirements/udp-close-dup.c

-- 
2.45.2


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/6] flow, icmp, tcp: Clean up helpers for getting flow from index
  2024-07-17  4:52 [PATCH 0/6] Of course there are more flow table preliminaries David Gibson
@ 2024-07-17  4:52 ` David Gibson
  2024-07-17  4:52 ` [PATCH 2/6] flow, tcp_splice: Prefer 'sidei' for variables referring to side index David Gibson
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: David Gibson @ 2024-07-17  4:52 UTC (permalink / raw)
  To: passt-dev, Stefano Brivio; +Cc: David Gibson

TCP (both regular and spliced) and ICMP both have macros to retrieve the
relevant protcol specific flow structure from a flow index.  In most cases
what we actually want is to get the specific flow from a sidx.  Replace
those simple macros with a more precise inline, which also asserts that
the flow is of the type we expect.

While we're they're also add a pif_at_sidx() helper to get the interface of
a specific flow & side, which is useful in some places.

Finally, fix some minor style issues in the comments on some of the
existing sidx related helpers.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 flow_table.h | 26 ++++++++++++++++++++------
 icmp.c       | 22 +++++++++++++++++++---
 tcp.c        | 28 ++++++++++++++++++++++------
 tcp_splice.c | 21 +++++++++++++++++++--
 4 files changed, 80 insertions(+), 17 deletions(-)

diff --git a/flow_table.h b/flow_table.h
index 226ddbdd..ab73e44a 100644
--- a/flow_table.h
+++ b/flow_table.h
@@ -42,7 +42,7 @@ extern unsigned flow_first_free;
 extern union flow flowtab[];
 
 
-/** flow_idx - Index of flow from common structure
+/** flow_idx() - Index of flow from common structure
  * @f:	Common flow fields pointer
  *
  * Return: index of @f in the flow table
@@ -52,21 +52,21 @@ static inline unsigned flow_idx(const struct flow_common *f)
 	return (union flow *)f - flowtab;
 }
 
-/** FLOW_IDX - Find the index of a flow
+/** FLOW_IDX() - Find the index of a flow
  * @f_:	Flow pointer, either union flow * or protocol specific
  *
  * Return: index of @f in the flow table
  */
 #define FLOW_IDX(f_)		(flow_idx(&(f_)->f))
 
-/** FLOW - Flow entry at a given index
+/** FLOW() - Flow entry at a given index
  * @idx:	Flow index
  *
  * Return: pointer to entry @idx in the flow table
  */
 #define FLOW(idx)		(&flowtab[(idx)])
 
-/** flow_at_sidx - Flow entry for a given sidx
+/** flow_at_sidx() - Flow entry for a given sidx
  * @sidx:	Flow & side index
  *
  * Return: pointer to the corresponding flow entry, or NULL
@@ -78,7 +78,21 @@ static inline union flow *flow_at_sidx(flow_sidx_t sidx)
 	return FLOW(sidx.flow);
 }
 
-/** flow_sidx_t - Index of one side of a flow from common structure
+/** pif_at_sidx() - Interface for a given flow and side
+ * @sidx:    Flow & side index
+ *
+ * Return: pif for the flow & side given by @sidx
+ */
+static inline uint8_t pif_at_sidx(flow_sidx_t sidx)
+{
+	const union flow *flow = flow_at_sidx(sidx);
+
+	if (!flow)
+		return PIF_NONE;
+	return flow->f.pif[sidx.side];
+}
+
+/** flow_sidx() - Index of one side of a flow from common structure
  * @f:		Common flow fields pointer
  * @side:	Which side to refer to (0 or 1)
  *
@@ -96,7 +110,7 @@ static inline flow_sidx_t flow_sidx(const struct flow_common *f,
 	};
 }
 
-/** FLOW_SIDX - Find the index of one side of a flow
+/** FLOW_SIDX() - Find the index of one side of a flow
  * @f_:		Flow pointer, either union flow * or protocol specific
  * @side:	Which side to index (0 or 1)
  *
diff --git a/icmp.c b/icmp.c
index d4ccc722..7cf31e60 100644
--- a/icmp.c
+++ b/icmp.c
@@ -45,11 +45,27 @@
 #define ICMP_ECHO_TIMEOUT	60 /* s, timeout for ICMP socket activity */
 #define ICMP_NUM_IDS		(1U << 16)
 
-#define PINGF(idx)		(&(FLOW(idx)->ping))
-
 /* Indexed by ICMP echo identifier */
 static struct icmp_ping_flow *icmp_id_map[IP_VERSIONS][ICMP_NUM_IDS];
 
+/**
+ * ping_at_sidx() - Get ping specific flow at given sidx
+ * @sidx:	Flow and side to retrieve
+ *
+ * Return: ping specific flow at @sidx, or NULL of @sidx is invalid.  Asserts if
+ *         the flow at @sidx is not FLOW_PING4 or FLOW_PING6
+ */
+static struct icmp_ping_flow *ping_at_sidx(flow_sidx_t sidx)
+{
+	union flow *flow = flow_at_sidx(sidx);
+
+	if (!flow)
+		return NULL;
+
+	ASSERT(flow->f.type == FLOW_PING4 || flow->f.type == FLOW_PING6);
+	return &flow->ping;
+}
+
 /**
  * icmp_sock_handler() - Handle new data from ICMP or ICMPv6 socket
  * @c:		Execution context
@@ -57,7 +73,7 @@ static struct icmp_ping_flow *icmp_id_map[IP_VERSIONS][ICMP_NUM_IDS];
  */
 void icmp_sock_handler(const struct ctx *c, union epoll_ref ref)
 {
-	struct icmp_ping_flow *pingf = PINGF(ref.flowside.flow);
+	struct icmp_ping_flow *pingf = ping_at_sidx(ref.flowside);
 	union sockaddr_inany sr;
 	socklen_t sl = sizeof(sr);
 	char buf[USHRT_MAX];
diff --git a/tcp.c b/tcp.c
index de3f9f64..04ea1f91 100644
--- a/tcp.c
+++ b/tcp.c
@@ -376,8 +376,6 @@ char		tcp_buf_discard		[MAX_WINDOW];
 /* sendmsg() to socket */
 static struct iovec	tcp_iov			[UIO_MAXIOV];
 
-#define CONN(idx)		(&(FLOW(idx)->tcp))
-
 /* Table for lookup from remote address, local port, remote port */
 static flow_sidx_t tc_hash[TCP_HASH_TABLE_SIZE];
 
@@ -388,6 +386,24 @@ static_assert(ARRAY_SIZE(tc_hash) >= FLOW_MAX,
 int init_sock_pool4		[TCP_SOCK_POOL_SIZE];
 int init_sock_pool6		[TCP_SOCK_POOL_SIZE];
 
+/**
+ * conn_at_sidx() - Get TCP connection specific flow at given sidx
+ * @sidx:	Flow and side to retrieve
+ *
+ * Return: TCP connection at @sidx, or NULL of @sidx is invalid.  Asserts if the
+ *         flow at @sidx is not FLOW_TCP.
+ */
+static struct tcp_tap_conn *conn_at_sidx(flow_sidx_t sidx)
+{
+	union flow *flow = flow_at_sidx(sidx);
+
+	if (!flow)
+		return NULL;
+
+	ASSERT(flow->f.type == FLOW_TCP);
+	return &flow->tcp;
+}
+
 /**
  * tcp_conn_epoll_events() - epoll events mask for given connection state
  * @events:	Current connection events
@@ -2339,9 +2355,10 @@ cancel:
 void tcp_timer_handler(struct ctx *c, union epoll_ref ref)
 {
 	struct itimerspec check_armed = { { 0 }, { 0 } };
-	struct tcp_tap_conn *conn = CONN(ref.flow);
+	struct tcp_tap_conn *conn = &FLOW(ref.flow)->tcp;
 
 	ASSERT(!c->no_tcp);
+	ASSERT(conn->f.type == FLOW_TCP);
 
 	/* We don't reset timers on ~ACK_FROM_TAP_DUE, ~ACK_TO_TAP_DUE. If the
 	 * timer is currently armed, this event came from a previous setting,
@@ -2397,11 +2414,10 @@ void tcp_timer_handler(struct ctx *c, union epoll_ref ref)
  */
 void tcp_sock_handler(struct ctx *c, union epoll_ref ref, uint32_t events)
 {
-	struct tcp_tap_conn *conn = CONN(ref.flowside.flow);
+	struct tcp_tap_conn *conn = conn_at_sidx(ref.flowside);
 
 	ASSERT(!c->no_tcp);
-	ASSERT(conn->f.type == FLOW_TCP);
-	ASSERT(conn->f.pif[ref.flowside.side] != PIF_TAP);
+	ASSERT(pif_at_sidx(ref.flowside) != PIF_TAP);
 
 	if (conn->events == CLOSED)
 		return;
diff --git a/tcp_splice.c b/tcp_splice.c
index f2d4fc60..9fc56565 100644
--- a/tcp_splice.c
+++ b/tcp_splice.c
@@ -76,7 +76,6 @@ static int splice_pipe_pool		[TCP_SPLICE_PIPE_POOL_SIZE][2];
 #define CONN_V6(x)			((x)->flags & SPLICE_V6)
 #define CONN_V4(x)			(!CONN_V6(x))
 #define CONN_HAS(conn, set)		(((conn)->events & (set)) == (set))
-#define CONN(idx)			(&FLOW(idx)->tcp_splice)
 
 /* Display strings for connection events */
 static const char *tcp_splice_event_str[] __attribute((__unused__)) = {
@@ -94,6 +93,24 @@ static const char *tcp_splice_flag_str[] __attribute((__unused__)) = {
 static int tcp_sock_refill_ns(void *arg);
 static int tcp_conn_sock_ns(const struct ctx *c, sa_family_t af);
 
+/**
+ * conn_at_sidx() - Get spliced TCP connection specific flow at given sidx
+ * @sidx:	Flow and side to retrieve
+ *
+ * Return: Spliced TCP connection at @sidx, or NULL of @sidx is invalid.
+ *         Asserts if the flow at @sidx is not FLOW_TCP_SPLICE.
+ */
+static struct tcp_splice_conn *conn_at_sidx(flow_sidx_t sidx)
+{
+	union flow *flow = flow_at_sidx(sidx);
+
+	if (!flow)
+		return NULL;
+
+	ASSERT(flow->f.type == FLOW_TCP_SPLICE);
+	return &flow->tcp_splice;
+}
+
 /**
  * tcp_splice_conn_epoll_events() - epoll events masks for given state
  * @events:	Connection event flags
@@ -502,7 +519,7 @@ bool tcp_splice_conn_from_sock(const struct ctx *c,
 void tcp_splice_sock_handler(struct ctx *c, union epoll_ref ref,
 			     uint32_t events)
 {
-	struct tcp_splice_conn *conn = CONN(ref.flowside.flow);
+	struct tcp_splice_conn *conn = conn_at_sidx(ref.flowside);
 	unsigned side = ref.flowside.side, fromside;
 	uint8_t lowat_set_flag, lowat_act_flag;
 	int eof, never_read;
-- 
@@ -76,7 +76,6 @@ static int splice_pipe_pool		[TCP_SPLICE_PIPE_POOL_SIZE][2];
 #define CONN_V6(x)			((x)->flags & SPLICE_V6)
 #define CONN_V4(x)			(!CONN_V6(x))
 #define CONN_HAS(conn, set)		(((conn)->events & (set)) == (set))
-#define CONN(idx)			(&FLOW(idx)->tcp_splice)
 
 /* Display strings for connection events */
 static const char *tcp_splice_event_str[] __attribute((__unused__)) = {
@@ -94,6 +93,24 @@ static const char *tcp_splice_flag_str[] __attribute((__unused__)) = {
 static int tcp_sock_refill_ns(void *arg);
 static int tcp_conn_sock_ns(const struct ctx *c, sa_family_t af);
 
+/**
+ * conn_at_sidx() - Get spliced TCP connection specific flow at given sidx
+ * @sidx:	Flow and side to retrieve
+ *
+ * Return: Spliced TCP connection at @sidx, or NULL of @sidx is invalid.
+ *         Asserts if the flow at @sidx is not FLOW_TCP_SPLICE.
+ */
+static struct tcp_splice_conn *conn_at_sidx(flow_sidx_t sidx)
+{
+	union flow *flow = flow_at_sidx(sidx);
+
+	if (!flow)
+		return NULL;
+
+	ASSERT(flow->f.type == FLOW_TCP_SPLICE);
+	return &flow->tcp_splice;
+}
+
 /**
  * tcp_splice_conn_epoll_events() - epoll events masks for given state
  * @events:	Connection event flags
@@ -502,7 +519,7 @@ bool tcp_splice_conn_from_sock(const struct ctx *c,
 void tcp_splice_sock_handler(struct ctx *c, union epoll_ref ref,
 			     uint32_t events)
 {
-	struct tcp_splice_conn *conn = CONN(ref.flowside.flow);
+	struct tcp_splice_conn *conn = conn_at_sidx(ref.flowside);
 	unsigned side = ref.flowside.side, fromside;
 	uint8_t lowat_set_flag, lowat_act_flag;
 	int eof, never_read;
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/6] flow, tcp_splice: Prefer 'sidei' for variables referring to side index
  2024-07-17  4:52 [PATCH 0/6] Of course there are more flow table preliminaries David Gibson
  2024-07-17  4:52 ` [PATCH 1/6] flow, icmp, tcp: Clean up helpers for getting flow from index David Gibson
@ 2024-07-17  4:52 ` David Gibson
  2024-07-17  5:20   ` Stefano Brivio
  2024-07-17  4:52 ` [PATCH 3/6] flow: Introduce flow_foreach_sidei() macro David Gibson
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 14+ messages in thread
From: David Gibson @ 2024-07-17  4:52 UTC (permalink / raw)
  To: passt-dev, Stefano Brivio; +Cc: David Gibson

In various places we have variables named 'side' or similar which always
have the value 0 or 1 (INISIDE or TGTSIDE).  Given a flow, this refers to
a specific side of it.  Upcoming flow table work will make it more useful
for "side" to refer to a specific side of a specific flow.  To make things
less confusing then, prefer the name term "side index" and name 'sidei' for
variables with just the 0 or 1 value.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 flow.h       | 16 +++++-----
 flow_table.h | 18 +++++------
 tcp_splice.c | 89 ++++++++++++++++++++++++++--------------------------
 3 files changed, 62 insertions(+), 61 deletions(-)

diff --git a/flow.h b/flow.h
index d1f49c65..5d2c34af 100644
--- a/flow.h
+++ b/flow.h
@@ -132,8 +132,8 @@ extern const uint8_t flow_proto[];
 
 #define SIDES			2
 
-#define INISIDE			0	/* Initiating side */
-#define TGTSIDE			1	/* Target side */
+#define INISIDE			0	/* Initiating side index */
+#define TGTSIDE			1	/* Target side index */
 
 /**
  * struct flow_common - Common fields for packet flows
@@ -164,17 +164,17 @@ struct flow_common {
 
 /**
  * struct flow_sidx - ID for one side of a specific flow
- * @side:	Side referenced (0 or 1)
+ * @side:	Index of side referenced (0 or 1)
  * @flow:	Index of flow referenced
  */
 typedef struct flow_sidx {
-	unsigned	side :1;
-	unsigned	flow :FLOW_INDEX_BITS;
+	unsigned	sidei :1;
+	unsigned	flowi :FLOW_INDEX_BITS;
 } flow_sidx_t;
 static_assert(sizeof(flow_sidx_t) <= sizeof(uint32_t),
 	      "flow_sidx_t must fit within 32 bits");
 
-#define FLOW_SIDX_NONE ((flow_sidx_t){ .flow = FLOW_MAX })
+#define FLOW_SIDX_NONE ((flow_sidx_t){ .flowi = FLOW_MAX })
 
 /**
  * flow_sidx_valid() - Test if a sidx is valid
@@ -184,7 +184,7 @@ static_assert(sizeof(flow_sidx_t) <= sizeof(uint32_t),
  */
 static inline bool flow_sidx_valid(flow_sidx_t sidx)
 {
-	return sidx.flow < FLOW_MAX;
+	return sidx.flowi < FLOW_MAX;
 }
 
 /**
@@ -195,7 +195,7 @@ static inline bool flow_sidx_valid(flow_sidx_t sidx)
  */
 static inline bool flow_sidx_eq(flow_sidx_t a, flow_sidx_t b)
 {
-	return (a.flow == b.flow) && (a.side == b.side);
+	return (a.flowi == b.flowi) && (a.sidei == b.sidei);
 }
 
 union flow;
diff --git a/flow_table.h b/flow_table.h
index ab73e44a..8a5f8f25 100644
--- a/flow_table.h
+++ b/flow_table.h
@@ -75,7 +75,7 @@ static inline union flow *flow_at_sidx(flow_sidx_t sidx)
 {
 	if (!flow_sidx_valid(sidx))
 		return NULL;
-	return FLOW(sidx.flow);
+	return FLOW(sidx.flowi);
 }
 
 /** pif_at_sidx() - Interface for a given flow and side
@@ -89,34 +89,34 @@ static inline uint8_t pif_at_sidx(flow_sidx_t sidx)
 
 	if (!flow)
 		return PIF_NONE;
-	return flow->f.pif[sidx.side];
+	return flow->f.pif[sidx.sidei];
 }
 
 /** flow_sidx() - Index of one side of a flow from common structure
  * @f:		Common flow fields pointer
- * @side:	Which side to refer to (0 or 1)
+ * @sidei:	Which side to refer to (0 or 1)
  *
  * Return: index of @f and @side in the flow table
  */
 static inline flow_sidx_t flow_sidx(const struct flow_common *f,
-				    int side)
+				    unsigned sidei)
 {
 	/* cppcheck-suppress [knownConditionTrueFalse, unmatchedSuppression] */
-	ASSERT(side == !!side);
+	ASSERT(sidei == !!sidei);
 
 	return (flow_sidx_t){
-		.side = side,
-		.flow = flow_idx(f),
+		.sidei = sidei,
+		.flowi = flow_idx(f),
 	};
 }
 
 /** FLOW_SIDX() - Find the index of one side of a flow
  * @f_:		Flow pointer, either union flow * or protocol specific
- * @side:	Which side to index (0 or 1)
+ * @sidei:	Which side to index (0 or 1)
  *
  * Return: index of @f and @side in the flow table
  */
-#define FLOW_SIDX(f_, side)	(flow_sidx(&(f_)->f, (side)))
+#define FLOW_SIDX(f_, sidei)	(flow_sidx(&(f_)->f, (sidei)))
 
 union flow *flow_alloc(void);
 void flow_alloc_cancel(union flow *flow);
diff --git a/tcp_splice.c b/tcp_splice.c
index 9fc56565..8bc68a1a 100644
--- a/tcp_splice.c
+++ b/tcp_splice.c
@@ -258,25 +258,25 @@ static void conn_event_do(const struct ctx *c, struct tcp_splice_conn *conn,
  */
 bool tcp_splice_flow_defer(struct tcp_splice_conn *conn)
 {
-	unsigned side;
+	unsigned sidei;
 
 	if (!(conn->flags & CLOSING))
 		return false;
 
-	for (side = 0; side < SIDES; side++) {
+	for (sidei = 0; sidei < SIDES; sidei++) {
 		/* Flushing might need to block: don't recycle them. */
-		if (conn->pipe[side][0] >= 0) {
-			close(conn->pipe[side][0]);
-			close(conn->pipe[side][1]);
-			conn->pipe[side][0] = conn->pipe[side][1] = -1;
+		if (conn->pipe[sidei][0] >= 0) {
+			close(conn->pipe[sidei][0]);
+			close(conn->pipe[sidei][1]);
+			conn->pipe[sidei][0] = conn->pipe[sidei][1] = -1;
 		}
 
-		if (conn->s[side] >= 0) {
-			close(conn->s[side]);
-			conn->s[side] = -1;
+		if (conn->s[sidei] >= 0) {
+			close(conn->s[sidei]);
+			conn->s[sidei] = -1;
 		}
 
-		conn->read[side] = conn->written[side] = 0;
+		conn->read[sidei] = conn->written[sidei] = 0;
 	}
 
 	conn->events = SPLICE_CLOSED;
@@ -296,33 +296,33 @@ bool tcp_splice_flow_defer(struct tcp_splice_conn *conn)
 static int tcp_splice_connect_finish(const struct ctx *c,
 				     struct tcp_splice_conn *conn)
 {
-	unsigned side;
+	unsigned sidei;
 	int i = 0;
 
-	for (side = 0; side < SIDES; side++) {
+	for (sidei = 0; sidei < SIDES; sidei++) {
 		for (; i < TCP_SPLICE_PIPE_POOL_SIZE; i++) {
 			if (splice_pipe_pool[i][0] >= 0) {
-				SWAP(conn->pipe[side][0],
+				SWAP(conn->pipe[sidei][0],
 				     splice_pipe_pool[i][0]);
-				SWAP(conn->pipe[side][1],
+				SWAP(conn->pipe[sidei][1],
 				     splice_pipe_pool[i][1]);
 				break;
 			}
 		}
 
-		if (conn->pipe[side][0] < 0) {
-			if (pipe2(conn->pipe[side], O_NONBLOCK | O_CLOEXEC)) {
+		if (conn->pipe[sidei][0] < 0) {
+			if (pipe2(conn->pipe[sidei], O_NONBLOCK | O_CLOEXEC)) {
 				flow_err(conn, "cannot create %d->%d pipe: %s",
-					 side, !side, strerror(errno));
+					 sidei, !sidei, strerror(errno));
 				conn_flag(c, conn, CLOSING);
 				return -EIO;
 			}
 
-			if (fcntl(conn->pipe[side][0], F_SETPIPE_SZ,
+			if (fcntl(conn->pipe[sidei][0], F_SETPIPE_SZ,
 				  c->tcp.pipe_size)) {
 				flow_trace(conn,
 					   "cannot set %d->%d pipe size to %zu",
-					   side, !side, c->tcp.pipe_size);
+					   sidei, !sidei, c->tcp.pipe_size);
 			}
 		}
 	}
@@ -520,7 +520,7 @@ void tcp_splice_sock_handler(struct ctx *c, union epoll_ref ref,
 			     uint32_t events)
 {
 	struct tcp_splice_conn *conn = conn_at_sidx(ref.flowside);
-	unsigned side = ref.flowside.side, fromside;
+	unsigned evsidei = ref.flowside.sidei, fromsidei;
 	uint8_t lowat_set_flag, lowat_act_flag;
 	int eof, never_read;
 
@@ -552,30 +552,31 @@ void tcp_splice_sock_handler(struct ctx *c, union epoll_ref ref,
 	}
 
 	if (events & EPOLLOUT) {
-		fromside = !side;
-		conn_event(c, conn, side == 0 ? ~OUT_WAIT_0 : ~OUT_WAIT_1);
+		fromsidei = !evsidei;
+		conn_event(c, conn, evsidei == 0 ? ~OUT_WAIT_0 : ~OUT_WAIT_1);
 	} else {
-		fromside = side;
+		fromsidei = evsidei;
 	}
 
 	if (events & EPOLLRDHUP)
 		/* For side 0 this is fake, but implied */
-		conn_event(c, conn, side == 0 ? FIN_RCVD_0 : FIN_RCVD_1);
+		conn_event(c, conn, evsidei == 0 ? FIN_RCVD_0 : FIN_RCVD_1);
 
 swap:
 	eof = 0;
 	never_read = 1;
 
-	lowat_set_flag = fromside == 0 ? RCVLOWAT_SET_0 : RCVLOWAT_SET_1;
-	lowat_act_flag = fromside == 0 ? RCVLOWAT_ACT_0 : RCVLOWAT_ACT_1;
+	lowat_set_flag = fromsidei == 0 ? RCVLOWAT_SET_0 : RCVLOWAT_SET_1;
+	lowat_act_flag = fromsidei == 0 ? RCVLOWAT_ACT_0 : RCVLOWAT_ACT_1;
 
 	while (1) {
 		ssize_t readlen, to_write = 0, written;
 		int more = 0;
 
 retry:
-		readlen = splice(conn->s[fromside], NULL,
-				 conn->pipe[fromside][1], NULL, c->tcp.pipe_size,
+		readlen = splice(conn->s[fromsidei], NULL,
+				 conn->pipe[fromsidei][1], NULL,
+				 c->tcp.pipe_size,
 				 SPLICE_F_MOVE | SPLICE_F_NONBLOCK);
 		flow_trace(conn, "%zi from read-side call", readlen);
 		if (readlen < 0) {
@@ -600,8 +601,8 @@ retry:
 		}
 
 eintr:
-		written = splice(conn->pipe[fromside][0], NULL,
-				 conn->s[!fromside], NULL, to_write,
+		written = splice(conn->pipe[fromsidei][0], NULL,
+				 conn->s[!fromsidei], NULL, to_write,
 				 SPLICE_F_MOVE | more | SPLICE_F_NONBLOCK);
 		flow_trace(conn, "%zi from write-side call (passed %zi)",
 			   written, to_write);
@@ -615,7 +616,7 @@ eintr:
 			    readlen > (long)c->tcp.pipe_size / 10) {
 				int lowat = c->tcp.pipe_size / 4;
 
-				if (setsockopt(conn->s[fromside], SOL_SOCKET,
+				if (setsockopt(conn->s[fromsidei], SOL_SOCKET,
 					       SO_RCVLOWAT,
 					       &lowat, sizeof(lowat))) {
 					flow_trace(conn,
@@ -630,8 +631,8 @@ eintr:
 			break;
 		}
 
-		conn->read[fromside]    += readlen > 0 ? readlen : 0;
-		conn->written[fromside] += written > 0 ? written : 0;
+		conn->read[fromsidei]    += readlen > 0 ? readlen : 0;
+		conn->written[fromsidei] += written > 0 ? written : 0;
 
 		if (written < 0) {
 			if (errno == EINTR)
@@ -640,11 +641,11 @@ eintr:
 			if (errno != EAGAIN)
 				goto close;
 
-			if (conn->read[fromside] == conn->written[fromside])
+			if (conn->read[fromsidei] == conn->written[fromsidei])
 				break;
 
 			conn_event(c, conn,
-				   fromside == 0 ? OUT_WAIT_1 : OUT_WAIT_0);
+				   fromsidei == 0 ? OUT_WAIT_1 : OUT_WAIT_0);
 			break;
 		}
 
@@ -661,14 +662,14 @@ eintr:
 	}
 
 	if ((conn->events & FIN_RCVD_0) && !(conn->events & FIN_SENT_1)) {
-		if (conn->read[fromside] == conn->written[fromside] && eof) {
+		if (conn->read[fromsidei] == conn->written[fromsidei] && eof) {
 			shutdown(conn->s[1], SHUT_WR);
 			conn_event(c, conn, FIN_SENT_1);
 		}
 	}
 
 	if ((conn->events & FIN_RCVD_1) && !(conn->events & FIN_SENT_0)) {
-		if (conn->read[fromside] == conn->written[fromside] && eof) {
+		if (conn->read[fromsidei] == conn->written[fromsidei] && eof) {
 			shutdown(conn->s[0], SHUT_WR);
 			conn_event(c, conn, FIN_SENT_0);
 		}
@@ -680,7 +681,7 @@ eintr:
 	if ((events & (EPOLLIN | EPOLLOUT)) == (EPOLLIN | EPOLLOUT)) {
 		events = EPOLLIN;
 
-		fromside = !fromside;
+		fromsidei = !fromsidei;
 		goto swap;
 	}
 
@@ -815,19 +816,19 @@ void tcp_splice_init(struct ctx *c)
  */
 void tcp_splice_timer(const struct ctx *c, struct tcp_splice_conn *conn)
 {
-	int side;
+	unsigned sidei;
 
 	ASSERT(!(conn->flags & CLOSING));
 
-	for (side = 0; side < SIDES; side++) {
-		uint8_t set = side == 0 ? RCVLOWAT_SET_0 : RCVLOWAT_SET_1;
-		uint8_t act = side == 0 ? RCVLOWAT_ACT_0 : RCVLOWAT_ACT_1;
+	for (sidei = 0; sidei < SIDES; sidei++) {
+		uint8_t set = sidei == 0 ? RCVLOWAT_SET_0 : RCVLOWAT_SET_1;
+		uint8_t act = sidei == 0 ? RCVLOWAT_ACT_0 : RCVLOWAT_ACT_1;
 
 		if ((conn->flags & set) && !(conn->flags & act)) {
-			if (setsockopt(conn->s[side], SOL_SOCKET, SO_RCVLOWAT,
+			if (setsockopt(conn->s[sidei], SOL_SOCKET, SO_RCVLOWAT,
 				       &((int){ 1 }), sizeof(int))) {
 				flow_trace(conn, "can't set SO_RCVLOWAT on %d",
-					   conn->s[side]);
+					   conn->s[sidei]);
 			}
 			conn_flag(c, conn, ~set);
 		}
-- 
@@ -258,25 +258,25 @@ static void conn_event_do(const struct ctx *c, struct tcp_splice_conn *conn,
  */
 bool tcp_splice_flow_defer(struct tcp_splice_conn *conn)
 {
-	unsigned side;
+	unsigned sidei;
 
 	if (!(conn->flags & CLOSING))
 		return false;
 
-	for (side = 0; side < SIDES; side++) {
+	for (sidei = 0; sidei < SIDES; sidei++) {
 		/* Flushing might need to block: don't recycle them. */
-		if (conn->pipe[side][0] >= 0) {
-			close(conn->pipe[side][0]);
-			close(conn->pipe[side][1]);
-			conn->pipe[side][0] = conn->pipe[side][1] = -1;
+		if (conn->pipe[sidei][0] >= 0) {
+			close(conn->pipe[sidei][0]);
+			close(conn->pipe[sidei][1]);
+			conn->pipe[sidei][0] = conn->pipe[sidei][1] = -1;
 		}
 
-		if (conn->s[side] >= 0) {
-			close(conn->s[side]);
-			conn->s[side] = -1;
+		if (conn->s[sidei] >= 0) {
+			close(conn->s[sidei]);
+			conn->s[sidei] = -1;
 		}
 
-		conn->read[side] = conn->written[side] = 0;
+		conn->read[sidei] = conn->written[sidei] = 0;
 	}
 
 	conn->events = SPLICE_CLOSED;
@@ -296,33 +296,33 @@ bool tcp_splice_flow_defer(struct tcp_splice_conn *conn)
 static int tcp_splice_connect_finish(const struct ctx *c,
 				     struct tcp_splice_conn *conn)
 {
-	unsigned side;
+	unsigned sidei;
 	int i = 0;
 
-	for (side = 0; side < SIDES; side++) {
+	for (sidei = 0; sidei < SIDES; sidei++) {
 		for (; i < TCP_SPLICE_PIPE_POOL_SIZE; i++) {
 			if (splice_pipe_pool[i][0] >= 0) {
-				SWAP(conn->pipe[side][0],
+				SWAP(conn->pipe[sidei][0],
 				     splice_pipe_pool[i][0]);
-				SWAP(conn->pipe[side][1],
+				SWAP(conn->pipe[sidei][1],
 				     splice_pipe_pool[i][1]);
 				break;
 			}
 		}
 
-		if (conn->pipe[side][0] < 0) {
-			if (pipe2(conn->pipe[side], O_NONBLOCK | O_CLOEXEC)) {
+		if (conn->pipe[sidei][0] < 0) {
+			if (pipe2(conn->pipe[sidei], O_NONBLOCK | O_CLOEXEC)) {
 				flow_err(conn, "cannot create %d->%d pipe: %s",
-					 side, !side, strerror(errno));
+					 sidei, !sidei, strerror(errno));
 				conn_flag(c, conn, CLOSING);
 				return -EIO;
 			}
 
-			if (fcntl(conn->pipe[side][0], F_SETPIPE_SZ,
+			if (fcntl(conn->pipe[sidei][0], F_SETPIPE_SZ,
 				  c->tcp.pipe_size)) {
 				flow_trace(conn,
 					   "cannot set %d->%d pipe size to %zu",
-					   side, !side, c->tcp.pipe_size);
+					   sidei, !sidei, c->tcp.pipe_size);
 			}
 		}
 	}
@@ -520,7 +520,7 @@ void tcp_splice_sock_handler(struct ctx *c, union epoll_ref ref,
 			     uint32_t events)
 {
 	struct tcp_splice_conn *conn = conn_at_sidx(ref.flowside);
-	unsigned side = ref.flowside.side, fromside;
+	unsigned evsidei = ref.flowside.sidei, fromsidei;
 	uint8_t lowat_set_flag, lowat_act_flag;
 	int eof, never_read;
 
@@ -552,30 +552,31 @@ void tcp_splice_sock_handler(struct ctx *c, union epoll_ref ref,
 	}
 
 	if (events & EPOLLOUT) {
-		fromside = !side;
-		conn_event(c, conn, side == 0 ? ~OUT_WAIT_0 : ~OUT_WAIT_1);
+		fromsidei = !evsidei;
+		conn_event(c, conn, evsidei == 0 ? ~OUT_WAIT_0 : ~OUT_WAIT_1);
 	} else {
-		fromside = side;
+		fromsidei = evsidei;
 	}
 
 	if (events & EPOLLRDHUP)
 		/* For side 0 this is fake, but implied */
-		conn_event(c, conn, side == 0 ? FIN_RCVD_0 : FIN_RCVD_1);
+		conn_event(c, conn, evsidei == 0 ? FIN_RCVD_0 : FIN_RCVD_1);
 
 swap:
 	eof = 0;
 	never_read = 1;
 
-	lowat_set_flag = fromside == 0 ? RCVLOWAT_SET_0 : RCVLOWAT_SET_1;
-	lowat_act_flag = fromside == 0 ? RCVLOWAT_ACT_0 : RCVLOWAT_ACT_1;
+	lowat_set_flag = fromsidei == 0 ? RCVLOWAT_SET_0 : RCVLOWAT_SET_1;
+	lowat_act_flag = fromsidei == 0 ? RCVLOWAT_ACT_0 : RCVLOWAT_ACT_1;
 
 	while (1) {
 		ssize_t readlen, to_write = 0, written;
 		int more = 0;
 
 retry:
-		readlen = splice(conn->s[fromside], NULL,
-				 conn->pipe[fromside][1], NULL, c->tcp.pipe_size,
+		readlen = splice(conn->s[fromsidei], NULL,
+				 conn->pipe[fromsidei][1], NULL,
+				 c->tcp.pipe_size,
 				 SPLICE_F_MOVE | SPLICE_F_NONBLOCK);
 		flow_trace(conn, "%zi from read-side call", readlen);
 		if (readlen < 0) {
@@ -600,8 +601,8 @@ retry:
 		}
 
 eintr:
-		written = splice(conn->pipe[fromside][0], NULL,
-				 conn->s[!fromside], NULL, to_write,
+		written = splice(conn->pipe[fromsidei][0], NULL,
+				 conn->s[!fromsidei], NULL, to_write,
 				 SPLICE_F_MOVE | more | SPLICE_F_NONBLOCK);
 		flow_trace(conn, "%zi from write-side call (passed %zi)",
 			   written, to_write);
@@ -615,7 +616,7 @@ eintr:
 			    readlen > (long)c->tcp.pipe_size / 10) {
 				int lowat = c->tcp.pipe_size / 4;
 
-				if (setsockopt(conn->s[fromside], SOL_SOCKET,
+				if (setsockopt(conn->s[fromsidei], SOL_SOCKET,
 					       SO_RCVLOWAT,
 					       &lowat, sizeof(lowat))) {
 					flow_trace(conn,
@@ -630,8 +631,8 @@ eintr:
 			break;
 		}
 
-		conn->read[fromside]    += readlen > 0 ? readlen : 0;
-		conn->written[fromside] += written > 0 ? written : 0;
+		conn->read[fromsidei]    += readlen > 0 ? readlen : 0;
+		conn->written[fromsidei] += written > 0 ? written : 0;
 
 		if (written < 0) {
 			if (errno == EINTR)
@@ -640,11 +641,11 @@ eintr:
 			if (errno != EAGAIN)
 				goto close;
 
-			if (conn->read[fromside] == conn->written[fromside])
+			if (conn->read[fromsidei] == conn->written[fromsidei])
 				break;
 
 			conn_event(c, conn,
-				   fromside == 0 ? OUT_WAIT_1 : OUT_WAIT_0);
+				   fromsidei == 0 ? OUT_WAIT_1 : OUT_WAIT_0);
 			break;
 		}
 
@@ -661,14 +662,14 @@ eintr:
 	}
 
 	if ((conn->events & FIN_RCVD_0) && !(conn->events & FIN_SENT_1)) {
-		if (conn->read[fromside] == conn->written[fromside] && eof) {
+		if (conn->read[fromsidei] == conn->written[fromsidei] && eof) {
 			shutdown(conn->s[1], SHUT_WR);
 			conn_event(c, conn, FIN_SENT_1);
 		}
 	}
 
 	if ((conn->events & FIN_RCVD_1) && !(conn->events & FIN_SENT_0)) {
-		if (conn->read[fromside] == conn->written[fromside] && eof) {
+		if (conn->read[fromsidei] == conn->written[fromsidei] && eof) {
 			shutdown(conn->s[0], SHUT_WR);
 			conn_event(c, conn, FIN_SENT_0);
 		}
@@ -680,7 +681,7 @@ eintr:
 	if ((events & (EPOLLIN | EPOLLOUT)) == (EPOLLIN | EPOLLOUT)) {
 		events = EPOLLIN;
 
-		fromside = !fromside;
+		fromsidei = !fromsidei;
 		goto swap;
 	}
 
@@ -815,19 +816,19 @@ void tcp_splice_init(struct ctx *c)
  */
 void tcp_splice_timer(const struct ctx *c, struct tcp_splice_conn *conn)
 {
-	int side;
+	unsigned sidei;
 
 	ASSERT(!(conn->flags & CLOSING));
 
-	for (side = 0; side < SIDES; side++) {
-		uint8_t set = side == 0 ? RCVLOWAT_SET_0 : RCVLOWAT_SET_1;
-		uint8_t act = side == 0 ? RCVLOWAT_ACT_0 : RCVLOWAT_ACT_1;
+	for (sidei = 0; sidei < SIDES; sidei++) {
+		uint8_t set = sidei == 0 ? RCVLOWAT_SET_0 : RCVLOWAT_SET_1;
+		uint8_t act = sidei == 0 ? RCVLOWAT_ACT_0 : RCVLOWAT_ACT_1;
 
 		if ((conn->flags & set) && !(conn->flags & act)) {
-			if (setsockopt(conn->s[side], SOL_SOCKET, SO_RCVLOWAT,
+			if (setsockopt(conn->s[sidei], SOL_SOCKET, SO_RCVLOWAT,
 				       &((int){ 1 }), sizeof(int))) {
 				flow_trace(conn, "can't set SO_RCVLOWAT on %d",
-					   conn->s[side]);
+					   conn->s[sidei]);
 			}
 			conn_flag(c, conn, ~set);
 		}
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/6] flow: Introduce flow_foreach_sidei() macro
  2024-07-17  4:52 [PATCH 0/6] Of course there are more flow table preliminaries David Gibson
  2024-07-17  4:52 ` [PATCH 1/6] flow, icmp, tcp: Clean up helpers for getting flow from index David Gibson
  2024-07-17  4:52 ` [PATCH 2/6] flow, tcp_splice: Prefer 'sidei' for variables referring to side index David Gibson
@ 2024-07-17  4:52 ` David Gibson
  2024-07-17  4:52 ` [PATCH 4/6] tcp_splice: Use parameterised macros for per-side event/flag bits David Gibson
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: David Gibson @ 2024-07-17  4:52 UTC (permalink / raw)
  To: passt-dev, Stefano Brivio; +Cc: David Gibson

We have a handful of places where we use a loop to step through each side
of a flow or flows, and we're probably going to have mroe in future.
Introduce a macro to implement this loop for convenience.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 flow_table.h | 6 ++++++
 tcp_splice.c | 6 +++---
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/flow_table.h b/flow_table.h
index 8a5f8f25..4f5d041c 100644
--- a/flow_table.h
+++ b/flow_table.h
@@ -41,6 +41,12 @@ union flow {
 extern unsigned flow_first_free;
 extern union flow flowtab[];
 
+/**
+ * flow_foreach_sidei() - 'for' type macro to step through each side of flow
+ * @sidei_:	Takes value INISIDE, then TGTSIDE
+ */
+#define flow_foreach_sidei(sidei_) \
+	for ((sidei_) = INISIDE; (sidei_) < SIDES; (sidei_)++)
 
 /** flow_idx() - Index of flow from common structure
  * @f:	Common flow fields pointer
diff --git a/tcp_splice.c b/tcp_splice.c
index 8bc68a1a..5a9325b1 100644
--- a/tcp_splice.c
+++ b/tcp_splice.c
@@ -263,7 +263,7 @@ bool tcp_splice_flow_defer(struct tcp_splice_conn *conn)
 	if (!(conn->flags & CLOSING))
 		return false;
 
-	for (sidei = 0; sidei < SIDES; sidei++) {
+	flow_foreach_sidei(sidei) {
 		/* Flushing might need to block: don't recycle them. */
 		if (conn->pipe[sidei][0] >= 0) {
 			close(conn->pipe[sidei][0]);
@@ -299,7 +299,7 @@ static int tcp_splice_connect_finish(const struct ctx *c,
 	unsigned sidei;
 	int i = 0;
 
-	for (sidei = 0; sidei < SIDES; sidei++) {
+	flow_foreach_sidei(sidei) {
 		for (; i < TCP_SPLICE_PIPE_POOL_SIZE; i++) {
 			if (splice_pipe_pool[i][0] >= 0) {
 				SWAP(conn->pipe[sidei][0],
@@ -820,7 +820,7 @@ void tcp_splice_timer(const struct ctx *c, struct tcp_splice_conn *conn)
 
 	ASSERT(!(conn->flags & CLOSING));
 
-	for (sidei = 0; sidei < SIDES; sidei++) {
+	flow_foreach_sidei(sidei) {
 		uint8_t set = sidei == 0 ? RCVLOWAT_SET_0 : RCVLOWAT_SET_1;
 		uint8_t act = sidei == 0 ? RCVLOWAT_ACT_0 : RCVLOWAT_ACT_1;
 
-- 
@@ -263,7 +263,7 @@ bool tcp_splice_flow_defer(struct tcp_splice_conn *conn)
 	if (!(conn->flags & CLOSING))
 		return false;
 
-	for (sidei = 0; sidei < SIDES; sidei++) {
+	flow_foreach_sidei(sidei) {
 		/* Flushing might need to block: don't recycle them. */
 		if (conn->pipe[sidei][0] >= 0) {
 			close(conn->pipe[sidei][0]);
@@ -299,7 +299,7 @@ static int tcp_splice_connect_finish(const struct ctx *c,
 	unsigned sidei;
 	int i = 0;
 
-	for (sidei = 0; sidei < SIDES; sidei++) {
+	flow_foreach_sidei(sidei) {
 		for (; i < TCP_SPLICE_PIPE_POOL_SIZE; i++) {
 			if (splice_pipe_pool[i][0] >= 0) {
 				SWAP(conn->pipe[sidei][0],
@@ -820,7 +820,7 @@ void tcp_splice_timer(const struct ctx *c, struct tcp_splice_conn *conn)
 
 	ASSERT(!(conn->flags & CLOSING));
 
-	for (sidei = 0; sidei < SIDES; sidei++) {
+	flow_foreach_sidei(sidei) {
 		uint8_t set = sidei == 0 ? RCVLOWAT_SET_0 : RCVLOWAT_SET_1;
 		uint8_t act = sidei == 0 ? RCVLOWAT_ACT_0 : RCVLOWAT_ACT_1;
 
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/6] tcp_splice: Use parameterised macros for per-side event/flag bits
  2024-07-17  4:52 [PATCH 0/6] Of course there are more flow table preliminaries David Gibson
                   ` (2 preceding siblings ...)
  2024-07-17  4:52 ` [PATCH 3/6] flow: Introduce flow_foreach_sidei() macro David Gibson
@ 2024-07-17  4:52 ` David Gibson
  2024-07-17  4:52 ` [PATCH 5/6] doc: Test behaviour of closing duplicate UDP sockets David Gibson
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: David Gibson @ 2024-07-17  4:52 UTC (permalink / raw)
  To: passt-dev, Stefano Brivio; +Cc: David Gibson

Both the events and flags fields in tcp_splice_conn have several bits
which are per-side, e.g. OUT_WAIT_0 for side 0 and OUT_WAIT_1 for side 1.
This necessitates some rather awkward ternary expressions when we need
to get the relevant bit for a particular side.

Simplify this by using a parameterised macro for the bit values.  This
needs a ternary expression inside the macros, but makes the places we use
it substantially clearer.

That simplification in turn allows us to use a loop across each side to
implement several things which are currently open coded to do equivalent
things for each side in turn.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 tcp_conn.h   | 15 +++++--------
 tcp_splice.c | 60 +++++++++++++++++++++++++---------------------------
 2 files changed, 34 insertions(+), 41 deletions(-)

diff --git a/tcp_conn.h b/tcp_conn.h
index 5f8c8fb6..f80ef67b 100644
--- a/tcp_conn.h
+++ b/tcp_conn.h
@@ -129,19 +129,14 @@ struct tcp_splice_conn {
 #define SPLICE_CLOSED			0
 #define SPLICE_CONNECT			BIT(0)
 #define SPLICE_ESTABLISHED		BIT(1)
-#define OUT_WAIT_0			BIT(2)
-#define OUT_WAIT_1			BIT(3)
-#define FIN_RCVD_0			BIT(4)
-#define FIN_RCVD_1			BIT(5)
-#define FIN_SENT_0			BIT(6)
-#define FIN_SENT_1			BIT(7)
+#define OUT_WAIT(sidei_)		((sidei_) ? BIT(3) : BIT(2))
+#define FIN_RCVD(sidei_)		((sidei_) ? BIT(5) : BIT(4))
+#define FIN_SENT(sidei_)		((sidei_) ? BIT(7) : BIT(6))
 
 	uint8_t flags;
 #define SPLICE_V6			BIT(0)
-#define RCVLOWAT_SET_0			BIT(1)
-#define RCVLOWAT_SET_1			BIT(2)
-#define RCVLOWAT_ACT_0			BIT(3)
-#define RCVLOWAT_ACT_1			BIT(4)
+#define RCVLOWAT_SET(sidei_)		((sidei_) ? BIT(2) : BIT(1))
+#define RCVLOWAT_ACT(sidei_)		((sidei_) ? BIT(4) : BIT(3))
 #define CLOSING				BIT(5)
 
 	uint32_t read[SIDES];
diff --git a/tcp_splice.c b/tcp_splice.c
index 5a9325b1..ffddc853 100644
--- a/tcp_splice.c
+++ b/tcp_splice.c
@@ -119,19 +119,22 @@ static struct tcp_splice_conn *conn_at_sidx(flow_sidx_t sidx)
 static void tcp_splice_conn_epoll_events(uint16_t events,
 					 struct epoll_event ev[])
 {
-	ev[0].events = ev[1].events = 0;
+	unsigned sidei;
+
+	flow_foreach_sidei(sidei)
+		ev[sidei].events = 0;
 
 	if (events & SPLICE_ESTABLISHED) {
-		if (!(events & FIN_SENT_1))
-			ev[0].events = EPOLLIN | EPOLLRDHUP;
-		if (!(events & FIN_SENT_0))
-			ev[1].events = EPOLLIN | EPOLLRDHUP;
+		flow_foreach_sidei(sidei) {
+			if (!(events & FIN_SENT(!sidei)))
+				ev[sidei].events = EPOLLIN | EPOLLRDHUP;
+		}
 	} else if (events & SPLICE_CONNECT) {
 		ev[1].events = EPOLLOUT;
 	}
 
-	ev[0].events |= (events & OUT_WAIT_0) ? EPOLLOUT : 0;
-	ev[1].events |= (events & OUT_WAIT_1) ? EPOLLOUT : 0;
+	flow_foreach_sidei(sidei)
+		ev[sidei].events |= (events & OUT_WAIT(sidei)) ? EPOLLOUT : 0;
 }
 
 /**
@@ -553,21 +556,21 @@ void tcp_splice_sock_handler(struct ctx *c, union epoll_ref ref,
 
 	if (events & EPOLLOUT) {
 		fromsidei = !evsidei;
-		conn_event(c, conn, evsidei == 0 ? ~OUT_WAIT_0 : ~OUT_WAIT_1);
+		conn_event(c, conn, ~OUT_WAIT(evsidei));
 	} else {
 		fromsidei = evsidei;
 	}
 
 	if (events & EPOLLRDHUP)
 		/* For side 0 this is fake, but implied */
-		conn_event(c, conn, evsidei == 0 ? FIN_RCVD_0 : FIN_RCVD_1);
+		conn_event(c, conn, FIN_RCVD(evsidei));
 
 swap:
 	eof = 0;
 	never_read = 1;
 
-	lowat_set_flag = fromsidei == 0 ? RCVLOWAT_SET_0 : RCVLOWAT_SET_1;
-	lowat_act_flag = fromsidei == 0 ? RCVLOWAT_ACT_0 : RCVLOWAT_ACT_1;
+	lowat_set_flag = RCVLOWAT_SET(fromsidei);
+	lowat_act_flag = RCVLOWAT_ACT(fromsidei);
 
 	while (1) {
 		ssize_t readlen, to_write = 0, written;
@@ -644,8 +647,7 @@ eintr:
 			if (conn->read[fromsidei] == conn->written[fromsidei])
 				break;
 
-			conn_event(c, conn,
-				   fromsidei == 0 ? OUT_WAIT_1 : OUT_WAIT_0);
+			conn_event(c, conn, OUT_WAIT(fromsidei));
 			break;
 		}
 
@@ -661,21 +663,19 @@ eintr:
 			break;
 	}
 
-	if ((conn->events & FIN_RCVD_0) && !(conn->events & FIN_SENT_1)) {
-		if (conn->read[fromsidei] == conn->written[fromsidei] && eof) {
-			shutdown(conn->s[1], SHUT_WR);
-			conn_event(c, conn, FIN_SENT_1);
-		}
-	}
+	if (conn->read[fromsidei] == conn->written[fromsidei] && eof) {
+		unsigned sidei;
 
-	if ((conn->events & FIN_RCVD_1) && !(conn->events & FIN_SENT_0)) {
-		if (conn->read[fromsidei] == conn->written[fromsidei] && eof) {
-			shutdown(conn->s[0], SHUT_WR);
-			conn_event(c, conn, FIN_SENT_0);
+		flow_foreach_sidei(sidei) {
+			if ((conn->events & FIN_RCVD(sidei)) &&
+			    !(conn->events & FIN_SENT(!sidei))) {
+				shutdown(conn->s[!sidei], SHUT_WR);
+				conn_event(c, conn, FIN_SENT(!sidei));
+			}
 		}
 	}
 
-	if (CONN_HAS(conn, FIN_SENT_0 | FIN_SENT_1))
+	if (CONN_HAS(conn, FIN_SENT(0) | FIN_SENT(1)))
 		goto close;
 
 	if ((events & (EPOLLIN | EPOLLOUT)) == (EPOLLIN | EPOLLOUT)) {
@@ -821,19 +821,17 @@ void tcp_splice_timer(const struct ctx *c, struct tcp_splice_conn *conn)
 	ASSERT(!(conn->flags & CLOSING));
 
 	flow_foreach_sidei(sidei) {
-		uint8_t set = sidei == 0 ? RCVLOWAT_SET_0 : RCVLOWAT_SET_1;
-		uint8_t act = sidei == 0 ? RCVLOWAT_ACT_0 : RCVLOWAT_ACT_1;
-
-		if ((conn->flags & set) && !(conn->flags & act)) {
+		if ((conn->flags & RCVLOWAT_SET(sidei)) &&
+		    !(conn->flags & RCVLOWAT_ACT(sidei))) {
 			if (setsockopt(conn->s[sidei], SOL_SOCKET, SO_RCVLOWAT,
 				       &((int){ 1 }), sizeof(int))) {
 				flow_trace(conn, "can't set SO_RCVLOWAT on %d",
 					   conn->s[sidei]);
 			}
-			conn_flag(c, conn, ~set);
+			conn_flag(c, conn, ~RCVLOWAT_SET(sidei));
 		}
 	}
 
-	conn_flag(c, conn, ~RCVLOWAT_ACT_0);
-	conn_flag(c, conn, ~RCVLOWAT_ACT_1);
+	flow_foreach_sidei(sidei)
+		conn_flag(c, conn, ~RCVLOWAT_ACT(sidei));
 }
-- 
@@ -119,19 +119,22 @@ static struct tcp_splice_conn *conn_at_sidx(flow_sidx_t sidx)
 static void tcp_splice_conn_epoll_events(uint16_t events,
 					 struct epoll_event ev[])
 {
-	ev[0].events = ev[1].events = 0;
+	unsigned sidei;
+
+	flow_foreach_sidei(sidei)
+		ev[sidei].events = 0;
 
 	if (events & SPLICE_ESTABLISHED) {
-		if (!(events & FIN_SENT_1))
-			ev[0].events = EPOLLIN | EPOLLRDHUP;
-		if (!(events & FIN_SENT_0))
-			ev[1].events = EPOLLIN | EPOLLRDHUP;
+		flow_foreach_sidei(sidei) {
+			if (!(events & FIN_SENT(!sidei)))
+				ev[sidei].events = EPOLLIN | EPOLLRDHUP;
+		}
 	} else if (events & SPLICE_CONNECT) {
 		ev[1].events = EPOLLOUT;
 	}
 
-	ev[0].events |= (events & OUT_WAIT_0) ? EPOLLOUT : 0;
-	ev[1].events |= (events & OUT_WAIT_1) ? EPOLLOUT : 0;
+	flow_foreach_sidei(sidei)
+		ev[sidei].events |= (events & OUT_WAIT(sidei)) ? EPOLLOUT : 0;
 }
 
 /**
@@ -553,21 +556,21 @@ void tcp_splice_sock_handler(struct ctx *c, union epoll_ref ref,
 
 	if (events & EPOLLOUT) {
 		fromsidei = !evsidei;
-		conn_event(c, conn, evsidei == 0 ? ~OUT_WAIT_0 : ~OUT_WAIT_1);
+		conn_event(c, conn, ~OUT_WAIT(evsidei));
 	} else {
 		fromsidei = evsidei;
 	}
 
 	if (events & EPOLLRDHUP)
 		/* For side 0 this is fake, but implied */
-		conn_event(c, conn, evsidei == 0 ? FIN_RCVD_0 : FIN_RCVD_1);
+		conn_event(c, conn, FIN_RCVD(evsidei));
 
 swap:
 	eof = 0;
 	never_read = 1;
 
-	lowat_set_flag = fromsidei == 0 ? RCVLOWAT_SET_0 : RCVLOWAT_SET_1;
-	lowat_act_flag = fromsidei == 0 ? RCVLOWAT_ACT_0 : RCVLOWAT_ACT_1;
+	lowat_set_flag = RCVLOWAT_SET(fromsidei);
+	lowat_act_flag = RCVLOWAT_ACT(fromsidei);
 
 	while (1) {
 		ssize_t readlen, to_write = 0, written;
@@ -644,8 +647,7 @@ eintr:
 			if (conn->read[fromsidei] == conn->written[fromsidei])
 				break;
 
-			conn_event(c, conn,
-				   fromsidei == 0 ? OUT_WAIT_1 : OUT_WAIT_0);
+			conn_event(c, conn, OUT_WAIT(fromsidei));
 			break;
 		}
 
@@ -661,21 +663,19 @@ eintr:
 			break;
 	}
 
-	if ((conn->events & FIN_RCVD_0) && !(conn->events & FIN_SENT_1)) {
-		if (conn->read[fromsidei] == conn->written[fromsidei] && eof) {
-			shutdown(conn->s[1], SHUT_WR);
-			conn_event(c, conn, FIN_SENT_1);
-		}
-	}
+	if (conn->read[fromsidei] == conn->written[fromsidei] && eof) {
+		unsigned sidei;
 
-	if ((conn->events & FIN_RCVD_1) && !(conn->events & FIN_SENT_0)) {
-		if (conn->read[fromsidei] == conn->written[fromsidei] && eof) {
-			shutdown(conn->s[0], SHUT_WR);
-			conn_event(c, conn, FIN_SENT_0);
+		flow_foreach_sidei(sidei) {
+			if ((conn->events & FIN_RCVD(sidei)) &&
+			    !(conn->events & FIN_SENT(!sidei))) {
+				shutdown(conn->s[!sidei], SHUT_WR);
+				conn_event(c, conn, FIN_SENT(!sidei));
+			}
 		}
 	}
 
-	if (CONN_HAS(conn, FIN_SENT_0 | FIN_SENT_1))
+	if (CONN_HAS(conn, FIN_SENT(0) | FIN_SENT(1)))
 		goto close;
 
 	if ((events & (EPOLLIN | EPOLLOUT)) == (EPOLLIN | EPOLLOUT)) {
@@ -821,19 +821,17 @@ void tcp_splice_timer(const struct ctx *c, struct tcp_splice_conn *conn)
 	ASSERT(!(conn->flags & CLOSING));
 
 	flow_foreach_sidei(sidei) {
-		uint8_t set = sidei == 0 ? RCVLOWAT_SET_0 : RCVLOWAT_SET_1;
-		uint8_t act = sidei == 0 ? RCVLOWAT_ACT_0 : RCVLOWAT_ACT_1;
-
-		if ((conn->flags & set) && !(conn->flags & act)) {
+		if ((conn->flags & RCVLOWAT_SET(sidei)) &&
+		    !(conn->flags & RCVLOWAT_ACT(sidei))) {
 			if (setsockopt(conn->s[sidei], SOL_SOCKET, SO_RCVLOWAT,
 				       &((int){ 1 }), sizeof(int))) {
 				flow_trace(conn, "can't set SO_RCVLOWAT on %d",
 					   conn->s[sidei]);
 			}
-			conn_flag(c, conn, ~set);
+			conn_flag(c, conn, ~RCVLOWAT_SET(sidei));
 		}
 	}
 
-	conn_flag(c, conn, ~RCVLOWAT_ACT_0);
-	conn_flag(c, conn, ~RCVLOWAT_ACT_1);
+	flow_foreach_sidei(sidei)
+		conn_flag(c, conn, ~RCVLOWAT_ACT(sidei));
 }
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 5/6] doc: Test behaviour of closing duplicate UDP sockets
  2024-07-17  4:52 [PATCH 0/6] Of course there are more flow table preliminaries David Gibson
                   ` (3 preceding siblings ...)
  2024-07-17  4:52 ` [PATCH 4/6] tcp_splice: Use parameterised macros for per-side event/flag bits David Gibson
@ 2024-07-17  4:52 ` David Gibson
  2024-07-17  4:52 ` [PATCH 6/6] doc: Extend zero-recv test with methods using msghdr David Gibson
  2024-07-17 15:07 ` [PATCH 0/6] Of course there are more flow table preliminaries Stefano Brivio
  6 siblings, 0 replies; 14+ messages in thread
From: David Gibson @ 2024-07-17  4:52 UTC (permalink / raw)
  To: passt-dev, Stefano Brivio; +Cc: David Gibson

To simplify lifetime management of "listening" UDP sockets, UDP flow
support needs to duplicate existing bound sockets.  Those duplicates will
be close()d when their corresponding flow expires, but we expect the
original to still receive datagrams as always.  That is, we expect the
close() on the duplicate to remove the duplicated fd, but not to close the
underlying UDP socket.

Add a test program to doc/platform-requirements to verify this requirement.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 doc/platform-requirements/.gitignore      |   1 +
 doc/platform-requirements/Makefile        |   4 +-
 doc/platform-requirements/udp-close-dup.c | 105 ++++++++++++++++++++++
 3 files changed, 108 insertions(+), 2 deletions(-)
 create mode 100644 doc/platform-requirements/udp-close-dup.c

diff --git a/doc/platform-requirements/.gitignore b/doc/platform-requirements/.gitignore
index 555031d8..3b5a10ac 100644
--- a/doc/platform-requirements/.gitignore
+++ b/doc/platform-requirements/.gitignore
@@ -1,2 +1,3 @@
 /reuseaddr-priority
 /recv-zero
+/udp-close-dup
diff --git a/doc/platform-requirements/Makefile b/doc/platform-requirements/Makefile
index 82aaac29..6a7d3748 100644
--- a/doc/platform-requirements/Makefile
+++ b/doc/platform-requirements/Makefile
@@ -3,8 +3,8 @@
 # Copyright Red Hat
 # Author: David Gibson <david@gibson.dropbear.id.au>
 
-TARGETS = reuseaddr-priority recv-zero
-SRCS = reuseaddr-priority.c recv-zero.c
+TARGETS = reuseaddr-priority recv-zero udp-close-dup
+SRCS = reuseaddr-priority.c recv-zero.c udp-close-dup.c
 CFLAGS = -Wall
 
 all: cppcheck clang-tidy $(TARGETS:%=check-%)
diff --git a/doc/platform-requirements/udp-close-dup.c b/doc/platform-requirements/udp-close-dup.c
new file mode 100644
index 00000000..99060fcb
--- /dev/null
+++ b/doc/platform-requirements/udp-close-dup.c
@@ -0,0 +1,105 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+
+/* udp-close-dup.c
+ *
+ * Verify that closing one dup() of a UDP socket won't stop other dups from
+ * receiving packets.
+ *
+ * Copyright Red Hat
+ * Author: David Gibson <david@gibson.dropbear.id.au>
+ */
+
+#include <arpa/inet.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <net/if.h>
+#include <netinet/in.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+
+#include "common.h"
+
+#define DSTPORT	13257U
+
+/* 127.0.0.1:DSTPORT */
+static const struct sockaddr_in lo_dst = SOCKADDR_INIT(INADDR_LOOPBACK, DSTPORT);
+
+enum dup_method {
+	DUP_DUP,
+	DUP_FCNTL,
+	NUM_METHODS,
+};
+
+static void test_close_dup(enum dup_method method)
+{
+	long token;
+	int s1, s2, send_s;
+	ssize_t rc;
+
+	s1 = sock_reuseaddr();
+	if (bind(s1, (struct sockaddr *)&lo_dst, sizeof(lo_dst)) < 0)
+		die("bind(): %s\n", strerror(errno));
+
+	send_s = sock_reuseaddr();
+	if (connect(send_s, (struct sockaddr *)&lo_dst, sizeof(lo_dst)) < 0)
+		die("connect(): %s\n", strerror(errno));
+
+	/* Receive before duplicating */
+	token = random();
+	send_token(send_s, token);
+	recv_token(s1, token);
+
+	switch (method) {
+	case DUP_DUP:
+		/* NOLINTNEXTLINE(android-cloexec-dup) */
+		s2 = dup(s1);
+		if (s2 < 0)
+			die("dup(): %s\n", strerror(errno));
+		break;
+	case DUP_FCNTL:
+		s2 = fcntl(s1, F_DUPFD_CLOEXEC, 0);
+		if (s2 < 0)
+			die("F_DUPFD_CLOEXEC: %s\n", strerror(errno));
+		break;
+	default:
+		die("Bad method\n");
+	}
+
+	/* Receive via original handle */
+	token = random();
+	send_token(send_s, token);
+	recv_token(s1, token);
+
+	/* Receive via duplicated handle */
+	token = random();
+	send_token(send_s, token);
+	recv_token(s2, token);
+
+	/* Close duplicate */
+	rc = close(s2);
+	if (rc < 0)
+		die("close() dup: %s\n", strerror(errno));
+
+	/* Receive after closing duplicate */
+	token = random();
+	send_token(send_s, token);
+	recv_token(s1, token);
+}
+
+int main(int argc, char *argv[])
+{
+	enum dup_method method;
+
+	(void)argc;
+	(void)argv;
+
+	for (method = 0; method < NUM_METHODS; method++)
+		test_close_dup(method);
+
+	printf("Closing dup()ed UDP sockets seems to work as expected\n");
+
+	exit(0);
+}
-- 
@@ -0,0 +1,105 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+
+/* udp-close-dup.c
+ *
+ * Verify that closing one dup() of a UDP socket won't stop other dups from
+ * receiving packets.
+ *
+ * Copyright Red Hat
+ * Author: David Gibson <david@gibson.dropbear.id.au>
+ */
+
+#include <arpa/inet.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <net/if.h>
+#include <netinet/in.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+
+#include "common.h"
+
+#define DSTPORT	13257U
+
+/* 127.0.0.1:DSTPORT */
+static const struct sockaddr_in lo_dst = SOCKADDR_INIT(INADDR_LOOPBACK, DSTPORT);
+
+enum dup_method {
+	DUP_DUP,
+	DUP_FCNTL,
+	NUM_METHODS,
+};
+
+static void test_close_dup(enum dup_method method)
+{
+	long token;
+	int s1, s2, send_s;
+	ssize_t rc;
+
+	s1 = sock_reuseaddr();
+	if (bind(s1, (struct sockaddr *)&lo_dst, sizeof(lo_dst)) < 0)
+		die("bind(): %s\n", strerror(errno));
+
+	send_s = sock_reuseaddr();
+	if (connect(send_s, (struct sockaddr *)&lo_dst, sizeof(lo_dst)) < 0)
+		die("connect(): %s\n", strerror(errno));
+
+	/* Receive before duplicating */
+	token = random();
+	send_token(send_s, token);
+	recv_token(s1, token);
+
+	switch (method) {
+	case DUP_DUP:
+		/* NOLINTNEXTLINE(android-cloexec-dup) */
+		s2 = dup(s1);
+		if (s2 < 0)
+			die("dup(): %s\n", strerror(errno));
+		break;
+	case DUP_FCNTL:
+		s2 = fcntl(s1, F_DUPFD_CLOEXEC, 0);
+		if (s2 < 0)
+			die("F_DUPFD_CLOEXEC: %s\n", strerror(errno));
+		break;
+	default:
+		die("Bad method\n");
+	}
+
+	/* Receive via original handle */
+	token = random();
+	send_token(send_s, token);
+	recv_token(s1, token);
+
+	/* Receive via duplicated handle */
+	token = random();
+	send_token(send_s, token);
+	recv_token(s2, token);
+
+	/* Close duplicate */
+	rc = close(s2);
+	if (rc < 0)
+		die("close() dup: %s\n", strerror(errno));
+
+	/* Receive after closing duplicate */
+	token = random();
+	send_token(send_s, token);
+	recv_token(s1, token);
+}
+
+int main(int argc, char *argv[])
+{
+	enum dup_method method;
+
+	(void)argc;
+	(void)argv;
+
+	for (method = 0; method < NUM_METHODS; method++)
+		test_close_dup(method);
+
+	printf("Closing dup()ed UDP sockets seems to work as expected\n");
+
+	exit(0);
+}
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 6/6] doc: Extend zero-recv test with methods using msghdr
  2024-07-17  4:52 [PATCH 0/6] Of course there are more flow table preliminaries David Gibson
                   ` (4 preceding siblings ...)
  2024-07-17  4:52 ` [PATCH 5/6] doc: Test behaviour of closing duplicate UDP sockets David Gibson
@ 2024-07-17  4:52 ` David Gibson
  2024-07-17 15:06   ` Stefano Brivio
  2024-07-17 15:07 ` [PATCH 0/6] Of course there are more flow table preliminaries Stefano Brivio
  6 siblings, 1 reply; 14+ messages in thread
From: David Gibson @ 2024-07-17  4:52 UTC (permalink / raw)
  To: passt-dev, Stefano Brivio; +Cc: David Gibson

This test program verifies that we can receive and discard datagrams by
using recv() with a NULL buffer and zero-length.  Extend it to verify it
also works using recvmsg() and either an iov with a zero-length NULL
buffer or an iov that itself is NULL and zero-length.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 doc/platform-requirements/recv-zero.c | 60 +++++++++++++++++++++++----
 1 file changed, 52 insertions(+), 8 deletions(-)

diff --git a/doc/platform-requirements/recv-zero.c b/doc/platform-requirements/recv-zero.c
index f161e5c2..ab27704d 100644
--- a/doc/platform-requirements/recv-zero.c
+++ b/doc/platform-requirements/recv-zero.c
@@ -23,11 +23,27 @@
 
 #define DSTPORT	13257U
 
+enum discard_method {
+	DISCARD_NULL_BUF,
+	DISCARD_ZERO_IOV,
+	DISCARD_NULL_IOV,
+	NUM_METHODS,
+};
+
 /* 127.0.0.1:DSTPORT */
 static const struct sockaddr_in lo_dst = SOCKADDR_INIT(INADDR_LOOPBACK, DSTPORT);
 
-static void test_discard(void)
+static void test_discard(enum discard_method method)
 {
+	struct iovec zero_iov = { .iov_base = NULL, .iov_len = 0, };
+	struct msghdr mh_zero = {
+		.msg_iov = &zero_iov,
+		.msg_iovlen = 1,
+	};
+	struct msghdr mh_null = {
+		.msg_iov = NULL,
+		.msg_iovlen = 0,
+	};
 	long token1, token2;
 	int recv_s, send_s;
 	ssize_t rc;
@@ -46,11 +62,36 @@ static void test_discard(void)
 	send_token(send_s, token1);
 	send_token(send_s, token2);
 
-	/* cppcheck-suppress nullPointer */
-	rc = recv(recv_s, NULL, 0, MSG_DONTWAIT);
-	if (rc < 0)
-		die("discarding recv(): %s\n", strerror(errno));
-	
+	switch (method) {
+	case DISCARD_NULL_BUF:
+		/* cppcheck-suppress nullPointer */
+		rc = recv(recv_s, NULL, 0, MSG_DONTWAIT);
+		if (rc < 0)
+			die("discarding recv(): %s\n", strerror(errno));
+		break;
+
+	case DISCARD_ZERO_IOV:
+		rc = recvmsg(recv_s, &mh_zero, MSG_DONTWAIT);
+		if (rc < 0)
+			die("recvmsg() with zero-length buffer: %s\n",
+			    strerror(errno));
+		if (!((unsigned)mh_zero.msg_flags & MSG_TRUNC))
+			die("Missing MSG_TRUNC flag\n");
+		break;
+
+	case DISCARD_NULL_IOV:
+		rc = recvmsg(recv_s, &mh_null, MSG_DONTWAIT);
+		if (rc < 0)
+			die("recvmsg() with zero-length iov: %s\n",
+			    strerror(errno));
+		if (!((unsigned)mh_null.msg_flags & MSG_TRUNC))
+			die("Missing MSG_TRUNC flag\n");
+		break;
+
+	default:
+		die("Bad method\n");
+	}
+
 	recv_token(recv_s, token2);
 
 	/* cppcheck-suppress nullPointer */
@@ -63,12 +104,15 @@ static void test_discard(void)
 
 int main(int argc, char *argv[])
 {
+	enum discard_method method;
+
 	(void)argc;
 	(void)argv;
 
-	test_discard();
+	for (method = 0; method < NUM_METHODS; method++)
+		test_discard(method);
 
-	printf("Discarding datagrams with a 0-length recv() seems to work\n");
+	printf("Discarding datagrams with a 0-length receives seems to work\n");
 
 	exit(0);
 }
-- 
@@ -23,11 +23,27 @@
 
 #define DSTPORT	13257U
 
+enum discard_method {
+	DISCARD_NULL_BUF,
+	DISCARD_ZERO_IOV,
+	DISCARD_NULL_IOV,
+	NUM_METHODS,
+};
+
 /* 127.0.0.1:DSTPORT */
 static const struct sockaddr_in lo_dst = SOCKADDR_INIT(INADDR_LOOPBACK, DSTPORT);
 
-static void test_discard(void)
+static void test_discard(enum discard_method method)
 {
+	struct iovec zero_iov = { .iov_base = NULL, .iov_len = 0, };
+	struct msghdr mh_zero = {
+		.msg_iov = &zero_iov,
+		.msg_iovlen = 1,
+	};
+	struct msghdr mh_null = {
+		.msg_iov = NULL,
+		.msg_iovlen = 0,
+	};
 	long token1, token2;
 	int recv_s, send_s;
 	ssize_t rc;
@@ -46,11 +62,36 @@ static void test_discard(void)
 	send_token(send_s, token1);
 	send_token(send_s, token2);
 
-	/* cppcheck-suppress nullPointer */
-	rc = recv(recv_s, NULL, 0, MSG_DONTWAIT);
-	if (rc < 0)
-		die("discarding recv(): %s\n", strerror(errno));
-	
+	switch (method) {
+	case DISCARD_NULL_BUF:
+		/* cppcheck-suppress nullPointer */
+		rc = recv(recv_s, NULL, 0, MSG_DONTWAIT);
+		if (rc < 0)
+			die("discarding recv(): %s\n", strerror(errno));
+		break;
+
+	case DISCARD_ZERO_IOV:
+		rc = recvmsg(recv_s, &mh_zero, MSG_DONTWAIT);
+		if (rc < 0)
+			die("recvmsg() with zero-length buffer: %s\n",
+			    strerror(errno));
+		if (!((unsigned)mh_zero.msg_flags & MSG_TRUNC))
+			die("Missing MSG_TRUNC flag\n");
+		break;
+
+	case DISCARD_NULL_IOV:
+		rc = recvmsg(recv_s, &mh_null, MSG_DONTWAIT);
+		if (rc < 0)
+			die("recvmsg() with zero-length iov: %s\n",
+			    strerror(errno));
+		if (!((unsigned)mh_null.msg_flags & MSG_TRUNC))
+			die("Missing MSG_TRUNC flag\n");
+		break;
+
+	default:
+		die("Bad method\n");
+	}
+
 	recv_token(recv_s, token2);
 
 	/* cppcheck-suppress nullPointer */
@@ -63,12 +104,15 @@ static void test_discard(void)
 
 int main(int argc, char *argv[])
 {
+	enum discard_method method;
+
 	(void)argc;
 	(void)argv;
 
-	test_discard();
+	for (method = 0; method < NUM_METHODS; method++)
+		test_discard(method);
 
-	printf("Discarding datagrams with a 0-length recv() seems to work\n");
+	printf("Discarding datagrams with a 0-length receives seems to work\n");
 
 	exit(0);
 }
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/6] flow, tcp_splice: Prefer 'sidei' for variables referring to side index
  2024-07-17  4:52 ` [PATCH 2/6] flow, tcp_splice: Prefer 'sidei' for variables referring to side index David Gibson
@ 2024-07-17  5:20   ` Stefano Brivio
  2024-07-17  5:28     ` David Gibson
  0 siblings, 1 reply; 14+ messages in thread
From: Stefano Brivio @ 2024-07-17  5:20 UTC (permalink / raw)
  To: David Gibson; +Cc: passt-dev

On Wed, 17 Jul 2024 14:52:19 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> @@ -164,17 +164,17 @@ struct flow_common {
>  
>  /**
>   * struct flow_sidx - ID for one side of a specific flow
> - * @side:	Side referenced (0 or 1)
> + * @side:	Index of side referenced (0 or 1)
>   * @flow:	Index of flow referenced

Nit (I can fix this up on merge, and I didn't finish reviewing yet):
the comment to the struct should also use sidei/flowi.

>   */
>  typedef struct flow_sidx {
> -	unsigned	side :1;
> -	unsigned	flow :FLOW_INDEX_BITS;
> +	unsigned	sidei :1;
> +	unsigned	flowi :FLOW_INDEX_BITS;
>  } flow_sidx_t;

-- 
Stefano


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/6] flow, tcp_splice: Prefer 'sidei' for variables referring to side index
  2024-07-17  5:20   ` Stefano Brivio
@ 2024-07-17  5:28     ` David Gibson
  2024-07-17  5:50       ` David Gibson
  0 siblings, 1 reply; 14+ messages in thread
From: David Gibson @ 2024-07-17  5:28 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: passt-dev

[-- Attachment #1: Type: text/plain, Size: 1064 bytes --]

On Wed, Jul 17, 2024 at 07:20:17AM +0200, Stefano Brivio wrote:
> On Wed, 17 Jul 2024 14:52:19 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > @@ -164,17 +164,17 @@ struct flow_common {
> >  
> >  /**
> >   * struct flow_sidx - ID for one side of a specific flow
> > - * @side:	Side referenced (0 or 1)
> > + * @side:	Index of side referenced (0 or 1)
> >   * @flow:	Index of flow referenced
> 
> Nit (I can fix this up on merge, and I didn't finish reviewing yet):
> the comment to the struct should also use sidei/flowi.

If this is the only issue, go ahead and fix on merge.  If there are
any more nits I'll make a new spin.

> 
> >   */
> >  typedef struct flow_sidx {
> > -	unsigned	side :1;
> > -	unsigned	flow :FLOW_INDEX_BITS;
> > +	unsigned	sidei :1;
> > +	unsigned	flowi :FLOW_INDEX_BITS;
> >  } flow_sidx_t;
> 

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/6] flow, tcp_splice: Prefer 'sidei' for variables referring to side index
  2024-07-17  5:28     ` David Gibson
@ 2024-07-17  5:50       ` David Gibson
  2024-07-17  7:20         ` Stefano Brivio
  0 siblings, 1 reply; 14+ messages in thread
From: David Gibson @ 2024-07-17  5:50 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: passt-dev

[-- Attachment #1: Type: text/plain, Size: 1301 bytes --]

On Wed, Jul 17, 2024 at 03:28:05PM +1000, David Gibson wrote:
> On Wed, Jul 17, 2024 at 07:20:17AM +0200, Stefano Brivio wrote:
> > On Wed, 17 Jul 2024 14:52:19 +1000
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> > 
> > > @@ -164,17 +164,17 @@ struct flow_common {
> > >  
> > >  /**
> > >   * struct flow_sidx - ID for one side of a specific flow
> > > - * @side:	Side referenced (0 or 1)
> > > + * @side:	Index of side referenced (0 or 1)
> > >   * @flow:	Index of flow referenced
> > 
> > Nit (I can fix this up on merge, and I didn't finish reviewing yet):
> > the comment to the struct should also use sidei/flowi.
> 
> If this is the only issue, go ahead and fix on merge.  If there are
> any more nits I'll make a new spin.

Actually, never mind.  I discovered I hadn't rebased onto the latest
and there's a conflict, so I'll respin anyway.

> > >   */
> > >  typedef struct flow_sidx {
> > > -	unsigned	side :1;
> > > -	unsigned	flow :FLOW_INDEX_BITS;
> > > +	unsigned	sidei :1;
> > > +	unsigned	flowi :FLOW_INDEX_BITS;
> > >  } flow_sidx_t;
> > 
> 



-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/6] flow, tcp_splice: Prefer 'sidei' for variables referring to side index
  2024-07-17  5:50       ` David Gibson
@ 2024-07-17  7:20         ` Stefano Brivio
  2024-07-17  7:27           ` David Gibson
  0 siblings, 1 reply; 14+ messages in thread
From: Stefano Brivio @ 2024-07-17  7:20 UTC (permalink / raw)
  To: David Gibson; +Cc: passt-dev

On Wed, 17 Jul 2024 15:50:32 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Wed, Jul 17, 2024 at 03:28:05PM +1000, David Gibson wrote:
> > On Wed, Jul 17, 2024 at 07:20:17AM +0200, Stefano Brivio wrote:  
> > > On Wed, 17 Jul 2024 14:52:19 +1000
> > > David Gibson <david@gibson.dropbear.id.au> wrote:
> > >   
> > > > @@ -164,17 +164,17 @@ struct flow_common {
> > > >  
> > > >  /**
> > > >   * struct flow_sidx - ID for one side of a specific flow
> > > > - * @side:	Side referenced (0 or 1)
> > > > + * @side:	Index of side referenced (0 or 1)
> > > >   * @flow:	Index of flow referenced  
> > > 
> > > Nit (I can fix this up on merge, and I didn't finish reviewing yet):
> > > the comment to the struct should also use sidei/flowi.  
> > 
> > If this is the only issue, go ahead and fix on merge.  If there are
> > any more nits I'll make a new spin.  
> 
> Actually, never mind.  I discovered I hadn't rebased onto the latest
> and there's a conflict, so I'll respin anyway.

It's trivial, just a bit of fuzz on 1/6, no need as far as I'm
concerned.

-- 
Stefano


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/6] flow, tcp_splice: Prefer 'sidei' for variables referring to side index
  2024-07-17  7:20         ` Stefano Brivio
@ 2024-07-17  7:27           ` David Gibson
  0 siblings, 0 replies; 14+ messages in thread
From: David Gibson @ 2024-07-17  7:27 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: passt-dev

[-- Attachment #1: Type: text/plain, Size: 1502 bytes --]

On Wed, Jul 17, 2024 at 09:20:42AM +0200, Stefano Brivio wrote:
> On Wed, 17 Jul 2024 15:50:32 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Wed, Jul 17, 2024 at 03:28:05PM +1000, David Gibson wrote:
> > > On Wed, Jul 17, 2024 at 07:20:17AM +0200, Stefano Brivio wrote:  
> > > > On Wed, 17 Jul 2024 14:52:19 +1000
> > > > David Gibson <david@gibson.dropbear.id.au> wrote:
> > > >   
> > > > > @@ -164,17 +164,17 @@ struct flow_common {
> > > > >  
> > > > >  /**
> > > > >   * struct flow_sidx - ID for one side of a specific flow
> > > > > - * @side:	Side referenced (0 or 1)
> > > > > + * @side:	Index of side referenced (0 or 1)
> > > > >   * @flow:	Index of flow referenced  
> > > > 
> > > > Nit (I can fix this up on merge, and I didn't finish reviewing yet):
> > > > the comment to the struct should also use sidei/flowi.  
> > > 
> > > If this is the only issue, go ahead and fix on merge.  If there are
> > > any more nits I'll make a new spin.  
> > 
> > Actually, never mind.  I discovered I hadn't rebased onto the latest
> > and there's a conflict, so I'll respin anyway.
> 
> It's trivial, just a bit of fuzz on 1/6, no need as far as I'm
> concerned.

Ok.  Well go ahead and apply then, if you have no additional revisions
to suggest.

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 6/6] doc: Extend zero-recv test with methods using msghdr
  2024-07-17  4:52 ` [PATCH 6/6] doc: Extend zero-recv test with methods using msghdr David Gibson
@ 2024-07-17 15:06   ` Stefano Brivio
  0 siblings, 0 replies; 14+ messages in thread
From: Stefano Brivio @ 2024-07-17 15:06 UTC (permalink / raw)
  To: David Gibson; +Cc: passt-dev

On Wed, 17 Jul 2024 14:52:23 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> This test program verifies that we can receive and discard datagrams by
> using recv() with a NULL buffer and zero-length.  Extend it to verify it
> also works using recvmsg() and either an iov with a zero-length NULL
> buffer or an iov that itself is NULL and zero-length.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  doc/platform-requirements/recv-zero.c | 60 +++++++++++++++++++++++----
>  1 file changed, 52 insertions(+), 8 deletions(-)
> 
> diff --git a/doc/platform-requirements/recv-zero.c b/doc/platform-requirements/recv-zero.c
> index f161e5c2..ab27704d 100644
> --- a/doc/platform-requirements/recv-zero.c
> +++ b/doc/platform-requirements/recv-zero.c
> @@ -23,11 +23,27 @@
>  
>  #define DSTPORT	13257U
>  
> +enum discard_method {
> +	DISCARD_NULL_BUF,
> +	DISCARD_ZERO_IOV,
> +	DISCARD_NULL_IOV,
> +	NUM_METHODS,
> +};
> +
>  /* 127.0.0.1:DSTPORT */
>  static const struct sockaddr_in lo_dst = SOCKADDR_INIT(INADDR_LOOPBACK, DSTPORT);
>  
> -static void test_discard(void)
> +static void test_discard(enum discard_method method)
>  {
> +	struct iovec zero_iov = { .iov_base = NULL, .iov_len = 0, };
> +	struct msghdr mh_zero = {
> +		.msg_iov = &zero_iov,
> +		.msg_iovlen = 1,
> +	};
> +	struct msghdr mh_null = {
> +		.msg_iov = NULL,
> +		.msg_iovlen = 0,
> +	};
>  	long token1, token2;
>  	int recv_s, send_s;
>  	ssize_t rc;
> @@ -46,11 +62,36 @@ static void test_discard(void)
>  	send_token(send_s, token1);
>  	send_token(send_s, token2);
>  
> -	/* cppcheck-suppress nullPointer */
> -	rc = recv(recv_s, NULL, 0, MSG_DONTWAIT);
> -	if (rc < 0)
> -		die("discarding recv(): %s\n", strerror(errno));
> -	
> +	switch (method) {
> +	case DISCARD_NULL_BUF:
> +		/* cppcheck-suppress nullPointer */
> +		rc = recv(recv_s, NULL, 0, MSG_DONTWAIT);
> +		if (rc < 0)
> +			die("discarding recv(): %s\n", strerror(errno));
> +		break;
> +
> +	case DISCARD_ZERO_IOV:
> +		rc = recvmsg(recv_s, &mh_zero, MSG_DONTWAIT);
> +		if (rc < 0)
> +			die("recvmsg() with zero-length buffer: %s\n",
> +			    strerror(errno));
> +		if (!((unsigned)mh_zero.msg_flags & MSG_TRUNC))
> +			die("Missing MSG_TRUNC flag\n");
> +		break;
> +
> +	case DISCARD_NULL_IOV:
> +		rc = recvmsg(recv_s, &mh_null, MSG_DONTWAIT);
> +		if (rc < 0)
> +			die("recvmsg() with zero-length iov: %s\n",
> +			    strerror(errno));
> +		if (!((unsigned)mh_null.msg_flags & MSG_TRUNC))
> +			die("Missing MSG_TRUNC flag\n");
> +		break;
> +
> +	default:
> +		die("Bad method\n");
> +	}
> +
>  	recv_token(recv_s, token2);
>  
>  	/* cppcheck-suppress nullPointer */
> @@ -63,12 +104,15 @@ static void test_discard(void)
>  
>  int main(int argc, char *argv[])
>  {
> +	enum discard_method method;
> +
>  	(void)argc;
>  	(void)argv;
>  
> -	test_discard();
> +	for (method = 0; method < NUM_METHODS; method++)
> +		test_discard(method);
>  
> -	printf("Discarding datagrams with a 0-length recv() seems to work\n");
> +	printf("Discarding datagrams with a 0-length receives seems to work\n");

Nit: also fixed up on merge: "with 0-length receives".

>  
>  	exit(0);
>  }

-- 
Stefano


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/6] Of course there are more flow table preliminaries
  2024-07-17  4:52 [PATCH 0/6] Of course there are more flow table preliminaries David Gibson
                   ` (5 preceding siblings ...)
  2024-07-17  4:52 ` [PATCH 6/6] doc: Extend zero-recv test with methods using msghdr David Gibson
@ 2024-07-17 15:07 ` Stefano Brivio
  6 siblings, 0 replies; 14+ messages in thread
From: Stefano Brivio @ 2024-07-17 15:07 UTC (permalink / raw)
  To: David Gibson; +Cc: passt-dev

On Wed, 17 Jul 2024 14:52:17 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> This series has some further preliminaries for the flow table.
> Specifically this is focused on some clean ups to handling of indicies
> for flows and sides.  We also add some additional test programs for
> platform requirements.
> 
> This is based on the series adding UDP error handling.
> 
> David Gibson (6):
>   flow, icmp, tcp: Clean up helpers for getting flow from index
>   flow, tcp_splice: Prefer 'sidei' for variables referring to side index
>   flow: Introduce flow_foreach_sidei() macro
>   tcp_splice: Use parameterised macros for per-side event/flag bits
>   doc: Test behaviour of closing duplicate UDP sockets
>   doc: Extend zero-recv test with methods using msghdr

Applied.

-- 
Stefano


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2024-07-17 15:08 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-07-17  4:52 [PATCH 0/6] Of course there are more flow table preliminaries David Gibson
2024-07-17  4:52 ` [PATCH 1/6] flow, icmp, tcp: Clean up helpers for getting flow from index David Gibson
2024-07-17  4:52 ` [PATCH 2/6] flow, tcp_splice: Prefer 'sidei' for variables referring to side index David Gibson
2024-07-17  5:20   ` Stefano Brivio
2024-07-17  5:28     ` David Gibson
2024-07-17  5:50       ` David Gibson
2024-07-17  7:20         ` Stefano Brivio
2024-07-17  7:27           ` David Gibson
2024-07-17  4:52 ` [PATCH 3/6] flow: Introduce flow_foreach_sidei() macro David Gibson
2024-07-17  4:52 ` [PATCH 4/6] tcp_splice: Use parameterised macros for per-side event/flag bits David Gibson
2024-07-17  4:52 ` [PATCH 5/6] doc: Test behaviour of closing duplicate UDP sockets David Gibson
2024-07-17  4:52 ` [PATCH 6/6] doc: Extend zero-recv test with methods using msghdr David Gibson
2024-07-17 15:06   ` Stefano Brivio
2024-07-17 15:07 ` [PATCH 0/6] Of course there are more flow table preliminaries Stefano Brivio

Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).