* [PATCH 1/6] tcp_splice: Improve error reporting
2026-05-20 13:08 [PATCH 0/6] Fix race condition while closing spliced connections David Gibson
@ 2026-05-20 13:08 ` David Gibson
2026-05-20 14:31 ` Stefano Brivio
2026-05-20 13:08 ` [PATCH 2/6] tcp_splice: Avoid missing EOF recognition while forwarding David Gibson
` (4 subsequent siblings)
5 siblings, 1 reply; 8+ messages in thread
From: David Gibson @ 2026-05-20 13:08 UTC (permalink / raw)
To: passt-dev, Stefano Brivio; +Cc: Paul Holzinger, David Gibson
A number of things can, at least theoretically, go wrong when forwarding
data across a spliced connection. We generally handle this by resetting
the connection on both sides. However, in many cases we don't log any
message about why the connection was reset, which can make it hard to
debug why this is happening.
Add a bunch of debug and error logging to make this easier to figure out.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
tcp_splice.c | 31 +++++++++++++++++++++++--------
1 file changed, 23 insertions(+), 8 deletions(-)
diff --git a/tcp_splice.c b/tcp_splice.c
index 42ee8abc..1359d6b8 100644
--- a/tcp_splice.c
+++ b/tcp_splice.c
@@ -502,15 +502,18 @@ void tcp_splice_sock_handler(struct ctx *c, union epoll_ref ref,
if (rc)
flow_perror(conn, "Error retrieving SO_ERROR");
else
- flow_trace(conn, "Error event on socket: %s",
- strerror_(err));
-
+ flow_dbg(conn, "Error event on %s socket: %s",
+ pif_name(conn->f.pif[evsidei]),
+ strerror_(err));
goto reset;
}
if (conn->events == SPLICE_CONNECT) {
- if (!(events & EPOLLOUT))
+ if (!(events & EPOLLOUT)) {
+ flow_err(conn, "Unexpected events 0x%x during connect",
+ events);
goto reset;
+ }
if (tcp_splice_connect_finish(c, conn))
goto reset;
}
@@ -545,8 +548,11 @@ retry:
SPLICE_F_MOVE | SPLICE_F_NONBLOCK);
while (readlen < 0 && errno == EINTR);
- if (readlen < 0 && errno != EAGAIN)
+ if (readlen < 0 && errno != EAGAIN) {
+ flow_perror(conn, "Splicing from %s socket",
+ pif_name(conn->f.pif[fromsidei]));
goto reset;
+ }
flow_trace(conn, "%zi from read-side call", readlen);
@@ -569,8 +575,11 @@ retry:
SPLICE_F_MOVE | more | SPLICE_F_NONBLOCK);
while (written < 0 && errno == EINTR);
- if (written < 0 && errno != EAGAIN)
+ if (written < 0 && errno != EAGAIN) {
+ flow_perror(conn, "Splicing to %s socket",
+ pif_name(conn->f.pif[!fromsidei]));
goto reset;
+ }
flow_trace(conn, "%zi from write-side call (passed %zi)",
written, c->tcp.pipe_size);
@@ -627,8 +636,11 @@ retry:
flow_foreach_sidei(sidei) {
if ((conn->events & FIN_RCVD(sidei)) &&
!(conn->events & FIN_SENT(!sidei))) {
- if (shutdown(conn->s[!sidei], SHUT_WR) < 0)
+ if (shutdown(conn->s[!sidei], SHUT_WR) < 0) {
+ flow_perror(conn, "shutdown() on %s",
+ pif_name(conn->f.pif[!sidei]));
goto reset;
+ }
conn_event(conn, FIN_SENT(!sidei));
}
}
@@ -647,8 +659,11 @@ retry:
goto swap;
}
- if (events & EPOLLHUP)
+ if (events & EPOLLHUP) {
+ flow_dbg(conn, "Hangup from %s socket",
+ pif_name(conn->f.pif[evsidei]));
goto reset;
+ }
return;
--
2.54.0
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH 1/6] tcp_splice: Improve error reporting
2026-05-20 13:08 ` [PATCH 1/6] tcp_splice: Improve error reporting David Gibson
@ 2026-05-20 14:31 ` Stefano Brivio
0 siblings, 0 replies; 8+ messages in thread
From: Stefano Brivio @ 2026-05-20 14:31 UTC (permalink / raw)
To: David Gibson; +Cc: passt-dev, Paul Holzinger, Anshu Kumari
On Wed, 20 May 2026 23:08:46 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:
> A number of things can, at least theoretically, go wrong when forwarding
> data across a spliced connection. We generally handle this by resetting
> the connection on both sides. However, in many cases we don't log any
> message about why the connection was reset, which can make it hard to
> debug why this is happening.
>
> Add a bunch of debug and error logging to make this easier to figure out.
>
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
> tcp_splice.c | 31 +++++++++++++++++++++++--------
> 1 file changed, 23 insertions(+), 8 deletions(-)
>
> diff --git a/tcp_splice.c b/tcp_splice.c
> index 42ee8abc..1359d6b8 100644
> --- a/tcp_splice.c
> +++ b/tcp_splice.c
> @@ -502,15 +502,18 @@ void tcp_splice_sock_handler(struct ctx *c, union epoll_ref ref,
> if (rc)
> flow_perror(conn, "Error retrieving SO_ERROR");
> else
> - flow_trace(conn, "Error event on socket: %s",
> - strerror_(err));
> -
> + flow_dbg(conn, "Error event on %s socket: %s",
> + pif_name(conn->f.pif[evsidei]),
> + strerror_(err));
> goto reset;
> }
>
> if (conn->events == SPLICE_CONNECT) {
> - if (!(events & EPOLLOUT))
> + if (!(events & EPOLLOUT)) {
> + flow_err(conn, "Unexpected events 0x%x during connect",
> + events);
Shouldn't all the flow_err() and flow_perror() calls here be
ratelimited, that is, eventually calling the err_ratelimit() function
Anshu introduced recently?
We don't have helpers ready for flow_err() and flow_perror(), I was
about to post a patch that would go before this series but I'm not sure
if there's a specific reason to avoid those.
> goto reset;
> + }
> if (tcp_splice_connect_finish(c, conn))
> goto reset;
> }
> @@ -545,8 +548,11 @@ retry:
> SPLICE_F_MOVE | SPLICE_F_NONBLOCK);
> while (readlen < 0 && errno == EINTR);
>
> - if (readlen < 0 && errno != EAGAIN)
> + if (readlen < 0 && errno != EAGAIN) {
> + flow_perror(conn, "Splicing from %s socket",
> + pif_name(conn->f.pif[fromsidei]));
> goto reset;
> + }
>
> flow_trace(conn, "%zi from read-side call", readlen);
>
> @@ -569,8 +575,11 @@ retry:
> SPLICE_F_MOVE | more | SPLICE_F_NONBLOCK);
> while (written < 0 && errno == EINTR);
>
> - if (written < 0 && errno != EAGAIN)
> + if (written < 0 && errno != EAGAIN) {
> + flow_perror(conn, "Splicing to %s socket",
> + pif_name(conn->f.pif[!fromsidei]));
> goto reset;
> + }
>
> flow_trace(conn, "%zi from write-side call (passed %zi)",
> written, c->tcp.pipe_size);
> @@ -627,8 +636,11 @@ retry:
> flow_foreach_sidei(sidei) {
> if ((conn->events & FIN_RCVD(sidei)) &&
> !(conn->events & FIN_SENT(!sidei))) {
> - if (shutdown(conn->s[!sidei], SHUT_WR) < 0)
> + if (shutdown(conn->s[!sidei], SHUT_WR) < 0) {
> + flow_perror(conn, "shutdown() on %s",
> + pif_name(conn->f.pif[!sidei]));
> goto reset;
> + }
> conn_event(conn, FIN_SENT(!sidei));
> }
> }
> @@ -647,8 +659,11 @@ retry:
> goto swap;
> }
>
> - if (events & EPOLLHUP)
> + if (events & EPOLLHUP) {
> + flow_dbg(conn, "Hangup from %s socket",
> + pif_name(conn->f.pif[evsidei]));
> goto reset;
> + }
>
> return;
>
--
Stefano
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 2/6] tcp_splice: Avoid missing EOF recognition while forwarding
2026-05-20 13:08 [PATCH 0/6] Fix race condition while closing spliced connections David Gibson
2026-05-20 13:08 ` [PATCH 1/6] tcp_splice: Improve error reporting David Gibson
@ 2026-05-20 13:08 ` David Gibson
2026-05-20 13:08 ` [PATCH 3/6] tcp_splice: Clean up flow control path for splice forwarding David Gibson
` (3 subsequent siblings)
5 siblings, 0 replies; 8+ messages in thread
From: David Gibson @ 2026-05-20 13:08 UTC (permalink / raw)
To: passt-dev, Stefano Brivio; +Cc: Paul Holzinger, David Gibson
tcp_splice_sock_handler() has an optimised path for the common case where
the amount we splice(2) into the pipe is exactly the same as the amount we
splice(2) out again. If the pipe is empty at that point, we stop
forwarding until we get another epoll event.
However, via a subtle chain of events, this can cause a bug for a
half-closed connection. Suppose the connection is already half-closed in
the other direction - that is, we've already called shutdown(SHUT_WR) on
the socket for which we're getting the event. In this event we're getting
the last batch of data in the other direction, and also a FIN. This can
result in EPOLLIN, EPOLLRDHUP and EPOLLHUP events simultaneously.
We read the last data from the socket and successfully splice it to the
other side. Since there is no data in the pipe, we exit the forwarding
loop. However, because we did read data, we don't set the eof flag.
Because we don't set eof, we don't (yet) propagate the FIN to the other
side, or set FIN_SENT_(!fromsidei). Therefore we don't (yet) recognize
this as a clean termination and set the CLOSING flag. We would correct
this when we get our next event, however before we can do so we process
the EPOLLHUP event. Because we haven't recognized this as a clean close
we assume it is an abrupt close and send an RST to the other side.
To avoid this, don't stop attempting to forward data on this path.
Continue for at least one more loop. If we're at EOF, we'll recognize it
on the next splice(2). If not it gives us an opportunity to forward more
data without returning to the mail epoll loop.
Link: https://bugs.passt.top/show_bug.cgi?id=202
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
tcp_splice.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tcp_splice.c b/tcp_splice.c
index 1359d6b8..34ffea73 100644
--- a/tcp_splice.c
+++ b/tcp_splice.c
@@ -605,7 +605,7 @@ retry:
}
}
- break;
+ continue;
}
conn->read[fromsidei] += readlen > 0 ? readlen : 0;
--
2.54.0
^ permalink raw reply [flat|nested] 8+ messages in thread* [PATCH 3/6] tcp_splice: Clean up flow control path for splice forwarding
2026-05-20 13:08 [PATCH 0/6] Fix race condition while closing spliced connections David Gibson
2026-05-20 13:08 ` [PATCH 1/6] tcp_splice: Improve error reporting David Gibson
2026-05-20 13:08 ` [PATCH 2/6] tcp_splice: Avoid missing EOF recognition while forwarding David Gibson
@ 2026-05-20 13:08 ` David Gibson
2026-05-20 13:08 ` [PATCH 4/6] tcp_splice: Simplify tracking of read/written bytes David Gibson
` (2 subsequent siblings)
5 siblings, 0 replies; 8+ messages in thread
From: David Gibson @ 2026-05-20 13:08 UTC (permalink / raw)
To: passt-dev, Stefano Brivio; +Cc: Paul Holzinger, David Gibson
Splice forwarding can be blocked either waiting for data from one side
or waiting for space on the other. For that reason,
tcp_splice_sock_handler() on either socket can forward data in either or
both directions, depending on whether we have EPOLLIN, EPOLLOUT or both
events.
The flow control for this is quite hard to follow though, since we forward
in one direction, then sometimes loop back with a goto to do it in the
other direction. Simplify this by adding a tcp_splice_forward() function
with the logic to forward in one direction and calling it either once or
twice from tcp_splice_sock_handler().
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
tcp_splice.c | 137 ++++++++++++++++++++++++++-------------------------
1 file changed, 71 insertions(+), 66 deletions(-)
diff --git a/tcp_splice.c b/tcp_splice.c
index 34ffea73..18e8b303 100644
--- a/tcp_splice.c
+++ b/tcp_splice.c
@@ -474,67 +474,20 @@ void tcp_splice_conn_from_sock(const struct ctx *c, union flow *flow, int s0)
}
/**
- * tcp_splice_sock_handler() - Handler for socket mapped to spliced connection
+ * tcp_splice_forward() - Forward data in one direction using splice()
* @c: Execution context
- * @ref: epoll reference
- * @events: epoll events bitmap
+ * @conn: Connection to forward data for
+ * @fromsidei: Side to forward data from
*
* #syscalls:pasta splice
*/
-void tcp_splice_sock_handler(struct ctx *c, union epoll_ref ref,
- uint32_t events)
+static int tcp_splice_forward(struct ctx *c, struct
+ tcp_splice_conn *conn, unsigned fromsidei)
{
- struct tcp_splice_conn *conn = conn_at_sidx(ref.flowside);
- unsigned evsidei = ref.flowside.sidei, fromsidei;
- uint8_t lowat_set_flag, lowat_act_flag;
- int eof, never_read;
-
- assert(conn->f.type == FLOW_TCP_SPLICE);
-
- if (conn->events == SPLICE_CLOSED)
- return;
-
- if (events & EPOLLERR) {
- int err, rc;
- socklen_t sl = sizeof(err);
-
- rc = getsockopt(ref.fd, SOL_SOCKET, SO_ERROR, &err, &sl);
- if (rc)
- flow_perror(conn, "Error retrieving SO_ERROR");
- else
- flow_dbg(conn, "Error event on %s socket: %s",
- pif_name(conn->f.pif[evsidei]),
- strerror_(err));
- goto reset;
- }
-
- if (conn->events == SPLICE_CONNECT) {
- if (!(events & EPOLLOUT)) {
- flow_err(conn, "Unexpected events 0x%x during connect",
- events);
- goto reset;
- }
- if (tcp_splice_connect_finish(c, conn))
- goto reset;
- }
-
- if (events & EPOLLOUT) {
- fromsidei = !evsidei;
- conn_event(conn, ~OUT_WAIT(evsidei));
- } else {
- fromsidei = evsidei;
- }
-
- if (events & EPOLLRDHUP)
- /* For side 0 this is fake, but implied */
- conn_event(conn, FIN_RCVD(evsidei));
-
-swap:
- eof = 0;
- never_read = 1;
-
- lowat_set_flag = RCVLOWAT_SET(fromsidei);
- lowat_act_flag = RCVLOWAT_ACT(fromsidei);
+ uint8_t lowat_set_flag = RCVLOWAT_SET(fromsidei);
+ uint8_t lowat_act_flag = RCVLOWAT_ACT(fromsidei);
+ int never_read = 1;
+ int eof = 0;
while (1) {
ssize_t readlen, written, pending;
@@ -551,7 +504,7 @@ retry:
if (readlen < 0 && errno != EAGAIN) {
flow_perror(conn, "Splicing from %s socket",
pif_name(conn->f.pif[fromsidei]));
- goto reset;
+ return -1;
}
flow_trace(conn, "%zi from read-side call", readlen);
@@ -578,7 +531,7 @@ retry:
if (written < 0 && errno != EAGAIN) {
flow_perror(conn, "Splicing to %s socket",
pif_name(conn->f.pif[!fromsidei]));
- goto reset;
+ return -1;
}
flow_trace(conn, "%zi from write-side call (passed %zi)",
@@ -639,24 +592,76 @@ retry:
if (shutdown(conn->s[!sidei], SHUT_WR) < 0) {
flow_perror(conn, "shutdown() on %s",
pif_name(conn->f.pif[!sidei]));
- goto reset;
+ return -1;
}
conn_event(conn, FIN_SENT(!sidei));
}
}
}
- if (CONN_HAS(conn, FIN_SENT(0) | FIN_SENT(1))) {
- /* Clean close, no reset */
- conn_flag(conn, CLOSING);
+ return 0;
+}
+
+/**
+ * tcp_splice_sock_handler() - Handler for socket mapped to spliced connection
+ * @c: Execution context
+ * @ref: epoll reference
+ * @events: epoll events bitmap
+ */
+void tcp_splice_sock_handler(struct ctx *c, union epoll_ref ref,
+ uint32_t events)
+{
+ struct tcp_splice_conn *conn = conn_at_sidx(ref.flowside);
+ unsigned evsidei = ref.flowside.sidei;
+
+ assert(conn->f.type == FLOW_TCP_SPLICE);
+
+ if (conn->events == SPLICE_CLOSED)
return;
+
+ if (events & EPOLLERR) {
+ int err, rc;
+ socklen_t sl = sizeof(err);
+
+ rc = getsockopt(ref.fd, SOL_SOCKET, SO_ERROR, &err, &sl);
+ if (rc)
+ flow_perror(conn, "Error retrieving SO_ERROR");
+ else
+ flow_dbg(conn, "Error event on %s socket: %s",
+ pif_name(conn->f.pif[evsidei]),
+ strerror_(err));
+ goto reset;
+ }
+
+ if (conn->events == SPLICE_CONNECT) {
+ if (!(events & EPOLLOUT)) {
+ flow_err(conn, "Unexpected events 0x%x during connect",
+ events);
+ goto reset;
+ }
+ if (tcp_splice_connect_finish(c, conn))
+ goto reset;
+ }
+
+ if (events & EPOLLRDHUP)
+ /* For side 0 this is fake, but implied */
+ conn_event(conn, FIN_RCVD(evsidei));
+
+ if (events & EPOLLOUT) {
+ if (tcp_splice_forward(c, conn, !evsidei))
+ goto reset;
+ conn_event(conn, ~OUT_WAIT(evsidei));
}
- if ((events & (EPOLLIN | EPOLLOUT)) == (EPOLLIN | EPOLLOUT)) {
- events = EPOLLIN;
+ if (events & EPOLLIN) {
+ if (tcp_splice_forward(c, conn, evsidei))
+ goto reset;
+ }
- fromsidei = !fromsidei;
- goto swap;
+ if (CONN_HAS(conn, FIN_SENT(0) | FIN_SENT(1))) {
+ /* Clean close, no reset */
+ conn_flag(conn, CLOSING);
+ return;
}
if (events & EPOLLHUP) {
--
2.54.0
^ permalink raw reply [flat|nested] 8+ messages in thread* [PATCH 4/6] tcp_splice: Simplify tracking of read/written bytes
2026-05-20 13:08 [PATCH 0/6] Fix race condition while closing spliced connections David Gibson
` (2 preceding siblings ...)
2026-05-20 13:08 ` [PATCH 3/6] tcp_splice: Clean up flow control path for splice forwarding David Gibson
@ 2026-05-20 13:08 ` David Gibson
2026-05-20 13:08 ` [PATCH 5/6] tcp_splice: Simplify EPOLLRDHUP / eof / FIN handling David Gibson
2026-05-20 13:08 ` [PATCH 6/6] tcp_splice: Simplify shutdown(2) handling David Gibson
5 siblings, 0 replies; 8+ messages in thread
From: David Gibson @ 2026-05-20 13:08 UTC (permalink / raw)
To: passt-dev, Stefano Brivio; +Cc: Paul Holzinger, David Gibson
For each each direction of each spliced connection, we keep track of how
many bytes we've read from one socket and written to the other. However,
we never actually care about the absolute values of these, only the
difference between them, which represents how much data is currently "in
flight" in the splicing pipe.
Simplify the handling by having a single variable tracking the number of
bytes in the pipe.
As a bonus, the new scheme makes it clearer that we don't need to worry
about overflows: pending can never become larger than the maximum pipe
bufffer size, well within 32-bits.
I _think_ the old scheme was safe in the case of overflow - again under
the assumption that read/written can never be further apart than the pipe
buffer size. However, it's much harder to reason about this case. It's
certainly plausible that an overflow could occur - sending 4GiB through
a local socket is entirely achievable.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
tcp_conn.h | 6 ++----
tcp_splice.c | 18 +++++++++---------
2 files changed, 11 insertions(+), 13 deletions(-)
diff --git a/tcp_conn.h b/tcp_conn.h
index 9f5bee03..c8381aa7 100644
--- a/tcp_conn.h
+++ b/tcp_conn.h
@@ -206,8 +206,7 @@ struct tcp_tap_transfer_ext {
* @f: Generic flow information
* @s: File descriptor for sockets
* @pipe: File descriptors for pipes
- * @read: Bytes read (not fully written to other side in one shot)
- * @written: Bytes written (not fully written from one other side read)
+ * @pending: Bytes currently in each pipe
* @events: Events observed/actions performed on connection
* @flags: Connection flags (attributes, not events)
*/
@@ -218,8 +217,7 @@ struct tcp_splice_conn {
int s[SIDES];
int pipe[SIDES][2];
- uint32_t read[SIDES];
- uint32_t written[SIDES];
+ uint32_t pending[SIDES];
uint8_t events;
#define SPLICE_CLOSED 0
diff --git a/tcp_splice.c b/tcp_splice.c
index 18e8b303..8fbd490f 100644
--- a/tcp_splice.c
+++ b/tcp_splice.c
@@ -292,7 +292,7 @@ bool tcp_splice_flow_defer(struct tcp_splice_conn *conn)
conn->s[sidei] = -1;
}
- conn->read[sidei] = conn->written[sidei] = 0;
+ conn->pending[sidei] = 0;
}
conn->events = SPLICE_CLOSED;
@@ -490,7 +490,7 @@ static int tcp_splice_forward(struct ctx *c, struct
int eof = 0;
while (1) {
- ssize_t readlen, written, pending;
+ ssize_t readlen, written;
int more = 0;
retry:
@@ -537,7 +537,7 @@ retry:
flow_trace(conn, "%zi from write-side call (passed %zi)",
written, c->tcp.pipe_size);
- /* Most common case: skip updating counters. */
+ /* Most common case: skip updating pending. */
if (readlen > 0 && readlen == written) {
if (readlen >= (long)c->tcp.pipe_size * 10 / 100)
continue;
@@ -561,11 +561,11 @@ retry:
continue;
}
- conn->read[fromsidei] += readlen > 0 ? readlen : 0;
- conn->written[fromsidei] += written > 0 ? written : 0;
+ conn->pending[fromsidei] += readlen > 0 ? readlen : 0;
+ conn->pending[fromsidei] -= written > 0 ? written : 0;
if (written < 0) {
- if (conn->read[fromsidei] == conn->written[fromsidei])
+ if (!conn->pending[fromsidei])
break;
conn_event(conn, OUT_WAIT(!fromsidei));
@@ -575,15 +575,15 @@ retry:
if (never_read && written == (long)(c->tcp.pipe_size))
goto retry;
- pending = conn->read[fromsidei] - conn->written[fromsidei];
- if (!never_read && written > 0 && written < pending)
+ if (!never_read && written > 0 &&
+ written < conn->pending[fromsidei])
goto retry;
if (eof)
break;
}
- if (conn->read[fromsidei] == conn->written[fromsidei] && eof) {
+ if (!conn->pending[fromsidei] && eof) {
unsigned sidei;
flow_foreach_sidei(sidei) {
--
2.54.0
^ permalink raw reply [flat|nested] 8+ messages in thread* [PATCH 5/6] tcp_splice: Simplify EPOLLRDHUP / eof / FIN handling
2026-05-20 13:08 [PATCH 0/6] Fix race condition while closing spliced connections David Gibson
` (3 preceding siblings ...)
2026-05-20 13:08 ` [PATCH 4/6] tcp_splice: Simplify tracking of read/written bytes David Gibson
@ 2026-05-20 13:08 ` David Gibson
2026-05-20 13:08 ` [PATCH 6/6] tcp_splice: Simplify shutdown(2) handling David Gibson
5 siblings, 0 replies; 8+ messages in thread
From: David Gibson @ 2026-05-20 13:08 UTC (permalink / raw)
To: passt-dev, Stefano Brivio; +Cc: Paul Holzinger, David Gibson
There are two ways we can tell one of our sockets has received a FIN. We
can either see an EPOLLRDHUP epoll event, or we can get a zero-length read
(EOF) on the socket. We currently use both, in a mildly confusing way:
we only set the FIN_RCVD() flag based on the EPOLLRDHUP event, but then
some other close out logic is based on seeing an EOF.
Simplify this by setting the flag based on only the EOF. To make sure we
don't miss an event if we get an EPOLLRDHUP with no data, we trigger the
forwarding path for EPOLLRDHUP as well as EPOLLIN.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
tcp_splice.c | 14 +++++---------
1 file changed, 5 insertions(+), 9 deletions(-)
diff --git a/tcp_splice.c b/tcp_splice.c
index 8fbd490f..b45f0060 100644
--- a/tcp_splice.c
+++ b/tcp_splice.c
@@ -487,7 +487,6 @@ static int tcp_splice_forward(struct ctx *c, struct
uint8_t lowat_set_flag = RCVLOWAT_SET(fromsidei);
uint8_t lowat_act_flag = RCVLOWAT_ACT(fromsidei);
int never_read = 1;
- int eof = 0;
while (1) {
ssize_t readlen, written;
@@ -510,7 +509,7 @@ retry:
flow_trace(conn, "%zi from read-side call", readlen);
if (!readlen) {
- eof = 1;
+ conn_event(conn, FIN_RCVD(fromsidei));
} else if (readlen > 0) {
never_read = 0;
@@ -579,11 +578,12 @@ retry:
written < conn->pending[fromsidei])
goto retry;
- if (eof)
+ if (conn->events & FIN_RCVD(fromsidei))
break;
}
- if (!conn->pending[fromsidei] && eof) {
+ if (!conn->pending[fromsidei] &&
+ conn->events & FIN_RCVD(fromsidei)) {
unsigned sidei;
flow_foreach_sidei(sidei) {
@@ -643,17 +643,13 @@ void tcp_splice_sock_handler(struct ctx *c, union epoll_ref ref,
goto reset;
}
- if (events & EPOLLRDHUP)
- /* For side 0 this is fake, but implied */
- conn_event(conn, FIN_RCVD(evsidei));
-
if (events & EPOLLOUT) {
if (tcp_splice_forward(c, conn, !evsidei))
goto reset;
conn_event(conn, ~OUT_WAIT(evsidei));
}
- if (events & EPOLLIN) {
+ if (events & (EPOLLIN | EPOLLRDHUP)) {
if (tcp_splice_forward(c, conn, evsidei))
goto reset;
}
--
2.54.0
^ permalink raw reply [flat|nested] 8+ messages in thread* [PATCH 6/6] tcp_splice: Simplify shutdown(2) handling
2026-05-20 13:08 [PATCH 0/6] Fix race condition while closing spliced connections David Gibson
` (4 preceding siblings ...)
2026-05-20 13:08 ` [PATCH 5/6] tcp_splice: Simplify EPOLLRDHUP / eof / FIN handling David Gibson
@ 2026-05-20 13:08 ` David Gibson
5 siblings, 0 replies; 8+ messages in thread
From: David Gibson @ 2026-05-20 13:08 UTC (permalink / raw)
To: passt-dev, Stefano Brivio; +Cc: Paul Holzinger, David Gibson
At the end of tcp_splice_forward(), we check for half-closed connections
and propagate the FIN to the other side with a shutdown(2). Currently we
check for a half closed connection in either direction. That's unnecessary
here, because tcp_splice_forward() will already be called for each
direction if there are any relevant events.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
tcp_splice.c | 22 ++++++++--------------
1 file changed, 8 insertions(+), 14 deletions(-)
diff --git a/tcp_splice.c b/tcp_splice.c
index b45f0060..e5018f2e 100644
--- a/tcp_splice.c
+++ b/tcp_splice.c
@@ -582,21 +582,15 @@ retry:
break;
}
- if (!conn->pending[fromsidei] &&
- conn->events & FIN_RCVD(fromsidei)) {
- unsigned sidei;
-
- flow_foreach_sidei(sidei) {
- if ((conn->events & FIN_RCVD(sidei)) &&
- !(conn->events & FIN_SENT(!sidei))) {
- if (shutdown(conn->s[!sidei], SHUT_WR) < 0) {
- flow_perror(conn, "shutdown() on %s",
- pif_name(conn->f.pif[!sidei]));
- return -1;
- }
- conn_event(conn, FIN_SENT(!sidei));
- }
+ if ((conn->events & FIN_RCVD(fromsidei)) &&
+ !(conn->events & FIN_SENT(!fromsidei)) &&
+ !conn->pending[fromsidei]) {
+ if (shutdown(conn->s[!fromsidei], SHUT_WR) < 0) {
+ flow_perror(conn, "shutdown() on %s",
+ pif_name(conn->f.pif[!fromsidei]));
+ return -1;
}
+ conn_event(conn, FIN_SENT(!fromsidei));
}
return 0;
--
2.54.0
^ permalink raw reply [flat|nested] 8+ messages in thread