On Wed, May 20, 2026 at 04:31:34PM +0200, Stefano Brivio wrote: > On Wed, 20 May 2026 23:08:46 +1000 > David Gibson wrote: > > > A number of things can, at least theoretically, go wrong when forwarding > > data across a spliced connection. We generally handle this by resetting > > the connection on both sides. However, in many cases we don't log any > > message about why the connection was reset, which can make it hard to > > debug why this is happening. > > > > Add a bunch of debug and error logging to make this easier to figure out. > > > > Signed-off-by: David Gibson > > --- > > tcp_splice.c | 31 +++++++++++++++++++++++-------- > > 1 file changed, 23 insertions(+), 8 deletions(-) > > > > diff --git a/tcp_splice.c b/tcp_splice.c > > index 42ee8abc..1359d6b8 100644 > > --- a/tcp_splice.c > > +++ b/tcp_splice.c > > @@ -502,15 +502,18 @@ void tcp_splice_sock_handler(struct ctx *c, union epoll_ref ref, > > if (rc) > > flow_perror(conn, "Error retrieving SO_ERROR"); > > else > > - flow_trace(conn, "Error event on socket: %s", > > - strerror_(err)); > > - > > + flow_dbg(conn, "Error event on %s socket: %s", > > + pif_name(conn->f.pif[evsidei]), > > + strerror_(err)); > > goto reset; > > } > > > > if (conn->events == SPLICE_CONNECT) { > > - if (!(events & EPOLLOUT)) > > + if (!(events & EPOLLOUT)) { > > + flow_err(conn, "Unexpected events 0x%x during connect", > > + events); > > Shouldn't all the flow_err() and flow_perror() calls here be > ratelimited, that is, eventually calling the err_ratelimit() function > Anshu introduced recently? I did think about that, I concluded it wasn't necessary here because it indicates that something has gone unexpectedly wrong at the kernel level, it's not guest triggerable. I can put in ratelimits if you still think they're necessary. > We don't have helpers ready for flow_err() and flow_perror(), I was > about to post a patch that would go before this series but I'm not sure > if there's a specific reason to avoid those. > > > goto reset; > > + } > > if (tcp_splice_connect_finish(c, conn)) > > goto reset; > > } > > @@ -545,8 +548,11 @@ retry: > > SPLICE_F_MOVE | SPLICE_F_NONBLOCK); > > while (readlen < 0 && errno == EINTR); > > > > - if (readlen < 0 && errno != EAGAIN) > > + if (readlen < 0 && errno != EAGAIN) { > > + flow_perror(conn, "Splicing from %s socket", > > + pif_name(conn->f.pif[fromsidei])); > > goto reset; > > + } > > > > flow_trace(conn, "%zi from read-side call", readlen); > > > > @@ -569,8 +575,11 @@ retry: > > SPLICE_F_MOVE | more | SPLICE_F_NONBLOCK); > > while (written < 0 && errno == EINTR); > > > > - if (written < 0 && errno != EAGAIN) > > + if (written < 0 && errno != EAGAIN) { > > + flow_perror(conn, "Splicing to %s socket", > > + pif_name(conn->f.pif[!fromsidei])); > > goto reset; > > + } > > > > flow_trace(conn, "%zi from write-side call (passed %zi)", > > written, c->tcp.pipe_size); > > @@ -627,8 +636,11 @@ retry: > > flow_foreach_sidei(sidei) { > > if ((conn->events & FIN_RCVD(sidei)) && > > !(conn->events & FIN_SENT(!sidei))) { > > - if (shutdown(conn->s[!sidei], SHUT_WR) < 0) > > + if (shutdown(conn->s[!sidei], SHUT_WR) < 0) { > > + flow_perror(conn, "shutdown() on %s", > > + pif_name(conn->f.pif[!sidei])); > > goto reset; > > + } > > conn_event(conn, FIN_SENT(!sidei)); > > } > > } > > @@ -647,8 +659,11 @@ retry: > > goto swap; > > } > > > > - if (events & EPOLLHUP) > > + if (events & EPOLLHUP) { > > + flow_dbg(conn, "Hangup from %s socket", > > + pif_name(conn->f.pif[evsidei])); Except for this one, which is debug level for that reason. > > goto reset; > > + } > > > > return; > > > > -- > Stefano > -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson