On Wed, May 20, 2026 at 10:28:36PM +0200, Stefano Brivio wrote: > On Wed, 20 May 2026 23:08:47 +1000 > David Gibson wrote: > > > tcp_splice_sock_handler() has an optimised path for the common case where > > the amount we splice(2) into the pipe is exactly the same as the amount we > > splice(2) out again. If the pipe is empty at that point, we stop > > forwarding until we get another epoll event. > > > > However, via a subtle chain of events, this can cause a bug for a > > half-closed connection. Suppose the connection is already half-closed in > > the other direction - that is, we've already called shutdown(SHUT_WR) on > > the socket for which we're getting the event. In this event we're getting > > the last batch of data in the other direction, and also a FIN. This can > > result in EPOLLIN, EPOLLRDHUP and EPOLLHUP events simultaneously. > > > > We read the last data from the socket and successfully splice it to the > > other side. Since there is no data in the pipe, we exit the forwarding > > loop. However, because we did read data, we don't set the eof flag. > > > > Because we don't set eof, we don't (yet) propagate the FIN to the other > > side, or set FIN_SENT_(!fromsidei). Therefore we don't (yet) recognize > > this as a clean termination and set the CLOSING flag. We would correct > > this when we get our next event, however before we can do so we process > > the EPOLLHUP event. Because we haven't recognized this as a clean close > > we assume it is an abrupt close and send an RST to the other side. > > > > To avoid this, don't stop attempting to forward data on this path. > > Continue for at least one more loop. If we're at EOF, we'll recognize it > > on the next splice(2). If not it gives us an opportunity to forward more > > data without returning to the mail epoll loop. > > Oops. The fix looks correct to me, but I wonder: is it clear to you why > the issue only started occurring in this release? This code had "always" > been there. Because we didn't used to force resets on abnormal connection terminations, so it still worked by accident. > I see a few possible directions but I'm not quite sure. Not that > important anyway, if you could reproduce the issue and this fixes it. Ah, actually, I do still need to test with the original reproducer. It fixes it for my reproducer which I'm maybe 90% confident is exercising the same bug. > Just one nit: > > > Link: https://bugs.passt.top/show_bug.cgi?id=202 > > Signed-off-by: David Gibson > > Reported-by: Paul Holzinger Good point, fixed. > > > --- > > tcp_splice.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/tcp_splice.c b/tcp_splice.c > > index 1359d6b8..34ffea73 100644 > > --- a/tcp_splice.c > > +++ b/tcp_splice.c > > @@ -605,7 +605,7 @@ retry: > > } > > } > > > > - break; > > + continue; > > } > > > > conn->read[fromsidei] += readlen > 0 ? readlen : 0; > > -- > Stefano > -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson