On Wed, Oct 08, 2025 at 12:42:37AM +0200, Stefano Brivio wrote: > On Tue, 7 Oct 2025 10:31:11 +1100 > David Gibson wrote: > > > On Tue, Oct 07, 2025 at 12:32:19AM +0200, Stefano Brivio wrote: > > > On Fri, 3 Oct 2025 13:19:17 +1000 > > > David Gibson wrote: > > > > > > > On Thu, Oct 02, 2025 at 01:58:41PM +0200, Stefano Brivio wrote: > > > > > On Thu, 2 Oct 2025 12:41:08 +1000 > > > > > David Gibson wrote: > > > > > > > > > > > On Thu, Oct 02, 2025 at 02:06:43AM +0200, Stefano Brivio wrote: > > > > > > > If we reach end-of-file on a socket (or get EPOLLRDHUP / EPOLLHUP) and > > > > > > > send a FIN segment to the guest / container acknowledging a sequence > > > > > > > number that's behind what we received so far, we won't have any > > > > > > > further trigger to send an updated ACK segment, as we are now > > > > > > > switching the epoll socket monitoring to edge-triggered mode. > > > > > > > > > > > > > > To avoid this situation, in tcp_update_seqack_wnd(), we set the next > > > > > > > acknowledgement sequence to the current observed sequence, regardless > > > > > > > of what was acknowledged socket-side. > > > > > > > > > > > > To double check my understanding: things should work if we always > > > > > > acknowledged everything we've received. Acknowledging only what the > > > > > > peer has acked is a refinement to give the guest a view that's closer > > > > > > to what it would be end-to-end with the peer (which might improve the > > > > > > operation of flow control). > > > > > > > > > > Right. > > > > > > > > > > > We can't use that refined mechanism when the socket is closing > > > > > > (amongst other cases), because while we can get the peer acked bytes > > > > > > from TCP_INFO, we can't get events when that changes, so we have no > > > > > > mechanism to provide updates to the guest at the right time. So we > > > > > > fall back to the simpler method. > > > > > > > > > > > > Is that correct? > > > > > > > > > > Also correct, yes. If you have a better idea to summarise this in the > > > > > comment in tcp_buf_data_from_sock() let me know. > > > > > > > > Hm, I might. Or actually a way to reorganise the code that I think > > > > will be a bit clearer and probably allow a clearer comment too. > > > > > > I would keep reworks for a later moment. Right now, it's already taking > > > me long enough to find a moment to investigate these issues, write these > > > fixes, and test them. > > > > I mean... the change I'm proposing reduces lines of code (excepting > > the big new comment), makes it easier to reason about and is localised > > to the immediately surrounding code. But whatever, I don't > > particularly care about the order we do things. > > Sure, I don't think that other patch is particularly complicated or > might ever become problematic at all, but, from your earlier comment > ("reorganise the code", as I mentioned the comment in > tcp_buf_data_from_sock()), I thought you wanted to rework this > particular code in tcp_buf_data_from_sock() at the same time. Yeah, I guess "reorganise" gives an impression of a wider ranging change than it actually was. > I guess it's not the case judging from your most recent reply, though. -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson