From mboxrd@z Thu Jan 1 00:00:00 1970 Received: by passt.top (Postfix, from userid 1000) id 8E9235A026F; Thu, 02 Oct 2025 02:06:46 +0200 (CEST) From: Stefano Brivio To: passt-dev@passt.top Subject: [PATCH 1/4] tcp: Fix ACK sequence on FIN to tap Date: Thu, 2 Oct 2025 02:06:43 +0200 Message-ID: <20251002000646.2136202-2-sbrivio@redhat.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20251002000646.2136202-1-sbrivio@redhat.com> References: <20251002000646.2136202-1-sbrivio@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Message-ID-Hash: 6SYSWSDFDPHLBEBFGEADW6UUEQCGJ7K6 X-Message-ID-Hash: 6SYSWSDFDPHLBEBFGEADW6UUEQCGJ7K6 X-MailFrom: sbrivio@passt.top X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: If we reach end-of-file on a socket (or get EPOLLRDHUP / EPOLLHUP) and send a FIN segment to the guest / container acknowledging a sequence number that's behind what we received so far, we won't have any further trigger to send an updated ACK segment, as we are now switching the epoll socket monitoring to edge-triggered mode. To avoid this situation, in tcp_update_seqack_wnd(), we set the next acknowledgement sequence to the current observed sequence, regardless of what was acknowledged socket-side. However, we don't necessarily call tcp_update_seqack_wnd() before sending the FIN segment, which might potentially lead to a situation, not observed in practice, where we unnecessarily cause a retransmission at some point after our FIN segment. Avoid that by setting the ACK sequence to whatever we received from the container / guest, before sending a FIN segment and switching to EPOLLET. Signed-off-by: Stefano Brivio --- tcp_buf.c | 14 +++++++++++++- tcp_vu.c | 7 ++++++- 2 files changed, 19 insertions(+), 2 deletions(-) diff --git a/tcp_buf.c b/tcp_buf.c index a493b5a..cc106bc 100644 --- a/tcp_buf.c +++ b/tcp_buf.c @@ -368,7 +368,19 @@ int tcp_buf_data_from_sock(const struct ctx *c, struct tcp_tap_conn *conn) conn_flag(c, conn, STALLED); } else if ((conn->events & (SOCK_FIN_RCVD | TAP_FIN_SENT)) == SOCK_FIN_RCVD) { - int ret = tcp_buf_send_flag(c, conn, FIN | ACK); + int ret; + + /* On TAP_FIN_SENT, we won't get further data events + * from the socket, and this might be the last ACK + * segment we send to the tap, so update its sequence to + * include everything we received until now. + * + * See also the special handling on CONN_IS_CLOSING() in + * tcp_update_seqack_wnd(). + */ + conn->seq_ack_to_tap = conn->seq_from_tap; + + ret = tcp_buf_send_flag(c, conn, FIN | ACK); if (ret) { tcp_rst(c, conn); return ret; diff --git a/tcp_vu.c b/tcp_vu.c index ebd3a1e..3ec3538 100644 --- a/tcp_vu.c +++ b/tcp_vu.c @@ -410,7 +410,12 @@ int tcp_vu_data_from_sock(const struct ctx *c, struct tcp_tap_conn *conn) conn_flag(c, conn, STALLED); } else if ((conn->events & (SOCK_FIN_RCVD | TAP_FIN_SENT)) == SOCK_FIN_RCVD) { - int ret = tcp_vu_send_flag(c, conn, FIN | ACK); + int ret; + + /* See tcp_buf_data_from_sock() */ + conn->seq_ack_to_tap = conn->seq_from_tap; + + ret = tcp_vu_send_flag(c, conn, FIN | ACK); if (ret) { tcp_rst(c, conn); return ret; -- 2.43.0