public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: passt-dev@passt.top
Subject: Re: [PATCH 3/4] tcp: Don't consider FIN flags with mismatching sequence
Date: Thu, 2 Oct 2025 12:52:31 +1000	[thread overview]
Message-ID: <aN3o71IS8HQLyC7z@zatzit> (raw)
In-Reply-To: <20251002000646.2136202-4-sbrivio@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 3231 bytes --]

On Thu, Oct 02, 2025 at 02:06:45AM +0200, Stefano Brivio wrote:
> If a guest or container sends us a FIN segment but its sequence number
> doesn't match the highest sequence of data we *accepted* (not
> necessarily the highest sequence we received), that is,
> conn->seq_from_tap, plus any data we're accepting in the current
> batch, we should discard the flag (not necessarily the segment),
> because there's still data we need to receive (again) before the end
> of the stream.
> 
> If we consider those FIN flags as such, we'll end up in the
> situation described below.
> 
> Here, 192.168.10.102 is a HTTP server in a Podman container, and
> 192.168.10.44 is a client fetching approximately 121 KB of data from
> it:
> 
>    82   2.026811 192.168.10.102 → 192.168.10.44 54 TCP 55414 → 44992 [FIN, ACK] Seq=121441 Ack=143 Win=65536 Len=0
> 
> the server is done sending
> 
>    83   2.026898 192.168.10.44 → 192.168.10.102 54 TCP 44992 → 55414 [ACK] Seq=143 Ack=114394 Win=216192 Len=0
> 
> pasta (client) acknowledges a previous sequence, because of
> a short sendmsg()
> 
>    84   2.027324 192.168.10.44 → 192.168.10.102 54 TCP 44992 → 55414 [FIN, ACK] Seq=143 Ack=114394 Win=216192 Len=0
> 
> pasta (client) sends FIN, ACK as the client has no more data to
> send (a single GET request), while still acknowledging a previous
> sequence, because the retransmission didn't happen yet
> 
>    85   2.027349 192.168.10.102 → 192.168.10.44 54 TCP 55414 → 44992 [ACK] Seq=121442 Ack=144 Win=65536 Len=0
> 
> the server acknowledges the FIN, ACK
> 
>    86   2.224125 192.168.10.102 → 192.168.10.44 4150 TCP [TCP Retransmission] 55414 → 44992 [ACK] Seq=114394 Ack=144 Win=65536 Len=4096 [TCP segment of a reassembled PDU]
> 
> and finally a retransmission comes, but as we wrongly switched to
> the CLOSE-WAIT state,
> 
>    87   2.224202 192.168.10.44 → 192.168.10.102 54 TCP 44992 → 55414 [RST] Seq=144 Win=0 Len=0
> 
> we consider frame #86 as an acknowledgement for the FIN segment we
> sent, and close the connection, while we still had to re-receive
> (and finally send) the missing data segment, instead.
> 
> Link: https://github.com/containers/podman/issues/27179
> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> ---
>  tcp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tcp.c b/tcp.c
> index 3f7dc82..5a7a607 100644
> --- a/tcp.c
> +++ b/tcp.c
> @@ -1769,7 +1769,7 @@ static int tcp_data_from_tap(const struct ctx *c, struct tcp_tap_conn *conn,
>  			}
>  		}
>  
> -		if (th->fin)
> +		if (th->fin && seq == seq_from_tap)
>  			fin = 1;

Can a FIN segment also contain data?  My quick googling suggests yes.
If so, doesn't this logic need to go after we process the data
processing, so that seq_from_tap points to the end of the packet's
data, rather than the beginning?  (And the handling of zero-length
packets would also need revision to match).

>  
>  		if (!len)
> -- 
> 2.43.0
> 

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2025-10-02  2:53 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-02  0:06 [PATCH 0/4] tcp: Fix bad switch to CLOSE-WAIT state and surrounding issues Stefano Brivio
2025-10-02  0:06 ` [PATCH 1/4] tcp: Fix ACK sequence on FIN to tap Stefano Brivio
2025-10-02  2:41   ` David Gibson
2025-10-02 11:58     ` Stefano Brivio
2025-10-03  3:19       ` David Gibson
2025-10-06 22:32         ` Stefano Brivio
2025-10-06 23:31           ` David Gibson
2025-10-02  0:06 ` [PATCH 2/4] tcp: Completely ignore data segment in CLOSE-WAIT state, log a message Stefano Brivio
2025-10-02  2:44   ` David Gibson
2025-10-02  0:06 ` [PATCH 3/4] tcp: Don't consider FIN flags with mismatching sequence Stefano Brivio
2025-10-02  2:52   ` David Gibson [this message]
2025-10-02  3:02     ` David Gibson
2025-10-02 11:51       ` Stefano Brivio
2025-10-03  3:43         ` David Gibson
2025-10-06 22:32           ` Stefano Brivio
2025-10-06 23:34             ` David Gibson
2025-10-02  0:06 ` [PATCH 4/4] tcp: On partial send (incomplete sendmsg()), request a retransmission right away Stefano Brivio
2025-10-02  3:00   ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aN3o71IS8HQLyC7z@zatzit \
    --to=david@gibson.dropbear.id.au \
    --cc=passt-dev@passt.top \
    --cc=sbrivio@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).