public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: passt-dev@passt.top
Subject: Re: [PATCH 3/4] tcp: Don't consider FIN flags with mismatching sequence
Date: Fri, 3 Oct 2025 13:43:32 +1000	[thread overview]
Message-ID: <aN9GZNswivXxOwr3@zatzit> (raw)
In-Reply-To: <20251002135104.1a662029@elisabeth>

[-- Attachment #1: Type: text/plain, Size: 4869 bytes --]

On Thu, Oct 02, 2025 at 01:51:04PM +0200, Stefano Brivio wrote:
> On Thu, 2 Oct 2025 13:02:09 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
> > On Thu, Oct 02, 2025 at 12:52:31PM +1000, David Gibson wrote:
> > > On Thu, Oct 02, 2025 at 02:06:45AM +0200, Stefano Brivio wrote:
[snip]
> > > > diff --git a/tcp.c b/tcp.c
> > > > index 3f7dc82..5a7a607 100644
> > > > --- a/tcp.c
> > > > +++ b/tcp.c
> > > > @@ -1769,7 +1769,7 @@ static int tcp_data_from_tap(const struct ctx *c, struct tcp_tap_conn *conn,
> > > >  			}
> > > >  		}
> > > >  
> > > > -		if (th->fin)
> > > > +		if (th->fin && seq == seq_from_tap)
> > > >  			fin = 1;  
> > > 
> > > Can a FIN segment also contain data?  My quick googling suggests yes.
> 
> Yes, absolutely, my slow wiresharking over the years also confirms, and
> it's so often the case that (I think) this issue doesn't happen so
> frequently as it only occurs if we have a FIN segment without data.

Makes sense.

> If we have a data segment, with FIN set, that we can't fully transmit,
> we already set 'partial_send' and won't set TAP_FIN_RCVD as a result.
> 
> Another case where we want to ignore a FIN segment with data is if we
> have a gap before it, but in that case we'll eventually set 'keep' and
> return early.

Ah, right.  I'd noticed we set fin = 1 in that case, but forgotten
about the exit before setting TAP_FIN_RCVD if keep is set.

> > > If so, doesn't this logic need to go after we process the data
> > > processing, so that seq_from_tap points to the end of the packet's
> > > data, rather than the beginning?  (And the handling of zero-length
> > > packets would also need revision to match).  
> 
> This made sense to me for a moment but now I'm struggling to understand
> or remember why. What I want to check here is that a FIN segment
> without data (I should have specified in the commit message) is
> acceptable because its sequence is as expected.

Right.  This is correct for zero-data FIN segments, but I think as a
side-effect you've made it ignore certain FIN segments _with_ data.
It will work in the common case where the data exactly follows on from
what we already have.  But in the case where the segment has some data
we already have and some new data, the fin = 1 won't trip because seq
!= seq_from_tap.  There isn't another place that will catch it
instead, AFAICT.

I guess it will be fine in the end, because with all the data acked,
the guest should retransmit the FIN with no data.

> But going back to FIN segments with data: why should we sum the length
> to seq_from_tap before comparing the sequence? I don't understand what
> additional check you want to introduce, or what case you want to cover.

I was thinking about the case above, but I didn't explain it very
well.

> > Following on from that, it seems to me like it would make sense for
> > FIN segments to also participate in the 'keep' mechanism.  It should
> > work eventually, but I expect it would be smoother in the case that we
> > get a final burst of packets in a stream out of order.
> 
> FIN segments with data already go through that dance.
> 
> Without data, I guess you're right, we might have in the same batch
> (not that I've ever seen it happening in practice) a FIN segment
> without data that we process first (and now discard because of the
> sequence number), and some data before that we process later, so we
> shouldn't throw away the FIN segment because of that. We should,
> conceptually, reorder it as well.
> 
> It probably makes things more complicated for a reason that's not so
> critical (ignoring a FIN is fine, we'll get another one), and I wanted
> to have the simplest possible fix here.
> 
> Let me see if I can make this entirely correct without a substantially
> bigger change, I haven't really tried.

How about this:

diff --git a/tcp.c b/tcp.c
index 7da41797..42e576b4 100644
--- a/tcp.c
+++ b/tcp.c
@@ -1774,10 +1774,7 @@ static int tcp_data_from_tap(const struct ctx *c, struct tcp_tap_conn *conn,
 			}
 		}
 
-		if (th->fin)
-			fin = 1;
-
-		if (!len)
+		if (!len && !th->fin)
 			continue;
 
 		seq_offset = seq_from_tap - seq;
@@ -1820,6 +1817,8 @@ static int tcp_data_from_tap(const struct ctx *c, struct tcp_tap_conn *conn,
 			break;
 		seq_from_tap += size;
 		iov_i += count;
+		if (th->fin)
+			fin = 1;
 
 		if (keep == i)
 			keep = -1;


We'd need to double check that the "accept data segment" path is safe
with len == 0, of course.  But I think that will treat dataless and
with-data FINs the same way, and let them use the keep mechanism.

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2025-10-03  3:43 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-02  0:06 [PATCH 0/4] tcp: Fix bad switch to CLOSE-WAIT state and surrounding issues Stefano Brivio
2025-10-02  0:06 ` [PATCH 1/4] tcp: Fix ACK sequence on FIN to tap Stefano Brivio
2025-10-02  2:41   ` David Gibson
2025-10-02 11:58     ` Stefano Brivio
2025-10-03  3:19       ` David Gibson
2025-10-06 22:32         ` Stefano Brivio
2025-10-06 23:31           ` David Gibson
2025-10-02  0:06 ` [PATCH 2/4] tcp: Completely ignore data segment in CLOSE-WAIT state, log a message Stefano Brivio
2025-10-02  2:44   ` David Gibson
2025-10-02  0:06 ` [PATCH 3/4] tcp: Don't consider FIN flags with mismatching sequence Stefano Brivio
2025-10-02  2:52   ` David Gibson
2025-10-02  3:02     ` David Gibson
2025-10-02 11:51       ` Stefano Brivio
2025-10-03  3:43         ` David Gibson [this message]
2025-10-06 22:32           ` Stefano Brivio
2025-10-06 23:34             ` David Gibson
2025-10-02  0:06 ` [PATCH 4/4] tcp: On partial send (incomplete sendmsg()), request a retransmission right away Stefano Brivio
2025-10-02  3:00   ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aN9GZNswivXxOwr3@zatzit \
    --to=david@gibson.dropbear.id.au \
    --cc=passt-dev@passt.top \
    --cc=sbrivio@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).