public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
* [PATCH 0/2] tcp: Correct handling of first ACK (or SYN-ACK) packet
@ 2023-03-27  3:56 David Gibson
  2023-03-27  3:56 ` [PATCH 1/2] tcp: Clarify allowed state for tcp_data_from_tap() David Gibson
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: David Gibson @ 2023-03-27  3:56 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: David Gibson

We have a subtle problem in the handling of the very first ack-flagged
packet (either the SYN-ACK or ACK from the three way handshake).
Stefano has posted a couple of versions of a patch addressing this,
however I think this is a better approach.  From the TCP logical point
of view, that first ACK does advance the sequence number, and if we
treat it as doing so, then the logic we already had in
tcp_update_seqack_from_tap() is correct.

David Gibson (2):
  tcp: Clarify allowed state for tcp_data_from_tap()
  tcp: Don't special case the handling of the ack of a syn

 tcp.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/2] tcp: Clarify allowed state for tcp_data_from_tap()
  2023-03-27  3:56 [PATCH 0/2] tcp: Correct handling of first ACK (or SYN-ACK) packet David Gibson
@ 2023-03-27  3:56 ` David Gibson
  2023-03-27  3:56 ` [PATCH 2/2] tcp: Don't special case the handling of the ack of a syn David Gibson
  2023-03-27  9:15 ` [PATCH 0/2] tcp: Correct handling of first ACK (or SYN-ACK) packet Stefano Brivio
  2 siblings, 0 replies; 4+ messages in thread
From: David Gibson @ 2023-03-27  3:56 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: David Gibson

Comments suggest that this should only be called for an ESTABLISHED
connection.  However, it's non-trivial to ascertain that from the actual
control flow in the caller.  Add an ASSERT() to make it very clear that
this is only called in ESTABLISHED state.

In fact, there were some circumstances where it could be called on a CLOSED
connection.  In a sense that is "established", but with that assert this
does require specific (trivial) handling to avoid a spurious abort().

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 tcp.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/tcp.c b/tcp.c
index f156287..d82c62e 100644
--- a/tcp.c
+++ b/tcp.c
@@ -2337,6 +2337,11 @@ static void tcp_data_from_tap(struct ctx *c, struct tcp_tap_conn *conn,
 	size_t len;
 	ssize_t n;
 
+	if (conn->events == CLOSED)
+		return;
+
+	ASSERT(conn->events & ESTABLISHED);
+
 	for (i = 0, iov_i = 0; i < (int)p->count; i++) {
 		uint32_t seq, seq_offset, ack_seq;
 		struct tcphdr *th;
-- 
@@ -2337,6 +2337,11 @@ static void tcp_data_from_tap(struct ctx *c, struct tcp_tap_conn *conn,
 	size_t len;
 	ssize_t n;
 
+	if (conn->events == CLOSED)
+		return;
+
+	ASSERT(conn->events & ESTABLISHED);
+
 	for (i = 0, iov_i = 0; i < (int)p->count; i++) {
 		uint32_t seq, seq_offset, ack_seq;
 		struct tcphdr *th;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/2] tcp: Don't special case the handling of the ack of a syn
  2023-03-27  3:56 [PATCH 0/2] tcp: Correct handling of first ACK (or SYN-ACK) packet David Gibson
  2023-03-27  3:56 ` [PATCH 1/2] tcp: Clarify allowed state for tcp_data_from_tap() David Gibson
@ 2023-03-27  3:56 ` David Gibson
  2023-03-27  9:15 ` [PATCH 0/2] tcp: Correct handling of first ACK (or SYN-ACK) packet Stefano Brivio
  2 siblings, 0 replies; 4+ messages in thread
From: David Gibson @ 2023-03-27  3:56 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: David Gibson

TCP treats the SYN packets as though they occupied 1 byte in the logical
data stream described by the sequence numbers.  That is, the very first ACK
(or SYN-ACK) each side sends should acknowledge a sequence number one
greater than the initial sequence number given in the SYN or SYN-ACK it's
responding to.

In passt we were tracking that by advancing conn->seq_to_tap by one when
we send a SYN or SYN-ACK (in tcp_send_flag()).  However, we also
initialized conn->seq_ack_from_tap, representing the acks we've already
seen from the tap side, to ISN+1, meaning we treated it has having
acknowledged the SYN before it actually did.

There were apparently reasons for this in earlier versions, but it causes
problems now.  Because of this when we actually did receive the initial ACK
or SYN-ACK, we wouldn't see the acknoweldged serial number as advancing,
and so wouldn't clear the ACK_FROM_TAP_DUE flag.

In most cases we'd get away because subsequent packets would clear the
flag.  However if one (or both) sides didn't send any data, the other side
would (correctly) keep sending ISN+1 as the acknowledged sequence number,
meaning we would never clear the ACK_FROM_TAP_DUE flag.  That would mean
we'd treat the connection as if we needed to retransmit (although we had
0 bytes to retransmit), and eventaully (after around 30s) reset the
connection due to too many retransmits.  Specifically this could cause the
iperf3 throughput tests in the testsuite to fail if set for a long enough
test period.

Correct this by initializing conn->seq_ack_from_tap to the ISN and only
advancing it when we actually get the first ACK (or SYN-ACK).

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 tcp.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tcp.c b/tcp.c
index d82c62e..bbdee60 100644
--- a/tcp.c
+++ b/tcp.c
@@ -2096,7 +2096,7 @@ static void tcp_conn_from_tap(struct ctx *c, int af, const void *addr,
 	conn->seq_ack_to_tap = conn->seq_from_tap;
 
 	tcp_seq_init(c, conn, now);
-	conn->seq_ack_from_tap = conn->seq_to_tap + 1;
+	conn->seq_ack_from_tap = conn->seq_to_tap;
 
 	tcp_hash_insert(c, conn);
 
@@ -2754,7 +2754,7 @@ static void tcp_tap_conn_from_sock(struct ctx *c, union epoll_ref ref,
 	tcp_seq_init(c, conn, now);
 	tcp_hash_insert(c, conn);
 
-	conn->seq_ack_from_tap = conn->seq_to_tap + 1;
+	conn->seq_ack_from_tap = conn->seq_to_tap;
 
 	conn->wnd_from_tap = WINDOW_DEFAULT;
 
-- 
@@ -2096,7 +2096,7 @@ static void tcp_conn_from_tap(struct ctx *c, int af, const void *addr,
 	conn->seq_ack_to_tap = conn->seq_from_tap;
 
 	tcp_seq_init(c, conn, now);
-	conn->seq_ack_from_tap = conn->seq_to_tap + 1;
+	conn->seq_ack_from_tap = conn->seq_to_tap;
 
 	tcp_hash_insert(c, conn);
 
@@ -2754,7 +2754,7 @@ static void tcp_tap_conn_from_sock(struct ctx *c, union epoll_ref ref,
 	tcp_seq_init(c, conn, now);
 	tcp_hash_insert(c, conn);
 
-	conn->seq_ack_from_tap = conn->seq_to_tap + 1;
+	conn->seq_ack_from_tap = conn->seq_to_tap;
 
 	conn->wnd_from_tap = WINDOW_DEFAULT;
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 0/2] tcp: Correct handling of first ACK (or SYN-ACK) packet
  2023-03-27  3:56 [PATCH 0/2] tcp: Correct handling of first ACK (or SYN-ACK) packet David Gibson
  2023-03-27  3:56 ` [PATCH 1/2] tcp: Clarify allowed state for tcp_data_from_tap() David Gibson
  2023-03-27  3:56 ` [PATCH 2/2] tcp: Don't special case the handling of the ack of a syn David Gibson
@ 2023-03-27  9:15 ` Stefano Brivio
  2 siblings, 0 replies; 4+ messages in thread
From: Stefano Brivio @ 2023-03-27  9:15 UTC (permalink / raw)
  To: David Gibson; +Cc: passt-dev

On Mon, 27 Mar 2023 14:56:32 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> We have a subtle problem in the handling of the very first ack-flagged
> packet (either the SYN-ACK or ACK from the three way handshake).
> Stefano has posted a couple of versions of a patch addressing this,
> however I think this is a better approach.  From the TCP logical point
> of view, that first ACK does advance the sequence number, and if we
> treat it as doing so, then the logic we already had in
> tcp_update_seqack_from_tap() is correct.
> 
> David Gibson (2):
>   tcp: Clarify allowed state for tcp_data_from_tap()
>   tcp: Don't special case the handling of the ack of a syn

Thanks for fixing this, the series looks good to me.

I'm wondering if we should still apply the v2 of the patch I sent, with
an adjusted commit message, because resetting ACK_FROM_TAP_DUE only on
SEQ_GT(seq, conn->seq_ack_from_tap) doesn't really follow any logic,
even though it wouldn't be a problem at this point.

-- 
Stefano


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-03-27  9:15 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-27  3:56 [PATCH 0/2] tcp: Correct handling of first ACK (or SYN-ACK) packet David Gibson
2023-03-27  3:56 ` [PATCH 1/2] tcp: Clarify allowed state for tcp_data_from_tap() David Gibson
2023-03-27  3:56 ` [PATCH 2/2] tcp: Don't special case the handling of the ack of a syn David Gibson
2023-03-27  9:15 ` [PATCH 0/2] tcp: Correct handling of first ACK (or SYN-ACK) packet Stefano Brivio

Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).