From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by passt.top (Postfix) with ESMTPS id EC1F85A026D for ; Mon, 27 Mar 2023 05:56:42 +0200 (CEST) Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4PlJsK07K9z4xFL; Mon, 27 Mar 2023 14:56:36 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=201602; t=1679889397; bh=amhxaRLQd8aoIE7i8KTVuUADzjuViO/UXBTtnvGbt1s=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Biwtg9THgzfbZ9F4LE+0DCyQmLpMzFBRDuI4969MoBMRfDP96ct0PQJD0vsAPUxO3 tpdFWba/k9Ura/UrHg8jHXaEJhqEANzVBTcvZb32a89FlvV6FZP6xbrEkJNPyVjPMe 0o72UD4Y38b7W0xMceDwWxjBM1wI0LVCbqcRJVYw= From: David Gibson To: Stefano Brivio , passt-dev@passt.top Subject: [PATCH 2/2] tcp: Don't special case the handling of the ack of a syn Date: Mon, 27 Mar 2023 14:56:34 +1100 Message-Id: <20230327035634.1432064-3-david@gibson.dropbear.id.au> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230327035634.1432064-1-david@gibson.dropbear.id.au> References: <20230327035634.1432064-1-david@gibson.dropbear.id.au> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Message-ID-Hash: QMZVVJFL6NWNWDWJAYVMORJ5FV4JUSET X-Message-ID-Hash: QMZVVJFL6NWNWDWJAYVMORJ5FV4JUSET X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: David Gibson X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: TCP treats the SYN packets as though they occupied 1 byte in the logical data stream described by the sequence numbers. That is, the very first ACK (or SYN-ACK) each side sends should acknowledge a sequence number one greater than the initial sequence number given in the SYN or SYN-ACK it's responding to. In passt we were tracking that by advancing conn->seq_to_tap by one when we send a SYN or SYN-ACK (in tcp_send_flag()). However, we also initialized conn->seq_ack_from_tap, representing the acks we've already seen from the tap side, to ISN+1, meaning we treated it has having acknowledged the SYN before it actually did. There were apparently reasons for this in earlier versions, but it causes problems now. Because of this when we actually did receive the initial ACK or SYN-ACK, we wouldn't see the acknoweldged serial number as advancing, and so wouldn't clear the ACK_FROM_TAP_DUE flag. In most cases we'd get away because subsequent packets would clear the flag. However if one (or both) sides didn't send any data, the other side would (correctly) keep sending ISN+1 as the acknowledged sequence number, meaning we would never clear the ACK_FROM_TAP_DUE flag. That would mean we'd treat the connection as if we needed to retransmit (although we had 0 bytes to retransmit), and eventaully (after around 30s) reset the connection due to too many retransmits. Specifically this could cause the iperf3 throughput tests in the testsuite to fail if set for a long enough test period. Correct this by initializing conn->seq_ack_from_tap to the ISN and only advancing it when we actually get the first ACK (or SYN-ACK). Signed-off-by: David Gibson --- tcp.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tcp.c b/tcp.c index d82c62e..bbdee60 100644 --- a/tcp.c +++ b/tcp.c @@ -2096,7 +2096,7 @@ static void tcp_conn_from_tap(struct ctx *c, int af, const void *addr, conn->seq_ack_to_tap = conn->seq_from_tap; tcp_seq_init(c, conn, now); - conn->seq_ack_from_tap = conn->seq_to_tap + 1; + conn->seq_ack_from_tap = conn->seq_to_tap; tcp_hash_insert(c, conn); @@ -2754,7 +2754,7 @@ static void tcp_tap_conn_from_sock(struct ctx *c, union epoll_ref ref, tcp_seq_init(c, conn, now); tcp_hash_insert(c, conn); - conn->seq_ack_from_tap = conn->seq_to_tap + 1; + conn->seq_ack_from_tap = conn->seq_to_tap; conn->wnd_from_tap = WINDOW_DEFAULT; -- 2.39.2