From: Stefano Brivio <sbrivio@redhat.com>
To: passt-dev@passt.top
Cc: Max Chernoff <git@maxchernoff.ca>,
David Gibson <david@gibson.dropbear.id.au>
Subject: [PATCH 0/8] tcp: Fix throughput issues with non-local peers
Date: Thu, 4 Dec 2025 08:45:33 +0100 [thread overview]
Message-ID: <20251204074542.2156548-1-sbrivio@redhat.com> (raw)
Patch 1/8 is the most relevant fix here, as we currently advertise a
window that might be too big for what we can write to the socket,
causing retransmissions right away and occasional high latency on
short transfers to non-local peers.
Mostly as a consequence of fixing that, we now need several
improvements and small fixes, including, most notably, an adaptive
approach to pick the interval between checks for socket-side ACKs
(patch 2/8), and several tricks to reliably trigger TCP buffer size
auto-tuning as implemented by the Linux kernel (patches 4/8 and 6/8).
These changes make some existing issues more relevant, fixed by the
other patches.
With this series, I'm getting the expected (wirespeed) throughput for
transfers between peers with varying non-local RTTs: I checked
different guests bridged on the same machine (~600 us) and hosts with
increasing distance (approximately 100 to 600 km, ~4 to ~35 ms), using
iperf3 as well as HTTP transfers.
For short transfers, we strictly stick to the available sending buffer
size to (almost) make sure we avoid local retransmissions, and
significantly decrease transfer time as a result: from 1.2 s to 60 ms
for a 5 MB HTTP transfer from a container hosted in a virtual machine
to another guest.
Stefano Brivio (8):
tcp: Limit advertised window to available, not total sending buffer
size
tcp: Adaptive interval based on RTT for socket-side acknowledgement
checks
tcp: Don't clear ACK_TO_TAP_DUE if we're advertising a zero-sized
window
tcp: Acknowledge everything if sending buffer is less than SNDBUF_BIG
tcp: Don't limit window to less-than-MSS values, use zero instead
tcp: Allow exceeding the available sending buffer size in window
advertisements
tcp: Send a duplicate ACK also on complete sendmsg() failure
tcp: Skip redundant ACK on partial sendmsg() failure
README.md | 2 +-
tcp.c | 85 ++++++++++++++++++++++++++++++++++++++++++------------
tcp_conn.h | 9 ++++++
util.c | 14 +++++++++
util.h | 1 +
5 files changed, 92 insertions(+), 19 deletions(-)
--
2.43.0
next reply other threads:[~2025-12-04 7:45 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-04 7:45 Stefano Brivio [this message]
2025-12-04 7:45 ` [PATCH 1/8] tcp: Limit advertised window to available, not total sending buffer size Stefano Brivio
2025-12-04 23:10 ` David Gibson
2025-12-04 7:45 ` [PATCH 2/8] tcp: Adaptive interval based on RTT for socket-side acknowledgement checks Stefano Brivio
2025-12-04 23:48 ` David Gibson
2025-12-05 1:20 ` Stefano Brivio
2025-12-05 2:49 ` David Gibson
2025-12-04 7:45 ` [PATCH 3/8] tcp: Don't clear ACK_TO_TAP_DUE if we're advertising a zero-sized window Stefano Brivio
2025-12-04 23:50 ` David Gibson
2025-12-04 7:45 ` [PATCH 4/8] tcp: Acknowledge everything if sending buffer is less than SNDBUF_BIG Stefano Brivio
2025-12-05 0:08 ` David Gibson
2025-12-05 1:20 ` Stefano Brivio
2025-12-05 2:50 ` David Gibson
2025-12-08 0:19 ` Stefano Brivio
2025-12-04 7:45 ` [PATCH 5/8] tcp: Don't limit window to less-than-MSS values, use zero instead Stefano Brivio
2025-12-05 0:35 ` David Gibson
2025-12-05 1:20 ` Stefano Brivio
2025-12-05 2:53 ` David Gibson
2025-12-04 7:45 ` [PATCH 6/8] tcp: Allow exceeding the available sending buffer size in window advertisements Stefano Brivio
2025-12-05 2:34 ` David Gibson
2025-12-08 0:20 ` Stefano Brivio
2025-12-04 7:45 ` [PATCH 7/8] tcp: Send a duplicate ACK also on complete sendmsg() failure Stefano Brivio
2025-12-05 2:35 ` David Gibson
2025-12-04 7:45 ` [PATCH 8/8] tcp: Skip redundant ACK on partial " Stefano Brivio
2025-12-05 2:36 ` David Gibson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251204074542.2156548-1-sbrivio@redhat.com \
--to=sbrivio@redhat.com \
--cc=david@gibson.dropbear.id.au \
--cc=git@maxchernoff.ca \
--cc=passt-dev@passt.top \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).