public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: Stefano Brivio <sbrivio@redhat.com>
To: passt-dev@passt.top
Cc: Max Chernoff <git@maxchernoff.ca>,
	David Gibson <david@gibson.dropbear.id.au>
Subject: [PATCH 0/8] tcp: Fix throughput issues with non-local peers
Date: Thu,  4 Dec 2025 08:45:33 +0100	[thread overview]
Message-ID: <20251204074542.2156548-1-sbrivio@redhat.com> (raw)

Patch 1/8 is the most relevant fix here, as we currently advertise a
window that might be too big for what we can write to the socket,
causing retransmissions right away and occasional high latency on
short transfers to non-local peers.

Mostly as a consequence of fixing that, we now need several
improvements and small fixes, including, most notably, an adaptive
approach to pick the interval between checks for socket-side ACKs
(patch 2/8), and several tricks to reliably trigger TCP buffer size
auto-tuning as implemented by the Linux kernel (patches 4/8 and 6/8).

These changes make some existing issues more relevant, fixed by the
other patches.

With this series, I'm getting the expected (wirespeed) throughput for
transfers between peers with varying non-local RTTs: I checked
different guests bridged on the same machine (~600 us) and hosts with
increasing distance (approximately 100 to 600 km, ~4 to ~35 ms), using
iperf3 as well as HTTP transfers.

For short transfers, we strictly stick to the available sending buffer
size to (almost) make sure we avoid local retransmissions, and
significantly decrease transfer time as a result: from 1.2 s to 60 ms
for a 5 MB HTTP transfer from a container hosted in a virtual machine
to another guest.

Stefano Brivio (8):
  tcp: Limit advertised window to available, not total sending buffer
    size
  tcp: Adaptive interval based on RTT for socket-side acknowledgement
    checks
  tcp: Don't clear ACK_TO_TAP_DUE if we're advertising a zero-sized
    window
  tcp: Acknowledge everything if sending buffer is less than SNDBUF_BIG
  tcp: Don't limit window to less-than-MSS values, use zero instead
  tcp: Allow exceeding the available sending buffer size in window
    advertisements
  tcp: Send a duplicate ACK also on complete sendmsg() failure
  tcp: Skip redundant ACK on partial sendmsg() failure

 README.md  |  2 +-
 tcp.c      | 85 ++++++++++++++++++++++++++++++++++++++++++------------
 tcp_conn.h |  9 ++++++
 util.c     | 14 +++++++++
 util.h     |  1 +
 5 files changed, 92 insertions(+), 19 deletions(-)

-- 
2.43.0


             reply	other threads:[~2025-12-04  7:45 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-04  7:45 Stefano Brivio [this message]
2025-12-04  7:45 ` [PATCH 1/8] tcp: Limit advertised window to available, not total sending buffer size Stefano Brivio
2025-12-04 23:10   ` David Gibson
2025-12-04  7:45 ` [PATCH 2/8] tcp: Adaptive interval based on RTT for socket-side acknowledgement checks Stefano Brivio
2025-12-04 23:48   ` David Gibson
2025-12-05  1:20     ` Stefano Brivio
2025-12-05  2:49       ` David Gibson
2025-12-04  7:45 ` [PATCH 3/8] tcp: Don't clear ACK_TO_TAP_DUE if we're advertising a zero-sized window Stefano Brivio
2025-12-04 23:50   ` David Gibson
2025-12-04  7:45 ` [PATCH 4/8] tcp: Acknowledge everything if sending buffer is less than SNDBUF_BIG Stefano Brivio
2025-12-05  0:08   ` David Gibson
2025-12-05  1:20     ` Stefano Brivio
2025-12-05  2:50       ` David Gibson
2025-12-08  0:19         ` Stefano Brivio
2025-12-04  7:45 ` [PATCH 5/8] tcp: Don't limit window to less-than-MSS values, use zero instead Stefano Brivio
2025-12-05  0:35   ` David Gibson
2025-12-05  1:20     ` Stefano Brivio
2025-12-05  2:53       ` David Gibson
2025-12-04  7:45 ` [PATCH 6/8] tcp: Allow exceeding the available sending buffer size in window advertisements Stefano Brivio
2025-12-05  2:34   ` David Gibson
2025-12-08  0:20     ` Stefano Brivio
2025-12-04  7:45 ` [PATCH 7/8] tcp: Send a duplicate ACK also on complete sendmsg() failure Stefano Brivio
2025-12-05  2:35   ` David Gibson
2025-12-04  7:45 ` [PATCH 8/8] tcp: Skip redundant ACK on partial " Stefano Brivio
2025-12-05  2:36   ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251204074542.2156548-1-sbrivio@redhat.com \
    --to=sbrivio@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=git@maxchernoff.ca \
    --cc=passt-dev@passt.top \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).