public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: passt-dev@passt.top, Max Chernoff <git@maxchernoff.ca>
Subject: Re: [PATCH 6/8] tcp: Allow exceeding the available sending buffer size in window advertisements
Date: Fri, 5 Dec 2025 13:34:07 +1100	[thread overview]
Message-ID: <aTJEn_K_7G9SH0mY@zatzit> (raw)
In-Reply-To: <20251204074542.2156548-7-sbrivio@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 2675 bytes --]

On Thu, Dec 04, 2025 at 08:45:39AM +0100, Stefano Brivio wrote:
> ...under two conditions:
> 
> - the remote peer is advertising a bigger value to us, meaning that a
>   bigger sending buffer is likely to benefit throughput, AND

I think this condition is redundant: if the remote peer is advertising
less, we'll clamp new_wnd_to_tap to that value anyway.

> - this is not a short-lived connection, where the latency cost of
>   retransmissions would be otherwise unacceptable.
> 
> By doing this, we can reliably trigger TCP buffer size auto-tuning (as
> long as it's available) on bulk data transfers.
> 
> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> ---
>  tcp.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/tcp.c b/tcp.c
> index 2220059..454df69 100644
> --- a/tcp.c
> +++ b/tcp.c
> @@ -353,6 +353,13 @@ enum {
>  #define LOW_RTT_TABLE_SIZE		8
>  #define LOW_RTT_THRESHOLD		10 /* us */
>  
> +/* Try to avoid retransmissions to improve latency on short-lived connections */
> +#define SHORT_CONN_BYTES		(16ULL * 1024 * 1024)
> +
> +/* Temporarily exceed available sending buffer to force TCP auto-tuning */
> +#define SNDBUF_BOOST_FACTOR		150 /* % */
> +#define SNDBUF_BOOST(x)			((x) * SNDBUF_BOOST_FACTOR / 100)

For the short term, the fact this works empirically is enough.  For
the longer term, it would be nice to have a better understanding of
what this "overcommit" amount is actually estimating.

I think what we're looking for is an estimate of the number of bytes
that will have left the buffer by the time the guest gets back to us.  So:
	<connection throughput> * <guest-side RTT>

Alas, I don't see a way to estimate either of those from the
information we already track - we'd need additional bookkeeping.

>  #define ACK_IF_NEEDED	0		/* See tcp_send_flag() */
>  
>  #define CONN_IS_CLOSING(conn)						\
> @@ -1137,6 +1144,9 @@ int tcp_update_seqack_wnd(const struct ctx *c, struct tcp_tap_conn *conn,
>  
>  		if ((int)sendq > SNDBUF_GET(conn)) /* Due to memory pressure? */
>  			limit = 0;
> +		else if ((int)tinfo->tcpi_snd_wnd > SNDBUF_GET(conn) &&
> +			 tinfo->tcpi_bytes_acked > SHORT_CONN_BYTES)

This is pretty subtle, I think it would be worth having some rationale
in a comment, not just the commit message.

> +			limit = SNDBUF_BOOST(SNDBUF_GET(conn)) - (int)sendq;
>  		else
>  			limit = SNDBUF_GET(conn) - (int)sendq;
>  
> -- 
> 2.43.0
> 

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2025-12-05  2:37 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-04  7:45 [PATCH 0/8] tcp: Fix throughput issues with non-local peers Stefano Brivio
2025-12-04  7:45 ` [PATCH 1/8] tcp: Limit advertised window to available, not total sending buffer size Stefano Brivio
2025-12-04 23:10   ` David Gibson
2025-12-04  7:45 ` [PATCH 2/8] tcp: Adaptive interval based on RTT for socket-side acknowledgement checks Stefano Brivio
2025-12-04 23:48   ` David Gibson
2025-12-05  1:20     ` Stefano Brivio
2025-12-05  2:49       ` David Gibson
2025-12-04  7:45 ` [PATCH 3/8] tcp: Don't clear ACK_TO_TAP_DUE if we're advertising a zero-sized window Stefano Brivio
2025-12-04 23:50   ` David Gibson
2025-12-04  7:45 ` [PATCH 4/8] tcp: Acknowledge everything if sending buffer is less than SNDBUF_BIG Stefano Brivio
2025-12-05  0:08   ` David Gibson
2025-12-05  1:20     ` Stefano Brivio
2025-12-05  2:50       ` David Gibson
2025-12-08  0:19         ` Stefano Brivio
2025-12-04  7:45 ` [PATCH 5/8] tcp: Don't limit window to less-than-MSS values, use zero instead Stefano Brivio
2025-12-05  0:35   ` David Gibson
2025-12-05  1:20     ` Stefano Brivio
2025-12-05  2:53       ` David Gibson
2025-12-04  7:45 ` [PATCH 6/8] tcp: Allow exceeding the available sending buffer size in window advertisements Stefano Brivio
2025-12-05  2:34   ` David Gibson [this message]
2025-12-08  0:20     ` Stefano Brivio
2025-12-04  7:45 ` [PATCH 7/8] tcp: Send a duplicate ACK also on complete sendmsg() failure Stefano Brivio
2025-12-05  2:35   ` David Gibson
2025-12-04  7:45 ` [PATCH 8/8] tcp: Skip redundant ACK on partial " Stefano Brivio
2025-12-05  2:36   ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aTJEn_K_7G9SH0mY@zatzit \
    --to=david@gibson.dropbear.id.au \
    --cc=git@maxchernoff.ca \
    --cc=passt-dev@passt.top \
    --cc=sbrivio@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).