public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: passt-dev@passt.top
Subject: Re: [PATCH] tcp: Fix rounding issue in check for approximating window to zero
Date: Mon, 12 Jan 2026 15:02:52 +1100	[thread overview]
Message-ID: <aWRybK2pqTjo5vBI@zatzit> (raw)
In-Reply-To: <20260109141443.3541507-1-sbrivio@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 3128 bytes --]

On Fri, Jan 09, 2026 at 03:14:43PM +0100, Stefano Brivio wrote:
> In general, we approximate the advertised window to zero if we would
> otherwise advertise less than a MSS worth, and the reasoning behind
> that is explained in cf1925fb7b77 ("tcp: Don't limit window to
> less-than-MSS values, use zero instead").
> 
> Then, in commit b40f5cd8c8e1 ("tcp: Use less-than-MSS window on no
> queued data, or no data sent recently"), I introduced some conditions
> under which we won't do that, including a check on whether any data
> was sent recently.
> 
> As an arbitrary but probably reasonable threshold, we consider data to
> have recently been sent if that occurred less than ten times the
> round-trip time (RTT) ago.
> 
> The time elapsed since the last data transmission is reported by the
> kernel in milliseconds, in the tcpi_last_data_sent field of struct
> tcp_info, and the RTT is reported in microseconds instead, in
> tcpi_rtt.
> 
> To avoid the risk of overflow in a simple way, for the purpose of this
> comparison, I converted tcpi_rtt to milliseconds first, but this means
> that the check will always be false (and we'll never approximate the
> window to zero) if the RTT is below one millisecond.
> 
> This, in turn, reintroduces nasty delay issues in transfers in
> non-local connections which have however almost-local (low) latency.
> 
> Given that we want to use ten times the RTT as an arbitrary "long
> enough" upper bound, round the RTT up while converting it to
> milliseconds.
> 
> As an alternative, we could perform the comparison in microseconds,
> but we would need a slightly more complicated implementation to
> exclude overflows, and it's definitely not worth it given the nature
> of this threshold.
> 
> Fixes: b40f5cd8c8e1 ("tcp: Use less-than-MSS window on no queued data, or no data sent recently")
> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> ---
>  tcp.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/tcp.c b/tcp.c
> index e7fa85f..9b7f505 100644
> --- a/tcp.c
> +++ b/tcp.c
> @@ -1182,6 +1182,7 @@ int tcp_update_seqack_wnd(const struct ctx *c, struct tcp_tap_conn *conn,
>  	if ((conn->flags & LOCAL) || tcp_rtt_dst_low(conn)) {
>  		new_wnd_to_tap = tinfo->tcpi_snd_wnd;
>  	} else {
> +		unsigned rtt_ms_ceiling = DIV_ROUND_UP(tinfo->tcpi_rtt, 1000);
>  		uint32_t sendq;
>  		int limit;
>  
> @@ -1225,7 +1226,7 @@ int tcp_update_seqack_wnd(const struct ctx *c, struct tcp_tap_conn *conn,
>  		 *   with pending data in the outbound queue
>  		 */
>  		if (limit < MSS_GET(conn) && sendq &&
> -		    tinfo->tcpi_last_data_sent < tinfo->tcpi_rtt / 1000 * 10)
> +		    tinfo->tcpi_last_data_sent < rtt_ms_ceiling * 10)
>  			limit = 0;
>  
>  		new_wnd_to_tap = MIN((int)tinfo->tcpi_snd_wnd, limit);
> -- 
> 2.43.0
> 

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

      reply	other threads:[~2026-01-12  4:03 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-09 14:14 Stefano Brivio
2026-01-12  4:02 ` David Gibson [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aWRybK2pqTjo5vBI@zatzit \
    --to=david@gibson.dropbear.id.au \
    --cc=passt-dev@passt.top \
    --cc=sbrivio@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).