public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
* [PATCH] tcp: Fix rounding issue in check for approximating window to zero
@ 2026-01-09 14:14 Stefano Brivio
  2026-01-12  4:02 ` David Gibson
  0 siblings, 1 reply; 2+ messages in thread
From: Stefano Brivio @ 2026-01-09 14:14 UTC (permalink / raw)
  To: passt-dev; +Cc: David Gibson

In general, we approximate the advertised window to zero if we would
otherwise advertise less than a MSS worth, and the reasoning behind
that is explained in cf1925fb7b77 ("tcp: Don't limit window to
less-than-MSS values, use zero instead").

Then, in commit b40f5cd8c8e1 ("tcp: Use less-than-MSS window on no
queued data, or no data sent recently"), I introduced some conditions
under which we won't do that, including a check on whether any data
was sent recently.

As an arbitrary but probably reasonable threshold, we consider data to
have recently been sent if that occurred less than ten times the
round-trip time (RTT) ago.

The time elapsed since the last data transmission is reported by the
kernel in milliseconds, in the tcpi_last_data_sent field of struct
tcp_info, and the RTT is reported in microseconds instead, in
tcpi_rtt.

To avoid the risk of overflow in a simple way, for the purpose of this
comparison, I converted tcpi_rtt to milliseconds first, but this means
that the check will always be false (and we'll never approximate the
window to zero) if the RTT is below one millisecond.

This, in turn, reintroduces nasty delay issues in transfers in
non-local connections which have however almost-local (low) latency.

Given that we want to use ten times the RTT as an arbitrary "long
enough" upper bound, round the RTT up while converting it to
milliseconds.

As an alternative, we could perform the comparison in microseconds,
but we would need a slightly more complicated implementation to
exclude overflows, and it's definitely not worth it given the nature
of this threshold.

Fixes: b40f5cd8c8e1 ("tcp: Use less-than-MSS window on no queued data, or no data sent recently")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
---
 tcp.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tcp.c b/tcp.c
index e7fa85f..9b7f505 100644
--- a/tcp.c
+++ b/tcp.c
@@ -1182,6 +1182,7 @@ int tcp_update_seqack_wnd(const struct ctx *c, struct tcp_tap_conn *conn,
 	if ((conn->flags & LOCAL) || tcp_rtt_dst_low(conn)) {
 		new_wnd_to_tap = tinfo->tcpi_snd_wnd;
 	} else {
+		unsigned rtt_ms_ceiling = DIV_ROUND_UP(tinfo->tcpi_rtt, 1000);
 		uint32_t sendq;
 		int limit;
 
@@ -1225,7 +1226,7 @@ int tcp_update_seqack_wnd(const struct ctx *c, struct tcp_tap_conn *conn,
 		 *   with pending data in the outbound queue
 		 */
 		if (limit < MSS_GET(conn) && sendq &&
-		    tinfo->tcpi_last_data_sent < tinfo->tcpi_rtt / 1000 * 10)
+		    tinfo->tcpi_last_data_sent < rtt_ms_ceiling * 10)
 			limit = 0;
 
 		new_wnd_to_tap = MIN((int)tinfo->tcpi_snd_wnd, limit);
-- 
2.43.0


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH] tcp: Fix rounding issue in check for approximating window to zero
  2026-01-09 14:14 [PATCH] tcp: Fix rounding issue in check for approximating window to zero Stefano Brivio
@ 2026-01-12  4:02 ` David Gibson
  0 siblings, 0 replies; 2+ messages in thread
From: David Gibson @ 2026-01-12  4:02 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: passt-dev

[-- Attachment #1: Type: text/plain, Size: 3128 bytes --]

On Fri, Jan 09, 2026 at 03:14:43PM +0100, Stefano Brivio wrote:
> In general, we approximate the advertised window to zero if we would
> otherwise advertise less than a MSS worth, and the reasoning behind
> that is explained in cf1925fb7b77 ("tcp: Don't limit window to
> less-than-MSS values, use zero instead").
> 
> Then, in commit b40f5cd8c8e1 ("tcp: Use less-than-MSS window on no
> queued data, or no data sent recently"), I introduced some conditions
> under which we won't do that, including a check on whether any data
> was sent recently.
> 
> As an arbitrary but probably reasonable threshold, we consider data to
> have recently been sent if that occurred less than ten times the
> round-trip time (RTT) ago.
> 
> The time elapsed since the last data transmission is reported by the
> kernel in milliseconds, in the tcpi_last_data_sent field of struct
> tcp_info, and the RTT is reported in microseconds instead, in
> tcpi_rtt.
> 
> To avoid the risk of overflow in a simple way, for the purpose of this
> comparison, I converted tcpi_rtt to milliseconds first, but this means
> that the check will always be false (and we'll never approximate the
> window to zero) if the RTT is below one millisecond.
> 
> This, in turn, reintroduces nasty delay issues in transfers in
> non-local connections which have however almost-local (low) latency.
> 
> Given that we want to use ten times the RTT as an arbitrary "long
> enough" upper bound, round the RTT up while converting it to
> milliseconds.
> 
> As an alternative, we could perform the comparison in microseconds,
> but we would need a slightly more complicated implementation to
> exclude overflows, and it's definitely not worth it given the nature
> of this threshold.
> 
> Fixes: b40f5cd8c8e1 ("tcp: Use less-than-MSS window on no queued data, or no data sent recently")
> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> ---
>  tcp.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/tcp.c b/tcp.c
> index e7fa85f..9b7f505 100644
> --- a/tcp.c
> +++ b/tcp.c
> @@ -1182,6 +1182,7 @@ int tcp_update_seqack_wnd(const struct ctx *c, struct tcp_tap_conn *conn,
>  	if ((conn->flags & LOCAL) || tcp_rtt_dst_low(conn)) {
>  		new_wnd_to_tap = tinfo->tcpi_snd_wnd;
>  	} else {
> +		unsigned rtt_ms_ceiling = DIV_ROUND_UP(tinfo->tcpi_rtt, 1000);
>  		uint32_t sendq;
>  		int limit;
>  
> @@ -1225,7 +1226,7 @@ int tcp_update_seqack_wnd(const struct ctx *c, struct tcp_tap_conn *conn,
>  		 *   with pending data in the outbound queue
>  		 */
>  		if (limit < MSS_GET(conn) && sendq &&
> -		    tinfo->tcpi_last_data_sent < tinfo->tcpi_rtt / 1000 * 10)
> +		    tinfo->tcpi_last_data_sent < rtt_ms_ceiling * 10)
>  			limit = 0;
>  
>  		new_wnd_to_tap = MIN((int)tinfo->tcpi_snd_wnd, limit);
> -- 
> 2.43.0
> 

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-01-12  4:03 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-01-09 14:14 [PATCH] tcp: Fix rounding issue in check for approximating window to zero Stefano Brivio
2026-01-12  4:02 ` David Gibson

Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).