public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: passt-dev@passt.top, Max Chernoff <git@maxchernoff.ca>
Subject: Re: [PATCH v2 1/9] tcp, util: Add function for scaling to linearly interpolated factor, use it
Date: Mon, 8 Dec 2025 16:33:17 +1100	[thread overview]
Message-ID: <aTZjHQWWEwVNAnjX@zatzit> (raw)
In-Reply-To: <20251208002229.391162-2-sbrivio@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 5741 bytes --]

On Mon, Dec 08, 2025 at 01:22:09AM +0100, Stefano Brivio wrote:
> Right now, the only need for this kind of function comes from
> tcp_get_sndbuf(), which calculates the amount of sending buffer we
> want to use depending on its own size: we want to use more of it
> if it's smaller, as bookkeeping overhead is usually lower and we rely
> on auto-tuning there, and use less of it when it's bigger.
> 
> For this purpose, the new function is overly generic and its name is
> a mouthful: @x is the same as @y, that is, we want to use more or
> less of the buffer depending on the size of the buffer itself.
> 
> However, an upcoming change will need that generality, as we'll want
> to scale the amount of sending buffer we use depending on another
> (scaled) factor.
> 
> While at it, now that we have this new function, which makes it simple
> to specify a precise usage factor, change the amount of sending buffer
> we want to use at and above 4 MiB: 75% looks perfectly safe.
> 
> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

In that it looks correct.  Several suggestions for improvements in
clarity below, either as followups, or in case a respin is needed
anyway.

> ---
>  tcp.c  |  8 ++------
>  util.c | 38 ++++++++++++++++++++++++++++++++++++++
>  util.h |  1 +
>  3 files changed, 41 insertions(+), 6 deletions(-)
> 
> diff --git a/tcp.c b/tcp.c
> index bb661ee..37012cc 100644
> --- a/tcp.c
> +++ b/tcp.c
> @@ -773,7 +773,7 @@ static void tcp_rtt_dst_check(const struct tcp_tap_conn *conn,
>  }
>  
>  /**
> - * tcp_get_sndbuf() - Get, scale SO_SNDBUF between thresholds (1 to 0.5 usage)
> + * tcp_get_sndbuf() - Get, scale SO_SNDBUF between thresholds (1 to 0.75 usage)

I'd slightly prefer the change to 0.75 be in a separate patch, just so
it's easier to tell that change to the helper function itself doesn't
change behaviour here.

>   * @conn:	Connection pointer
>   */
>  static void tcp_get_sndbuf(struct tcp_tap_conn *conn)
> @@ -788,11 +788,7 @@ static void tcp_get_sndbuf(struct tcp_tap_conn *conn)
>  		return;
>  	}
>  
> -	v = sndbuf;
> -	if (v >= SNDBUF_BIG)
> -		v /= 2;
> -	else if (v > SNDBUF_SMALL)
> -		v -= v * (v - SNDBUF_SMALL) / (SNDBUF_BIG - SNDBUF_SMALL) / 2;
> +	v = scale_x_to_y_slope(sndbuf, sndbuf, SNDBUF_SMALL, SNDBUF_BIG, 75);
>  
>  	SNDBUF_SET(conn, MIN(INT_MAX, v));
>  }
> diff --git a/util.c b/util.c
> index f32c9cb..ff0ba01 100644
> --- a/util.c
> +++ b/util.c
> @@ -1223,3 +1223,41 @@ void fsync_pcap_and_log(void)
>  	if (log_file != -1)
>  		(void)fsync(log_file);
>  }
> +
> +/**
> + * scale_x_to_y_slope() - Scale @x from 100% to f% depending on @y's value

Would "clamped_scale" work as a more descriptive name?

> + * @x:		Value to scale
> + * @y:		Value determining scaling
> + * @lo:		Lower bound for @y (start of y-axis slope)
> + * @hi:		Upper bound for @y (end of y-axis slope)
> + * @f:		Scaling factor, percent

Maybe worth clarifying that this can be less than or more than 100% -
description below uses >100%, but the usage above is <100%.

> + *
> + * Return: @x scaled by @f * linear interpolation of @y between @lo and @hi
> + *
> + * In pictures:
> + *
> + *                f % -> ,----   * If @y < lo (for example, @y is y0), return @x
> + *                      /|   |
> + *                     / |   |   * If @lo < @y < @hi (for example, @y is y1),
> + *                    /  |   |     return @x scaled by a factor linearly
> + * (100 + f) / 2 % ->/   |   |     interpolated between 100% and f% depending on
> + *                  /|   |   |     @y's position between @lo (100%) and @hi (f%)
> + *                 / |   |   |
> + *                /  |   |   |   * If @y > @hi (for example, @y is y2), return
> + * 100 % -> -----'   |   |   |     @x * @f / 100
> + *           |   |   |   |   |
> + *          y0  lo  y1  hi  y2   Example: @f = 150, @lo = 10, @hi = 20, @y = 15,
> + *                                        @x = 1000
> + *                                        -> interpolated factor is 125%
> + *                                        -> return 1250
> + */
> +long scale_x_to_y_slope(long x, long y, long lo, long hi, long f)
> +{
> +	if (y < lo)
> +		return x;
> +
> +	if (y > hi)
> +		return x * f / 100;
> +
> +	return x - (x * (y - lo) / (hi - lo)) * (100 - f) / 100;

There's a subtle tradeoff here.  Dividing by (hi - lo) before
multiplying by the factor loses some precision in the final result.
On the hand, doing all the multiplies first would increase the risk of
an overflow.


Possible different way of organising this that _might_ be slightly
easier to describe:  rather than including a scaling factor, instead
give upper and lower bounds of the output, so something like:

long clamped_scale(long a, long b, long s, long sa, long sb)

=> returns alue between @a and @b, matching where @s lies between @sa
and @sb.



> +}
> diff --git a/util.h b/util.h
> index 17f5ae0..ec75453 100644
> --- a/util.h
> +++ b/util.h
> @@ -242,6 +242,7 @@ int read_remainder(int fd, const struct iovec *iov, size_t cnt, size_t skip);
>  void close_open_files(int argc, char **argv);
>  bool snprintf_check(char *str, size_t size, const char *format, ...);
>  void fsync_pcap_and_log(void);
> +long scale_x_to_y_slope(long x, long y, long lo, long hi, long f);
>  
>  /**
>   * af_name() - Return name of an address family
> -- 
> 2.43.0
> 

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2025-12-08  6:12 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-08  0:22 [PATCH v2 0/9] tcp: Fix throughput issues with non-local peers Stefano Brivio
2025-12-08  0:22 ` [PATCH v2 1/9] tcp, util: Add function for scaling to linearly interpolated factor, use it Stefano Brivio
2025-12-08  5:33   ` David Gibson [this message]
2025-12-08  0:22 ` [PATCH v2 2/9] tcp: Limit advertised window to available, not total sending buffer size Stefano Brivio
2025-12-08  0:22 ` [PATCH v2 3/9] tcp: Adaptive interval based on RTT for socket-side acknowledgement checks Stefano Brivio
2025-12-08  5:41   ` David Gibson
2025-12-08  7:22     ` Stefano Brivio
2025-12-08  8:28       ` David Gibson
2025-12-08  0:22 ` [PATCH v2 4/9] tcp: Don't clear ACK_TO_TAP_DUE if we're advertising a zero-sized window Stefano Brivio
2025-12-08  0:22 ` [PATCH v2 5/9] tcp: Acknowledge everything if it looks like bulk traffic, not interactive Stefano Brivio
2025-12-08  5:54   ` David Gibson
2025-12-08  7:25     ` Stefano Brivio
2025-12-08  8:31       ` David Gibson
2025-12-08  0:22 ` [PATCH v2 6/9] tcp: Don't limit window to less-than-MSS values, use zero instead Stefano Brivio
2025-12-08  6:43   ` David Gibson
2025-12-08  8:11     ` Stefano Brivio
2025-12-08  0:22 ` [PATCH v2 7/9] tcp: Allow exceeding the available sending buffer size in window advertisements Stefano Brivio
2025-12-08  6:25   ` David Gibson
2025-12-08  7:45     ` Stefano Brivio
2025-12-08  0:22 ` [PATCH v2 8/9] tcp: Send a duplicate ACK also on complete sendmsg() failure Stefano Brivio
2025-12-08  0:22 ` [PATCH v2 9/9] tcp: Skip redundant ACK on partial " Stefano Brivio
2025-12-08  6:46 ` [PATCH v2 0/9] tcp: Fix throughput issues with non-local peers David Gibson
2025-12-08  8:22   ` Stefano Brivio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aTZjHQWWEwVNAnjX@zatzit \
    --to=david@gibson.dropbear.id.au \
    --cc=git@maxchernoff.ca \
    --cc=passt-dev@passt.top \
    --cc=sbrivio@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).