From: David Gibson <david@gibson.dropbear.id.au>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: Matej Hrica <mhrica@redhat.com>, passt-dev@passt.top
Subject: Re: [PATCH RFT 5/5] passt.1: Add note about tuning rmem_max and wmem_max for throughput
Date: Mon, 25 Sep 2023 14:57:40 +1000 [thread overview]
Message-ID: <ZRETRN2PTCiEWeWd@zatzit> (raw)
In-Reply-To: <20230922220610.58767-6-sbrivio@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 3201 bytes --]
On Sat, Sep 23, 2023 at 12:06:10AM +0200, Stefano Brivio wrote:
> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> ---
> passt.1 | 33 +++++++++++++++++++++++++++++++++
> 1 file changed, 33 insertions(+)
>
> diff --git a/passt.1 b/passt.1
> index 1ad4276..bcbe6fd 100644
> --- a/passt.1
> +++ b/passt.1
> @@ -926,6 +926,39 @@ If the sending window cannot be queried, it will always be announced as the
> current sending buffer size to guest or target namespace. This might affect
> throughput of TCP connections.
>
> +.SS Tuning for high throughput
> +
> +On Linux, by default, the maximum memory that can be set for receive and send
> +socket buffers is 208 KiB. Those limits are set by the
> +\fI/proc/sys/net/core/rmem_max\fR and \fI/proc/sys/net/core/wmem_max\fR files,
> +see \fBsocket\fR(7).
> +
> +As of Linux 6.5, while the TCP implementation can dynamically shrink buffers
> +depending on utilisation even above those limits, such a small limit will
"shrink buffers" and "even above those limits" don't seem to quite
work together.
> +reflect on the advertised TCP window at the beginning of a connection, and the
Hmmm.... while [rw]mem_max might limit that initial window size, I
wouldn't expect increasing the limits alone to increase that initial
window size: wouldn't that instead be affected by the TCP default
buffer size i.e. the middle value in net.ipv4.tcp_rmem?
> +buffer size of the UNIX domain socket buffer used by \fBpasst\fR cannot exceed
> +these limits anyway.
> +
> +Further, as of Linux 6.5, using socket options \fBSO_RCVBUF\fR and
> +\fBSO_SNDBUF\fR will prevent TCP buffers to expand above the \fIrmem_max\fR and
> +\fIwmem_max\fR limits because the automatic adjustment provided by the TCP
> +implementation is then disabled.
> +
> +As a consequence, \fBpasst\fR and \fBpasta\fR probe these limits at start-up and
> +will not set TCP socket buffer sizes if they are lower than 2 MiB, because this
> +would affect the maximum size of TCP buffers for the whole duration of a
> +connection.
> +
> +Note that 208 KiB is, accounting for kernel overhead, enough to fit less than
> +three TCP packets at the default MSS. In applications where high throughput is
> +expected, it is therefore advisable to increase those limits to at least 2 MiB,
> +or even 16 MiB:
> +
> +.nf
> + sysctl -w net.core.rmem_max=$((16 << 20)
> + sysctl -w net.core.wmem_max=$((16 << 20)
> +.fi
As noted in a previous mail, empirically, this doesn't necessarily
seem to work better for me. I'm wondering if we'd be better off never
touching RCFBUF and SNDBUF for TCP sockets, and letting the kernel do
its adaptive thing. We probably still want to expand the buffers as
much as we can for the Unix socket, though. And we likely still want
expanded limits for the tests so that iperf3 can use large buffers
> +
> .SH LIMITATIONS
>
> Currently, IGMP/MLD proxying (RFC 4605) and support for SCTP (RFC 4960) are not
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2023-09-25 5:52 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-22 22:06 [PATCH RFT 0/5] Fixes and a workaround for TCP stalls with small buffers Stefano Brivio
2023-09-22 22:06 ` [PATCH RFT 1/5] tcp: Fix comment to tcp_sock_consume() Stefano Brivio
2023-09-23 2:48 ` David Gibson
2023-09-22 22:06 ` [PATCH RFT 2/5] tcp: Reset STALLED flag on ACK only, check for pending socket data Stefano Brivio
2023-09-25 3:07 ` David Gibson
2023-09-27 17:05 ` Stefano Brivio
2023-09-28 1:48 ` David Gibson
2023-09-29 15:20 ` Stefano Brivio
2023-10-03 3:20 ` David Gibson
2023-10-05 6:18 ` Stefano Brivio
2023-10-05 7:36 ` David Gibson
2023-09-22 22:06 ` [PATCH RFT 3/5] tcp: Force TCP_WINDOW_CLAMP before resetting STALLED flag Stefano Brivio
2023-09-22 22:31 ` Stefano Brivio
2023-09-23 7:55 ` David Gibson
2023-09-25 4:09 ` David Gibson
2023-09-25 4:10 ` David Gibson
2023-09-25 4:21 ` David Gibson
2023-09-27 17:05 ` Stefano Brivio
2023-09-28 1:51 ` David Gibson
2023-09-22 22:06 ` [PATCH RFT 4/5] tcp, tap: Don't increase tap-side sequence counter for dropped frames Stefano Brivio
2023-09-25 4:47 ` David Gibson
2023-09-27 17:06 ` Stefano Brivio
2023-09-28 1:58 ` David Gibson
2023-09-29 15:19 ` Stefano Brivio
2023-10-03 3:22 ` David Gibson
2023-10-05 6:19 ` Stefano Brivio
2023-10-05 7:38 ` David Gibson
2023-09-22 22:06 ` [PATCH RFT 5/5] passt.1: Add note about tuning rmem_max and wmem_max for throughput Stefano Brivio
2023-09-25 4:57 ` David Gibson [this message]
2023-09-27 17:06 ` Stefano Brivio
2023-09-28 2:02 ` David Gibson
2023-09-25 5:52 ` [PATCH RFT 0/5] Fixes and a workaround for TCP stalls with small buffers David Gibson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZRETRN2PTCiEWeWd@zatzit \
--to=david@gibson.dropbear.id.au \
--cc=mhrica@redhat.com \
--cc=passt-dev@passt.top \
--cc=sbrivio@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).