From: Stefano Brivio <sbrivio@redhat.com>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: passt-dev@passt.top
Subject: Re: tcp_splice SO_RCVLOWAT code; never invoked?
Date: Fri, 11 Apr 2025 08:13:23 +0200 [thread overview]
Message-ID: <20250411081323.5ac96909@elisabeth> (raw)
In-Reply-To: <Z_igmtlHmRQpOIZu@zatzit>
On Fri, 11 Apr 2025 14:54:50 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:
> Hi Stefano,
>
> When debugging the splice EINTR bug I fixed the other day, I found the
> whole tcp_splice_sock_handler() pretty confusing to follow. So, I was
> working on some cleanups. But then I noticed something more
> specifically odd here.
>
> We've discussed the use of SO_RCVLOWAT previously. AIUI, you found it
> essential to achieve reasonable throughput and load for spliced
> connections.
From my tests back then (never on what I ended up committing, it seems)
it wasn't needed in general, it helped only with bulk transfers that
never feel the pipe for some reason. With iperf3, I needed to play with
parameters quite a bit to reproduce something like that. You would need
(at least) to disable Nagle's algorithm (-N) and send small messages
(say, -l 4k instead of -l 1M).
> I think we've agreed before that it's not entirely the
> right tool for the job; just the only one available.
>
> Except... as far as I can tell, it's never invoked. AFAICT the only
> place we enable the RCVLOWAT stuff is in a block under this if:
>
> if (!(conn->flags & lowat_set_flag) &&
> readlen > (long)c->tcp.pipe_size / 10) {
>
> But... this occurs immediately after:
> if (readlen >= (long)c->tcp.pipe_size * 10 / 100)
> continue;
>
> .. which is a strictly more inclusive condition, so we'll never reach
> the RCVLOWAT block.
Right, yes, I think we noticed a while ago a bit after trying to
restore the functionality with 01b6a164d94f ("tcp_splice: A typo three
years ago and SO_RCVLOWAT is gone"). That wasn't sufficient.
> To confirm, I tried putting an ASSERT(0) in that block, and didn't hit
> it with spliced iperf3 runs.
>
> Am I missing something?
No, not really, and that error has been there forever, since I "added"
(not really) the feature in 904b86ade7db ("tcp: Rework window handling,
timers, add SO_RCVLOWAT and pools for sockets/pipes").
My intention was actually:
if (read >= (long)c->tcp.pipe_size * 90 / 100)
continue;
or something like that, perhaps 50% even. The idea behind it was: if we
already have good pipe utilisation, there's no need for SO_RCVLOWAT, and
we should retry calling splice() right away. But at this point we should
try again with iperf3 and smaller messages. Perhaps even limiting the
throughput (-b) with multiple flows...
--
Stefano
next prev parent reply other threads:[~2025-04-11 6:13 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-11 4:54 tcp_splice SO_RCVLOWAT code; never invoked? David Gibson
2025-04-11 6:13 ` Stefano Brivio [this message]
2025-04-11 6:15 ` Stefano Brivio
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250411081323.5ac96909@elisabeth \
--to=sbrivio@redhat.com \
--cc=david@gibson.dropbear.id.au \
--cc=passt-dev@passt.top \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).