public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: Stefano Brivio <sbrivio@redhat.com>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: passt-dev@passt.top
Subject: Re: tcp_splice SO_RCVLOWAT code; never invoked?
Date: Fri, 11 Apr 2025 08:13:23 +0200	[thread overview]
Message-ID: <20250411081323.5ac96909@elisabeth> (raw)
In-Reply-To: <Z_igmtlHmRQpOIZu@zatzit>

On Fri, 11 Apr 2025 14:54:50 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> Hi Stefano,
> 
> When debugging the splice EINTR bug I fixed the other day, I found the
> whole tcp_splice_sock_handler() pretty confusing to follow.  So, I was
> working on some cleanups.  But then I noticed something more
> specifically odd here.
> 
> We've discussed the use of SO_RCVLOWAT previously.  AIUI, you found it
> essential to achieve reasonable throughput and load for spliced
> connections.

From my tests back then (never on what I ended up committing, it seems)
it wasn't needed in general, it helped only with bulk transfers that
never feel the pipe for some reason. With iperf3, I needed to play with
parameters quite a bit to reproduce something like that. You would need
(at least) to disable Nagle's algorithm (-N) and send small messages
(say, -l 4k instead of -l 1M).

> I think we've agreed before that it's not entirely the
> right tool for the job; just the only one available.
> 
> Except... as far as I can tell, it's never invoked.  AFAICT the only
> place we enable the RCVLOWAT stuff is in a block under this if:
> 
> 	if (!(conn->flags & lowat_set_flag) &&
> 	    readlen > (long)c->tcp.pipe_size / 10) {
> 
> But... this occurs immediately after:
> 	if (readlen >= (long)c->tcp.pipe_size * 10 / 100)
> 		continue;
> 
> .. which is a strictly more inclusive condition, so we'll never reach
> the RCVLOWAT block.

Right, yes, I think we noticed a while ago a bit after trying to
restore the functionality with 01b6a164d94f ("tcp_splice: A typo three
years ago and SO_RCVLOWAT is gone"). That wasn't sufficient.

> To confirm, I tried putting an ASSERT(0) in that block, and didn't hit
> it with spliced iperf3 runs.
> 
> Am I missing something?

No, not really, and that error has been there forever, since I "added"
(not really) the feature in 904b86ade7db ("tcp: Rework window handling,
timers, add SO_RCVLOWAT and pools for sockets/pipes").

My intention was actually:

			if (read >= (long)c->tcp.pipe_size * 90 / 100)
				continue;

or something like that, perhaps 50% even. The idea behind it was: if we
already have good pipe utilisation, there's no need for SO_RCVLOWAT, and
we should retry calling splice() right away. But at this point we should
try again with iperf3 and smaller messages. Perhaps even limiting the
throughput (-b) with multiple flows...

-- 
Stefano


  reply	other threads:[~2025-04-11  6:13 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-11  4:54 tcp_splice SO_RCVLOWAT code; never invoked? David Gibson
2025-04-11  6:13 ` Stefano Brivio [this message]
2025-04-11  6:15   ` Stefano Brivio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250411081323.5ac96909@elisabeth \
    --to=sbrivio@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=passt-dev@passt.top \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).