public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: Stefano Brivio <sbrivio@redhat.com>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: passt-dev@passt.top
Subject: Re: [PATCH] tcp_splice: A typo three years ago and SO_RCVLOWAT is gone
Date: Mon, 17 Feb 2025 08:12:10 +0100	[thread overview]
Message-ID: <20250217081210.52b3ba3b@elisabeth> (raw)
In-Reply-To: <Z7Kx0x8YMYRqIZ06@zatzit>

On Mon, 17 Feb 2025 14:49:39 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Sun, Feb 16, 2025 at 11:12:15PM +0100, Stefano Brivio wrote:
> > In commit e5eefe77435a ("tcp: Refactor to use events instead of
> > states, split out spliced implementation"), this:
> > 
> > 			if (!bitmap_isset(rcvlowat_set, conn - ts) &&
> > 			    readlen > (long)c->tcp.pipe_size / 10) {
> > 
> > (note the !) became:
> > 
> > 			if (conn->flags & lowat_set_flag &&
> > 			    readlen > (long)c->tcp.pipe_size / 10) {
> > 
> > in the new tcp_splice_sock_handler().
> > 
> > We want to check, there, if we should set SO_RCVLOWAT, only if we
> > haven't set it already.
> > 
> > But, instead, we're checking if it's already set before we set it, so
> > we'll never set it, of course.
> > 
> > Fix the check and re-enable the functionality, which should give us
> > improved CPU utilisation in non-interactive cases where we are not
> > transferring at full pipe capacity.
> > 
> > Fixes: e5eefe77435a ("tcp: Refactor to use events instead of states, split out spliced implementation")
> > Signed-off-by: Stefano Brivio <sbrivio@redhat.com>  
> 
> Ouch.
> 
> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> 
> At least insofar as this clearly corrects towards the intended
> behaviour.  Given that we inadvertently bee using RCVLOWAT for so
> long, I am a bit worried that this might expose deadlocks or stalls.
> But, I guess we debug that when we come to it.

Yeah, I was undecided as well, then I tested and tested, and I
realised that commit 904b86ade7db ("tcp: Rework window handling, timers,
add SO_RCVLOWAT and pools for sockets/pipes") added this gem, still there:

			if (read >= (long)c->tcp.pipe_size * 10 / 100)
				continue;

			if (!bitmap_isset(rcvlowat_set, conn - ts) &&
			    read > (long)c->tcp.pipe_size / 10) {
				int lowat = c->tcp.pipe_size / 4;

				setsockopt(move_from, SOL_SOCKET, SO_RCVLOWAT,
					   &lowat, sizeof(lowat));

which means that we'll not set SO_RCVLOWAT anyway, because if
read > c->tcp.pipe_size / 10, we'll skip the second block, and if not,
we'll skip it anyway.

Now, I have a clear memory of characterising those 10% and 25% values
over a wide range of pipe sizes, message sizes, etc. Other than SSH and
installing packages in a container (and check that nothing gets stuck
for ~one second), my basic test idea was:

  $ iperf3 -c localhost -p 5202 -l 100

with port 5202 forwarded to the network namespace and iperf3 server
listening there.

I think I added another typo while cleaning up. I probably meant:

			    [...]
			    read < (long)c->tcp.pipe_size / 10) {

...and if I do that, it finally works.

But anyway, it's too many typos at this point and especially we never
had a release with it enabled, so I'm not "fixing" this for the moment,
it needs a lot more testing than I can do now.

-- 
Stefano


      reply	other threads:[~2025-02-17  7:12 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-16 22:12 [PATCH] tcp_splice: A typo three years ago and SO_RCVLOWAT is gone Stefano Brivio
2025-02-16 22:12 ` [PATCH] tcp_splice: Don't wake up on input data if we can't write it anywhere Stefano Brivio
2025-02-17  3:51   ` David Gibson
2025-02-17  3:49 ` [PATCH] tcp_splice: A typo three years ago and SO_RCVLOWAT is gone David Gibson
2025-02-17  7:12   ` Stefano Brivio [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250217081210.52b3ba3b@elisabeth \
    --to=sbrivio@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=passt-dev@passt.top \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).