From: Stefano Brivio <sbrivio@redhat.com>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: passt-dev@passt.top
Subject: Re: [PATCH] tcp_splice: A typo three years ago and SO_RCVLOWAT is gone
Date: Mon, 17 Feb 2025 08:12:10 +0100 [thread overview]
Message-ID: <20250217081210.52b3ba3b@elisabeth> (raw)
In-Reply-To: <Z7Kx0x8YMYRqIZ06@zatzit>
On Mon, 17 Feb 2025 14:49:39 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:
> On Sun, Feb 16, 2025 at 11:12:15PM +0100, Stefano Brivio wrote:
> > In commit e5eefe77435a ("tcp: Refactor to use events instead of
> > states, split out spliced implementation"), this:
> >
> > if (!bitmap_isset(rcvlowat_set, conn - ts) &&
> > readlen > (long)c->tcp.pipe_size / 10) {
> >
> > (note the !) became:
> >
> > if (conn->flags & lowat_set_flag &&
> > readlen > (long)c->tcp.pipe_size / 10) {
> >
> > in the new tcp_splice_sock_handler().
> >
> > We want to check, there, if we should set SO_RCVLOWAT, only if we
> > haven't set it already.
> >
> > But, instead, we're checking if it's already set before we set it, so
> > we'll never set it, of course.
> >
> > Fix the check and re-enable the functionality, which should give us
> > improved CPU utilisation in non-interactive cases where we are not
> > transferring at full pipe capacity.
> >
> > Fixes: e5eefe77435a ("tcp: Refactor to use events instead of states, split out spliced implementation")
> > Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
>
> Ouch.
>
> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
>
> At least insofar as this clearly corrects towards the intended
> behaviour. Given that we inadvertently bee using RCVLOWAT for so
> long, I am a bit worried that this might expose deadlocks or stalls.
> But, I guess we debug that when we come to it.
Yeah, I was undecided as well, then I tested and tested, and I
realised that commit 904b86ade7db ("tcp: Rework window handling, timers,
add SO_RCVLOWAT and pools for sockets/pipes") added this gem, still there:
if (read >= (long)c->tcp.pipe_size * 10 / 100)
continue;
if (!bitmap_isset(rcvlowat_set, conn - ts) &&
read > (long)c->tcp.pipe_size / 10) {
int lowat = c->tcp.pipe_size / 4;
setsockopt(move_from, SOL_SOCKET, SO_RCVLOWAT,
&lowat, sizeof(lowat));
which means that we'll not set SO_RCVLOWAT anyway, because if
read > c->tcp.pipe_size / 10, we'll skip the second block, and if not,
we'll skip it anyway.
Now, I have a clear memory of characterising those 10% and 25% values
over a wide range of pipe sizes, message sizes, etc. Other than SSH and
installing packages in a container (and check that nothing gets stuck
for ~one second), my basic test idea was:
$ iperf3 -c localhost -p 5202 -l 100
with port 5202 forwarded to the network namespace and iperf3 server
listening there.
I think I added another typo while cleaning up. I probably meant:
[...]
read < (long)c->tcp.pipe_size / 10) {
...and if I do that, it finally works.
But anyway, it's too many typos at this point and especially we never
had a release with it enabled, so I'm not "fixing" this for the moment,
it needs a lot more testing than I can do now.
--
Stefano
prev parent reply other threads:[~2025-02-17 7:12 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-16 22:12 [PATCH] tcp_splice: A typo three years ago and SO_RCVLOWAT is gone Stefano Brivio
2025-02-16 22:12 ` [PATCH] tcp_splice: Don't wake up on input data if we can't write it anywhere Stefano Brivio
2025-02-17 3:51 ` David Gibson
2025-02-17 3:49 ` [PATCH] tcp_splice: A typo three years ago and SO_RCVLOWAT is gone David Gibson
2025-02-17 7:12 ` Stefano Brivio [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250217081210.52b3ba3b@elisabeth \
--to=sbrivio@redhat.com \
--cc=david@gibson.dropbear.id.au \
--cc=passt-dev@passt.top \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).