public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: passt-dev@passt.top
Subject: Re: [PATCH] tcp, tcp_splice: Don't set SO_SNDBUF and SO_RCVBUF to maximum values
Date: Fri, 14 Feb 2025 13:24:40 +1100	[thread overview]
Message-ID: <Z66paE6mDyDtfLBR@zatzit> (raw)
In-Reply-To: <Z656ICs2Vddh2nv9@zatzit>

[-- Attachment #1: Type: text/plain, Size: 2257 bytes --]

On Fri, Feb 14, 2025 at 10:02:56AM +1100, David Gibson wrote:
> On Thu, Feb 13, 2025 at 11:16:50PM +0100, Stefano Brivio wrote:
> > I added this a long long time ago because it dramatically improved
> > throughput back then: with rmem_max and wmem_max >= 4 MiB, we would
> > force send and receive buffer sizes for TCP sockets to the maximum
> > allowed value.
> > 
> > This effectively disables TCP auto-tuning, which would otherwise allow
> > us to exceed those limits, as crazy as it might sound. But in any
> > case, it made sense.
> > 
> > Now that we have zero (internal) copies on every path, plus vhost-user
> > support, it turns out that these settings are entirely obsolete. I get
> > substantially the same throughput in every test we perform, even with
> > very short durations (one second).
> > 
> > The settings are not just useless: they actually cause us quite some
> > trouble on guest state migration, because they lead to huge queues
> > that need to be moved as well.
> > 
> > Drop those settings.
> > 
> > Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> 
> Hooray!
> 
> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

So, I still think this is a good idea in general, but it does cause a
new issue for migration.  On the source, our buffers may have been
auto-tuned up, but on the target, of course, we have a new socket with
the default initial buffer size.  So, when we try to put the source's
buffer data into the target's buffer we can hit the (current) limit.

I'm currently seeing what I think is that problem with iperf3_bidir6
- I'm getting EAGAIN on the target filling the sndbuf - this time the
already sent portion (repair mode).  It might have been one of the
intermittent problems you hit as well.

Some early experiments suggest we might be able to deal with this by
moving the socket into blocking mode during the migration, although
I'm certainly concerned that might let us block indefinitely while
in the migration window if the peer isn't recv()ing for a while.

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

      reply	other threads:[~2025-02-14  2:24 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-13 22:16 [PATCH] tcp, tcp_splice: Don't set SO_SNDBUF and SO_RCVBUF to maximum values Stefano Brivio
2025-02-13 23:02 ` David Gibson
2025-02-14  2:24   ` David Gibson [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z66paE6mDyDtfLBR@zatzit \
    --to=david@gibson.dropbear.id.au \
    --cc=passt-dev@passt.top \
    --cc=sbrivio@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).