From: David Gibson <david@gibson.dropbear.id.au>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: passt-dev@passt.top
Subject: Re: [PATCH] tcp_splice: Don't wake up on input data if we can't write it anywhere
Date: Mon, 17 Feb 2025 14:51:20 +1100 [thread overview]
Message-ID: <Z7KyOKhTT9rKiCqQ@zatzit> (raw)
In-Reply-To: <20250216221216.2014593-2-sbrivio@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 2410 bytes --]
On Sun, Feb 16, 2025 at 11:12:16PM +0100, Stefano Brivio wrote:
> If we set the OUT_WAIT_* flag (waiting on EPOLLOUT) for a side of a
> given flow, it means that we're blocked, waiting for the receiver to
> actually receive data, with a full pipe.
>
> In that case, if we keep EPOLLIN set for the socket on the other side
> (our receiving side), we'll get into a loop such as:
>
> 41.0230: pasta: epoll event on connected spliced TCP socket 108 (events: 0x00000001)
> 41.0230: Flow 1 (TCP connection (spliced)): -1 from read-side call
> 41.0230: Flow 1 (TCP connection (spliced)): -1 from write-side call (passed 8192)
> 41.0230: Flow 1 (TCP connection (spliced)): event at tcp_splice_sock_handler:577
> 41.0230: pasta: epoll event on connected spliced TCP socket 108 (events: 0x00000001)
> 41.0230: Flow 1 (TCP connection (spliced)): -1 from read-side call
> 41.0230: Flow 1 (TCP connection (spliced)): -1 from write-side call (passed 8192)
> 41.0230: Flow 1 (TCP connection (spliced)): event at tcp_splice_sock_handler:577
>
> leading to 100% CPU usage, of course.
>
> Drop EPOLLIN on our receiving side as long when we're waiting for
> output readiness on the other side.
>
> Link: https://github.com/containers/podman/issues/23686#issuecomment-2661036584
> Link: https://www.reddit.com/r/podman/comments/1iph50j/pasta_high_cpu_on_podman_rootless_container/
> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> ---
> tcp_splice.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/tcp_splice.c b/tcp_splice.c
> index f1a9223..8a39a6f 100644
> --- a/tcp_splice.c
> +++ b/tcp_splice.c
> @@ -131,8 +131,12 @@ static void tcp_splice_conn_epoll_events(uint16_t events,
> ev[1].events = EPOLLOUT;
> }
>
> - flow_foreach_sidei(sidei)
> - ev[sidei].events |= (events & OUT_WAIT(sidei)) ? EPOLLOUT : 0;
> + flow_foreach_sidei(sidei) {
> + if (events & OUT_WAIT(sidei)) {
> + ev[sidei].events |= EPOLLOUT;
> + ev[!sidei].events &= ~EPOLLIN;
> + }
> + }
> }
>
> /**
--
David Gibson (he or they) | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you, not the other way
| around.
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2025-02-17 3:57 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-16 22:12 [PATCH] tcp_splice: A typo three years ago and SO_RCVLOWAT is gone Stefano Brivio
2025-02-16 22:12 ` [PATCH] tcp_splice: Don't wake up on input data if we can't write it anywhere Stefano Brivio
2025-02-17 3:51 ` David Gibson [this message]
2025-02-17 3:49 ` [PATCH] tcp_splice: A typo three years ago and SO_RCVLOWAT is gone David Gibson
2025-02-17 7:12 ` Stefano Brivio
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z7KyOKhTT9rKiCqQ@zatzit \
--to=david@gibson.dropbear.id.au \
--cc=passt-dev@passt.top \
--cc=sbrivio@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).