From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: passt.top; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=dLRkMzj7; dkim-atps=neutral Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by passt.top (Postfix) with ESMTPS id 125C05A0008 for ; Fri, 11 Apr 2025 08:13:35 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1744352015; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=LBjfyA/yb9s0w5TkUeKu906ttJ5MCM7mc957DC0HhdA=; b=dLRkMzj7chodU5DH9QW9OL2+T9XiZrHyyxhtHUTJoZa5DECrTQnjwHgIIfvoC0FekYF0ma EOjd1B+sjdnyx34dEm8joQzT2pjHA68l5S7AXZAR46XFr2v5uwMe1k2NpRJClq+QNXRMsL dBFBkpwyA9JBpE7CiJTVmyxQBypmj/k= Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-3-W5JYFOTAPAC-gGS2cYndyA-1; Fri, 11 Apr 2025 02:13:33 -0400 X-MC-Unique: W5JYFOTAPAC-gGS2cYndyA-1 X-Mimecast-MFC-AGG-ID: W5JYFOTAPAC-gGS2cYndyA_1744352012 Received: by mail-wm1-f71.google.com with SMTP id 5b1f17b1804b1-43cf327e9a2so13747145e9.3 for ; Thu, 10 Apr 2025 23:13:33 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744352012; x=1744956812; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=co2uDWMlg0onmkvG0R81lyaWrQpSp3kt829Fz5SLe3A=; b=Ut9W4JInD4dpEiR9sEVWNvMGdmd1HKiiHh4yEg6r5TPyDAZ9ZUH3QWYIYFsdTbhd6j HJ8DHzQLfA2OqpYw6nzytGJOVUSRKduW5FemW4PoDn3K7Rabi8rOAl4jS208rDJlGmy6 +4sDzK8skqNVzqkpmHKieDdOjCcFFaVuHvxNsIC5ghyNhuvJLxsz79Rh7LP0qGNXiGN9 06s7TESrb3rNsyg/9pek/HrMumYupmhNQ+qqODdypZdGO93swimimw8jbGCU49iAPA7h Wwn+IsBJ80DZnf4byp7ksIH2XIqVVLQRlHfeb5R6oxQ428vPA5Nu7yXPVW1gmSdDNtx6 z8IQ== X-Gm-Message-State: AOJu0YyswXX1gQa151N49aJ5wM54JNhrxDZV6dUd1mVM7v7+1PdeEU7B EQTHQNcQ64ACXVJad3lWB5GqZs4l2yUN1aa1F1Vb/vwGfvij9rrynGiNO1cAHgPbIYZIA5vvwup uUNuHnRDGwvdPsLWhhMvIJ9FuwXWAaXZKlCjfHUwkcGXvQnjVWw== X-Gm-Gg: ASbGnctSf8XMC3x2iHviXj7K09klbk8D1HE7c41aJEPBd2aHplWeehAO/WFh5DpbIyi LoOn5KBlTRcaU6p0GYbFcGmBKj9+aeGusntdVWpfy1V7M6lIKWQLbs4KXdQRSKir1+0Owvzw2Du nGfLz7XLCteyRX8jg89v4C9NUMtw+4lCDf/KdcDL2ZWQZaKAb+vv7C5hi4fWM5uk0g8S/zC52bv TAaDVhFXe7dhDF1MIfuW6+KgHG822ahAG8gxrwHsll6nqdcZfL1NLVHqeOGUKWESWQPM2NSPcOo SxKwwz7D594XVLuqSHdbQzo= X-Received: by 2002:a05:600c:83c4:b0:43c:fb8e:aec0 with SMTP id 5b1f17b1804b1-43f3a925ademr9453225e9.1.1744352012590; Thu, 10 Apr 2025 23:13:32 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGnGpBWVAWz8aZJMyWBcqAL7ps6nCKxiiidexZ16dRZxyvFckj5l/HAjpQaIkWB5nvzAiQgUg== X-Received: by 2002:a05:600c:83c4:b0:43c:fb8e:aec0 with SMTP id 5b1f17b1804b1-43f3a925ademr9452685e9.1.1744352011151; Thu, 10 Apr 2025 23:13:31 -0700 (PDT) Received: from maya.myfinge.rs (ifcgrfdd.trafficplex.cloud. [2a10:fc81:a806:d6a9::1]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-43f205ecb20sm76766815e9.3.2025.04.10.23.13.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 10 Apr 2025 23:13:30 -0700 (PDT) Date: Fri, 11 Apr 2025 08:13:23 +0200 From: Stefano Brivio To: David Gibson Subject: Re: tcp_splice SO_RCVLOWAT code; never invoked? Message-ID: <20250411081323.5ac96909@elisabeth> In-Reply-To: References: Organization: Red Hat X-Mailer: Claws Mail 4.2.0 (GTK 3.24.49; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: v1MP-8qI5usM7rBjRy_-_K7icZG5qZmaudEh7qZCx5Y_1744352012 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Message-ID-Hash: WDWU45H6B2AQNQCXUNRXVDU73K6ZRGDL X-Message-ID-Hash: WDWU45H6B2AQNQCXUNRXVDU73K6ZRGDL X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Fri, 11 Apr 2025 14:54:50 +1000 David Gibson wrote: > Hi Stefano, >=20 > When debugging the splice EINTR bug I fixed the other day, I found the > whole tcp_splice_sock_handler() pretty confusing to follow. So, I was > working on some cleanups. But then I noticed something more > specifically odd here. >=20 > We've discussed the use of SO_RCVLOWAT previously. AIUI, you found it > essential to achieve reasonable throughput and load for spliced > connections. >From my tests back then (never on what I ended up committing, it seems) it wasn't needed in general, it helped only with bulk transfers that never feel the pipe for some reason. With iperf3, I needed to play with parameters quite a bit to reproduce something like that. You would need (at least) to disable Nagle's algorithm (-N) and send small messages (say, -l 4k instead of -l 1M). > I think we've agreed before that it's not entirely the > right tool for the job; just the only one available. >=20 > Except... as far as I can tell, it's never invoked. AFAICT the only > place we enable the RCVLOWAT stuff is in a block under this if: >=20 > =09if (!(conn->flags & lowat_set_flag) && > =09 readlen > (long)c->tcp.pipe_size / 10) { >=20 > But... this occurs immediately after: > =09if (readlen >=3D (long)c->tcp.pipe_size * 10 / 100) > =09=09continue; >=20 > .. which is a strictly more inclusive condition, so we'll never reach > the RCVLOWAT block. Right, yes, I think we noticed a while ago a bit after trying to restore the functionality with 01b6a164d94f ("tcp_splice: A typo three years ago and SO_RCVLOWAT is gone"). That wasn't sufficient. > To confirm, I tried putting an ASSERT(0) in that block, and didn't hit > it with spliced iperf3 runs. >=20 > Am I missing something? No, not really, and that error has been there forever, since I "added" (not really) the feature in 904b86ade7db ("tcp: Rework window handling, timers, add SO_RCVLOWAT and pools for sockets/pipes"). My intention was actually: =09=09=09if (read >=3D (long)c->tcp.pipe_size * 90 / 100) =09=09=09=09continue; or something like that, perhaps 50% even. The idea behind it was: if we already have good pipe utilisation, there's no need for SO_RCVLOWAT, and we should retry calling splice() right away. But at this point we should try again with iperf3 and smaller messages. Perhaps even limiting the throughput (-b) with multiple flows... --=20 Stefano