From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=pass (2048-bit key; secure) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202506 header.b=hAfJxN3t; dkim-atps=neutral Received: from mail.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by passt.top (Postfix) with ESMTPS id B88985A0271 for ; Thu, 31 Jul 2025 07:59:43 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202506; t=1753941357; bh=bUili/kd3N0XDid5VMckcP7Z05Fk3BMjLlrD+U2mZ70=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=hAfJxN3tVfZt6/KTCB4lDi7evr26wRzSzEHGxiUPjUdsBiUrRhqvXqJ750ylLf5X5 Z6TkeX8Zv5pN5RyNdIrd4UqmudKi1w0TpFLp1EZ/3pU6vQBbi1e/QTdh5042NFIWfp u5nTFk9TetG3SEQb7u+scq95B3Xg2cIUAlLtFTQFsKkRXxzyvdJ6QIww2PQ7aRYHdM X56wCveGWwynF9NfDWHKgORmjSnJ58KRccwrU5csMpMS6mrMMcUTkEfo1Shtpm5o7N KrSrLzYj70xRFsFsPAvOZXFAmP0NKsdAHki1Z4JgUv7eogQM32GTOiH151Wk1+Hnz2 W+Px13MKEtTQA== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4bsyxT14TFz4x7j; Thu, 31 Jul 2025 15:55:57 +1000 (AEST) Date: Thu, 31 Jul 2025 15:59:33 +1000 From: David Gibson To: Eugenio Perez Martin Subject: Re: [RFC v2 10/11] tap: add poll(2) to used_idx Message-ID: References: <20250709174748.3514693-1-eperezma@redhat.com> <20250709174748.3514693-11-eperezma@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="KyQ5KYOyY+QrTje3" Content-Disposition: inline In-Reply-To: Message-ID-Hash: 5AF3TJH7QKOCKXI3X757ITG5A4FME56D X-Message-ID-Hash: 5AF3TJH7QKOCKXI3X757ITG5A4FME56D X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, jasowang@redhat.com X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --KyQ5KYOyY+QrTje3 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Jul 30, 2025 at 08:11:20AM +0200, Eugenio Perez Martin wrote: > On Wed, Jul 30, 2025 at 2:34=E2=80=AFAM David Gibson > wrote: > > > > On Tue, Jul 29, 2025 at 09:04:19AM +0200, Eugenio Perez Martin wrote: > > > On Tue, Jul 29, 2025 at 2:33=E2=80=AFAM David Gibson > > > wrote: > > > > > > > > On Mon, Jul 28, 2025 at 07:03:12PM +0200, Eugenio Perez Martin wrot= e: > > > > > On Thu, Jul 24, 2025 at 3:21=E2=80=AFAM David Gibson > > > > > wrote: > > > > > > > > > > > > On Wed, Jul 09, 2025 at 07:47:47PM +0200, Eugenio P=C3=A9rez wr= ote: > > > > > > > From ~13Gbit/s to ~11.5Gbit/s. > > > > > > > > > > > > Again, I really don't know what you're comparing to what here. > > > > > > > > > > > > > > > > When the buffer is full I'm using poll() to wait until vhost free= some > > > > > buffers, instead of actively checking the used index. This is the= cost > > > > > of the syscall. > > > > > > > > Ah, right. So.. I'm not sure if it's so much the cost of the sysca= ll > > > > itself, as the fact that you're actively waiting for free buffers, > > > > rather than returning to the main epoll loop so you can maybe make > > > > progress on something else before returning to the Tx path. > > > > > > > > > > Previous patch also wait for free buffers, but it does it burning a > > > CPU for that. > > > > Ah, ok. Hrm. I still find it hard to believe that it's the cost of > > the syscall per se that's causing the slowdown. My guess is that the > > cost is because having the poll() leads to a higher latency between > > the buffer being released and us detecting it and re-using. > > > > > The next patch is the one that allows to continue progress as long as > > > there are enough free buffers, instead of always wait until all the > > > buffer has been sent. But there are situations where this conversion > > > needs other code changes. In particular, all the calls to > > > tcp_payload_flush after checking that we have enough buffers like: > > > > > > if (tcp_payload_sock_used > TCP_FRAMES_MEM - 2) { > > > tcp_buf_free_old_tap_xmit(c, 2); > > > tcp_payload_flush(c); > > > ... > > > } > > > > > > Seems like coroutines would be a good fix here, but maybe there are > > > simpler ways to go back to the main loop while keeping the tcp socket > > > "ready to read" by epoll POV. Out of curiosity, what do you think > > > about setjmp()? :). > > > > I think it has its uses, but deciding to go with it is a big > > architectural decision not to be entered into likely. > > >=20 > Got it, >=20 > Another idea is to add the flows that are being processed but they had > no space available into the virtqueue to a "pending" list. When the > kernel tells pasta that new buffers are available, pasta checks that > pending list. Maybe it can consist of only one element. I think this makes sense. We already kind of want the same thing for the (rare) cases where the tap buffer fills up (or the pipe buffer for passt/qemu). This is part of what we'd need to make the event handling simpler (if we have proper wakeups on tap side writability we can always use EPOLLET on the socket side, instead of turning it on and off). I'm not actually sure if we need an explicit list. It might be adequate to just have a pending flag (or even derive it from existing state) and poll the entire flow list. Might be more expensive, but could well be good enough (we already scan the entire flow list on every epoll cycle). --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --KyQ5KYOyY+QrTje3 Content-Type: application/pgp-signature; name=signature.asc -----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmiLBjcACgkQzQJF27ox 2Gf8Gw//alsWbPuJQMuytHbfdwhZEe5AkcJ9KZcKA6p17mvX5CzhR1I19R7doEE7 OCtKcg9J9qzZ3lB13UZjH4AdSEBKIPMB/RnVTB326H2kdH3CTE45k5ytM1AHSfJC GYt2dl/4w+FgzkG92PmqCh1Hvrz9ed3vBgKL4SmK4WN/Z6gIhKxdATFpoq2t8ozj TfklbjeVdxHZRNKu6j7fJEPuERfS1wQPkv058WrTrjk3aezG5B6vJEIuMGSzRQey HF3CrfVYbmI5FWSwMv4wVZHaEGnoDECdCSWMWiLtt5naG4uKBlg6dJ1l4mgtoIhw hGaoQ07hGs52LrSltbYK8XbKB0lUlC1aPZXGEUu1G+3AujXZ72J2ZM3Ji1FWx/bH ljf+1OmPmKKyN7Wl5ZmjeFnSMtnbFoZ5QelEYZvljsL6Zz7eItB2pcf1dCcwZpHo G+hUJzHlBnk7Ga/Tvi5Px4LCQ0yEBWGtAV/O8YhfM3GhLduGo+J8cQYtv5DBBqiN 0E96tXZlSMnIFeo4RMUrJsDFjtWpdyUVYyK0WFwMJnLiUgsnn3qBJd37qaZ+yChA Zha+nK5Sm/ceU5qj0yF3UPSCiJRwu1QZnXUeSNam2vFxX9Lo5VhEBW2gGSWPSlp4 rql8SMLFmK4IEs/g0OwL//8utuq6c5FXHm8uGx3qxsehCKInFe8= =Vqr3 -----END PGP SIGNATURE----- --KyQ5KYOyY+QrTje3--