public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Eugenio Perez Martin <eperezma@redhat.com>
Cc: passt-dev@passt.top, jasowang@redhat.com
Subject: Re: [RFC v2 09/11] tcp: start conversion to circular buffer
Date: Tue, 29 Jul 2025 10:30:43 +1000	[thread overview]
Message-ID: <aIgWM1Dp6SwMwwmU@zatzit> (raw)
In-Reply-To: <CAJaqyWcmwH+70p=U97t0DAQ-TD24A+1VwYNtEjoxMnUO9kefcg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4519 bytes --]

On Mon, Jul 28, 2025 at 06:55:50PM +0200, Eugenio Perez Martin wrote:
> On Thu, Jul 24, 2025 at 3:03 AM David Gibson
> <david@gibson.dropbear.id.au> wrote:
> >
> > On Wed, Jul 09, 2025 at 07:47:46PM +0200, Eugenio Pérez wrote:
> > > The vhost-kernel module is async by nature: the driver (pasta) places a
> > > few buffers in the virtqueue and the device (vhost-kernel) trust the
> >
> > s/trust/trusts/
> >
> 
> Fixing in the next version.
> 
> > > driver will not modify them until it uses them.  To implement it is not
> > > possible with TCP at the moment, as tcp_buf trust it can reuse the
> > > buffers as soon as tcp_payload_flush() finish.
> >
> >
> >
> > >
> > > To achieve async let's make tcp_buf work with a circular ring, so vhost
> > > can transmit at the same time pasta is queing more data.  When a buffer
> > > is received from a TCP socket, the element is placed in the ring and
> > > sock_head is moved:
> > >                              [][][][]
> > >                              ^ ^
> > >                              | |
> > >                              | sock_head
> > >                              |
> > >                              tail
> > >                              tap_head
> > >
> > > When the data is sent to vhost through the tx queue, tap_head is moved
> > > forward:
> > >                              [][][][]
> > >                              ^ ^
> > >                              | |
> > >                              | sock_head
> > >                              | tap_head
> > >                              |
> > >                            tail
> > >
> > > Finally, the tail move forward when vhost has used the tx buffers, so
> > > tcp_payload (and all lower protocol buffers) can be reused.
> > >                              [][][][]
> > >                                ^
> > >                                |
> > >                                sock_head
> > >                                tap_head
> > >                              tail
> >
> > This all sounds good.  I wonder if it might be clearer to do this
> > circular queue conversion as a separate patch series.  I think it
> > makes sense even without the context of vhost (it's closer to how most
> > network things work).
> >
> 
> Sure it can be done.
> 
> > > In the case of error queueing to the vhost virtqueue, sock_head moves
> > > backwards.  The only possible error is that the queue is full, as
> >
> > sock_head moves backwards?  Or tap_head moves backwards?
> >
> 
> Sock head moves backwards. Tap_head cannot move backwards as vhost
> does not have a way to report "the last X packets has not been sent".

Right, I realised that as I read further.

> > > virtio-net does not report success on packet sending.
> > >
> > > Starting as simple as possible, and only implementing the count
> > > variables in this patch so it keeps working as previously.  The circular
> > > behavior will be added on top.
> > >
> > > From ~16BGbit/s to ~13Gbit/s compared with write(2) to the tap.
> >
> > I don't really understand what you're comparing here.
> 
> Sending through vhost-net vs write(2) to tap device.

Ok.  That's a bit dissapointing.

> > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > ---
> > >  tcp_buf.c | 63 +++++++++++++++++++++++++++++++++++--------------------
> > >  1 file changed, 40 insertions(+), 23 deletions(-)
> > >
> > > diff --git a/tcp_buf.c b/tcp_buf.c
> > > index 242086d..0437120 100644
> > > --- a/tcp_buf.c
> > > +++ b/tcp_buf.c
> > > @@ -53,7 +53,12 @@ static_assert(MSS6 <= sizeof(tcp_payload[0].data), "MSS6 is greater than 65516")
> > >
> > >  /* References tracking the owner connection of frames in the tap outqueue */
> > >  static struct tcp_tap_conn *tcp_frame_conns[TCP_FRAMES_MEM];
> > > -static unsigned int tcp_payload_used;
> > > +static unsigned int tcp_payload_sock_used, tcp_payload_tap_used;
> >
> > I think the "payload" here is a hangover from when we had separate
> > queues for flags-only and data-containing packets.  We can probably
> > drop it and make a bunch of names shorter.
> 
> Maybe we can short even more if we isolate this in its own
> circular_buffer.h or equivalent. UDP will also need it.

Maybe, yes.

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2025-07-29  0:32 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-09 17:47 [RFC v2 00/11] Add vhost-net kernel support Eugenio Pérez
2025-07-09 17:47 ` [RFC v2 01/11] tap: implement vhost_call_cb Eugenio Pérez
2025-07-23  6:56   ` David Gibson
2025-07-28 16:33     ` Eugenio Perez Martin
2025-07-29  0:11       ` David Gibson
2025-07-09 17:47 ` [RFC v2 02/11] tap: add die() on vhost error Eugenio Pérez
2025-07-23  6:58   ` David Gibson
2025-07-09 17:47 ` [RFC v2 03/11] tap: replace tx tap hdr with virtio_nethdr_mrg_rxbuf Eugenio Pérez
2025-07-24  0:17   ` David Gibson
2025-07-28 16:37     ` Eugenio Perez Martin
2025-07-09 17:47 ` [RFC v2 04/11] tcp: export memory regions to vhost Eugenio Pérez
2025-07-23  7:06   ` David Gibson
2025-07-28 16:41     ` Eugenio Perez Martin
2025-07-29  0:25       ` David Gibson
2025-07-09 17:47 ` [RFC v2 05/11] virtio: Fill .next in tx queue Eugenio Pérez
2025-07-23  7:07   ` David Gibson
2025-07-28 16:44     ` Eugenio Perez Martin
2025-07-09 17:47 ` [RFC v2 06/11] tap: move static iov_sock to tcp_buf_data_from_sock Eugenio Pérez
2025-07-23  7:09   ` David Gibson
2025-07-28 16:43     ` Eugenio Perez Martin
2025-07-29  0:28       ` David Gibson
2025-07-09 17:47 ` [RFC v2 07/11] tap: support tx through vhost Eugenio Pérez
2025-07-24  0:24   ` David Gibson
2025-07-24 14:30     ` Stefano Brivio
2025-07-25  0:23       ` David Gibson
2025-07-09 17:47 ` [RFC v2 08/11] tap: add tap_free_old_xmit Eugenio Pérez
2025-07-24  0:32   ` David Gibson
2025-07-28 16:45     ` Eugenio Perez Martin
2025-07-09 17:47 ` [RFC v2 09/11] tcp: start conversion to circular buffer Eugenio Pérez
2025-07-24  1:03   ` David Gibson
2025-07-28 16:55     ` Eugenio Perez Martin
2025-07-29  0:30       ` David Gibson [this message]
2025-07-09 17:47 ` [RFC v2 10/11] tap: add poll(2) to used_idx Eugenio Pérez
2025-07-24  1:20   ` David Gibson
2025-07-28 17:03     ` Eugenio Perez Martin
2025-07-29  0:32       ` David Gibson
2025-07-29  7:04         ` Eugenio Perez Martin
2025-07-30  0:32           ` David Gibson
2025-07-09 17:47 ` [RFC v2 11/11] tcp_buf: adding TCP tx circular buffer Eugenio Pérez
2025-07-24  1:33   ` David Gibson
2025-07-28 17:04     ` Eugenio Perez Martin
2025-07-10  9:46 ` [RFC v2 00/11] Add vhost-net kernel support Eugenio Perez Martin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aIgWM1Dp6SwMwwmU@zatzit \
    --to=david@gibson.dropbear.id.au \
    --cc=eperezma@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=passt-dev@passt.top \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).