On Mon, Jul 28, 2025 at 06:55:50PM +0200, Eugenio Perez Martin wrote: > On Thu, Jul 24, 2025 at 3:03 AM David Gibson > wrote: > > > > On Wed, Jul 09, 2025 at 07:47:46PM +0200, Eugenio Pérez wrote: > > > The vhost-kernel module is async by nature: the driver (pasta) places a > > > few buffers in the virtqueue and the device (vhost-kernel) trust the > > > > s/trust/trusts/ > > > > Fixing in the next version. > > > > driver will not modify them until it uses them. To implement it is not > > > possible with TCP at the moment, as tcp_buf trust it can reuse the > > > buffers as soon as tcp_payload_flush() finish. > > > > > > > > > > > > To achieve async let's make tcp_buf work with a circular ring, so vhost > > > can transmit at the same time pasta is queing more data. When a buffer > > > is received from a TCP socket, the element is placed in the ring and > > > sock_head is moved: > > > [][][][] > > > ^ ^ > > > | | > > > | sock_head > > > | > > > tail > > > tap_head > > > > > > When the data is sent to vhost through the tx queue, tap_head is moved > > > forward: > > > [][][][] > > > ^ ^ > > > | | > > > | sock_head > > > | tap_head > > > | > > > tail > > > > > > Finally, the tail move forward when vhost has used the tx buffers, so > > > tcp_payload (and all lower protocol buffers) can be reused. > > > [][][][] > > > ^ > > > | > > > sock_head > > > tap_head > > > tail > > > > This all sounds good. I wonder if it might be clearer to do this > > circular queue conversion as a separate patch series. I think it > > makes sense even without the context of vhost (it's closer to how most > > network things work). > > > > Sure it can be done. > > > > In the case of error queueing to the vhost virtqueue, sock_head moves > > > backwards. The only possible error is that the queue is full, as > > > > sock_head moves backwards? Or tap_head moves backwards? > > > > Sock head moves backwards. Tap_head cannot move backwards as vhost > does not have a way to report "the last X packets has not been sent". Right, I realised that as I read further. > > > virtio-net does not report success on packet sending. > > > > > > Starting as simple as possible, and only implementing the count > > > variables in this patch so it keeps working as previously. The circular > > > behavior will be added on top. > > > > > > From ~16BGbit/s to ~13Gbit/s compared with write(2) to the tap. > > > > I don't really understand what you're comparing here. > > Sending through vhost-net vs write(2) to tap device. Ok. That's a bit dissapointing. > > > Signed-off-by: Eugenio Pérez > > > --- > > > tcp_buf.c | 63 +++++++++++++++++++++++++++++++++++-------------------- > > > 1 file changed, 40 insertions(+), 23 deletions(-) > > > > > > diff --git a/tcp_buf.c b/tcp_buf.c > > > index 242086d..0437120 100644 > > > --- a/tcp_buf.c > > > +++ b/tcp_buf.c > > > @@ -53,7 +53,12 @@ static_assert(MSS6 <= sizeof(tcp_payload[0].data), "MSS6 is greater than 65516") > > > > > > /* References tracking the owner connection of frames in the tap outqueue */ > > > static struct tcp_tap_conn *tcp_frame_conns[TCP_FRAMES_MEM]; > > > -static unsigned int tcp_payload_used; > > > +static unsigned int tcp_payload_sock_used, tcp_payload_tap_used; > > > > I think the "payload" here is a hangover from when we had separate > > queues for flags-only and data-containing packets. We can probably > > drop it and make a bunch of names shorter. > > Maybe we can short even more if we isolate this in its own > circular_buffer.h or equivalent. UDP will also need it. Maybe, yes. -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson