public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: Stefano Brivio <sbrivio@redhat.com>
To: Eugenio Perez Martin <eperezma@redhat.com>
Cc: passt-dev@passt.top, Jason Wang <jasowang@redhat.com>,
	Jeff Nelson <jenelson@redhat.com>
Subject: Re: vhost-kernel net on pasta: from 26 to 37Gbit/s
Date: Fri, 6 Jun 2025 18:37:02 +0200	[thread overview]
Message-ID: <20250606183702.0ff9a3c7@elisabeth> (raw)
In-Reply-To: <CAJaqyWe38_kR1=nC-fok0tBHE+z-QpBs+kcsT8Z1OgUEGzA-uw@mail.gmail.com>

On Fri, 6 Jun 2025 16:32:38 +0200
Eugenio Perez Martin <eperezma@redhat.com> wrote:

> On Wed, May 21, 2025 at 12:35 PM Eugenio Perez Martin
> <eperezma@redhat.com> wrote:
> >
> > On Wed, May 21, 2025 at 12:09 PM Stefano Brivio <sbrivio@redhat.com> wrote:  
> > >
> > > On Tue, 20 May 2025 17:09:44 +0200
> > > Eugenio Perez Martin <eperezma@redhat.com> wrote:
> > >  
> > > > [...]
> > > >
> > > > Now if I isolate the vhost kernel thread [1] I get way more
> > > > performance as expected:
> > > > - - - - - - - - - - - - - - - - - - - - - - - - -
> > > > [ ID] Interval           Transfer     Bitrate         Retr
> > > > [  5]   0.00-10.00  sec  43.1 GBytes  37.1 Gbits/sec    0             sender
> > > > [  5]   0.00-10.04  sec  43.1 GBytes  36.9 Gbits/sec                  receiver
> > > >
> > > > After analyzing perf output, rep_movs_alternative is the most called
> > > > function in the three iperf3 (~20%Self), passt.avx2 (~15%Self) and
> > > > vhost (~15%Self)  
> > >
> > > Interesting... s/most called function/function using the most cycles/, I
> > > suppose.
> > >  
> >
> > Right!
> >  
> > > So it looks somewhat similar to
> > >
> > >   https://archives.passt.top/passt-dev/20241017021027.2ac9ea53@elisabeth/
> > >
> > > now?
> > >  
> >
> > Kind of. Below tcp_sendmsg_locked I don't see sk_page_frag_refill but
> > skb_do_copy_data_nocache. Not sure if that means something, as it
> > should not be affected by vhost.
> >  
> > > > But I don't see any of them consuming 100% of CPU in
> > > > top: pasta consumes ~85% %CPU, both iperf3 client and server consumes
> > > > 60%, and vhost consumes ~53%.
> > > >
> > > > So... I have mixed feelings about this :). By "default" it seems to
> > > > have less performance, but my test is maybe too synthetic.  
> > >
> > > Well, surely we can't ask Podman users to pin specific stuff to given
> > > CPU threads. :)
> > >  
> >
> > Yes but maybe the result changes under the right schedule? I'm
> > isolating the CPUs entirely, which is not the usual case for pasta for
> > sure :).
> >  
> > > > There is room for improvement with the mentioned optimizations so I'd
> > > > continue applying them, continuing with UDP and TCP zerocopy, and
> > > > developing zerocopy vhost rx.  
> > >
> > > That definitely makes sense to me.
> > >  
> >
> > Good!
> >  
> > > > With these numbers I think the series should not be
> > > > merged at the moment. I could send it as RFC if you want but I've not
> > > > applied the comments the first one received, POC style :).  
> > >
> > > I don't think it's really needed for you to spend time on
> > > semi-polishing something just to have an RFC if you're still working on
> > > it. I guess the implementation will change substantially anyway once
> > > you factor in further optimisations.
> > >  
> >
> > Agree! I'll keep iterating on this then.
> >  
> 
> Actually, if I remove all the taskset etc, and trust the kernel
> scheduler, vanilla pasta gives me:
> [pasta@virtlab716 ~]$ /home/passt/pasta --config-net iperf3 -c 10.6.68.254 -w 8M
> Connecting to host 10.6.68.254, port 5201
> [  5] local 10.6.68.20 port 40408 connected to 10.6.68.254 port 5201
> [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
> [  5]   0.00-1.00   sec  3.11 GBytes  26.7 Gbits/sec    0   25.4 MBytes
> [  5]   1.00-2.00   sec  3.11 GBytes  26.7 Gbits/sec    0   25.4 MBytes
> [  5]   2.00-3.00   sec  3.12 GBytes  26.8 Gbits/sec    0   25.4 MBytes
> [  5]   3.00-4.00   sec  3.11 GBytes  26.7 Gbits/sec    0   25.4 MBytes
> [  5]   4.00-5.00   sec  3.10 GBytes  26.6 Gbits/sec    0   25.4 MBytes
> [  5]   5.00-6.00   sec  3.11 GBytes  26.7 Gbits/sec    0   25.4 MBytes
> [  5]   6.00-7.00   sec  3.11 GBytes  26.7 Gbits/sec    0   25.4 MBytes
> [  5]   7.00-8.00   sec  3.09 GBytes  26.6 Gbits/sec    0   25.4 MBytes
> [  5]   8.00-9.00   sec  3.08 GBytes  26.5 Gbits/sec    0   25.4 MBytes
> [  5]   9.00-10.00  sec  3.10 GBytes  26.6 Gbits/sec    0   25.4 MBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bitrate         Retr
> [  5]   0.00-10.00  sec  31.0 GBytes  26.7 Gbits/sec    0             sender
> [  5]   0.00-10.04  sec  31.0 GBytes  26.5 Gbits/sec                  receiver
> 
> And with vhost-net :
> [pasta@virtlab716 ~]$ /home/passt/pasta --config-net iperf3 -c 10.6.68.254 -w 8M
> ...
> Connecting to host 10.6.68.254, port 5201
> [  5] local 10.6.68.20 port 46720 connected to 10.6.68.254 port 5201
> [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
> [  5]   0.00-1.00   sec  4.17 GBytes  35.8 Gbits/sec    0   11.9 MBytes
> [  5]   1.00-2.00   sec  4.17 GBytes  35.9 Gbits/sec    0   11.9 MBytes
> [  5]   2.00-3.00   sec  4.16 GBytes  35.7 Gbits/sec    0   11.9 MBytes
> [  5]   3.00-4.00   sec  4.14 GBytes  35.6 Gbits/sec    0   11.9 MBytes
> [  5]   4.00-5.00   sec  4.16 GBytes  35.7 Gbits/sec    0   11.9 MBytes
> [  5]   5.00-6.00   sec  4.16 GBytes  35.8 Gbits/sec    0   11.9 MBytes
> [  5]   6.00-7.00   sec  4.18 GBytes  35.9 Gbits/sec    0   11.9 MBytes
> [  5]   7.00-8.00   sec  4.19 GBytes  35.9 Gbits/sec    0   11.9 MBytes
> [  5]   8.00-9.00   sec  4.18 GBytes  35.9 Gbits/sec    0   11.9 MBytes
> [  5]   9.00-10.00  sec  4.18 GBytes  35.9 Gbits/sec    0   11.9 MBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bitrate         Retr
> [  5]   0.00-10.00  sec  41.7 GBytes  35.8 Gbits/sec    0             sender
> [  5]   0.00-10.04  sec  41.7 GBytes  35.7 Gbits/sec                  receiver
> 
> If I  go the extra mile and disable notifications (it might be just
> noise, but...)
> [pasta@virtlab716 ~]$ /home/passt/pasta --config-net iperf3 -c 10.6.68.254 -w 8M
> ...
> Connecting to host 10.6.68.254, port 5201
> [  5] local 10.6.68.20 port 56590 connected to 10.6.68.254 port 5201
> [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
> [  5]   0.00-1.00   sec  4.19 GBytes  36.0 Gbits/sec    0   12.4 MBytes
> [  5]   1.00-2.00   sec  4.18 GBytes  35.9 Gbits/sec    0   12.4 MBytes
> [  5]   2.00-3.00   sec  4.18 GBytes  35.9 Gbits/sec    0   12.4 MBytes
> [  5]   3.00-4.00   sec  4.20 GBytes  36.1 Gbits/sec    0   12.4 MBytes
> [  5]   4.00-5.00   sec  4.21 GBytes  36.2 Gbits/sec    0   12.4 MBytes
> [  5]   5.00-6.00   sec  4.21 GBytes  36.1 Gbits/sec    0   12.4 MBytes
> [  5]   6.00-7.00   sec  4.20 GBytes  36.1 Gbits/sec    0   12.4 MBytes
> [  5]   7.00-8.00   sec  4.23 GBytes  36.4 Gbits/sec    0   12.4 MBytes
> [  5]   8.00-9.00   sec  4.24 GBytes  36.4 Gbits/sec    0   12.4 MBytes
> [  5]   9.00-10.00  sec  4.21 GBytes  36.2 Gbits/sec    0   12.4 MBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bitrate         Retr
> [  5]   0.00-10.00  sec  42.1 GBytes  36.1 Gbits/sec    0             sender
> [  5]   0.00-10.04  sec  42.1 GBytes  36.0 Gbits/sec                  receiver
> 
> So I guess the best is to actually run performance tests closer to
> real-world workload against the new version and see if it works
> better?

Well, that's certainly a possibility.

I'd say the biggest value for vhost-net usage in pasta is reaching
throughput figures that are comparable with veth, with or without
multithreading (keeping an eye on bytes per cycle, of course), with or
without kernel changes, so that users won't need to choose between
rootless and performance anymore.

It would also simplify things in Podman quite a lot (and to some extent
in rootlesskit / Docker as well). We're pretty much there with virtual
machines, just not quite with containers (which is somewhat ironic, but
of course there's a good reason for that).

If we're clearly wasting cycles in vhost-net (because of the bounce
buffer, plus something else perhaps?) *and* there's a somewhat possible
solution for that in sight *and* the interface would change anyway,
running throughput tests and polishing up the current version with a
half-baked solution at the moment sounds a bit wasteful to me.

But if one of those assumptions doesn't hold, or if you feel the need to
consolidate the current status, perhaps polishing up the current
version right now and actually evaluating throughput (as well as
overhead) makes sense to me, yes.

-- 
Stefano


      reply	other threads:[~2025-06-06 16:37 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-20 15:09 vhost-kernel net on pasta: from 26 to 37Gbit/s Eugenio Perez Martin
2025-05-21  0:57 ` Jason Wang
2025-05-21  5:37   ` Eugenio Perez Martin
2025-05-21 10:08 ` Stefano Brivio
2025-05-21 10:35   ` Eugenio Perez Martin
2025-06-06 14:32     ` Eugenio Perez Martin
2025-06-06 16:37       ` Stefano Brivio [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250606183702.0ff9a3c7@elisabeth \
    --to=sbrivio@redhat.com \
    --cc=eperezma@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=jenelson@redhat.com \
    --cc=passt-dev@passt.top \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).