public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
* Testing vhost-user with "virtio-net: tweak for better TX performance in NAPI mode"
@ 2025-03-18 11:07 Laurent Vivier
  2025-03-18 16:33 ` Stefano Brivio
  0 siblings, 1 reply; 2+ messages in thread
From: Laurent Vivier @ 2025-03-18 11:07 UTC (permalink / raw)
  To: passt-dev

Hi,

as reported by Stefano there is an asymmetry in the throughput between host and guest with 
vhost-user.

I've tested the following kernel patch from Jason to see if it can improve the performance:

------------------------------------------------------------------------------
commit e13b6da7045f997e1a5a5efd61d40e63c4fc20e8
Author: Jason Wang <jasowang@redhat.com>
Date:   Tue Feb 18 10:39:08 2025 +0800

     virtio-net: tweak for better TX performance in NAPI mode

     There are several issues existed in start_xmit():

     - Transmitted packets need to be freed before sending a packet, this
       introduces delay and increases the average packets transmit
       time. This also increase the time that spent in holding the TX lock.
     - Notification is enabled after free_old_xmit_skbs() which will
       introduce unnecessary interrupts if TX notification happens on the
       same CPU that is doing the transmission now (actually, virtio-net
       driver are optimized for this case).

     So this patch tries to avoid those issues by not cleaning transmitted
     packets in start_xmit() when TX NAPI is enabled and disable
     notifications even more aggressively. Notification will be since the
     beginning of the start_xmit(). But we can't enable delayed
     notification after TX is stopped as we will lose the
     notifications. Instead, the delayed notification needs is enabled
     after the virtqueue is kicked for best performance.

     Performance numbers:

     1) single queue 2 vcpus guest with pktgen_sample03_burst_single_flow.sh
        (burst 256) + testpmd (rxonly) on the host:

     - When pinning TX IRQ to pktgen VCPU: split virtqueue PPS were
       increased 55% from 6.89 Mpps to 10.7 Mpps and 32% TX interrupts were
       eliminated. Packed virtqueue PPS were increased 50% from 7.09 Mpps to
       10.7 Mpps, 99% TX interrupts were eliminated.

     - When pinning TX IRQ to VCPU other than pktgen: split virtqueue PPS
       were increased 96% from 5.29 Mpps to 10.4 Mpps and 45% TX interrupts
       were eliminated; Packed virtqueue PPS were increased 78% from 6.12
       Mpps to 10.9 Mpps and 99% TX interrupts were eliminated.

     2) single queue 1 vcpu guest + vhost-net/TAP on the host: single
        session netperf from guest to host shows 82% improvement from
        31Gb/s to 58Gb/s, %stddev were reduced from 34.5% to 1.9% and 88%
        of TX interrupts were eliminated.

     Signed-off-by: Jason Wang <jasowang@redhat.com>
     Acked-by: Michael S. Tsirkin <mst@redhat.com>
     Signed-off-by: David S. Miller <davem@davemloft.net>
------------------------------------------------------------------------------

systemctl stop firewalld.service || service iptables stop || iptables -Ft
/sbin/sysctl -w net.core.rmem_max=536870912
/sbin/sysctl -w net.core.wmem_max=536870912

____ I made my test using 6.14-rc7 kernel:

   From guest:

   iperf3 -c 10.6.68.254  -P2 -Z -t5  -l 1M -w 16M
   [SUM]   0.00-5.00   sec  14.5 GBytes  24.9 Gbits/sec    0             sender
   [SUM]   0.00-5.00   sec  14.5 GBytes  24.9 Gbits/sec                  receiver

   From host:

   iperf3 -c localhost -P2 -Z -t5  -p 10001 -l 1M -w 16M
   [SUM]   0.00-5.00   sec  28.9 GBytes  49.6 Gbits/sec    0             sender
   [SUM]   0.00-5.03   sec  28.8 GBytes  49.2 Gbits/sec                  receiver

____ The results with a 6.14-rc7 + e13b6da7045f:

   From guest:

   iperf3 -c 10.6.68.254  -P2 -Z -t5  -l 1M -w 16M
   [SUM]   0.00-5.00   sec  14.8 GBytes  25.4 Gbits/sec    0             sender
   [SUM]   0.00-5.01   sec  14.8 GBytes  25.4 Gbits/sec                  receiver

   From host:

   iperf3 -c localhost -P2 -Z -t5  -p 10001 -l 1M -w 16M
   [SUM]   0.00-5.00   sec  28.5 GBytes  48.9 Gbits/sec    0             sender
   [SUM]   0.00-5.03   sec  28.4 GBytes  48.6 Gbits/sec                  receiver

We have only a 2% improvement.

Thanks,
Laurent



^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Testing vhost-user with "virtio-net: tweak for better TX performance in NAPI mode"
  2025-03-18 11:07 Testing vhost-user with "virtio-net: tweak for better TX performance in NAPI mode" Laurent Vivier
@ 2025-03-18 16:33 ` Stefano Brivio
  0 siblings, 0 replies; 2+ messages in thread
From: Stefano Brivio @ 2025-03-18 16:33 UTC (permalink / raw)
  To: Laurent Vivier; +Cc: passt-dev

On Tue, 18 Mar 2025 12:07:09 +0100
Laurent Vivier <lvivier@redhat.com> wrote:

> ____ The results with a 6.14-rc7 + e13b6da7045f:

Thanks for checking!

>    From guest:
> 
>    iperf3 -c 10.6.68.254  -P2 -Z -t5  -l 1M -w 16M
>    [SUM]   0.00-5.00   sec  14.8 GBytes  25.4 Gbits/sec    0             sender
>    [SUM]   0.00-5.01   sec  14.8 GBytes  25.4 Gbits/sec                  receiver
> 
>    From host:
> 
>    iperf3 -c localhost -P2 -Z -t5  -p 10001 -l 1M -w 16M
>    [SUM]   0.00-5.00   sec  28.5 GBytes  48.9 Gbits/sec    0             sender
>    [SUM]   0.00-5.03   sec  28.4 GBytes  48.6 Gbits/sec                  receiver
> 
> We have only a 2% improvement.

Ouch. :( Then there's something else... by the way, for reference, my
investigation back then stopped a bit after:

  https://archives.passt.top/passt-dev/20241010090801.23da8bff@elisabeth/

that is, I tried zeroing pages "fast":

  https://archives.passt.top/passt-dev/20241017021027.2ac9ea53@elisabeth/

but it didn't really change the asymmetry. I was getting the same
numbers you're getting now.

Whatever, I guess it's not so important, just... one day we should
figure it out. :)

-- 
Stefano


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2025-03-18 16:33 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-03-18 11:07 Testing vhost-user with "virtio-net: tweak for better TX performance in NAPI mode" Laurent Vivier
2025-03-18 16:33 ` Stefano Brivio

Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).