From: Stefano Brivio <sbrivio@redhat.com>
To: Laurent Vivier <lvivier@redhat.com>
Cc: passt-dev@passt.top
Subject: Re: [PATCH v7 0/8] Add vhost-user support to passt. (part 3)
Date: Fri, 11 Oct 2024 20:07:30 +0200 [thread overview]
Message-ID: <20241011200730.63c97dc7@elisabeth> (raw)
In-Reply-To: <20241010090801.23da8bff@elisabeth>
On Thu, 10 Oct 2024 09:08:01 +0200
Stefano Brivio <sbrivio@redhat.com> wrote:
> For outbound traffic (I tried with IPv4), which is much slower for some
> reason (~25 Gbps):
>
> --
> Samples: 79K of event 'cycles', Event count (approx.): 73661070737
> Children Self Command Shared Object Symbol
> - 91.00% 0.23% passt.avx2 [kernel.kallsyms] [k] entry_SYSCALL_64_after_hwframe ◆
> 90.78% entry_SYSCALL_64_after_hwframe ▒
> - do_syscall_64 ▒
> - 78.75% __sys_sendmsg ▒
> - 78.58% ___sys_sendmsg ▒
> - 78.06% ____sys_sendmsg ▒
> - sock_sendmsg ▒
> - 77.58% tcp_sendmsg ▒
> - 68.63% tcp_sendmsg_locked ▒
> - 26.24% sk_page_frag_refill ▒
> - skb_page_frag_refill ▒
> - 25.87% __alloc_pages ▒
> - 25.61% get_page_from_freelist ▒
> 24.51% clear_page_rep ▒
> - 23.08% _copy_from_iter ▒
> 22.88% copy_user_generic_string ▒
> - 8.77% tcp_write_xmit ▒
> - 8.19% __tcp_transmit_skb ▒
> - 7.86% __ip_queue_xmit ▒
> - 7.13% ip_finish_output2 ▒
> - 6.65% __local_bh_enable_ip ▒
> - 6.60% do_softirq.part.0 ▒
> - 6.51% __softirqentry_text_start ▒
> - 6.40% net_rx_action ▒
> - 5.43% __napi_poll ▒
> + process_backlog ▒
> 0.50% napi_consume_skb ▒
> + 5.39% __tcp_push_pending_frames ▒
> + 2.03% tcp_stream_alloc_skb ▒
> + 1.48% tcp_wmem_schedule ▒
> + 8.58% release_sock ▒
> - 4.57% ksys_write ▒
> - 4.41% vfs_write ▒
> - 3.96% eventfd_write ▒
> - 3.46% __wake_up_common ▒
> - irqfd_wakeup ▒
> - 3.15% kvm_arch_set_irq_inatomic ▒
> - 3.11% kvm_irq_delivery_to_apic_fast ▒
> - 2.01% __apic_accept_irq ▒
> 0.93% svm_complete_interrupt_delivery ▒
> + 3.91% __x64_sys_epoll_wait ▒
> + 1.20% __x64_sys_getsockopt ▒
> + 0.78% syscall_trace_enter.constprop.0 ▒
> 0.71% syscall_exit_to_user_mode ▒
> + 0.61% ksys_read ▒
> --
>
> ...there are no users of more than 1% cycles in passt itself. The bulk of
> it is sendmsg() as expected, one notable thing is that the kernel spends
> an awful amount of cycles zeroing pages so that we can fill them. I looked
> into that "issue" a long time ago,
>
> https://github.com/netoptimizer/prototype-kernel/pull/39/commits/2c8223c30d7f280a9e456d8e690adb0869ed8c5c
>
> ...maybe I can try out a kernel with a version of that as
> clear_page_rep() and see what happens.
...so I tried, it looks like this, but it doesn't boot for some reason:
--
diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h
index f3d257c45225..4079012ce765 100644
--- a/arch/x86/include/asm/page_64.h
+++ b/arch/x86/include/asm/page_64.h
@@ -44,6 +44,17 @@ void clear_page_orig(void *page);
void clear_page_rep(void *page);
void clear_page_erms(void *page);
+#define MEMSET_AVX2_ZERO(reg) \
+ asm volatile("vpxor %ymm" #reg ", %ymm" #reg ", %ymm" #reg)
+#define MEMSET_AVX2_STORE(loc, reg) \
+ asm volatile("vmovdqa %%ymm" #reg ", %0" : "=m" (loc))
+
+#define YMM_BYTES (256 / 8)
+#define BYTES_TO_YMM(x) ((x) / YMM_BYTES)
+extern void kernel_fpu_begin_mask(unsigned int kfpu_mask);
+extern void kernel_fpu_end(void);
+extern bool irq_fpu_usable(void);
+
static inline void clear_page(void *page)
{
/*
@@ -51,6 +62,18 @@ static inline void clear_page(void *page)
* below clobbers @page, so we perform unpoisoning before it.
*/
kmsan_unpoison_memory(page, PAGE_SIZE);
+
+ if (irq_fpu_usable()) {
+ int i;
+
+ kernel_fpu_begin();
+ MEMSET_AVX2_ZERO(0);
+ for (i = 0; i < BYTES_TO_YMM(PAGE_SIZE); i++)
+ MEMSET_AVX2_STORE(((unsigned char *)page)[YMM_BYTES * i], 0);
+ kernel_fpu_end();
+ return;
+ }
+
alternative_call_2(clear_page_orig,
clear_page_rep, X86_FEATURE_REP_GOOD,
clear_page_erms, X86_FEATURE_ERMS,
--
...I'm not sure if that's something we can do at early boot, so perhaps
I should add something specific in skb_page_frag_refill() instead. But
that's for another day/week/month...
--
@@ -44,6 +44,17 @@ void clear_page_orig(void *page);
void clear_page_rep(void *page);
void clear_page_erms(void *page);
+#define MEMSET_AVX2_ZERO(reg) \
+ asm volatile("vpxor %ymm" #reg ", %ymm" #reg ", %ymm" #reg)
+#define MEMSET_AVX2_STORE(loc, reg) \
+ asm volatile("vmovdqa %%ymm" #reg ", %0" : "=m" (loc))
+
+#define YMM_BYTES (256 / 8)
+#define BYTES_TO_YMM(x) ((x) / YMM_BYTES)
+extern void kernel_fpu_begin_mask(unsigned int kfpu_mask);
+extern void kernel_fpu_end(void);
+extern bool irq_fpu_usable(void);
+
static inline void clear_page(void *page)
{
/*
@@ -51,6 +62,18 @@ static inline void clear_page(void *page)
* below clobbers @page, so we perform unpoisoning before it.
*/
kmsan_unpoison_memory(page, PAGE_SIZE);
+
+ if (irq_fpu_usable()) {
+ int i;
+
+ kernel_fpu_begin();
+ MEMSET_AVX2_ZERO(0);
+ for (i = 0; i < BYTES_TO_YMM(PAGE_SIZE); i++)
+ MEMSET_AVX2_STORE(((unsigned char *)page)[YMM_BYTES * i], 0);
+ kernel_fpu_end();
+ return;
+ }
+
alternative_call_2(clear_page_orig,
clear_page_rep, X86_FEATURE_REP_GOOD,
clear_page_erms, X86_FEATURE_ERMS,
--
...I'm not sure if that's something we can do at early boot, so perhaps
I should add something specific in skb_page_frag_refill() instead. But
that's for another day/week/month...
--
Stefano
next prev parent reply other threads:[~2024-10-11 18:07 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-09 9:07 [PATCH v7 0/8] Add vhost-user support to passt. (part 3) Laurent Vivier
2024-10-09 9:07 ` [PATCH v7 1/8] packet: replace struct desc by struct iovec Laurent Vivier
2024-10-09 9:07 ` [PATCH v7 2/8] vhost-user: introduce virtio API Laurent Vivier
2024-10-09 9:07 ` [PATCH v7 3/8] vhost-user: introduce vhost-user API Laurent Vivier
2024-10-09 9:07 ` [PATCH v7 4/8] udp: Prepare udp.c to be shared with vhost-user Laurent Vivier
2024-10-09 9:07 ` [PATCH v7 5/8] tcp: Export headers functions Laurent Vivier
2024-10-09 9:07 ` [PATCH v7 6/8] passt: rename tap_sock_init() to tap_backend_init() Laurent Vivier
2024-10-09 9:07 ` [PATCH v7 7/8] vhost-user: add vhost-user Laurent Vivier
2024-10-09 9:07 ` [PATCH v7 8/8] test: Add tests for passt in vhost-user mode Laurent Vivier
2024-10-09 13:07 ` [PATCH v7 0/8] Add vhost-user support to passt. (part 3) Stefano Brivio
2024-10-09 14:50 ` Laurent Vivier
2024-10-09 17:37 ` Stefano Brivio
2024-10-10 7:08 ` Stefano Brivio
2024-10-10 7:43 ` Laurent Vivier
2024-10-10 7:45 ` Laurent Vivier
2024-10-10 7:52 ` Stefano Brivio
2024-10-11 18:07 ` Stefano Brivio [this message]
2024-10-17 0:10 ` Stefano Brivio
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241011200730.63c97dc7@elisabeth \
--to=sbrivio@redhat.com \
--cc=lvivier@redhat.com \
--cc=passt-dev@passt.top \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).