From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: passt.top; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=GLYXbCGo; dkim-atps=neutral Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by passt.top (Postfix) with ESMTPS id 395E05A027E for ; Tue, 20 May 2025 17:10:25 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1747753824; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type; bh=NysmIPeuoBJKIJrvcBBPk+2nF9AtbPstpYvaj9zOvHU=; b=GLYXbCGoPM/+2cKDvLJPKYEYFf+v7HelC2reCskomz7EXeo2ys/SBUV1cdrpkPjYsQ8ghF 2XGT9uR1M7A+jcbOgZv/c3MYFpihOEpa25WJhpuyMApew/JKjtSAqwza9168Qc6NgEVNkB 8qyvn/b3ry1+QzOYsoPCni1qfcOY4r8= Received: from mail-pj1-f71.google.com (mail-pj1-f71.google.com [209.85.216.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-157-TKUIKnx7NTaP3I6e4FCZPw-1; Tue, 20 May 2025 11:10:22 -0400 X-MC-Unique: TKUIKnx7NTaP3I6e4FCZPw-1 X-Mimecast-MFC-AGG-ID: TKUIKnx7NTaP3I6e4FCZPw_1747753822 Received: by mail-pj1-f71.google.com with SMTP id 98e67ed59e1d1-30f0ffde3ecso1782113a91.0 for ; Tue, 20 May 2025 08:10:22 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747753821; x=1748358621; h=cc:to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=NysmIPeuoBJKIJrvcBBPk+2nF9AtbPstpYvaj9zOvHU=; b=v5JUXsQehY2EDqGNdk01fq5iGPvmcmHX2jgrInaipTM9TilCflzsXOVp0jfcWTMiQD c22c3CJ/Ej8X8EoLCSKF22cS6dekCPN57ETJzadOGase3OetZAhyj2JLMFxA6TCOVYSR nO2IOt1ajhIC73Wc0gpxypjSlFt26v0TbTxHr619hL661GnJpWrlah/4w4+Fs7K+zsW6 DnZoPHtZ5zGRNRuTqaYKdCzoRXf5hImar8gxdmW8ir1bLmCTklndzuxw1KfIu3mECEmv Yod2/jtGK3+sQLTDk5wV9jkTC0v3125pbnhwRl8vofLQTF92RiCSpL1AzgvF2NoWM0N6 4jWg== X-Gm-Message-State: AOJu0YziCvjqac9uhPj10cpH7CNjHw1Gcoq/kI4ccLBMyE0OE3Z8D83P qsV4YiLWmdOYxNWhYMW5cYyMQ0P8cMrbEJ4RPymqdowaW05Bxs0HEra2b7VwexX4pKEfJZAGVHB S5eA4CdmY9rr5/7bnOb6B0XbS3paBoMIo/A4HauzorPlMRO+M/xCMOOB7oDfiPJGLYYXdIf3XCc 9KtGZHw9O32YAfXQGR3qaRVNDjR72oIbQ4xEX1Q08= X-Gm-Gg: ASbGncugawR29Vx94sd46uGdCAjnEhN1UxBsSbBtj4W29coo2i2Ucor8p2fv+sdRF4j OgVeFQ+onl8OSRnkghwNZU7Wyiaw1otCmb45UE5cqwmpdRu3SAy+8KaD1vkKUXz1I/f99 X-Received: by 2002:a17:90b:3f4c:b0:2fe:994d:613b with SMTP id 98e67ed59e1d1-30e7d5be445mr24698861a91.35.1747753821137; Tue, 20 May 2025 08:10:21 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFhIm6sfk5hxDmxlJvwLYPHM5vVcHmf57Gm0sKCf79uyxvEFbf4JxIf/BsT8glDRBZWXB1YqUDqyGSOa90u+L0= X-Received: by 2002:a17:90b:3f4c:b0:2fe:994d:613b with SMTP id 98e67ed59e1d1-30e7d5be445mr24698828a91.35.1747753820764; Tue, 20 May 2025 08:10:20 -0700 (PDT) MIME-Version: 1.0 From: Eugenio Perez Martin Date: Tue, 20 May 2025 17:09:44 +0200 X-Gm-Features: AX0GCFtvU6rWW1ndjloke1Ed2aHWiUNF3TZkB_ngXIoxPW2onxYaDv1xx1hrIuQ Message-ID: Subject: vhost-kernel net on pasta: from 26 to 37Gbit/s To: passt-dev@passt.top X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: DiiIkZNo42cdjioFqN8SkkGgI_LBxXpMh43foiZg-0Y_1747753822 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Message-ID-Hash: VPWWCZKOB27XGRINJJ564FF7YAEJNMDP X-Message-ID-Hash: VPWWCZKOB27XGRINJJ564FF7YAEJNMDP X-MailFrom: eperezma@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: Jason Wang , Jeff Nelson X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Hi! Some updates on the integration. The main culprit was to allow pasta to keep reading packets using the regular read() on the tap device. I thought that part was completely disabled, but I guess the kernel is able to omit the notification on tap as long as the userspace does not read it. My scenario: in different CPUs, all in the same NUMA. I run iperf server on CPU 11 with "iperf3 -A 11 -s". All odds CPUs are isolated with isolcpus=1,3,... nohz=on nohz_full=1,3,... With vanilla pasta isolated to CPUs 1,3 with taskset, and just --config-net option, and running iperf with "iperf3 -A 5 -c 10.6.68.254 -w 8M": - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 30.7 GBytes 26.4 Gbits/sec 0 sender [ 5] 0.00-10.04 sec 30.7 GBytes 26.3 Gbits/sec receiver Now trying with the vhost patches we get a slightly worse performance: - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 25.5 GBytes 21.9 Gbits/sec 0 sender [ 5] 0.00-10.04 sec 25.5 GBytes 21.8 Gbits/sec receiver Now vhost patch still lacks optimizations like disabling notifications or batch more rx available buffer notifications. At the moment it refills the rx buffers in each iteration, and does not set the no_notify bit which makes the kernel skip the used buffer notifications if pasta is actively checking the queue, which is not optimal. Now if I isolate the vhost kernel thread [1] I get way more performance as expected: - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 43.1 GBytes 37.1 Gbits/sec 0 sender [ 5] 0.00-10.04 sec 43.1 GBytes 36.9 Gbits/sec receiver After analyzing perf output, rep_movs_alternative is the most called function in the three iperf3 (~20%Self), passt.avx2 (~15%Self) and vhost (~15%Self), But I don't see any of them consuming 100% of CPU in top: pasta consumes ~85% %CPU, both iperf3 client and server consumes 60%, and vhost consumes ~53%. So... I have mixed feelings about this :). By "default" it seems to have less performance, but my test is maybe too synthetic. There is room for improvement with the mentioned optimizations so I'd continue applying them, continuing with UDP and TCP zerocopy, and developing zerocopy vhost rx. With these numbers I think the series should not be merged at the moment. I could send it as RFC if you want but I've not applied the comments the first one received, POC style :). Thanks! [1] Notes to reproduce it, I'm able to see it with top -H and then set with taskset. Either the latest changes on the module or the way pasta behaves does not allow me to see in classical ps output.