From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: passt.top; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=h6KmWDTC; dkim-atps=neutral Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by passt.top (Postfix) with ESMTPS id 1CB455A061C for ; Wed, 01 Jan 2025 22:54:53 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1735768491; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CS9uMvPl3mYg16ytTS+SH0gyR0lQWe6gNeIwnzsUEAA=; b=h6KmWDTCiMlrs9y3GCNzbYfbqAxhuSdC0O8ROu+B8k9XW9Pcb5be//ITo5Gv7bLDymkDGv oFtbePt6E1lQQnHQyqLdPbKNvEIZ4fvR6y5gpNToxmxbSpQf7tBR4s5ITno5coDx/nwGzO NWRKmgFvJIDavdyO6Zr3lenDzFY5Rsw= Received: from mail-wr1-f69.google.com (mail-wr1-f69.google.com [209.85.221.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-691-kwLtxC3fOv6QylA08CtjUw-1; Wed, 01 Jan 2025 16:54:50 -0500 X-MC-Unique: kwLtxC3fOv6QylA08CtjUw-1 X-Mimecast-MFC-AGG-ID: kwLtxC3fOv6QylA08CtjUw Received: by mail-wr1-f69.google.com with SMTP id ffacd0b85a97d-385d735965bso6268758f8f.1 for ; Wed, 01 Jan 2025 13:54:50 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735768489; x=1736373289; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=CS9uMvPl3mYg16ytTS+SH0gyR0lQWe6gNeIwnzsUEAA=; b=mf7kwf/1yxfUB2q8VdaqkNrb++176D6pUu37I0Vo5nStlVQ35/X+8PAE+vA9RCKoVe SoBglmX74oFl7N1XLd3jgAESecj3efb+dVcA4FYltgB2pRfpFmB3h+834xk/OawiEgOq cHAlLyU2dEdxRZVjF4nbSpOGWRK0O71qmd9IH587ZYjA5M2pNFcPyh5TugwWk79DHOGG J8qT940hHxlghCZJABusi9JSWK7DahYrsN+hGqjuGN5krHSh0QQaOdzxDF1rh0NSQ55Q Rxw1xjkKTAz3nLzhzyTVNNrq4HDGu8u0UZCsxWSVzZxfpzWakif3jau1pEql7zp6XERi q2pA== X-Gm-Message-State: AOJu0Yx4gaHgS6alV63mvu9Qw1Sgu766a3LMFIDRG8s95ykLaFBXsOmf aGpX60Ca2s9geLGoYjODgqzJp0Hocoa1vaIV/XjGzv2LqV0FC74RsRychItqh7zUl0sfGXLo54s SLP4ljPgCF93Vbe27vOMeN8T6YxGxiJq8jbZKMfTE3B37w8/vqy7QsdZY6A== X-Gm-Gg: ASbGncut1vJm7FZqjjKsErvdr1hjZtGnSgVEFc3R4PTMc9BWOjILNCxF17WL8fqF0Vn bt040Eta1d5J6pSkv1exIY0GoHH9m4YoyQUhvXDdfxrMwBqC1hTsC/v4dvgMGvRcpfnfHcygeh1 sTVbXxtxCrzOxKaAXWWJy5LXymMPmcyfIi/SdLM4O41oDVpBWBSGUGaPRqfnv/JDcrvz+vLlWcm d996T7Hq6ALH8R/r0JCJs6ExSowiZW74vOxBwWqXfA3Ho8r8TNWDlVUYyg6cg6bTD2IyPBVBDs+ 1XGmC4KXyg== X-Received: by 2002:a05:6000:188c:b0:385:f0dc:c9f4 with SMTP id ffacd0b85a97d-38a221eaeb6mr33712632f8f.20.1735768488713; Wed, 01 Jan 2025 13:54:48 -0800 (PST) X-Google-Smtp-Source: AGHT+IHDhF0HmLgN3Y0qhIbRPs5auV4f9vDIdpujkXAgYdKwwem9WhI3Ab4I8CDnXGKoR7hRV6WTUQ== X-Received: by 2002:a05:6000:188c:b0:385:f0dc:c9f4 with SMTP id ffacd0b85a97d-38a221eaeb6mr33712627f8f.20.1735768488356; Wed, 01 Jan 2025 13:54:48 -0800 (PST) Received: from maya.myfinge.rs (ifcgrfdd.trafficplex.cloud. [176.103.220.4]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38a1c89e1acsm36438501f8f.68.2025.01.01.13.54.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Jan 2025 13:54:45 -0800 (PST) Date: Wed, 1 Jan 2025 22:54:44 +0100 From: Stefano Brivio To: David Gibson Subject: Re: [PATCH v2 11/12] tap: Don't size pool_tap[46] for the maximum number of packets Message-ID: <20250101225444.130c1034@elisabeth> In-Reply-To: <20241220083535.1372523-12-david@gibson.dropbear.id.au> References: <20241220083535.1372523-1-david@gibson.dropbear.id.au> <20241220083535.1372523-12-david@gibson.dropbear.id.au> Organization: Red Hat X-Mailer: Claws Mail 4.2.0 (GTK 3.24.41; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: HqYt3mPOBmLEUkqEqGEQ4ILgxIy6BoYadrXbIkqweg0_1735768489 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-ID-Hash: DQLIWXI7R67L6YTNQLVBAL7SP73JLU6I X-Message-ID-Hash: DQLIWXI7R67L6YTNQLVBAL7SP73JLU6I X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Fri, 20 Dec 2024 19:35:34 +1100 David Gibson wrote: > Currently we attempt to size pool_tap[46] so they have room for the maximum > possible number of packets that could fit in pkt_buf, TAP_MSGS. However, > the calculation isn't quite correct: TAP_MSGS is based on ETH_ZLEN (60) as > the minimum possible L2 frame size. But, we don't enforce that L2 frames > are at least ETH_ZLEN when we receive them from the tap backend, and since > we're dealing with virtual interfaces we don't have the physical Ethernet > limitations requiring that length. Indeed it is possible to generate a > legitimate frame smaller than that (e.g. a zero-payload UDP/IPv4 frame on > the 'pasta' backend is only 42 bytes long). > > It's also unclear if this limit is sufficient for vhost-user which isn't > limited by the size of pkt_buf as the other modes are. > > We could attempt to correct the calculation, but that would leave us with > even larger arrays, which in practice rarely accumulate more than a handful > of packets. So, instead, put an arbitrary cap on the number of packets we > can put in a batch, and if we run out of space, process and flush the > batch. I ran a few more tests with this, keeping TAP_MSGS at 256, and in general I couldn't really see a difference in latency (especially for UDP streams with small packets) or throughput. Figures from short throughput tests (such as the ones from the test suite) look a bit more variable, but I don't have any statistically meaningful data. Then I looked into how many messages we might have in the array without this change, and I realised that, with the throughput tests from the suite, we very easily exceed the 256 limit. Perhaps surprisingly we get the highest buffer counts with TCP transfers and intermediate MTUs: we're at about 4000-5000 with 1500 bytes (and more like ~1000 with 1280 bytes) meaning that we move 6 to 8 megabytes in one shot, every 5-10ms (at 8 Gbps). With that kind of time interval, the extra system call overhead from forcibly flushing batches might become rather relevant. With lower MTUs, it looks like we have a lower CPU load and transmissions are scheduled differently (resulting in smaller batches), but I didn't really trace things. So I start thinking that this has the *potential* to introduce a performance regression in some cases and we shouldn't just assume that some arbitrary 256 limit is good enough. I didn't check with perf(1), though. Right now that array takes, effectively, less than 100 KiB (it's ~5000 copies of struct iovec, 16 bytes each), and in theory that could be ~2.5 MiB (at 161319 items). Even if we double or triple that (let's assume we use 2 * ETH_ALEN to keep it simple) it's not much... and will have no practical effect anyway. All in all, I think we shouldn't change this limit without a deeper understanding of the practical impact. While this change doesn't bring any practical advantage, the current behaviour is somewhat tested by now, and a small limit isn't. -- Stefano