From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by passt.top (Postfix) with ESMTP id 956485A0272 for ; Wed, 7 Feb 2024 15:57:42 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707317861; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pBhdT2TvZq7gxu1jdWKQU2oZMDXJPRdXg/y15wIm5dg=; b=Mo4GTK7/cZ6LDZ8fhbi0mlkjzGIQKyukEnzL22x6Csw8ydqJuYJOhh0KxpgJU21n+jl8W4 8BeNxV1esF35F5bZlzqrpke/Pg6Yii6HEahAFIA+TKuVkDC4BgQnuIIJ/3ENtl7jobIPjj I7XKcB2eKGBDFeDnPVaDEBUALAJkss8= Received: from mail-ed1-f71.google.com (mail-ed1-f71.google.com [209.85.208.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-681-WUP9LNC3M2yY5G71vVrYpQ-1; Wed, 07 Feb 2024 09:57:39 -0500 X-MC-Unique: WUP9LNC3M2yY5G71vVrYpQ-1 Received: by mail-ed1-f71.google.com with SMTP id 4fb4d7f45d1cf-559391de41bso465147a12.3 for ; Wed, 07 Feb 2024 06:57:39 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707317858; x=1707922658; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=LAwJsO2bmNf3WF/LQa0Oynssg2VRuUQ4D0Jomv0bwhg=; b=qxXOelbdsFuuJcBqxByjRYFvm2D4vZ5c81849cIsH9EgLn6ZRqPoLAXeiFBxg4wjSR SzXg/g680sh1wZdFaYuiEfR/QBtOXFtzQ42z1Idt1TEqqGjJniH8EPqg9phqZc8XZ3NU oMzcdu9YiEddgbXclMBSvVj8Ah5O6EODCTNs1wE0h8RtWblulun6N8CTNiE2WenB3wb7 3HlJLZmGOQ1mzVvBs43tKq9AZv0lu0oQXX5f55Jx2XQ/Uk93IlugvPZPjnZiIriOkcSs bSKz/c/xs9O/4QVxdjpFrpJYwyp1r3vgxa6dt2ESmFeyfQMCKTqqMHqggDBVcREbfhO5 ABlw== X-Gm-Message-State: AOJu0Yzcdrc8skFkDnuv7SmemxXGu9lOPlBx2JT5/zRmo3H3mqp8bbKL 8Aa3mipH0D2JqMl3cMmVElxsCmmSORJns22b1WzsvTxmi8HtzN2vMZrUYkgbSlkWRDWzbAFMKos DJwCvy/vDp4hkQKV6w9NOe7vwzUIJYGy+glXEO9nOr2qNF8Lru/I+238PWvDfLeLoal1b3/xodC Sa6SgYAzltetiB7v5BstiPTFEqnpF5x3iL/4I= X-Received: by 2002:aa7:c2c5:0:b0:55f:fe85:ed47 with SMTP id m5-20020aa7c2c5000000b0055ffe85ed47mr4352926edp.12.1707317857915; Wed, 07 Feb 2024 06:57:37 -0800 (PST) X-Google-Smtp-Source: AGHT+IHmnjT2xfdojv+kcZaeDH4jcxHCggbrR5eodr4ScQV6a8yYI4HaUa1CuF5GCnbNXara53E5dA== X-Received: by 2002:aa7:c2c5:0:b0:55f:fe85:ed47 with SMTP id m5-20020aa7c2c5000000b0055ffe85ed47mr4352912edp.12.1707317857506; Wed, 07 Feb 2024 06:57:37 -0800 (PST) Received: from maya.cloud.tilaa.com (maya.cloud.tilaa.com. [164.138.29.33]) by smtp.gmail.com with ESMTPSA id ef5-20020a05640228c500b005607f899175sm747874edb.70.2024.02.07.06.57.36 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 07 Feb 2024 06:57:36 -0800 (PST) Date: Wed, 7 Feb 2024 15:57:02 +0100 From: Stefano Brivio To: Laurent Vivier Subject: Re: [PATCH 01/24] iov: add some functions to manage iovec Message-ID: <20240207155702.149cb870@elisabeth> In-Reply-To: <854a77ba-4d9e-44e4-b4c2-29a23b3af695@redhat.com> References: <20240202141151.3762941-1-lvivier@redhat.com> <20240202141151.3762941-2-lvivier@redhat.com> <20240206171019.7aaec06d@elisabeth> <854a77ba-4d9e-44e4-b4c2-29a23b3af695@redhat.com> Organization: Red Hat X-Mailer: Claws Mail 4.1.1 (GTK 3.24.36; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Message-ID-Hash: KSTWTVL2VFHCYZ4UUI2C4UKAZZSHHWPA X-Message-ID-Hash: KSTWTVL2VFHCYZ4UUI2C4UKAZZSHHWPA X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Wed, 7 Feb 2024 15:02:42 +0100 Laurent Vivier wrote: > On 2/6/24 17:10, Stefano Brivio wrote: > > On Fri, 2 Feb 2024 15:11:28 +0100 > > Laurent Vivier wrote: > > ... > > diff --git a/iov.h b/iov.h > > new file mode 100644 > > index 000000000000..31fbf6d0e1cf > > --- /dev/null > > +++ b/iov.h > > @@ -0,0 +1,46 @@ > > +// SPDX-License-Identifier: GPL-2.0-or-later > > + > > +/* some parts copied from QEMU include/qemu/iov.h */ > > + > > +#ifndef IOVEC_H > > +#define IOVEC_H > > + > > +#include > > +#include > > + > > +size_t iov_from_buf_full(const struct iovec *iov, unsigned int iov_cnt= , > > +=09=09=09 size_t offset, const void *buf, size_t bytes); > > +size_t iov_to_buf_full(const struct iovec *iov, const unsigned int iov= _cnt, > > +=09=09 size_t offset, void *buf, size_t bytes); > > + > > +static inline size_t iov_from_buf(const struct iovec *iov, > > +=09=09=09=09 unsigned int iov_cnt, size_t offset, > > +=09=09=09=09 const void *buf, size_t bytes) > > +{ > > Is there a particular reason to include these two in a header? The > > compiler will inline as needed if they are in a source file. > > =20 > This code has been introduced in QEMU by: >=20 > commit ad523bca56a7202d2498c550a41be5c986c4d33c > Author: Paolo Bonzini > Date:=C2=A0=C2=A0 Tue Dec 22 12:03:33 2015 +0100 >=20 > =C2=A0=C2=A0=C2=A0 iov: avoid memcpy for "simple" iov_from_buf/iov_to_bu= f >=20 > =C2=A0=C2=A0=C2=A0 memcpy can take a large amount of time for small read= s and writes. > =C2=A0=C2=A0=C2=A0 For virtio it is a common case that the first iovec c= an satisfy the > =C2=A0=C2=A0=C2=A0 whole read or write.=C2=A0 In that case, and if bytes= is a constant to > =C2=A0=C2=A0=C2=A0 avoid excessive growth of code, inline the first iter= ation > =C2=A0=C2=A0=C2=A0 into the caller. >=20 > =C2=A0=C2=A0=C2=A0 Signed-off-by: Paolo Bonzini > =C2=A0=C2=A0=C2=A0 Message-id: 1450782213-14227-1-git-send-email-pbonzin= i@redhat.com > =C2=A0=C2=A0=C2=A0 Signed-off-by: Stefan Hajnoczi >=20 > Is the compiler able to check "bytes" is a constant and inline the functi= on if the=20 > definition is in a .c file and not in a .h ? Well, those are two different aspects, but anyway, yes, both. Having it in a header file implies that the compiler considers the problem separately for every compilation unit (roughly every .c file, here). If you move it in a source file, the compiler will instead apply some heuristics to decide if it makes sense to inline and, if, yes, you end up with essentially the same result. If I apply this on top of your series: --- diff --git a/iov.c b/iov.c index 38a8e75..cabd6d0 100644 --- a/iov.c +++ b/iov.c @@ -76,3 +76,27 @@ unsigned iov_copy(struct iovec *dst_iov, unsigned int dst_iov_cnt, } =09return j; } + +size_t iov_from_buf(const struct iovec *iov, unsigned int iov_cnt, +=09=09 size_t offset, const void *buf, size_t bytes) +{ +=09if (__builtin_constant_p(bytes) && iov_cnt && +=09=09offset <=3D iov[0].iov_len && bytes <=3D iov[0].iov_len - offset) { +=09=09memcpy((char *)iov[0].iov_base + offset, buf, bytes); +=09=09return bytes; +=09} else { +=09=09return iov_from_buf_full(iov, iov_cnt, offset, buf, bytes); +=09} +} + +size_t iov_to_buf(const struct iovec *iov, const unsigned int iov_cnt, +=09=09 size_t offset, void *buf, size_t bytes) +{ +=09if (__builtin_constant_p(bytes) && iov_cnt && +=09=09offset <=3D iov[0].iov_len && bytes <=3D iov[0].iov_len - offset) { +=09=09memcpy(buf, (char *)iov[0].iov_base + offset, bytes); +=09=09return bytes; +=09} else { +=09=09return iov_to_buf_full(iov, iov_cnt, offset, buf, bytes); +=09} +} diff --git a/iov.h b/iov.h index 31fbf6d..598c2ba 100644 --- a/iov.h +++ b/iov.h @@ -12,35 +12,14 @@ size_t iov_from_buf_full(const struct iovec *iov, unsigned int iov_cnt, size_t offset, const void *buf, size_t bytes); size_t iov_to_buf_full(const struct iovec *iov, const unsigned int iov_cnt, size_t offset, void *buf, size_t bytes); - -static inline size_t iov_from_buf(const struct iovec *iov, -=09=09=09=09 unsigned int iov_cnt, size_t offset, -=09=09=09=09 const void *buf, size_t bytes) -{ -=09if (__builtin_constant_p(bytes) && iov_cnt && -=09=09offset <=3D iov[0].iov_len && bytes <=3D iov[0].iov_len - offset) { -=09=09memcpy((char *)iov[0].iov_base + offset, buf, bytes); -=09=09return bytes; -=09} else { -=09=09return iov_from_buf_full(iov, iov_cnt, offset, buf, bytes); -=09} -} - -static inline size_t iov_to_buf(const struct iovec *iov, -=09=09=09=09const unsigned int iov_cnt, size_t offset, -=09=09=09=09void *buf, size_t bytes) -{ -=09if (__builtin_constant_p(bytes) && iov_cnt && -=09=09offset <=3D iov[0].iov_len && bytes <=3D iov[0].iov_len - offset) { -=09=09memcpy(buf, (char *)iov[0].iov_base + offset, bytes); -=09=09return bytes; -=09} else { -=09=09return iov_to_buf_full(iov, iov_cnt, offset, buf, bytes); -=09} -} - size_t iov_size(const struct iovec *iov, const unsigned int iov_cnt); unsigned iov_copy(struct iovec *dst_iov, unsigned int dst_iov_cnt, =09=09 const struct iovec *iov, unsigned int iov_cnt, =09=09 size_t offset, size_t bytes); + +size_t iov_from_buf(const struct iovec *iov, unsigned int iov_cnt, +=09=09 size_t offset, const void *buf, size_t bytes); + +size_t iov_to_buf(const struct iovec *iov, const unsigned int iov_cnt, +=09=09 size_t offset, void *buf, size_t bytes); #endif --- and have a look at it: $ CFLAGS=3D"-g" make $ objdump -DxSs passt | less you'll see it's not inlined, but the function simply resolves to a jump to iov_from_buf_full(): 0000000000020ac0 : return iov_from_buf_full(iov, iov_cnt, offset, buf, bytes); 20ac0: e9 db fd ff ff jmp 208a0 20ac5: 66 66 2e 0f 1f 84 00 data16 cs nopw 0x0(%rax,%rax,1) 20acc: 00 00 00 00=20 because in the usage you make of it (vu_send()), elem->in_sg is never a constant. If we look at the AVX2 version instead: $ objdump -DxSs passt.avx2 | less iov_from_buf() directly becomes iov_from_buf_full() (inlined): 000000000002dfa0 : { 2dfa0: 41 57 push %r15 2dfa2: 41 56 push %r14 2dfa4: 41 89 f6 mov %esi,%r14d for (i =3D 0, done =3D 0; (offset || done < bytes) && i < iov_cnt; = i++) { 2dfa7: 4c 89 c6 mov %r8,%rsi and it's still called from vu_send() -- not inlined there, probably because iov_from_buf_full() is too big. In the end, the compiler decides to inline iov_from_buf_full() into iov_from_buf(), to drop the "constant" implementation from it (because it wouldn't be used), and I guess that makes sense. --=20 Stefano