From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: passt.top; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=BxGnwM6Y; dkim-atps=neutral Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by passt.top (Postfix) with ESMTPS id 82FB55A0278 for ; Mon, 28 Jul 2025 18:42:00 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1753720919; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=EiGa6w1FT4X/7EAhJmS6YMqiHn2mcr3gd2WIJd4bopY=; b=BxGnwM6Y+/kPFJ0pZ+Fe3dDwAjNSYef6a9sJuFFX/qN1KMZyQqJNnHvyjlakVgXGPGjvFs G+NELu07QZKya1UVqbW0y3qc3HymnUA6LGS4BeTF8xB6XaWXSU478wOvVYTNqd1h1kHZrk 5e0fPp4PJ1jJqfsOb5JYyzqXpM8m2uY= Received: from mail-pj1-f70.google.com (mail-pj1-f70.google.com [209.85.216.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-526-aksdrn19Om-OslonstGwxw-1; Mon, 28 Jul 2025 12:41:57 -0400 X-MC-Unique: aksdrn19Om-OslonstGwxw-1 X-Mimecast-MFC-AGG-ID: aksdrn19Om-OslonstGwxw_1753720917 Received: by mail-pj1-f70.google.com with SMTP id 98e67ed59e1d1-311d670ad35so4341486a91.3 for ; Mon, 28 Jul 2025 09:41:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753720917; x=1754325717; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=EiGa6w1FT4X/7EAhJmS6YMqiHn2mcr3gd2WIJd4bopY=; b=euaFhHOsXhzv1eaF/8TyVWDTSTAxujC62ZaeyUsth7JyVLZVzfqXlLCuiKiQiP7FRI PRy0VaESZd7a/bqXkrJe+ByVPrS9Jn/ZvBVWXUWH0xMsmDoLUNrJV8Kf/sjjSi3uq82l xLH4HKS8F9Cr47b7jY5FUqC2Jn+6KWD3bhZka+JeLQ2tst7XgbI4zcFZZVCCNklzDs0n hp8DdniKLJRl88wH1wTaBvS06IagDcGY0uVal3Bz/K8vmLbatJGFEDXokk0E+OZETYDJ Uvuykl6hBSYl0Sv+3UH7frXTbADHy8dwHwsIkaTVzguvJ1YGpe3bxITbBa6ab6GMXS34 I25w== X-Gm-Message-State: AOJu0Yy741AFmd7wR+ZCSrrNvZ2h00Vk0KxhkMqecTczrW5zM6hTtp3N ONgRctZ1GBtId35ROVlDwVKeFBRrXCBxkBB10byDW4yDUa9NDI+bhkba60ZX8bUGQKTohQ1LVDB 1lNQSbdbd0l6IPPaX3usiP7Zj5WMM0AZA9b8PTIj6EruzP3u+BH+MOcgRUwTvGGC0M3NjO49Wnw e7Tr6aUKeyHf/bDV1iulYiZSrZU0uT X-Gm-Gg: ASbGncs0eEDYtkWDYWHwD2ZBeGK7xE7jMSkjmvBkKtm609ewcQu9bdFf2NZPrEUywym uarDE4i0dRnafWe8lL054TjZ/fq2Ko0rQmX9QD8rO2zxb+XiS0ZKH9ahX6AEREfXqRlv5SVkZfb OU/LOf7AKKku20+lZMXWEz X-Received: by 2002:a17:90b:4d0d:b0:31c:15da:2175 with SMTP id 98e67ed59e1d1-31e77864aa9mr16758056a91.9.1753720916491; Mon, 28 Jul 2025 09:41:56 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHqsEp+7TElm0iN59TZ/kAr77wcV6Rs/1Q21JevjZw79PdsIxPIldBVZPBlr1kKm1fGVGLzYosX52wKNYE/JZk= X-Received: by 2002:a17:90b:4d0d:b0:31c:15da:2175 with SMTP id 98e67ed59e1d1-31e77864aa9mr16758027a91.9.1753720916026; Mon, 28 Jul 2025 09:41:56 -0700 (PDT) MIME-Version: 1.0 References: <20250709174748.3514693-1-eperezma@redhat.com> <20250709174748.3514693-5-eperezma@redhat.com> In-Reply-To: From: Eugenio Perez Martin Date: Mon, 28 Jul 2025 18:41:20 +0200 X-Gm-Features: Ac12FXyXNmCMgkF6mycW7IX-rAiojmhOlii5LaqaH4v_-_QxZgwPZz5iImg0Yb8 Message-ID: Subject: Re: [RFC v2 04/11] tcp: export memory regions to vhost To: David Gibson X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: 38wxvC1558ABH-ifzj70YxWpUEvW-mh1zZ75SZUNYy0_1753720917 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Message-ID-Hash: F2DXZ4G6OU3IAQNT2O4YTKZZGV4ZRVLZ X-Message-ID-Hash: F2DXZ4G6OU3IAQNT2O4YTKZZGV4ZRVLZ X-MailFrom: eperezma@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, jasowang@redhat.com X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Wed, Jul 23, 2025 at 9:06=E2=80=AFAM David Gibson wrote: > > On Wed, Jul 09, 2025 at 07:47:41PM +0200, Eugenio P=C3=A9rez wrote: > > So vhost kernel is able to access the TCP buffers. > > > > Signed-off-by: Eugenio P=C3=A9rez > > --- > > tap.c | 14 +++++++++++--- > > tcp_buf.c | 14 ++++---------- > > tcp_buf.h | 19 +++++++++++++++++++ > > 3 files changed, 34 insertions(+), 13 deletions(-) > > > > diff --git a/tap.c b/tap.c > > index 0656294..8b3ec45 100644 > > --- a/tap.c > > +++ b/tap.c > > @@ -63,6 +63,8 @@ > > #include "vhost_user.h" > > #include "vu_common.h" > > > > +#include "tcp_buf.h" > > + > > I don't love including the pretty specific content of tcp_buf.h into > the mostly protocol unaware tap.c. Though I do realise that avoiding > it will probably have other tradeoffs. > I was more focused on testing the performance of the solution than making it pretty, but I'm a big fan of hiding the buffers somehow :). I just didn't explore the options. > > /* Maximum allowed frame lengths (including L2 header) */ > > > > /* Verify that an L2 frame length limit is large enough to contain the= header, > > @@ -136,8 +138,8 @@ static union { > > char buf[offsetof(struct vring_used, ring[VHOST_NDESCS])]; > > } vring_used_0 __attribute__((aligned(PAGE_SIZE))), vring_used_1 __att= ribute__((aligned(PAGE_SIZE))); > > > > -/* all descs ring + 2rings * 2vqs + tx pkt buf + rx pkt buf */ > > -#define N_VHOST_REGIONS 6 > > +/* all descs ring + 2rings * 2vqs + tx pkt buf + rx pkt buf + TCP virt= io hdr + TCP eth(src,dst) + TCP ip hdr */ > > +#define N_VHOST_REGIONS 12 > > Hmm. Keeping this and the region initialisation in sync is pretty > clunky. I recall that before I went on leave we were discussing just > exposing pasta's whole data segment to vhost; was there a reason for > changing that plan, or just that you haven't implemented it so far? > > > union { > > struct vhost_memory mem; > > char buf[offsetof(struct vhost_memory, regions[N_VHOST_REGIONS])]= ; > > @@ -1635,7 +1637,13 @@ static int tap_ns_tun(void *arg) > > vhost_memory.mem.regions[3] =3D VHOST_MEMORY_REGION(vring_used_0)= ; > > vhost_memory.mem.regions[4] =3D VHOST_MEMORY_REGION(vring_used_1)= ; > > vhost_memory.mem.regions[5] =3D VHOST_MEMORY_REGION(pkt_buf); > > - static_assert(5 < N_VHOST_REGIONS); > > + vhost_memory.mem.regions[6] =3D VHOST_MEMORY_REGION(tcp_payload_t= ap_hdr); > > + vhost_memory.mem.regions[7] =3D VHOST_MEMORY_REGION(tcp4_eth_src)= ; > > + vhost_memory.mem.regions[8] =3D VHOST_MEMORY_REGION(tcp6_eth_src)= ; > > + vhost_memory.mem.regions[9] =3D VHOST_MEMORY_REGION(tcp4_payload_= ip); > > + vhost_memory.mem.regions[10] =3D VHOST_MEMORY_REGION(tcp6_payload= _ip); > > + vhost_memory.mem.regions[11] =3D VHOST_MEMORY_REGION(tcp_payload)= ; > > + static_assert(11 < N_VHOST_REGIONS); > > If all the regions are global variables, you could put them into a > static const array, then define N_VHOST_REGIONS via ARRAY_SIZE(). > I'm ok with that, just trying to make all the variables as narrow scoped as possible. The array will not be used after the syscall. > > #undef VHOST_MEMORY_REGION > > #undef VHOST_MEMORY_REGION_PTR > > > > diff --git a/tcp_buf.c b/tcp_buf.c > > index 2fbd056..c999d2e 100644 > > --- a/tcp_buf.c > > +++ b/tcp_buf.c > > @@ -22,8 +22,6 @@ > > > > #include > > > > -#include > > - > > #include "util.h" > > #include "ip.h" > > #include "iov.h" > > @@ -35,24 +33,20 @@ > > #include "tcp_internal.h" > > #include "tcp_buf.h" > > > > -#define TCP_FRAMES_MEM 128 > > -#define TCP_FRAMES = \ > > - (c->mode =3D=3D MODE_PASTA ? 1 : TCP_FRAMES_MEM) > > - > > /* Static buffers */ > > > > /* Ethernet header for IPv4 and IPv6 frames */ > > -static struct ethhdr tcp4_eth_src; > > -static struct ethhdr tcp6_eth_src; > > +struct ethhdr tcp4_eth_src; > > +struct ethhdr tcp6_eth_src; > > > > -static struct virtio_net_hdr_mrg_rxbuf tcp_payload_tap_hdr[TCP_FRAMES_= MEM]; > > +struct virtio_net_hdr_mrg_rxbuf tcp_payload_tap_hdr[TCP_FRAMES_MEM]; > > > > /* IP headers for IPv4 and IPv6 */ > > struct iphdr tcp4_payload_ip[TCP_FRAMES_MEM]; > > struct ipv6hdr tcp6_payload_ip[TCP_FRAMES_MEM]; > > > > /* TCP segments with payload for IPv4 and IPv6 frames */ > > -static struct tcp_payload_t tcp_payload[TCP_FRAMES_MEM]; > > +struct tcp_payload_t tcp_payload[TCP_FRAMES_MEM]; > > > > static_assert(MSS4 <=3D sizeof(tcp_payload[0].data), "MSS4 is greater = than 65516"); > > static_assert(MSS6 <=3D sizeof(tcp_payload[0].data), "MSS6 is greater = than 65516"); > > diff --git a/tcp_buf.h b/tcp_buf.h > > index 54f5e53..7ae2536 100644 > > --- a/tcp_buf.h > > +++ b/tcp_buf.h > > @@ -6,9 +6,28 @@ > > #ifndef TCP_BUF_H > > #define TCP_BUF_H > > > > +#include > > + > > +#include "tcp_conn.h" > > +#include "tcp_internal.h" > > + > > void tcp_sock_iov_init(const struct ctx *c); > > void tcp_payload_flush(const struct ctx *c); > > +struct tcp_tap_conn; > > int tcp_buf_data_from_sock(const struct ctx *c, struct tcp_tap_conn *c= onn); > > int tcp_buf_send_flag(const struct ctx *c, struct tcp_tap_conn *conn, = int flags); > > > > +#define TCP_FRAMES_MEM 128 > > +#define TCP_FRAMES = \ > > +(c->mode =3D=3D MODE_PASTA ? 1 : TCP_FRAMES_MEM) > > + > > +extern struct virtio_net_hdr_mrg_rxbuf tcp_payload_tap_hdr[TCP_FRAMES_= MEM]; > > +extern struct tcp_payload_t tcp_payload[TCP_FRAMES_MEM]; > > + > > +extern struct ethhdr tcp4_eth_src; > > +extern struct ethhdr tcp6_eth_src; > > + > > +extern struct iphdr tcp4_payload_ip[TCP_FRAMES_MEM]; > > +extern struct ipv6hdr tcp6_payload_ip[TCP_FRAMES_MEM]; > > + > > #endif /*TCP_BUF_H */ > > -- > David Gibson (he or they) | I'll have my music baroque, and my code > david AT gibson.dropbear.id.au | minimalist, thank you, not the other wa= y > | around. > http://www.ozlabs.org/~dgibson