From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by passt.top (Postfix) with ESMTP id EB8125A026D for ; Thu, 14 Mar 2024 16:47:52 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1710431271; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gBaUxolEhoAeVrqJBhRTgIAuqnqSMLFbSv0vTlC5a9Q=; b=THZ44ymbTKk3A6CxbAmFwOuWMe2y1n2IVKs+ty6Vh7YalxUWSAwc6p6iZ/S0SnCZ+JOYPB 3MwymuXSBMpZ6kfsUVGawYz/s84cneHKNit1J7dumRBiUcWT42FV9yMC2MCQ4mVYKrrn14 DQaKfle9ibJkHzWv8kPvdkqn7lwgURo= Received: from mail-lj1-f200.google.com (mail-lj1-f200.google.com [209.85.208.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-202-oVFpFGhwMTGkWWCQ26UAAg-1; Thu, 14 Mar 2024 11:47:50 -0400 X-MC-Unique: oVFpFGhwMTGkWWCQ26UAAg-1 Received: by mail-lj1-f200.google.com with SMTP id 38308e7fff4ca-2d47e55dfd1so4665071fa.2 for ; Thu, 14 Mar 2024 08:47:50 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710431268; x=1711036068; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=gBaUxolEhoAeVrqJBhRTgIAuqnqSMLFbSv0vTlC5a9Q=; b=KCIvZZqimVL/WLt21NOCInMFD+z02XM+yQFBRJqbO8/wokI6kdYyJyxfMXyvbjzH6W DEIfCrTWRG4rq+Vf91HFZblfBxDsX4q/2CWkDnPeTQPvtngaxE2wo//jexMGUxE9FU1j QtKLNIiO7h8raOjaiddUrnY1plvMBU7YWEwk+bN/aE0V7jF9eM+sWSFF9f8W4Y/9k+i0 Cfd1bqQui5knYc3f1iRedfy2ayEjr9f0WDQFUjMVBUKNXib8GRecMoRd/KLMzLJXOWgg phHgslGfkiq++KcyO5B1Hg4TFKTGg9FrKao7w4fOYP+HifyTRSXoRF2L9cymTa06W0jC IN9w== X-Gm-Message-State: AOJu0YzfuwamshFLXE7SpE40pu4Ij5zm9T6VuVxrXcQdr/yQdcbBO7ed 1f4DutY50Y5soJ+5BYa/n7IzdDDZqdsQfxGFoqgdnz2aqmb59pJIDFA2hJStytEtScUERBS2H3p urhx03qddWSBZQqCkvDc8RfY+LePmAEWkexrGwKttcI3Hskan7WUv/WLcpJiHzjN/rikaiQ+Lhu MFjPEQXHWPYW1qQXDlvEv3Zw2P0K94NUgvIgM= X-Received: by 2002:a2e:9357:0:b0:2d4:693d:ff10 with SMTP id m23-20020a2e9357000000b002d4693dff10mr1488848ljh.20.1710431268003; Thu, 14 Mar 2024 08:47:48 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHtNaTL+GpxrbvgEM4FyQHldhjtnV2QtfexHucQLJzRgaVcRDwYEkrFpc8Jh8NSRNt8uVhfJw== X-Received: by 2002:a2e:9357:0:b0:2d4:693d:ff10 with SMTP id m23-20020a2e9357000000b002d4693dff10mr1488831ljh.20.1710431267375; Thu, 14 Mar 2024 08:47:47 -0700 (PDT) Received: from maya.cloud.tilaa.com (maya.cloud.tilaa.com. [164.138.29.33]) by smtp.gmail.com with ESMTPSA id jy5-20020a170907762500b00a4675490095sm553048ejc.42.2024.03.14.08.47.46 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Thu, 14 Mar 2024 08:47:46 -0700 (PDT) Date: Thu, 14 Mar 2024 16:47:07 +0100 From: Stefano Brivio To: Laurent Vivier Subject: Re: [RFC] tcp: Replace TCP buffer structure by an iovec array Message-ID: <20240314164707.75ee6501@elisabeth> In-Reply-To: <84cadd0b-4102-4bde-bad6-45705cca34ce@redhat.com> References: <20240311133356.1405001-1-lvivier@redhat.com> <20240313123725.7a37f311@elisabeth> <84cadd0b-4102-4bde-bad6-45705cca34ce@redhat.com> Organization: Red Hat X-Mailer: Claws Mail 4.2.0 (GTK 3.24.36; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-ID-Hash: KT2327FILHV4CX25XZHGQNWI2LOOZKEJ X-Message-ID-Hash: KT2327FILHV4CX25XZHGQNWI2LOOZKEJ X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Thu, 14 Mar 2024 15:07:48 +0100 Laurent Vivier wrote: > On 3/13/24 12:37, Stefano Brivio wrote: > ... > >> @@ -390,6 +414,42 @@ static size_t tap_send_frames_passt(const struct ctx *c, > >> return i; > >> } > >> > >> +/** > >> + * tap_send_iov_passt() - Send out multiple prepared frames > > > > ...I would argue that this function prepares frames as well. Maybe: > > > > * tap_send_iov_passt() - Prepare TCP_IOV_VNET parts and send multiple frames > > > >> + * @c: Execution context > >> + * @iov: Array of frames, each frames is divided in an array of iovecs. > >> + * The first entry of the iovec is updated to point to an > >> + * uint32_t storing the frame length. > > > > * @iov: Array of frames, each one a vector of parts, TCP_IOV_VNET blank > > > >> + * @n: Number of frames in @iov > >> + * > >> + * Return: number of frames actually sent > >> + */ > >> +static size_t tap_send_iov_passt(const struct ctx *c, > >> + struct iovec iov[][TCP_IOV_NUM], > >> + size_t n) > >> +{ > >> + unsigned int i; > >> + > >> + for (i = 0; i < n; i++) { > >> + uint32_t vnet_len; > >> + int j; > >> + > >> + vnet_len = 0; > > > > This could be initialised in the declaration (yes, it's "reset" at > > every loop iteration). > > > >> + for (j = TCP_IOV_ETH; j < TCP_IOV_NUM; j++) > >> + vnet_len += iov[i][j].iov_len; > >> + > >> + vnet_len = htonl(vnet_len); > >> + iov[i][TCP_IOV_VNET].iov_base = &vnet_len; > >> + iov[i][TCP_IOV_VNET].iov_len = sizeof(vnet_len); > >> + > >> + if (!tap_send_frames_passt(c, iov[i], TCP_IOV_NUM)) > > > > ...which would now send a single frame at a time, but actually it can > > already send everything in one shot because it's using sendmsg(), if you > > move it outside of the loop and do something like (untested): > > > > return tap_send_frames_passt(c, iov, TCP_IOV_NUM * n); > > > >> + break; > >> + } > >> + > >> + return i; > >> + > >> +} > >> + > > I tried to do something like that but I have a performance drop: > > static size_t tap_send_iov_passt(const struct ctx *c, > struct iovec iov[][TCP_IOV_NUM], > size_t n) > { > unsigned int i; > uint32_t vnet_len[n]; > > for (i = 0; i < n; i++) { > int j; > > vnet_len[i] = 0; > for (j = TCP_IOV_ETH; j < TCP_IOV_NUM; j++) > vnet_len[i] += iov[i][j].iov_len; > > vnet_len[i] = htonl(vnet_len[i]); > iov[i][TCP_IOV_VNET].iov_base = &vnet_len[i]; > iov[i][TCP_IOV_VNET].iov_len = sizeof(uint32_t); > } > > return tap_send_frames_passt(c, &iov[0][0], TCP_IOV_NUM * n) / TCP_IOV_NUM; > } > > iperf3 -c localhost -p 10001 -t 60 -4 > > berfore > [ ID] Interval Transfer Bitrate Retr > [ 5] 0.00-60.00 sec 33.0 GBytes 4.72 Gbits/sec 1 sender > [ 5] 0.00-60.06 sec 33.0 GBytes 4.72 Gbits/sec receiver > > after: > [ ID] Interval Transfer Bitrate Retr > [ 5] 0.00-60.00 sec 18.2 GBytes 2.60 Gbits/sec 0 sender > [ 5] 0.00-60.07 sec 18.2 GBytes 2.60 Gbits/sec receiver Weird, it looks like doing one sendmsg() per frame results in a higher throughput than one sendmsg() per multiple frames, which sounds rather absurd. Perhaps we should start looking into what perf(1) reports, in terms of both syscall overhead and cache misses. I'll have a look later today or tomorrow -- unless you have other ideas as to why this might happen... -- Stefano