From: Stefano Brivio <sbrivio@redhat.com>
To: Laurent Vivier <lvivier@redhat.com>
Cc: passt-dev@passt.top
Subject: Re: [RFC] tcp: Replace TCP buffer structure by an iovec array
Date: Thu, 14 Mar 2024 17:26:17 +0100 [thread overview]
Message-ID: <20240314172617.22c28caa@elisabeth> (raw)
In-Reply-To: <893d5b17-cb92-49bf-8752-7ba1d798ceeb@redhat.com>
On Thu, 14 Mar 2024 16:54:02 +0100
Laurent Vivier <lvivier@redhat.com> wrote:
> On 3/14/24 16:47, Stefano Brivio wrote:
> > On Thu, 14 Mar 2024 15:07:48 +0100
> > Laurent Vivier <lvivier@redhat.com> wrote:
> >
> >> On 3/13/24 12:37, Stefano Brivio wrote:
> >> ...
> >>>> @@ -390,6 +414,42 @@ static size_t tap_send_frames_passt(const struct ctx *c,
> >>>> return i;
> >>>> }
> >>>>
> >>>> +/**
> >>>> + * tap_send_iov_passt() - Send out multiple prepared frames
> >>>
> >>> ...I would argue that this function prepares frames as well. Maybe:
> >>>
> >>> * tap_send_iov_passt() - Prepare TCP_IOV_VNET parts and send multiple frames
> >>>
> >>>> + * @c: Execution context
> >>>> + * @iov: Array of frames, each frames is divided in an array of iovecs.
> >>>> + * The first entry of the iovec is updated to point to an
> >>>> + * uint32_t storing the frame length.
> >>>
> >>> * @iov: Array of frames, each one a vector of parts, TCP_IOV_VNET blank
> >>>
> >>>> + * @n: Number of frames in @iov
> >>>> + *
> >>>> + * Return: number of frames actually sent
> >>>> + */
> >>>> +static size_t tap_send_iov_passt(const struct ctx *c,
> >>>> + struct iovec iov[][TCP_IOV_NUM],
> >>>> + size_t n)
> >>>> +{
> >>>> + unsigned int i;
> >>>> +
> >>>> + for (i = 0; i < n; i++) {
> >>>> + uint32_t vnet_len;
> >>>> + int j;
> >>>> +
> >>>> + vnet_len = 0;
> >>>
> >>> This could be initialised in the declaration (yes, it's "reset" at
> >>> every loop iteration).
> >>>
> >>>> + for (j = TCP_IOV_ETH; j < TCP_IOV_NUM; j++)
> >>>> + vnet_len += iov[i][j].iov_len;
> >>>> +
> >>>> + vnet_len = htonl(vnet_len);
> >>>> + iov[i][TCP_IOV_VNET].iov_base = &vnet_len;
> >>>> + iov[i][TCP_IOV_VNET].iov_len = sizeof(vnet_len);
> >>>> +
> >>>> + if (!tap_send_frames_passt(c, iov[i], TCP_IOV_NUM))
> >>>
> >>> ...which would now send a single frame at a time, but actually it can
> >>> already send everything in one shot because it's using sendmsg(), if you
> >>> move it outside of the loop and do something like (untested):
> >>>
> >>> return tap_send_frames_passt(c, iov, TCP_IOV_NUM * n);
> >>>
> >>>> + break;
> >>>> + }
> >>>> +
> >>>> + return i;
> >>>> +
> >>>> +}
> >>>> +
> >>
> >> I tried to do something like that but I have a performance drop:
> >>
> >> static size_t tap_send_iov_passt(const struct ctx *c,
> >> struct iovec iov[][TCP_IOV_NUM],
> >> size_t n)
> >> {
> >> unsigned int i;
> >> uint32_t vnet_len[n];
> >>
> >> for (i = 0; i < n; i++) {
> >> int j;
> >>
> >> vnet_len[i] = 0;
> >> for (j = TCP_IOV_ETH; j < TCP_IOV_NUM; j++)
> >> vnet_len[i] += iov[i][j].iov_len;
> >>
> >> vnet_len[i] = htonl(vnet_len[i]);
> >> iov[i][TCP_IOV_VNET].iov_base = &vnet_len[i];
> >> iov[i][TCP_IOV_VNET].iov_len = sizeof(uint32_t);
> >> }
> >>
> >> return tap_send_frames_passt(c, &iov[0][0], TCP_IOV_NUM * n) / TCP_IOV_NUM;
> >> }
> >>
> >> iperf3 -c localhost -p 10001 -t 60 -4
> >>
> >> berfore
> >> [ ID] Interval Transfer Bitrate Retr
> >> [ 5] 0.00-60.00 sec 33.0 GBytes 4.72 Gbits/sec 1 sender
> >> [ 5] 0.00-60.06 sec 33.0 GBytes 4.72 Gbits/sec receiver
> >>
> >> after:
> >> [ ID] Interval Transfer Bitrate Retr
> >> [ 5] 0.00-60.00 sec 18.2 GBytes 2.60 Gbits/sec 0 sender
> >> [ 5] 0.00-60.07 sec 18.2 GBytes 2.60 Gbits/sec receiver
> >
> > Weird, it looks like doing one sendmsg() per frame results in a higher
> > throughput than one sendmsg() per multiple frames, which sounds rather
> > absurd. Perhaps we should start looking into what perf(1) reports, in
> > terms of both syscall overhead and cache misses.
> >
> > I'll have a look later today or tomorrow -- unless you have other
> > ideas as to why this might happen...
>
> Perhaps in first case we only update one vnet_len and in the second case we have to update
> an array of vnet_len, so there is an use of more cache lines?
Yes, I'm wondering if for example this:
iov[i][TCP_IOV_VNET].iov_base = &vnet_len[i];
causes a prefetch of everything pointed by iov[i][...], so we would
prefetch (and throw away) each buffer, one by one.
Another interesting experiment to verify if this is the case could be
to "flush" a few frames at a time (say, 4), with something like this on
top of your original change (completely untested):
[...]
if (!((i + 1) % 4) &&
!tap_send_frames_passt(c, iov[i / 4], TCP_IOV_NUM * 4))
break;
}
if ((i + 1) % 4) {
tap_send_frames_passt(c, iov[i / 4],
TCP_IOV_NUM * ((i + 1) % 4));
}
Or maybe we could set vnet_len right after we receive data in the
buffers.
--
Stefano
next prev parent reply other threads:[~2024-03-14 16:27 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-11 13:33 [RFC] tcp: Replace TCP buffer structure by an iovec array Laurent Vivier
2024-03-12 22:56 ` Stefano Brivio
2024-03-13 11:37 ` Stefano Brivio
2024-03-13 14:42 ` Laurent Vivier
2024-03-13 15:27 ` Stefano Brivio
2024-03-13 15:20 ` Laurent Vivier
2024-03-13 16:58 ` Stefano Brivio
2024-03-14 14:07 ` Laurent Vivier
2024-03-14 15:47 ` Stefano Brivio
2024-03-14 15:54 ` Laurent Vivier
2024-03-14 16:26 ` Stefano Brivio [this message]
2024-03-15 0:46 ` David Gibson
2024-03-14 4:22 ` David Gibson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240314172617.22c28caa@elisabeth \
--to=sbrivio@redhat.com \
--cc=lvivier@redhat.com \
--cc=passt-dev@passt.top \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).