public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: Laurent Vivier <lvivier@redhat.com>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: passt-dev@passt.top
Subject: Re: [RFC] tcp: Replace TCP buffer structure by an iovec array
Date: Thu, 14 Mar 2024 16:54:02 +0100	[thread overview]
Message-ID: <893d5b17-cb92-49bf-8752-7ba1d798ceeb@redhat.com> (raw)
In-Reply-To: <20240314164707.75ee6501@elisabeth>

On 3/14/24 16:47, Stefano Brivio wrote:
> On Thu, 14 Mar 2024 15:07:48 +0100
> Laurent Vivier <lvivier@redhat.com> wrote:
> 
>> On 3/13/24 12:37, Stefano Brivio wrote:
>> ...
>>>> @@ -390,6 +414,42 @@ static size_t tap_send_frames_passt(const struct ctx *c,
>>>>    	return i;
>>>>    }
>>>>    
>>>> +/**
>>>> + * tap_send_iov_passt() - Send out multiple prepared frames
>>>
>>> ...I would argue that this function prepares frames as well. Maybe:
>>>
>>>    * tap_send_iov_passt() - Prepare TCP_IOV_VNET parts and send multiple frames
>>>    
>>>> + * @c:		Execution context
>>>> + * @iov:	Array of frames, each frames is divided in an array of iovecs.
>>>> + *              The first entry of the iovec is updated to point to an
>>>> + *              uint32_t storing the frame length.
>>>
>>>    * @iov:	Array of frames, each one a vector of parts, TCP_IOV_VNET blank
>>>    
>>>> + * @n:		Number of frames in @iov
>>>> + *
>>>> + * Return: number of frames actually sent
>>>> + */
>>>> +static size_t tap_send_iov_passt(const struct ctx *c,
>>>> +				 struct iovec iov[][TCP_IOV_NUM],
>>>> +				 size_t n)
>>>> +{
>>>> +	unsigned int i;
>>>> +
>>>> +	for (i = 0; i < n; i++) {
>>>> +		uint32_t vnet_len;
>>>> +		int j;
>>>> +
>>>> +		vnet_len = 0;
>>>
>>> This could be initialised in the declaration (yes, it's "reset" at
>>> every loop iteration).
>>>    
>>>> +		for (j = TCP_IOV_ETH; j < TCP_IOV_NUM; j++)
>>>> +			vnet_len += iov[i][j].iov_len;
>>>> +
>>>> +		vnet_len = htonl(vnet_len);
>>>> +		iov[i][TCP_IOV_VNET].iov_base = &vnet_len;
>>>> +		iov[i][TCP_IOV_VNET].iov_len = sizeof(vnet_len);
>>>> +
>>>> +		if (!tap_send_frames_passt(c, iov[i], TCP_IOV_NUM))
>>>
>>> ...which would now send a single frame at a time, but actually it can
>>> already send everything in one shot because it's using sendmsg(), if you
>>> move it outside of the loop and do something like (untested):
>>>
>>> 	return tap_send_frames_passt(c, iov, TCP_IOV_NUM * n);
>>>    
>>>> +			break;
>>>> +	}
>>>> +
>>>> +	return i;
>>>> +
>>>> +}
>>>> +
>>
>> I tried to do something like that but I have a performance drop:
>>
>> static size_t tap_send_iov_passt(const struct ctx *c,
>>                                    struct iovec iov[][TCP_IOV_NUM],
>>                                    size_t n)
>> {
>>           unsigned int i;
>>           uint32_t vnet_len[n];
>>
>>           for (i = 0; i < n; i++) {
>>                   int j;
>>
>>                   vnet_len[i] = 0;
>>                   for (j = TCP_IOV_ETH; j < TCP_IOV_NUM; j++)
>>                           vnet_len[i] += iov[i][j].iov_len;
>>
>>                   vnet_len[i] = htonl(vnet_len[i]);
>>                   iov[i][TCP_IOV_VNET].iov_base = &vnet_len[i];
>>                   iov[i][TCP_IOV_VNET].iov_len = sizeof(uint32_t);
>>           }
>>
>>           return tap_send_frames_passt(c, &iov[0][0], TCP_IOV_NUM * n) / TCP_IOV_NUM;
>> }
>>
>> iperf3 -c localhost -p 10001  -t 60  -4
>>
>> berfore
>> [ ID] Interval           Transfer     Bitrate         Retr
>> [  5]   0.00-60.00  sec  33.0 GBytes  4.72 Gbits/sec    1             sender
>> [  5]   0.00-60.06  sec  33.0 GBytes  4.72 Gbits/sec                  receiver
>>
>> after:
>> [ ID] Interval           Transfer     Bitrate         Retr
>> [  5]   0.00-60.00  sec  18.2 GBytes  2.60 Gbits/sec    0             sender
>> [  5]   0.00-60.07  sec  18.2 GBytes  2.60 Gbits/sec                  receiver
> 
> Weird, it looks like doing one sendmsg() per frame results in a higher
> throughput than one sendmsg() per multiple frames, which sounds rather
> absurd. Perhaps we should start looking into what perf(1) reports, in
> terms of both syscall overhead and cache misses.
> 
> I'll have a look later today or tomorrow -- unless you have other
> ideas as to why this might happen...
> 

Perhaps in first case we only update one vnet_len and in the second case we have to update 
an array of vnet_len, so there is an use of more cache lines?

Thanks,
Laurent


  reply	other threads:[~2024-03-14 15:54 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-11 13:33 [RFC] tcp: Replace TCP buffer structure by an iovec array Laurent Vivier
2024-03-12 22:56 ` Stefano Brivio
2024-03-13 11:37 ` Stefano Brivio
2024-03-13 14:42   ` Laurent Vivier
2024-03-13 15:27     ` Stefano Brivio
2024-03-13 15:20   ` Laurent Vivier
2024-03-13 16:58     ` Stefano Brivio
2024-03-14 14:07   ` Laurent Vivier
2024-03-14 15:47     ` Stefano Brivio
2024-03-14 15:54       ` Laurent Vivier [this message]
2024-03-14 16:26         ` Stefano Brivio
2024-03-15  0:46           ` David Gibson
2024-03-14  4:22 ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=893d5b17-cb92-49bf-8752-7ba1d798ceeb@redhat.com \
    --to=lvivier@redhat.com \
    --cc=passt-dev@passt.top \
    --cc=sbrivio@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).