From: David Gibson <david@gibson.dropbear.id.au>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: Laurent Vivier <lvivier@redhat.com>, passt-dev@passt.top
Subject: Re: [PATCH v2 4/4] tcp: Update TCP checksum using an iovec array
Date: Thu, 26 Sep 2024 11:56:49 +1000 [thread overview]
Message-ID: <ZvS_YcPj8-CYkmD5@zatzit.fritz.box> (raw)
In-Reply-To: <20240925193919.6bfe0df4@elisabeth>
[-- Attachment #1: Type: text/plain, Size: 4745 bytes --]
On Wed, Sep 25, 2024 at 07:39:19PM +0200, Stefano Brivio wrote:
> On Wed, 25 Sep 2024 10:11:25 +0200
> Laurent Vivier <lvivier@redhat.com> wrote:
>
> > TCP header and payload are supposed to be in the same buffer,
> > and tcp_update_check_tcp4()/tcp_update_check_tcp6() compute
> > the checksum from the base address of the header using the
> > length of the IP payload.
> >
> > In the future (for vhost-user) we need to dispatch the TCP header and
> > the TCP payload through several buffers. To be able to manage that, we
> > provide an iovec array that points to the data of the TCP frame.
> > We provide also an offset to be able to provide an array that contains
> > the TCP frame embedded in an lower level frame, and this offset points
> > to the TCP header inside the iovec array.
> >
> > Signed-off-by: Laurent Vivier <lvivier@redhat.com>
> > ---
> >
> > Notes:
> > v2:
> > - s/payload_offset/l4offset/
> > - check memory address of the checksum (alignment, iovec boundaries)
> >
> > checksum.c | 1 -
> > tcp.c | 116 ++++++++++++++++++++++++++++++++++++++++-------------
> > 2 files changed, 88 insertions(+), 29 deletions(-)
> >
> > diff --git a/checksum.c b/checksum.c
> > index 68ffaddb5bb0..4854c1937c39 100644
> > --- a/checksum.c
> > +++ b/checksum.c
> > @@ -503,7 +503,6 @@ uint16_t csum(const void *buf, size_t len, uint32_t init)
> > *
> > * Return: 16-bit folded, complemented checksum
> > */
> > -/* cppcheck-suppress unusedFunction */
> > uint16_t csum_iov(const struct iovec *iov, size_t n, size_t offset,
> > uint32_t init)
> > {
> > diff --git a/tcp.c b/tcp.c
> > index c9472d905520..f0a6f7a507a7 100644
> > --- a/tcp.c
> > +++ b/tcp.c
> > @@ -755,36 +755,81 @@ static void tcp_sock_set_bufsize(const struct ctx *c, int s)
> > }
> >
> > /**
> > - * tcp_update_check_tcp4() - Update TCP checksum from stored one
> > - * @iph: IPv4 header
> > - * @bp: TCP header followed by TCP payload
> > - */
> > -static void tcp_update_check_tcp4(const struct iphdr *iph,
> > - struct tcp_payload_t *bp)
> > + * tcp_update_check_tcp4() - Calculate TCP checksum for IPv6
> > + * @src: IPv4 source address
> > + * @dst: IPv4 destination address
> > + * @iov: Pointer to the array of IO vectors
> > + * @iov_cnt: Length of the array
> > + * @l4offset: IPv4 payload offset in the iovec array
> > + */
> > +void tcp_update_check_tcp4(struct in_addr src,
> > + struct in_addr dst,
> > + const struct iovec *iov, int iov_cnt,
> > + size_t l4offset)
> > {
> > - uint16_t l4len = ntohs(iph->tot_len) - sizeof(struct iphdr);
> > - struct in_addr saddr = { .s_addr = iph->saddr };
> > - struct in_addr daddr = { .s_addr = iph->daddr };
> > - uint32_t sum = proto_ipv4_header_psum(l4len, IPPROTO_TCP, saddr, daddr);
> > + size_t check_ofs;
> > + __sum16 *check;
> > + int check_idx;
> > + uint32_t sum;
> > +
> > + sum = proto_ipv4_header_psum(iov_size(iov, iov_cnt) - l4offset,
> > + IPPROTO_TCP, src, dst);
> > +
> > + check_idx = iov_skip_bytes(iov, iov_cnt,
> > + l4offset + offsetof(struct tcphdr, check),
> > + &check_ofs);
> > +
> > + if (check_idx >= iov_cnt)
> > + die("TCP4 buffer is too small");
> > + if (check_ofs + sizeof(*check) > iov[check_idx].iov_len)
> > + die("TCP4 checksum field memory is not contiguous");
>
> I'm not really fond of those die() calls. First off, they should report
> a couple more details (at least check_idx, iov_cnt).
>
> Second, we could fail gracefully (hence, we should) instead of aborting
> the whole thing: those could be err() calls.
It's a question of how plausible a graceful recovery is at this point.
If we hit this, the guest has given us buffers that aren't even 2-byte
aligned. Is it reasonable to keep trying to working with a guest that
does that?
> If we fail to calculate checksums, we can leave them as zero, and the
> receiver will drop those frames anyway, if you don't want to add
> complexity (propagation of return values) for something that should
> never happen.
I think the checksum is (in the general vhost-user case)
uninitialised, not zero, at this point. To set it to zero, we'd still
need to get a usable pointer to it. Not that that really changes what
would happen - 0 has the same chance of being right by accident as an
uninitialised value.
> The rest looks good to me, but I didn't go through David's comments so
> I'll wait for his review before applying.
--
David Gibson (he or they) | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you, not the other way
| around.
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2024-09-26 3:42 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-25 8:11 [PATCH v2 0/4] tcp: use csum_iov() in tcp_update_check_tcp[4|6]() Laurent Vivier
2024-09-25 8:11 ` [PATCH v2 1/4] tcp: Use tcp_payload_t rather than tcphdr Laurent Vivier
2024-09-25 8:11 ` [PATCH v2 2/4] pcap: Add an offset argument in pcap_iov() Laurent Vivier
2024-09-25 8:11 ` [PATCH v2 3/4] checksum: Add an offset argument in csum_iov() Laurent Vivier
2024-09-25 17:39 ` Stefano Brivio
2024-09-26 1:31 ` David Gibson
2024-09-25 8:11 ` [PATCH v2 4/4] tcp: Update TCP checksum using an iovec array Laurent Vivier
2024-09-25 17:39 ` Stefano Brivio
2024-09-26 1:56 ` David Gibson [this message]
2024-09-26 3:54 ` Stefano Brivio
2024-09-26 1:45 ` David Gibson
2024-09-27 13:49 ` Laurent Vivier
2024-09-30 2:59 ` David Gibson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZvS_YcPj8-CYkmD5@zatzit.fritz.box \
--to=david@gibson.dropbear.id.au \
--cc=lvivier@redhat.com \
--cc=passt-dev@passt.top \
--cc=sbrivio@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).