public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Laurent Vivier <lvivier@redhat.com>
Cc: passt-dev@passt.top
Subject: Re: [PATCH 4/4] tcp: Update TCP checksum using an iovec array
Date: Wed, 25 Sep 2024 17:01:52 +1000	[thread overview]
Message-ID: <ZvO1YG_Su82EHrMX@zatzit.fritz.box> (raw)
In-Reply-To: <ec983275-bd51-4b6e-9ce1-61d773a99e63@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 6092 bytes --]

On Wed, Sep 25, 2024 at 08:40:28AM +0200, Laurent Vivier wrote:
> On 25/09/2024 03:12, David Gibson wrote:
> > On Tue, Sep 24, 2024 at 05:46:42PM +0200, Laurent Vivier wrote:
> > > TCP header and payload are supposed to be in the same buffer,
> > > and tcp_update_check_tcp4()/tcp_update_check_tcp6() compute
> > > the checksum from the base address of the header using the
> > > length of the IP payload.
> > > 
> > > In the future (for vhost-user) we need to dispatch the TCP header and
> > > the TCP payload through several buffers. To be able to manage that, we
> > > provide an iovec array that points to the data of the TCP frame.
> > > We provide also an offset to be able to provide an array that contains
> > > the TCP frame embedded in an lower level frame, and this offset points
> > > to the TCP header inside the iovec array.
> > > 
> > > Signed-off-by: Laurent Vivier <lvivier@redhat.com>
> > > ---
> > >   checksum.c |   1 -
> > >   tcp.c      | 100 ++++++++++++++++++++++++++++++++++++++---------------
> > >   2 files changed, 72 insertions(+), 29 deletions(-)
> > > 
> > > diff --git a/checksum.c b/checksum.c
> > > index f80db4d309a2..96ccfe2af50b 100644
> > > --- a/checksum.c
> > > +++ b/checksum.c
> > > @@ -503,7 +503,6 @@ uint16_t csum(const void *buf, size_t len, uint32_t init)
> > >    *
> > >    * Return: 16-bit folded, complemented checksum
> > >    */
> > > -/* cppcheck-suppress unusedFunction */
> > >   uint16_t csum_iov(const struct iovec *iov, size_t n, size_t offset,
> > >   		  uint32_t init)
> > >   {
> > > diff --git a/tcp.c b/tcp.c
> > > index c9472d905520..efd4037ed008 100644
> > > --- a/tcp.c
> > > +++ b/tcp.c
> > > @@ -755,36 +755,65 @@ static void tcp_sock_set_bufsize(const struct ctx *c, int s)
> > >   }
> > >   /**
> > > - * tcp_update_check_tcp4() - Update TCP checksum from stored one
> > > - * @iph:	IPv4 header
> > > - * @bp:		TCP header followed by TCP payload
> > > - */
> > > -static void tcp_update_check_tcp4(const struct iphdr *iph,
> > > -				  struct tcp_payload_t *bp)
> > > + * tcp_update_check_tcp4() - Calculate TCP checksum for IPv6
> > > + * @src:		IPv4 source address
> > > + * @dst:		IPv4 destination address
> > > + * @iov:		Pointer to the array of IO vectors
> > > + * @iov_cnt:		Length of the array
> > > + * @payload_offset:	IPv4 payload offset in the iovec array
> > 
> > You explain it here, but "payload_offset" is a bit unclear if you're
> > not sure which layer it's talking about.  "l4offset" maybe?
> > 
> > > + */
> > > +void tcp_update_check_tcp4(struct in_addr src,
> > > +			   struct in_addr dst,
> > > +			   const struct iovec *iov, int iov_cnt,
> > > +			   size_t payload_offset)
> > >   {
> > > -	uint16_t l4len = ntohs(iph->tot_len) - sizeof(struct iphdr);
> > > -	struct in_addr saddr = { .s_addr = iph->saddr };
> > > -	struct in_addr daddr = { .s_addr = iph->daddr };
> > > -	uint32_t sum = proto_ipv4_header_psum(l4len, IPPROTO_TCP, saddr, daddr);
> > > +	size_t check_ofs;
> > > +	__sum16 *check;
> > 
> > What's a __sum16?
> 
> It's the type of "check" in struct tcphdr.

Huh, ok.

> > > +	int check_idx;
> > > +	uint32_t sum;
> > > +
> > > +	sum = proto_ipv4_header_psum(iov_size(iov, iov_cnt) - payload_offset,
> > > +				     IPPROTO_TCP, src, dst);
> > > +
> > > +	check_idx = iov_skip_bytes(iov, iov_cnt,
> > > +				payload_offset + offsetof(struct tcphdr, check),
> > > +				&check_ofs);
> > > +
> > > +	check = (__sum16 *)((char *)iov[check_idx].iov_base + check_ofs);
> > 
> > So.. it's not likely, but it's possible for the first byte of the
> > checksum to be in one iovec and the second byte in another.  This
> > whole construction is a bit awkward too.
> 
> I'm guessing iov_base and iov_len are 32bit aligned, and address of "check"
> too (as it is from tcphdr in memory, that should be 32bit aligned, and
> offset of "check" is 16), so a 16 bit value cannot be shared between two
> iovecs. I'm guessing that any 32bit value we take from a structure will not
> cross boundary of an iovec.

That's probably true.. but I think we should actually verify / assert
that somewhere.

> > I think we want another helper on top of iov_skip_bytes().  It would
> > retreive a pointer to a field of a given length and offset within the
> > IOV, returning NULL if that can't be found contiguously.  It could
> > have a macro wrapper that fills in some of the details based on a
> > type.  For now I'd imagine we just give up if it returns NULL, but
> > that's enough to reduce a potential out of bounds memory access to
> > merely breaking one connection.  If we ever need it, we can add a slow
> > path to handle that case.
> > 
> > There are a couple of other curly cases to consider too, alas: what if
> > the field you request does exist contiguously, but isn't properly
> > aligned for the type we want to access it as?  Then there's the
> > question of whether doing this will run afoul of the type-based
> > aliasing rules.
> 
> In our case, "check" is a field of "struct tcphdr", I think it's sane to
> think it is correctly aligned.

It will be correctly aligned as long as the struct tcphdr itself is
correctly aligned.  We can probably count on that being 2-byte aligned
w.r.t. the start of the frame, but probably no more since the L2
header is 14 bytes long.  If there were odd-length iovs, and the TCP
header wasn't in the first, that could destroy its alignment though.
Actually... the same thing could happen in the first IOV if iov_base
wasn't aligned, which is admittedly probably even less likely.

> I don't want to write complicated code only to write the checksum of
> the tcp header.

I agree, but I think we should at least test and bail with an error
message if our assumptions about the alignments of the IOVs were given
aren't true.

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2024-09-25  7:02 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-24 15:46 [PATCH 0/4] tcp: use csum_iov() in tcp_update_check_tcp[4|6]() Laurent Vivier
2024-09-24 15:46 ` [PATCH 1/4] tcp: Use tcp_payload_t rather than tcphdr Laurent Vivier
2024-09-24 15:46 ` [PATCH 2/4] pcap: Add an offset argument in pcap_iov() Laurent Vivier
2024-09-25  1:21   ` David Gibson
2024-09-24 15:46 ` [PATCH 3/4] checksum: Add an offset argument in csum_iov() Laurent Vivier
2024-09-25  0:51   ` David Gibson
2024-09-24 15:46 ` [PATCH 4/4] tcp: Update TCP checksum using an iovec array Laurent Vivier
2024-09-25  1:12   ` David Gibson
2024-09-25  6:40     ` Laurent Vivier
2024-09-25  7:01       ` David Gibson [this message]
2024-09-25  7:27         ` Laurent Vivier
2024-09-25  7:31           ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZvO1YG_Su82EHrMX@zatzit.fritz.box \
    --to=david@gibson.dropbear.id.au \
    --cc=lvivier@redhat.com \
    --cc=passt-dev@passt.top \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).