On Wed, Sep 25, 2024 at 07:39:19PM +0200, Stefano Brivio wrote: > On Wed, 25 Sep 2024 10:11:25 +0200 > Laurent Vivier wrote: > > > TCP header and payload are supposed to be in the same buffer, > > and tcp_update_check_tcp4()/tcp_update_check_tcp6() compute > > the checksum from the base address of the header using the > > length of the IP payload. > > > > In the future (for vhost-user) we need to dispatch the TCP header and > > the TCP payload through several buffers. To be able to manage that, we > > provide an iovec array that points to the data of the TCP frame. > > We provide also an offset to be able to provide an array that contains > > the TCP frame embedded in an lower level frame, and this offset points > > to the TCP header inside the iovec array. > > > > Signed-off-by: Laurent Vivier > > --- > > > > Notes: > > v2: > > - s/payload_offset/l4offset/ > > - check memory address of the checksum (alignment, iovec boundaries) > > > > checksum.c | 1 - > > tcp.c | 116 ++++++++++++++++++++++++++++++++++++++++------------- > > 2 files changed, 88 insertions(+), 29 deletions(-) > > > > diff --git a/checksum.c b/checksum.c > > index 68ffaddb5bb0..4854c1937c39 100644 > > --- a/checksum.c > > +++ b/checksum.c > > @@ -503,7 +503,6 @@ uint16_t csum(const void *buf, size_t len, uint32_t init) > > * > > * Return: 16-bit folded, complemented checksum > > */ > > -/* cppcheck-suppress unusedFunction */ > > uint16_t csum_iov(const struct iovec *iov, size_t n, size_t offset, > > uint32_t init) > > { > > diff --git a/tcp.c b/tcp.c > > index c9472d905520..f0a6f7a507a7 100644 > > --- a/tcp.c > > +++ b/tcp.c > > @@ -755,36 +755,81 @@ static void tcp_sock_set_bufsize(const struct ctx *c, int s) > > } > > > > /** > > - * tcp_update_check_tcp4() - Update TCP checksum from stored one > > - * @iph: IPv4 header > > - * @bp: TCP header followed by TCP payload > > - */ > > -static void tcp_update_check_tcp4(const struct iphdr *iph, > > - struct tcp_payload_t *bp) > > + * tcp_update_check_tcp4() - Calculate TCP checksum for IPv6 > > + * @src: IPv4 source address > > + * @dst: IPv4 destination address > > + * @iov: Pointer to the array of IO vectors > > + * @iov_cnt: Length of the array > > + * @l4offset: IPv4 payload offset in the iovec array > > + */ > > +void tcp_update_check_tcp4(struct in_addr src, > > + struct in_addr dst, > > + const struct iovec *iov, int iov_cnt, > > + size_t l4offset) > > { > > - uint16_t l4len = ntohs(iph->tot_len) - sizeof(struct iphdr); > > - struct in_addr saddr = { .s_addr = iph->saddr }; > > - struct in_addr daddr = { .s_addr = iph->daddr }; > > - uint32_t sum = proto_ipv4_header_psum(l4len, IPPROTO_TCP, saddr, daddr); > > + size_t check_ofs; > > + __sum16 *check; > > + int check_idx; > > + uint32_t sum; > > + > > + sum = proto_ipv4_header_psum(iov_size(iov, iov_cnt) - l4offset, > > + IPPROTO_TCP, src, dst); > > + > > + check_idx = iov_skip_bytes(iov, iov_cnt, > > + l4offset + offsetof(struct tcphdr, check), > > + &check_ofs); > > + > > + if (check_idx >= iov_cnt) > > + die("TCP4 buffer is too small"); > > + if (check_ofs + sizeof(*check) > iov[check_idx].iov_len) > > + die("TCP4 checksum field memory is not contiguous"); > > I'm not really fond of those die() calls. First off, they should report > a couple more details (at least check_idx, iov_cnt). > > Second, we could fail gracefully (hence, we should) instead of aborting > the whole thing: those could be err() calls. It's a question of how plausible a graceful recovery is at this point. If we hit this, the guest has given us buffers that aren't even 2-byte aligned. Is it reasonable to keep trying to working with a guest that does that? > If we fail to calculate checksums, we can leave them as zero, and the > receiver will drop those frames anyway, if you don't want to add > complexity (propagation of return values) for something that should > never happen. I think the checksum is (in the general vhost-user case) uninitialised, not zero, at this point. To set it to zero, we'd still need to get a usable pointer to it. Not that that really changes what would happen - 0 has the same chance of being right by accident as an uninitialised value. > The rest looks good to me, but I didn't go through David's comments so > I'll wait for his review before applying. -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson