From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: passt.top; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=i10ND1ND; dkim-atps=neutral Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by passt.top (Postfix) with ESMTP id A082B5A004C for ; Thu, 26 Sep 2024 05:54:08 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1727322847; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=KpNUbRAFszI1eK3s3uSMQU+NJFPEPpFLxnyJ6HS30rg=; b=i10ND1ND2bWJPtzCbj/jLq8dXZAgW1sHjInNc9azp70i1ZkG/94s1DBwtBk+yJdNN/17Hu enC5x+7pjg1hqNgK6t9YKqb5W5Cmgp3Ipo6XyBQjb/i/P3Pjz2L7YdcPIqBnP4jPOqGzCp GF7MY3Sx/F+iQ5JGC/0vi9ocy+WAkwM= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-681-IRRAtqN6NGOvweFfDZVObg-1; Wed, 25 Sep 2024 23:54:06 -0400 X-MC-Unique: IRRAtqN6NGOvweFfDZVObg-1 Received: by mail-wm1-f69.google.com with SMTP id 5b1f17b1804b1-42cb830ea86so2557085e9.3 for ; Wed, 25 Sep 2024 20:54:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727322845; x=1727927645; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=KpNUbRAFszI1eK3s3uSMQU+NJFPEPpFLxnyJ6HS30rg=; b=DKpmoPctj2w6kcQKt+HHHsnFyNCLwiE4epub2IfFQXL9c3XcBswhbEXrSjWPLv8YgO v7+VmAz2tgPdF8QR6df6AW/fQPZI7377WPY8hJZ+8ocy2vxR7YTtGmbx5ceoJ2/eK/cU hO/CKEoh1u4IlPu/n4eWOHzOcyjQ9pmFc8SvgxAPRFGlLPhHi950A4Vt2bIHdSj+Rbnx e1Q+CznTKZvit2sO+sr4oFI3/KiZ3tVvQ832QQb/ZAUEZioz2guxtS3b/FjhCqU4cCpx OQUpQ5mh32ln0xiRKJ+EIGwfUtZXtd6jNFRDzzcKiWcJ2XxX9Fy497CRdT8dPfDkv9Bn mMiA== X-Forwarded-Encrypted: i=1; AJvYcCXQv3w18MDAdUCY+yXGQE2YGG5Vv6OsELs7yjUg21EXm8RZP5W2k7tu/Rgbk879hWwnMlcUXI3Mneo=@passt.top X-Gm-Message-State: AOJu0YwkKGZTftstCx1KTKEw6S1mU7Zv2om8I4wflz+QPzysIDwU3s6R yy1gpeamT/z1xk7TPB/35uXEffRJXMsDgKRH13Iz0tGLhCBIAxKHUmdr924Uj2n2AyXURCPgaO9 PapCayNJ1urpizxEHL07GZxYDv/ItAkMUes6Pn/Akao/sChGclw== X-Received: by 2002:a05:600c:46cb:b0:42c:b1f0:f67 with SMTP id 5b1f17b1804b1-42e96144f67mr31125845e9.27.1727322845268; Wed, 25 Sep 2024 20:54:05 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHcyLmW3GJHJCs32CDjVIqdJ6LtxUHvmklz+GI2mZWtZJhJdyIW4AsQ54Tu/bP1U4Xud2qLZg== X-Received: by 2002:a05:600c:46cb:b0:42c:b1f0:f67 with SMTP id 5b1f17b1804b1-42e96144f67mr31125715e9.27.1727322844826; Wed, 25 Sep 2024 20:54:04 -0700 (PDT) Received: from maya.myfinge.rs (ifcgrfdd.trafficplex.cloud. [2a10:fc81:a806:d6a9::1]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-42e96a36244sm34461685e9.38.2024.09.25.20.54.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 25 Sep 2024 20:54:04 -0700 (PDT) Date: Thu, 26 Sep 2024 05:54:00 +0200 From: Stefano Brivio To: David Gibson Subject: Re: [PATCH v2 4/4] tcp: Update TCP checksum using an iovec array Message-ID: <20240926055400.47d0adeb@elisabeth> In-Reply-To: References: <20240925081125.205974-1-lvivier@redhat.com> <20240925081125.205974-5-lvivier@redhat.com> <20240925193919.6bfe0df4@elisabeth> Organization: Red Hat X-Mailer: Claws Mail 4.2.0 (GTK 3.24.41; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-ID-Hash: MF2FHRMQMHK74YFSXNWLYNM4VHQQCZAY X-Message-ID-Hash: MF2FHRMQMHK74YFSXNWLYNM4VHQQCZAY X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: Laurent Vivier , passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Thu, 26 Sep 2024 11:56:49 +1000 David Gibson wrote: > On Wed, Sep 25, 2024 at 07:39:19PM +0200, Stefano Brivio wrote: > > On Wed, 25 Sep 2024 10:11:25 +0200 > > Laurent Vivier wrote: > > > > > TCP header and payload are supposed to be in the same buffer, > > > and tcp_update_check_tcp4()/tcp_update_check_tcp6() compute > > > the checksum from the base address of the header using the > > > length of the IP payload. > > > > > > In the future (for vhost-user) we need to dispatch the TCP header and > > > the TCP payload through several buffers. To be able to manage that, we > > > provide an iovec array that points to the data of the TCP frame. > > > We provide also an offset to be able to provide an array that contains > > > the TCP frame embedded in an lower level frame, and this offset points > > > to the TCP header inside the iovec array. > > > > > > Signed-off-by: Laurent Vivier > > > --- > > > > > > Notes: > > > v2: > > > - s/payload_offset/l4offset/ > > > - check memory address of the checksum (alignment, iovec boundaries) > > > > > > checksum.c | 1 - > > > tcp.c | 116 ++++++++++++++++++++++++++++++++++++++++------------- > > > 2 files changed, 88 insertions(+), 29 deletions(-) > > > > > > diff --git a/checksum.c b/checksum.c > > > index 68ffaddb5bb0..4854c1937c39 100644 > > > --- a/checksum.c > > > +++ b/checksum.c > > > @@ -503,7 +503,6 @@ uint16_t csum(const void *buf, size_t len, uint32_t init) > > > * > > > * Return: 16-bit folded, complemented checksum > > > */ > > > -/* cppcheck-suppress unusedFunction */ > > > uint16_t csum_iov(const struct iovec *iov, size_t n, size_t offset, > > > uint32_t init) > > > { > > > diff --git a/tcp.c b/tcp.c > > > index c9472d905520..f0a6f7a507a7 100644 > > > --- a/tcp.c > > > +++ b/tcp.c > > > @@ -755,36 +755,81 @@ static void tcp_sock_set_bufsize(const struct ctx *c, int s) > > > } > > > > > > /** > > > - * tcp_update_check_tcp4() - Update TCP checksum from stored one > > > - * @iph: IPv4 header > > > - * @bp: TCP header followed by TCP payload > > > - */ > > > -static void tcp_update_check_tcp4(const struct iphdr *iph, > > > - struct tcp_payload_t *bp) > > > + * tcp_update_check_tcp4() - Calculate TCP checksum for IPv6 > > > + * @src: IPv4 source address > > > + * @dst: IPv4 destination address > > > + * @iov: Pointer to the array of IO vectors > > > + * @iov_cnt: Length of the array > > > + * @l4offset: IPv4 payload offset in the iovec array > > > + */ > > > +void tcp_update_check_tcp4(struct in_addr src, > > > + struct in_addr dst, > > > + const struct iovec *iov, int iov_cnt, > > > + size_t l4offset) > > > { > > > - uint16_t l4len = ntohs(iph->tot_len) - sizeof(struct iphdr); > > > - struct in_addr saddr = { .s_addr = iph->saddr }; > > > - struct in_addr daddr = { .s_addr = iph->daddr }; > > > - uint32_t sum = proto_ipv4_header_psum(l4len, IPPROTO_TCP, saddr, daddr); > > > + size_t check_ofs; > > > + __sum16 *check; > > > + int check_idx; > > > + uint32_t sum; > > > + > > > + sum = proto_ipv4_header_psum(iov_size(iov, iov_cnt) - l4offset, > > > + IPPROTO_TCP, src, dst); > > > + > > > + check_idx = iov_skip_bytes(iov, iov_cnt, > > > + l4offset + offsetof(struct tcphdr, check), > > > + &check_ofs); > > > + > > > + if (check_idx >= iov_cnt) > > > + die("TCP4 buffer is too small"); > > > + if (check_ofs + sizeof(*check) > iov[check_idx].iov_len) > > > + die("TCP4 checksum field memory is not contiguous"); > > > > I'm not really fond of those die() calls. First off, they should report > > a couple more details (at least check_idx, iov_cnt). > > > > Second, we could fail gracefully (hence, we should) instead of aborting > > the whole thing: those could be err() calls. > > It's a question of how plausible a graceful recovery is at this point. > If we hit this, the guest has given us buffers that aren't even 2-byte > aligned. Is it reasonable to keep trying to working with a guest that > does that? Could it happen for a single buffer, if the hypervisor has some issue? Or, say, for every second buffer? If that's the case, then we could kind of work with it. Looking at it again, true, that would mean there's some fundamental issue with it and it doesn't make a lot of sense to try to recover, I still think we probably should but it's not a strong preference. > > If we fail to calculate checksums, we can leave them as zero, and the > > receiver will drop those frames anyway, if you don't want to add > > complexity (propagation of return values) for something that should > > never happen. > > I think the checksum is (in the general vhost-user case) > uninitialised, not zero, at this point. To set it to zero, we'd still > need to get a usable pointer to it. Not that that really changes what > would happen - 0 has the same chance of being right by accident as an > uninitialised value. Ah, right. Well, we could just leave it uninitialised then. -- Stefano