From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: passt.top; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=Jk7m0Fv5; dkim-atps=neutral Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by passt.top (Postfix) with ESMTPS id E4AD15A0262 for ; Sun, 10 May 2026 01:45:09 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1778370308; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hCqFgnwdZUzvc2Wqg/4WnYyN4N8kLChO363Rf+0KLFk=; b=Jk7m0Fv5f+cVt3V85tdskjEKv3pZ1lDYBPJuS2RiJWIE6UJnpuvwVmHxk6Ela5Mlq2zPsJ 8WoIp/5nmT3hbATYPkpz3/hsJeOPBzbGamfho+X6lTtQkP/dBPy/9EjtK/SKCQ6OK7dNZz Z0GWOgk45/55nrY++yG7tVjx7ezd6i4= Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-592-WepTtbENNai_qjqgJlVpvA-1; Sat, 09 May 2026 19:45:07 -0400 X-MC-Unique: WepTtbENNai_qjqgJlVpvA-1 X-Mimecast-MFC-AGG-ID: WepTtbENNai_qjqgJlVpvA_1778370307 Received: by mail-qt1-f198.google.com with SMTP id d75a77b69052e-50e136aff17so55056551cf.3 for ; Sat, 09 May 2026 16:45:07 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1778370307; x=1778975107; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-gg:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=hCqFgnwdZUzvc2Wqg/4WnYyN4N8kLChO363Rf+0KLFk=; b=rV8WYDH4jJ/3cjEW4j3VZLJcCvAuqOilk0PH/md7WhjY3O//SEssnfB/PK2MOifYpH ZKu5fuaEidZ9jBKVVJv5Xgaf/nTivVWwQ5hhUNhJlRh/HXp5X1BUeMw8maO/jyGT5AwJ zw/qwx40Ut52T9b/vT4EDKht2FH/hurFbo/u4Tnv5hAgmQhA24a36k2d3X6HxAfrrWV1 UCe7E+EmlKuJl/eiWYQJGYr05lhwJD0cIPazPCa2UzQo9wakg15NCF2aOVdw5a1L3ox1 IYmvRCPP+w+52hdZ/YY+jiOCt5bIcyJZG/O3DuLI6+FZNvIYBnPsO3XH20tbxB4A2pfz R3hQ== X-Forwarded-Encrypted: i=1; AFNElJ97+94ql3I3WX8oCQfWVRGBrPhWeMKBFocPnmXOXBvMhL+6aez7ekb2CF4ncFA/6qK71Cx2iiJvLzE=@passt.top X-Gm-Message-State: AOJu0Yx9/tfV+l9gQwGVhhktSIVBmXN6WdFQblQQXIgzunym15u+4ndv rF8GssPUy7wkeFlkyTeDWEcGAYMvMlpxyjBmkoUTZZKJ/lbKIaqHsGHK06RiAI9YMZuao1+bFhB x4H4JbKTjf+3oIYGxqB3FKkwd/+H01O+aSThX1XH4L6TVV2oSP7mMMA== X-Gm-Gg: Acq92OEVA+DbPBJ1sCK/EuKWQh4wEGIxk3evhc5Db6KjXTMFIDwls5F+43dadmucCCm b232KrUYAADOKugx9GB6pgOP0c2JfRJLQg4izL/ddu/mcNbZcjYNw6hTUAOeQKl1l2QP+3Q2CEj EG5/kqINSgmvy0f3OUE7cNRKIKqlqxjpGS1Hdrq+/XU5qVYrTVRP9Y79ewBSKQqUb3UnitPeHLE b5Vdw9jbe97/qVJAV+CE2v1oDPRJq8ZEAg++dzJhmAI/FDRVNxLeSsQ5qedXP4G66i3x3H+AaFt Syw74R6crXMN3bG+f+Bf9ECdyOH0bynDNLA6CeP4EbmT0DRo4fYxkc746+IN9R0A6cw29aoKj8b 4/r5QEu4vNuPGWMb855Scln5v0Df/wKbJ/JxLQMmXfuPpA+JdtKJ3xEFPUCe//q4t2CBij3tvDy UgCg1iI8i+Dmd3 X-Received: by 2002:a05:622a:4ce:b0:50e:ca38:e219 with SMTP id d75a77b69052e-51461fb1949mr257160601cf.45.1778370306967; Sat, 09 May 2026 16:45:06 -0700 (PDT) X-Received: by 2002:a05:622a:4ce:b0:50e:ca38:e219 with SMTP id d75a77b69052e-51461fb1949mr257159981cf.45.1778370306379; Sat, 09 May 2026 16:45:06 -0700 (PDT) Received: from [192.168.2.15] (lnsm4-toronto63-142-116-28-118.internet.virginmobile.ca. [142.116.28.118]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-5149bf65846sm37911851cf.19.2026.05.09.16.45.05 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 09 May 2026 16:45:05 -0700 (PDT) Message-ID: <6356e669-2b37-422a-bf8d-9bddbcb8c987@redhat.com> Date: Sat, 9 May 2026 19:45:05 -0400 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v6 1/4] tcp: Encode checksum computation flags in a single parameter To: Laurent Vivier , passt-dev@passt.top References: <20260416161618.3826904-1-lvivier@redhat.com> <20260416161618.3826904-2-lvivier@redhat.com> From: Jon Maloy In-Reply-To: <20260416161618.3826904-2-lvivier@redhat.com> X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: xXINgnjOp1MmGKquGHsUNZj1WU7mWZOWtKTTS5COdYM_1778370307 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Message-ID-Hash: OFSPPMMO32GQDBRAFQSJQNL5LFUKLBB2 X-Message-ID-Hash: OFSPPMMO32GQDBRAFQSJQNL5LFUKLBB2 X-MailFrom: jmaloy@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: David Gibson X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On 2026-04-16 12:16, Laurent Vivier wrote: > tcp_fill_headers() takes a pointer to a previously computed IPv4 header > checksum to avoid recalculating it when the payload length doesn't > change, and a separate bool to skip TCP checksum computation. > > Replace both parameters with a single uint32_t csum_flags that encodes: > - IP4_CSUM (bit 31): compute IPv4 header checksum from scratch > - TCP_CSUM (bit 30): compute TCP checksum > - IP4_CMASK (low 16 bits): cached IPv4 header checksum value > > When IP4_CSUM is not set, the cached checksum is extracted from the low > 16 bits. This is cleaner than the pointer-based approach, and also > avoids a potential dangling pointer issue: a subsequent patch makes > tcp_fill_headers() access ip4h via with_header(), which scopes it to a > temporary variable, so a pointer to ip4h->check would become invalid > after the with_header() block. > > Suggested-by: David Gibson > Signed-off-by: Laurent Vivier Reviewed-by: Jon Maloy But see comment below. > --- > tcp.c | 25 +++++++++++++------------ > tcp_buf.c | 23 ++++++++++++----------- > tcp_internal.h | 7 +++++-- > tcp_vu.c | 28 +++++++++++++++++----------- > 4 files changed, 47 insertions(+), 36 deletions(-) > > diff --git a/tcp.c b/tcp.c > index 45bcc19375fe..de362290b034 100644 > --- a/tcp.c > +++ b/tcp.c > @@ -946,9 +946,10 @@ static void tcp_fill_header(struct tcphdr *th, > * @th: Pointer to TCP header > * @payload: TCP payload > * @dlen: TCP payload length > - * @ip4_check: IPv4 checksum, if already known > + * @csum_flags: TCP_CSUM if TCP checksum must be computed, > + * IP4_CSUM if IPv4 checksum must be computed, > + * otherwise IPv4 checksum is provided in IP4_CMASK > * @seq: Sequence number for this segment > - * @no_tcp_csum: Do not set TCP checksum > * > * Return: frame length (including L2 headers) > */ > @@ -956,8 +957,7 @@ size_t tcp_fill_headers(const struct ctx *c, struct tcp_tap_conn *conn, > struct ethhdr *eh, > struct iphdr *ip4h, struct ipv6hdr *ip6h, > struct tcphdr *th, struct iov_tail *payload, > - size_t dlen, const uint16_t *ip4_check, uint32_t seq, > - bool no_tcp_csum) > + size_t dlen, uint32_t csum_flags, uint32_t seq) > { > const struct flowside *tapside = TAPFLOW(conn); > size_t l4len = dlen + sizeof(*th); > @@ -977,13 +977,14 @@ size_t tcp_fill_headers(const struct ctx *c, struct tcp_tap_conn *conn, > ip4h->saddr = src4->s_addr; > ip4h->daddr = dst4->s_addr; > > - if (ip4_check) > - ip4h->check = *ip4_check; > - else > + if (csum_flags & IP4_CSUM) { > ip4h->check = csum_ip4_header(l3len, IPPROTO_TCP, > *src4, *dst4); > + } else { > + ip4h->check = csum_flags & IP4_CMASK; > + } > > - if (!no_tcp_csum) { > + if (csum_flags & TCP_CSUM) { > psum = proto_ipv4_header_psum(l4len, IPPROTO_TCP, > *src4, *dst4); > } > @@ -1003,7 +1004,7 @@ size_t tcp_fill_headers(const struct ctx *c, struct tcp_tap_conn *conn, > > ip6_set_flow_lbl(ip6h, conn->sock); > > - if (!no_tcp_csum) { > + if (csum_flags & TCP_CSUM) { > psum = proto_ipv6_header_psum(l4len, IPPROTO_TCP, > &ip6h->saddr, > &ip6h->daddr); > @@ -1018,10 +1019,10 @@ size_t tcp_fill_headers(const struct ctx *c, struct tcp_tap_conn *conn, > > tcp_fill_header(th, conn, seq); > > - if (no_tcp_csum) > - th->check = 0; > - else > + if (csum_flags & TCP_CSUM) > tcp_update_csum(psum, th, payload, dlen); > + else > + th->check = 0; > > return MAX(l3len + sizeof(struct ethhdr), ETH_ZLEN); > } > diff --git a/tcp_buf.c b/tcp_buf.c > index 27151854033c..a27d9733616c 100644 > --- a/tcp_buf.c > +++ b/tcp_buf.c > @@ -166,14 +166,15 @@ static void tcp_l2_buf_pad(struct iovec *iov) > * @c: Execution context > * @conn: Connection pointer > * @iov: Pointer to an array of iovec of TCP pre-cooked buffers > - * @check: Checksum, if already known > + * @csum_flags: TCP_CSUM if TCP checksum must be computed, > + * IP4_CSUM if IPv4 checksum must be computed, > + * otherwise IPv4 checksum is provided in IP4_CMASK > * @seq: Sequence number for this segment > - * @no_tcp_csum: Do not set TCP checksum > */ > static void tcp_l2_buf_fill_headers(const struct ctx *c, > struct tcp_tap_conn *conn, > - struct iovec *iov, const uint16_t *check, > - uint32_t seq, bool no_tcp_csum) > + struct iovec *iov, uint32_t csum_flags, > + uint32_t seq) > { > struct iov_tail tail = IOV_TAIL(&iov[TCP_IOV_PAYLOAD], 1, 0); > struct tcphdr th_storage, *th = IOV_REMOVE_HEADER(&tail, th_storage); > @@ -191,8 +192,7 @@ static void tcp_l2_buf_fill_headers(const struct ctx *c, > ip6h = iov[TCP_IOV_IP].iov_base; > > l2len = tcp_fill_headers(c, conn, eh, ip4h, ip6h, th, &tail, > - iov_tail_size(&tail), check, seq, > - no_tcp_csum); > + iov_tail_size(&tail), csum_flags, seq); > tap_hdr_update(taph, l2len); > } > > @@ -234,7 +234,7 @@ int tcp_buf_send_flag(const struct ctx *c, struct tcp_tap_conn *conn, int flags) > if (flags & KEEPALIVE) > seq--; > > - tcp_l2_buf_fill_headers(c, conn, iov, NULL, seq, false); > + tcp_l2_buf_fill_headers(c, conn, iov, IP4_CSUM | TCP_CSUM, seq); > > tcp_l2_buf_pad(iov); > > @@ -271,7 +271,7 @@ static void tcp_data_to_tap(const struct ctx *c, struct tcp_tap_conn *conn, > ssize_t dlen, int no_csum, uint32_t seq, bool push) > { > struct tcp_payload_t *payload; > - const uint16_t *check = NULL; > + uint32_t check = IP4_CSUM; > struct iovec *iov; > > conn->seq_to_tap = seq + dlen; > @@ -280,9 +280,10 @@ static void tcp_data_to_tap(const struct ctx *c, struct tcp_tap_conn *conn, > if (CONN_V4(conn)) { > if (no_csum) { > struct iovec *iov_prev = tcp_l2_iov[tcp_payload_used - 1]; > - struct iphdr *iph = iov_prev[TCP_IOV_IP].iov_base; > + const struct iphdr *iph = iov_prev[TCP_IOV_IP].iov_base; > > - check = &iph->check; > + /* overwrite IP4_CSUM flag as we set the checksum */ > + check = iph->check; Subtle, but looks correct. Maybe a little better comment? /jon > } > iov[TCP_IOV_IP] = IOV_OF_LVALUE(tcp4_payload_ip[tcp_payload_used]); > } else if (CONN_V6(conn)) { > @@ -296,7 +297,7 @@ static void tcp_data_to_tap(const struct ctx *c, struct tcp_tap_conn *conn, > payload->th.ack = 1; > payload->th.psh = push; > iov[TCP_IOV_PAYLOAD].iov_len = dlen + sizeof(struct tcphdr); > - tcp_l2_buf_fill_headers(c, conn, iov, check, seq, false); > + tcp_l2_buf_fill_headers(c, conn, iov, TCP_CSUM | check, seq); > > tcp_l2_buf_pad(iov); > > diff --git a/tcp_internal.h b/tcp_internal.h > index a0fa19f4ed11..40472c9973c8 100644 > --- a/tcp_internal.h > +++ b/tcp_internal.h > @@ -183,12 +183,15 @@ void tcp_rst_do(const struct ctx *c, struct tcp_tap_conn *conn); > > struct tcp_info_linux; > > +#define IP4_CSUM 0x80000000 > +#define IP4_CMASK 0x0000FFFF > +#define TCP_CSUM 0x40000000 > + > size_t tcp_fill_headers(const struct ctx *c, struct tcp_tap_conn *conn, > struct ethhdr *eh, > struct iphdr *ip4h, struct ipv6hdr *ip6h, > struct tcphdr *th, struct iov_tail *payload, > - size_t dlen, const uint16_t *ip4_check, uint32_t seq, > - bool no_tcp_csum); > + size_t dlen, uint32_t csum_flags, uint32_t seq); > > int tcp_update_seqack_wnd(const struct ctx *c, struct tcp_tap_conn *conn, > bool force_seq, struct tcp_info_linux *tinfo); > diff --git a/tcp_vu.c b/tcp_vu.c > index 2dfe14485eee..3e399c20f0d7 100644 > --- a/tcp_vu.c > +++ b/tcp_vu.c > @@ -134,7 +134,7 @@ int tcp_vu_send_flag(const struct ctx *c, struct tcp_tap_conn *conn, int flags) > seq--; > > tcp_fill_headers(c, conn, eh, ip4h, ip6h, th, &payload, > - optlen, NULL, seq, !*c->pcap); > + optlen, IP4_CSUM | (*c->pcap ? TCP_CSUM : 0), seq); > > vu_pad(flags_elem[0].in_sg, 1, hdrlen + optlen); > vu_flush(vdev, vq, flags_elem, 1, hdrlen + optlen); > @@ -282,13 +282,15 @@ static ssize_t tcp_vu_sock_recv(const struct ctx *c, struct vu_virtq *vq, > * @iov: Pointer to the array of IO vectors > * @iov_cnt: Number of entries in @iov > * @dlen: Data length > - * @check: Checksum, if already known > - * @no_tcp_csum: Do not set TCP checksum > + * @csum_flags: Pointer to checksum flags (input/output) > + * TCP_CSUM if TCP checksum must be computed, > + * IP4_CSUM if IPv4 checksum must be computed, > + * otherwise IPv4 checksum is provided in IP4_CMASK > * @push: Set PSH flag, last segment in a batch > */ > static void tcp_vu_prepare(const struct ctx *c, struct tcp_tap_conn *conn, > struct iovec *iov, size_t iov_cnt, size_t dlen, > - const uint16_t **check, bool no_tcp_csum, bool push) > + uint32_t *csum_flags, bool push) > { > const struct flowside *toside = TAPFLOW(conn); > bool v6 = !(inany_v4(&toside->eaddr) && inany_v4(&toside->oaddr)); > @@ -332,9 +334,11 @@ static void tcp_vu_prepare(const struct ctx *c, struct tcp_tap_conn *conn, > th->psh = push; > > tcp_fill_headers(c, conn, eh, ip4h, ip6h, th, &payload, dlen, > - *check, conn->seq_to_tap, no_tcp_csum); > + *csum_flags, conn->seq_to_tap); > + > + /* Preserve TCP_CSUM, overwrite IP4_CSUM as we set the checksum */ > if (ip4h) > - *check = &ip4h->check; > + *csum_flags = (*csum_flags & TCP_CSUM) | ip4h->check; > } > > /** > @@ -350,12 +354,11 @@ int tcp_vu_data_from_sock(const struct ctx *c, struct tcp_tap_conn *conn) > uint32_t wnd_scaled = conn->wnd_from_tap << conn->ws_from_tap; > struct vu_dev *vdev = c->vdev; > struct vu_virtq *vq = &vdev->vq[VHOST_USER_RX_QUEUE]; > + uint32_t already_sent, check; > ssize_t len, previous_dlen; > int i, iov_cnt, head_cnt; > size_t hdrlen, fillsize; > int v6 = CONN_V6(conn); > - uint32_t already_sent; > - const uint16_t *check; > > if (!vu_queue_enabled(vq) || !vu_queue_started(vq)) { > debug("Got packet, but RX virtqueue not usable yet"); > @@ -442,7 +445,10 @@ int tcp_vu_data_from_sock(const struct ctx *c, struct tcp_tap_conn *conn) > */ > > hdrlen = tcp_vu_hdrlen(v6); > - for (i = 0, previous_dlen = -1, check = NULL; i < head_cnt; i++) { > + check = IP4_CSUM; > + if (*c->pcap) > + check |= TCP_CSUM; > + for (i = 0, previous_dlen = -1; i < head_cnt; i++) { > struct iovec *iov = &elem[head[i]].in_sg[0]; > int buf_cnt = head[i + 1] - head[i]; > size_t frame_size = iov_size(iov, buf_cnt); > @@ -458,10 +464,10 @@ int tcp_vu_data_from_sock(const struct ctx *c, struct tcp_tap_conn *conn) > > /* The IPv4 header checksum varies only with dlen */ > if (previous_dlen != dlen) > - check = NULL; > + check |= IP4_CSUM; > previous_dlen = dlen; > > - tcp_vu_prepare(c, conn, iov, buf_cnt, dlen, &check, !*c->pcap, push); > + tcp_vu_prepare(c, conn, iov, buf_cnt, dlen, &check, push); > > vu_pad(elem[head[i]].in_sg, buf_cnt, dlen + hdrlen); > vu_flush(vdev, vq, &elem[head[i]], buf_cnt, dlen + hdrlen);