From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124])
	by passt.top (Postfix) with ESMTP id 335405A027B
	for <passt-dev@passt.top>; Wed, 14 Feb 2024 09:56:32 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
	s=mimecast20190719; t=1707900991;
	h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
	 to:to:cc:cc:mime-version:mime-version:content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references;
	bh=2yHb0HVare9a8oHBKPAwPPFHETXxPK5RHqGO4rWH6Xk=;
	b=VY/BX7D/JwB/iOfA0SF3Ci8xnkJgp/165SeqD/VGz0FcKzJF/zQtJW2n2uX2F51xVIxKzu
	33aeloi6VJOPW2SiIgt6PdJ/5bBGqIdn4tFNPc5Z+jEV2431LvJZVVR5Veu87ZBSUbLrTJ
	A0fM3Bpa2xNkwEJ4d7nlmQg9shUOS0I=
Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com
 [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS
 (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id
 us-mta-684-KNAPOIqCNJydHAyU-LZe4g-1; Wed, 14 Feb 2024 03:56:29 -0500
X-MC-Unique: KNAPOIqCNJydHAyU-LZe4g-1
Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
	(No client certificate requested)
	by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 9BB9D84B04A
	for <passt-dev@passt.top>; Wed, 14 Feb 2024 08:56:29 +0000 (UTC)
Received: from virtlab218.virt.lab.eng.bos.redhat.com (virtlab218.virt.lab.eng.bos.redhat.com [10.19.152.190])
	by smtp.corp.redhat.com (Postfix) with ESMTP id 81403112131D;
	Wed, 14 Feb 2024 08:56:29 +0000 (UTC)
From: Laurent Vivier <lvivier@redhat.com>
To: passt-dev@passt.top
Subject: [PATCH v2 3/8] checksum: align buffers
Date: Wed, 14 Feb 2024 09:56:23 +0100
Message-ID: <20240214085628.210783-4-lvivier@redhat.com>
In-Reply-To: <20240214085628.210783-1-lvivier@redhat.com>
References: <20240214085628.210783-1-lvivier@redhat.com>
MIME-Version: 1.0
X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.3
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; charset="US-ASCII"; x-default=true
Message-ID-Hash: CXI3XPPXK7S6JUSGJD2DJNAUAFQ3GEBD
X-Message-ID-Hash: CXI3XPPXK7S6JUSGJD2DJNAUAFQ3GEBD
X-MailFrom: lvivier@redhat.com
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: Laurent Vivier <lvivier@redhat.com>
X-Mailman-Version: 3.3.8
Precedence: list
List-Id: Development discussion and patches for passt <passt-dev.passt.top>
Archived-At: <https://archives.passt.top/passt-dev/20240214085628.210783-4-lvivier@redhat.com/>
Archived-At: <https://passt.top/hyperkitty/list/passt-dev@passt.top/message/CXI3XPPXK7S6JUSGJD2DJNAUAFQ3GEBD/>
List-Archive: <https://archives.passt.top/passt-dev/>
List-Archive: <https://passt.top/hyperkitty/list/passt-dev@passt.top/>
List-Help: <mailto:passt-dev-request@passt.top?subject=help>
List-Owner: <mailto:passt-dev-owner@passt.top>
List-Post: <mailto:passt-dev@passt.top>
List-Subscribe: <mailto:passt-dev-join@passt.top>
List-Unsubscribe: <mailto:passt-dev-leave@passt.top>

if buffer is not aligned use sum_16b() only on the not aligned
part, and then use csum_avx2() on the remaining part

Remove unneeded now function csum_unaligned().

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
---

Notes:
    v2:
      - use ROUND_UP() and sizeof(__m256i)
      - fix function comment
      - remove csum_unaligned() and use csum() instead

 checksum.c | 47 ++++++++++++++++++++++++-----------------------
 1 file changed, 24 insertions(+), 23 deletions(-)

diff --git a/checksum.c b/checksum.c
index f21c9b7a14d1..65486b4625ba 100644
--- a/checksum.c
+++ b/checksum.c
@@ -56,6 +56,8 @@
 #include <linux/udp.h>
 #include <linux/icmpv6.h>
 
+#include "util.h"
+
 /* Checksums are optional for UDP over IPv4, so we usually just set
  * them to 0.  Change this to 1 to calculate real UDP over IPv4
  * checksums
@@ -110,20 +112,7 @@ uint16_t csum_fold(uint32_t sum)
 	return sum;
 }
 
-/**
- * csum_unaligned() - Compute TCP/IP-style checksum for not 32-byte aligned data
- * @buf:	Input data
- * @len:	Input length
- * @init:	Initial 32-bit checksum, 0 for no pre-computed checksum
- *
- * Return: 16-bit IPv4-style checksum
- */
-/* NOLINTNEXTLINE(clang-diagnostic-unknown-attributes) */
-__attribute__((optimize("-fno-strict-aliasing")))	/* See csum_16b() */
-uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init)
-{
-	return (uint16_t)~csum_fold(sum_16b(buf, len) + init);
-}
+uint16_t csum(const void *buf, size_t len, uint32_t init);
 
 /**
  * csum_ip4_header() - Calculate and set IPv4 header checksum
@@ -132,7 +121,7 @@ uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init)
 void csum_ip4_header(struct iphdr *ip4h)
 {
 	ip4h->check = 0;
-	ip4h->check = csum_unaligned(ip4h, (size_t)ip4h->ihl * 4, 0);
+	ip4h->check = csum(ip4h, (size_t)ip4h->ihl * 4, 0);
 }
 
 /**
@@ -159,7 +148,7 @@ void csum_udp4(struct udphdr *udp4hr,
 			+ htons(IPPROTO_UDP);
 		/* Add in partial checksum for the UDP header alone */
 		psum += sum_16b(udp4hr, sizeof(*udp4hr));
-		udp4hr->check = csum_unaligned(payload, len, psum);
+		udp4hr->check = csum(payload, len, psum);
 	}
 }
 
@@ -178,7 +167,7 @@ void csum_icmp4(struct icmphdr *icmp4hr, const void *payload, size_t len)
 	/* Partial checksum for ICMP header alone */
 	psum = sum_16b(icmp4hr, sizeof(*icmp4hr));
 
-	icmp4hr->checksum = csum_unaligned(payload, len, psum);
+	icmp4hr->checksum = csum(payload, len, psum);
 }
 
 /**
@@ -199,7 +188,7 @@ void csum_udp6(struct udphdr *udp6hr,
 	udp6hr->check = 0;
 	/* Add in partial checksum for the UDP header alone */
 	psum += sum_16b(udp6hr, sizeof(*udp6hr));
-	udp6hr->check = csum_unaligned(payload, len, psum);
+	udp6hr->check = csum(payload, len, psum);
 }
 
 /**
@@ -222,7 +211,7 @@ void csum_icmp6(struct icmp6hdr *icmp6hr,
 	icmp6hr->icmp6_cksum = 0;
 	/* Add in partial checksum for the ICMPv6 header alone */
 	psum += sum_16b(icmp6hr, sizeof(*icmp6hr));
-	icmp6hr->icmp6_cksum = csum_unaligned(payload, len, psum);
+	icmp6hr->icmp6_cksum = csum(payload, len, psum);
 }
 
 #ifdef __AVX2__
@@ -397,17 +386,29 @@ less_than_128_bytes:
 
 /**
  * csum() - Compute TCP/IP-style checksum
- * @buf:	Input buffer, must be aligned to 32-byte boundary
+ * @buf:	Input buffer
  * @len:	Input length
  * @init:	Initial 32-bit checksum, 0 for no pre-computed checksum
  *
- * Return: 16-bit folded, complemented checksum sum
+ * Return: 16-bit folded, complemented checksum
  */
 /* NOLINTNEXTLINE(clang-diagnostic-unknown-attributes) */
 __attribute__((optimize("-fno-strict-aliasing")))	/* See csum_16b() */
 uint16_t csum(const void *buf, size_t len, uint32_t init)
 {
-	return (uint16_t)~csum_fold(csum_avx2(buf, len, init));
+	intptr_t align = ROUND_UP((intptr_t)buf, sizeof(__m256i));
+	unsigned int pad = align - (intptr_t)buf;
+
+	if (len < pad)
+		pad = len;
+
+	if (pad)
+		init += sum_16b(buf, pad);
+
+	if (len > pad)
+		init = csum_avx2((void *)align, len - pad, init);
+
+	return (uint16_t)~csum_fold(init);
 }
 
 #else /* __AVX2__ */
@@ -424,7 +425,7 @@ uint16_t csum(const void *buf, size_t len, uint32_t init)
 __attribute__((optimize("-fno-strict-aliasing")))	/* See csum_16b() */
 uint16_t csum(const void *buf, size_t len, uint32_t init)
 {
-	return csum_unaligned(buf, len, init);
+	return (uint16_t)~csum_fold(sum_16b(buf, len) + init);
 }
 
 #endif /* !__AVX2__ */
-- 
2.42.0