On Fri, Feb 02, 2024 at 03:11:30PM +0100, Laurent Vivier wrote: > if buffer is not aligned use sum_16b() only on the not aligned > part, and then use csum() on the remaining part > > Signed-off-by: Laurent Vivier Reviewed-by: David Gibson > --- > checksum.c | 14 +++++++++++++- > 1 file changed, 13 insertions(+), 1 deletion(-) > > diff --git a/checksum.c b/checksum.c > index f21c9b7a14d1..c94980771c63 100644 > --- a/checksum.c > +++ b/checksum.c > @@ -407,7 +407,19 @@ less_than_128_bytes: > __attribute__((optimize("-fno-strict-aliasing"))) /* See csum_16b() */ > uint16_t csum(const void *buf, size_t len, uint32_t init) > { > - return (uint16_t)~csum_fold(csum_avx2(buf, len, init)); > + intptr_t align = ((intptr_t)buf + 0x1f) & ~(intptr_t)0x1f; Wonder if its worth adding an ALIGN_UP macro. > + unsigned int pad = align - (intptr_t)buf; > + > + if (len < pad) > + pad = len; > + > + if (pad) > + init += sum_16b(buf, pad); > + > + if (len > pad) > + init = csum_avx2((void *)align, len - pad, init); > + > + return (uint16_t)~csum_fold(init); > } > > #else /* __AVX2__ */ -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson