From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: passt.top; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=U6k/hgQf; dkim-atps=neutral Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by passt.top (Postfix) with ESMTPS id 29F9E5A0271 for ; Fri, 10 Jan 2025 09:19:23 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1736497162; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=mZ064QfcWKWotvp73PVJFveJDyLldLe9qAJZ7uBRokc=; b=U6k/hgQfqIeudn9nSx07wUF6iJi2Pbcw4DBoECaUNBLZ7pqDSGeXjBPym8TDl4LmdP5gGa SxsEXo+bWwfLWQOjt8NTenNi4UNG5lUwCIBhSUs3tjbE0fXR2Wc1gy8Kmena3rDCiTTa4F /TahcQ2oVq/02WyGhkRnlEBulNOLxn4= Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-259-nOobALF_Mgq-vnCaqj2a5A-1; Fri, 10 Jan 2025 03:19:20 -0500 X-MC-Unique: nOobALF_Mgq-vnCaqj2a5A-1 X-Mimecast-MFC-AGG-ID: nOobALF_Mgq-vnCaqj2a5A Received: by mail-wr1-f70.google.com with SMTP id ffacd0b85a97d-38a35a65575so1283508f8f.1 for ; Fri, 10 Jan 2025 00:19:20 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736497159; x=1737101959; h=content-transfer-encoding:in-reply-to:autocrypt:from :content-language:references:cc:to:subject:user-agent:mime-version :date:message-id:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=mZ064QfcWKWotvp73PVJFveJDyLldLe9qAJZ7uBRokc=; b=suoOgOzQiQniSCXT/FWfY8KSzpM4/CYiZfnPHAZM4UYp4RH66syVXEUlDJbe4GXvbl flxv/csPt/o5K5gj5OWQuOGPA5feNXOPi84rxXT8kf6IGzpYXjJtjrloSHK+FCvNpPcB Nic0tqEGeqFIIs4mUjHGNKM//MCDy97yMUjnP5ZPrUTHPPN+nFgA+b8ONMWscz75RIU6 g0LDlAEaAmzyu4hP14JW8CiESqB1cMZBITPEQ/iymDvkQOYKs+S2/hKeM8wLxtOb53IU VBUKA5l7IBNfSWntP4DVjAaUGAJTBga/1faQ7+9qsvwHefsWpbw4jXdKYxoGOjLyjB2g kkuQ== X-Gm-Message-State: AOJu0Yzr/Qpc7tZ/5CYIuRql5fNjAa5W6HWnDkbP52Gv5JnT21q/tPcx 0B5OFq5FNiFLwIJW6syuK6zz0vVv0fwv2DjeIWq2yW9zjXNmyO7hEJr8baQKT+ROAtzhNxKAyJW jFYInd4GTt3AADg03CXace2iE7AwDV1T3nPodYe7qQadL/zdqR04GxniFnQ== X-Gm-Gg: ASbGncsNGpzcjPet7ChTeX6G1M7knjWsnTIl8UDk6wnqMWCKq5shnwbssT2ucm3hfO2 ya1hM6WGEfcXU7aVpxFiAEMq82sasnYDRFT317VDPruf6h+kBKAG0qi/RDr7lAKL638BDYxEX1Q 1LXLrTgeHdU2zVXBYi3Gpae0tBm+KujYnKxUluz1KsnR9v8WhtC9Bqg4vU7IJ+Ru2wXkYvcquCF Byi3XrHi2PSvSvzuYgjU2HgSTBQBG8hP4uKoXiXnJYz+u4+ntsY+63Mbbae7bJkzEduST6uBcfG dw+CHvVetlToN37NJFI= X-Received: by 2002:a5d:64eb:0:b0:385:f349:fffb with SMTP id ffacd0b85a97d-38a8732c439mr9384221f8f.45.1736497159245; Fri, 10 Jan 2025 00:19:19 -0800 (PST) X-Google-Smtp-Source: AGHT+IE06t/Xo0Sy929XDfhB1pxwcBbGJ1H0SnTb1g6iLJiGssBx7mpV/49WDv8TlDugqvc1E+59Sg== X-Received: by 2002:a5d:64eb:0:b0:385:f349:fffb with SMTP id ffacd0b85a97d-38a8732c439mr9384195f8f.45.1736497158858; Fri, 10 Jan 2025 00:19:18 -0800 (PST) Received: from ?IPV6:2a01:e0a:e10:ef90:343a:68f:2e91:95c? ([2a01:e0a:e10:ef90:343a:68f:2e91:95c]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-436e2e8a326sm79379935e9.35.2025.01.10.00.19.18 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 10 Jan 2025 00:19:18 -0800 (PST) Message-ID: <2e4def34-e7ac-49f7-92f7-b88e494fb205@redhat.com> Date: Fri, 10 Jan 2025 09:19:17 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] checksum: fix checksum with odd base address To: David Gibson References: <20250109130648.326933-1-lvivier@redhat.com> From: Laurent Vivier Autocrypt: addr=lvivier@redhat.com; keydata= xsFNBFYFJhkBEAC2me7w2+RizYOKZM+vZCx69GTewOwqzHrrHSG07MUAxJ6AY29/+HYf6EY2 WoeuLWDmXE7A3oJoIsRecD6BXHTb0OYS20lS608anr3B0xn5g0BX7es9Mw+hV/pL+63EOCVm SUVTEQwbGQN62guOKnJJJfphbbv82glIC/Ei4Ky8BwZkUuXd7d5NFJKC9/GDrbWdj75cDNQx UZ9XXbXEKY9MHX83Uy7JFoiFDMOVHn55HnncflUncO0zDzY7CxFeQFwYRbsCXOUL9yBtqLer Ky8/yjBskIlNrp0uQSt9LMoMsdSjYLYhvk1StsNPg74+s4u0Q6z45+l8RAsgLw5OLtTa+ePM JyS7OIGNYxAX6eZk1+91a6tnqfyPcMbduxyBaYXn94HUG162BeuyBkbNoIDkB7pCByed1A7q q9/FbuTDwgVGVLYthYSfTtN0Y60OgNkWCMtFwKxRaXt1WFA5ceqinN/XkgA+vf2Ch72zBkJL RBIhfOPFv5f2Hkkj0MvsUXpOWaOjatiu0fpPo6Hw14UEpywke1zN4NKubApQOlNKZZC4hu6/ 8pv2t4HRi7s0K88jQYBRPObjrN5+owtI51xMaYzvPitHQ2053LmgsOdN9EKOqZeHAYG2SmRW LOxYWKX14YkZI5j/TXfKlTpwSMvXho+efN4kgFvFmP6WT+tPnwARAQABzSNMYXVyZW50IFZp dmllciA8bHZpdmllckByZWRoYXQuY29tPsLBeAQTAQIAIgUCVgVQgAIbAwYLCQgHAwIGFQgC CQoLBBYCAwECHgECF4AACgkQ8ww4vT8vvjwpgg//fSGy0Rs/t8cPFuzoY1cex4limJQfReLr SJXCANg9NOWy/bFK5wunj+h/RCFxIFhZcyXveurkBwYikDPUrBoBRoOJY/BHK0iZo7/WQkur 6H5losVZtrotmKOGnP/lJYZ3H6OWvXzdz8LL5hb3TvGOP68K8Bn8UsIaZJoeiKhaNR0sOJyI YYbgFQPWMHfVwHD/U+/gqRhD7apVysxv5by/pKDln1I5v0cRRH6hd8M8oXgKhF2+rAOL7gvh jEHSSWKUlMjC7YwwjSZmUkL+TQyE18e2XBk85X8Da3FznrLiHZFHQ/NzETYxRjnOzD7/kOVy gKD/o7asyWQVU65mh/ECrtjfhtCBSYmIIVkopoLaVJ/kEbVJQegT2P6NgERC/31kmTF69vn8 uQyW11Hk8tyubicByL3/XVBrq4jZdJW3cePNJbTNaT0d/bjMg5zCWHbMErUib2Nellnbg6bc 2HLDe0NLVPuRZhHUHM9hO/JNnHfvgiRQDh6loNOUnm9Iw2YiVgZNnT4soUehMZ7au8PwSl4I KYE4ulJ8RRiydN7fES3IZWmOPlyskp1QMQBD/w16o+lEtY6HSFEzsK3o0vuBRBVp2WKnssVH qeeV01ZHw0bvWKjxVNOksP98eJfWLfV9l9e7s6TaAeySKRRubtJ+21PRuYAxKsaueBfUE7ZT 7zfOwU0EVgUmGQEQALxSQRbl/QOnmssVDxWhHM5TGxl7oLNJms2zmBpcmlrIsn8nNz0rRyxT 460k2niaTwowSRK8KWVDeAW6ZAaWiYjLlTunoKwvF8vP3JyWpBz0diTxL5o+xpvy/Q6YU3BN efdq8Vy3rFsxgW7mMSrI/CxJ667y8ot5DVugeS2NyHfmZlPGE0Nsy7hlebS4liisXOrN3jFz asKyUws3VXek4V65lHwB23BVzsnFMn/bw/rPliqXGcwl8CoJu8dSyrCcd1Ibs0/Inq9S9+t0 VmWiQWfQkz4rvEeTQkp/VfgZ6z98JRW7S6l6eophoWs0/ZyRfOm+QVSqRfFZdxdP2PlGeIFM C3fXJgygXJkFPyWkVElr76JTbtSHsGWbt6xUlYHKXWo+xf9WgtLeby3cfSkEchACrxDrQpj+ Jt/JFP+q997dybkyZ5IoHWuPkn7uZGBrKIHmBunTco1+cKSuRiSCYpBIXZMHCzPgVDjk4viP brV9NwRkmaOxVvye0vctJeWvJ6KA7NoAURplIGCqkCRwg0MmLrfoZnK/gRqVJ/f6adhU1oo6 z4p2/z3PemA0C0ANatgHgBb90cd16AUxpdEQmOCmdNnNJF/3Zt3inzF+NFzHoM5Vwq6rc1JP jfC3oqRLJzqAEHBDjQFlqNR3IFCIAo4SYQRBdAHBCzkM4rWyRhuVABEBAAHCwV8EGAECAAkF AlYFJhkCGwwACgkQ8ww4vT8vvjwg9w//VQrcnVg3TsjEybxDEUBm8dBmnKqcnTBFmxN5FFtI WlEuY8+YMiWRykd8Ln9RJ/98/ghABHz9TN8TRo2b6WimV64FmlVn17Ri6FgFU3xNt9TTEChq AcNg88eYryKsYpFwegGpwUlaUaaGh1m9OrTzcQy+klVfZWaVJ9Nw0keoGRGb8j4XjVpL8+2x OhXKrM1fzzb8JtAuSbuzZSQPDwQEI5CKKxp7zf76J21YeRrEW4WDznPyVcDTa+tz++q2S/Bp P4W98bXCBIuQgs2m+OflERv5c3Ojldp04/S4NEjXEYRWdiCxN7ca5iPml5gLtuvhJMSy36gl U6IW9kn30IWuSoBpTkgV7rLUEhh9Ms82VWW/h2TxL8enfx40PrfbDtWwqRID3WY8jLrjKfTd R3LW8BnUDNkG+c4FzvvGUs8AvuqxxyHbXAfDx9o/jXfPHVRmJVhSmd+hC3mcQ+4iX5bBPBPM oDqSoLt5w9GoQQ6gDVP2ZjTWqwSRMLzNr37rJjZ1pt0DCMMTbiYIUcrhX8eveCJtY7NGWNyx FCRkhxRuGcpwPmRVDwOl39MB3iTsRighiMnijkbLXiKoJ5CDVvX5yicNqYJPKh5MFXN1bvsB kmYiStMRbrD0HoY1kx5/VozBtc70OU0EB8Wrv9hZD+Ofp0T3KOr1RUHvCZoLURfFhSQ= In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: QCquD0HXNz7Pw_3Fo6_IU4KjtJ95SaxBt5WkohgrUIw_1736497159 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Message-ID-Hash: 7P4RHP6IYSFEPZMMHISUWWZG6VP7GMPV X-Message-ID-Hash: 7P4RHP6IYSFEPZMMHISUWWZG6VP7GMPV X-MailFrom: lvivier@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On 10/01/2025 03:40, David Gibson wrote: > On Thu, Jan 09, 2025 at 02:06:48PM +0100, Laurent Vivier wrote: >> csum_unfolded() must call csum_avx2() with a 32byte aligned base address. >> >> To be able to do that if the buffer is not correctly aligned, >> it splits the buffers in 2 parts, the second part is 32byte aligned and >> can be used with csum_avx2(), the first part is the remaining part, that >> is not 32byte aligned and we use sum_16b() to compute the checksum. >> >> A problem appears if the length of the first part is odd because >> the checksum is using 16bit words to do the checksum. >> >> If the length is odd, when the second part is computed, all words are >> shifted by 1 byte, meaning weight of upper and lower byte is swapped. >> >> For instance a 13 bytes buffer: >> >> bytes: >> >> aa AA bb BB cc CC dd DD ee EE ff FF gg >> >> 16bit words: >> >> AAaa BBbb CCcc DDdd EEee FFff 00gg >> >> If we don't split the sequence, the checksum is: >> >> AAaa + BBbb + CCcc + DDdd + EEee + FFff + 00gg >> >> If we split the sequence with an even length for the first part: >> >> (AAaa + BBbb) + (CCcc + DDdd + EEee + FFff + 00gg) >> >> But if the first part has an odd length: >> >> (AAaa + BBbb + 00cc) + (ddCC + eeDD + ffEE + ggFF) >> >> To avoid the problem, do not call csum_avx2() if the first part cannot >> have an even length, and compute the checksum of all the buffer using >> sum_16b(). >> >> This is slower but it can only happen if the buffer base address is odd, >> and this can only happen if the binary is built using '-Os', and that >> means we have chosen to prioritize size over speed. >> >> Link: https://bugs.passt.top/show_bug.cgi?id=108 >> Signed-off-by: Laurent Vivier > > Reviewed-by: David Gibson > > In that it's a real bug and we need to fix it quickly. > > That said, I think we can do a bit better long term: I believe it > should be possible to correct the value from the of-by-one > csum_avx2(), I think with just an unconditional byteswap. The TCP/UDP > checksum has the curious property that it doesn't matter if you > compute it big-endian or little-endian, as long as you're consistent. > We already rely on this. Having one odd byte piece essentially means > we're using inconsistent endianness between the two pieces. > Yes, I spent my afternoon trying to understand that, but we must use same endianness between sum_16b() and csum_avx2(), and I found this: diff --git a/checksum.c b/checksum.c index 1c4354d35734..0543e86b0e67 100644 --- a/checksum.c +++ b/checksum.c @@ -458,8 +458,13 @@ uint32_t csum_unfolded(const void *buf, size_t len, uint32_t init) if (pad) init += sum_16b(buf, pad); - if (len > pad) - init = csum_avx2((void *)align, len - pad, init); + if (len > pad) { + if (pad & 1) + init = __bswap_32(csum_avx2((void *)align, len - pad, + __bswap_32(init))); + else + init = csum_avx2((void *)align, len - pad, init); + } return init; } Thanks, Laurent