From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: passt.top; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=IkNP87GK; dkim-atps=neutral Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by passt.top (Postfix) with ESMTP id 2A0775A004E for ; Tue, 22 Oct 2024 14:59:26 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1729601964; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=PHQvX+XGUOMwHM9FzAS+MbkaurUoxC2DZq5hNtSbTh8=; b=IkNP87GKPUKVamZbyTrHVE5q+WkfzQhcL0O9WNtypfCMbBybQtfHBxjwGxsgBxQ5Eyx/EN nDh7bupaO3DwuYjraSXnJIla1fAIlzzwtfXbf9j+bL2xZGtu2dKOaL36Ag8rIegrST7HL/ NYTnILS+lCPJLnl3o2BfpHCQ6Mnyc7k= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-457-K1-OPKJXOJG0SFscCLMwuw-1; Tue, 22 Oct 2024 08:59:22 -0400 X-MC-Unique: K1-OPKJXOJG0SFscCLMwuw-1 Received: by mail-wm1-f72.google.com with SMTP id 5b1f17b1804b1-4317391101aso15515765e9.2 for ; Tue, 22 Oct 2024 05:59:22 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729601961; x=1730206761; h=content-transfer-encoding:in-reply-to:autocrypt:from :content-language:references:cc:to:subject:user-agent:mime-version :date:message-id:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=PHQvX+XGUOMwHM9FzAS+MbkaurUoxC2DZq5hNtSbTh8=; b=uir1WT4jgIAgC6ODkX38CkMaDMZPrNZ94vX7ilNDbB3+MF+8uaR1OmBDpCAONQnT69 KLJ/kEnca7bjC7/37vpzC8ODUK7aDUW5MULjiUdzyeVKLGN5ghRFi4VWLNT06o6iS3FA 5L7BquDLApzDC/4b4qzG86OV4OhTDh8gChm2oifgI13UaSLrzbPf2MI47atUTEpETJIq ej5SA6Cv+ADOjyjU0XtqDWru5ayR8+OuehNnpxa0C6fLgOd6c0vb/1SF+hU3SLPU/dKX yPMpPqsyVq/sxxkGC6yCwNwk5Eqn/1XOiMRGB0BjzeW9ZrbA+x0mLR6WNinR8BN5PYJG qiHw== X-Gm-Message-State: AOJu0Yzjg8x3GyRqT2XOlV4IWLuMbNKoRun0wmTtntSeqZaQgvQVaTzE y4YP4upA+58RmVslyMy6g3pr/TsNS6Nm2YkCvsCX2v7D8Quy9tZWCngBoKZq7GvIEtpNnoSXDwL rjZsVDBLtHt50SEdzNZZWV9jrOQglaZxsWWd85OJELZn4uuhbbQ== X-Received: by 2002:a05:600c:468a:b0:431:3a6d:b84a with SMTP id 5b1f17b1804b1-43161633c8amr96177345e9.4.1729601961166; Tue, 22 Oct 2024 05:59:21 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGRpE/X1/mBrOY3FcUEOWfuMFsLy1Tyz5mljPvWP0JsVPyFypbROxia0h1OUiHeYF7mWr6kqg== X-Received: by 2002:a05:600c:468a:b0:431:3a6d:b84a with SMTP id 5b1f17b1804b1-43161633c8amr96177195e9.4.1729601960689; Tue, 22 Oct 2024 05:59:20 -0700 (PDT) Received: from ?IPV6:2a01:e0a:e10:ef90:4c84:58cb:a1ef:8b78? ([2a01:e0a:e10:ef90:4c84:58cb:a1ef:8b78]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4316f57124csm87946025e9.8.2024.10.22.05.59.20 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 22 Oct 2024 05:59:20 -0700 (PDT) Message-ID: Date: Tue, 22 Oct 2024 14:59:19 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v8 7/8] vhost-user: add vhost-user To: Stefano Brivio , David Gibson References: <20241010122903.1188992-1-lvivier@redhat.com> <20241010122903.1188992-8-lvivier@redhat.com> <20241015215438.1595b4d7@elisabeth> <20241017021031.1adb421e@elisabeth> From: Laurent Vivier Autocrypt: addr=lvivier@redhat.com; keydata= xsFNBFYFJhkBEAC2me7w2+RizYOKZM+vZCx69GTewOwqzHrrHSG07MUAxJ6AY29/+HYf6EY2 WoeuLWDmXE7A3oJoIsRecD6BXHTb0OYS20lS608anr3B0xn5g0BX7es9Mw+hV/pL+63EOCVm SUVTEQwbGQN62guOKnJJJfphbbv82glIC/Ei4Ky8BwZkUuXd7d5NFJKC9/GDrbWdj75cDNQx UZ9XXbXEKY9MHX83Uy7JFoiFDMOVHn55HnncflUncO0zDzY7CxFeQFwYRbsCXOUL9yBtqLer Ky8/yjBskIlNrp0uQSt9LMoMsdSjYLYhvk1StsNPg74+s4u0Q6z45+l8RAsgLw5OLtTa+ePM JyS7OIGNYxAX6eZk1+91a6tnqfyPcMbduxyBaYXn94HUG162BeuyBkbNoIDkB7pCByed1A7q q9/FbuTDwgVGVLYthYSfTtN0Y60OgNkWCMtFwKxRaXt1WFA5ceqinN/XkgA+vf2Ch72zBkJL RBIhfOPFv5f2Hkkj0MvsUXpOWaOjatiu0fpPo6Hw14UEpywke1zN4NKubApQOlNKZZC4hu6/ 8pv2t4HRi7s0K88jQYBRPObjrN5+owtI51xMaYzvPitHQ2053LmgsOdN9EKOqZeHAYG2SmRW LOxYWKX14YkZI5j/TXfKlTpwSMvXho+efN4kgFvFmP6WT+tPnwARAQABzSNMYXVyZW50IFZp dmllciA8bHZpdmllckByZWRoYXQuY29tPsLBeAQTAQIAIgUCVgVQgAIbAwYLCQgHAwIGFQgC CQoLBBYCAwECHgECF4AACgkQ8ww4vT8vvjwpgg//fSGy0Rs/t8cPFuzoY1cex4limJQfReLr SJXCANg9NOWy/bFK5wunj+h/RCFxIFhZcyXveurkBwYikDPUrBoBRoOJY/BHK0iZo7/WQkur 6H5losVZtrotmKOGnP/lJYZ3H6OWvXzdz8LL5hb3TvGOP68K8Bn8UsIaZJoeiKhaNR0sOJyI YYbgFQPWMHfVwHD/U+/gqRhD7apVysxv5by/pKDln1I5v0cRRH6hd8M8oXgKhF2+rAOL7gvh jEHSSWKUlMjC7YwwjSZmUkL+TQyE18e2XBk85X8Da3FznrLiHZFHQ/NzETYxRjnOzD7/kOVy gKD/o7asyWQVU65mh/ECrtjfhtCBSYmIIVkopoLaVJ/kEbVJQegT2P6NgERC/31kmTF69vn8 uQyW11Hk8tyubicByL3/XVBrq4jZdJW3cePNJbTNaT0d/bjMg5zCWHbMErUib2Nellnbg6bc 2HLDe0NLVPuRZhHUHM9hO/JNnHfvgiRQDh6loNOUnm9Iw2YiVgZNnT4soUehMZ7au8PwSl4I KYE4ulJ8RRiydN7fES3IZWmOPlyskp1QMQBD/w16o+lEtY6HSFEzsK3o0vuBRBVp2WKnssVH qeeV01ZHw0bvWKjxVNOksP98eJfWLfV9l9e7s6TaAeySKRRubtJ+21PRuYAxKsaueBfUE7ZT 7zfOwU0EVgUmGQEQALxSQRbl/QOnmssVDxWhHM5TGxl7oLNJms2zmBpcmlrIsn8nNz0rRyxT 460k2niaTwowSRK8KWVDeAW6ZAaWiYjLlTunoKwvF8vP3JyWpBz0diTxL5o+xpvy/Q6YU3BN efdq8Vy3rFsxgW7mMSrI/CxJ667y8ot5DVugeS2NyHfmZlPGE0Nsy7hlebS4liisXOrN3jFz asKyUws3VXek4V65lHwB23BVzsnFMn/bw/rPliqXGcwl8CoJu8dSyrCcd1Ibs0/Inq9S9+t0 VmWiQWfQkz4rvEeTQkp/VfgZ6z98JRW7S6l6eophoWs0/ZyRfOm+QVSqRfFZdxdP2PlGeIFM C3fXJgygXJkFPyWkVElr76JTbtSHsGWbt6xUlYHKXWo+xf9WgtLeby3cfSkEchACrxDrQpj+ Jt/JFP+q997dybkyZ5IoHWuPkn7uZGBrKIHmBunTco1+cKSuRiSCYpBIXZMHCzPgVDjk4viP brV9NwRkmaOxVvye0vctJeWvJ6KA7NoAURplIGCqkCRwg0MmLrfoZnK/gRqVJ/f6adhU1oo6 z4p2/z3PemA0C0ANatgHgBb90cd16AUxpdEQmOCmdNnNJF/3Zt3inzF+NFzHoM5Vwq6rc1JP jfC3oqRLJzqAEHBDjQFlqNR3IFCIAo4SYQRBdAHBCzkM4rWyRhuVABEBAAHCwV8EGAECAAkF AlYFJhkCGwwACgkQ8ww4vT8vvjwg9w//VQrcnVg3TsjEybxDEUBm8dBmnKqcnTBFmxN5FFtI WlEuY8+YMiWRykd8Ln9RJ/98/ghABHz9TN8TRo2b6WimV64FmlVn17Ri6FgFU3xNt9TTEChq AcNg88eYryKsYpFwegGpwUlaUaaGh1m9OrTzcQy+klVfZWaVJ9Nw0keoGRGb8j4XjVpL8+2x OhXKrM1fzzb8JtAuSbuzZSQPDwQEI5CKKxp7zf76J21YeRrEW4WDznPyVcDTa+tz++q2S/Bp P4W98bXCBIuQgs2m+OflERv5c3Ojldp04/S4NEjXEYRWdiCxN7ca5iPml5gLtuvhJMSy36gl U6IW9kn30IWuSoBpTkgV7rLUEhh9Ms82VWW/h2TxL8enfx40PrfbDtWwqRID3WY8jLrjKfTd R3LW8BnUDNkG+c4FzvvGUs8AvuqxxyHbXAfDx9o/jXfPHVRmJVhSmd+hC3mcQ+4iX5bBPBPM oDqSoLt5w9GoQQ6gDVP2ZjTWqwSRMLzNr37rJjZ1pt0DCMMTbiYIUcrhX8eveCJtY7NGWNyx FCRkhxRuGcpwPmRVDwOl39MB3iTsRighiMnijkbLXiKoJ5CDVvX5yicNqYJPKh5MFXN1bvsB kmYiStMRbrD0HoY1kx5/VozBtc70OU0EB8Wrv9hZD+Ofp0T3KOr1RUHvCZoLURfFhSQ= In-Reply-To: <20241017021031.1adb421e@elisabeth> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Message-ID-Hash: KDFE5CUCRCCQ6YRFGAUPY3Z46R4ED32N X-Message-ID-Hash: KDFE5CUCRCCQ6YRFGAUPY3Z46R4ED32N X-MailFrom: lvivier@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On 17/10/2024 02:10, Stefano Brivio wrote: > On Wed, 16 Oct 2024 11:41:34 +1100 > David Gibson wrote: > >> On Tue, Oct 15, 2024 at 09:54:38PM +0200, Stefano Brivio wrote: >>> [Still partial review] >> [snip] >>>> + if (peek_offset_cap) >>>> + already_sent = 0; >>>> + >>>> + iov_vu[0].iov_base = tcp_buf_discard; >>>> + iov_vu[0].iov_len = already_sent; >>> >>> I think I had a similar comment to a previous revision. Now, I haven't >>> tested this (yet) on a kernel with support for SO_PEEK_OFF on TCP, but >>> I think this should eventually follow the same logic as the (updated) >>> tcp_buf_data_from_sock(): we should use tcp_buf_discard only if >>> (!peek_offset_cap). >>> >>> It's fine to always initialise VIRTQUEUE_MAX_SIZE iov_vu items, >>> starting from 1, for simplicity. But I'm not sure if it's safe to pass a >>> zero iov_len if (peek_offset_cap). >> >>> I'll test that (unless you already did) -- if it works, we can fix this >>> up later as well. >> >> I believe I tested it at some point, and I think we're already using >> it somewhere. > > I tested it again just to be sure on a recent net.git kernel: sometimes > the first test in passt_vu_in_ns/tcp, "TCP/IPv4: host to guest: big > transfer" hangs on my setup, sometimes it's the "TCP/IPv4: ns to guest > (using loopback address): big transfer" test instead. > > I can reproduce at least one of the two issues consistently (tests > stopped 5 times out of 5). > > The socat client completes the transfer, the server is still waiting > for something. I haven't taken captures yet or tried to re-send from > the client. > > It all works (consistently) with an older kernel without support for > SO_PEEK_OFF on TCP, but also on this kernel if I force peek_offset_cap > to false in tcp_init(). > I have a fix for that but there is an error I don't understand: when I run twice the test, the second time I have: guest: # socat -u TCP4-LISTEN:10001 OPEN:test_big.bin,create,trunc # socat -u TCP4-LISTEN:10001 OPEN:test_big.bin,create,trunc 2024/10/22 08:51:58 socat[1485] E bind(5, {AF=2 0.0.0.0:10001}, 16): Address already in use host: $ socat -u OPEN:test/big.bin TCP4:127.0.0.1:10001 If I wait a little it can work again several times and fails again. Any idea? The patch is: diff --git a/tcp_vu.c b/tcp_vu.c index 78884c673215..83e40fb07a03 100644 --- a/tcp_vu.c +++ b/tcp_vu.c @@ -379,6 +379,10 @@ int tcp_vu_data_from_sock(const struct ctx *c, struct tcp_tap_conn *conn) conn->seq_ack_from_tap, conn->seq_to_tap); conn->seq_to_tap = conn->seq_ack_from_tap; already_sent = 0; + if (tcp_set_peek_offset(conn->sock, 0)) { + tcp_rst(c, conn); + return -1; + } } if (!wnd_scaled || already_sent >= wnd_scaled) { @@ -389,14 +393,13 @@ int tcp_vu_data_from_sock(const struct ctx *c, struct tcp_tap_conn *conn) /* Set up buffer descriptors we'll fill completely and partially. */ - fillsize = wnd_scaled; + fillsize = wnd_scaled - already_sent; if (peek_offset_cap) already_sent = 0; iov_vu[0].iov_base = tcp_buf_discard; iov_vu[0].iov_len = already_sent; - fillsize -= already_sent; /* collect the buffers from vhost-user and fill them with the * data from the socket