From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: passt.top; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=Lj0qUl67; dkim-atps=neutral Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by passt.top (Postfix) with ESMTPS id D4EFF5A061E for ; Mon, 27 Jan 2025 11:27:20 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1737973639; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=J4fuV98eVi6eB3Wc9QfpZlcTm2yStutOGY/rHM4nP+k=; b=Lj0qUl67FWadTpFb3Xq+6GQqWg1cKu/2ML+PcrX9kYZVLKlAvjMhs/2sMMni047tQCV3d2 RjnDJeIls5zZ2+qbociOiEMytXZvEGeQT7W7IircdW5ALZ/PAPUXv13MgRUF5dB+fHTtKW IXKpmPirInldrW4npU/+Xd4AtH5qzLg= Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-526-GtzH-YnEMNeGCXCHf_phmw-1; Mon, 27 Jan 2025 05:27:16 -0500 X-MC-Unique: GtzH-YnEMNeGCXCHf_phmw-1 X-Mimecast-MFC-AGG-ID: GtzH-YnEMNeGCXCHf_phmw Received: by mail-wm1-f70.google.com with SMTP id 5b1f17b1804b1-436723db6c4so28983995e9.3 for ; Mon, 27 Jan 2025 02:27:16 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1737973635; x=1738578435; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=aPQF6s94+SLLK6q09hcYNsrxkJ7FIGbSw5VTIudV5Rs=; b=vtyjIXHMtmV6T6CQYceGHiZOXo6LtbTP/PQjTozl1p0sci7wmgC9L/Y6gg+KBmSzDz Bf561LMveFf1t5NGfsGSm8fR6iqArV9+ihU+cU+VrVYKufiDcgKOab8seLaYcZSLre6T yaroIRuHoRsNEbwrk1bkbYd9XzdlI8XLFtP/Pezpj6QQGHA93qZQ6sHIimNXmEnGuPfk xfFPBNTq+Bx8jXIXcJ36T0xgLrdZaZszYtGRQjjmBe6J/RZm9y5CDEU19FkvxxFIQzm0 SrRKxtwlRcFVTDfYTaH9KZsvw+y9UsglS1QSSXxqnnTupO0WSVdfGqrD93RbuaWGnbcY R3Mw== X-Forwarded-Encrypted: i=1; AJvYcCV4Z1QCLiy2j7r5BMgKrmEYzqTjM4FLttmD21TTTyyUyTECT4MKhwcB5NsN7OYHes2S+JTjBBTgDQA=@passt.top X-Gm-Message-State: AOJu0YyqnPsq9HMYZ52X+KvKtYyE//uYb9ZQZGqmJeVAVxCxX/WLAcN0 A/9gSkqgpRcgraNj6FkvR1UMcm3C2frpYxw3jVld97S5doIPJGJgdaKpvEksjHvIb+YCogXTeDz iB4LssKBhvbpI5CkT7eBoqUO3/mgVMDgiM+hmkEHZZwDCfeBxpg== X-Gm-Gg: ASbGncuDktIeuosfpuCduEjPRFrloumFJo3wjcV2aYc14fGjZRAkSd0zUvnPZhL3Lbu vAWVmAwbTD3c2fCpCee4KDg3zhlcwOtgrCOnqWDjmBrY3idcZ6zaly7TBv30N6PMTi7/bmBfOg1 gP1AKmXwThfbnrYI43qYZdkYz05Ye1d6o+C6DBX62BwUKPAoGn+7IOC/EhiUfN3lpV9uYUDe8Hs ARg/u2z7EV5UJPrHDTO2m47GwuQxQbsi6wqMQpnHP2YT2JXaRu8YAP4+E5aoCyb9+4HqnbtoDuY tcCaABrUoZ4ijDYJ X-Received: by 2002:a05:600c:ccc:b0:434:e9ee:c3d with SMTP id 5b1f17b1804b1-4389141c1e5mr308044045e9.20.1737973634951; Mon, 27 Jan 2025 02:27:14 -0800 (PST) X-Google-Smtp-Source: AGHT+IG/vOIlh8NUolnxEhhezqtFrfya33PW0mZ2/xkcVF+l0sldXD54BuiD9icmtPhfhQPA+ua5zA== X-Received: by 2002:a05:600c:ccc:b0:434:e9ee:c3d with SMTP id 5b1f17b1804b1-4389141c1e5mr308043805e9.20.1737973634567; Mon, 27 Jan 2025 02:27:14 -0800 (PST) Received: from maya.myfinge.rs (ifcgrfdd.trafficplex.cloud. [2a10:fc81:a806:d6a9::1]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-438bd501721sm125050985e9.9.2025.01.27.02.27.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Jan 2025 02:27:13 -0800 (PST) Date: Mon, 27 Jan 2025 11:27:12 +0100 From: Stefano Brivio To: Eric Dumazet Subject: Re: [net,v2] tcp: correct handling of extreme memory squeeze Message-ID: <20250127112712.50bb6341@elisabeth> In-Reply-To: References: <20250117214035.2414668-1-jmaloy@redhat.com> <20250127110121.1f53b27d@elisabeth> Organization: Red Hat X-Mailer: Claws Mail 4.2.0 (GTK 3.24.41; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: rs96M1PrROXGHVnkNwRMcyus5xhf5SCSubhZExSC0Ic_1737973635 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Message-ID-Hash: MOIJXNKLRLEINAQUJXLP5N3KO5QTZFBB X-Message-ID-Hash: MOIJXNKLRLEINAQUJXLP5N3KO5QTZFBB X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: Jon Maloy , Neal Cardwell , netdev@vger.kernel.org, davem@davemloft.net, kuba@kernel.org, passt-dev@passt.top, lvivier@redhat.com, dgibson@redhat.com, eric.dumazet@gmail.com, Menglong Dong X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Mon, 27 Jan 2025 11:06:07 +0100 Eric Dumazet wrote: > On Mon, Jan 27, 2025 at 11:01=E2=80=AFAM Stefano Brivio wrote: > > > > On Fri, 24 Jan 2025 12:40:16 -0500 > > Jon Maloy wrote: > > =20 > > > I can certainly clear tp->pred_flags and post it again, maybe with > > > an improved and shortened log. Would that be acceptable? =20 > > > > Talking about an improved log, what strikes me the most of the whole > > problem is: > > > > $ tshark -r iperf3_jon_zero_window.pcap -td -Y 'frame.number in { 1064 = .. 1068 }' > > 1064 0.004416 192.168.122.1 =E2=86=92 192.168.122.198 TCP 65534 3448= 2 =E2=86=92 5201 [ACK] Seq=3D1611679466 Ack=3D1 Win=3D36864 Len=3D65480 > > 1065 0.007334 192.168.122.1 =E2=86=92 192.168.122.198 TCP 65534 3448= 2 =E2=86=92 5201 [ACK] Seq=3D1611744946 Ack=3D1 Win=3D36864 Len=3D65480 > > 1066 0.005104 192.168.122.1 =E2=86=92 192.168.122.198 TCP 56382 [TCP= Window Full] 34482 =E2=86=92 5201 [ACK] Seq=3D1611810426 Ack=3D1 Win=3D368= 64 Len=3D56328 > > 1067 0.015226 192.168.122.198 =E2=86=92 192.168.122.1 TCP 54 [TCP Ze= roWindow] 5201 =E2=86=92 34482 [ACK] Seq=3D1 Ack=3D1611090146 Win=3D0 Len= =3D0 > > 1068 6.298138 fe80::44b3:f5ff:fe86:c529 =E2=86=92 ff02::2 ICMPv= 6 70 Router Solicitation from 46:b3:f5:86:c5:29 > > > > ...and then the silence, 192.168.122.198 never announces that its > > window is not zero, so the peer gives up 15 seconds later: > > > > $ tshark -r iperf3_jon_zero_window_cut.pcap -td -Y 'frame.number in { 1= 069 .. 1070 }' > > 1069 8.709313 192.168.122.1 =E2=86=92 192.168.122.198 TCP 55 34466 = =E2=86=92 5201 [ACK] Seq=3D166 Ack=3D5 Win=3D36864 Len=3D1 > > 1070 0.008943 192.168.122.198 =E2=86=92 192.168.122.1 TCP 54 5201 = =E2=86=92 34482 [FIN, ACK] Seq=3D1 Ack=3D1611090146 Win=3D778240 Len=3D0 > > > > Data in frame #1069 is iperf3 ending the test. > > > > This didn't happen before e2142825c120 ("net: tcp: send zero-window > > ACK when no memory") so it's a relatively recent (17 months) regression= . > > > > It actually looks pretty simple (and rather serious) to me. >=20 > With all that, it should be pretty easy to cook a packetdrill test, right= ? Not really :( because to reproduce this exact condition you need to somehow get the right amount of memory pressure so that you can actually establish a connection, start the transfer, and then exhaust the receive buffer at the right moment. And packetdrill doesn't do that. Sure, it would be great if it did, and it's probably a nice feature to implement... given enough time. Given less time, I guess fixing regressions has a higher priority. One could perhaps tweak sk->sk_rcvbuf as you suggested but that just artificially reproduces one part of it. It's not a really fitting test. For example: when would you increase it back? > packetdrill tests are part of tools/testing/selftests/net/ already, we > are not asking for something unreasonable. I would agree, in general, except that I don't see a way to craft a test like this with packetdrill. At least not trivially with the current feature set. On top of that, this is not a new feature, it's a fix for a regression (that was introduced without adding any test, of course). And the fix itself was definitely tested, just not with packetdrill. Requesting that tests are 1. automated and 2. written with a specific tool is something I can quite understand for general convenience, but I don't think it always makes sense. Especially as this fix has been blocked for about 9 months now because of the fact that automating a test for it is quite hard. --=20 Stefano