From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by passt.top (Postfix) with ESMTPS id 97A545A004F for ; Mon, 15 Jul 2024 02:34:43 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202312; t=1721003671; bh=R2SIiBDQls/znkDbBuiAPKPCpy3ZFTugZOSCt+y5bWw=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=oJtAVzBIu8TYCvbR4/GN4rgbWqn6Mc2x01iYk2qHzKjTfNH0SaPx27InjXhXXTqnn Cj9n5kJmB4V8ENlGOrtCA3mr//7F1A2LLCg2DkH6br95yfDq3BD39PeyulWIyO/jSE R+kVbl1whcwWUCOEvusoaWGz5xckWjpHlCD4+fpsy6I8WHXJJ+YdUg6nhxqstBm2cG b6SEK05hXsWVBPiNMhe4C0YMcYwZ0oumaVj32zq2Fn3rxwBTd9Ighj9xe3Bm7Mp3NR iKHVhyi1pgVx7yfSzsnjptvCUpljg6ZcZ9+Qf3cyGMOIa76MRldFLvmCmuaH3lWkRX E17deaaI0vBIQ== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4WMjrR6sJVz4wnx; Mon, 15 Jul 2024 10:34:31 +1000 (AEST) Date: Mon, 15 Jul 2024 10:34:23 +1000 From: David Gibson To: Jon Maloy Subject: Re: [PATCH v9 2/2] tcp: handle shrunk window advertisemenst from guest Message-ID: References: <20240712190450.1261907-1-jmaloy@redhat.com> <20240712190450.1261907-3-jmaloy@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="5SPjFZQjQ33GLGeO" Content-Disposition: inline In-Reply-To: <20240712190450.1261907-3-jmaloy@redhat.com> Message-ID-Hash: FGQNNNUHN556SVUMKIJBUJMFXEQOF4X6 X-Message-ID-Hash: FGQNNNUHN556SVUMKIJBUJMFXEQOF4X6 X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, sbrivio@redhat.com, lvivier@redhat.com, dgibson@redhat.com X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --5SPjFZQjQ33GLGeO Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Jul 12, 2024 at 03:04:50PM -0400, Jon Maloy wrote: > A bug in kernel TCP may lead to a deadlock where a zero window is sent > from the guest peer, while it is unable to send out window updates even > after socket reads have freed up enough buffer space to permit a larger > window. In this situation, new window advertisements from the peer can > only be triggered by data packets arriving from this side. >=20 > However, currently such packets are never sent, because the zero-window > condition prevents this side from sending out any packets whatsoever > to the peer. >=20 > We notice that the above bug is triggered *only* after the peer has > dropped one or more arriving packets because of severe memory squeeze, > and that we hence always enter a retransmission situation when this > occurs. This also means that the implementation goes against the > RFC-9293 recommendation that a previously advertised window never > should shrink. >=20 > RFC-9293 seems to permit that we can continue sending up to the right > edge of the last advertised non-zero window in such situations, so that > is what we do to resolve this situation. >=20 > It turns out that this solution is extremely simple to implememt in the > code: We just omit to save the advertised zero-window when we see that > it has shrunk, i.e., if the acknowledged sequence number in the > advertisement message is lower than that of the last data byte sent > from our side. >=20 > When that is the case, the following happens: > - The 'retr' flag in tcp_data_from_tap() will be 'false', so no > retransmission will occur at this occasion. > - The data stream will soon reach the right edge of the previously > advertised window. In fact, in all observed cases we have seen that > it is already there when the zero-advertisement arrives. > - At that moment, the flags STALLED and ACK_FROM_TAP_DUE will be set, > unless they already have been, meaning that only the next timer > expiration will open for data retransmission or transmission. > - When that happens, the memory squeeze at the guest will normally have > abated, and the data flow can resume. >=20 > It should be noted that although this solves the problem we have at > hand, it is a work-around, and not a genuine solution to the described > kernel bug. >=20 > Suggested-by: Stefano Brivio > Signed-off-by: Jon Maloy I only half-understand the problem here, but the fix LGTM. Reviewed-by: David Gibson --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --5SPjFZQjQ33GLGeO Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmaUbosACgkQzQJF27ox 2GcZ/A//TF3vBv4MMoM2e2fP+bA0X6iucKb3Obm/wecOiLaOzX/+XO/oAhmarsfd EwGDqTQ6DPjQ3BkswLGwa6lmn6VsvIJkQtxiUwfYHOxYQbQW2elGmftt8JhzEiHs oA398AW3rChUAKb0fshS3lWdCMR5R0D/tuXFmVBd6oXperaB0HayrrNCX/I0SYub nxWmN/2nZ2C41E9a+4AewwqBmIztjzgTiFuLEhsSa3lXBqrCpxa13uAyQU6njDVe P0p4c4t03y+n4l+caqzxnHHjm7q+MPv+x008XVpoBhnD1GEj/TwoUgB2thf/UjBQ ejzigB/PMcZiBTwCCoX2mQHpcTOAhgGYkODgd6THUODbYjzXJcHoW1nyAJZWj5Zp W+Vq9X6tjaeVLesO4f5+/7UfgjE3u8hZdrRHKn+sSFZk8NeX3GL0um9e7jWxfSEj 7PqYWVFQbtYNfoCl9wtp5zzewsrLvhrs5cQEP5yT9IwkvMpex5z/NE8OE6GZAnlc yZDxxRqMmK02a5jvbxcvG9v8r7URFTSiLnkTpXTjS8ml1zan8oA39MwagnUphMwW MJtieom5FfdKqEYzwA6bwe05UXWIhjOkSYFG3d8AjIB5aHSgWJmt1QtOcwezBNnK LqnV6DeSIT5SkbQ7aRZMSRVamGU3QqBE5x8UBWeebbAYwfBI3Q8= =p1Tv -----END PGP SIGNATURE----- --5SPjFZQjQ33GLGeO--