From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=pass (2048-bit key; secure) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202410 header.b=MndsB2AA; dkim-atps=neutral Received: from mail.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by passt.top (Postfix) with ESMTPS id DA0AD5A061B for ; Thu, 21 Nov 2024 09:53:28 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202410; t=1732179188; bh=DsrgvrH5AZ/sBchiQ658OV/UZFjOvrfXMXFqX0EqYZo=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=MndsB2AAEIBTFeiSuwgxDsEJlPrfWraSQDzjX5I721XsrOCWsbvWVYvc1Xy/fu0xO tK2wLQDRlq+w53zxrXhnfijP1sg/kXdL9hYofgEz/R+BCetE4k1nVj5rtE6LQDYpUP wUYuGe1R4tgQrrpPGSedzp3Rw2sObsaO3Hs6cA0wUBJev7PDJsxQJ2D4Hahi7CXerm i3lGXQscGc2Lhavu5FE0zW1PNHuJoPp2Sz4ZqhJtgp+Q1S78k2dH6ilaItzMBq8iuW P8ykgd+AMZnw49AkuX/A9B5rwfjcXmoRJ7Wjglyn4ZMauLTobY9NNybPl0V4n79X7H jNdxiWe2qU4/w== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4XvBpD4Nppz4x4y; Thu, 21 Nov 2024 19:53:08 +1100 (AEDT) Date: Thu, 21 Nov 2024 17:21:12 +1100 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH 2/2] tcp: Acknowledge keep-alive segments, ignore them for the rest Message-ID: References: <20241119195344.3056010-1-sbrivio@redhat.com> <20241119195344.3056010-3-sbrivio@redhat.com> <20241120074344.705523be@elisabeth> <20241121052617.50cf96ef@elisabeth> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="rowBE+xr1qAyHckw" Content-Disposition: inline In-Reply-To: <20241121052617.50cf96ef@elisabeth> Message-ID-Hash: PMY5A4GZD5WLJLUE45MBTV7BZQFXTH2S X-Message-ID-Hash: PMY5A4GZD5WLJLUE45MBTV7BZQFXTH2S X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, Tim Besard X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --rowBE+xr1qAyHckw Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Nov 21, 2024 at 05:26:17AM +0100, Stefano Brivio wrote: > On Thu, 21 Nov 2024 13:38:09 +1100 > David Gibson wrote: >=20 > > On Wed, Nov 20, 2024 at 07:43:44AM +0100, Stefano Brivio wrote: > > > On Wed, 20 Nov 2024 12:02:00 +1100 > > > David Gibson wrote: > > > =20 > > > > On Tue, Nov 19, 2024 at 08:53:44PM +0100, Stefano Brivio wrote: =20 > > > > > RFC 9293, 3.8.4 says: > > > > >=20 > > > > > Implementers MAY include "keep-alives" in their TCP implementa= tions > > > > > (MAY-5), although this practice is not universally accepted. = Some > > > > > TCP implementations, however, have included a keep-alive mecha= nism. > > > > > To confirm that an idle connection is still active, these > > > > > implementations send a probe segment designed to elicit a resp= onse > > > > > from the TCP peer. Such a segment generally contains SEG.SEQ = =3D > > > > > SND.NXT-1 and may or may not contain one garbage octet of data= =2E If > > > > > keep-alives are included, the application MUST be able to turn= them > > > > > on or off for each TCP connection (MUST-24), and they MUST def= ault to > > > > > off (MUST-25). > > > > >=20 > > > > > but currently, tcp_data_from_tap() is not aware of this and will > > > > > schedule a fast re-transmit on the second keep-alive (because it's > > > > > also a duplicate ACK), ignoring the fact that the sequence number= was > > > > > rewinded to SND.NXT-1. > > > > >=20 > > > > > ACK these keep-alive segments, reset the activity timeout, and ig= nore > > > > > them for the rest. > > > > >=20 > > > > > At some point, we could think of implementing an approximation of > > > > > keep-alive segments on outbound sockets, for example by setting > > > > > TCP_KEEPIDLE to 1, and a large TCP_KEEPINTVL, so that we send a s= ingle > > > > > keep-alive segment at approximately the same time, and never rese= t the > > > > > connection. That's beyond the scope of this fix, though. > > > > >=20 > > > > > Reported-by: Tim Besard > > > > > Link: https://github.com/containers/podman/discussions/24572 > > > > > Signed-off-by: Stefano Brivio > > > > > --- > > > > > tcp.c | 14 ++++++++++++++ > > > > > 1 file changed, 14 insertions(+) > > > > >=20 > > > > > diff --git a/tcp.c b/tcp.c > > > > > index f357920..1eb85bb 100644 > > > > > --- a/tcp.c > > > > > +++ b/tcp.c > > > > > @@ -1763,6 +1763,20 @@ static int tcp_data_from_tap(const struct = ctx *c, struct tcp_tap_conn *conn, > > > > > continue; > > > > > =20 > > > > > seq =3D ntohl(th->seq); > > > > > + if (SEQ_LT(seq, conn->seq_from_tap) && len <=3D 1) { > > > > > + flow_trace(conn, > > > > > + "keep-alive sequence: %u, previous: %u", > > > > > + seq, conn->seq_from_tap); > > > > > + > > > > > + tcp_send_flag(c, conn, ACK); > > > > > + tcp_timer_ctl(c, conn); > > > > > + > > > > > + if (p->count =3D=3D 1) > > > > > + return 1; =20 > > > >=20 > > > > I'm not sure what this test is for. Shouldn't the continue be suff= icient? =20 > > >=20 > > > I don't think we want to go through tcp_update_seqack_from_tap(), > > > tcp_tap_window_update() and the like on a keep-alive segment. =20 > >=20 > > Ah, I see. But that is an optimisation, right? It shouldn't be > > necessary for correctness. >=20 > *Shouldn't*. >=20 > > > But if we receive something else in this batch, that's going to be a > > > data segment that happened to arrive just after the keep-alive, so, in > > > that case, we have to do the normal processing, by ignoring just this > > > segment and hitting 'continue'. > > >=20 > > > Strictly speaking, the 'continue' is enough and correct, but I think > > > that returning early in the obviously common case is simpler and more > > > robust. =20 > >=20 > > Hrm. Doesn't seem simpler to me, but I can see the point of the > > change so, >=20 > The code itself is two lines longer, of course, with an additional > early return. Considering all the possible side effects of looking at > window values from a keep-alive segment looks to me more complicated > than the alternative, though. Except that we *will* consider them if there happen to be other data packets in the batch. That seems like it will just make any problems =66rom processing the keepalive sequence values harder to track down, not make them go away. --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --rowBE+xr1qAyHckw Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmc+0VcACgkQzQJF27ox 2GeZOw//ReeHlvvEm7PvgzzZdofjMhr7Egspb2dGFCAx2CfvOM1AJiYHlQn00jad mgVhlkjSnmkzQrlrV9QVl+mTbue6vsyoL6YIF+bJ3a31VHlvS366jJSfuGGsl3ZB MzuPLC1IKbT0TZOUNHmUC2fCpSgVCxXWv0VsIGYbbabqZAGJi3DFPSpEZjCpR62j Cc8OORf9D0sDBwCLFXa+yszMN2qh+bptuW6KFYugThnrsuutjtVCJwwN1Myaq9y2 W3AxD9bntyKOxMbLDYEIUbqZ4bhLb5UxDQkQnWjYpvIp6bRisF2l+VIFejijvdh4 GynmAtnBuo4fPooCRni2aSKl3Mut2WWai+IJZBB3oTTC1xeXOJFWa21cVE6AEWVU PVpeV0ukZuIfbFKqO2bkqkj+NL7pi9cUqsjcwnJ1JHO8y5+oi3poUpauwjvMeAHk /2Da3pCq3nRfqCpbmdamT6DnORSlxMBiKUJmq3mukfzqj+kTn0ppFDuTx2WrlpjH ZI0ZoEswh1ic7rR141/kTYpCVhfKIo+WnwaMzOy3pqpB8k8slGk4nrpOimr87S5Q xckn541j+WtQAbuR9NuElj/uFs6wg8xxaRH8Bi+8BNdw/jMiQUdNlrzLzIdi7zKe Ac74WfdGwuFOeu242TTJam/BR/QCFIrugRZuRi3uQSvGJUGHPqE= =KVX5 -----END PGP SIGNATURE----- --rowBE+xr1qAyHckw--