From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=pass (2048-bit key; secure) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202410 header.b=ECejinEz; dkim-atps=neutral Received: from mail.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by passt.top (Postfix) with ESMTPS id 891415A061B for ; Thu, 21 Nov 2024 10:32:34 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202410; t=1732181535; bh=OT0AKSMbJVdIUQP2/14RusRnXMOTwzmsv6IWTPopBto=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=ECejinEzfPM8DQDVJfd5krzXIuFdvGW9Z8rIaSC+LxhFxRoTQ1aSAPNz6Vq83K7oN YzEijKKR0v5wZBHpmYLcZH96pbEacPRF6OgnShw6+tFWBCYUeSkf7v3BkeeXAtqaaF Vzo9lsH03Vrnbstddno3FgO2ForENuDi3pJrGb2geRMiFTAQwvPN8RvO5WBPff6d37 tVSv5qgUk6ljMOhDzY3blOECOdVAtRKhclqyPn/0Lxg60urDjiKTiQnkzeL8mbaej+ mrQ8HS0DNvhV1zqmGW7oIrQLyE1zUbU6YJrzIg4KNNEcwq4VTmV2mbYjtaCmA9hU1H WonN8VuhbBttA== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4XvCgM0JRMz4x4v; Thu, 21 Nov 2024 20:32:15 +1100 (AEDT) Date: Thu, 21 Nov 2024 20:32:11 +1100 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH 2/2] tcp: Acknowledge keep-alive segments, ignore them for the rest Message-ID: References: <20241119195344.3056010-1-sbrivio@redhat.com> <20241119195344.3056010-3-sbrivio@redhat.com> <20241120074344.705523be@elisabeth> <20241121052617.50cf96ef@elisabeth> <20241121102312.156af880@elisabeth> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="8IFsHYWog4ljLG2K" Content-Disposition: inline In-Reply-To: <20241121102312.156af880@elisabeth> Message-ID-Hash: J3IJCEGS35OKHFZ3QALBFT7XZM6BKB3K X-Message-ID-Hash: J3IJCEGS35OKHFZ3QALBFT7XZM6BKB3K X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, Tim Besard X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --8IFsHYWog4ljLG2K Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Nov 21, 2024 at 10:23:12AM +0100, Stefano Brivio wrote: > On Thu, 21 Nov 2024 17:21:12 +1100 > David Gibson wrote: >=20 > > On Thu, Nov 21, 2024 at 05:26:17AM +0100, Stefano Brivio wrote: > > > On Thu, 21 Nov 2024 13:38:09 +1100 > > > David Gibson wrote: > > > =20 > > > > On Wed, Nov 20, 2024 at 07:43:44AM +0100, Stefano Brivio wrote: =20 > > > > > On Wed, 20 Nov 2024 12:02:00 +1100 > > > > > David Gibson wrote: > > > > > =20 > > > > > > On Tue, Nov 19, 2024 at 08:53:44PM +0100, Stefano Brivio wrote:= =20 > > > > > > > RFC 9293, 3.8.4 says: > > > > > > >=20 > > > > > > > Implementers MAY include "keep-alives" in their TCP implem= entations > > > > > > > (MAY-5), although this practice is not universally accepte= d. Some > > > > > > > TCP implementations, however, have included a keep-alive m= echanism. > > > > > > > To confirm that an idle connection is still active, these > > > > > > > implementations send a probe segment designed to elicit a = response > > > > > > > from the TCP peer. Such a segment generally contains SEG.= SEQ =3D > > > > > > > SND.NXT-1 and may or may not contain one garbage octet of = data. If > > > > > > > keep-alives are included, the application MUST be able to = turn them > > > > > > > on or off for each TCP connection (MUST-24), and they MUST= default to > > > > > > > off (MUST-25). > > > > > > >=20 > > > > > > > but currently, tcp_data_from_tap() is not aware of this and w= ill > > > > > > > schedule a fast re-transmit on the second keep-alive (because= it's > > > > > > > also a duplicate ACK), ignoring the fact that the sequence nu= mber was > > > > > > > rewinded to SND.NXT-1. > > > > > > >=20 > > > > > > > ACK these keep-alive segments, reset the activity timeout, an= d ignore > > > > > > > them for the rest. > > > > > > >=20 > > > > > > > At some point, we could think of implementing an approximatio= n of > > > > > > > keep-alive segments on outbound sockets, for example by setti= ng > > > > > > > TCP_KEEPIDLE to 1, and a large TCP_KEEPINTVL, so that we send= a single > > > > > > > keep-alive segment at approximately the same time, and never = reset the > > > > > > > connection. That's beyond the scope of this fix, though. > > > > > > >=20 > > > > > > > Reported-by: Tim Besard > > > > > > > Link: https://github.com/containers/podman/discussions/24572 > > > > > > > Signed-off-by: Stefano Brivio > > > > > > > --- > > > > > > > tcp.c | 14 ++++++++++++++ > > > > > > > 1 file changed, 14 insertions(+) > > > > > > >=20 > > > > > > > diff --git a/tcp.c b/tcp.c > > > > > > > index f357920..1eb85bb 100644 > > > > > > > --- a/tcp.c > > > > > > > +++ b/tcp.c > > > > > > > @@ -1763,6 +1763,20 @@ static int tcp_data_from_tap(const str= uct ctx *c, struct tcp_tap_conn *conn, > > > > > > > continue; > > > > > > > =20 > > > > > > > seq =3D ntohl(th->seq); > > > > > > > + if (SEQ_LT(seq, conn->seq_from_tap) && len <=3D 1) { > > > > > > > + flow_trace(conn, > > > > > > > + "keep-alive sequence: %u, previous: %u", > > > > > > > + seq, conn->seq_from_tap); > > > > > > > + > > > > > > > + tcp_send_flag(c, conn, ACK); > > > > > > > + tcp_timer_ctl(c, conn); > > > > > > > + > > > > > > > + if (p->count =3D=3D 1) > > > > > > > + return 1; =20 > > > > > >=20 > > > > > > I'm not sure what this test is for. Shouldn't the continue be = sufficient? =20 > > > > >=20 > > > > > I don't think we want to go through tcp_update_seqack_from_tap(), > > > > > tcp_tap_window_update() and the like on a keep-alive segment. = =20 > > > >=20 > > > > Ah, I see. But that is an optimisation, right? It shouldn't be > > > > necessary for correctness. =20 > > >=20 > > > *Shouldn't*. > > > =20 > > > > > But if we receive something else in this batch, that's going to b= e a > > > > > data segment that happened to arrive just after the keep-alive, s= o, in > > > > > that case, we have to do the normal processing, by ignoring just = this > > > > > segment and hitting 'continue'. > > > > >=20 > > > > > Strictly speaking, the 'continue' is enough and correct, but I th= ink > > > > > that returning early in the obviously common case is simpler and = more > > > > > robust. =20 > > > >=20 > > > > Hrm. Doesn't seem simpler to me, but I can see the point of the > > > > change so, =20 > > >=20 > > > The code itself is two lines longer, of course, with an additional > > > early return. Considering all the possible side effects of looking at > > > window values from a keep-alive segment looks to me more complicated > > > than the alternative, though. =20 > >=20 > > Except that we *will* consider them if there happen to be other data > > packets in the batch. >=20 > Eh, yes, we have to: >=20 > > > > > But if we receive something else in this batch, that's going to b= e a > > > > > data segment that happened to arrive just after the keep-alive, s= o, in > > > > > that case, we have to do the normal processing, by ignoring just = this > > > > > segment and hitting 'continue'. >=20 > but we'll use _those_ window values (because we 'continue' here). >=20 > > That seems like it will just make any problems > > from processing the keepalive sequence values harder to track down, > > not make them go away. >=20 > We tested the common case (perhaps we'll never get anything else) and > my priority would be to make _that_ robust, because it's what matters > to users. If we find the time to write a small keep-alive sending > program, then I would feel more confident to drop that additional > condition. Eh, fair enough. Reviewed-by: David Gibson --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --8IFsHYWog4ljLG2K Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmc+/hQACgkQzQJF27ox 2GcYaQ/8CEm4byqSc/HXBEe7OluxxY6hzRLw24sMK6368cGy25yaA38N2uz2y+pi 5xLiBNkcaHkU390XOsLg03539rUkuPzG4Yx7FhOl7P4XSl85+Xa2n7q+wWFWcVUB 20EfM4Qzmdht6rnGKL+masjwN+K1cpjrc72SKNBfsQFNQ/fKMUXsIVdlfjQLqjlB KiXOMoSBD/5AwMpluR164S2t+rOf4JgWQYE/HskdtkESZO8kBkdlSaTRe5laHq6O /kNlXi4jfA26lbLkQ2pv9lW9WLkkLH1D0jYZivpFIynC9eODWDEUaycb1DQmpf1m rV/mqfrvP2x36fNrFNCEv+RhJBqQyFm5j81iU9i+SLCkVlK2QB627CFzzn4V9XAN i+rfLqQCEyUM7n1D9jCjIB6hig9dVFVOXvQHJkGk8ZpK8WBnHqdPawfOKShLv9/4 Qy/BhZkg6k21PxK1oklLxhXZH/APrSbodoEKYNN5fioSC5+QVmmBC5Y4pvmG+c4E zWcFs+G/svFiQEYzn/X51c1hw+fENHj+b2n4n78swmswRCY17c/b8xlRXerpuNnS wAtmZFSIW0rRjYZnY99DZO7+tWBqxnOyvlIEHxqkLx78GeKhiUXLN5zJxnW4sba1 oSYHP3i6btpI8g4DgDKdmC8qWZ25h9yLtmmXxcnEw7yyCLhzHu8= =Q5g/ -----END PGP SIGNATURE----- --8IFsHYWog4ljLG2K--