From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=pass (2048-bit key; secure) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202510 header.b=Zg419XYi; dkim-atps=neutral Received: from mail.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by passt.top (Postfix) with ESMTPS id 4DFDF5A026F for ; Wed, 29 Oct 2025 10:37:40 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202510; t=1761730657; bh=IunpDfZRI6Zd55QEifUCpwjuJBKBRtaZGVvUdPP+4kc=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=Zg419XYio2aV+15neAwIqoLxkZMd38t5g1PfcAJeT7DkRq8HGWeG1DYrdLcGfbKmc fOGb8VJ2e+puhA4pQRij88pggtPvrPDaH5naStIJD7CHuq7y094K81ZGaLacxeYo4i 8xGFygzid+FBo0AJeRmstvFEAYUlddfDOqldVL2ze/GJQsqlfAT/IlOSQiu4Aj8cHm qFzv+BEG20FUeSzWooM7TfTiACLe9oXRXEs2BEEwcwGmqJ5BskD4spdgz5/aE5sz8V MruSfWW8F3POxJBvyWVVNGq76JP50nA7sc4M2aUeGGOq1HlcnIbZMgleyeYK6GARC0 Jl7V0giLUKPGg== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4cxMbj6fYRz4wMB; Wed, 29 Oct 2025 20:37:37 +1100 (AEDT) Date: Wed, 29 Oct 2025 20:37:25 +1100 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH v3 4/4] tcp: Update data retransmission timeout Message-ID: References: <20251017202812.173e9352@elisabeth> <20251020071107.42fd40e9@elisabeth> <20251029001330.579cc85a@elisabeth> <20251029055259.2ad35fde@elisabeth> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="mUX7J2GlP6zwLSMt" Content-Disposition: inline In-Reply-To: <20251029055259.2ad35fde@elisabeth> Message-ID-Hash: YR4SOY3TYHFES36NPVQFBVFTJJ3TC5JL X-Message-ID-Hash: YR4SOY3TYHFES36NPVQFBVFTJJ3TC5JL X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: Yumei Huang , passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --mUX7J2GlP6zwLSMt Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Oct 29, 2025 at 05:52:59AM +0100, Stefano Brivio wrote: > On Wed, 29 Oct 2025 11:35:29 +1100 > David Gibson wrote: >=20 > > On Wed, Oct 29, 2025 at 12:13:30AM +0100, Stefano Brivio wrote: > > > On Mon, 20 Oct 2025 20:17:10 +1100 > > > David Gibson wrote: > > > =20 > > > > On Mon, Oct 20, 2025 at 07:11:07AM +0200, Stefano Brivio wrote: =20 > > > > > On Mon, 20 Oct 2025 11:20:19 +1100 > > > > > David Gibson wrote: =20 > > [snip] > > > > > > > Rather than the local link I was thinking of whatever monitor= or > > > > > > > liveness probe in KubeVirt which might have a 60-second perio= d, or some > > > > > > > firewall agent, or how long it typically takes for guests to = stop and > > > > > > > resume again in KubeVirt. =20 > > > > > >=20 > > > > > > Right, I hadn't considered those. Although.. do those actually= re-use > > > > > > a single connection? I would have guessed they use a new conne= ction > > > > > > each time, making the timeouts here irrelevant. =20 > > > > >=20 > > > > > It depends on the definition of "each time", because we don't tim= e out > > > > > host-side connections immediately. =20 > > > >=20 > > > > Hm, ok. Is your concern that getting a negative answer from the pr= obe > > > > will take too long? =20 > > >=20 > > > More like getting a positive answer taking too long, because we retry > > > so infrequently. =20 > >=20 > > Right, but it will only be slow if we lose the first probe, which > > should be very rare. >=20 > No, because again, that might be due to the guest doing something with > its firewall or stopping/resuming/getting online etc. It's not > necessarily rare. Hmmm... I'd think if interruption due to coming up / firewall frobbing / whatever is *not* rare, then that constitutes flaky availability that arguably the probe *should* fail on. > If that situation persists for at least 1 + 2 + 4 + 8 + 16 + 32 =3D 55 > seconds, without a clamp, we'll wait 119 seconds next, and 247 seconds > after that. In this case, to me, it looks more reasonable to retry > every minute instead. Yeah, I guess so. --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --mUX7J2GlP6zwLSMt Content-Type: application/pgp-signature; name=signature.asc -----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmkB4EkACgkQzQJF27ox 2Gf9nA/+LETgS+tp7uYbTL8dWAVgBnHMc5ZsVugEzLg5Fu7Q5dTSdvLnOmvbk0/A AP3z3Z6XZZIGx68vicPAkk8VbwrwNTzsJNd44BnSHDbqc7g+jEKWgrkdYvXhUIn8 IxO80AlYeH1qmcNRone46Js7QjqRG485EhJL6JKaCpE+3GAImxbn0wNYYxifmOky ZBEy2qmTxHXDzZvMaTUi0hlMmVcuuufDldOEU0zcBa9RwEnH6AdMLNOWZejge0K2 1RmhyDe+MYXdHLynM5xnuc0ADTcMlrVkj3kEIi2bTaEKhNOvSSKJ5c7j5RpAcwV6 mwXsYiWn35Yz8K+yuDMiW2TLjSR+MLNo80+QFqL8d0WFQD0b3dM1RKlbTDsxPc3N bAOx+E3J+dV/cTewT0yGYXlynAipnumYgozY9uW5UsBmmfILfpoUzpEpbDgjk+xK RV1tu5rpXOKEFUAO34btsAKFr6IlgzepQqJ5vdQe/ba6M4oy4kzbGr/vJtzEc/yp JmyS37v1wJqg4160zjyOMpXVLOv7F6/7dKcSp30aG3oAJcQREhv6QvywysXGb6ZR OA/G4qrY9e5XIyUirYUwl1yFgLQOldUrRDdtsiofclh6wCIseOilXnRDSiZ1T1V/ umCsbSzzzPVIkIP+6Z/KCyrkiB4P+ApNlG/KPH+P/aNaW6vxCB0= =Xf5T -----END PGP SIGNATURE----- --mUX7J2GlP6zwLSMt--