From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: passt.top; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=Xg0Hx6FF; dkim-atps=neutral Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by passt.top (Postfix) with ESMTPS id 62ABB5A0619 for ; Thu, 30 Oct 2025 09:51:17 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1761814276; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Jz/Q6P63cY9xUSXVZzyR8A8sNHtqltfBvL7cxka8QXE=; b=Xg0Hx6FFb1+P5b3Xa/YIrh4Bsd4gsbgnxdP/5j1OhgoifHteTB2dE0Jh/zKt0kihYld0YA X77ohsB0Yir7LOCgnWU9hMS8yazIa5+ronk9U05uX2+AOd2A9a2ZJ83MMKShcT8aAYnF3i FiGwpZcwEH2BxhGZ+q4sNttQUjkgQnk= Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-86-oFkpskFsOt28_AKx-k7k7w-1; Thu, 30 Oct 2025 04:51:14 -0400 X-MC-Unique: oFkpskFsOt28_AKx-k7k7w-1 X-Mimecast-MFC-AGG-ID: oFkpskFsOt28_AKx-k7k7w_1761814273 Received: by mail-wr1-f70.google.com with SMTP id ffacd0b85a97d-427015f62a7so406813f8f.0 for ; Thu, 30 Oct 2025 01:51:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761814273; x=1762419073; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=0lpyW28Y88YV6zleN7w+8g6fNp0tabs5bGxHlTCPQyA=; b=SMfLfdPbdHh/d3s5amDQz2qj6fTuFV9dSbvcWRsn9istdtIrcF1uej4y9VHZWDH9Jf Ye1vYju9WDtTJ4Kr62ZwOjaCx6SVnEAbuGfLBjbTiClbaiJrEoBWIOrLfyRsTUnF2Taz A4FED7FFfsNI1Q+BNgN1TwKk5XCiBfHJJmsLNXPSBZBq3IJprXgv5nzs3cyINFXRo0eU sVr2DcoHsh4CXS1iW2EABH+pdPQrxBLlE4GSEdyuvtobVxevnY4eC7/5wFUl/hA7HZIZ LFeob8pXMH3eNJiM9oDkjmD4rw9UdcMTK3BAewULt+U5ndH7PQOtVI6BIgPcE4fMRQpj 36SA== X-Gm-Message-State: AOJu0YyeyKqxIfi3D7gYnwhjs7vhOX5I4/eAWnPE563lnrKa0XCrm5Sf D+HUgNiqITodGgEYVmol2B6IJ6BDOhEa4UJR0+yFwSN2xj04PS5LFSXMLF/D423zswuL8UBayVq 9aPSmmhatRaiT5Rc0koy0cCVCHI+pB+NZAqiYz47Bva6VUT6oSTzCCQ== X-Gm-Gg: ASbGncvikeaoc6UyaOTau0vfpEP+Op24vVS4XQDVaw+yGybAQXYLc2buE7zTQTvaG0t mzXcUzf3US4jGaifomJJSdBCS9HxKftgDTw0OcIPyN9Lj83qiRaA5Q3kCBio/MqI8NG2iH/Tq3E /IawwGSNv/nUUdIsPGEdotSbA52TO8Z8Tb/966F0RBHEKgsTE5WcBu/UWhZKUmG/DAgUBRW5bCJ 9lH4s1EMVYDq5WKHRIeaaYcDT2OZijX3C1ND7OkFOu9p/0cB2vTxx5zaClfe57wEgYPi7AdbAQj On8DJ1qvYSglVczjD396vW8mhYPwh4VKRd7IwOdC1P6QA87pmlPui58NjdDaPhfk0ADC+SLmEsL FxZ7Il5DA1Q== X-Received: by 2002:a05:6000:1889:b0:429:8a71:d63 with SMTP id ffacd0b85a97d-429b4c987fcmr1669981f8f.37.1761814272614; Thu, 30 Oct 2025 01:51:12 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHEeZBBEAk9KzGR2F/fWqvy3zN6F2jsYbscM39nlAIgiklTCwWDp14ihBa2GjGbf+IpFcNg4w== X-Received: by 2002:a05:6000:1889:b0:429:8a71:d63 with SMTP id ffacd0b85a97d-429b4c987fcmr1669957f8f.37.1761814271931; Thu, 30 Oct 2025 01:51:11 -0700 (PDT) Received: from maya.myfinge.rs (ifcgrfdd.trafficplex.cloud. [2a10:fc81:a806:d6a9::1]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-429952db964sm30941211f8f.33.2025.10.30.01.51.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Oct 2025 01:51:11 -0700 (PDT) Date: Thu, 30 Oct 2025 09:51:10 +0100 From: Stefano Brivio To: Yumei Huang Subject: Re: [PATCH v6 4/4] tcp: Update data retransmission timeout Message-ID: <20251030095110.794ffe54@elisabeth> In-Reply-To: References: <20251017062838.21041-1-yuhuang@redhat.com> <20251017062838.21041-5-yuhuang@redhat.com> <20251024010438.16e19757@elisabeth> <20251028124426.534fd236@elisabeth> <20251029053804.711716aa@elisabeth> <20251029080953.4f671f56@elisabeth> <20251029083929.0dc52bc6@elisabeth> <20251029131808.3e2a70c9@elisabeth> Organization: Red Hat X-Mailer: Claws Mail 4.2.0 (GTK 3.24.49; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: zeZcWJ4OpCo0T-KyV7MrBBdRRtXqWHtZSCJkE3Scnj8_1761814273 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Message-ID-Hash: V7NMIHQ6XYJCJRX2O4HG6API33W4C3QH X-Message-ID-Hash: V7NMIHQ6XYJCJRX2O4HG6API33W4C3QH X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, david@gibson.dropbear.id.au X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Thu, 30 Oct 2025 16:25:53 +0800 Yumei Huang wrote: > On Wed, Oct 29, 2025 at 8:18=E2=80=AFPM Stefano Brivio wrote: > > > > On Wed, 29 Oct 2025 16:59:51 +0800 > > Yumei Huang wrote: > > =20 > > > On Wed, Oct 29, 2025 at 3:39=E2=80=AFPM Stefano Brivio wrote: =20 > > > > > > > > On Wed, 29 Oct 2025 15:32:33 +0800 > > > > Yumei Huang wrote: > > > > =20 > > > > > On Wed, Oct 29, 2025 at 3:10=E2=80=AFPM Stefano Brivio wrote: =20 > > > > > > > > > > > > On Wed, 29 Oct 2025 13:11:48 +0800 > > > > > > Yumei Huang wrote: > > > > > > =20 > > > > > > > On Wed, Oct 29, 2025 at 12:38=E2=80=AFPM Stefano Brivio wrote: =20 > > > > > > > > > > > > > > > > On Wed, 29 Oct 2025 11:06:44 +0800 > > > > > > > > Yumei Huang wrote: > > > > > > > > =20 > > > > > > > > > On Tue, Oct 28, 2025 at 7:44=E2=80=AFPM Stefano Brivio wrote: =20 > > > > > > > > > > > > > > > > > > > > On Tue, 28 Oct 2025 16:09:03 +0800 > > > > > > > > > > Yumei Huang wrote: > > > > > > > > > > =20 > > > > > > > > > > > On Fri, Oct 24, 2025 at 7:04=E2=80=AFAM Stefano Brivi= o wrote: =20 > > > > > > > > > > > > > > > > > > > > > > > > On Fri, 17 Oct 2025 14:28:38 +0800 > > > > > > > > > > > > Yumei Huang wrote: > > > > > > > > > > > > =20 > > > > > > > > > > > > > Use an exponential backoff timeout for data retra= nsmission according > > > > > > > > > > > > > to RFC 2988 and RFC 6298. Set the initial RTO to = one second as discussed > > > > > > > > > > > > > in Appendix A of RFC 6298. > > > > > > > > > > > > > > > > > > > > > > > > > > Also combine the macros defining the initial RTO = for both SYN and ACK. > > > > > > > > > > > > > > > > > > > > > > > > > > Signed-off-by: Yumei Huang > > > > > > > > > > > > > Reviewed-by: David Gibson > > > > > > > > > > > > > --- > > > > > > > > > > > > > tcp.c | 27 ++++++++++++--------------- > > > > > > > > > > > > > 1 file changed, 12 insertions(+), 15 deletions(-= ) > > > > > > > > > > > > > > > > > > > > > > > > > > diff --git a/tcp.c b/tcp.c > > > > > > > > > > > > > index 9385132..dc0ec6c 100644 > > > > > > > > > > > > > --- a/tcp.c > > > > > > > > > > > > > +++ b/tcp.c > > > > > > > > > > > > > @@ -179,16 +179,14 @@ > > > > > > > > > > > > > * > > > > > > > > > > > > > * Timeouts are implemented by means of timerfd = timers, set based on flags: > > > > > > > > > > > > > * > > > > > > > > > > > > > - * - SYN_TIMEOUT_INIT: if no ACK is received fro= m tap/guest during handshake > > > > > > > > > > > > > - * (flag ACK_FROM_TAP_DUE without ESTABLISHED = event) within this time, resend > > > > > > > > > > > > > - * SYN. It's the starting timeout for the firs= t SYN retry. If this persists > > > > > > > > > > > > > - * for more than TCP_MAX_RETRIES or (tcp_syn_r= etries + > > > > > > > > > > > > > - * tcp_syn_linear_timeouts) times in a row, re= set the connection > > > > > > > > > > > > > - * > > > > > > > > > > > > > - * - ACK_TIMEOUT: if no ACK segment was received= from tap/guest, after sending > > > > > > > > > > > > > - * data (flag ACK_FROM_TAP_DUE with ESTABLISHE= D event), re-send data from the > > > > > > > > > > > > > - * socket and reset sequence to what was ackno= wledged. If this persists for > > > > > > > > > > > > > - * more than TCP_MAX_RETRIES times in a row, r= eset the connection > > > > > > > > > > > > > + * - RTO_INIT: if no ACK segment was received fr= om tap/guest, either during > > > > > > > > > > > > > + * handshake (flag ACK_FROM_TAP_DUE without ES= TABLISHED event) or after > > > > > > > > > > > > > + * sending data (flag ACK_FROM_TAP_DUE with ES= TABLISHED event), re-send data > > > > > > > > > > > > > + * from the socket and reset sequence to what = was acknowledged. This is the > > > > > > > > > > > > > + * timeout for the first retry, in seconds. If= this persists too many times > > > > > > > > > > > > > + * in a row, reset the connection: TCP_MAX_RET= RIES for established > > > > > > > > > > > > > + * connections, or (tcp_syn_retries + tcp_syn_= linear_timeouts) during the > > > > > > > > > > > > > + * handshake. > > > > > > > > > > > > > * > > > > > > > > > > > > > * - FIN_TIMEOUT: if a FIN segment was sent to t= ap/guest (flag ACK_FROM_TAP_DUE > > > > > > > > > > > > > * with TAP_FIN_SENT event), and no ACK is rec= eived within this time, reset > > > > > > > > > > > > > @@ -342,8 +340,7 @@ enum { > > > > > > > > > > > > > #define WINDOW_DEFAULT 146= 00 /* RFC 6928 */ > > > > > > > > > > > > > > > > > > > > > > > > > > #define ACK_INTERVAL 10 = /* ms */ > > > > > > > > > > > > > -#define SYN_TIMEOUT_INIT 1 = /* s */ > > > > > > > > > > > > > -#define ACK_TIMEOUT 2 > > > > > > > > > > > > > +#define RTO_INIT 1 = /* s, RFC 6298 */ > > > > > > > > > > > > > #define FIN_TIMEOUT 60 > > > > > > > > > > > > > #define ACT_TIMEOUT 7200 > > > > > > > > > > > > > > > > > > > > > > > > > > @@ -588,13 +585,13 @@ static void tcp_timer_ctl(c= onst struct ctx *c, struct tcp_tap_conn *conn) > > > > > > > > > > > > > } else if (conn->flags & ACK_FROM_TAP_DUE) = { > > > > > > > > > > > > > if (!(conn->events & ESTABLISHED)) = { > > > > > > > > > > > > > if (conn->retries < c->tcp.= syn_linear_timeouts) > > > > > > > > > > > > > - it.it_value.tv_sec = =3D SYN_TIMEOUT_INIT; > > > > > > > > > > > > > + it.it_value.tv_sec = =3D RTO_INIT; > > > > > > > > > > > > > else > > > > > > > > > > > > > - it.it_value.tv_sec = =3D SYN_TIMEOUT_INIT << > > > > > > > > > > > > > + it.it_value.tv_sec = =3D RTO_INIT << > > > > > > > > > > > > > (conn->retr= ies - c->tcp.syn_linear_timeouts); > > > > > > > > > > > > > } > > > > > > > > > > > > > else > > > > > > > > > > > > > - it.it_value.tv_sec =3D ACK_= TIMEOUT; > > > > > > > > > > > > > + it.it_value.tv_sec =3D RTO_= INIT << conn->retries; =20 > > > > > > > > > > > > > > > > > > > > > > > > Same as on 3/4, but here it's clearly more convenie= nt: just assign > > > > > > > > > > > > RTO_INIT, and multiply as needed in the if / else c= lauses. =20 > > > > > > > > > > > > > > > > > > > > > > I guess we can't just assign RTO_INIT. Maybe assign = it only when > > > > > > > > > > > retries=3D=3D0, otherwise multiply as it.it_value.tv_= sec <<=3D1. =20 > > > > > > > > > > > > > > > > > > > > Why can't you do that? Say: > > > > > > > > > > > > > > > > > > > > it.it_value.tv_sec =3D RTO_INIT > > > > > > > > > > if (!(conn->events & ESTABLISHED)) { > > > > > > > > > > if (conn->retries >=3D c->tcp.syn_linea= r_timeouts) > > > > > > > > > > it.it_value.tv_sec <<=3D (conn-= >retries - > > > > > > > > > > c->tcp.= syn_linear_timeouts); > > > > > > > > > > > > > > > > > > > > but anyway, see below. > > > > > > > > > > =20 > > > > > > > > > > > But it seems more complicated. What do you think? =20 > > > > > > > > > > > > > > > > > > > > Or maybe, building on my latest comment to 3/4: > > > > > > > > > > > > > > > > > > > > int factor =3D conn->retries; > > > > > > > > > > > > > > > > > > > > if (!(conn->events & ESTABLISHED)) > > > > > > > > > > factor -=3D c->tcp.syn_linear_t= imeouts; > > > > > > > > > > > > > > > > > > > > it.it_value.tv_sec =3D RTO_INIT << MAX(= factor, 0); > > > > > > > > > > > > > > > > > > > > ? =20 > > > > > > > > > > > > > > > > > > Yeah, I understand this part now. > > > > > > > > > =20 > > > > > > > > > > =20 > > > > > > > > > > > > =20 > > > > > > > > > > > > > } else if (CONN_HAS(conn, SOCK_FIN_SENT | T= AP_FIN_ACKED)) { > > > > > > > > > > > > > it.it_value.tv_sec =3D FIN_TIMEOUT; > > > > > > > > > > > > > } else { =20 > > > > > > > > > > > > > > > > > > > > > > > > The rest of the series looks good to me. > > > > > > > > > > > > > > > > > > > > > > > > It might be slightly more practical to factor in di= rectly the RTO > > > > > > > > > > > > clamp, and I don't think it's complicated now that = you have the helper > > > > > > > > > > > > from 2/4, but it's not a strong preference from my = side, as the series > > > > > > > > > > > > makes sense in any case. =20 > > > > > > > > > > > > > > > > > > > > > > Reading tcp_rto_max_ms can be easy with the helper. M= y concern is > > > > > > > > > > > about the way we get the total time for retries. > > > > > > > > > > > > > > > > > > > > > > I used to do it like this in v2, > > > > > > > > > > > https://archives.passt.top/passt-dev/20251010074700.2= 2177-4-yuhuang@redhat.com/: > > > > > > > > > > > > > > > > > > > > > > +#define RETRY_ELAPSED(timeout_init, retries) \ > > > > > > > > > > > + ((timeout_init) * ((1 << ((retries) + 1)) - 2)) > > > > > > > > > > > > > > > > > > > > > > Though the formula is not quite right, we could refin= e it as below: > > > > > > > > > > > > > > > > > > > > > > #define RETRY_ELAPSED(retries) ((RTO_INIT) * ((1 << (= (retries) + 1)) - 1)) > > > > > > > > > > > > > > > > > > > > > > Does it make sense to get the time this way? =20 > > > > > > > > > > > > > > > > > > > > Well, it also depends on c->tcp.syn_linear_timeouts, ri= ght? =20 > > > > > > > > > > > > > > > > > > Not really, it's only used for data retransmission, so > > > > > > > > > syn_linear_timeouts is not relevant. =20 > > > > > > > > > > > > > > > > Hmm, no, why? RFC 6298 covers SYN retries as well, and that= 's the one > > > > > > > > stating: =20 > > > > > > > > > > > > > > I meant RETRY_ELAPSED was only used for data retransmission, = which > > > > > > > uses exponential backoff timeout directly, so syn_linear_time= outs was > > > > > > > not relevant. =20 > > > > > > > > > > > > Ah, okay. > > > > > > =20 > > > > > > > > > > > > > > > > (2.5) A maximum value MAY be placed on RTO provided it i= s at least 60 > > > > > > > > seconds. =20 > > > > > > > > > > > > > > For SYN retries, as we used linear backoff + exponential back= off, and > > > > > > > also limited by TCP_MAX_RETRIES, the possible max RTO is far = less than > > > > > > > 60s. So we didn't clamp it. Do you think we need to clamp it = as well? =20 > > > > > > > > > > > > If syn_linear_timeouts is 0 and tcp_syn_retries is 7, I guess w= e'll > > > > > > reach 2^0 + 2^1 + 2^2 + 2^3 + 2^4 + 2^5 + 2^6 + 2^7 =3D 247 sec= onds? Or > > > > > > just up to ... + 2^6, that is, 119 seconds? > > > > > > > > > > > > In any case, what is the difference compared to data retransmis= sions? =20 > > > > > > > > > > You are right. I assumed wrongly that syn_linear_timeouts is alwa= ys 4. > > > > > When it's 0, it's the same as data retransmissions. =20 > > > > > > > > > > > > Don't we have 3 bits to store the retry count as well, so we're= limited > > > > > > by TCP_MAX_RETRIES anyway? Looking at patch 1/4 I'd say it's th= e same > > > > > > counter. =20 > > > > > > > > > > Yes, it's the same counter. I guess you mean we should clamp it a= s well. =20 > > > > > > > > I didn't check this part, I thought we already did, but if we don't= , we > > > > should do that, yes. > > > > =20 > > > > > > > > ...the only thing that I don't see implemented in this vers= ion of the > > > > > > > > patch is paragraph 5.7: > > > > > > > > > > > > > > > > (5.7) If the timer expires awaiting the ACK of a SYN seg= ment and the > > > > > > > > TCP implementation is using an RTO less than 3 sec= onds, the RTO > > > > > > > > MUST be re-initialized to 3 seconds when data tran= smission > > > > > > > > begins (i.e., after the three-way handshake comple= tes). > > > > > > > > > > > > > > > > I missed that while reviewing earlier versions. I guess we = need to use > > > > > > > > a MAX(x, 3) clamp if (c->conn->events & ESTABLISHED). I thi= nk it's > > > > > > > > simpler than re-introducing separate starting values (one s= econd and > > > > > > > > three seconds). =20 > > > > > > Sorry I'd like to confirm one more thing. By "use MAX(x, 3) clamp if > > > (c->conn->events & ESTABLISHED)", I guess you mean: > > > > > > it.it_value.tv_sec =3D MAX(it.it_value.tv_sec, 3); =20 > > > > Right. > > =20 > > > Then the code would be: > > > > > > it.it_value.tv_sec =3D RTO_INIT; > > > if (!(conn->events & ESTABLISHED)) { > > > int exp =3D conn->retries - c->tcp.syn_linear_timeouts; > > > it.it_value.tv_sec <<=3D MAX(exp, 0); > > > } else { > > > it.it_value.tv_sec =3D MAX(it.it_value.tv_sec, 3); > > > it.it_value.tv_sec <<=3D conn->retries; =20 > > > > Hmm what's wrong with my previous suggestion of initialising 'exp' with > > conn->retries? Then it becomes (equivalent to your code snippet, but > > shorter): =20 >=20 > Sorry, I read the comments several times but still missed that. And > yes, it's shorter. >=20 > > =20 > > > } > > > it.it_value.tv_sec =3D MIN( > > > it.it_value.tv_sec, c->tcp.tcp_rto_max); =20 > > > > something like: > > > > int exp =3D conn->retries, min =3D 0, max =3D c->tcp.tc= p_rto_max; > > > > if (!(conn->events & ESTABLISHED)) > > exp -=3D c->tcp.syn_linear_timeouts; > > else > > min =3D /* add a constant for this */ 3; > > > > it.it_value.tv_sec =3D RTO_INIT << MAX(exp, 0); > > > > it.it_value.tv_sec =3D CLAMP(it.it_value.tv_sec, min, m= ax); =20 >=20 > But this seems not right. For data retransmission, only the first > timeout will be changed from 1 to 3s, then the timeout becomes 4, 8, > 16 ..., 120. Oops, sorry, you're right. I proposed that because I had for a moment the thought that it.it_value.tv_sec would be just multiplied from the current value in the next iterations, but it's of course not the case. > I feel we don't need CLAMP, the code would be: >=20 > #define RTO_INIT_ACK 3 > ... > int exp =3D conn->retries; > it.it_value.tv_sec =3D RTO_INIT; > if (!(conn->events & ESTABLISHED)) > exp -=3D c->tcp.syn_linear_timeouts; > else { > it.it_value.tv_sec =3D MAX(it.it_value.tv_sec, > RTO_INIT_ACK); > } > it.it_value.tv_sec <<=3D MAX(exp, 0); > it.it_value.tv_sec =3D MIN( > it.it_value.tv_sec, c->tcp.tcp_rto_max); >=20 > Please correct me if I got anything wrong. Thanks. It looks good to me now. Maybe try keeping 'min' or 'max' values assigned separately as I did and see if it helps readability, though. *Or* use a temporary 'timeout' variable and assign it.it_value.tv_sec only at the end, so that lines are shorter and more readable. Probably better than any 'min' or 'max' which force the reader to look at yet another place. But conceptually I think it should be like that, yes. > > ...we don't have a CLAMP() macro at the moment, just MIN() and MAX() in > > util.h, but now that we need one, I would add it, similar to the one > > from the Linux kernel but without type checking (as it's not really > > practical here). > > > > Inspired from include/linux/kernel.h (current Linux kernel tree): > > > > #define clamp(val, lo, hi) min((typeof(val))max(val, lo), hi) > > > > we can do something similar with two differences: 1. all our macros are > > uppercase (we don't have as many as the kernel, and it's nice to know > > that they are macros from the name) and 2. as I mentioned we don't > > need/want type checking, so, say: > > > > #define CLAMP(x, min, max) MIN(MAX((x), (min)), (max)) > > > > ...or keep val / lo / hi if it's clearer, I don't really have a > > preference. > > > > I added parentheses around the arguments by the way because I think > > it's good practice, even though not needed in this case. It's the > > PRE02-C recommendation in CERT C (we generally stick to those > > recommendations whenever practical): > > > > https://wiki.sei.cmu.edu/confluence/display/c/PRE02-C.+Macro+replacem= ent+lists+should+be+parenthesized > > > > https://clang.llvm.org/extra/clang-tidy/checks/bugprone/macro-parenth= eses.html > > > > Note that the Linux kernel has a compatible license (it's the same, > > actually, GPLv2+), and, regardless of that, the implementation is > > trivial and the idea is obvious, so I don't think we need to give > > explicit attribution in this case. > > > > But in other cases we do, or I guess it's fair to the author anyway, > > see for example siphash.h. > > =20 > > > we basically set the starting value to 3 for data retransmissions no > > > matter if we have retried SYN. And the max total timeout would be > > > 3+6+12+24+48+96+120+120 =3D 429s. Is it what we want? =20 > > > > Ah, yes. In the discussion I previously assumed that the default clamp > > value was 60 seconds, but it's actually 120 seconds. It looks correct > > to me. > > =20 > > > Besides, I guess I > > > need to define a macro for "3" as well, like "ACK_TIMEOUT_INIT"? =20 > > > > Or RTO_INIT_ACK? Maybe RTO_INIT_DATA? I'm thinking that you already > > have a RTO_INIT at this point, and this constant is very closely relate= d > > to that one. > > =20 > > > > > > > > > > > > > > I'm not sure I understand here. If the timer expires, didn't = we reset > > > > > > > the connection directly? We would never get to the data trans= mission > > > > > > > phase? =20 > > > > > > > > > > > > In the RFC, "the timer expires" indicates one iteration of the = timeout > > > > > > algorithm, that is, it's the same as our timer (tcp_timer_handl= er()) > > > > > > triggering. =20 > > > > > > > > > > I see. > > > > > > > > > > If you agree, I can add a new patch in this series to address the= clamping. =20 > > > > > > > > I don't really have a preference, I guess it could be directly in 4= /4 or > > > > in a separate patch, neither option should complicate things too mu= ch. > > > > =20 > > > > > > The paragraph says "after the three-way handshake completes", s= o it > > > > > > looks like the connection wasn't reset. > > > > > > =20 > > > > > > > > > Probably I should name it more clearly. > > > > > > > > > =20 > > > > > > > > > > > > > > > > > > > > But in any case, do you really need to calculate that e= xplicitly? I was > > > > > > > > > > thinking that you can just clamp the value when you use= it in > > > > > > > > > > tcp_timer_ctl(). > > > > > > > > > > > > > > > > > > > > If c->tcp.rto_max is DIV_ROUND_CLOSEST(x, 1000), where = 'x' is the value > > > > > > > > > > you read from the kernel, then I guess it's just: > > > > > > > > > > > > > > > > > > > > --- > > > > > > > > > > it.it_value.tv_sec =3D MIN(it.it_value.= tv_sec, c->tcp.rto_max); =20 > > > > > > > > > > > > > > > > > > After reading the comments in v3 when tcp_rto_max_ms was = first > > > > > > > > > mentioned again, I realized I got something wrong again. = I thought it > > > > > > > > > was for the total timeout for all retries, so I need to c= alculate that > > > > > > > > > and decide to reset the connection as in v2. =20 > > > > > > > > > > > > > > > > I think it actually applies to all the retries. > > > > > > > > =20 > > > > > > > > > Anyway, you are right. We don't need to do that. Thanks f= or your patience. > > > > > > > > > =20 > > > > > > > > > > } ... > > > > > > > > > > > > > > > > > > > > if (timerfd_settime(conn->timer, 0, &it, NULL)) > > > > > > > > > > flow_perror(conn, "failed to set timer"= ); > > > > > > > > > > --- =20 --=20 Stefano