From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by passt.top (Postfix) with ESMTP id 4817B5A02EF for ; Wed, 15 May 2024 22:21:40 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1715804499; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Fx7DsG1rtVyKqhfrvJAhl0FZ3Vb+9AUMucLn4+mCzA4=; b=AmdAvg0G/ej29pmG4BOtnS3QHbFDQVbpWfUoaMR8Nd4r9EglsYver/mVmK5516u647Azzl gc0bp19EcdBZuiGwSrvQwI45yJrFWOWIHx9UpWkUP1t3XAa+6Tebnm75l0vAQ+LqwYbZeD MOEgUVrW6g4UWOYmesBD1fX6wpkDJBQ= Received: from mail-ej1-f72.google.com (mail-ej1-f72.google.com [209.85.218.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-299-cF6IAK_-NuSOrKyLYcSFAQ-1; Wed, 15 May 2024 16:21:37 -0400 X-MC-Unique: cF6IAK_-NuSOrKyLYcSFAQ-1 Received: by mail-ej1-f72.google.com with SMTP id a640c23a62f3a-a59c2583f0bso392091466b.1 for ; Wed, 15 May 2024 13:21:37 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715804495; x=1716409295; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=Fx7DsG1rtVyKqhfrvJAhl0FZ3Vb+9AUMucLn4+mCzA4=; b=izd9trpIm3mKiU6dVP7j7Ad1yfLSlbNshaip5mWLEvA1+wdXGe64U6JHUAoq6LAQZ7 +bWrPkKj+3C5nrdzHlwx7/GumvnNu9FREurCeCLlyr6/BM6wsURkPCk5Wb2ewShHd8lq UsLfEZM/e7ARFmQjuAkKzRh5GLLWMeMzaF0XtaQXRLh+h/qO158gf8eJ14rJBzS9/oww 7ZctoZIr3iL/fDgehwjYtdLNXEnrFDGBpTRpRuvP28xw6yonqAbnv0CTP1JyvkKNlR8D 5/m2wMKkrbpHuwPMieFJNEaNmswrZoClONNZuaz9Cv0HOb/1geBkvvZEd+6LGSIqx5zt Cd4Q== X-Gm-Message-State: AOJu0YyZ0Si60m+TagfvQNgJJJD0rfbkxbK7RuzFj9QXVCTeLK0QeiSF /jypfO8CDBl701Yd88kq0hb9DN2i/R2NBQV6fcl8RtBTAaUqsjuCRrsha+ThOA0DQgTdr0F+Hd0 jFq9PoehEBsePP5NhLfdrNqRpXk2HyqxnnUyca4bVi6E6lGerbgiPkd3dGEVVWPJvwy9UfepXKv pkIYomAa5BrmjC9M6DmDP92kSZmuz2K65KCfg= X-Received: by 2002:a17:907:7b8a:b0:a59:9b8e:aa61 with SMTP id a640c23a62f3a-a5a2d5c9303mr1493934966b.35.1715804494929; Wed, 15 May 2024 13:21:34 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFmqhh6nETlqZJyxmT4vP+VW/g249S7w3+k6d/KUXKu0b9d1F9iGLX8F0R2iv7YRyjAE5Lcfw== X-Received: by 2002:a17:907:7b8a:b0:a59:9b8e:aa61 with SMTP id a640c23a62f3a-a5a2d5c9303mr1493932466b.35.1715804494238; Wed, 15 May 2024 13:21:34 -0700 (PDT) Received: from maya.cloud.tilaa.com (maya.cloud.tilaa.com. [164.138.29.33]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a5a1781d350sm900510266b.23.2024.05.15.13.21.32 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 15 May 2024 13:21:32 -0700 (PDT) Date: Wed, 15 May 2024 22:20:58 +0200 From: Stefano Brivio To: Jon Maloy Subject: Re: [PATCH v4 1/3] tcp: move seq_to_tap update to when frame is queued Message-ID: <20240515222058.5252e370@elisabeth> In-Reply-To: <20240515153429.859185-2-jmaloy@redhat.com> References: <20240515153429.859185-1-jmaloy@redhat.com> <20240515153429.859185-2-jmaloy@redhat.com> Organization: Red Hat X-Mailer: Claws Mail 4.2.0 (GTK 3.24.36; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-ID-Hash: X4XSORHMWDOIWZUCM5BRMHCABD6GRIK6 X-Message-ID-Hash: X4XSORHMWDOIWZUCM5BRMHCABD6GRIK6 X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, lvivier@redhat.com, dgibson@redhat.com X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Wed, 15 May 2024 11:34:27 -0400 Jon Maloy wrote: > commit a469fc393fa1 ("tcp, tap: Don't increase tap-side sequence counter for dropped frames") > delayed update of conn->seq_to_tap until the moment the corresponding > frame has been successfully pushed out. This has the advantage that we > immediately can make a new attempt to transmit a frame after a failed > trasnmit, rather than waiting for the peer to later discover a gap and > trigger the fast retransmit mechanism to solve the problem. > > This approach has turned out to cause a problem with spurious sequence > number updates during peer-initiated retransmits, and we have realized > it may not be the best way to solve the above issue. > > We now restore the previous method, by updating the said field at the > moment a frame is added to the outqueue. To retain the advantage of > having a quick re-attempt based on local failure detection, we now scan > through the part of the outqueue that had do be dropped, and restore the > sequence counter for each affected connection to the most appropriate > value. > > Signed-off-by: Jon Maloy > > --- > v2: - Re-spun loop in tcp_revert_seq() and some other changes based on > feedback from Stefano Brivio. > - Added paranoid test to avoid that seq_to_tap becomes lower than > seq_ack_from_tap. > > v3: - Identical to v2. Called v3 because it was embedded in a series > with that version. > > v4: - In tcp_revert_seq(), we read the sequence number from the TCP > header instead of keeping a copy in struct tcp_buf_seq_update. > - Since the only remaining field in struct tcp_buf_seq_update is > a pointer to struct tcp_tap_conn, we eliminate the struct > altogether, and make the tcp6/tcp3_buf_seq_update arrays into > arrays of said pointer. > - Removed 'paranoid' test in tcp_revert_seq. If it happens, it > is not fatal, and will be caught by other code anyway. > - Separated from the series again. > --- > tcp.c | 59 +++++++++++++++++++++++++++++++++++++---------------------- > 1 file changed, 37 insertions(+), 22 deletions(-) > > diff --git a/tcp.c b/tcp.c > index 21d0af0..976dba8 100644 > --- a/tcp.c > +++ b/tcp.c > @@ -410,16 +410,6 @@ static int tcp_sock_ns [NUM_PORTS][IP_VERSIONS]; > */ > static union inany_addr low_rtt_dst[LOW_RTT_TABLE_SIZE]; > > -/** > - * tcp_buf_seq_update - Sequences to update with length of frames once sent > - * @seq: Pointer to sequence number sent to tap-side, to be updated > - * @len: TCP payload length > - */ > -struct tcp_buf_seq_update { > - uint32_t *seq; > - uint16_t len; > -}; > - > /* Static buffers */ > /** > * struct tcp_payload_t - TCP header and data to send segments with payload > @@ -461,7 +451,8 @@ static struct tcp_payload_t tcp4_payload[TCP_FRAMES_MEM]; > > static_assert(MSS4 <= sizeof(tcp4_payload[0].data), "MSS4 is greater than 65516"); > > -static struct tcp_buf_seq_update tcp4_seq_update[TCP_FRAMES_MEM]; > +/* References tracking the owner connection of frames in the tap outqueue */ > +static struct tcp_tap_conn *tcp4_frame_conns[TCP_FRAMES_MEM]; > static unsigned int tcp4_payload_used; > > static struct tap_hdr tcp4_flags_tap_hdr[TCP_FRAMES_MEM]; > @@ -483,7 +474,8 @@ static struct tcp_payload_t tcp6_payload[TCP_FRAMES_MEM]; > > static_assert(MSS6 <= sizeof(tcp6_payload[0].data), "MSS6 is greater than 65516"); > > -static struct tcp_buf_seq_update tcp6_seq_update[TCP_FRAMES_MEM]; > +/* References tracking the owner connection of frames in the tap outqueue */ > +static struct tcp_tap_conn *tcp6_frame_conns[TCP_FRAMES_MEM]; > static unsigned int tcp6_payload_used; > > static struct tap_hdr tcp6_flags_tap_hdr[TCP_FRAMES_MEM]; > @@ -1261,25 +1253,49 @@ static void tcp_flags_flush(const struct ctx *c) > tcp4_flags_used = 0; > } > > +/** > + * tcp_revert_seq() - Revert affected conn->seq_to_tap after failed transmission > + * @conns: Array of connection pointers corresponding to queued frames > + * @frames: Two-dimensional array containing queued frames with sub-iovs > + * @num_frames: Number of entries in the two arrays to be compared > + */ > +static void tcp_revert_seq(struct tcp_tap_conn **conns, struct iovec *frames, > + int num_frames) > +{ > + int c, f; > + > + for (c = 0, f = 0; c < num_frames; c++, f += TCP_NUM_IOVS) { > + struct tcp_tap_conn *conn = conns[c]; > + struct tcphdr *th = frames[f + TCP_IOV_PAYLOAD].iov_base; > + uint32_t seq = ntohl(th->seq); > + > + if (SEQ_LE(conn->seq_to_tap, seq)) > + continue; > + > + conn->seq_to_tap = seq; > + } > +} > + > /** > * tcp_payload_flush() - Send out buffers for segments with data > * @c: Execution context > */ > static void tcp_payload_flush(const struct ctx *c) > { > - unsigned i; > size_t m; > > m = tap_send_frames(c, &tcp6_l2_iov[0][0], TCP_NUM_IOVS, > tcp6_payload_used); > - for (i = 0; i < m; i++) > - *tcp6_seq_update[i].seq += tcp6_seq_update[i].len; > + if (m != tcp6_payload_used) > + tcp_revert_seq(tcp6_frame_conns, &tcp6_l2_iov[m][0], > + tcp6_payload_used - m); Nit, not worth respinning, and I can fix this up on merge: we always use curly brackets around multiple lines, even if it's a single statement, consistently with the current Linux kernel coding style. > tcp6_payload_used = 0; > > m = tap_send_frames(c, &tcp4_l2_iov[0][0], TCP_NUM_IOVS, > tcp4_payload_used); > - for (i = 0; i < m; i++) > - *tcp4_seq_update[i].seq += tcp4_seq_update[i].len; > + if (m != tcp4_payload_used) > + tcp_revert_seq(tcp4_frame_conns, &tcp4_l2_iov[m][0], > + tcp4_payload_used - m); Same here. -- Stefano