From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: passt.top; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=f0t8v96g; dkim-atps=neutral Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by passt.top (Postfix) with ESMTPS id EDA425A026F for ; Wed, 08 Oct 2025 00:42:43 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1759876963; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Hx84MPnzt9nhMh5fveZQPTOQ20TWu5H7P94ssBtfOlo=; b=f0t8v96gmp5SI+kJ8gF+Q1G4cDCans07reHpjykUE1BWKdpYRW2HhhCvuGeCtfPLLmJk5i Ew7v7C6iLzJEpFB+xu3xNceBSQy5LbtogrYfKSSJlcqgGOmaVMxLJqJTr+Z7R1CVxoltq3 5X9rlSGVI9MPiLkXlH192aQNLY70FHs= Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-527-9SLLEIdBPJ6O81GksJwh7Q-1; Tue, 07 Oct 2025 18:42:41 -0400 X-MC-Unique: 9SLLEIdBPJ6O81GksJwh7Q-1 X-Mimecast-MFC-AGG-ID: 9SLLEIdBPJ6O81GksJwh7Q_1759876961 Received: by mail-wr1-f70.google.com with SMTP id ffacd0b85a97d-3f3c118cbb3so4843461f8f.3 for ; Tue, 07 Oct 2025 15:42:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759876960; x=1760481760; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=Hx84MPnzt9nhMh5fveZQPTOQ20TWu5H7P94ssBtfOlo=; b=D2jOX13LKG3OflgG/F5RfRfUeVsmsP8a+vty31tW0NqbIBNtxfrONSuCRgoDzynLFT toHhgOjciJWo8JG//5zs5vySlnCC/FJjUmNbb27kD2bK/Nl1rZz3MXT4ojmGjDb36NAa 3i5VOb1Hg3qCJcYPWsGkgR1uqyi7OS+zCSCm5dYhaPi5E6cunVDaSrPrcCHDJkmvwHnP MJBoQDvCdkRkrujkKPmgoBTjUqEccJvdlox6LuKs1Xc8qsSzvN8epf73KNKsZ8QeOJV/ 8K1mzWsPNemX3lOaWZ1hKIieMOUJwjYMOkpy2rUdSVj5M3SVd9a86PUQMAw7ANohFJBo fi+g== X-Gm-Message-State: AOJu0Ywhn0P69Y2GLLWjrK/o5EHrIic1KccQd+nRnrkbQAXFJ6Ox4P62 dUkDWNrdO29xkPnm/96Pj+n4b/vhm9fIpBXNJ2rvfyzOjmbPli0TL1HbJMR0yCs4dnM6i+Gr+Nu 0NrDXchqKzTShY+li2vFtwxrAPT2O8yWcdMk8VMolHXd7SP4rgtkQbskwG9pilw== X-Gm-Gg: ASbGnctTZlX7gWhW7YvhPXY3qB6ZKxbprzKJLGX7uf/OlYbAucrlG2G+421rGd6wjfV I5OzYRnl6sp8Eawcr4CUlBrA1CtkT76MsPMedm6zIqmNuKFj0m9HFc/tLJGygcW69b5sw6cG3Em JGyHp6uyasZ5gswIzVf6iGk0iDUA8c0naJnUSKOqd10Dg+TqLcL8IpAzHJnR3cmVxfuE9cuU4nC 6I2uXKwVrhvlDobK36lUsvXaAW84T/TkHiKR/0efjwqr15FeSOAE8vAkiywiAE590fGII3GFHS3 gUDdYokAZHyevkT5wqbb/WCIA1VRTBU63ymQXrhNDE1yzBXvpg1hfWUE X-Received: by 2002:a05:6000:2003:b0:3ee:13ba:e133 with SMTP id ffacd0b85a97d-42666ac466amr537038f8f.1.1759876960018; Tue, 07 Oct 2025 15:42:40 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHA96nkvvM6U+tCzJYy8BxPvsHspazLT6D6S9m5LLk6GV7Q6wnOiupxpD9gq9g3ATlge+EfKw== X-Received: by 2002:a05:6000:2003:b0:3ee:13ba:e133 with SMTP id ffacd0b85a97d-42666ac466amr537030f8f.1.1759876959611; Tue, 07 Oct 2025 15:42:39 -0700 (PDT) Received: from maya.myfinge.rs (ifcgrfdd.trafficplex.cloud. [2a10:fc81:a806:d6a9::1]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4255d8a6e1bsm27450500f8f.8.2025.10.07.15.42.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 07 Oct 2025 15:42:39 -0700 (PDT) Date: Wed, 8 Oct 2025 00:42:37 +0200 From: Stefano Brivio To: David Gibson Subject: Re: [PATCH 1/4] tcp: Fix ACK sequence on FIN to tap Message-ID: <20251008004237.6b50cb0d@elisabeth> In-Reply-To: References: <20251002000646.2136202-1-sbrivio@redhat.com> <20251002000646.2136202-2-sbrivio@redhat.com> <20251002135841.112eb4d3@elisabeth> <20251007003219.3f286b1d@elisabeth> Organization: Red Hat X-Mailer: Claws Mail 4.2.0 (GTK 3.24.49; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: rO081OPTkkQDS1l5VxniglkEeagHUKwrwsYzTA_VpEU_1759876961 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-ID-Hash: VQQ7OUTB2BWGJKXZXHWKDC4Q2SQQAQRI X-Message-ID-Hash: VQQ7OUTB2BWGJKXZXHWKDC4Q2SQQAQRI X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Tue, 7 Oct 2025 10:31:11 +1100 David Gibson wrote: > On Tue, Oct 07, 2025 at 12:32:19AM +0200, Stefano Brivio wrote: > > On Fri, 3 Oct 2025 13:19:17 +1000 > > David Gibson wrote: > > > > > On Thu, Oct 02, 2025 at 01:58:41PM +0200, Stefano Brivio wrote: > > > > On Thu, 2 Oct 2025 12:41:08 +1000 > > > > David Gibson wrote: > > > > > > > > > On Thu, Oct 02, 2025 at 02:06:43AM +0200, Stefano Brivio wrote: > > > > > > If we reach end-of-file on a socket (or get EPOLLRDHUP / EPOLLHUP) and > > > > > > send a FIN segment to the guest / container acknowledging a sequence > > > > > > number that's behind what we received so far, we won't have any > > > > > > further trigger to send an updated ACK segment, as we are now > > > > > > switching the epoll socket monitoring to edge-triggered mode. > > > > > > > > > > > > To avoid this situation, in tcp_update_seqack_wnd(), we set the next > > > > > > acknowledgement sequence to the current observed sequence, regardless > > > > > > of what was acknowledged socket-side. > > > > > > > > > > To double check my understanding: things should work if we always > > > > > acknowledged everything we've received. Acknowledging only what the > > > > > peer has acked is a refinement to give the guest a view that's closer > > > > > to what it would be end-to-end with the peer (which might improve the > > > > > operation of flow control). > > > > > > > > Right. > > > > > > > > > We can't use that refined mechanism when the socket is closing > > > > > (amongst other cases), because while we can get the peer acked bytes > > > > > from TCP_INFO, we can't get events when that changes, so we have no > > > > > mechanism to provide updates to the guest at the right time. So we > > > > > fall back to the simpler method. > > > > > > > > > > Is that correct? > > > > > > > > Also correct, yes. If you have a better idea to summarise this in the > > > > comment in tcp_buf_data_from_sock() let me know. > > > > > > Hm, I might. Or actually a way to reorganise the code that I think > > > will be a bit clearer and probably allow a clearer comment too. > > > > I would keep reworks for a later moment. Right now, it's already taking > > me long enough to find a moment to investigate these issues, write these > > fixes, and test them. > > I mean... the change I'm proposing reduces lines of code (excepting > the big new comment), makes it easier to reason about and is localised > to the immediately surrounding code. But whatever, I don't > particularly care about the order we do things. Sure, I don't think that other patch is particularly complicated or might ever become problematic at all, but, from your earlier comment ("reorganise the code", as I mentioned the comment in tcp_buf_data_from_sock()), I thought you wanted to rework this particular code in tcp_buf_data_from_sock() at the same time. I guess it's not the case judging from your most recent reply, though. -- Stefano