From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: passt.top; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=B0yahCuL; dkim-atps=neutral Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by passt.top (Postfix) with ESMTPS id 77B195A026F for ; Thu, 09 Oct 2025 21:28:59 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1760038138; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mJMoNbIx4IDniYgQrEBYyBb55oz2zKLAnaGfaJh7DKM=; b=B0yahCuLdD55O+WuxNfGfntkysqAF20Kb3qU85nxDV2xHG8vZ9e0nIwwnbqVQT0+ZuhCKb thDgt67KkFQuzFY3K7Y4iLKx1syea0JjMOghPwuwxbor32jcG36cGyni/Atk4ADcdWOCoM e4vj3AtuI2tIRv/7Nt4x3kkM7dtkLyY= Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-423-5G22VdQ5OBKn29rddvY7DQ-1; Thu, 09 Oct 2025 15:28:57 -0400 X-MC-Unique: 5G22VdQ5OBKn29rddvY7DQ-1 X-Mimecast-MFC-AGG-ID: 5G22VdQ5OBKn29rddvY7DQ_1760038135 Received: by mail-wm1-f70.google.com with SMTP id 5b1f17b1804b1-46e4fa584e7so6691405e9.0 for ; Thu, 09 Oct 2025 12:28:55 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760038134; x=1760642934; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=mJMoNbIx4IDniYgQrEBYyBb55oz2zKLAnaGfaJh7DKM=; b=ZavvI3Z+cjhwJDkmM90kd6Jusd50MMoJ5AXJMTz7DvejhhmFGXa1//vHCxsNnMTmPx egQ8EnWO2143j0QBrDSYaMIWJ3RF44C3IbLSAjCKUTm+38hne44vJu3t7hUYIV0TVdZi Qt1jBug03ivRr7ehcN/Ckv2Vm7xQBGMK2dy867YJ4DYU6AgMH8fH+sVPoDqWWj311Zk+ hBTfGsZ5Dj3KXFgALLiR0ZSR1tA40+TVVrIbd94Pol2gZaqz+i+GS0LIdKHYPbKwWbRZ 732TJVdao8UiDawcRxSFDvkzxldIV9hBj8OGjZXmyuRpLOK9SEr6B4rJC0eyV6bMA7/3 Yb9A== X-Gm-Message-State: AOJu0YzhqhuLGQEdTlxtWtrYrfKektu1Uy5SfbYaLVGg1qPkVWbz8q6k pUzTFVJjkRMxYiRME/1ofHMs1VIkB04CbaOKseANxYj8c1moyvTYiBIgdwG2hY8Z+AtuwijreNt LvyGpcLARXPZ2WmPqJQGImaOjJ3sU9GJfOTvi6YL2KbG4bKI0Yf4zUvZ4rH0qdg== X-Gm-Gg: ASbGncse0+8Zh04vkc2PH+oEFz2PYlm9h4NfxQlGzf/CAfEAbQsbFcONr5GK4CVrpGD UIgWKQCuODEhBsHq4rWi93u31tWWd6j9U04FSSlpFnS/IIHnMOOQbaH2YTJAn0ZcWBn9GxVGMkm MXrqx7Nkxp5HZm+HWwwptXperHJoxA5rO2Pr48dbRUd7eBRhBqxIy/MoF8/t4Fm5wHv1I5J24RJ QcVQSa598iYWsRbAcbKYENVok2VMWI5ENnXzBYlE3/xit6aAb1w2Vg7FBBAJgovz2X/ai6Q+Ofp KLRX7Au+O+dXC8Ulx4cvfLAOIoX5vdOj+jrdFwN3gxHygGWPqLt3LcSw X-Received: by 2002:a05:600d:42f2:b0:46e:4341:7302 with SMTP id 5b1f17b1804b1-46fb15396d6mr20075335e9.34.1760038134284; Thu, 09 Oct 2025 12:28:54 -0700 (PDT) X-Google-Smtp-Source: AGHT+IErpgw9HL9qALNg6p8nGNs8c5WjphE3rWeHR31HUxPsY4X8z6ZCAD7rsYCskUsZ4NTLLnvjww== X-Received: by 2002:a05:600d:42f2:b0:46e:4341:7302 with SMTP id 5b1f17b1804b1-46fb15396d6mr20075225e9.34.1760038133709; Thu, 09 Oct 2025 12:28:53 -0700 (PDT) Received: from maya.myfinge.rs (ifcgrfdd.trafficplex.cloud. [2a10:fc81:a806:d6a9::1]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-46fb49bd977sm9187225e9.11.2025.10.09.12.28.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 09 Oct 2025 12:28:52 -0700 (PDT) Date: Thu, 9 Oct 2025 21:28:51 +0200 From: Stefano Brivio To: David Gibson Subject: Re: [PATCH 3/4] tcp: Don't consider FIN flags with mismatching sequence Message-ID: <20251009212851.64af22d2@elisabeth> In-Reply-To: References: <20251002000646.2136202-4-sbrivio@redhat.com> <20251002135104.1a662029@elisabeth> <20251007003254.57ce5e26@elisabeth> <20251008004249.6dc894fa@elisabeth> <20251008120127.75aa7585@elisabeth> Organization: Red Hat X-Mailer: Claws Mail 4.2.0 (GTK 3.24.49; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: K6vl4xb9O9Qgw1AK3soD4-KDyO9t25foFCvqqXX6A0Y_1760038135 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-ID-Hash: BKZ5AINCOUZCJS77YKIA55JC47PJUHJG X-Message-ID-Hash: BKZ5AINCOUZCJS77YKIA55JC47PJUHJG X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Thu, 9 Oct 2025 13:43:00 +1100 David Gibson wrote: > On Wed, Oct 08, 2025 at 12:01:27PM +0200, Stefano Brivio wrote: > > On Wed, 8 Oct 2025 11:41:23 +1100 > > David Gibson wrote: > > > > > On Wed, Oct 08, 2025 at 12:42:49AM +0200, Stefano Brivio wrote: > > > > On Tue, 7 Oct 2025 10:34:01 +1100 > > > > David Gibson wrote: > > > > > On Tue, Oct 07, 2025 at 12:32:54AM +0200, Stefano Brivio wrote: > > > > > > On Fri, 3 Oct 2025 13:43:32 +1000 > > > > > > David Gibson wrote: > > > [snip] > > > > > > > @@ -1820,6 +1817,8 @@ static int tcp_data_from_tap(const struct ctx *c, struct tcp_tap_conn *conn, > > > > > > > break; > > > > > > > seq_from_tap += size; > > > > > > > iov_i += count; > > > > > > > + if (th->fin) > > > > > > > + fin = 1; > > > > > > > > > > > > > > if (keep == i) > > > > > > > keep = -1; > > > > > > > > > > > > > > > > > > > > > We'd need to double check that the "accept data segment" path is safe > > > > > > > with len == 0, of course. > > > > > > > > > > > > For sure it's not before d2c33f45f7be ("tcp: Convert > > > > > > tcp_data_from_tap() to use iov_tail"), because we might add > > > > > > zero-length segments to the tcp_iov array, and that would make > > > > > > backporting an otherwise simple and critical fix to slightly older > > > > > > versions rather complicated. > > > > > > > > > > Kinda. It's not that complicated to deal with that case, by wrapping > > > > > the actual data processing in an `if (len) { ... }` > > > > > > > > That's needed for sure, but do we risk looping forever on a particular > > > > batch of FIN segments without data with a given series of sequence > > > > numbers? > > > > > > > > Right now the loop terminates because the sequence moves forward. I'm > > > > not sure what happens without data while moving 'keep' around. Maybe it > > > > takes a few minutes to figure out (I haven't tried) but I wouldn't call > > > > that trivial. > > > > > > That should be fine, because we need to advance the sequence by one > > > for the FIN anyway, so we will move forwards. I might have forgotten > > > that in my quick example, which is a bug, but an easy one to fix. > > > > But we don't, in that loop, and there's a specific reason for it. FIN > > segments are a special case in that, if you receive more than one, you > > don't advance the sequence by more than one, even if the segments > > themselves are in the expected sequence. > > Ok. *thinks*. So, if we both set fin = 1 and advance the sequence > inside an if (th->fin && !fin), that should do the trick, yes? Yes, but mind that we must then rewind the sequence on 'partial_send' as well as on any non-retried sendmsg() failure because we want to pretend, in those cases, that we never received a FIN (we won't set TAP_FIN_RCVD either). I think it will be less awkward and probably more robust altogether to make sure that the 'keep' mechanism can't be fooled by FIN segments with looping sequences. Actually, now that we have packet pools (I didn't have them when I first implemented this logic in dc169643a457 "tcp: Full batched processing for tap messages"), do we want to try and reorder segments first, separately? It's not great for cache locality but perhaps worth at least profiling, and it would probably make things simpler. In that case, though, maybe it's appropriate to have a complete, less intrusive fix for out-of-sequence FIN handling. If we advance the sequence there, and rewind it later, I think it's not that bad as an intermediate step. > > If you want to move the sequence increase into that loop (which might > > make sense with some extra care), perhaps it's worth doing that > > together with the whole https://bugs.passt.top/show_bug.cgi?id=125 at > > this point. > > Yeah, maybe. By the way a separate reordering step should make this easier as well, but still I'm not sure it's justified, I think it depends on what perf(1) says about it. -- Stefano