From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: passt.top; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=WkvSAaxe; dkim-atps=neutral Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by passt.top (Postfix) with ESMTPS id 354665A0624 for ; Wed, 21 Jan 2026 12:41:21 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1768995680; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=N+yU/jFWnJr5bbgs08KrgT6tTBYj6vye8S4gRrSX5/w=; b=WkvSAaxe5ZSTZGX2Hqs3jPhcaK2mpM2cXQJ6Ba2mZ6H1/TC8q8VDzMLXXH1rwKo1TX6C1j XLtegSrihCsBbiaQaDOfGW4cnvElkCcnFmt9HerqeOgafuQAuLBbcmnpKIGlb+m9XoRoYC Sp1IOMEVfvbstqyRYUYU/XbJr7/peMk= Received: from mail-wr1-f72.google.com (mail-wr1-f72.google.com [209.85.221.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-104-2E8RMRfVNsS89TLvJq13yw-1; Wed, 21 Jan 2026 06:41:18 -0500 X-MC-Unique: 2E8RMRfVNsS89TLvJq13yw-1 X-Mimecast-MFC-AGG-ID: 2E8RMRfVNsS89TLvJq13yw_1768995678 Received: by mail-wr1-f72.google.com with SMTP id ffacd0b85a97d-430fcfe4494so6300201f8f.2 for ; Wed, 21 Jan 2026 03:41:18 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1768995677; x=1769600477; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=N+yU/jFWnJr5bbgs08KrgT6tTBYj6vye8S4gRrSX5/w=; b=CMHo/XVh7wAvjo2KUel6yXDpHHddjJ+VUiRoEANg1CNZmUldWXY0dZjwldFwapNT1o AuaclK1PCRZMCE65e8Igt2pjtuJ6K4GxBsu4oxZtX0SV1tqplY6nhQ+XdQ+oL0NuAUcQ zKmbpYRdLfA83oSDa7Adauaq38Gt5ti7FlEUETf1mh27hn7gRXDlyFv/uHv6ypjbJlfI 5a4WXKRVCfWrwjWLCvuID0e2d18A9rLhPEtzPeYT2/HUKA6zz0qkVrDNr1VIbkcWYMWk GBCrO+lCSgYG9HynKnZcF5c5P2h5PT2OitdIWAxU2iPY8z4Qbu9+LSILz7RZl0QTmoSx T0BQ== X-Gm-Message-State: AOJu0YzHkYIE67bHXsZViyA+3skd45CTuMF0lZUnTit2WC71KcOM+7eF jEy3KwUYRxKtsDMCoJ271ww4FwGGXACRHGNmFguU1KXcCxtGqwiWh+Zz/zAJjwdbUDl+BjEfwl0 VBqydTe9Z8do93Zvr6bsxoMK1c1BhNZh6g8R12SyGisIYh+Bu5aXU4w== X-Gm-Gg: AZuq6aKMgo4TtG2+9szt9ebAA3ckN4USgHNIpkalTt2GghV0f1aWxDKLPbsHpLwFT48 qnX1LMtT08+bYvTxnajR7yuk/5CEvVADdoNZHbx9UCdyxGgC50f9UyGZwPpc33B36xxF86DozKn sTX1p+BuMxXJOBSxBoEdTDh4/0b4Kf5ZUbxRQk/k9ALFXOHgf4jOebyZ7O6FbyqVQmT3g9Y7r7A /CHZVvC+VlGAy4q9XZFHVQ6vgfv0jNShXvebP+gqwdUVC6RLDTGYXEwzkPfKc+XDHjtpay42g0g ee8I4xN12vzF2HILHczGEW91GhxxPYGHFT6D4kTH6J2fM+Vl/VSCw05TPHSwZavJIXZd0B8FEpz 1QBOLIda+OfyvC9vindfW2TvFMMSjSr59eywBJw== X-Received: by 2002:a05:6000:2282:b0:431:abb:942f with SMTP id ffacd0b85a97d-4356a083757mr26586943f8f.54.1768995677472; Wed, 21 Jan 2026 03:41:17 -0800 (PST) X-Received: by 2002:a05:6000:2282:b0:431:abb:942f with SMTP id ffacd0b85a97d-4356a083757mr26586883f8f.54.1768995676899; Wed, 21 Jan 2026 03:41:16 -0800 (PST) Received: from maya.myfinge.rs (ifcgrfdd.trafficplex.cloud. [176.103.220.4]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4358f138e26sm10937038f8f.17.2026.01.21.03.41.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Jan 2026 03:41:16 -0800 (PST) Date: Wed, 21 Jan 2026 12:41:13 +0100 From: Stefano Brivio To: Laurent Vivier Subject: Re: [PATCH v2 2/3] tcp: Register fds with epoll at flow creation Message-ID: <20260121124113.56147c36@elisabeth> In-Reply-To: References: <20260119161915.4014677-1-lvivier@redhat.com> <20260119161915.4014677-3-lvivier@redhat.com> <20260121091321.22fbb246@elisabeth> Organization: Red Hat X-Mailer: Claws Mail 4.2.0 (GTK 3.24.49; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: RXldPfsuc3FejTpu8ifFivNhWv5ORthrbmDp2NwOAWU_1768995678 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-ID-Hash: MPSVJ47RLOV6ZSV3OO4B2PKXT7B6D73Z X-Message-ID-Hash: MPSVJ47RLOV6ZSV3OO4B2PKXT7B6D73Z X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, David Gibson X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Wed, 21 Jan 2026 09:43:55 +0100 Laurent Vivier wrote: > On 1/21/26 09:13, Stefano Brivio wrote: > > On Mon, 19 Jan 2026 17:19:14 +0100 > > Laurent Vivier wrote: > > > >> Register connection sockets with epoll using empty events > >> (events=0) in tcp_conn_from_tap(), tcp_tap_conn_from_sock() > >> and tcp_flow_repair_socket(). > >> > >> This allows tcp_epoll_ctl() to always use EPOLL_CTL_MOD, removing > >> the need to check whether fds are already registered. As a result, the > >> conditional ADD/MOD logic is no longer needed, simplifying the function. > >> > >> Signed-off-by: Laurent Vivier > >> --- > >> flow.c | 1 + > >> tcp.c | 46 ++++++++++++++++++++++++---------------------- > >> 2 files changed, 25 insertions(+), 22 deletions(-) > >> > >> diff --git a/flow.c b/flow.c > >> index cefe6c8b5b24..532339ce7fe1 100644 > >> --- a/flow.c > >> +++ b/flow.c > >> @@ -357,6 +357,7 @@ static void flow_set_state(struct flow_common *f, enum flow_state state) > >> * > >> * Return: true if flow is registered with epoll, false otherwise > >> */ > >> +/* cppcheck-suppress unusedFunction */ > >> bool flow_in_epoll(const struct flow_common *f) > >> { > >> return f->epollid != EPOLLFD_ID_INVALID; > >> diff --git a/tcp.c b/tcp.c > >> index 1db861705ddb..29d69354bd94 100644 > >> --- a/tcp.c > >> +++ b/tcp.c > >> @@ -528,37 +528,22 @@ static uint32_t tcp_conn_epoll_events(uint8_t events, uint8_t conn_flags) > >> static int tcp_epoll_ctl(struct tcp_tap_conn *conn) > >> { > >> uint32_t events; > >> - int m; > >> > >> if (conn->events == CLOSED) { > >> - if (flow_in_epoll(&conn->f)) { > >> - int epollfd = flow_epollfd(&conn->f); > >> + int epollfd = flow_epollfd(&conn->f); > >> > >> - epoll_del(epollfd, conn->sock); > >> - if (conn->timer != -1) > >> - epoll_del(epollfd, conn->timer); > >> - } > >> + epoll_del(epollfd, conn->sock); > >> + if (conn->timer != -1) > >> + epoll_del(epollfd, conn->timer); > >> > >> return 0; > >> } > >> > >> events = tcp_conn_epoll_events(conn->events, conn->flags); > >> > >> - if (flow_in_epoll(&conn->f)) { > >> - m = EPOLL_CTL_MOD; > >> - } else { > >> - flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT); > >> - m = EPOLL_CTL_ADD; > >> - } > >> - > >> - if (flow_epoll_set(&conn->f, m, events, conn->sock, > >> - !TAPSIDE(conn)) < 0) { > >> - int ret = -errno; > >> - > >> - if (m == EPOLL_CTL_ADD) > >> - flow_epollid_clear(&conn->f); > >> - return ret; > >> - } > >> + if (flow_epoll_set(&conn->f, EPOLL_CTL_MOD, events, conn->sock, > >> + !TAPSIDE(conn)) < 0) > >> + return -errno; > >> > >> return 0; > >> } > >> @@ -1710,6 +1695,11 @@ static void tcp_conn_from_tap(const struct ctx *c, sa_family_t af, > >> conn->sock = s; > >> conn->timer = -1; > >> conn->listening_sock = -1; > >> + flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT); > >> + if (flow_epoll_set(&conn->f, EPOLL_CTL_ADD, 0, s, TGTSIDE) < 0) { > >> + flow_perror(flow, "Can't register with epoll"); > >> + goto cancel; > >> + } > >> conn_event(c, conn, TAP_SYN_RCVD); > >> > >> conn->wnd_to_tap = WINDOW_DEFAULT; > >> @@ -2433,6 +2423,15 @@ static void tcp_tap_conn_from_sock(const struct ctx *c, union flow *flow, > >> conn->sock = s; > >> conn->timer = -1; > >> conn->ws_to_tap = conn->ws_from_tap = 0; > >> + > >> + flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT); > >> + if (flow_epoll_set(&conn->f, EPOLL_CTL_ADD, 0, s, INISIDE) < 0) { > >> + flow_perror(flow, "Can't register with epoll"); > >> + conn_flag(c, conn, CLOSING); > >> + FLOW_ACTIVATE(conn); > >> + return; > >> + } > >> + > >> conn_event(c, conn, SOCK_ACCEPTED); > >> > >> hash = flow_hash_insert(c, TAP_SIDX(conn)); > >> @@ -3825,6 +3824,9 @@ int tcp_flow_migrate_target(struct ctx *c, int fd) > >> return 0; > >> } > >> > >> + flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT); > >> + flow_epoll_set(&conn->f, EPOLL_CTL_ADD, 0, conn->sock, !TAPSIDE(conn)); > > > > Oops, I overlooked this, because I missed re-checking on v2 after David > > pointed it out: this causes Coverity Scan to (reasonably, I think) > > complain that: > > > > /home/sbrivio/passt/tcp.c:3693:2: > > Type: Unchecked return value (CHECKED_RETURN) > > > > /home/sbrivio/passt/tcp.c:3648:2: Unchecked call to function > > 1. path: Condition "!(flow = flow_alloc())", taking false branch. > > /home/sbrivio/passt/tcp.c:3653:2: > > 2. path: Condition "read_all_buf(fd, &t, 112UL /* sizeof (t) */)", taking false branch. > > /home/sbrivio/passt/tcp.c:3685:2: > > 3. path: Condition "rc = tcp_flow_repair_socket(c, conn)", taking false branch. > > /home/sbrivio/passt/tcp.c:3693:2: > > 4. path: Condition "!(conn->f.pif[1] == PIF_TAP)", taking true branch. > > /home/sbrivio/passt/tcp.c:3693:2: > > 5. check_return: Calling "flow_epoll_set" without checking return value (as is done elsewhere 6 out of 7 times). > > /home/sbrivio/passt/icmp.c:213:2: Examples where return value from this function is checked > > 6. example_checked: Example 1: "flow_epoll_set(&pingf->f, 1, EPOLLIN, pingf->sock, 1U)" has its value checked in "flow_epoll_set(&pingf->f, 1, EPOLLIN, pingf->sock, 1U) < 0". > > /home/sbrivio/passt/tcp.c:1695:2: Examples where return value from this function is checked > > 7. example_checked: Example 2: "flow_epoll_set(&conn->f, 1, 0U, s, 1U)" has its value checked in "flow_epoll_set(&conn->f, 1, 0U, s, 1U) < 0". > > /home/sbrivio/passt/tcp_splice.c:364:2: Examples where return value from this function is checked > > 8. example_checked: Example 3: "flow_epoll_set(&conn->f, 1, 0U, conn->s[1], 1U)" has its value checked in "flow_epoll_set(&conn->f, 1, 0U, conn->s[1], 1U)". > > /home/sbrivio/passt/tcp_splice.c:364:2: Examples where return value from this function is checked > > 9. example_checked: Example 4: "flow_epoll_set(&conn->f, 1, 0U, conn->s[0], 0U)" has its value checked in "flow_epoll_set(&conn->f, 1, 0U, conn->s[0], 0U)". > > /home/sbrivio/passt/udp_flow.c:87:2: Examples where return value from this function is checked > > 10. example_checked: Example 5: "flow_epoll_set(&uflow->f, 1, EPOLLIN, s, sidei)" has its value checked in "flow_epoll_set(&uflow->f, 1, EPOLLIN, s, sidei) < 0". > > > > I don't think it's overly problematic but epoll_ctl() can, in theory, > > fail regardless of the arguments. > > > > Yes, I agree, but the problem is catched later by tcp_epoll_ctl() within > tcp_flow_migrate_target_ext(). I see. It's just not really obvious, and other than that it adds static checkers noise that I'm trying to keep to a minimum. > Furthermore, at this point, to handle the error we have to return 0 (single point of > failure) and still execute FLOW_ACTIVATE() anyway. The result is therefore identical (only > flow_hash_insert() is not executed in the absence of an error). Well, not calling flow_hash_insert() in that case looks like a substantial simplification (for auditing / debugging purposes) to me. Given that we already have an early return above on failures from tcp_flow_repair_socket(), maybe that could become an 'out' goto label at the end, and we would have a similar comment for this other case? I'll send a patch in this sense in a bit, unless you see something wrong with it. -- Stefano