public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: Laurent Vivier <lvivier@redhat.com>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: passt-dev@passt.top, David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [PATCH v2 2/3] tcp: Register fds with epoll at flow creation
Date: Wed, 21 Jan 2026 13:18:34 +0100	[thread overview]
Message-ID: <887bdb7b-ee0e-4723-889f-a90f158d0073@redhat.com> (raw)
In-Reply-To: <20260121124113.56147c36@elisabeth>

On 1/21/26 12:41, Stefano Brivio wrote:
> On Wed, 21 Jan 2026 09:43:55 +0100
> Laurent Vivier <lvivier@redhat.com> wrote:
> 
>> On 1/21/26 09:13, Stefano Brivio wrote:
>>> On Mon, 19 Jan 2026 17:19:14 +0100
>>> Laurent Vivier <lvivier@redhat.com> wrote:
>>>    
>>>> Register connection sockets with epoll using empty events
>>>> (events=0) in tcp_conn_from_tap(), tcp_tap_conn_from_sock()
>>>> and tcp_flow_repair_socket().
>>>>
>>>> This allows tcp_epoll_ctl() to always use EPOLL_CTL_MOD, removing
>>>> the need to check whether fds are already registered. As a result, the
>>>> conditional ADD/MOD logic is no longer needed, simplifying the function.
>>>>
>>>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
>>>> ---
>>>>    flow.c |  1 +
>>>>    tcp.c  | 46 ++++++++++++++++++++++++----------------------
>>>>    2 files changed, 25 insertions(+), 22 deletions(-)
>>>>
>>>> diff --git a/flow.c b/flow.c
>>>> index cefe6c8b5b24..532339ce7fe1 100644
>>>> --- a/flow.c
>>>> +++ b/flow.c
>>>> @@ -357,6 +357,7 @@ static void flow_set_state(struct flow_common *f, enum flow_state state)
>>>>     *
>>>>     * Return: true if flow is registered with epoll, false otherwise
>>>>     */
>>>> +/* cppcheck-suppress unusedFunction */
>>>>    bool flow_in_epoll(const struct flow_common *f)
>>>>    {
>>>>    	return f->epollid != EPOLLFD_ID_INVALID;
>>>> diff --git a/tcp.c b/tcp.c
>>>> index 1db861705ddb..29d69354bd94 100644
>>>> --- a/tcp.c
>>>> +++ b/tcp.c
>>>> @@ -528,37 +528,22 @@ static uint32_t tcp_conn_epoll_events(uint8_t events, uint8_t conn_flags)
>>>>    static int tcp_epoll_ctl(struct tcp_tap_conn *conn)
>>>>    {
>>>>    	uint32_t events;
>>>> -	int m;
>>>>    
>>>>    	if (conn->events == CLOSED) {
>>>> -		if (flow_in_epoll(&conn->f)) {
>>>> -			int epollfd = flow_epollfd(&conn->f);
>>>> +		int epollfd = flow_epollfd(&conn->f);
>>>>    
>>>> -			epoll_del(epollfd, conn->sock);
>>>> -			if (conn->timer != -1)
>>>> -				epoll_del(epollfd, conn->timer);
>>>> -		}
>>>> +		epoll_del(epollfd, conn->sock);
>>>> +		if (conn->timer != -1)
>>>> +			epoll_del(epollfd, conn->timer);
>>>>    
>>>>    		return 0;
>>>>    	}
>>>>    
>>>>    	events = tcp_conn_epoll_events(conn->events, conn->flags);
>>>>    
>>>> -	if (flow_in_epoll(&conn->f)) {
>>>> -		m = EPOLL_CTL_MOD;
>>>> -	} else {
>>>> -		flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT);
>>>> -		m = EPOLL_CTL_ADD;
>>>> -	}
>>>> -
>>>> -	if (flow_epoll_set(&conn->f, m, events, conn->sock,
>>>> -			   !TAPSIDE(conn)) < 0) {
>>>> -		int ret = -errno;
>>>> -
>>>> -		if (m == EPOLL_CTL_ADD)
>>>> -			flow_epollid_clear(&conn->f);
>>>> -		return ret;
>>>> -	}
>>>> +	if (flow_epoll_set(&conn->f, EPOLL_CTL_MOD, events, conn->sock,
>>>> +			   !TAPSIDE(conn)) < 0)
>>>> +		return -errno;
>>>>    
>>>>    	return 0;
>>>>    }
>>>> @@ -1710,6 +1695,11 @@ static void tcp_conn_from_tap(const struct ctx *c, sa_family_t af,
>>>>    	conn->sock = s;
>>>>    	conn->timer = -1;
>>>>    	conn->listening_sock = -1;
>>>> +	flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT);
>>>> +	if (flow_epoll_set(&conn->f, EPOLL_CTL_ADD, 0, s, TGTSIDE) < 0) {
>>>> +		flow_perror(flow, "Can't register with epoll");
>>>> +		goto cancel;
>>>> +	}
>>>>    	conn_event(c, conn, TAP_SYN_RCVD);
>>>>    
>>>>    	conn->wnd_to_tap = WINDOW_DEFAULT;
>>>> @@ -2433,6 +2423,15 @@ static void tcp_tap_conn_from_sock(const struct ctx *c, union flow *flow,
>>>>    	conn->sock = s;
>>>>    	conn->timer = -1;
>>>>    	conn->ws_to_tap = conn->ws_from_tap = 0;
>>>> +
>>>> +	flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT);
>>>> +	if (flow_epoll_set(&conn->f, EPOLL_CTL_ADD, 0, s, INISIDE) < 0) {
>>>> +		flow_perror(flow, "Can't register with epoll");
>>>> +		conn_flag(c, conn, CLOSING);
>>>> +		FLOW_ACTIVATE(conn);
>>>> +		return;
>>>> +	}
>>>> +
>>>>    	conn_event(c, conn, SOCK_ACCEPTED);
>>>>    
>>>>    	hash = flow_hash_insert(c, TAP_SIDX(conn));
>>>> @@ -3825,6 +3824,9 @@ int tcp_flow_migrate_target(struct ctx *c, int fd)
>>>>    		return 0;
>>>>    	}
>>>>    
>>>> +	flow_epollid_set(&conn->f, EPOLLFD_ID_DEFAULT);
>>>> +	flow_epoll_set(&conn->f, EPOLL_CTL_ADD, 0, conn->sock, !TAPSIDE(conn));
>>>
>>> Oops, I overlooked this, because I missed re-checking on v2 after David
>>> pointed it out: this causes Coverity Scan to (reasonably, I think)
>>> complain that:
>>>
>>> /home/sbrivio/passt/tcp.c:3693:2:
>>>     Type: Unchecked return value (CHECKED_RETURN)
>>>
>>> /home/sbrivio/passt/tcp.c:3648:2: Unchecked call to function
>>>     1. path: Condition "!(flow = flow_alloc())", taking false branch.
>>> /home/sbrivio/passt/tcp.c:3653:2:
>>>     2. path: Condition "read_all_buf(fd, &t, 112UL /* sizeof (t) */)", taking false branch.
>>> /home/sbrivio/passt/tcp.c:3685:2:
>>>     3. path: Condition "rc = tcp_flow_repair_socket(c, conn)", taking false branch.
>>> /home/sbrivio/passt/tcp.c:3693:2:
>>>     4. path: Condition "!(conn->f.pif[1] == PIF_TAP)", taking true branch.
>>> /home/sbrivio/passt/tcp.c:3693:2:
>>>     5. check_return: Calling "flow_epoll_set" without checking return value (as is done elsewhere 6 out of 7 times).
>>> /home/sbrivio/passt/icmp.c:213:2: Examples where return value from this function is checked
>>>     6. example_checked: Example 1: "flow_epoll_set(&pingf->f, 1, EPOLLIN, pingf->sock, 1U)" has its value checked in "flow_epoll_set(&pingf->f, 1, EPOLLIN, pingf->sock, 1U) < 0".
>>> /home/sbrivio/passt/tcp.c:1695:2: Examples where return value from this function is checked
>>>     7. example_checked: Example 2: "flow_epoll_set(&conn->f, 1, 0U, s, 1U)" has its value checked in "flow_epoll_set(&conn->f, 1, 0U, s, 1U) < 0".
>>> /home/sbrivio/passt/tcp_splice.c:364:2: Examples where return value from this function is checked
>>>     8. example_checked: Example 3: "flow_epoll_set(&conn->f, 1, 0U, conn->s[1], 1U)" has its value checked in "flow_epoll_set(&conn->f, 1, 0U, conn->s[1], 1U)".
>>> /home/sbrivio/passt/tcp_splice.c:364:2: Examples where return value from this function is checked
>>>     9. example_checked: Example 4: "flow_epoll_set(&conn->f, 1, 0U, conn->s[0], 0U)" has its value checked in "flow_epoll_set(&conn->f, 1, 0U, conn->s[0], 0U)".
>>> /home/sbrivio/passt/udp_flow.c:87:2: Examples where return value from this function is checked
>>>     10. example_checked: Example 5: "flow_epoll_set(&uflow->f, 1, EPOLLIN, s, sidei)" has its value checked in "flow_epoll_set(&uflow->f, 1, EPOLLIN, s, sidei) < 0".
>>>
>>> I don't think it's overly problematic but epoll_ctl() can, in theory,
>>> fail regardless of the arguments.
>>>    
>>
>> Yes, I agree, but the problem is catched later by tcp_epoll_ctl() within
>> tcp_flow_migrate_target_ext().
> 
> I see. It's just not really obvious, and other than that it adds static
> checkers noise that I'm trying to keep to a minimum.
> 
>> Furthermore, at this point, to handle the error we have to return 0 (single point of
>> failure) and still execute FLOW_ACTIVATE() anyway. The result is therefore identical (only
>> flow_hash_insert() is not executed in the absence of an error).
> 
> Well, not calling flow_hash_insert() in that case looks like a
> substantial simplification (for auditing / debugging purposes) to me.
> 
> Given that we already have an early return above on failures from
> tcp_flow_repair_socket(), maybe that could become an 'out' goto label at
> the end, and we would have a similar comment for this other case?
> 
> I'll send a patch in this sense in a bit, unless you see something
> wrong with it.
> 

Yes, that makes sense. Go ahead.

Thanks,
Laurent


  reply	other threads:[~2026-01-21 12:18 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-19 16:19 [PATCH v2 0/3] Register TCP flows with epoll at creation time Laurent Vivier
2026-01-19 16:19 ` [PATCH v2 1/3] tcp_splice: Register fds with epoll at flow creation Laurent Vivier
2026-01-20  0:06   ` David Gibson
2026-01-19 16:19 ` [PATCH v2 2/3] tcp: " Laurent Vivier
2026-01-20  0:08   ` David Gibson
2026-01-21  8:13   ` Stefano Brivio
2026-01-21  8:43     ` Laurent Vivier
2026-01-21 11:41       ` Stefano Brivio
2026-01-21 12:18         ` Laurent Vivier [this message]
2026-01-19 16:19 ` [PATCH v2 3/3] flow: Remove EPOLLFD_ID_INVALID Laurent Vivier
2026-01-20  0:09   ` David Gibson
2026-01-20 20:24 ` [PATCH v2 0/3] Register TCP flows with epoll at creation time Stefano Brivio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=887bdb7b-ee0e-4723-889f-a90f158d0073@redhat.com \
    --to=lvivier@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=passt-dev@passt.top \
    --cc=sbrivio@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).