public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: passt-dev@passt.top
Subject: Re: [PATCH 01/10] tcp: no v6 flag in ref
Date: Tue, 8 Nov 2022 11:35:52 +1100	[thread overview]
Message-ID: <Y2mkaMuVsqN7/c+I@yekko> (raw)
In-Reply-To: <20221107190753.5e1cf0de@elisabeth>

[-- Attachment #1: Type: text/plain, Size: 6087 bytes --]

On Mon, Nov 07, 2022 at 07:07:53PM +0100, Stefano Brivio wrote:
> On Fri,  4 Nov 2022 19:43:24 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > ---
> >  tcp.c        |  9 ++++-----
> >  tcp.h        |  1 -
> >  tcp_splice.c | 13 +++++++------
> >  3 files changed, 11 insertions(+), 12 deletions(-)
> > 
> > diff --git a/tcp.c b/tcp.c
> > index 713248f..3d48d6e 100644
> > --- a/tcp.c
> > +++ b/tcp.c
> > @@ -761,8 +761,7 @@ static int tcp_epoll_ctl(const struct ctx *c, struct tcp_conn *conn)
> >  {
> >  	int m = (conn->flags & IN_EPOLL) ? EPOLL_CTL_MOD : EPOLL_CTL_ADD;
> >  	union epoll_ref ref = { .r.proto = IPPROTO_TCP, .r.s = conn->sock,
> > -				.r.p.tcp.tcp.index = conn - tc,
> > -				.r.p.tcp.tcp.v6 = CONN_V6(conn) };
> > +				.r.p.tcp.tcp.index = conn - tc, };
> >  	struct epoll_event ev = { .data.u64 = ref.u64 };
> >  
> >  	if (conn->events == CLOSED) {
> > @@ -2857,6 +2856,7 @@ static void tcp_conn_from_sock(struct ctx *c, union epoll_ref ref,
> >  	if (c->tcp.conn_count >= TCP_MAX_CONNS)
> >  		return;
> >  
> > +	sa.ss_family = AF_UNSPEC; /* FIXME: stop clang-tidy complaining */
> 
> No need for /* FIXME */ here in my opinion,

"FIXME" maybe isn't quite the right sentiment.  The issue here is I
don't think this should be necessary.  AFAIK sa.ss_family will be
written on any succesful return from accept4(), it's just the
clang-tidy doesn't seem to know that.

> valgrind should also hit
> this

I'm not sure.  Depends whether valgrind's knowledge of syscall
behaviours has the same bug or not.

> -- perhaps move to initialiser though.
> 
> >  	sl = sizeof(sa);
> >  	s = accept4(ref.r.s, (struct sockaddr *)&sa, &sl, SOCK_NONBLOCK);
> >  	if (s < 0)
> > @@ -2868,7 +2868,7 @@ static void tcp_conn_from_sock(struct ctx *c, union epoll_ref ref,
> >  	conn->ws_to_tap = conn->ws_from_tap = 0;
> >  	conn_event(c, conn, SOCK_ACCEPTED);
> >  
> > -	if (ref.r.p.tcp.tcp.v6) {
> > +	if (sa.ss_family == AF_INET6) {
> >  		struct sockaddr_in6 sa6;
> >  
> >  		memcpy(&sa6, &sa, sizeof(sa6));
> > @@ -3147,8 +3147,7 @@ static void tcp_sock_init6(const struct ctx *c, int ns,
> >  			   const struct in6_addr *addr, const char *ifname,
> >  			   in_port_t port)
> >  {
> > -	union tcp_epoll_ref tref = { .tcp.listen = 1, .tcp.outbound = ns,
> > -				     .tcp.v6 = 1 };
> > +	union tcp_epoll_ref tref = { .tcp.listen = 1, .tcp.outbound = ns, };
> 
> Undocumented style rule: no comma at the end, it's post-modern C anyway.

Fixed.

> >  	bool spliced = false, tap = true;
> >  	int s;
> >  
> > diff --git a/tcp.h b/tcp.h
> > index 3fabb5a..85ac750 100644
> > --- a/tcp.h
> > +++ b/tcp.h
> > @@ -45,7 +45,6 @@ union tcp_epoll_ref {
> >  		uint32_t	listen:1,
> >  				splice:1,
> >  				outbound:1,
> > -				v6:1,
> 
> Don't forget to update the comment above.

Fixed.

> >  				timer:1,
> >  				index:20;
> >  	} tcp;
> > diff --git a/tcp_splice.c b/tcp_splice.c
> > index 99c5fa7..1f26ff4 100644
> > --- a/tcp_splice.c
> > +++ b/tcp_splice.c
> > @@ -211,12 +211,10 @@ static int tcp_splice_epoll_ctl(const struct ctx *c,
> >  	int m = (conn->flags & IN_EPOLL) ? EPOLL_CTL_MOD : EPOLL_CTL_ADD;
> >  	union epoll_ref ref_a = { .r.proto = IPPROTO_TCP, .r.s = conn->a,
> >  				  .r.p.tcp.tcp.splice = 1,
> > -				  .r.p.tcp.tcp.index = conn - tc,
> > -				  .r.p.tcp.tcp.v6 = CONN_V6(conn) };
> > +				  .r.p.tcp.tcp.index = conn - tc, };
> >  	union epoll_ref ref_b = { .r.proto = IPPROTO_TCP, .r.s = conn->b,
> >  				  .r.p.tcp.tcp.splice = 1,
> > -				  .r.p.tcp.tcp.index = conn - tc,
> > -				  .r.p.tcp.tcp.v6 = CONN_V6(conn) };
> > +				  .r.p.tcp.tcp.index = conn - tc, };
> 
> Commas.

Fixed.

> 
> >  	struct epoll_event ev_a = { .data.u64 = ref_a.u64 };
> >  	struct epoll_event ev_b = { .data.u64 = ref_b.u64 };
> >  	uint32_t events_a, events_b;
> > @@ -579,12 +577,15 @@ void tcp_sock_handler_splice(struct ctx *c, union epoll_ref ref,
> >  	struct tcp_splice_conn *conn;
> >  
> >  	if (ref.r.p.tcp.tcp.listen) {
> > +		struct sockaddr sa;
> 
> Should this be sockaddr_storage in this case and then cast? I never
> remember.

So, for this patch, no it doesn't, but it will have to change later,
in fact.  The trick here is that we only want the sa_family field, we
don't need the full socket address.  struct sockaddr has that (and
maybe a little extra, but not much).  accept4() will truncate the
socket address when it writes, but that's ok for our purposes.  Using
the short form may mitigate the concern below because there's less
data to copy_to_user().

Of course, when I change this later to add different handling for
IPv4-mapped IPv6 source addresses we will need the full address, so
this will need to be a sockaddr_in6.

> > +		socklen_t sl = sizeof(sa);
> >  		int s;
> >  
> >  		if (c->tcp.splice_conn_count >= TCP_SPLICE_MAX_CONNS)
> >  			return;
> >  
> > -		if ((s = accept4(ref.r.s, NULL, NULL, SOCK_NONBLOCK)) < 0)
> > +		sa.sa_family = AF_UNSPEC; /* FIXME: stop clang-tidy complaining */
> > +		if ((s = accept4(ref.r.s, &sa, &sl, SOCK_NONBLOCK)) < 0)
> 
> Same comment about /* FIXME */ as above.
> 
> This adds a copy_to_user() and microseconds, but it's unavoidable. I
> tried harder to optimise for latency on the spliced path, that's why it
> was NULL here and not above.
> 
> >  			return;
> >  
> >  		if (setsockopt(s, SOL_TCP, TCP_QUICKACK, &((int){ 1 }),
> > @@ -595,7 +596,7 @@ void tcp_sock_handler_splice(struct ctx *c, union epoll_ref ref,
> >  
> >  		conn = CONN(c->tcp.splice_conn_count++);
> >  		conn->a = s;
> > -		conn->flags = ref.r.p.tcp.tcp.v6 ? SOCK_V6 : 0;
> > +		conn->flags = (sa.sa_family == AF_INET6) ? SOCK_V6 : 0;
> >  
> >  		if (tcp_splice_new(c, conn, ref.r.p.tcp.tcp.index,
> >  				   ref.r.p.tcp.tcp.outbound))
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2022-11-08  1:02 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-04  8:43 [PATCH 00/10] RFC: Preliminaries for using share IPv4 & IPv6 sockets David Gibson
2022-11-04  8:43 ` [PATCH 01/10] tcp: no v6 flag in ref David Gibson
2022-11-07 18:07   ` Stefano Brivio
2022-11-08  0:35     ` David Gibson [this message]
2022-11-04  8:43 ` [PATCH 02/10] tcp: Helper to encode IPv4-mapped IPv6 addresses David Gibson
2022-11-07 18:08   ` Stefano Brivio
2022-11-08  0:46     ` David Gibson
2022-11-04  8:43 ` [PATCH 03/10] tcp: Partially unify IPv4 and IPv6 paths in tcp_hash_match() David Gibson
2022-11-07 18:08   ` Stefano Brivio
2022-11-08  0:51     ` David Gibson
2022-11-04  8:43 ` [PATCH 04/10] tcp: Hash IPv4 and IPv4-mapped-IPv6 addresses the same David Gibson
2022-11-04  8:43 ` [PATCH 05/10] tcp: Take tcp_hash_insert() address from struct tcp_conn David Gibson
2022-11-04  8:43 ` [PATCH 06/10] tcp: Unify IPv4 and IPv6 paths for hashing and matching David Gibson
2022-11-04  8:43 ` [PATCH 07/10] tcp: Remove ugly address union from struct tcp_conn David Gibson
2022-11-07 18:08   ` Stefano Brivio
2022-11-08  0:54     ` David Gibson
2022-11-04  8:43 ` [PATCH 08/10] tcp: Unify initial sequence numbers for IPv4 and IPv6 David Gibson
2022-11-04  8:43 ` [PATCH 09/10] tcp: Have tcp_seq_init() take its parameters from struct tcp_conn David Gibson
2022-11-04  8:43 ` [PATCH 10/10] tcp: Fix small error in tcp_seq_init() time handling David Gibson
2022-11-07 18:08   ` Stefano Brivio
2022-11-08  0:59     ` David Gibson
2022-11-04  8:47 ` [PATCH 00/10] RFC: Preliminaries for using share IPv4 & IPv6 sockets Stefano Brivio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y2mkaMuVsqN7/c+I@yekko \
    --to=david@gibson.dropbear.id.au \
    --cc=passt-dev@passt.top \
    --cc=sbrivio@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).