From: Stefano Brivio <sbrivio@redhat.com>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: passt-dev@passt.top
Subject: Re: [PATCH 7/8] udp: Decide whether to "splice" per datagram rather than per socket
Date: Wed, 14 Dec 2022 11:35:54 +0100 [thread overview]
Message-ID: <20221214113554.29c0c196@elisabeth> (raw)
In-Reply-To: <Y5krLbRLTmtpJjaE@yekko>
On Wed, 14 Dec 2022 12:47:25 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:
> On Tue, Dec 13, 2022 at 11:49:18PM +0100, Stefano Brivio wrote:
> > On Mon, 5 Dec 2022 19:14:24 +1100
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> >
> > > Currently we have special sockets for receiving datagrams from locahost
> > > which can use the optimized "splice" path rather than going across the tap
> > > interface.
> > >
> > > We want to loosen this so that sockets can receive sockets that will be
> > > forwarded by both the spliced and non-spliced paths. To do this, we alter
> > > the meaning of the @splice bit in the reference to mean that packets
> > > receieved on this socket *can* be spliced, not that they *will* be spliced.
> > > They'll only actually be spliced if they come from 127.0.0.1 or ::1.
> > >
> > > We can't (for now) remove the splice bit entirely, unlike with TCP. Our
> > > gateway mapping means that if the ns initiates communication to the gw
> > > address, we'll translate that to target 127.0.0.1 on the host side. Reply
> > > packets will therefore have source address 127.0.0.1 when received on the
> > > host, but these need to go via the tap path where that will be translated
> > > back to the gateway address. We need the @splice bit to distinguish that
> > > case from packets going from localhost to a port mapped explicitly with
> > > -u which should be spliced.
> > >
> > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > > ---
> > > udp.c | 54 +++++++++++++++++++++++++++++++++++-------------------
> > > udp.h | 2 +-
> > > 2 files changed, 36 insertions(+), 20 deletions(-)
> > >
> > > diff --git a/udp.c b/udp.c
> > > index 6ccfe8c..011a157 100644
> > > --- a/udp.c
> > > +++ b/udp.c
> > > @@ -513,16 +513,27 @@ static int udp_splice_new_ns(void *arg)
> > > }
> > >
> > > /**
> > > - * sa_port() - Determine port from a sockaddr_in or sockaddr_in6
> > > + * udp_mmh_splice_port() - Is source address of message suitable for splicing?
> > > * @v6: Is @sa a sockaddr_in6 (otherwise sockaddr_in)?
> > > - * @sa: Pointer to either sockaddr_in or sockaddr_in6
> > > + * @mmh: mmsghdr of incoming message
> > > + *
> > > + * Return: if @sa refers to localhost (127.0.0.1 or ::1) the port from
> > > + * @sa, otherwise 0.
> > > + *
> > > + * NOTE: this relies on the fact that it's not valid to use UDP port 0
> >
> > The port is reserved by IANA indeed, but... it can actually be used. On
> > Linux, you can bind() it and you can connect() to it. As far as I can
> > tell from the new version of udp_sock_handler() we would actually
> > misdirect packets in that case.
>
> Hm, ok. Given the IANA reservation, I think it would be acceptable to
> simply drop such packets - but if we were to make that choice we
> should do so explicitly, rather than misdirecting them.
Acceptable, sure, but... I don't know, it somehow doesn't look
desirable to me. The kernel doesn't enforce this, so I guess we
shouldn't either.
> > How bad would it be to use an int here?
>
> Pretty straightforward. Just means we have to use the somewhat
> abtruse "if (port <= USHRT_MAX)" or "if (port >= 0)" or something
> instead of just "if (port)". Should I go ahead and make that change?
Eh, I don't like it either, but... I guess it's better than the
alternative, so yes, thanks. Or pass port as a pointer, set on return.
I'm fine with both.
> > By the way, I think the comment should also mention that the port is
> > returned in host order.
>
> Ok, easily done. Generally I try to keep the endianness associated
> with the type, rather than attempting to document it for each variable
> (or even worse, each point in time for each variable).
Yes, I see, and it's a more valid approach than mine, but still mine
comes almost for free.
By the way, I got distracted by this and I forgot about two things:
> +static in_port_t udp_mmh_splice_port(bool v6, const struct mmsghdr *mmh)
> {
> - const struct sockaddr_in6 *sa6 = sa;
> - const struct sockaddr_in *sa4 = sa;
> + const struct sockaddr_in6 *sa6 = mmh->msg_hdr.msg_name;
> + const struct sockaddr_in *sa4 = mmh->msg_hdr.msg_name;;
Stray semicolon here.
> +
> + if (v6 && IN6_IS_ADDR_LOOPBACK(&sa6->sin6_addr))
> + return ntohs(sa6->sin6_port);
>
> - return v6 ? ntohs(sa6->sin6_port) : ntohs(sa4->sin_port);
> + if (ntohl(sa4->sin_addr.s_addr) == INADDR_LOOPBACK)
> + return ntohs(sa4->sin_port);
If it's IPv6, but not a loopback address, we'll check if
sa4->sin_addr.s_addr == INADDR_LOOPBACK -- which might actually be true for
an IPv6, non-loopback address.
Also, I think we can happily "splice" for any loopback address, not
just 127.0.0.1. What about something like:
if (v6 && IN6_IS_ADDR_LOOPBACK(&sa6->sin6_addr))
return ntohs(sa6->sin6_port);
if (!v4 && IN4_IS_ADDR_LOOPBACK(&sa4->sin_addr))
return ntohs(sa4->sin_port);
return -1;
?
--
Stefano
next prev parent reply other threads:[~2022-12-14 10:36 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-05 8:14 [PATCH 0/8] Don't use additional sockets for receiving "spliced" UDP communications David Gibson
2022-12-05 8:14 ` [PATCH 1/8] udp: Move sending pasta tap frames to the end of udp_sock_handler() David Gibson
2022-12-05 8:14 ` [PATCH 2/8] udp: Split sending to passt tap interface into separate function David Gibson
2022-12-05 8:14 ` [PATCH 3/8] udp: Split receive from preparation and send in udp_sock_handler() David Gibson
2022-12-05 8:14 ` [PATCH 4/8] udp: Receive multiple datagrams at once on the pasta sock->tap path David Gibson
2022-12-13 22:48 ` Stefano Brivio
2022-12-14 1:42 ` David Gibson
2022-12-14 10:35 ` Stefano Brivio
2022-12-20 10:42 ` Stefano Brivio
2022-12-21 6:00 ` David Gibson
2022-12-22 2:37 ` Stefano Brivio
2023-01-04 0:08 ` Stefano Brivio
2023-01-04 4:53 ` David Gibson
2022-12-05 8:14 ` [PATCH 5/8] udp: Pre-populate msg_names with local address David Gibson
2022-12-13 22:48 ` Stefano Brivio
2022-12-14 1:22 ` David Gibson
2022-12-05 8:14 ` [PATCH 6/8] udp: Unify udp_sock_handler_splice() with udp_sock_handler() David Gibson
2022-12-13 22:48 ` Stefano Brivio
2022-12-14 1:19 ` David Gibson
2022-12-14 10:35 ` Stefano Brivio
2022-12-05 8:14 ` [PATCH 7/8] udp: Decide whether to "splice" per datagram rather than per socket David Gibson
2022-12-13 22:49 ` Stefano Brivio
2022-12-14 1:47 ` David Gibson
2022-12-14 10:35 ` Stefano Brivio [this message]
2022-12-15 0:33 ` David Gibson
2022-12-05 8:14 ` [PATCH 8/8] udp: Don't use separate sockets to listen for spliced packets David Gibson
2022-12-06 6:45 ` [PATCH 0/8] Don't use additional sockets for receiving "spliced" UDP communications Stefano Brivio
2022-12-06 6:46 ` Stefano Brivio
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221214113554.29c0c196@elisabeth \
--to=sbrivio@redhat.com \
--cc=david@gibson.dropbear.id.au \
--cc=passt-dev@passt.top \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).