public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: passt-dev@passt.top, jmaloy@redhat.com
Subject: Re: [PATCH v7 21/27] udp: Handle "spliced" datagrams with per-flow sockets
Date: Mon, 15 Jul 2024 14:32:46 +1000	[thread overview]
Message-ID: <ZpSmbkLsWtjeWOm4@zatzit> (raw)
In-Reply-To: <20240712153407.4b894453@elisabeth>

[-- Attachment #1: Type: text/plain, Size: 3665 bytes --]

On Fri, Jul 12, 2024 at 03:34:07PM +0200, Stefano Brivio wrote:
> On Fri,  5 Jul 2024 12:07:18 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > When forwarding a datagram to a socket, we need to find a socket with a
> > suitable local address to send it.  Currently we keep track of such sockets
> > in an array indexed by local port, but this can't properly handle cases
> > where we have multiple local addresses in active use.
> > 
> > For "spliced" (socket to socket) cases, improve this by instead opening
> > a socket specifically for the target side of the flow.  We connect() as
> > well as bind()ing that socket, so that it will only receive the flow's
> > reply packets, not anything else.  We direct datagrams sent via that socket
> > using the addresses from the flow table, effectively replacing bespoke
> > addressing logic with the unified logic in fwd.c
> > 
> > When we create the flow, we also take a duplicate of the originating
> > socket, and use that to deliver reply datagrams back to the origin, again
> > using addresses from the flow table entry.
> 
> For some reason, after this patch (I bisected), I'm getting an EPOLLERR
> loop:
> 
>   pasta: epoll event on UDP socket 6 (events: 0x00000001)
>   Flow 0 (NEW): FREE -> NEW
>   Flow 0 (INI): NEW -> INI
>   Flow 0 (INI): HOST [127.0.0.1]:47041 -> [0.0.0.0]:10001 => ?
>   Flow 0 (TGT): INI -> TGT
>   Flow 0 (TGT): HOST [127.0.0.1]:47041 -> [0.0.0.0]:10001 => SPLICE [127.0.0.1]:47041 -> [127.0.0.1]:10001
>   Flow 0 (UDP flow): TGT -> TYPED
>   Flow 0 (UDP flow): HOST [127.0.0.1]:47041 -> [0.0.0.0]:10001 => SPLICE [127.0.0.1]:47041 -> [127.0.0.1]:10001
>   Flow 0 (UDP flow): Side 0 hash table insert: bucket: 97474
>   Flow 0 (UDP flow): TYPED -> ACTIVE
>   Flow 0 (UDP flow): HOST [127.0.0.1]:47041 -> [0.0.0.0]:10001 => SPLICE [127.0.0.1]:47041 -> [127.0.0.1]:10001
>   pasta: epoll event on UDP reply socket 116 (events: 0x00000008)
>   pasta: epoll event on UDP reply socket 116 (events: 0x00000008)
>   pasta: epoll event on UDP reply socket 116 (events: 0x00000008)
>   [...repeated until I terminate the process]
> 
> by sending one UDP datagram from the parent namespace with no
> "listening" process in the namespace, using the "spliced" path,
> something like this:
> 
>   echo a | nc -q1 -u localhost 10001
> 
> after running pasta with:
> 
>   ./pasta -u 10001 --trace -l /tmp/pasta.trace --log-size $((1 << 30))
> 
> I tried bigger/multiple datagrams, same result.

Ouch.  I see it too.  I'll debug...

> 
> Before this patch, I get something like this instead:
> 
>   5.1018: pasta: epoll event on UDP socket 6 (events: 0x00000001)
>   5.1018: Flow 0 (NEW): FREE -> NEW
>   5.1018: Flow 0 (INI): NEW -> INI
>   5.1019: Flow 0 (INI): HOST [127.0.0.1]:41245 -> [0.0.0.0]:10001 => ?
>   5.1019: Flow 0 (TGT): INI -> TGT
>   5.1019: Flow 0 (TGT): HOST [127.0.0.1]:41245 -> [0.0.0.0]:10001 => SPLICE [127.0.0.1]:41245 -> [127.0.0.1]:10001
>   5.1019: Flow 0 (UDP flow): TGT -> TYPED
>   5.1019: Flow 0 (UDP flow): HOST [127.0.0.1]:41245 -> [0.0.0.0]:10001 => SPLICE [127.0.0.1]:41245 -> [127.0.0.1]:10001
>   5.1019: Flow 0 (UDP flow): Side 0 hash table insert: bucket: 111174
>   5.1019: Flow 0 (UDP flow): TYPED -> ACTIVE
>   5.1019: Flow 0 (UDP flow): HOST [127.0.0.1]:41245 -> [0.0.0.0]:10001 => SPLICE [127.0.0.1]:41245 -> [127.0.0.1]:10001
> 
> I didn't really investigate, though.
> 

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2024-07-15  4:33 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-05  2:06 [PATCH v7 00/27] Unified flow table David Gibson
2024-07-05  2:06 ` [PATCH v7 01/27] flow: Common address information for initiating side David Gibson
2024-07-05  2:06 ` [PATCH v7 02/27] flow: Common address information for target side David Gibson
2024-07-10 21:30   ` Stefano Brivio
2024-07-11  0:19     ` David Gibson
2024-07-05  2:07 ` [PATCH v7 03/27] tcp, flow: Remove redundant information, repack connection structures David Gibson
2024-07-05  2:07 ` [PATCH v7 04/27] tcp: Obtain guest address from flowside David Gibson
2024-07-05  2:07 ` [PATCH v7 05/27] tcp: Manage outbound address via flow table David Gibson
2024-07-05  2:07 ` [PATCH v7 06/27] tcp: Simplify endpoint validation using flowside information David Gibson
2024-07-05  2:07 ` [PATCH v7 07/27] tcp_splice: Eliminate SPLICE_V6 flag David Gibson
2024-07-05  2:07 ` [PATCH v7 08/27] tcp, flow: Replace TCP specific hash function with general flow hash David Gibson
2024-07-05  2:07 ` [PATCH v7 09/27] flow, tcp: Generalise TCP hash table to general flow hash table David Gibson
2024-07-05  2:07 ` [PATCH v7 10/27] tcp: Re-use flow hash for initial sequence number generation David Gibson
2024-07-05  2:07 ` [PATCH v7 11/27] icmp: Remove redundant id field from flow table entry David Gibson
2024-07-05  2:07 ` [PATCH v7 12/27] icmp: Obtain destination addresses from the flowsides David Gibson
2024-07-05  2:07 ` [PATCH v7 13/27] icmp: Look up ping flows using flow hash David Gibson
2024-07-05  2:07 ` [PATCH v7 14/27] icmp: Eliminate icmp_id_map David Gibson
2024-07-05  2:07 ` [PATCH v7 15/27] flow: Helper to create sockets based on flowside David Gibson
2024-07-10 21:32   ` Stefano Brivio
2024-07-11  0:21     ` David Gibson
2024-07-11  0:27     ` David Gibson
2024-07-05  2:07 ` [PATCH v7 16/27] icmp: Manage outbound socket address via flow table David Gibson
2024-07-05  2:07 ` [PATCH v7 17/27] flow, tcp: Flow based NAT and port forwarding for TCP David Gibson
2024-07-05  2:07 ` [PATCH v7 18/27] flow, icmp: Use general flow forwarding rules for ICMP David Gibson
2024-07-05  2:07 ` [PATCH v7 19/27] fwd: Update flow forwarding logic for UDP David Gibson
2024-07-08 21:26   ` Stefano Brivio
2024-07-09  0:19     ` David Gibson
2024-07-05  2:07 ` [PATCH v7 20/27] udp: Create flows for datagrams from originating sockets David Gibson
2024-07-09 22:32   ` Stefano Brivio
2024-07-09 23:59     ` David Gibson
2024-07-10 21:35       ` Stefano Brivio
2024-07-11  4:26         ` David Gibson
2024-07-11  8:20           ` Stefano Brivio
2024-07-11 22:58             ` David Gibson
2024-07-12  8:21               ` Stefano Brivio
2024-07-15  4:06                 ` David Gibson
2024-07-15 16:37                   ` Stefano Brivio
2024-07-17  0:49                     ` David Gibson
2024-07-05  2:07 ` [PATCH v7 21/27] udp: Handle "spliced" datagrams with per-flow sockets David Gibson
2024-07-09 22:32   ` Stefano Brivio
2024-07-10  0:23     ` David Gibson
2024-07-10 17:13       ` Stefano Brivio
2024-07-11  1:30         ` David Gibson
2024-07-11  8:23           ` Stefano Brivio
2024-07-11  2:48         ` David Gibson
2024-07-12 13:34   ` Stefano Brivio
2024-07-15  4:32     ` David Gibson [this message]
2024-07-05  2:07 ` [PATCH v7 22/27] udp: Remove obsolete splice tracking David Gibson
2024-07-10 21:36   ` Stefano Brivio
2024-07-11  0:43     ` David Gibson
2024-07-05  2:07 ` [PATCH v7 23/27] udp: Find or create flows for datagrams from tap interface David Gibson
2024-07-10 21:36   ` Stefano Brivio
2024-07-11  0:45     ` David Gibson
2024-07-05  2:07 ` [PATCH v7 24/27] udp: Direct datagrams from host to guest via flow table David Gibson
2024-07-10 21:37   ` Stefano Brivio
2024-07-11  0:46     ` David Gibson
2024-07-05  2:07 ` [PATCH v7 25/27] udp: Remove obsolete socket tracking David Gibson
2024-07-05  2:07 ` [PATCH v7 26/27] udp: Remove rdelta port forwarding maps David Gibson
2024-07-05  2:07 ` [PATCH v7 27/27] udp: Rename UDP listening sockets David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZpSmbkLsWtjeWOm4@zatzit \
    --to=david@gibson.dropbear.id.au \
    --cc=jmaloy@redhat.com \
    --cc=passt-dev@passt.top \
    --cc=sbrivio@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).