public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: Stefano Brivio <sbrivio@redhat.com>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: passt-dev@passt.top
Subject: Re: [PATCH v3 4/4] fwd: Direct inbound spliced forwards to the guest's external address
Date: Thu, 17 Oct 2024 10:31:22 +0200	[thread overview]
Message-ID: <20241017103122.29b1afb0@elisabeth> (raw)
In-Reply-To: <ZxBmPgKI5liJTRaS@zatzit.fritz.box>

On Thu, 17 Oct 2024 12:19:58 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Wed, Oct 16, 2024 at 05:26:48PM +0200, Stefano Brivio wrote:
> > On Wed, 16 Oct 2024 19:39:40 +1100
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> >   
> > > On Wed, Oct 16, 2024 at 04:46:52PM +1100, David Gibson wrote:  
> > > > On Wed, Oct 16, 2024 at 02:15:19PM +1100, David Gibson wrote:    
> > > > > On Thu, Oct 10, 2024 at 04:57:32PM +1100, David Gibson wrote:    
> > > > > > On Wed, Oct 09, 2024 at 10:44:33PM +0200, Stefano Brivio wrote:    
> > > > > > > On Wed, 9 Oct 2024 15:07:21 +0200
> > > > > > > Stefano Brivio <sbrivio@redhat.com> wrote:    
> > > > > [snip]    
> > > > > > > > > @@ -447,20 +447,35 @@ uint8_t fwd_nat_from_host(const struct ctx *c, uint8_t proto,
> > > > > > > > >  	    (proto == IPPROTO_TCP || proto == IPPROTO_UDP)) {
> > > > > > > > >  		/* spliceable */
> > > > > > > > >  
> > > > > > > > > -		/* Preserve the specific loopback adddress used, but let the
> > > > > > > > > -		 * kernel pick a source port on the target side
> > > > > > > > > +		/* The traffic will go over the guest's 'lo' interface, but by
> > > > > > > > > +		 * default use its external address, so we don't inadvertently
> > > > > > > > > +		 * expose services that listen only on the guest's loopback
> > > > > > > > > +		 * address.  That can be overridden by --host-lo-to-ns-lo which
> > > > > > > > > +		 * will instead forward to the loopback address in the guest.
> > > > > > > > > +		 *
> > > > > > > > > +		 * In either case, let the kernel pick the source address to
> > > > > > > > > +		 * match.
> > > > > > > > >  		 */
> > > > > > > > > -		tgt->oaddr = ini->eaddr;
> > > > > > > > > +		if (inany_v4(&ini->eaddr)) {
> > > > > > > > > +			if (c->host_lo_to_ns_lo)
> > > > > > > > > +				tgt->eaddr = inany_loopback4;
> > > > > > > > > +			else
> > > > > > > > > +				tgt->eaddr = inany_from_v4(c->ip4.addr_seen);
> > > > > > > > > +			tgt->oaddr = inany_any4;
> > > > > > > > > +		} else {
> > > > > > > > > +			if (c->host_lo_to_ns_lo)
> > > > > > > > > +				tgt->eaddr = inany_loopback6;
> > > > > > > > > +			else
> > > > > > > > > +				tgt->eaddr.a6 = c->ip6.addr_seen;      
> > > > > > > > 
> > > > > > > > Either this...
> > > > > > > >     
> > > > > > > > > +			tgt->oaddr = inany_any6;      
> > > > > > > > 
> > > > > > > > or this (and not something before this patch, up to 3/4) make the
> > > > > > > > "TCP/IPv6: host to ns (spliced): big transfer" test in pasta/tcp hang,
> > > > > > > > sometimes (about one in three/four runs), that's what I mistakenly
> > > > > > > > reported as coming from Laurent's series at:    
> > > > > > 
> > > > > > Huh, interesting.  Just got back from my leave and ran that group of
> > > > > > tests in a loop this afternoon, but didn't manage to reproduce.  I
> > > > > > have administrivia that will probably fill the rest of this week, but
> > > > > > I'll look into this as soon as I can.    
> > > > > 
> > > > > I reproduced the problem on passt.top, and I have a partial idea
> > > > > what's going on.  As you say it's seeming like the address (addr_seen
> > > > > == addr in this case) isn't properly ready.  This is over splice, but
> > > > > on the tap interface, I see the container sending NS messages for its
> > > > > own address - seems like it's doing DAD.  But more importantly, we're
> > > > > answering those NS messages with NA messages, because we answer all
> > > > > NS.  i.e. we're making the DAD fail.  What I'm not sure of is how this
> > > > > ever worked at all.  --config-net makes sense, since we disable DAD,
> > > > > but our test suite has always been using NDP+DHCP instead of
> > > > > --config-net.
> > > > > 
> > > > > So, AFACT, we'll always fail guest DAD attempts, both IPv6, which
> > > > > happens most of the time and for IPv4 via ARP, which is used much more
> > > > > rarely.  I think we need to be more selective in what NS or ARP
> > > > > lookups we resopnd to.  The question is what approach to take:    
> > > > 
> > > > Hmm... no.. there's more to this.
> > > > 
> > > > Usually DAD requests have :: as the source address, and we *do*
> > > > exclude those from getting replies.  In this case though, we're
> > > > getting NS requests for the assigned address from what looks like the
> > > > SLAAC address.  So, I do think it would be wise to explicitly exclude
> > > > these: we shouldn't be giving NA responses for an address that ought
> > > > to belong to the guest, even if it doesn't look like a DAD.
> > > > 
> > > > But, I'm not sure what's triggering this.  Is for some reason the DHCP
> > > > address not "taking", so the container is trying to locate it on the
> > > > network instead?  Or _is_ this DAD, but under some circumstances
> > > > rather than using :: as the source address it uses another configured
> > > > address.    
> > > 
> > > Ok.. I've understood a bit more.  While timing is a factor here, it
> > > looks like the main reason I wasn't seeing it on my machine is what
> > > I'd consider a bug in the Debian version of the dhclient-script:
> > > when adding an IPv6 address, it returns without waiting for DAD to
> > > complete (i.e. for the address to be non-tentative).  
> > 
> > Oops. On one hand, I would feel inclined to propose a fix for the
> > Debian and Ubuntu packages. On the other hand, I wonder if it's
> > universally considered a bug: the DHCPv6 client did its job at that
> > point, and it's debatable whether dhclient should wait for the address
> > to be usable before forking to background.
> > 
> > That is, arguably, the job of dhclient's is to request and configure an
> > address. It's not a network configuration daemon. There might be many
> > other reasons why that address is unusable, and yet dhclient is not
> > responsible for them.  
> 
> Hrm... I guess.  Counterpoints..
>  - Most other failures to get a usable address will result in a
>    visible error
>  - dhclient has a --dad-wait-time option which seems to imply that the
>    script should wait for DAD
>  - The upstream script version waits for DAD
> 
> In any case I filed a report for it
>     https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1085231
> 
> > By the way, I guess it's just an issue for test scripts like this one.  
> 
> Why do you guess that?

Because it's kind of rare that your address changes if you use DHCPv6,
I guess, so this would be relevant almost exclusively at boot.

And, at boot, if a remote peer/client happens to try to connect to the
machine where the client is running right after an address was
assigned, it must have a retry mechanism almost for sure.

-- 
Stefano


  reply	other threads:[~2024-10-17  8:31 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-02  5:48 [PATCH v3 0/4] Don't expose container loopback services to the host David Gibson
2024-10-02  5:48 ` [PATCH v3 1/4] passt.1: Mark --stderr as deprecated more prominently David Gibson
2024-10-02  5:48 ` [PATCH v3 2/4] passt.1: Clarify and update "Handling of local addresses" section David Gibson
2024-10-02  5:48 ` [PATCH v3 3/4] test: Clarify test for spliced inbound transfers David Gibson
2024-10-02  5:48 ` [PATCH v3 4/4] fwd: Direct inbound spliced forwards to the guest's external address David Gibson
2024-10-09 13:07   ` Stefano Brivio
2024-10-09 20:44     ` Stefano Brivio
2024-10-10  5:57       ` David Gibson
2024-10-16  3:15         ` David Gibson
2024-10-16  5:46           ` David Gibson
2024-10-16  8:39             ` David Gibson
2024-10-16 15:26               ` Stefano Brivio
2024-10-17  1:19                 ` David Gibson
2024-10-17  8:31                   ` Stefano Brivio [this message]
2024-10-21  1:35                     ` David Gibson
2024-10-17  5:06                 ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241017103122.29b1afb0@elisabeth \
    --to=sbrivio@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=passt-dev@passt.top \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).