From: David Gibson <david@gibson.dropbear.id.au>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: passt-dev@passt.top
Subject: Re: [PATCH v3 4/4] fwd: Direct inbound spliced forwards to the guest's external address
Date: Wed, 16 Oct 2024 14:15:19 +1100 [thread overview]
Message-ID: <Zw8vxxJQfRq-ARMl@zatzit.fritz.box> (raw)
In-Reply-To: <ZwdszOBQxWf1Njx0@zatzit.fritz.box>
[-- Attachment #1: Type: text/plain, Size: 4514 bytes --]
On Thu, Oct 10, 2024 at 04:57:32PM +1100, David Gibson wrote:
> On Wed, Oct 09, 2024 at 10:44:33PM +0200, Stefano Brivio wrote:
> > On Wed, 9 Oct 2024 15:07:21 +0200
> > Stefano Brivio <sbrivio@redhat.com> wrote:
[snip]
> > > > @@ -447,20 +447,35 @@ uint8_t fwd_nat_from_host(const struct ctx *c, uint8_t proto,
> > > > (proto == IPPROTO_TCP || proto == IPPROTO_UDP)) {
> > > > /* spliceable */
> > > >
> > > > - /* Preserve the specific loopback adddress used, but let the
> > > > - * kernel pick a source port on the target side
> > > > + /* The traffic will go over the guest's 'lo' interface, but by
> > > > + * default use its external address, so we don't inadvertently
> > > > + * expose services that listen only on the guest's loopback
> > > > + * address. That can be overridden by --host-lo-to-ns-lo which
> > > > + * will instead forward to the loopback address in the guest.
> > > > + *
> > > > + * In either case, let the kernel pick the source address to
> > > > + * match.
> > > > */
> > > > - tgt->oaddr = ini->eaddr;
> > > > + if (inany_v4(&ini->eaddr)) {
> > > > + if (c->host_lo_to_ns_lo)
> > > > + tgt->eaddr = inany_loopback4;
> > > > + else
> > > > + tgt->eaddr = inany_from_v4(c->ip4.addr_seen);
> > > > + tgt->oaddr = inany_any4;
> > > > + } else {
> > > > + if (c->host_lo_to_ns_lo)
> > > > + tgt->eaddr = inany_loopback6;
> > > > + else
> > > > + tgt->eaddr.a6 = c->ip6.addr_seen;
> > >
> > > Either this...
> > >
> > > > + tgt->oaddr = inany_any6;
> > >
> > > or this (and not something before this patch, up to 3/4) make the
> > > "TCP/IPv6: host to ns (spliced): big transfer" test in pasta/tcp hang,
> > > sometimes (about one in three/four runs), that's what I mistakenly
> > > reported as coming from Laurent's series at:
>
> Huh, interesting. Just got back from my leave and ran that group of
> tests in a loop this afternoon, but didn't manage to reproduce. I
> have administrivia that will probably fill the rest of this week, but
> I'll look into this as soon as I can.
I reproduced the problem on passt.top, and I have a partial idea
what's going on. As you say it's seeming like the address (addr_seen
== addr in this case) isn't properly ready. This is over splice, but
on the tap interface, I see the container sending NS messages for its
own address - seems like it's doing DAD. But more importantly, we're
answering those NS messages with NA messages, because we answer all
NS. i.e. we're making the DAD fail. What I'm not sure of is how this
ever worked at all. --config-net makes sense, since we disable DAD,
but our test suite has always been using NDP+DHCP instead of
--config-net.
So, AFACT, we'll always fail guest DAD attempts, both IPv6, which
happens most of the time and for IPv4 via ARP, which is used much more
rarely. I think we need to be more selective in what NS or ARP
lookups we resopnd to. The question is what approach to take:
Option 1: Answer everything apart from specific exceptions
========
Basically we'd explicitly exclude the guest/container's assigned
address but continue answering everything else. This is a bit ugly
and means we'd probably still have a similar problem if the guest just
picks an address instead of taking the assigned one. On the other
hand it means things will work in at least some cases where the guest
tries to contact remote hosts directly instead of via the default
gateway.
Option 2: Only answer for specific addresses
=========
Reverse the logic above, usually *don't* answer NS queries, unless
they have a specific address that we know is our responsibility. That
would be, afaict, the gateway address, the "our_tap" addresses and the
-map--host-loopback and --map-guest-addr addresses (usually all the
same).
This might be less robust against certain guests that don't use a
default gateway when we expect. I guess that could happen when we
have non-gateway routes copied from the host, so we'd have to handle
that too.
It would remove one more barrier towards having multiple bridged
guests behind a single passt/pasta instance (they can't reasonably
communicate with each other if passt answers all their NS / ARP
requests).
--
David Gibson (he or they) | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you, not the other way
| around.
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2024-10-16 4:10 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-02 5:48 [PATCH v3 0/4] Don't expose container loopback services to the host David Gibson
2024-10-02 5:48 ` [PATCH v3 1/4] passt.1: Mark --stderr as deprecated more prominently David Gibson
2024-10-02 5:48 ` [PATCH v3 2/4] passt.1: Clarify and update "Handling of local addresses" section David Gibson
2024-10-02 5:48 ` [PATCH v3 3/4] test: Clarify test for spliced inbound transfers David Gibson
2024-10-02 5:48 ` [PATCH v3 4/4] fwd: Direct inbound spliced forwards to the guest's external address David Gibson
2024-10-09 13:07 ` Stefano Brivio
2024-10-09 20:44 ` Stefano Brivio
2024-10-10 5:57 ` David Gibson
2024-10-16 3:15 ` David Gibson [this message]
2024-10-16 5:46 ` David Gibson
2024-10-16 8:39 ` David Gibson
2024-10-16 15:26 ` Stefano Brivio
2024-10-17 1:19 ` David Gibson
2024-10-17 8:31 ` Stefano Brivio
2024-10-21 1:35 ` David Gibson
2024-10-17 5:06 ` David Gibson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zw8vxxJQfRq-ARMl@zatzit.fritz.box \
--to=david@gibson.dropbear.id.au \
--cc=passt-dev@passt.top \
--cc=sbrivio@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).