From: Stefano Brivio <sbrivio@redhat.com>
To: Noah Gold <nkgold@google.com>
Cc: David Gibson <david@gibson.dropbear.id.au>, passt-dev@passt.top
Subject: Re: Improved handling of changing DNS resolvers
Date: Tue, 14 Feb 2023 16:06:53 +0100 [thread overview]
Message-ID: <20230214160653.14192635@elisabeth> (raw)
In-Reply-To: <CAEJ_Dr_CjuDvaoBzUO-NmV3OB0P1mnjkJNrFK0byk7Sp90vBqw@mail.gmail.com>
On Mon, 13 Feb 2023 18:45:20 -0800
Noah Gold <nkgold@google.com> wrote:
> On Thu, Feb 2, 2023 at 3:09 AM Stefano Brivio <sbrivio@redhat.com> wrote:
> >
> > On Mon, 30 Jan 2023 16:11:38 -0800
> > Noah Gold <nkgold@google.com> wrote:
> >
> > > Sorry for the delay, I've been really busy this past week.
> > >
> > > On Sun, Jan 22, 2023 at 10:26 PM David Gibson
> > > <david@gibson.dropbear.id.au> wrote:
> > > >
> > > > On Sat, Jan 21, 2023 at 10:47:03AM +0100, Stefano Brivio wrote:
> > > > > Hi Noah,
> > > > >
> > > > > Sorry for the delay, I didn't check pending mailing list posts for a
> > > > > couple of days. Comments below:
> > > > >
> > > > > On Tue, 17 Jan 2023 11:50:50 -0800
> > > > > Noah Gold <nkgold@google.com> wrote:
> > > > >
> > > > > > Hi folks,
> > > > > >
> > > > > > libslirp and Passt have different approaches to sharing DNS resolvers with
> > > > > > the guest system, each with their own benefits & drawbacks. On the libslirp
> > > > > > project, we're discussing [1] how to support DNS failover. Passt already has
> > > > > > support for this, but there is a drawback to its solution which prevents us
> > > > > > from taking a similar approach: the resolvers are read exactly once, so if the
> > > > > > host changes networks at runtime, the guest will not receive the updated
> > > > > > resolvers and thus its connectivity will break.
> > > >
> > > > So, passt/pasta kinda-sorta binds itself to a particular host
> > > > interface, so DNS won't be the only issue if the host changes
> > > > network. For one thing, at least by default the guest gets the same
> > > > IP as the host, so if the host IP changes the guest will get out of
> > > > sync. We'll mostly cope with that ok, but there will be some edge
> > > > cases which will break (most obviously if after the network change the
> > > > guest wants to talk to something at the host's old address / its
> > > > current address).
> > > >
> > > > > Right -- the main motivation behind this (other than simplicity) is that
> > > > > we can close /etc/resolv.conf before sandboxing.
> > > > >
> > > > > However, we could keep a handle on it, just like we do for PID and pcap
> > > > > files, while still unmounting the filesystem.
> > > > >
> > > > > And we could also use inotify to detect changes I guess -- we do the
> > > > > same to monitor namespaces in pasta mode (see pasta_netns_quit_init()).
> > > >
> > > > All true, but I'm not sure those are actually the most pressing issues
> > > > we'll face with a host network change.
> > > >
> > > > > > libslirp's current approach is to DNAT a single address exposed to the guest
> > > > > > to one of the resolvers configured on the host. The problem here is that if that
> > > > > > one resolver goes down, the guest can't resolve DNS names. We're
> > > > > > considering changing so that instead of a single address, we expose a set of
> > > > > > MAXNS addresses, and DNAT those 1:1 to the DNS resolvers registered with
> > > > > > the host. Because the DNAT table lives on the host side, we can refresh the
> > > > > > guest's resolvers whenever the host's resolvers change, but without the need to
> > > > > > expire a DHCP lease (even with short leases, the guest will still lose
> > > > > > connectivity
> > > > > > for a time).
> > > > > >
> > > > > > Does this sound like an approach Passt would be open to adopting as well?
> > > > >
> > > > > Yes, definitely, patches would be very welcome.
> > > >
> > > > Hm, that's doesn't fit that easily into the passt model. For the most
> > > > part we don't NAT at all, we only have a couple of special cases where
> > > > we do. Because of that the problem with adding any extra NAT case is
> > > > address allocation. Currently we use the host's gateway address,
> > > > which mostly works but is a bit troublesome. I have some ideas I
> > > > think will work better, but those don't necessarily get us more
> > > > available addresses.
> > >
> > > For libslirp we have the guest on a private subnet, so pulling addresses from
> > > that pool is pretty easy. For passt is the issue that there is no address range,
> > > or that the infrastructure to allocate from the range just doesn't exist yet?
> >
> > [David is out this and next week]
> >
> > There's no address range because it's not designed with NAT in mind,
> > even though it can do NAT. From what we discussed with David in the
> > past, the idea, if I recall correctly, was that you could decide to, at
> > least, remap a particular address instead of the gateway address (more
> > on that below) -- and perhaps something more flexible with more
> > addresses, but not an arbitrary number of them, as passt doesn't do
> > dynamic memory allocation.
>
> Ah okay, it's sharing the host network by default? Or at least, doing
> its best to pretend that's the case?
Yes -- not sharing it, just pretending. Again, that's only by default.
> > > When you say "we use the host's gateway address", what is it used for
> > > exactly? (I didn't follow the loopback example below.)
> >
> > The host's default gateway address (for both IPv4 and IPv6) is
> > advertised, by default, as gateway address/next hop of default route,
> > to the guest, via DHCP/NDP.
> >
> > Again by default (unless --no-map-gw is used), the guest can then use
> > this address to refer to the host (and not its default gateway). See
> > also the "Handling of traffic with local destination and source
> > addressses" section in the NOTES of passt(1).
> >
> > However, this is, at the moment, unrelated to how DNS addresses are
> > mapped: right now you can specify --dns-forward zero to two times
> > (separately for IPv4 and IPv6) and that will forward DNS queries (with
> > reverse mapping) to the first configured resolver.
> >
> > So, if you are happy with this kind of solution (with a NAT), you pick
> > the addresses yourself, you don't need pools or ranges, and you would
> > "just" need, on top of what's already available, to change, at runtime,
> > the resolver passt forwards queries to (perhaps via inotify as I
> > mentioned).
>
> Makes sense. The trouble is when N > 2, see below.
>
> > > > > Note that David (Cc'ed) is currently working on a generalised/flexible
> > > > > address mapping mechanism, some kind of (simple) NAT table as far as I
> > > > > understood it.
> > > >
> > > > That's a bit overstating it. I'm making our current single NAT case
> > > > (translating host side loopback to gateway address on the guest) more
> > > > configurable. I have plans (or at least ideas) for a more generalized
> > > > NAT mechanism, but I'm really not implementing that yet. What I'm
> > > > doing now is kind of a soft prerequisite for that rework though (as
> > > > well as useful in its own right).
> > > >
> > > > > This might even address your DNS idea already, I'm not sure, I'd wait
> > > > > for him to comment.
> > > >
> > > > Hadn't considered specifically that model, but it's a reasonbly
> > > > natural extension of it (address allocation is still a complication).
> > > > I'll certainly consider this case when I do more on this.
> > >
> > > It sounds like there might be a path to using NAT, but it's not something
> > > that would be ready soon. Given that, would there be long term concerns
> > > with using NAT for DNS in the way proposed here? I understand we can't
> > > implement it now, but I'd like to understand if it's an approach we would
> > > still rather avoid, even long term.
> >
> > I don't really see an issue with it, also because, actually, we already
> > do it. :) ...even though it's for two address pairs only
> > (internal/external IPv4/IPv6 addresses). If that's enough for your use
> > case (more on that below), I think we can also implement a runtime
> > change of resolvers now.
>
> Got it. The problem with just two pairs is when the host has N DNS
> resolvers, and N-1 of them are broken (N > 3 is unfortunately possible
> on the non unix systems (Windows) libslirp supports). It sounds like
> the *future* approach for passt might be tricky if dynamic allocation
> is completely off the table. Is some dynamic allocation permitted at
> initialization time? If so, we could detect the # of resolvers and
> perhaps take a start address as an argument?
If really necessary, I think we could consider to do some dynamic
allocation before the seccomp profile is applied.
However, how many addresses of resolvers could we possibly want? 16?
32? Reading around a bit, it looks like Windows DHCP servers generally
support assigning 25 addresses. It's not the same thing, but it would
suggest that 32 a reasonable choice -- and that's something we could
merrily use to size static buffers.
> > > > --
> > > > David Gibson | I'll have my music baroque, and my code
> > > > david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
> > > > | _way_ _around_!
> > > > http://www.ozlabs.org/~dgibson
> > >
> > > On Wed, Jan 25, 2023 at 9:55 AM Stefano Brivio <sbrivio@redhat.com> wrote:
> > > >
> > > > On Mon, 23 Jan 2023 17:20:13 +1100
> > > > David Gibson <david@gibson.dropbear.id.au> wrote:
> > > >
> > > > > On Sat, Jan 21, 2023 at 10:47:03AM +0100, Stefano Brivio wrote:
> > > > > > Hi Noah,
> > > > > >
> > > > > > Sorry for the delay, I didn't check pending mailing list posts for a
> > > > > > couple of days. Comments below:
> > > > > >
> > > > > > On Tue, 17 Jan 2023 11:50:50 -0800
> > > > > > Noah Gold <nkgold@google.com> wrote:
> > > > > >
> > > > > > > Hi folks,
> > > > > > >
> > > > > > > libslirp and Passt have different approaches to sharing DNS resolvers with
> > > > > > > the guest system, each with their own benefits & drawbacks. On the libslirp
> > > > > > > project, we're discussing [1] how to support DNS failover. Passt already has
> > > > > > > support for this, but there is a drawback to its solution which prevents us
> > > > > > > from taking a similar approach: the resolvers are read exactly once, so if the
> > > > > > > host changes networks at runtime, the guest will not receive the updated
> > > > > > > resolvers and thus its connectivity will break.
> > > > >
> > > > > So, passt/pasta kinda-sorta binds itself to a particular host
> > > > > interface, so DNS won't be the only issue if the host changes
> > > > > network. For one thing, at least by default the guest gets the same
> > > > > IP as the host, so if the host IP changes the guest will get out of
> > > > > sync. We'll mostly cope with that ok, but there will be some edge
> > > > > cases which will break (most obviously if after the network change the
> > > > > guest wants to talk to something at the host's old address / its
> > > > > current address).
> > > >
> > > > Noah, by the way, if your usage for DNS failover is related to a
> > > > virtual machine being migrated to another host with different
> > > > addressing, mind that you could simply tell qemu to connect to a new
> > > > instance of passt. That's something you can't do with libslirp.
> > >
> > > It's not related to machine migration, though that's another interesting
> > > case with similar constraints. The use case I'm thinking about is for a
> > > mobile device that may experience network changes as part of its
> > > normal operation (e.g. changing wifi networks).
> >
> > So... I admit I have no idea what happens exactly when you change parts
> > of the host configuration, this kind of use case wasn't really a
> > priority for passt in the... past.
>
> For the use case I'm looking at (present, not passt), it's probably
> fine for the typical thing to happen (all open sockets timeout or hit
> resets) since that's happening on the host anyways.
>
> > I expect it to mostly work. By default, we don't do NAT because (with
> > default options) the address of the guest matches the address of the
> > host. But once you change addresses and routes on the host, passt
> > should just start doing NAT, it's implicit and not something you need
> > to enable or disable.
> >
> > Would you have a chance to try it out in the use case you had in mind,
> > so that we can go through any issue you might hit?
>
> I'm working exclusively with Windows at the moment, so presently this
> is more to make sure the adjustments we make in libslirp could be
> applied to passt... in the future. (Time travel aside, my vague
> understanding is that passt may be the successor for libslirp, at
> least based on the interest from the maintainers in keeping some
> compatibility in terms of features. I'd be very curious if someone
> could clarify how the two projects relate beyond solving very similar
> problems.)
I guess we should eventually add a FAQ section to the project website.
Meanwhile, there's some kind of summary about that in slide 16 from a
presentation at KVM Forum last year:
https://static.sched.com/hosted_files/kvmforum2022/01/passt_kubevirt_kvm_forum_2022_final.pdf
(recording at: https://www.youtube.com/watch?v=U89bWP1HNgU).
By the way, a Win2k port is up for grabs at
https://bugs.passt.top/show_bug.cgi?id=8. I think the complexity might
be similar to a FreeBSD/Darwin port (tracked at
https://bugs.passt.top/show_bug.cgi?id=6).
> Conceptually though, I'll definitely keep this thread
> updated if we run into issues implementing first in libslirp, as they
> may apply to passt as well.
Thanks!
> > > > Would that solve your problem, or your issue is specifically related to
> > > > DNS failover without any VM migration playing a role?
> > >
> > > It's not related to migration, but I wonder whether there's an idea there
> > > which could be used. The approach I was taking was to make the
> > > network component resilient to network changes. But another option is
> > > to detect network changes and restart the network component. libslirp
> > > still needs a way to support exposing multiple servers though, and I
> > > wonder whether we would want to require library consumers to write
> > > network awareness into their applications as opposed to solving it
> > > for them.
> >
> > Restarting the network component has a single, fundamental advantage, I
> > think: it's a convenient way to reset a number of states and stored
> > information in an implicit way.
> >
> > For example, it's better to reset TCP connections (stop the process,
> > sockets close) than to let them hang. We could reset connections
> > explicitly, of course, but this adds a bit of complexity.
> >
> > Still, with some effort we could make an attempt at actually keeping
> > them alive. Maybe this even works with passt already.
> >
> > So I'm not really sure what would be the best approach. Making the
> > network component resilient to network changes, in the long term,
> > sounds more appropriate and elegant to me.
> >
> > I was just suggesting that, in the short term, restarting passt should
> > cover whatever use case you might have.
>
> Makes sense. I agree, long term resiliency seems like the cleaner solution.
>
--
Stefano
prev parent reply other threads:[~2023-02-14 15:07 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-17 18:51 Improved handling of changing DNS resolvers Noah Gold
2023-01-21 9:47 ` Stefano Brivio
2023-01-23 6:20 ` David Gibson
2023-01-25 17:55 ` Stefano Brivio
2023-01-31 0:11 ` Noah Gold
2023-02-02 11:09 ` Stefano Brivio
2023-02-14 2:45 ` Noah Gold
2023-02-14 15:06 ` Stefano Brivio [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230214160653.14192635@elisabeth \
--to=sbrivio@redhat.com \
--cc=david@gibson.dropbear.id.au \
--cc=nkgold@google.com \
--cc=passt-dev@passt.top \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).