From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by passt.top (Postfix) with ESMTP id 0A34E5A0082 for ; Thu, 2 Feb 2023 12:09:47 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675336186; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BYyxvxRmHMrk/8IWUqgaShomrWPRTcaZptNXipZsK2U=; b=K0KtlStoI9C/m6dolx1uib/zRY6pqYYYgxGXGR1+FtfrkEue/gJB0xHQ4CukE6hWI5WU/G TC6a2IukUZWm9SaZ7SVrdflPh78wsyWdOM0Oz4doav0bBvKhqquzcmEPzXLA2gh/gNRN5g hvA/Q2Ig5T/PddwIIId2REzlIzSaUJc= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-316-TW6JFAjeMLajE_QExiIMTA-1; Thu, 02 Feb 2023 06:09:43 -0500 X-MC-Unique: TW6JFAjeMLajE_QExiIMTA-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 29BDA830FB0; Thu, 2 Feb 2023 11:09:43 +0000 (UTC) Received: from maya.cloud.tilaa.com (ovpn-208-4.brq.redhat.com [10.40.208.4]) by smtp.corp.redhat.com (Postfix) with ESMTPS id B3689140EBF6; Thu, 2 Feb 2023 11:09:42 +0000 (UTC) Date: Thu, 2 Feb 2023 12:09:40 +0100 From: Stefano Brivio To: Noah Gold Subject: Re: Improved handling of changing DNS resolvers Message-ID: <20230202120940.2e044c4b@elisabeth> In-Reply-To: References: <20230121104703.3ebcc753@elisabeth> Organization: Red Hat MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-ID-Hash: 4NW5GYI24WDJFOUKMXUG63GPRWSUW7Z6 X-Message-ID-Hash: 4NW5GYI24WDJFOUKMXUG63GPRWSUW7Z6 X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: David Gibson , passt-dev@passt.top X-Mailman-Version: 3.3.3 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Mon, 30 Jan 2023 16:11:38 -0800 Noah Gold wrote: > Sorry for the delay, I've been really busy this past week. > > On Sun, Jan 22, 2023 at 10:26 PM David Gibson > wrote: > > > > On Sat, Jan 21, 2023 at 10:47:03AM +0100, Stefano Brivio wrote: > > > Hi Noah, > > > > > > Sorry for the delay, I didn't check pending mailing list posts for a > > > couple of days. Comments below: > > > > > > On Tue, 17 Jan 2023 11:50:50 -0800 > > > Noah Gold wrote: > > > > > > > Hi folks, > > > > > > > > libslirp and Passt have different approaches to sharing DNS resolvers with > > > > the guest system, each with their own benefits & drawbacks. On the libslirp > > > > project, we're discussing [1] how to support DNS failover. Passt already has > > > > support for this, but there is a drawback to its solution which prevents us > > > > from taking a similar approach: the resolvers are read exactly once, so if the > > > > host changes networks at runtime, the guest will not receive the updated > > > > resolvers and thus its connectivity will break. > > > > So, passt/pasta kinda-sorta binds itself to a particular host > > interface, so DNS won't be the only issue if the host changes > > network. For one thing, at least by default the guest gets the same > > IP as the host, so if the host IP changes the guest will get out of > > sync. We'll mostly cope with that ok, but there will be some edge > > cases which will break (most obviously if after the network change the > > guest wants to talk to something at the host's old address / its > > current address). > > > > > Right -- the main motivation behind this (other than simplicity) is that > > > we can close /etc/resolv.conf before sandboxing. > > > > > > However, we could keep a handle on it, just like we do for PID and pcap > > > files, while still unmounting the filesystem. > > > > > > And we could also use inotify to detect changes I guess -- we do the > > > same to monitor namespaces in pasta mode (see pasta_netns_quit_init()). > > > > All true, but I'm not sure those are actually the most pressing issues > > we'll face with a host network change. > > > > > > libslirp's current approach is to DNAT a single address exposed to the guest > > > > to one of the resolvers configured on the host. The problem here is that if that > > > > one resolver goes down, the guest can't resolve DNS names. We're > > > > considering changing so that instead of a single address, we expose a set of > > > > MAXNS addresses, and DNAT those 1:1 to the DNS resolvers registered with > > > > the host. Because the DNAT table lives on the host side, we can refresh the > > > > guest's resolvers whenever the host's resolvers change, but without the need to > > > > expire a DHCP lease (even with short leases, the guest will still lose > > > > connectivity > > > > for a time). > > > > > > > > Does this sound like an approach Passt would be open to adopting as well? > > > > > > Yes, definitely, patches would be very welcome. > > > > Hm, that's doesn't fit that easily into the passt model. For the most > > part we don't NAT at all, we only have a couple of special cases where > > we do. Because of that the problem with adding any extra NAT case is > > address allocation. Currently we use the host's gateway address, > > which mostly works but is a bit troublesome. I have some ideas I > > think will work better, but those don't necessarily get us more > > available addresses. > > For libslirp we have the guest on a private subnet, so pulling addresses from > that pool is pretty easy. For passt is the issue that there is no address range, > or that the infrastructure to allocate from the range just doesn't exist yet? [David is out this and next week] There's no address range because it's not designed with NAT in mind, even though it can do NAT. From what we discussed with David in the past, the idea, if I recall correctly, was that you could decide to, at least, remap a particular address instead of the gateway address (more on that below) -- and perhaps something more flexible with more addresses, but not an arbitrary number of them, as passt doesn't do dynamic memory allocation. > When you say "we use the host's gateway address", what is it used for > exactly? (I didn't follow the loopback example below.) The host's default gateway address (for both IPv4 and IPv6) is advertised, by default, as gateway address/next hop of default route, to the guest, via DHCP/NDP. Again by default (unless --no-map-gw is used), the guest can then use this address to refer to the host (and not its default gateway). See also the "Handling of traffic with local destination and source addressses" section in the NOTES of passt(1). However, this is, at the moment, unrelated to how DNS addresses are mapped: right now you can specify --dns-forward zero to two times (separately for IPv4 and IPv6) and that will forward DNS queries (with reverse mapping) to the first configured resolver. So, if you are happy with this kind of solution (with a NAT), you pick the addresses yourself, you don't need pools or ranges, and you would "just" need, on top of what's already available, to change, at runtime, the resolver passt forwards queries to (perhaps via inotify as I mentioned). > > > Note that David (Cc'ed) is currently working on a generalised/flexible > > > address mapping mechanism, some kind of (simple) NAT table as far as I > > > understood it. > > > > That's a bit overstating it. I'm making our current single NAT case > > (translating host side loopback to gateway address on the guest) more > > configurable. I have plans (or at least ideas) for a more generalized > > NAT mechanism, but I'm really not implementing that yet. What I'm > > doing now is kind of a soft prerequisite for that rework though (as > > well as useful in its own right). > > > > > This might even address your DNS idea already, I'm not sure, I'd wait > > > for him to comment. > > > > Hadn't considered specifically that model, but it's a reasonbly > > natural extension of it (address allocation is still a complication). > > I'll certainly consider this case when I do more on this. > > It sounds like there might be a path to using NAT, but it's not something > that would be ready soon. Given that, would there be long term concerns > with using NAT for DNS in the way proposed here? I understand we can't > implement it now, but I'd like to understand if it's an approach we would > still rather avoid, even long term. I don't really see an issue with it, also because, actually, we already do it. :) ...even though it's for two address pairs only (internal/external IPv4/IPv6 addresses). If that's enough for your use case (more on that below), I think we can also implement a runtime change of resolvers now. > > > > -- > > David Gibson | I'll have my music baroque, and my code > > david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ > > | _way_ _around_! > > http://www.ozlabs.org/~dgibson > > On Wed, Jan 25, 2023 at 9:55 AM Stefano Brivio wrote: > > > > On Mon, 23 Jan 2023 17:20:13 +1100 > > David Gibson wrote: > > > > > On Sat, Jan 21, 2023 at 10:47:03AM +0100, Stefano Brivio wrote: > > > > Hi Noah, > > > > > > > > Sorry for the delay, I didn't check pending mailing list posts for a > > > > couple of days. Comments below: > > > > > > > > On Tue, 17 Jan 2023 11:50:50 -0800 > > > > Noah Gold wrote: > > > > > > > > > Hi folks, > > > > > > > > > > libslirp and Passt have different approaches to sharing DNS resolvers with > > > > > the guest system, each with their own benefits & drawbacks. On the libslirp > > > > > project, we're discussing [1] how to support DNS failover. Passt already has > > > > > support for this, but there is a drawback to its solution which prevents us > > > > > from taking a similar approach: the resolvers are read exactly once, so if the > > > > > host changes networks at runtime, the guest will not receive the updated > > > > > resolvers and thus its connectivity will break. > > > > > > So, passt/pasta kinda-sorta binds itself to a particular host > > > interface, so DNS won't be the only issue if the host changes > > > network. For one thing, at least by default the guest gets the same > > > IP as the host, so if the host IP changes the guest will get out of > > > sync. We'll mostly cope with that ok, but there will be some edge > > > cases which will break (most obviously if after the network change the > > > guest wants to talk to something at the host's old address / its > > > current address). > > > > Noah, by the way, if your usage for DNS failover is related to a > > virtual machine being migrated to another host with different > > addressing, mind that you could simply tell qemu to connect to a new > > instance of passt. That's something you can't do with libslirp. > > It's not related to machine migration, though that's another interesting > case with similar constraints. The use case I'm thinking about is for a > mobile device that may experience network changes as part of its > normal operation (e.g. changing wifi networks). So... I admit I have no idea what happens exactly when you change parts of the host configuration, this kind of use case wasn't really a priority for passt in the... past. I expect it to mostly work. By default, we don't do NAT because (with default options) the address of the guest matches the address of the host. But once you change addresses and routes on the host, passt should just start doing NAT, it's implicit and not something you need to enable or disable. Would you have a chance to try it out in the use case you had in mind, so that we can go through any issue you might hit? > > Would that solve your problem, or your issue is specifically related to > > DNS failover without any VM migration playing a role? > > It's not related to migration, but I wonder whether there's an idea there > which could be used. The approach I was taking was to make the > network component resilient to network changes. But another option is > to detect network changes and restart the network component. libslirp > still needs a way to support exposing multiple servers though, and I > wonder whether we would want to require library consumers to write > network awareness into their applications as opposed to solving it > for them. Restarting the network component has a single, fundamental advantage, I think: it's a convenient way to reset a number of states and stored information in an implicit way. For example, it's better to reset TCP connections (stop the process, sockets close) than to let them hang. We could reset connections explicitly, of course, but this adds a bit of complexity. Still, with some effort we could make an attempt at actually keeping them alive. Maybe this even works with passt already. So I'm not really sure what would be the best approach. Making the network component resilient to network changes, in the long term, sounds more appropriate and elegant to me. I was just suggesting that, in the short term, restarting passt should cover whatever use case you might have. -- Stefano