From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by passt.top (Postfix) with ESMTPS id 5A3DA5A0272 for ; Tue, 19 Sep 2023 04:09:16 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=201602; t=1695089350; bh=NPYNL6oGUhoY9fdKT61oSLzgwE//Mr79LddBo/n2wTM=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=kmM9f55mc24dxB3jYz9z8dKpE/VNd3PddpmR9YZMRfVPvEnVzGyjlk9nhkApgZX/j 9HItD8Hx8ZcIo+G1FcUHNt9CAu1LaQ3bx0O3V0pUA/8V6zTWdeus2s44Ycravw30bL oBro7mFYSnrKB8E0byEADw6phDYk4of6jx2IiDeE= Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4RqQ8606s9z4xMC; Tue, 19 Sep 2023 12:09:10 +1000 (AEST) Date: Tue, 19 Sep 2023 12:09:01 +1000 From: David Gibson To: Nikolay Edigaryev Subject: Re: [PATCH] arp: only send ARP replies for --gateway address Message-ID: References: <20230915142045.73457-1-edigaryev@gmail.com> <20230918160134.09d2b706@elisabeth> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="Uud91m06rD7Wj6kP" Content-Disposition: inline In-Reply-To: Message-ID-Hash: TLLSMQX7KI5GKQBIQFQ3UWQ23MSTMYTY X-Message-ID-Hash: TLLSMQX7KI5GKQBIQFQ3UWQ23MSTMYTY X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: Stefano Brivio , passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --Uud91m06rD7Wj6kP Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Sep 18, 2023 at 07:52:23PM +0400, Nikolay Edigaryev wrote: > Hello Stefano, I will try to clarify: >=20 > I have a single host machine, a dedicated amd64 server, capable of > running multiple Cloud Hypervisor virtual machines backed by /dev/kvm. >=20 > I also have a daemon-less CLI software that can provision as many VM > instances as the user wants, e.g. by running "mycli create --kernel > ... --disk ... ubuntu". >=20 > To run a VM, the user types "mycli run ubuntu", which results in the > creation of two TAP interfaces: one is for passt, one is for Cloud > Hypervisor >=20 > "mycli run" then creates a bridge(8) interface, assigns a free IP from > /29 network to it (for example, 10.0.0.3/29), and adds both the TAP > interfaces to that bridge forming up a virtual switch, which allows > passt <-> VM and host <-> communication. Ok. So, to check my understanding: the VM only has a single virtual NIC, which connects to this bridge, then you're connecting the bridge to the outside world using passt. Is that correct? > "mycli run ubuntu" also invokes the passt with the following arguments: >=20 > >passt --foreground --address 10.0.0.2 --netmask 255.255.255.248 --gatewa= y 10.0.0.1 --mac-addr 52:f1:18:34:28:0b -4 --mtu 1500 --tap-fd 3 What owns the address 10.0.0.1 here? I'm assuming that's an address of the host, but is it on an external interface, or on this special bridge? Or somewhere else? [Btw, clamping the passt mtu to 1500 is probably going to be pretty bad for TCP throughput] > Now to the issue: if the user wants to access the VM, for provisioning > purposes, e.g. by running "ssh 10.0.0.2", there's a race between the > real ARP reply from that VM and an ARP reply from passt due to the > code fixed in the patch above. >=20 > And even if we add a static ARP entry for that VM on the host, there's > still exist a race on the VM's side. >=20 > Here the VM looks up the host's ethernet address and receives one > reply from host (ba:46:4e:27:8b:93) and another from passt > (52:f1:18:34:28:0b): >=20 > 17:26:42.685718 5a:b7:e3:dc:bb:9f > ba:46:4e:27:8b:93, ethertype ARP > (0x0806), length 42: Request who-has 10.0.0.3 tell 10.0.0.2, length 28 > 17:26:42.685744 ba:46:4e:27:8b:93 > 5a:b7:e3:dc:bb:9f, ethertype ARP > (0x0806), length 42: Reply 10.0.0.3 is-at ba:46:4e:27:8b:93, length 28 > 17:26:42.685908 52:f1:18:34:28:0b > 5a:b7:e3:dc:bb:9f, ethertype ARP > (0x0806), length 42: Reply 10.0.0.3 is-at 52:f1:18:34:28:0b, length 28 Right. Ok, so Stefano mentioned that this change will break the case of a guest not using the gateway it's supposed to. That's true, but there's certainly a pretty strong case that no-one has any right to expect that case to work anyway, so we need not consider it. I believe there's some other rare but legitimate cases it can also break though. For now I think these can only occur with pasta, not passt, but they'd still be affected: * Although it's not common, it's possible to have a default route with an interface, but no gateway (this can occur if the host has connectivity over a point to point link like a VPN). With pasta --config-net we'll copy that gateway-less default route to the namespace, and it will then ARP for *everything*. That will work now, because we'll answer all those arps, but would not if we only arp the gateway address. * A lesser version of the same same thing: even if we have a normal default gateway, we may also have specific subnet routes on the host which override it. With pasta --config-net again we will copy those routes to the namespace, and so packets routed that way will induce ARPs for something other than the default gateway (either for the destination address or for the route specific gateway). Apart from the ARP issues, I think there's at least one other fragility in the setup you've described. This is what I was thinking about when I mentioned elsewhere that I don't think ARP will be the only issue with having a non-trivial broadcast domain on the guest side of passt: If from the host you to send packets on the bridge addressed to passt's address, rather than the host, I believe that would cause passt to update its 'addr_seen' to that of the host. That could then cause packets which should be going to the guest to be sent to the host instead. That could have a variety of effects from just a brief interruption to essentially breaking connectivity. > On Mon, Sep 18, 2023 at 6:01=E2=80=AFPM Stefano Brivio wrote: > > > > On Mon, 18 Sep 2023 12:26:03 +1000 > > David Gibson wrote: > > > > > On Fri, Sep 15, 2023 at 06:20:45PM +0400, Nikolay Edigaryev wrote: > > > > Problem: when passt/pasta are working in a broadcast domain with mo= re > > > > than one host machine, > > > > > > Oof. So, at present, passt/pasta is really not designed to have more > > > than a single machine on the "tap" side. Changing the ARP behaviour > > > is likely to be the least of the problems with that setup. > > > > Now I'm confused on which "side" this happens. :) Nikolay, can you > > articulate the issue a bit better? Do you really have multiple *host* > > machines? Does the passt process... move between them? > > > > By the way, the only concern I have with this change is that the guest > > might ignore the gateway address it's being assigned, for whatever > > reason, and by just resolving "almost everything" we guarantee the > > traffic goes out anyway. > > > > If there's no other way to solve the issue you're facing, I would > > rather propose to have this as an option, and perhaps have it off by > > default. --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --Uud91m06rD7Wj6kP Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmUJArYACgkQzQJF27ox 2Gdxfw//eKzO1x9aby68azb3shXFJOxTQ7tMeuk4Lj6P9UYmc01bAmSliubaO0li 6tNUhFEMmDnwDRi/qd6GHtDGJ9x+tJ3cih05KFF5SlSCtkJYW8h41n3zWHsUfUrp ikQ0UV44yDAUmaqMxNEKvS3YvyYAYTeqE70JjxP3LYMCyh6tqwug501LDyU316R4 3DRsuBDNDv9bCmK+F5vkU5nZSouMAcU+CyCB75EC8fldMz/SHhc6FqwwAZblSsaM SjVNC3sSQfzgHn1St/A5WK0UkVq0OeUocCaM17eIqKT5lY+eNMJauCg7qk3GzA4h Hup/vK8DUJLD+GKnIClhGAywII4qq696poys2a5OaUrdcvcS3aMwTy7vsntmnkBB Lqf82zvX/e1VMJOrlEZBiYmgZQgORIHPvWFdVPlKTFuTB5MIXi6EFEGgRi9CxRZK rv24hk4ASUxD7Hupx5nNpgj/22awPVIYIzNcpjVYkKXVX8YQLZ66+QWuoGrj/zUc YyvKux8C8HFGFRVqZcfXXBSpVfu6i+BEdRlj48NAnf7AEInT3XepvKGJPKUkdjpe ZLeWgqXJhdSTiM6EzXBOsAxvMkYeAjS34b8BGEKuDGWSTGhTGimV84Mk+mgRg8cn 6quK9QgWmiXie5nt7IwqL+CxC+JJ4XU5rlnQr0eQStZdT4gtpR0= =4el2 -----END PGP SIGNATURE----- --Uud91m06rD7Wj6kP--