From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=pass (2048-bit key; secure) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202408 header.b=HYcHGPJx; dkim-atps=neutral Received: from mail.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by passt.top (Postfix) with ESMTPS id 496765A004E for ; Tue, 17 Sep 2024 03:24:05 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202408; t=1726536241; bh=fIYPIH4hnJ921RXAbFnO7ZDgJHwbWiF02OuEgK/fpd0=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=HYcHGPJxNe28iuZAgwiJRKYW89K1VH3JO8aVXM3nCQiv6v71lCaj3sdri3M/d6o6a XCPkdkDTmX866Yzbv6dxqfvu1x48iG8NE29ZJNl+sIGJWWaIKKiFbDxKi6CjqZwT2E pYDbbXbf8sCC40BZVkg4K3eXpztlmij5qPYIDDAIB2CdygZ3CisuNZ5nQSjeuAmW32 0DYPFEvsZKByOgqeQ/Mcc3o86gaa92qX+sURKefItOZCFDAPRHvqhhAE+1TcEhI3wq FJTcheYvJaDKnNdCWr5CBcv76jPy5SOdKVi5NF8bWMR7kiNTPncE0RUp/qT7ZXffwG mx4+x/+fMyYTQ== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4X73w10gsrz4xZh; Tue, 17 Sep 2024 11:24:01 +1000 (AEST) Date: Tue, 17 Sep 2024 11:08:52 +1000 From: David Gibson To: "Castelli, Anton" Subject: Re: Rootless Podman with VRRP Message-ID: References: <172649928722.151934.9874324737582181440@maja> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="pI214dJSxK5FJpVm" Content-Disposition: inline In-Reply-To: <172649928722.151934.9874324737582181440@maja> Message-ID-Hash: LZSEHSGEYLW3O6G6X4J7OEOI2QBYSWFW X-Message-ID-Hash: LZSEHSGEYLW3O6G6X4J7OEOI2QBYSWFW X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: "passt-user@passt.top" X-Mailman-Version: 3.3.8 Precedence: list List-Id: "For passt users: support, questions and answers" Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --pI214dJSxK5FJpVm Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Sep 16, 2024 at 02:46:51PM +0000, Castelli, Anton via user wrote: > Date: Mon, 16 Sep 2024 14:46:51 +0000 > From: "Castelli, Anton" > To: "passt-user@passt.top" > Subject: Rootless Podman with VRRP > List-Id: "For passt users: support, questions and answers" > >=20 > I'm trying to get a service in a rootless Podman container (BIND DNS > server) to respond correctly when using VRRP (via keeepalived) on > the host. It seems like Pasta will forward the inbound traffic to > the container from the VRRP address, but the responses will be from > the regular IP address instead of the VRRP address, which causes the > client to ignore the response. I've tried adding Pasta network > options to the container, but the behavior seems to be the same. I'm not familiar with VRRP beyond a quick skim of the wikipedia article just now. I think the (only) relevant thing is that the host has multiple addresses, but it's possible there's some other complexity on the host that I'm not factoring in yet. > OS: Centos Stream 9 > Podman: 5.2.2 > Pasta: 0^20240806.gee36266-2.el9.x86_64-pasta This version includes UDP flow tracking, which fixes some of the worst bugs with UDP forwarding / addressing, but there are still some edge cases as you've discovered. I think I know what's going on here. With TCP, once we accept() a connection, it's local address on the host is part of the accept()ed socket, so we'll always use the same address for reply packets. With UDP, however, this doesn't happen automatically: the local address for replies is controlled by the bound address of the socket we use to send them out. In this instance that will be the same socket as is listening for the incoming requests. I'm assuming that socket will be set to listen to DNS traffic on any address, so it will be bound to 0.0.0.0:53 on the host. That will see the traffic to the VRRP address, but when we go to send the reply the kernel will pick the local address according to its routing tables and it seems it's picking the default address. That is a bug, at least in principle: we should remember the local address for a UDP flow and use the same local address for replies. With the flow table we now have the means of correctly tracking this, however, we haven't implemented this (yet) because it's kind of fiddly: we'd need to use additional getsockopt() calls to determine and control the local address for datagrams on a per-packet basis. > Outside interface: >=20 > ens18 > 10.1.1.1/24 (main IP) > 10.1.1.2/32 (VRRP IP) >=20 > TCPdump shows the problem (note that the reply packet has source as the m= ain IP, not the VRRP IP: >=20 > IP 10.2.2.2.37392 > 10.1.1.2.53: 60211+ [1au] A? www.example.com. (56) > IP 10.1.1.1.53 > 10.2.2.2.37392: 60211*- 1/0/1 A 192.168.254.7 (88) >=20 > Tried starting the container with non-default pasta options, but the resu= lt is the same: >=20 > --network pasta:-I,tap0,-o,10.1.1.2,--ipv4-only,-a,10.0.2.0,-n,24,-g,10.0= =2E2.2,--dns-forward,10.0.2.3,--no-ndp,--no-dhcpv6,--no-dhcp Most of those options are not relevant. --dns-forward is set by podman and won't change anything (it's about DNS requests _out_ from the container). Similarly -g, --ipv4-only and the --no-* options are unlikely to be relevant here. -a is also probably not relevant: it controls the guest's address, but this is a host side addressing problem. That said, -a 10.0.2.0 -n 24 is probably not a good idea, since it's setting the guest's address to the network address. -o is the one that you'd think is relevant, since it's supposed to control the outbound local address. The catch here is that while it will control the local address for new sockets we create to handle flows initiated by the guest, here the flow is initiated by the host and so goes through the listening socket, which is still bound to 0.0.0.0:53. > Any help with possible solutions would be greatly appreciated. I believe I have a workaround. The trick here is we want the listening socket to be bound to the VRRP address, this can be done by setting that specific IP for publishing the port so, for example: podman run --publish 10.1.1.2:53:53/udp ... (which should translate to to the pasta option "-u 10.1.1.2/53:53"). The tradeoff for this workaround is that the container will now *only* respond to DNS traffic to the VRRP address, but I'm guessing that might be what you want anyway. You could also explicitly publish any additional addresses you want to respond on. But, longer term, this is a bug that we'd like to fix when we have time. Would you mind filing a ticket on bugs.passt.top so we don't forget it? --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --pI214dJSxK5FJpVm Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmbo1qMACgkQzQJF27ox 2Gd+8A/9Hgp5v4A3BJV6/zt2MGGD7hNKcFhueZSUX6ATmx1hWjTBGw9V5ghe2q3g XYRa/d50ntWKBckeFg/Z0yANAO9+65Ur9XMrzxrPLIAVyHUu6TbAkzy0yxNZ99y1 U0asEdj4bFf0DG2EqqcANdap/Suu5PU2VwZQZMqMyM56zBYrSBTU5ExSDIXF9nzN 8TH2nvqFojbw1dWs9bJ/sx113ANry/ZVWppVfY+WpoPYA77ZyfplBTSXgDKuSbED zq+W3HpXKTi0jNr245vXMFl+EhrvU5fP7f7/RWMgOICrzulCg1zjROmzv6W2H11e BT+YJDRHlHIMrZvqXrOsXna4UOEeQh7MQ9UT0mItqC7V6pzsPBGLgK2BmodUpwSv DsQLBuRNXSRZXiYM/oRzIqOewpE+kmla+G/veuNPiVf6TpXYJJfRi2/H9d0puDmJ KfiZN813GRI3HjWaSukKRxAHsRZLuVG58E25z1O9Gg9B7dmuEnlR42t8yFNUhiZG WZ388m/AqpUJg8zfaUESeDbd0lmd5XUbgYgOa4WWMioVHEbQCsi4sOMJTcrVCus1 zSxxNScvDWCsoFUX+0EIPZtM2ubC41wPytiXDWHkW3hf4gbE1zocWYJ7zlBVeOOR SrjVhfFq0EDzjZMbXN11XbRlbrDmcXiWGjoekt+dhtPIWxld7NI= =xhCH -----END PGP SIGNATURE----- --pI214dJSxK5FJpVm--