public inbox for passt-user@passt.top
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: "Castelli, Anton" <anton.castelli@siu.edu>
Cc: "passt-user@passt.top" <passt-user@passt.top>
Subject: Re: Rootless Podman with VRRP
Date: Tue, 17 Sep 2024 11:08:52 +1000	[thread overview]
Message-ID: <ZujWpPGEOnw53OA7@zatzit.fritz.box> (raw)
In-Reply-To: <172649928722.151934.9874324737582181440@maja>

[-- Attachment #1: Type: text/plain, Size: 4973 bytes --]

On Mon, Sep 16, 2024 at 02:46:51PM +0000, Castelli, Anton via user wrote:
> Date: Mon, 16 Sep 2024 14:46:51 +0000
> From: "Castelli, Anton" <anton.castelli@siu.edu>
> To: "passt-user@passt.top" <passt-user@passt.top>
> Subject: Rootless Podman with VRRP
> List-Id: "For passt users: support, questions and answers"
>  <passt-user.passt.top>
> 
> I'm trying to get a service in a rootless Podman container (BIND DNS
> server) to respond correctly when using VRRP (via keeepalived) on
> the host. It seems like Pasta will forward the inbound traffic to
> the container from the VRRP address, but the responses will be from
> the regular IP address instead of the VRRP address, which causes the
> client to ignore the response. I've tried adding Pasta network
> options to the container, but the behavior seems to be the same.

I'm not familiar with VRRP beyond a quick skim of the wikipedia
article just now.  I think the (only) relevant thing is that the host
has multiple addresses, but it's possible there's some other
complexity on the host that I'm not factoring in yet.

> OS: Centos Stream 9
> Podman: 5.2.2
> Pasta: 0^20240806.gee36266-2.el9.x86_64-pasta

This version includes UDP flow tracking, which fixes some of the worst
bugs with UDP forwarding / addressing, but there are still some edge
cases as you've discovered.  I think I know what's going on here.

With TCP, once we accept() a connection, it's local address on the
host is part of the accept()ed socket, so we'll always use the same
address for reply packets.

With UDP, however, this doesn't happen automatically: the local
address for replies is controlled by the bound address of the socket
we use to send them out.  In this instance that will be the same
socket as is listening for the incoming requests.  I'm assuming that
socket will be set to listen to DNS traffic on any address, so it will
be bound to 0.0.0.0:53 on the host.  That will see the traffic to the
VRRP address, but when we go to send the reply the kernel will pick
the local address according to its routing tables and it seems it's
picking the default address.

That is a bug, at least in principle: we should remember the local
address for a UDP flow and use the same local address for replies.
With the flow table we now have the means of correctly tracking this,
however, we haven't implemented this (yet) because it's kind of
fiddly: we'd need to use additional getsockopt() calls to determine
and control the local address for datagrams on a per-packet basis.

> Outside interface:
> 
> ens18
>     10.1.1.1/24 (main IP)
>     10.1.1.2/32 (VRRP IP)
> 
> TCPdump shows the problem (note that the reply packet has source as the main IP, not the VRRP IP:
> 
> IP 10.2.2.2.37392 > 10.1.1.2.53: 60211+ [1au] A? www.example.com. (56)
> IP 10.1.1.1.53 > 10.2.2.2.37392: 60211*- 1/0/1 A 192.168.254.7 (88)
> 
> Tried starting the container with non-default pasta options, but the result is the same:
> 
> --network pasta:-I,tap0,-o,10.1.1.2,--ipv4-only,-a,10.0.2.0,-n,24,-g,10.0.2.2,--dns-forward,10.0.2.3,--no-ndp,--no-dhcpv6,--no-dhcp

Most of those options are not relevant.  --dns-forward is set by
podman and won't change anything (it's about DNS requests _out_ from
the container).  Similarly -g, --ipv4-only and the --no-* options are
unlikely to be relevant here.

-a is also probably not relevant: it controls the guest's address, but
this is a host side addressing problem.  That said, -a 10.0.2.0 -n 24
is probably not a good idea, since it's setting the guest's address to
the network address.

-o is the one that you'd think is relevant, since it's supposed to
control the outbound local address.  The catch here is that while it
will control the local address for new sockets we create to handle
flows initiated by the guest, here the flow is initiated by the host
and so goes through the listening socket, which is still bound to
0.0.0.0:53.

> Any help with possible solutions would be greatly appreciated.

I believe I have a workaround.  The trick here is we want the
listening socket to be bound to the VRRP address, this can be done by
setting that specific IP for publishing the port so, for example:
	podman run --publish 10.1.1.2:53:53/udp ...
(which should translate to to the pasta option "-u 10.1.1.2/53:53").

The tradeoff for this workaround is that the container will now *only*
respond to DNS traffic to the VRRP address, but I'm guessing that
might be what you want anyway.  You could also explicitly publish any
additional addresses you want to respond on.

But, longer term, this is a bug that we'd like to fix when we have
time.  Would you mind filing a ticket on bugs.passt.top so we don't
forget it?

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

       reply	other threads:[~2024-09-17  1:24 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <172649928722.151934.9874324737582181440@maja>
2024-09-17  1:08 ` David Gibson [this message]
     [not found]   ` <172658946856.151934.7414720839553284015@maja>
2024-09-17 16:14     ` Rootless Podman with VRRP Stefano Brivio
     [not found]       ` <SA1PR07MB8738A2C2355C0CA3E1C0822092612@SA1PR07MB8738.namprd07.prod.outlook.com>
     [not found]         ` <SA1PR07MB8738CADA17DFA7FDA3D9BD4C92612@SA1PR07MB8738.namprd07.prod.outlook.com>
2024-09-17 20:11           ` Stefano Brivio
     [not found]   ` <SA1PR07MB87384D729F3FC0543F54BF4392612@SA1PR07MB8738.namprd07.prod.outlook.com>
2024-09-18  0:58     ` David Gibson
2024-09-18  2:14       ` David Gibson
     [not found]         ` <SA1PR07MB87382333E58D662293E51E0B92622@SA1PR07MB8738.namprd07.prod.outlook.com>
2024-09-19  2:15           ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZujWpPGEOnw53OA7@zatzit.fritz.box \
    --to=david@gibson.dropbear.id.au \
    --cc=anton.castelli@siu.edu \
    --cc=passt-user@passt.top \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).