public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: Stefano Brivio <sbrivio@redhat.com>
To: passt-dev@passt.top
Subject: Re: [PATCH 17/18] tests: Correct determination of host interface name in tests
Date: Wed, 20 Jul 2022 10:23:22 +0200	[thread overview]
Message-ID: <20220720102322.2d4d85f9@elisabeth> (raw)
In-Reply-To: <YtetooUM55/Ol2ZN@yekko>

[-- Attachment #1: Type: text/plain, Size: 6549 bytes --]

On Wed, 20 Jul 2022 17:24:18 +1000
David Gibson <david(a)gibson.dropbear.id.au> wrote:

> On Wed, Jul 20, 2022 at 12:47:57PM +1000, David Gibson wrote:
> > On Tue, Jul 19, 2022 at 09:05:52PM +0200, Stefano Brivio wrote:  
> > > On Tue, 19 Jul 2022 16:20:45 +1000
> > > David Gibson <david(a)gibson.dropbear.id.au> wrote:
> > >   
> > > > On Fri, Jul 15, 2022 at 03:21:40PM +1000, David Gibson wrote:  
> > > > > By default, passt itself attaches to the first host interface with a
> > > > > default route.  However, when determining the host interface name the tests
> > > > > implicitly select the *last* host interface: they use a jq expression which
> > > > > will list all interfaces with default routes, but the way output detection
> > > > > works in the scripts, it will only pick up the last line.
> > > > > 
> > > > > If there are multiple interfaces with default routes on the host, and they
> > > > > each have a different address, this can cause spurious test
> > > > > failures.    
> > > > 
> > > > It seems this change is not enough to always fix the tests when there
> > > > are multiple default routes.  I'm still sometimes getting failures,
> > > > now because passt itself doesn't seem to be picking the interface
> > > > with the first default route.
> > > > 
> > > > I'm wondering if this is because ip(8) is sorting the output, not just
> > > > presenting it in the same order that the underlying netlink interface
> > > > does.  
> > > 
> > > I don't see that happening at least in my environment (and I also can't
> > > see any code that would sort it, that pretty much comes from
> > > rtnl_dump_filter_l() in lib/libnetlink.c).
> > > 
> > > The netlink "filter", though, is slightly different. For IPv4:
> > > 
> > > $ strace -e sendto ip route show >/dev/null
> > > sendto(3, [{{len=36, type=RTM_GETROUTE, flags=NLM_F_REQUEST|NLM_F_DUMP, seq=1658257027, pid=0}, {rtm_family=AF_INET, rtm_dst_len=0, rtm_src_len=0, rtm_tos=0, rtm_table=RT_TABLE_UNSPEC, rtm_protocol=RTPROT_UNSPEC, rtm_scope=RT_SCOPE_UNIVERSE, rtm_type=RTN_UNSPEC, rtm_flags=0}, {{nla_len=8, nla_type=RTA_TABLE}, RT_TABLE_MAIN}}, {len=0, type=0 /* NLMSG_??? */, flags=0, seq=0, pid=0}], 156, 0, NULL, 0) = 156
> > > 
> > > $ strace -e sendto ./passt -f
> > > sendto(5, {{len=28, type=RTM_GETROUTE, flags=NLM_F_REQUEST|NLM_F_DUMP, seq=0, pid=0}, {rtm_family=AF_INET, rtm_dst_len=0, rtm_src_len=0, rtm_tos=0, rtm_table=RT_TABLE_MAIN, rtm_protocol=RTPROT_UNSPEC, rtm_scope=RT_SCOPE_UNIVERSE, rtm_type=RTN_UNICAST, rtm_flags=0}}, 28, 0, NULL, 0) = 28
> > > [...]
> > > 
> > > and, while I don't think the FIB trie is actually descended in a
> > > different way, we might still have a slightly different result from the
> > > kernel (if I recall correctly, I didn't check right now).  
> 
> So, I've been staring at this for most of the day.  I don't think the
> difference is because of the slightly different filters.  Instead it's
> an easy to miss subtletly in nl_get_ext_if().
> 
> In nl_get_ext_if() both first_v4 and first_v6 will be set to the
> interface which meets the criteria and comes first in the order of
> routes returned by netlink from the kernel.  However, there's this early return:
> 
> 	word = (long *)has_v4;
> 	for (i = 0; i < ARRAY_SIZE(has_v4) / sizeof(long); i++, word++) {
> 		tmp = *word;
> 		while ((n = ffsl(tmp))) {
> 			int ifi = i * sizeof(long) * 8 + n - 1;
> 
> 			if (!first_v4)
> 				first_v4 = ifi;
> 
> 			tmp &= ~(1UL << (n - 1));
> 			if (bitmap_isset(has_v6, ifi)) {
> 				*v4 = *v6 = IP_VERSION_ENABLED;
> 				return ifi;    <================ HERE
> 			}
> 		}
> 	}
> 
> Which means that if we find an interface with both a v4 and a v6
> default route, we'll return it immediately.  That loop is across the
> has_v4 bitmap, so we'll return the first one in *interface index order*,
> which might not necessarily be the same as the order the routes are
> returned from netlink.

Whoops :) ...right, sorry, I forgot about that.

> In the other cases: one of v4 or v6 is disabled, or there's no
> interface with both v4 and v6 default routes, then we'll end up
> returning the first interface in netlink route order.  Which makes
> passt's overall priority of selecting interfaces pretty hard to
> follow.
> 
> I've been thinking a bunch about what the order should be.  The
> obvious cleanup to the current logic would be:
> 	1. If there are >0 interfaces with both default v4 and v6
> 	   routes, pick the one with the lowest metric
> 	2. Otherwise if there are >0 interfaces with a default v4
> 	   route, pick the one with the lowest metric
> 	3. Otherwise if there are >0 interfaces with a default v6
> 	   route, pick the one with the lowest metric
> 	4. Otherwise fail.
> 
> ..but, I think that's the wrong choice.  For one thing it gets weird
> in the unlikely but possible case that there are two interfaces with
> both v4 and v6 routes, but one has a lower v4 metric and the other has
> a lower v6 metric.  But more importantly, I think it fails the "do
> what you want by default" test in the case of a host that has both v4
> and v6 routing, but not on the same interface.  That's quite a
> realistic situation if the v6 connectivity is coming from a tunnel
> (6in4 or 6rd or 6over4 or whatever).

Definitely, I didn't consider the tunneled IPv6 case, but it looks
still rather real even nowadays.

> Instead, I think we should separately pick an external interface for
> v4 and v6 connectivity.  Yes, it's weird to have a view of two
> different host interfaces on the same virtual interface.  It also
> means pasta has to arbitrarily pick one of the host interface names
> for the namespace, which will make the tests a bit uglier.  But it
> makes picking the right interfaces a lot simpler and clearer, and does
> what you probably want for the case of a host with tunnelled ipv6
> connectivity.

I totally agree. It's still less weird than the alternative.

We'll need to store two interface indices (say, "ifi4" and "ifi6"), but
their usage right now is almost entirely limited to configuration and
netlink functions.

One additional usage is to select a given interface for sockets bound
to link-local addresses by means of setting .sin6_scope_id in struct
sockaddr_in6 -- that happens in tcp_conn_from_tap() and sock_l4().
There, we just need to switch to the new "ifi6" index.

Let me know if I should start looking into this or if you can do it
instead.

-- 
Stefano


  reply	other threads:[~2022-07-20  8:23 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-15  5:21 [PATCH 00/18] Test fixes, batch 5 David Gibson
2022-07-15  5:21 ` [PATCH 01/18] tests: Remove no longer needed /usr/bin/bash link David Gibson
2022-07-15  5:21 ` [PATCH 02/18] tests: Let Fedora find dhclient-script in /usr/sbin David Gibson
2022-07-15  5:21 ` [PATCH 03/18] tests: Add rudimentary debugging to dhclient-script David Gibson
2022-07-15  5:21 ` [PATCH 04/18] tests: Add some extra dhclient support directories to mbuto.img David Gibson
2022-07-15  5:21 ` [PATCH 05/18] tests: More robust parsing of resolv.conf for DHCP tests David Gibson
2022-07-15  5:21 ` [PATCH 06/18] tests: Handle the case of a nameserver on host localhost David Gibson
2022-07-15  5:21 ` [PATCH 07/18] tests: Correctly handle domain search list in dhclient-script David Gibson
2022-07-15  5:21 ` [PATCH 08/18] tests: Fix detection of empty 'hout' responses in passt{,_in_ns} tests David Gibson
2022-07-15  5:21 ` [PATCH 09/18] tests: Fix creation of test file in udp passt tests David Gibson
2022-07-15  5:21 ` [PATCH 10/18] valgrind needs futex David Gibson
2022-07-15  5:21 ` [PATCH 11/18] tests: Use socat instead of netcat David Gibson
2022-07-15  5:21 ` [PATCH 12/18] tests: Remove unnecessary ^D in passt_in_ns teardown David Gibson
2022-07-15  5:21 ` [PATCH 13/18] tests: Remove unnecessary truncation of temporary files in udp tests David Gibson
2022-07-15  5:21 ` [PATCH 14/18] tests: Use dhclient --no-pid for namespaces in two_guests tests David Gibson
2022-07-15  5:21 ` [PATCH 15/18] tests: Clean up better after iperf tests David Gibson
2022-07-15  5:21 ` [PATCH 16/18] tests: No need to retrieve host ifname in ndp/pasta David Gibson
2022-07-15  5:21 ` [PATCH 17/18] tests: Correct determination of host interface name in tests David Gibson
2022-07-19  6:20   ` David Gibson
2022-07-19 19:05     ` Stefano Brivio
2022-07-20  2:47       ` David Gibson
2022-07-20  7:24         ` David Gibson
2022-07-20  8:23           ` Stefano Brivio [this message]
2022-07-20 10:33             ` David Gibson
2022-07-15  5:21 ` [PATCH 18/18] demo: Use git protocol downloads David Gibson
2022-07-21 12:13 ` [PATCH 00/18] Test fixes, batch 5 Stefano Brivio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220720102322.2d4d85f9@elisabeth \
    --to=sbrivio@redhat.com \
    --cc=passt-dev@passt.top \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).