From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by passt.top (Postfix) with ESMTP id A55B95A0272 for ; Wed, 20 Mar 2024 12:34:59 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1710934498; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tyEiMBzoBIvIgJmavhO/f7Pp1kzrf0qxvrbDxImhmBQ=; b=WHV2IeOb0G/F3zFO2yhkZgds1d6w3NR+Mu05834DsuG6J7sze1Pb4RXGdKmKibSanqpqDA k5gKG5m88bH73YtSNbC7VTElZhN1DQ5oAm7yfpn0ijw3Sz8IfaJwZYnM6dvuaokgii81v/ 1Gu+QnT9ghinwW92yy3hKP499XVOq24= Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-591-bpns14QsM3WpA2TJZ0lefg-1; Wed, 20 Mar 2024 07:34:57 -0400 X-MC-Unique: bpns14QsM3WpA2TJZ0lefg-1 Received: by mail-wm1-f71.google.com with SMTP id 5b1f17b1804b1-4146eed32bdso737225e9.1 for ; Wed, 20 Mar 2024 04:34:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710934496; x=1711539296; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=tyEiMBzoBIvIgJmavhO/f7Pp1kzrf0qxvrbDxImhmBQ=; b=YPmGoU8115xX2s33cX4jBH8rZN1J9G2poDLc70RVnkvEKutEkiWJT+VgoqfMCpa1OF 6iNY6fzfDuGKMhV67iafu+nu8LEkjGvBodZU1OAXEr4VtAS7yv1ml+sk3FD0GytGpF+A wiLigQ6P0BtUIhznFEmI9NodUuNIiTr9dO3tIlb6I4iQcEo8apYajmWFLIjDlhvWUjWN vEXuX0afnRWP/+I7o+pTpOmFpHm6rF9SlFPdeqAdmeq8wF1IlyqHt3CjntIggf/Z0uS5 ILWum6tlJwc4Lr+oNEpIwhKp9lIPqSjr1dNtiDoxVdz+Dj9hIG5C17Q0vilOUNtBGCQN HYEA== X-Forwarded-Encrypted: i=1; AJvYcCUKTZH0oRVN4iggHw0d87cjPE0m7PURDZfa1lJ1nYfPtE0Y88hC1eC8zqe/AhRmsB8Q135+ElAZFwXTPsCuNKk8d3XY X-Gm-Message-State: AOJu0YyMLYBEPY8Z6+Xpy1tfLZumRADOsP+0VQpd6RfUo5KCi5KRodzD 249c5ALF6Znb5Ms9AHrxGvKJ6aiS0Z/OHCWmboAiNXAlnOurHh8TWP/VX6M6ewgFOT2mvDzH3R7 DFvHJMEEj3cM8NdH1WiTbafw/YRWWze5/uNsE1z6Gx/bfo5Hvdg== X-Received: by 2002:a05:600c:5109:b0:414:204:df50 with SMTP id o9-20020a05600c510900b004140204df50mr1379836wms.4.1710934495847; Wed, 20 Mar 2024 04:34:55 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGWti+VnpCArglgb4survwQy+ZApjAdF02eD/2721I9Cmu87i0fLzB3ZjxvddTmo4JEKbAAVg== X-Received: by 2002:a05:600c:5109:b0:414:204:df50 with SMTP id o9-20020a05600c510900b004140204df50mr1379825wms.4.1710934495481; Wed, 20 Mar 2024 04:34:55 -0700 (PDT) Received: from [192.168.188.25] ([80.243.52.134]) by smtp.gmail.com with ESMTPSA id jh2-20020a05600ca08200b00413e63bb140sm1952278wmb.41.2024.03.20.04.34.54 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 20 Mar 2024 04:34:55 -0700 (PDT) Message-ID: Date: Wed, 20 Mar 2024 12:34:54 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] netlink: Fix selection of template interface To: Stefano Brivio References: <20240320053339.553418-1-david@gibson.dropbear.id.au> <80107179-5357-4b14-94ae-c82925a97ba8@redhat.com> <20240320120243.49516b90@elisabeth> From: Paul Holzinger In-Reply-To: <20240320120243.49516b90@elisabeth> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Message-ID-Hash: EZJVORF53Z5YPMQ7IHWKZULSVR4JJOAI X-Message-ID-Hash: EZJVORF53Z5YPMQ7IHWKZULSVR4JJOAI X-MailFrom: pholzing@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: David Gibson , passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On 20/03/2024 12:02, Stefano Brivio wrote: > On Wed, 20 Mar 2024 11:51:59 +0100 > Paul Holzinger wrote: > >> On 20/03/2024 06:33, David Gibson wrote: >>> Since f919dc7a4b1c ("conf, netlink: Don't require a default route to >>> start"), if there is only one host interface with routes, we will pick that >>> as the template interface, even if there are no default routes for an IP >>> version. Unfortunately this selection had a serious flaw: in some cases >>> it would 'return' in the middle of an nl_foreach() loop, meaning we >>> wouldn't consume all the netlink responses for our query. This could cause >>> later netlink operations to fail as we read leftover responses from the >>> aborted query. >>> >>> Rewrite the interface detection to avoid this problem. While we're there: >>> * Perform detection of both default and non-default routes in a single >>> pass, avoiding an ugly goto >>> * Give more detail on error and working but unusual paths about the >>> situation (no suitable interface, multiple possible candidates, etc.). >>> >>> Fixes: f919dc7a4b1c ("conf, netlink: Don't require a default route to start") >>> Link: https://bugs.passt.top/show_bug.cgi?id=83 >>> Link: https://github.com/containers/podman/issues/22052 >>> Link: https://bugzilla.redhat.com/show_bug.cgi?id=2270257 >>> >>> Signed-off-by: David Gibson >>> --- >>> conf.c | 4 ++-- >>> netlink.c | 62 ++++++++++++++++++++++++++++++++++--------------------- >>> 2 files changed, 40 insertions(+), 26 deletions(-) >>> >>> diff --git a/conf.c b/conf.c >>> index 644752cc..9e0318a5 100644 >>> --- a/conf.c >>> +++ b/conf.c >>> @@ -584,7 +584,7 @@ static unsigned int conf_ip4(unsigned int ifi, >>> ifi = nl_get_ext_if(nl_sock, AF_INET); >>> >>> if (!ifi) { >>> - info("No interface with a route for IPv4: disabling IPv4"); >>> + info("Couldn't pick external interface: disabling IPv4"); >>> return 0; >>> } >>> >>> @@ -656,7 +656,7 @@ static unsigned int conf_ip6(unsigned int ifi, >>> ifi = nl_get_ext_if(nl_sock, AF_INET6); >>> >>> if (!ifi) { >>> - info("No interface with a route for IPv6: disabling IPv6"); >>> + info("Couldn't pick external interface: disabling IPv6"); >>> return 0; >>> } >>> >>> diff --git a/netlink.c b/netlink.c >>> index 632304c1..c0a5f158 100644 >>> --- a/netlink.c >>> +++ b/netlink.c >>> @@ -254,8 +254,8 @@ unsigned int nl_get_ext_if(int s, sa_family_t af) >>> .rtm.rtm_type = RTN_UNICAST, >>> .rtm.rtm_family = af, >>> }; >>> - bool default_only = true; >>> - unsigned int ifi = 0; >>> + unsigned defifi = 0, anyifi = 0; >>> + unsigned ndef = 0, nany = 0; >>> struct nlmsghdr *nh; >>> struct rtattr *rta; >>> char buf[NLBUFSIZ]; >>> @@ -263,7 +263,6 @@ unsigned int nl_get_ext_if(int s, sa_family_t af) >>> uint32_t seq; >>> size_t na; >>> >>> -again: >>> /* Look for an interface with a default route first, failing that, look >>> * for any interface with a route, and pick it only if it's the only >>> * interface with a route. >>> @@ -271,46 +270,61 @@ again: >>> seq = nl_send(s, &req, RTM_GETROUTE, NLM_F_DUMP, sizeof(req)); >>> nl_foreach_oftype(nh, status, s, buf, seq, RTM_NEWROUTE) { >>> struct rtmsg *rtm = (struct rtmsg *)NLMSG_DATA(nh); >>> + unsigned thisifi = 0; >>> >>> - if (default_only) { >>> - if (ifi || rtm->rtm_dst_len || rtm->rtm_family != af) >>> - continue; >>> - } else { >>> - if (rtm->rtm_family != af) >>> - continue; >>> - } >>> + if (rtm->rtm_family != af) >>> + continue; >>> >>> for (rta = RTM_RTA(rtm), na = RTM_PAYLOAD(nh); RTA_OK(rta, na); >>> rta = RTA_NEXT(rta, na)) { >>> if (rta->rta_type == RTA_OIF) { >>> - if (!default_only && ifi && >>> - ifi != *(unsigned int *)RTA_DATA(rta)) >>> - return 0; >>> - >>> - ifi = *(unsigned int *)RTA_DATA(rta); >>> + thisifi = *(unsigned int *)RTA_DATA(rta); >>> } else if (rta->rta_type == RTA_MULTIPATH) { >>> const struct rtnexthop *rtnh; >>> >>> rtnh = (struct rtnexthop *)RTA_DATA(rta); >>> + thisifi = rtnh->rtnh_ifindex; >>> + } >>> + } >>> >>> - if (!default_only && ifi && >>> - (int)ifi != rtnh->rtnh_ifindex) >>> - return 0; >>> + if (!thisifi) >>> + continue; /* No interface for this route */ >>> >>> - ifi = rtnh->rtnh_ifindex; >>> - } >>> + if (rtm->rtm_dst_len == 0) { >>> + /* Default route */ >>> + ndef++; >>> + if (!defifi) >>> + defifi = thisifi; >>> + } else { >>> + /* Non-default route */ >>> + nany++; >>> + if (!anyifi) >>> + anyifi = thisifi; >>> } >>> } >>> >>> if (status < 0) >>> warn("netlink: RTM_GETROUTE failed: %s", strerror(-status)); >>> >>> - if (!ifi && default_only) { >>> - default_only = false; >>> - goto again; >>> + if (defifi) { >>> + if (ndef > 1) >>> + info("Multiple default %s routes, picked first", >>> + af == AF_INET ? "IPv4" : "IPv6"); >>> + return defifi; >>> } >>> >>> - return ifi; >>> + if (anyifi) { >>> + if (nany == 1) >>> + return anyifi; >>> + >>> + warn("Multiple interfaces with %s routes, use -i to select one", >>> + af == AF_INET ? "IPv4" : "IPv6"); >> This should not be a warning, for me this always triggers because I have >> two interfaces with link local addresses and no global ipv6 route as I >> do not have any ipv6 connection. > I was about to reply as I just applied this with s/warn/info/ here :) The more I think about this I think this suggestion is just wrong. I have a valid ipv4 connection and no ipv6, offering me to use -i is just wrong as it will not help me, especially if one uses a different interface for ipv4/ipv6 connections. > >> Or maybe the correct fix is to never consider ipv6 link local routes for >> this logic? At least I cannot see the purpose of using a interface with >> only a link local route. >> >>> + } >>> + >>> + if (!nany) >>> + warn("No interfaces with %s routes", af == AF_INET ? "IPv4" : "IPv6"); > ...and here, because if one has no IPv6 routes we would reintroduce the > issue we just fixed in 338b6321ac0d ("conf: No routable interface for > IPv4 or IPv6 is informational, not a warning"). > > I think the purpose of picking interfaces based on routes for > link-local destinations is for practical test setups like the one > described in https://github.com/containers/podman/issues/21896. > > Functionally it doesn't make sense, but it shouldn't harm either > (right?). I am not sure how the issue is related to link local addresses, for ipv6 a link local route must explicitly never be routed anywhere else besides the current interface so picking this seems incorrect as it will be unusable not matter what. > >>> + >>> + return 0; >>> } >>> >>> /** >> I reproduced by having a second interface and confirm this patch fixes it. > Thanks for checking! Let's hope that was the case in the failing test. >