From mboxrd@z Thu Jan 1 00:00:00 1970 Received: by passt.top (Postfix, from userid 1000) id A73FC5A0271; Sun, 14 May 2023 20:14:15 +0200 (CEST) From: Stefano Brivio To: passt-dev@passt.top Subject: [PATCH 03/10] netlink: Add functionality to copy routes from outer namespace Date: Sun, 14 May 2023 20:14:08 +0200 Message-Id: <20230514181415.313420-4-sbrivio@redhat.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230514181415.313420-1-sbrivio@redhat.com> References: <20230514181415.313420-1-sbrivio@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Message-ID-Hash: 63TERY5NFWAYAPW6BMQTUU4BRSU67AHO X-Message-ID-Hash: 63TERY5NFWAYAPW6BMQTUU4BRSU67AHO X-MailFrom: sbrivio@passt.top X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: Callum Parsey , me@yawnt.com, David Gibson , lemmi@nerd2nerd.org X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Instead of just fetching the default gateway and configuring a single equivalent route in the target namespace, on 'pasta --config-net', it might be desirable in some cases to copy the whole set of routes corresponding to a given output interface. For instance, in: https://github.com/containers/podman/issues/18539 IPv4 Default Route Does Not Propagate to Pasta Containers on Hetzner VPSes configuring the default gateway won't work without a gateway-less route (specifying the output interface only), because the default gateway is, somewhat dubiously, not on the same subnet as the container. This is a similar case to the one covered by commit 7656a6f88882 ("conf: Adjust netmask on mismatch between IPv4 address/netmask and gateway"), and I'm not exactly proud of that workaround. We also have: https://bugs.passt.top/show_bug.cgi?id=49 pasta does not work with tap-style interface for which, eventually, we should be able to configure a gateway-less route in the target namespace. Introduce different operation modes for nl_route(), including a new NL_DUP one, not exposed yet, which simply parrots back to the kernel the route dump for a given interface from the outer namespace, fixing up flags and interface indices on the way, and requesting to add the same routes in the target namespace, on the interface we manage. I'm not kidding, it actually works pretty well. Link: https://github.com/containers/podman/issues/18539 Link: https://bugs.passt.top/show_bug.cgi?id=49 Signed-off-by: Stefano Brivio --- conf.c | 4 ++-- netlink.c | 59 +++++++++++++++++++++++++++++++++++++++---------------- netlink.h | 9 ++++++++- pasta.c | 6 ++++-- 4 files changed, 56 insertions(+), 22 deletions(-) diff --git a/conf.c b/conf.c index 447b000..aad2b00 100644 --- a/conf.c +++ b/conf.c @@ -646,7 +646,7 @@ static unsigned int conf_ip4(unsigned int ifi, } if (IN4_IS_ADDR_UNSPECIFIED(&ip4->gw)) - nl_route(0, ifi, AF_INET, &ip4->gw); + nl_route(NL_GET, ifi, 0, AF_INET, &ip4->gw); if (IN4_IS_ADDR_UNSPECIFIED(&ip4->addr)) nl_addr(0, ifi, AF_INET, &ip4->addr, &ip4->prefix_len, NULL); @@ -718,7 +718,7 @@ static unsigned int conf_ip6(unsigned int ifi, } if (IN6_IS_ADDR_UNSPECIFIED(&ip6->gw)) - nl_route(0, ifi, AF_INET6, &ip6->gw); + nl_route(NL_GET, ifi, 0, AF_INET6, &ip6->gw); nl_addr(0, ifi, AF_INET6, IN6_IS_ADDR_UNSPECIFIED(&ip6->addr) ? &ip6->addr : NULL, diff --git a/netlink.c b/netlink.c index c07a13c..0ff94ae 100644 --- a/netlink.c +++ b/netlink.c @@ -185,16 +185,16 @@ unsigned int nl_get_ext_if(sa_family_t af) } /** - * nl_route() - Get/set default gateway for given interface and address family - * @ns: Use netlink socket in namespace - * @ifi: Interface index + * nl_route() - Get/set/copy routes for given interface and address family + * @op: Requested operation + * @ifi: Interface index in outer network namespace + * @ifi_ns: Interface index in target namespace for NL_SET, NL_DUP * @af: Address family - * @gw: Default gateway to fill if zero, to set if not + * @gw: Default gateway to fill on NL_GET, to set on NL_SET */ -void nl_route(int ns, unsigned int ifi, sa_family_t af, void *gw) +void nl_route(enum nl_op op, unsigned int ifi, unsigned int ifi_ns, + sa_family_t af, void *gw) { - int set = (af == AF_INET6 && !IN6_IS_ADDR_UNSPECIFIED(gw)) || - (af == AF_INET && *(uint32_t *)gw); struct req_t { struct nlmsghdr nlh; struct rtmsg rtm; @@ -215,7 +215,7 @@ void nl_route(int ns, unsigned int ifi, sa_family_t af, void *gw) } r4; } set; } req = { - .nlh.nlmsg_type = set ? RTM_NEWROUTE : RTM_GETROUTE, + .nlh.nlmsg_type = op == NL_SET ? RTM_NEWROUTE : RTM_GETROUTE, .nlh.nlmsg_flags = NLM_F_REQUEST, .nlh.nlmsg_seq = nl_seq++, @@ -228,14 +228,14 @@ void nl_route(int ns, unsigned int ifi, sa_family_t af, void *gw) .rta.rta_len = RTA_LENGTH(sizeof(unsigned int)), .ifi = ifi, }; + ssize_t n, nlmsgs_size; struct nlmsghdr *nh; struct rtattr *rta; - struct rtmsg *rtm; char buf[NLBUFSIZ]; - ssize_t n; + struct rtmsg *rtm; size_t na; - if (set) { + if (op == NL_SET) { if (af == AF_INET6) { size_t rta_len = RTA_LENGTH(sizeof(req.set.r6.d)); @@ -269,31 +269,56 @@ void nl_route(int ns, unsigned int ifi, sa_family_t af, void *gw) req.nlh.nlmsg_flags |= NLM_F_DUMP; } - if ((n = nl_req(ns, buf, &req, req.nlh.nlmsg_len)) < 0 || set) + if ((n = nl_req(op == NL_SET, buf, &req, req.nlh.nlmsg_len)) < 0) + return; + + if (op == NL_SET) return; nh = (struct nlmsghdr *)buf; + nlmsgs_size = n; + for ( ; NLMSG_OK(nh, n); nh = NLMSG_NEXT(nh, n)) { if (nh->nlmsg_type != RTM_NEWROUTE) goto next; + if (op == NL_DUP) { + nh->nlmsg_seq = nl_seq++; + nh->nlmsg_pid = 0; + nh->nlmsg_flags &= ~NLM_F_DUMP_FILTERED; + nh->nlmsg_flags |= NLM_F_REQUEST | NLM_F_ACK | + NLM_F_CREATE; + } + rtm = (struct rtmsg *)NLMSG_DATA(nh); - if (rtm->rtm_dst_len) + if (op == NL_GET && rtm->rtm_dst_len) continue; for (rta = RTM_RTA(rtm), na = RTM_PAYLOAD(nh); RTA_OK(rta, na); rta = RTA_NEXT(rta, na)) { - if (rta->rta_type != RTA_GATEWAY) - continue; + if (op == NL_GET) { + if (rta->rta_type != RTA_GATEWAY) + continue; - memcpy(gw, RTA_DATA(rta), RTA_PAYLOAD(rta)); - return; + memcpy(gw, RTA_DATA(rta), RTA_PAYLOAD(rta)); + return; + } + + if (op == NL_DUP && rta->rta_type == RTA_OIF) + *(unsigned int *)RTA_DATA(rta) = ifi_ns; } next: if (nh->nlmsg_type == NLMSG_DONE) break; } + + if (op == NL_DUP) { + char resp[NLBUFSIZ]; + + nh = (struct nlmsghdr *)buf; + nl_req(1, resp, nh, nlmsgs_size); + } } /** diff --git a/netlink.h b/netlink.h index ca4d6ef..217cf1e 100644 --- a/netlink.h +++ b/netlink.h @@ -6,9 +6,16 @@ #ifndef NETLINK_H #define NETLINK_H +enum nl_op { + NL_GET, + NL_SET, + NL_DUP, +}; + void nl_sock_init(const struct ctx *c, bool ns); unsigned int nl_get_ext_if(sa_family_t af); -void nl_route(int ns, unsigned int ifi, sa_family_t af, void *gw); +void nl_route(enum nl_op op, unsigned int ifi, unsigned int ifi_ns, + sa_family_t af, void *gw); void nl_addr(int ns, unsigned int ifi, sa_family_t af, void *addr, int *prefix_len, void *addr_l); void nl_link(int ns, unsigned int ifi, void *mac, int up, int mtu); diff --git a/pasta.c b/pasta.c index 2fa0168..9161a1f 100644 --- a/pasta.c +++ b/pasta.c @@ -273,14 +273,16 @@ void pasta_ns_conf(struct ctx *c) if (c->ifi4) { nl_addr(1, c->pasta_ifi, AF_INET, &c->ip4.addr, &c->ip4.prefix_len, NULL); - nl_route(1, c->pasta_ifi, AF_INET, &c->ip4.gw); + nl_route(NL_SET, c->ifi4, c->pasta_ifi, AF_INET, + &c->ip4.gw); } if (c->ifi6) { int prefix_len = 64; nl_addr(1, c->pasta_ifi, AF_INET6, &c->ip6.addr, &prefix_len, NULL); - nl_route(1, c->pasta_ifi, AF_INET6, &c->ip6.gw); + nl_route(NL_SET, c->ifi6, c->pasta_ifi, AF_INET6, + &c->ip6.gw); } } else { nl_link(1, c->pasta_ifi, c->mac_guest, 0, 0); -- 2.39.2