From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: passt.top; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=GPPesqhY; dkim-atps=neutral Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by passt.top (Postfix) with ESMTPS id 8793E5A0272 for ; Mon, 03 Feb 2025 10:45:10 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1738575909; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Kj2N7/LYbjAahsQyWjGDVzXX4PSrqwjC9VSo0rmp9U0=; b=GPPesqhY8U2+sfQu7lVE70ecPhca+RBUGb+giZcheA5DQ09AjqmzyeULPkEOevpVDheslq KahaiJtANjtSadjsrcEh/vER5mLgiS9XbgLXHKe7SXV6EhBzHRp/qgkkVFuXxmmKvcRb52 QtIqF/xP0KUHLhb+r4pQOqyypApAulM= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-232-HWc6Zbt1OTyHc712ZEXUyQ-1; Mon, 03 Feb 2025 04:45:07 -0500 X-MC-Unique: HWc6Zbt1OTyHc712ZEXUyQ-1 X-Mimecast-MFC-AGG-ID: HWc6Zbt1OTyHc712ZEXUyQ Received: by mail-wm1-f69.google.com with SMTP id 5b1f17b1804b1-43625ceae52so21633505e9.0 for ; Mon, 03 Feb 2025 01:45:07 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738575906; x=1739180706; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=Kj2N7/LYbjAahsQyWjGDVzXX4PSrqwjC9VSo0rmp9U0=; b=wZ95tbh3V+3YsnD7L6+96WD/7JBu4qZ0206QMhh7V4/LNY3vM+62RD8tnTHrYzZ4vz eN+DGvg7wmGqT2WZ6cmWqpFtwGTx4YH22kwwyor6U38h8T+ljQ/BNoS95ATtQMYZ9Z5x Gvjgfx7VfP81D9vpO9zsELMUYAgfkVE7C4cjRkRH0x76hjWRbaE2bGfnmMxMoWLqReoX BTwNl+WKm4TE9R+NfMwRKeVk/PsHNgR0J7Xt2OcS0GBssy5ywEHBqFS+r6Av9VSBpC6m sA/3AZMAAjvDOPrhtlJQlx18jvzKkvLcIut5GxQfztaklGsDoZSFT5c4+SXHvrUw0p9u rJ3w== X-Gm-Message-State: AOJu0Yz3RLnZD1TDR4YasyyddTTelEqlxwRxOz+fnp0vRp8zJiSx4sJg T8m7XJcEHIJ7Br/iUZeL12iIJXjuovmPtSq7N0rbzkSVZZYWQRziw1wvQyWvl0ZS+t5vGzfoMj4 zMISUcC0q2RB5Hij5n7OkQiNSk+0Ht4OP4NFFKX/cKo1+vXaDxQ== X-Gm-Gg: ASbGncsmQJpGFUYqqygn6mdPlejH2Uefwf+bV+suQsyqwQwVtXYShw43GgDIi1NJp/q CDyZ3G2FVsjPPLEnyrAfWTHW0vJxKV6ELVEWysYfacPxo5NvgWVrHEVtAax5fayNMCheAp6WrSa /7AYThAOZmX6vk43pf4X0Doq0nYgU8w8Cda+u3HTg5ZCtOFacXFJyJM3nR6GPnb7lAQqQVAz4Qj tFwUfxzPvvxbd/5UjIH03fV8Xr/bpeawlrLGqPGaSiEVGh+I5oNcKfYmxzrMi8zLDjOhXppKVBn r7rQkMRtFqXJkWl71dFfL/lCOxTYN3h8mA== X-Received: by 2002:a05:600c:3d05:b0:434:fa55:eb56 with SMTP id 5b1f17b1804b1-438dc3ab80bmr197080765e9.7.1738575906406; Mon, 03 Feb 2025 01:45:06 -0800 (PST) X-Google-Smtp-Source: AGHT+IHmuBxRgQVRdsMV8GcNkFES10eFFGz6tv2ROGr9o4udlzbXduG7Ct+jyUxEJS5EsXvc4Xz3UQ== X-Received: by 2002:a05:600c:3d05:b0:434:fa55:eb56 with SMTP id 5b1f17b1804b1-438dc3ab80bmr197080425e9.7.1738575905959; Mon, 03 Feb 2025 01:45:05 -0800 (PST) Received: from maya.myfinge.rs (ifcgrfdd.trafficplex.cloud. [176.103.220.4]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38c5c102bdbsm12091551f8f.28.2025.02.03.01.45.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 03 Feb 2025 01:45:03 -0800 (PST) Date: Mon, 3 Feb 2025 10:45:02 +0100 From: Stefano Brivio To: David Gibson Subject: Re: [PATCH v3 18/20] tcp: Get our socket port using getsockname() when connecting from guest Message-ID: <20250203104502.7f1f649b@elisabeth> In-Reply-To: References: <20250131193953.3034031-1-sbrivio@redhat.com> <20250131193953.3034031-19-sbrivio@redhat.com> <20250203070937.61c30df9@elisabeth> Organization: Red Hat X-Mailer: Claws Mail 4.2.0 (GTK 3.24.41; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: rLLKjL1ZLcsoEbZPG-w18shHTlzM68v3ZbmzjHuQKKQ_1738575906 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-ID-Hash: I7NUN43452UKNT2XGEI5J2P4PIKDFKZB X-Message-ID-Hash: I7NUN43452UKNT2XGEI5J2P4PIKDFKZB X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, Laurent Vivier X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Mon, 3 Feb 2025 19:59:12 +1100 David Gibson wrote: > On Mon, Feb 03, 2025 at 07:09:37AM +0100, Stefano Brivio wrote: > > On Mon, 3 Feb 2025 13:05:33 +1100 > > David Gibson wrote: > > > > > On Fri, Jan 31, 2025 at 08:39:51PM +0100, Stefano Brivio wrote: > > > > For migration only: we need to store 'oport', our socket-side port, > > > > as we establish a connection from the guest, so that we can bind the > > > > same oport as source port in the migration target. > > > > > > > > Use getsockname() to fetch that. > > > > > > > > Signed-off-by: Stefano Brivio > > > > --- > > > > flow.c | 4 ++-- > > > > flow_table.h | 4 ++-- > > > > tcp.c | 24 +++++++++++++++++++++++- > > > > 3 files changed, 27 insertions(+), 5 deletions(-) > > > > > > > > diff --git a/flow.c b/flow.c > > > > index 5638ff1..506cbac 100644 > > > > --- a/flow.c > > > > +++ b/flow.c > > > > @@ -411,8 +411,8 @@ const struct flowside *flow_initiate_sa(union flow *flow, uint8_t pif, > > > > * > > > > * Return: pointer to the target flowside information > > > > */ > > > > -const struct flowside *flow_target(const struct ctx *c, union flow *flow, > > > > - uint8_t proto) > > > > +struct flowside *flow_target(const struct ctx *c, union flow *flow, > > > > + uint8_t proto) > > > > { > > > > char estr[INANY_ADDRSTRLEN], fstr[INANY_ADDRSTRLEN]; > > > > struct flow_common *f = &flow->f; > > > > diff --git a/flow_table.h b/flow_table.h > > > > index 633805d..b107107 100644 > > > > --- a/flow_table.h > > > > +++ b/flow_table.h > > > > @@ -178,8 +178,8 @@ const struct flowside *flow_target_af(union flow *flow, uint8_t pif, > > > > sa_family_t af, > > > > const void *saddr, in_port_t sport, > > > > const void *daddr, in_port_t dport); > > > > -const struct flowside *flow_target(const struct ctx *c, union flow *flow, > > > > - uint8_t proto); > > > > +struct flowside *flow_target(const struct ctx *c, union flow *flow, > > > > + uint8_t proto); > > > > > > > > union flow *flow_set_type(union flow *flow, enum flow_type type); > > > > #define FLOW_SET_TYPE(flow_, t_, var_) (&flow_set_type((flow_), (t_))->var_) > > > > diff --git a/tcp.c b/tcp.c > > > > index 0bd2a02..4fd405b 100644 > > > > --- a/tcp.c > > > > +++ b/tcp.c > > > > @@ -1471,6 +1471,8 @@ static void tcp_bind_outbound(const struct ctx *c, > > > > * @opts: Pointer to start of options > > > > * @optlen: Bytes in options: caller MUST ensure available length > > > > * @now: Current timestamp > > > > + * > > > > + * #syscalls:vu getsockname > > > > */ > > > > static void tcp_conn_from_tap(const struct ctx *c, sa_family_t af, > > > > const void *saddr, const void *daddr, > > > > @@ -1479,9 +1481,10 @@ static void tcp_conn_from_tap(const struct ctx *c, sa_family_t af, > > > > { > > > > in_port_t srcport = ntohs(th->source); > > > > in_port_t dstport = ntohs(th->dest); > > > > - const struct flowside *ini, *tgt; > > > > + const struct flowside *ini; > > > > struct tcp_tap_conn *conn; > > > > union sockaddr_inany sa; > > > > + struct flowside *tgt; > > > > union flow *flow; > > > > int s = -1, mss; > > > > uint64_t hash; > > > > @@ -1586,6 +1589,25 @@ static void tcp_conn_from_tap(const struct ctx *c, sa_family_t af, > > > > } > > > > > > > > tcp_epoll_ctl(c, conn); > > > > + > > > > + if (c->mode == MODE_VU) { /* To rebind to same oport after migration */ > > > > > > I suspect we'll want this local side information in more places in > > > future, but this is fine for now. > > > > > > > + if (af == AF_INET) { > > > > + struct sockaddr_in s_in; > > > > + socklen_t sl; > > > > + > > > > + sl = sizeof(s_in); > > > > + getsockname(s, (struct sockaddr *)&s_in, &sl); > > > > + tgt->oport = ntohs(s_in.sin_port); > > > > > > Since we're already doing the getsockname() we should also update > > > tgt->oaddr, and that might matter in cases where the host has multiple > > > local addresses. > > > > I had that in a previous version, because I was actually restoring it > > as I thought it was needed, then I dropped it. > > > > We expect the configuration of the target to be the same as the source, > > Eh... up to a point. I'm not sure about the kubevirt case, but in > general when migrating amongst managed hosts it would not surprise me > to find that the destination has multiple IPs. One is the "public" IP > used by the workload and will indeed be the same between the ends. > However, there could also be a "management" IP that's different > between them - after all the management system will need to talk to > both source and destination simultaneously for a little while. > > > so the same connect() should yield to the same source address being > > used (minus the fact that we don't set socket options yet (point 9. of > > the to-do list from cover letter). So should we really bind() to a > > specific source address just because we happen to know it? I'm not > > quite sure. > > Yes, we should. Note that right now, the outbound socket already has > a fixed bound address, assigned at connect() time. We don't track it > in our internal data structures, it's still there as part of the > socket state, and our behaviour reflects that. That forms part of the > socket state which we should maintain across migration. Okay, fine, then let me add that back. > We could do this lazily - one of the pre-migrate steps could be a > getsockname() on any flows which have an unassigned oaddr/oport. Sure, on the other hand that's an optimisation, and I already have code ready for this. -- Stefano