From: Stefano Brivio <sbrivio@redhat.com>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: passt-dev@passt.top, Laurent Vivier <lvivier@redhat.com>
Subject: Re: [PATCH v3 18/20] tcp: Get our socket port using getsockname() when connecting from guest
Date: Mon, 3 Feb 2025 10:45:02 +0100 [thread overview]
Message-ID: <20250203104502.7f1f649b@elisabeth> (raw)
In-Reply-To: <Z6CFYDZFUP4cyoO4@zatzit>
On Mon, 3 Feb 2025 19:59:12 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:
> On Mon, Feb 03, 2025 at 07:09:37AM +0100, Stefano Brivio wrote:
> > On Mon, 3 Feb 2025 13:05:33 +1100
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> >
> > > On Fri, Jan 31, 2025 at 08:39:51PM +0100, Stefano Brivio wrote:
> > > > For migration only: we need to store 'oport', our socket-side port,
> > > > as we establish a connection from the guest, so that we can bind the
> > > > same oport as source port in the migration target.
> > > >
> > > > Use getsockname() to fetch that.
> > > >
> > > > Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> > > > ---
> > > > flow.c | 4 ++--
> > > > flow_table.h | 4 ++--
> > > > tcp.c | 24 +++++++++++++++++++++++-
> > > > 3 files changed, 27 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/flow.c b/flow.c
> > > > index 5638ff1..506cbac 100644
> > > > --- a/flow.c
> > > > +++ b/flow.c
> > > > @@ -411,8 +411,8 @@ const struct flowside *flow_initiate_sa(union flow *flow, uint8_t pif,
> > > > *
> > > > * Return: pointer to the target flowside information
> > > > */
> > > > -const struct flowside *flow_target(const struct ctx *c, union flow *flow,
> > > > - uint8_t proto)
> > > > +struct flowside *flow_target(const struct ctx *c, union flow *flow,
> > > > + uint8_t proto)
> > > > {
> > > > char estr[INANY_ADDRSTRLEN], fstr[INANY_ADDRSTRLEN];
> > > > struct flow_common *f = &flow->f;
> > > > diff --git a/flow_table.h b/flow_table.h
> > > > index 633805d..b107107 100644
> > > > --- a/flow_table.h
> > > > +++ b/flow_table.h
> > > > @@ -178,8 +178,8 @@ const struct flowside *flow_target_af(union flow *flow, uint8_t pif,
> > > > sa_family_t af,
> > > > const void *saddr, in_port_t sport,
> > > > const void *daddr, in_port_t dport);
> > > > -const struct flowside *flow_target(const struct ctx *c, union flow *flow,
> > > > - uint8_t proto);
> > > > +struct flowside *flow_target(const struct ctx *c, union flow *flow,
> > > > + uint8_t proto);
> > > >
> > > > union flow *flow_set_type(union flow *flow, enum flow_type type);
> > > > #define FLOW_SET_TYPE(flow_, t_, var_) (&flow_set_type((flow_), (t_))->var_)
> > > > diff --git a/tcp.c b/tcp.c
> > > > index 0bd2a02..4fd405b 100644
> > > > --- a/tcp.c
> > > > +++ b/tcp.c
> > > > @@ -1471,6 +1471,8 @@ static void tcp_bind_outbound(const struct ctx *c,
> > > > * @opts: Pointer to start of options
> > > > * @optlen: Bytes in options: caller MUST ensure available length
> > > > * @now: Current timestamp
> > > > + *
> > > > + * #syscalls:vu getsockname
> > > > */
> > > > static void tcp_conn_from_tap(const struct ctx *c, sa_family_t af,
> > > > const void *saddr, const void *daddr,
> > > > @@ -1479,9 +1481,10 @@ static void tcp_conn_from_tap(const struct ctx *c, sa_family_t af,
> > > > {
> > > > in_port_t srcport = ntohs(th->source);
> > > > in_port_t dstport = ntohs(th->dest);
> > > > - const struct flowside *ini, *tgt;
> > > > + const struct flowside *ini;
> > > > struct tcp_tap_conn *conn;
> > > > union sockaddr_inany sa;
> > > > + struct flowside *tgt;
> > > > union flow *flow;
> > > > int s = -1, mss;
> > > > uint64_t hash;
> > > > @@ -1586,6 +1589,25 @@ static void tcp_conn_from_tap(const struct ctx *c, sa_family_t af,
> > > > }
> > > >
> > > > tcp_epoll_ctl(c, conn);
> > > > +
> > > > + if (c->mode == MODE_VU) { /* To rebind to same oport after migration */
> > >
> > > I suspect we'll want this local side information in more places in
> > > future, but this is fine for now.
> > >
> > > > + if (af == AF_INET) {
> > > > + struct sockaddr_in s_in;
> > > > + socklen_t sl;
> > > > +
> > > > + sl = sizeof(s_in);
> > > > + getsockname(s, (struct sockaddr *)&s_in, &sl);
> > > > + tgt->oport = ntohs(s_in.sin_port);
> > >
> > > Since we're already doing the getsockname() we should also update
> > > tgt->oaddr, and that might matter in cases where the host has multiple
> > > local addresses.
> >
> > I had that in a previous version, because I was actually restoring it
> > as I thought it was needed, then I dropped it.
> >
> > We expect the configuration of the target to be the same as the source,
>
> Eh... up to a point. I'm not sure about the kubevirt case, but in
> general when migrating amongst managed hosts it would not surprise me
> to find that the destination has multiple IPs. One is the "public" IP
> used by the workload and will indeed be the same between the ends.
> However, there could also be a "management" IP that's different
> between them - after all the management system will need to talk to
> both source and destination simultaneously for a little while.
>
> > so the same connect() should yield to the same source address being
> > used (minus the fact that we don't set socket options yet (point 9. of
> > the to-do list from cover letter). So should we really bind() to a
> > specific source address just because we happen to know it? I'm not
> > quite sure.
>
> Yes, we should. Note that right now, the outbound socket already has
> a fixed bound address, assigned at connect() time. We don't track it
> in our internal data structures, it's still there as part of the
> socket state, and our behaviour reflects that. That forms part of the
> socket state which we should maintain across migration.
Okay, fine, then let me add that back.
> We could do this lazily - one of the pre-migrate steps could be a
> getsockname() on any flows which have an unassigned oaddr/oport.
Sure, on the other hand that's an optimisation, and I already have code
ready for this.
--
Stefano
next prev parent reply other threads:[~2025-02-03 9:45 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-31 19:39 [PATCH v3 00/20] Draft, incomplete series introducing state migration Stefano Brivio
2025-01-31 19:39 ` [PATCH v3 01/20] tcp: Always pass NULL event with EPOLL_CTL_DEL Stefano Brivio
2025-01-31 19:39 ` [PATCH v3 02/20] util: Rename and make global vu_remove_watch() Stefano Brivio
2025-01-31 19:39 ` [PATCH v3 03/20] icmp, udp: Pad time_t timestamp to 64-bit to ease state migration Stefano Brivio
2025-01-31 19:39 ` [PATCH v3 04/20] flow, flow_table: Pad flow table entries to 128 bytes, hash entries to 32 bits Stefano Brivio
2025-01-31 19:39 ` [PATCH v3 05/20] flow_table: Use size in extern declaration for flowtab Stefano Brivio
2025-01-31 19:39 ` [PATCH v3 06/20] util: Add read_remainder() and read_all_buf() Stefano Brivio
2025-01-31 19:39 ` [PATCH v3 07/20] Introduce facilities for guest migration on top of vhost-user infrastructure Stefano Brivio
2025-01-31 19:39 ` [PATCH v3 08/20] Introduce passt-repair Stefano Brivio
2025-02-03 1:46 ` David Gibson
2025-01-31 19:39 ` [PATCH v3 09/20] Add interfaces and configuration bits for passt-repair Stefano Brivio
2025-02-03 5:22 ` David Gibson
2025-01-31 19:39 ` [PATCH v3 10/20] flow, tcp: Basic pre-migration source handler to dump sequence numbers Stefano Brivio
2025-01-31 19:39 ` [PATCH v3 11/20] migrate: vu_migrate_{source,target}() aren't actually vu speciic Stefano Brivio
2025-01-31 19:39 ` [PATCH v3 12/20] migrate: Move repair_sock_init() to vu_init() Stefano Brivio
2025-01-31 19:39 ` [PATCH v3 13/20] migrate: Make more handling common rather than vhost-user specific Stefano Brivio
2025-01-31 19:39 ` [PATCH v3 14/20] migrate: Don't handle the migration channel through epoll Stefano Brivio
2025-02-03 1:50 ` David Gibson
2025-02-03 5:38 ` Stefano Brivio
2025-02-03 8:45 ` David Gibson
2025-02-03 2:16 ` David Gibson
2025-01-31 19:39 ` [PATCH v3 15/20] flow, flow_table: Export declaration of hash table Stefano Brivio
2025-01-31 19:39 ` [PATCH v3 16/20] vhost_user: Turn vhost-user message reports to trace() Stefano Brivio
2025-02-03 3:11 ` David Gibson
2025-02-03 6:10 ` Stefano Brivio
2025-02-03 8:47 ` David Gibson
2025-01-31 19:39 ` [PATCH v3 17/20] vhost_user: Make source quit after reporting migration state Stefano Brivio
2025-02-03 1:55 ` David Gibson
2025-02-03 6:09 ` Stefano Brivio
2025-02-03 8:52 ` David Gibson
2025-02-03 9:44 ` Stefano Brivio
2025-01-31 19:39 ` [PATCH v3 18/20] tcp: Get our socket port using getsockname() when connecting from guest Stefano Brivio
2025-02-03 2:05 ` David Gibson
2025-02-03 6:09 ` Stefano Brivio
2025-02-03 8:59 ` David Gibson
2025-02-03 9:45 ` Stefano Brivio [this message]
2025-01-31 19:39 ` [PATCH v3 19/20] tcp: Add HOSTSIDE(x), HOSTFLOW(x) macros Stefano Brivio
2025-02-03 2:06 ` David Gibson
2025-01-31 19:39 ` [PATCH v3 20/20] Implement target side of migration Stefano Brivio
2025-02-01 7:45 ` [PATCH v3 00/20] Draft, incomplete series introducing state migration Stefano Brivio
2025-02-03 2:18 ` David Gibson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250203104502.7f1f649b@elisabeth \
--to=sbrivio@redhat.com \
--cc=david@gibson.dropbear.id.au \
--cc=lvivier@redhat.com \
--cc=passt-dev@passt.top \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).