From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=pass (2048-bit key; secure) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202412 header.b=WDA4EH35; dkim-atps=neutral Received: from mail.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by passt.top (Postfix) with ESMTPS id E7E885A0272 for ; Wed, 29 Jan 2025 07:15:50 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202412; t=1738131337; bh=0ubTZbOh80ef0PX1qN/ADFO+fiLBEpQKlLCG1HBjblA=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=WDA4EH35lN57Ho1orRz7vY5OLDiTlVlnSWdtMi9lMBHGjHgMOJqKYd6fgf/kfpnGF hJjjPqVHwPMpj7cuKzY0gBmzJJWMkulDljCMVv27Jm7l0z5VbsqAsQKconTVMZXuiw 65lg+Vc8l2c62KE34Land6sfMSihYgdHjJNLL41GheC3zycmsGq1xqYVRJfOo7FDc/ lmhjchIN5N6St6An67w7kvlUG5ZJBcaSHiM2H3hzLtmjaMGIGU/HB803U8KvfSpheT +A080T9oPL3pPncc9wRA4sllqVM8mEwGt9uG8AtqpOA3nJVK7EXZYPuv5raBCW94sh dp6f8eZtbS8GQ== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4YjX2d4JHCz4wxx; Wed, 29 Jan 2025 17:15:37 +1100 (AEDT) Date: Wed, 29 Jan 2025 17:15:41 +1100 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH v2 8/8] flow, tcp: Basic pre-migration source handler to dump sequence numbers Message-ID: References: <20250128233940.1235855-1-sbrivio@redhat.com> <20250128233940.1235855-9-sbrivio@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="vvOXtEm/GjCEU2hr" Content-Disposition: inline In-Reply-To: <20250128233940.1235855-9-sbrivio@redhat.com> Message-ID-Hash: 4IVYFMGRZQN2SBO4TBTXPLX6S7KE6ICH X-Message-ID-Hash: 4IVYFMGRZQN2SBO4TBTXPLX6S7KE6ICH X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, Laurent Vivier X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --vvOXtEm/GjCEU2hr Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Jan 29, 2025 at 12:39:40AM +0100, Stefano Brivio wrote: > Very much draft quality, but it works. Ask passt-repair to switch > TCP sockets to repair mode and dump their current sequence numbers to > the flow table, which will be transferred and used by the target in > the next step. >=20 > Signed-off-by: Stefano Brivio > --- > flow.c | 43 +++++++++++++++++++++++++++++++++++++++++ > flow.h | 1 + > migrate.c | 1 + > tcp.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ > tcp_conn.h | 5 +++++ > 5 files changed, 106 insertions(+) >=20 > diff --git a/flow.c b/flow.c > index ee1221b..e7148b2 100644 > --- a/flow.c > +++ b/flow.c > @@ -19,6 +19,7 @@ > #include "inany.h" > #include "flow.h" > #include "flow_table.h" > +#include "repair.h" > =20 > const char *flow_state_str[] =3D { > [FLOW_STATE_FREE] =3D "FREE", > @@ -874,6 +875,48 @@ void flow_defer_handler(const struct ctx *c, const s= truct timespec *now) > *last_next =3D FLOW_MAX; > } > =20 > +/** > + * flow_migrate_source_pre() - Prepare all source flows for migration > + * @c: Execution context > + * @m: Migration metadata > + * > + * Return: 0 on success > + */ > +int flow_migrate_source_pre(struct ctx *c, struct migrate_meta *m) > +{ > + unsigned i; > + int rc; > + > + (void)m; > + > + for (i =3D 0; i < FLOW_MAX; i++) { /* TODO: iterator with skip */ > + union flow *flow =3D &flowtab[i]; > + > + if (flow->f.state =3D=3D FLOW_STATE_FREE) > + i +=3D flow->free.n - 1; > + else if (flow->f.state =3D=3D FLOW_STATE_ACTIVE && We should probably just abort any flows that are in pre-ACTIVE state at migration time. Wait... IIRC flows have to be in ACTIVE state (or already cancelled) once we get to the next epoll cycle. So we can possibly just assert that state is either ACTIVE or FREE. > + flow->f.type =3D=3D FLOW_TCP) > + rc =3D tcp_flow_repair_on(c, &flow->tcp); > + > + if (rc) > + return rc; /* TODO: rollback */ > + } > + > + repair_flush(c); /* TODO: move to TCP logic */ > + > + for (i =3D 0; i < FLOW_MAX; i++) { /* TODO: iterator with skip */ > + union flow *flow =3D &flowtab[i]; > + > + if (flow->f.state =3D=3D FLOW_STATE_FREE) > + i +=3D flow->free.n - 1; > + else if (flow->f.state =3D=3D FLOW_STATE_ACTIVE && > + flow->f.type =3D=3D FLOW_TCP) > + tcp_flow_dump_seq(c, &flow->tcp); > + } > + > + return 0; > +} > + > /** > * flow_init() - Initialise flow related data structures > */ > diff --git a/flow.h b/flow.h > index 8eb5964..ff390a6 100644 > --- a/flow.h > +++ b/flow.h > @@ -255,6 +255,7 @@ union flow; > =20 > void flow_init(void); > void flow_defer_handler(const struct ctx *c, const struct timespec *now); > +int flow_migrate_source_pre(struct ctx *c, struct migrate_meta *m); > =20 > void flow_log_(const struct flow_common *f, int pri, const char *fmt, ..= =2E) > __attribute__((format(printf, 3, 4))); > diff --git a/migrate.c b/migrate.c > index b8b79e0..6707c02 100644 > --- a/migrate.c > +++ b/migrate.c > @@ -56,6 +56,7 @@ static struct migrate_data data_versions[] =3D { > =20 > /* Handlers to call in source before sending data */ > struct migrate_handler handlers_source_pre[] =3D { > + { flow_migrate_source_pre }, > { 0 }, > }; > =20 > diff --git a/tcp.c b/tcp.c > index c89f323..3a3038b 100644 > --- a/tcp.c > +++ b/tcp.c > @@ -299,6 +299,7 @@ > #include "log.h" > #include "inany.h" > #include "flow.h" > +#include "repair.h" > #include "linux_dep.h" > =20 > #include "flow_table.h" > @@ -868,6 +869,61 @@ void tcp_defer_handler(struct ctx *c) > tcp_payload_flush(c); > } > =20 > +/** > + * tcp_flow_repair_on() - Enable repair mode for a single TCP flow > + * @c: Execution context > + * @conn: Pointer to the TCP connection structure > + * > + * Return: 0 on success, negative error code on failure > + */ > +int tcp_flow_repair_on(struct ctx *c, const struct tcp_tap_conn *conn) > +{ > + int rc =3D 0; > + > + if ((rc =3D repair_set(c, conn->sock, TCP_REPAIR_ON))) > + err("Failed to set TCP_REPAIR for socket %i", conn->sock); Well.. except that the error could just as easily have been on a previous socket that wasn't flushed yet. > + > + return rc; > +} > + > +/** > + * tcp_flow_dump_seq() - Dump sequences for send and receive queues > + * @c: Execution context > + * @conn: Pointer to the TCP connection structure > + * > + * Return: 0 on success, negative error code on failure > + */ > +int tcp_flow_dump_seq(struct ctx *c, struct tcp_tap_conn *conn) > +{ > + int v, s =3D conn->sock; > + socklen_t vlen; > + > + (void)c; > + > + vlen =3D sizeof(v); > + > + v =3D TCP_SEND_QUEUE; > + /* TODO: proper error management and prints */ > + if (setsockopt(s, SOL_TCP, TCP_REPAIR_QUEUE, &v, vlen)) > + return -errno; > + > + if (getsockopt(s, SOL_TCP, TCP_QUEUE_SEQ, &conn->sock_seq_snd, &vlen)) > + return -errno; > + > + debug("Send queue sequence %u for socket %i", conn->sock_seq_snd, s); > + > + v =3D TCP_RECV_QUEUE; > + if (setsockopt(s, SOL_TCP, TCP_REPAIR_QUEUE, &v, vlen)) > + return -errno; > + > + if (getsockopt(s, SOL_TCP, TCP_QUEUE_SEQ, &conn->sock_seq_rcv, &vlen)) > + return -errno; > + > + debug("Receive queue sequence %u for socket %i", conn->sock_seq_rcv, s); > + > + return 0; > +} > + > /** > * tcp_fill_header() - Fill the TCP header fields for a given TCP segmen= t. > * > diff --git a/tcp_conn.h b/tcp_conn.h > index d342680..0c3e197 100644 > --- a/tcp_conn.h > +++ b/tcp_conn.h > @@ -94,6 +94,9 @@ struct tcp_tap_conn { > uint32_t seq_from_tap; > uint32_t seq_ack_to_tap; > uint32_t seq_init_from_tap; > + > + uint32_t sock_seq_snd; > + uint32_t sock_seq_rcv; > }; > =20 > /** > @@ -140,6 +143,8 @@ extern int init_sock_pool4 [TCP_SOCK_POOL_SIZE]; > extern int init_sock_pool6 [TCP_SOCK_POOL_SIZE]; > =20 > bool tcp_flow_defer(const struct tcp_tap_conn *conn); > +int tcp_flow_repair_on(struct ctx *c, const struct tcp_tap_conn *conn); > +int tcp_flow_dump_seq(struct ctx *c, struct tcp_tap_conn *conn); > bool tcp_splice_flow_defer(struct tcp_splice_conn *conn); > void tcp_splice_timer(const struct ctx *c, struct tcp_splice_conn *conn); > int tcp_conn_pool_sock(int pool[]); --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --vvOXtEm/GjCEU2hr Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmeZx4wACgkQzQJF27ox 2GdPVhAAqD2J1GP2JxlMWpquB0j/DW+Em76vDDYRnWubFft/Y0JrelY/XlsLi3tw 8kBvEfy9VnQY6lZLuj3SPv+MNvQcRCnTc8qPWXVfikcaZt+vpd0caitK4zseMN5f pbcht0BORSaeuZkPgzta8SMU5nONqXk9qKcOcG71atvUqtfrTpX+LFAWNc7xnSjY SGzvyjSGzsbu0dgF4fEEdp64hiGMvd9qouccFNUMCt+urbC/A5b9uNb1hrXJgqxs 10DBWmPrKg5nlV94pJabgOo7NOgUQ+DJE+P+dS/EP2ps1F13qzDEO0SM6N3dgxZ2 KTCkzMUCEXtqy2PsmknV19Rwb4jjM+T0kBxP4ZzLj0iAKc++x5xQ5VN6vLZ7xTnp HFxzYwIJAL3qy+y09sELuVd2k1VLpxkD6+/3qsit0tOsaCM6Y55JVugWSJVcS2Xw Hi3U4t4mlJLFgJwA3AasR7py/GrruNS1T2iML/S6AIFiFQgWrkdYc++/sFgtTF0A WOM0xPDzvwJ9dd3OfE7XMQ8r/guI/paJqKgLW1T8OwZET7xuiiy0WnH11WakOW6L ZcE3aTQ4UmJHDKfCS5qR/O3kXtVJ++BkwuDq/j4xRP5F/pl3yuHqi9vCYdsYbM8q HSLI2kGW60NJjXnB1ZpkuideFVTx9xmIPPcYZN35UnjysVN28Rs= =buYv -----END PGP SIGNATURE----- --vvOXtEm/GjCEU2hr--