From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=pass (2048-bit key; secure) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202502 header.b=Ta0ppc0y; dkim-atps=neutral Received: from mail.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by passt.top (Postfix) with ESMTPS id 6607E5A0638 for ; Fri, 14 Feb 2025 14:08:52 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202502; t=1739538527; bh=kAt/i2nFBX5CCniekA+YArq9lLh0O6x3GCw8eYbfJWs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Ta0ppc0y094NNOVDUqKGpt3q1JvMF1RMScCJ7BvaIfijHZXqvd6oGZQcJkBclMZ60 /Ji57wDX09m7rS/6UOBwIcw4qiVIKRQ+pgOQZ5r5YJoxjmJgS4rpmwZL42mQGDETda Eq5BkJCs8fGDtowNLo6RpKftVpSnH4X0uxPcpQjgFGpL8T2fToc8P2+KP/NenWB12E EkYwSUNJ1aFuYjGt7HntM6PlfJWPH6JEBKu5e389AqP2zYtz+59CvBx1uSgwi7sM8+ YfosBohoLUrwnFjZOcLCDT85P2q6oD0omb+4hstiL72e+axZW9KVs4SRvu32kXK64x z+BNgQYqLlDZw== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4YvXRz6tL4z4x4w; Sat, 15 Feb 2025 00:08:47 +1100 (AEDT) From: David Gibson To: Stefano Brivio , passt-dev@passt.top Subject: [PATCH v24 3/5] flow, migrate: Flow migration skeleton Date: Sat, 15 Feb 2025 00:08:43 +1100 Message-ID: <20250214130845.3475757-4-david@gibson.dropbear.id.au> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250214130845.3475757-1-david@gibson.dropbear.id.au> References: <20250214130845.3475757-1-david@gibson.dropbear.id.au> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Message-ID-Hash: L6KL4GEBQL3W2QYWM6TJQJ5F3SZOZBTC X-Message-ID-Hash: L6KL4GEBQL3W2QYWM6TJQJ5F3SZOZBTC X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: David Gibson X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Implement an outline of how to transfer the state of individual flows during migration. We split the handling into three stages: 1) freeze This selects which flows are eligible to be migrated, marking them with a new flow state 'MIGRATING'. It makes any preparations for the flow to be migrated, but writes no data to the state stream. If migration fails, we roll back by executing step (3) on the source instead of on the target. 2) transfer This actually transfers data about the flows. Flows are created on the target in MIGRATING state. 3) thaw This "activates" the transferred flows, letting them resume normal operation, but reads no additional data from the state stream. If this fails for a flow, generally that flow is terminated, but we otherwise carry on. Signed-off-by: David Gibson --- flow.c | 186 +++++++++++++++++++++++++++++++++++++++++++++++++-- flow.h | 5 ++ flow_table.h | 22 ++++++ migrate.c | 39 +++++++++-- migrate.h | 2 + 5 files changed, 240 insertions(+), 14 deletions(-) diff --git a/flow.c b/flow.c index d9b888ce..90bff884 100644 --- a/flow.c +++ b/flow.c @@ -27,6 +27,7 @@ const char *flow_state_str[] = { [FLOW_STATE_TGT] = "TGT", [FLOW_STATE_TYPED] = "TYPED", [FLOW_STATE_ACTIVE] = "ACTIVE", + [FLOW_STATE_MIGRATING] = "MIGRATING", }; static_assert(ARRAY_SIZE(flow_state_str) == FLOW_NUM_STATES, "flow_state_str[] doesn't match enum flow_state"); @@ -806,19 +807,18 @@ void flow_defer_handler(const struct ctx *c, const struct timespec *now) continue; } - case FLOW_STATE_NEW: - case FLOW_STATE_INI: - case FLOW_STATE_TGT: - case FLOW_STATE_TYPED: - /* Incomplete flow at end of cycle */ - ASSERT(false); - break; + case FLOW_STATE_MIGRATING: + /* This can occur between the SET_DEVICE_STATE_FD and + * CHECK_DEVICE_STATE vhost-user events, ignore. + */ + continue; case FLOW_STATE_ACTIVE: /* Nothing to do */ break; default: + /* Bad flow state at end of cycle */ ASSERT(false); } @@ -874,6 +874,178 @@ void flow_defer_handler(const struct ctx *c, const struct timespec *now) *last_next = FLOW_MAX; } +/** + * flow_freeze() - Select and prepare flows for migration + * @c: Execution context + * @stage: Migration stage information, unused + * @fd: Migration file descriptor, unused + * + * MUST NOT READ OR WRITE MIGRATION STREAM + * + * Return: 0 on success, positive error code on failure + */ +int flow_freeze(struct ctx *c, const struct migrate_stage *stage, int fd) +{ + union flow *flow; + + (void)stage; + (void)fd; + (void)c; + + flow_foreach(flow) { + /* rc == 0 : not a migration candidate + * rc > 0 : migration candidate + * rc < 0 : error, fail migration + */ + int rc; + + switch (flow->f.type) { + default: + /* Otherwise assume it doesn't migrate */ + rc = 0; + } + + if (rc < 0) { + /* FIXME: rollback */ + return -rc; + } + if (rc > 0) + flow_set_state(&flow->f, FLOW_STATE_MIGRATING); + } + + return 0; +} + +/** + * flow_thaw() - Resume flow operation after migration + * @c: Execution context + * @stage: Migration stage information, unused + * @fd: Migration file descriptor, unused + * + * MUST NOT READ OR WRITE MIGRATION STREAM + * + * Return: 0 on success, positive error code on failure + */ +int flow_thaw(struct ctx *c, const struct migrate_stage *stage, int fd) +{ + struct flow_free_cluster *free_head = NULL; + unsigned *last_next = &flow_first_free; + union flow *flow; + + (void)stage; + (void)fd; + (void)c; + + /* FIXME: Share logic with flow_defer_handler to rebuild free list */ + flow_foreach_slot(flow) { + int rc; + + if (flow->f.state == FLOW_STATE_FREE) { + unsigned skip = flow->free.n; + + /* First entry of a free cluster must have n >= 1 */ + ASSERT(skip); + + if (free_head) { + /* Merge into preceding free cluster */ + free_head->n += flow->free.n; + flow->free.n = flow->free.next = 0; + } else { + /* New free cluster, add to chain */ + free_head = &flow->free; + *last_next = FLOW_IDX(flow); + last_next = &free_head->next; + } + + /* Skip remaining empty entries */ + flow += skip - 1; + continue; + } + + if (flow->f.state == FLOW_STATE_ACTIVE) { + free_head = NULL; + continue; + } + + ASSERT(flow->f.state == FLOW_STATE_MIGRATING); + + rc = 0; + switch (flow->f.type) { + default: + /* Bug. We marked a flow as migrating, but we don't + * know how to resume it */ + ASSERT(0); + } + + if (rc == 0) { + /* Successfully resumed flow */ + flow_set_state(&flow->f, FLOW_STATE_ACTIVE); + free_head = NULL; + continue; + } + + flow_err(flow, "Failed to unfreeze resume flow: %s", + strerror_(-rc)); + + flow_set_state(&flow->f, FLOW_STATE_FREE); + memset(flow, 0, sizeof(*flow)); + + if (free_head) { + /* Add slot to current free cluster */ + ASSERT(FLOW_IDX(flow) == + FLOW_IDX(free_head) + free_head->n); + free_head->n++; + flow->free.n = flow->free.next = 0; + } else { + /* Create new free cluster */ + free_head = &flow->free; + free_head->n = 1; + *last_next = FLOW_IDX(flow); + last_next = &free_head->next; + } + } + + *last_next = FLOW_MAX; + + return 0; +} + +/** + * flow_migrate_source() - Transfer migrating flows to device state stream + * @c: Execution context + * @stage: Migration stage information, unused + * @fd: Migration file descriptor + * + * Return: 0 on success, positive error code on failure + */ +int flow_migrate_source(struct ctx *c, const struct migrate_stage *stage, int fd) +{ + (void)c; + (void)stage; + (void)fd; + + /* FIXME: todo */ + return ENOTSUP; +} + +/** + * flow_migrate_target() - Build flows from device state stream + * @c: Execution context + * @stage: Migration stage information, unused + * @fd: Migration file descriptor + * + * Return: 0 on success, positive error code on failure + */ +int flow_migrate_target(struct ctx *c, const struct migrate_stage *stage, int fd) +{ + (void)c; + (void)stage; + (void)fd; + + /* FIXME: todo */ + return ENOTSUP; +} + /** * flow_init() - Initialise flow related data structures */ diff --git a/flow.h b/flow.h index 24ba3ef0..deb70eb1 100644 --- a/flow.h +++ b/flow.h @@ -82,6 +82,10 @@ * 'true' from flow type specific deferred or timer handler * Caveats: * - flow_alloc_cancel() may not be called on it + * + * MIGRATING - A flow in the process of live migration + * Caveats: + * - Must only exist during live migration process */ enum flow_state { FLOW_STATE_FREE, @@ -90,6 +94,7 @@ enum flow_state { FLOW_STATE_TGT, FLOW_STATE_TYPED, FLOW_STATE_ACTIVE, + FLOW_STATE_MIGRATING, FLOW_NUM_STATES, }; diff --git a/flow_table.h b/flow_table.h index 6ff6d0f6..5ccf6644 100644 --- a/flow_table.h +++ b/flow_table.h @@ -89,11 +89,28 @@ static inline unsigned flow_idx(const struct flow_common *f) flow_foreach_slot(flow_) \ if ((flow_)->f.state == FLOW_STATE_FREE) { \ (flow_) += (flow_)->free.n - 1; \ + } else if ((flow)->f.state == FLOW_STATE_MIGRATING) { \ + continue; \ } else if ((flow)->f.state != FLOW_STATE_ACTIVE) { \ flow_err((flow_), "BUG: Traversing non-active flow"); \ continue; \ } else +/** + * flow_foreach_migrating() - step through migrating flows + * @flow_: Points to each active flow in order + */ +#define flow_foreach_migrating(flow_) \ + flow_foreach_slot(flow_) \ + if ((flow_)->f.state == FLOW_STATE_FREE) { \ + (flow_) += (flow_)->free.n - 1; \ + } else if ((flow)->f.state == FLOW_STATE_ACTIVE) { \ + continue; \ + } else if ((flow)->f.state != FLOW_STATE_MIGRATING) { \ + flow_err((flow_), "BUG: Traversing non-active flow"); \ + continue; \ + } else + /** * flow_foreach_of_type() - step through active flows of one type * @flow_: Points to each active flow in order @@ -209,4 +226,9 @@ void flow_activate(struct flow_common *f); #define FLOW_ACTIVATE(flow_) \ (flow_activate(&(flow_)->f)) +int flow_freeze(struct ctx *c, const struct migrate_stage *stage, int fd); +int flow_thaw(struct ctx *c, const struct migrate_stage *stage, int fd); +int flow_migrate_source(struct ctx *c, const struct migrate_stage *stage, int fd); +int flow_migrate_target(struct ctx *c, const struct migrate_stage *stage, int fd); + #endif /* FLOW_TABLE_H */ diff --git a/migrate.c b/migrate.c index 1c590160..d802e2f9 100644 --- a/migrate.c +++ b/migrate.c @@ -98,11 +98,27 @@ static int seen_addrs_target_v1(struct ctx *c, /* Stages for version 1 */ static const struct migrate_stage stages_v1[] = { + { + .name = "freeze flows", + .source = flow_freeze, + .rollback = flow_thaw, + .target = NULL, + }, { .name = "observed addresses", .source = seen_addrs_source_v1, .target = seen_addrs_target_v1, }, + { + .name = "transfer flows", + .source = flow_migrate_source, + .target = flow_migrate_target, + }, + { + .name = "thaw flows", + .source = NULL, + .target = flow_thaw, + }, { 0 }, }; @@ -145,14 +161,23 @@ static int migrate_source(struct ctx *c, int fd) debug("Source side migration stage: %s", s->name); - if ((ret = s->source(c, s, fd))) { - err("Source migration stage: %s: %s, abort", s->name, - strerror_(ret)); - return ret; - } + if ((ret = s->source(c, s, fd))) + goto rollback; } return 0; + +rollback: + err("Source migration stage: %s: %s, aborting", s->name, + strerror_(ret)); + + while (s-- > v->s) { + debug("Rolling back migration: %s", s->name); + if (s->rollback && s->rollback(c, s, fd)) + die("Unable to roll back migration"); + } + + return ret; } /** @@ -214,9 +239,9 @@ static int migrate_target(struct ctx *c, int fd) debug("Target side migration stage: %s", s->name); if ((ret = s->target(c, s, fd))) { - err("Target migration stage: %s: %s, abort", s->name, + /* FIXME: Less brutal failure */ + die("Target migration stage: %s: %s, abort", s->name, strerror_(ret)); - return ret; } } diff --git a/migrate.h b/migrate.h index 2c51cd97..eca65226 100644 --- a/migrate.h +++ b/migrate.h @@ -23,11 +23,13 @@ struct migrate_header { * struct migrate_stage - Callbacks and parameters for one stage of migration * @name: Stage name (for debugging) * @source: Callback to implement this stage on the source + * @rollback: Callback to roll back this stage on the source * @target: Callback to implement this stage on the target */ struct migrate_stage { const char *name; int (*source)(struct ctx *c, const struct migrate_stage *stage, int fd); + int (*rollback)(struct ctx *c, const struct migrate_stage *stage, int fd); int (*target)(struct ctx *c, const struct migrate_stage *stage, int fd); /* Add here separate rollback callbacks if needed */ -- 2.48.1