From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=pass (2048-bit key; secure) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202502 header.b=SxN2Mp1z; dkim-atps=neutral Received: from mail.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by passt.top (Postfix) with ESMTPS id B62B25A0274 for ; Thu, 27 Feb 2025 06:55:24 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202502; t=1740635720; bh=NXxu6t+8qbKPcG2B6esqTqnmVt+11EcYvLZjbL4vE+U=; h=From:To:Cc:Subject:Date:From; b=SxN2Mp1zApmexssqvqSKwqTHdWAuw0mVc5f0hF6V7XT8+CNqYHoAdvs+FwSNXGkpb tkjjG5Sw9DF/Zkn3mBzXse9qu6mXIvqFak1GD91t7ozQ1YUuOVNtic9sLU1d2sbsCK MGg1enWUT3gF97uWh9q9TsMW081uFQUsntjRZ2KzIHxIcKDyno4oixIYNhzVzfUED5 L/+M3MxvKS80gu2JaImoWuJ54YU92ZBTnyeY7HdM+JhrZed+TCiaOHrmKeOpGA3nmo IgZboHdmQAxhDzElafUHL1NkU3mlgmWmMeIVB3DzgOpYcrcqxuenvDdXIFOdyjS7fb G5FwJ4c+of+TQ== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4Z3LCr0CqCz4wcT; Thu, 27 Feb 2025 16:55:20 +1100 (AEDT) From: David Gibson To: Stefano Brivio , passt-dev@passt.top Subject: [PATCH v4 0/5] Improve robustness of migration Date: Thu, 27 Feb 2025 16:55:12 +1100 Message-ID: <20250227055517.497347-1-david@gibson.dropbear.id.au> X-Mailer: git-send-email 2.48.1 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Message-ID-Hash: 2UGWGB5RDHHTFZ7XCUIH2LTCLTRGUXA7 X-Message-ID-Hash: 2UGWGB5RDHHTFZ7XCUIH2LTCLTRGUXA7 X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: David Gibson X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: >From Red Hat internal testing we've had some reports that if attempting to migrate without passt-repair, the failure mode is uglier than we'd like. The migration fails, which is somewhat expected, but we don't correctly roll things back on the source, so it breaks network there as well. Address this and several other fragilities in the migration. Everything tested with the basic test suite, plus some additional testing for the specific functionality of the patches: For patches 1..2: * I get a clean migration if there are now active flows * Migration completes, although connections are broken if passt-repair isn't connected For patches 3..5: * Migration completes ok if the source and destination hosts have different IPs * Target correctly sees the bind() failure and abandons the flow * Unfortunately, target-side client doesn't get a reset, it just sits there not working. This is because a) the RST we try to send is lost because the queue is still inactive over the migration and b) we don't send RSTs or ICMPs for packets from the guest which no longer match a flow (I hope to tackle this soon) * After manually quitting the stuck client on the target, other connectivity works There are more fragile cases that I'm looking to fix, particularly the die()s in flow_migrate_source_rollback() and elsewhere. David Gibson (5): migrate, flow: Trivially succeed if migrating with no flows migrate, flow: Don't attempt to migrate TCP flows without passt-repair tcp: Correct error code handling from tcp_flow_repair_socket() tcp: Unconditionally move to CLOSED state on tcp_rst() migrate, tcp: Don't flow_alloc_cancel() during incoming migration flow.c | 17 +++++++++++++++-- tcp.c | 28 +++++++++++++++++++++------- 2 files changed, 36 insertions(+), 9 deletions(-) -- 2.48.1