From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=pass (2048-bit key; secure) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202502 header.b=QHJymtTx; dkim-atps=neutral Received: from mail.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by passt.top (Postfix) with ESMTPS id 53AF25A0620 for ; Thu, 27 Feb 2025 02:57:37 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202502; t=1740621440; bh=n3YlnMlPiUD7vdbZP2QzgLqiIz6eJ1SRkS9sstsb1vU=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=QHJymtTxcC5rqRiBQ+SpmiWolaiDRAjdel6OD5oO31P0Rul9Mqus5ow8UzafzVLnu Jkg5/39H2APscjfHdIp4gZ34hhoMsUmheOW1bZpsMGffZF6tQSjsOsJCs9rK6VwPqK Y8hO+3NL/1+RFUM9sbswd2Y6rmBCy7ZLJcPO6aibz1zHPTDYHrSGbH4wyNBjRnTOMT /rNFuaJV6te/GkX7Sx2et7HD0FaQNnKv1jKrWjw+pKuL8YtbUsakG7dMmuCOui4+8x caNcWjEN1perNE1QNXaYhhzev/cFfPVhI50DJ2/40R1LIABncPfM4vTWzY5S6/5Ta7 jGcd53d86NnHQ== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4Z3DxD3L0zz4x5k; Thu, 27 Feb 2025 12:57:20 +1100 (AEDT) Date: Thu, 27 Feb 2025 12:57:15 +1100 From: David Gibson To: Stefano Brivio Subject: Re: tcp_rst() complications Message-ID: References: <20250226111507.166ed938@elisabeth> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="DfxEtAjDUhu+IDKE" Content-Disposition: inline In-Reply-To: <20250226111507.166ed938@elisabeth> Message-ID-Hash: 3RUUKBHCNXMM3XY2ARGIZGNGMUMRPCP2 X-Message-ID-Hash: 3RUUKBHCNXMM3XY2ARGIZGNGMUMRPCP2 X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --DfxEtAjDUhu+IDKE Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Feb 26, 2025 at 11:15:07AM +0100, Stefano Brivio wrote: > On Wed, 26 Feb 2025 20:05:25 +1100 > David Gibson wrote: >=20 > > Amongst other things, I spotted some additional complications using > > tcp_rst() in the migration path (some of which might also have > > implications in other contexts). These might be things we can safely > > ignore, at least for now, but I haven't thought through them enough to > > be sure. > >=20 > > 1) Sending RST to guest during migration > >=20 > > The first issue is that tcp_rst() will send an actual RST to the guest > > on the tap interface. During migration, that means we're sending to > > the guest while it's suspended. At the very least that means we > > probably have a much higher that usual chance of getting a queue full > > failure writing to the tap interface, which could hit problem (2). > >=20 > > But, beyond that, with vhost-user that means we're writing to guest > > memory while the guest is suspended. Kind of the whole point of the > > suspended time is that the guest memory doesn't change during it, so > > I'm not sure what the consequences will be. >=20 > If I recall correctly I checked this and something in the vhost-user > code will tell us that the queue is not ready yet, done. Ah, yes. I'd seen those "Got packet, but RX virtqueue is not usable yet" messages, but didn't make the connection. I had been concerned that might only happen when we tried to kick the queue, after we'd touched guest memory, but that doesn't appear to be the case. Ok, now I'm reasonably convinced this is ifne. > Ideally we want to make sure we queue those, but queue sizes are finite > and I don't think we can guarantee we can pile up 128k RST segments. Well that, plus with vhost-user we don't really have anywhere we can queue them other than in guest memory. > Right now I would check that the functionality is not spectacularly > broken (I looked into that briefly, QEMU didn't crash, guest kept > running, but I didn't check further than that). If we miss a RST too > bad, things will time out eventually. >=20 > As a second step we could perhaps introduce a post-migration stage and > move calling tcp_rst() to there if the connection is in a given state? >=20 > > Now, at the moment I > > think all our tcp_rst() calls are either on the source during rollback > > (i.e. we're committed to resuming only on the source) or on the target > > past the point of no return (i.e. we're committed to resuming only on > > the target). I suspect that means we can get away with it, but I do > > worry this could break something in qeme by violating that assumption. > >=20 > > 2) tcp_rst() failures > >=20 > > tcp_rst() can fail if tcp_send_flag() fails. In this case we *don't* > > change the events to CLOSED. I _think_ that's a bug: even if we > > weren't able to send the RST to the guest, we've already closed the > > socket so the flow is dead. Moving to CLOSED state (and then removing > > the flow entirely) should mean that we'll resend an RST if the guest > > attempts to use the flow again later. > >=20 > > But.. I was worried there might be some subtle reason for not changing > > the event state in that case. >=20 > Not very subtle: my original idea was that if we fail to send the RST, > we should note that (by not moving to CLOSED) and try again from a > timer. Ok. Makes sense, but I don't think it's necessary. We can treat a failure here as if the RST was lost on the wire: we just forget the connection, and if the guest doesn't drop it we'll send it another RST if it tries to send a packet to the connection we no longer know about. > In practice I've never observed a tcp_send_flag() failure so I'm not > sure if that mechanism even works. I don't think the mechanism even exists. I don't think there's any timer that does this, and more to the point, if we don't change events we basically don't have any record that we already reset the connection. > Moving the socket to CLOSED sounds > totally okay to me, surely simpler, and probably more robust. Ok. I'll go ahead with that. --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --DfxEtAjDUhu+IDKE Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAme/xnoACgkQzQJF27ox 2GcIKQ//X01/A3YwiKLLlObTrcL414yBQ6W/Z15q9WI9f0e13tnsXRVjg49KpUs8 y17OufcXJ3PYT235U0grA8hC2xLCpnGdy1tuRJcFnYxGdCxg3hfdFmzbRdNGHMyZ 2zE/w6amGXGzYIVTQKj2s1rRi1JT/2LMrKG169V0F2y5deDRnyvVyOrRDklJXKda Hbq/difNUpSgnKSzyI86JZy+xwKA3kqxlyl7bfXiQYO9TSVZcZwh9hlE/ry9Q/J5 ERaV4nfWpXSa15fZzM7czcVnMlgd3ePrXhNHS3CGL5RXHS0biHamk94jMZ9pu9yd qMVvSw3wDoZ3W6f4hHAUK2EKwtCUh9ORKBjFk9ONrGY+NTCihs7QBb7iA5Vollc7 7KGfC3UNa0LnVmOYCqmA+4NIIDcXJpiNpn279fRHAExVFllSp4LsrdG9eYBEz+fF GCDWW6LmWknVKzEUnfC8VuzxEcPSLER34XkaDhfA7laHzw0TKxfZpm0tAPbpnwZv xar2SA1fWLSRnwyJSKybh+Lo+YIvwh+2PUROEVvHyMONyt14lIOocAxOqY750m7f HMMV0hVAkqq4MUgUQlNnHdHHsb5YGOr/W2RxFdrlHiKq85XLcJeX71Njlt4FNhtj DGu14bIZ0xvtXEabwQbssrIdlHP9BmvjP7TJnJfx/4CxP1Y/vuc= =cmfv -----END PGP SIGNATURE----- --DfxEtAjDUhu+IDKE--