Amongst other things, I spotted some additional complications using tcp_rst() in the migration path (some of which might also have implications in other contexts). These might be things we can safely ignore, at least for now, but I haven't thought through them enough to be sure. 1) Sending RST to guest during migration The first issue is that tcp_rst() will send an actual RST to the guest on the tap interface. During migration, that means we're sending to the guest while it's suspended. At the very least that means we probably have a much higher that usual chance of getting a queue full failure writing to the tap interface, which could hit problem (2). But, beyond that, with vhost-user that means we're writing to guest memory while the guest is suspended. Kind of the whole point of the suspended time is that the guest memory doesn't change during it, so I'm not sure what the consequences will be. Now, at the moment I think all our tcp_rst() calls are either on the source during rollback (i.e. we're committed to resuming only on the source) or on the target past the point of no return (i.e. we're committed to resuming only on the target). I suspect that means we can get away with it, but I do worry this could break something in qeme by violating that assumption. 2) tcp_rst() failures tcp_rst() can fail if tcp_send_flag() fails. In this case we *don't* change the events to CLOSED. I _think_ that's a bug: even if we weren't able to send the RST to the guest, we've already closed the socket so the flow is dead. Moving to CLOSED state (and then removing the flow entirely) should mean that we'll resend an RST if the guest attempts to use the flow again later. But.. I was worried there might be some subtle reason for not changing the event state in that case. -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson