Continued investigating the problem with migration failing across a bridge. Good news is I've found the problem... or at least one problem. Bad news is we'll have to change the migration stream format to fix it. The packets are being dropped in tcp_validate_incoming() due to a failed PAWS check (skb drop reason "TCP_RFC7323_PAWS"). That in turn looks to be because we don't preserve TCP timestamp state across the migration. We preserve _whether_ TCP timestamps are active on the connection (TCPOPT_TIMESTAMP entry in TCP_REPAIR_OPTIONS), but we don't preserve the current timestamp values (TCP_TIMESTAMP socket option). The equivalent CRIU code is https://github.com/checkpoint-restore/criu/blob/d18912fc88f3dc7bde5fdfa3575691977eb21753/soccr/soccr.c#L266 and https://github.com/checkpoint-restore/criu/blob/d18912fc88f3dc7bde5fdfa3575691977eb21753/soccr/soccr.c#L572 I'll work on writing a fix tomorrow. Not yet sure why we didn't hit this with a local migration. I'm guessing some part of being a local connection means we're bypassing the PAWS check. -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson