From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: passt.top; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=AjC7F1BE; dkim-atps=neutral Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by passt.top (Postfix) with ESMTPS id 739505A061E for <passt-dev@passt.top>; Fri, 31 Jan 2025 06:37:07 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1738301826; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7P4o4igpyOkPoV3HYUh227x99By3bAgBJM9J/qZpV10=; b=AjC7F1BEt+Zw5LprdqHjk4DXqH4S/OgfTy9p7aawDp1wvgKTA/jiK+JM1JYp/8W5pkHFr7 oYCIthSSQ43eiLz/6Xr7OXF+Ns4gBtjxsNarejduSyjay8tapDHA9P4Gc8HiWAMSaLkWA3 tRyR0C9XSVSg4rOB42/u3h/fpOLwKBg= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-694-pI0sIrxaPkaEONytDSKC2Q-1; Fri, 31 Jan 2025 00:37:04 -0500 X-MC-Unique: pI0sIrxaPkaEONytDSKC2Q-1 X-Mimecast-MFC-AGG-ID: pI0sIrxaPkaEONytDSKC2Q Received: by mail-wm1-f69.google.com with SMTP id 5b1f17b1804b1-43651b1ba8aso11291025e9.1 for <passt-dev@passt.top>; Thu, 30 Jan 2025 21:37:03 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738301823; x=1738906623; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=7P4o4igpyOkPoV3HYUh227x99By3bAgBJM9J/qZpV10=; b=U2q/hPtiQ/Ly0IqUPhW4t6Wz/C7hViOBCQOahkG3BC0tXwBfuSfP5rW/DO7Xpt/asM LVFoXaxkEtj2R8glduhGDs6jqYMDeT3kSFUfI6et9h2CcFmvQER6GXOAUG8CBfLMZgLb iEHYclNq8FEoVIpJhzjmFSlwGgu/3fJxubLsZUQjWog1+9kuMrkzTNidDfewlnXKsgWs ov7nXQrwqhuSiot86rCJQZJV7GGHbBqf5EttsmBwrrzA7jYhkCFw+ck6bXDGmoCClceL bfdRKl5dZPpIgjY2XVdtx8Oj6tw9rE9w7DYx+tSzjBp5pWZfYra+FrCd/5vKyjqPVJ4S Qntg== X-Gm-Message-State: AOJu0YzjN/C3JoUdTLVtm1SIMpxjV0bZ/RM/dEL4Zis5V4bD/+wgdzhX 2BsRsy1HXTYu2XzE6VC0nEDBw88S/jkOmIlnieT52uTlI79WVvoehDRpkf668uJyHbAMiiv2wv0 JI7pUtUpu9iP9YTLP84Ek7YsbCkA4On0bsM+UsgtIcmfyo7RMvg== X-Gm-Gg: ASbGncvIsAszXC7QWjZHGUQnBBKRR4CvStwt3p8R9VpBUyT04JrunwdKefPFeRFSnnB 5PMwC6eyzb+rsQX1iY5Kq6nQBv/Yzv/n/ZCCQyVYrSN6vem3iZdalSoWg0VykPG2Cz94hOfJGK9 1t43MEJNAskpAJa/ECE5zWxTZplXN4wK/GNeAvHk3W49kr4rwGN4oGPhBRwvG4Hz9t1X2FvbnY/ 4hugbt3XUcb27VY4mc6zBW3ofxJZw5ZH3z+GvEeq8yU9Z+MCxMBoH5VHJQwRJQGUVqCJ2w7amTB /oalmuxttqsAdZa2WqRBYJcAY+mbCuln+w== X-Received: by 2002:a05:600c:470c:b0:434:fddf:5bfa with SMTP id 5b1f17b1804b1-438dc3bb395mr90931935e9.2.1738301822660; Thu, 30 Jan 2025 21:37:02 -0800 (PST) X-Google-Smtp-Source: AGHT+IE+877ba3UdqIbLcMz2KOS5MX5T04u4OGqhmiqIYjlqRWbfaJpyQgX0Do02IItxz8FmmrkqcA== X-Received: by 2002:a05:600c:470c:b0:434:fddf:5bfa with SMTP id 5b1f17b1804b1-438dc3bb395mr90931515e9.2.1738301821326; Thu, 30 Jan 2025 21:37:01 -0800 (PST) Received: from maya.myfinge.rs (ifcgrfdd.trafficplex.cloud. [176.103.220.4]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38c5c1b59f6sm3784351f8f.69.2025.01.30.21.36.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Jan 2025 21:37:00 -0800 (PST) Date: Fri, 31 Jan 2025 06:36:55 +0100 From: Stefano Brivio <sbrivio@redhat.com> To: David Gibson <david@gibson.dropbear.id.au> Subject: Re: [PATCH 6/7] Introduce facilities for guest migration on top of vhost-user infrastructure Message-ID: <20250131063655.41a5861b@elisabeth> In-Reply-To: <20250130093236.117c3fd0@elisabeth> References: <20250127231532.672363-1-sbrivio@redhat.com> <20250127231532.672363-7-sbrivio@redhat.com> <Z5g1fMgTmEUKBo_e@zatzit> <20250128075001.3557d398@elisabeth> <Z5mBik4kbm9GLjRG@zatzit> <20250129083350.220a7ab0@elisabeth> <Z5rMU0dVWJWSZ_ta@zatzit> <20250130055522.39acb265@elisabeth> <Z5ssbg6ID_Tqx6Eq@zatzit> <20250130093236.117c3fd0@elisabeth> Organization: Red Hat X-Mailer: Claws Mail 4.2.0 (GTK 3.24.41; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: t5ATUheR9z3GlP3o9CwbYWjPnT-dSbCWU5mBUl6Hppc_1738301823 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-ID-Hash: GZWFD43U6DWYOEYPHJYWI37FKIGUP64L X-Message-ID-Hash: GZWFD43U6DWYOEYPHJYWI37FKIGUP64L X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, Laurent Vivier <lvivier@redhat.com> X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt <passt-dev.passt.top> Archived-At: <https://archives.passt.top/passt-dev/20250131063655.41a5861b@elisabeth/> Archived-At: <https://passt.top/hyperkitty/list/passt-dev@passt.top/message/GZWFD43U6DWYOEYPHJYWI37FKIGUP64L/> List-Archive: <https://archives.passt.top/passt-dev/> List-Archive: <https://passt.top/hyperkitty/list/passt-dev@passt.top/> List-Help: <mailto:passt-dev-request@passt.top?subject=help> List-Owner: <mailto:passt-dev-owner@passt.top> List-Post: <mailto:passt-dev@passt.top> List-Subscribe: <mailto:passt-dev-join@passt.top> List-Unsubscribe: <mailto:passt-dev-leave@passt.top> On Thu, 30 Jan 2025 09:32:36 +0100 Stefano Brivio <sbrivio@redhat.com> wrote: > I would like to quickly complete the whole flow first, because I think > we can inform design and implementation decisions much better at that > point So, there seems to be a problem with (testing?) this. I couldn't quite understand the root cause yet, and it doesn't happen with the reference source.c and target.c implementations I shared. Let's assume I have a connection in the source guest to 127.0.0.1:9091, from 127.0.0.1:56350. After the migration, in the target, I get: --- socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 79 setsockopt(79, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 bind(79, {sa_family=AF_INET, sin_port=htons(56350), sin_addr=inet_addr("0.0.0.0")}, 16) = 0 sendmsg(72, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\1", iov_len=1}], msg_iovlen=1, msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[79]}], msg_controllen=24, msg_flags=0}, 0) = 1 recvfrom(72, "\1", 1, 0, NULL, NULL) = 1 setsockopt(79, SOL_TCP, TCP_REPAIR_QUEUE, [2], 4) = 0 setsockopt(79, SOL_TCP, TCP_QUEUE_SEQ, [1788468535], 4) = 0 write(2, "77.6923: ", 977.6923: ) = 9 write(2, "Set send queue sequence for sock"..., 51Set send queue sequence for socket 79 to 1788468535) = 51 write(2, "\n", 1 ) = 1 setsockopt(79, SOL_TCP, TCP_REPAIR_QUEUE, [1], 4) = 0 setsockopt(79, SOL_TCP, TCP_QUEUE_SEQ, [115288604], 4) = 0 write(2, "77.6924: ", 977.6924: ) = 9 write(2, "Set receive queue sequence for s"..., 53Set receive queue sequence for socket 79 to 115288604) = 53 write(2, "\n", 1 ) = 1 connect(79, {sa_family=AF_INET, sin_port=htons(9091), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EADDRNOTAVAIL (Cannot assign requested address) --- EADDRNOTAVAIL, according to the documentation, which seems to be consistent with a glance at the implementation (that is, I must be missing some issue in the kernel), should be returned on connect() if: EADDRNOTAVAIL (Internet domain sockets) The socket referred to by sockfd had not previously been bound to an address and, upon attempting to bind it to an ephemeral port, it was determined that all port numbers in the ephemeral port range are currently in use. See the discussion of /proc/sys/net/ipv4/ip_local_port_range in ip(7). but well, of course it was bound. To a port, indeed, not a full address, that is, any (0.0.0.0) and address port, but I think for the purposes of this description that bind() call is enough. Is this related to SO_REUSEADDR? I need it (on both source and target) because, at least in my tests, source and target are on the same machine, in the same namespace. If I drop it: --- bind(79, {sa_family=AF_INET, sin_port=htons(46280), sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EADDRINUSE (Address already in use) --- as expected. However, in my reference implementation, with a connection from 127.0.0.1:9998 to 127.0.0.1:9091, this is what the target does: --- socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 3 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 bind(3, {sa_family=AF_INET, sin_port=htons(9998), sin_addr=inet_addr("0.0.0.0")}, 16) = 0 socket(AF_UNIX, SOCK_STREAM, 0) = 4 unlink("/tmp/repair.sock") = 0 bind(4, {sa_family=AF_UNIX, sun_path="/tmp/repair.sock"}, 110) = 0 listen(4, 1) = 0 accept(4, NULL, NULL) = 5 sendmsg(5, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\1", iov_len=1}], msg_iovlen=1, msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[3]}], msg_controllen=24, msg_flags=0}, 0) = 1 recvfrom(5, "\1", 1, 0, NULL, NULL) = 1 setsockopt(3, SOL_TCP, TCP_REPAIR_QUEUE, [2], 4) = 0 setsockopt(3, SOL_TCP, TCP_QUEUE_SEQ, [1612504019], 4) = 0 setsockopt(3, SOL_TCP, TCP_REPAIR_QUEUE, [1], 4) = 0 setsockopt(3, SOL_TCP, TCP_QUEUE_SEQ, [1756508956], 4) = 0 connect(3, {sa_family=AF_INET, sin_port=htons(9091), sin_addr=inet_addr("127.0.0.1")}, 16) = 0 --- The only obvious difference is that, here, I'm not binding to an ephemeral port: the source port (in both source and target "guests") is 9998. Fine, so I tried forcing a lower port in passt (source) as well, and this is what I get in the target now: --- socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 79 setsockopt(79, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 bind(79, {sa_family=AF_INET, sin_port=htons(9000), sin_addr=inet_addr("0.0.0.0")}, 16) = 0 sendmsg(72, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\1", iov_len=1}], msg_iovlen=1, msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[79]}], msg_controllen=24, msg_flags=0}, 0) = 1 recvfrom(72, "\1", 1, 0, NULL, NULL) = 1 setsockopt(79, SOL_TCP, TCP_REPAIR_QUEUE, [2], 4) = 0 setsockopt(79, SOL_TCP, TCP_QUEUE_SEQ, [-348109334], 4) = 0 write(2, "46.9751: ", 946.9751: ) = 9 write(2, "Set send queue sequence for sock"..., 51Set send queue sequence for socket 79 to 3946857962) = 51 write(2, "\n", 1 ) = 1 setsockopt(79, SOL_TCP, TCP_REPAIR_QUEUE, [1], 4) = 0 setsockopt(79, SOL_TCP, TCP_QUEUE_SEQ, [-1820322671], 4) = 0 write(2, "46.9752: ", 946.9752: ) = 9 write(2, "Set receive queue sequence for s"..., 54Set receive queue sequence for socket 79 to 2474644625) = 54 write(2, "\n", 1 ) = 1 connect(79, {sa_family=AF_INET, sin_port=htons(9091), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EADDRNOTAVAIL (Cannot assign requested address) --- no obvious difference. I'll try binding to an explicit address, next, but I have no idea why 1. we get EADDRNOTAVAIL after a bind() and 2. it works with the reference implementation. Yes, I explicitly close() the socket in the source passt now, but that doesn't change things. This is presumably just an issue with testing, because in real use cases source and target guests would be on different machines. Another idea could be separating the namespaces. I can't just run source and target passt in two instances of pasta --config-net, because pasta would run into the same issue, but I could isolate one namespace with it, then add two network namespaces inside that, and connect them with veth pairs. -- Stefano