From mboxrd@z Thu Jan  1 00:00:00 1970
Authentication-Results: passt.top; dmarc=pass (p=none dis=none) header.from=redhat.com
Authentication-Results: passt.top;
	dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=AjC7F1BE;
	dkim-atps=neutral
Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124])
	by passt.top (Postfix) with ESMTPS id 739505A061E
	for <passt-dev@passt.top>; Fri, 31 Jan 2025 06:37:07 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
	s=mimecast20190719; t=1738301826;
	h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
	 to:to:cc:cc:mime-version:mime-version:content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references;
	bh=7P4o4igpyOkPoV3HYUh227x99By3bAgBJM9J/qZpV10=;
	b=AjC7F1BEt+Zw5LprdqHjk4DXqH4S/OgfTy9p7aawDp1wvgKTA/jiK+JM1JYp/8W5pkHFr7
	oYCIthSSQ43eiLz/6Xr7OXF+Ns4gBtjxsNarejduSyjay8tapDHA9P4Gc8HiWAMSaLkWA3
	tRyR0C9XSVSg4rOB42/u3h/fpOLwKBg=
Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com
 [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS
 (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id
 us-mta-694-pI0sIrxaPkaEONytDSKC2Q-1; Fri, 31 Jan 2025 00:37:04 -0500
X-MC-Unique: pI0sIrxaPkaEONytDSKC2Q-1
X-Mimecast-MFC-AGG-ID: pI0sIrxaPkaEONytDSKC2Q
Received: by mail-wm1-f69.google.com with SMTP id 5b1f17b1804b1-43651b1ba8aso11291025e9.1
        for <passt-dev@passt.top>; Thu, 30 Jan 2025 21:37:03 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1738301823; x=1738906623;
        h=content-transfer-encoding:mime-version:organization:references
         :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state
         :from:to:cc:subject:date:message-id:reply-to;
        bh=7P4o4igpyOkPoV3HYUh227x99By3bAgBJM9J/qZpV10=;
        b=U2q/hPtiQ/Ly0IqUPhW4t6Wz/C7hViOBCQOahkG3BC0tXwBfuSfP5rW/DO7Xpt/asM
         LVFoXaxkEtj2R8glduhGDs6jqYMDeT3kSFUfI6et9h2CcFmvQER6GXOAUG8CBfLMZgLb
         iEHYclNq8FEoVIpJhzjmFSlwGgu/3fJxubLsZUQjWog1+9kuMrkzTNidDfewlnXKsgWs
         ov7nXQrwqhuSiot86rCJQZJV7GGHbBqf5EttsmBwrrzA7jYhkCFw+ck6bXDGmoCClceL
         bfdRKl5dZPpIgjY2XVdtx8Oj6tw9rE9w7DYx+tSzjBp5pWZfYra+FrCd/5vKyjqPVJ4S
         Qntg==
X-Gm-Message-State: AOJu0YzjN/C3JoUdTLVtm1SIMpxjV0bZ/RM/dEL4Zis5V4bD/+wgdzhX
	2BsRsy1HXTYu2XzE6VC0nEDBw88S/jkOmIlnieT52uTlI79WVvoehDRpkf668uJyHbAMiiv2wv0
	JI7pUtUpu9iP9YTLP84Ek7YsbCkA4On0bsM+UsgtIcmfyo7RMvg==
X-Gm-Gg: ASbGncvIsAszXC7QWjZHGUQnBBKRR4CvStwt3p8R9VpBUyT04JrunwdKefPFeRFSnnB
	5PMwC6eyzb+rsQX1iY5Kq6nQBv/Yzv/n/ZCCQyVYrSN6vem3iZdalSoWg0VykPG2Cz94hOfJGK9
	1t43MEJNAskpAJa/ECE5zWxTZplXN4wK/GNeAvHk3W49kr4rwGN4oGPhBRwvG4Hz9t1X2FvbnY/
	4hugbt3XUcb27VY4mc6zBW3ofxJZw5ZH3z+GvEeq8yU9Z+MCxMBoH5VHJQwRJQGUVqCJ2w7amTB
	/oalmuxttqsAdZa2WqRBYJcAY+mbCuln+w==
X-Received: by 2002:a05:600c:470c:b0:434:fddf:5bfa with SMTP id 5b1f17b1804b1-438dc3bb395mr90931935e9.2.1738301822660;
        Thu, 30 Jan 2025 21:37:02 -0800 (PST)
X-Google-Smtp-Source: AGHT+IE+877ba3UdqIbLcMz2KOS5MX5T04u4OGqhmiqIYjlqRWbfaJpyQgX0Do02IItxz8FmmrkqcA==
X-Received: by 2002:a05:600c:470c:b0:434:fddf:5bfa with SMTP id 5b1f17b1804b1-438dc3bb395mr90931515e9.2.1738301821326;
        Thu, 30 Jan 2025 21:37:01 -0800 (PST)
Received: from maya.myfinge.rs (ifcgrfdd.trafficplex.cloud. [176.103.220.4])
        by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38c5c1b59f6sm3784351f8f.69.2025.01.30.21.36.58
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Thu, 30 Jan 2025 21:37:00 -0800 (PST)
Date: Fri, 31 Jan 2025 06:36:55 +0100
From: Stefano Brivio <sbrivio@redhat.com>
To: David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [PATCH 6/7] Introduce facilities for guest migration on top of
 vhost-user infrastructure
Message-ID: <20250131063655.41a5861b@elisabeth>
In-Reply-To: <20250130093236.117c3fd0@elisabeth>
References: <20250127231532.672363-1-sbrivio@redhat.com>
	<20250127231532.672363-7-sbrivio@redhat.com>
	<Z5g1fMgTmEUKBo_e@zatzit>
	<20250128075001.3557d398@elisabeth>
	<Z5mBik4kbm9GLjRG@zatzit>
	<20250129083350.220a7ab0@elisabeth>
	<Z5rMU0dVWJWSZ_ta@zatzit>
	<20250130055522.39acb265@elisabeth>
	<Z5ssbg6ID_Tqx6Eq@zatzit>
	<20250130093236.117c3fd0@elisabeth>
Organization: Red Hat
X-Mailer: Claws Mail 4.2.0 (GTK 3.24.41; x86_64-pc-linux-gnu)
MIME-Version: 1.0
X-Mimecast-Spam-Score: 0
X-Mimecast-MFC-PROC-ID: t5ATUheR9z3GlP3o9CwbYWjPnT-dSbCWU5mBUl6Hppc_1738301823
X-Mimecast-Originator: redhat.com
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Message-ID-Hash: GZWFD43U6DWYOEYPHJYWI37FKIGUP64L
X-Message-ID-Hash: GZWFD43U6DWYOEYPHJYWI37FKIGUP64L
X-MailFrom: sbrivio@redhat.com
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: passt-dev@passt.top, Laurent Vivier <lvivier@redhat.com>
X-Mailman-Version: 3.3.8
Precedence: list
List-Id: Development discussion and patches for passt <passt-dev.passt.top>
Archived-At: <https://archives.passt.top/passt-dev/20250131063655.41a5861b@elisabeth/>
Archived-At: <https://passt.top/hyperkitty/list/passt-dev@passt.top/message/GZWFD43U6DWYOEYPHJYWI37FKIGUP64L/>
List-Archive: <https://archives.passt.top/passt-dev/>
List-Archive: <https://passt.top/hyperkitty/list/passt-dev@passt.top/>
List-Help: <mailto:passt-dev-request@passt.top?subject=help>
List-Owner: <mailto:passt-dev-owner@passt.top>
List-Post: <mailto:passt-dev@passt.top>
List-Subscribe: <mailto:passt-dev-join@passt.top>
List-Unsubscribe: <mailto:passt-dev-leave@passt.top>

On Thu, 30 Jan 2025 09:32:36 +0100
Stefano Brivio <sbrivio@redhat.com> wrote:

> I would like to quickly complete the whole flow first, because I think
> we can inform design and implementation decisions much better at that
> point

So, there seems to be a problem with (testing?) this. I couldn't quite
understand the root cause yet, and it doesn't happen with the reference
source.c and target.c implementations I shared.

Let's assume I have a connection in the source guest to 127.0.0.1:9091,
from 127.0.0.1:56350. After the migration, in the target, I get:

---
socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 79
setsockopt(79, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
bind(79, {sa_family=AF_INET, sin_port=htons(56350), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
sendmsg(72, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\1", iov_len=1}], msg_iovlen=1, msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[79]}], msg_controllen=24, msg_flags=0}, 0) = 1
recvfrom(72, "\1", 1, 0, NULL, NULL)    = 1
setsockopt(79, SOL_TCP, TCP_REPAIR_QUEUE, [2], 4) = 0
setsockopt(79, SOL_TCP, TCP_QUEUE_SEQ, [1788468535], 4) = 0
write(2, "77.6923: ", 977.6923: )                = 9
write(2, "Set send queue sequence for sock"..., 51Set send queue sequence for socket 79 to 1788468535) = 51
write(2, "\n", 1
)                       = 1
setsockopt(79, SOL_TCP, TCP_REPAIR_QUEUE, [1], 4) = 0
setsockopt(79, SOL_TCP, TCP_QUEUE_SEQ, [115288604], 4) = 0
write(2, "77.6924: ", 977.6924: )                = 9
write(2, "Set receive queue sequence for s"..., 53Set receive queue sequence for socket 79 to 115288604) = 53
write(2, "\n", 1
)                       = 1
connect(79, {sa_family=AF_INET, sin_port=htons(9091), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EADDRNOTAVAIL (Cannot assign requested address)
---

EADDRNOTAVAIL, according to the documentation, which seems to be
consistent with a glance at the implementation (that is, I must be
missing some issue in the kernel), should be returned on connect() if:

       EADDRNOTAVAIL
              (Internet  domain sockets) The socket referred to by
              sockfd had not previously been bound to  an  address
              and,  upon  attempting  to  bind  it to an ephemeral
              port, it was determined that all port numbers in the
              ephemeral port range are currently in use.  See  the
              discussion of /proc/sys/net/ipv4/ip_local_port_range
              in ip(7).

but well, of course it was bound.

To a port, indeed, not a full address, that is, any (0.0.0.0) and
address port, but I think for the purposes of this description that
bind() call is enough.

Is this related to SO_REUSEADDR? I need it (on both source and target)
because, at least in my tests, source and target are on the same
machine, in the same namespace. If I drop it:

---
bind(79, {sa_family=AF_INET, sin_port=htons(46280), sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EADDRINUSE (Address already in use)
---

as expected.

However, in my reference implementation, with a connection from
127.0.0.1:9998 to 127.0.0.1:9091, this is what the target does:

---
socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 3
setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
bind(3, {sa_family=AF_INET, sin_port=htons(9998), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
socket(AF_UNIX, SOCK_STREAM, 0)         = 4
unlink("/tmp/repair.sock")              = 0
bind(4, {sa_family=AF_UNIX, sun_path="/tmp/repair.sock"}, 110) = 0
listen(4, 1)                            = 0
accept(4, NULL, NULL)                   = 5
sendmsg(5, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\1", iov_len=1}], msg_iovlen=1, msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[3]}], msg_controllen=24, msg_flags=0}, 0) = 1
recvfrom(5, "\1", 1, 0, NULL, NULL)     = 1
setsockopt(3, SOL_TCP, TCP_REPAIR_QUEUE, [2], 4) = 0
setsockopt(3, SOL_TCP, TCP_QUEUE_SEQ, [1612504019], 4) = 0
setsockopt(3, SOL_TCP, TCP_REPAIR_QUEUE, [1], 4) = 0
setsockopt(3, SOL_TCP, TCP_QUEUE_SEQ, [1756508956], 4) = 0
connect(3, {sa_family=AF_INET, sin_port=htons(9091), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
---

The only obvious difference is that, here, I'm not binding to an
ephemeral port: the source port (in both source and target "guests") is
9998.

Fine, so I tried forcing a lower port in passt (source) as well, and
this is what I get in the target now:

---
socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 79
setsockopt(79, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
bind(79, {sa_family=AF_INET, sin_port=htons(9000), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
sendmsg(72, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\1", iov_len=1}], msg_iovlen=1, msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[79]}], msg_controllen=24, msg_flags=0}, 0) = 1
recvfrom(72, "\1", 1, 0, NULL, NULL)    = 1
setsockopt(79, SOL_TCP, TCP_REPAIR_QUEUE, [2], 4) = 0
setsockopt(79, SOL_TCP, TCP_QUEUE_SEQ, [-348109334], 4) = 0
write(2, "46.9751: ", 946.9751: )                = 9
write(2, "Set send queue sequence for sock"..., 51Set send queue sequence for socket 79 to 3946857962) = 51
write(2, "\n", 1
)                       = 1
setsockopt(79, SOL_TCP, TCP_REPAIR_QUEUE, [1], 4) = 0
setsockopt(79, SOL_TCP, TCP_QUEUE_SEQ, [-1820322671], 4) = 0
write(2, "46.9752: ", 946.9752: )                = 9
write(2, "Set receive queue sequence for s"..., 54Set receive queue sequence for socket 79 to 2474644625) = 54
write(2, "\n", 1
)                       = 1
connect(79, {sa_family=AF_INET, sin_port=htons(9091), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EADDRNOTAVAIL (Cannot assign requested address)
---

no obvious difference. I'll try binding to an explicit address, next,
but I have no idea why 1. we get EADDRNOTAVAIL after a bind() and 2. it
works with the reference implementation.

Yes, I explicitly close() the socket in the source passt now, but that
doesn't change things.

This is presumably just an issue with testing, because in real use
cases source and target guests would be on different machines. Another
idea could be separating the namespaces.

I can't just run source and target passt in two instances of pasta
--config-net, because pasta would run into the same issue, but I could
isolate one namespace with it, then add two network namespaces inside
that, and connect them with veth pairs.

-- 
Stefano