From: Stefano Brivio <sbrivio@redhat.com>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: passt-dev@passt.top, Laurent Vivier <lvivier@redhat.com>
Subject: Re: [PATCH 6/7] Introduce facilities for guest migration on top of vhost-user infrastructure
Date: Fri, 31 Jan 2025 06:36:55 +0100 [thread overview]
Message-ID: <20250131063655.41a5861b@elisabeth> (raw)
In-Reply-To: <20250130093236.117c3fd0@elisabeth>
On Thu, 30 Jan 2025 09:32:36 +0100
Stefano Brivio <sbrivio@redhat.com> wrote:
> I would like to quickly complete the whole flow first, because I think
> we can inform design and implementation decisions much better at that
> point
So, there seems to be a problem with (testing?) this. I couldn't quite
understand the root cause yet, and it doesn't happen with the reference
source.c and target.c implementations I shared.
Let's assume I have a connection in the source guest to 127.0.0.1:9091,
from 127.0.0.1:56350. After the migration, in the target, I get:
---
socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 79
setsockopt(79, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
bind(79, {sa_family=AF_INET, sin_port=htons(56350), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
sendmsg(72, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\1", iov_len=1}], msg_iovlen=1, msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[79]}], msg_controllen=24, msg_flags=0}, 0) = 1
recvfrom(72, "\1", 1, 0, NULL, NULL) = 1
setsockopt(79, SOL_TCP, TCP_REPAIR_QUEUE, [2], 4) = 0
setsockopt(79, SOL_TCP, TCP_QUEUE_SEQ, [1788468535], 4) = 0
write(2, "77.6923: ", 977.6923: ) = 9
write(2, "Set send queue sequence for sock"..., 51Set send queue sequence for socket 79 to 1788468535) = 51
write(2, "\n", 1
) = 1
setsockopt(79, SOL_TCP, TCP_REPAIR_QUEUE, [1], 4) = 0
setsockopt(79, SOL_TCP, TCP_QUEUE_SEQ, [115288604], 4) = 0
write(2, "77.6924: ", 977.6924: ) = 9
write(2, "Set receive queue sequence for s"..., 53Set receive queue sequence for socket 79 to 115288604) = 53
write(2, "\n", 1
) = 1
connect(79, {sa_family=AF_INET, sin_port=htons(9091), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EADDRNOTAVAIL (Cannot assign requested address)
---
EADDRNOTAVAIL, according to the documentation, which seems to be
consistent with a glance at the implementation (that is, I must be
missing some issue in the kernel), should be returned on connect() if:
EADDRNOTAVAIL
(Internet domain sockets) The socket referred to by
sockfd had not previously been bound to an address
and, upon attempting to bind it to an ephemeral
port, it was determined that all port numbers in the
ephemeral port range are currently in use. See the
discussion of /proc/sys/net/ipv4/ip_local_port_range
in ip(7).
but well, of course it was bound.
To a port, indeed, not a full address, that is, any (0.0.0.0) and
address port, but I think for the purposes of this description that
bind() call is enough.
Is this related to SO_REUSEADDR? I need it (on both source and target)
because, at least in my tests, source and target are on the same
machine, in the same namespace. If I drop it:
---
bind(79, {sa_family=AF_INET, sin_port=htons(46280), sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EADDRINUSE (Address already in use)
---
as expected.
However, in my reference implementation, with a connection from
127.0.0.1:9998 to 127.0.0.1:9091, this is what the target does:
---
socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 3
setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
bind(3, {sa_family=AF_INET, sin_port=htons(9998), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
socket(AF_UNIX, SOCK_STREAM, 0) = 4
unlink("/tmp/repair.sock") = 0
bind(4, {sa_family=AF_UNIX, sun_path="/tmp/repair.sock"}, 110) = 0
listen(4, 1) = 0
accept(4, NULL, NULL) = 5
sendmsg(5, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\1", iov_len=1}], msg_iovlen=1, msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[3]}], msg_controllen=24, msg_flags=0}, 0) = 1
recvfrom(5, "\1", 1, 0, NULL, NULL) = 1
setsockopt(3, SOL_TCP, TCP_REPAIR_QUEUE, [2], 4) = 0
setsockopt(3, SOL_TCP, TCP_QUEUE_SEQ, [1612504019], 4) = 0
setsockopt(3, SOL_TCP, TCP_REPAIR_QUEUE, [1], 4) = 0
setsockopt(3, SOL_TCP, TCP_QUEUE_SEQ, [1756508956], 4) = 0
connect(3, {sa_family=AF_INET, sin_port=htons(9091), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
---
The only obvious difference is that, here, I'm not binding to an
ephemeral port: the source port (in both source and target "guests") is
9998.
Fine, so I tried forcing a lower port in passt (source) as well, and
this is what I get in the target now:
---
socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 79
setsockopt(79, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
bind(79, {sa_family=AF_INET, sin_port=htons(9000), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
sendmsg(72, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\1", iov_len=1}], msg_iovlen=1, msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[79]}], msg_controllen=24, msg_flags=0}, 0) = 1
recvfrom(72, "\1", 1, 0, NULL, NULL) = 1
setsockopt(79, SOL_TCP, TCP_REPAIR_QUEUE, [2], 4) = 0
setsockopt(79, SOL_TCP, TCP_QUEUE_SEQ, [-348109334], 4) = 0
write(2, "46.9751: ", 946.9751: ) = 9
write(2, "Set send queue sequence for sock"..., 51Set send queue sequence for socket 79 to 3946857962) = 51
write(2, "\n", 1
) = 1
setsockopt(79, SOL_TCP, TCP_REPAIR_QUEUE, [1], 4) = 0
setsockopt(79, SOL_TCP, TCP_QUEUE_SEQ, [-1820322671], 4) = 0
write(2, "46.9752: ", 946.9752: ) = 9
write(2, "Set receive queue sequence for s"..., 54Set receive queue sequence for socket 79 to 2474644625) = 54
write(2, "\n", 1
) = 1
connect(79, {sa_family=AF_INET, sin_port=htons(9091), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EADDRNOTAVAIL (Cannot assign requested address)
---
no obvious difference. I'll try binding to an explicit address, next,
but I have no idea why 1. we get EADDRNOTAVAIL after a bind() and 2. it
works with the reference implementation.
Yes, I explicitly close() the socket in the source passt now, but that
doesn't change things.
This is presumably just an issue with testing, because in real use
cases source and target guests would be on different machines. Another
idea could be separating the namespaces.
I can't just run source and target passt in two instances of pasta
--config-net, because pasta would run into the same issue, but I could
isolate one namespace with it, then add two network namespaces inside
that, and connect them with veth pairs.
--
Stefano
next prev parent reply other threads:[~2025-01-31 5:37 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-27 23:15 [PATCH 0/7] Draft, incomplete series introducing state migration Stefano Brivio
2025-01-27 23:15 ` [PATCH 1/7] icmp, udp: Pad time_t timestamp to 64-bit to ease " Stefano Brivio
2025-01-28 0:49 ` David Gibson
2025-01-28 6:48 ` Stefano Brivio
2025-01-27 23:15 ` [PATCH 2/7] flow, flow_table: Pad flow table entries to 128 bytes, hash entries to 32 bits Stefano Brivio
2025-01-28 0:50 ` David Gibson
2025-01-27 23:15 ` [PATCH 3/7] tcp_conn: Avoid 7-bit hole in struct tcp_splice_conn Stefano Brivio
2025-01-28 0:53 ` David Gibson
2025-01-28 6:48 ` Stefano Brivio
2025-01-29 1:02 ` David Gibson
2025-01-29 7:33 ` Stefano Brivio
2025-01-30 0:44 ` David Gibson
2025-01-30 4:55 ` Stefano Brivio
2025-01-30 7:27 ` David Gibson
2025-01-27 23:15 ` [PATCH 4/7] flow_table: Use size in extern declaration for flowtab Stefano Brivio
2025-01-27 23:15 ` [PATCH 5/7] util: Add read_remainder() and read_all_buf() Stefano Brivio
2025-01-28 0:59 ` David Gibson
2025-01-28 6:48 ` Stefano Brivio
2025-01-29 1:03 ` David Gibson
2025-01-29 7:33 ` Stefano Brivio
2025-01-30 0:44 ` David Gibson
2025-01-27 23:15 ` [PATCH 6/7] Introduce facilities for guest migration on top of vhost-user infrastructure Stefano Brivio
2025-01-28 1:40 ` David Gibson
2025-01-28 6:50 ` Stefano Brivio
2025-01-29 1:16 ` David Gibson
2025-01-29 7:33 ` Stefano Brivio
2025-01-30 0:48 ` David Gibson
2025-01-30 4:55 ` Stefano Brivio
2025-01-30 7:38 ` David Gibson
2025-01-30 8:32 ` Stefano Brivio
2025-01-30 8:54 ` David Gibson
2025-01-31 5:46 ` Stefano Brivio
2025-01-31 6:32 ` David Gibson
2025-01-31 9:09 ` Stefano Brivio
2025-02-03 0:47 ` David Gibson
2025-01-31 5:36 ` Stefano Brivio [this message]
2025-01-31 6:14 ` David Gibson
2025-01-31 9:09 ` Stefano Brivio
2025-02-03 0:46 ` David Gibson
2025-02-03 6:09 ` Stefano Brivio
2025-02-03 9:06 ` David Gibson
2025-02-03 9:45 ` Stefano Brivio
2025-02-03 9:57 ` David Gibson
2025-01-27 23:15 ` [PATCH 7/7] Introduce passt-repair Stefano Brivio
2025-01-27 23:31 ` Stefano Brivio
2025-01-28 1:51 ` David Gibson
2025-01-28 6:51 ` Stefano Brivio
2025-01-29 1:29 ` David Gibson
2025-01-29 7:04 ` Stefano Brivio
2025-01-30 0:53 ` David Gibson
2025-01-30 4:55 ` Stefano Brivio
2025-01-30 7:43 ` David Gibson
2025-01-30 7:56 ` Stefano Brivio
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250131063655.41a5861b@elisabeth \
--to=sbrivio@redhat.com \
--cc=david@gibson.dropbear.id.au \
--cc=lvivier@redhat.com \
--cc=passt-dev@passt.top \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).