From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stefano Brivio To: passt-user@passt.top Subject: Re: qemu couldn't connect the unix domain socket Date: Fri, 29 Oct 2021 13:52:31 +0200 Message-ID: <20211029135231.78c55afe@elisabeth> In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1711339715934055394==" --===============1711339715934055394== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable On Fri, 29 Oct 2021 19:02:15 +0800 Li Feng wrote: > On Fri, Oct 29, 2021 at 5:34 PM Stefano Brivio wrote: > > > > On Fri, 29 Oct 2021 16:54:47 +0800 > > Li Feng wrote: > > =20 > > > [...] > > > > > > Thanks for the detailed explanation. > > > I finally found out that the `qrap` was the root cause. > > > I patched the qemu, and it works well. =20 > > > > Have you found out what was the offending syscall? I'll probably hit > > this later too, but that would help me double checking what the problem > > was. =20 >=20 > I made a mistake in the previous mail, using `./passt -f` works, but > if running in background, > it still exits without any output. >=20 > This is the strace output. >=20 > ``` > $ strace -f ./passt > ... > ... > accept(6, NULL, NULL) =3D 7 > epoll_ctl(5, EPOLL_CTL_ADD, 7, {events=3DEPOLLIN|EPOLLRDHUP|EPOLLET, > data=3D{u32=3D7, u64=3D7}}) =3D 0 > getrandom("\x40\xfc\xc5\x4a\x29\x3e\xdb\xcd\x25\x92\xc6\xc3\xc7\xcb\x57\x5a= ", > 16, GRND_RANDOM) =3D 16 > socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) =3D 8 > socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) =3D 9 > socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) =3D 10 > socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) =3D 11 > socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) =3D 12 > socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) =3D 13 > socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) =3D 14 > socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) =3D 15 > socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) =3D 16 > socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) =3D 17 > socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) =3D 18 > socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) =3D 19 > socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) =3D 20 > socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) =3D 21 > socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) =3D 22 > socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) =3D 23 > socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) =3D 24 > socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) =3D 25 > socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) =3D 26 > socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) =3D 27 > socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) =3D 28 > socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) =3D 29 > socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) =3D 30 > socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) =3D 31 > socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) =3D 32 > socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) =3D 33 > socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) =3D 34 > socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) =3D 35 > socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) =3D 36 > socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) =3D 37 > socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) =3D 38 > socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) =3D 39 > clone(child_stack=3DNULL, > flags=3DCLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLDstrace: Process > 172939 attached > , child_tidptr=3D0x7f7b89b1da10) =3D 172939 > [pid 172921] exit_group(0 > [pid 172939] set_robust_list(0x7f7b89b1da20, 24 > [pid 172921] <... exit_group resumed>) =3D ? > [pid 172921] +++ exited with 0 +++ > <... set_robust_list resumed>) =3D ? > +++ killed by SIGSYS (core dumped) +++ > ``` > Which is the bad syscall? Oh, it's set_robust_list(), it's normal that exit_group() doesn't return. That new usage probably comes from: https://sourceware.org/git/?p=3Dglibc.git;a=3Dcommit;h=3D9a7565403758f65c07f= e3705e966381d9cfd35b6 and that code path is not really needed for passt, so I would have a quick try at avoiding it rather than adding a syscall, perhaps with a small replacement of daemon() using clone() instead of fork(). Meanwhile, this should work for you: diff --git a/passt.c b/passt.c index 6436a45..2a4ba8b 100644 --- a/passt.c +++ b/passt.c @@ -280,3 +280,3 @@ static void pid_file(struct ctx *c) { * #syscalls openat fstat fcntl lseek clone setsid exit_group getpid - * #syscalls clock_gettime newfstatat + * #syscalls clock_gettime newfstatat set_robust_list * #syscalls:pasta rt_sigreturn --=20 Stefano --===============1711339715934055394==--