* Race condition with pasta COMMAND... @ 2023-01-30 18:08 Paul Holzinger 2023-01-31 13:02 ` Stefano Brivio 0 siblings, 1 reply; 4+ messages in thread From: Paul Holzinger @ 2023-01-30 18:08 UTC (permalink / raw) To: passt-dev Hi all, while debugging some things I used `./pasta --config-net -- nslookup google.com 1.1.1.1` to test dns. The problem is that does not work because the nslookup process will be executed before pasta is ready with the netns setup, i.e. compare `./pasta --config-net -- ip a`. So a workaround is to spawn a shell and sleep: `sh -c "sleep 1; nslookup google.com 1.1.1.1"` However this is ugly and does not ensure that the netns is ready after one second. As a user I would expect pasta to wait until the setup is finished before it calls exec(). I can send a patch if you agree and I find some time. --- Paul ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Race condition with pasta COMMAND... 2023-01-30 18:08 Race condition with pasta COMMAND Paul Holzinger @ 2023-01-31 13:02 ` Stefano Brivio 2023-01-31 18:00 ` Paul Holzinger 0 siblings, 1 reply; 4+ messages in thread From: Stefano Brivio @ 2023-01-31 13:02 UTC (permalink / raw) To: Paul Holzinger; +Cc: passt-dev Hi Paul, On Mon, 30 Jan 2023 19:08:14 +0100 Paul Holzinger <pholzing@redhat.com> wrote: > Hi all, > > while debugging some things I used `./pasta --config-net -- nslookup > google.com 1.1.1.1` to test dns. > The problem is that does not work because the nslookup process will be > executed before pasta is > ready with the netns setup, i.e. compare `./pasta --config-net -- ip a`. Thanks for the report. I also hit this a couple of months ago but I couldn't find yet the time to deal with it: https://bugs.passt.top/show_bug.cgi?id=37 > So a workaround is to spawn a shell and sleep: `sh -c "sleep 1; nslookup > google.com 1.1.1.1"` > However this is ugly and does not ensure that the netns is ready after > one second. As a user > I would expect pasta to wait until the setup is finished before it calls > exec(). Absolutely, yes. As I mentioned on that ticket, I *think* that the only way to make sure the setup is actually complete is to query back via netlink addresses and routes we configured -- simply waiting until we successfully sent netlink messages isn't enough, because it takes a (substantial) while until addresses and routes are actually available. > I can send a patch if you agree and I find some time. That would be great, thanks in advance! If you get to it, I think you could reuse nl_route() and nl_addr() from netlink.c to perform the checks -- they might need to be extended a bit, I'm not sure. -- Stefano ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Race condition with pasta COMMAND... 2023-01-31 13:02 ` Stefano Brivio @ 2023-01-31 18:00 ` Paul Holzinger 2023-01-31 18:43 ` Stefano Brivio 0 siblings, 1 reply; 4+ messages in thread From: Paul Holzinger @ 2023-01-31 18:00 UTC (permalink / raw) To: Stefano Brivio; +Cc: passt-dev Hi Stefano, On 31/01/2023 14:02, Stefano Brivio wrote: > Hi Paul, > > On Mon, 30 Jan 2023 19:08:14 +0100 > Paul Holzinger <pholzing@redhat.com> wrote: > >> Hi all, >> >> while debugging some things I used `./pasta --config-net -- nslookup >> google.com 1.1.1.1` to test dns. >> The problem is that does not work because the nslookup process will be >> executed before pasta is >> ready with the netns setup, i.e. compare `./pasta --config-net -- ip a`. > Thanks for the report. I also hit this a couple of months ago but I > couldn't find yet the time to deal with it: > > https://bugs.passt.top/show_bug.cgi?id=37 > >> So a workaround is to spawn a shell and sleep: `sh -c "sleep 1; nslookup >> google.com 1.1.1.1"` >> However this is ugly and does not ensure that the netns is ready after >> one second. As a user >> I would expect pasta to wait until the setup is finished before it calls >> exec(). > Absolutely, yes. > > As I mentioned on that ticket, I *think* that the only way to make sure > the setup is actually complete is to query back via netlink addresses > and routes we configured -- simply waiting until we successfully sent > netlink messages isn't enough, because it takes a (substantial) while > until addresses and routes are actually available. > Is there any reason why we would explicitly need to query netlink after the setup is done? With NLM_F_ACK it should wait long enough, no? We use it like that in podman and never experienced an problem with the network not being ready apart from ipv6 DAD and I don't think we need worry about this here. From a quick test, at least for my use case it seems to be working when I hold the exec until the isolate_prefork() call. >> I can send a patch if you agree and I find some time. > That would be great, thanks in advance! > > If you get to it, I think you could reuse nl_route() and nl_addr() from > netlink.c to perform the checks -- they might need to be extended a > bit, I'm not sure. > ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Race condition with pasta COMMAND... 2023-01-31 18:00 ` Paul Holzinger @ 2023-01-31 18:43 ` Stefano Brivio 0 siblings, 0 replies; 4+ messages in thread From: Stefano Brivio @ 2023-01-31 18:43 UTC (permalink / raw) To: Paul Holzinger; +Cc: passt-dev On Tue, 31 Jan 2023 19:00:02 +0100 Paul Holzinger <pholzing@redhat.com> wrote: > Hi Stefano, > > On 31/01/2023 14:02, Stefano Brivio wrote: > > Hi Paul, > > > > On Mon, 30 Jan 2023 19:08:14 +0100 > > Paul Holzinger <pholzing@redhat.com> wrote: > > > >> Hi all, > >> > >> while debugging some things I used `./pasta --config-net -- nslookup > >> google.com 1.1.1.1` to test dns. > >> The problem is that does not work because the nslookup process will be > >> executed before pasta is > >> ready with the netns setup, i.e. compare `./pasta --config-net -- ip a`. > > Thanks for the report. I also hit this a couple of months ago but I > > couldn't find yet the time to deal with it: > > > > https://bugs.passt.top/show_bug.cgi?id=37 > > > >> So a workaround is to spawn a shell and sleep: `sh -c "sleep 1; nslookup > >> google.com 1.1.1.1"` > >> However this is ugly and does not ensure that the netns is ready after > >> one second. As a user > >> I would expect pasta to wait until the setup is finished before it calls > >> exec(). > > Absolutely, yes. > > > > As I mentioned on that ticket, I *think* that the only way to make sure > > the setup is actually complete is to query back via netlink addresses > > and routes we configured -- simply waiting until we successfully sent > > netlink messages isn't enough, because it takes a (substantial) while > > until addresses and routes are actually available. > > > Is there any reason why we would explicitly need to query netlink > after the setup is done? > With NLM_F_ACK it should wait long enough, no? We use it like that in > podman and never experienced an problem with the network not being > ready apart from ipv6 DAD and I don't think we need worry about this here. Perhaps in practice yes, but we don't have a formal guarantee that any given routing table/trie/structure is actually synchronised just because we get an acknowledgement for the netlink message (that's all NLM_F_ACK says). In terms of RCU, by the time we read NLM_F_ACK in passt, we know that any pending (from the address/routing update perspective) read-side critical section must have exited. However, routing tables have generation IDs and there might be other mechanisms (current or future) that prevent us from making this assumption. Also, given multiple CPU threads on a machine, I don't think it's really that straightforward. > From a quick test, at least for my use case it seems to be working when > I hold the exec until the isolate_prefork() call. I tried that too in the past and I had some "failures", I don't remember if that was on "ip route show" or "ip address show". Still, it's by all means an improvement on the original, and if it works for you reliably that's a good sign, so I'd definitely welcome that patch. I'm just not convinced it's a complete fix, so I would also welcome a more involved patch doing the check via netlink. :) -- Stefano ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-01-31 18:44 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-01-30 18:08 Race condition with pasta COMMAND Paul Holzinger 2023-01-31 13:02 ` Stefano Brivio 2023-01-31 18:00 ` Paul Holzinger 2023-01-31 18:43 ` Stefano Brivio
Code repositories for project(s) associated with this public inbox https://passt.top/passt This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for IMAP folder(s).