From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by passt.top (Postfix) with ESMTPS id 69A045A004F for ; Thu, 08 Aug 2024 03:29:01 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202312; t=1723080533; bh=95XbYaCxDQVHKgnTPPjKij3A7CN8HH5zmwvsn8I4t5I=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=FdzBSw2E7DcGDkummOmbu6R7gG7oWqgQJUh0vEUMi4vsUxD0SKO9XA4O8f5XIQWzu XFGpVEPEVgx7m+R4yGQW9zO63wWD6dKxz2Tm8EN52OQ5SFdxRJkrjvfghzsSUtfni6 chz2EUAvZiWI1Jqb7T53FUnmmQTrBjTtoUlWsXIv2dQ8t8gKqZcfXmJejrBsGyPevd P4AztcW801n+a3fW3omth7ZGbJLp5eVX68zVdA+2rMVp1/0D1B+tJqDUkiPQ0/6SGX RX7qh1t1HE9XChJxpOviTVh9Gwruh3kFGRIbk1l7VP0H1Why6PKF+meNhSiZPGVzkn uV/nU6hmA6iwQ== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4WfTw56CgDz4x3p; Thu, 8 Aug 2024 11:28:53 +1000 (AEST) Date: Thu, 8 Aug 2024 11:28:50 +1000 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH v2 06/22] test: Add exeter+Avocado based build tests Message-ID: References: <20240805123701.1720730-1-david@gibson.dropbear.id.au> <20240805123701.1720730-7-david@gibson.dropbear.id.au> <20240807001126.5e9a92d3@elisabeth> <20240807150644.5dc22f50@elisabeth> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="aKHjA3xhXaJ5m8UL" Content-Disposition: inline In-Reply-To: <20240807150644.5dc22f50@elisabeth> Message-ID-Hash: SL2OYNAQD6X3PEIQVOZM3BYNLXUU6S3Y X-Message-ID-Hash: SL2OYNAQD6X3PEIQVOZM3BYNLXUU6S3Y X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, Cleber Rosa X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --aKHjA3xhXaJ5m8UL Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Aug 07, 2024 at 03:06:44PM +0200, Stefano Brivio wrote: > On Wed, 7 Aug 2024 20:51:08 +1000 > David Gibson wrote: >=20 > > On Wed, Aug 07, 2024 at 12:11:26AM +0200, Stefano Brivio wrote: > > > On Mon, 5 Aug 2024 22:36:45 +1000 > > > David Gibson wrote: > > > =20 > > > > Add a new test script to run the equivalent of the tests in build/a= ll > > > > using exeter and Avocado. This new version of the tests is more ro= bust > > > > than the original, since it makes a temporary copy of the source tr= ee so > > > > will not be affected by concurrent manual builds. =20 > > >=20 > > > I think this is much more readable than the previous Python attempt. = =20 > >=20 > > That's encouraging. > >=20 > > > On the other hand, I guess it's not an ideal candidate for a fair > > > comparison because this is exactly the kind of stuff where shell > > > scripting shines: it's a simple test that needs a few basic shell > > > commands. =20 > >=20 > > Right. > >=20 > > > On that subject, the shell test is about half the lines of code (just > > > skipping headers, it's 48 lines instead of 90... and yes, this versio= n =20 > >=20 > > Even ignoring the fact that this case is particularly suited to shell, > > I don't think that's really an accurate comparison, but getting to one > > is pretty hard. > >=20 > > The existing test isn't 48 lines of shell, but of "passt test DSL". > > There are several hundred additional lines of shell to interpret that. >=20 > Yeah, but the 48 lines is all I have to look at, which is what matters > I would argue. That's exactly why I wrote that interpreter. >=20 > Here, it's 90 lines of *test file*. Fair point. Fwiw, it's down to 77 so far for my next draft. > > Now obviously we don't need all of that for just this test. Likewise > > the new Python test needs at least exeter - that's only a couple of > > hundred lines - but also Avocado (huge, but only a small amount is > > really relevant here). > >=20 > > > now uses a copy of the source code, but that would be two lines). =20 > >=20 > > I feel like it would be a bit more than two lines, to copy exactly > > what youwant, and to clean up after yourself. >=20 > host mkdir __STATEDIR__/sources > host cp --parents $(git ls-files) __STATEDIR__/sources >=20 > ...which is actually an improvement on the original as __STATEDIR__ can > be handled in a centralised way, if one wants to keep that after the > single test case, after the whole test run, or not at all. Huh, I didn't know about cp --parents, which does exactly what's needed. In the Python library there are, alas, several things that do almost but not quite what's needed. I guess I could just invoke 'cp --parents' myself. > > > In terms of time overhead, dropping delays to make the display capture > > > nice (a feature that we would anyway lose with exeter plus Avocado, if > > > I understood correctly): =20 > >=20 > > Yes. Unlike you, I'm really not convinced of the value of the display > > capture versus log files, at least in the majority of cases. >=20 > Well, but I use that... >=20 > By the way, openQA nowadays takes periodic screenshots. That's certainly > not as useful, but I'm indeed not the only one who benefits from > _seeing_ tests as they run instead of correlating log files from > different contexts, especially when you have a client, a server, and > what you're testing in between. If you have to correlate multiple logs that's a pain, yes. My approach here is, as much as possible, to have a single "log" (actually stdout & stderr) from the top level test logic, so the logical ordering is kind of built in. > > I certainly don't think it's worth slowing down the test running in the > > normal case. >=20 > It doesn't significantly slow things down, It does if you explicitly add delays to make the display capture nice as mentioned above. > but it certainly makes it > more complicated to run test cases in parallel... which you can't do > anyway for throughput and latency tests (which take 22 out of the 37 > minutes of a current CI run), unless you set up VMs with CPU pinning and > cgroups, or a server farm. So, yes, the perf tests take the majority of the runtime for CI, but I'm less concerned about runtime for CI tests. I'm more interested in runtime for a subset of functional tests you can run repeatedly while developing. I routinely disable the perf and other slow tests, to get a subset taking 5-7 minutes. That's ok, but I'm pretty confident I can get better coverage in significantly less time using parallel tests. > I mean, I see the value of running things in parallel in a general > case, but I don't think you should just ignore everything else. >=20 > > > $ time (make clean; make passt; make clean; make pasta; make clean; m= ake qrap; make clean; make; d=3D$(mktemp -d); prefix=3D$d make install; pre= fix=3D$d make uninstall; ) > > > [...] > > > real 0m17.449s > > > user 0m15.616s > > > sys 0m2.136s =20 > >=20 > > On my system: > > [...] > > real 0m20.325s > > user 0m15.595s > > sys 0m5.287s > >=20 > > > compared to: > > >=20 > > > $ time ./run > > > [...] > > > real 0m18.217s > > > user 0m0.010s > > > sys 0m0.001s > > >=20 > > > ...which I would call essentially no overhead. I didn't try out this > > > version yet, I suspect it would be somewhere in between. =20 > >=20 > > Well.. > >=20 > > $ time PYTHONPATH=3Dtest/exeter/py3 test/venv/bin/avocado run test/buil= d/build.json=20 > > [...] > > RESULTS : PASS 5 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 = | CANCEL 0 > > JOB TIME : 10.85 s > >=20 > > real 0m11.000s > > user 0m23.439s > > sys 0m7.315s > >=20 > > Because parallel. It looks like the avocado start up time is > > reasonably substantial too, so that should look better with a larger > > set of tests. >=20 > With the current set of tests, I doubt it's ever going to pay off. Even > if you run the non-perf tests in 10% of the time, it's going to be 24 > minutes instead of 37. Including the perf tests, probably not. Excluding them (which is extremely useful when actively coding) I think it will. > I guess it will start making sense with larger matrices of network > environments, or with more test cases (but really a lot of them). We could certainly do with a lot more tests, though I expect it will take a while to get them. --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --aKHjA3xhXaJ5m8UL Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAma0H1EACgkQzQJF27ox 2GfyuRAAgAxuV7SCy4A+ZjXGAZ8QeDIfSuGcelSG1jc+kAmw9nLTiEuIaQGtVjin 8UH2fHR9ie/Q9Y8gGuqu957uty/tvdD4QD/W0sT1HQTBF+L7LicFzLlsmHva+A7U quo6ZJBAJRMbO8Vjy8MsY7IQh9lA69WgtiwhV2IQuYAUjoKEd8tJZdNy2NdgeM64 tlMt2AtdAzYQz61v1lsZqV9yYq/h+PsclPDT9Os6STKMBfpOkTuMZ3b6fRJ7754C Jgp251DQYqMYHR0b28koNqOF4PzvHBFnl829cNLAUUOWIT6vv8+A349cNzD+xQ/m 5lb84B+yLFDRaG99JTNYd9JFH7EOarCg6LRcYOjesGaEjrn79fQrfFMNbHTMjdKt wbhn9sOdUFwXBttJ6L9Fd31h+HlMH+S+NvGFGHAb1ZxiBwyB2VSWJdVj/a4UWJHE VpjPrWDcKtKUxt+TLgIMVx8bnmQWDJIfjgTJghYX/wco3axwkf+wu/rN2d3ftpXj bE1af1RRo5RLOZHdz5C0AX1Usow4pf9aZGiLQKRK7fY7nstEzeBhzdZJsiCFvBp7 0Erp6GH0YhASIuSt532lnMIZuvER+zp5rXWCM4mXPAfnm1s4uvcIAilzZ1A9BjuZ SjJh6ct5os8MuqpJZPb4k5+AA2tuzvSOCAdoeARU6ZAPORlvx+8= =FQVC -----END PGP SIGNATURE----- --aKHjA3xhXaJ5m8UL--