Quality of progress bar and ETA in lit has always bothered me.
For example, given `./bin/llvm-lit /repositories/llvm-project/clang/test/CodeGen* -sv`
at 1%, it says it will take 10 more minutes,
at 25%, it says it will take 1.25 more minutes,
at 50%, it says it will take 30 more seconds,
and in the end finishes with `Testing Time: 39.49s`. That's rather wildly unprecise.
Currently, it assumes that every single test will take the same amount of time to run on average.
This is is a somewhat reasonable approximation overall, but it is quite clearly imprecise,
especially in the beginning.
But, we can do better now, after D98179! We now know how long the tests took to run last time.
So we can build a better ETA predictor, by accumulating the time spent already,
the time that will be spent on the tests for which we know the previous time,
and for the test for which we don't have previous time, again use the average time
over the tests for which we know current or previous run time.
It would be better to use median, but i'm wary of the cost that may incur.
Now, on **first** run of `./bin/llvm-lit /repositories/llvm-project/clang/test/CodeGen* -sv`
at 10%, it says it will take 30 seconds,
at 25%, it says it will take 50 more seconds,
at 50%, it says it will take 27 more seconds,
and in the end finishes with `Testing Time: 41.64s`. That's pretty reasonable.
And on second run of `./bin/llvm-lit /repositories/llvm-project/clang/test/CodeGen* -sv`
at 1%, it says it will take 1 minutes,
at 25%, it says it will take 30 more seconds,
at 50%, it says it will take 19 more seconds,
and in the end finishes with `Testing Time: 39.49s`. That's amazing i think!
I think people will love this :)
Reviewed By: yln
Differential Revision: https://reviews.llvm.org/D99073
Even though we have read the times before,
we intentionally forget about it for performance reasons.
But that means we also forget all the times for the tests
that weren't executed this time. This is mildly inconvenient.
So, when recording the new times, first re-read the old times,
and update times for the tests that were executed,
thus preserving all original times, too.
I.e. when you first run lit on a directory, and then on a single test,
the timing knowledge about anything else other than that single test
is lost. This isn't right.
All of these depend on the order of tests, so if one runs them twice,
the tests within them will naturally be reordered
using the previous run times, which breaks them.
If lit was run on a directory that contained no suites,
then naturally suite[0] will not be there,
and that line would cause python warnings.
So just predicate it with a check that it is there in the first place.
This reverts commit d09adfd399.
That commit caused failures in
clang-tidy/infrastructure/validate-check-names.cpp on windows
buildbots.
That change exposed a surprising issue, not directly related to
this change in itself, but in how TestRunner quotes command line
arguments that later are going to be interpreted by a msys based
tool (like grep.exe, when provided by Git for Windows). This
worked accidentally before, when grep was invoked via not.exe
which took a more conservative approach to windows argument quoting.
When running in a Windows Container, the Git for Windows Unix tools
(C:\Program Files\Git\usr\bin) just hang if this variable isn't
passed through.
Currently, running the LLVM/clang tests in a Windows Container fails
if that directory is added to the path, but succeeds after this change.
(After this change, the previously used GnuWin tools can be left out
entirely, too, as lit automatically picks up the Git for Windows tools
if necessary.)
Differential Revision: https://reviews.llvm.org/D98858
Keep running "not --crash" via the external "not" executable, but
for plain negations, and for cases that use the shell "!" operator,
just skip that argument and invert the return code.
The libcxx tests only use the shell operator "!" for negations,
never the "not" executable, because libcxx tests can be run without
having a fully built llvm tree available providing the "not"
executable.
This allows using the internal shell for libcxx tests.
Differential Revision: https://reviews.llvm.org/D98859
The "path" recorded for timing purposes is only used as a key into a dictionary. It is never used as an actual path to a filesystem API, therefore we should use '/' as the canonical separator so that Unix and Windows machines can share timing data. This also ensures that the lit testing works across platforms.
Reviewed By: jhenderson, jmorse
Differential Revision: https://reviews.llvm.org/D98767
The test file has embedded slashes. This is fine for normal users that
are just recording and reordering paths, but not great when the trace
data is committed back to a repository that should work on both Unix and
Windows.
Lit as it exists today has three hacks that allow users to run tests earlier:
1) An entire test suite can set the `is_early` boolean.
2) A very recently introduced "early_tests" feature.
3) The `--incremental` flag forces failing tests to run first.
All of these approaches have problems.
1) The `is_early` feature was until very recently undocumented. Nevertheless it still lacks testing and is a imprecise way of optimizing test starting times.
2) The `early_tests` feature requires manual updates and doesn't scale.
3) `--incremental` is undocumented, untested, and it requires modifying the *source* file system by "touching" the file. This "touch" based approach is arguably a hack because it confuses editors (because it looks like the test was modified behind the back of the editor) and "touching" the test source file doesn't work if the test suite is read only from the perspective of `lit` (via advanced filesystem/build tricks).
This patch attempts to simplify and address all of the above problems.
This patch formalizes, documents, tests, and defaults lit to recording the execution time of tests and then reordering all tests during the next execution. By reordering the tests, high core count machines run faster, sometimes significantly so.
This patch also always runs failing tests first, which is a positive user experience win for those that didn't know about the hidden `--incremental` flag.
Finally, if users want, they can _optionally_ commit the test timing data (or a subset thereof) back to the repository to accelerate bots and first-time runs of the test suite.
Reviewed By: jhenderson, yln
Differential Revision: https://reviews.llvm.org/D98179
Visual Studios implementation of the C++ Standard Library does not use strerror to produce a message for std::error_code unlike other standard libraries such as libstdc++ or libc++ that might be used.
This patch adds a cmake script that through running a C++ program gets the error messages for the POSIX error codes and passes them onto lit through an optional config parameter.
If the config parameter is not set, or getting the messages failed, due to say a cross compiling configuration without an emulator, it will fall back to using pythons strerror functions.
Differential Revision: https://reviews.llvm.org/D98278
This patch uses the errno python library to print out the correct error messages instead of hardcoding the error message per platform.
Reviewed By: jhenderson, ASDenysPetrov
Differential Revision: https://reviews.llvm.org/D97472
For some build configurations, `check-all` calls lit multiple times to
run multiple lit test suites. Most recently, I've found this to be
true when configuring openmp as part of `LLVM_ENABLE_RUNTIMES`, but
this is not the first time.
If one test suite fails, none of the remaining test suites run, so you
cannot determine if your patch has broken them. It can then be
frustrating to try to determine which `check-` targets will run the
remaining tests without getting stuck on the failing tests.
When such cases arise, it is probably best to adjust the cmake
configuration for `check-all` to run all test suites as part of one
lit invocation. Because that fix will likely not be implemented and
land immediately, this patch introduces `--ignore-fail` to serve as a
workaround for developers trying to see test results until it does
land:
```
$ LIT_OPTS=--ignore-fail ninja check-all
```
One problem with `--ignore-fail` is that it makes it challenging to
detect test failures in a script, perhaps in CI. This problem should
serve as motivation to actually fix the cmake configuration instead of
continuing to use `--ignore-fail` indefinitely.
Reviewed By: jhenderson, thopre
Differential Revision: https://reviews.llvm.org/D96371
In semi-automated environments, XFAILing or filtering out known regressions without actually committing changes or temporarily modifying the test suite can be quite useful.
Reviewed By: yln
Differential Revision: https://reviews.llvm.org/D96662
With enough cores, the slowest tests can significantly change the total testing time if they happen to run late. With this change, a test suite can improve performance (for high-end systems) by listing just a few of the slowest tests up front.
Reviewed By: jdenny, jhenderson
Differential Revision: https://reviews.llvm.org/D96594
Some test systems do not use lit for test discovery but only for its
substitution and test selection because they use another way of managing
test collections, e.g. CTest. This forces those tests to be invoked with
lit --no-indirectly-run-check. When a mix of lit version is in use, it
requires to detect the availability of that option.
This commit provides a new config option standalone_tests to signal a
directory made of tests meant to run as standalone. When this option is
set, lit skips test discovery and the indirectly run check. It also adds
the missing documentation for --no-indirectly-run-check.
Reviewed By: jdenny
Differential Revision: https://reviews.llvm.org/D94766
On z/OS, other error messages are not matched correctly in lit tests.
```
EDC5121I Invalid argument.
EDC5111I Permission denied.
```
This patch adds a lit substitution to fix it.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D95808
On z/OS, the following error message is not matched correctly in lit tests.
```
EDC5129I No such file or directory.
```
This patch uses a lit config substitution to check for platform specific error messages.
Reviewed By: muiez, jhenderson
Differential Revision: https://reviews.llvm.org/D95246
The initial problem with the remaining bot config was resolved.
We can now use Python3. Let's use `os.cpu_count()` to cleanup this
helper.
Differential Revision: https://reviews.llvm.org/D94734
Update shebang to always use Python3 when executing `lit.py` directly.
A previous change of mine [1] revealed that we still use Python2 on some
bot configurations that invoke `llvm/utils/lit/lit.py` as a script
directly (instead of `python3 path/to/lit.py`).
[1] https://reviews.llvm.org/D94734
Differential Revision: https://reviews.llvm.org/D95393
In addition to consistency, we'll hit a wall when 11.1.0 gets released, because
we cannot represent it with lit versioning scheme.
Differential Revision: https://reviews.llvm.org/D94157
Historically, we have told contributors that GnuWin32 is a pre-requisite
because our tests depend on utilities such as sed, grep, diff, and more.
However, Git on Windows includes versions of these utilities in its
installation. Furthermore, GnuWin32 has not been updated in many years.
For these reasons, it makes sense to have the ability to run llvm tests
in a way that is both:
a) Easier on the user (less stuff to install)
b) More up-to-date (The verions that ship with git are at least as
new, if not newer, than the versions in GnuWin32.
We add support for this here by attempting to detect where Git is
installed using the Windows registry, confirming the existence of
several common Unix tools, and then adding this location to lit's PATH
environment.
Differential Revision: https://reviews.llvm.org/D84380
Don't produce or expect any output from the infinite looping test -
doing so is a recipe for racey flakyness without a longer timeout to
ensure the output is received first, even though that doesn't seem
integral/important to the test. Instead have a plain, no output infinite
loop and check that that is caught and handled.
If for some reason the output is valuable for test coverage - the
timeout should be increased from 1 second to give the process time to
output the text, flush, and for that text to be received and buffered
before the test is timed out.
On some of the slow or heavily-loaded bots, this test was failing
intermittently because the infinite_loop.py script might not emit
anything to stdout before the 1 second timeout, so the "Command Output"
line isn't present in the output. That output isn't really important to
this test, we just care that the process is killed, so we can just rmove
that check line from the test.
Differential revision: https://reviews.llvm.org/D92563
- pass required=False to use_clang(), as we don't need it
- fix required=False (which was unused and rotted):
- make derived substitutions conditional on it
- add a feature so we can disable tests that need it
- conditionally disable our one test that depends on %resource_dir.
This doesn't seem right from first principles, but isn't a big deal.
Differential Revision: https://reviews.llvm.org/D90528
Currently, a LIT test named directly (on the command line) will
be run even if the name of the test file does not meet the rules
to be considered a test in the LIT test configuration files for
its test suite. For example, if the test does not have a
recognised file extension.
This makes it relatively easy to write a LIT test that won't
actually be run. I did in: https://reviews.llvm.org/D82567
This patch adds an error to avoid users doing that. There is a
small performance overhead for this check. A command line option
has been added so that users can opt into the old behaviour.
Differential Revision: https://reviews.llvm.org/D83069
I did some profiling of lit while trying to optimize the libc++ test
startup for remote hosts and it turns out that there is a realpath() call
for every message printed and this shows up in the profile.
The inspect.getframeinfo() function calls realpath() internally and
moreover we don't need most of the other information returned from it.
This patch uses inspect.getsourcefile() and os.path.abspath() to remove
../ from the path instead. Not resolving symlinks reduces the startup time
for running a single test with lit by about 50ms for me.
Reviewed By: ldionne, yln
Differential Revision: https://reviews.llvm.org/D89186
I recently had to run the test suite with a debug Python which got started
warning about some invalid escape sequences in LLDB's Python code. They all
attempt to add a backslash by doing a single backslash instead of a double
backslash in a normal string. This seems to work fine for now, but Python says
this behaviour is deprecated, so this patch turns all those strings into raw
strings (where a single backslash is actually a single backslash)
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D88289
The tests previously relied on the `short.py` and `FirstTest.subTestA`
script being executed on a machine within a short time window (1 or 2
seconds). While this "seems to work" it can fail on resource constrained
machines. We could bump the timeout a little bit (bumping it too
much would mean the test would take a long time to execute) but it wouldn't
really solve the problem of the test being prone to failures.
This patch tries to remove this flakeyness by separating testing into
two separate parts:
1. Testing if a test can hit a timeout.
2. Testing if a test can run to completion in the presence of a
timeout.
This way we can give (1.) a really short timeout (to make the test run
as fast as possible) and (2.) a really long timeout. This means for (2.)
we are no longer trying to rely on the "short" test executing within
some short time window. Instead the window is now 3600 seconds which
should be long enough even for a heavily resource constrained machine to
execute the "short" test.
Thanks to Julian Lettner for suggesting this approach. This superseeds
my original approach in https://reviews.llvm.org/D88807.
This patch also changes the command line override test to run the quick
test rather than the slow one to make the test run faster.
Differential Revision: https://reviews.llvm.org/D89020
This reverts b3418cb4eb and d12ae042e1
This breaks some external bots, see discussion in https://reviews.llvm.org/D84380
In the meanwhile, please use `cmake -DLLVM_LIT_TOOLS_DIR="C:/Program Files/Git/usr/bin"` or add it to %PATH%.
Summary:
This patch adds an error to Clang that detects if OpenMP offloading is
used between two architectures with incompatible pointer sizes. This
ensures that the data mapping can be done correctly and solves an issue
in code generation generating the wrong size pointer. This patch adds a
new lit substitution, %omp_powerpc_triple that, if the system is 32-bit or
64-bit, sets the powerpc triple accordingly. This was required to fix
some OpenMP tests that automatically populated the target architecture.
Reviewers: jdoerfert
Subscribers: cfe-commits guansong sstefan1 yaxunl delcypher
Tags: OpenMP clang LLVM
Differential Revision: https://reviews.llvm.org/D88594