Summary:
Call it StreamExecutorCoreTests in order to prevent collision with
targets from other modules.
Reviewers: jlebar, jprice
Subscribers: beanz, mgorny, jlebar, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24949
llvm-svn: 282491
Summary:
It turns out CMake errors out if a processed directory contains source
files that are not used. This was causing an error with the CUDATest.cpp
file when configuring StreamExecutor with the CUDA platform disabled.
Moving CUDATest.cpp to its own directory fixes this problem.
Reviewers: jlebar, jprice
Subscribers: beanz, mgorny, jlebar, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24618
llvm-svn: 281654
Summary:
Add proper handling for shared memory arguments in the CUDA platform. Also add
in unit tests for CUDA.
Reviewers: jlebar
Subscribers: beanz, mgorny, jprice, jlebar, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24596
llvm-svn: 281635
Summary:
We were packing global device memory handles in
`PackedKernelArgumentArray`, but as I was implementing the CUDA
platform, I realized that CUDA wants the address of the handle, not the
handle itself. So this patch switches to packing the address of the
handle.
Reviewers: jlebar
Subscribers: jprice, jlebar, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24528
llvm-svn: 281424
Summary:
Before, the kernel spec would only return PTX for exactly the requested
compute capability. With this patch it will now return the PTX with the
largest compute capability that does not exceed that requested compute
capability.
Reviewers: jlebar
Subscribers: jprice, jlebar, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24531
llvm-svn: 281417
Summary:
The .clang-tidy file is copied from the top-level LLVM source directory.
Also fix warnings generated by clang-format:
* Moved SimpleHostPlatformDevice.h so its header include guard could
have the right format.
* Changed signatures of methods taking llvm::Twine by value to take it
by const ref instead.
* Add "noexcept" to some move constructors and assignment operators.
* Removed a bunch of places where single-statement loops and
conditionals were surrounded with braces. (This was not found by the
current clang-tidy, but with a local patch that I hope to upstream
soon.)
Reviewers: jlebar, jprice
Subscribers: parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24468
llvm-svn: 281374
Summary:
* Add LLVM_ATTRIBUTE_UNUSED_RESULT used to slicing methods in order to
emphasize that the slicing is not done in place.
* Change device memory slice function name from `drop_front` to `slice`
in order to match the naming convention of `llvm::ArrayRef` and host
memory slice.
* Change the parameter names of host memory slice functions to
`DropCount` and `TakeCount` to match device memory slice declarations.
Reviewers: jlebar
Subscribers: jprice, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24464
llvm-svn: 281239
Summary:
Improve the error-prone interface that allows users to pass host
pointers that haven't been registered to asynchronous copy methods. In
CUDA, this is an extremely easy error to make, and instead of failing at
runtime, it succeeds and gives the right answers by turning the async
copy into a sync copy. So, you silently get a huge performance
degradation if you misuse the old interface. This new interface should
prevent that.
Reviewers: jlebar
Subscribers: jprice, beanz, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24353
llvm-svn: 281225
Summary:
With these changes, we can put parallel-libs within llvm/projects and
build as normal.
This is kind of the minimal change I could figure out how to make while
still making us compatible with llvm's build system. Some things I'm
not thrilled about include:
* The creation of a CoreTests directory (the macros really seemed to
want this)
* Pulling SimpleHostPlatformDevice.h into CoreTests. It seems to me
this should live inside unittests/include, or maybe tests/include,
but I didn't want to make that change in this patch.
One important piece of work that remains to be done is to make
$ ninja check-streamexecutor
run all the tests. Right now the only way I've figured out to run the
tests is
$ ninja projects/parallel-libs/streamexecutor/unittests/StreamExecutorUnitTests
$ projects/parallel-libs/streamexecutor/unittests/CoreTests/CoreTests
Reviewers: jhen
Subscribers: beanz, parallel_libs-commits, jprice
Differential Revision: https://reviews.llvm.org/D24368
llvm-svn: 281091