forked from OSchip/llvm-project
Summary: We were packing global device memory handles in `PackedKernelArgumentArray`, but as I was implementing the CUDA platform, I realized that CUDA wants the address of the handle, not the handle itself. So this patch switches to packing the address of the handle. Reviewers: jlebar Subscribers: jprice, jlebar, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24528 llvm-svn: 281424 |
||
|---|---|---|
| .. | ||
| examples | ||
| include/streamexecutor | ||
| lib | ||
| tools/streamexecutor-config | ||
| unittests | ||
| CMakeLists.txt | ||
| Doxyfile.in | ||
| README.txt | ||
| customdoxygen.css | ||
README.txt
StreamExecutor ============== StreamExecutor is a wrapper around CUDA and OpenCL (host-side) programming models (runtimes). This abstraction cleanly permits host code to target either CUDA or OpenCL devices with identically-functioning data parallel kernels. It manages the execution of concurrent work targeting the accelerator, similar to a host-side Executor. This version of StreamExecutor can be built either as a sub-project of the LLVM project or as a standalone project depending on LLVM as an external package.