llvm-project

Commit Graph

Author	SHA1	Message	Date
Joseph Huber	23bc343855	[Libomptarget] Change device free routines to accept the allocation kind Previous support for device memory allocators used a single free routine and did not provide the original kind of the allocation. This is problematic as some of these memory types required different handling. Previously this was worked around using a map in runtime to record the original kind of each pointer. Instead, this patch introduces new free routines similar to the existing allocation routines. This allows us to avoid a map traversal every time we free a device pointer. The only interfaces defined by the standard are `omp_target_alloc` and `omp_target_free`, these do not take a kind as `omp_alloc` does. The standard dictates the following: "The omp_target_alloc routine returns a device pointer that references the device address of a storage location of size bytes. The storage location is dynamically allocated in the device data environment of the device specified by device_num." Which suggests that these routines only allocate the default device memory for the kind. So this has been changed to reflect this. This change is somewhat breaking if users were using `omp_target_free` as previously shown in the tests. Reviewed By: JonChesterfield, tianshilei1992 Differential Revision: https://reviews.llvm.org/D133053	2022-09-14 12:14:07 -05:00
Joseph Huber	6e8d93e5c2	[Libomptarget] Implement OpenMP 5.2 semantics for device pointers In OpenMP 5.2, §5.8.6, page 160 line 32-33, when a device pointer allocated by omp_target_alloc has implicitly been included on a target construct as a zero-length array, the pointer initialisation should not find a matching mapped list item, and so should retain its value as a firstprivate variable. Previously, we would return a null pointer if the list item was not found. This patch updates the map handling to the OpenMP 5.2 semantics. Reviewed By: jdoerfert, ye-luo Differential Revision: https://reviews.llvm.org/D133447	2022-09-07 17:01:14 -05:00
Joseph Huber	dc400f8612	[libomptarget] Deprecate old method for setting the tripcount Previously, the tripcount was set by a push call. We moved away from this with the new interface that added the tripcount to the kernel arguments struct, but kept around the old interface for legacy purposes for the LLVM 15 release. This patch removes the support for the legacy method. This removes the support for the old method, but does not break backwards compatibility. This will result in applications using the old interface being slower when run on the device. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D132885	2022-08-29 20:08:26 -05:00
Joseph Huber	51bda3a0e7	[Libomptarget] Replace std::vector with llvm::SmallVector The runtime makes some use of `std::vector` data structures. We should be able to replace these trivially with `llvm::SmallVector` instead. This should allow us to avoid heap allocations in the majority of cases now. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D130927	2022-08-01 15:59:15 -04:00
Joseph Huber	fbcb1ee7f3	[Libomptarget] Add support for offloading binaries in libomptarget The previous path changed the linker wrapper to embed the offloading binary format inside the target image instead. This will allow us to more generically bundle metadata with these images, such as requires clauses or the target architecture it was compiled for. I wasn't sure how to handle this best, so I introduced a new type that replaces the old `__tgt_device_image` struct that we can expand inside the runtime library. I made the new `__tgt_device_binary` struct pretty much the same for now. In the future we could change this struct to pretty much be the `OffloadBinary` class in the future. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D127432	2022-07-21 13:20:04 -04:00
Joseph Huber	d27d0a673c	[Libomptarget][NFC] Make Libomptarget use the LLVM naming convention Libomptarget grew out of a project that was originally not in LLVM. As we develop libomptarget this has led to an increasingly large clash between the naming conventions used. This patch fixes most of the variable names that did not confrom to the LLVM standard, that is `VariableName` for variables and `functionName` for functions. This patch was primarily done using my editor's linting messages, if there are any issues I missed arising from the automation let me know. Reviewed By: saiislam Differential Revision: https://reviews.llvm.org/D128997	2022-07-05 14:53:38 -04:00
Johannes Doerfert	b316126887	[OpenMP][FIX] Avoid races in the handling of to be deleted mapping entries If we decided to delete a mapping entry we did not act on it right away but first issued and waited for memory copies. In the meantime some other thread might reuse the entry. While there was some logic to avoid colliding on the actual "deletion" part, there were two races happening: 1) The data transfer back of the thread deleting the entry and the data transfer back of the thread taking over the entry raced. 2) The update to the shadow map happened regardless if the entry was actually reused by another thread which left the shadow map in a inconsistent state. To fix both issues we will now update the shadow map and delete the entry only if we are sure the thread is responsible for deletion, hence no other thread took over the entry and reused it. We also wait for a potential former data transfer from the device to finish before we issue another one that would race with it. Fixes https://github.com/llvm/llvm-project/issues/54216 Differential Revision: https://reviews.llvm.org/D121058	2022-03-28 22:33:18 -05:00
Johannes Doerfert	4e34f061d6	[OpenMP][FIX] Ensure exclusive access to the HDTT map This patch solves two problems with the `HostDataToTargetMap` (HDTT map) which caused races and crashes before: 1) Any access to the HDTT map needs to be exclusive access. This was not the case for the "dump table" traversals that could collide with updates by other threads. The new `Accessor` and `ProtectedObject` wrappers will ensure we have a hard time introducing similar races in the future. Note that we could allow multiple concurrent read-accesses but that feature can be added to the `Accessor` API later. 2) The elements of the HDTT map were `HostDataToTargetTy` objects which meant that they could be copied/moved/deleted as the map was changed. However, we sometimes kept pointers to these elements around after we gave up the map lock which caused potential races again. The new indirection through `HostDataToTargetMapKeyTy` will allows us to modify the map while keeping the (interesting part of the) entries valid. To offset potential cost we duplicate the ordering key of the entry which avoids an additional indirect lookup. We should replace more objects with "protected objects" as we go. Differential Revision: https://reviews.llvm.org/D121057	2022-03-25 11:38:54 -05:00
Johannes Doerfert	10aa83ff74	[OpenMP] Allow to explicitly deinitialize device resources There are two problems this patch tries to address: 1) We currently free resources in a random order wrt. plugin and libomptarget destruction. This patch should ensure the CUDA plugin is less fragile if something during the deinitialization goes wrong. 2) We need to support (hard) pause runtime calls eventually. This patch allows us to free all associated resources, though we cannot reinitialize the device yet. Follow up patch will associate one event pool per device/context. Differential Revision: https://reviews.llvm.org/D120089	2022-03-07 23:43:04 -06:00
Johannes Doerfert	307bbd3c82	[OpenMP][NFCI] Use RAII lock guards in libomptarget where possible Differential Revision: https://reviews.llvm.org/D121060	2022-03-07 23:43:04 -06:00
Johannes Doerfert	7ead7e90fc	Revert "[OpenMP][NFCI] Use RAII lock guards in libomptarget where possible" This reverts commit `ff50e81b50` as it broke the buildbots, see https://reviews.llvm.org/D121060#3362737.	2022-03-06 21:27:41 -06:00
Johannes Doerfert	ff50e81b50	[OpenMP][NFCI] Use RAII lock guards in libomptarget where possible Differential Revision: https://reviews.llvm.org/D121060	2022-03-06 19:59:23 -06:00
Johannes Doerfert	b0789a1b12	[OpenMP] Avoid costly shadow map traversals whenever possible In the OpenMC app we saw `omp target update` spending an awful lot of time in the shadow map traversal without ever doing any update there. There are two cases that allow us to avoid the traversal completely. The simplest thing is that small updates cannot (reasonably) contain an attached pointer part. The other case requires to track in the mapping table if an entry might contain an attached pointer as part. Given that we have a single location shadow map entries are created, the latter is actually fairly easy as well. Differential Revision: https://reviews.llvm.org/D113124	2022-01-19 22:14:41 -06:00
Johannes Doerfert	1e447d03e2	[OpenMP] Introduce an environment variable to disable atomic map clauses Atomic handling of map clauses was introduced to comply with the OpenMP standard (see D104418). However, many apps won't need this feature which can be costly in certain situations. To allow for applications to opt-out we now introduce the `LIBOMPTARGET_MAP_FORCE_ATOMIC` environment flag that voids the atomicity guarantee of the standard for map clauses again, shifting the burden to the user. This patch also de-duplicates the code that introduces the events used to enforce atomicity as a cleanup. Differential Revision: https://reviews.llvm.org/D117627	2022-01-19 22:14:41 -06:00
Shilei Tian	9584c6fa2f	[OpenMP][Offloading] Fixed data race in libomptarget caused by async data movement The async data movement can cause data race if the target supports it. Details can be found in [1]. This patch tries to fix this problem by attaching an event to the entry of data mapping table. Here are the details. For each issued data movement, a new event is generated and returned to `libomptarget` by calling `createEvent`. The event will be attached to the corresponding mapping table entry. For each data mapping lookup, if there is no need for a data movement, the attached event has to be inserted into the queue to gaurantee that all following operations in the queue can only be executed if the event is fulfilled. This design is to avoid synchronization on host side. Note that we are using CUDA terminolofy here. Similar mechanism is assumped to be supported by another targets. Even if the target doesn't support it, it can be easily implemented in the following fall back way: - `Event` can be any kind of flag that has at least two status, 0 and 1. - `waitEvent` can directly busy loop if `Event` is still 0. My local test shows that `bug49334.cpp` can pass. Reference: [1] https://bugs.llvm.org/show_bug.cgi?id=49940 Reviewed By: grokos, JonChesterfield, ye-luo Differential Revision: https://reviews.llvm.org/D104418	2022-01-05 20:20:04 -05:00
Johannes Doerfert	73104ad65b	[OpenMP][NFC] Move headers into include folder	2021-12-28 23:53:28 -06:00

16 Commits