llvm-project

Commit Graph

Author	SHA1	Message	Date
Joseph Huber	586fc5999b	[Libomptarget][NFC] clang-format the libomptarget OpenMP tests Summary: Recent changes to clang-format improved the handling of OpenMP pragmas. Clean up the existing libomptarget tests.	2022-10-19 08:57:27 -05:00
Joseph Huber	1af7541741	[Libomptarget] Fix missing semicolon in exports	2022-10-14 09:02:42 -05:00
Joseph Huber	619dced0fc	[Libomptarget] Don't use full names for exported plugin symbols Summary: This patch changes the `exports` file to export all `__tgt_rtl` functions. This is a better option as not each plugin implements all of these functions, furthermore any new functions added will be automatically included.	2022-10-14 08:57:57 -05:00
Slava Zakharin	88da0de14f	Revert "[Libomp] Do not error on undefined version script symbols" This reverts commit `096f93e73d`. Revert "[Libomptarget] Make the plugins ingore undefined exported symbols" This reverts commit `3f62314c23`. Revert "[LLD] Enable --no-undefined-version by default." This reverts commit `7ec8b0d162`. Three commits are reverted because of the current omp build fail with GNU ld. See discussion here: https://reviews.llvm.org/rG096f93e73dc3	2022-10-13 14:12:07 -07:00
Joseph Huber	3f62314c23	[Libomptarget] Make the plugins ingore undefined exported symbols Summary: Recent changes made the default behaviour to error when given an undefined symbol in a version script. A previous patch fixed this for `libomptarget` by removing the single undefined symbol. However, the plguins are expected to only define a subset of the availible functions so we shouldn't treat it as an error. This patch updates the build flags to work appropriately.	2022-10-13 08:13:03 -05:00
Joseph Huber	e801e8f3e7	[Libomptarget] Remove undefined 'omp_get_interop_rc_desc' symbol from exports list Summary: A recent patch made undefined symbols in version scripts cause errors by default. The `omp_get_interop_rc_desc` function is declared but not defined, so it is undefined in the final link unit. This patch removes it from the exports list, it should be added back in when actually defined and used.	2022-10-13 07:41:14 -05:00
Ye Luo	053e894106	[DeviceRTL] CMake fix using target-level dependency File-level dependency should not be used on files generated during the build. The next command may execute before the generating command finishes writing the file. Use add_custom_target and use target-level dependency. Differential Revision: https://reviews.llvm.org/D135630	2022-10-10 21:23:58 -05:00
Shilei Tian	395d261de7	[NFC] Remove trailing white space in openmp/libomptarget/src/CMakeLists.txt	2022-10-07 13:42:31 -04:00
Joseph Huber	defe072010	[Libomptarget] Remove debug definitions DeviceRTL's CMake These debugging definitions are no longer used in the new runtime. The old runtime has been removed since Clang-14 so we can safely get rid of these leftover variables. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D135452	2022-10-07 11:31:18 -05:00
Joseph Huber	1bddb0fc23	[Libomptarget] Clean up DeviceRTL CMake and remove unused flags Summary: This patch just cleans up the unused flags in the DeviceRTL. These should no longer be necessary or are redundant. Also add the extract tool and packager to the check and error message if not found. This will make it easier to tell if they are not present.	2022-10-07 10:09:48 -05:00
Ye Luo	deba92d6c2	[DeviceRTL] Fix a CMake multi-step compilation dependency issue. caused by `9223315903`	2022-10-06 19:07:39 -05:00
Shilei Tian	9dd0476293	[OpenMP][DeviceRTL] Fix build issue	2022-10-06 16:21:51 -04:00
Shilei Tian	32dc48094b	[OpenMP][DeviceRTL] Fix an issue that thread array might be corrupted The shared memory stack in the device runtime assumes no intervined uses. D135037 breaks the assumption, potentially causing the shared stack corruption. This patch moves the thread array to heap memory. Since it is already the slow path, it doesn't matter that much anyway. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D135391	2022-10-06 16:13:33 -04:00
Joseph Huber	9223315903	[DeviceRTL] Allow IsSPMDMode to be optimized out in LTO mode A previous patch merged the static and bitcode versions of the deviceRTL. We previously used the static library's separate compilation to set a special flag that prevented `IsSPMDMode` from being put in the used list and preventing it from being optimized out. When they were merged we could no longer do this separate compilation that allowed users of LTO to get more optimal code. This patch rearranges the code. The `IsSPMDMode` global is now transitively used by its inclusion in the changed `__keep_alive` function. This allows us to then manually delete the `__keep_alive` function from the module when building the static library via `llvm-extract`. The result is that the bitcode library correctly will maintain the needed shared state, while the static library will be able to internalize it and optimize it out. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D135280	2022-10-05 14:40:01 -05:00
Johannes Doerfert	a3a741c0bb	[OpenMP][FIX] Update device API to match recent changes	2022-10-05 08:07:38 -07:00
Johannes Doerfert	f8ee045c6d	[OpenMP] Eliminate the ThreadStates array in favor of indirection If we have thread states, the program is going to be rather slow. If we don't, we want to avoid wasting shared memory. This patch introduces a slight penalty (malloc + indirection) for the slow path and reduces resource usage for the fast path. Differential Revision: https://reviews.llvm.org/D135037	2022-10-04 20:27:34 -07:00
Johannes Doerfert	b113965073	[OpenMP] Introduce more atomic operations into the runtime We should use OpenMP atomics but they don't take variable orderings. Maybe we should expose all of this in the header but that solves only part of the problem anyway. Differential Revision: https://reviews.llvm.org/D135036	2022-10-04 20:20:55 -07:00
Johannes Doerfert	f85c1f3b7c	[OpenMP] Replace __ATOMIC_XYZ with atomic::xyz for style Also fixes one ordering argument not used. Differential Revision: https://reviews.llvm.org/D135035	2022-10-04 19:43:30 -07:00
Johannes Doerfert	abbc3fa17b	[OpenMP] Replace pointer comparison with `isSharedMemPtr` check The pointer comparison was causing confusion for capture tracking, let's avoid confusion. Differential Revision: https://reviews.llvm.org/D135160	2022-10-04 19:24:22 -07:00
Jennifer Yu	30cc712eb6	[Clang][OpenMP] Fix run time crash when use_device_addr is used. It is data mapping ordering problem. According omp spec If one or more map clauses are present, the list item conversions that are performed for any use_device_ptr or use_device_addr clause occur after all variables are mapped on entry to the region according to those map clauses. The change is to put mapping data for use_device_addr at end of data mapping array. Differential Revision: https://reviews.llvm.org/D134556	2022-09-27 11:53:57 -07:00
Dan Palermo	db021abf33	[OpenMP][AMDGPU] Enable OpenMP device runtime build for gfx110[0123] Add OpenMP device runtime build support for the gfx1100, gfx1101, gfx1102, and gfx1103 targets. Differential Revision: https://reviews.llvm.org/D134465	2022-09-23 01:49:51 +00:00
Jennifer Yu	48ffd40ba2	[Clang][OpenMP] Codegen generation for has_device_addr claues. This patch add codegen support for the has_device_addr clause. It use the same logic of is_device_ptr. But passing &var instead pointer to var to kernal. Differential Revision: https://reviews.llvm.org/D134268	2022-09-20 21:12:30 -07:00
Ron Lieberman	d5b5289561	revert `684f76643` [Clang][OpenMP] Codegen generation for has_device_addr claues. breaks amdgpu buildbot	2022-09-20 01:37:27 +00:00
Jennifer Yu	a1df13ecd6	Fix test case which is not working for AMDGPU. This is for the change of Differential Revision: https://reviews.llvm.org/D134186	2022-09-19 17:07:01 -07:00
Jennifer Yu	684f766431	[Clang][OpenMP] Codegen generation for has_device_addr claues. Summary: This patch add codegen support for the has_device_addr clause. It use the same logic of is_device_ptr. Differential Revision: https://reviews.llvm.org/D134186	2022-09-19 16:14:57 -07:00
Joseph Huber	292cb114b0	[Libomptarget] Revert changes to AMDGPU plugin destructors These patches exposed a lot of problems in the AMD toolchain. Rather than keep it broken we should revert it to its old semi-functional state. This will prevent us from using device destructors but should remove some new bugs. In the future this interface should be changed once these problems are addressed more correctly. This reverts commit `ed0f218115`. This reverts commit `2b7203a359`. Fixes #57536 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D133997	2022-09-16 06:55:51 -05:00
Joseph Huber	4b004a0b83	[Libomptarget] Embed bitcode library in static library instead. This patch changes the CMake to instead embed the already generated LLVM-IR bitcode library into an object file to create the static library. This is different from the previous method which generated them separately. This will make the build faster and allow us to perform the same internalization into a single library we do with the bitcode library. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D133952	2022-09-15 14:05:18 -05:00
Dhruva Chakrabarti	839ac62c50	Revert "[OpenMP] Codegen aggregate for outlined function captures" This reverts commit `7539e9cf81`.	2022-09-15 03:08:46 +00:00
Giorgis Georgakoudis	7539e9cf81	[OpenMP] Codegen aggregate for outlined function captures Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3) forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call. Reviewed By: jdoerfert, jhuber6, ABataev Differential Revision: https://reviews.llvm.org/D102107	2022-09-15 00:54:05 +00:00
Joseph Huber	23bc343855	[Libomptarget] Change device free routines to accept the allocation kind Previous support for device memory allocators used a single free routine and did not provide the original kind of the allocation. This is problematic as some of these memory types required different handling. Previously this was worked around using a map in runtime to record the original kind of each pointer. Instead, this patch introduces new free routines similar to the existing allocation routines. This allows us to avoid a map traversal every time we free a device pointer. The only interfaces defined by the standard are `omp_target_alloc` and `omp_target_free`, these do not take a kind as `omp_alloc` does. The standard dictates the following: "The omp_target_alloc routine returns a device pointer that references the device address of a storage location of size bytes. The storage location is dynamically allocated in the device data environment of the device specified by device_num." Which suggests that these routines only allocate the default device memory for the kind. So this has been changed to reflect this. This change is somewhat breaking if users were using `omp_target_free` as previously shown in the tests. Reviewed By: JonChesterfield, tianshilei1992 Differential Revision: https://reviews.llvm.org/D133053	2022-09-14 12:14:07 -05:00
Joseph Huber	c2acb1e5d3	[Libomptarget][NFC] Remove unused variable	2022-09-09 15:26:02 -05:00
Joseph Huber	86587f2891	[Libomptarget] Fix compiling with asserts using the bitcode library Sumnmary: A previous patch introduces an `exports` file which contains all the symbol names that are not internalized in the bitcode library. This is done to reduce the size of the bitcode library and only export needed functions. This export file must contain all the functoins expected to be called from the device. Since its introduction the `__assert_fail` function used to be provided but was mistakenly not included. This patch adds it. Fixes #57656 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D133594	2022-09-09 15:25:24 -05:00
Joseph Huber	83fcba82cc	[Libomptarget] Add proper LLVM libraries now that the AMDGPU plugin uses them Summary: The AMDGPU and CUDA plugins now relies on the Object and Support libraries. This patch adds them explicitly rather than hoping that they share the symbols loaded from the standard `libomptarget`.	2022-09-09 10:33:26 -05:00
Joseph Huber	6e8d93e5c2	[Libomptarget] Implement OpenMP 5.2 semantics for device pointers In OpenMP 5.2, §5.8.6, page 160 line 32-33, when a device pointer allocated by omp_target_alloc has implicitly been included on a target construct as a zero-length array, the pointer initialisation should not find a matching mapped list item, and so should retain its value as a firstprivate variable. Previously, we would return a null pointer if the list item was not found. This patch updates the map handling to the OpenMP 5.2 semantics. Reviewed By: jdoerfert, ye-luo Differential Revision: https://reviews.llvm.org/D133447	2022-09-07 17:01:14 -05:00
Joseph Huber	8d2a447bf9	[Libomptarget] Remove leftover ELF header from x86 plugin Summary: We removed the linking support for `gelf.h` in a previous patch. This header was incorrectly leftover causing build problems on some systems.	2022-09-07 13:41:40 -05:00
Joseph Huber	300155911a	[Libomptarget] Replace libelf with LLVM's Elf libraries This patch replaces the dependency on `libelf` with LLVM's ELF support. With this patch the user no-longer needs to have `libelf` on their system to build and configure OpenMP offloading. The replacement is mostly mechanical, with the exception of the hash table support which was added in D131309. Depends on D131309 Reviewed By: JonChesterfield, saiislam Differential Revision: https://reviews.llvm.org/D131401	2022-09-07 12:38:51 -05:00
Joseph Huber	894531f59b	[Libomptarget] Add utility functions for loading an ELF symbol by name The `SHT_HASH` sections in an ELF are used to look up a symbol in the symbol table using a symbol's name. This is done by obtaining the `SHT_HASH` section and using its `sh_link` attribute to access the associated symbol table, from which we can access the string table containing the associated name. We can then search for the symbol using the hash of the name and the buckets and chains in the hash table itself This patch adds utility functions that allow us to look up a symbol in an ELF file by name. It will first attempt to look through the hash tables, and then search the section tables manually if failed. This allows us to pull out constants necessary for setting up offloading without first loading the object. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D131309	2022-09-07 12:38:50 -05:00
Joseph Huber	31f434ee3b	[Libomptarget][NFC] Clean up CUDA plugin and address warnings	2022-09-06 15:28:57 -05:00
Ye Luo	0e68f483d4	[OpenMP] add a offload test involving std::complex Taken from the https://github.com/llvm/llvm-project/issues/57064 reproducer. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D133258	2022-09-03 13:28:11 -05:00
Joseph Huber	f8b1f93f26	[libomptarget] Enable the device allocator for AMDGPU This patch adds support for the device memory type, this is currently equivalent to the default type so it should be treated as the same. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D133128	2022-09-01 12:40:59 -05:00
Joseph Huber	56cf3d626f	[Libomptarget] Remove old workaround for GCC 5,6 from libomptarget Some code previous needed the `used` attribute to prevent the GCC compiler versions 5 and 6 from removing it. This is no longer required as the minimum supported GCC version for LLVM 16 is >=7.1.0. Reviewed By: JonChesterfield, vzakhari Differential Revision: https://reviews.llvm.org/D132976	2022-08-30 19:13:48 -05:00
Joseph Huber	52556c3c0f	[Libomptarget] Make unified shared memory test unsupported on AMDGPU This test is an expected failure on AMDGPU. The expected failure is a GPU memory failure, which will typically result in the device totally failing. This isn't an issue for some GPU configurations that do not use the offloading device to also drive the display server. However, if the main GPU is used for testing it will reliably result in the user's display becoming unresponsive. This makes it difficult to run the GPU offloading tests on many systems. This patch simply makes this test unsupported so it no longer runs and freezes my computer when using `ninja check-openmp`. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D132891	2022-08-30 12:14:25 -05:00
Joseph Huber	dc400f8612	[libomptarget] Deprecate old method for setting the tripcount Previously, the tripcount was set by a push call. We moved away from this with the new interface that added the tripcount to the kernel arguments struct, but kept around the old interface for legacy purposes for the LLVM 15 release. This patch removes the support for the legacy method. This removes the support for the old method, but does not break backwards compatibility. This will result in applications using the old interface being slower when run on the device. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D132885	2022-08-29 20:08:26 -05:00
Joseph Huber	04ae35e592	[libomptarget] Always enable time tracing in libomptarget Previously time tracing features were hidden behind an optional CMake option. This was because `libomptarget` was not based on the LLVM libraries at that time. Now that `libomptarget` is an LLVM library we should be able to freely use the `LLVMSupport` library whenever we want and do not need to guard it in this way. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D132852	2022-08-29 14:49:03 -05:00
Joseph Huber	22d71e72c9	[Libomptarget] Do not check for valid binaries twice. The only RTLs that get added to the `UsedRTLs` list have already been checked is they were valid binaries. We shouldn't need to do this again when we unregister all the used binaries as they wouldn't have been used if they were invalid anyway. Let me know if I'm incorrect in this assumption. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D131443	2022-08-29 08:36:50 -05:00
Joseph Huber	47166968db	[OpenMP] Deprecate the old driver for OpenMP offloading Recently OpenMP has transitioned to using the "new" driver which primarily merges the device and host linking phases into a single wrapper that handles both at the same time. This replaced a few tools that were only used for OpenMP offloading, such as the `clang-offload-wrapper` and `clang-nvlink-wrapper`. The new driver carries some marked benefits compared to the old driver that is now being deprecated. Things like device-side LTO, static library support, and more compatible tooling. As such, we should be able to completely deprecate the old driver, at least for OpenMP. The old driver support will still exist for CUDA and HIP, although both of these can currently be compiled on Linux with `--offload-new-driver` to use the new method. Note that this does not deprecate the `clang-offload-bundler`, although it is unused by OpenMP now, it is still used by the HIP toolchain both as their device binary format and object format. When I proposed deprecating this code I heard some vendors voice concernes about needing to update their code in their fork. They should be able to just revert this commit if it lands. Reviewed By: jdoerfert, MaskRay, ye-luo Differential Revision: https://reviews.llvm.org/D130020	2022-08-26 13:47:09 -05:00
Jon Chesterfield	ffabe997a5	[openmp][amdgpu] Implement target_alloc_host as fine grain HSA memory The cuda plugin maps TARGET_ALLOC_HOST onto cuMemAllocHost which is page locked host memory. Fine grain HSA memory is not necessarily page locked but has the same read/write from host or device semantics. The cuda plugin does this per-gpu and this patch makes it accessible from any gpu, but it can be locked down to match the cuda behaviour if preferred. Enabling tests requires an equivalent to // RUN: %libomptarget-compile-run-and-check-nvptx64-nvidia-cuda for amdgpu which doesn't seem to be in use yet. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D132660	2022-08-25 16:27:52 +01:00
Ye Luo	322ea53144	[libomptarget][amdgpu] enable tests whenever possible. if(TARGET amdgpu-arch) doesn't work when ENABLE_LLVM_PROJECTS=openmp because openmp subdirectory is processed before clang subdirectory. Adopt the same logic of enabling tests like the CUDA plugin. Differential Revision: https://reviews.llvm.org/D132579	2022-08-24 14:33:28 -05:00
Joseph Huber	540a13652f	[Libomptarget] Replace use of `dlopen` with LLVM's dynamic library support This patch replaces uses of `dlopen` and `dlsym` with LLVM's support with `loadPermanentLibrary` and `getSymbolAddress`. This allows us to remove the explicit dependency on the `dl` libraries in the CMake. This removes another explicit dependency and solves an issue encountered while building on Windows platforms. The one downside to this is that the LLVM library does not currently support `dlclose` functionality, but this could be added in the future. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D131507	2022-08-24 10:46:21 -05:00
Joseph Huber	30efb459e0	[Libomptarget] Remove use of ELF link_address in x86_64 plugin We use the offloading entires array to determine the relative names and addressed of device-side kernel functions. The x86_64 plugin previously derived the device-side entry table by first identifying the `omp_offloading_entries` section offset in the loaded elf. Then we would use the base offset of the loaded dyanmic library to identify the entries array within the loaded image. This relied on some more unconventional methods which prevented us from using the LLVM dynamic library loader for this plugin. This patch simplifies this by instead copying the host-side entry and replacing its address with the device-side address looked up through `dlsym`. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D131516	2022-08-24 10:46:20 -05:00

1 2 3 4 5 ...

1045 Commits