llvm-project

Commit Graph

Author	SHA1	Message	Date
Peyton, Jonathan L	4457565757	[OpenMP] Implement GOMP task reductions Implement the remaining GOMP_* functions to support task reductions in taskgroup, parallel, loop, and taskloop constructs. The unused mem argument to many of the work-sharing constructs has to do with the scan() directive/ inscan() modifier. If mem is set, each function will call KMP_FATAL() and tell the user scan/inscan is unsupported. The GOMP reduction implementation is kept separate from our implementation because of how GOMP presents reduction data and computes the reductions. GOMP expects the privatized copies to be present even after a #pragma omp parallel reduction(task:...) region has ended so the data is stored inside GOMP's uintptr_t* data pseudo-structure. This style is tightly coupled with GCC compiler codegen. There also isn't any init(), combiner(), fini() functions in GOMP's codegen so the two implementations were to disparate to try to wrap GOMP's around our own. Differential Revision: https://reviews.llvm.org/D98806	2021-04-16 16:36:31 -05:00
Shilei Tian	2df65f87c1	[OpenMP] Fixed a crash in hidden helper thread It is reported that after enabling hidden helper thread, the program can hit the assertion `new_gtid < __kmp_threads_capacity` sometimes. The root cause is explained as follows. Let's say the default `__kmp_threads_capacity` is `N`. If hidden helper thread is enabled, `__kmp_threads_capacity` will be offset to `N+8` by default. If the number of threads we need exceeds `N+8`, e.g. via `num_threads` clause, we need to expand `__kmp_threads`. In `__kmp_expand_threads`, the expansion starts from `__kmp_threads_capacity`, and repeatedly doubling it until the new capacity meets the requirement. Let's assume the new requirement is `Y`. If `Y` happens to meet the constraint `(N+8)2^X=Y` where `X` is the number of iterations, the new capacity is not enough because we have 8 slots for hidden helper threads. Here is an example. ``` #include <vector> int main(int argc, char argv[]) { constexpr const size_t N = 1344; std::vector<int> data(N); #pragma omp parallel for for (unsigned i = 0; i < N; ++i) { data[i] = i; } #pragma omp parallel for num_threads(N) for (unsigned i = 0; i < N; ++i) { data[i] += i; } return 0; } ``` My CPU is 20C40T, then `__kmp_threads_capacity` is 160. After offset, `__kmp_threads_capacity` becomes 168. `1344 = (160+8)*2^3`, then the assertions hit. Reviewed By: protze.joachim Differential Revision: https://reviews.llvm.org/D98838	2021-03-18 18:25:36 -04:00
AndreyChurbanov	dab5d6c2eb	[OpenMP] fix race condition in test	2021-02-18 02:27:49 +03:00
Shilei Tian	3c31b78455	[OpenMP] Fixed an issue that taskwait doesn't work on detachable task D77609 mistakenly changed the bebavior of task waiting on detachable task that a detachable task is not waited, based on https://lists.llvm.org/pipermail/openmp-dev/2021-February/003836.html. This patch fixed it. Thank Raúl for the report. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D95798	2021-02-03 13:12:43 -05:00
Shilei Tian	9d64275ae0	[OpenMP] Added the support for hidden helper task in RTL The basic design is to create an outer-most parallel team. It is not a regular team because it is only created when the first hidden helper task is encountered, and is only responsible for the execution of hidden helper tasks. We first use `pthread_create` to create a new thread, let's call it the initial and also the main thread of the hidden helper team. This initial thread then initializes a new root, just like what RTL does in initialization. After that, it directly calls `__kmpc_fork_call`. It is like the initial thread encounters a parallel region. The wrapped function for this team is, for main thread, which is the initial thread that we create via `pthread_create` on Linux, waits on a condition variable. The condition variable can only be signaled when RTL is being destroyed. For other work threads, they just do nothing. The reason that main thread needs to wait there is, in current implementation, once the main thread finishes the wrapped function of this team, it starts to free the team which is not what we want. Two environment variables, `LIBOMP_NUM_HIDDEN_HELPER_THREADS` and `LIBOMP_USE_HIDDEN_HELPER_TASK`, are also set to configure the number of threads and enable/disable this feature. By default, the number of hidden helper threads is 8. Here are some open issues to be discussed: 1. The main thread goes to sleeping when the initialization is finished. As Andrey mentioned, we might need it to be awaken from time to time to do some stuffs. What kind of update/check should be put here? Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D77609	2021-01-25 22:16:17 -05:00
Shilei Tian	9bf843bdc8	Revert "[OpenMP] Added the support for hidden helper task in RTL" This reverts commit `ed939f853d`.	2021-01-18 06:57:52 -05:00
Shilei Tian	ed939f853d	[OpenMP] Added the support for hidden helper task in RTL The basic design is to create an outer-most parallel team. It is not a regular team because it is only created when the first hidden helper task is encountered, and is only responsible for the execution of hidden helper tasks. We first use `pthread_create` to create a new thread, let's call it the initial and also the main thread of the hidden helper team. This initial thread then initializes a new root, just like what RTL does in initialization. After that, it directly calls `__kmpc_fork_call`. It is like the initial thread encounters a parallel region. The wrapped function for this team is, for main thread, which is the initial thread that we create via `pthread_create` on Linux, waits on a condition variable. The condition variable can only be signaled when RTL is being destroyed. For other work threads, they just do nothing. The reason that main thread needs to wait there is, in current implementation, once the main thread finishes the wrapped function of this team, it starts to free the team which is not what we want. Two environment variables, `LIBOMP_NUM_HIDDEN_HELPER_THREADS` and `LIBOMP_USE_HIDDEN_HELPER_TASK`, are also set to configure the number of threads and enable/disable this feature. By default, the number of hidden helper threads is 8. Here are some open issues to be discussed: 1. The main thread goes to sleeping when the initialization is finished. As Andrey mentioned, we might need it to be awaken from time to time to do some stuffs. What kind of update/check should be put here? Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D77609	2021-01-16 14:13:35 -05:00
Nawrin Sultana	540007b427	[OpenMP] Add strict mode in num_tasks and grainsize This patch adds new API __kmpc_taskloop_5 to accomadate strict modifier (introduced in OpenMP 5.1) in num_tasks and grainsize clause. Differential Revision: https://reviews.llvm.org/D92352	2020-12-09 16:46:30 -06:00
Peyton, Jonathan L	ee1c04a926	[OpenMP] Fix if0 task with dependencies in the runtime The current GOMP interface for serialized tasks does not take into account task dependencies. Add the check and wait for dependencies. Fixes: https://bugs.llvm.org/show_bug.cgi?id=46573 Differential Revision: https://reviews.llvm.org/D87271	2020-09-24 09:47:53 -05:00
Peyton, Jonathan L	9089b4a5c5	[OpenMP] Introduce GOMP taskwait depend in the runtime This change introduces the GOMP_taskwait_depend() function. It implements the OpenMP 5.0 feature of #pragma omp taskwait with depend() clause by wrapping around __kmpc_omp_wait_deps(). Differential Revision: https://reviews.llvm.org/D87269	2020-09-24 09:45:14 -05:00
Peyton, Jonathan L	72ada5ae6c	[OpenMP] Introduce GOMP mutexinoutset in the runtime Encapsulate GOMP task dependencies in separate class and introduce the new mutexinoutset dependency type. This separate class allows future GOMP task APIs easier access to the task dependency functionality and better ability to propagate new dependency types to all existing GOMP task APIs which use task dependencies. Differential Revision: https://reviews.llvm.org/D87267	2020-09-24 09:45:13 -05:00
Joachim Protze	f226171429	[OpenMP][Tests][NFC] Mark compatibility with older versions of clang	2020-07-20 13:53:29 +02:00
Joachim Protze	8289f2891e	[OpenMP][Tests] Flag compatibility of OpenMP runtime tests with GCC versions If the compilation fails, the test is marked as unsupported. -> This will never change for a specific version of gcc If the linking fails, the test is marked as expected to fail. -> This might change as LLVM/OpenMP implements the missing GOMP interface function Reviewed by: Hahnfeld Differential Revision: https://reviews.llvm.org/D83077	2020-07-05 22:49:54 +02:00
Joachim Protze	9e5aefc5f9	[OpenMP][Tests] fix data race in an OpenMP runtime test Reviewed by: AndreyChurbanov Differential Revision: https://reviews.llvm.org/D81804	2020-06-15 18:48:35 +02:00
AndreyChurbanov	5e111c5df8	[openmp] Fixed taskloop recursive splitting so that taskloop tasks have same parent tasks. Differential Revision: https://reviews.llvm.org/D80577	2020-06-01 17:51:02 +03:00
AndreyChurbanov	57d8b8d6f0	[openmp] Fixed hang if detached task was serialized. The patch fixes https://bugs.llvm.org/show_bug.cgi?id=45904. Differential Revision: https://reviews.llvm.org/D79944	2020-05-18 15:32:13 +03:00
Kazuaki Ishizaki	4201679110	[OpenMP] NFC: Fix trivial typo Differential Revision: https://reviews.llvm.org/D77430	2020-04-04 12:06:54 +09:00
Vitaly Buka	c9ae3c5e10	[openmp] Disable tests flaky on Debian https://bugs.llvm.org/show_bug.cgi?id=45397	2020-04-01 21:58:05 -07:00
Alexey Bataev	0fca766458	[OPENMP50]Fix PR45117: Orphaned task reduction should be allowed. Add support for orpahned task reductions.	2020-03-27 17:47:30 -04:00
AndreyChurbanov	ae044467ed	[openmp][runtime] Fixed hang for explicit task inside a taskloop. Added missed initialization of td_last_tied field for taskloop tasks. Differential Revision: https://reviews.llvm.org/D75673	2020-03-23 20:07:30 +03:00
Kelvin Li	ed5fe64581	[OpenMP] NFC: Fix trivial typos in comments Submitted by: kiszk Differential Revision: https://reviews.llvm.org/D72171	2020-01-03 22:03:42 -05:00
Michał Górny	6f8ee2c575	[openmp] [test] Skip one more test that kills NetBSD buildbot	2019-11-07 17:29:57 +01:00
Andrey Churbanov	de44f434e8	fixed test: eliminated race condition which might cause deadlock llvm-svn: 372887	2019-09-25 15:25:52 +00:00
Andrey Churbanov	a1639b9bba	Enable tasks dependencies hashmaps resizing. Patch by viroulep (Philippe Virouleau) Differential Revision: https://reviews.llvm.org/D67447 llvm-svn: 372879	2019-09-25 14:40:19 +00:00
Jonathan Peyton	aa5cdafa40	Remove REQUIRES OMP spec version within lit tests This is a follow up patch to D64534 (r365963) which removed all OMP spec versioning within the OpenMP runtime codebase. This patch removes REQUIRES: openmp-x.y lines from lit tests. llvm-svn: 366341	2019-07-17 15:41:00 +00:00
Jonathan Peyton	e4b4f994d2	[OpenMP] Remove OMP spec versioning Remove all older OMP spec versioning from the runtime and build system. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D64534 llvm-svn: 365963	2019-07-12 21:45:36 +00:00
Andrey Churbanov	a23806e67a	Create a runtime option to disable task throttling. Patch by viroulep (Philippe Virouleau) Differential Revision: https://reviews.llvm.org/D63196 llvm-svn: 364934	2019-07-02 15:10:20 +00:00
Andrey Churbanov	405037c4e6	New implementation of OpenMP 5.0 detached tasks. Patch by Alex Duran Differential Revision: https://reviews.llvm.org/D62485 llvm-svn: 363799	2019-06-19 13:23:28 +00:00
Michal Gorny	a815cbb010	[openmp] [test] Skip kernel-breaking tests on NetBSD The omp_taskloop_num_tasks and omp_taskwait have deadlooped on the NetBSD buildbot previously, practically hanging the host running it. Disable them until we can find a good solution, or make the kernel less fragile. llvm-svn: 361825	2019-05-28 14:10:47 +00:00
Jonathan Peyton	a8426ac8c2	[OpenMP] Implement task modifier for reduction clause Implemented task modifier in two versions - one without taking into account omp_orig variable (the omp_orig still can be processed by compiler without help of the library, but each reduction object will need separate initializer with global access to omp_orig), another with omp_orig variable included into interface (single initializer can be used for multiple reduction objects of the same type). Second version can be used when the omp_orig is not globally accessible, or to optimize code in case of multiple reduction objects of the same type. Patch by Andrey Churbanov Differential Revision: https://reviews.llvm.org/D60976 llvm-svn: 359710	2019-05-01 17:54:01 +00:00
Dimitry Andric	956168c802	Ensure correct pthread flags and libraries are used On most platforms, certain compiler and linker flags have to be passed when using pthreads, otherwise linking against libomp.so might fail with undefined references to several pthread functions. Use CMake's `find_package(Threads)` to determine these for standalone builds, or take them (and optionally modify them) from the top-level LLVM cmake files. Also, On FreeBSD, ensure that libomp.so is linked against libm.so, similar to NetBSD. Adjust test cases with hardcoded `-lpthread` flag to use the common build flags, which should now have the required pthread flags. Reviewers: emaste, jlpeyton, krytarowski, mgorny, protze.joachim, Hahnfeld Reviewed By: Hahnfeld Subscribers: AndreyChurbanov, tra, EricWF, Hahnfeld, jfb, jdoerfert, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D59451 llvm-svn: 357618	2019-04-03 18:11:36 +00:00
Roman Lebedev	781a0896b0	[OpenMP] Fixes for LIBOMP_OMP_VERSION=45/40 Summary: I have discovered this because i wanted to experiment with building static libomp (with openmp-4.0 support only) for debugging purposes. There are three kinds of problems here: 1. `__kmp_compare_and_store_acq()` simply does not exist. It was added in D47903 by @jlpeyton. I'm guessing `__kmp_atomic_compare_store_acq()` was meant. 2. In `__kmp_is_ticket_lock_initialized()`, `lck->lk.initialized` is `std::atomic<bool>`, while `lck` is `kmp_ticket_lock_t *`. Naturally, they can't be equality-compared. Either, it should return the value read from `lck->lk.initialized`, or do what `__kmp_is_queuing_lock_initialized()` does, compare the passed pointer with the field in the struct pointed by the pointer. I think the latter is correct-er choice here. 3. Tests were not versioned. They assume that `LIBOMP_OMP_VERSION` is at the latest version. This does not touch LIBOMP_OMP_VERSION=30. That is still broken. Reviewers: jlpeyton, Hahnfeld, AndreyChurbanov Reviewed By: AndreyChurbanov Subscribers: guansong, jfb, openmp-commits, jlpeyton Tags: #openmp Differential Revision: https://reviews.llvm.org/D55496 llvm-svn: 349260	2018-12-15 09:23:39 +00:00
Andrey Churbanov	74f98554f9	Fix for bugzilla https://bugs.llvm.org/show_bug.cgi?id=39970 Broken tests fixed Differential Revision: https://reviews.llvm.org/D55598 llvm-svn: 349017	2018-12-13 10:04:10 +00:00
Andrey Churbanov	c334434550	Implementation of OpenMP 5.0 mutexinoutset task dependency type. Differential Revision: https://reviews.llvm.org/D53380 llvm-svn: 346307	2018-11-07 12:19:57 +00:00
Jonas Hahnfeld	5b57eb4b09	[tests] Add annotations for taskloop features Only supported since GCC 6 and Intel 17.0. However GCC 6.3.0 is crashing on two of the tests, so disable them as well... Differential Revision: https://reviews.llvm.org/D50085 llvm-svn: 338720	2018-08-02 14:34:03 +00:00
Jonas Hahnfeld	51fc3cc628	[test] Convert test for PR36720 to c89 GCC 4.8.5 defaults to this old C standard. I think we should make the tests pass a newer -std=c99\|c11 but that's too intrusive for now... Differential Revision: https://reviews.llvm.org/D50084 llvm-svn: 338490	2018-08-01 06:26:55 +00:00
Jonathan Peyton	28226e7d64	[OpenMP] Fix tasking + parallel bug From the bug report, the runtime needs to initialize the nproc variables (inside middle init) for each root when the task is encountered, otherwise, a segfault can occur. Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=36720 Differential Revision: https://reviews.llvm.org/D49996 llvm-svn: 338313	2018-07-30 21:47:56 +00:00
Jonathan Peyton	27a677fc95	Introduce GOMP_taskloop API This patch introduces GOMP_taskloop to our API. It adds GOMP_4.5 to our version symbols. Being a wrapper around __kmpc_taskloop, the function creates a task with the loop bounds properly nested in the shareds so that the GOMP task thunk will work properly. Also, the firstprivate copy constructors are properly handled using the __kmp_gomp_task_dup() auxiliary function. Currently, only linear spawning of tasks is supported for the GOMP_taskloop interface. Differential Revision: https://reviews.llvm.org/D45327 llvm-svn: 330282	2018-04-18 19:23:54 +00:00
Jonas Hahnfeld	fc473dee98	[CMake] Detect information about test compiler Perform a nested CMake invocation to avoid writing our own parser for compiler versions when we are not testing the in-tree compiler. Use the extracted information to mark a test as unsupported that hangs with Clang prior to version 4.0.1 and restrict tests for libomptarget to Clang version 6.0.0 and later. Differential Revision: https://reviews.llvm.org/D40083 llvm-svn: 319448	2017-11-30 17:08:31 +00:00
Jonathan Peyton	16a05bca9c	Add C++ support for testcases Patch by Simon Convent Differential Revision: https://reviews.llvm.org/D38878 llvm-svn: 316230	2017-10-20 19:42:32 +00:00
Jonas Hahnfeld	5872f1e97f	[test] Fix uninitialized memory in omp_taskloop_grainsize.c result was never initialized to zero which sometimes failed the test. llvm-svn: 314513	2017-09-29 13:53:03 +00:00
Jonathan Peyton	1c50ee64a2	Fix failing taskloop tests by omitting gcc We do not have GOMP interface support for taskloop yet. llvm-svn: 308351	2017-07-18 20:16:25 +00:00
Jonathan Peyton	93e17cfe6c	Add recursive task scheduling strategy to taskloop implementation Summary: Taskloop implementation is extended by using recursive task scheduling. Envirable KMP_TASKLOOP_MIN_TASKS added as a manual threshold for the user to switch from recursive to linear tasks scheduling. Details: * The calculations for the loop parameters are moved from __kmp_taskloop_linear upper level * Initial calculation is done in the __kmpc_taskloop, further range splitting is done in the __kmp_taskloop_recur. * Added threshold to switch from recursive to linear tasks scheduling; * One half of split range is scheduled as an internal task which just moves sub-range parameters to the stealing thread that continues recursive scheduling (if number of tasks still enough), the other half is processed recursively; * Internal task duplication routine fixed to assign parent task, that was not needed when all tasks were scheduled by same thread, but is needed now. Patch by Andrey Churbanov Differential Revision: https://reviews.llvm.org/D35273 llvm-svn: 308338	2017-07-18 18:50:13 +00:00
Andrey Churbanov	72ba210916	Run-time library part of OpenMP 5.0 task reduction implementation. Added test kmp_task_reduction_nest.cpp which has an example of possible compiler codegen. Differential Revision: https://reviews.llvm.org/D29600 llvm-svn: 295343	2017-02-16 17:49:49 +00:00
Jonas Hahnfeld	479088eefa	Correct wrong comment in bug_nested_proxy_task.c The nested proxy task does not have dependencies. llvm-svn: 293472	2017-01-30 09:51:02 +00:00
Jonas Hahnfeld	ad0c42e3a9	kmp_gsupport: Fix library initialization with taskgroup Differential Revision: https://reviews.llvm.org/D23259 llvm-svn: 278003	2016-08-08 13:23:08 +00:00
Jonas Hahnfeld	ca32babfa7	Mark tests with task dependencies as unsupported with GCC llvm-svn: 277996	2016-08-08 11:52:49 +00:00
Jonas Hahnfeld	bedc371c9d	Do not block on explicit task depending on proxy task Consider the following code: int dep; #pragma omp target nowait depend(out: dep) { sleep(1); } #pragma omp task depend(in: dep) { printf("Task with dependency\n"); } printf("Doing some work...\n"); In its current state the runtime will block on the second task and not continue execution. Differential Revision: https://reviews.llvm.org/D23116 llvm-svn: 277992	2016-08-08 10:08:14 +00:00
Jonas Hahnfeld	69f8511f8f	__kmp_free_task: Fix for serial explicit tasks producing proxy tasks Consider the following code which may be executed by a serial team: int dep; #pragma omp target nowait depend(out: dep) { sleep(1); } #pragma omp task depend(in: dep) { #pragma omp target nowait { sleep(1); } } Here the explicit task may not be freed until the nested proxy task has finished. The current code hasn't considered this and called __kmp_free_task anyway which triggered an assert because of remaining incomplete children: KMP_DEBUG_ASSERT( TCR_4(taskdata->td_incomplete_child_tasks) == 0 ); Differential Revision: https://reviews.llvm.org/D23115 llvm-svn: 277991	2016-08-08 10:08:07 +00:00
Jonas Hahnfeld	d1f4b8f6e8	Add test case for nested creation of tasks For discussion in D23115 llvm-svn: 277730	2016-08-04 14:55:56 +00:00

1 2

56 Commits