Commit Graph

278 Commits

Author SHA1 Message Date
Peyton, Jonathan L 4457565757 [OpenMP] Implement GOMP task reductions
Implement the remaining GOMP_* functions to support task reductions
in taskgroup, parallel, loop, and taskloop constructs.  The unused mem
argument to many of the work-sharing constructs has to do with the
scan() directive/ inscan() modifier.  If mem is set, each function
will call KMP_FATAL() and tell the user scan/inscan is unsupported.  The
GOMP reduction implementation is kept separate from our implementation
because of how GOMP presents reduction data and computes the reductions.
GOMP expects the privatized copies to be present even after a #pragma
omp parallel reduction(task:...) region has ended so the data is stored
inside GOMP's uintptr_t* data pseudo-structure.  This style is tightly
coupled with GCC compiler codegen.  There also isn't any init(),
combiner(), fini() functions in GOMP's codegen so the two
implementations were to disparate to try to wrap GOMP's around our own.

Differential Revision: https://reviews.llvm.org/D98806
2021-04-16 16:36:31 -05:00
Peyton, Jonathan L 5ebbb366c4 [OpenMP] Allow affinity to re-detect for child processes
Current atfork() handler for child processes does not reset
the affinity masks array which prevents users from setting their own
affinity in child processes.

Differential Revision: https://reviews.llvm.org/D99218
2021-04-16 16:34:02 -05:00
Hansang Bae 77dc7b4653 [OpenMP] Fix printing routine for OMP_TOOL_VERBOSE_INIT
Also fixed typo in the verbose message.

Differential Revision: https://reviews.llvm.org/D100414
2021-04-14 07:55:26 -05:00
Shilei Tian 2df65f87c1 [OpenMP] Fixed a crash in hidden helper thread
It is reported that after enabling hidden helper thread, the program
can hit the assertion `new_gtid < __kmp_threads_capacity` sometimes. The root
cause is explained as follows. Let's say the default `__kmp_threads_capacity` is
`N`. If hidden helper thread is enabled, `__kmp_threads_capacity` will be offset
to `N+8` by default. If the number of threads we need exceeds `N+8`, e.g. via
`num_threads` clause, we need to expand `__kmp_threads`. In
`__kmp_expand_threads`, the expansion starts from `__kmp_threads_capacity`, and
repeatedly doubling it until the new capacity meets the requirement. Let's
assume the new requirement is `Y`.  If `Y` happens to meet the constraint
`(N+8)*2^X=Y` where `X` is the number of iterations, the new capacity is not
enough because we have 8 slots for hidden helper threads.

Here is an example.
```
#include <vector>

int main(int argc, char *argv[]) {
  constexpr const size_t N = 1344;
  std::vector<int> data(N);

#pragma omp parallel for
  for (unsigned i = 0; i < N; ++i) {
    data[i] = i;
  }

#pragma omp parallel for num_threads(N)
  for (unsigned i = 0; i < N; ++i) {
    data[i] += i;
  }

  return 0;
}
```
My CPU is 20C40T, then `__kmp_threads_capacity` is 160. After offset,
`__kmp_threads_capacity` becomes 168. `1344 = (160+8)*2^3`, then the assertions
hit.

Reviewed By: protze.joachim

Differential Revision: https://reviews.llvm.org/D98838
2021-03-18 18:25:36 -04:00
Hansang Bae a6f9cb6adc [OpenMP] Add runtime interface for OpenMP 5.1 error directive
The proposed new interface is for supporting `at(execution)` clause in the
error directive.

Differential Revision: https://reviews.llvm.org/D98448
2021-03-16 08:55:25 -05:00
Peyton, Jonathan L e2738b3758 [OpenMP] Fix potential integer overflow in dynamic schedule code
Restrict the chunk_size * chunk_num to only occur for valid
chunk_nums and reimplement calculating the limit to avoid overflow.

Differential Revision: https://reviews.llvm.org/D96747
2021-03-08 09:43:05 -06:00
Joachim Protze 35ab6d6390 [OpenMP][Tests][NFC] rename macro to avoid naming clash
When including <ostream>, the register_callback macro of the OMPT callback.h
clashes with a function defined in ostream. This patch renames the macro
and includes ompt into the macro name.
2021-02-24 18:03:54 +01:00
Peyton, Jonathan L 56223b1e91 [OpenMP] Help static loop code avoid over/underflow
This code alleviates some pathological loop parameters (lower,
upper, stride) within calculations involved in the static loop code.  It
bounds the chunk size to the trip count if it is greater than the trip
count and also minimizes problematic code for when trip count < nth.

Differential Revision: https://reviews.llvm.org/D96426
2021-02-22 13:22:01 -06:00
Peyton, Jonathan L 8c73be9d86 [OpenMP] Limit number of dispatch buffers
This patch limits the number of dispatch buffers (used for
loop worksharing construct) to between 1 and 4096.

Differential Revision: https://reviews.llvm.org/D96749
2021-02-22 13:14:28 -06:00
AndreyChurbanov dab5d6c2eb [OpenMP] fix race condition in test 2021-02-18 02:27:49 +03:00
AndreyChurbanov 5631842d18 [OpenMP] NFC: fix test removing the target construct 2021-02-13 04:49:52 +03:00
AndreyChurbanov 091e8daa24 [OpenMP] fix test adding mapping of shared variables 2021-02-13 04:13:54 +03:00
Nawrin Sultana 4692bb4a8a [OpenMP] Add lower and upper bound in num_teams clause
This patch adds lower-bound and upper-bound to num_teams clause
according to OpenMP 5.1 specification. The initial number of teams
created is implementation defined, but it will be greater than or
equal to lower-bound and less than or equal to upper-bound. If
num_teams clause is not specified, the number of teams created is
implementation defined, but it will be greater or equal to 1.

Differential Revision: https://reviews.llvm.org/D95820
2021-02-10 13:58:50 -06:00
Shilei Tian 3c31b78455 [OpenMP] Fixed an issue that taskwait doesn't work on detachable task
D77609 mistakenly changed the bebavior of task waiting on detachable task that a detachable task is not waited, based on https://lists.llvm.org/pipermail/openmp-dev/2021-February/003836.html. This patch fixed it. Thank Raúl for the report.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D95798
2021-02-03 13:12:43 -05:00
AndreyChurbanov d7b12004bd [OpenMP] libomp: implement nteams-var and teams-thread-limit-var ICVs
The change includes OMP_NUM_TEAMS, OMP_TEAMS_THREAD_LIMIT env variables,
omp_set_num_teams, omp_get_max_teams, omp_set_teams_thread_limit,
omp_get_teams_thread_limit routines.

Differential Revision: https://reviews.llvm.org/D95003
2021-02-01 22:54:11 +03:00
Tobias Hieta c3c02d0d5a [OpenMP] Fix python3 compatibility in openmp's lit.cfg
Differential Revision: https://reviews.llvm.org/D95669
2021-02-01 08:20:26 +01:00
AndreyChurbanov ac70a53653 [OpenMP] NFC: disabled two flakey tests as the bug in libomp not fixed yet 2021-01-29 00:54:13 +03:00
Peyton, Jonathan L 8e67134364 [OpenMP] Fix misleading warning for OMP_PLACES
When OMP_PLACES contains an invalid value, the warning informs the user
that the fallback is OMP_PLACES=threads, but the actual internal setting
is OMP_PLACES=cores and is detected as such with KMP_SETTINGS=1.
This patch informs the user that OMP_PLACES=cores is being used instead
of OMP_PLACES=threads.

Differential Revision: https://reviews.llvm.org/D95170
2021-01-27 14:27:24 -06:00
Nawrin Sultana 927af4b3c5 [OpenMP] Modify OMP_ALLOCATOR environment variable
This patch sets the def-allocator-var ICV based on the environment variables
provided in OMP_ALLOCATOR. Previously, only allowed value for OMP_ALLOCATOR
was a predefined memory allocator. OpenMP 5.1 specification allows predefined
memory allocator, predefined mem space, or predefined mem space with traits in
OMP_ALLOCATOR. If an allocator can not be created using the provided environment
variables, the def-allocator-var is set to omp_default_mem_alloc.

Differential Revision: https://reviews.llvm.org/D94985
2021-01-26 18:27:39 -06:00
Shilei Tian 9d64275ae0 [OpenMP] Added the support for hidden helper task in RTL
The basic design is to create an outer-most parallel team. It is not a regular team because it is only created when the first hidden helper task is encountered, and is only responsible for the execution of hidden helper tasks.  We first use `pthread_create` to create a new thread, let's call it the initial and also the main thread of the hidden helper team. This initial thread then initializes a new root, just like what RTL does in initialization. After that, it directly calls `__kmpc_fork_call`. It is like the initial thread encounters a parallel region. The wrapped function for this team is, for main thread, which is the initial thread that we create via `pthread_create` on Linux, waits on a condition variable. The condition variable can only be signaled when RTL is being destroyed. For other work threads, they just do nothing. The reason that main thread needs to wait there is, in current implementation, once the main thread finishes the wrapped function of this team, it starts to free the team which is not what we want.

Two environment variables, `LIBOMP_NUM_HIDDEN_HELPER_THREADS` and `LIBOMP_USE_HIDDEN_HELPER_TASK`, are also set to configure the number of threads and enable/disable this feature. By default, the number of hidden helper threads is 8.

Here are some open issues to be discussed:
1. The main thread goes to sleeping when the initialization is finished. As Andrey mentioned, we might need it to be awaken from time to time to do some stuffs. What kind of update/check should be put here?

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D77609
2021-01-25 22:16:17 -05:00
AndreyChurbanov a60bc55c69 [OpenMP] libomp: cleanup parsing of OMP_ALLOCATOR env variable.
Differential Revision: https://reviews.llvm.org/D94932
2021-01-19 16:21:22 +03:00
AndreyChurbanov aa3a59e0c6 [OpenMP][NFC] Fix test
The test fails if memkind library is accessible.
2021-01-19 00:05:34 +03:00
Shilei Tian 9bf843bdc8 Revert "[OpenMP] Added the support for hidden helper task in RTL"
This reverts commit ed939f853d.
2021-01-18 06:57:52 -05:00
Shilei Tian ed939f853d [OpenMP] Added the support for hidden helper task in RTL
The basic design is to create an outer-most parallel team. It is not a regular team because it is only created when the first hidden helper task is encountered, and is only responsible for the execution of hidden helper tasks.  We first use `pthread_create` to create a new thread, let's call it the initial and also the main thread of the hidden helper team. This initial thread then initializes a new root, just like what RTL does in initialization. After that, it directly calls `__kmpc_fork_call`. It is like the initial thread encounters a parallel region. The wrapped function for this team is, for main thread, which is the initial thread that we create via `pthread_create` on Linux, waits on a condition variable. The condition variable can only be signaled when RTL is being destroyed. For other work threads, they just do nothing. The reason that main thread needs to wait there is, in current implementation, once the main thread finishes the wrapped function of this team, it starts to free the team which is not what we want.

Two environment variables, `LIBOMP_NUM_HIDDEN_HELPER_THREADS` and `LIBOMP_USE_HIDDEN_HELPER_TASK`, are also set to configure the number of threads and enable/disable this feature. By default, the number of hidden helper threads is 8.

Here are some open issues to be discussed:
1. The main thread goes to sleeping when the initialization is finished. As Andrey mentioned, we might need it to be awaken from time to time to do some stuffs. What kind of update/check should be put here?

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D77609
2021-01-16 14:13:35 -05:00
Terry Wilmarth 4fe17ada55 [OpenMP] Fix hierarchical barrier
Hierarchical barrier is an experimental barrier algorithm that uses aspects
of machine hierarchy to define the barrier tree structure. This patch fixes
offset calculation in hierarchical barrier. The offset is used to store info
on a flag about sleeping threads waiting on a location stored in the flag.
This commit also fixes a potential deadlock in hierarchical barrier when
using infinite blocktime by adjusting the offset value of leaf kids so that
it matches the value of leaf state. It also adds testing of default barriers
with infinite blocktime, and also tests hierarchical barrier algorithm with
both default and infinite blocktime.

Patch by Terry Wilmarth and Nawrin Sultana.

Differential Revision: https://reviews.llvm.org/D94241
2021-01-13 10:22:57 -06:00
Nawrin Sultana 540007b427 [OpenMP] Add strict mode in num_tasks and grainsize
This patch adds new API __kmpc_taskloop_5 to accomadate strict
modifier (introduced in OpenMP 5.1) in num_tasks and grainsize
clause.

Differential Revision: https://reviews.llvm.org/D92352
2020-12-09 16:46:30 -06:00
Joachim Protze d3ec512b1d [OpenMP][OMPT] Make sure that 0 is never used as ID in tests (NFC) 2020-12-04 18:41:56 +01:00
Joachim Protze fd3d1b09c1 [OpenMP][Tests][NFC] Use FileCheck from cmake config 2020-11-30 23:16:56 +01:00
Joachim Protze 723be4042a [OpenMP][OMPT][NFC] Fix failing test
The test would fail for gcc, when built with debug flag.
2020-11-29 19:07:42 +01:00
Joachim Protze cdf9401df8 [OpenMP][OMPT][NFC] Fix flaky test
The test had a chance to finish the first task before the second task is
created. In this case, the dependences-pair event would not trigger.
2020-11-29 19:07:41 +01:00
Joachim Protze 6d3b81664a [OpenMP][OMPT] Introduce a guard to handle OMPT return address
This is an alternative approach to address inconsistencies pointed out in: D90078
This patch makes sure that the return address is reset, when leaving the scope.
In some cases, I had to move the macro out of an if-statement to have it in the
right scope, in some cases I added an additional block to restrict the scope.

This patch does not handle inconsistencies, which might occur if the return
address is still set when we call into the application.

Test case (repeated_calls.c) provided by @hbae

Differential Revision: https://reviews.llvm.org/D91692
2020-11-25 18:17:44 +01:00
Isabel Thärigen b281a05dac [OpenMP][OMPT] Implement verbose tool loading
OpenMP 5.1 introduces the new env variable
OMP_TOOL_VERBOSE_INIT=(disabled|stdout|stderr|<filename>) to enable verbose
loading and initialization of OMPT tools.
This env variable helps to understand the cause when loading of a tool fails
(e.g., undefined symbols or dependency not in LD_LIBRARY_PATH)
Output of OMP_TOOL_VERBOSE_INIT is added for OMP_DISPLAY_ENV

Tests for this patch are integrated into the different existing tool loading
tests, making these tests more verbose. An Archer specific verbose test is
integrated into an existing Archer test.

Patch prepared by: Isabel Thärigen

Differential Revision: https://reviews.llvm.org/D91464
2020-11-25 18:17:44 +01:00
Nawrin Sultana 5439db05e7 [OpenMP] Add omp_realloc implementation
This patch adds omp_realloc function implementation according to
OpenMP 5.1 specification.

Differential Revision: https://reviews.llvm.org/D90971
2020-11-17 13:43:00 -06:00
Nawrin Sultana 938f1b8581 [OpenMP] Add omp_calloc implementation
This patch adds omp_calloc implementation according to OpenMP 5.1
specification.

Differential Revision: https://reviews.llvm.org/D90967
2020-11-13 14:35:46 -06:00
Shilei Tian 24d0ef0f50 [OpenMP] Fixed a bug when displaying affinity
Currently the affinity format string has initial value. When users set
the format via OMP_AFFINITY_FORMAT, it will overwrite the format string. However,
when copying the format, the tailing null is missing. As a result, if the user
format string is shorter than default value, the remaining part in the default
value still makes effort. This bug is not exposed because the test case doesn't
check the end of a string. It only checks whether given output "contains" the
check string.

Reviewed By: AndreyChurbanov

Differential Revision: https://reviews.llvm.org/D91309
2020-11-12 22:27:32 -05:00
Joachim Protze ce0911b3e9 [OpenMP][Tests] Fix compiler warnings in OpenMP runtime tests
This patch allows to pass the OpenMP runtime tests after configuring with
`cmake . -DOPENMP_TEST_FLAGS:STRING="-Werror"`.
The warnings for OMPT tests are addressed in D90752.

Differential Revision: https://reviews.llvm.org/D91280
2020-11-11 20:13:21 +01:00
Joachim Protze 6213ed062b [OpenMP][OMPT] Update the omp-tools header file to reflect 5.1 changes
This doesn't add functionality, but just adds the new types and renames the
master callback to masked callback.

Differential Revision: https://reviews.llvm.org/D90752
2020-11-11 20:13:21 +01:00
Joachim Protze b0eb19bf8a [OpenMP][OMPT][NFC] Fix flaky test
As reported by @ronlieb, the test shows intermittent fails.
The test failed, if the dependent task was already finished, when the depending
task was to be created. We have other tests to check for the dependences pair.
2020-11-03 13:15:32 +01:00
Joachim Protze 34b34e90fc [OpenMP][Tests] NFC: fix flaky test failure caused by rare scheduling
The worker thread can start execution of the task before creation of the second task
Fixes the spurious failure reported in https://reviews.llvm.org/D61657
2020-10-05 16:55:32 +02:00
Joachim Protze 6104b30446 [OpenMP][OMPT] Update OMPT tests for newly added GOMP interface patches
This patch updates the expected results for the GOMP interface patches: D87267, D87269, and D87271.
The taskwait-depend test is changed to really use taskwait-depend and copied to an task_if0-depend test.

To pass the tests, the handling of the return address was fixed.

Differential Revision: https://reviews.llvm.org/D87680
2020-10-01 00:53:41 +02:00
Peyton, Jonathan L ee1c04a926 [OpenMP] Fix if0 task with dependencies in the runtime
The current GOMP interface for serialized tasks does not take into
account task dependencies. Add the check and wait for dependencies.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=46573

Differential Revision: https://reviews.llvm.org/D87271
2020-09-24 09:47:53 -05:00
Peyton, Jonathan L 9089b4a5c5 [OpenMP] Introduce GOMP taskwait depend in the runtime
This change introduces the GOMP_taskwait_depend() function. It implements
the OpenMP 5.0 feature of #pragma omp taskwait with depend() clause by
wrapping around __kmpc_omp_wait_deps().

Differential Revision: https://reviews.llvm.org/D87269
2020-09-24 09:45:14 -05:00
Peyton, Jonathan L 72ada5ae6c [OpenMP] Introduce GOMP mutexinoutset in the runtime
Encapsulate GOMP task dependencies in separate class and introduce the
new mutexinoutset dependency type. This separate class allows
future GOMP task APIs easier access to the task dependency functionality
and better ability to propagate new dependency types to all existing GOMP
task APIs which use task dependencies.

Differential Revision: https://reviews.llvm.org/D87267
2020-09-24 09:45:13 -05:00
Peyton, Jonathan L ea34d95e0a [OpenMP] Introduce GOMP teams support in runtime
Implement GOMP_teams_reg() function which enables GOMP support of the
standalone teams construct. The GOMP_parallel* functions were modified
to call __kmp_fork_call() unconditionally so that the teams-specific
code could be reused within __kmp_fork_call() instead of reproduced
inside the GOMP_* functions.

Differential Revision: https://reviews.llvm.org/D87167
2020-09-24 09:45:13 -05:00
Saiyedul Islam 741e55aeed [OpenMP] Temporarily disable failing runtime tests for clang-12
Following tests were disabled for clang-11 after upgrading to
version 5.0 in D82963:

1. openmp/runtime/test/env/kmp_set_dispatch_buf.c
2. openmp/runtime/test/worksharing/for/kmp_set_dispatch_buf.c

They are also failing for clang-12. Thus this temporary disabling
until they are fixed.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D84241
2020-07-21 15:32:46 +00:00
AndreyChurbanov 617787ea77 [OpenMP] add missed REQUIRES:ompt for 2 OMPT tests 2020-07-21 16:31:17 +03:00
Joachim Protze f226171429 [OpenMP][Tests][NFC] Mark compatibility with older versions of clang 2020-07-20 13:53:29 +02:00
Joachim Protze 0fa0cf8638 [OpenMP][Tests] Update compatibility with GCC (NFC)
Commit 95a28df5c provided implementation for GOMP*_nonmonotonic*runtime*
functions. Now the tests succeed with gcc 9 and 10
2020-07-08 00:27:19 +02:00
Joachim Protze 6d9626d2da [OpenMP][Tests] Fix/Mark compatibilty for GCC
Reviewed by: Hahnfeld, saiislam

Differential Revision: https://reviews.llvm.org/D82267
2020-07-06 23:56:09 +02:00
Saiyedul Islam 4c4bda1630 [OpenMP] Temporarily disable failing runtime tests for OpenMP 5.0
Following tests are failing after upgrading to version 5.0 but are passing
for version 4.5:
1. openmp/runtime/test/env/kmp_set_dispatch_buf.c
2. openmp/runtime/test/worksharing/for/kmp_set_dispatch_buf.c

To be enabled as soon as these tests are fixed.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D82963
2020-07-06 14:04:43 +00:00