Commit Graph

1858 Commits

Author SHA1 Message Date
Jon Chesterfield c9bc414840 [libomptarget][amdgpu] Let default number of teams equal number of CUs 2020-12-09 19:35:34 +00:00
Jon Chesterfield e191d31159 [libomptarget][amdgpu] Robust handling of device_environment symbol 2020-12-09 19:21:51 +00:00
Jon Chesterfield cab9f69235 [libomptarget][amdgpu] Improve diagnostics on arch mismatch 2020-12-09 18:55:53 +00:00
Giorgis Georgakoudis 18dff28958 [OpenMP] Add doxygen generation for the runtime
Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D92779
2020-12-08 16:20:45 -08:00
AndreyChurbanov fff1abc406 [OpenMP] NFC: comment adjusted 2020-12-07 19:50:14 +03:00
AndreyChurbanov 22558c8501 [OpenMP] libomp: Fix possible NULL dereferences
Check pointer returned by strchr, as it can be NULL in case of broken
format of input string. Introduced new function __kmp_str_loc_numbers
for fast parsing of numbers only in the location string.
Also made some cleanup of __kmp_str_loc_init declaration and usage:
- changed type of init_fname parameter to bool;
- changed input from true to false in places where fname is not used.

Differential Revision: https://reviews.llvm.org/D90962
2020-12-07 19:09:07 +03:00
Jon Chesterfield 71f4693020 [libomptarget][amdgpu] Add plumbing to call into hostrpc lib, if linked 2020-12-07 15:24:01 +00:00
Jon Chesterfield e1b8e8a1f4 [libomptarget][amdgpu] Skip device_State allocation when using bss global 2020-12-06 12:13:56 +00:00
Joachim Protze a148216b31 [OpenMP][OMPT] Fix OMPT return address guard for gomp interface
D91692 missed various locations in kmp_gsupport, where the scope for
OMPT_STORE_RETURN_ADDRESS is too narrow, i.e. the scope ends before the OMPT
callback is called in some nested function.

This patch fixes the scoping issue, so that all OMPT tests pass, when the
tests are built with gcc.

Differential Revision: https://reviews.llvm.org/D92121
2020-12-05 19:06:28 +01:00
Joachim Protze d3ec512b1d [OpenMP][OMPT] Make sure that 0 is never used as ID in tests (NFC) 2020-12-04 18:41:56 +01:00
Jon Chesterfield f628eef98a [libomptarget][amdgpu] Fix latent race in load binary 2020-12-04 16:29:09 +00:00
Hansang Bae c4a22224d9 [OpenMP] Add __kmpc_omp_target_task_alloc to dllexport
This patch enables use of the entry on Windows.

Differential Revision: https://reviews.llvm.org/D92618
2020-12-04 08:11:14 -06:00
Jon Chesterfield ae9d96a656 [libomptarget][amdgpu] Address compiler warnings, drive by fixes
[libomptarget][amdgpu] Address compiler warnings, drive by fixes

Initialize some variables, remove unused ones.
Changes the debug printing condition to align with the aomp test suite.

Differential Revision: https://reviews.llvm.org/D92559
2020-12-03 11:09:12 +00:00
Pushpinder Singh afc09c6fe4 [libomptarget][AMDGPU] Remove MaxParallelLevel
Removes MaxParallelLevel references from rtl.cpp and drops
resulting dead code.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D92463
2020-12-03 00:27:03 -05:00
Terry Wilmarth e0665a9050 [OpenMP] Add support for Intel's umonitor/umwait
These changes add support for Intel's umonitor/umwait usage in wait
code, for architectures that support those intrinsic functions. Usage of
umonitor/umwait is off by default, but can be turned on by setting the
KMP_USER_LEVEL_MWAIT environment variable.

Differential Revision: https://reviews.llvm.org/D91189
2020-12-01 14:07:46 -06:00
AndreyChurbanov 6bf84871e9 [OpenMP] libomp: add UNLIKELY hints to rarely executed branches
Added UNLIKELY hint to one-time or rarely executed branches.
This improves performance of the library on some tasking benchmarks.

Differential Revision: https://reviews.llvm.org/D92322
2020-12-01 16:53:21 +03:00
Joachim Protze fd3d1b09c1 [OpenMP][Tests][NFC] Use FileCheck from cmake config 2020-11-30 23:16:56 +01:00
Todd Erdner 9615890db5 [OpenMP] libomp: change shm name to include UID, call unregister_lib on SIGTERM
With the change to using shared memory, there were a few problems that need to be fixed.
- The previous filename that was used for SHM only used process id. Given that process is
  usually based on 16bit number, this was causing some conflicts on machines. Thus we add
  UID to the name to prevent this.
- It appears under some conditions (SIGTERM, etc) the shared memory files were not getting
  cleaned up. Added a call to clean up the shm files under those conditions. For this user
  needs to set envirable KMP_HANDLE_SIGNALS to true.

Patch by Erdner, Todd <todd.erdner@intel.com>

Differential Revision: https://reviews.llvm.org/D91869
2020-12-01 00:40:47 +03:00
AndreyChurbanov f6f28b44ad [OpenMP] libomp: fix mutexinoutset dependence for proxy tasks
Once __kmp_task_finish is not executed for proxy tasks,
move mutexinoutset dependency code to __kmp_release_deps
which is executed for all task kinds.

Differential Revision: https://reviews.llvm.org/D92326
2020-12-01 00:13:31 +03:00
Joachim Protze 723be4042a [OpenMP][OMPT][NFC] Fix failing test
The test would fail for gcc, when built with debug flag.
2020-11-29 19:07:42 +01:00
Joachim Protze cdf9401df8 [OpenMP][OMPT][NFC] Fix flaky test
The test had a chance to finish the first task before the second task is
created. In this case, the dependences-pair event would not trigger.
2020-11-29 19:07:41 +01:00
Jon Chesterfield 89a0f48c58 [libomptarget][cuda] Detect missing symbols in plugin at build time
[libomptarget][cuda] Detect missing symbols in plugin at build time

Passes -z,defs to the linker. Error on unresolved symbol references.

Otherwise, those unresolved symbols present as target code running on the host
as the plugin fails to load. This is significantly harder to debug than a link
time error. Flag matches that passed by amdgcn and ve plugins.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D92143
2020-11-27 15:39:41 +00:00
Martin Storsjö 6b429668de [OpenMP][OMPT] Fix building with OMPT disabled after 6d3b81664a 2020-11-26 10:09:32 +02:00
Johannes Doerfert 227c8ff189 [OpenMP][Docs] Add more content, call coordinates, FAQ entries, links 2020-11-25 11:52:35 -06:00
AndreyChurbanov 9e3e332d27 [OpenMP] libomp: fix non-X86, non-AARCH64 builds
Commit https://reviews.llvm.org/rG7b5254223acbf2ef9cd278070c5a84ab278d7e5f
broke the build for some architectures, because macro KMP_PREFIX_UNDERSCORE
was defined only for x86, x86_64 and aarch64. This patch defines it for other
architectures (as a no-op).

Differential Revision: https://reviews.llvm.org/D92027
2020-11-25 20:40:23 +03:00
Joachim Protze 6d3b81664a [OpenMP][OMPT] Introduce a guard to handle OMPT return address
This is an alternative approach to address inconsistencies pointed out in: D90078
This patch makes sure that the return address is reset, when leaving the scope.
In some cases, I had to move the macro out of an if-statement to have it in the
right scope, in some cases I added an additional block to restrict the scope.

This patch does not handle inconsistencies, which might occur if the return
address is still set when we call into the application.

Test case (repeated_calls.c) provided by @hbae

Differential Revision: https://reviews.llvm.org/D91692
2020-11-25 18:17:44 +01:00
Isabel Thärigen b281a05dac [OpenMP][OMPT] Implement verbose tool loading
OpenMP 5.1 introduces the new env variable
OMP_TOOL_VERBOSE_INIT=(disabled|stdout|stderr|<filename>) to enable verbose
loading and initialization of OMPT tools.
This env variable helps to understand the cause when loading of a tool fails
(e.g., undefined symbols or dependency not in LD_LIBRARY_PATH)
Output of OMP_TOOL_VERBOSE_INIT is added for OMP_DISPLAY_ENV

Tests for this patch are integrated into the different existing tool loading
tests, making these tests more verbose. An Archer specific verbose test is
integrated into an existing Archer test.

Patch prepared by: Isabel Thärigen

Differential Revision: https://reviews.llvm.org/D91464
2020-11-25 18:17:44 +01:00
AndreyChurbanov 7b5254223a [OpenMP] fix asm code for for arm64 (AARCH64) for Darwin/macOS
Adjusted external reference for Darwin/AARCH64 link compatibility.
Made size directive conditional only if __ELF__ defined.

Patch by Michael_Pique <mpique@icloud.com>

Differential Revision: https://reviews.llvm.org/D88252
2020-11-24 13:08:24 +03:00
AndreyChurbanov 5644f734d6 Revert "[OpenMP] Add support for Intel's umonitor/umwait"
This reverts commit 9cfad5f9c5.
2020-11-20 12:16:34 +03:00
AndreyChurbanov 9cfad5f9c5 [OpenMP] Add support for Intel's umonitor/umwait
Patch by tlwilmar (Terry Wilmarth)

Differential Revision: https://reviews.llvm.org/D91189
2020-11-19 22:04:21 +03:00
cchen 7036fe8a0c [libomptarget] Add support for target update non-contiguous
This patch is the runtime support for https://reviews.llvm.org/D84192.

In order not to modify the tgt_target_data_update information but still be
able to pass the extra information for non-contiguous map item (offset,
count, and stride for each dimension), this patch overload arg when
the maptype is set as OMP_TGT_MAPTYPE_DESCRIPTOR. The origin arg is for
passing the pointer information, however, the overloaded arg is an
array of descriptor_dim:

```
struct descriptor_dim {
  int64_t offset;
  int64_t count;
  int64_t stride
};
```

and the array size is the dimension size. In addition, since we
have count and stride information in descriptor_dim, we can replace/overload the
arg_size parameter by using dimension size.

Reviewed By: grokos, tianshilei1992

Differential Revision: https://reviews.llvm.org/D82245
2020-11-19 11:33:27 -06:00
Joseph Huber da8bec47ab [OpenMP] Add Location Fields to Libomptarget Runtime for Debugging
Summary:
Add support for passing source locations to libomptarget runtime functions using the ident_t struct present in the rest of the libomp API. This will allow the runtime system to give much more insightful error messages and debugging values.

Reviewers: jdoerfert grokos

Differential Revision: https://reviews.llvm.org/D87946
2020-11-19 12:01:53 -05:00
Joseph Huber 5378c6a4bf [OpenMP] Add Support for Mapping Names in Libomptarget RTL
Summary:
This patch adds basic support for priting the source location and names for the mapped variables. This patch does not support names for custom mappers. This is based on D89802.

Reviewers: jdoerfert

Differential Revision: https://reviews.llvm.org/D90172
2020-11-18 16:01:59 -05:00
Joseph Huber 97e55cfef5 [OpenMP] Add Passing in Original Declaration Names To Mapper API
Summary:
This patch adds support for passing in the original delcaration name in the source file to the libomptarget runtime. This will allow the runtime to provide more intelligent debugging messages. This patch takes the original expression parsed from the OpenMP map / update clause and provides a textual representation if it was explicitly mapped, otherwise it takes the name of the variable declaration as a fallback. The information in passed to the runtime in a global array of strings that matches the existing ident_t source location strings using ";name;filename;column;row;;"

Reviewers: jdoerfert

Differential Revision: https://reviews.llvm.org/D89802
2020-11-18 15:28:39 -05:00
Hansang Bae 44a11c342c [OpenMP] Use explicit type casting in kmp_atomic.cpp
Differential Revision: https://reviews.llvm.org/D91105
2020-11-17 14:31:13 -06:00
Nawrin Sultana 5439db05e7 [OpenMP] Add omp_realloc implementation
This patch adds omp_realloc function implementation according to
OpenMP 5.1 specification.

Differential Revision: https://reviews.llvm.org/D90971
2020-11-17 13:43:00 -06:00
Peyton, Jonathan L 8647c669a4 [OpenMP] NFC: remove tabs in message catalog file 2020-11-17 10:15:04 -06:00
Peyton, Jonathan L 0454154efd [OpenMP][stats] reset serial state when re-entering serial region
Differential Revision: https://reviews.llvm.org/D90867
2020-11-17 10:09:56 -06:00
Joachim Protze fdc9dfc8e4 [OpenMP][Tool] Add Archer option to disable data race analysis for sequential part
This introduces the new `ARCHER_OPTIONS` flag `ignore_serial=0|1` to disable
analysis and logging of memory accesses in the sequential part of the OpenMP
application.

In the sequential part of an OpenMP program no data race is possible, unless
there is non-OpenMP concurrency (such as pthreads, MPI, ...). For the latter
reason, this is not active by default.

Besides reducing the runtime overhead for the sequential part of the program,
this reduces the memory overhead for sequential initialization. In combination
with `flush_shadow=1` this can allow analysis of applications, which run close
to the limit of available memory, but only access smaller parts of shared
memory during each OpenMP parallel region.

A problem for this approach is that Archer only gets active, when the OpenMP
runtime gets initialized, which might be after serial initialization of the
application. In such case, it helps to call for example `omp_get_max_threads()`
at the beginning of main.

Differential Revision: https://reviews.llvm.org/D90473
2020-11-16 10:45:21 +01:00
Martin Storsjö 9bcef58b63 [OpenMP] Fix building for windows after adding omp_calloc
Differential Revision: https://reviews.llvm.org/D91478
2020-11-15 21:32:38 +02:00
Nawrin Sultana 938f1b8581 [OpenMP] Add omp_calloc implementation
This patch adds omp_calloc implementation according to OpenMP 5.1
specification.

Differential Revision: https://reviews.llvm.org/D90967
2020-11-13 14:35:46 -06:00
Joachim Protze 96eaacc917 [OpenMP][Tool] Update archer to accept new OpenMP 5.1 enum values
OpenMP 5.1 adds an extra enum entry for ompt_scope_t, which makes the related
switch statement incomplete.
Also adding cases for newly added barrier variants.

Differential Revision: https://reviews.llvm.org/D90758
2020-11-13 16:09:05 +01:00
Shilei Tian 24d0ef0f50 [OpenMP] Fixed a bug when displaying affinity
Currently the affinity format string has initial value. When users set
the format via OMP_AFFINITY_FORMAT, it will overwrite the format string. However,
when copying the format, the tailing null is missing. As a result, if the user
format string is shorter than default value, the remaining part in the default
value still makes effort. This bug is not exposed because the test case doesn't
check the end of a string. It only checks whether given output "contains" the
check string.

Reviewed By: AndreyChurbanov

Differential Revision: https://reviews.llvm.org/D91309
2020-11-12 22:27:32 -05:00
Joseph Huber 292e898c16 [OpenMP] Begin Adding OpenMP Tool to Gather OpenMP Information
Summary:
This patch begins to add support for a set of scripts that can be used to get information from OpenMP programs to better describe problems and eventually show the data to the user in formatted output. Right now the only support is forformatting the register and memory usage reports from ptxas and nvlink. This is simply done as a wrapper around clang and clang++.

Reviewers: jdoerfert

DIfferential Revision: https://reviews.llvm.org/D91085
2020-11-11 20:00:37 -05:00
Joachim Protze 25b3164bfb [OpenMP][Tools][Tests] Fix ompt multiplex test
With 6213ed0 the master callback was renamed to masked.
The multiplex tests must check for masked now.
2020-11-12 01:43:49 +01:00
Peyton, Jonathan L dd8723d348 [OpenMP] Fix shutdown hang/race bug
The deadlock/race happens when primary thread gets initz lock and tries to join
the worker thread which waits for the same lock in TLS key destructor.
The patch removes the lock and the code of setting TLS value which needed
the lock. Also removed setting TLS from __kmp_unregister_root_current_thread.

Differential Revision: https://reviews.llvm.org/D90647
2020-11-11 13:47:23 -06:00
Joachim Protze 3fa2e19338 [OpenMP][Tool] Fix possible NULL-pointer dereference in test
Avoid dereferencing a possibly uninitialized pointer as mentioned in D91280.
2020-11-11 20:13:22 +01:00
Joachim Protze ce0911b3e9 [OpenMP][Tests] Fix compiler warnings in OpenMP runtime tests
This patch allows to pass the OpenMP runtime tests after configuring with
`cmake . -DOPENMP_TEST_FLAGS:STRING="-Werror"`.
The warnings for OMPT tests are addressed in D90752.

Differential Revision: https://reviews.llvm.org/D91280
2020-11-11 20:13:21 +01:00
Joachim Protze 6213ed062b [OpenMP][OMPT] Update the omp-tools header file to reflect 5.1 changes
This doesn't add functionality, but just adds the new types and renames the
master callback to masked callback.

Differential Revision: https://reviews.llvm.org/D90752
2020-11-11 20:13:21 +01:00
AndreyChurbanov 33da6bd7f5 [OpenMP] Fixes for shared memory cleanup when aborts occur
Patch by Erdner, Todd <todd.erdner@intel.com>

Differential Revision: https://reviews.llvm.org/D90974
2020-11-11 00:16:23 +03:00
Alexey Bataev dcde6f17fd Revert "[libomptarget] Add support for target update non-contiguous"
This reverts commit 6847bcec1a. It breaks
the build of libomptarget.
2020-11-10 07:49:00 -08:00
Hansang Bae ef7738240c [OpenMP] Remove obsolete Fortran module file
Modern Fortran compilers support Fortran 90, so we do not need to use
the source code for Fortran compilers that do not support Fortran 90.

Differential Revision: https://reviews.llvm.org/D90077
2020-11-09 15:26:38 -06:00
cchen 6847bcec1a [libomptarget] Add support for target update non-contiguous
This patch is the runtime support for https://reviews.llvm.org/D84192.

In order not to modify the tgt_target_data_update information but still be
able to pass the extra information for non-contiguous map item (offset,
count, and stride for each dimension), this patch overload arg when
the maptype is set as OMP_TGT_MAPTYPE_DESCRIPTOR. The origin arg is for
passing the pointer information, however, the overloaded arg is an
array of descriptor_dim:

```
struct descriptor_dim {
  int64_t offset;
  int64_t count;
  int64_t stride
};
```

and the array size is the dimension size. In addition, since we
have count and stride information in descriptor_dim, we can replace/overload the
arg_size parameter by using dimension size.

Reviewed By: grokos

Differential Revision: https://reviews.llvm.org/D82245
2020-11-06 20:55:33 -06:00
Nawrin Sultana 082031949c [OpenMP] Fix potential division by 0
This patch fixes potential division by 0 in case hwloc does not
recognize cores (or architecture has no cores).

Patch by Andrey Churbanov

Differential Revision: https://reviews.llvm.org/D90954
2020-11-06 11:52:19 -06:00
Peyton, Jonathan L 5e34877480 [OpenMP] Add ident_t flags for compiler OpenMP version
This patch adds the mask and ident_t function to get the
openmp version. It also adds logic to force monotonic:dynamic
behavior when OpenMP version less than 5.0.

The OpenMP version is stored in the format:
major*10+minor e.g., OpenMP 5.0 = 50

Differential Revision: https://reviews.llvm.org/D90632
2020-11-05 11:14:25 -06:00
Joachim Protze 7b0ca32b62 [OpenMP] avoid warning: equality comparison with extraneous parentheses
The macros are used in several places with an if(macro) pattern. This results
in several warnings about extraneous parenteses in equality comparison.

Having the constant at the lhs of the comparison, avoids this warning.

Differential Revision: https://reviews.llvm.org/D90756
2020-11-05 12:13:08 +01:00
Jon Chesterfield 93cbf622fc [libomptarget][nfc] Build amdgcn deviceRTL with nogpulib 2020-11-04 11:29:22 +00:00
Shilei Tian f5eebc25cc [OpenMP] Fixed an issue in the test case parallel_offloading_map
There is a non-conforming use of variable-sized array in the test case `parallel_offloading_map.c`. This patch fixed it.

Reviewed By: protze.joachim

Differential Revision: https://reviews.llvm.org/D90642
2020-11-03 15:59:16 -05:00
Joachim Protze eaed9e6b56 [OpenMP][Tools] clang-format Archer (NFC) 2020-11-03 16:32:02 +01:00
Joachim Protze 71041a8b6b [OpenMP][libomptarget][Tests] fix failing test
D88149 updated `omp_get_initial_device` behavior to conform with OpenMP 5.1.
omp_get_initial_device() == omp_get_num_devices()
2020-11-03 13:15:33 +01:00
Joachim Protze b0eb19bf8a [OpenMP][OMPT][NFC] Fix flaky test
As reported by @ronlieb, the test shows intermittent fails.
The test failed, if the dependent task was already finished, when the depending
task was to be created. We have other tests to check for the dependences pair.
2020-11-03 13:15:32 +01:00
Joachim Protze e99207feb4 [OpenMP][Tool] Handle detached tasks in Archer
Since detached tasks are supported by clang and the OpenMP runtime, Archer
must expect to receive the corresponding callbacks.

This patch adds support to interpret the synchronization semantics of
omp_fulfill_event and cleans up the handling of task switches.
2020-11-03 13:15:32 +01:00
Atmn Patel a95b25b29e [Libomptarget][NFC] Move global Libomptarget state to a struct
Presently, there a number of global variables in libomptarget (devices,
RTLs, tables, mutexes, etc.) that are not placed within a struct. This
patch places them into a struct ``PluginManager``. All of the functions
that act on this data remain free.

Differential Revision: https://reviews.llvm.org/D90519
2020-11-03 00:10:18 -05:00
Johannes Doerfert 30e818db91 [OpenMP][Docs] Structure and content for the OpenMP documentation
This adds some initial content as well as structure to the new OpenMP
Sphinx documentation hosted at http://openmp.llvm.org/docs/ .

The content contains some useful links but most pages are still empty.

This uses a "custom" theme which is a copy of the default "agogo" one
with minor modifications to get a nicer table of content in the sidebar.
This way we can also adjust the theme as we go.

Reviewed By: jhuber6, JonChesterfield

Differential Revision: https://reviews.llvm.org/D90256
2020-10-30 01:31:48 -05:00
Peyton, Jonathan L 771f0fb92d [OpenMP] Add NULL check in dispatcher debug output
Patch by Nawrin Sultana

Differential Revision: https://reviews.llvm.org/D90403
2020-10-29 14:08:03 -05:00
Jon Chesterfield dee7704829 [AMDGPU] Add __builtin_amdgcn_grid_size
[AMDGPU] Add __builtin_amdgcn_grid_size

Similar to D76772, loads the data from the dispatch pointer. Marked invariant.

Patch also updates the openmp devicertl to use this builtin.

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D90251
2020-10-29 16:25:13 +00:00
Benjamin Kramer 207cf71fa9 Revert "[OpenMP] Add Passing in Original Declaration Names To Mapper API"
This reverts commit d981c7b758 and
a87d7b3d44. Test fails under msan.
2020-10-28 13:58:14 +01:00
Joseph Huber d981c7b758 [OpenMP] Add Support for Mapping Names in Libomptarget RTL
Summary:
This patch adds basic support for priting the source location and names for the
mapped variables. This patch does not support names for custom mappers. This is
based on D89802. The names information currently will be printed out only in
debug mode or using env LIBOMPTARGET_INFO during execution. But the information
is added when availible to the Device and Private data structures. To get the
information out the code must be built with debug symbols on using -g or
-Rpass=openmp-opt

Reviewers: jdoerfert

Differential Revision: https://reviews.llvm.org/D90172
2020-10-27 16:53:05 -04:00
Joseph Huber a87d7b3d44 [OpenMP] Add Passing in Original Declaration Names To Mapper API
Summary:
This patch adds support for passing in the original delcaration name in the
source file to the libomptarget runtime. This will allow the runtime to provide
more intelligent debugging messages. This patch takes the original expression
parsed from the OpenMP map / update clause and provides a textual
representation if it was explicitly mapped, otherwise it takes the name of the
variable declaration as a fallback. The information in passed to the runtime in
a global array of strings that matches the existing ident_t source location
strings using ";name;filename;column;row;;". See
clang/test/OpenMP/target_map_names.cpp for an example of the generated output
for a given map clause.

Reviewers: jdoervert

Differential Revision: https://reviews.llvm.org/D89802
2020-10-27 16:09:19 -04:00
Shilei Tian e20d64c3d9 [Clang][OpenMP] Fixed an issue of segment fault when using target nowait
The implementation of target nowait just wraps the target region into a task. The essential four parameters (base ptr, ptr, size, mapper) are taken as firstprivate such that they will be copied to the private location. When there is no user-defined mapper, the mapper variable will be nullptr. However, it will be still copied to the corresponding place. Therefore, a memcpy will be generated and the source pointer will be nullptr, causing a segmentation fault. The root cause is when calling `emitOffloadingArraysArgument`, the last argument `Options` has a field about whether it requires a task. It only takes depend clause into account. In this patch, the nowait clause is also included.

There're two things that will be done in another patches:
1. target data nowait has not been supported yet. D90099 added the support.
2. When there is no mapper, the mapper array can be nullptr no matter whether it requires outer task or not. It can avoid an unnecessary data copy. This is an optimization that is covered in D90101.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D89844
2020-10-26 22:33:22 -04:00
AndreyChurbanov d6a0957467 [OpenMP] changing OMP rtl to use shared memory instead of env variable
Patch by Erdner, Todd <todd.erdner@intel.com>

Differential Revision: https://reviews.llvm.org/D89898
2020-10-26 19:02:21 +03:00
Shilei Tian 3091ed099f [OpenMP] Fixed a potential integer overflow
`size_t` has different width on 32- and 64-bit architecture, but the
computation to floor to power of two assumed it is 64-bit, which can cause an
integer overflow. In this patch, architecture detection is added so that the
operation for 64-bit `size_t`. Thank Luke for reporting the issue.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D89878
2020-10-22 21:22:19 -04:00
Jon Chesterfield 26790ed248 [libomptarget] Require LLVM source tree to build libomptarget
[libomptarget] Require LLVM source tree to build libomptarget

This is to permit reliably #including files from the LLVM tree in libomptarget,
as an improvement on the copy and paste that is currently in use. See D87841
for the first example of removing duplication given this new requirement.

The weekly openmp dev call reached consensus on this approach. See also D87841
for some alternatives that were considered. In the future, we may want to
introduce a new top level repo for shared constants, or start using the ADT
library within openmp.

This will break sufficiently exotic build systems, trivial fixes as below.

Building libomptarget as part of the monorepo will continue to work.
If openmp is built separately, it now requires a cmake macro indicating
where to find the LLVM source tree.

If openmp is built separately, without the llvm source tree already on disk,
the build machine will need a copy of a subset of the llvm source tree and
the cmake macro indicating where it is.

Reviewed By: protze.joachim

Differential Revision: https://reviews.llvm.org/D89426
2020-10-21 18:53:00 +01:00
JonChesterfield 55dc123555 [libomptarget][amdgcn] Refactor memcpy to eliminate maps
[libomptarget][amdgcn] Refactor memcpy to eliminate maps

Builds on D89776 to remove now dead code.

Reviewed By: pdhaliwal

Differential Revision: https://reviews.llvm.org/D89888
2020-10-21 16:59:33 +01:00
Pushpinder Singh aa616efbb3 [libomptarget][AMDGPU][NFC] Split atmi_memcpy for h2d and d2h
The calls to atmi_memcpy presently determine the direction of copy (host to
device or device to host) by storing pointers in a map during malloc and
looking up the pointers during memcpy. As each call site already knows the
direction, this stash+lookup can be eliminated.

This NFC will be followed by a functional one that deletes those map lookups.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D89776

Change-Id: I1d9089bc1e56b3a9a30e334735fa07dee1f84990
2020-10-20 06:29:32 -04:00
Jon Chesterfield d27b39ce11 [libomptarget][amdgcn] Implement missing symbols in deviceRTL
[libomptarget][amdgcn] Implement missing symbols in deviceRTL

Malloc, wtime are stubs. Malloc needs a hostrpc implementation which is
a work in progress, wtime needs some experimentation to find out the
multiplier to get a time in seconds as documentation is scarce.

Reviewed By: ronlieb

Differential Revision: https://reviews.llvm.org/D89725
2020-10-20 00:24:15 +01:00
George Rokos 5adb3a6d86 [libomptarget] Fix copy-to motion for PTR_AND_OBJ entries where PTR is a struct member.
This patch fixes a problem whereby the pointee object of a PTR_AND_OBJ entry with a `map(to)` motion clause can be overwritten on the device even if its reference counter is >=1.

Currently, we check the reference counter of the parent struct in order to determine whether the motion clause should be respected, but since the pointee object is not part of the struct, it's got its own reference counter which should be used to enqueue the copy or discard it.

The same behavior has already been implemented in targetDataEnd (omptarget.cpp:539-540), but we somehow missed doing the same in targetDataBegin.

Differential Revision: https://reviews.llvm.org/D89597
2020-10-16 16:14:01 -07:00
JonChesterfield 7d2ecef5ed [openmp][libomptarget] Include header from LLVM source tree
[openmp][libomptarget] Include header from LLVM source tree

The change is to the amdgpu plugin so is unlikely to break anything.

The point of contention is whether libomptarget can depend on LLVM.
A community discussion was cautiously not opposed yesterday.

This introduces a compile time dependency on the LLVM source tree, in this case
expressed as skipping the building of the plugin if LLVM_MAIN_INCLUDE_DIR is not
set. One the source files will #include llvm/Frontend/OpenMP/OMPGridValues.h,
instead of copy&pasting the numbers across.

For users that download the monorepo, the llvm tree is already on disk. This will
inconvenience users who download only the openmp source as a tar, as they would
now also have to download (at least a file or two) from the llvm source, if they want
to build the parts of the openmp project that (post this patch) depend on llvm.

There was interest expressed in going further - using llvm tools as part of
building libomp, or linking against llvm libraries. That seems less clear cut
an improvement and worthy of further discussion. This patch seeks only to change
policy to support openmp depending on the llvm source tree. Including in the
other direction, or using libraries / tools etc, are purposefully out of scope.

Reviewers are a best guess at interested parties, please feel free to add others

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D87841
2020-10-15 15:46:19 +01:00
JonChesterfield 8b6cd15242 [libomptarget][amdgcn] Implement partial barrier
[libomptarget][amdgcn] Implement partial barrier

named_sync is used to coordinate non-spmd kernels. This uses bar.sync on nvptx.
There is no corresponding ISA support on amdgcn, so this is implemented using
shared memory, one word initialized to zero.

Each wave increments the variable by one. Whichever wave is last is responsible
for resetting the variable to zero, at which point it and the others continue.

The race condition on a wave reaching the barrier before another wave has
noticed that it has been released is handled with a generation counter, packed
into the same word.

Uses a shared variable that is not needed on nvptx. Introduces a new hook,
kmpc_impl_target_init, to allow different targets to do extra initialization.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D88602
2020-10-12 21:27:32 +01:00
Joseph Huber d564409946 [OpenMP] Change CMake Configuration to Build for Highest CUDA Architecture by Default
Summary:
This patch changes the CMake files for Clang and Libomptarget to query the
system for its supported CUDA architecture. This makes it much easier for the
user to build optimal code without needing to set the flags manually. This
relies on the now deprecated FindCUDA method in CMake, but full support for
architecture detection is only availible in CMake >3.18

Reviewers: jdoerfert ye-luo

Subscribers: cfe-commits guansong mgorny openmp-commits sstefan1 yaxunl

Tags: #clang #OpenMP

Differential Revision: https://reviews.llvm.org/D87946
2020-10-08 12:09:34 -04:00
Pushpinder Singh 3a12ff0dac [OpenMP][RTL] Remove dead code
RequiresDataSharing was always 0, resulting dead code in device runtime library.

Reviewed By: jdoerfert, JonChesterfield

Differential Revision: https://reviews.llvm.org/D88829
2020-10-06 05:43:47 -04:00
Joachim Protze 69f87400a8 [OpenMP][Archer][Tests] NFC: fix spurious test failure
The test disables suppression and therefore sometimes triggers a know false
positive in the openmp runtime. The test should only verify that the env
var is handles as expected.
2020-10-06 00:26:08 +02:00
Joachim Protze 34b34e90fc [OpenMP][Tests] NFC: fix flaky test failure caused by rare scheduling
The worker thread can start execution of the task before creation of the second task
Fixes the spurious failure reported in https://reviews.llvm.org/D61657
2020-10-05 16:55:32 +02:00
Joachim Protze 23419bfd1c [OpenMP][libarcher] Allow all possible argument separators in TSAN_OPTIONS
Currently, the parser used to tokenize the TSAN_OPTIONS in libomp uses
only spaces as separators, even though TSAN in compiler-rt supports
other separators like ':' or ','.
CTest uses ':' to separate sanitizer options by default.
The documentation for other sanitizers mentions ':' as separator,
but TSAN only lists spaces, which is probably where this mismatch originated.

Patch provided by  upsj

Differential Revision: https://reviews.llvm.org/D87144
2020-10-01 01:10:13 +02:00
Joachim Protze 6104b30446 [OpenMP][OMPT] Update OMPT tests for newly added GOMP interface patches
This patch updates the expected results for the GOMP interface patches: D87267, D87269, and D87271.
The taskwait-depend test is changed to really use taskwait-depend and copied to an task_if0-depend test.

To pass the tests, the handling of the return address was fixed.

Differential Revision: https://reviews.llvm.org/D87680
2020-10-01 00:53:41 +02:00
Joachim Protze 55cff5b288 [OpenMP][libomptarget] make omp_get_initial_device 5.1 compliant
OpenMP 5.1 defines omp_get_initial_device to return the same value as omp_get_num_devices.
Since this change is also 5.0 compliant, no versioning is needed.

Differential Revision: https://reviews.llvm.org/D88149
2020-10-01 00:51:11 +02:00
JonChesterfield d256797c90 [nfc][libomptarget] Drop parameter to named_sync
[nfc][libomptarget] Drop parameter to named_sync

named_sync has one call site (in sync.cu) where it always passed L1_BARRIER.
Folding this into the call site and dropping the macro is a simplification.

amdgpu doesn't have ptx' bar.sync instruction. A correct implementation of
__kmpc_impl_named_sync in terms of shared memory is much easier if it can
assume that the barrier argument is this constant. Said implementation is left
for a second patch.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D88474
2020-09-29 23:12:21 +01:00
Manoel Roemmer c816ee13ad [OpenMP][VE plugin] Fixing failure to build VE plugin with consolidated error handling in libomptarget
The libomptarget VE plugin [[
http://lab.llvm.org:8014/builders/clang-ve-ninja/builds/8937/steps/build-unified-tree/logs/stdio
| fails zu build ]] after ae95ceeb8f .

Differential Revision: https://reviews.llvm.org/D88476
2020-09-29 17:38:01 +02:00
Joseph Huber 0103df7903 [OpenMP] Add Missing _static Director for OpenMP Documentation
Summary:
Adding a missing directory needed for generating Sphinx documentation without
errors. Directory current contains a placeholder image just to populate the
directory.
2020-09-27 15:35:47 -04:00
Ye Luo ffd159d8e9 [OpenMP] cmake option LIBOMPTARGET_NVPTX_MAX_SM for nvptx device RTL
It allows customizing MAX_SM for non-flagship GPU and reduces graphic memory usage.

In addition, so far the size is hard-coded up to __CUDA_ARCH__ 700 and is already a hassle for 800.
Introduce MAX_SM for 800 and protect future arch

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D88185
2020-09-24 12:39:59 -04:00
Peyton, Jonathan L ee1c04a926 [OpenMP] Fix if0 task with dependencies in the runtime
The current GOMP interface for serialized tasks does not take into
account task dependencies. Add the check and wait for dependencies.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=46573

Differential Revision: https://reviews.llvm.org/D87271
2020-09-24 09:47:53 -05:00
Peyton, Jonathan L 9089b4a5c5 [OpenMP] Introduce GOMP taskwait depend in the runtime
This change introduces the GOMP_taskwait_depend() function. It implements
the OpenMP 5.0 feature of #pragma omp taskwait with depend() clause by
wrapping around __kmpc_omp_wait_deps().

Differential Revision: https://reviews.llvm.org/D87269
2020-09-24 09:45:14 -05:00
Peyton, Jonathan L 72ada5ae6c [OpenMP] Introduce GOMP mutexinoutset in the runtime
Encapsulate GOMP task dependencies in separate class and introduce the
new mutexinoutset dependency type. This separate class allows
future GOMP task APIs easier access to the task dependency functionality
and better ability to propagate new dependency types to all existing GOMP
task APIs which use task dependencies.

Differential Revision: https://reviews.llvm.org/D87267
2020-09-24 09:45:13 -05:00
Peyton, Jonathan L ea34d95e0a [OpenMP] Introduce GOMP teams support in runtime
Implement GOMP_teams_reg() function which enables GOMP support of the
standalone teams construct. The GOMP_parallel* functions were modified
to call __kmp_fork_call() unconditionally so that the teams-specific
code could be reused within __kmp_fork_call() instead of reproduced
inside the GOMP_* functions.

Differential Revision: https://reviews.llvm.org/D87167
2020-09-24 09:45:13 -05:00
Ye Luo 03111e5e7a [OpenMP] Protect unrecogonized CUDA error code
If an error code can not be recognized by cuGetErrorString, errStr remains null and causes crashing at DP() printing.
Protect this case.

Reviewed By: jhuber6, tianshilei1992

Differential Revision: https://reviews.llvm.org/D87980
2020-09-21 13:43:08 -04:00
Joseph Huber 1c4c21489f [OpenMP] Initial Support for OpenMP Webpage Documentation
Summary:
Adding support for generated html documentation for OpenMP. Changing
Cmake files to build the documentation and adding the base templates for
future documentation to be added.

Reviewers: jdoerfert

Subscribers: aaron.ballman arphaman guansong mgorny openmp-commits sstefan1 yaxunl

Tags: #OpenMP

Differential Revision: https://reviews.llvm.org/D87797
2020-09-18 16:32:22 -04:00
JonChesterfield a9be2b5cb2 [libomptarget] Disable build of amdgpu plugin as it doesn't build with rocm. 2020-09-18 18:10:27 +01:00
Joseph Huber c3e6054b07 [OpenMP] Additional Information for Libomptarget Mappings
Summary:
This patch adds additonal support for priting infromation from Libomptarget for
already existing maps and printing the final data mapped on the device at
device destruction.

Reviewers: jdoerfort gkistanova

Subscribers: guansong openmp-commits sstefan1 yaxunl

Tags: #OpenMP

Differential Revision: https://reviews.llvm.org/D87722
2020-09-15 18:12:57 -04:00
Raul Tambre c42f96cb23 [CMake][OpenMP] Simplify getting CUDA library directory
LLVM now requires CMake 3.13.4 so we can simplify this.

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D87195
2020-09-11 21:19:11 +03:00
Joseph Huber ae209397b1 [OpenMP] Begin Printing Information Dumps In Libomptarget and Plugins
Summary:
This patch starts adding support for adding information dumps to libomptarget
and rtl plugins. The information printing is controlled by the
LIBOMPTARGET_INFO environment variable introduced in D86483. The goal of this
patch is to provide the user with additional information about the device
during kernel execution and providing the user with information dumps in the
case of failure. This patch added the ability to dump the pointer mapping table
as well as printing the number of blocks and threads in the cuda RTL.

Reviewers: jdoerfort gkistanova	ye-luo

Subscribers: guansong openmp-commits sstefan1 yaxunl ye-luo

Tags: #OpenMP

Differential Revision: https://reviews.llvm.org/D87165
2020-09-09 12:03:56 -04:00
Pushpinder Singh 7634c64b61 [OpenMP][AMDGPU] Use DS_Max_Warp_Number instead of WARPSIZE
The size of worker_rootS should have been DS_Max_Warp_Number.
This reduces memory usage by deviceRTL on AMDGPU from around 2.3GB
to around 770MB.

Reviewed By: JonChesterfield, jdoerfert

Differential Revision: https://reviews.llvm.org/D87084
2020-09-07 05:15:21 -04:00
Raul Tambre 21c0e74c9e [CMake][OpenMP] Remove old dead CMake code
LLVM requires CMake 3.13.4 so remove code behind checks for an older version.

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D87191
2020-09-07 10:56:56 +03:00
Joseph Huber ae95ceeb8f [OpenMP] Consolidate error handling and debug messages in Libomptarget
Summary:

This patch consolidates the error handling and messaging routines to a single
file omptargetmessage. The goal is to simplify the error handling interface
prior to adding more error handling support

Reviewers: jdoerfert grokos ABataev AndreyChurbanov ronlieb JonChesterfield ye-luo tianshilei1992

Subscribers: danielkiss guansong jvesely kerbowa nhaehnle openmp-commits sstefan1 yaxunl
2020-09-01 15:28:19 -04:00
Alexey Bataev 6aa7228a62 [LIBOMPTARGET]Do not try to optimize bases for the next parameters.
PrivateArgumentManager shall immediately allocate firstprivates if they
are bases for the next parameters and the next paramaters rely on the
fact that the base musst be allocated already.

Differential Revision: https://reviews.llvm.org/D86781
2020-08-28 15:46:31 -04:00
Shilei Tian 46e0ced762 [OpenMP] Fixed wrong test command in the test private_mapping.c
The test command in `private_mapping.c` was set to expect failure by mistake. It is fixed in this patch.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D86758
2020-08-28 12:19:46 -04:00
Joseph Huber 7a5a74ea96 [OpenMP] Always emit debug messages that indicate offloading failure
Summary:

This patch changes the libomptarget runtime to always emit debug messages that
occur before offloading failure. The goal is to provide users with information
about why their application failed in the target region rather than a single
failure message. This is only done in regions that precede offloading failure
so this should not impact runtime performance. if the debug environment
variable is set then the message is forwarded to the debug output as usual.

A new environment variable was added for future use but does nothing in this
current patch. LIBOMPTARGET_INFO will be used to report runtime information to
the user if requrested, such as grid size, SPMD usage, or data mapping. It will
take an integer indicating the level of information verbosity and a value of 0
will disable it.

Reviewers: jdoerfort

Subscribers: guansong sstefan1 yaxunl ye-luo

Tags: #OpenMP

Differential Revision: https://reviews.llvm.org/D86483
2020-08-26 19:30:41 -04:00
JonChesterfield 5d989fb37d [libomptarget][amdgpu] Improve thread safety, remove dead code 2020-08-26 22:04:03 +01:00
Jon Chesterfield 28fbf422f2 [libomptarget][amdgpu] Update plugin CMake to work with latest rocr library 2020-08-26 20:01:42 +01:00
AndreyChurbanov 1596ea80fd [OpenMP] Fix import library installation with MinGW
Patch by mati865@gmail.com

Differential Revision: https://reviews.llvm.org/D86552
2020-08-26 21:56:01 +03:00
AndreyChurbanov 09af378f49 [OpenMP] Fix build on macOS sdk 10.12 and newer
Patch by nihui (Ni Hui)

Differential Revision: https://reviews.llvm.org/D76755
2020-08-26 16:52:46 +03:00
Shilei Tian 0775c1dfbc [OpenMP] Pack first-private arguments to improve efficiency of data transfer
In this patch, we pack all small first-private arguments, allocate and transfer them all at once to reduce the number of data transfer which is very expensive.

Let's take the test case as example.
```
int main() {
  int data1[3] = {1}, data2[3] = {2}, data3[3] = {3};
  int sum[16] = {0};
#pragma omp target teams distribute parallel for map(tofrom: sum) firstprivate(data1, data2, data3)
  for (int i = 0; i < 16; ++i) {
    for (int j = 0; j < 3; ++j) {
      sum[i] += data1[j];
      sum[i] += data2[j];
      sum[i] += data3[j];
    }
  }
}
```
Here `data1`, `data2`, and `data3` are three first-private arguments of the target region. In the previous `libomptarget`, it called data allocation and data transfer three times, each of which allocated and transferred 12 bytes. With this patch, it only calls allocation and transfer once. The size is `(12+4)*3=48` where 12 is the size of each array and 4 is the padding to keep the address aligned with 8. It is implemented in this way:
1. First collect all information for those *first*-private arguments. _private_ arguments are not the case because private arguments don't need to be mapped to target device. It just needs a data allocation. With the patch for memory manager, the data allocation could be very cheap, especially for the small size. For each qualified argument, push a place holder pointer `nullptr` to the `vector` for kernel arguments, and we will update them later.
2. After we have all information, create a buffer that can accommodate all arguments plus their paddings. Copy the arguments to the buffer at the right place, i.e. aligned address.
3. Allocate a target memory with the same size as the host buffer, transfer the host buffer to target device, and finally update all place holder pointers in the arguments `vector`.

The reason we only consider small arguments is, the data transfer is asynchronous. Therefore, for the large argument, we could continue to do things on the host side meanwhile, hopefully, the data is also being transferred. The "small" is defined by that the argument size is less than a predefined value. Currently it is 1024. I'm not sure whether it is a good one, and that is an open question. Another question is, do we need to make it configurable via an environment variable?

Reviewed By: ye-luo

Differential Revision: https://reviews.llvm.org/D86307
2020-08-25 16:06:29 -04:00
Dimitry Andric 47b0262d3f Add <stdarg.h> include to kmp_os.h, to get the va_list type, required
after cde8f4c164. Sort system includes, while here.
2020-08-24 22:45:02 +02:00
Dimitry Andric cde8f4c164 Move special va_list handling to kmp_os.h
Instead of copying and pasting the same `#ifdef` expressions in multiple
places, define a type and a pair of macros in `kmp_os.h`, to handle
whether `va_list` is pointer-like or not:

* `kmp_va_list` is the type to use for `__kmp_fork_call()`
* `kmp_va_deref()` dereferences a `va_list`, if necessary
* `kmp_va_addr_of()` takes the address of a `va_list`, if necessary

Also add FreeBSD to the list of OSes that has a non pointer-like
va_list. This can now be easily extended to other OSes too.

Reviewed By: AndreyChurbanov

Differential Revision: https://reviews.llvm.org/D86397
2020-08-24 22:31:56 +02:00
AndreyChurbanov d0f4f5a182 [OpenMP] Check if _MSC_VER is defined before using it
Patch by mati865@gmail.com

Differential Revision: https://reviews.llvm.org/D86448
2020-08-24 17:50:38 +03:00
Shilei Tian f93b42a629 [NFC][OpenMP] Remove outdated comments about potential issues
The issue mentioned has been fixed in D84996
2020-08-24 01:21:06 +00:00
Shilei Tian 0289696751 [OpenMP] Introduce target memory manager
Target memory manager is introduced in this patch which aims to manage target
memory such that they will not be freed immediately when they are not used
because the overhead of memory allocation and free is very large. For CUDA
device, cuMemFree even blocks the context switch on device which affects
concurrent kernel execution.

The memory manager can be taken as a memory pool. It divides the pool into
multiple buckets according to the size such that memory allocation/free
distributed to different buckets will not affect each other.

In this version, we use the exact-equality policy to find a free buffer. This
is an open question: will best-fit work better here? IMO, best-fit is not good
for target memory management because computation on GPU usually requires GBs of
data. Best-fit might lead to a serious waste. For example, there is a free
buffer of size 1960MB, and now we need a buffer of size 1200MB. If best-fit,
the free buffer will be returned, leading to a 760MB waste.

The allocation will happen when there is no free memory left, and the memory
free on device will take place in the following two cases:
1. The program ends. Obviously. However, there is a little problem that plugin
library is destroyed before the memory manager is destroyed, leading to a fact
that the call to target plugin will not succeed.
2. Device is out of memory when we request a new memory. The manager will walk
through all free buffers from the bucket with largest base size, pick up one
buffer, free it, and try to allocate immediately. If it succeeds, it will
return right away rather than freeing all buffers in free list.

Update:
A threshold (8KB by default) is set such that users could control what size of memory
will be managed by the manager. It can also be configured by an environment variable
`LIBOMPTARGET_MEMORY_MANAGER_THRESHOLD`.

Reviewed By: jdoerfert, ye-luo, JonChesterfield

Differential Revision: https://reviews.llvm.org/D81054
2020-08-19 23:12:23 -04:00
Shilei Tian 83c3d07994 [OpenMP] Refactored the function `DeviceTy::data_exchange`
This patch contains the following changes:
1. Renamed the function `DeviceTy::data_exchange` to `DeviceTy::dataExchange`;
2. Changed the second argument `DeviceTy DstDev` to `DeviceTy &DstDev`;
3. Renamed the last argument.

Reviewed By: ye-luo

Differential Revision: https://reviews.llvm.org/D86238
2020-08-19 16:08:14 -04:00
Jon Chesterfield 6e1b11087f [libomptarget][amdgpu] Support building with static rocm libraries 2020-08-19 15:44:30 +01:00
George Rokos 32ebdc70f3 [libomptarget][NFC] Sort list of plugins in chronological order
Differential Revision: https://reviews.llvm.org/D86082
2020-08-17 08:33:36 -07:00
Johannes Doerfert 5272d29e2c [OpenMP][CUDA] Keep one kernel list per device, not globally.
Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D86039
2020-08-16 14:38:35 -05:00
Johannes Doerfert aa27cfc1e7 [OpenMP][CUDA] Cache the maximal number of threads per block (per kernel)
Instead of calling `cuFuncGetAttribute` with
`CU_FUNC_ATTRIBUTE_MAX_THREADS_PER_BLOCK` for every kernel invocation,
we can do it for the first one and cache the result as part of the
`KernelInfo` struct. The only functional change is that we now expect
`cuFuncGetAttribute` to succeed and otherwise propagate the error.
Ignoring any error seems like a slippery slope...

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D86038
2020-08-16 14:38:33 -05:00
Jon Chesterfield d0b312955f [libomptarget] Implement host plugin for amdgpu
[libomptarget] Implement host plugin for amdgpu

Replacement for D71384. Primary difference is inlining the dependency on atmi
followed by extensive simplification and bugfixes. This is the latest version
from https://github.com/ROCm-Developer-Tools/amd-llvm-project/tree/aomp12 with
minor patches and a rename from hsa to amdgpu, on the basis that this can't be
used by other implementations of hsa without additional work.

This will not build unless the ROCM_DIR variable is passed so won't break other
builds. That variable is used to locate two amdgpu specific libraries that ship
as part of rocm:
libhsakmt at https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface
libhsa-runtime64 at https://github.com/RadeonOpenCompute/ROCR-Runtime
These libraries build from source. The build scripts in those repos are for
shared libraries, but can be adapted to statically link both into this plugin.

There are caveats.
- This works well enough to run various tests and benchmarks, and will be used
  to support the current clang bring up
- It is adequately thread safe for the above but there will be races remaining
- It is not stylistically correct for llvm, though has had clang-format run
- It has suboptimal memory management and locking strategies
- The debug printing / error handling is inconsistent

I would like to contribute this pretty much as-is and then improve it in-tree.
This would be advantagous because the aomp12 branch that was in use for fixing
this codebase has just been joined with the amd internal rocm dev process.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D85742
2020-08-15 23:58:28 +01:00
Joachim Protze 66a3575c28 [OpenMP] Fix releasing of stack memory
Starting with 787eb0c637 I got spurious segmentation faults for some testcases. I could nail it down to `brel` trying to release the "memory" of the node allocated on the stack of __kmpc_omp_wait_deps. With this patch, you will see the assertion triggering for some of the tests in the test suite.

My proposed solution for the issue is to just patch __kmpc_omp_wait_deps:
```
  __kmp_init_node(&node);
-  node.dn.on_stack = 1;
+  // the stack owns the node
+  __kmp_node_ref(&node);
```

What do you think?

Reviewed By: AndreyChurbanov

Differential Revision: https://reviews.llvm.org/D84472
2020-08-14 10:32:53 +02:00
Joel E. Denny 518a27e559 [OpenMP] Fix ref count dec for implicit map of partial data
D85342 broke this case.  The new test case presents an example.

Reviewed By: grokos

Differential Revision: https://reviews.llvm.org/D85369
2020-08-06 11:39:29 -04:00
Joel E. Denny 8c8bb128df [OpenMP] Fix `target data` exit for array extension
For example:

```
 #pragma omp target data map(tofrom:arr[0:100])
 {
   #pragma omp target exit data map(delete:arr[0:100])
   #pragma omp target enter data map(alloc:arr[98:2])
 }
```

Without this patch, the transfer at the end of the target data region
is broken and fails depending on the target device.  According to my
read of the spec, the transfer shouldn't even be attempted because
`arr[0:100]` isn't (fully) present there.  To fix that, this patch
makes `DeviceTy::getTgtPtrBegin` return null for this case.

Reviewed By: grokos

Differential Revision: https://reviews.llvm.org/D85342
2020-08-05 16:51:25 -04:00
Joel E. Denny 41b1aefecb [OpenMP] Fix `present` diagnostic for array extension
For example, without this patch, the following fails as expected with
or without the `present` modifier, but the `present` modifier doesn't
produce its usual diagnostic:

```
 #pragma omp target data map(alloc: arr[0:2])
 {
   #pragma omp target map(present, tofrom: arr[0:100]) // not fully present
   ;
 }
```

Reviewed By: grokos, vzakhari

Differential Revision: https://reviews.llvm.org/D85320
2020-08-05 16:51:24 -04:00
George Rokos 40470eb27a [libomptarget][NFC] Replace `%ld` with PRId64 for data of type int64_t.
The standard way of printing `int64_t` data is via the PRId64 macro, `ld`
is for `long int` and int64_t is not guaranteed to be typedef'ed as `long int`
on all platforms. E.g. on Windows we get mismatch warnings.

Differential Revision: https://reviews.llvm.org/D85353
2020-08-05 13:28:35 -07:00
Alexey Bataev 6780d5675b [LIBOMPTARGET]Fix order of mapper data for targetDataEnd function.
targetDataMapper function fills arrays with the mapping data in the
direct order. When this function is called by targetDataBegin or
tgt_target_update functions, it works as expected. But targetDataEnd
function processes mapped data in reverse order. In this case, the base
pointer might be deleted before the associated data is deleted. Need to
reverse data, mapped by mapper, too, since it always adds data that must
be deleted at the end of the buffer.
Fixes the test declare_mapper_target_update.cpp.
Also, reduces the memry fragmentation by preallocation the memory
buffers.

Differential Revision: https://reviews.llvm.org/D85216
2020-08-05 13:42:24 -04:00
Joel E. Denny 5ab43989c3 [OpenMP] Fix `omp target update` for array extension
OpenMP TR8 sec. 2.15.6 "target update Construct", p. 183, L3-4 states:

> If the corresponding list item is not present in the device data
> environment and there is no present modifier in the clause, then no
> assignment occurs to or from the original list item.

L10-11 states:

> If a present modifier appears in the clause and the corresponding
> list item is not present in the device data environment then an
> error occurs and the program termintates.

(OpenMP 5.0 also has the first passage but without mention of the
present modifier of course.)

In both passages, I assume "is not present" includes the case of
partially but not entirely present.  However, without this patch, the
target update directive misbehaves in this case both with and without
the present modifier.  For example:

```
 #pragma omp target enter data map(to:arr[0:3])
 #pragma omp target update to(arr[0:5]) // might fail on data transfer
 #pragma omp target update to(present:arr[0:5]) // might fail on data transfer
```

The problem is that `DeviceTy::getTgtPtrBegin` does not return a null
pointer in that case, so `target_data_update` sees the data as fully
present, and the data transfer then might fail depending on the target
device.  However, without the present modifier, there should never be
a failure.  Moreover, with the present modifier, there should always
be a failure, and the diagnostic should mention the present modifier.

This patch fixes `DeviceTy::getTgtPtrBegin` to return null when
`target_data_update` is the caller.  I'm wondering if it should do the
same for more callers.

Reviewed By: grokos, jdoerfert

Differential Revision: https://reviews.llvm.org/D85246
2020-08-05 10:03:31 -04:00
Joel E. Denny 002d61db2b [OpenMP] Fix `present` for exit from `omp target data`
Without this patch, the following example fails but shouldn't
according to OpenMP TR8:

```
 #pragma omp target enter data map(alloc:i)
 #pragma omp target data map(present, alloc: i)
 {
   #pragma omp target exit data map(delete:i)
 } // fails presence check here
```

OpenMP TR8 sec. 2.22.7.1 "map Clause", p. 321, L23-26 states:

> If the map clause appears on a target, target data, target enter
> data or target exit data construct with a present map-type-modifier
> then on entry to the region if the corresponding list item does not
> appear in the device data environment an error occurs and the
> program terminates.

There is no corresponding statement about the exit from a region.
Thus, the `present` modifier should:

1. Check for presence upon entry into any region, including a `target
   exit data` region.  This behavior is already implemented correctly.

2. Should not check for presence upon exit from any region, including
   a `target` or `target data` region.  Without this patch, this
   behavior is not implemented correctly, breaking the above example.

In the case of `target data`, this patch fixes the latter behavior by
removing the `present` modifier from the map types Clang generates for
the runtime call at the end of the region.

In the case of `target`, we have not found a valid OpenMP program for
which such a fix would matter.  It appears that, if a program can
guarantee that data is present at the beginning of a `target` region
so that there's no error there, that data is also guaranteed to be
present at the end.  This patch adds a comment to the runtime to
document this case.

Reviewed By: grokos, RaviNarayanaswamy, ABataev

Differential Revision: https://reviews.llvm.org/D84422
2020-08-05 10:03:31 -04:00
Adrian Pop bf2aa74e51 [OpenMP] support build on msys2/mingw with clang or gcc
RTM Adaptive Locks are supported on msys2/mingw for clang and gcc.

Differential Revision: https://reviews.llvm.org/D81776
2020-08-04 23:15:36 +03:00
AndreyChurbanov 4a04bc8995 [OpenMP] Don't use MSVC workaround with MinGW
Patch by mati865@gmail.com

Differential Revision: https://reviews.llvm.org/D85210
2020-08-04 18:48:25 +03:00
David Blaikie 0c938a8dd8 OpenMP: Fix typo variabls -> variables 2020-08-03 17:00:15 -07:00
Shilei Tian f2400f024d [OpenMP] Fixed the issue that target memory deallocation might be called when they're being used
This patch fixed the issue that target memory might be deallocated when
they're still being used or before they're used.

Reviewed By: ye-luo

Differential Revision: https://reviews.llvm.org/D84996
2020-07-31 18:54:18 -04:00
Joachim Protze 03116a9f8c [OpenMP] Use weak attribute in interface only for static library
This is to address the issue reported at:
https://bugs.llvm.org/show_bug.cgi?id=46863

Since weak is meaningless for a shared library interface function, this patch
disables the attribute, when the OpenMP library is built as shared library.

ompt_start_tool is not an interface function, but a internally called function
possibly implemented by an OMPT tool.
This function needs to be weak if possible to allow overwriting ompt_start_tool
with a function implementation built into the application.

Differential Revision: https://reviews.llvm.org/D84871
2020-07-31 12:29:05 +02:00
Shilei Tian 0f10165626 [OpenMP] Refactored the function `targetDataEnd`
Refactored the function `targetDataEnd` to make preparation of fixing
the issue of ahead-of-time target memory deallocation. This patch only
renamed `targetDataEnd` related variables and functions to conform
with LLVM code standard.

Reviewed By: ye-luo

Differential Revision: https://reviews.llvm.org/D84991
2020-07-30 21:39:26 -04:00
Shilei Tian 8218eee269 [OpenMP] Refactored the function `target`
Refactored the function `target` to make preparation for fixing the
issue of ahead-of-time device memory deallocation.

Reviewed By: ye-luo

Differential Revision: https://reviews.llvm.org/D84816
2020-07-30 21:05:55 -04:00
Alexey Bataev 622e46156d [OPENMP]Fix PR46824: Global declare target pointer cannot be accessed in target region.
Need to map the base pointer for all directives, not only target
data-based ones.
The base pointer is mapped for array sections, array subscript, array
shaping and other array-like constructs with the base pointer. Also,
codegen for use_device_ptr clause was modified to correctly handle
mapping combination of array like constructs + use_device_ptr clause.
The data for use_device_ptr clause is emitted as the last records in the
data mapping array.

Reviewed By: ye-luo

Differential Revision: https://reviews.llvm.org/D84767
2020-07-30 11:18:33 -04:00
Alexey Bataev b69357c2f4 Revert "[OPENMP]Fix PR46824: Global declare target pointer cannot be accessed in target region."
This reverts commit 142d0d3ed8 to
investigate undefined behavior revealed by buildbots.
2020-07-30 10:57:56 -04:00
Alexey Bataev 142d0d3ed8 [OPENMP]Fix PR46824: Global declare target pointer cannot be accessed in target region.
Need to map the base pointer for all directives, not only target
data-based ones.
The base pointer is mapped for array sections, array subscript, array
shaping and other array-like constructs with the base pointer. Also,
codegen for use_device_ptr clause was modified to correctly handle
mapping combination of array like constructs + use_device_ptr clause.
The data for use_device_ptr clause is emitted as the last records in the
data mapping array.
It applies only for global pointers.

Differential Revision: https://reviews.llvm.org/D84767
2020-07-30 09:40:05 -04:00
Joel E. Denny cee52dd026 [OpenMP] Implement TR8 `present` motion modifier in runtime (2/2)
This patch implements OpenMP runtime support for the OpenMP TR8
`present` motion modifier for `omp target update` directives.  The
previous patch in this series implements Clang front end support.

Reviewed By: grokos

Differential Revision: https://reviews.llvm.org/D84712
2020-07-29 12:18:50 -04:00
Shilei Tian 30440924d4 [OpenMP] Replaced mutex lock/unlock in `target` with `std::lock_guard`
Reviewed By: ye-luo

Differential Revision: https://reviews.llvm.org/D84799
2020-07-28 20:31:40 -04:00
Joel E. Denny 65564e5eaf Revert "[OpenMP] Implement TR8 `present` motion modifier in runtime (2/2)"
This reverts commit 2cb926a447.

It depends on 3c3faae497, which is being
reverted.
2020-07-28 20:30:05 -04:00
Shilei Tian 3ce69d4d50 [NFC][OpenMP] Renamed all variable and function names in `target` to conform with LLVM code standard
This patch only touched variables and functions in `target`.

Reviewed By: ye-luo

Differential Revision: https://reviews.llvm.org/D84797
2020-07-28 20:11:09 -04:00
Joel E. Denny 2cb926a447 [OpenMP] Implement TR8 `present` motion modifier in runtime (2/2)
This patch implements OpenMP runtime support for the OpenMP TR8
`present` motion modifier for `omp target update` directives.  The
previous patch in this series implements Clang front end support.

Reviewed By: grokos

Differential Revision: https://reviews.llvm.org/D84712
2020-07-28 19:15:18 -04:00
Jinsong Ji d28f86723f Re-land "[PowerPC] Remove QPX/A2Q BGQ/BGP CNK support"
This reverts commit bf544fa1c3.

Fixed the typo in PPCInstrInfo.cpp.
2020-07-28 14:00:11 +00:00
Joel E. Denny 9b4826d18b [OpenMP] Fix libomptarget negative tests to expect abort
On runtime failures, D83963 causes the runtime to abort instead of
merely exiting with a non-zero value, but many tests in the
libomptarget test suite still expect the former behavior.  This patch
updates the test suite and was discussed in post-commit comments on
D83963 and D84557.
2020-07-28 09:02:16 -04:00
Joachim Protze e2f5444c9c [OpenMP][Tests] Enable nvptx64 testing for most libomptarget tests
Also add $BUILD/lib to the LIBRARY_PATH to fix
https://bugs.llvm.org/show_bug.cgi?id=46836.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D84557
2020-07-28 11:08:24 +02:00
Jinsong Ji bf544fa1c3 Revert "[PowerPC] Remove QPX/A2Q BGQ/BGP CNK support"
This reverts commit adffce7153.

This is breaking test-suite, revert while investigation.
2020-07-27 21:07:00 +00:00
Ye Luo 9323166601 [OpenMP] Add more pass-through functions in DeviceTy
Summary:
1. Add DeviceTy::data_alloc, DeviceTy::data_delete, DeviceTy::data_alloc, DeviceTy::synchronize pass-through functions. Avoid directly accessing Device.RTL
2. Fix the type of the first argument of synchronize_ty in rth.h, device id is int32_t which is consistent with other functions.

Reviewers: tianshilei1992, jdoerfert

Reviewed By: tianshilei1992

Subscribers: yaxunl, guansong, sstefan1, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D84487
2020-07-27 16:08:30 -04:00
Jinsong Ji adffce7153 [PowerPC] Remove QPX/A2Q BGQ/BGP CNK support
Per RFC http://lists.llvm.org/pipermail/llvm-dev/2020-April/141295.html
no one is making use of QPX/A2Q/BGQ/BGP CNK anymore.

This patch remove the support of QPX/A2Q in llvm, BGQ/BGP in clang,
CNK support in openmp/polly.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D83915
2020-07-27 19:24:39 +00:00
Johannes Doerfert 9c87466c39 [OpenMP] Use `abort` not `error` for fatal runtime exceptions
See PR46515 for the rational but generally, we want to *really* abort
not gracefully shut down.

Reviewed By: grokos, ABataev

Differential Revision: https://reviews.llvm.org/D83963
2020-07-24 15:15:38 -05:00
David Truby bb099c87ab [openmp] Don't copy exports into the source folder by default.
Additionally fix the copy if enabled on multi-config targets.

Summary:
This changes the copy command for libomp.so to use the output of the target as
the source of the copy, rather than trying to find it based on
${LIBOMP_LIBRARY_DIR}, which appears to be incorrect in multi-config generator
builds.

Reviewers: jdoerfert

Subscribers: mgorny, yaxunl, guansong, sstefan1, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D84148
2020-07-24 14:34:50 +01:00
Shilei Tian c0185dc7df Revert "[OpenMP] Wait for kernel prior to memory deallocation"
This reverts commit 9b2832c089.
2020-07-22 23:03:36 -04:00
Shilei Tian 9b2832c089 [OpenMP] Wait for kernel prior to memory deallocation
Summary:
In the function `target`, memory deallocation and `target_data_end` is called
immediately returning from launching kernel. This might cause a race condition
that the corresponding memory is still being used by the kernel and a potential
issue that when the kernel starts to execute, its required data have already
been deallocated, especially when multiple kernels running concurrently. Since
nevertheless, we will block the thread issuing the target offloading at the end
of the target, we just move the synchronization ahead a little bit to make sure
the correctness.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: yaxunl, guansong, sstefan1, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D84381
2020-07-22 22:55:34 -04:00
Louis Dionne afa1afd410 [CMake] Bump CMake minimum version to 3.13.4
This upgrade should be friction-less because we've already been ensuring
that CMake >= 3.13.4 is used.

This is part of the effort discussed on llvm-dev here:

  http://lists.llvm.org/pipermail/llvm-dev/2020-April/140578.html

Differential Revision: https://reviews.llvm.org/D78648
2020-07-22 14:25:07 -04:00
Joel E. Denny 708752b2f6 [OpenMP] Implement TR8 `present` map type modifier in runtime (2/2)
This implements OpenMP runtime support for the OpenMP TR8 `present`
map type modifier.  The previous patch in this series implements Clang
front end support.  See that patch summary for behaviors that are not
yet supported.

Reviewed By: grokos, jdoerfert

Differential Revision: https://reviews.llvm.org/D83062
2020-07-22 14:04:58 -04:00
Joel E. Denny fc247c8f3c Revert "[OpenMP] Implement TR8 `present` map type modifier in runtime (2/2)"
This reverts commit 45b8f7ec35.

It attempts to use debug macros `DPxMOD` and `DPxPTR` in release
builds.  Will fix and reapply later.
2020-07-22 11:22:08 -04:00
Joel E. Denny 45b8f7ec35 [OpenMP] Implement TR8 `present` map type modifier in runtime (2/2)
This implements OpenMP runtime support for the OpenMP TR8 `present`
map type modifier.  The previous patch in this series implements Clang
front end support.  See that patch summary for behaviors that are not
yet supported.

Reviewed By: grokos, jdoerfert

Differential Revision: https://reviews.llvm.org/D83062
2020-07-22 10:15:32 -04:00
Joachim Protze ae31d7838c [OpenMP][NFC] pass on env variables to libomptarget tests 2020-07-22 12:14:45 +02:00
Saiyedul Islam 741e55aeed [OpenMP] Temporarily disable failing runtime tests for clang-12
Following tests were disabled for clang-11 after upgrading to
version 5.0 in D82963:

1. openmp/runtime/test/env/kmp_set_dispatch_buf.c
2. openmp/runtime/test/worksharing/for/kmp_set_dispatch_buf.c

They are also failing for clang-12. Thus this temporary disabling
until they are fixed.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D84241
2020-07-21 15:32:46 +00:00
AndreyChurbanov 617787ea77 [OpenMP] add missed REQUIRES:ompt for 2 OMPT tests 2020-07-21 16:31:17 +03:00
AndreyChurbanov 5a8779169e [OpenMP] libomp build fix without OMPT_SUPPORT 2020-07-21 16:03:17 +03:00
AndreyChurbanov 917f842159 [OpenMP] libomp cleanup: add checks of bad memory access
Add check of frm to prevent array out-of-bound access;
add check of new_nproc to prevent access of unallocated hot_teams array;
add check of location info pointer to prevent NULL dereference;
add check of d_tn pointer to prevent NULL dereference in release build.
These checks make static analyzers happier.

This is second part of the patch from https://reviews.llvm.org/D84062.
2020-07-21 00:12:46 +03:00
AndreyChurbanov 787eb0c637 [OpenMP] libomp cleanup: add check of input global tid parameter
Add check of negative gtid before indexing __kmp_threads.
This makes static analyzers happier.
This is the first part of the patch split in two parts.

Differential Revision: https://reviews.llvm.org/D84062
2020-07-20 23:49:58 +03:00
Joachim Protze f226171429 [OpenMP][Tests][NFC] Mark compatibility with older versions of clang 2020-07-20 13:53:29 +02:00
AndreyChurbanov 86fb2db49b [OpenMP] libomp cleanup: check presence of hwloc objects CORE, PACKAGE
hwloc documentation guarantees the only object that is always present
in the topology is PU. We can check the presence of other objects
in the topology, just in case.

Differential Revision: https://reviews.llvm.org/D84065
2020-07-18 01:15:37 +03:00
AndreyChurbanov 62d88a1c79 [OpenMP] libomp: add itt notifications for teams construct on host
Add barrier/region notification for parallel inside teams construct
when number of teams is 1, as VTune only shows outer level regions for
simplicity.

Differential Revision: https://reviews.llvm.org/D84024
2020-07-17 21:10:25 +03:00
serge-sans-paille 515bc8c155 Harmonize Python shebang
Differential Revision: https://reviews.llvm.org/D83857
2020-07-16 21:53:45 +02:00
AndreyChurbanov ffd8f00931 [openmp] libomp: added itt notifications for task, taskwait, taskgroup
Add releasing->acquire edges for child task->taskwait and
child task->end of taskgroup.

Differential Revision: https://reviews.llvm.org/D83804
2020-07-16 14:28:46 +03:00
George Rokos 140ab574a1 [OpenMP][Offload] Declare mapper runtime implementation
Libomptarget patch adding runtime support for "declare mapper".
Patch co-developed by Lingda Li and George Rokos.

Differential revision: https://reviews.llvm.org/D68100
2020-07-15 18:11:43 -07:00
Johannes Doerfert 5937434677 [OpenMP] Silence unused symbol warning with proper ifdefs 2020-07-11 11:57:42 -05:00
Johannes Doerfert c98699582a [OpenMP][NFC] Remove unused (always fixed) arguments
There are various runtime calls in the device runtime with unused, or
always fixed, arguments. This is bad for all sorts of reasons. Clean up
two before as we match them in OpenMPOpt now.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D83268
2020-07-11 00:51:51 -05:00
Johannes Doerfert cd0ea03e6f [OpenMP][NFC] Remove unused and untested code from the device runtime
Summary:
We carried a lot of unused and untested code in the device runtime.
Among other reasons, we are planning major rewrites for which reduced
size is going to help a lot.

The number of code lines reduced by 14%!

Before:
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
CUDA                            13            489            841           2454
C/C++ Header                    14            322            493           1377
C                               12            117            124            559
CMake                            4             64             64            262
C++                              1              6              6             39
-------------------------------------------------------------------------------
SUM:                            44            998           1528           4691
-------------------------------------------------------------------------------

After:
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
CUDA                            13            366            733           1879
C/C++ Header                    14            317            484           1293
C                               12            117            124            559
CMake                            4             64             64            262
C++                              1              6              6             39
-------------------------------------------------------------------------------
SUM:                            44            870           1411           4032
-------------------------------------------------------------------------------

Reviewers: hfinkel, jhuber6, fghanim, JonChesterfield, grokos, AndreyChurbanov, ye-luo, tianshilei1992, ggeorgakoudis, Hahnfeld, ABataev, hbae, ronlieb, gregrodgers

Subscribers: jvesely, yaxunl, bollu, guansong, jfb, sstefan1, aaron.ballman, openmp-commits, cfe-commits

Tags: #clang, #openmp

Differential Revision: https://reviews.llvm.org/D83349
2020-07-10 19:09:41 -05:00
Joachim Protze 0fa0cf8638 [OpenMP][Tests] Update compatibility with GCC (NFC)
Commit 95a28df5c provided implementation for GOMP*_nonmonotonic*runtime*
functions. Now the tests succeed with gcc 9 and 10
2020-07-08 00:27:19 +02:00
Ye Luo c5348aecd7 [OpenMP] Use primary context in CUDA plugin
Summary:
Retaining per device primary context is preferred to creating a context owned by the plugin.

From CUDA documentation
1. Note that the use of multiple CUcontext s per device within a single process will substantially degrade performance and is strongly discouraged. Instead, it is highly recommended that the implicit one-to-one device-to-context mapping for the process provided by the CUDA Runtime API be used." from https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__DRIVER.html
2. Right under cuCtxCreate. In most cases it is recommended to use cuDevicePrimaryCtxRetain. https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__CTX.html#group__CUDA__CTX_1g65dc0012348bc84810e2103a40d8e2cf
3. The primary context is unique per device and shared with the CUDA runtime API. These functions allow integration with other libraries using CUDA.  https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__PRIMARY__CTX.html#group__CUDA__PRIMARY__CTX

Two issues are addressed by this patch:
1. Not using the primary context caused interoperability issue with libraries like cublas, cusolver. CUBLAS_STATUS_EXECUTION_FAILED and cudaErrorInvalidResourceHandle
2. On OLCF summit, "Error returned from cuCtxCreate" and "CUDA error is: invalid device ordinal"

Regarding the flags of the primary context. If it is inactive, we set CU_CTX_SCHED_BLOCKING_SYNC. If it is already active, we respect the current flags.

Reviewers: grokos, ABataev, jdoerfert, protze.joachim, AndreyChurbanov, Hahnfeld

Reviewed By: jdoerfert

Subscribers: openmp-commits, yaxunl, guansong, sstefan1, tianshilei1992

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D82718
2020-07-07 10:14:51 -04:00
Saiyedul Islam 38d6640ba5 [libomptarget] Implement atomic inc and fence functions for AMDGCN using clang builtins
This function uses __builtin_amdgcn_atomic_inc32():
  uint32_t atomicInc(uint32_t *address, uint32_t max);

These functions use __builtin_amdgcn_fence():
__kmpc_impl_threadfence()
__kmpc_impl_threadfence_block()
__kmpc_impl_threadfence_system()

They will take place of current mechanism of directly calling IR functions.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D83132
2020-07-07 06:36:25 +00:00
Peyton, Jonathan L 95a28df5c4 [OpenMP] Add GOMP 5.0 loop entry points
This patch adds missing GOMP_5.0 loop entry points which incorporate
new non-monotonic default into entry point name.  Since monotonic
schedules are a subset of nonmonotonic, it is acceptable to use
monotonic as the implementation.  This patch simply has the nonmonotonic
(and possibly non-monontonic) versions of the loop entry points as
wrappers around the monotonic ones.

Differential Revision: https://reviews.llvm.org/D73922
2020-07-06 17:22:26 -05:00
Joachim Protze 6d9626d2da [OpenMP][Tests] Fix/Mark compatibilty for GCC
Reviewed by: Hahnfeld, saiislam

Differential Revision: https://reviews.llvm.org/D82267
2020-07-06 23:56:09 +02:00
Saiyedul Islam 4c4bda1630 [OpenMP] Temporarily disable failing runtime tests for OpenMP 5.0
Following tests are failing after upgrading to version 5.0 but are passing
for version 4.5:
1. openmp/runtime/test/env/kmp_set_dispatch_buf.c
2. openmp/runtime/test/worksharing/for/kmp_set_dispatch_buf.c

To be enabled as soon as these tests are fixed.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D82963
2020-07-06 14:04:43 +00:00
Joachim Protze 8289f2891e [OpenMP][Tests] Flag compatibility of OpenMP runtime tests with GCC versions
If the compilation fails, the test is marked as unsupported.
-> This will never change for a specific version of gcc

If the linking fails, the test is marked as expected to fail.
-> This might change as LLVM/OpenMP implements the missing GOMP interface function

Reviewed by: Hahnfeld

Differential Revision: https://reviews.llvm.org/D83077
2020-07-05 22:49:54 +02:00
Joachim Protze 30205865d9 [OpenMP][OMPT] Fix ifdefs for OMPT code
Fixes build with LIBOMP_OMPT_SUPPORT=off

Reported by: Jason Edson

Reviewed by: Hahnfeld

Differential Revision: https://reviews.llvm.org/D83171
2020-07-05 22:39:25 +02:00
Fangrui Song 6ba4380ed6 [libomptarget][test] Fix text relocations by adding -fPIC 2020-07-05 12:51:28 -07:00
Joachim Protze 3fc97f9636 [OpenMP][Tests] NFC use type macro in printf 2020-07-05 09:17:18 +02:00
Joachim Protze 47cb8a0f0b [OpenMP][OMPT]Add event callbacks for taskwait with depend
This adds the missing event callbacks to express dependencies on included tasks
and taskwait with depend clause.

The test fails for GCC, see bug report:
https://bugs.llvm.org/show_bug.cgi?id=46573

Reviewed by: hbae

Differential Revision: https://reviews.llvm.org/D81891
2020-07-03 09:58:31 +02:00
Jonas Hahnfeld 0e0483bf5c [OpenMP][CMake] Fix version detection of testing compiler
When configuring in-tree, the correct names are LLVM_VERSION_MAJOR
and LLVM_VERSION_MINOR. This has been wrong since the code was added
in commits fc473dee98 and 821649229e.
2020-07-02 19:39:30 +02:00
Ye Luo 45bb073da8 [OpenMP] fix clang warning about printf format in CUDA plugin
Summary: Warnings are printed by clang when building LIBOMPTARGET_ENABLE_DEBUG=ON due incorrect format string.

Reviewers: tianshilei1992, jdoerfert

Reviewed By: tianshilei1992

Subscribers: yaxunl, guansong, sstefan1, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D82789
2020-06-29 22:35:39 -04:00
AndreyChurbanov 7f3d9cc1c0 [openmp][NFC] Cleanup: guard __kmp_mic_type by KMP_MIC_SUPPORTED macro.
Differential Revision: https://reviews.llvm.org/D82301
2020-06-29 14:14:56 +03:00
Joachim Protze d4230c67bf [OpenMP][Tool] Fix buffer overflow in ompt-multiplex.h
Reviewed by: runlieb

Differential Revision: https://reviews.llvm.org/D82452
2020-06-29 12:44:33 +02:00
Han Zhu 1eaebe192f [openmp] Use config.test_extra_flags in archer and multiplex tests
Summary:
`config.test_extra_flags` is passed in from `lit.site.cfg.in` files, but they're not used in the LIT configs. This variable can be useful for distros which don't have the standard c/c++ headers in the default search paths. Since the tests run clang on c/c++ source code, we rely on `test_extra_flags` to pass in the necessary header files.

This is a similar setup that's also done in litomptarget https://github.com/llvm/llvm-project/blob/master/openmp/libomptarget/test/lit.cfg#L42 and openmp/runtime.

Reviewers: jdoerfert, jdenny, protze.joachim

Reviewed By: jdoerfert

Subscribers: yaxunl, guansong, sstefan1, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D82516
2020-06-25 11:58:52 -07:00
Ye Luo 6e5f64c44f [OpenMP] Adopt std::set in HostDataToTargetMap
Summary:
lookupMapping took significant time due to linear complexity searching.
This is bad for offloading from multiple host threads because lookupMapping is protected by mutex.
Use std::set for logarithmic complexity searching.

Before my change.
libomptarget inclusive time 16.7 sec, exclusive time 8.6 sec.
After the change
libomptarget inclusive time 7.3 sec, exclusive time 0.4 sec.

Most of the overhead of libomptarget (exclusive time) is gone.

Reviewers: jdoerfert, grokos

Reviewed By: grokos

Subscribers: tianshilei1992, yaxunl, guansong, sstefan1

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D82264
2020-06-24 12:22:45 -04:00
Joachim Protze 73b7ff4e16 [OpenMP] NFC: Create OpenMP release notes file 2020-06-24 13:42:32 +02:00
Joachim Protze 63a3c5925d [OpenMP][OMPT] Pass mutexinoutset to the tool
Adds OMPT support for the mutexinoutset dependency

Reviewed by: hbae

Differential Revision: https://reviews.llvm.org/D81890
2020-06-19 12:51:18 +02:00
Shilei Tian aaf50adb53 Revert "[OpenMP][NFC] Added DeviceID and Event pointer to __tgt_async_info"
This reverts commit ee1bf45e1d.
2020-06-17 15:01:16 -04:00
Shilei Tian ee1bf45e1d [OpenMP][NFC] Added DeviceID and Event pointer to __tgt_async_info
DeviceID is added for some cases that we only have the __tgt_async_info but do
not know its corresponding device id. However, to communicate with target
plugins, we need that information.

Event is added for another way to synchronize.
2020-06-17 14:29:09 -04:00
Alexey Bataev 08029595ca [OPENMP]Fix overflow during counting the number of iterations.
Summary:
The OpenMP loops are normalized and transformed into the loops from 0 to
max number of iterations. In some cases, original scheme may lead to
overflow during calculation of number of iterations. If it is unknown,
if we can end up with overflow or not (the bounds are not constant and
  we cannot define if there is an overflow), cast original type to the
  unsigned.

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, sstefan1, openmp-commits, cfe-commits, caomhin

Tags: #clang, #openmp

Differential Revision: https://reviews.llvm.org/D81881
2020-06-17 08:47:01 -04:00
Joachim Protze 8580af3f7d subdirectories should not use cmake project command 2020-06-17 09:38:56 +02:00
Joachim Protze e9b8ed1fd7 [OpenMP][Tool] Header-only multiplexing of OMPT tools
Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D76012
2020-06-17 09:16:46 +02:00
Joachim Protze cbea36903e [OpenMP][OMPT] Add callbacks for doacross loops
Adds the callbacks for ordered with source/sink dependencies.

The test for task dependencies changed, because callbach.h now actually prints
the passed dependencies and the test also checks for the address.

Reviewed by: hbae

Differential Revision: https://reviews.llvm.org/D81807
2020-06-16 16:53:40 +02:00
Joachim Protze 9e5aefc5f9 [OpenMP][Tests] fix data race in an OpenMP runtime test
Reviewed by: AndreyChurbanov

Differential Revision: https://reviews.llvm.org/D81804
2020-06-15 18:48:35 +02:00
Joachim Protze d056d7592a [OpenMP][Tool] Extend reuse of OMPT testing
This patch allows to specify a prefix (default:empty) to be included into print-out
written by callback.h.
Also adding a cmake target to find the header file from other tests.

Reviewed by: jdoerfert

Differential Revision: https://reviews.llvm.org/D76008
2020-06-14 15:55:32 +02:00
Joachim Protze add8d90cb3 [OpenMP] support alloc of serialized tasks
Reviewed by: AndreyChurbanov

Differential Revision: https://reviews.llvm.org/D81497
2020-06-14 15:55:32 +02:00
Joachim Protze e7577d1d76 Remove mention of counter from Archer readme
The feature was removed before upstreaming Archer, so the documentation is wrong
2020-06-05 14:31:03 +02:00
Shilei Tian a014fbbc21 [OpenMP] Improve D2D memcpy to use more efficient driver API
Summary:
In current implementation, D2D memcpy is first to copy data back to host and then
copy from host to device. This is very efficient if the device supports D2D
memcpy, like CUDA.

In this patch, D2D memcpy will first try to use native supported driver API. If
it fails, fall back to original way. It is worth noting that D2D memcpy in this
scenerio contains two ideas:
- Same devices: this is the D2D memcpy in the CUDA context.
- Different devices: this is the PeerToPeer memcpy in the CUDA context.
My implementation merges this two parts. It chooses the best API according to
the source device and destination device.

Reviewers: jdoerfert, AndreyChurbanov, grokos

Reviewed By: jdoerfert

Subscribers: yaxunl, guansong, sstefan1, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D80649
2020-06-04 16:59:06 -04:00
AndreyChurbanov abe64360ae [openmp] Fixed nonmonotonic schedule implementation.
Differential Revision: https://reviews.llvm.org/D80942
2020-06-04 15:39:45 +03:00
Joachim Protze 10995c77b4 [OpenMP][OMPT] Fix and add event callbacks for detached tasks
The OpenMP spec has the task-fulfill event for a call to omp_fulfill_event.
If the task did not yet finish execution, ompt_task_early_fulfill is used,
otherwise ompt_task_late_fulfill.
If a task does not complete, when the execution finishes (i.e., the task goes
in detached mode), ompt_task_detach instead of ompt_task_complete must be
used, when the next task is scheduled.

A test for both cases is included, which only work with clang-11+

Reviewed By: hbae

Differential revision: https://reviews.llvm.org/D80843
2020-06-02 09:52:40 +02:00
AndreyChurbanov 5e111c5df8 [openmp] Fixed taskloop recursive splitting so that taskloop tasks have
same parent tasks.

Differential Revision: https://reviews.llvm.org/D80577
2020-06-01 17:51:02 +03:00
Joachim Protze 3895148d7c [OpenMP] Fix a race in task queue reallocation
__kmp_realloc_task_deque implicitly assumes, that the task queue is full
(ntasks == size), therefore tail = size in line 319.
An assertion is added to document this assumption.

The first check for a full queue is before the locking and might not hold
when the lock is taken. So, we need to check again for this condition when
we have the lock.

Reviewed By: AndreyChurbanov

Differential Revision: https://reviews.llvm.org/D80480
2020-05-25 10:23:22 +02:00
AndreyChurbanov 57d8b8d6f0 [openmp] Fixed hang if detached task was serialized.
The patch fixes https://bugs.llvm.org/show_bug.cgi?id=45904.

Differential Revision: https://reviews.llvm.org/D79944
2020-05-18 15:32:13 +03:00
Joachim Protze d23131a3c0 [OpenMP] Fix race condition in the completion/freeing of detached tasks
Spurious assertion failures are symptoms of a race condition for the handling
of detached tasks:
Assertion failure at kmp_tasking.cpp(3744): taskdata->td_flags.complete == 1.
Assertion failure at kmp_tasking.cpp(710): taskdata->td_flags.executing == 0.

in the case of detach=true, all accesses to taskdata in __kmp_task_finish need
to happen before (~line 873):

taskdata->td_flags.proxy = TASK_PROXY;

This assignment signals to __kmp_fulfill_event, that the task will need to be
freed there. So, conceptionally the ownership of taskdata is moved.

Reviewed By: AndreyChurbanov

Differential Revision: https://reviews.llvm.org/D79702
2020-05-17 12:28:38 +02:00
Manoel Roemmer 6b9e43c67e [Openmp][VE] Libomptarget plugin for NEC SX-Aurora
This patch adds a libomptarget plugin for the NEC SX-Aurora TSUBASA Vector
Engine (VE target).  The code is largely based on the existing generic-elf
plugin and uses the NEC VEO and VEOSINFO libraries for offloading.

Differential Revision: https://reviews.llvm.org/D76843
2020-05-12 10:47:30 +02:00
Joel E. Denny dd5ba4b585 [OpenMP][NFC] Fix `not` sustitution in tests
D78566 introduced a `\bnot\b` lit substitution in OpenMP test suites.
However, that would corrupt a command like
`FileCheck -implicit-check-not` or any file name like `%t.not`.  We
could use lookbehind/lookahead assertions to avoid such cases, but
this patch switches to `%not` (suggested during the D78566 review) as
a safer option.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D79529
2020-05-11 14:53:48 -04:00
Shilei Tian cb038927ef [OpenMP] Fix an issue of wrong return type of DeviceRTLTy::getNumOfDevices
Summary: There is a typo in DeviceRTLTy::getNumOfDevices that the type of its return value is bool. It will lead to a problem of wrong device number returned from omp_get_num_devices.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: yaxunl, guansong, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D79255
2020-05-03 15:59:06 -04:00
Ron Lieberman ee9c53d271 [libomptarget] Initialize reference parameter IsNew within Device::getOrAllocTgtPtr
The two locals IsNew and Pointer_IsNew were uninitialized at declaration, and then passed by
reference to Device.getOrAllocTgtPtr which in turn did not assign on all
paths within the function. This resulted in occasional runtime failures in one application.
Device::getOrAllocTgtPtr will now initialize IsNew to false on entry to function.

Differential Revision: https://reviews.llvm.org/D78744
2020-04-24 15:33:37 -05:00
Joel E. Denny 5f6aa9680c [OpenMP] target_data_begin: fail on device alloc fail
Without this patch, target_data_begin continues after an illegal
mapping or an out-of-memory error on the device.  With this patch, it
terminates the runtime with an error instead.

The new test exercises only illegal mappings.  I didn't think of a
good way to exercise out-of-memory errors from the test suite.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D78170
2020-04-21 17:10:50 -04:00
Joel E. Denny ba942610f6 [OpenMP] Add scaffolding for negative runtime tests
Without this patch, the openmp project's test suites do not appear to
have support for negative tests.  However, D78170 needs to add a test
that an expected runtime failure occurs.

This patch makes `not` visible in all of the openmp project's test
suites.  In all but `libomptarget/test`, it should be possible for a
test author to insert `not` before a use of the lit substitution for
running a test program.  In `libomptarget/test`, that substitution is
target-specific, and its value is `echo` when the target is not
available.  In that case, inserting `not` before a lit substitution
would expect an `echo` fail, so this patch instead defines a separate
lit substitution for expected runtime fails.

Reviewed By: jdoerfert, Hahnfeld

Differential Revision: https://reviews.llvm.org/D78566
2020-04-21 17:10:50 -04:00
Bryan Chan b86ff5f6ef [OpenMP] Sync writes to child thread's data before reduction
On systems with weak memory consistency, this patch fixes an intermittent crash
in the reduction function called by __kmp_hyper_barrier_gather, which suffers
from a race on a child thread's data.

Reviewed-By: AndreyChurbanov

Differential Revision: https://reviews.llvm.org/D77603
2020-04-14 14:31:06 -04:00
Shilei Tian 4031bb982b [OpenMP] Refined CUDA plugin to put all CUDA operations into class
Summary: Current implementation mixed everything up so that there is almost no encapsulation. In this patch, all CUDA related operations are put into a new class DeviceRTLTy and only necessary functions are exposed. In addition, all C++ code now conforms with LLVM code standard, keeping those API functions following C style.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: jfb, yaxunl, guansong, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D77951
2020-04-13 13:32:46 -04:00
Shilei Tian feed674dec [OpenMP] Introduce stream pool to make sure the correctness of device synchr...
...onization

Summary: In previous patch, in order to optimize performance, we only synchronize once
for each target region. The syncrhonization is via stream synchronization.
However, in the extreme situation, the performce might be bad. Consider the
following case: There is a task that requires transferring huge amount of data
(call many times of data transferring function). It is scheduled to the first
stream. And then we have 255 very light tasks scheduled to the remaining 255
streams (by default we have 256 streams). They can be finished before we do
synchronization at the end of the first task. Next, we get another very huge
task. It will be scheduled again to the first stream. Now the first task
finishes its kernel launch and call stream synchronization. Right now, the
stream already contains two kernels, and the synchronization will wait until the
two kernels finish instead of just the first one for the first task.

In this patch, we introduce stream pool. After each synchronization, the stream
will be returned back to the pool to make sure that for each synchronization,
only expected operations are waited.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: gregrodgers, yaxunl, lildmh, guansong, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D77412
2020-04-11 07:08:56 -04:00
Shilei Tian 03ff643d2e [OpenMP] Put old APIs back and added new _async series for backward compatibility
Summary: According to comments on bi-weekly meeting, this patch put back old APIs and added new `_async` series

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: yaxunl, guansong, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D77822
2020-04-09 22:40:58 -04:00
Shilei Tian 32ed29271f [OpenMP] Optimized stream selection by scheduling data mapping for the same target region into a same stream
Summary:
This patch introduces two things for offloading:
1. Asynchronous data transferring: those functions are suffix with `_async`. They have one more argument compared with their synchronous counterparts: `__tgt_async_info*`, which is a new struct that only has one field, `void *Identifier`. This struct is for information exchange between different asynchronous operations. It can be used for stream selection, like in this case, or operation synchronization, which is also used. We may expect more usages in the future.
2. Optimization of stream selection for data mapping. Previous implementation was using asynchronous device memory transfer but synchronizing after each memory transfer. Actually, if we say kernel A needs four memory copy to device and two memory copy back to host, then we can schedule these seven operations (four H2D, two D2H, and one kernel launch) into a same stream and just need synchronization after memory copy from device to host. In this way, we can save a huge overhead compared with synchronization after each operation.

Reviewers: jdoerfert, ye-luo

Reviewed By: jdoerfert

Subscribers: yaxunl, lildmh, guansong, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D77005
2020-04-07 14:55:47 -04:00
Kazuaki Ishizaki 4201679110 [OpenMP] NFC: Fix trivial typo
Differential Revision: https://reviews.llvm.org/D77430
2020-04-04 12:06:54 +09:00
Vitaly Buka c9ae3c5e10 [openmp] Disable tests flaky on Debian
https://bugs.llvm.org/show_bug.cgi?id=45397
2020-04-01 21:58:05 -07:00
JonChesterfield 09834f9761 [libomptarget][nfc] Move non-freestanding headers out of common
Summary:
[libomptarget][nfc] Move non-freestanding headers out of common

Lowers the bar for building deviceRTL.
Drops math.h entirely as it wasn't used and libm is a big dependency.

Reviewers: jdoerfert, ABataev, grokos

Reviewed By: jdoerfert

Subscribers: jvesely, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D77071
2020-03-31 23:43:18 +01:00
Alexey Bataev 0fca766458 [OPENMP50]Fix PR45117: Orphaned task reduction should be allowed.
Add support for orpahned task reductions.
2020-03-27 17:47:30 -04:00
Henry Kao 236ac68fa5 [OpenMP] Add memory barrier to solve data race
Data race occurs when acquiring lock for critical section
triggering assertion failure. Added barrier to ensure
all memory is commited before checking assertion.

Reviewed By: Hahnfeld

Differential Revision: https://reviews.llvm.org/D76780
2020-03-27 16:32:28 -04:00
Jon Chesterfield 856c995436 [libomptarget] Add missing elf_end call in elf_common.c
Summary:
[libomptarget] Add missing elf_end call in elf_common.c
Noticed when reviewing D76843.

Reviewers: simoll, jdoerfert, efocht, AndreyChurbanov, grokos, manorom

Reviewed By: grokos

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D76874
2020-03-26 19:07:33 +00:00
JonChesterfield 0813f41005 [libomptarget][nfc] Explicitly static function scope shared variables
Summary:
[libomptarget][nfc] Explicitly static function scope shared variables

`__shared__` in CUDA implies static in function scope. See e.g. D.2.1.1
in CUDA_C_Programming_Guide.pdf,
http://developer.download.nvidia.com/compute/DevZone/docs/html/C/doc/

This is surprising for non-cuda developers, see e.g. D73239 where I thought
local variables would be thread local.

Tested by IR diff of libomptarget.bc (no change), running in tree tests,
and binary diff of the nvcc static archives (no significant change).

Reviewers: jdoerfert, ABataev, grokos

Reviewed By: jdoerfert

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D76713
2020-03-24 18:51:50 +00:00
AndreyChurbanov ae044467ed [openmp][runtime] Fixed hang for explicit task inside a taskloop.
Added missed initialization of td_last_tied field for taskloop tasks.

Differential Revision: https://reviews.llvm.org/D75673
2020-03-23 20:07:30 +03:00
Sylvestre Ledru 72fd1033ea Doc: Links should use https 2020-03-22 22:49:33 +01:00
JonChesterfield 298527587c [libomptarget][nfc] Disable amdgcn rtl build. The cmake logic for finding llvm is misbehaving. 2020-03-21 00:01:03 +00:00
George Rokos 0a42c9bfe4 Enable CUDA offloading on aarch64 host
Differential Revision: https://reviews.llvm.org/D76469
2020-03-20 15:38:47 -07:00
Tom Scogland a23d7282ca openmp: fix memcpy memory leak
Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D72637
2020-03-12 23:24:16 -05:00
Alexey Bataev c422d69b1a [LIBOMPTARGET]Fix PR45139: Bug in mixing Python and OpenMP target offload.
Summary: Explicitly initialize data members of RTLsTy class upon construction.

Reviewers: grokos

Subscribers: guansong, openmp-commits, caomhin, kkwli0

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D75946
2020-03-11 09:12:02 -04:00
Jonas Hahnfeld f0689d2e62 archer: Remove superfluous dot from warning message 2020-03-06 15:19:30 +01:00
Jon Chesterfield 221ada654b [libomptarget] Implement locks for amdgcn
Summary:
[libomptarget] Implement locks for amdgcn

The nvptx implementation deadlocks on amdgcn. atomic_cas with multiple
active lanes can deadlock - if one lane succeeds, all the others are locked
out. The set_lock implementation therefore runs on a single lane.

Also uses a sleep intrinsic instead of the system clock for a probably
minor performance improvement. The unset/test implementations may be revised
later, based on code size / performance or similar concerns.

This implements the lock at a per-wavefront scope. That's not strictly as
specified, since openmp describes locks in terms of threads. I think the
nvptx implementation provides true per-thread locking on volta and the same
per-warp locking on other architectures.

Reviewers: jdoerfert, ABataev, grokos

Reviewed By: jdoerfert

Subscribers: jvesely, mgorny, jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D75546
2020-03-05 20:25:31 +00:00
Jon Chesterfield 918a1065be [libomptarget][nfc] Move GetWarp/LaneId functions into per arch code
Summary:
[libomptarget][nfc] Move GetWarp/LaneId functions into per arch code

No code change for nvptx. Amdgcn currently has two implementations of GetLaneId,
this patch keeps the one a colleague considered to be superior for our ISA.

GetWarpId is currently the same function for amdgcn and nvptx, but I think it's
cleaner to keep it grouped with all the others than to keep it in support.cu.

Reviewers: jdoerfert, grokos, ABataev

Reviewed By: jdoerfert

Subscribers: jvesely, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D75587
2020-03-05 17:05:58 +00:00
Jon Chesterfield 84ac0dffd4 [libomptarget][nfc][amdgcn] Replace magic number with named intrinsic 2020-03-05 11:50:30 +00:00
Jon Chesterfield 133db44996 [libomptarget] Implement most hip atomic functions in terms of intrinsics
Summary:
[libomptarget] Implement hip atomic functions in terms of intrinsics

All but atomicInc can be implemented using type generic clang intrinsics.
There is not yet a corresponding intrinsic for atomicInc in clang, only one in
LLVM. This patch leaves atomicInc as an unresolved symbol.

Reviewers: jdoerfert, ABataev, hfinkel, grokos, arsenm

Reviewed By: arsenm

Subscribers: sri, saiislam, wdng, jvesely, mgorny, jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D73076
2020-03-04 17:56:40 +00:00
AndreyChurbanov 95df6747cf [openmp] OpenMP 5.1 omp_display_env function implementation.
Patch by Michael Klemm.

Differential Revision: https://reviews.llvm.org/D74956
2020-03-04 18:15:05 +03:00
Jon Chesterfield ad3d021b9e [libomptarget][nfc][amdgcn] Simplify assert_fail implementation 2020-03-03 18:24:51 +00:00
Alexey Bataev c4a9d976c1 [LIBOMPTARGET]Lower priority of global constructor/destructor to silence the warning from gcc.
Summary: fixed the warning from gcc since prios 0-100 are reserved for the internal use.

Reviewers: grokos

Subscribers: kkwli0, caomhin, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D75458
2020-03-02 15:15:11 -05:00
Alexey Bataev 63cef621f9 [LIBOMPTARGET]Fix PR44933: fix crash because of the too early deinitialization of libomptarget.
Summary:
Instead of using global variables with unpredicted time of
deinitialization, use dynamically allocated variables with functions
explicitly marked as global constructor/destructor and priority. This
allows to prevent the crash because of the incorrect order of dynamic
libraries deinitialization.

Reviewers: grokos, hfinkel

Subscribers: caomhin, kkwli0, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D74837
2020-02-25 15:54:37 -05:00
Kelvin Li e16e267bb6 [OpenMP][cmake] ignore warning on unknown CUDA version
Differential Revision: https://reviews.llvm.org/D75001
2020-02-25 09:29:07 -05:00
Shoaib Meenai e34ddc09f4 [arcconfig] Delete subproject arcconfigs
From https://secure.phabricator.com/book/phabricator/article/arcanist_new_project/:

> An .arcconfig file is a JSON file which you check into your project's root.

I've done some experimentation, and it looks like the subproject
.arcconfigs just get ignored, as the documentation says. Given that
we're fully on the monorepo now, it's safe to remove them.

Differential Revision: https://reviews.llvm.org/D74996
2020-02-24 16:20:36 -08:00
serge-sans-paille 99b03c1c18 Detect and disable openmp tests that require multiple hardware processor to run
Team tests seem to require at least two physical cores, and using the same trick
as in https://reviews.llvm.org/D55598 doesn't work (why?) .
Using lit configuration instead.

Differential Revision: https://reviews.llvm.org/D74921
2020-02-21 14:02:12 +01:00
Yuanfang Chen c2c4f1c120 [openmp][cmake] passing option argument correctly
From the context, it looks like the test should not be run with `check-all`,
but it does. It turns out option argument resolving to True/False which
could not be passed down as is. There is one such example in
AddLLVM.cmake.
2020-02-13 09:33:58 -08:00
Alexey Bataev 578c13d13c [OPENMP]Fix the test, NFC. 2020-02-13 10:40:06 -05:00
Ethan Stewart 190a11148b Changed omp_get_max_threads() implementation to more closely match spec description.
Summary: The 5.0 spec states, "The omp_get_max_threads routine returns an upper bound on the number of threads that could be used to form a new team if a parallel construct without a num_threads clause were encountered after execution returns from this routine." The attached test shows Max Threads: 96, Num Threads: 128 without the proposed change. The number of threads should not exceed the (max) nthreads ICV, hence we should return the higher SPMD thread number even when omp_get_max_threads() is called in a generic kernel. This change does fail the api test, max_threads.c, because now it would return 64 instead of 32.

Reviewers: jdoerfert, ABataev, grokos, JonChesterfield

Reviewed By: jdoerfert

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D74092
2020-02-12 23:29:34 +00:00
JonChesterfield c2ce9ea4e3 [libomptarget][nfc] Change enum values to match those in cuda/rtl
Summary:
[libomptarget][nfc] Change enum values to match those in cuda/rtl

support.h and cuda/rtl.cpp (and downsteam hsa/rtl.cpp) have enums for execution
mode. These are actually independent - the numbers that used within support, or
within the plugin, are never passed across the boundary.

Nevertheless, trying to work out why the values are different between the two
has generated a reasonable amount of confusion. This patch changes support to
match the values in plugin, on the basis that the plugin also has some comments
which I'd have to update if I changed that one instead. Credit to Ron for
working through this in our own fork. See rocm-developer-tools/aomp/issues/7
for that earlier diagnostic write up.

Also happy with generic = 0, spmd = 1 - provided it's the same in both places.

Reviewers: jdoerfert, grokos, ABataev, ronlieb

Reviewed By: grokos

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D74503
2020-02-12 23:27:08 +00:00
Kelvin Li 4f1f2b7a5b [OpenMP] update strings output of libomp.so [NFC]
Change the string from "Intel(R) OMP" to "LLVM OMP" in libomp.so

Differential Revision: https://reviews.llvm.org/D74462
2020-02-12 15:45:55 -05:00
Johannes Doerfert a5153dbc36 [OpenMP][Offloading] Added support for multiple streams so that multiple kernels can be executed concurrently
Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D74145
2020-02-11 22:07:14 -06:00
Johannes Doerfert 3ff4e2eee8 [OpenMP] Switch default C++ standard to C++ 14
Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D74258
2020-02-11 17:11:54 -06:00
Jonas Devlieghere 4fe839ef3a [CMake] Rename EXCLUDE_FROM_ALL and make it an argument to add_lit_testsuite
EXCLUDE_FROM_ALL means something else for add_lit_testsuite as it does
for something like add_executable. Distinguish between the two by
renaming the variable and making it an argument to add_lit_testsuite.

Differential revision: https://reviews.llvm.org/D74168
2020-02-06 15:33:18 -08:00
Jon Chesterfield 6a82f0f0b9 [libomptarget] Implement wavefront functions for amdgcn
Summary: [libomptarget] Implement wavefront functions for amdgcn

Reviewers: jdoerfert, ABataev, grokos, arsenm

Reviewed By: arsenm

Subscribers: saiislam, wdng, arsenm, jvesely, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D73077
2020-02-04 21:55:29 +00:00
protze@itc.rwth-aachen.de 90e4ebdce5 [OpenMP][OMPT] fix reduction test for 32-bit x86
Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=44733 | TEST 'libomp :: ompt/synchronization/reduction/tree_reduce.c' FAILED on 32-bit x86 ]]

For 32-bit we need at least 3 variables to avoid atomic reduction to be
choosen by runtime function `__kmp_determine_reduction_method`.
This patch adds reduction variables to the testcase.

Reviewers: mgorny, Hahnfeld

Differential Revision: https://reviews.llvm.org/D73850
2020-02-04 12:19:10 +01:00
Jon Chesterfield ab9762a9f5 Revert "[nfc][libomptarget] Remove SHARED annotation from local variables"
This reverts commit 0e9374e374.
Revert D73239. It fails some local testing, cause presently unknown
2020-01-27 20:05:17 +00:00
Michał Górny 3c545e4b73 [openmp] Disable archer if LIBOMP_OMPT_SUPPORT is off
This fixed build failures due to missing ompt headers.

See https://bugs.gentoo.org/700762.

Differential Revision: https://reviews.llvm.org/D73249
2020-01-23 19:26:18 +01:00
Kelvin Li ad24cf2a94 [OpenMP] change omp_atk_* and omp_atv_* enumerators to lowercase [NFC]
The OpenMP spec defines the OMP_ATK_* and OMP_ATV_* to be lowercase.

Differential Revision: https://reviews.llvm.org/D73248
2020-01-23 11:15:44 -05:00
Jon Chesterfield 0e9374e374 [nfc][libomptarget] Remove SHARED annotation from local variables
Summary:
[nfc][libomptarget] Remove SHARED annotation from local variables

A few local variables in reduction.cu were marked SHARED. This patch leaves
all per-kernel global state localised in omp_data.cu.

Reviewers: ABataev, jdoerfert, grokos

Reviewed By: jdoerfert

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D73239
2020-01-23 00:00:23 +00:00
Alexey Bataev 9148b8b734 [OpenMP][Offloading] Fix the issue that omp_get_num_devices returns wrong number of devices, by Shiley Tian.
Summary:
This patch is to fix issue in the following simple case:

  #include <omp.h>
  #include <stdio.h>

  int main(int argc, char *argv[]) {
    int num = omp_get_num_devices();
    printf("%d\n", num);

    return 0;
  }

Currently it returns 0 even devices exist. Since this file doesn't contain any
target region, the host entry is empty so further actions like initialization
will not be proceeded, leading to wrong device number returned by runtime
function call.

Reviewers: jdoerfert, ABataev, protze.joachim

Reviewed By: ABataev

Subscribers: protze.joachim

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D72576
2020-01-21 13:25:18 -05:00
David Carlier ea99c09963 [OpenMP] affinity little fix for FreeBSD
- pthread affinity np has different semantic than sched affinity counterpart. On success returns strictly 0.

Reviewers: chandlerc, AndreyChurbanov, jdoerfert

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D72132
2020-01-20 18:52:10 +00:00
Jon Chesterfield 03c2a59cd6 [libomptarget] Implement smid for amdgcn
Summary:
[libomptarget] Implement smid for amdgcn

Implementation is in a new file as it uses an intrinsic with
complicated encoding that warranted substantial comments.

Reviewers: jdoerfert, grokos, ABataev, ronlieb

Reviewed By: jdoerfert

Subscribers: jvesely, mgorny, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D72956
2020-01-20 14:52:17 +00:00
Joachim Protze 39f746d8de [OpenMP][Tool] Fix memory leak and double-allocation
Fix the memory leak pointed out in https://reviews.llvm.org/D70412.
And a second one due to double-allocation.

Reviewed by: Hahnfeld

Differential revision: https://reviews.llvm.org/D72779
2020-01-16 10:05:06 -10:00
George Rokos e244145ab0 [LIBOMPTARGET] Do not increment/decrement the refcount for "declare target" objects
The reference counter for global objects marked with declare target is INF. This patch prevents the runtime from incrementing /decrementing INF refcounts. Without it, the map(delete: global_object) directive actually deallocates the global on the device. With this patch, such a directive becomes a no-op.

Differential Revision: https://reviews.llvm.org/D72525
2020-01-14 16:30:38 -08:00
Joachim Protze 2d4571bf30 [OpenMP][Tool] Runtime warning for missing TSan-option
TSan spuriously reports for any OpenMP application a race on the initialization
of a runtime internal mutex:

```
Atomic read of size 1 at 0x7b6800005940 by thread T4:
  #0 pthread_mutex_lock <null> (a.out+0x43f39e)
  #1 __kmp_resume_64 <null> (libomp.so.5+0x84db4)

Previous write of size 1 at 0x7b6800005940 by thread T7:
  #0 pthread_mutex_init <null> (a.out+0x424793)
  #1 __kmp_suspend_initialize_thread <null> (libomp.so.5+0x8422e)
```

According to @AndreyChurbanov this is a false positive report, as the control
flow of the runtime guarantees the ordering of the mutex initialization and
the lock:
https://software.intel.com/en-us/forums/intel-open-source-openmp-runtime-library/topic/530363

To suppress this report, I suggest the use of
TSAN_OPTIONS='ignore_uninstrumented_modules=1'.
With this patch, a runtime warning is provided in case an OpenMP application
is built with Tsan and executed without this Tsan-option.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D70412
2020-01-14 09:58:05 -10:00
Jon Chesterfield 2a43688a0a [nfc][libomptarget] Refactor nvptx/target_impl.cu
Summary:
[nfc][libomptarget] Refactor nxptx/target_impl.cu

Use __kmpc_impl_atomic_add instead of atomicAdd to match the rest of the file.
Alternatively, target_impl.cu could use the cuda functions directly. Using a mixture in this
file was an oversight, happy to resolve in either direction.

Removed some comments that look outdated.

Call __kmpc_impl_unset_lock directly to avoid a redundant diagnostic and remove an implict
dependency on interface.h.

Reviewers: ABataev, grokos, jdoerfert

Reviewed By: jdoerfert

Subscribers: jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D72719
2020-01-14 19:27:45 +00:00
Jon Chesterfield 2d287bec3c [nfc][libomptarget] Refactor amdgcn target_impl
Summary:
[nfc][libomptarget] Refactor amdgcn target_impl

Removes references to internal libraries from the header
Standardises on C++ mangling for all the target_impl functions
Update comment block
clang-format
Move some functions into a new target_impl.hip source file

This lays the groundwork for implementing the remaining unresolved
symbols in the target_impl.hip source.

Reviewers: jdoerfert, grokos, ABataev, ronlieb

Reviewed By: jdoerfert

Subscribers: jvesely, mgorny, jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D72712
2020-01-14 19:27:07 +00:00
Joachim Protze ed810da732 [OpenMP][Tool] Improving stack trace for Archer
The OpenMP runtime is not instrumented, so entering the runtime leaves no hint
on the source line of the pragma on ThreadSanitizer's function stack.

This patch adds function entry/exit annotations for OpenMP parallel regions,
and synchronization regions (barrier, taskwait, taskgroup).

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D70408
2020-01-13 22:14:06 -10:00
Joachim Protze 84637408f2 [OpenMP][Tool] Make tests for archer dependent on TSan
If the openmp project is built standalone, the test compiler is feature tested for an available -fsanitize=thread flag.
If the openmp project is built as part of llvm, the target tsan is needed to test archer.

An additional line (requires tsan) was introduced to the tests, this patch updates the line numbers for the race.

Follow-up for 77ad98c

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D71914
2020-01-13 21:47:58 -10:00
Alexey Bataev b19c0810e5 [LIBOMPTARGET]Ignore empty target descriptors.
Summary:
If the dynamically loaded module has been compiled with -fopenmp-targets
and has no target regions, it has empty target descriptor. It leads to a
crash at the runtime if another module has at least one target region
and at least one entry in its descriptor. The runtime library is unable
to load the empty binary descriptor and terminates the execution.
Caused by a clang-offload-wrapper.

Reviewers: grokos, jdoerfert

Subscribers: caomhin, kkwli0, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D72472
2020-01-10 09:45:27 -05:00
Kazuaki Ishizaki 4c6a098ad5 [OpenMP] NFC: Fix trivial typos in comments
Reviewers: jdoerfert, Jim

Reviewed By: Jim

Subscribers: Jim, mgorny, guansong, jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D72285
2020-01-07 14:05:03 +08:00
Kelvin Li 19433b199d [OpenMP] Fix incorrect property of __has_attribute() macro
__has_attribute(fallthough) -> __has_attribute(fallthrough)

Submitted by: kiszk (Kazuaki Ishizaki <ishizaki@jp.ibm.com>)

Differential Revision: https://reviews.llvm.org/D72287
2020-01-06 15:00:10 -05:00
Kelvin Li ed5fe64581 [OpenMP] NFC: Fix trivial typos in comments
Submitted by: kiszk

Differential Revision: https://reviews.llvm.org/D72171
2020-01-03 22:03:42 -05:00
Jon Chesterfield bc48af8c57 [libomptarget][nfc] Change unintentional target_impl prefix to kmpc_impl 2019-12-30 20:50:23 +00:00
protze@itc.rwth-aachen.de 3356e268f6 [OpenMP] Implementation of OMPT reduction callbacks
Including two tests
These callbacks were added late to the 5.0 specification, an implementation is missing.

Reviewed By: jdoerfert

Differential Review: https://reviews.llvm.org/D70395
2019-12-27 15:30:51 +01:00
Jon Chesterfield 63e2aa5658 [libomptarget][nfc] Provide target_impl malloc/free
Summary:
[libomptarget][nfc] Provide target_impl malloc/free

Sufficient to build support.cu for amdgcn

Reviewers: jdoerfert, ABataev, grokos

Reviewed By: jdoerfert

Subscribers: jvesely, mgorny, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71685
2019-12-19 16:54:28 +00:00
JonChesterfield b40822fc14 [libomptarget][nvptx] Fix build, second symbol reordering 2019-12-19 02:02:44 +00:00
Jon Chesterfield 89a2bef27a [libomptarget][nvptx] Fix build, symbol ordering in target_impl.h 2019-12-19 01:50:06 +00:00
JonChesterfield 9aefe5f65e [libomptarget][amdgcn] Correct return type of extern __clock64 to unsigned 2019-12-19 00:11:21 +00:00
Jon Chesterfield 2caeaf2f45 [libomptarget][nfc] Introduce atomic wrapper function
Summary:
[libomptarget][nfc] Introduce atomic wrapper function

Wraps atomic functions in a template prefixed __kmpc_atomic that
dispatches to cuda or hip atomic functions. Intended to be easily extended
to dispatch to OpenCL or C++ atomics for a third target.

Reviewers: ABataev, jdoerfert, grokos

Reviewed By: jdoerfert

Subscribers: Anastasia, jvesely, mgrang, dexonsmith, llvm-commits, mgorny, jfb, openmp-commits

Tags: #openmp, #llvm

Differential Revision: https://reviews.llvm.org/D71404
2019-12-18 20:06:17 +00:00
JonChesterfield 8adae6027c [libomptarget][nfc] Extract function from data_sharing, move to common
Summary:
[libomptarget][nfc] Extract function from data_sharing, move to common

Finding the first active thread in the warp is different on nvptx and amdgcn,
mostly due to warp size and the desire for efficiency.

Reviewers: ABataev, jdoerfert, grokos

Reviewed By: jdoerfert

Subscribers: jvesely, mgorny, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71643
2019-12-18 19:39:35 +00:00
Alexey Bataev 15d47deedd [LIBOPENMP][NVPTX]Fix the build error in the runtime. 2019-12-17 14:46:04 -05:00
JonChesterfield 0c83f8ccc7 [libomptarget][nfc] Move three files under common, build them for amdgcn
Summary:
[libomptarget][nfc] Move three files under common, build them for amdgcn

Change to reduction.cu to remove two dead includes, otherwise no code change.

Reviewers: jdoerfert, ABataev, grokos

Reviewed By: jdoerfert

Subscribers: jvesely, mgorny, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71601
2019-12-17 18:02:49 +00:00
JonChesterfield 3d3e4076cd [libomptarget][nfc] Move omp locks under target_impl
Summary:
[libomptarget][nfc] Move omp locks under target_impl

These are likely to be target specific, even down to the lock_t which is
correspondingly moved out of interface.h. The alternative is to include
interface.h in target_impl which substantiatially increases the scope of
those symbols.

The current nvptx implementation deadlocks on amdgcn. The preferred
implementation for that arch is still under discussion - this change
leaves declarations in target_impl.

The functions could be inline for nvptx. I'd prefer to keep the internals
hidden in the target_impl translation unit, but will add the (possibly renamed)
macros to target_impl.h if preferred.

Reviewers: ABataev, jdoerfert, grokos

Reviewed By: jdoerfert

Subscribers: jvesely, mgorny, jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71574
2019-12-17 12:18:57 +00:00
Jon Chesterfield ce12a523b0 [libomptarget][nfc] Move timer functions behind target_impl
Summary: [libomptarget][nfc] Move timer functions behind target_impl

Reviewers: jdoerfert, ABataev, grokos

Reviewed By: jdoerfert

Subscribers: jvesely, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71584
2019-12-17 02:22:29 +00:00
Jon Chesterfield 53bcd1e141 [libomptarget][nfc] Wrap cuda min() in target_impl
Summary:
[libomptarget][nfc] Wrap cuda min() in target_impl

nvptx forwards to cuda min, amdgcn implements directly.
Sufficient to build parallel.cu for amdgcn, added to CMakeLists.

All call sites are homogenous except one that passes a uint32_t and an
int32_t. This could be smoothed over by taking two type parameters
and some care over the return type, but overall I think the inline
<uint32_t> calling attention to what was an implicit sign conversion
is cleaner.

Reviewers: ABataev, jdoerfert

Reviewed By: jdoerfert

Subscribers: jvesely, mgorny, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71580
2019-12-17 01:30:04 +00:00
JonChesterfield 69fcc6ecc1 Revert "Revert "[libomptarget] Move resource id functions into target specific code, implement for amdgcn""
Summary:
This reverts commit dd8a7fcdd7.

Alexey reports undefined symbols for the new inline functions defined in target_impl.h
This does not reproduce for me for nvptx, or amdgcn, under release or debug builds.

I believe the patch is fine, based on:
 - the semantics of an inline function in C++ (the cuda INLINE functions end
   up as linkonce_odr in IR), which are only legal to drop if they have no uses
 - the code generated from a debug build of clang 9 does not show these undef symbols
 - the tests pass
 - the code is trivial

To progress from here I either need:
 - A tie break - someone to play the role of CI in determining whether the patch works
 - Alexey to provide sufficient information about his build for me to reproduce the failure
 - Alexey to debug why the symbols are disappearing for him and report back

Reviewers: ABataev, jdoerfert, grokos

Subscribers: jvesely, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71502
2019-12-16 16:16:14 +00:00
Alexey Bataev dd8a7fcdd7 Revert "[libomptarget] Move resource id functions into target specific code, implement for amdgcn"
This reverts commit dbb3fec8ad since it
breaks the NVPTX tests.
2019-12-13 16:36:06 -05:00
Jon Chesterfield 40d72134fd [libomptarget] Build most of common/src for amdgcn
Summary:
[libomptarget] Build most of common/src for amdgcn

Excluding parallel.cu, which uses an integer min() from cuda,
Excluding support.cu, which calls malloc that is not yet available for amdgcn

Reviewers: jdoerfert, ABataev, grokos

Reviewed By: jdoerfert

Subscribers: gregrodgers, ronlieb, jvesely, mgorny, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71446
2019-12-13 17:48:19 +00:00
Jon Chesterfield 56adcebfda [libomptarget][nfc] Add nop syncwarp function for amdgcn 2019-12-13 14:27:52 +00:00
Jon Chesterfield 479868646a [libomptarget][nfc] Add declarations of atomic functions for amdgcn
Summary:
[libomptarget][nfc] Add declarations of atomic functions for amdgcn

This enables building more source for amdgcn. The functions are usually available
in a hip runtime header, but are duplicated here to decouple the implementation

Reviewers: jdoerfert, ABataev, grokos

Reviewed By: jdoerfert

Subscribers: jvesely, mgorny, jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71412
2019-12-12 22:56:14 +00:00
Jon Chesterfield dbb3fec8ad [libomptarget] Move resource id functions into target specific code, implement for amdgcn
Summary: [libomptarget] Move resource id functions into target specific code, implement for amdgcn

Reviewers: jdoerfert, ABataev, grokos

Reviewed By: jdoerfert

Subscribers: jvesely, mgorny, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71382
2019-12-12 22:49:02 +00:00
Jon Chesterfield b399252028 [libomptarget][nfc] Add missing header for amdgcn/target_impl 2019-12-12 09:36:57 +00:00
David Carlier 27535a1449 [OpenMP] Fix linkage issue on FreeBSD
needs kmp_set_thread_affinity_mask_initial implementation.
2019-12-06 15:47:50 +00:00
JonChesterfield 0dd62c5c2e [libomptarget][nfc] Move cuda threadfence functions behind kmpc_impl
Summary:
[libomptarget][nfc] Move cuda threadfence functions behind kmpc_impl

Part of building code under common/ without requiring a cuda compiler

Reviewers: ABataev, jdoerfert, grokos

Reviewed By: ABataev

Subscribers: jvesely, jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71102
2019-12-06 15:41:18 +00:00
Jon Chesterfield cd90f49d70 [libomptarget][nfc] Move three more files to common
Summary: [libomptarget][nfc] Move three more files to common

Reviewers: ABataev, jdoerfert, grokos

Reviewed By: ABataev

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71103
2019-12-06 15:29:50 +00:00
Jon Chesterfield 4af84d2686 [libomptarget][nfc] Introduce SHARED, ALIGN macros
Summary:
[libomptarget][nfc] Introduce SHARED, ALIGN macros
Move remaining cuda attributes behind such macros

Reviewers: ABataev, jdoerfert, grokos

Reviewed By: ABataev

Subscribers: openmp-commits, jvesely

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71076
2019-12-05 21:57:58 +00:00
Jon Chesterfield d0b9ed5c49 [libomptarget][nfc] Move omptarget-nvptx under common
Summary:
[libomptarget][nfc] Move omptarget-nvptx under common

Almost all files depend on require omptarget-nvptx, which no longer
contains any obviously architecture dependent code. Moving it under
common unblocks task/loop for amdgcn, and allows moving other code.

At some point there should probably be a widespread symbol renaming to
replace the nvptx string. I'd prefer to get things working first.

Building this (and task.cu, loop.cu) without a cuda library requires
some more refactoring, e.g. wrap threadfence(), use DEVICE macro more
consistently. Patches for that are orthogonal and will be posted shortly.

Reviewers: jdoerfert, ABataev, grokos

Reviewed By: ABataev

Subscribers: mgorny, fedor.sergeev, jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71073
2019-12-05 20:34:15 +00:00
JonChesterfield 3ada8d2a87 [libomptarget] Build a minimal deviceRTL for amdgcn
Summary:
[libomptarget] Build a minimal deviceRTL for amdgcn

Repeat of D70414, with an include path fixed. Diff for sanity checking.

The CMakeLists.txt file is functionally identical to the one used in the aomp fork.
Whitespace changes were made based on nvptx/CMakeLists.txt, plus the
copyright notice updated to match (Greg was the original author so would
like his sign off on that here).

This change will build a small subset of the deviceRTL if an appropriate toolchain is
available, e.g. a local install of rocm. Support.h is moved from nvptx as a dependency
of debug.h.

Reviewers: ABataev, jdoerfert

Reviewed By: ABataev

Subscribers: jvesely, mgorny, jfb, openmp-commits, jdoerfert

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D70971
2019-12-04 16:43:37 +00:00
Alexey Bataev 02b9c5d963 Revert "[libomptarget] Build a minimal deviceRTL for amdgcn"
This reverts commit 877ffa716f because it
breaks the build.
2019-12-03 12:35:08 -05:00
Jon Chesterfield 877ffa716f [libomptarget] Build a minimal deviceRTL for amdgcn
Summary:
[libomptarget] Build a minimal deviceRTL for amdgcn

The CMakeLists.txt file is functionally identical to the one used in the aomp fork.
Whitespace changes were made based on nvptx/CMakeLists.txt, plus the
copyright notice updated to match (Greg was the original author so would
like his sign off on that here).

This change will build a small subset of the deviceRTL if an appropriate toolchain is
available, e.g. a local install of rocm. Support.h is moved from nvptx as a dependency
of debug.h.

Reviewers: jdoerfert, ABataev, grokos, ronlieb, gregrodgers

Reviewed By: jdoerfert

Subscribers: jfb, Hahnfeld, jvesely, mgorny, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D70414
2019-12-03 15:18:41 +00:00
Bryan Chan 4d3198e243 [OpenMP] build offload plugins before testing them
Summary:
"make check-all" or "make check-libomptarget" would attempt to run offloading
tests before the offload plugins are built. This patch corrects that by adding
dependencies to the libomptarget CMake rules.

Reviewers: jdoerfert

Subscribers: mgorny, guansong, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D70803
2019-11-28 17:43:56 -05:00
AndreyChurbanov bd2fb41c2d [openmp] Fixed nonmonotonic schedule when #threads > #chunks in a loop.
Differential Revision: https://reviews.llvm.org/D70713
2019-11-27 15:26:51 +03:00
AndreyChurbanov 5f8b8d2820 [openmp] Recognise ARMv7ve machine arch.
Patch by raj.khem (Khem Raj)

Differential Revision: https://reviews.llvm.org/D68543
2019-11-26 14:37:24 +03:00
protze@itc.rwth-aachen.de 77ad98c808 [OpenMP][Tool] archer tests require tsan
Testing for tsan capability in the test-compiler in follow-up review
2019-11-22 17:11:16 +01:00
protze@itc.rwth-aachen.de 6b2431e0c2 [OpenMP][Tool] disable archer tests in standalone build
Will be enabled after Build-Bots are fixed
2019-11-22 15:25:43 +01:00
protze@itc.rwth-aachen.de ac21de0d7e [OpenMP][Tool] Fix cmake variable in lit.site.cfg.in
As noted in D45890
2019-11-22 14:31:54 +01:00
JonChesterfield a84b48d01e [nfc][libomptarget] Remove casts of string literals to char* 2019-11-19 19:41:59 +00:00
JonChesterfield 4681e2e434 [nfc][libomptarget] Write amdgcn macros in terms of compiler intrinsics 2019-11-19 17:23:46 +00:00
AndreyChurbanov 3a76b8a538 Fix openmp on PowerPC64-BE-ELFv2 ABI on FreeBSD.
Patch by adalava (Alfredo Dal'Ava J.nior)

Differential Revision: https://reviews.llvm.org/D67190
2019-11-19 19:45:06 +03:00
Aaron Puchert b29c7fdb61 [OpenMP] Remove -Wl,-fini=__kmp_internal_end_fini
Summary:
The termination function duplicated the functionality of the
__attribute((destructor))-annotated function __kmp_internal_end_fini,
and we have no indication that this doesn't work.

The function might cause issues with link-time optimization turned on:
until very recently, none of the usual linkers was reporting functions
named in -Wl,-fini as used to the LTO plugin, so it might be dropped.
If the function is dropped, -Wl,-fini=__kmp_internal_end_fini doesn't
do what we want: with ld.bfd and lld it drops the FINI attribute from
.dynamic and with gold we get FINI = 0x0, which leads to a crash on
cleanup. This can be reproduced by building with

    -DLLVM_ENABLE_PROJECTS="clang;openmp" \
    -DLLVM_ENABLE_LTO=Thin \
    -DLLVM_USE_LINKER=gold

The issue in lld has been fixed in f95273f75a, but gold remains without
fix so far.

Fixes PR43927.

Reviewers: JonChesterfield, jdoerfert, AndreyChurbanov

Reviewed By: AndreyChurbanov

Differential Revision: https://reviews.llvm.org/D69927
2019-11-19 00:54:58 +01:00
Jon Chesterfield 5a4a05d776 [libomptarget][nfc] Move some source into common from nvptx
Summary:
[libomptarget][nfc] Move some source into common from nvptx

Moves some source that compiles cleanly under amdgcn into a common subdirectory
Includes some non-trivial files and some headers. Keeps the cuda file extension.

The build systems for different architectures seem unlikely to have much in
common. The idea is therefore to set include paths such that files under
common/src compile as if they were under arch/src as the mechanism for sharing.
In particular, files under common/src need to be able to include target_impl.h.

The corresponding -Icommon is left out in favour of explicit includes on the
basis that the it makes it clearer which files under common are used by a given
architecture.

Reviewers: jdoerfert, ABataev, grokos

Reviewed By: ABataev

Subscribers: jfb, mgorny, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D70328
2019-11-18 18:17:36 +00:00
protze@itc.rwth-aachen.de 2b8115b10b [OpenMP] Add implementation and tests of Archer tool
The tool provides TSAN annotations for OpenMP synchronization. The tool
is activated if no other OMPT tool is loaded.

The tool detects whether the application was built with TSan and rejects
activation according to the OMPT protocol if there is no TSan-rt.

Differential Revision: https://reviews.llvm.org/D45890
2019-11-18 14:45:34 +01:00
Sylvestre Ledru 9b40a7f3bf Remove +x permission on some files 2019-11-16 14:47:20 +01:00
JonChesterfield 32dfbd131d [libomptarget][nfc] Use cuda variable wrappers from support.h
Summary:
[libomptarget][nfc] Use cuda variable wrappers from support.h
Reimplementation of D69693, after the revert of D69885

Use the wrappers in support.h for cuda builtin variables at all call sites.
Localises use of cuda and removes WARPSIZE==32 assumption in debug.h.

Reviewers: ABataev, jdoerfert, grokos

Reviewed By: jdoerfert

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D70186
2019-11-14 12:45:09 +00:00
JonChesterfield fd9fa9995c [libomptarget] Move supporti.h to support.cu
Summary:
[libomptarget] Move supporti.h to support.cu
Reimplementation of D69652, without the unity build and refactors.
Will need a clean build of libomptarget as the cmakelists changed.

Reviewers: ABataev, jdoerfert

Reviewed By: jdoerfert

Subscribers: mgorny, jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D70131
2019-11-13 11:36:46 +00:00
Michał Górny 6f8ee2c575 [openmp] [test] Skip one more test that kills NetBSD buildbot 2019-11-07 17:29:57 +01:00
Jon Chesterfield 7cea0cea77 [libomptarget] Revert all improvements to support
Summary:
[libomptarget] Revert all improvements to support

The change to unity build for nvcc has broken the build for some developers.
This patch reverts to a known-working state.

There has been some confusion over exactly how the build broke. I think we
have reached a common understanding that the disappearing symbols are from
the bitcode library built by clang. The static archive built by nvcc may show the
same problem. Some of the confusion arose from building the deviceRTL twice
and using one or the other library based on various environmental factors.

I'm pretty sure the problem is clang expanding `__forceinline__` into both `__inline__`
and `attribute(("always_inline"))`. The `__inline__` attribute resolves to linkonce_odr
which is not safe for exporting symbols from translation units.

"always_inline" is the desired semantic for small functions defined in one translation
unit that are intended to be inlined at link time. "inline" is not.

This therefore reintroduces the dependency hazard of supporti.h and some code
duplication, and blocks progress separating deviceRTL into reusable components.

See also D69857, D69859 for attempts at a fix instead of a revert.

Reviewers: ABataev, jdoerfert, grokos, ikitayama, tianshilei1992

Reviewed By: ABataev

Subscribers: mgorny, jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D69885
2019-11-06 15:44:10 +00:00
Ron Lieberman dc34b1c94d Test commit: adds a . to comment. NFC 2019-11-04 16:51:03 -06:00
JonChesterfield 94c59ea8dd [libomptarget] Implement target_impl for amdgcn
Summary:
[libomptarget] Implement target_impl for amdgcn

Smallest atomic addition for a new target. Implements enough of the amdgcn
specific code that some of the source files under nvptx/src could be compiled,
without modification, to run on amdgcn.

This foreshadows a work in progress patch to move said source out of nvptx/src.
Patch based on fork at https://github.com/ROCm-Developer-Tools/llvm-project

Reviewers: ABataev, jdoerfert, grokos, ronlieb

Subscribers: jvesely, jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D69718
2019-11-01 15:46:35 +00:00
Alexey Bataev e57f8ad914 [LIBOMPTARGET]Call GetLaneId function, do not use its address in debug
log functions.
2019-11-01 09:43:47 -04:00
JonChesterfield 9b06ac98d0 [nfc][omptarget] Use builtin var abstraction. Second pass at D69476
Summary:
[nfc][omptarget] Use builtin var abstraction. Second pass at D69476

Use the wrappers in support.h for cuda builtin variables at all call sites.
Localises use of cuda and removes WARPSIZE==32 assumption in debug.h.

Reviewers: ABataev, jdoerfert, grokos

Reviewed By: jdoerfert

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D69693
2019-11-01 02:21:44 +00:00
JonChesterfield 764c8420e4 [nfc][libomptarget] Reorganise support header
Summary:
[nfc][libomptarget] Reorganise support header

All functions defined in support implementation are now declared in support.h
Reordered functions in support implementation to match the sequence in support.h
Added include guards to support.h
Added #include interface to support.h to provide kmp_Ident declaration
Move supporti.h to support.cu and s/INLINE/EXTERN/g
Add remaining includes to support.cu

A minor side effect is to change the name mangling of the support functions to
extern "C". If this matters another macro along the lines of INLINE/EXTERN
can be added - perhaps DEVICE as that's the obvious implementation.

Reviewers: jdoerfert, ABataev, grokos

Reviewed By: jdoerfert

Subscribers: mgorny, jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D69652
2019-10-31 17:15:02 +00:00
Jon Chesterfield e9f9dfab82 [libomptarget] Change nvcc compilation to use a unity build
Summary:
[libomptarget] Change nvcc compilation to use a unity build

This allows nvcc to inline functions between what would otherwise be distinct
translation units, which in turn removes any runtime cost from implementing
functions in source files (as opposed to inline in headers).

This will then allow the circular dependencies in deviceRTL to be readily
broken and individual components more easily shared between architectures.

Reviewers: ABataev, jdoerfert, grokos, RaviNarayanaswamy, hfinkel, ronlieb, gregrodgers

Reviewed By: jdoerfert

Subscribers: mgorny, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D69489
2019-10-31 01:58:51 +00:00
Jon Chesterfield 8548e2f543 [nfc][libomptarget] Move named_sync() into target_impl
Summary: [nfc][libomptarget] Move named_sync() into target_impl

Reviewers: ABataev, jdoerfert, grokos

Reviewed By: ABataev

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D69487
2019-10-30 16:25:05 +00:00
David Carlier 5069928487 [OpenMP] Reset affinity mask in the process child on FreeBSD
Reviewers: dim, chandlerc, jdoerfert

Reviewed By: dim

Differential Revision: https://reviews.llvm.org/D69047
2019-10-30 14:51:22 +00:00
Jon Chesterfield 74bb5ee674 [nfc][libomptarget] Move smid() into target_impl
Summary: [nfc][libomptarget] Move smid() into target_impl

Reviewers: ABataev, jdoerfert, grokos

Reviewed By: ABataev

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D69485
2019-10-30 13:39:15 +00:00
Jon Chesterfield 62a161cc00 [libomptarget] Always call malloc, free via SafeMalloc, SafeFree wrapper
Summary:
[libomptarget] Always call malloc, free via SafeMalloc, SafeFree wrapper

NFC for release, adds some verbosity to debug printing. Motivation is to provide
one place where local modifications can be made to the behaviour of all heap
allocation or deallocation while debugging.

Reviewers: jdoerfert, ABataev, grokos

Reviewed By: ABataev

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D69492
2019-10-30 13:35:34 +00:00
AndreyChurbanov 27f6eedc57 Enable OpenBSD support.
Patch by devnexen (David CARLIER)

Differential Revision: https://reviews.llvm.org/D69220
2019-10-30 12:37:44 +03:00
Alexey Bataev d7941a6ab9 [LIBOMPTARGET]Fix build, NFC.
Need to include nvptx_interface.h in target_impl.h, otherwise the build
is failed because of missing __kmpc_impl_lanemask_t type.
2019-10-28 10:43:00 -04:00
Jon Chesterfield 174967f153 [nfc][libomptarget] Decrease coupling between files
Summary:
[nfc][libomptarget] Decrease coupling between files

debug.h used the symbol omptarget_device_environment so implicitly required
an include of omptarget-nvptx.h to compile. Similarly interface.h uses size_t.

Moving this declaration to a new header means cancel, critical can now build
without omptarget-nvptx.h. After this change, debug.h, cancel.cu, critical.cu
could move under a common source directory.

Reviewers: ABataev, jdoerfert, grokos

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D69473
2019-10-27 14:27:54 +00:00
Jon Chesterfield ad4c42666d [nfc][libomptarget] Inline option into target_impl
Summary:
[nfc][libomptarget] Inline option into target_impl

Subset of D69423. The macros that were in option.h are all target dependent.
Inlining the header simplifies the dependency graph when looking to move code
into a common subdir.

Reviewers: ABataev, jdoerfert, grokos

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D69472
2019-10-27 14:26:55 +00:00
Jon Chesterfield f7c3c640af [NFC][libomptarget]Remove TRUE,FALSE macros from option.h
Summary:
[NFC][libomptarget]Remove TRUE,FALSE macros from option.h
Subset of D69423. Patch series ends with removing option.h.

Reviewers: ABataev, jdoerfert, grokos

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D69463
2019-10-27 01:31:12 +01:00
Jon Chesterfield 197b7b24c3 [NFC][libomptarget] move remaining device specific code out of omptarget-nvptx.h
Summary:
[NFC][libomptarget] move remaining device specific code out of omptarget-nvptx.h

Strictly there is one remaining difference wrt amdgcn - parallelLevel is
volatile qualified on amdgcn and not on nvptx. Determining whether this is
correct - and how to represent the different semantics of 'volatile' under
various conditions - is beyond the scope of this code motion patch.

Reviewers: ABataev, jdoerfert, grokos

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D69424
2019-10-25 18:58:31 +01:00
AndreyChurbanov be29d92854 OpenMP Tasks dependencies hash re-sizing fixed.
Details:
- nconflicts field initialized;
- formatting fix (moved declaration out of the long line);
- count conflicts in new hash as opposed to old one.

Differential Revision: https://reviews.llvm.org/D68036
2019-10-25 16:04:46 +03:00
Stephan T. Lavavej 2e4f1e112d [www] Change URLs to HTTPS.
This changes most URLs in llvm's html files to HTTPS. Most changes were
search-and-replace with manual verification; some changes were manual.
For a few URLs, the websites were performing redirects or had changed
their anchors; I fixed those up manually. This consistently uses the
official https://wg21.link redirector. This also strips trailing
whitespace and fixes a couple of typos.

Fixes D69363.

There are a very small number of dead links for which I don't know any
replacements (they are equally dead as HTTP or HTTPS):

https://llvm.org/cmds/llvm2cpp.html
https://llvm.org/devmtg/2010-11/videos/Grosser_Polly-desktop.mp4
https://llvm.org/devmtg/2010-11/videos/Grosser_Polly-mobile.mp4
https://llvm.org/devmtg/2011-11/videos/Grosser_PollyOptimizations-desktop.mov
https://llvm.org/devmtg/2011-11/videos/Grosser_PollyOptimizations-mobile.mp4
https://llvm.org/perf/db_default/v4/nts/22463
https://polly.llvm.org/documentation/memaccess.html
2019-10-24 13:25:15 -07:00
Jon Chesterfield d69d1aa131 [libomptarget][nfc] Make interface.h target independent
Summary:
[libomptarget][nfc] Make interface.h target independent

Move interface.h under a top level include directory.
Remove #includes to avoid the interface depending on the implementation.

Reviewers: ABataev, jdoerfert, grokos, ronlieb, RaviNarayanaswamy

Reviewed By: jdoerfert

Subscribers: mgorny, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D68615

llvm-svn: 374919
2019-10-15 17:15:26 +00:00
David Carlier f61f13d4e7 [OpenMP] Enable thread affinity on FreeBSD
Reviewers: chandlerc, jlpeyton, jdoerfert, dim

Reviewed-By: dim

Differential Revision: https://reviews.llvm.org/D68580

llvm-svn: 374118
2019-10-08 21:25:30 +00:00
Andrey Churbanov ca2973bb20 Don't assume Type from `readelf -d` has parentheses
Patch by jbeich (Jan Beich)

Differential Revision: https://reviews.llvm.org/D68053

llvm-svn: 374038
2019-10-08 12:39:04 +00:00
Andrey Churbanov f34271d886 Don't link libm with -Wl,--as-needed on FreeBSD
Patch by jbeich (Jan Beich)

Differential Revision: https://reviews.llvm.org/D68051

llvm-svn: 374037
2019-10-08 12:23:25 +00:00
Jon Chesterfield 58fd6b5b9c [libomptarget][nfc] Update remaining uint32 to use lanemask_t
Summary:
[libomptarget][nfc] Update remaining uint32 to use lanemask_t

Update a few functions in the API to use lanemask_t instead of i32. NFC for
nvptx. Also update the ActiveThreads type in DataSharingStateTy.
This removes a lot of #ifdef from the downsteam amdgcn implementation.

Reviewers: ABataev, jdoerfert, grokos, ronlieb, RaviNarayanaswamy

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D68513

llvm-svn: 373806
2019-10-04 22:30:28 +00:00
Jon Chesterfield 4f75a73796 Use named constant to indicate all lanes, to handle 32 and 64 wide architectures
Summary: Use named constant to indicate all lanes, to handle 32 and 64 wide architectures

Reviewers: ABataev, jdoerfert, grokos, ronlieb

Reviewed By: grokos

Subscribers: ronlieb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D68369

llvm-svn: 373793
2019-10-04 21:39:22 +00:00
David Carlier fef62e1a68 [OpenMP] FreeBSD address check if mapped more native
/proc unless Linux layer compatibility is activated for CentOS is activated is not present
thus relying on a more native for checking the address.

Reviewers: Hahnfeld, kongyl, jdoerfert, jlpeyton, AndreyChurbanov, emaster, dim

Reviewed By: Hahnfeld

Differential Revision: https://reviews.llvm.org/D67326

llvm-svn: 373152
2019-09-28 19:01:59 +00:00
Sergey Dmitriev 4b343fd84c [Clang][OpenMP Offload] Create start/end symbols for the offloading entry table with a help of a linker
Linker automatically provides __start_<section name> and __stop_<section name> symbols to satisfy unresolved references if <section name> is representable as a C identifier (see https://sourceware.org/binutils/docs/ld/Input-Section-Example.html for details). These symbols indicate the start address and end address of the output section respectively. Therefore, renaming OpenMP offload entries section name from ".omp.offloading_entries" to "omp_offloading_entries" to use this feature.

This is the first part of the patch for eliminating OpenMP linker script (please see https://reviews.llvm.org/D64943).

Differential Revision: https://reviews.llvm.org/D68070

llvm-svn: 373118
2019-09-27 20:00:51 +00:00
Andrey Churbanov de44f434e8 fixed test: eliminated race condition which might cause deadlock
llvm-svn: 372887
2019-09-25 15:25:52 +00:00
Andrey Churbanov a1639b9bba Enable tasks dependencies hashmaps resizing.
Patch by viroulep (Philippe Virouleau)

Differential Revision: https://reviews.llvm.org/D67447

llvm-svn: 372879
2019-09-25 14:40:19 +00:00
Jonas Hahnfeld 673e5476a8 [OpenMP] Change initialization of __kmp_global
There's no need to initialize variables with static storage duration
because they're implicitly initialized to zero. See
https://en.cppreference.com/w/c/language/initialization#Implicit_initialization

I think that's already relied upon because the supplied 0 only sets
'kmp_time_global_t g_time;' in 'struct kmp_base_global'. The other fields
are not set in the code, but implicitly initialized by the compiler.

Differential Revision: https://reviews.llvm.org/D66292

llvm-svn: 370943
2019-09-04 17:47:37 +00:00
Alexey Bataev 4812941776 [OPENMP][NVPTX]Fix parallel level counter in non-SPMD mode.
Summary:
In non-SPMD mode we may end up with the divergent threads when trying to
increment/decrement parallel level counter. It may lead to incorrect
calculations of the parallel level and wrong results when threads are
divergent. We need to reconverge the threads before trying to modify the
parallel level counter.

Reviewers: grokos, jdoerfert

Subscribers: guansong, openmp-commits, caomhin, kkwli0

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D66802

llvm-svn: 370803
2019-09-03 18:11:50 +00:00
Jon Chesterfield bbdd282371 [libomptarget] Refactor activemask macro to inline function
Summary:
[libomptarget] Refactor activemask macro to inline function
See also abandoned D66846, split into this diff and others.

Reviewers: jdoerfert, ABataev, grokos, ronlieb, gregrodgers

Reviewed By: jdoerfert, ABataev

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D66851

llvm-svn: 370781
2019-09-03 16:31:30 +00:00
Jon Chesterfield 3294421926 Use target_impl functions to replace more inline asm
Summary:
Use target_impl functions to replace more inline asm
Follow on from D65836. Removes remaining asm shuffles and lanemask accessors
Also changes the types of target_impl bitwise functions to unsigned.

Reviewers: jdoerfert, ABataev, grokos, Hahnfeld, gregrodgers, ronlieb, hfinkel

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D66809

llvm-svn: 370216
2019-08-28 15:04:06 +00:00
Jon Chesterfield 80f9a38a76 [libomptarget] Refactor syncthreads macro to inline function
Summary:
[libomptarget] Refactor syncthreads macro to inline function
See also abandoned D66846, split into this diff and others.

Rev 2 of D66855

Reviewers: jdoerfert, ABataev, grokos, ronlieb, gregrodgers

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D66861

llvm-svn: 370210
2019-08-28 14:22:35 +00:00
Jon Chesterfield be3d487313 [libomptarget] Refactor syncwarp macro to inline function
Summary:
[libomptarget] Refactor syncwarp macro to inline function
See also abandoned D66846, split into this diff and others.

Reviewers: jdoerfert, ABataev, grokos, ronlieb, gregrodgers

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D66857

llvm-svn: 370149
2019-08-28 02:02:53 +00:00
Jon Chesterfield e73e3013a6 Fix build break due to close brace lost in merge
llvm-svn: 370148
2019-08-28 01:56:26 +00:00
Jon Chesterfield 327aa81123 [libomptarget] Refactor shfl_down_sync macro to inline function
Summary:
[libomptarget] Refactor shfl_down_sync macro to inline function
See also abandoned D66846, split into this diff and others.

Reviewers: jdoerfert, ABataev, grokos, ronlieb, gregrodgers

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D66853

llvm-svn: 370146
2019-08-28 01:47:41 +00:00
Jon Chesterfield b9b712df82 [libomptarget] Refactor shfl_sync macro to inline function
Summary:
[libomptarget] Refactor shfl_sync macro to inline function
See also abandoned D66846, split into this diff and others.

Reviewers: jdoerfert, ABataev, grokos, ronlieb, gregrodgers

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D66852

llvm-svn: 370144
2019-08-28 01:31:04 +00:00
Alexey Bataev da8b5cc9f1 [OPENMP][NVPTX]Add __kmpc_syncwarp(int32_t) function.
Summary:
Added function void __kmpc_syncwarp(int32_t) to expose it to the
compiler. It is required to fix the problem with the critical regions in
Cuda9.0+. We cannot use barrier in the critical region, but still need
to reconverge the threads in the warp after. This function allows to do
this.

Reviewers: grokos, jdoerfert

Subscribers: guansong, openmp-commits, kkwli0, caomhin

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D66672

llvm-svn: 369933
2019-08-26 17:32:45 +00:00
Alexey Bataev 0366168f3a [OPENMP][NVPTX]Use __syncwarp() to reconverge the threads.
Summary:
In Cuda 9.0 it is not guaranteed that threads in the warps are
convergent. We need to use __syncwarp() function to reconverge
the threads and to guarantee the memory ordering among threads in the
warps.
This is the first patch to fix the problem with the test
libomptarget/deviceRTLs/nvptx/src/sync.cu on Cuda9+.
This patch just replaces calls to __shfl_sync() function with the call
of __syncwarp() function where we need to reconverge the threads when we
try to modify the value of the parallel level counter.

Reviewers: grokos

Subscribers: guansong, jfb, jdoerfert, caomhin, kkwli0, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D65013

llvm-svn: 369796
2019-08-23 18:34:48 +00:00
Jonathan Peyton 57ae6b8e37 Force honoring nthreads-var and thread-limit-var inside teams construct on host
This patch fixes https://bugs.llvm.org/show_bug.cgi?id=42906, via adding
adjustment of number of threads on enter to the teams construct on host
according to user settings. This allows to pass checks and avoid assertions
at time of team of threads creation.

Patch by Andrey Churbanov

Differential Revision: https://reviews.llvm.org/D66351

llvm-svn: 369430
2019-08-20 19:39:17 +00:00
Jonas Hahnfeld d2ae0c4f44 [OpenMP] Enable warning about "implicit fallthrough"
Fix last warned location in ittnotify_static.cpp using the defined
macro KMP_FALLTHROUGH().

Differential Revision: https://reviews.llvm.org/D65871

llvm-svn: 369003
2019-08-15 13:26:55 +00:00
Jonas Hahnfeld 4d77e50e6e [OpenMP] Remove 'unnecessary parentheses'
The variables in kmp_lock.cpp are really arrays of function pointers
that return void or int, not pointers to functions that return void*
or int*. The other changes are only cosmetic.

Differential Revision: https://reviews.llvm.org/D65870

llvm-svn: 369002
2019-08-15 13:26:41 +00:00
Jonas Hahnfeld fb72a03f85 [OMPT] Resolve warnings because of ints in if conditions
The implementation status can only be one of
ompt_event_UNIMPLEMENTED = ompt_set_never = 1
ompt_event_MAY_ALWAYS = ompt_set_always = 5

In both cases, the condition was already true, so just remove
the check.

Differential Revision: https://reviews.llvm.org/D65869

llvm-svn: 369001
2019-08-15 13:26:29 +00:00
Jonas Hahnfeld dc23c832f4 [OpenMP] Turn on -Wall compiler warnings by default
Instead, maintain a list of disabled options to still build libomp and
libomptarget without warnings. This includes -Wno-error and -Wno-pedantic
to silence warnings that LLVM enables when building in-tree.

I tested the following compilers:
 * Clang 6.0, 7.0, 8.0
 * GCC 4.8.5 (CentOS 7), GCC 6, 7, 8, 9
 * Intel Compiler 16, 17, 18, 19

RFC thread on openmp-dev mailing list:
http://lists.llvm.org/pipermail/openmp-dev/2019-August/002668.html

Differential Revision: https://reviews.llvm.org/D65867

llvm-svn: 368999
2019-08-15 13:11:50 +00:00
Jon Chesterfield ed3324f6b6 Factor architecture dependent code out of loop.cu
Summary:
[libomptarget] Factor architecture dependent code out of loop.cu

Related to the patch series starting D64217. Added subscribers to said series as reviewers. This effort is smaller in scope.

This patch factors out just enough architecture dependent code from loop.cu to allow the same source to be used with amdgcn, given a different target_impl.h. Testing is that the same bitcode (modulo variable names) is generated for libomptarget before and after the refactor, for nvptx and the out of tree amdgcn.

Reviewers: jdoerfert, ABataev, bollu, jfb, tra, grokos, Hahnfeld, guansong, xtian, gregrodgers, ronlieb, hfinkel, gtbercea, guraypp, arpith-jacob

Reviewed By: jdoerfert, ABataev

Subscribers: dexonsmith, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D65836

llvm-svn: 368751
2019-08-13 21:41:47 +00:00
Andrey Churbanov 5eec1a9d32 Cleanup unused variable.
This patch fixes problem raised in post-review comments of the
https://reviews.llvm.org/D65285. Developers of ittnotify confirmed
that dll_path_ptr field of the __itt_global structure is never used
by ittnotify library, so it is safe to remove the dll_path array.

Differential Revision: https://reviews.llvm.org/D65885

llvm-svn: 368559
2019-08-12 12:37:30 +00:00
Gheorghe-Teodor Bercea 6c7b882e52 [OpenMP][libomptarget] Add support for close map modifier
Summary:
This patch adds support for the close map modifier.

The close map modifier will overwrite the unified shared memory requirement and create a device copy of the data.

Reviewers: ABataev, Hahnfeld, caomhin, grokos, jdoerfert, AlexEichenberger

Reviewed By: Hahnfeld, AlexEichenberger

Subscribers: guansong, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D65340

llvm-svn: 368488
2019-08-09 21:32:57 +00:00
Jonas Hahnfeld 7a0f2dc5a4 [libomptarget] Remove duplicate RTLRequiresFlags per device
We have one global RTLs.RequiresFlags, I don't see a need to make a
copy per device that the runtime manages. This was problematic anyway
because the copy happened during the first __tgt_register_lib(). This
made it impossible to call __tgt_register_requires() from normal user
funtions for testing.
Hence, this change also fixes unified_shared_memory/shared_update.c for
older versions of Clang that don't call __tgt_register_requires() before
__tgt_register_lib().

Differential Revision: https://reviews.llvm.org/D66019

llvm-svn: 368465
2019-08-09 19:20:39 +00:00
Gheorghe-Teodor Bercea a1d20506e7 [OpenMP][libomptarget] Add support for unified memory for regular maps
Summary:
This patch adds support for using unified memory in the case of regular maps that happen when a target region is offloaded to the device.

For cases where only a single version of the data is required then the host address can be used. When variables need to be privatized in any way or globalized, then the copy to the device is still required for correctness.

Reviewers: ABataev, jdoerfert, Hahnfeld, AlexEichenberger, caomhin, grokos

Reviewed By: Hahnfeld

Subscribers: mgorny, guansong, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D65001

llvm-svn: 368192
2019-08-07 17:29:45 +00:00
Jon Chesterfield ae0178bee7 Use forceinline. Necessary for nvcc to inline small functions within the bitcode library
Summary:
[libomptarget] Use forceinline. Necessary for nvcc to inline small functions within the bitcode library
Suggested in D65836

Reviewers: ABataev, jdoerfert, grokos, gregrodgers

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D65876

llvm-svn: 368177
2019-08-07 15:24:12 +00:00
Alexey Bataev c10180ed8e [OPENMP][OFFLOADING]Fix the test, NFC.
llvm-svn: 368068
2019-08-06 18:13:39 +00:00
Jonathan Peyton 73d5abd809 [OpenMP] Add support for GOMP_*_nonmonotonic_* functions
Patch by Isuru Fernando

Differential Revision: https://reviews.llvm.org/D65714

llvm-svn: 367949
2019-08-05 23:23:52 +00:00
Hansang Bae dcdbe6515b [OpenMP] Fix broken build due to new OMPT tests
New OMPT tests with teams construct should be disabled for GCC as it
emits code with a GOMP entry not supported in the LLVM runtime.

Differential Revision: https://reviews.llvm.org/D65757

llvm-svn: 367939
2019-08-05 21:46:13 +00:00
Michael Kruse 78769ec403 [libomptarget] Harmonize emitting CUDA errors and general debug messages.
Ensures that CUDA fail reasons (such as "No CUDA-capable device detected")
are printed together with libomptarget's debug message
(e.g. "Error when setting CUDA context"). Previously, the former was
printed only in CMAKE_BUILD_TYPE=Debug builds while the latter was
enabled by LIBOMPTARGET_ENABLE_DEBUG.

With this change, also only call cuGetErrorString when the error will be
printed.

Suggested-by: Ye Luo <xw111luoye@gmail.com>

Differential Revision: https://reviews.llvm.org/D65687

llvm-svn: 367910
2019-08-05 19:12:10 +00:00
Michael Kruse 2c7a8eaf3d [OpenMP 5.0] libomptarget interface for declare mapper functions.
This patch implements the libomptarget runtime interface for OpenMP 5.0
declare mapper functions. The declare mapper functions generated by
Clang will call them to complete the mapping of members.
kmpc_mapper_num_components gets the current number of components for a
user-defined mapper; kmpc_push_mapper_component pushes back one
component for a user-defined mapper.

The design slides can be found at
https://github.com/lingda-li/public-sharing/blob/master/mapper_runtime_design.pptx

Patch by Lingda Li <lildmh@gmail.com>

Differential Revision: https://reviews.llvm.org/D60972

llvm-svn: 367772
2019-08-04 04:18:28 +00:00
Hansang Bae 67e93a1ae0 Add OMPT support for teams construct
This change adds OMPT support for events from teams construct.

Differential Revision: https://reviews.llvm.org/D64025

llvm-svn: 367746
2019-08-03 02:38:53 +00:00
Jonas Hahnfeld 52b87ac32f [OpenMP] Rename last file to cpp and remove LIBOMP_CFLAGS
All other files are already C++ and the build system has always
passed '-x c++' for C files, effectively compiling them as C++.

To stay warning free we need one fix in ittnotify_static.{c,cpp}:
The variable dll_path can be written to, so it must not be const.
GCC complained with -Wcast-qual and I think it's right.

Differential Revision: https://reviews.llvm.org/D65285

llvm-svn: 367343
2019-07-30 18:37:28 +00:00
Yi Kong 3d21a3af87 [openmp] Workaround bug in old Android pthread_attr_setstacksize
Round the stack size to a multiple of the page size. Older versions of
Android (until KitKat) would fail pthread_attr_setstacksize with
EINVAL if the stack size was not a multiple of the page size.

Patch by Dan Albert <danalbert@google.com>.

Test: Build, copied into the NDK, passed openmp test on ICS.
Bug: https://github.com/android-ndk/ndk/issues/9
llvm-svn: 367070
2019-07-25 22:29:55 +00:00
Jonas Hahnfeld baeab1fc44 [OpenMP] Fix build of stubs library, NFC.
Both Clang and GCC complained that they cannot initialize a return
object of type 'kmp_proc_bind_t' with an 'int'. While at it, also
fix a warning about missing parentheses thrown by Clang.

Differential Revision: https://reviews.llvm.org/D65284

llvm-svn: 367041
2019-07-25 17:51:24 +00:00
Alexey Bataev ca424d100c [OPENMP][NVPTX]Perform memory flush if number of threads to sync is 1 or less.
Summary:
According to the OpenMP standard, barrier operation must perform
implicit flush operation. Currently, if there is only one thread in the
team, barrier does not flush the memory. Patch fixes this problem.

Reviewers: grokos, gtbercea, kkwli0

Subscribers: guansong, jdoerfert, openmp-commits, caomhin

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D62398

llvm-svn: 367024
2019-07-25 15:02:28 +00:00
Jonas Hahnfeld 2488ae9df1 [OpenMP] RISCV64 port
This is a port of libomp for the RISC-V 64-bit Linux target.

We have tested this port on a HiFive Unleashed development board
using a downstream LLVM that has support for the missing bits in
upstream. As of now, all tests are passing, including OMPT.

Patch by Ferran Pallarès!

Differential Revision: https://reviews.llvm.org/D59880

llvm-svn: 367021
2019-07-25 14:36:20 +00:00
Jonas Hahnfeld 6e40ae8f3d [libomptarget] Handle offload policy in push_tripcount
If the first target region in a program calls the push_tripcount
function, libomptarget didn't handle the offload policy correctly.
This could lead to unexpected error messages as seen in
http://lists.llvm.org/pipermail/openmp-dev/2019-June/002561.html

To solve this, add a check calling IsOffloadDisabled() as all other
entry points already do. If this method returns false, libomptarget
is effectively disabled.

Differential Revision: https://reviews.llvm.org/D64626

llvm-svn: 366810
2019-07-23 14:20:48 +00:00
Jonas Hahnfeld a2748c74d6 [OMPT] Cleanup reset of exit_frame pointer
This is done at call-site and does not need to be handled in
__kmp_invoke_microtask. It was already absent from the x86
and x86_64 assembly, this patch removes it from the generic
implementation in z_Linux_util.cpp and adds documentation for
AArch64 and PPC64 that it's actually not needed. I can't test
on these architectures, so I don't want to change the code just
because it looks right :)

While at it, rename some variables for consistency and add a
check in test/ompt/parallel/normal.c that the pointer was reset
before entering the barrier.

Differential Revision: https://reviews.llvm.org/D64442

llvm-svn: 366721
2019-07-22 18:46:02 +00:00
Jonas Hahnfeld 4138b2f167 Delete empty file
This is a left-over from r356288 which was reviewed in D58989.

llvm-svn: 366716
2019-07-22 18:11:06 +00:00
Alexey Bataev da43861b4a [OpenMP][libomptarget] Suppress C++ 11 related warnings when building libomptarget-nvptx bitcode library, by Doru Bercea.
Summary: Pass -std=c++11 flag to compiler to suppress C++ 11 related warnings when building NVPTX bitcode library.

Reviewers: ABataev, caomhin, Hahnfeld

Reviewed By: ABataev, Hahnfeld

Subscribers: jdoerfert, Hahnfeld, jholewinski, mgorny, guansong, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D55772

llvm-svn: 366438
2019-07-18 13:54:01 +00:00
Ron Lieberman 59532488b1 [OPENMP] Resolve lost LoopTripCnt for subsequent loops in same thread.
Remove loopTripCnt from threaded device stack after consuming it.
Added a libomptarget DP message to aid in future debugging and to
validate the added testcase, which only runs in Debug build.

Differential Revision: https://reviews.llvm.org/D64808

llvm-svn: 366349
2019-07-17 17:07:52 +00:00
Jonathan Peyton aa5cdafa40 Remove REQUIRES OMP spec version within lit tests
This is a follow up patch to D64534 (r365963) which removed all OMP
spec versioning within the OpenMP runtime codebase.  This patch removes
REQUIRES: openmp-x.y lines from lit tests.

llvm-svn: 366341
2019-07-17 15:41:00 +00:00
Jonas Hahnfeld 1ff5535785 [OpenMP] Move header inclusion out of 'extern "C"'
This leads to problems when compiling C++ code with libc++ for Nvidia GPUs
because Clang now uses wrappers for math functions that might include
C++ templates not allowed in 'extern "C"'.

Differentiel Revision: https://reviews.llvm.org/D64625

llvm-svn: 366229
2019-07-16 17:16:43 +00:00
Alexey Bataev 85b9651edd [OPENMP][NVPTX]Fixed checks for cuda versions.
Summary:
We used CUDART_VERSION macro to check for the installed cuda version
but this macro is defined in cuda_runtime_api.h, which is not used by
project. Better to use CUDA_VERSION macro, which is defined in cuda.h.
Also, added the check if this macro is defined. If macro is undefined,
there is something wrong with the cuda configuration and we should not
continue the compilation.
This also fixes problems with runtime building in cuda 10+.

Reviewers: grokos

Subscribers: guansong, jdoerfert, caomhin, kkwli0, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D64648

llvm-svn: 366224
2019-07-16 16:07:10 +00:00
Alexey Bataev 42816107f7 [OPENMP]Fix threadid in __kmpc_omp_taskwait call for dependent target calls.
Summary:
We used to call __kmpc_omp_taskwait function with global threadid set to
0. It may crash the application at the runtime if the thread executing
 target region is not a master thread.

Reviewers: grokos, kkwli0

Subscribers: guansong, jdoerfert, caomhin, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D64571

llvm-svn: 366220
2019-07-16 15:51:32 +00:00
Jonathan Peyton e4b4f994d2 [OpenMP] Remove OMP spec versioning
Remove all older OMP spec versioning from the runtime and build system.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D64534

llvm-svn: 365963
2019-07-12 21:45:36 +00:00
Jonas Hahnfeld aca476b296 [libomptarget] Fix typos and grammar in error messages, NFC.
llvm-svn: 365890
2019-07-12 10:21:55 +00:00
Jonas Hahnfeld 2dfc5179f6 [libomptarget-nvptx] Remove dead functions
These entry points are never called by Clang trunk nor clang-ykt. If
XL doesn't use them either, they can finally go away.

Differential Revision: https://reviews.llvm.org/D52700

llvm-svn: 365817
2019-07-11 20:12:51 +00:00
Andrey Churbanov 28f44040cc NFC: fixed typo #ifdef --> #if to allow macro set to 0 work correctly
llvm-svn: 365642
2019-07-10 15:09:37 +00:00
Alexey Bataev 4ad9286a57 [OPENMP]Rename loopTripCnt member data to LoopTripCnt, NFC.
Rename variable to follow LLVM coding standard.

llvm-svn: 365368
2019-07-08 18:45:48 +00:00
Alexey Bataev 060921dee7 [OPENMP]Make __kmpc_push_tripcount thread safe.
Summary:
__kmpc_push_tripcount function is not thread safe and may lead to data
race when the target regions are executed in parallel threads. The patch
makes loopTripCnt counter thread aware and stores the tripcount value
per thread in the map. Access to map is guarded by mutex to prevent
data race in the map itself.
Test is for NVPTX target because it does not work correctly on the
host. Seems to me, there is a problem in libomp with target regions in
the parallel threads.

Reviewers: grokos

Subscribers: guansong, jfb, jdoerfert, openmp-commits, kkwli0, caomhin

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D64080

llvm-svn: 365332
2019-07-08 15:30:23 +00:00
Andrey Churbanov a23806e67a Create a runtime option to disable task throttling.
Patch by viroulep (Philippe Virouleau)

Differential Revision: https://reviews.llvm.org/D63196

llvm-svn: 364934
2019-07-02 15:10:20 +00:00
Andrey Churbanov e7b2c64a6e Cleanup of unused code
Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D63891

llvm-svn: 364925
2019-07-02 13:45:40 +00:00
Alexey Bataev bb55ece269 [OPENMP][NVPTX]Relax flush directive.
Summary:
According to the OpenMP standard, flush  makes a thread’s temporary view of memory consistent with memory and enforces an order on the memory operations of the variables explicitly specified or implied.

According to the Cuda toolkit documentation (https://docs.nvidia.com/cuda/archive/8.0/cuda-c-programming-guide/index.html#memory-fence-functions), __threadfence() functions provides required functionality.

__threadfence_system() also provides required functionality, but it also
includes some extra functionality, like synchronization of page-locked
host memory, synchronization for the host, etc. It is not required per
the standard and we can use more relaxed version of memory fence
operation.

Reviewers: grokos, gtbercea, kkwli0

Subscribers: guansong, jfb, jdoerfert, openmp-commits, caomhin

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D62397

llvm-svn: 364572
2019-06-27 18:33:09 +00:00
Andrey Churbanov b7e6c37efe Fixed memory use-after-free problem.
Bug reported in https://bugs.llvm.org/show_bug.cgi?id=42269.
Freeing of the contention group (CG) stucture by master thread looks wrong,
because workers can leave the CG later on. Intead the freeing
is now done by the last thread leaving the CG.

Differential Revision: https://reviews.llvm.org/D63599

llvm-svn: 364456
2019-06-26 18:11:26 +00:00
Gheorghe-Teodor Bercea aace6d285d [OpenMP][libomptarget] Add support for declare target to clause under unified memory
Summary:
This patch adds support for handling variables under the:

```
#pragma omp declare target to()
```

clause when the 

```
#pragma omp requires unified_shared_memory
```

is used.

The address of the host variable is copied into the device pointer just like for the declare target link case.

Reviewers: ABataev, caomhin, grokos, AlexEichenberger

Reviewed By: grokos

Subscribers: jcownie, guansong, jdoerfert, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D63106

llvm-svn: 363825
2019-06-19 15:48:10 +00:00
Alexey Bataev 8a2bd361eb [OPENMP][CUDA]Use __syncthreads when compiled by nvcc and clang >= 9.0.
Summary:
The problems with __syncthreads() were fixed in clang >= 9.0 and the
original __syncthreads() can be used instead of the ptx instruction.

Reviewers: grokos

Subscribers: guansong, jdoerfert, openmp-commits, kkwli0, caomhin

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D63515

llvm-svn: 363807
2019-06-19 14:20:34 +00:00
Andrey Churbanov 405037c4e6 New implementation of OpenMP 5.0 detached tasks.
Patch by Alex Duran

Differential Revision: https://reviews.llvm.org/D62485

llvm-svn: 363799
2019-06-19 13:23:28 +00:00
Gheorghe-Teodor Bercea b48e44a65c [OpenMP] Add task alloc function
Summary: Add the target task allocation function to the interface.

Reviewers: ABataev, AlexEichenberger, caomhin, jlpeyton, AndreyChurbanov, RaviNarayanaswamy, hbae

Reviewed By: AlexEichenberger, hbae

Subscribers: hbae, RaviNarayanaswamy, cfe-commits, Hahnfeld, guansong, jdoerfert, openmp-commits

Tags: #openmp, #clang

Differential Revision: https://reviews.llvm.org/D63010

llvm-svn: 363449
2019-06-14 20:15:15 +00:00
Andrey Churbanov d47f5488cf Added propagation of not big initial stack size of master thread to workers.
Currently implemented only for non-Windows 64-bit platforms.

Differential Revision: https://reviews.llvm.org/D62488

llvm-svn: 362618
2019-06-05 16:14:47 +00:00
Gheorghe-Teodor Bercea c5fe030c16 [OpenMP][libomptarget] Enable usage of unified memory for declare target link variables
Summary: This patch enables the usage of a host variable on the device for declare target link variables when unified memory is available.

Reviewers: ABataev, caomhin, grokos

Reviewed By: grokos

Subscribers: Hahnfeld, guansong, jdoerfert, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D60884

llvm-svn: 362505
2019-06-04 15:05:53 +00:00
Andrey Churbanov 3f786dab0e Fixed build warning with -DLIBOMP_USE_HWLOC=1
Made type of depth of hwloc object to correapond with
change from unsigned in hwloc 1,x to int in hwloc 2.x.
This eliminates the warning on signed-unsigned comparison.

Differential Revision: https://reviews.llvm.org/D62332

llvm-svn: 362401
2019-06-03 14:21:59 +00:00
Hansang Bae ec1b4d1f6f Fix OMP_TARGET_OFFLOAD parsing
Current parsing allows trailing string after the permitted value,
MANDATORY|DISABLED|DEFAULT -- e.g., "mandatorynot" is also recognized
as "MANDATORY". Such cases should be recognized as incorrect/unknown
value.

Differential Revision: https://reviews.llvm.org/D62431

llvm-svn: 362125
2019-05-30 18:35:07 +00:00
Hansang Bae 7c75ac0c60 Add checks before pointer dereferencing
This change adds checks before dereferencing a pointer returned from a
function.

Differential Revision: https://reviews.llvm.org/D62224

llvm-svn: 362111
2019-05-30 16:32:20 +00:00
Michal Gorny a815cbb010 [openmp] [test] Skip kernel-breaking tests on NetBSD
The omp_taskloop_num_tasks and omp_taskwait have deadlooped
on the NetBSD buildbot previously, practically hanging the host running
it.  Disable them until we can find a good solution, or make the kernel
less fragile.

llvm-svn: 361825
2019-05-28 14:10:47 +00:00
Alexey Bataev e1947b84c1 Revert "[OPENMP][NVPTX]Fix barriers and parallel level counters, NFC."
This reverts commit r361421 to split the patch into 3 parts.

llvm-svn: 361638
2019-05-24 14:06:47 +00:00
Alexey Bataev 9d9e406684 [OPENMP][NVPTX]Fix barriers and parallel level counters, NFC.
Summary:
Parallel level counter should be volatile to prevent some dangerous
optimiations by the ptxas. Otherwise, ptxas optimizations lead to
undefined behaviour in some cases.
Also, use __threadfence() for #pragma omp flush and if the barrier
should not be used (we have only one thread in the team), still perform
flush operation since the standard requires implicit flush when
executing barriers.

Reviewers: gtbercea, kkwli0, grokos

Subscribers: guansong, jfb, jdoerfert, openmp-commits, caomhin

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D62199

llvm-svn: 361421
2019-05-22 19:50:32 +00:00
Andrey Churbanov 184ef0a0a6 Fixed third issue reported in https://bugs.llvm.org/show_bug.cgi?id=41584.
Removed wrong debug assertion.

Differential Revision: https://reviews.llvm.org/D62251

llvm-svn: 361408
2019-05-22 16:48:05 +00:00
Jonathan Peyton 3057c3a092 [OpenMP] Add implementation to two OMPT API routines
This change adds implementation to ompt_finalize_tool() and
ompt_get_task_memory().

Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D61657

llvm-svn: 361309
2019-05-21 20:51:05 +00:00
Gheorghe-Teodor Bercea 9e9c918259 [OpenMP][libomptarget] Enable requires flags for target libraries.
Summary:
Target link variables are currently implemented by creating a copy of the variables on the device side and unified memory never gets exploited.

When the prgram uses the:

```
#pragma omp requires unified_shared_memory
```

directive in conjunction with a declare target link, the linked variable is no longer allocated on the device and the host version is used instead.

This behavior is overridden by performing an explicit mapping.

A Clang side patch is required.

Reviewers: ABataev, AlexEichenberger, grokos, Hahnfeld

Reviewed By: AlexEichenberger, grokos, Hahnfeld

Subscribers: Hahnfeld, jfb, guansong, jdoerfert, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D60223

llvm-svn: 361294
2019-05-21 19:35:02 +00:00
Joachim Protze 4109d5606e [OpenMP][OMPT] Fix locking testcases for 32 bit architectures
https://reviews.llvm.org/D58454 did not fix the problem for a typical use
case of building LLVM with gcc or icc and then testing with the newly built
clang compiler.
The compilers do not agree on how to extend a 32-bit pointer to uint64, so
make the pointer unsigned first, before adjusting the size.

Patch by Joachim Protze

Differential Revision: https://reviews.llvm.org/D58506

llvm-svn: 361158
2019-05-20 14:21:42 +00:00
Joachim Protze 48b8a4b519 [OMPT] Handling of the events of initial-task-begin and initial-task-end
OpenMP 5.0 says that the callback for the events initial-task-begin and
initial-task-end has to be ompt_callback_implicit_task.

Patch by Tim Cramer

Differential Revision: https://reviews.llvm.org/D58776

llvm-svn: 361157
2019-05-20 14:21:36 +00:00
Andrey Churbanov f8f788b205 Fixed second issue reported in https://bugs.llvm.org/show_bug.cgi?id=41584.
Added synchronization for possible concurrent initialization of mutexes
by multiple threads. The need of synchronization caused by commit r357927
which added the use of mutexes at threads movement to/from common pool
(earlier the mutexes were used only at suspend/resume).

Patch by Johnny Peyton.

Differential Revision: https://reviews.llvm.org/D61995

llvm-svn: 360919
2019-05-16 17:52:53 +00:00
Paul Osmialowski 0732fcc7d5 Fix hwloc topology traversal code unable to handle situation where L2 cache is common for the packages
Currently cores within package that share the same L2 cache are grouped together.
The current logic behind this assumes that the L2 cache is always at deeper
(or the same) level than the package itself. In case when L2 cache is common
for all packages (and the packages are at deeper level than L2 cache) the whole of
the further topology discovery fails to find any computational units resulting in
following assertion:

Assertion failure at kmp_affinity.cpp(715): nActiveThreads == __kmp_avail_proc.
OMP: Error #13: Assertion failure at kmp_affinity.cpp(715).

This patch adds a bit of a logic that prevents such situation from occurring.

Differential Revision: https://reviews.llvm.org/D61796

llvm-svn: 360890
2019-05-16 13:16:24 +00:00
Andrey Churbanov 6ebb785bb1 Fixed https://bugs.llvm.org/show_bug.cgi?id=41584.
Removed unconditional and unsafe decrement of counter 
of active threads in pool at shutdown time.

Differential Revision: https://reviews.llvm.org/D61944

llvm-svn: 360784
2019-05-15 16:53:45 +00:00
Andrey Churbanov 22405f3097 Introduce new OpenMP 5.0 depend object type.
The implementation should be done by compiler, user can only declare
objects of this type and use them in OpenMP directives.

Differential Revision: https://reviews.llvm.org/D61860

llvm-svn: 360774
2019-05-15 13:45:36 +00:00
Eli Friedman 025df3b827 [OpenMP][AArch64] Fix compile with LLVM trunk.
The code is currently using the ambiguous instruction
"sub sp, sp, w9, lsl #4". The ARM reference manual says this isn't
valid, and it's not clear whether it's supposed to mean uxtw or uxtx.

It doesn't matter which instruction we use here, since the high
bits of the operand are zero anyway, so I arbitrarily choose uxtw, to
preserve the register name.

See https://reviews.llvm.org/D60840 for the LLVM patch.

Differential Revision: https://reviews.llvm.org/D61770

llvm-svn: 360711
2019-05-14 21:44:54 +00:00
Andrey Churbanov 1aaf2a3c18 fixed typo made by commit r360595
llvm-svn: 360602
2019-05-13 17:04:32 +00:00
Andrey Churbanov 7f63e8c0a6 Fixed creation of aliases in Windows build.
Changed file extension of the destination of the copy of libomp.lib
(it was mistakely .dll, now it is .lib) in installation on Windows.

Differential Revision: https://reviews.llvm.org/D61673

llvm-svn: 360595
2019-05-13 16:07:37 +00:00
Alexey Bataev f9e00db818 [OPENMP][NVPTX]Simplify handling of thread limit, NFC.
Summary:
Patch improves performance of the full runtime mode by moving
threads limit counter to the shared memory. It also allows to save
global memory.

Reviewers: grokos, kkwli0, gtbercea

Subscribers: guansong, jdoerfert, openmp-commits, caomhin

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D61801

llvm-svn: 360584
2019-05-13 14:21:46 +00:00
Alexey Bataev f62c266de7 [OPENMP][NVPTX]Improve number of threads counter, NFC.
Summary:
Patch improves performance of the full runtime mode by moving
number-of-threads counter to the shared memory. It also allows to save
global memory.

Reviewers: grokos, gtbercea, kkwli0

Subscribers: guansong, jfb, jdoerfert, openmp-commits, caomhin

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D61785

llvm-svn: 360457
2019-05-10 18:56:05 +00:00
Jonathan Peyton c107332583 [OpenMP] Workaround gfortran bugzilla build bug 41755
This patch provides workaround to allow gfortran to compile the
OpenMP Fortran modules.

From the gfortran manual:
https://gcc.gnu.org/onlinedocs/gcc-9.1.0/gfortran/BOZ-literal-constants.html

"Note that initializing an INTEGER variable with a statement such as
DATA i/Z'FFFFFFFF'/ will give an integer overflow error rather than the desired
result of -1 when i is a 32-bit integer on a system that supports 64-bit
integers. The -fno-range-check option can be used as a workaround for legacy
code that initializes integers in this manner."

Bug filed: https://bugs.llvm.org/show_bug.cgi?id=41755

Differential Revision: https://reviews.llvm.org/D61603

llvm-svn: 360299
2019-05-08 23:12:31 +00:00
Dimitry Andric 181aff63fb Add non-SSE wrapper for __kmp_{load,store}_mxcsr
Summary:
To be able to successfully build OpenMP on FreeBSD/i386, which still
uses i486 as its default processor, I had to provide wrappers for the
`__kmp_load_mxcsr` and `__kmp_store_mxcsr` functions.

If the compiler signals that SSE is not available, loading and storing
mxcsr does not make sense anway, so in that case the inline functions
are empty.  This gives the minimum amount of code churn.

See also https://svnweb.freebsd.org/changeset/base/345283

Reviewers: emaste, jlpeyton, Hahnfeld

Reviewed By: jlpeyton

Subscribers: hfinkel, krytarowski, jdoerfert, openmp-commits, llvm-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D60916

llvm-svn: 360062
2019-05-06 17:58:03 +00:00
Alexey Bataev a857e31011 [OPENMP][NVPTX]Improve thread limit counter, NFC.
Summary:
Patch improves performance of the full runtime mode by moving
thread-limit counter to the shared memory. It also allows to save
global memory.

Reviewers: grokos, gtbercea, kkwli0

Subscribers: guansong, jdoerfert, caomhin, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D61526

llvm-svn: 359922
2019-05-03 20:00:38 +00:00
Alexey Bataev e031e17919 [OPENMP][NVPTX]Improved several standard OpenMP functions, NFC.
Summary:
Used parallelLevel[] counter to simplify and improve implementation of
the existing standard OpenMP functions. Functions are tested already in
several tests, the patch is NFC.

Reviewers: grokos, gtbercea, kkwli0

Subscribers: guansong, jdoerfert, caomhin, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D61459

llvm-svn: 359892
2019-05-03 14:47:20 +00:00
Alexey Bataev 8ccb8f8647 [OPENMP][NVPTX]Improve code by using parallel level counter.
Summary:
Previously for the different purposes we need to get the active/common
parallel level and with full runtime we iterated over all the records to
calculate this level. Instead, we can used the warp-based parallel level
counters used in no-runtime mode.

Reviewers: grokos, gtbercea, kkwli0

Subscribers: guansong, jfb, jdoerfert, caomhin, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D61395

llvm-svn: 359822
2019-05-02 20:05:01 +00:00
Alexey Bataev 4ad6dbc5fd [OPENMP][NVPTX]Improve omp_get_max_threads() function.
Summary:
Function omp_get_max_threads() can always return 1 if current execution
mode is SPMD.

Reviewers: grokos, gtbercea, kkwli0

Subscribers: guansong, jdoerfert, caomhin, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D61379

llvm-svn: 359792
2019-05-02 14:52:52 +00:00
Alexey Bataev 8e6bf88cf7 [OPENMP][NVPTX]Improved omp_get_thread_limit() function.
Summary:
Function omp_get_thread_limit() in SPMD mode can return the maximum
available number of threads as a result.

Reviewers: grokos, gtbercea, kkwli0

Subscribers: guansong, jdoerfert, openmp-commits, caomhin

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D61378

llvm-svn: 359790
2019-05-02 14:46:32 +00:00
Dimitry Andric 147ce2334c Enable OpenMP build for 32-bit FreeBSD
Summary:
To be able to successfully build OpenMP on 32-bit FreeBSD, such as
FreeBSD/i386, I first had to provide a few wrappers (see D60916), and
then add `KMP_OS_FREEBSD` to the list of defines checked for 32-bit
architectures in `kmp_runtime.cpp`.

I have successfully built libomp.so and ran a bunch of test programs on
FreeBSD/i386 with this.

See also https://svnweb.freebsd.org/changeset/base/345283

Reviewers: emaste, jlpeyton, Hahnfeld

Reviewed By: jlpeyton

Subscribers: krytarowski, guansong, jdoerfert, openmp-commits, llvm-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D60917

llvm-svn: 359716
2019-05-01 19:32:58 +00:00
Jonathan Peyton a8426ac8c2 [OpenMP] Implement task modifier for reduction clause
Implemented task modifier in two versions - one without taking into account
omp_orig variable (the omp_orig still can be processed by compiler without help
of the library, but each reduction object will need separate initializer with
global access to omp_orig), another with omp_orig variable included into
interface (single initializer can be used for multiple reduction objects of
the same type). Second version can be used when the omp_orig is not globally
accessible, or to optimize code in case of multiple reduction objects
of the same type.

Patch by Andrey Churbanov

Differential Revision: https://reviews.llvm.org/D60976

llvm-svn: 359710
2019-05-01 17:54:01 +00:00
Jonathan Peyton 71abe28e81 [OpenMP] Add OpenMP 5.0 nonmonotonic code
This patch adds:
* New omp_sched_monotonic flag to omp_sched_t which is handled within the runtime
* Parsing of monotonic/nonmonotonic in OMP_SCHEDULE
* Tests for the monotonic flag and envirable parsing
* Logic to force monotonic when hierarchical scheduling is used

Differential Revision: https://reviews.llvm.org/D60979

llvm-svn: 359601
2019-04-30 19:20:35 +00:00
Jonathan Peyton 1ca746170b [OpenMP] Eliminate some compiler warnings
* Remove accidental == for =
* Assign values to variables to appease compiler
* Surround debug code with KMP_DEBUG
* Remove unused local typedefs

Differential Revision: https://reviews.llvm.org/D60983

llvm-svn: 359599
2019-04-30 19:13:37 +00:00
Alexey Bataev c03fe73176 [OPENMP][NVPTX]Correctly handle L2 parallelism in SPMD mode.
Summary:
The parallelLevel counter must be on per-thread basis to fully support
L2+ parallelism, otherwise we may end up with undefined behavior.
Introduce the parallelLevel on per-warp basis using shared memory. It
allows to avoid the problems with the synchronization and allows fully
support L2+ parallelism in SPMD mode with no runtime.

Reviewers: gtbercea, grokos

Subscribers: guansong, jdoerfert, caomhin, kkwli0, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D60918

llvm-svn: 359341
2019-04-26 19:30:34 +00:00
Dimitry Andric 87e7f895bb Use correct way to test for MIPS arch after rOMP355687
Summary:
I ran into some issues after rOMP355687, where __atomic_fetch_add was
being used incorrectly on x86, and this turns out to be caused by the
following added conditionals:

```
#if defined(KMP_ARCH_MIPS)
```

The problem is, these macros are always defined, and are either 0 or 1
depending on the architecture.  E.g. the correct way to test for MIPS
is:

```
#if KMP_ARCH_MIPS
```

Reviewers: petarj, jlpeyton, Hahnfeld, AndreyChurbanov

Reviewed By: petarj, AndreyChurbanov

Subscribers: AndreyChurbanov, sdardis, arichardson, atanasyan, jfb, jdoerfert, openmp-commits, llvm-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D60938

llvm-svn: 358911
2019-04-22 19:20:46 +00:00
Alexey Bataev 5de5d74c8d [OPENMP][NVPTX] Fix the test, NFC.
Fix the test to run it really in SPMD mode without runtime. Previously
it was run in SPMD + full runtime mode and does not allow to cehck the
functionality correctly.

llvm-svn: 358902
2019-04-22 17:25:31 +00:00
Andrey Churbanov cf5bdb83b0 Fixed memory leak reported in Bugzilla:
https://bugs.llvm.org/show_bug.cgi?id=41494

Freed th_cg_roots structure at exit from uber thread.

Differential Revision: https://reviews.llvm.org/D60729

llvm-svn: 358572
2019-04-17 10:44:28 +00:00
Alexey Bataev 13532ea623 [OPENMP][NVPTX]Fix dynamic scheduling in L2+ SPMD parallel regions.
Summary:
If the kernel is executed in SPMD mode and the L2+ parallel for region
with the dynamic scheduling is executed, dynamic scheduling functions
are called. They expect full runtime support, but SPMD kernels may be
executed without the full runtime. It leads to the runtime crash of the
compiled program. Patch fixes this problem + fixes handling of the
parallelism level in SPMD mode, which is required as part of this patch.

Reviewers: gtbercea, kkwli0, grokos

Subscribers: guansong, jdoerfert, openmp-commits, caomhin

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D60578

llvm-svn: 358442
2019-04-15 20:15:20 +00:00
Jonathan Peyton 4f21f5f5ce [OpenMP] Exchange code in asm file for inline assembly
This change replaces some of the assembly functions in z_Linux_asm.S
for inline asm in kmp.h. This allows better interaction with compiler
tools and sanitizers.

Differential Revision: https://reviews.llvm.org/D60423

llvm-svn: 358438
2019-04-15 19:19:57 +00:00
Andrey Churbanov 705384be97 Fixed possible out of bound array access.
The check of index value moved to before the write to the array.

Differential Revision: https://reviews.llvm.org/D60471

llvm-svn: 358181
2019-04-11 15:03:44 +00:00
Jonathan Peyton ebf1830bb1 [OpenMP] Implement 5.0 memory management
* Replace HBWMALLOC API with more general MEMKIND API, new functions
  and variables added.
* Have libmemkind.so loaded when accessible.
* Redirect memspaces to default one except for high bandwidth which
  is processed separately.
* Ignore some allocator traits e.g., sync_hint, access, pinned, while
  others are processed normally e.g., alignment, pool_size, fallback,
  fb_data, partition.
* Add tests for memory management

Patch by Andrey Churbanov

Differential Revision: https://reviews.llvm.org/D59783

llvm-svn: 357929
2019-04-08 17:59:28 +00:00
Jonathan Peyton feac33ebb0 [OpenMP] Clean up load balancing dynamic mode
This patch cleans up the bookkeeping code for the load balancing dynamic mode.

When a thread is moved to or from the thread pool, the th_active_in_pool flag
and the __kmp_thread_pool_active_nth global counter are both updated. This
removes the need for the corrective code in the main wait loop. Another global
counter, __kmp_thread_pool_nth, was removed completely, as it was only used for
debugging, but was not under KMP_DEBUG.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D59508

llvm-svn: 357927
2019-04-08 17:50:02 +00:00
Dimitry Andric 8f2d1eb9e8 After rL357618, quote ${CMAKE_THREAD_LIBS_INIT} so CMake does not
complain when the variable is empty.  Fixes PR 41401.

llvm-svn: 357828
2019-04-05 22:19:40 +00:00
Jonathan Peyton b727d384a3 [OpenMP] Fix hang on Windows
Debug dump on large machine shows when many OpenMP threads (401 in total)
sleep on a barrier, one of the innermost nesting levels sleeps
on a child's b_arrived flag whose value is equal to 4 and is equal to
checker value. i.e., (1) sleep bit is 0, and (2) done_check() would
return true if called.

It is unclear how this might happen. It could be Windows Server 2016's
error of EnterCriticalSection / LeaveCriticalSection, or
error of WaitForSingleObject / SetEvent / ResetEvent, or
error in the library which is very difficult to find.

As a workaround, change INFINITE wait to timed wait, so that each
thread awakens each 5 seconds (the timeout was chosen arbitrary to not
disturb other threads much), check flag condition under the lock, and
either go to sleep again or stop sleeping as a result of the check.

Patch by Andrey Churbanov

Differential Revision: https://reviews.llvm.org/D59793

llvm-svn: 357722
2019-04-04 20:35:29 +00:00
Jonathan Peyton d2b53cad18 [OpenMP][Stats] Fix stats gathering for distribute and team clause
The distribute clause needs an explicit push of a timer. The teams
clause needs a timer added and also, similarly to parallel, exchanged
with the serial timer when encountered so that serial regions are
counted properly.

Differential Revision: https://reviews.llvm.org/D59801

llvm-svn: 357621
2019-04-03 18:53:26 +00:00
Dimitry Andric 956168c802 Ensure correct pthread flags and libraries are used
On most platforms, certain compiler and linker flags have to be passed
when using pthreads, otherwise linking against libomp.so might fail with
undefined references to several pthread functions.

Use CMake's `find_package(Threads)` to determine these for standalone
builds, or take them (and optionally modify them) from the top-level
LLVM cmake files.

Also, On FreeBSD, ensure that libomp.so is linked against libm.so,
similar to NetBSD.

Adjust test cases with hardcoded `-lpthread` flag to use the common
build flags, which should now have the required pthread flags.

Reviewers: emaste, jlpeyton, krytarowski, mgorny, protze.joachim, Hahnfeld

Reviewed By: Hahnfeld

Subscribers: AndreyChurbanov, tra, EricWF, Hahnfeld, jfb, jdoerfert, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D59451

llvm-svn: 357618
2019-04-03 18:11:36 +00:00
Michael Kruse d97d5ebcfa [libomptarget] Introduce LIBOMPTARGET_ENABLE_DEBUG cmake option.
At the moment, support for runtime debug output using the
OMPTARGET_DEBUG=1 environment variable is only available with
CMAKE_BUILD_TYPE=Debug builds. The patch allows setting it independently
using the LIBOMPTARGET_ENABLE_DEBUG option, which is enabled by default
depending on CMAKE_BUILD_TYPE. That is, unless this option is set
explicitly, nothing changes. This is the same mechanism used by LLVM for
LLVM_ENABLE_ASSERTIONS.

This patch also removes adding -g -O0 in debug builds, it should be
handled by cmake's CMAKE_{C|CXX}_FLAGS_DEBUG configuration option.

Idea by Hal Finkel

Differential Revision: https://reviews.llvm.org/D55952

llvm-svn: 356998
2019-03-26 15:19:15 +00:00
Jonathan Peyton 3bc703d538 [OpenMP] Add LLVM license header to file
This file was missing the LLVM license header

llvm-svn: 356962
2019-03-25 22:36:31 +00:00
Jonathan Peyton 7ca09056c7 [OpenMP] Add Intel 19.0 to list of compilers in kmp_version.cpp
llvm-svn: 356961
2019-03-25 22:31:00 +00:00
Dimitry Andric a70da7f29f Fix interoperability test compilation on FreeBSD
Summary:
While building the 8.0 releases on FreeBSD, I encountered the following
error in the regression tests, where ompt/misc/interoperability.cpp
failed to compile, with:

```
projects/openmp/runtime/test/ompt/misc/interoperability.cpp:7:10: fatal error: 'alloca.h' file not found
#include <alloca.h>
         ^~~~~~~~~~
```

Like on NetBSD, alloca(3) is defined in <stdlib.h> instead.

Reviewers: emaste, jlpeyton, krytarowski, mgorny, protze.joachim

Reviewed By: jlpeyton

Subscribers: jdoerfert, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D59736

llvm-svn: 356936
2019-03-25 18:37:49 +00:00
Dimitry Andric dab9ed87c6 Fix gettid warnings on FreeBSD
Summary:
[Split off from D59451 to get this fix in separately]

While building the 8.0 releases on FreeBSD, I encountered the following
warnings in openmp quite a few times:

```
In file included from projects/openmp/runtime/src/kmp_settings.cpp:27:
projects/openmp/runtime/src/kmp_wrapper_getpid.h:35:2: warning: #warning is a language extension [-Wpedantic]
#warning No gettid found, use getpid instead
 ^
projects/openmp/runtime/src/kmp_wrapper_getpid.h:35:2: warning: No gettid found, use getpid instead [-W#warnings]
2 warnings generated.
```

I added a gettid wrapper that uses FreeBSD's pthread_getthreadid_np(3)
function for this.

Reviewers: emaste, jlpeyton, krytarowski, mgorny, protze.joachim

Reviewed By: jlpeyton

Subscribers: jfb, jdoerfert, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D59735

llvm-svn: 356934
2019-03-25 18:37:14 +00:00
Jonathan Peyton 61708b1e94 [OpenMP] Fix pause check with version info
Add 5.0 guard to pause code for now.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D59428

llvm-svn: 356933
2019-03-25 18:17:55 +00:00
Jonathan Peyton 6622732d9a [OpenMP] Fix OMPT cancellation test for GOMP
The GOMP sections interface uses schedule(dynamic) dispatch so it cannot
be assumed which thread executes the cancel and which thread executes
the cancellation point.  This patch allows either thread to execute either
section.

llvm-svn: 356302
2019-03-15 21:24:45 +00:00
Jonathan Peyton 5af1c22d0b [OpenMP] Add missing parenthesis in Perl module
llvm-svn: 356289
2019-03-15 18:27:14 +00:00
Jonathan Peyton 44b476c141 [OpenMP] Remove deprecated taskq
Remove very old, unused, and deprecated taskq code.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D58989

llvm-svn: 356288
2019-03-15 18:24:59 +00:00
Jonathan Peyton 529e0d2ea4 [OpenMP][stats] Update stats gathering macros
llvm-svn: 355739
2019-03-08 21:23:34 +00:00
Petar Jovanovic bc3cda1526 [mips] Use libatomic instead of GCC intrinsics for 64bit
The following GCC intrinsics are not available on MIPS32:

__sync_fetch_and_add_8
__sync_fetch_and_and_8
__sync_fetch_and_or_8
__sync_val_compare_and_swap_8

Replace these with appropriate libatomic implementation.

Patch by Miodrag Dinic.

Differential Revision: https://reviews.llvm.org/D45691

llvm-svn: 355687
2019-03-08 10:53:19 +00:00
Shoaib Meenai 5be71faf4b [build] Rename clang-headers to clang-resource-headers
Summary:
The current install-clang-headers target installs clang's resource
directory headers. This is different from the install-llvm-headers
target, which installs LLVM's API headers. We want to introduce the
corresponding target to clang, and the natural name for that new target
would be install-clang-headers. Rename the existing target to
install-clang-resource-headers to free up the install-clang-headers name
for the new target, following the discussion on cfe-dev [1].

I didn't find any bots on zorg referencing install-clang-headers. I'll
send out another PSA to cfe-dev to accompany this rename.

[1] http://lists.llvm.org/pipermail/cfe-dev/2019-February/061365.html

Reviewers: beanz, phosek, tstellar, rnk, dim, serge-sans-paille

Subscribers: mgorny, javed.absar, jdoerfert, #sanitizers, openmp-commits, lldb-commits, cfe-commits, llvm-commits

Tags: #clang, #sanitizers, #lldb, #openmp, #llvm

Differential Revision: https://reviews.llvm.org/D58791

llvm-svn: 355340
2019-03-04 21:19:53 +00:00
Stefan Pintilie a908829bf5 [OPENMP] Deal with additional store inserted by Clang under -fno-PIC for PowerPC.
Changing the default from -fPIC to -fno-PIC on PowerPC exposed an issue in
OpenMP for PowerPC.
The issue is reported here:
https://bugs.llvm.org/show_bug.cgi?id=40082

This is a fix for that issue.
Also removed the XFAIL from the two tests that were failing under -fno-PIC.

Differential Revision: https://reviews.llvm.org/D56286

llvm-svn: 355229
2019-03-01 21:16:45 +00:00
Jonathan Peyton ad1ad7ae8b [OpenMP][OMPT] Distinguish different barrier kinds
This change makes the runtime decide the intended use of each barrier
invocation, for the OMPT synchronization tool callbacks.  The OpenMP 5.0
specification defines four possible barrier kinds -- implicit, explicit,
implementation, and just normal barrier.

Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D58247

llvm-svn: 355140
2019-02-28 20:55:39 +00:00
Jonathan Peyton 76b45e874d [OpenMP 5.0] Deprecate nest-var and associated features
Nest-var, OMP_NESTED, omp_set_nested()., and omp_get_nested() have been
deprecated in the 5.0 spec. Initial nesting info is now derived from
OMP_MAX_ACTIVE_LEVELS, OMP_NUM_THREADS, and OMP_PROC_BIND.

This patch deprecates the internal ICV that corresponds to nest-var, and
replaces it with the max-active-levels-var ICV to determine nesting. The
change still allows for use of OMP_NESTED (according to 5.0 changes),
omp_get_nested, and omp_set_nested, which have had deprecation messages
added to them. The change allows certain settings of OMP_NUM_THREADS,
OMP_PROC_BIND, and OMP_MAX_ACTIVE_LEVELS to turn on nesting, but
OMP_NESTED=0 will still force nesting to be off.

The runtime now prints informative messages about deprecation of
OMP_NESTED, omp_set_nested(), and omp_get_nested(), when those
environment variables or routines are used. It also prints deprecated
message in output for KMP_SETTINGS and OMP_DISPLAY_ENV for OMP_NESTED.
This patch also fixes OMP_DISPLAY_ENV output for OMP_TARGET_OFFLOAD.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D58408

llvm-svn: 355138
2019-02-28 20:47:21 +00:00
Jonathan Peyton e47d32f165 [OpenMP] Make use of sched_yield optional in runtime
This patch cleans up the yielding code and makes it optional. An
environment variable, KMP_USE_YIELD, was added. Yielding is still
on by default (KMP_USE_YIELD=1), but can be turned off completely
(KMP_USE_YIELD=0), or turned on only when oversubscription is detected
(KMP_USE_YIELD=2). Note that oversubscription cannot always be detected
by the runtime (for example, when the runtime is initialized and the
process forks, oversubscription cannot be detected currently over
multiple instances of the runtime).

Because yielding can be controlled by user now, the library mode
settings (from KMP_LIBRARY) for throughput and turnaround have been
adjusted by altering blocktime, unless that was also explicitly set.

In the original code, there were a number of places where a double yield
might have been done under oversubscription. This version checks
oversubscription and if that's not going to yield, then it does
the spin check.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D58148

llvm-svn: 355120
2019-02-28 19:11:29 +00:00
Jonas Hahnfeld db3025ad57 [OpenMP] Fix check-openmp after r354553
Calling add_openmp_testsuite will add the tests to check-openmp unless
EXCLUDE_FROM_ALL is set. This is problematic because the tests for OMPT
will be included twice which doesn't work if the same test is executed
concurrently by multiple threads.

See:
http://lab.llvm.org:8011/builders/openmp-gcc-x86_64-linux-debian/builds/163
http://lab.llvm.org:8011/builders/openmp-clang-x86_64-linux-debian/builds/184

http://lab.llvm.org:8011/builders/openmp-clang-ppc64le-linux-rhel/builds/133
(On PPC some failures are unrelated to r354553, the bot has been red before
and this commit is not expected to fix that. For a proper patch please see
https://reviews.llvm.org/D56286.)

llvm-svn: 354572
2019-02-21 12:00:57 +00:00
Joachim Protze 8b96fad85c [OpenMP][OMPT] Fix locking testcases for 32 bit architectures
Fix for the bug reported in:
https://bugs.llvm.org/show_bug.cgi?id=40531

The address is now casted the same way as in the runtime code.

Differential Revision: https://reviews.llvm.org/D58454

llvm-svn: 354553
2019-02-21 08:50:49 +00:00
Gheorghe-Teodor Bercea 06e08f0b0a [OpenMP][libomptarget] New reduction scheme for team reductions
Summary:
This patch adds a more sophisticated team reduction scheme to the OpenMP libomptarget-nvptx runtime.

The scheme uses a fixed size global memory buffer whose length can be adjusted via compiler flag:
```
-fopenmp-cuda-teams-reduction-recs-num=1024
```
The global buffer is a structure of arrays (with default size of 1024 each and controlled by the above flag), one array for each reduction variable.

Values in the buffer are processed by the last team to finish executing the body of the target region.

In addition to adding support for the new flag, the compiler also emits special functions used for the reduction of the intermediate reduction values. These changes will be added in a separate compiler patch following this one.




Reviewers: ABataev, caomhin

Reviewed By: ABataev

Subscribers: guansong, jfb, jdoerfert, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D58409

llvm-svn: 354471
2019-02-20 14:55:55 +00:00
Jonathan Peyton 7d2cfa1fd5 [OpenMP] Remove XFAIL for cancellation tests using gcc
llvm-svn: 354370
2019-02-19 19:00:29 +00:00
Jonathan Peyton 154ac075cd [OpenMP 5.0] Add omp_get_supported_active_levels()
This patch adds the new 5.0 API function omp_get_supported_active_levels().

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D58211

llvm-svn: 354368
2019-02-19 18:51:11 +00:00
Jonathan Peyton 4fe5271fa0 [OpenMP] Adding GOMP compatible cancellation
Remove fatal error messages from the cancellation API for GOMP
Add __kmp_barrier_gomp_cancel() to implement cancellation of parallel regions.
This new function uses the linear barrier algorithm with a cancellable
nonsleepable wait loop.

Differential Revision: https://reviews.llvm.org/D57969

llvm-svn: 354367
2019-02-19 18:47:57 +00:00
Jonathan Peyton 511092cab0 [OpenMP] Fix broken link to browse sources
llvm-svn: 353858
2019-02-12 17:00:57 +00:00
Jonathan Peyton 2f744592a0 [OpenMP] Remove accidental commit to config-ix.cmake in r353747
llvm-svn: 353748
2019-02-11 21:09:15 +00:00
Jonathan Peyton 65ebfeecf8 [OpenMP] Fix thread_limits to work properly for teams construct
The thread-limit-var and omp_get_thread_limit API was not perfectly handled for
teams construct. Now, when modified by thread_limit clause, omp_get_thread_limit
reports the correct value. In addition, the value is restored when leaving the
teams construct to what it was in the encountering context.

This is done partly by creating the notion of a Contention Group root (CG root)
that keeps track of the thread at the root of each separate CG, the
thread-limit-var associated with the CG, and associated counter of active
threads within the contention group.

thread-limits are passed from master to worker threads via an entry in the ICV
data structure. When a "contention group switch" occurs, a new CG root record is
made and passed from master to worker. A thread could potentially have several
CG root records if it encounters multiple nested teams constructs (but at the
moment the spec doesn't allow for nested teams, so the most one could have
currently is 2). The master of the teams masters gets the thread-limit clause
value stored to its local ICV structure, and the other teams masters copy it
from the master. The thread-limit is set from that ICV copy and restored to the
ICV copy when entering and leaving the teams construct.

This change also fixes a bug when the top-level teams construct team gets
reused, and OMP_DYNAMIC was true, which can cause the expected size of this team
to be smaller than what was actually allocated. The fix updates the size of the
team after its threads were reserved.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D56804

llvm-svn: 353747
2019-02-11 21:04:23 +00:00
Jonas Hahnfeld f26d3e7185 [OMPT] Remove test output from source tree
%s refers to the test file in the source tree. This was accidentally added in
r351197 / 2b46d30 ("[OMPT] Second chunk of final OMPT 5.0 interface updates").

Differential Revision: https://reviews.llvm.org/D58002

llvm-svn: 353715
2019-02-11 16:14:51 +00:00
Taewook Oh 91c32fd8c8 Guard a feature that unsupported by old GCC
Summary:
As @david2050 commented, changes introduced by https://reviews.llvm.org/D56397 break builds for older compilers
which don't support `__has(_cpp)_attribute`. This is a fix for the break.

Reviewers: protze.joachim, jlpeyton, AndreyChurbanov, Hahnfeld, david2050

Subscribers: openmp-commits, david2050

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D57851

llvm-svn: 353538
2019-02-08 17:15:50 +00:00
Joachim Protze 0c599c388d [OMPT] Make sure that OMPT is enabled when accessing internals of the runtime
The three switch fallthrough generate a warning with -Wimplicit-fallthrough.
Two are documented as fallthrough, one is not, but I think the intention is to also fallthrough in kmp_tasking.cpp.

Not sure whether kmp.h is the best place to define the macro.

Reviewers: jlpeyton, AndreyChurbanov, Hahnfeld

Reviewed By: jlpeyton

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D56397

llvm-svn: 353052
2019-02-04 15:59:42 +00:00
Joachim Protze 32959e683a [OMPT] Make sure that OMPT is enabled when accessing internals of the runtime
Redo after revert by hans. The wrong include in one test is fixed.

Make sure that OMPT is enabled in runtime entry points that access internals
of the runtime. Else, return an appropiate value indicating an error or that
the data is not available.

Patch provided by @sconvent

Reviewers: jlpeyton, omalyshe, hbae, Hahnfeld, joachim.protze

Reviewed By: joachim.protze

Tags: #openmp, #ompt

Differential Revision: https://reviews.llvm.org/D47717

llvm-svn: 352611
2019-01-30 08:41:06 +00:00
James Y Knight 5d71fc5d7b Adjust documentation for git migration.
This fixes most references to the paths:
 llvm.org/svn/
 llvm.org/git/
 llvm.org/viewvc/
 github.com/llvm-mirror/
 github.com/llvm-project/
 reviews.llvm.org/diffusion/

to instead point to https://github.com/llvm/llvm-project.

This is *not* a trivial substitution, because additionally, all the
checkout instructions had to be migrated to instruct users on how to
use the monorepo layout, setting LLVM_ENABLE_PROJECTS instead of
checking out various projects into various subdirectories.

I've attempted to not change any scripts here, only documentation. The
scripts will have to be addressed separately.

Additionally, I've deleted one document which appeared to be outdated
and unneeded:
  lldb/docs/building-with-debug-llvm.txt

Differential Revision: https://reviews.llvm.org/D57330

llvm-svn: 352514
2019-01-29 16:37:27 +00:00
Arnaud A. de Grandmaison f185823668 Remove no longer needed Arm specific words in the LICENSE.txt file.
As the codebase is now under the Apache 2.0 license with LLVM
Exceptions, and all Arm's contributions, past or future, are under that
new license, this Arm specific words in LICENSE.txt are no longer
needed.

llvm-svn: 352377
2019-01-28 15:42:58 +00:00
Andrey Churbanov efa6b826b4 NFC: fixed formatting to be consistent across the file
llvm-svn: 351748
2019-01-21 16:11:43 +00:00
Andrey Churbanov b8e3643506 Fixed https://reviews.llvm.org/D55078 broken Fortran fixed form.
Long lines split in order to obey Fortran fixed form compilation.

Differential Revision: https://reviews.llvm.org/D57017

llvm-svn: 351745
2019-01-21 15:30:31 +00:00
Chandler Carruth 4a1b95bda0 Fix typos throughout the license files that somehow I and my reviewers
all missed!

Thanks to Alex Bradbury for pointing this out, and the fact that I never
added the intended `legacy` anchor to the developer policy. Add that
anchor too. With hope, this will cause the links to all resolve
successfully.

llvm-svn: 351731
2019-01-21 09:52:34 +00:00
Chandler Carruth 57b08b0944 Update more file headers across all of the LLVM projects in the monorepo
to reflect the new license. These used slightly different spellings that
defeated my regular expressions.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351648
2019-01-19 10:56:40 +00:00
Chandler Carruth 469bdefd44 Install new LLVM license structure and new developer policy.
This installs the new developer policy and moves all of the license
files across all LLVM projects in the monorepo to the new license
structure. The remaining projects will be moved independently.

Note that I've left odd formatting and other idiosyncracies of the
legacy license structure text alone to make the diff easier to read.
Critically, note that we do not in any case *remove* the old license
notice or terms, as that remains necessary until we finish the
relicensing process.

I've updated a few license files that refer to the LLVM license to
instead simply refer generically to whatever license the LLVM project is
under, basically trying to minimize confusion.

This is really the culmination of so many people. Chris led the
community discussions, drafted the policy update and organized the
multi-year string of meeting between lawyers across the community to
figure out the strategy. Numerous lawyers at companies in the community
spent their time figuring out initial answers, and then the Foundation's
lawyer Heather Meeker has done *so* much to help refine and get us ready
here. I could keep going on, but I just want to make sure everyone
realizes what a huge community effort this has been from the begining.

Differential Revision: https://reviews.llvm.org/D56897

llvm-svn: 351631
2019-01-19 06:14:24 +00:00
Hans Wennborg 799b5dcbda Revert r351311 "[OMPT] Make sure that OMPT is enabled when accessing internals of the runtime"
and also the follow-up r351315.

The new test is failing on the buildbots.

> Make sure that OMPT is enabled in runtime entry points that access internals
> of the runtime. Else, return an appropiate value indicating an error or that
> the data is not available.
>
> Patch provided by @sconvent
>
> Reviewers: jlpeyton, omalyshe, hbae, Hahnfeld, joachim.protze
>
> Reviewed By: joachim.protze
>
> Tags: #openmp, #ompt
>
> Differential Revision: https://reviews.llvm.org/D47717

llvm-svn: 351431
2019-01-17 11:31:03 +00:00
Jonathan Peyton 9b8bb323c9 [OpenMP] Add omp_pause_resource* API
Add omp_pause_resource and omp_pause_resource_all API and enum, plus stub for
internal implementation. Implemented callable helper function to do local pause,
and added basic functionality for hard and soft pause.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D55078

llvm-svn: 351372
2019-01-16 20:07:39 +00:00
Joachim Protze c46bd682ac [OpenMP] Output written by tests should go to build directory
llvm-svn: 351332
2019-01-16 13:06:10 +00:00
Joachim Protze 6b840ccea9 [OpenMP] Remove compiler warning about unused value
The compiler warns about an unused variable/statement:

    runtime/src/kmp_affinity.cpp:4958:18: warning: statement has no effect [-Wunused-value]
       KA_TRACE(1000, ; {
                      ^
    runtime/src/kmp_debug.h:84:24: note: in definition of macro 'KA_TRACE'
         __kmp_debug_printf x;                                                      \
                            ^

Instead of the unused reference to this function, this patch now calls the function
with an empty string. The call to this function should have no effect.

Patch provided by joachim.protze

Reviewers: jlpeyton, hbae, AndreyChurbanov

Reviewed By: AndreyChurbanov

Tags: #openmp, #ompt

Differential Revision: https://reviews.llvm.org/D56775

llvm-svn: 351323
2019-01-16 11:35:11 +00:00
Joachim Protze c3716617df Fix compiler error in r351311
llvm-svn: 351315
2019-01-16 09:39:42 +00:00
Joachim Protze 582b183dda [OMPT] Make sure that OMPT is enabled when accessing internals of the runtime
Make sure that OMPT is enabled in runtime entry points that access internals
of the runtime. Else, return an appropiate value indicating an error or that
the data is not available.

Patch provided by @sconvent

Reviewers: jlpeyton, omalyshe, hbae, Hahnfeld, joachim.protze

Reviewed By: joachim.protze

Tags: #openmp, #ompt

Differential Revision: https://reviews.llvm.org/D47717

llvm-svn: 351311
2019-01-16 08:58:17 +00:00
Jonathan Peyton 9355d0dc13 [OpenMP] Fix for nested proc_bind affinity bug
Using proc_bind clause on a nested #pragma omp parallel region
with KMP_AFFINITY set causes an assertion error. This assertion occurs because
the place-partition-var is not properly initialized in the nested master threads.
Trying to get an intuitive result with KMP_AFFINITY + proc_bind is difficult
because of how the KMP_AFFINITY gtid-to-place mapping occurs. This
patch creates an initial place list no matter what affinity mechanism is used.
For KMP_AFFINITY, the place-partition-var is initialized to all the places.

Differential Revision: https://reviews.llvm.org/D55795

llvm-svn: 351227
2019-01-15 19:39:32 +00:00
Jonathan Peyton fce3972553 [OpenMP] Add lock function definitions to fix Bug 40042
This change fixes the sanity issue reported in Bug 40042.
Lock function definitions for the three lock kinds were added
to disambiguate calls to the lock functions done directly and indirectly.

Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=40042
Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D56103

llvm-svn: 351224
2019-01-15 19:14:00 +00:00
Jonathan Peyton 1c268554ba [OpenMP][Cmake] Allowed OpenMP testing detect test compiler with same generator
Fix ninja build detect test compiler failed under windows.

Patch by Peiyuan Song

Differential Revision: https://reviews.llvm.org/D53479

llvm-svn: 351223
2019-01-15 19:08:26 +00:00
Jonathan Peyton dc375486b0 [OpenMP] Fix performance regression in SPEC kdtree test
Make __ompt_implicit_task_end a static function and remove the inline part.  Remove
pId variable that is unused.  This fixes small regression in SPEC kdtree benchmark.
Also reformat some of __ompt_implicit_task_end.

Differential Revision: https://reviews.llvm.org/D55788

llvm-svn: 351221
2019-01-15 18:57:24 +00:00
Joachim Protze 2b46d30fc7 [OMPT] Second chunk of final OMPT 5.0 interface updates
The omp-tools.h file is generated from the OpenMP spec to ensure that the interface
is implemented as specified.
The other changes are necessary to update the interface implementation to the
final version as published in 5.0.
The omp-tools.h header was previously called ompt.h, currently a copy under this name
is installed for legacy tools.

Patch partially perpared by @sconvent

Reviewers: AndreyChurbanov, hbae, Hahnfeld

Reviewed By: hbae

Tags: #openmp, #ompt

Differential Revision: https://reviews.llvm.org/D55579

llvm-svn: 351197
2019-01-15 15:36:53 +00:00
Hans Wennborg eb60fbfdb4 Update year in license files
In last year's update (D48219) it was suggested that the release manager
might want to do this, so here we go.

llvm-svn: 351194
2019-01-15 15:10:32 +00:00
Roman Lebedev 06e3950561 [OpenMP] Fix LIBOMP_USE_DEBUGGER=ON build (PR38612)
Summary:
Two things:
1. Those two variables had the wrong sigdness, which was resulting in "sign mismatch in comparison" warning.
2. The whole `kmp_debugger.cpp` wasn't being built, or rather, it was being built as-if `USE_DEBUGGER` was off,
   thus, nothing provided the definition of `__kmp_omp_debug_struct_info`, `__kmp_debugging`.
   Makes sense, because `USE_DEBUGGER` is set in `kmp_config.h`, which is not included explicitly.
   It is included by `kmp.h`, but that one is only included inside of the `#if USE_DEBUGGER` block..
   I *think* this is the only source file with this issue,
   everything else seem to `#include` either `kmp.h` or `kmp_config.h`.
   The alternative solution would be to add `add_compile_options(-include kmp_config.h)` in CMake.

I did verify that `__kmp_omp_debug_struct_info` becomes available with this patch.

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=38612 | PR38612 ]].

Reviewers: AndreyChurbanov, jlpeyton, Hahnfeld

Reviewed By: jlpeyton

Subscribers: guansong, jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D55783

llvm-svn: 351019
2019-01-13 12:54:34 +00:00
Gheorghe-Teodor Bercea 1653633a1c [OpenMP][libomptarget] Use shared memory variable for tracking parallel level
Summary: Replace existing infrastructure for tracking parallel level using global memory with a per-team shared memory variable. This minimizes the impact of the overhead of tracking the parallel level for non-nested cases.

Reviewers: ABataev, caomhin

Reviewed By: ABataev

Subscribers: guansong, openmp-commits

Differential Revision: https://reviews.llvm.org/D55773

llvm-svn: 350747
2019-01-09 18:30:14 +00:00
Andrey Churbanov b7a8ab3417 Doc: fixed description of a parameter of the __kmpc_taskloop
Patch by sergi.mateo.bellido@gmail.com

Differential Revision: https://reviews.llvm.org/D56432

llvm-svn: 350713
2019-01-09 13:06:23 +00:00