Commit Graph

1858 Commits

Author SHA1 Message Date
Kelvin Li 4f1f2b7a5b [OpenMP] update strings output of libomp.so [NFC]
Change the string from "Intel(R) OMP" to "LLVM OMP" in libomp.so

Differential Revision: https://reviews.llvm.org/D74462
2020-02-12 15:45:55 -05:00
Johannes Doerfert a5153dbc36 [OpenMP][Offloading] Added support for multiple streams so that multiple kernels can be executed concurrently
Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D74145
2020-02-11 22:07:14 -06:00
Johannes Doerfert 3ff4e2eee8 [OpenMP] Switch default C++ standard to C++ 14
Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D74258
2020-02-11 17:11:54 -06:00
Jonas Devlieghere 4fe839ef3a [CMake] Rename EXCLUDE_FROM_ALL and make it an argument to add_lit_testsuite
EXCLUDE_FROM_ALL means something else for add_lit_testsuite as it does
for something like add_executable. Distinguish between the two by
renaming the variable and making it an argument to add_lit_testsuite.

Differential revision: https://reviews.llvm.org/D74168
2020-02-06 15:33:18 -08:00
Jon Chesterfield 6a82f0f0b9 [libomptarget] Implement wavefront functions for amdgcn
Summary: [libomptarget] Implement wavefront functions for amdgcn

Reviewers: jdoerfert, ABataev, grokos, arsenm

Reviewed By: arsenm

Subscribers: saiislam, wdng, arsenm, jvesely, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D73077
2020-02-04 21:55:29 +00:00
protze@itc.rwth-aachen.de 90e4ebdce5 [OpenMP][OMPT] fix reduction test for 32-bit x86
Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=44733 | TEST 'libomp :: ompt/synchronization/reduction/tree_reduce.c' FAILED on 32-bit x86 ]]

For 32-bit we need at least 3 variables to avoid atomic reduction to be
choosen by runtime function `__kmp_determine_reduction_method`.
This patch adds reduction variables to the testcase.

Reviewers: mgorny, Hahnfeld

Differential Revision: https://reviews.llvm.org/D73850
2020-02-04 12:19:10 +01:00
Jon Chesterfield ab9762a9f5 Revert "[nfc][libomptarget] Remove SHARED annotation from local variables"
This reverts commit 0e9374e374.
Revert D73239. It fails some local testing, cause presently unknown
2020-01-27 20:05:17 +00:00
Michał Górny 3c545e4b73 [openmp] Disable archer if LIBOMP_OMPT_SUPPORT is off
This fixed build failures due to missing ompt headers.

See https://bugs.gentoo.org/700762.

Differential Revision: https://reviews.llvm.org/D73249
2020-01-23 19:26:18 +01:00
Kelvin Li ad24cf2a94 [OpenMP] change omp_atk_* and omp_atv_* enumerators to lowercase [NFC]
The OpenMP spec defines the OMP_ATK_* and OMP_ATV_* to be lowercase.

Differential Revision: https://reviews.llvm.org/D73248
2020-01-23 11:15:44 -05:00
Jon Chesterfield 0e9374e374 [nfc][libomptarget] Remove SHARED annotation from local variables
Summary:
[nfc][libomptarget] Remove SHARED annotation from local variables

A few local variables in reduction.cu were marked SHARED. This patch leaves
all per-kernel global state localised in omp_data.cu.

Reviewers: ABataev, jdoerfert, grokos

Reviewed By: jdoerfert

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D73239
2020-01-23 00:00:23 +00:00
Alexey Bataev 9148b8b734 [OpenMP][Offloading] Fix the issue that omp_get_num_devices returns wrong number of devices, by Shiley Tian.
Summary:
This patch is to fix issue in the following simple case:

  #include <omp.h>
  #include <stdio.h>

  int main(int argc, char *argv[]) {
    int num = omp_get_num_devices();
    printf("%d\n", num);

    return 0;
  }

Currently it returns 0 even devices exist. Since this file doesn't contain any
target region, the host entry is empty so further actions like initialization
will not be proceeded, leading to wrong device number returned by runtime
function call.

Reviewers: jdoerfert, ABataev, protze.joachim

Reviewed By: ABataev

Subscribers: protze.joachim

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D72576
2020-01-21 13:25:18 -05:00
David Carlier ea99c09963 [OpenMP] affinity little fix for FreeBSD
- pthread affinity np has different semantic than sched affinity counterpart. On success returns strictly 0.

Reviewers: chandlerc, AndreyChurbanov, jdoerfert

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D72132
2020-01-20 18:52:10 +00:00
Jon Chesterfield 03c2a59cd6 [libomptarget] Implement smid for amdgcn
Summary:
[libomptarget] Implement smid for amdgcn

Implementation is in a new file as it uses an intrinsic with
complicated encoding that warranted substantial comments.

Reviewers: jdoerfert, grokos, ABataev, ronlieb

Reviewed By: jdoerfert

Subscribers: jvesely, mgorny, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D72956
2020-01-20 14:52:17 +00:00
Joachim Protze 39f746d8de [OpenMP][Tool] Fix memory leak and double-allocation
Fix the memory leak pointed out in https://reviews.llvm.org/D70412.
And a second one due to double-allocation.

Reviewed by: Hahnfeld

Differential revision: https://reviews.llvm.org/D72779
2020-01-16 10:05:06 -10:00
George Rokos e244145ab0 [LIBOMPTARGET] Do not increment/decrement the refcount for "declare target" objects
The reference counter for global objects marked with declare target is INF. This patch prevents the runtime from incrementing /decrementing INF refcounts. Without it, the map(delete: global_object) directive actually deallocates the global on the device. With this patch, such a directive becomes a no-op.

Differential Revision: https://reviews.llvm.org/D72525
2020-01-14 16:30:38 -08:00
Joachim Protze 2d4571bf30 [OpenMP][Tool] Runtime warning for missing TSan-option
TSan spuriously reports for any OpenMP application a race on the initialization
of a runtime internal mutex:

```
Atomic read of size 1 at 0x7b6800005940 by thread T4:
  #0 pthread_mutex_lock <null> (a.out+0x43f39e)
  #1 __kmp_resume_64 <null> (libomp.so.5+0x84db4)

Previous write of size 1 at 0x7b6800005940 by thread T7:
  #0 pthread_mutex_init <null> (a.out+0x424793)
  #1 __kmp_suspend_initialize_thread <null> (libomp.so.5+0x8422e)
```

According to @AndreyChurbanov this is a false positive report, as the control
flow of the runtime guarantees the ordering of the mutex initialization and
the lock:
https://software.intel.com/en-us/forums/intel-open-source-openmp-runtime-library/topic/530363

To suppress this report, I suggest the use of
TSAN_OPTIONS='ignore_uninstrumented_modules=1'.
With this patch, a runtime warning is provided in case an OpenMP application
is built with Tsan and executed without this Tsan-option.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D70412
2020-01-14 09:58:05 -10:00
Jon Chesterfield 2a43688a0a [nfc][libomptarget] Refactor nvptx/target_impl.cu
Summary:
[nfc][libomptarget] Refactor nxptx/target_impl.cu

Use __kmpc_impl_atomic_add instead of atomicAdd to match the rest of the file.
Alternatively, target_impl.cu could use the cuda functions directly. Using a mixture in this
file was an oversight, happy to resolve in either direction.

Removed some comments that look outdated.

Call __kmpc_impl_unset_lock directly to avoid a redundant diagnostic and remove an implict
dependency on interface.h.

Reviewers: ABataev, grokos, jdoerfert

Reviewed By: jdoerfert

Subscribers: jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D72719
2020-01-14 19:27:45 +00:00
Jon Chesterfield 2d287bec3c [nfc][libomptarget] Refactor amdgcn target_impl
Summary:
[nfc][libomptarget] Refactor amdgcn target_impl

Removes references to internal libraries from the header
Standardises on C++ mangling for all the target_impl functions
Update comment block
clang-format
Move some functions into a new target_impl.hip source file

This lays the groundwork for implementing the remaining unresolved
symbols in the target_impl.hip source.

Reviewers: jdoerfert, grokos, ABataev, ronlieb

Reviewed By: jdoerfert

Subscribers: jvesely, mgorny, jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D72712
2020-01-14 19:27:07 +00:00
Joachim Protze ed810da732 [OpenMP][Tool] Improving stack trace for Archer
The OpenMP runtime is not instrumented, so entering the runtime leaves no hint
on the source line of the pragma on ThreadSanitizer's function stack.

This patch adds function entry/exit annotations for OpenMP parallel regions,
and synchronization regions (barrier, taskwait, taskgroup).

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D70408
2020-01-13 22:14:06 -10:00
Joachim Protze 84637408f2 [OpenMP][Tool] Make tests for archer dependent on TSan
If the openmp project is built standalone, the test compiler is feature tested for an available -fsanitize=thread flag.
If the openmp project is built as part of llvm, the target tsan is needed to test archer.

An additional line (requires tsan) was introduced to the tests, this patch updates the line numbers for the race.

Follow-up for 77ad98c

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D71914
2020-01-13 21:47:58 -10:00
Alexey Bataev b19c0810e5 [LIBOMPTARGET]Ignore empty target descriptors.
Summary:
If the dynamically loaded module has been compiled with -fopenmp-targets
and has no target regions, it has empty target descriptor. It leads to a
crash at the runtime if another module has at least one target region
and at least one entry in its descriptor. The runtime library is unable
to load the empty binary descriptor and terminates the execution.
Caused by a clang-offload-wrapper.

Reviewers: grokos, jdoerfert

Subscribers: caomhin, kkwli0, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D72472
2020-01-10 09:45:27 -05:00
Kazuaki Ishizaki 4c6a098ad5 [OpenMP] NFC: Fix trivial typos in comments
Reviewers: jdoerfert, Jim

Reviewed By: Jim

Subscribers: Jim, mgorny, guansong, jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D72285
2020-01-07 14:05:03 +08:00
Kelvin Li 19433b199d [OpenMP] Fix incorrect property of __has_attribute() macro
__has_attribute(fallthough) -> __has_attribute(fallthrough)

Submitted by: kiszk (Kazuaki Ishizaki <ishizaki@jp.ibm.com>)

Differential Revision: https://reviews.llvm.org/D72287
2020-01-06 15:00:10 -05:00
Kelvin Li ed5fe64581 [OpenMP] NFC: Fix trivial typos in comments
Submitted by: kiszk

Differential Revision: https://reviews.llvm.org/D72171
2020-01-03 22:03:42 -05:00
Jon Chesterfield bc48af8c57 [libomptarget][nfc] Change unintentional target_impl prefix to kmpc_impl 2019-12-30 20:50:23 +00:00
protze@itc.rwth-aachen.de 3356e268f6 [OpenMP] Implementation of OMPT reduction callbacks
Including two tests
These callbacks were added late to the 5.0 specification, an implementation is missing.

Reviewed By: jdoerfert

Differential Review: https://reviews.llvm.org/D70395
2019-12-27 15:30:51 +01:00
Jon Chesterfield 63e2aa5658 [libomptarget][nfc] Provide target_impl malloc/free
Summary:
[libomptarget][nfc] Provide target_impl malloc/free

Sufficient to build support.cu for amdgcn

Reviewers: jdoerfert, ABataev, grokos

Reviewed By: jdoerfert

Subscribers: jvesely, mgorny, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71685
2019-12-19 16:54:28 +00:00
JonChesterfield b40822fc14 [libomptarget][nvptx] Fix build, second symbol reordering 2019-12-19 02:02:44 +00:00
Jon Chesterfield 89a2bef27a [libomptarget][nvptx] Fix build, symbol ordering in target_impl.h 2019-12-19 01:50:06 +00:00
JonChesterfield 9aefe5f65e [libomptarget][amdgcn] Correct return type of extern __clock64 to unsigned 2019-12-19 00:11:21 +00:00
Jon Chesterfield 2caeaf2f45 [libomptarget][nfc] Introduce atomic wrapper function
Summary:
[libomptarget][nfc] Introduce atomic wrapper function

Wraps atomic functions in a template prefixed __kmpc_atomic that
dispatches to cuda or hip atomic functions. Intended to be easily extended
to dispatch to OpenCL or C++ atomics for a third target.

Reviewers: ABataev, jdoerfert, grokos

Reviewed By: jdoerfert

Subscribers: Anastasia, jvesely, mgrang, dexonsmith, llvm-commits, mgorny, jfb, openmp-commits

Tags: #openmp, #llvm

Differential Revision: https://reviews.llvm.org/D71404
2019-12-18 20:06:17 +00:00
JonChesterfield 8adae6027c [libomptarget][nfc] Extract function from data_sharing, move to common
Summary:
[libomptarget][nfc] Extract function from data_sharing, move to common

Finding the first active thread in the warp is different on nvptx and amdgcn,
mostly due to warp size and the desire for efficiency.

Reviewers: ABataev, jdoerfert, grokos

Reviewed By: jdoerfert

Subscribers: jvesely, mgorny, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71643
2019-12-18 19:39:35 +00:00
Alexey Bataev 15d47deedd [LIBOPENMP][NVPTX]Fix the build error in the runtime. 2019-12-17 14:46:04 -05:00
JonChesterfield 0c83f8ccc7 [libomptarget][nfc] Move three files under common, build them for amdgcn
Summary:
[libomptarget][nfc] Move three files under common, build them for amdgcn

Change to reduction.cu to remove two dead includes, otherwise no code change.

Reviewers: jdoerfert, ABataev, grokos

Reviewed By: jdoerfert

Subscribers: jvesely, mgorny, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71601
2019-12-17 18:02:49 +00:00
JonChesterfield 3d3e4076cd [libomptarget][nfc] Move omp locks under target_impl
Summary:
[libomptarget][nfc] Move omp locks under target_impl

These are likely to be target specific, even down to the lock_t which is
correspondingly moved out of interface.h. The alternative is to include
interface.h in target_impl which substantiatially increases the scope of
those symbols.

The current nvptx implementation deadlocks on amdgcn. The preferred
implementation for that arch is still under discussion - this change
leaves declarations in target_impl.

The functions could be inline for nvptx. I'd prefer to keep the internals
hidden in the target_impl translation unit, but will add the (possibly renamed)
macros to target_impl.h if preferred.

Reviewers: ABataev, jdoerfert, grokos

Reviewed By: jdoerfert

Subscribers: jvesely, mgorny, jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71574
2019-12-17 12:18:57 +00:00
Jon Chesterfield ce12a523b0 [libomptarget][nfc] Move timer functions behind target_impl
Summary: [libomptarget][nfc] Move timer functions behind target_impl

Reviewers: jdoerfert, ABataev, grokos

Reviewed By: jdoerfert

Subscribers: jvesely, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71584
2019-12-17 02:22:29 +00:00
Jon Chesterfield 53bcd1e141 [libomptarget][nfc] Wrap cuda min() in target_impl
Summary:
[libomptarget][nfc] Wrap cuda min() in target_impl

nvptx forwards to cuda min, amdgcn implements directly.
Sufficient to build parallel.cu for amdgcn, added to CMakeLists.

All call sites are homogenous except one that passes a uint32_t and an
int32_t. This could be smoothed over by taking two type parameters
and some care over the return type, but overall I think the inline
<uint32_t> calling attention to what was an implicit sign conversion
is cleaner.

Reviewers: ABataev, jdoerfert

Reviewed By: jdoerfert

Subscribers: jvesely, mgorny, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71580
2019-12-17 01:30:04 +00:00
JonChesterfield 69fcc6ecc1 Revert "Revert "[libomptarget] Move resource id functions into target specific code, implement for amdgcn""
Summary:
This reverts commit dd8a7fcdd7.

Alexey reports undefined symbols for the new inline functions defined in target_impl.h
This does not reproduce for me for nvptx, or amdgcn, under release or debug builds.

I believe the patch is fine, based on:
 - the semantics of an inline function in C++ (the cuda INLINE functions end
   up as linkonce_odr in IR), which are only legal to drop if they have no uses
 - the code generated from a debug build of clang 9 does not show these undef symbols
 - the tests pass
 - the code is trivial

To progress from here I either need:
 - A tie break - someone to play the role of CI in determining whether the patch works
 - Alexey to provide sufficient information about his build for me to reproduce the failure
 - Alexey to debug why the symbols are disappearing for him and report back

Reviewers: ABataev, jdoerfert, grokos

Subscribers: jvesely, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71502
2019-12-16 16:16:14 +00:00
Alexey Bataev dd8a7fcdd7 Revert "[libomptarget] Move resource id functions into target specific code, implement for amdgcn"
This reverts commit dbb3fec8ad since it
breaks the NVPTX tests.
2019-12-13 16:36:06 -05:00
Jon Chesterfield 40d72134fd [libomptarget] Build most of common/src for amdgcn
Summary:
[libomptarget] Build most of common/src for amdgcn

Excluding parallel.cu, which uses an integer min() from cuda,
Excluding support.cu, which calls malloc that is not yet available for amdgcn

Reviewers: jdoerfert, ABataev, grokos

Reviewed By: jdoerfert

Subscribers: gregrodgers, ronlieb, jvesely, mgorny, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71446
2019-12-13 17:48:19 +00:00
Jon Chesterfield 56adcebfda [libomptarget][nfc] Add nop syncwarp function for amdgcn 2019-12-13 14:27:52 +00:00
Jon Chesterfield 479868646a [libomptarget][nfc] Add declarations of atomic functions for amdgcn
Summary:
[libomptarget][nfc] Add declarations of atomic functions for amdgcn

This enables building more source for amdgcn. The functions are usually available
in a hip runtime header, but are duplicated here to decouple the implementation

Reviewers: jdoerfert, ABataev, grokos

Reviewed By: jdoerfert

Subscribers: jvesely, mgorny, jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71412
2019-12-12 22:56:14 +00:00
Jon Chesterfield dbb3fec8ad [libomptarget] Move resource id functions into target specific code, implement for amdgcn
Summary: [libomptarget] Move resource id functions into target specific code, implement for amdgcn

Reviewers: jdoerfert, ABataev, grokos

Reviewed By: jdoerfert

Subscribers: jvesely, mgorny, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71382
2019-12-12 22:49:02 +00:00
Jon Chesterfield b399252028 [libomptarget][nfc] Add missing header for amdgcn/target_impl 2019-12-12 09:36:57 +00:00
David Carlier 27535a1449 [OpenMP] Fix linkage issue on FreeBSD
needs kmp_set_thread_affinity_mask_initial implementation.
2019-12-06 15:47:50 +00:00
JonChesterfield 0dd62c5c2e [libomptarget][nfc] Move cuda threadfence functions behind kmpc_impl
Summary:
[libomptarget][nfc] Move cuda threadfence functions behind kmpc_impl

Part of building code under common/ without requiring a cuda compiler

Reviewers: ABataev, jdoerfert, grokos

Reviewed By: ABataev

Subscribers: jvesely, jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71102
2019-12-06 15:41:18 +00:00
Jon Chesterfield cd90f49d70 [libomptarget][nfc] Move three more files to common
Summary: [libomptarget][nfc] Move three more files to common

Reviewers: ABataev, jdoerfert, grokos

Reviewed By: ABataev

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71103
2019-12-06 15:29:50 +00:00
Jon Chesterfield 4af84d2686 [libomptarget][nfc] Introduce SHARED, ALIGN macros
Summary:
[libomptarget][nfc] Introduce SHARED, ALIGN macros
Move remaining cuda attributes behind such macros

Reviewers: ABataev, jdoerfert, grokos

Reviewed By: ABataev

Subscribers: openmp-commits, jvesely

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71076
2019-12-05 21:57:58 +00:00
Jon Chesterfield d0b9ed5c49 [libomptarget][nfc] Move omptarget-nvptx under common
Summary:
[libomptarget][nfc] Move omptarget-nvptx under common

Almost all files depend on require omptarget-nvptx, which no longer
contains any obviously architecture dependent code. Moving it under
common unblocks task/loop for amdgcn, and allows moving other code.

At some point there should probably be a widespread symbol renaming to
replace the nvptx string. I'd prefer to get things working first.

Building this (and task.cu, loop.cu) without a cuda library requires
some more refactoring, e.g. wrap threadfence(), use DEVICE macro more
consistently. Patches for that are orthogonal and will be posted shortly.

Reviewers: jdoerfert, ABataev, grokos

Reviewed By: ABataev

Subscribers: mgorny, fedor.sergeev, jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71073
2019-12-05 20:34:15 +00:00
JonChesterfield 3ada8d2a87 [libomptarget] Build a minimal deviceRTL for amdgcn
Summary:
[libomptarget] Build a minimal deviceRTL for amdgcn

Repeat of D70414, with an include path fixed. Diff for sanity checking.

The CMakeLists.txt file is functionally identical to the one used in the aomp fork.
Whitespace changes were made based on nvptx/CMakeLists.txt, plus the
copyright notice updated to match (Greg was the original author so would
like his sign off on that here).

This change will build a small subset of the deviceRTL if an appropriate toolchain is
available, e.g. a local install of rocm. Support.h is moved from nvptx as a dependency
of debug.h.

Reviewers: ABataev, jdoerfert

Reviewed By: ABataev

Subscribers: jvesely, mgorny, jfb, openmp-commits, jdoerfert

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D70971
2019-12-04 16:43:37 +00:00
Alexey Bataev 02b9c5d963 Revert "[libomptarget] Build a minimal deviceRTL for amdgcn"
This reverts commit 877ffa716f because it
breaks the build.
2019-12-03 12:35:08 -05:00
Jon Chesterfield 877ffa716f [libomptarget] Build a minimal deviceRTL for amdgcn
Summary:
[libomptarget] Build a minimal deviceRTL for amdgcn

The CMakeLists.txt file is functionally identical to the one used in the aomp fork.
Whitespace changes were made based on nvptx/CMakeLists.txt, plus the
copyright notice updated to match (Greg was the original author so would
like his sign off on that here).

This change will build a small subset of the deviceRTL if an appropriate toolchain is
available, e.g. a local install of rocm. Support.h is moved from nvptx as a dependency
of debug.h.

Reviewers: jdoerfert, ABataev, grokos, ronlieb, gregrodgers

Reviewed By: jdoerfert

Subscribers: jfb, Hahnfeld, jvesely, mgorny, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D70414
2019-12-03 15:18:41 +00:00
Bryan Chan 4d3198e243 [OpenMP] build offload plugins before testing them
Summary:
"make check-all" or "make check-libomptarget" would attempt to run offloading
tests before the offload plugins are built. This patch corrects that by adding
dependencies to the libomptarget CMake rules.

Reviewers: jdoerfert

Subscribers: mgorny, guansong, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D70803
2019-11-28 17:43:56 -05:00
AndreyChurbanov bd2fb41c2d [openmp] Fixed nonmonotonic schedule when #threads > #chunks in a loop.
Differential Revision: https://reviews.llvm.org/D70713
2019-11-27 15:26:51 +03:00
AndreyChurbanov 5f8b8d2820 [openmp] Recognise ARMv7ve machine arch.
Patch by raj.khem (Khem Raj)

Differential Revision: https://reviews.llvm.org/D68543
2019-11-26 14:37:24 +03:00
protze@itc.rwth-aachen.de 77ad98c808 [OpenMP][Tool] archer tests require tsan
Testing for tsan capability in the test-compiler in follow-up review
2019-11-22 17:11:16 +01:00
protze@itc.rwth-aachen.de 6b2431e0c2 [OpenMP][Tool] disable archer tests in standalone build
Will be enabled after Build-Bots are fixed
2019-11-22 15:25:43 +01:00
protze@itc.rwth-aachen.de ac21de0d7e [OpenMP][Tool] Fix cmake variable in lit.site.cfg.in
As noted in D45890
2019-11-22 14:31:54 +01:00
JonChesterfield a84b48d01e [nfc][libomptarget] Remove casts of string literals to char* 2019-11-19 19:41:59 +00:00
JonChesterfield 4681e2e434 [nfc][libomptarget] Write amdgcn macros in terms of compiler intrinsics 2019-11-19 17:23:46 +00:00
AndreyChurbanov 3a76b8a538 Fix openmp on PowerPC64-BE-ELFv2 ABI on FreeBSD.
Patch by adalava (Alfredo Dal'Ava J.nior)

Differential Revision: https://reviews.llvm.org/D67190
2019-11-19 19:45:06 +03:00
Aaron Puchert b29c7fdb61 [OpenMP] Remove -Wl,-fini=__kmp_internal_end_fini
Summary:
The termination function duplicated the functionality of the
__attribute((destructor))-annotated function __kmp_internal_end_fini,
and we have no indication that this doesn't work.

The function might cause issues with link-time optimization turned on:
until very recently, none of the usual linkers was reporting functions
named in -Wl,-fini as used to the LTO plugin, so it might be dropped.
If the function is dropped, -Wl,-fini=__kmp_internal_end_fini doesn't
do what we want: with ld.bfd and lld it drops the FINI attribute from
.dynamic and with gold we get FINI = 0x0, which leads to a crash on
cleanup. This can be reproduced by building with

    -DLLVM_ENABLE_PROJECTS="clang;openmp" \
    -DLLVM_ENABLE_LTO=Thin \
    -DLLVM_USE_LINKER=gold

The issue in lld has been fixed in f95273f75a, but gold remains without
fix so far.

Fixes PR43927.

Reviewers: JonChesterfield, jdoerfert, AndreyChurbanov

Reviewed By: AndreyChurbanov

Differential Revision: https://reviews.llvm.org/D69927
2019-11-19 00:54:58 +01:00
Jon Chesterfield 5a4a05d776 [libomptarget][nfc] Move some source into common from nvptx
Summary:
[libomptarget][nfc] Move some source into common from nvptx

Moves some source that compiles cleanly under amdgcn into a common subdirectory
Includes some non-trivial files and some headers. Keeps the cuda file extension.

The build systems for different architectures seem unlikely to have much in
common. The idea is therefore to set include paths such that files under
common/src compile as if they were under arch/src as the mechanism for sharing.
In particular, files under common/src need to be able to include target_impl.h.

The corresponding -Icommon is left out in favour of explicit includes on the
basis that the it makes it clearer which files under common are used by a given
architecture.

Reviewers: jdoerfert, ABataev, grokos

Reviewed By: ABataev

Subscribers: jfb, mgorny, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D70328
2019-11-18 18:17:36 +00:00
protze@itc.rwth-aachen.de 2b8115b10b [OpenMP] Add implementation and tests of Archer tool
The tool provides TSAN annotations for OpenMP synchronization. The tool
is activated if no other OMPT tool is loaded.

The tool detects whether the application was built with TSan and rejects
activation according to the OMPT protocol if there is no TSan-rt.

Differential Revision: https://reviews.llvm.org/D45890
2019-11-18 14:45:34 +01:00
Sylvestre Ledru 9b40a7f3bf Remove +x permission on some files 2019-11-16 14:47:20 +01:00
JonChesterfield 32dfbd131d [libomptarget][nfc] Use cuda variable wrappers from support.h
Summary:
[libomptarget][nfc] Use cuda variable wrappers from support.h
Reimplementation of D69693, after the revert of D69885

Use the wrappers in support.h for cuda builtin variables at all call sites.
Localises use of cuda and removes WARPSIZE==32 assumption in debug.h.

Reviewers: ABataev, jdoerfert, grokos

Reviewed By: jdoerfert

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D70186
2019-11-14 12:45:09 +00:00
JonChesterfield fd9fa9995c [libomptarget] Move supporti.h to support.cu
Summary:
[libomptarget] Move supporti.h to support.cu
Reimplementation of D69652, without the unity build and refactors.
Will need a clean build of libomptarget as the cmakelists changed.

Reviewers: ABataev, jdoerfert

Reviewed By: jdoerfert

Subscribers: mgorny, jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D70131
2019-11-13 11:36:46 +00:00
Michał Górny 6f8ee2c575 [openmp] [test] Skip one more test that kills NetBSD buildbot 2019-11-07 17:29:57 +01:00
Jon Chesterfield 7cea0cea77 [libomptarget] Revert all improvements to support
Summary:
[libomptarget] Revert all improvements to support

The change to unity build for nvcc has broken the build for some developers.
This patch reverts to a known-working state.

There has been some confusion over exactly how the build broke. I think we
have reached a common understanding that the disappearing symbols are from
the bitcode library built by clang. The static archive built by nvcc may show the
same problem. Some of the confusion arose from building the deviceRTL twice
and using one or the other library based on various environmental factors.

I'm pretty sure the problem is clang expanding `__forceinline__` into both `__inline__`
and `attribute(("always_inline"))`. The `__inline__` attribute resolves to linkonce_odr
which is not safe for exporting symbols from translation units.

"always_inline" is the desired semantic for small functions defined in one translation
unit that are intended to be inlined at link time. "inline" is not.

This therefore reintroduces the dependency hazard of supporti.h and some code
duplication, and blocks progress separating deviceRTL into reusable components.

See also D69857, D69859 for attempts at a fix instead of a revert.

Reviewers: ABataev, jdoerfert, grokos, ikitayama, tianshilei1992

Reviewed By: ABataev

Subscribers: mgorny, jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D69885
2019-11-06 15:44:10 +00:00
Ron Lieberman dc34b1c94d Test commit: adds a . to comment. NFC 2019-11-04 16:51:03 -06:00
JonChesterfield 94c59ea8dd [libomptarget] Implement target_impl for amdgcn
Summary:
[libomptarget] Implement target_impl for amdgcn

Smallest atomic addition for a new target. Implements enough of the amdgcn
specific code that some of the source files under nvptx/src could be compiled,
without modification, to run on amdgcn.

This foreshadows a work in progress patch to move said source out of nvptx/src.
Patch based on fork at https://github.com/ROCm-Developer-Tools/llvm-project

Reviewers: ABataev, jdoerfert, grokos, ronlieb

Subscribers: jvesely, jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D69718
2019-11-01 15:46:35 +00:00
Alexey Bataev e57f8ad914 [LIBOMPTARGET]Call GetLaneId function, do not use its address in debug
log functions.
2019-11-01 09:43:47 -04:00
JonChesterfield 9b06ac98d0 [nfc][omptarget] Use builtin var abstraction. Second pass at D69476
Summary:
[nfc][omptarget] Use builtin var abstraction. Second pass at D69476

Use the wrappers in support.h for cuda builtin variables at all call sites.
Localises use of cuda and removes WARPSIZE==32 assumption in debug.h.

Reviewers: ABataev, jdoerfert, grokos

Reviewed By: jdoerfert

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D69693
2019-11-01 02:21:44 +00:00
JonChesterfield 764c8420e4 [nfc][libomptarget] Reorganise support header
Summary:
[nfc][libomptarget] Reorganise support header

All functions defined in support implementation are now declared in support.h
Reordered functions in support implementation to match the sequence in support.h
Added include guards to support.h
Added #include interface to support.h to provide kmp_Ident declaration
Move supporti.h to support.cu and s/INLINE/EXTERN/g
Add remaining includes to support.cu

A minor side effect is to change the name mangling of the support functions to
extern "C". If this matters another macro along the lines of INLINE/EXTERN
can be added - perhaps DEVICE as that's the obvious implementation.

Reviewers: jdoerfert, ABataev, grokos

Reviewed By: jdoerfert

Subscribers: mgorny, jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D69652
2019-10-31 17:15:02 +00:00
Jon Chesterfield e9f9dfab82 [libomptarget] Change nvcc compilation to use a unity build
Summary:
[libomptarget] Change nvcc compilation to use a unity build

This allows nvcc to inline functions between what would otherwise be distinct
translation units, which in turn removes any runtime cost from implementing
functions in source files (as opposed to inline in headers).

This will then allow the circular dependencies in deviceRTL to be readily
broken and individual components more easily shared between architectures.

Reviewers: ABataev, jdoerfert, grokos, RaviNarayanaswamy, hfinkel, ronlieb, gregrodgers

Reviewed By: jdoerfert

Subscribers: mgorny, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D69489
2019-10-31 01:58:51 +00:00
Jon Chesterfield 8548e2f543 [nfc][libomptarget] Move named_sync() into target_impl
Summary: [nfc][libomptarget] Move named_sync() into target_impl

Reviewers: ABataev, jdoerfert, grokos

Reviewed By: ABataev

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D69487
2019-10-30 16:25:05 +00:00
David Carlier 5069928487 [OpenMP] Reset affinity mask in the process child on FreeBSD
Reviewers: dim, chandlerc, jdoerfert

Reviewed By: dim

Differential Revision: https://reviews.llvm.org/D69047
2019-10-30 14:51:22 +00:00
Jon Chesterfield 74bb5ee674 [nfc][libomptarget] Move smid() into target_impl
Summary: [nfc][libomptarget] Move smid() into target_impl

Reviewers: ABataev, jdoerfert, grokos

Reviewed By: ABataev

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D69485
2019-10-30 13:39:15 +00:00
Jon Chesterfield 62a161cc00 [libomptarget] Always call malloc, free via SafeMalloc, SafeFree wrapper
Summary:
[libomptarget] Always call malloc, free via SafeMalloc, SafeFree wrapper

NFC for release, adds some verbosity to debug printing. Motivation is to provide
one place where local modifications can be made to the behaviour of all heap
allocation or deallocation while debugging.

Reviewers: jdoerfert, ABataev, grokos

Reviewed By: ABataev

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D69492
2019-10-30 13:35:34 +00:00
AndreyChurbanov 27f6eedc57 Enable OpenBSD support.
Patch by devnexen (David CARLIER)

Differential Revision: https://reviews.llvm.org/D69220
2019-10-30 12:37:44 +03:00
Alexey Bataev d7941a6ab9 [LIBOMPTARGET]Fix build, NFC.
Need to include nvptx_interface.h in target_impl.h, otherwise the build
is failed because of missing __kmpc_impl_lanemask_t type.
2019-10-28 10:43:00 -04:00
Jon Chesterfield 174967f153 [nfc][libomptarget] Decrease coupling between files
Summary:
[nfc][libomptarget] Decrease coupling between files

debug.h used the symbol omptarget_device_environment so implicitly required
an include of omptarget-nvptx.h to compile. Similarly interface.h uses size_t.

Moving this declaration to a new header means cancel, critical can now build
without omptarget-nvptx.h. After this change, debug.h, cancel.cu, critical.cu
could move under a common source directory.

Reviewers: ABataev, jdoerfert, grokos

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D69473
2019-10-27 14:27:54 +00:00
Jon Chesterfield ad4c42666d [nfc][libomptarget] Inline option into target_impl
Summary:
[nfc][libomptarget] Inline option into target_impl

Subset of D69423. The macros that were in option.h are all target dependent.
Inlining the header simplifies the dependency graph when looking to move code
into a common subdir.

Reviewers: ABataev, jdoerfert, grokos

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D69472
2019-10-27 14:26:55 +00:00
Jon Chesterfield f7c3c640af [NFC][libomptarget]Remove TRUE,FALSE macros from option.h
Summary:
[NFC][libomptarget]Remove TRUE,FALSE macros from option.h
Subset of D69423. Patch series ends with removing option.h.

Reviewers: ABataev, jdoerfert, grokos

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D69463
2019-10-27 01:31:12 +01:00
Jon Chesterfield 197b7b24c3 [NFC][libomptarget] move remaining device specific code out of omptarget-nvptx.h
Summary:
[NFC][libomptarget] move remaining device specific code out of omptarget-nvptx.h

Strictly there is one remaining difference wrt amdgcn - parallelLevel is
volatile qualified on amdgcn and not on nvptx. Determining whether this is
correct - and how to represent the different semantics of 'volatile' under
various conditions - is beyond the scope of this code motion patch.

Reviewers: ABataev, jdoerfert, grokos

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D69424
2019-10-25 18:58:31 +01:00
AndreyChurbanov be29d92854 OpenMP Tasks dependencies hash re-sizing fixed.
Details:
- nconflicts field initialized;
- formatting fix (moved declaration out of the long line);
- count conflicts in new hash as opposed to old one.

Differential Revision: https://reviews.llvm.org/D68036
2019-10-25 16:04:46 +03:00
Stephan T. Lavavej 2e4f1e112d [www] Change URLs to HTTPS.
This changes most URLs in llvm's html files to HTTPS. Most changes were
search-and-replace with manual verification; some changes were manual.
For a few URLs, the websites were performing redirects or had changed
their anchors; I fixed those up manually. This consistently uses the
official https://wg21.link redirector. This also strips trailing
whitespace and fixes a couple of typos.

Fixes D69363.

There are a very small number of dead links for which I don't know any
replacements (they are equally dead as HTTP or HTTPS):

https://llvm.org/cmds/llvm2cpp.html
https://llvm.org/devmtg/2010-11/videos/Grosser_Polly-desktop.mp4
https://llvm.org/devmtg/2010-11/videos/Grosser_Polly-mobile.mp4
https://llvm.org/devmtg/2011-11/videos/Grosser_PollyOptimizations-desktop.mov
https://llvm.org/devmtg/2011-11/videos/Grosser_PollyOptimizations-mobile.mp4
https://llvm.org/perf/db_default/v4/nts/22463
https://polly.llvm.org/documentation/memaccess.html
2019-10-24 13:25:15 -07:00
Jon Chesterfield d69d1aa131 [libomptarget][nfc] Make interface.h target independent
Summary:
[libomptarget][nfc] Make interface.h target independent

Move interface.h under a top level include directory.
Remove #includes to avoid the interface depending on the implementation.

Reviewers: ABataev, jdoerfert, grokos, ronlieb, RaviNarayanaswamy

Reviewed By: jdoerfert

Subscribers: mgorny, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D68615

llvm-svn: 374919
2019-10-15 17:15:26 +00:00
David Carlier f61f13d4e7 [OpenMP] Enable thread affinity on FreeBSD
Reviewers: chandlerc, jlpeyton, jdoerfert, dim

Reviewed-By: dim

Differential Revision: https://reviews.llvm.org/D68580

llvm-svn: 374118
2019-10-08 21:25:30 +00:00
Andrey Churbanov ca2973bb20 Don't assume Type from `readelf -d` has parentheses
Patch by jbeich (Jan Beich)

Differential Revision: https://reviews.llvm.org/D68053

llvm-svn: 374038
2019-10-08 12:39:04 +00:00
Andrey Churbanov f34271d886 Don't link libm with -Wl,--as-needed on FreeBSD
Patch by jbeich (Jan Beich)

Differential Revision: https://reviews.llvm.org/D68051

llvm-svn: 374037
2019-10-08 12:23:25 +00:00
Jon Chesterfield 58fd6b5b9c [libomptarget][nfc] Update remaining uint32 to use lanemask_t
Summary:
[libomptarget][nfc] Update remaining uint32 to use lanemask_t

Update a few functions in the API to use lanemask_t instead of i32. NFC for
nvptx. Also update the ActiveThreads type in DataSharingStateTy.
This removes a lot of #ifdef from the downsteam amdgcn implementation.

Reviewers: ABataev, jdoerfert, grokos, ronlieb, RaviNarayanaswamy

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D68513

llvm-svn: 373806
2019-10-04 22:30:28 +00:00
Jon Chesterfield 4f75a73796 Use named constant to indicate all lanes, to handle 32 and 64 wide architectures
Summary: Use named constant to indicate all lanes, to handle 32 and 64 wide architectures

Reviewers: ABataev, jdoerfert, grokos, ronlieb

Reviewed By: grokos

Subscribers: ronlieb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D68369

llvm-svn: 373793
2019-10-04 21:39:22 +00:00
David Carlier fef62e1a68 [OpenMP] FreeBSD address check if mapped more native
/proc unless Linux layer compatibility is activated for CentOS is activated is not present
thus relying on a more native for checking the address.

Reviewers: Hahnfeld, kongyl, jdoerfert, jlpeyton, AndreyChurbanov, emaster, dim

Reviewed By: Hahnfeld

Differential Revision: https://reviews.llvm.org/D67326

llvm-svn: 373152
2019-09-28 19:01:59 +00:00
Sergey Dmitriev 4b343fd84c [Clang][OpenMP Offload] Create start/end symbols for the offloading entry table with a help of a linker
Linker automatically provides __start_<section name> and __stop_<section name> symbols to satisfy unresolved references if <section name> is representable as a C identifier (see https://sourceware.org/binutils/docs/ld/Input-Section-Example.html for details). These symbols indicate the start address and end address of the output section respectively. Therefore, renaming OpenMP offload entries section name from ".omp.offloading_entries" to "omp_offloading_entries" to use this feature.

This is the first part of the patch for eliminating OpenMP linker script (please see https://reviews.llvm.org/D64943).

Differential Revision: https://reviews.llvm.org/D68070

llvm-svn: 373118
2019-09-27 20:00:51 +00:00
Andrey Churbanov de44f434e8 fixed test: eliminated race condition which might cause deadlock
llvm-svn: 372887
2019-09-25 15:25:52 +00:00
Andrey Churbanov a1639b9bba Enable tasks dependencies hashmaps resizing.
Patch by viroulep (Philippe Virouleau)

Differential Revision: https://reviews.llvm.org/D67447

llvm-svn: 372879
2019-09-25 14:40:19 +00:00
Jonas Hahnfeld 673e5476a8 [OpenMP] Change initialization of __kmp_global
There's no need to initialize variables with static storage duration
because they're implicitly initialized to zero. See
https://en.cppreference.com/w/c/language/initialization#Implicit_initialization

I think that's already relied upon because the supplied 0 only sets
'kmp_time_global_t g_time;' in 'struct kmp_base_global'. The other fields
are not set in the code, but implicitly initialized by the compiler.

Differential Revision: https://reviews.llvm.org/D66292

llvm-svn: 370943
2019-09-04 17:47:37 +00:00
Alexey Bataev 4812941776 [OPENMP][NVPTX]Fix parallel level counter in non-SPMD mode.
Summary:
In non-SPMD mode we may end up with the divergent threads when trying to
increment/decrement parallel level counter. It may lead to incorrect
calculations of the parallel level and wrong results when threads are
divergent. We need to reconverge the threads before trying to modify the
parallel level counter.

Reviewers: grokos, jdoerfert

Subscribers: guansong, openmp-commits, caomhin, kkwli0

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D66802

llvm-svn: 370803
2019-09-03 18:11:50 +00:00
Jon Chesterfield bbdd282371 [libomptarget] Refactor activemask macro to inline function
Summary:
[libomptarget] Refactor activemask macro to inline function
See also abandoned D66846, split into this diff and others.

Reviewers: jdoerfert, ABataev, grokos, ronlieb, gregrodgers

Reviewed By: jdoerfert, ABataev

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D66851

llvm-svn: 370781
2019-09-03 16:31:30 +00:00
Jon Chesterfield 3294421926 Use target_impl functions to replace more inline asm
Summary:
Use target_impl functions to replace more inline asm
Follow on from D65836. Removes remaining asm shuffles and lanemask accessors
Also changes the types of target_impl bitwise functions to unsigned.

Reviewers: jdoerfert, ABataev, grokos, Hahnfeld, gregrodgers, ronlieb, hfinkel

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D66809

llvm-svn: 370216
2019-08-28 15:04:06 +00:00
Jon Chesterfield 80f9a38a76 [libomptarget] Refactor syncthreads macro to inline function
Summary:
[libomptarget] Refactor syncthreads macro to inline function
See also abandoned D66846, split into this diff and others.

Rev 2 of D66855

Reviewers: jdoerfert, ABataev, grokos, ronlieb, gregrodgers

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D66861

llvm-svn: 370210
2019-08-28 14:22:35 +00:00
Jon Chesterfield be3d487313 [libomptarget] Refactor syncwarp macro to inline function
Summary:
[libomptarget] Refactor syncwarp macro to inline function
See also abandoned D66846, split into this diff and others.

Reviewers: jdoerfert, ABataev, grokos, ronlieb, gregrodgers

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D66857

llvm-svn: 370149
2019-08-28 02:02:53 +00:00
Jon Chesterfield e73e3013a6 Fix build break due to close brace lost in merge
llvm-svn: 370148
2019-08-28 01:56:26 +00:00
Jon Chesterfield 327aa81123 [libomptarget] Refactor shfl_down_sync macro to inline function
Summary:
[libomptarget] Refactor shfl_down_sync macro to inline function
See also abandoned D66846, split into this diff and others.

Reviewers: jdoerfert, ABataev, grokos, ronlieb, gregrodgers

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D66853

llvm-svn: 370146
2019-08-28 01:47:41 +00:00
Jon Chesterfield b9b712df82 [libomptarget] Refactor shfl_sync macro to inline function
Summary:
[libomptarget] Refactor shfl_sync macro to inline function
See also abandoned D66846, split into this diff and others.

Reviewers: jdoerfert, ABataev, grokos, ronlieb, gregrodgers

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D66852

llvm-svn: 370144
2019-08-28 01:31:04 +00:00
Alexey Bataev da8b5cc9f1 [OPENMP][NVPTX]Add __kmpc_syncwarp(int32_t) function.
Summary:
Added function void __kmpc_syncwarp(int32_t) to expose it to the
compiler. It is required to fix the problem with the critical regions in
Cuda9.0+. We cannot use barrier in the critical region, but still need
to reconverge the threads in the warp after. This function allows to do
this.

Reviewers: grokos, jdoerfert

Subscribers: guansong, openmp-commits, kkwli0, caomhin

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D66672

llvm-svn: 369933
2019-08-26 17:32:45 +00:00
Alexey Bataev 0366168f3a [OPENMP][NVPTX]Use __syncwarp() to reconverge the threads.
Summary:
In Cuda 9.0 it is not guaranteed that threads in the warps are
convergent. We need to use __syncwarp() function to reconverge
the threads and to guarantee the memory ordering among threads in the
warps.
This is the first patch to fix the problem with the test
libomptarget/deviceRTLs/nvptx/src/sync.cu on Cuda9+.
This patch just replaces calls to __shfl_sync() function with the call
of __syncwarp() function where we need to reconverge the threads when we
try to modify the value of the parallel level counter.

Reviewers: grokos

Subscribers: guansong, jfb, jdoerfert, caomhin, kkwli0, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D65013

llvm-svn: 369796
2019-08-23 18:34:48 +00:00
Jonathan Peyton 57ae6b8e37 Force honoring nthreads-var and thread-limit-var inside teams construct on host
This patch fixes https://bugs.llvm.org/show_bug.cgi?id=42906, via adding
adjustment of number of threads on enter to the teams construct on host
according to user settings. This allows to pass checks and avoid assertions
at time of team of threads creation.

Patch by Andrey Churbanov

Differential Revision: https://reviews.llvm.org/D66351

llvm-svn: 369430
2019-08-20 19:39:17 +00:00
Jonas Hahnfeld d2ae0c4f44 [OpenMP] Enable warning about "implicit fallthrough"
Fix last warned location in ittnotify_static.cpp using the defined
macro KMP_FALLTHROUGH().

Differential Revision: https://reviews.llvm.org/D65871

llvm-svn: 369003
2019-08-15 13:26:55 +00:00
Jonas Hahnfeld 4d77e50e6e [OpenMP] Remove 'unnecessary parentheses'
The variables in kmp_lock.cpp are really arrays of function pointers
that return void or int, not pointers to functions that return void*
or int*. The other changes are only cosmetic.

Differential Revision: https://reviews.llvm.org/D65870

llvm-svn: 369002
2019-08-15 13:26:41 +00:00
Jonas Hahnfeld fb72a03f85 [OMPT] Resolve warnings because of ints in if conditions
The implementation status can only be one of
ompt_event_UNIMPLEMENTED = ompt_set_never = 1
ompt_event_MAY_ALWAYS = ompt_set_always = 5

In both cases, the condition was already true, so just remove
the check.

Differential Revision: https://reviews.llvm.org/D65869

llvm-svn: 369001
2019-08-15 13:26:29 +00:00
Jonas Hahnfeld dc23c832f4 [OpenMP] Turn on -Wall compiler warnings by default
Instead, maintain a list of disabled options to still build libomp and
libomptarget without warnings. This includes -Wno-error and -Wno-pedantic
to silence warnings that LLVM enables when building in-tree.

I tested the following compilers:
 * Clang 6.0, 7.0, 8.0
 * GCC 4.8.5 (CentOS 7), GCC 6, 7, 8, 9
 * Intel Compiler 16, 17, 18, 19

RFC thread on openmp-dev mailing list:
http://lists.llvm.org/pipermail/openmp-dev/2019-August/002668.html

Differential Revision: https://reviews.llvm.org/D65867

llvm-svn: 368999
2019-08-15 13:11:50 +00:00
Jon Chesterfield ed3324f6b6 Factor architecture dependent code out of loop.cu
Summary:
[libomptarget] Factor architecture dependent code out of loop.cu

Related to the patch series starting D64217. Added subscribers to said series as reviewers. This effort is smaller in scope.

This patch factors out just enough architecture dependent code from loop.cu to allow the same source to be used with amdgcn, given a different target_impl.h. Testing is that the same bitcode (modulo variable names) is generated for libomptarget before and after the refactor, for nvptx and the out of tree amdgcn.

Reviewers: jdoerfert, ABataev, bollu, jfb, tra, grokos, Hahnfeld, guansong, xtian, gregrodgers, ronlieb, hfinkel, gtbercea, guraypp, arpith-jacob

Reviewed By: jdoerfert, ABataev

Subscribers: dexonsmith, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D65836

llvm-svn: 368751
2019-08-13 21:41:47 +00:00
Andrey Churbanov 5eec1a9d32 Cleanup unused variable.
This patch fixes problem raised in post-review comments of the
https://reviews.llvm.org/D65285. Developers of ittnotify confirmed
that dll_path_ptr field of the __itt_global structure is never used
by ittnotify library, so it is safe to remove the dll_path array.

Differential Revision: https://reviews.llvm.org/D65885

llvm-svn: 368559
2019-08-12 12:37:30 +00:00
Gheorghe-Teodor Bercea 6c7b882e52 [OpenMP][libomptarget] Add support for close map modifier
Summary:
This patch adds support for the close map modifier.

The close map modifier will overwrite the unified shared memory requirement and create a device copy of the data.

Reviewers: ABataev, Hahnfeld, caomhin, grokos, jdoerfert, AlexEichenberger

Reviewed By: Hahnfeld, AlexEichenberger

Subscribers: guansong, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D65340

llvm-svn: 368488
2019-08-09 21:32:57 +00:00
Jonas Hahnfeld 7a0f2dc5a4 [libomptarget] Remove duplicate RTLRequiresFlags per device
We have one global RTLs.RequiresFlags, I don't see a need to make a
copy per device that the runtime manages. This was problematic anyway
because the copy happened during the first __tgt_register_lib(). This
made it impossible to call __tgt_register_requires() from normal user
funtions for testing.
Hence, this change also fixes unified_shared_memory/shared_update.c for
older versions of Clang that don't call __tgt_register_requires() before
__tgt_register_lib().

Differential Revision: https://reviews.llvm.org/D66019

llvm-svn: 368465
2019-08-09 19:20:39 +00:00
Gheorghe-Teodor Bercea a1d20506e7 [OpenMP][libomptarget] Add support for unified memory for regular maps
Summary:
This patch adds support for using unified memory in the case of regular maps that happen when a target region is offloaded to the device.

For cases where only a single version of the data is required then the host address can be used. When variables need to be privatized in any way or globalized, then the copy to the device is still required for correctness.

Reviewers: ABataev, jdoerfert, Hahnfeld, AlexEichenberger, caomhin, grokos

Reviewed By: Hahnfeld

Subscribers: mgorny, guansong, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D65001

llvm-svn: 368192
2019-08-07 17:29:45 +00:00
Jon Chesterfield ae0178bee7 Use forceinline. Necessary for nvcc to inline small functions within the bitcode library
Summary:
[libomptarget] Use forceinline. Necessary for nvcc to inline small functions within the bitcode library
Suggested in D65836

Reviewers: ABataev, jdoerfert, grokos, gregrodgers

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D65876

llvm-svn: 368177
2019-08-07 15:24:12 +00:00
Alexey Bataev c10180ed8e [OPENMP][OFFLOADING]Fix the test, NFC.
llvm-svn: 368068
2019-08-06 18:13:39 +00:00
Jonathan Peyton 73d5abd809 [OpenMP] Add support for GOMP_*_nonmonotonic_* functions
Patch by Isuru Fernando

Differential Revision: https://reviews.llvm.org/D65714

llvm-svn: 367949
2019-08-05 23:23:52 +00:00
Hansang Bae dcdbe6515b [OpenMP] Fix broken build due to new OMPT tests
New OMPT tests with teams construct should be disabled for GCC as it
emits code with a GOMP entry not supported in the LLVM runtime.

Differential Revision: https://reviews.llvm.org/D65757

llvm-svn: 367939
2019-08-05 21:46:13 +00:00
Michael Kruse 78769ec403 [libomptarget] Harmonize emitting CUDA errors and general debug messages.
Ensures that CUDA fail reasons (such as "No CUDA-capable device detected")
are printed together with libomptarget's debug message
(e.g. "Error when setting CUDA context"). Previously, the former was
printed only in CMAKE_BUILD_TYPE=Debug builds while the latter was
enabled by LIBOMPTARGET_ENABLE_DEBUG.

With this change, also only call cuGetErrorString when the error will be
printed.

Suggested-by: Ye Luo <xw111luoye@gmail.com>

Differential Revision: https://reviews.llvm.org/D65687

llvm-svn: 367910
2019-08-05 19:12:10 +00:00
Michael Kruse 2c7a8eaf3d [OpenMP 5.0] libomptarget interface for declare mapper functions.
This patch implements the libomptarget runtime interface for OpenMP 5.0
declare mapper functions. The declare mapper functions generated by
Clang will call them to complete the mapping of members.
kmpc_mapper_num_components gets the current number of components for a
user-defined mapper; kmpc_push_mapper_component pushes back one
component for a user-defined mapper.

The design slides can be found at
https://github.com/lingda-li/public-sharing/blob/master/mapper_runtime_design.pptx

Patch by Lingda Li <lildmh@gmail.com>

Differential Revision: https://reviews.llvm.org/D60972

llvm-svn: 367772
2019-08-04 04:18:28 +00:00
Hansang Bae 67e93a1ae0 Add OMPT support for teams construct
This change adds OMPT support for events from teams construct.

Differential Revision: https://reviews.llvm.org/D64025

llvm-svn: 367746
2019-08-03 02:38:53 +00:00
Jonas Hahnfeld 52b87ac32f [OpenMP] Rename last file to cpp and remove LIBOMP_CFLAGS
All other files are already C++ and the build system has always
passed '-x c++' for C files, effectively compiling them as C++.

To stay warning free we need one fix in ittnotify_static.{c,cpp}:
The variable dll_path can be written to, so it must not be const.
GCC complained with -Wcast-qual and I think it's right.

Differential Revision: https://reviews.llvm.org/D65285

llvm-svn: 367343
2019-07-30 18:37:28 +00:00
Yi Kong 3d21a3af87 [openmp] Workaround bug in old Android pthread_attr_setstacksize
Round the stack size to a multiple of the page size. Older versions of
Android (until KitKat) would fail pthread_attr_setstacksize with
EINVAL if the stack size was not a multiple of the page size.

Patch by Dan Albert <danalbert@google.com>.

Test: Build, copied into the NDK, passed openmp test on ICS.
Bug: https://github.com/android-ndk/ndk/issues/9
llvm-svn: 367070
2019-07-25 22:29:55 +00:00
Jonas Hahnfeld baeab1fc44 [OpenMP] Fix build of stubs library, NFC.
Both Clang and GCC complained that they cannot initialize a return
object of type 'kmp_proc_bind_t' with an 'int'. While at it, also
fix a warning about missing parentheses thrown by Clang.

Differential Revision: https://reviews.llvm.org/D65284

llvm-svn: 367041
2019-07-25 17:51:24 +00:00
Alexey Bataev ca424d100c [OPENMP][NVPTX]Perform memory flush if number of threads to sync is 1 or less.
Summary:
According to the OpenMP standard, barrier operation must perform
implicit flush operation. Currently, if there is only one thread in the
team, barrier does not flush the memory. Patch fixes this problem.

Reviewers: grokos, gtbercea, kkwli0

Subscribers: guansong, jdoerfert, openmp-commits, caomhin

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D62398

llvm-svn: 367024
2019-07-25 15:02:28 +00:00
Jonas Hahnfeld 2488ae9df1 [OpenMP] RISCV64 port
This is a port of libomp for the RISC-V 64-bit Linux target.

We have tested this port on a HiFive Unleashed development board
using a downstream LLVM that has support for the missing bits in
upstream. As of now, all tests are passing, including OMPT.

Patch by Ferran Pallarès!

Differential Revision: https://reviews.llvm.org/D59880

llvm-svn: 367021
2019-07-25 14:36:20 +00:00
Jonas Hahnfeld 6e40ae8f3d [libomptarget] Handle offload policy in push_tripcount
If the first target region in a program calls the push_tripcount
function, libomptarget didn't handle the offload policy correctly.
This could lead to unexpected error messages as seen in
http://lists.llvm.org/pipermail/openmp-dev/2019-June/002561.html

To solve this, add a check calling IsOffloadDisabled() as all other
entry points already do. If this method returns false, libomptarget
is effectively disabled.

Differential Revision: https://reviews.llvm.org/D64626

llvm-svn: 366810
2019-07-23 14:20:48 +00:00
Jonas Hahnfeld a2748c74d6 [OMPT] Cleanup reset of exit_frame pointer
This is done at call-site and does not need to be handled in
__kmp_invoke_microtask. It was already absent from the x86
and x86_64 assembly, this patch removes it from the generic
implementation in z_Linux_util.cpp and adds documentation for
AArch64 and PPC64 that it's actually not needed. I can't test
on these architectures, so I don't want to change the code just
because it looks right :)

While at it, rename some variables for consistency and add a
check in test/ompt/parallel/normal.c that the pointer was reset
before entering the barrier.

Differential Revision: https://reviews.llvm.org/D64442

llvm-svn: 366721
2019-07-22 18:46:02 +00:00
Jonas Hahnfeld 4138b2f167 Delete empty file
This is a left-over from r356288 which was reviewed in D58989.

llvm-svn: 366716
2019-07-22 18:11:06 +00:00
Alexey Bataev da43861b4a [OpenMP][libomptarget] Suppress C++ 11 related warnings when building libomptarget-nvptx bitcode library, by Doru Bercea.
Summary: Pass -std=c++11 flag to compiler to suppress C++ 11 related warnings when building NVPTX bitcode library.

Reviewers: ABataev, caomhin, Hahnfeld

Reviewed By: ABataev, Hahnfeld

Subscribers: jdoerfert, Hahnfeld, jholewinski, mgorny, guansong, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D55772

llvm-svn: 366438
2019-07-18 13:54:01 +00:00
Ron Lieberman 59532488b1 [OPENMP] Resolve lost LoopTripCnt for subsequent loops in same thread.
Remove loopTripCnt from threaded device stack after consuming it.
Added a libomptarget DP message to aid in future debugging and to
validate the added testcase, which only runs in Debug build.

Differential Revision: https://reviews.llvm.org/D64808

llvm-svn: 366349
2019-07-17 17:07:52 +00:00
Jonathan Peyton aa5cdafa40 Remove REQUIRES OMP spec version within lit tests
This is a follow up patch to D64534 (r365963) which removed all OMP
spec versioning within the OpenMP runtime codebase.  This patch removes
REQUIRES: openmp-x.y lines from lit tests.

llvm-svn: 366341
2019-07-17 15:41:00 +00:00
Jonas Hahnfeld 1ff5535785 [OpenMP] Move header inclusion out of 'extern "C"'
This leads to problems when compiling C++ code with libc++ for Nvidia GPUs
because Clang now uses wrappers for math functions that might include
C++ templates not allowed in 'extern "C"'.

Differentiel Revision: https://reviews.llvm.org/D64625

llvm-svn: 366229
2019-07-16 17:16:43 +00:00
Alexey Bataev 85b9651edd [OPENMP][NVPTX]Fixed checks for cuda versions.
Summary:
We used CUDART_VERSION macro to check for the installed cuda version
but this macro is defined in cuda_runtime_api.h, which is not used by
project. Better to use CUDA_VERSION macro, which is defined in cuda.h.
Also, added the check if this macro is defined. If macro is undefined,
there is something wrong with the cuda configuration and we should not
continue the compilation.
This also fixes problems with runtime building in cuda 10+.

Reviewers: grokos

Subscribers: guansong, jdoerfert, caomhin, kkwli0, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D64648

llvm-svn: 366224
2019-07-16 16:07:10 +00:00
Alexey Bataev 42816107f7 [OPENMP]Fix threadid in __kmpc_omp_taskwait call for dependent target calls.
Summary:
We used to call __kmpc_omp_taskwait function with global threadid set to
0. It may crash the application at the runtime if the thread executing
 target region is not a master thread.

Reviewers: grokos, kkwli0

Subscribers: guansong, jdoerfert, caomhin, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D64571

llvm-svn: 366220
2019-07-16 15:51:32 +00:00
Jonathan Peyton e4b4f994d2 [OpenMP] Remove OMP spec versioning
Remove all older OMP spec versioning from the runtime and build system.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D64534

llvm-svn: 365963
2019-07-12 21:45:36 +00:00
Jonas Hahnfeld aca476b296 [libomptarget] Fix typos and grammar in error messages, NFC.
llvm-svn: 365890
2019-07-12 10:21:55 +00:00
Jonas Hahnfeld 2dfc5179f6 [libomptarget-nvptx] Remove dead functions
These entry points are never called by Clang trunk nor clang-ykt. If
XL doesn't use them either, they can finally go away.

Differential Revision: https://reviews.llvm.org/D52700

llvm-svn: 365817
2019-07-11 20:12:51 +00:00
Andrey Churbanov 28f44040cc NFC: fixed typo #ifdef --> #if to allow macro set to 0 work correctly
llvm-svn: 365642
2019-07-10 15:09:37 +00:00
Alexey Bataev 4ad9286a57 [OPENMP]Rename loopTripCnt member data to LoopTripCnt, NFC.
Rename variable to follow LLVM coding standard.

llvm-svn: 365368
2019-07-08 18:45:48 +00:00
Alexey Bataev 060921dee7 [OPENMP]Make __kmpc_push_tripcount thread safe.
Summary:
__kmpc_push_tripcount function is not thread safe and may lead to data
race when the target regions are executed in parallel threads. The patch
makes loopTripCnt counter thread aware and stores the tripcount value
per thread in the map. Access to map is guarded by mutex to prevent
data race in the map itself.
Test is for NVPTX target because it does not work correctly on the
host. Seems to me, there is a problem in libomp with target regions in
the parallel threads.

Reviewers: grokos

Subscribers: guansong, jfb, jdoerfert, openmp-commits, kkwli0, caomhin

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D64080

llvm-svn: 365332
2019-07-08 15:30:23 +00:00
Andrey Churbanov a23806e67a Create a runtime option to disable task throttling.
Patch by viroulep (Philippe Virouleau)

Differential Revision: https://reviews.llvm.org/D63196

llvm-svn: 364934
2019-07-02 15:10:20 +00:00
Andrey Churbanov e7b2c64a6e Cleanup of unused code
Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D63891

llvm-svn: 364925
2019-07-02 13:45:40 +00:00
Alexey Bataev bb55ece269 [OPENMP][NVPTX]Relax flush directive.
Summary:
According to the OpenMP standard, flush  makes a thread’s temporary view of memory consistent with memory and enforces an order on the memory operations of the variables explicitly specified or implied.

According to the Cuda toolkit documentation (https://docs.nvidia.com/cuda/archive/8.0/cuda-c-programming-guide/index.html#memory-fence-functions), __threadfence() functions provides required functionality.

__threadfence_system() also provides required functionality, but it also
includes some extra functionality, like synchronization of page-locked
host memory, synchronization for the host, etc. It is not required per
the standard and we can use more relaxed version of memory fence
operation.

Reviewers: grokos, gtbercea, kkwli0

Subscribers: guansong, jfb, jdoerfert, openmp-commits, caomhin

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D62397

llvm-svn: 364572
2019-06-27 18:33:09 +00:00
Andrey Churbanov b7e6c37efe Fixed memory use-after-free problem.
Bug reported in https://bugs.llvm.org/show_bug.cgi?id=42269.
Freeing of the contention group (CG) stucture by master thread looks wrong,
because workers can leave the CG later on. Intead the freeing
is now done by the last thread leaving the CG.

Differential Revision: https://reviews.llvm.org/D63599

llvm-svn: 364456
2019-06-26 18:11:26 +00:00
Gheorghe-Teodor Bercea aace6d285d [OpenMP][libomptarget] Add support for declare target to clause under unified memory
Summary:
This patch adds support for handling variables under the:

```
#pragma omp declare target to()
```

clause when the 

```
#pragma omp requires unified_shared_memory
```

is used.

The address of the host variable is copied into the device pointer just like for the declare target link case.

Reviewers: ABataev, caomhin, grokos, AlexEichenberger

Reviewed By: grokos

Subscribers: jcownie, guansong, jdoerfert, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D63106

llvm-svn: 363825
2019-06-19 15:48:10 +00:00
Alexey Bataev 8a2bd361eb [OPENMP][CUDA]Use __syncthreads when compiled by nvcc and clang >= 9.0.
Summary:
The problems with __syncthreads() were fixed in clang >= 9.0 and the
original __syncthreads() can be used instead of the ptx instruction.

Reviewers: grokos

Subscribers: guansong, jdoerfert, openmp-commits, kkwli0, caomhin

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D63515

llvm-svn: 363807
2019-06-19 14:20:34 +00:00
Andrey Churbanov 405037c4e6 New implementation of OpenMP 5.0 detached tasks.
Patch by Alex Duran

Differential Revision: https://reviews.llvm.org/D62485

llvm-svn: 363799
2019-06-19 13:23:28 +00:00
Gheorghe-Teodor Bercea b48e44a65c [OpenMP] Add task alloc function
Summary: Add the target task allocation function to the interface.

Reviewers: ABataev, AlexEichenberger, caomhin, jlpeyton, AndreyChurbanov, RaviNarayanaswamy, hbae

Reviewed By: AlexEichenberger, hbae

Subscribers: hbae, RaviNarayanaswamy, cfe-commits, Hahnfeld, guansong, jdoerfert, openmp-commits

Tags: #openmp, #clang

Differential Revision: https://reviews.llvm.org/D63010

llvm-svn: 363449
2019-06-14 20:15:15 +00:00
Andrey Churbanov d47f5488cf Added propagation of not big initial stack size of master thread to workers.
Currently implemented only for non-Windows 64-bit platforms.

Differential Revision: https://reviews.llvm.org/D62488

llvm-svn: 362618
2019-06-05 16:14:47 +00:00
Gheorghe-Teodor Bercea c5fe030c16 [OpenMP][libomptarget] Enable usage of unified memory for declare target link variables
Summary: This patch enables the usage of a host variable on the device for declare target link variables when unified memory is available.

Reviewers: ABataev, caomhin, grokos

Reviewed By: grokos

Subscribers: Hahnfeld, guansong, jdoerfert, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D60884

llvm-svn: 362505
2019-06-04 15:05:53 +00:00
Andrey Churbanov 3f786dab0e Fixed build warning with -DLIBOMP_USE_HWLOC=1
Made type of depth of hwloc object to correapond with
change from unsigned in hwloc 1,x to int in hwloc 2.x.
This eliminates the warning on signed-unsigned comparison.

Differential Revision: https://reviews.llvm.org/D62332

llvm-svn: 362401
2019-06-03 14:21:59 +00:00
Hansang Bae ec1b4d1f6f Fix OMP_TARGET_OFFLOAD parsing
Current parsing allows trailing string after the permitted value,
MANDATORY|DISABLED|DEFAULT -- e.g., "mandatorynot" is also recognized
as "MANDATORY". Such cases should be recognized as incorrect/unknown
value.

Differential Revision: https://reviews.llvm.org/D62431

llvm-svn: 362125
2019-05-30 18:35:07 +00:00
Hansang Bae 7c75ac0c60 Add checks before pointer dereferencing
This change adds checks before dereferencing a pointer returned from a
function.

Differential Revision: https://reviews.llvm.org/D62224

llvm-svn: 362111
2019-05-30 16:32:20 +00:00
Michal Gorny a815cbb010 [openmp] [test] Skip kernel-breaking tests on NetBSD
The omp_taskloop_num_tasks and omp_taskwait have deadlooped
on the NetBSD buildbot previously, practically hanging the host running
it.  Disable them until we can find a good solution, or make the kernel
less fragile.

llvm-svn: 361825
2019-05-28 14:10:47 +00:00
Alexey Bataev e1947b84c1 Revert "[OPENMP][NVPTX]Fix barriers and parallel level counters, NFC."
This reverts commit r361421 to split the patch into 3 parts.

llvm-svn: 361638
2019-05-24 14:06:47 +00:00
Alexey Bataev 9d9e406684 [OPENMP][NVPTX]Fix barriers and parallel level counters, NFC.
Summary:
Parallel level counter should be volatile to prevent some dangerous
optimiations by the ptxas. Otherwise, ptxas optimizations lead to
undefined behaviour in some cases.
Also, use __threadfence() for #pragma omp flush and if the barrier
should not be used (we have only one thread in the team), still perform
flush operation since the standard requires implicit flush when
executing barriers.

Reviewers: gtbercea, kkwli0, grokos

Subscribers: guansong, jfb, jdoerfert, openmp-commits, caomhin

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D62199

llvm-svn: 361421
2019-05-22 19:50:32 +00:00
Andrey Churbanov 184ef0a0a6 Fixed third issue reported in https://bugs.llvm.org/show_bug.cgi?id=41584.
Removed wrong debug assertion.

Differential Revision: https://reviews.llvm.org/D62251

llvm-svn: 361408
2019-05-22 16:48:05 +00:00
Jonathan Peyton 3057c3a092 [OpenMP] Add implementation to two OMPT API routines
This change adds implementation to ompt_finalize_tool() and
ompt_get_task_memory().

Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D61657

llvm-svn: 361309
2019-05-21 20:51:05 +00:00
Gheorghe-Teodor Bercea 9e9c918259 [OpenMP][libomptarget] Enable requires flags for target libraries.
Summary:
Target link variables are currently implemented by creating a copy of the variables on the device side and unified memory never gets exploited.

When the prgram uses the:

```
#pragma omp requires unified_shared_memory
```

directive in conjunction with a declare target link, the linked variable is no longer allocated on the device and the host version is used instead.

This behavior is overridden by performing an explicit mapping.

A Clang side patch is required.

Reviewers: ABataev, AlexEichenberger, grokos, Hahnfeld

Reviewed By: AlexEichenberger, grokos, Hahnfeld

Subscribers: Hahnfeld, jfb, guansong, jdoerfert, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D60223

llvm-svn: 361294
2019-05-21 19:35:02 +00:00
Joachim Protze 4109d5606e [OpenMP][OMPT] Fix locking testcases for 32 bit architectures
https://reviews.llvm.org/D58454 did not fix the problem for a typical use
case of building LLVM with gcc or icc and then testing with the newly built
clang compiler.
The compilers do not agree on how to extend a 32-bit pointer to uint64, so
make the pointer unsigned first, before adjusting the size.

Patch by Joachim Protze

Differential Revision: https://reviews.llvm.org/D58506

llvm-svn: 361158
2019-05-20 14:21:42 +00:00
Joachim Protze 48b8a4b519 [OMPT] Handling of the events of initial-task-begin and initial-task-end
OpenMP 5.0 says that the callback for the events initial-task-begin and
initial-task-end has to be ompt_callback_implicit_task.

Patch by Tim Cramer

Differential Revision: https://reviews.llvm.org/D58776

llvm-svn: 361157
2019-05-20 14:21:36 +00:00
Andrey Churbanov f8f788b205 Fixed second issue reported in https://bugs.llvm.org/show_bug.cgi?id=41584.
Added synchronization for possible concurrent initialization of mutexes
by multiple threads. The need of synchronization caused by commit r357927
which added the use of mutexes at threads movement to/from common pool
(earlier the mutexes were used only at suspend/resume).

Patch by Johnny Peyton.

Differential Revision: https://reviews.llvm.org/D61995

llvm-svn: 360919
2019-05-16 17:52:53 +00:00
Paul Osmialowski 0732fcc7d5 Fix hwloc topology traversal code unable to handle situation where L2 cache is common for the packages
Currently cores within package that share the same L2 cache are grouped together.
The current logic behind this assumes that the L2 cache is always at deeper
(or the same) level than the package itself. In case when L2 cache is common
for all packages (and the packages are at deeper level than L2 cache) the whole of
the further topology discovery fails to find any computational units resulting in
following assertion:

Assertion failure at kmp_affinity.cpp(715): nActiveThreads == __kmp_avail_proc.
OMP: Error #13: Assertion failure at kmp_affinity.cpp(715).

This patch adds a bit of a logic that prevents such situation from occurring.

Differential Revision: https://reviews.llvm.org/D61796

llvm-svn: 360890
2019-05-16 13:16:24 +00:00
Andrey Churbanov 6ebb785bb1 Fixed https://bugs.llvm.org/show_bug.cgi?id=41584.
Removed unconditional and unsafe decrement of counter 
of active threads in pool at shutdown time.

Differential Revision: https://reviews.llvm.org/D61944

llvm-svn: 360784
2019-05-15 16:53:45 +00:00
Andrey Churbanov 22405f3097 Introduce new OpenMP 5.0 depend object type.
The implementation should be done by compiler, user can only declare
objects of this type and use them in OpenMP directives.

Differential Revision: https://reviews.llvm.org/D61860

llvm-svn: 360774
2019-05-15 13:45:36 +00:00
Eli Friedman 025df3b827 [OpenMP][AArch64] Fix compile with LLVM trunk.
The code is currently using the ambiguous instruction
"sub sp, sp, w9, lsl #4". The ARM reference manual says this isn't
valid, and it's not clear whether it's supposed to mean uxtw or uxtx.

It doesn't matter which instruction we use here, since the high
bits of the operand are zero anyway, so I arbitrarily choose uxtw, to
preserve the register name.

See https://reviews.llvm.org/D60840 for the LLVM patch.

Differential Revision: https://reviews.llvm.org/D61770

llvm-svn: 360711
2019-05-14 21:44:54 +00:00
Andrey Churbanov 1aaf2a3c18 fixed typo made by commit r360595
llvm-svn: 360602
2019-05-13 17:04:32 +00:00
Andrey Churbanov 7f63e8c0a6 Fixed creation of aliases in Windows build.
Changed file extension of the destination of the copy of libomp.lib
(it was mistakely .dll, now it is .lib) in installation on Windows.

Differential Revision: https://reviews.llvm.org/D61673

llvm-svn: 360595
2019-05-13 16:07:37 +00:00
Alexey Bataev f9e00db818 [OPENMP][NVPTX]Simplify handling of thread limit, NFC.
Summary:
Patch improves performance of the full runtime mode by moving
threads limit counter to the shared memory. It also allows to save
global memory.

Reviewers: grokos, kkwli0, gtbercea

Subscribers: guansong, jdoerfert, openmp-commits, caomhin

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D61801

llvm-svn: 360584
2019-05-13 14:21:46 +00:00
Alexey Bataev f62c266de7 [OPENMP][NVPTX]Improve number of threads counter, NFC.
Summary:
Patch improves performance of the full runtime mode by moving
number-of-threads counter to the shared memory. It also allows to save
global memory.

Reviewers: grokos, gtbercea, kkwli0

Subscribers: guansong, jfb, jdoerfert, openmp-commits, caomhin

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D61785

llvm-svn: 360457
2019-05-10 18:56:05 +00:00
Jonathan Peyton c107332583 [OpenMP] Workaround gfortran bugzilla build bug 41755
This patch provides workaround to allow gfortran to compile the
OpenMP Fortran modules.

From the gfortran manual:
https://gcc.gnu.org/onlinedocs/gcc-9.1.0/gfortran/BOZ-literal-constants.html

"Note that initializing an INTEGER variable with a statement such as
DATA i/Z'FFFFFFFF'/ will give an integer overflow error rather than the desired
result of -1 when i is a 32-bit integer on a system that supports 64-bit
integers. The -fno-range-check option can be used as a workaround for legacy
code that initializes integers in this manner."

Bug filed: https://bugs.llvm.org/show_bug.cgi?id=41755

Differential Revision: https://reviews.llvm.org/D61603

llvm-svn: 360299
2019-05-08 23:12:31 +00:00
Dimitry Andric 181aff63fb Add non-SSE wrapper for __kmp_{load,store}_mxcsr
Summary:
To be able to successfully build OpenMP on FreeBSD/i386, which still
uses i486 as its default processor, I had to provide wrappers for the
`__kmp_load_mxcsr` and `__kmp_store_mxcsr` functions.

If the compiler signals that SSE is not available, loading and storing
mxcsr does not make sense anway, so in that case the inline functions
are empty.  This gives the minimum amount of code churn.

See also https://svnweb.freebsd.org/changeset/base/345283

Reviewers: emaste, jlpeyton, Hahnfeld

Reviewed By: jlpeyton

Subscribers: hfinkel, krytarowski, jdoerfert, openmp-commits, llvm-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D60916

llvm-svn: 360062
2019-05-06 17:58:03 +00:00
Alexey Bataev a857e31011 [OPENMP][NVPTX]Improve thread limit counter, NFC.
Summary:
Patch improves performance of the full runtime mode by moving
thread-limit counter to the shared memory. It also allows to save
global memory.

Reviewers: grokos, gtbercea, kkwli0

Subscribers: guansong, jdoerfert, caomhin, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D61526

llvm-svn: 359922
2019-05-03 20:00:38 +00:00
Alexey Bataev e031e17919 [OPENMP][NVPTX]Improved several standard OpenMP functions, NFC.
Summary:
Used parallelLevel[] counter to simplify and improve implementation of
the existing standard OpenMP functions. Functions are tested already in
several tests, the patch is NFC.

Reviewers: grokos, gtbercea, kkwli0

Subscribers: guansong, jdoerfert, caomhin, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D61459

llvm-svn: 359892
2019-05-03 14:47:20 +00:00
Alexey Bataev 8ccb8f8647 [OPENMP][NVPTX]Improve code by using parallel level counter.
Summary:
Previously for the different purposes we need to get the active/common
parallel level and with full runtime we iterated over all the records to
calculate this level. Instead, we can used the warp-based parallel level
counters used in no-runtime mode.

Reviewers: grokos, gtbercea, kkwli0

Subscribers: guansong, jfb, jdoerfert, caomhin, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D61395

llvm-svn: 359822
2019-05-02 20:05:01 +00:00
Alexey Bataev 4ad6dbc5fd [OPENMP][NVPTX]Improve omp_get_max_threads() function.
Summary:
Function omp_get_max_threads() can always return 1 if current execution
mode is SPMD.

Reviewers: grokos, gtbercea, kkwli0

Subscribers: guansong, jdoerfert, caomhin, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D61379

llvm-svn: 359792
2019-05-02 14:52:52 +00:00
Alexey Bataev 8e6bf88cf7 [OPENMP][NVPTX]Improved omp_get_thread_limit() function.
Summary:
Function omp_get_thread_limit() in SPMD mode can return the maximum
available number of threads as a result.

Reviewers: grokos, gtbercea, kkwli0

Subscribers: guansong, jdoerfert, openmp-commits, caomhin

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D61378

llvm-svn: 359790
2019-05-02 14:46:32 +00:00
Dimitry Andric 147ce2334c Enable OpenMP build for 32-bit FreeBSD
Summary:
To be able to successfully build OpenMP on 32-bit FreeBSD, such as
FreeBSD/i386, I first had to provide a few wrappers (see D60916), and
then add `KMP_OS_FREEBSD` to the list of defines checked for 32-bit
architectures in `kmp_runtime.cpp`.

I have successfully built libomp.so and ran a bunch of test programs on
FreeBSD/i386 with this.

See also https://svnweb.freebsd.org/changeset/base/345283

Reviewers: emaste, jlpeyton, Hahnfeld

Reviewed By: jlpeyton

Subscribers: krytarowski, guansong, jdoerfert, openmp-commits, llvm-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D60917

llvm-svn: 359716
2019-05-01 19:32:58 +00:00
Jonathan Peyton a8426ac8c2 [OpenMP] Implement task modifier for reduction clause
Implemented task modifier in two versions - one without taking into account
omp_orig variable (the omp_orig still can be processed by compiler without help
of the library, but each reduction object will need separate initializer with
global access to omp_orig), another with omp_orig variable included into
interface (single initializer can be used for multiple reduction objects of
the same type). Second version can be used when the omp_orig is not globally
accessible, or to optimize code in case of multiple reduction objects
of the same type.

Patch by Andrey Churbanov

Differential Revision: https://reviews.llvm.org/D60976

llvm-svn: 359710
2019-05-01 17:54:01 +00:00
Jonathan Peyton 71abe28e81 [OpenMP] Add OpenMP 5.0 nonmonotonic code
This patch adds:
* New omp_sched_monotonic flag to omp_sched_t which is handled within the runtime
* Parsing of monotonic/nonmonotonic in OMP_SCHEDULE
* Tests for the monotonic flag and envirable parsing
* Logic to force monotonic when hierarchical scheduling is used

Differential Revision: https://reviews.llvm.org/D60979

llvm-svn: 359601
2019-04-30 19:20:35 +00:00
Jonathan Peyton 1ca746170b [OpenMP] Eliminate some compiler warnings
* Remove accidental == for =
* Assign values to variables to appease compiler
* Surround debug code with KMP_DEBUG
* Remove unused local typedefs

Differential Revision: https://reviews.llvm.org/D60983

llvm-svn: 359599
2019-04-30 19:13:37 +00:00
Alexey Bataev c03fe73176 [OPENMP][NVPTX]Correctly handle L2 parallelism in SPMD mode.
Summary:
The parallelLevel counter must be on per-thread basis to fully support
L2+ parallelism, otherwise we may end up with undefined behavior.
Introduce the parallelLevel on per-warp basis using shared memory. It
allows to avoid the problems with the synchronization and allows fully
support L2+ parallelism in SPMD mode with no runtime.

Reviewers: gtbercea, grokos

Subscribers: guansong, jdoerfert, caomhin, kkwli0, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D60918

llvm-svn: 359341
2019-04-26 19:30:34 +00:00
Dimitry Andric 87e7f895bb Use correct way to test for MIPS arch after rOMP355687
Summary:
I ran into some issues after rOMP355687, where __atomic_fetch_add was
being used incorrectly on x86, and this turns out to be caused by the
following added conditionals:

```
#if defined(KMP_ARCH_MIPS)
```

The problem is, these macros are always defined, and are either 0 or 1
depending on the architecture.  E.g. the correct way to test for MIPS
is:

```
#if KMP_ARCH_MIPS
```

Reviewers: petarj, jlpeyton, Hahnfeld, AndreyChurbanov

Reviewed By: petarj, AndreyChurbanov

Subscribers: AndreyChurbanov, sdardis, arichardson, atanasyan, jfb, jdoerfert, openmp-commits, llvm-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D60938

llvm-svn: 358911
2019-04-22 19:20:46 +00:00
Alexey Bataev 5de5d74c8d [OPENMP][NVPTX] Fix the test, NFC.
Fix the test to run it really in SPMD mode without runtime. Previously
it was run in SPMD + full runtime mode and does not allow to cehck the
functionality correctly.

llvm-svn: 358902
2019-04-22 17:25:31 +00:00
Andrey Churbanov cf5bdb83b0 Fixed memory leak reported in Bugzilla:
https://bugs.llvm.org/show_bug.cgi?id=41494

Freed th_cg_roots structure at exit from uber thread.

Differential Revision: https://reviews.llvm.org/D60729

llvm-svn: 358572
2019-04-17 10:44:28 +00:00
Alexey Bataev 13532ea623 [OPENMP][NVPTX]Fix dynamic scheduling in L2+ SPMD parallel regions.
Summary:
If the kernel is executed in SPMD mode and the L2+ parallel for region
with the dynamic scheduling is executed, dynamic scheduling functions
are called. They expect full runtime support, but SPMD kernels may be
executed without the full runtime. It leads to the runtime crash of the
compiled program. Patch fixes this problem + fixes handling of the
parallelism level in SPMD mode, which is required as part of this patch.

Reviewers: gtbercea, kkwli0, grokos

Subscribers: guansong, jdoerfert, openmp-commits, caomhin

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D60578

llvm-svn: 358442
2019-04-15 20:15:20 +00:00
Jonathan Peyton 4f21f5f5ce [OpenMP] Exchange code in asm file for inline assembly
This change replaces some of the assembly functions in z_Linux_asm.S
for inline asm in kmp.h. This allows better interaction with compiler
tools and sanitizers.

Differential Revision: https://reviews.llvm.org/D60423

llvm-svn: 358438
2019-04-15 19:19:57 +00:00
Andrey Churbanov 705384be97 Fixed possible out of bound array access.
The check of index value moved to before the write to the array.

Differential Revision: https://reviews.llvm.org/D60471

llvm-svn: 358181
2019-04-11 15:03:44 +00:00
Jonathan Peyton ebf1830bb1 [OpenMP] Implement 5.0 memory management
* Replace HBWMALLOC API with more general MEMKIND API, new functions
  and variables added.
* Have libmemkind.so loaded when accessible.
* Redirect memspaces to default one except for high bandwidth which
  is processed separately.
* Ignore some allocator traits e.g., sync_hint, access, pinned, while
  others are processed normally e.g., alignment, pool_size, fallback,
  fb_data, partition.
* Add tests for memory management

Patch by Andrey Churbanov

Differential Revision: https://reviews.llvm.org/D59783

llvm-svn: 357929
2019-04-08 17:59:28 +00:00
Jonathan Peyton feac33ebb0 [OpenMP] Clean up load balancing dynamic mode
This patch cleans up the bookkeeping code for the load balancing dynamic mode.

When a thread is moved to or from the thread pool, the th_active_in_pool flag
and the __kmp_thread_pool_active_nth global counter are both updated. This
removes the need for the corrective code in the main wait loop. Another global
counter, __kmp_thread_pool_nth, was removed completely, as it was only used for
debugging, but was not under KMP_DEBUG.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D59508

llvm-svn: 357927
2019-04-08 17:50:02 +00:00
Dimitry Andric 8f2d1eb9e8 After rL357618, quote ${CMAKE_THREAD_LIBS_INIT} so CMake does not
complain when the variable is empty.  Fixes PR 41401.

llvm-svn: 357828
2019-04-05 22:19:40 +00:00
Jonathan Peyton b727d384a3 [OpenMP] Fix hang on Windows
Debug dump on large machine shows when many OpenMP threads (401 in total)
sleep on a barrier, one of the innermost nesting levels sleeps
on a child's b_arrived flag whose value is equal to 4 and is equal to
checker value. i.e., (1) sleep bit is 0, and (2) done_check() would
return true if called.

It is unclear how this might happen. It could be Windows Server 2016's
error of EnterCriticalSection / LeaveCriticalSection, or
error of WaitForSingleObject / SetEvent / ResetEvent, or
error in the library which is very difficult to find.

As a workaround, change INFINITE wait to timed wait, so that each
thread awakens each 5 seconds (the timeout was chosen arbitrary to not
disturb other threads much), check flag condition under the lock, and
either go to sleep again or stop sleeping as a result of the check.

Patch by Andrey Churbanov

Differential Revision: https://reviews.llvm.org/D59793

llvm-svn: 357722
2019-04-04 20:35:29 +00:00
Jonathan Peyton d2b53cad18 [OpenMP][Stats] Fix stats gathering for distribute and team clause
The distribute clause needs an explicit push of a timer. The teams
clause needs a timer added and also, similarly to parallel, exchanged
with the serial timer when encountered so that serial regions are
counted properly.

Differential Revision: https://reviews.llvm.org/D59801

llvm-svn: 357621
2019-04-03 18:53:26 +00:00
Dimitry Andric 956168c802 Ensure correct pthread flags and libraries are used
On most platforms, certain compiler and linker flags have to be passed
when using pthreads, otherwise linking against libomp.so might fail with
undefined references to several pthread functions.

Use CMake's `find_package(Threads)` to determine these for standalone
builds, or take them (and optionally modify them) from the top-level
LLVM cmake files.

Also, On FreeBSD, ensure that libomp.so is linked against libm.so,
similar to NetBSD.

Adjust test cases with hardcoded `-lpthread` flag to use the common
build flags, which should now have the required pthread flags.

Reviewers: emaste, jlpeyton, krytarowski, mgorny, protze.joachim, Hahnfeld

Reviewed By: Hahnfeld

Subscribers: AndreyChurbanov, tra, EricWF, Hahnfeld, jfb, jdoerfert, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D59451

llvm-svn: 357618
2019-04-03 18:11:36 +00:00
Michael Kruse d97d5ebcfa [libomptarget] Introduce LIBOMPTARGET_ENABLE_DEBUG cmake option.
At the moment, support for runtime debug output using the
OMPTARGET_DEBUG=1 environment variable is only available with
CMAKE_BUILD_TYPE=Debug builds. The patch allows setting it independently
using the LIBOMPTARGET_ENABLE_DEBUG option, which is enabled by default
depending on CMAKE_BUILD_TYPE. That is, unless this option is set
explicitly, nothing changes. This is the same mechanism used by LLVM for
LLVM_ENABLE_ASSERTIONS.

This patch also removes adding -g -O0 in debug builds, it should be
handled by cmake's CMAKE_{C|CXX}_FLAGS_DEBUG configuration option.

Idea by Hal Finkel

Differential Revision: https://reviews.llvm.org/D55952

llvm-svn: 356998
2019-03-26 15:19:15 +00:00
Jonathan Peyton 3bc703d538 [OpenMP] Add LLVM license header to file
This file was missing the LLVM license header

llvm-svn: 356962
2019-03-25 22:36:31 +00:00
Jonathan Peyton 7ca09056c7 [OpenMP] Add Intel 19.0 to list of compilers in kmp_version.cpp
llvm-svn: 356961
2019-03-25 22:31:00 +00:00
Dimitry Andric a70da7f29f Fix interoperability test compilation on FreeBSD
Summary:
While building the 8.0 releases on FreeBSD, I encountered the following
error in the regression tests, where ompt/misc/interoperability.cpp
failed to compile, with:

```
projects/openmp/runtime/test/ompt/misc/interoperability.cpp:7:10: fatal error: 'alloca.h' file not found
#include <alloca.h>
         ^~~~~~~~~~
```

Like on NetBSD, alloca(3) is defined in <stdlib.h> instead.

Reviewers: emaste, jlpeyton, krytarowski, mgorny, protze.joachim

Reviewed By: jlpeyton

Subscribers: jdoerfert, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D59736

llvm-svn: 356936
2019-03-25 18:37:49 +00:00
Dimitry Andric dab9ed87c6 Fix gettid warnings on FreeBSD
Summary:
[Split off from D59451 to get this fix in separately]

While building the 8.0 releases on FreeBSD, I encountered the following
warnings in openmp quite a few times:

```
In file included from projects/openmp/runtime/src/kmp_settings.cpp:27:
projects/openmp/runtime/src/kmp_wrapper_getpid.h:35:2: warning: #warning is a language extension [-Wpedantic]
#warning No gettid found, use getpid instead
 ^
projects/openmp/runtime/src/kmp_wrapper_getpid.h:35:2: warning: No gettid found, use getpid instead [-W#warnings]
2 warnings generated.
```

I added a gettid wrapper that uses FreeBSD's pthread_getthreadid_np(3)
function for this.

Reviewers: emaste, jlpeyton, krytarowski, mgorny, protze.joachim

Reviewed By: jlpeyton

Subscribers: jfb, jdoerfert, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D59735

llvm-svn: 356934
2019-03-25 18:37:14 +00:00
Jonathan Peyton 61708b1e94 [OpenMP] Fix pause check with version info
Add 5.0 guard to pause code for now.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D59428

llvm-svn: 356933
2019-03-25 18:17:55 +00:00
Jonathan Peyton 6622732d9a [OpenMP] Fix OMPT cancellation test for GOMP
The GOMP sections interface uses schedule(dynamic) dispatch so it cannot
be assumed which thread executes the cancel and which thread executes
the cancellation point.  This patch allows either thread to execute either
section.

llvm-svn: 356302
2019-03-15 21:24:45 +00:00
Jonathan Peyton 5af1c22d0b [OpenMP] Add missing parenthesis in Perl module
llvm-svn: 356289
2019-03-15 18:27:14 +00:00
Jonathan Peyton 44b476c141 [OpenMP] Remove deprecated taskq
Remove very old, unused, and deprecated taskq code.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D58989

llvm-svn: 356288
2019-03-15 18:24:59 +00:00
Jonathan Peyton 529e0d2ea4 [OpenMP][stats] Update stats gathering macros
llvm-svn: 355739
2019-03-08 21:23:34 +00:00
Petar Jovanovic bc3cda1526 [mips] Use libatomic instead of GCC intrinsics for 64bit
The following GCC intrinsics are not available on MIPS32:

__sync_fetch_and_add_8
__sync_fetch_and_and_8
__sync_fetch_and_or_8
__sync_val_compare_and_swap_8

Replace these with appropriate libatomic implementation.

Patch by Miodrag Dinic.

Differential Revision: https://reviews.llvm.org/D45691

llvm-svn: 355687
2019-03-08 10:53:19 +00:00
Shoaib Meenai 5be71faf4b [build] Rename clang-headers to clang-resource-headers
Summary:
The current install-clang-headers target installs clang's resource
directory headers. This is different from the install-llvm-headers
target, which installs LLVM's API headers. We want to introduce the
corresponding target to clang, and the natural name for that new target
would be install-clang-headers. Rename the existing target to
install-clang-resource-headers to free up the install-clang-headers name
for the new target, following the discussion on cfe-dev [1].

I didn't find any bots on zorg referencing install-clang-headers. I'll
send out another PSA to cfe-dev to accompany this rename.

[1] http://lists.llvm.org/pipermail/cfe-dev/2019-February/061365.html

Reviewers: beanz, phosek, tstellar, rnk, dim, serge-sans-paille

Subscribers: mgorny, javed.absar, jdoerfert, #sanitizers, openmp-commits, lldb-commits, cfe-commits, llvm-commits

Tags: #clang, #sanitizers, #lldb, #openmp, #llvm

Differential Revision: https://reviews.llvm.org/D58791

llvm-svn: 355340
2019-03-04 21:19:53 +00:00
Stefan Pintilie a908829bf5 [OPENMP] Deal with additional store inserted by Clang under -fno-PIC for PowerPC.
Changing the default from -fPIC to -fno-PIC on PowerPC exposed an issue in
OpenMP for PowerPC.
The issue is reported here:
https://bugs.llvm.org/show_bug.cgi?id=40082

This is a fix for that issue.
Also removed the XFAIL from the two tests that were failing under -fno-PIC.

Differential Revision: https://reviews.llvm.org/D56286

llvm-svn: 355229
2019-03-01 21:16:45 +00:00
Jonathan Peyton ad1ad7ae8b [OpenMP][OMPT] Distinguish different barrier kinds
This change makes the runtime decide the intended use of each barrier
invocation, for the OMPT synchronization tool callbacks.  The OpenMP 5.0
specification defines four possible barrier kinds -- implicit, explicit,
implementation, and just normal barrier.

Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D58247

llvm-svn: 355140
2019-02-28 20:55:39 +00:00
Jonathan Peyton 76b45e874d [OpenMP 5.0] Deprecate nest-var and associated features
Nest-var, OMP_NESTED, omp_set_nested()., and omp_get_nested() have been
deprecated in the 5.0 spec. Initial nesting info is now derived from
OMP_MAX_ACTIVE_LEVELS, OMP_NUM_THREADS, and OMP_PROC_BIND.

This patch deprecates the internal ICV that corresponds to nest-var, and
replaces it with the max-active-levels-var ICV to determine nesting. The
change still allows for use of OMP_NESTED (according to 5.0 changes),
omp_get_nested, and omp_set_nested, which have had deprecation messages
added to them. The change allows certain settings of OMP_NUM_THREADS,
OMP_PROC_BIND, and OMP_MAX_ACTIVE_LEVELS to turn on nesting, but
OMP_NESTED=0 will still force nesting to be off.

The runtime now prints informative messages about deprecation of
OMP_NESTED, omp_set_nested(), and omp_get_nested(), when those
environment variables or routines are used. It also prints deprecated
message in output for KMP_SETTINGS and OMP_DISPLAY_ENV for OMP_NESTED.
This patch also fixes OMP_DISPLAY_ENV output for OMP_TARGET_OFFLOAD.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D58408

llvm-svn: 355138
2019-02-28 20:47:21 +00:00
Jonathan Peyton e47d32f165 [OpenMP] Make use of sched_yield optional in runtime
This patch cleans up the yielding code and makes it optional. An
environment variable, KMP_USE_YIELD, was added. Yielding is still
on by default (KMP_USE_YIELD=1), but can be turned off completely
(KMP_USE_YIELD=0), or turned on only when oversubscription is detected
(KMP_USE_YIELD=2). Note that oversubscription cannot always be detected
by the runtime (for example, when the runtime is initialized and the
process forks, oversubscription cannot be detected currently over
multiple instances of the runtime).

Because yielding can be controlled by user now, the library mode
settings (from KMP_LIBRARY) for throughput and turnaround have been
adjusted by altering blocktime, unless that was also explicitly set.

In the original code, there were a number of places where a double yield
might have been done under oversubscription. This version checks
oversubscription and if that's not going to yield, then it does
the spin check.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D58148

llvm-svn: 355120
2019-02-28 19:11:29 +00:00
Jonas Hahnfeld db3025ad57 [OpenMP] Fix check-openmp after r354553
Calling add_openmp_testsuite will add the tests to check-openmp unless
EXCLUDE_FROM_ALL is set. This is problematic because the tests for OMPT
will be included twice which doesn't work if the same test is executed
concurrently by multiple threads.

See:
http://lab.llvm.org:8011/builders/openmp-gcc-x86_64-linux-debian/builds/163
http://lab.llvm.org:8011/builders/openmp-clang-x86_64-linux-debian/builds/184

http://lab.llvm.org:8011/builders/openmp-clang-ppc64le-linux-rhel/builds/133
(On PPC some failures are unrelated to r354553, the bot has been red before
and this commit is not expected to fix that. For a proper patch please see
https://reviews.llvm.org/D56286.)

llvm-svn: 354572
2019-02-21 12:00:57 +00:00
Joachim Protze 8b96fad85c [OpenMP][OMPT] Fix locking testcases for 32 bit architectures
Fix for the bug reported in:
https://bugs.llvm.org/show_bug.cgi?id=40531

The address is now casted the same way as in the runtime code.

Differential Revision: https://reviews.llvm.org/D58454

llvm-svn: 354553
2019-02-21 08:50:49 +00:00
Gheorghe-Teodor Bercea 06e08f0b0a [OpenMP][libomptarget] New reduction scheme for team reductions
Summary:
This patch adds a more sophisticated team reduction scheme to the OpenMP libomptarget-nvptx runtime.

The scheme uses a fixed size global memory buffer whose length can be adjusted via compiler flag:
```
-fopenmp-cuda-teams-reduction-recs-num=1024
```
The global buffer is a structure of arrays (with default size of 1024 each and controlled by the above flag), one array for each reduction variable.

Values in the buffer are processed by the last team to finish executing the body of the target region.

In addition to adding support for the new flag, the compiler also emits special functions used for the reduction of the intermediate reduction values. These changes will be added in a separate compiler patch following this one.




Reviewers: ABataev, caomhin

Reviewed By: ABataev

Subscribers: guansong, jfb, jdoerfert, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D58409

llvm-svn: 354471
2019-02-20 14:55:55 +00:00
Jonathan Peyton 7d2cfa1fd5 [OpenMP] Remove XFAIL for cancellation tests using gcc
llvm-svn: 354370
2019-02-19 19:00:29 +00:00
Jonathan Peyton 154ac075cd [OpenMP 5.0] Add omp_get_supported_active_levels()
This patch adds the new 5.0 API function omp_get_supported_active_levels().

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D58211

llvm-svn: 354368
2019-02-19 18:51:11 +00:00
Jonathan Peyton 4fe5271fa0 [OpenMP] Adding GOMP compatible cancellation
Remove fatal error messages from the cancellation API for GOMP
Add __kmp_barrier_gomp_cancel() to implement cancellation of parallel regions.
This new function uses the linear barrier algorithm with a cancellable
nonsleepable wait loop.

Differential Revision: https://reviews.llvm.org/D57969

llvm-svn: 354367
2019-02-19 18:47:57 +00:00
Jonathan Peyton 511092cab0 [OpenMP] Fix broken link to browse sources
llvm-svn: 353858
2019-02-12 17:00:57 +00:00
Jonathan Peyton 2f744592a0 [OpenMP] Remove accidental commit to config-ix.cmake in r353747
llvm-svn: 353748
2019-02-11 21:09:15 +00:00
Jonathan Peyton 65ebfeecf8 [OpenMP] Fix thread_limits to work properly for teams construct
The thread-limit-var and omp_get_thread_limit API was not perfectly handled for
teams construct. Now, when modified by thread_limit clause, omp_get_thread_limit
reports the correct value. In addition, the value is restored when leaving the
teams construct to what it was in the encountering context.

This is done partly by creating the notion of a Contention Group root (CG root)
that keeps track of the thread at the root of each separate CG, the
thread-limit-var associated with the CG, and associated counter of active
threads within the contention group.

thread-limits are passed from master to worker threads via an entry in the ICV
data structure. When a "contention group switch" occurs, a new CG root record is
made and passed from master to worker. A thread could potentially have several
CG root records if it encounters multiple nested teams constructs (but at the
moment the spec doesn't allow for nested teams, so the most one could have
currently is 2). The master of the teams masters gets the thread-limit clause
value stored to its local ICV structure, and the other teams masters copy it
from the master. The thread-limit is set from that ICV copy and restored to the
ICV copy when entering and leaving the teams construct.

This change also fixes a bug when the top-level teams construct team gets
reused, and OMP_DYNAMIC was true, which can cause the expected size of this team
to be smaller than what was actually allocated. The fix updates the size of the
team after its threads were reserved.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D56804

llvm-svn: 353747
2019-02-11 21:04:23 +00:00
Jonas Hahnfeld f26d3e7185 [OMPT] Remove test output from source tree
%s refers to the test file in the source tree. This was accidentally added in
r351197 / 2b46d30 ("[OMPT] Second chunk of final OMPT 5.0 interface updates").

Differential Revision: https://reviews.llvm.org/D58002

llvm-svn: 353715
2019-02-11 16:14:51 +00:00
Taewook Oh 91c32fd8c8 Guard a feature that unsupported by old GCC
Summary:
As @david2050 commented, changes introduced by https://reviews.llvm.org/D56397 break builds for older compilers
which don't support `__has(_cpp)_attribute`. This is a fix for the break.

Reviewers: protze.joachim, jlpeyton, AndreyChurbanov, Hahnfeld, david2050

Subscribers: openmp-commits, david2050

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D57851

llvm-svn: 353538
2019-02-08 17:15:50 +00:00
Joachim Protze 0c599c388d [OMPT] Make sure that OMPT is enabled when accessing internals of the runtime
The three switch fallthrough generate a warning with -Wimplicit-fallthrough.
Two are documented as fallthrough, one is not, but I think the intention is to also fallthrough in kmp_tasking.cpp.

Not sure whether kmp.h is the best place to define the macro.

Reviewers: jlpeyton, AndreyChurbanov, Hahnfeld

Reviewed By: jlpeyton

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D56397

llvm-svn: 353052
2019-02-04 15:59:42 +00:00
Joachim Protze 32959e683a [OMPT] Make sure that OMPT is enabled when accessing internals of the runtime
Redo after revert by hans. The wrong include in one test is fixed.

Make sure that OMPT is enabled in runtime entry points that access internals
of the runtime. Else, return an appropiate value indicating an error or that
the data is not available.

Patch provided by @sconvent

Reviewers: jlpeyton, omalyshe, hbae, Hahnfeld, joachim.protze

Reviewed By: joachim.protze

Tags: #openmp, #ompt

Differential Revision: https://reviews.llvm.org/D47717

llvm-svn: 352611
2019-01-30 08:41:06 +00:00
James Y Knight 5d71fc5d7b Adjust documentation for git migration.
This fixes most references to the paths:
 llvm.org/svn/
 llvm.org/git/
 llvm.org/viewvc/
 github.com/llvm-mirror/
 github.com/llvm-project/
 reviews.llvm.org/diffusion/

to instead point to https://github.com/llvm/llvm-project.

This is *not* a trivial substitution, because additionally, all the
checkout instructions had to be migrated to instruct users on how to
use the monorepo layout, setting LLVM_ENABLE_PROJECTS instead of
checking out various projects into various subdirectories.

I've attempted to not change any scripts here, only documentation. The
scripts will have to be addressed separately.

Additionally, I've deleted one document which appeared to be outdated
and unneeded:
  lldb/docs/building-with-debug-llvm.txt

Differential Revision: https://reviews.llvm.org/D57330

llvm-svn: 352514
2019-01-29 16:37:27 +00:00
Arnaud A. de Grandmaison f185823668 Remove no longer needed Arm specific words in the LICENSE.txt file.
As the codebase is now under the Apache 2.0 license with LLVM
Exceptions, and all Arm's contributions, past or future, are under that
new license, this Arm specific words in LICENSE.txt are no longer
needed.

llvm-svn: 352377
2019-01-28 15:42:58 +00:00
Andrey Churbanov efa6b826b4 NFC: fixed formatting to be consistent across the file
llvm-svn: 351748
2019-01-21 16:11:43 +00:00
Andrey Churbanov b8e3643506 Fixed https://reviews.llvm.org/D55078 broken Fortran fixed form.
Long lines split in order to obey Fortran fixed form compilation.

Differential Revision: https://reviews.llvm.org/D57017

llvm-svn: 351745
2019-01-21 15:30:31 +00:00
Chandler Carruth 4a1b95bda0 Fix typos throughout the license files that somehow I and my reviewers
all missed!

Thanks to Alex Bradbury for pointing this out, and the fact that I never
added the intended `legacy` anchor to the developer policy. Add that
anchor too. With hope, this will cause the links to all resolve
successfully.

llvm-svn: 351731
2019-01-21 09:52:34 +00:00
Chandler Carruth 57b08b0944 Update more file headers across all of the LLVM projects in the monorepo
to reflect the new license. These used slightly different spellings that
defeated my regular expressions.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351648
2019-01-19 10:56:40 +00:00
Chandler Carruth 469bdefd44 Install new LLVM license structure and new developer policy.
This installs the new developer policy and moves all of the license
files across all LLVM projects in the monorepo to the new license
structure. The remaining projects will be moved independently.

Note that I've left odd formatting and other idiosyncracies of the
legacy license structure text alone to make the diff easier to read.
Critically, note that we do not in any case *remove* the old license
notice or terms, as that remains necessary until we finish the
relicensing process.

I've updated a few license files that refer to the LLVM license to
instead simply refer generically to whatever license the LLVM project is
under, basically trying to minimize confusion.

This is really the culmination of so many people. Chris led the
community discussions, drafted the policy update and organized the
multi-year string of meeting between lawyers across the community to
figure out the strategy. Numerous lawyers at companies in the community
spent their time figuring out initial answers, and then the Foundation's
lawyer Heather Meeker has done *so* much to help refine and get us ready
here. I could keep going on, but I just want to make sure everyone
realizes what a huge community effort this has been from the begining.

Differential Revision: https://reviews.llvm.org/D56897

llvm-svn: 351631
2019-01-19 06:14:24 +00:00
Hans Wennborg 799b5dcbda Revert r351311 "[OMPT] Make sure that OMPT is enabled when accessing internals of the runtime"
and also the follow-up r351315.

The new test is failing on the buildbots.

> Make sure that OMPT is enabled in runtime entry points that access internals
> of the runtime. Else, return an appropiate value indicating an error or that
> the data is not available.
>
> Patch provided by @sconvent
>
> Reviewers: jlpeyton, omalyshe, hbae, Hahnfeld, joachim.protze
>
> Reviewed By: joachim.protze
>
> Tags: #openmp, #ompt
>
> Differential Revision: https://reviews.llvm.org/D47717

llvm-svn: 351431
2019-01-17 11:31:03 +00:00
Jonathan Peyton 9b8bb323c9 [OpenMP] Add omp_pause_resource* API
Add omp_pause_resource and omp_pause_resource_all API and enum, plus stub for
internal implementation. Implemented callable helper function to do local pause,
and added basic functionality for hard and soft pause.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D55078

llvm-svn: 351372
2019-01-16 20:07:39 +00:00
Joachim Protze c46bd682ac [OpenMP] Output written by tests should go to build directory
llvm-svn: 351332
2019-01-16 13:06:10 +00:00
Joachim Protze 6b840ccea9 [OpenMP] Remove compiler warning about unused value
The compiler warns about an unused variable/statement:

    runtime/src/kmp_affinity.cpp:4958:18: warning: statement has no effect [-Wunused-value]
       KA_TRACE(1000, ; {
                      ^
    runtime/src/kmp_debug.h:84:24: note: in definition of macro 'KA_TRACE'
         __kmp_debug_printf x;                                                      \
                            ^

Instead of the unused reference to this function, this patch now calls the function
with an empty string. The call to this function should have no effect.

Patch provided by joachim.protze

Reviewers: jlpeyton, hbae, AndreyChurbanov

Reviewed By: AndreyChurbanov

Tags: #openmp, #ompt

Differential Revision: https://reviews.llvm.org/D56775

llvm-svn: 351323
2019-01-16 11:35:11 +00:00
Joachim Protze c3716617df Fix compiler error in r351311
llvm-svn: 351315
2019-01-16 09:39:42 +00:00
Joachim Protze 582b183dda [OMPT] Make sure that OMPT is enabled when accessing internals of the runtime
Make sure that OMPT is enabled in runtime entry points that access internals
of the runtime. Else, return an appropiate value indicating an error or that
the data is not available.

Patch provided by @sconvent

Reviewers: jlpeyton, omalyshe, hbae, Hahnfeld, joachim.protze

Reviewed By: joachim.protze

Tags: #openmp, #ompt

Differential Revision: https://reviews.llvm.org/D47717

llvm-svn: 351311
2019-01-16 08:58:17 +00:00
Jonathan Peyton 9355d0dc13 [OpenMP] Fix for nested proc_bind affinity bug
Using proc_bind clause on a nested #pragma omp parallel region
with KMP_AFFINITY set causes an assertion error. This assertion occurs because
the place-partition-var is not properly initialized in the nested master threads.
Trying to get an intuitive result with KMP_AFFINITY + proc_bind is difficult
because of how the KMP_AFFINITY gtid-to-place mapping occurs. This
patch creates an initial place list no matter what affinity mechanism is used.
For KMP_AFFINITY, the place-partition-var is initialized to all the places.

Differential Revision: https://reviews.llvm.org/D55795

llvm-svn: 351227
2019-01-15 19:39:32 +00:00
Jonathan Peyton fce3972553 [OpenMP] Add lock function definitions to fix Bug 40042
This change fixes the sanity issue reported in Bug 40042.
Lock function definitions for the three lock kinds were added
to disambiguate calls to the lock functions done directly and indirectly.

Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=40042
Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D56103

llvm-svn: 351224
2019-01-15 19:14:00 +00:00
Jonathan Peyton 1c268554ba [OpenMP][Cmake] Allowed OpenMP testing detect test compiler with same generator
Fix ninja build detect test compiler failed under windows.

Patch by Peiyuan Song

Differential Revision: https://reviews.llvm.org/D53479

llvm-svn: 351223
2019-01-15 19:08:26 +00:00
Jonathan Peyton dc375486b0 [OpenMP] Fix performance regression in SPEC kdtree test
Make __ompt_implicit_task_end a static function and remove the inline part.  Remove
pId variable that is unused.  This fixes small regression in SPEC kdtree benchmark.
Also reformat some of __ompt_implicit_task_end.

Differential Revision: https://reviews.llvm.org/D55788

llvm-svn: 351221
2019-01-15 18:57:24 +00:00
Joachim Protze 2b46d30fc7 [OMPT] Second chunk of final OMPT 5.0 interface updates
The omp-tools.h file is generated from the OpenMP spec to ensure that the interface
is implemented as specified.
The other changes are necessary to update the interface implementation to the
final version as published in 5.0.
The omp-tools.h header was previously called ompt.h, currently a copy under this name
is installed for legacy tools.

Patch partially perpared by @sconvent

Reviewers: AndreyChurbanov, hbae, Hahnfeld

Reviewed By: hbae

Tags: #openmp, #ompt

Differential Revision: https://reviews.llvm.org/D55579

llvm-svn: 351197
2019-01-15 15:36:53 +00:00
Hans Wennborg eb60fbfdb4 Update year in license files
In last year's update (D48219) it was suggested that the release manager
might want to do this, so here we go.

llvm-svn: 351194
2019-01-15 15:10:32 +00:00
Roman Lebedev 06e3950561 [OpenMP] Fix LIBOMP_USE_DEBUGGER=ON build (PR38612)
Summary:
Two things:
1. Those two variables had the wrong sigdness, which was resulting in "sign mismatch in comparison" warning.
2. The whole `kmp_debugger.cpp` wasn't being built, or rather, it was being built as-if `USE_DEBUGGER` was off,
   thus, nothing provided the definition of `__kmp_omp_debug_struct_info`, `__kmp_debugging`.
   Makes sense, because `USE_DEBUGGER` is set in `kmp_config.h`, which is not included explicitly.
   It is included by `kmp.h`, but that one is only included inside of the `#if USE_DEBUGGER` block..
   I *think* this is the only source file with this issue,
   everything else seem to `#include` either `kmp.h` or `kmp_config.h`.
   The alternative solution would be to add `add_compile_options(-include kmp_config.h)` in CMake.

I did verify that `__kmp_omp_debug_struct_info` becomes available with this patch.

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=38612 | PR38612 ]].

Reviewers: AndreyChurbanov, jlpeyton, Hahnfeld

Reviewed By: jlpeyton

Subscribers: guansong, jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D55783

llvm-svn: 351019
2019-01-13 12:54:34 +00:00
Gheorghe-Teodor Bercea 1653633a1c [OpenMP][libomptarget] Use shared memory variable for tracking parallel level
Summary: Replace existing infrastructure for tracking parallel level using global memory with a per-team shared memory variable. This minimizes the impact of the overhead of tracking the parallel level for non-nested cases.

Reviewers: ABataev, caomhin

Reviewed By: ABataev

Subscribers: guansong, openmp-commits

Differential Revision: https://reviews.llvm.org/D55773

llvm-svn: 350747
2019-01-09 18:30:14 +00:00
Andrey Churbanov b7a8ab3417 Doc: fixed description of a parameter of the __kmpc_taskloop
Patch by sergi.mateo.bellido@gmail.com

Differential Revision: https://reviews.llvm.org/D56432

llvm-svn: 350713
2019-01-09 13:06:23 +00:00
Alexey Bataev 26e6c86b79 [OPENMP][NVPTX]Fix dynamic scheduling.
Summary:
Previous implementation may cause the runtime crash when the number of
teams is > 1024. Patch fixes this problem + reduces number of the atomic
operations by 32 times.

Reviewers: grokos, gtbercea, kkwli0

Subscribers: guansong, jfb, openmp-commits, caomhin

Differential Revision: https://reviews.llvm.org/D56332

llvm-svn: 350524
2019-01-07 14:25:25 +00:00
Alexey Bataev 6b3153ada0 [OPENMP][NVPTX]General formatting/code improvement, NFC.
Summary: Formatting.

Reviewers: gtbercea, grokos, kkwli0

Subscribers: guansong, openmp-commits, caomhin

Differential Revision: https://reviews.llvm.org/D56290

llvm-svn: 350431
2019-01-04 20:16:54 +00:00
Alexey Bataev dcf2edcdf5 [OPENMP][NVPTX]Improve performance + reduce number of used registers.
Summary:
Reduced number of the used register + improved performance propagating
the information about current execution/data sharing mode directly from
the compiler, where it is possible.
In some cases, it requires new/reworked interfaces of the runtime
external functions. Old functions are marked as deprecated.

Reviewers: grokos, gtbercea, kkwli0

Subscribers: guansong, jfb, openmp-commits, caomhin

Differential Revision: https://reviews.llvm.org/D56278

llvm-svn: 350405
2019-01-04 17:09:12 +00:00
Joel E. Denny f17f7a5d4d [OpenMP] Fix nvidia-cuda-toolkit detection on Debian/Ubuntu
The OpenMP runtime's cmake scripts do not correctly locate the
libdevice that the Debian/Ubuntu package nvidia-cuda-toolkit currently
includes, at least on my Ubuntu 18.04.1 installation.  This patch
fixes that for me.

This problem was discussed at length in D55269.  D40453 added a
similar adjustment in clang, but reviewers of D55269 concluded that,
for the OpenMP runtime, the right place to address this problem is in
cmake's CUDA support.  However, it was also suggested we could add a
workaround to OpenMP's cmake scripts now.  This patch contains such a
workaround, which I've tried to design so that it will have no harmful
effect if cmake improves in the future.

nvidia-cuda-toolkit also needs improvements because its intended
monolithic CUDA tree shim, /usr/lib/cuda, has many empty directories,
such as bin.  I reported that at:

<https://bugs.launchpad.net/ubuntu/+source/nvidia-cuda-toolkit/+bug/1808999>

Reviewed By: grokos

Differential Revision: https://reviews.llvm.org/D55588

llvm-svn: 350377
2019-01-04 02:07:13 +00:00
Jonathan Peyton 76f3980a20 [OpenMP] Add omp_get_device_num() and update several other device API functions
Add omp_get_device_num() function for 5.0 which returns the number of the
device the current thread is running on. Currently, we are leaving it to the
compiler to handle this properly if it is called inside target.

Also, did some cleanup and updating of duplicate device API functions (in both
libomp and libomptarget) to make them into weak functions that check for the
symbol from libomptarget, and will call the version in libomptarget if it is
present. If any additional device API functions are implemented also in
libomptarget in the future, we should add the dlsym calls to the host functions.
Also, if the omp_target_* functions are to be implemented for the host (this has
been requested), they should attempt to call the libomptarget versions as well.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D55578

llvm-svn: 350352
2019-01-03 21:14:19 +00:00
Alexey Bataev 3c74be8049 [OPENMP][NVPTX]Fix incompatibility of __syncthreads with LLVM, NFC.
Summary:
One of the LLVM optimizations, split critical edges, also clones tail
instructions. This is a dangerous operation for __syncthreads()
functions and this transformation leads to undefined behavior or
incorrect results. Patch fixes this problem by replacing __syncthreads()
function with the assembler instruction, which cost is too high and
wich cannot be copied.

Reviewers: grokos, gtbercea, kkwli0

Subscribers: guansong, openmp-commits, caomhin

Differential Revision: https://reviews.llvm.org/D56274

llvm-svn: 350333
2019-01-03 17:43:46 +00:00
Vyacheslav Zakharin e889ac7e6b [libomptarget] Added install component for libomptarget
Differential Revision: https://reviews.llvm.org/D56108

llvm-svn: 350254
2019-01-02 19:39:49 +00:00
Alexey Bataev d1cd005ec5 [OPENMP][NVPTX]Added/fixed debugging messages, NFC.
Summary: Added or fixed new/old debugging messages for the better diagnostics.

Reviewers: gtbercea, kkwli0, grokos

Reviewed By: grokos

Subscribers: caomhin, guansong, openmp-commits

Differential Revision: https://reviews.llvm.org/D56102

llvm-svn: 350137
2018-12-28 21:36:09 +00:00
Alexey Bataev 28eccf5ba0 [OPENMP][NVPTX]Fixed initialization of the data-sharing interface.
Summary:
Avoid using of the atomic loop to wait for the completion of the
data-sharing interface initialization, use __shfl_sync instead for the
communication within the warp to signal other threads in the warp about
completion of the initialization.

Reviewers: gtbercea, kkwli0, grokos

Subscribers: guansong, jfb, caomhin, openmp-commits

Differential Revision: https://reviews.llvm.org/D56100

llvm-svn: 350129
2018-12-28 17:31:06 +00:00
Alexey Bataev 1708858dbd [OPENMP][NVPTX]Outline assert into noinline function, NFC.
Summary:
At high optimization level asserts lead to some unexpected results
because of auto-inserted unreachable instructions. This outlining
prevents some of such dangerous optimizations and leads to better
stability.

Reviewers: gtbercea, kkwli0, grokos

Subscribers: guansong, caomhin, openmp-commits

Differential Revision: https://reviews.llvm.org/D56101

llvm-svn: 350128
2018-12-28 17:29:47 +00:00
Michal Gorny a70184ba92 [runtime] [test] Fix using %python path
Fix the newly-added tests to use %python substitution in order to use
the correct path to Python interpreter.  Otherwise, they fail on NetBSD
where there is no 'python', just 'pythonX.Y'.

Differential Revision: https://reviews.llvm.org/D56048

llvm-svn: 350001
2018-12-22 10:51:53 +00:00
Stefan Pintilie 4230f91aa2 [Tests] [OpenMP] XFAIL also for ppc64le.
Two tests were XFAILed for powerpc64le in r349512.
They should have also been XFAILed for ppc64le.

llvm-svn: 349521
2018-12-18 19:05:07 +00:00
Stefan Pintilie ea79468b41 XFAIL Pair of OpenMP Tests for PowerPC LE Linux
XFAIL two tests that fail on PowerPC LE Linux due
to the change of default from PIC to no-PIC on that
platform.

A Bug has been opened for this:
https://bugs.llvm.org/show_bug.cgi?id=40082

The tests are:
runtime/test/ompt/misc/control_tool.c
runtime/test/ompt/synchronization/taskwait.c

llvm-svn: 349512
2018-12-18 17:39:22 +00:00
Joachim Protze cf80e72e30 [Tests] fix non-determinism failure in testcase
llvm-svn: 349460
2018-12-18 08:57:23 +00:00
Joachim Protze 0e0d6cdd58 [OMPT] First chunk of final OMPT 5.0 interface updates
This patch updates the implementation of the ompt_frame_t, ompt_wait_id_t
and ompt_state_t. The final version of the OpenMP 5.0 spec added the "t"
for these types.
Furthermore the structure for ompt_frame_t changed and allows to specify
that the reenter frame belongs to the runtime.

Patch partially prepared by Simon Convent

Reviewers: hbae
llvm-svn: 349458
2018-12-18 08:52:30 +00:00
Joachim Protze 1f7d4aca8d [OMPT] Add testcase for thread_num provided by implicit task events
llvm-svn: 349457
2018-12-18 08:52:12 +00:00
Jonathan Peyton fca3ac543e [OpenMP] version the affinity format tests and fix one test
llvm-svn: 349412
2018-12-17 22:53:47 +00:00
Jonathan Peyton 5640556b55 [OpenMP] Add affinity format tests
llvm-svn: 349411
2018-12-17 22:33:21 +00:00
Roman Lebedev 781a0896b0 [OpenMP] Fixes for LIBOMP_OMP_VERSION=45/40
Summary:
I have discovered this because i wanted to experiment with
building static libomp (with openmp-4.0 support only)
for debugging purposes.

There are three kinds of problems here:
1. `__kmp_compare_and_store_acq()` simply does not exist.
   It was added in D47903 by @jlpeyton.
   I'm guessing `__kmp_atomic_compare_store_acq()` was meant.
2. In `__kmp_is_ticket_lock_initialized()`,
   `lck->lk.initialized` is `std::atomic<bool>`,
   while `lck` is `kmp_ticket_lock_t *`.
   Naturally, they can't be equality-compared.
   Either, it should return the value read from `lck->lk.initialized`,
   or do what `__kmp_is_queuing_lock_initialized()` does,
   compare the passed pointer with the field in the struct
   pointed by the pointer. I think the latter is correct-er choice here.
3. Tests were not versioned.
   They assume that `LIBOMP_OMP_VERSION` is at the latest version.

This does not touch LIBOMP_OMP_VERSION=30. That is still broken.

Reviewers: jlpeyton, Hahnfeld, AndreyChurbanov

Reviewed By: AndreyChurbanov

Subscribers: guansong, jfb, openmp-commits, jlpeyton

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D55496

llvm-svn: 349260
2018-12-15 09:23:39 +00:00
Jonathan Peyton bdb0a2ffaa [OpenMP] Fix transient divide by zero bug in 32-bit code
The value returned by __kmp_now_nsec() can overflow 32-bit values causing
incorrect values to be returned. The overflow can end up causing a divide
by zero error because in __kmp_initialize_system_tick(), the value
(__kmp_now_nsec() - nsec) can end up being much larger than the numerator:
1e6 * (delay + (now - goal))
during a pathological timing where the current time calculated is much larger
than nsec. When this happens, the value of __kmp_ticks_per_msec is set to zero
which is then used as the denominator in the KMP_NOW_MSEC() macro leading to
the divide by zero error.

Differential Revision: https://reviews.llvm.org/D55300

llvm-svn: 349090
2018-12-13 23:18:55 +00:00
Jonathan Peyton 6d88e049dc [OpenMP] Implement OpenMP 5.0 affinity format functionality
This patch adds the affinity format functionality introduced in OpenMP 5.0.
This patch adds: Two new environment variables:

OMP_DISPLAY_AFFINITY=TRUE|FALSE
OMP_AFFINITY_FORMAT=<string>
and Four new API:
1) omp_set_affinity_format()
2) omp_get_affinity_format()
3) omp_display_affinity()
4) omp_capture_affinity()
The affinity format functionality has two ICV's associated with it:
affinity-display-var (bool) and affinity-format-var (string).
The affinity-display-var enables/disables the functionality through the
envirable OMP_DISPLAY_AFFINITY. The affinity-format-var is a formatted
string with the special field types beginning with a '%' character
similar to printf
For example, the affinity-format-var could be:
"OMP: host:%H pid:%P OStid:%i num_threads:%N thread_num:%n affinity:{%A}"

The affinity-format-var is displayed by every thread implicitly at the beginning
of a parallel region when any thread's affinity has changed (including a brand
new thread being spawned), or explicitly using the omp_display_affinity() API.
The omp_capture_affinity() function can capture the affinity-format-var in a
char buffer. And omp_set|get_affinity_format() allow the user to set|get the
affinity-format-var explicitly at runtime. omp_capture_affinity() and
omp_get_affinity_format() both return the number of characters needed to hold
the entire string it tried to make (not including NULL character). If not
enough buffer space is available,
both these functions truncate their output.

Differential Revision: https://reviews.llvm.org/D55148

llvm-svn: 349089
2018-12-13 23:14:24 +00:00
Andrey Churbanov 74f98554f9 Fix for bugzilla https://bugs.llvm.org/show_bug.cgi?id=39970
Broken tests fixed

Differential Revision: https://reviews.llvm.org/D55598

llvm-svn: 349017
2018-12-13 10:04:10 +00:00
Michal Gorny 8876dac50a [runtime] Disable KMP_HAVE_QUAD on NetBSD gcc
Disable KMP_HAVE_QUAD when building via gcc on NetBSD system,
as the build fails due to unimplemented builtins:

  .../kmp_atomic.cpp.o: In function `__kmpc_atomic_cmplx16_mul':
  .../kmp_atomic.cpp:1332: undefined reference to `__multc3'
  .../kmp_atomic.cpp.o: In function `__kmpc_atomic_cmplx16_div':
  .../kmp_atomic.cpp:1334: undefined reference to `__divtc3'
  ...

Differential Revision: https://reviews.llvm.org/D55478

llvm-svn: 348886
2018-12-11 19:02:14 +00:00
Michal Gorny 70cdd83cd6 [runtime] Use getloadavg() on NetBSD as well
Switch NetBSD from reading /proc (which is broken) to getloadavg()
(which is already used by Darwin).  NetBSD discourages using procfs
in favor of system API calls.

Differential Revision: https://reviews.llvm.org/D55486

llvm-svn: 348885
2018-12-11 19:02:09 +00:00
Kamil Rytarowski 316f423876 Implement __kmp_is_address_mapped() for NetBSD
Summary:
Use the sysctl(3) function to check whether an address is mapped
into the address space.

Reviewers: mgorny, joerg, #openmp

Reviewed By: mgorny

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D55549

llvm-svn: 348874
2018-12-11 18:35:07 +00:00
Kamil Rytarowski 98bdf1f21d Implement __kmp_gettid() for NetBSD
Summary: _lwp_self() returns current Thread Id in a numeric version on NetBSD.

Reviewers: joerg, mgorny, #openmp

Reviewed By: mgorny

Subscribers: llvm-commits, openmp-commits, #openmp

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D55497

llvm-svn: 348873
2018-12-11 18:34:33 +00:00
Michal Gorny 276df88154 [test] [runtime] Permit omp_get_wtick() to return 0.01
Increase the range for omp_get_wtick() test to allow for 0.01
(from <0.01).  This is needed for NetBSD where it returns exactly that
value due to CLOCKS_PER_SEC being 100.  This should not cause
a significant difference from e.g. FreeBSD where it is 128,
and especially from Linux where CLOCKS_PER_SEC is apparently meaningless
and sysconf(_SC_CLK_TCK) gives 100 as well.

Differential Revision: https://reviews.llvm.org/D55493

llvm-svn: 348857
2018-12-11 15:39:34 +00:00
Michal Gorny 3815b9f5f9 [test] [runtime] Do not include alloca.h on NetBSD
On NetBSD, alloca() is in stdlib.h and there is no alloca.h.  Adjust
the includes appopriately.

Differential Revision: https://reviews.llvm.org/D55487

llvm-svn: 348856
2018-12-11 15:39:30 +00:00
Michal Gorny 7bbc1a782f [runtime] [test] Use more portable short options to sort(1)
Pass `-n -s` instead of `--numeric --stable` to sort(1), as long options
are not supported by NetBSD sort implementation.  `-n` is defined
by POSIX, so it should be fully portable.  `-s` is used consistently
at least in GNU sort and FreeBSD sort, and I honestly doubt it would
cause issues with any other implementation supporting `--stable`.

Differential Revision: https://reviews.llvm.org/D55479

llvm-svn: 348855
2018-12-11 15:39:26 +00:00
Michal Gorny e9d4267277 [cmake] Use -std=gnu++11 to fix alloca() on NetBSD
Prefer using '-std=gnu++11' over '-std=c++11' when available, as NetBSD
exposes the correct alloca() implementation only with gnu* C/C++
standards.

Differential Revision: https://reviews.llvm.org/D55477

llvm-svn: 348854
2018-12-11 15:39:22 +00:00
Jonathan Peyton 17e53b9299 [OpenMP] Fix a few build issues
Fix two build issues:

1) Recent commit 348756 accidentally included Unix clang compilers
   to use immintrin.h when only clang-cl should be using it leading
   to the following error:

openmp-llvm/runtime/src/kmp_lock.cpp:2035:25: error: always_
inline function '_xbegin' requires target feature 'rtm', but would be inlined into function
      '__kmp_test_adaptive_lock_only' that is compiled without support for 'rtm'
          kmp_uint32 status = _xbegin();
This patch changes the guard to use immintrin.h to only use clang-cl instead of all clang

2) gcc-8 gives a warning about multiline comment in kmp_runtime.cpp:
This patch just changes it to a two line comment
openmp-llvm/runtime/src/kmp_runtime.cpp:7697:8: warning: multi-line comment [-Wcomment]
 #endif // KMP_OS_LINUX || KMP_OS_DRAGONFLY || KMP_OS_FREEBSD || KMP_OS_NETBSD  \

llvm-svn: 348783
2018-12-10 18:26:50 +00:00
Alexey Bataev 9056f1116d [OPENMP][NVPTX]Revert __kmpc_shuffle_int64 to its original form.
Summary:
Use the original shuffle implementation for __kmpc_shuffle_int64 since
default implementation uses the same implementation.

Reviewers: gtbercea

Subscribers: guansong, caomhin, openmp-commits

Differential Revision: https://reviews.llvm.org/D55514

llvm-svn: 348772
2018-12-10 16:50:36 +00:00
Alexey Bataev cc6cf64c38 [OPENMP][NVPTX]Enable fast shuffles on 64bit values only if CUDA >= 9.
Summary:
Shuffle on 64bit data is allowed only for CUDA >= 9.0. Also, fixed the
constant for the mask, need one extra L in the end.

Reviewers: gtbercea, kkwli0

Subscribers: guansong, caomhin, openmp-commits

Differential Revision: https://reviews.llvm.org/D55440

llvm-svn: 348758
2018-12-10 14:29:05 +00:00
Andrey Churbanov f700e9ed8c Support clang compiling under windows-gnu and windows-msvc
Patch by Peiyuan Song <squallatf@gmail.com>

Differential Revision: https://reviews.llvm.org/D53422

llvm-svn: 348756
2018-12-10 13:45:00 +00:00
Kamil Rytarowski 7e1ea993e0 Add OpenBSD support to OpenMP
Summary: This patch permits OpenMP to build and work (with both gcc and clang) on OpenBSD. It mostly follows what was done for FreeBSD and NetBSD, except OpenBSD does not have pthread_getattr_np support, so it follows OS X in that one instance.

Reviewers: #openmp, krytarowski

Reviewed By: krytarowski

Subscribers: guansong, jfb, emaste, mgorny, krytarowski, #openmp

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D34280

llvm-svn: 348726
2018-12-09 16:46:48 +00:00
Kamil Rytarowski a56ac949ec Add DragonFlyBSD support to OpenMP
Summary:
Additions mostly follow FreeBSD and NetBSD and are not intrusive.
There is similar patch for OpenBSD: https://reviews.llvm.org/D34280

The -lm was being omitted due to -Wl,--as-needed in cmake rule, similar patch is in freebsd-ports/devel/llvm-devel port.

Simple OpenMP programs compile and work as expected:
$ clang-devel ~/omp_hello.c -fopenmp -I/usr/local/llvm-devel/include
$ LD_LIBRARY_PATH=/usr/local/llvm-devel/lib OMP_NUM_THREADS=100 ./a.out

The assertion in LLVMgold.so when -fopenmp was used together with -flto in 20170524 snapshot is no longer triggered on current svn-trunk and works fine as in llvm-4.0 with our local patches.

Reviewers: #openmp, krytarowski

Reviewed By: krytarowski

Subscribers: dexonsmith, jfb, krytarowski, guansong, gregrodgers, emaste, mgorny, mehdi_amini

Differential Revision: https://reviews.llvm.org/D35129

llvm-svn: 348725
2018-12-09 16:40:33 +00:00
Alexey Bataev 8acafff404 [OPENMP][NVPTX]Save registers for optimized builds with enabled logging.
Summary:
Introduced special noinline function log that allows to save some
registers for optimized builds but with enabled logging. Also, it
increases the stability of the optimized builds with inlined runtime.

Reviewers: gtbercea, kkwli0

Reviewed By: gtbercea

Subscribers: caomhin, guansong, openmp-commits

Differential Revision: https://reviews.llvm.org/D55436

llvm-svn: 348606
2018-12-07 16:08:29 +00:00
Alexey Bataev 653e8ba79a [OPENMP][NVPTX]Correct type casting for printf args + simplified shfl64 function.
Summary:
Explicitly casted printf's args to the required types + simplified
shfl64 function.

Reviewers: gtbercea, kkwli0

Subscribers: guansong, jfb, caomhin, openmp-commits

Differential Revision: https://reviews.llvm.org/D55379

llvm-svn: 348521
2018-12-06 19:45:48 +00:00
Alexey Bataev 5442f3e549 [OPENMP][NVPTX]Fix __kmpc_flush to flush the memory per system, not per block.
Summary:
According to the standard, after memory flushing the changes in the
memory must be visible to all the threads in all teams. Patch fixes
this.

Reviewers: gtbercea, kkwli0

Subscribers: guansong, jfb, caomhin, openmp-commits

Differential Revision: https://reviews.llvm.org/D55370

llvm-svn: 348491
2018-12-06 15:27:58 +00:00
Gheorghe-Teodor Bercea 10b2e60b7e [OpenMP][libomptarget] Flush intermediate values during team reduction
Summary: Ensure intermediate values of a team reduction are flushed to memory.

Reviewers: ABataev, caomhin

Reviewed By: ABataev

Subscribers: guansong, jfb, openmp-commits

Differential Revision: https://reviews.llvm.org/D55219

llvm-svn: 348148
2018-12-03 15:21:49 +00:00
Alexey Bataev 0f221f53d8 [OPENMP][NVPTX]Make runtime compatible with the original runtime.
Summary:
Reworked runtime to make it compatible with the requirements of the
original runtime library. Also, simplified some code to reduce number of
function calls.

Reviewers: gtbercea, kkwli0

Subscribers: guansong, jfb, caomhin, openmp-commits

Differential Revision: https://reviews.llvm.org/D55130

llvm-svn: 348003
2018-11-30 16:52:38 +00:00
Jonathan Peyton bfe427bf41 Revert r347799: Add omp_get_device_num() and update other device API
There is a conflict between libomptarget and libomp concerning some of the
standard OpenMP device API which needs further intestigation.

llvm-svn: 347932
2018-11-29 23:56:14 +00:00
Jonathan Peyton b04f7d681a [OpenMP] Add stubs for Task affinity API
This patch adds __kmpc_omp_reg_task_with_affinity to register affinity
information for tasks. For now, the affinity information is not used,
and the function always succeeds. This also adds the kmp_task_affinity_info_t
structure to store the task affinity information.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D55026

llvm-svn: 347907
2018-11-29 20:04:29 +00:00
Jonathan Peyton 1742eced55 [OpenMP] Rename ompt_mutex_impl_unknown to ompt_mutex_impl_none
This change renames ompt_mutex_impl_unknown to ompt_mutex_impl_none,
following the name change in the specification.

Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D54347

llvm-svn: 347802
2018-11-28 20:19:53 +00:00
Jonathan Peyton 1ce776ea2f [OpenMP] Minor cleanup of debug code
* Fix calculation of string length.
* Remove NULL-check of pointer which has been dereferenced.

Patch by Andrey Churbanov

Differential Revision: https://reviews.llvm.org/D54948

llvm-svn: 347801
2018-11-28 20:18:06 +00:00
Jonathan Peyton f4c0720ad0 [OpenMP] Fixed possible array out of bound access
There is low probability that array th_hot_teams can be
accessed out of bound (when many nested levels are requested
to keep hot teams via KMP_HOT_TEAMS_MAX_LEVEL). The patch
adds the check of index that fixes the problem.

Patch by Andrey Churbanov

Differential Revision: https://reviews.llvm.org/D54950

llvm-svn: 347800
2018-11-28 20:15:11 +00:00
Jonathan Peyton a17318b89b [OpenMP] Add omp_get_device_num() and update several other device API functions
Add omp_get_device_num() function for 5.0 which returns the number of the device
the current thread is running on. Also, did some cleanup and updating of device
API functions to make them into weak functions that should be replaced with
libomptarget functions when libomptarget is present.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D54342

llvm-svn: 347799
2018-11-28 20:10:26 +00:00
Gheorghe-Teodor Bercea 31c1589ab0 [OpenMP][libomptarget] Add new version of SPMD deinit kernel function with argument
Summary: To enable the compiler to optimize parts of the function that are not needed when runtime can be omitted, a new version of the SPMD deinit kernel function is needed. This function takes the runtime required flag as an argument.

Reviewers: ABataev, kkwli0, caomhin

Reviewed By: ABataev

Subscribers: guansong, openmp-commits

Differential Revision: https://reviews.llvm.org/D54969

llvm-svn: 347714
2018-11-27 21:23:40 +00:00
Alexey Bataev d4de439cf4 [OPENMP][NVPTX]Basic support for reductions across the teams.
Summary:
Added functions __kmpc_nvptx_teams_reduce_nowait_simple and
__kmpc_nvptx_teams_end_reduce_nowait_simple to implement basic support
for reductions across the teams.

Reviewers: gtbercea, kkwli0

Subscribers: guansong, jfb, caomhin, openmp-commits

Differential Revision: https://reviews.llvm.org/D54967

llvm-svn: 347710
2018-11-27 21:06:09 +00:00
Gheorghe-Teodor Bercea ad8632a9ba [OpenMP][libomptarget] Refactor SPMD and runtime requirement checking
Summary: Refactor the checking for SPMD mode and whether the runtime is initialized or not. This uses constant flags which enables the runtime to optimize out unused sections of code that depend on these flags.

Reviewers: ABataev, caomhin

Reviewed By: ABataev

Subscribers: guansong, jfb, openmp-commits

Differential Revision: https://reviews.llvm.org/D54960

llvm-svn: 347698
2018-11-27 19:45:10 +00:00
Alexey Bataev 8ab0924ab4 [OPENMP][NVPTX]Improved lock/critical constructs.
Summary: Improved support for critical constructs + omp_..._lock... constructs.

Reviewers: gtbercea, kkwli0, caomhin

Subscribers: guansong, jfb, openmp-commits

Differential Revision: https://reviews.llvm.org/D54766

llvm-svn: 347342
2018-11-20 20:19:36 +00:00
Andrey Churbanov 82318c6f14 Fix for bugzilla https://bugs.llvm.org/show_bug.cgi?id=39137.
Do not write to internal structure if it keeps same value.

Differential Revision: https://reviews.llvm.org/D54305

llvm-svn: 346862
2018-11-14 13:49:41 +00:00
Alexey Bataev 15ab891e68 [OPENMP]Make lambda mapping follow reqs for PTR_AND_OBJ mapping.
Summary:
The base pointer for the lambda mapping must point to the lambda capture
placement and pointer must point to the captured variable itself. Patch
fixes this problem.

Reviewers: gtbercea

Subscribers: guansong, openmp-commits, kkwli0, caomhin

Differential Revision: https://reviews.llvm.org/D54260

llvm-svn: 346407
2018-11-08 15:47:30 +00:00
Andrey Churbanov 855d09855d Add Hurd support.
Patch by samuel.thibault@ens-lyon.org

Differential Revision: https://reviews.llvm.org/D54079

llvm-svn: 346310
2018-11-07 12:27:38 +00:00
Andrey Churbanov c334434550 Implementation of OpenMP 5.0 mutexinoutset task dependency type.
Differential Revision: https://reviews.llvm.org/D53380

llvm-svn: 346307
2018-11-07 12:19:57 +00:00
Alexey Bataev 9476ca7db9 [OPENMP][OFFLOADING]Change the lambda capturing flags.
Summary:
The previously used combination `PTR_AND_OBJ | PRIVATE` could be used
for mapping of some data in Fortran. Changed it to `PTR_AND_OBJ |
  LITERAL`.

Reviewers: gtbercea

Subscribers: guansong, caomhin, openmp-commits

Differential Revision: https://reviews.llvm.org/D54035

llvm-svn: 345981
2018-11-02 15:24:47 +00:00
Alexey Bataev 463e9f3224 [OPENMP][NVPTX]Fixed/improved support for globalization in team contexts.
Summary:
Current globalization scheme works correctly only for SPMD+lightweight
runtime mode and does not work for full runtime. Patch improves support
for the globalization scheme + reduces global memory consumption in
  lightweight runtime mode.
Patch adds runtime functions to work with the statically allocated
global memory. It allows to improve performance and memory consumption.
This global memory must be allocated by the compiler.

Reviewers: grokos, kkwli0, gtbercea, caomhin

Subscribers: guansong, jfb, openmp-commits

Differential Revision: https://reviews.llvm.org/D53943

llvm-svn: 345976
2018-11-02 14:43:23 +00:00
Gheorghe-Teodor Bercea b10bacf122 [OpenMP][libomptarget] Add runtime function for pushing coalesced global records
Summary: In the case of coalesced global records, we need to push the exact data size passed in. This patch fixes this by outlining the common functionality of the previous push function and by adding a separate entry point for coalesced pushes. The pop function remains unchanged.

Reviewers: ABataev, grokos, caomhin

Reviewed By: ABataev, grokos

Subscribers: jholewinski, cfe-commits, Hahnfeld, guansong, jfb, openmp-commits

Differential Revision: https://reviews.llvm.org/D53141

llvm-svn: 345867
2018-11-01 18:08:12 +00:00
Alexey Bataev e5369885dd [LIBOMPTARGET] Add support for mapping of lambda captures.
Summary:
Added support for correct mapping of variables captured by reference in
lambdas. That kind of mapping may appear only in target-executable
regions and must follow the original lambda or another lambda capture
for the same lambda.
The expected data: base address - the address of the lambda, begin
pointer - pointer to the address of the lambda capture, size - size of
the captured variable.
When OMP_TGT_MAPTYPE_PTR_AND_OBJ mapping type is seen in
target-executable region, the target address of the last processed item
is taken as the address of the original lambda `tgt_lambda_ptr`. Then,
the pointer to capture on the device is calculated like `tgt_lambda_ptr
+ (host_begin_pointer - host_begin_base)` and the target-based address
of the original variable (which host address is
`*(void**)begin_pointer`) is written to that pointer.

Reviewers: kkwli0, gtbercea, grokos

Subscribers: openmp-commits

Differential Revision: https://reviews.llvm.org/D51107

llvm-svn: 345608
2018-10-30 15:42:12 +00:00
Andrey Churbanov 6ca3609418 remove duplicate omp_control_tool export to fix windows build
Patch by squallatf@gmail.com

Differential Revision: https://reviews.llvm.org/D53480

llvm-svn: 345255
2018-10-25 11:04:01 +00:00
Jonathan Peyton 8b3842fc99 [OpenMP] Convert KMP_DYNAMIC_LIB to a 0 or 1 guard everywhere
llvm-svn: 343869
2018-10-05 17:59:39 +00:00
Jonathan Peyton f194033316 [OpenMP] Fix KMP_DYNAMIC_LIB to be dependent on LIBOMP_ENABLE_SHARED
The KMP_DYNAMIC_LIB guard was hard set to 1. This patch has the guard depend
on CMake variable LIBOMP_ENABLE_SHARED.

llvm-svn: 343866
2018-10-05 17:47:58 +00:00
Jonathan Peyton 3574f28709 [OpenMP][OMPT] Fix unsafe initialization of ompt_data_t objects
Initializing an ompt_data_t object using the pointer union member is potentially
unsafe in 32-bit programs.  This change fixes the issue
by using the constant, ompt_data_none.

Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D52046

llvm-svn: 343785
2018-10-04 14:57:04 +00:00
Jonathan Peyton 8bb8a92de9 [OpenMP] Shutdown library on Windows if possible for better OMPT behavior
On Windows, child workers are terminated by the parent during the normal
program exit process (ExitProcess()) and they are not able to finish generating
their OpenMP events. We can force manual library shut down in __kmpc_end() to
fix this at least for the cases where __kmpc_end() is properly inserted.

Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D52628

llvm-svn: 343619
2018-10-02 19:15:04 +00:00
Jonas Hahnfeld a762bfc03a [libomptarget-nvptx] Enable asserts in bclib
If the user requested LIBOMPTARGET_NVPTX_DEBUG, include asserts in
the bitcode library. Everything else will have very unpleasent
effects because asserts will appear when falling back to the static
library libomptarget-nvptx.a.

Differential Revision: https://reviews.llvm.org/D52701

llvm-svn: 343477
2018-10-01 14:16:55 +00:00
Jonas Hahnfeld a1100e6b9a [libomptarget-nvptx] reduction: Determine if runtime uninitialized
Pass in the correct value of isRuntimeUninitialized() which solves
parallel reductions as reported on the mailing list.
For reference: r333285 did the same for loop scheduling.

Differential Revision: https://reviews.llvm.org/D52725

llvm-svn: 343476
2018-10-01 14:14:26 +00:00
Andrey Churbanov df60b37226 Fixed workaround made in https://reviews.llvm.org/D51694.
Patch suggested by Kelvin Li: removed optional "kind=" part of kind-selector
for variables with long names and kind names.

Differential Revision: https://reviews.llvm.org/D52712

llvm-svn: 343475
2018-10-01 14:08:50 +00:00
Jonas Hahnfeld 1bf767fb8e [libomptarget-nvptx] Align data sharing stack
NVPTX requires addresses of pointer locations to be 8-byte aligned
or there will be an exception during runtime.
This could happen without this patch as shown in the added test:
getId() requires 4 byte of stack and putValueInParallel() uses 16
bytes to store the addresses of the captured variables.

Differential Revision: https://reviews.llvm.org/D52655

llvm-svn: 343402
2018-09-30 09:23:21 +00:00
Jonas Hahnfeld 067235f227 [libomptarget-nvptx] Fix ancestor_thread_num and team_size (non-SPMD)
According to OpenMP 4.5, p250:12-14:

    If the requested nest level is outside the range of 0 and the
    nest level of the current thread, as returned by the omp_get_level
    routine, the routine returns -1.

The SPMD code path will need a similar fix.

Differential Revision: https://reviews.llvm.org/D51787

llvm-svn: 343401
2018-09-30 09:23:14 +00:00
Jonas Hahnfeld fb1b80191e [libomptarget-nvptx] Add tests for nested parallelism
Clang trunk will serialize nested parallel regions. Check that this
is correctly reflected in various API methods.

Differential Revision: https://reviews.llvm.org/D51786

llvm-svn: 343382
2018-09-29 16:02:32 +00:00
Jonas Hahnfeld c89a14f5d2 [libomptarget-nvptx] Ignore calls to dynamic API
There is no support and according to the OpenMP 4.5, p238:7-9:

    For implementations that do not support dynamic adjustment
    of the number of threads this routine has no effect: the
    value of dyn-var remains false.

Add a test that cancellation and nested parallelism aren't
supported either.

Differential Revision: https://reviews.llvm.org/D51785

llvm-svn: 343381
2018-09-29 16:02:25 +00:00
Jonas Hahnfeld a743c04412 [libomptarget-nvptx] Fix number of threads in parallel
If there is no num_threads() clause we must consider the
nthreads-var ICV. Its value is set by omp_set_num_threads()
and can be queried using omp_get_max_num_threads().
The rewritten code now closely resembles the algorithm given
in the OpenMP standard.

Differential Revision: https://reviews.llvm.org/D51783

llvm-svn: 343380
2018-09-29 16:02:17 +00:00
Alexey Bataev 418af6f6cf [OPENMP] Add the test to check that the libomptarget does not cause
infinite loop on removing non-mapped pointer-with-object.

Added test to check that libomptarget does not cause infinite loop when
trying to unmap the pointer-with-object data that was not previously
mapped.

llvm-svn: 343344
2018-09-28 17:13:11 +00:00
Jonas Hahnfeld 122dbb5dce [libomptarget-nvptx] Add testing infrastructure
This patch also introduces testing for libomptarget-nvptx
which has been missing until now. I propose to add tests for
all bugs that are fixed in the future.
The target check-libomptarget-nvptx is not run by default because
 - we can't determine if there is a GPU plugged into the system.
 - it will require the latest Clang compiler. Keeping compatibility
   with older releases would prevent testing newer code generation
   developed in trunk.

Differential Revision: https://reviews.llvm.org/D51687

llvm-svn: 343324
2018-09-28 15:05:43 +00:00
Jonathan Peyton 83e360a427 [OpenMP] Add missing __kmpc_critical_with_hint to dllexports
This patch puts the __kmpc_critical_with_hint function in dllexports
and also replaces some OMP_45_ENABLED to OMP_50_ENABLED

Differential Revision: https://reviews.llvm.org/D52380

llvm-svn: 343143
2018-09-26 20:47:25 +00:00
Jonathan Peyton e525f0d4e2 [OpenMP] Fix balanced affinity so thread's private affinity mask is updated
Balanced affinity only updated the thread's affinity with the operating system.
This change also has the thread's private mask reflect that change as well so
that any API that probes the thread's affinity mask will report the correct
mask value.

Differential Revision: https://reviews.llvm.org/D52379

llvm-svn: 343142
2018-09-26 20:43:23 +00:00
Jonathan Peyton 985f152f25 [OpenMP] Update ittnotify sources
This patch updates the ittnotify sources to the latest
corresponding with Intel(R) VTune(TM) Amplifier 2018

Differential Revision: https://reviews.llvm.org/D52378

llvm-svn: 343139
2018-09-26 20:30:00 +00:00
Jonathan Peyton cf27e31bdd [OpenMP] Fix performance issue from 376.kdtree
This change improves the performance of 376.kdtree by giving the compiler an
opportunity to do inlining and other optimizations for the call path,
__kmpc_omp_task_complete_if0()->__kmp_task_finish(), which is one of the hot
paths in the program; some functions in kmp_taskdeps.cpp were moved to the new
header file, kmp_taskdeps.h to achieve this.

Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D51889

llvm-svn: 343138
2018-09-26 20:24:39 +00:00
Jonathan Peyton 60eec6fecb [OpenMP][OMPT] A few improvements
This change includes miscellaneous improvements as follows:
1) Added ompt_get_proc_id() implementation for Windows
2) Added parser and print tool for omp-tool-var, just in case it needs
   to be printed (OMP_DISPLAY_ENV)
3) omp_control_tool is exported on Windows

Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D50538

llvm-svn: 343137
2018-09-26 20:19:44 +00:00
Gheorghe-Teodor Bercea f7256a593f [OpenMP][libomptarget] Set the frame pointer then test empty slot condition
Summary: NFC - just fixing a bug: the empty slot test was before the re-setting of the Stack pointer. 

Reviewers: ABataev, caomhin, Hahnfeld

Reviewed By: ABataev

Subscribers: guansong, openmp-commits

Differential Revision: https://reviews.llvm.org/D52122

llvm-svn: 343006
2018-09-25 18:48:14 +00:00
Gheorghe-Teodor Bercea 9bc3bfffb4 [OpenMP][libomptarget] Simplify warp master selection for data sharing
Summary:
There is currently no supported situation where the warp master is not the first thread in the warp.

This also avoids the device execution from hanging on Volta GPUs when ballot_sync is called by a number of threads that is less that the size of a warp.


Reviewers: ABataev, caomhin, grokos

Reviewed By: grokos

Subscribers: guansong, openmp-commits

Differential Revision: https://reviews.llvm.org/D50188

llvm-svn: 342972
2018-09-25 13:23:32 +00:00
Alexey Bataev 022bf16b41 [OPENMP][NVPTX] Add support for lastprivates/reductions handling in SPMD constructs with lightweight runtime.
Summary:
We need the support for per-team shared variables to support codegen for
lastprivates/reductions. Patch adds this support by using shared memory
if the total size of the reductions/lastprivates is <= 128 bytes,
then  pre-allocated buffer in global memory if size is <= 4K bytes,or
uses malloc/free, otherwise.

Reviewers: gtbercea, kkwli0, grokos

Subscribers: guansong, openmp-commits

Differential Revision: https://reviews.llvm.org/D51875

llvm-svn: 342737
2018-09-21 14:11:41 +00:00
Alexey Bataev 06b6e0f406 [OPENMP]Increment iterator when the loop is continued.
Summary:
Missed operation of the incrementing iterator when required just to
continue execution.

Reviewers: kkwli0, gtbercea, grokos

Subscribers: guansong, openmp-commits

Differential Revision: https://reviews.llvm.org/D51937

llvm-svn: 341964
2018-09-11 17:16:26 +00:00
Joachim Protze 489cdb783a [OMPT] Update types according to TR7
Some types and callback signatures have changed from TR6 to TR7.
Major changes (only adding signatures and stubs):
(-remove idle callback) done by D48362
-add reduction and dispatch callback
-add get_task_memory and finalize_tool runtime entry points
-ompt_invoker_t  becomes ompt_parallel_flag_t
-more types of sync_regions

Patch provided by Simon Convent

Reviewers: hbae, protze.joachim

Differential Revision: https://reviews.llvm.org/D50774

llvm-svn: 341834
2018-09-10 14:34:54 +00:00
Jonas Hahnfeld dc79c7187c [libomptarget-nvptx] Remove last mentions of __kmpc_print_*
Their implementation was removed during review, delete their
prototype declarations.

llvm-svn: 341748
2018-09-08 12:10:19 +00:00
Jonathan Peyton 08f0180ba9 [OpenMP] Update copyright to 2018
Better late than never

llvm-svn: 341703
2018-09-07 20:33:35 +00:00
Jonathan Peyton a2f6eff488 [OpenMP] Change hint parameter type for critical to uint32_t
Add atomic hint flags to the enum.
The hint parameter type was changed to uint32_t in __kmpc_critical_with_hint()

Patch by Olga Malysheva

Differential Revision: https://reviews.llvm.org/D51235

llvm-svn: 341694
2018-09-07 18:46:40 +00:00
Jonathan Peyton 2ff302d5d7 [OpenMP] Synchronization hint constants added to headers
ident flags reserved for atomic hints.
This patch adds omp_sync_hint_t to omp.h and omp_sync_hint_kind to omp_lib.h.
For better maintainability the list of macros for ident flags was replaced with
a enum. The new KMP_IDENT_ATOMIC_HINT_MASK was added to the enum to
support possible future atomic hints.

Also fix omp_lib.h.var to be under 72 chars again after 5.0 OpenMP Memory commit

Patch by Olga Malysheva

Differential Revision: https://reviews.llvm.org/D51233

llvm-svn: 341693
2018-09-07 18:45:13 +00:00
Jonathan Peyton 92ca61884b [OpenMP] Initial implementation of OMP 5.0 Memory Management routines
Implemented omp_alloc, omp_free, omp_{set,get}_default_allocator entries,
and OMP_ALLOCATOR environment variable.

Added support for HBW memory on Linux if libmemkind.so library is accessible
(dynamic library only, no support for static libraries).
Only used stable API (hbwmalloc) of the memkind library
though we may consider using experimental API in future.

The ICV def-allocator-var is implemented per implicit task similar to
place-partition-var.  In the absence of a requested allocator, the uses the
default allocator.

Predefined allocators (the only ones currently available) are made similar
for C and Fortran, - pointers (long integers) with values 1 to 8.

Patch by Andrey Churbanov

Differential Revision: https://reviews.llvm.org/D51232

llvm-svn: 341687
2018-09-07 18:25:49 +00:00
Andrey Churbanov d946778b9f Fix for https://bugs.llvm.org/show_bug.cgi?id=38839:
Changed style of declarations to be less than 72 char each.

Differential Revision: https://reviews.llvm.org/D51694

llvm-svn: 341653
2018-09-07 12:22:04 +00:00
Jonas Hahnfeld 21e3ee0afe [libomptarget] Remove two unneeded includes, NFCI.
Follow-up to r340542 and r340767.

llvm-svn: 341563
2018-09-06 17:00:57 +00:00
Jonas Hahnfeld f27dcf01d2 [libomptaret][test] Announce compiler features
This is a follow-up to r341371: The new test for PR38704 doesn't
work with Clang 6.0. It uses an UNSUPPORTED: clang-6, but that
hasn't worked because the compiler features weren't known to lit.

llvm-svn: 341448
2018-09-05 07:26:00 +00:00
Sergey Dmitriev b4dc69ff80 [libomptarget] Remove `Devices` from `RTLInfoTy`
This patch removes unused field `Devices` from `RTLInfoTy`.

Differential Revision: https://reviews.llvm.org/D51653

llvm-svn: 341399
2018-09-04 20:23:09 +00:00
Jonas Hahnfeld bb51d39871 [libomptarget][CUDA] Use cuDeviceGetAttribute, NFCI.
cuDeviceGetProperties has apparently been deprecated since CUDA 5.0.
Nvidia started using annotations only in CUDA 9.2, so nobody noticed
nor cared before.
The new function returns the same values, tested with a P100.

Differential Revision: https://reviews.llvm.org/D51624

llvm-svn: 341372
2018-09-04 15:13:28 +00:00
Jonas Hahnfeld f7f86971e6 [libomptarget] PR38704: Fix erase of ShadowPtrMap
erase() invalidates the iterator and returns a new one pointing
to the following element. The code now follows the example at
https://en.cppreference.com/w/cpp/container/map/erase.
(The added testcase crashes without this patch.)

Reported by David Binderman (https://llvm.org/PR38704)!

Differential Revision: https://reviews.llvm.org/D51623

llvm-svn: 341371
2018-09-04 15:13:23 +00:00
Jonas Hahnfeld 82d20201d0 [libomptarget][NVPTX] Drop dead code and data structures, NFCI.
* cg and HasCancel in WorkDescr were never read and can be removed.
 * This eliminates the last use of priv in ThreadPrivateContext.
 * CounterGroup is unused afterwards.
 * Remove duplicate external declares in omptarget-nvptx.cu that are
   already in the header omptarget-nvptx.h.

Differential Revision: https://reviews.llvm.org/D51622

llvm-svn: 341370
2018-09-04 15:13:17 +00:00
Jonas Hahnfeld 96c13488ab [libomptarget][NVPTX] Fix __kmpc_spmd_kernel_deinit
If the runtime is uninitialized the master thread must Enqueue the
state object, and ALL threads must return immediately.
Found post-commit of https://reviews.llvm.org/D51222.

llvm-svn: 341328
2018-09-03 17:24:23 +00:00
Alexey Bataev 39a4724095 [OPENMP][NVPTX] Replace assert() by ASSERT0() macro, NFC.
Required to fix the buildbots.

llvm-svn: 340956
2018-08-29 19:22:06 +00:00
Alexey Bataev b7a5d38cf5 [OPENMP][NVPTX] Lightweight runtime support for SPMD mode.
Summary:
Implemented simple and lightweight runtime support for SPMD mode-based
constructs. It adds support for L2 sequential parallelism wihtout full
runtime support. Also, patch fixes some use cases for
uninitialized|lightweight runtime.

Reviewers: grokos, kkwli0, Hahnfeld, gtbercea

Subscribers: guansong, openmp-commits

Differential Revision: https://reviews.llvm.org/D51222

llvm-svn: 340944
2018-08-29 17:35:09 +00:00
Gheorghe-Teodor Bercea 15f5407d92 [OpenMP][Fix] Conditional compilation leaves variables unused
Summary: Prevent variables from being left unused by conditional compilation.

Reviewers: ABataev, grokos, Hahnfeld, caomhin, protze.joachim

Reviewed By: Hahnfeld

Subscribers: guansong, openmp-commits

Differential Revision: https://reviews.llvm.org/D51303

llvm-svn: 340771
2018-08-27 19:54:26 +00:00
Alexandre Eichenberger e9b7d8dcd6 [OpenMP][libomptarget] rework of fatal error reporting
Summary:
Removed the function that used a lock and varargs
Used the same mechanism as for debug messages

Reviewers: ABataev, gtbercea, grokos, Hahnfeld

Reviewed By: gtbercea, Hahnfeld

Subscribers: mikerice, ABataev, RaviNarayanaswamy, guansong, openmp-commits

Differential Revision: https://reviews.llvm.org/D51226

llvm-svn: 340767
2018-08-27 18:20:15 +00:00
Gheorghe-Teodor Bercea 353adf437d [OpenMP][Fix] Ensure comparison between unsigned values.
Summary: Ensure the values being compared are both unsigned.

Reviewers: ABataev, Hahnfeld, caomhin, grokos, AndreyChurbanov

Reviewed By: AndreyChurbanov

Subscribers: AndreyChurbanov, guansong, openmp-commits

Differential Revision: https://reviews.llvm.org/D51301

llvm-svn: 340745
2018-08-27 14:52:20 +00:00
Jonathan Peyton 2a966e84ce [OpenMP] Remove deprecated/obsolete MIC attributes from headers
llvm-svn: 340656
2018-08-24 21:34:10 +00:00
Jonathan Peyton 2c3e5d82b4 [OpenMP] Fixed affinity verbose double printing for balanced type.
llvm-svn: 340647
2018-08-24 20:35:42 +00:00
Jonathan Peyton a4a9c48c78 [OpenMP] Fix tasking bug for decreasing hot team nthreads
The __kmp_execute_tasks_template() function reads the task_team and
current_task from the thread structure. There appears to be a pathological
timing where the number of threads in the hot team decreases and so a
thread is put in the pool via __kmp_free_thread(). It could be the case that:
1) A thread reads th_task_team into task_team local variables
       and is then interrupted by the OS
2) Master frees the thread and sets current task and task team to NULL
3) The thread reads current_task as NULL

When this happens, current_task is dereferenced and a segfault occurs.
This patch just checks for current_task to not be NULL as well.

Differential Revision: https://reviews.llvm.org/D50651

llvm-svn: 340632
2018-08-24 18:07:35 +00:00
Jonathan Peyton ca10a76f08 [OpenMP] Add check for hot_teams array
If hot teams are not being used, this code could seg fault without the added
check, and does so when composability is used in conjunction with nesting.
The fix prevents the segfault.

Differential Revision: https://reviews.llvm.org/D50649

llvm-svn: 340629
2018-08-24 18:05:00 +00:00
Jonathan Peyton b1b221c82c [OpenMP] Fix incorrect barrier imbalance reporting in ITTNOTIFY
Exclude nested explicit tasks from timing, only outer level explicit task
counted and its time added to barrier arrive time for the thread.

Differential Revision: https://reviews.llvm.org/D50584

llvm-svn: 340628
2018-08-24 18:03:27 +00:00
Alexandre Eichenberger 1b4a666ba5 [OpenMP][libomptarget] Bringing up to spec with respect to OMP_TARGET_OFFLOAD env var
Summary:
Right now, only the OMP_TARGET_OFFLOAD=DISABLED was implemented. Added support for the other MANDATORY and DEFAULT values.


Reviewers: gtbercea, ABataev, grokos, caomhin, Hahnfeld

Reviewed By: Hahnfeld

Subscribers: protze.joachim, gtbercea, AlexEichenberger, RaviNarayanaswamy, Hahnfeld, guansong, openmp-commits

Differential Revision: https://reviews.llvm.org/D50522

llvm-svn: 340542
2018-08-23 16:22:42 +00:00
Joachim Protze e1a04b4659 [OMPT] Remove OMPT idle callback
The idle callback was removed from the spec as of TR7.
This removes it from the implementation.

Patch provided by Simon Convent

Reviewers: hbae, protze.joachim

Differential Revision: https://reviews.llvm.org/D48362

llvm-svn: 339771
2018-08-15 13:54:28 +00:00
Jonathan Peyton a3f6d4c5b8 [OMPT] Make omp_control_tool() compliant when called from Fortran programs
This change fixes an incorrect behavior of the omp_control_tool function when
called from Fortran applications.  A tool callback function for this event is
supposed to get NULL for the third argument according to the specification, but
the current implementation just passes a garbage value. A possible fix is to use
the OPTIONAL attribute for the third argument.

Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D50565

llvm-svn: 339585
2018-08-13 17:26:18 +00:00
Jonathan Peyton baad3f6016 [OpenMP] Cleanup code
This patch cleans up unused functions, variables, sign compare issues, and
addresses some -Warning flags which are now enabled including -Wcast-qual.
Not all the warning flags in LibompHandleFlags.cmake are enabled, but some
are with this patch.

Some __kmp_gtid_from_* macros in kmp.h are switched to static inline functions
which allows us to remove the awkward definition of KMP_DEBUG_ASSERT() and
KMP_ASSERT() macros which used the comma operator. This had to be done for the
innumerable -Wunused-value warnings related to KMP_DEBUG_ASSERT()

Differential Revision: https://reviews.llvm.org/D49105

llvm-svn: 339393
2018-08-09 22:04:30 +00:00
Jonathan Peyton 821649229e [OpenMP] Fix doacross testing for gcc
This patch adds a test using the doacross clauses in OpenMP and removes gcc from
testing kmp_doacross_check.c which is only testing the kmp rather than the
gomp interface.

Differential Revision: https://reviews.llvm.org/D50014

llvm-svn: 338757
2018-08-02 19:13:07 +00:00
Jonas Hahnfeld ef8f737288 [OMPT] Disable by default on Windows
This is broken per PR36561 and PR36574, so disable it for now until
somebody interested can take a look. OMPT can still be activated manually
by passing -DLIBOMP_OMPT_SUPPORT=ON during configuration.

Differential Revision: https://reviews.llvm.org/D50086

llvm-svn: 338721
2018-08-02 14:34:08 +00:00
Jonas Hahnfeld 5b57eb4b09 [tests] Add annotations for taskloop features
Only supported since GCC 6 and Intel 17.0. However GCC 6.3.0 is
crashing on two of the tests, so disable them as well...

Differential Revision: https://reviews.llvm.org/D50085

llvm-svn: 338720
2018-08-02 14:34:03 +00:00
Joachim Protze 935399d254 [OMPT,tests] Fix taskloop testcase scheduling effects
The taskloop testcase had scheduling effects. Tasks of the taskloop would
sometimes be scheduled before all task were created. The testing is now
split into two phases. First, the task creation on the master is tested,
than the scheduling events of the tasks are tested. Thus, the order of
creation and scheduling events is irrelavant.

Patch by Simon Convent

Reviewed by: protze.joachim, Hahnfeld

Subscribers: openmp-commits

Differential Revision: https://reviews.llvm.org/D50140

llvm-svn: 338580
2018-08-01 16:15:18 +00:00
Jonas Hahnfeld 51fc3cc628 [test] Convert test for PR36720 to c89
GCC 4.8.5 defaults to this old C standard. I think we should make the
tests pass a newer -std=c99|c11 but that's too intrusive for now...

Differential Revision: https://reviews.llvm.org/D50084

llvm-svn: 338490
2018-08-01 06:26:55 +00:00
Jonathan Peyton 28226e7d64 [OpenMP] Fix tasking + parallel bug
From the bug report, the runtime needs to initialize the nproc variables
(inside middle init) for each root when the task is encountered, otherwise,
a segfault can occur.

Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=36720

Differential Revision: https://reviews.llvm.org/D49996

llvm-svn: 338313
2018-07-30 21:47:56 +00:00
Gheorghe-Teodor Bercea f729df821a [OpenMP] Fix new task creation
Summary:
When OMPT is not supported the __kmp_omp_task() function is passed the parameters in the wrong order. This is a fix related to patch D47709.


Reviewers: Hahnfeld, sconvent, caomhin, jlpeyton

Reviewed By: Hahnfeld

Subscribers: guansong, openmp-commits

Differential Revision: https://reviews.llvm.org/D50001

llvm-svn: 338295
2018-07-30 19:51:51 +00:00
Jonas Hahnfeld f985f98128 [CMake] Disable -Wstringop-overflow
GCC 8 produces false-positives with this:
In file included from <openmp>/src/runtime/src/kmp_os.h:950,
                 from <openmp>/src/runtime/src/kmp.h:78,
                 from <openmp>/src/runtime/src/kmp_environment.cpp:54:
<openmp>/src/runtime/src/kmp_environment.cpp: In function ‘char* __kmp_env_get(const char*)’:
<openmp>/src/runtime/src/kmp_safe_c_api.h:52:50: warning: ‘char* strncpy(char*, const char*, size_t)’ specified bound depends on the length of the source argument [-Wstringop-overflow=]
 #define KMP_STRNCPY_S(dst, bsz, src, cnt) strncpy(dst, src, cnt)
                                           ~~~~~~~^~~~~~~~~~~~~~~
<openmp>/src/runtime/src/kmp_environment.cpp:97:5: note: in expansion of macro ‘KMP_STRNCPY_S’
     KMP_STRNCPY_S(result, len, value, len);
     ^~~~~~~~~~~~~
<openmp>/src/runtime/src/kmp_environment.cpp:92:28: note: length computed here
     size_t len = KMP_STRLEN(value) + 1;

This is stupid because result is allocated with KMP_INTERNAL_MALLOC(len),
so the arguments are correct.

Differential Revision: https://reviews.llvm.org/D49904

llvm-svn: 338283
2018-07-30 18:16:22 +00:00
Jonathan Peyton 284fab195a [OpenMP] Add GOMP version symbols for OMP_4.5 API
This patch adds the appropriate version symbols to the relevant API functions

Differential Revision: https://reviews.llvm.org/D49859

llvm-svn: 338281
2018-07-30 17:50:35 +00:00
Jonathan Peyton 369d72db11 [OpenMP] Implement GOMP doacross compatibility
This change introduces GOMP doacross compatibility. There are 12 new interface
functions 6 for long type and 6 for unsigned long long type:
GOMP_doacross_post, GOMP_doacross_wait, GOMP_loop_doacross_[schedule]_start
where schedule can be static, dynamic, guided, or runtime.

These functions just translate the parameters if necessary and send them
to the corresponding kmp function.
E.g., GOMP_doacross_post() -> __kmpc_doacross_post()

For the GOMP_doacross_post function, there is template specialization to
account for when long is a four byte vs an eight byte type. If it is a
four byte type, then a temporary array has to be created to convert the
four byte integers into eight byte integers and then sending that into
__kmpc_doacross_post(). Because GOMP_doacross_wait uses varargs, it
always needs a temporary array and does not need template specialization.

Differential Revision: https://reviews.llvm.org/D49857

llvm-svn: 338280
2018-07-30 17:48:33 +00:00
Jonathan Peyton 8692e142b3 [OpenMP] Fix build errors when building with KMP_DEBUG_ADAPTIVE_LOCKS=1
This change fixes build errors when building a runtime with adaptive lock stats
enabled. Most of the errors were due to the recent changes in the runtime, but
it seems that we have not tried to build this debug runtime on Windows for a
long time.

Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D49823

llvm-svn: 338277
2018-07-30 17:45:23 +00:00
Jonathan Peyton f0682ac498 [OpenMP][Stats] Cleanup stats gathering code
1) Remove unnecessary data from list node structure
2) Remove timerPair in favor of pushing/popping explicitTimers.
   This way, nested timers will work properly.
3) Fix #pragma omp critical timers
4) Add histogram capability
5) Add KMP_STATS_FILE formatting capability
6) Have time partitioned into serial & parallel by introducing
   partitionedTimers::exchange(). This also counts the number of serial regions
   in the executable.
7) Fix up the timers around OMP loops so that scheduling overhead and work are
   both counted correctly.
8) Fix up the iterations statistics so they count the number of iterations the
   thread receives at each loop scheduling event
9) Change timers so there is only one RDTSC read per event change
10) Fix up the outdated comments for the timers

Differential Revision: https://reviews.llvm.org/D49699

llvm-svn: 338276
2018-07-30 17:41:08 +00:00
Joachim Protze cdaefac5bd [OMPT] Fix OMPT callbacks for the taskloop construct and add testcase
Fix the order of callbacks related to the taskloop construct.
Add the iteration_count to work callbacks (according to the spec).
Use kmpc_omp_task() instead of kmp_omp_task() to include OMPT callbacks.
Add a testcase.

Patch by Simon Convent

Reviewed by: protze.joachim, hbae

Subscribers: openmp-commits

Differential Revision: https://reviews.llvm.org/D47709

llvm-svn: 338146
2018-07-27 18:13:24 +00:00
Joachim Protze 86ed6aa668 [OMPT] Adapt OMPT callbacks for tasks to handle untied tasks correctly
The ompt/tasks/task_types.c testcase did not test untied tasks properly. Now,
frame addresses are tested and two scheduling points are added at which the
task can switch to another thread. Due to scheduling effects, the frame address
could be NULL.

This needed a restructure of the way OMPT callbacks are called.
__ompt_task_finish() now as an extra parameter, whether a task is completed.
Its invocation has been moved into __kmp_task_finish(). Thus, the order of the
writes to the frame addresses is not subject to scheduling effects anymore.

Patch by Simon Convent

Reviewed by: protze.joachim, hbae

Subscribers: openmp-commits

Differential Revision: https://reviews.llvm.org/D49181

llvm-svn: 338145
2018-07-27 18:13:20 +00:00
Joachim Protze f203109edb [OMPT] Print two more addresses in print_fuzzy_address_block()
The two more outputs are needed to match the return addresses when using the
Intel Compiler, as it generates more instructions between the fuzzy-printing
of the address and the runtime call.

Patch by Simon Convent

Reviewed By: protze.joachim, hbae

Differential Revision: https://reviews.llvm.org/D49373

llvm-svn: 338144
2018-07-27 18:13:15 +00:00
Jonas Hahnfeld 3a0e9b37f3 PR30734: Remove __kmp_ft_page_allocate()
This function was not enabled by default and not exported when manually
tweaking the build flags. Additionally it was hard to use since there
is no corresponding __kmp_ft_page_free().
The code itself is questionable because the returned memory address
is padded by an extra pointer which stores the unpadded start of the
allocated region (this would need to be freed).

Differential Revision: https://reviews.llvm.org/D49802

llvm-svn: 338052
2018-07-26 18:15:02 +00:00
Jonas Hahnfeld 6fbbf27d98 [test] Remove XFAIL of omp_for_bigbounds.c for Intel Compiler
The initial commit said that the test passes with Intel Compiler,
so change XFAIL to only list clang and gcc.

Differential Revision: https://reviews.llvm.org/D49801

llvm-svn: 338051
2018-07-26 18:14:57 +00:00
Jonas Hahnfeld ba5ec9c684 [OMPT] Fix typo in test parallel/nested_thread_num.c
This caused test failures with GCC since its initial commit in
r336085 (https://reviews.llvm.org/D46533).

llvm-svn: 337911
2018-07-25 12:34:31 +00:00
Alexey Bataev 37d4156b11 [OPNEMP, NVPTX] Fixed sychronization construct + code cleanup.
Summary:
1. Fixed internal problem in `__kmpc_barrier` function: SPMD mode
synchronization function should be called only in L1 parallel level.
2. Removed some extra code for synchronization inside of the code, used
`__kmpc_barrier` instead.
3. Some code cleanup.

Reviewers: gtbercea, grokos

Subscribers: openmp-commits

Differential Revision: https://reviews.llvm.org/D49564

llvm-svn: 337691
2018-07-23 13:52:12 +00:00
Jonathan Peyton a764af68be Block library shutdown until unreaped threads finish spin-waiting
This change fixes possibly invalid access to the internal data structure during
library shutdown.  In a heavily oversubscribed situation, the library shutdown
sequence can reach the point where resources are deallocated while there still
exist threads in their final spinning loop.  The added loop in
__kmp_internal_end() checks if there are such busy-waiting threads and blocks
the shutdown sequence if that is the case. Two versions of kmp_wait_template()
are now used to minimize performance impact.

Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D49452

llvm-svn: 337486
2018-07-19 19:17:00 +00:00
George Rokos a0da24683b [OpenMP][libomptarget] New map interface: remove translation code and ensure proper alignment of struct members
This patch removes the translation code since this functionality is now implemented in the compiler.
target_data_begin and target_data_end are also patched to handle some special cases that used to be
handled by the obsolete translation function, namely ensure proper alignment of struct members when
we have partially mapped structs. Mapping a struct from a higher address (i.e. not from its beginning)
can result in distortion of the alignment for some of its member fields. Padding restores the original
(proper) alignment.

Differential revision: https://reviews.llvm.org/D44186

llvm-svn: 337455
2018-07-19 13:41:03 +00:00
Joachim Protze bb869f42b7 [libomptarget] Also support several images for elf
In revision r336569 (D49036) libomptarget support for multiple nvidia images
has been fixed in case a target region resides inside one or multiple
libraries and in the compiled application. But the issues is still present
for elf images.
This fix will also support multiple images for elf.

Patch by Jannis Klinkenberg

Reviewers: protze.joachim, ABataev, grokos

Reviewed By: protze.joachim, ABataev, grokos

Subscribers: openmp-commits

Differential Revision: https://reviews.llvm.org/D49418

llvm-svn: 337355
2018-07-18 07:23:46 +00:00
Azharuddin Mohammed 6712b8675b [cmake] Fix libomptarget/test/CMakeLists.txt
Summary:
Should be variable name instead of variable reference. If the variable is
somehow unset, it messes up the if condition expression and causes a CMake
error.

Reviewers: jlpeyton, AndreyChurbanov, Hahnfeld

Reviewed By: Hahnfeld

Subscribers: mgorny, llvm-commits, openmp-commits

Differential Revision: https://reviews.llvm.org/D47221

llvm-svn: 337133
2018-07-15 17:29:43 +00:00
Gheorghe-Teodor Bercea 9e94326185 [OpenMP][libomptarget] Fix data sharing and globalization infrastructure to work in SPMD mode
Summary: This patch fixes the data sharing infrastructure to work for the SPMD and non-SPMD cases.

Reviewers: ABataev, grokos, carlo.bertolli, caomhin

Reviewed By: ABataev, grokos

Subscribers: guansong, openmp-commits

Differential Revision: https://reviews.llvm.org/D49204

llvm-svn: 337013
2018-07-13 16:14:22 +00:00
Alexey Bataev c2c0138a04 [OPENMP, NVPTX] Fix loop boundaries calculation for dynamic loops.
Summary:
Patch fixes the next problems.
1. Removes unused functions from omptarget_nvptx_ThreadPrivateContext
class + simplified data members.
2. Fixed calculation of loop boundaries for dynamic loops with static
scheduling.
3. Introduced saving/restoring of the dynamic loop boundaries to support
several nested parallel dynamic loops.

Reviewers: grokos

Subscribers: guansong, kkwli0, openmp-commits

Differential Revision: https://reviews.llvm.org/D49241

llvm-svn: 336915
2018-07-12 15:18:28 +00:00
Jonathan Peyton dc73f512ae Fix const cast problem introduced in r336563
336563 eliminated CCAST() macros caused build failures

llvm-svn: 336586
2018-07-09 19:09:31 +00:00
Jonathan Peyton 61d44f188a [OpenMP] Fix a few formatting issues
llvm-svn: 336575
2018-07-09 18:09:25 +00:00
Jonathan Peyton f639936748 [OpenMP] Introduce hierarchical scheduling
This patch introduces the logic implementing hierarchical scheduling.
First and foremost, hierarchical scheduling is off by default
To enable, use -DLIBOMP_USE_HIER_SCHED=On during CMake's configure stage.
This work is based off if the IWOMP paper:
"Workstealing and Nested Parallelism in SMP Systems"

Hierarchical scheduling is the layering of OpenMP schedules for different layers
of the memory hierarchy. One can have multiple layers between the threads and
the global iterations space. The threads will go up the hierarchy to grab
iterations, using possibly a different schedule & chunk for each layer.

[ Global iteration space (0-999) ]

(use static)
[ L1 | L1 | L1 | L1 ]

(use dynamic,1)
[ T0 T1 | T2 T3 | T4 T5 | T6 T7 ]

In the example shown above, there are 8 threads and 4 L1 caches begin targeted.
If the topology indicates that there are two threads per core, then two
consecutive threads will share the data of one L1 cache unit. This example
would have the iteration space (0-999) split statically across the four L1
caches (so the first L1 would get (0-249), the second would get (250-499), etc).
Then the threads will use a dynamic,1 schedule to grab iterations from the L1
cache units. There are currently four supported layers: L1, L2, L3, NUMA

OMP_SCHEDULE can now read a hierarchical schedule with this syntax:
OMP_SCHEDULE='EXPERIMENTAL LAYER,SCHED[,CHUNK][:LAYER,SCHED[,CHUNK]...]:SCHED,CHUNK
And OMP_SCHEDULE can still read the normal SCHED,CHUNK syntax from before

I've kept most of the hierarchical scheduling logic inside kmp_dispatch_hier.h
to try to keep it separate from the rest of the code.

Differential Revision: https://reviews.llvm.org/D47962

llvm-svn: 336571
2018-07-09 17:51:13 +00:00
Alexey Bataev 2622e9e5b3 [OPENMP, NVPTX] Support several images in the executable.
Summary:
Currently Cuda plugin supports loading of the single image, though we
may have the executable with the several images, if it has target
regions inside of the dynamically loaded library. Patch allows to load
multiple images.

Reviewers: grokos

Subscribers: guansong, openmp-commits, kkwli0

Differential Revision: https://reviews.llvm.org/D49036

llvm-svn: 336569
2018-07-09 17:46:55 +00:00
Jonathan Peyton 39ada85446 [OpenMP] Restructure loop code for hierarchical scheduling
This patch reorganizes the loop scheduling code in order to allow hierarchical
scheduling to use it more effectively. In particular, the goal of this patch
is to separate the algorithmic parts of the scheduling from the thread
logistics code.

Moves declarations & structures to kmp_dispatch.h for easier access in
other files.  Extracts the algorithmic part of __kmp_dispatch_init() and
__kmp_dispatch_next() into __kmp_dispatch_init_algorithm() and
__kmp_dispatch_next_algorithm(). The thread bookkeeping logic is still kept in
__kmp_dispatch_init() and __kmp_dispatch_next(). This is done because the
hierarchical scheduler needs to access the scheduling logic without the
bookkeeping logic.  To prepare for new pointer in dispatch_private_info_t, a
new flags variable is created which stores the ordered and nomerge flags instead
of them being in two separate variables. This will keep the
dispatch_private_info_t structure the same size.

Differential Revision: https://reviews.llvm.org/D47961

llvm-svn: 336568
2018-07-09 17:45:33 +00:00
Jonathan Peyton 37e2ef5434 [OpenMP] Use C++11 Atomics - barrier, tasking, and lock code
These are preliminary changes that attempt to use C++11 Atomics in the runtime.
We are expecting better portability with this change across architectures/OSes.
Here is the summary of the changes.

Most variables that need synchronization operation were converted to generic
atomic variables (std::atomic<T>). Variables that are updated with combined CAS
are packed into a single atomic variable, and partial read/write is done
through unpacking/packing

Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D47903

llvm-svn: 336563
2018-07-09 17:36:22 +00:00
Kelvin Li b1711b28f7 Define the __STDC_FORMAT_MACROS to avoid test failure on some platforms.
ompt/misc/api_calls_from_other_thread.cpp
ompt/misc/interoperability.cpp

Differential Revision: https://reviews.llvm.org/D48984

llvm-svn: 336438
2018-07-06 14:15:59 +00:00
Joachim Protze b41c61eed4 Dropped non-supoorted "--no-as-needed" flag from OMPT tests for macOS
The flag "--no-as-needed" is not recognized by the linker on macOS making the following tests fail:

ompt/loadtool/tool_available/tool_available.c
ompt/loadtool/tool_not_available/tool_not_available.c
This patch removes this flag for macOS and adds it only for Linux and Windows.
I tested it on Ubuntu 16.04 and macOS HighSierra, with Clang/LLVM 6.0.1 and OpenMP trunk.

This solution was also discussed in the OpenMP-dev mailing list.

Patch provided by Simone Atzeni

Differential Revision: https://reviews.llvm.org/D48888

llvm-svn: 336327
2018-07-05 09:14:06 +00:00
Joachim Protze 00505b85a3 [OMPT] Add synchronization to threads_nested.c testcase
The testcase potentially fails when a thread is reused.
The added synchronization makes sure this does not happen.

Patch provided by Simon Convent

Differential Revision: https://reviews.llvm.org/D48932

llvm-svn: 336326
2018-07-05 09:14:01 +00:00
Joachim Protze 04a00fc18c [OMPT] Use alloca() to force availability of frame pointer
When compiling with icc, there is a problem with reenter frame addresses in
parallel_begin callbacks in the interoperability.c testcase. (The address is
not available. thus NULL)
Using alloca() forces availability of the frame pointer.

Patch provided by Simon Convent

Differential Revision: https://reviews.llvm.org/D48282

llvm-svn: 336088
2018-07-02 09:13:38 +00:00
Joachim Protze e2eec57a4f [OMPT] Add tests for runtime entry points from non-OpenMP threads
Several runtime entry points have not been tested from non-OpenMP threads. This
adds tests to an existing testcase. While at it, the testcase was reformatted

Patch provided by Simon Convent

Differential Revision: https://reviews.llvm.org/D48124

llvm-svn: 336087
2018-07-02 09:13:34 +00:00
Joachim Protze 28d2d708d4 [OMPT] Add testcases for thread_begin and thread_end callbacks
Especially the thread_end callback has not been tested before.
This adds a testcase for nested and non-nested threads.

Patch provided by Simon Convent

Differential Revision: https://reviews.llvm.org/D47824

llvm-svn: 336086
2018-07-02 09:13:30 +00:00
Joachim Protze 4a73ae167e [OMPT] Provide the right thread_num for ancestor levels
The current implementation always provides the thread-num for the current
parallel region. This patch fixes the behavior for ancestor levels >0.

Differential Revision: https://reviews.llvm.org/D46533

llvm-svn: 336085
2018-07-02 09:13:24 +00:00
Alexey Bataev 3994bafbc7 [OPENMP, NVPTX] Sync threads before start ordered loops.
Summary: Threads must be synchronized before starting ordered construct.

Reviewers: grokos

Subscribers: guansong, openmp-commits

Differential Revision: https://reviews.llvm.org/D48732

llvm-svn: 335987
2018-06-29 16:16:00 +00:00
Alexey Bataev 0ac29350b5 [OPENMP, NVPTX] Fixes for NVPTX RTL
Summary:
Patch fixes several problems in the implementation of NVPTX RTL.
1. Detection of the last iteration for loops with static scheduling, no chunks.
2. Fixes reductions for the serialized parallel constructs.
3. Fixes handling of the barriers.

Reviewers: grokos

Reviewed By: grokos

Subscribers: Hahnfeld, guansong, openmp-commits

Differential Revision: https://reviews.llvm.org/D48480

llvm-svn: 335469
2018-06-25 13:43:35 +00:00
Andrey Churbanov a7fa3f009a minor: fixed typo in debug print
llvm-svn: 335138
2018-06-20 15:54:11 +00:00
Jonas Hahnfeld d03cbf2cfe Remove liboffload from repository
See the mailing list for the proposal and discussion:
http://lists.llvm.org/pipermail/openmp-dev/2018-June/002041.html

llvm-svn: 335069
2018-06-19 19:08:17 +00:00
Guansong Zhang f9e56e5982 [OpenMP] [CUDA] Expose teamid to the debug path
Summary: Small bug fix for debug build. A previous fix causing trouble for debug build.

Reviewers: grokos

Reviewed By: grokos

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D48286

llvm-svn: 335046
2018-06-19 14:05:38 +00:00
Jonathan Peyton e92ae43be8 [OpenMP] Fix formatting issues in kmp_stats.h
llvm-svn: 334335
2018-06-08 22:27:53 +00:00
Joachim Protze 406361330b [OMPT] Rename ompt_wait_id to omp_wait_id
Rename ompt_wait_id to omp_wait_id, as defined in the spec.

Differential Revision: https://reviews.llvm.org/D46530

llvm-svn: 333368
2018-05-28 08:16:08 +00:00
Joachim Protze c5836064bb [OMPT] Rename ompt_frame_t to omp_frame_t
Rename ompt_frame_t to omp_frame_t, as defined in the spec.

Differential Revision: https://reviews.llvm.org/D43568

llvm-svn: 333367
2018-05-28 08:14:58 +00:00
Jonas Hahnfeld 3c6595d65d [OMPT] Fix test parallel/not_enough_threads.c
Upcoming changes to FileCheck will modify CHECK-DAG to not match
overlapping regions of the input. This test was found to be affected
because it expects to find four threads to invoke events of type
ompt_event_implicit_task_begin. It turns out this is wrong because
OMP_THREAD_LIMIT is set to 2, so there are only two threads. The
rest of the test got it right so it went unnoticed until now.

(Rewrite test and apply clang-format to it as discussed in the past.)

Differential Revision: https://reviews.llvm.org/D47119

llvm-svn: 333361
2018-05-27 17:07:38 +00:00
Jonas Hahnfeld 17aabf83e9 [libomptarget-nvptx] loop: Determine if runtime uninitialized
The generic entry points for static loop scheduling previously
hardcoded that the runtime was initialized. This can be wrong if
the compiler analyzes that the runtime is not needed and calls
the init functions accordingly.

This didn't affect clang-ykt because they have entry points for
different combinations of SPMD x Runtime not needed. I didn't do
measurements yet but with inlining we might get away with always
calling the generic interface and letting compiler and runtime
figure out the rest.
In any case, a correct runtime is always better than having
functions that may only be called if previous calls passed in
a specific set of arguments!

Differential Revision: https://reviews.llvm.org/D47131

llvm-svn: 333285
2018-05-25 15:56:48 +00:00
Jonas Hahnfeld 65e0b8784c [CMake] Unify install path for libraries
Introduce OPENMP_INSTALL_LIBDIR and use in all install() commands.
This also fixes installation of libomptarget-nvptx that previously
didn't honor {OPENMP,LLVM}_LIBDIR_SUFFIX.

Differential Revision: https://reviews.llvm.org/D47130

llvm-svn: 333284
2018-05-25 15:56:41 +00:00
George Rokos 6da6f433a0 [CUDA]Fix dynamic|guided scheduling.
The existing implementation of the dynamic scheduling
breaks the contract introduced by the original openmp
runtime and, thus, is incorrect. Patch fixes it and
introduces correct dynamic scheduling model.

Thanks to Alexey Bataev for submitting this patch.

Differential Revision: https://reviews.llvm.org/D47333

llvm-svn: 333225
2018-05-24 21:12:41 +00:00
Jonas Hahnfeld 9228f9718c [libomptarget-nvptx-bc] Pass found CUDA installations
We already know where the CUDA SDK is, so there is no point in
letting Clang search for it again and possibly finding no or
a different installation.

--cuda-path is supported since the beginning of CUDA support in
Clang, so making this required doesn't impose additional restrictions.

Differential Revision: https://reviews.llvm.org/D46930

llvm-svn: 332495
2018-05-16 17:20:27 +00:00
Jonas Hahnfeld 37bbe1a698 [libomptarget-nvptx] Test bitcode compiler flags and enable by default
Move all logic related to selecting the bitcode compiler and linker
into a new file and dynamically test required compiler flags. This
also adds -fcuda-rdc for Clang trunk as previously attempted in D44992
which fixes the build.

As a result this change also enables building the library by default
if all prerequisites are met.

Differential Revision: https://reviews.llvm.org/D46901

llvm-svn: 332494
2018-05-16 17:20:21 +00:00
Gheorghe-Teodor Bercea 787a350021 [OpenMP][libomptarget] Add function for checking SPMD mode
Summary: Add function to the NVPTX libomptarget library that will return true if the current target region is being executed in SPMD mode.

Reviewers: ABataev, grokos, carlo.bertolli, caomhin

Reviewed By: grokos

Subscribers: guansong, openmp-commits

Differential Revision: https://reviews.llvm.org/D46840

llvm-svn: 332360
2018-05-15 15:16:43 +00:00
Joachim Protze 9be9cf20bf [OMPT] Fix thread_num for implicit_task_end callbacks in nested parallel regions
implicit_task_end callbacks in nested parallel regions did not always give the
correct thread_num, since the inner parallel region may have already been
finalized.
Now, the thread_num is stored at the beginning of the implicit task and
retrieved at the end, whenever necessary.

A testcase was added as well.

Differential Revision: https://reviews.llvm.org/D46260

llvm-svn: 331632
2018-05-07 12:42:21 +00:00
Joachim Protze 8fc39f6b19 [OMPT] Add api_calls_misc.c testcase and rename api_calls.c testcase
The api_calls_misc.c testcase tests the following api calls:

ompt_get_callback()
ompt_get_state()
ompt_enumerate_states()
ompt_enumerate_mutex_impls()
These have not been tested previously.

The api_calls.c testcase has been renamed to api_calls_places.c because it only tests api calls that are related to places.

Differential Revision: https://reviews.llvm.org/D42523

llvm-svn: 331631
2018-05-07 12:42:15 +00:00
Guansong Zhang e1c7a46d5b [OpenMP] Use LIBOMPTARGET_DEVICE_RTL_DEBUG env var to control debug messages on the device side
Summary:
Enable the device side debug messages at compile time, use env var to control at runtime.

To achieve this, an environment data block is passed to the device lib when it is loaded.

By default, the message is off, to enable it, a user need to set LIBOMPDEVICE_DEBUG=1.

Reviewers: grokos

Reviewed By: grokos

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D46210

llvm-svn: 331550
2018-05-04 19:29:28 +00:00
Jonathan Peyton d47df260ba [OpenMP][OMPT] Fix api_calls_from_other_thread.cpp
Removed environment setting in RUN: line that was being ignored anyways.
Changed a few specific checks to "any number"

llvm-svn: 331212
2018-04-30 18:46:31 +00:00
Guansong Zhang ad6c26516b [OpenMP] Remove compilation warning when using clang to compile bc files.
Summary: Minor printf format correction. NVCC ignore those. Clang will give warning on these if debug is enabled.

Reviewers: grokos

Reviewed By: grokos

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D45528

llvm-svn: 330944
2018-04-26 14:06:53 +00:00
Guansong Zhang 334c379e32 [OpenMP] Make bc file compilation sensitive to LIBOMPTARGET_NVPTX_DEBUG flag
Summary: The LIBOMPTARGET_NVPTX_DEBUG flag is inconsistent between using nvcc to generate .a file and clang to generate .bc file. Sync the two setting so we can get debug messages from the bc file path as well.

Reviewers: grokos

Subscribers: Hahnfeld, openmp-commits, mgorny

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D45530

llvm-svn: 330477
2018-04-20 20:41:00 +00:00
Heejin Ahn f78a493528 [OpenMP] Compilation error fix on const char*
Summary:
This line
(0ed912c7a7/runtime/src/kmp_gsupport.cpp (L1459))
added in D45327 (rL330282) causes a compilation failure.

Reviewers: jlpeyton

Subscribers: guansong, openmp-commits

Differential Revision: https://reviews.llvm.org/D45786

llvm-svn: 330299
2018-04-18 22:23:31 +00:00
Jonathan Peyton 1482db9e03 [OpenMP] Fix affinity API for KMP_AFFINITY=none|compact|scatter
Currently, the affinity API reports garbage for the initial place list and any
thread's place lists when using KMP_AFFINITY=none|compact|scatter.
This patch does two things:

for KMP_AFFINITY=none, Creates a one entry table for the places, this way, the
initial place list is just a single place with all the proc ids in it. We also
set the initial place of any thread to 0 instead of KMP_PLACE_ALL so that the
thread reports that single place (place 0) instead of garbage (-1) when using
the affinity API.

When non-OMP_PROC_BIND affinity is used
(including KMP_AFFINITY=compact|scatter), a thread's place list is populated
correctly. We assume that each thread is assigned to a single place. This is
implemented in two of the affinity API functions

Differential Revision: https://reviews.llvm.org/D45527

llvm-svn: 330283
2018-04-18 19:25:48 +00:00
Jonathan Peyton 27a677fc95 Introduce GOMP_taskloop API
This patch introduces GOMP_taskloop to our API. It adds GOMP_4.5 to our
version symbols. Being a wrapper around __kmpc_taskloop, the function
creates a task with the loop bounds properly nested in the shareds so that
the GOMP task thunk will work properly. Also, the firstprivate copy constructors
are properly handled using the __kmp_gomp_task_dup() auxiliary function.

Currently, only linear spawning of tasks is supported
for the GOMP_taskloop interface.

Differential Revision: https://reviews.llvm.org/D45327

llvm-svn: 330282
2018-04-18 19:23:54 +00:00
Joachim Protze 3865c69b84 Set the license header for all OMPT files
llvm-svn: 329928
2018-04-12 17:23:26 +00:00
Guansong Zhang f679431f91 [OpenMP] Remove extra warning when we build
Summary:
This one line change is to remove this warning message

"warning: integer conversion resulted in a change of sign"

Reviewers: grokos

Reviewed By: grokos

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D45415

llvm-svn: 329713
2018-04-10 15:28:31 +00:00
Guansong Zhang f0029a7738 Revert "[OpenMP] enable bc file compilation using the latest clang"
This reverts commit 6849e31c36d712d97433bca9af39b7a09c8c1207.

llvm-svn: 329576
2018-04-09 14:45:41 +00:00
Guansong Zhang e47fbc9da8 [OpenMP] enable bc file compilation using the latest clang
Summary: adding cuda-rdc flag to allow extern global data

Reviewers: grokos

Reviewed By: grokos

Subscribers: gregrodgers, mgorny, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D44992

llvm-svn: 329072
2018-04-03 15:01:34 +00:00
Jonathan Peyton 1e6bb8d5de Minor cleanup in __kmp_atfork_child()
This change removes the unnecessary lock operation on __kmp_initz_lock inside
the __kmp_atfork_child() function for Linux; the lock variable is initialized
in the same function later.

Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D44949

llvm-svn: 328900
2018-03-30 19:55:11 +00:00
Jonathan Peyton ea82c769f4 Move blocktime_str variable right before its first use
llvm-svn: 328575
2018-03-26 19:20:50 +00:00
Jonathan Peyton b6b79ac95b Add summarizeStats.py to tools directory
The summarizeStats.py script processes raw data provided by the
instrumented (stats-gathering) OpenMP* runtime library. It provides:

1) A radar chart which plots counters as frequency (per GigaTick) of use within
   the program. The frequencies are plotted as log10, however values less than
   one are kept as it is and represented in red color. This was done to help
   visualize the differences better.
2) Pie charts separating total time as compute and non-compute. The compute and
   non-compute times have their own pie charts showing the constructs that
   contributed to them. The percentages listed are with respect to the total
   time.
3) '.csv' file with percentage of time spent within the different constructs.

The script can be used as:
$ python $PATH_TO_SCRIPT/summarizeStats.py instrumented1.csv instrumented2.csv

Patch by Taru Doodi

Differential Revision: https://reviews.llvm.org/D41838

llvm-svn: 328568
2018-03-26 18:44:48 +00:00
Andrey Churbanov 2d91a8a3ba Fixed __kmpc_get_target_offload() to call library initialization.
Differential Revision: https://reviews.llvm.org/D44793

llvm-svn: 328228
2018-03-22 18:51:51 +00:00
Gheorghe-Teodor Bercea 4bc36a06e2 [OpenMP][libomptarget] Initialize global memory stack only once.
Summary: The global stack initialization function may be called multiple times. The initialization of the shared memory slots should only happen when the function is called for the first time for a given warp master thread.

Reviewers: grokos, carlo.bertolli, ABataev, caomhin

Reviewed By: grokos

Subscribers: guansong, openmp-commits

Differential Revision: https://reviews.llvm.org/D44754

llvm-svn: 328148
2018-03-21 21:02:55 +00:00
Gheorghe-Teodor Bercea b4332ca3da [OpenMP][libomptarget] Fix master warp check
Summary: The check for the master warp must take into consideration the actual number of warps: the master warp is equal to the last active warp not necessarily WARPSIZE - 1.

Reviewers: grokos, carlo.bertolli, ABataev, caomhin

Reviewed By: grokos

Subscribers: guansong, openmp-commits

Differential Revision: https://reviews.llvm.org/D44537

llvm-svn: 328146
2018-03-21 20:51:16 +00:00
Gheorghe-Teodor Bercea c8d395a168 [OpenMP][libomptarget] Enable globalization for workers
Summary:
This patch allows worker to have a global memory stack managed by the runtime. This patch is needed for completeness and consistency with the globalization policy: if a worker-side variable escapes the current context it then needs to be globalized.
Until now, only the master thread was allowed to have such a stack. These global values can now potentially be shared amongst workers if the semantics of the OpenMP program require it.

Reviewers: ABataev, grokos, carlo.bertolli, caomhin

Reviewed By: grokos

Subscribers: guansong, openmp-commits

Differential Revision: https://reviews.llvm.org/D44487

llvm-svn: 328144
2018-03-21 20:34:19 +00:00
Jonathan Peyton 78f977fcd1 Read OMP_TARGET_OFFLOAD and provide API to access ICV
Added settings code to read OMP_TARGET_OFFLOAD environment variable. Added
target-offload-var ICV as __kmp_target_offload, set via OMP_TARGET_OFFLOAD,
if available, otherwise defaulting to DEFAULT. Valid values for the ICV are
specified as enum values {0,1,2} for disabled, default, and mandatory. An
internal API access function __kmpc_get_target_offload is provided.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D44577

llvm-svn: 328046
2018-03-20 21:18:17 +00:00
Andrey Churbanov 3336aa0d07 Fix for Fix for https://bugs.llvm.org/show_bug.cgi?id=36705.
Differential Revision: https://reviews.llvm.org/D44637

llvm-svn: 327875
2018-03-19 18:05:15 +00:00
George Rokos 6b9bb5e1c2 Bugfix, extern declarations for libomp functions are `extern "C"` declarations
llvm-svn: 327763
2018-03-17 02:07:42 +00:00
George Rokos 2878c3957b Moved extern declarations to private header file, they are only used from within libomptarget, they don't need to be in omptarget.h.
llvm-svn: 327740
2018-03-16 20:40:09 +00:00
Gheorghe-Teodor Bercea 876c1ed2e5 [OpenMP][libomptarget] Enable usage of shared memory slots
Summary:
Allow the runtime to use the existing shared memory statically allocated slots.

When a variable is globalized, the underlying memory can be either shared or global memory (both have block-wide visibility). In this case, we allow that the storage to use a limited amount of shared memory that has been statically allocated already. Only if shared memory doesn't prove to be enough do we then invoke malloc() to create a new global memory slot.

Reviewers: ABataev, carlo.bertolli, grokos, caomhin

Reviewed By: grokos

Subscribers: guansong, openmp-commits

Differential Revision: https://reviews.llvm.org/D44486

llvm-svn: 327639
2018-03-15 16:05:34 +00:00
Gheorghe-Teodor Bercea f3de222b0d [OpenMP][libomptarget] Enable multiple frames per global memory slot
Summary: To save on calls to malloc, this patch enables the re-use of pre-allocated global memory slots.

Reviewers: ABataev, grokos, carlo.bertolli, caomhin

Reviewed By: grokos

Subscribers: guansong, openmp-commits

Differential Revision: https://reviews.llvm.org/D44470

llvm-svn: 327637
2018-03-15 15:56:04 +00:00
George Rokos 59be4b434f [libomptarget][nvptx] Bug fix: Correctly identify the warp master active thread.
llvm-svn: 327556
2018-03-14 19:11:36 +00:00
Gheorghe-Teodor Bercea 49b62649cf [OpenMP][libomptarget] Add global memory data sharing support for master-worker sharing.
Summary:
This patch adds support for the sharing of variables from the master thread of a team to the worker threads of the team.
The runtime uses a stack structure implemented as a doubly-linked list of slots with each slot having the exact same size as the size requested. This implementation leverages existing data structures. The runtime functions are added as separate functions to avoid interfering with the current interface. 

Limitations to be addressed in future patches:
- This current patch only employs global memory. In a future patch we will enable to usage for shared memory as an optimization.
- Allow the allocation of several requested sizes in the same slot.

Reviewers: ABataev, grokos, caomhin, carlo.bertolli

Reviewed By: grokos

Subscribers: Hahnfeld, guansong, openmp-commits

Differential Revision: https://reviews.llvm.org/D44260

llvm-svn: 327440
2018-03-13 19:44:53 +00:00
Sylvestre Ledru 1c861c582f fix a typo on the website
llvm-svn: 327237
2018-03-11 10:53:40 +00:00
Gheorghe-Teodor Bercea d5e5992f9a [OpenMP][libomptarget] Fix union.
Summary: To make the two parts of the union have the same size, the size of vect needs to be increased by 16 bits.

Reviewers: grokos, carlo.bertolli, caomhin, ABataev

Reviewed By: grokos, ABataev

Subscribers: fedor.sergeev, guansong, openmp-commits

Differential Revision: https://reviews.llvm.org/D44254

llvm-svn: 327040
2018-03-08 18:44:02 +00:00
Gheorghe-Teodor Bercea 7a5fa21ae2 [OpenMP] Remove implicit data sharing using device shared memory from libomptarget
Summary:
This patch reverts the changes to libomptarget that were coupled with the changes to Clang code gen for data sharing using shared memory. A similar patch exists for Clang: D43625

Shared memory is meant to be used as an optimization on top of a more general scheme. So far we didn't have a global memory implementation ready so shared memory was a solution which applied to the current level of OpenMP complexity supported by trunk on GPU devices (due to the missing NVPTX backend patch this functionality has never been exercised). Now that we have a global memory solution this patch is "in the way" and needs to be removed (for now). This patch (or an equivalent version of it) will be put out for review once the global memory scheme is in place.


Reviewers: ABataev, grokos, carlo.bertolli, caomhin

Reviewed By: grokos

Subscribers: Hahnfeld, guansong, openmp-commits

Differential Revision: https://reviews.llvm.org/D43626

llvm-svn: 326950
2018-03-07 22:10:10 +00:00
Andrey Churbanov 9e9333aa8a Improve OpenMP threadprivate implementation.
Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D41914

llvm-svn: 326733
2018-03-05 18:42:01 +00:00
Andrey Churbanov 75bc70fb56 Fixed build of the OpenMP stubs library.
Differential Revision: https://reviews.llvm.org/D44019

llvm-svn: 326728
2018-03-05 18:01:47 +00:00
Jonas Hahnfeld b0f051ae63 [OMPT] Fix interoperability test with GCC
We have to ensure that the runtime is initialized _before_ waiting
for the two started threads to guarantee that the master threads
post their ompt_event_thread_begin before the worker threads. This
is not guaranteed in the parallel region where one worker thread
could start before the other master thread has invoked the callback.

The problem did not happen with Clang becauses the generated code
calls __kmpc_global_thread_num() and cashes its result for functions
that contain OpenMP pragmas.

Differential Revision: https://reviews.llvm.org/D43882

llvm-svn: 326435
2018-03-01 14:03:18 +00:00
Joachim Protze f5aebc27ad [OMPT] Fix task-type test with GCC
This is similar to D43882. The runtime needs to be initialized before calling print_ids(0)

http://lab.llvm.org:8011/builders/openmp-gcc-x86_64-linux-debian/builds/60

Differential Revision: https://reviews.llvm.org/D43897

llvm-svn: 326428
2018-03-01 11:26:15 +00:00
Joachim Protze aa2022e74f [OMPT] Fix ompt_get_task_info() and add tests for it
The thread_num parameter of ompt_get_task_info() was not being used previously,
but need to be set.

The print_task_type() function (form the task-types.c testcase) was merged into
the print_ids() function (in callback.h). Testing of ompt_get_task_info() was
added to the task-types.c testcase. It was not tested extensively previously.

Differential Revision: https://reviews.llvm.org/D42472

llvm-svn: 326338
2018-02-28 17:36:18 +00:00
Joachim Protze 4df80bda40 [OMPT] Fix inconsistent testcases
The main change of this patch is to insert {{.*}} in current_address=[[RETURN_ADDRESS_END]].
This is needed to match any of the alternatively printed addresses.

Additionally, clang-format is applied to the two tests.

Differential Revision: https://reviews.llvm.org/D43115

llvm-svn: 326312
2018-02-28 09:28:51 +00:00
Jonas Hahnfeld 82768d0ba1 [OMPT] Fix parallel_data in implicit barrier-end
This is required to be NULL for implicit barriers at the end of a
parallel region. Noticed in review of D43191.

Differential Revision: https://reviews.llvm.org/D43308

llvm-svn: 325922
2018-02-23 16:46:25 +00:00
Jonas Hahnfeld 5e44069857 [OMPT] Fix test tasks/serialized.c with optimization
The compiler inlines the user code in the task. Check for that case at
runtime by comparing the frame addresses and print the expected exit
address.

Also showcase how I think the OMPT tests could be reformatted to match
LLVM's code style. In my opinion it would be great to that kind of change
to all tests that need to be touched for whatever reason...

Differential Revision: https://reviews.llvm.org/D43191

llvm-svn: 325921
2018-02-23 16:46:11 +00:00
Joachim Protze b0e4f87fb0 [OMPT] Omissionin in OMPT Formatting
Applying clang-format to the /runtime/src/ folder

Differential Revision: https://reviews.llvm.org/D42169

llvm-svn: 325424
2018-02-17 09:54:10 +00:00
Joachim Protze 33db70d2d7 [OMPT] Add interoperability testcase
Test whether OMPT-callbacks for two threads that initiate a parallel region are correct.

Differential Revision: https://reviews.llvm.org/D41942

llvm-svn: 325423
2018-02-17 09:40:08 +00:00
Joachim Protze 76899b84fe [OMPT] Update api_calls testcase
Only use ompt_ functions when testing OMPT in api_calls testcase.
Add size parameter to print_list.
Fix small bug in implementation of ompt_get_partition_place_nums(): return correct length.

Differential Revision: https://reviews.llvm.org/D42162

llvm-svn: 325422
2018-02-17 09:40:02 +00:00
Jonas Hahnfeld 6f9e25d382 [CMake] Add -fno-experimental-isel for testing
GlobalISel doesn't yet implement blockaddress and falls back to
SelectionDAG. This results in additional branch instruction to
the next basic block which breaks the OMPT tests.
Disable GlobalISel for now when compiling the tests because fixing
them is not easily possible. See http://llvm.org/PR36313 for full
discussion history.

Differential Revision: https://reviews.llvm.org/D43195

llvm-svn: 325218
2018-02-15 08:10:22 +00:00
Jonas Hahnfeld cc6d29d72c [OMPT][test] Correct warning about added wrapper functions
This affects all outlined functions, not just tasks! Only show warning
when using Clang 5.0 or later.

Differential Revision: https://reviews.llvm.org/D43190

llvm-svn: 325131
2018-02-14 15:15:24 +00:00
Gheorghe-Teodor Bercea d5ae4e6501 [OpenMP][libomptarget] Enable the compilation of multiple bc libraries for runtime inlining
Summary:
Different NVIDIA GPUs support different compute capabilities. To enable the inlining of runtime functions and the best performance on different generations of NVIDIA GPUs, a bc library for each compute capability needs to be compiled. The same compiler build will then be usable in conjunction with multiple generations of NVIDIA GPUs.
To differentiate between versions of the same bc lib, the output file name will contain the compute capability ID.
Depends on D14254

Reviewers: Hahnfeld, hfinkel, carlo.bertolli, caomhin, ABataev, grokos

Reviewed By: Hahnfeld, grokos

Subscribers: guansong, mgorny, openmp-commits

Differential Revision: https://reviews.llvm.org/D41724

llvm-svn: 324904
2018-02-12 16:45:20 +00:00
Jonas Hahnfeld 3cfaf3dd0d [libomptarget] Fix detection of CUDA stubs library
CUDA_LIBRARIES contains additional linker arguments since CMake 3.3
which breakes the current way of finding the stubs library.

llvm-svn: 324879
2018-02-12 11:01:56 +00:00
Joachim Protze cfc98c2493 [OMPT] Add tool_available_search testcase
Tests the search for tools as defined in the spec. The OMP_TOOL_LIBRARIES
environment variable contains paths to the following files(in that order)

-to a nonexisting file
-to a shared library that does not have a ompt_start_tool function
-to a shared library that has an ompt_start_tool implementation returning NULL
-to a shared library that has an ompt_start_tool implementation returning a
    pointer to a valid instance of ompt_start_tool_result_t

The expected result is that the last tool gets active and can print in the
thread-begin callback.

Differential Revision: https://reviews.llvm.org/D42166

llvm-svn: 324588
2018-02-08 10:04:33 +00:00
Joachim Protze 9440c0ee3c [OMPT] Add tool_not_available testcase
Add a testcase that checks wheter the runtime can handle an ompt_start_tool
method that returns NULL indicating that no tool shall be loaded.

All tool_available testcases need a separate folder to avoid file conflicts for
the generated tools.

Differential Revision: https://reviews.llvm.org/D41904

llvm-svn: 324587
2018-02-08 10:04:28 +00:00
Gheorghe-Teodor Bercea aaeab8d4ef [OpenMP][libomptarget] Add data sharing support in libomptarget
Summary: This patch extends the libomptarget functionality in patch D14254 with support for the data sharing scheme for supporting implicitly shared variables. The runtime therefore maintains a list of references to shared variables.

Reviewers: carlo.bertolli, ABataev, Hahnfeld, grokos, caomhin, hfinkel

Reviewed By: Hahnfeld, grokos

Subscribers: guansong, llvm-commits, openmp-commits

Differential Revision: https://reviews.llvm.org/D41485

llvm-svn: 324495
2018-02-07 18:21:55 +00:00
Joachim Protze 2a20299f91 [OMPT] Fix tool initialization returning 0
If tool initialization returns 0, OMPT should not be active. The current
implementation provided some callback invocations in this case.

Differential Revision: https://reviews.llvm.org/D42709

llvm-svn: 324320
2018-02-06 08:41:27 +00:00
Carlo Bertolli 57e9f44a8c [OpenMP-RT] Fix debug string for NVPTX runtime library
https://reviews.llvm.org/D42757

The method ThreadsInTeam is used to determine the number of threads to be used in a parallel region under SPMD mode (see line 127 of supporti.h in libomptarget/deviceRTLs/nvptx/src/). This patch fixes the corresponding debug print upon initialization of the kernel in SPMD mode.

llvm-svn: 323978
2018-02-01 16:12:16 +00:00
Jonas Hahnfeld a349d4820c [libomptarget] Check for library with CUDA Driver API
That's what we really need to link the CUDA plugin against,
not the CUDA runtime API in CUDA_LIBRARIES! While the latter
comes with the CUDA SDK, the Driver API is installed with
the kernel driver and there is at most one per system. As
fallback we can use the stubs library distributed with the
CUDA SDK for linking.

Differential Revision: https://reviews.llvm.org/D42643

llvm-svn: 323787
2018-01-30 16:49:13 +00:00
Jonas Hahnfeld c189523529 [libomptarget] Only use CUDA Driver API
Use equivalents for the last calls to the Runtime API. Remove
stray assert in case of an error found during review, we should
only return OFFLOAD_FAIL.

Differential Revision: https://reviews.llvm.org/D42686

llvm-svn: 323786
2018-01-30 16:49:06 +00:00
George Rokos 0dd6ed74fd [OpenMP] Initial implementation of OpenMP offloading library - libomptarget device RTLs.
This patch implements the device runtime library whose interface is used in the code generation for OpenMP offloading devices.
Currently there is a single device RTL written in CUDA meant to CUDA enabled GPUs.
The interface is a variation of the kmpc interface that includes some extra calls to do thread and storage management that only make sense for a GPU target.

Differential revision: https://reviews.llvm.org/D14254

llvm-svn: 323649
2018-01-29 13:59:35 +00:00
Jonas Hahnfeld 723560d123 [OMPT] Use fuzzy return addresses in lock testcases
Use fuzzy return addresses in lock testcases so that these
testcases can also be run using the Intel Compiler.

Patch by Simon Convent!

Differential Revision: https://reviews.llvm.org/D41896

llvm-svn: 323529
2018-01-26 14:19:02 +00:00
Jonas Hahnfeld e57620308e Fix name of 'macOS' and add asteriks to brands, NFC.
llvm-svn: 323180
2018-01-23 07:54:10 +00:00
Dimitry Andric 9f49676a8a Sprinkle a few <cstdlib> includes, for libomptarget sources using
malloc, free, alloca and getenv.  NFCI.

llvm-svn: 322869
2018-01-18 18:24:22 +00:00
Jonas Hahnfeld e5499111b9 Add missing headers for Debug builds
llvm-svn: 322830
2018-01-18 10:58:43 +00:00
Joachim Protze e6269e3509 Partial revert of [OMPT] Rename ompt_mutex_impl_t to kmp_mutex_impl
The previous commit did not revert all replaced ompt_mutex_impl_unknown.

llvm-svn: 322631
2018-01-17 11:13:11 +00:00
Joachim Protze 0c9516b36c [OMPT] Add Workaround for Intel Compiler Bug
Add Workaround for Intel Compiler Bug with Case#: 03138964

A critical region within a nested task causes a segfault in icc 14-18:

int main()
{
  #pragma omp parallel num_threads(2)
  #pragma omp master
    #pragma omp task
      #pragma omp task
        #pragma omp critical
          printf("test\n");
}
When the critical region is in a separate function, the segault does not occur.
So we add noinline to make sure that the function call stays there.

Differential Revision: https://reviews.llvm.org/D41182

llvm-svn: 322622
2018-01-17 10:06:06 +00:00
Joachim Protze 1b2bd2680b [OMPT] Rename ompt_mutex_impl_t to kmp_mutex_impl
The defintion is not part of the spec and thus should not have the prefix
"ompt_" but rather a prefix that indicates that this is implementation
specific.

Differential Revision: https://reviews.llvm.org/D41166

llvm-svn: 322621
2018-01-17 10:06:01 +00:00
Joachim Protze 1dc2afdcaf [OMPT] Return appropiate values for ompt runtime entry points for non-OpenMP threads
When the current thread is not an (initialized) OpenMP thread, the runtime
entry points return values that correspond to "not available" or similar

Differential Revision: https://reviews.llvm.org/D41167

llvm-svn: 322620
2018-01-17 10:05:55 +00:00
Andrey Churbanov 5388acd3de Fixed libomp static build broken by the commit rL322202.
Patch by simone <simone@cs.utah.edu>.

Differential Revision: https://reviews.llvm.org/D41945

llvm-svn: 322282
2018-01-11 15:09:49 +00:00
Jonathan Peyton 79390ad709 Force HWLOC topology method for NUMA-specific topology
If user requested affinity with granularity=tile we need to either use HWLOC
or ignore the request. The change allows user to not specify
KMP_TOPOLOGY_METHOD=hwloc and choose it automatically instead.

Patch by Andrey Churbanov

Differential Revision: https://reviews.llvm.org/D40905

llvm-svn: 322205
2018-01-10 18:31:49 +00:00
Jonathan Peyton 1800ecec70 Simplify __kmp_expand_threads
This change simplifies __kmp_expand_threads to take a single argument.
Previously, it allowed two arguments and had logic to decide on different
potential expansion sizes. However, no calls to __kmp_expand_threads in the
runtime make use of this extra logic. Thus the extra argument and logic is
removed here.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D41836

llvm-svn: 322204
2018-01-10 18:27:01 +00:00
Jonathan Peyton bff8ded906 Minor code cleanup
Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D41831

llvm-svn: 322203
2018-01-10 18:24:09 +00:00
Jonathan Peyton eaa9e40c9a Improve stability of the runtime in parent/child processes
This change improves stability of the runtime when the application forks child
processes.  Acquiring/releasing __kmp_initz_lock and __kmp_forkjoin_lock in the
atfork handlers insures that the actual fork does not occur while those two
locks are held, and __kmp_itt_reset() reverts the itt's global state to the
initial state which also initializes the mutex stored in the global state.
Some missing initialization code was also inserted in the child's atfork handler.

Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D41462

llvm-svn: 322202
2018-01-10 18:21:48 +00:00
Joachim Protze 1014a6b6c6 Missed to add new test case in previous commit
llvm-svn: 322179
2018-01-10 12:52:34 +00:00
Joachim Protze 14b512e20c [OMPT] Fix ompt_task_data handling in implicit barriers
Changes to task_data in barrier-begin were not visible at barrier-end

Differential Revision: https://reviews.llvm.org/D41176

llvm-svn: 322178
2018-01-10 12:51:27 +00:00
Jonas Hahnfeld f34d65a164 [OMPT] Fix cast and printf of wait_id in lock test
This didn't work on 32 bit platforms.

Differential Revision: https://reviews.llvm.org/D41853

llvm-svn: 322160
2018-01-10 08:10:23 +00:00
Paul Osmialowski 6db41e608f Fix type mismatch in omp_control_tool() implementation that makes it run incorrectly on 32-bit machines.
Differential Revision: https://reviews.llvm.org/D41854

llvm-svn: 322068
2018-01-09 10:54:06 +00:00
Jonas Hahnfeld 3ffca790f6 Correct types of pointers to doacross_num_done
This field is defined as kmp_int32, so we should use neither
pointers to kmp_int64 nor 64 bit atomic instructions.
(Found while testing on a Raspberry Pi, 32 bit ARM)

Differential Revision: https://reviews.llvm.org/D41656

llvm-svn: 321964
2018-01-07 16:54:36 +00:00
Jonathan Peyton 97f4320086 Fix some comments and formatting in kmp_dispatch.cpp
llvm-svn: 321831
2018-01-04 23:05:26 +00:00
Jonathan Peyton 8c432f2d5e Fix trademarks found by scanner
llvm-svn: 321827
2018-01-04 22:56:47 +00:00
Joachim Protze e5e4afd6db [OMPT] Build runtime with OMPT support by default
This patch enables OMPT by default if version 50 or later is built and the config says, that OMPT will be supported.

Differential Revision: https://reviews.llvm.org/D41508

llvm-svn: 321675
2018-01-02 21:09:00 +00:00
Jonas Hahnfeld 2e809acd0b Unify build documentation and convert to reStructuredText
We now have several options that apply for both libraries and they
shouldn't be documented in multiple files. When already merging
the two Build_With_CMake.txt documents, convert them to
reStructuredText which is used for all of LLVM's documentation.

Differential Revision: https://reviews.llvm.org/D40920

llvm-svn: 321481
2017-12-27 09:15:10 +00:00
Joachim Protze 265fb584a5 [OMPT] Set and reset frame address when creating a task with dependences
As for normal task creation, the task frame addresses need to be stored
for the encountering task.

Differential Revision: https://reviews.llvm.org/D41165

llvm-svn: 321421
2017-12-24 07:30:23 +00:00
Paul Osmialowski 6b8141acdd [OMPT] Add missing initialization in nested_lwt.c test case
Without this initialization this test case tend to fail.

Differential Revision: https://reviews.llvm.org/D41542

llvm-svn: 321379
2017-12-22 19:24:06 +00:00
Joachim Protze 9c9b61df7e [OMPT] Fix failing test cases for gcc on Ubuntu
The compiler warns that _BSD_SOURCE is deprecated and _DEFAULT_SOURCE should
be used instead. We keep _BSD_SOURCE for older compilers, that don't know
about _DEFAULT_SOURCE.

The linker drops the tool when linking, since there is no visible need for
the library. So we need to tell the linker, that the tool should be linked
anyway.

Differential Revision: https://reviews.llvm.org/D41499

llvm-svn: 321362
2017-12-22 16:40:32 +00:00
Joachim Protze 25aa3ec1c5 Remove unused positional argument for printf
The format string for hints only prints the second argument (string) and drops
the first argument (hint id). Depending on how you read the POSIX text for
printf, this could be valid. But for practical reason, i.e., unpacking the
va_list passed to printf based on the formating information, it makes sense
to fix the implementation and not pass the id for hint.

Failing testcases were:

misc_bugs/teams-reduction.c
ompt/parallel/not_enough_threads.c

Differential Revision: https://reviews.llvm.org/D41504

llvm-svn: 321361
2017-12-22 16:40:26 +00:00
Joachim Protze e8d84a67c2 Add missing test case from D41171 commit
llvm-svn: 321270
2017-12-21 14:36:36 +00:00
Joachim Protze f375f4b49a [OMPT] Add missing ompt_get_num_procs function
This function is defined in OpenMP-TR6 section 4.1.5.1.6
The functions was not implemented yet.

Since ompt-functions can only be called after the runtime was initialized and
has loaded a tool, it can assume the runtime to be initialized. In contrast
to omp_get_num_procs which needs to check whether the runtime is initialized.

Differential Revision: https://reviews.llvm.org/D40949

llvm-svn: 321269
2017-12-21 14:36:30 +00:00
Joachim Protze f8d22f9db8 [OMPT] Fix return address handling in a few GOMP interface methods
This revision fixes failing testcases with parallel for loops and the gomp
interface. The return address needs to be stored at entry to runtime.
The storage is cleared on usage, so we need to update the storage before
calling again internal functions, that will trigger event callbacks.

Differential Revision: https://reviews.llvm.org/D41181

llvm-svn: 321265
2017-12-21 13:55:39 +00:00
Joachim Protze 4fe83593eb [OMPT] Handle null pointer in set_callback to improve performance
We use the bitmap ompt_enabled thoughout the runtime, to avoid loading the
vector of callback functions when testing if specific code should be executed.
Before invoking an event callback function, the pointer is tested for NULL.

This revision resets the corresponding bit in ompt_enabled to 0 if
NULL is passed in set_callback.

Differential Revision: https://reviews.llvm.org/D41171

llvm-svn: 321264
2017-12-21 13:55:34 +00:00
Joachim Protze 0e2a2571ca [OMPT] Use frames at different level when using clang version 5 or higher with debug flag
Clang 5 or higher adds an intermediate function call in certain cases when
compiling with debug flag. This revision updates the testcases to work
correctly.

Differential Revision: https://reviews.llvm.org/D40595

llvm-svn: 321263
2017-12-21 13:55:29 +00:00
Joachim Protze 633bc4ca99 [OMPT] Add annotations to testcases that are expected to fail when using certain compilers
Reasons for expected failures are mainly bugs when using lables in OpenMP regions
or missing support of some OpenMP features.
For some worksharing clauses, support to distinguish the kind of workshare was
added just recently.

If an issue was fixed in a minor release version of a compiler, we flag the
test as unsupported for this compiler version to avoid false positives.
Same for fixes that where backported to older compiler versions.

Differential Revision: https://reviews.llvm.org/D40384

llvm-svn: 321262
2017-12-21 13:55:16 +00:00
Paul Osmialowski 17fb580c12 [AArch64] add required arch specific code for running OMPT test cases
Differential Revision: https://reviews.llvm.org/D41482

llvm-svn: 321258
2017-12-21 12:33:31 +00:00
Dimitry Andric e4f5d01033 Fix more inconsistent line endings. NFC.
llvm-svn: 321016
2017-12-18 19:46:56 +00:00
Paul Osmialowski 7634f7093a [AArch64] fix an issue with older /proc/cpuinfo layout
There are two /proc/cpuinfo layots in use for AArch64: old and new.
The old one has all 'processor : n' lines in one section, hence
checking for duplications does not make sense.

Differential Revision: https://reviews.llvm.org/D41000

llvm-svn: 320593
2017-12-13 16:12:24 +00:00
Jonas Hahnfeld 2fcce313ad [CMake] Remove legacy LIBOMP_LIT_ARGS
The bots have been updated, this option isn't needed anymore.

llvm-svn: 320153
2017-12-08 15:07:08 +00:00
Jonas Hahnfeld e628ab4c65 Use hyperbarrier by default on all architectures
All architectures except x86_64 used the linear barrier implementation
by default which doesn't give good performance for a larger number
of threads.

Improvements for PARALLEL overhead (EPCC) with this patch on a Power8
system (2 sockets x 10 cores x 8 threads, OMP_PLACES=cores)

 20 threads:  4.55us -> 3.49us
 40 threads:  8.84us -> 4.06us
 80 threads: 19.18us -> 4.74us
160 threads: 54.22us -> 6.73us

Differential Revision: https://reviews.llvm.org/D40358

llvm-svn: 320152
2017-12-08 15:07:07 +00:00
Jonas Hahnfeld ce528acf0d Fix thread affinity on non-x86 Linux
To make thread affinity work according to the OpenMP spec, the
runtime needs information about the hardware topology. On Linux
the default way is to parse /proc/cpuinfo which contains this
information for x86 machines but (at least) not for AArch64 and
Power architectures.

Fortunately, there is a different code path which is able to get
that data from sysfs. The needed patch has landed in 2006 for
Linux 2.6.16 which is safe to assume nowadays (even RHEL 5 had
a kernel version derived from 2.6.18, and we are now at RHEL 7!).

Differential Revision: https://reviews.llvm.org/D40357

llvm-svn: 320151
2017-12-08 15:07:05 +00:00
Jonas Hahnfeld 86c307821c Add missing memory barrier for queuing locks
Otherwise I see hangs in the omp_single_copyprivate test when
compiling in release mode. With the debug assertions, I get a
failure `head > 0 && tail > 0`.

Differential Revision: https://reviews.llvm.org/D40722

llvm-svn: 320150
2017-12-08 15:07:02 +00:00
Jonas Hahnfeld a7c4f3202b [libomptarget] Split implementation of interface functions
This last of four patches adds a new file for the interface
functions that Clang uses during code generation. The only
change except simply moving the current code is renaming the
function CheckDeviceAndCtors() and using the correct type for
64bit device ids.

Differential Revision: https://reviews.llvm.org/D40801

llvm-svn: 319972
2017-12-06 21:59:15 +00:00
Jonas Hahnfeld a3b147ab45 [libomptarget] Split implementation of API functions
This third patch moves the implementation of the user-facing
OpenMP API functions into its own file. For now, the code is
only moved, no cleanups applied yet.

Differential Revision: https://reviews.llvm.org/D40800

llvm-svn: 319971
2017-12-06 21:59:12 +00:00
Jonas Hahnfeld 3029616dfc [libomptarget] Split device functionality
This is the second patch to split the current monolithic
implementation into separate files. Note that this change
doesn't cleanup the code yet.

Differential Revision: https://reviews.llvm.org/D40799

llvm-svn: 319970
2017-12-06 21:59:09 +00:00
Jonas Hahnfeld 433228048a [libomptarget] Split RTL plugin functionality
This is the first of four patches to split the target agnostic
library into multiple (smaller) files. It only moves the code
to separate implementation files and does no cleanup (yet) except
removing unneeded headers.

Differential Revision: https://reviews.llvm.org/D40798

llvm-svn: 319969
2017-12-06 21:59:07 +00:00
Jonas Hahnfeld 2559fbdf08 [libomptarget] Move header files and CMake library definition
Future patches will add (private) header files in src/ that should
not be visible to plugins, so move the "public" ones to a new
include/ directory. This is still internal in a sense that the
contained files won't be installed for the user.
Similarly, the target agnostic offloading library should be built
directly in src/. The parent directory is responsible for finding
dependencies and including all subdirectories.

Differential Revision: https://reviews.llvm.org/D40797

llvm-svn: 319968
2017-12-06 21:59:04 +00:00
Jonathan Peyton ebbcb43976 [OpenMP] Add entry for Intel Compiler 18
Patch by Simon Convent

Differential Revision: https://reviews.llvm.org/D40386

llvm-svn: 319961
2017-12-06 21:15:28 +00:00
Jonathan Peyton 125203e003 Eliminate double printing of verbose affinity settings
Redundant extra verbose output of binding to full mask in case
affinity=balanced or OMP_PLACES=<any> or OMP_PROC_BIND=<any>

Differential Revision: https://reviews.llvm.org/D40624

llvm-svn: 319960
2017-12-06 21:07:41 +00:00
Jonathan Peyton ec5b87188d Trivial enum fix
This change is a trivial fix for enums that removes specification of "last" or
"upper" values, or other boundary values. This simplifies the code in places,
and results in never needing to update the "upper" values again.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D40804

llvm-svn: 319957
2017-12-06 21:02:15 +00:00
Jonas Hahnfeld 241d1d9e17 Fix alignment in teams-reduction.c test
The runtime will use the global kmp_critical_name as a lock and
tries to atomically store a pointer in there. This will fail
if the global is only aligned by 4 bytes, the size of one int32_t
element. Use a union to ensure the global is aligned to the size
of a pointer on the current platform.

llvm-svn: 319811
2017-12-05 18:45:21 +00:00
Jonas Hahnfeld a4ca525c1b Fix PR30890: Reduction across teams hangs
__kmpc_reduce_nowait() correctly swapped the teams for reductions
in a teams construct. Apply the same logic to __kmpc_reduce() and
__kmpc_reduce_end().

Differential Revision: https://reviews.llvm.org/D40753

llvm-svn: 319788
2017-12-05 16:51:24 +00:00
Jonas Hahnfeld fc473dee98 [CMake] Detect information about test compiler
Perform a nested CMake invocation to avoid writing our own parser
for compiler versions when we are not testing the in-tree compiler.
Use the extracted information to mark a test as unsupported that
hangs with Clang prior to version 4.0.1 and restrict tests for
libomptarget to Clang version 6.0.0 and later.

Differential Revision: https://reviews.llvm.org/D40083

llvm-svn: 319448
2017-11-30 17:08:31 +00:00
Andrey Churbanov a5868215b4 Extension of HWLOC topology discovery with NUMA nodes and tiles
Patch by Olga Malysheva

Differential Revision: https://reviews.llvm.org/D40309

llvm-svn: 319422
2017-11-30 11:51:47 +00:00
Jonathan Peyton ba55a7b958 Make kmp_r_sched_t into a union
This change makes kmp_r_sched_t type into a union for simpler
comparisons and assignments

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D40374

llvm-svn: 319379
2017-11-29 22:47:52 +00:00
Jonathan Peyton 62da55020b Fix aligned memory allocation in the stub library
kmp_aligned_malloc() always returned NULL on Windows (stub library only)
that may cause Fortran application crash.  With this change all memory
allocation functions were fixed to use aligned{m,re,rec}alloc() to
allocate/reallocate memory. To deallocate that memory _aligned_free() is
used in kmp_free().

Patch by Olga Malysheva

Differential Revision: https://reviews.llvm.org/D40296

llvm-svn: 319375
2017-11-29 22:29:38 +00:00
Jonathan Peyton 64249504b5 Warning is emitted when tiles are requested but cannot be used
Added two warnings:
1) Before building the topology map check if tiles are requested but the
   topo method is not hwloc;
2) After building the topology map check if tiles are requested but not
   detected by the library.

Patch by Olga Malysheva

Differential Revision: https://reviews.llvm.org/D40340

llvm-svn: 319374
2017-11-29 22:27:18 +00:00
Jonathan Peyton 92ce4bcfd8 Fix types of Fortran array elements
Fortran array elements made default integer in OMP_GET_PLACE_PROC_IDS and
OMP_GET_PARTITION_PLACE_NUMS subroutines, otherwise call to them produces
incorrect result.

Patch by Olga Malysheva

Differential Revision: https://reviews.llvm.org/D40356

llvm-svn: 319372
2017-11-29 22:23:44 +00:00
Jonas Hahnfeld 18bec60bc2 [CMake] Refactor testing infrastructure
The code for the two OpenMP runtime libraries was very similar.
Move to common CMake file that is included and provides a simple
interface for adding testsuites. Also add a common check-openmp
target that runs all testsuites that have been registered.

Note that this renames all test options to the common OPENMP
namespace, for example OPENMP_TEST_C_COMPILER instead of
LIBOMP_TEST_COMPILER and so on.

Differential Revision: https://reviews.llvm.org/D40082

llvm-svn: 319343
2017-11-29 19:31:52 +00:00
Jonas Hahnfeld 5af381acad [CMake] Refactor common settings and flags
These are needed by both libraries, so we can do that in a
common namespace and unify configuration parameters.
Also make sure that the user isn't requesting libomptarget
if the library cannot be built on the system. Issue an error
in that case.

Differential Revision: https://reviews.llvm.org/D40081

llvm-svn: 319342
2017-11-29 19:31:48 +00:00
Jonas Hahnfeld 3e921d3c52 [CMake] Disallow direct configuration
As a first step, this allows us to generalize the detection of
standalone builds and make it fully compatible when building in
llvm/runtimes/ which automatically sets OPENMP_STANDLONE_BUILD.

Differential Revision: https://reviews.llvm.org/D40080

llvm-svn: 319341
2017-11-29 19:31:43 +00:00
Ben Hamilton 35c736671a [openmp] Set up .arcconfig to point to new Diffusion OMP repository
Summary:
We want to automatically copy the appropriate mailing list
for review requests to the openmp repository.

For context, see the proposal and discussion here:

http://lists.llvm.org/pipermail/cfe-dev/2017-November/056032.html

Similar to D40179, I set up a new Diffusion repository with callsign
"OMP" for OpenMP:

https://reviews.llvm.org/source/openmp/

This explicitly updates openmp's .arcconfig to point to the new
OMP repository in Diffusion, which will let us use Herald rule H272
to automatically subscribe openmp-commits to review requests.

Reviewers: hans, grokos, Hahnfeld

Reviewed By: grokos

Subscribers: sammccall, klimek, openmp-commits

Differential Revision: https://reviews.llvm.org/D40499

llvm-svn: 319254
2017-11-28 23:28:45 +00:00
Sylvestre Ledru 67e60434c3 doxygen: disable the html timestamp: this is breaking the reproducible build of openmp
llvm-svn: 318978
2017-11-25 14:12:33 +00:00
Jonas Hahnfeld 221e7bb1fc Fix for OMP doacross implementation on Power
Power has a weak consistency model so we need memory barriers to
make writes (both from runtime and from user code) available for
all threads.

Differential Revision: https://reviews.llvm.org/D40175

llvm-svn: 318848
2017-11-22 17:15:20 +00:00
Jonas Hahnfeld 318dd729cc [CMake] Re-enable libomptarget and restrict tests to Clang 6.0.0
We have just fixed the codegen of omp_is_initial_device() to reliably work
when offloading to the same device, see commit r316001. This fixes the
failing tests that were the reason why we disabled the library for 5.0.

Differential Revision: https://reviews.llvm.org/D39052

llvm-svn: 318847
2017-11-22 17:15:18 +00:00
George Rokos b92dbb4b0a [Clang][OpenMP] New clang/libomptarget map interface: new function signatures, libomptarget-side
This is the libomptarget-side patch which changes the __tgt_* API function signatures in preparation for the new map interface.
Changes are: Device IDs 32bits --> 64bits, Flags 32bits --> 64bits

Differential revision: https://reviews.llvm.org/D40313

llvm-svn: 318790
2017-11-21 18:26:41 +00:00
Andrey Churbanov 58acafc424 Fixed OMP doacross implementation on 32-bit platforms.
Differential Revision: https://reviews.llvm.org/D40171

llvm-svn: 318658
2017-11-20 16:00:42 +00:00
Jonas Hahnfeld 0924094e34 [OMPT] Fix inaccuracies in worksharing tests
These tests were failing rarely on my MacBook when there was some
activity in the background. Read: one of a thousand executions?

 * sections.c missed the sorting based on thread ids. This worked
   as long as the master thread finished its section before the
   worker thread started the second one but failed if the master
   thread was put to sleep by the OS.
 * The checks in single.c assumed that the master thread executes
   the single region which works most of the time because it is
   usually faster than the newly spawned worker thread.

Differential Revision: https://reviews.llvm.org/D39853

llvm-svn: 318527
2017-11-17 15:26:44 +00:00
Andrey Churbanov a756cb240a Exclude untied tasks from checking of task scheduling constraint (TSC).
This can improve performance of tests with untied tasks.

Differential Revision: https://reviews.llvm.org/D39613

llvm-svn: 318388
2017-11-16 10:45:07 +00:00
Jonathan Peyton f902e467b7 [OpenMP] Remove the unused testsuite/ directory
The testsuite directory is not used or updated and confuses new users to the
OpenMP project. These tests were rewritten using the lit format and put under
the runtime/test directory. This patch removes the entire testsuite/ directory.

Differential Revision: https://reviews.llvm.org/D39767

llvm-svn: 318056
2017-11-13 17:44:48 +00:00
Jonas Hahnfeld d0ef19ef9b [OMPT] Provide initialization for Mac OS X
Traditionally, the library had a weak symbol for ompt_start_tool()
that served as fallback and disabled OMPT if called. Tools could
provide their own version and replace the default implementation
to register callbacks and lookup functions. This mechanism has
worked reasonably well on Linux systems where this interface was
initially developed.

On Darwin / Mac OS X the situation is a bit more complicated and
the weak symbol doesn't work out-of-the-box. In my tests, the
library with the tool needed to link against the OpenMP runtime
to make the process work. This would effectively mean that a tool
needed to choose a runtime library whereas one design goal of the
interface was to allow tools that are agnostic of the runtime.

The solution is to use dlsym() with the argument RTLD_DEFAULT so
that static implementations of ompt_start_tool() are found in the
main executable. This works because the linker on Mac OS X includes
all symbols of an executable in the global symbol table by default.
To use the same code path on Linux, the application would need to
be built with -Wl,--export-dynamic. To avoid this restriction, we
continue to use weak symbols on Linux systems as before.

Finally this patch extends the existing test to cover all possible
ways of initializing the tool as described by the standard. It
also fixes ompt_finalize() to not call omp_get_thread_num() when
the library is shut down which resulted in hangs on Darwin.
The changes have been tested on Linux to make sure that it passes
the current tests as well as the newly extended one.

Differential Revision: https://reviews.llvm.org/D39801

llvm-svn: 317980
2017-11-11 13:59:48 +00:00
Jonas Hahnfeld 3acaa701ad [libomptarget] Build all libraries in libomptarget/
In standalone build, plugins where previously built in their
subdirectory in plugins/ and tests couldn't find them.

Differential Revision: https://reviews.llvm.org/D39920

llvm-svn: 317979
2017-11-11 13:59:45 +00:00
Joachim Protze 91732475a6 [OMPT] Fix assertion for OpenMP code generated with outdated compilers
For up-to-date compilers, this assertion is reasonable, but it breaks
compatibility with the typical compiler installed on most systems.
This patch changes the default value to what we had when there was no
compiler support. A warning about the outdated compiler is printed during
runtime, when this point is reached.

Differential Revision: https://reviews.llvm.org/D39890

llvm-svn: 317928
2017-11-10 21:07:01 +00:00
Jonas Hahnfeld d30cb27a17 [OMPT] Purge OMPT_BLAME and OMPT_TRACE
This was replace by OMPT_OPTIONAL.

llvm-svn: 317890
2017-11-10 15:17:57 +00:00
Jonas Hahnfeld e9b7c0a392 Add const to some variables to avoid const_casts
In these places the const attribute seems correct and doesn't
need any other change, so let's do it.

Differential Revision: https://reviews.llvm.org/D39756

llvm-svn: 317798
2017-11-09 15:52:29 +00:00
Jonas Hahnfeld aeb40adabf Remove const from variables with dynamic memory
Allocated memory is typically not 'const' if it needs to be freed.
This patch removes around 50 wrong const attributes, modifies the
corresponding functions and finally gets rid of some const_casts.
These have especially been strange for __kmp_str_fname_free() that
added a 'const' to call __kmp_str_free() which removed it again.

Two minor cleanups that I performed in this process:
 * __kmp_tool_libraries now lives in kmp_settings.cpp as it is
   used nowhere else.
 * __kmp_msg_empty was removed as it was never used and Clang
   now complained that it was assigned a string literal that
   is 'const char *'.

Differential Revision: https://reviews.llvm.org/D39755

llvm-svn: 317797
2017-11-09 15:52:25 +00:00
Jonas Hahnfeld c60300333e [OMPT] Fix test cancel_parallel.c
If a parallel region is cancelled, execution resumes at the end
of the structured block. That is why this test cannot use the
"normal" macros that print right after inserting the label.
Instead it previously printed the addresses before the pragma
and swapped the checks compared to the other tests.

However, this does not work because FileChecks '*' is greedy
so that RETURN_ADDRESS always matched the second address. This
makes the test fail when an "overflow" occurrs and the first
address matches the value of codeptr_ra.

I discovered this on my MacBook but I'm unable to reproduce the
failure with the current version. Nevertheless we should fix this
problem to avoid that this test fails later after an unrelated change.

Differential Revision: https://reviews.llvm.org/D39708

llvm-svn: 317787
2017-11-09 14:26:14 +00:00
Jonas Hahnfeld 380346fce1 [OMPT] Add support for testing return addresses on POWER
Return addresses are determined based on the address of a label
that is inserted directly after a pragma / API call. In some cases
the tests can assume a known number of instructions between the
addresses. However, the instructions and their encoded lengths
depend on the target that the test is compiled on.

Firstly, this patch refactors the macro print_current_address() to
allow such target dependent modifications and adds information for
the observed instructions on POWER. Secondly, it adapts the related
macro print_fuzzy_address() to reuse much of "hacky" code and fixes
the used formatting strings in the printf() call. Finally, it also
adds documentation about how these macros are intended to work.

Differential Revision: https://reviews.llvm.org/D39699

llvm-svn: 317786
2017-11-09 14:26:12 +00:00
Jonathan Peyton 40039ac98c Cleanup version symbol macros and attributes/declspecs
1) Get rid of xaliasify, xexpand and xversionify for KMP_EXPAND_NAME and
KMP_VERSION_SYMBOL. KMP_VERSION_SYMBOL is a combination of xaliasify and
xversionify.

2) Put all attribute and __declspec definitions in kmp_os.h

Differential Revision: https://reviews.llvm.org/D39516

llvm-svn: 317636
2017-11-07 23:32:13 +00:00
Jonas Hahnfeld ba84ca9efb [OMPT] Fix null pointer in parallel/no_thread_num_clause.c
Looks like the implementation of printf on Darwin uses "0x0"
instead of "(nil)" like glibc does.

llvm-svn: 317515
2017-11-06 22:06:14 +00:00
Jonas Hahnfeld dc5d849e2b [OMPT] Fix callback.h for tests for changes in TR6
This was also lost in the last commit.

llvm-svn: 317484
2017-11-06 15:13:06 +00:00
Jonas Hahnfeld 13dc13ef09 [OMPT] Improve cast that was lost on commit, NFC.
llvm-svn: 317480
2017-11-06 14:33:09 +00:00
Joachim Protze cab9cdc2ad Updating implementation of OMPT as specified in OpenMP 5.0 Preview 2 (TR6)
The TR6 document is expected to be publically released around November 15.
This patch does not implement OMPT for libomptarget.

Patch by Simon Convent and Joachim Protze

Differential Revision: https://reviews.llvm.org/D39182

llvm-svn: 317436
2017-11-05 14:11:19 +00:00
Joachim Protze c255ca70ce Rename fields of ompt_frame_t
This is part of the renaming of data types from OpenMP TR4 to TR6

Patch by Simon Convent

Differential Revision: https://reviews.llvm.org/D39326

llvm-svn: 317435
2017-11-05 14:11:10 +00:00
Jonas Hahnfeld b71424fda5 Revert "Rename fields of ompt_frame_t"
This reverts commit r317338 which discarded some recent commits.

llvm-svn: 317347
2017-11-03 18:28:25 +00:00
Jonas Hahnfeld f0a1c65fb0 Revert "Updating implementation of OMPT as specified in OpenMP 5.0 Preview 2 (TR6)"
This reverts commit r317339 which discarded some recent commits.

llvm-svn: 317346
2017-11-03 18:28:19 +00:00
Joachim Protze 924cff0a39 Updating implementation of OMPT as specified in OpenMP 5.0 Preview 2 (TR6)
The TR6 document is expected to be publically released around November 15.
This patch does not implement OMPT for libomptarget.

Patch by Simon Convent and Joachim Protze

Differential Revision: https://reviews.llvm.org/D39182

llvm-svn: 317339
2017-11-03 17:09:00 +00:00
Joachim Protze 741572593f Rename fields of ompt_frame_t
This is part of the renaming of data types from OpenMP TR4 to TR6

Patch by Simon Convent

Differential Revision: https://reviews.llvm.org/D39326

llvm-svn: 317338
2017-11-03 17:08:40 +00:00
Jonas Hahnfeld 238a6b7f09 [libomptarget] Remove stale omp handle
This was never used in the upstream compiler and was responsible
for some problems with reductions in the clang-ykt fork.

Differential Revision: https://reviews.llvm.org/D39553

llvm-svn: 317214
2017-11-02 15:59:51 +00:00
Jonathan Peyton 3d18a37ca9 [OpenMP] Fix race condition in omp_init_lock
This is a partial fix for bug 34050.

This prevents callers of omp_set_lock (which does not hold __kmp_global_lock)
from ever seeing an uninitialized version of __kmp_i_lock_table.table.

It does not solve a use-after-free race condition if omp_set_lock obtains a
pointer to __kmp_i_lock_table.table before it is updated and then attempts to
dereference afterwards. That race is far less likely and can be handled in a
separate patch.

The unit test usually segfaults on the current trunk revision. It passes with
the patch.

Patch by Adam Azarchs

Differential Revision: https://reviews.llvm.org/D39439

llvm-svn: 317115
2017-11-01 19:44:42 +00:00
Joachim Protze 82e94a5934 Update implementation of OMPT to the specification OpenMP 5.0 Preview 1 (TR4).
The code is tested to work with latest clang, GNU and Intel compiler. The implementation
is optimized for low overhead when no tool is attached shifting the cost to execution with
tool attached.

This patch does not implement OMPT for libomptarget.

Patch by Simon Convent and Joachim Protze

Differential Revision: https://reviews.llvm.org/D38185

llvm-svn: 317085
2017-11-01 10:08:30 +00:00
Joachim Protze 0f4c5712b4 Test commit: sort names in CREDITS.txt
llvm-svn: 316922
2017-10-30 16:44:00 +00:00
Jonathan Peyton 5e6cb9022c Fix fatal error message displaying
Replacing call to __kmp_msg(kmp_ms_fatal,...) with __kmp_fatal(...) caused an
issue when incomplete message is displayed in case an error message is followed
by another message (e.g. by a hint messa)ge. This is because __kmp_fatal()
passes incomplete list of arguments to __kmp_msg().

Patch by Olga Malysheva

Differential Revision: https://reviews.llvm.org/D39248

llvm-svn: 316623
2017-10-25 22:05:02 +00:00
Jonathan Peyton dff0ee2f4e Disable threadprivate data cleanup if runtime is terminating
The problem is due to the runtime's threadprivate cleanup code which tries to
access data that was already destroyed by one of the root threads.
__kmp_init_gtid is used as a checker here since it is set to false before actual
resource cleanup is done in __kmp_cleanup().

Patch by Hansang Bae

llvm-svn: 316452
2017-10-24 16:10:09 +00:00
Jonathan Peyton 7b16ae201f Restrict OMPT to OpenMP version 5.0 and remove old header files
Patch by Simon Convent

Differential Revision: https://reviews.llvm.org/D38876

llvm-svn: 316234
2017-10-20 20:14:46 +00:00
Jonathan Peyton 48db80cc6c Add license envirable for testing Intel compilers
Patch by Simon Convent

Differential Revision: https://reviews.llvm.org/D38881

llvm-svn: 316232
2017-10-20 19:45:43 +00:00
Jonathan Peyton 16a05bca9c Add C++ support for testcases
Patch by Simon Convent

Differential Revision: https://reviews.llvm.org/D38878

llvm-svn: 316230
2017-10-20 19:42:32 +00:00
Jonathan Peyton 94a114fc39 Apply formatting changes
.clang-format's comments are removed and a (hopefully) final
set of formatting changes are applied.

Differential Revision: https://reviews.llvm.org/D38837
Differential Revision: https://reviews.llvm.org/D38920

llvm-svn: 316227
2017-10-20 19:30:57 +00:00
Jonathan Peyton 3f850bfcf0 KMP_HW_SUBSET vs KMP_PLACE_THREADS rival envirables fix
If both KMP_HW_SUBSET and KMP_PLACE_THREADS are set and KMP_PLACE_THREADS gets
parsed first, then the current environment variable parser rejects both and
neither get used. This patch uses the rivals mechanism that is used for other
environment variable groups (e.g., KMP_STACKSIZE, GOMP_STACKSIZE, OMP_STACKSIZE).
If both are set, then it tells the user that it is ignoring KMP_PLACE_THREADS in
favor of KMP_HW_SUBSET. The message about deprecating KMP_PLACE_THREADS when it
is set is still printed regardless.

Differential Revision: https://reviews.llvm.org/D38292

llvm-svn: 315091
2017-10-06 19:23:19 +00:00
Jonas Hahnfeld 5872f1e97f [test] Fix uninitialized memory in omp_taskloop_grainsize.c
result was never initialized to zero which sometimes failed the test.

llvm-svn: 314513
2017-09-29 13:53:03 +00:00
Jonathan Peyton bd3a7633f1 Remove unnecessary semicolons
Removes semicolons after if {} blocks, function definitions, etc.
I was able to apply the large OMPT patch cleanly on top of this one
with no conflicts.

llvm-svn: 314340
2017-09-27 20:36:27 +00:00
Jonathan Peyton 8f3d7448b9 Allow printing of KMP_TOPOLOGY_METHOD when KMP_SETTINGS=true
llvm-svn: 314243
2017-09-26 20:33:53 +00:00
Jonathan Peyton 6de85b1565 Remove unused t_single_lock
Add padding inside team structure to keep same structure size.

llvm-svn: 314242
2017-09-26 20:12:16 +00:00
Jonathan Peyton 52527cd2c1 Read blocktime value set by kmp_set_blocktime() before reading from KMP_BLOCKTIME
Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D37403

llvm-svn: 312539
2017-09-05 15:45:48 +00:00
Jonathan Peyton 6a393f75f4 Minor code cleanup of Klocwork issues
Minor code cleanup of Klocwork issues. Fatal messages are given no return
attribute. Define and use KMP_NORETURN to work for multiple C++ versions.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D37275

llvm-svn: 312538
2017-09-05 15:43:58 +00:00
Jonathan Peyton 0447708f8d Use va_copy instead of __va_copy to fix building libomp against musl libc
Fixes https://bugs.llvm.org/show_bug.cgi?id=34040

Patch by Peter Levine

Differential Revision: https://reviews.llvm.org/D36343

llvm-svn: 311269
2017-08-19 23:53:36 +00:00
Jonathan Peyton d4daf4540a Remove BUILD_TV
Cleanup code to remove BUILD_TV and unused code bracketed by it.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D36011

llvm-svn: 311114
2017-08-17 19:09:28 +00:00
Sergey Dmitriev b305d26b57 [OpenMP] libomptarget: move debugging dumps under control of env var LIBOMPTARGET_DEBUG
Disable default debugging dumps for libomptarget and plugins and move dumps
under control of environment variable LIBOMPTARGET_DEBUG=<integer>. Dumps
are enabled when LIBOMPTARGET_DEBUG is set to a positive integer value.

Debugging dumps are available only in debug build; release build does not
support it.

Differential Revision: https://reviews.llvm.org/D33227

llvm-svn: 310841
2017-08-14 15:09:59 +00:00
Paul Osmialowski a016279422 OMP_PROC_BIND: better spread
This change improves the way threads are spread across cores
when OMP_PROC_BIND=spread is set and no unusual affinity masks are in use.

Differential Revision: https://reviews.llvm.org/D36510

llvm-svn: 310670
2017-08-10 23:04:11 +00:00
Jonathan Peyton 038855ade8 Exclude version symbols for static libomp
We use symbol versioning for GNU-compatibility but libgomp has versioned symbols
only in the shared library but not in the static.
Moreover, version symbols in the static library can cause an error at link time.

Patch by Olga Malysheva

Differential Revision: https://reviews.llvm.org/D36225

llvm-svn: 309877
2017-08-02 20:10:00 +00:00
Jonathan Peyton 1b536724d9 Move lock acquire/release functions in task deque cleanup code
The original locations can be reached without initializing the lock variable
(td_deque_lock), so it is potentially unsafe.  It is guaranteed that the lock
is initialized if the deque (td_deque) is not NULL, and lock functions can be
safely called.

Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D36017

llvm-svn: 309875
2017-08-02 20:06:32 +00:00
Jonathan Peyton 4f90c82aec Add new envirable KMP_TEAMS_THREAD_LIMIT
This change adds a new environment variable, KMP_TEAMS_THREAD_LIMIT, which is
used to set a new global variable, __kmp_teams_max_nth, which is checked when
determining the size and quantity of teams that will be created in the teams
construct. Specifically, it is a limit on the total number of threads in a given
teams construct. It differentiates the limits for the teams construct from the
limits for regular parallel regions (KMP_DEVICE_THREAD_LIMIT/__kmp_max_nth and
OMP_THREAD_LIMIT/__kmp_cg_max_nth). When each individual team is formed, it is
still subject to those limits. After the clauses to the teams construct are
parsed and calculated, we check to make sure we are within this limit, and if
not, reduce num_threads per team and/or number of teams, accordingly. The
default value is set to the number of available processors on the system.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D36009

llvm-svn: 309874
2017-08-02 20:04:45 +00:00
Jonathan Peyton 644f4e3d11 Fix comments and build messages concerning TSX
llvm-svn: 309418
2017-07-28 19:05:17 +00:00
Jonathan Peyton f439246328 Fix implementation of OMP_THREAD_LIMIT
This change fixes the implementation of OMP_THREAD_LIMIT. The implementation of
this previously was not restricted to a contention group (but it should be,
according to the spec), and this is fixed here. A field is added to root thread
to store a counter of the threads in the contention group. An extra check is
added when reserving threads for a parallel region that checks this variable and
compares to threadlimit-var, which is implemented as a new global variable,
kmp_cg_max_nth. Associated settings changes were also made, and clean up of
comments that referred to OMP_THREAD_LIMIT, but should refer to the new
KMP_DEVICE_THREAD_LIMIT (added in an earlier patch).

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D35912

llvm-svn: 309319
2017-07-27 20:58:41 +00:00
Jonathan Peyton 09244f39dd Introduce KMP_DEVICE_THREAD_LIMIT
This change drops in KMP_DEVICE_THREAD_LIMIT to replace KMP_MAX_THREADS. It's
possible there will eventually be a OMP_DEVICE_THREAD_LIMIT, and we need
something to distinguish from OMP_THREAD_LIMIT, which is currently implemented
incorrectly (the fix for that will be added soon in a separate patch).
KMP_ALL_THREADS is deprecated here, but we can keep the "all" option on
KMP_DEVICE_THREAD_LIMIT to support that functionality. KMP_DEVICE_THREAD_LIMIT
now has priority over its deprecated rival KMP_ALL_THREADS. I also cleaned up
some comments that incorrectly referred to non-existent kmp_max_threads variable
instead of kmp_max_nth.

I've left the name of where this setting eventually ends up as
__kmp_max_nth, for now.

This change does not change much in the way of functionality. It does NOT change
OMP_THREAD_LIMIT. It's just cleaning up and setting up for that.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D35860

llvm-svn: 309168
2017-07-26 20:07:58 +00:00
Jonas Hahnfeld 203c730719 [CMake] Disable building libomptarget and add CMake switch
Introduce OPENMP_ENABLE_LIBOMPTARGET which defaults to OFF at the moment.

libomptarget is not yet ready for prime time:
 - Offloading to NVIDIA GPUs is not completed yet (compiler, device RTL)
 - The generic ELF plugin for offloading to the host (meant for testing)
   uses a single instance of the OpenMP runtime (libomp). That is why
   omp_is_initial_device() returns 1 which makes the tests fail.
Because of these reasons, we want to disable building (and testing!)
for release 5.0.

See https://bugs.llvm.org/show_bug.cgi?id=33859

Differential Revision: https://reviews.llvm.org/D35719

llvm-svn: 309115
2017-07-26 13:55:00 +00:00
Jonathan Peyton d74d890247 Cleanup: __kmp_env_* variables
Removed unused __kmp_env_* variables. Also clangified other people's code.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D35808

llvm-svn: 309000
2017-07-25 18:20:16 +00:00
NAKAMURA Takumi 0c7d6ef459 Whitespace.
llvm-svn: 308693
2017-07-20 23:12:39 +00:00
Andrey Churbanov c7476ed0be OpenMP RTL cleanup: two PAUSEs per spin loop iteration replaced with single one
Differential Revision: https://reviews.llvm.org/D35490

llvm-svn: 308423
2017-07-19 09:26:13 +00:00
Dimitry Andric 0c7238b21c For KMP_PAGE_SIZE, use getpagesize() on Unix, GetSystemInfo() on Windows
Summary:
The kmp_os.h header is defining the `PAGE_SIZE` macro unconditionally,
even while it is only used directly after its definition, for the
Windows implementation of the `KMP_GET_PAGE_SIZE()` macro.

On at least FreeBSD, but likely all other BSDs too, this macro conflicts
with the one defined in system headers, so remove it, since nothing else
uses it.  Make all Unixes use `getpagesize()` instead, and use
`GetSystemInfo()` for the Windows case.

Reviewers: jlpeyton, jcownie, emaste, AndreyChurbanov

Reviewed By: AndreyChurbanov

Subscribers: AndreyChurbanov, hfinkel, zturner

Differential Revision: https://reviews.llvm.org/D35072

llvm-svn: 308355
2017-07-18 20:31:19 +00:00
Jonathan Peyton 1c50ee64a2 Fix failing taskloop tests by omitting gcc
We do not have GOMP interface support for taskloop yet.

llvm-svn: 308351
2017-07-18 20:16:25 +00:00
Jonathan Peyton 93e17cfe6c Add recursive task scheduling strategy to taskloop implementation
Summary:
Taskloop implementation is extended by using recursive task scheduling.
Envirable KMP_TASKLOOP_MIN_TASKS added as a manual threshold for the user
to switch from recursive to linear tasks scheduling.

Details:
* The calculations for the loop parameters are moved from __kmp_taskloop_linear
  upper level
* Initial calculation is done in the __kmpc_taskloop, further range splitting
  is done in the __kmp_taskloop_recur.
* Added threshold to switch from recursive to linear tasks scheduling;
* One half of split range is scheduled as an internal task which just moves
  sub-range parameters to the stealing thread that continues recursive
  scheduling (if number of tasks still enough), the other half is processed
  recursively;
* Internal task duplication routine fixed to assign parent task, that was not
  needed when all tasks were scheduled by same thread, but is needed now.

Patch by Andrey Churbanov

Differential Revision: https://reviews.llvm.org/D35273

llvm-svn: 308338
2017-07-18 18:50:13 +00:00
Andrey Churbanov 71483f2dda Fix sporadic segfaults in tasking tests.
Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D35535

llvm-svn: 308298
2017-07-18 11:56:16 +00:00
Andrey Churbanov ddc38722a4 OpenMP RTL cleanup: nullify pointer after memory freeing
Differential Revision: https://reviews.llvm.org/D35497

llvm-svn: 308274
2017-07-18 08:30:03 +00:00
Jonathan Peyton f6f2c6e47f Removed "duplicates" from verbose affinity output
The internal details of this setting are not meant to be user visible and only create confusion.

Differential Revision: https://reviews.llvm.org/D35269

llvm-svn: 308189
2017-07-17 17:06:43 +00:00
Andrey Churbanov 5ba90c7979 OpenMP RTL cleanup: eliminated warnings with -Wcast-qual, patch 2.
Changes are: got all atomics to accept volatile pointers that allowed
to simplify many type conversions. Windows specific code fixed correspondingly.

Differential Revision: https://reviews.llvm.org/D35417

llvm-svn: 308164
2017-07-17 09:03:14 +00:00
Jonas Hahnfeld 266ddafc68 [GOMP] Fix (un)tied tasks with the GCC
The first bit is actually the "untied" flag. That is why the condition was
wrong and has to be inverted to set the flag correctly.

Found and initial patch by Simon Convent!

llvm-svn: 307899
2017-07-13 10:38:11 +00:00
Dimitry Andric b9fb12291a Rename z_Linux_asm.s to z_Linux_asm.S
Summary:
On Unix, a .S file is normally an assembly source which must be
preprocessed with a C preprocessor, while a .s file is "plain" assembly.
The former is handled by the compiler driver (cc), the latter is
directly passed to the assembler binary (as).

Because z_Linux_asm.s is supposed to be preprocessed, rename it to .S,
so it can be automatically picked up correctly by build systems.

Reviewers: AndreyChurbanov, emaste, jlpeyton

Reviewed By: AndreyChurbanov

Subscribers: mgorny, openmp-commits

Differential Revision: https://reviews.llvm.org/D35171

llvm-svn: 307680
2017-07-11 18:04:56 +00:00
Dimitry Andric 79bf29ccb7 Add a .arcconfig file for openmp.
llvm-svn: 307474
2017-07-08 16:09:47 +00:00
Ed Maste 414544c9aa remove deprecated register storage class specifier
While importing libomp into the FreeBSD base system we encountered
Clang warnings that "'register' storage class specifier is deprecated
and incompatible with C++1z [-Wdeprecated-register]".

Differential Revision:	https://reviews.llvm.org/D35124

llvm-svn: 307441
2017-07-07 21:06:05 +00:00
Ed Maste 78b0f075f7 remove duplicate symbol version script entries
GNU ld ignores duplicates, but lld produces a warning.

Differential Revision:	https://reviews.llvm.org/D35121

llvm-svn: 307399
2017-07-07 13:45:41 +00:00
Jonathan Peyton d0494046c7 Fix wrong website in messages
Address user message bug where the messages were sending users to Intel's
website instead of the LLVM OpenMP runtime websites.

Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=32892

Differential Revision: https://reviews.llvm.org/D35018

llvm-svn: 307206
2017-07-05 22:01:05 +00:00
Andrey Churbanov c47afcd9bb OpenMP RTL cleanup: eliminated warnings with -Wcast-qual.
Changes are: replaced C-style casts with cons_cast and reinterpret_cast;
type of several counters changed to signed; type of parameters of 32-bit and
64-bit AND and OR intrinsics changes to unsigned; changed files formatted
using clang-format version 3.8.1.

Differential Revision: https://reviews.llvm.org/D34759

llvm-svn: 307020
2017-07-03 11:24:08 +00:00
Hal Finkel 2bc3449d22 Make test/parallel/omp_nested.c not use so many threads
I've found it very difficult to get test/parallel/omp_nested.c to pass
consistently across my build environments. The problem is that it creates N^2
threads (it is testing nested parallel regions), and that often exceeds the
thread limits on systems with many cores. We do raise the process limits in
lit, and that often helps, but if running lit with a smaller number of threads
or on a system where we're otherwise resource constrained, this particular test
tends to fail (because the runtime cannot create a sufficient number of
threads).

This seems to work: if the maximum number of threads is more than some small
number, then cap the number of threads used for the parallel region. The choice
of 4 here is somewhat arbitrary.

Differential Revision: https://reviews.llvm.org/D32033

llvm-svn: 306357
2017-06-27 03:04:25 +00:00
Dimitry Andric 695c69316b Only use libdl when it is available
Summary: On BSDs, there is no `libdl.so`, and functions like `dlopen`
are implemented in the main C library instead.  Use the `CMAKE_DL_LIBS`
variable instead of hardcoding a dependency on the `dl` library.

Reviewers: grokos, joerg, emaste

Reviewed By: emaste

Subscribers: jlpeyton, mgorny, openmp-commits

Differential Revision: https://reviews.llvm.org/D34632

llvm-svn: 306319
2017-06-26 19:16:49 +00:00
Jonathan Peyton 072ccb7239 Set affinity to none/false in child processes
Reset affinity to none (false for proc-bind-var) so that threads in the child
processes are not bound tightly, unless the user explicitly sets this in
KMP_AFFINITY/OMP_PROC_BIND, in child processes. This can improve
performance for scripting languages which fork for parallelism like Python's
multiprocessing module.

Differential Revision: https://reviews.llvm.org/D34154

llvm-svn: 305513
2017-06-15 21:51:07 +00:00
Jonathan Peyton 492e0a33cb Replace platform macro with KMP_MIC_SUPPORTED
Differential Revision: https://reviews.llvm.org/D34119

llvm-svn: 305307
2017-06-13 17:17:26 +00:00
Jonathan Peyton d330e630db Reset initial affinity in children processes
If OpenMP is initialized before fork()-ing occurs and affinity is set to
something like compact, then the master thread will be pinned to a single HW
thread/core after initialization. If the master (or any other thread) then
forks N processes, all N processes will then be pinned to that same single HW
thread/core. To reset the affinity for the new child process, the atfork
handler for the child process can call kmp_set_thread_affinity_mask_initial()
to reset its affinity to the initial affinity of the application before it
re-initializes libomp. The parent process will not be affected and still
keeps its affinity setting.

Differential Revision: https://reviews.llvm.org/D34118

llvm-svn: 305306
2017-06-13 17:16:12 +00:00
Samuel Antao 8933ffbb12 [OpenMP] Prevent unused-variable warning in libomptarget when compiling in Release mode.
llvm-svn: 305090
2017-06-09 16:46:07 +00:00
Jonathan Peyton ccfed2edb6 Fix static initializers for locks.
Fix static initializers to use the proper unlocked value for the poll
field of the tas and futex locks.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D33794

llvm-svn: 304828
2017-06-06 20:24:41 +00:00
Andrey Churbanov d454c73cc3 OpenMP 4.5: implemented support of schedule(simd:guided) and
schedule(simd:runtime) - library part. Compiler generation should use newly
introduced scheduling kinds kmp_sch_guided_simd = 46, kmp_sch_runtime_simd = 47,
as parameters to __kmpc_dispatch_init_* entries.

Differential Revision: https://reviews.llvm.org/D31602

llvm-svn: 304724
2017-06-05 17:17:33 +00:00
George Rokos 0e86bfb5bb [OpenMP] libomptarget: eliminate compiler warnings at build
Thanks to Sergey Dmitriev for submitting the patch.

Differential Revision: https://reviews.llvm.org/D33851

llvm-svn: 304601
2017-06-02 22:41:35 +00:00
Andrey Churbanov b3b10c2fa5 Re-enable assertion after the problem that caused it to be hit had been fixed
Differential Revision: https://reviews.llvm.org/D31421

llvm-svn: 304443
2017-06-01 18:10:45 +00:00
Jonathan Peyton 642688b632 Fix minor formatting issues
Some code was restructured to move it under KMP_DEBUG.  The rest is
formatting changes to fix some things broken by clang-format

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D33744

llvm-svn: 304438
2017-06-01 16:46:36 +00:00
Jonathan Peyton e3e2aaf68d Fix for KMP_AFFINITY=disabled and KMP_TOPOLOGY_METHOD=hwloc
With these settings, the create_hwloc_map() method was being called causing an
assert(). After some consideration, it was determined that disabling affinity
explicitly should just disable hwloc as well. i.e., KMP_AFFINITY overrides
KMP_TOPOLOGY_METHOD. This lets the user know that the Hwloc mechanism is being
ignored when KMP_AFFINITY=disabled.

Differential Revision: https://reviews.llvm.org/D33208

llvm-svn: 304344
2017-05-31 20:35:22 +00:00
Jonathan Peyton 9f5df8b02e Address default pinning OpenMP process with multiple processor groups
This change checks if the initial affinity mask is equal to exactly one
Windows processor group's affinity mask. If it is, then the code does not
respect the initial affinity mask and uses the entire machine instead.
The reasoning behind this is that, by default, Windows assigns exactly one
processor group as the initial affinity mask even when there are multiple
Windows processor groups available. User's typically want to use the whole
machine, so we ignore this special case and use the entire machine.

If the initial affinity mask is a proper subset of one group, or spans multiple
groups, then the initial affinity mask is respected since we can assume that the
operating system did not assign this initial affinity mask. This change only
affects machines with multiple processor groups

Differential Revision: https://reviews.llvm.org/D33210

llvm-svn: 304343
2017-05-31 20:33:56 +00:00
Jonathan Peyton 586849918b Fix for KMP_AFFINITY=respect with multiple processor groups
An assert() was being tripped when KMP_AFFINITY=respect + Multiple Processor
Groups. Let __kmp_affinity_create_proc_group_map() function be able to create
address2os object which contains a single group by deleting restriction that
process affinity mask must span multiple groups.

llvm-svn: 303101
2017-05-15 19:05:59 +00:00
Jonathan Peyton 6da813336c Remove some outdated comments
llvm-svn: 303086
2017-05-15 17:39:16 +00:00
Jonathan Peyton 9e704efaa6 Add the .clang-format file which the formatting was based on
llvm-svn: 303079
2017-05-15 16:39:42 +00:00
Jonathan Peyton 3041982dd1 Clang-format and whitespace cleanup of source code
This patch contains the clang-format and cleanup of the entire code base. Some
of clang-formats changes made the code look worse in places. A best effort was
made to resolve the bulk of these problems, but many remain. Most of the
problems were mangling line-breaks and tabbing of comments.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D32659

llvm-svn: 302929
2017-05-12 18:01:32 +00:00
George Rokos 1546d31924 [OpenMP] Changes in the plugin interface
This patch chagnes the plugin interface so that:
1) future plugins can take advantage of systems with shared CPU/device storage
2) instead of using base addresses, target regions are launched by providing target addresseds and base offsets explicitly.

Differential revision: https://reviews.llvm.org/D33028
 

llvm-svn: 302663
2017-05-10 14:12:36 +00:00
George Rokos 7ec263672c [OpenMP] libomptarget: test correction for use with OpenMP 4.5
Differential Revision: https://reviews.llvm.org/D32562

Thanks to Sergey Dmitriev for submitting the patch.

llvm-svn: 301577
2017-04-27 18:54:00 +00:00
Jonathan Peyton 20e13d4a38 Fix Hwloc API Incompatibility
Older Hwloc libraries (< 1.10.0) don't offer the HWLOC_OBJ_NUMANODE nor
HWLOC_OBJ_PACKAGE types. Instead they are named HWLOC_OBJ_NODE and
HWLOC_OBJ_SOCKET instead. This patch just defines the newer names based on
the older names when using an older Hwloc.

Differential Revision: https://reviews.llvm.org/D32496

llvm-svn: 301349
2017-04-25 19:04:07 +00:00
George Rokos c13df8e5e0 [OpenMP] Optimized default kernel launch parameters in CUDA plugin
Differential Revision: https://reviews.llvm.org/D32321

llvm-svn: 301321
2017-04-25 16:34:13 +00:00
George Rokos 4800fc4363 [OpenMP] Add missing parenthesis which triggers a compile error
Differential Revision: https://reviews.llvm.org/D32490

llvm-svn: 301318
2017-04-25 15:55:39 +00:00
George Rokos d57681b703 [OpenMP] libomptarget: Set ref count for global objects to positive infinity
Differential Revision: https://reviews.llvm.org/D32326

llvm-svn: 301076
2017-04-22 11:45:03 +00:00
George Rokos f9cb9c18a0 [OpenMP] libomptarget: Remove obsolete negative device IDs -2/-3
Differential Revision: https://reviews.llvm.org/D32325

llvm-svn: 301075
2017-04-22 11:21:54 +00:00
George Rokos 6c79cc2198 [OpenMP] Run libomptarget regression tests using all available system threads.
Differential Revision: https://reviews.llvm.org/D32327

llvm-svn: 301074
2017-04-22 11:20:20 +00:00
Andrey Churbanov 44fea6b864 Fix crash in invoking microtask on ios arm64.
Patch by Ni Hui.

Differential Revision: https://reviews.llvm.org/D31923

llvm-svn: 300448
2017-04-17 11:58:20 +00:00
Andrey Churbanov 4a9a89241b KMP_HW_SUBSET extended with NUMA support when HWLOC enabled
Differential Revision: https://reviews.llvm.org/D31600

llvm-svn: 300220
2017-04-13 17:15:07 +00:00
Olga Malysheva 80af9c081a Test cancellation_for_sections.c expectedly fails on GCC
llvm-svn: 299437
2017-04-04 14:39:52 +00:00
Olga Malysheva dbdcfa127f Reset cancellation status for 'parallel', 'sections' and 'for' constracts.
Without this fix cancellation status for parallel, sections and for persists 
across construct boundaries.

Differential Revision: https://reviews.llvm.org/D31419

llvm-svn: 299434
2017-04-04 13:56:50 +00:00
Olga Malysheva b7784ebdf7 Test check-in, comment changed
llvm-svn: 299428
2017-04-04 12:56:55 +00:00
Andrey Churbanov 31d39bfc5f Fix for bug https://llvm.org/bugs/show_bug.cgi?id=32456
ITT Notify disabled for static build of OpenMP RTL.

Differential Revision: https://reviews.llvm.org/D31466

llvm-svn: 299230
2017-03-31 16:20:07 +00:00
Andrey Churbanov cece72aa04 Fix for bug https://llvm.org/bugs/show_bug.cgi?id=30889
Condition adjusted for Debug assertion.

Differential Revision: https://reviews.llvm.org/D29638

llvm-svn: 298915
2017-03-28 13:35:42 +00:00
Paul Osmialowski 0788515cb1 GOMP compatibility: add missing OpenMP4.0 task deps handling code
Differential Revision: https://reviews.llvm.org/D31071

llvm-svn: 298605
2017-03-23 15:03:17 +00:00
George Rokos 01954092d0 [OpenMP] CUDA plugin: More descriptive error messages
Differential Revision: https://reviews.llvm.org/D31206

llvm-svn: 298527
2017-03-22 17:36:22 +00:00
George Rokos ba7380bf6c [OpenMP] Allow multiple weak symbols to be loaded from the fat binary
For compatibility with Fortran.

Differential Revision: https://reviews.llvm.org/D31205

llvm-svn: 298516
2017-03-22 16:43:40 +00:00
George Rokos f3fe2dd235 [OpenMP] CUDA plugin: add include directory for libelf
Allow the user to manually specify where libelf is installed.

Differential Revision: https://reviews.llvm.org/D31207

llvm-svn: 298515
2017-03-22 16:41:46 +00:00
George Rokos 00749e0fd4 [OpenMP] libomptarget: Disable on MacOS X
Disable compilation of libomptarget on MacOS X.

Differential Revision: https://reviews.llvm.org/D31055

llvm-svn: 298411
2017-03-21 18:19:09 +00:00
Andrey Churbanov 435b419d26 Fixed intermittent hang on tests with "target teams if(0)" construct with no parallel inside.
Differential Revision: https://reviews.llvm.org/D29597

llvm-svn: 298373
2017-03-21 13:48:52 +00:00
Andrey Churbanov 3b939d070c Stride in distribute parallel for loops with no chunk size.
Patch by George Rokos.

Differential Revision: https://reviews.llvm.org/D24486

llvm-svn: 298362
2017-03-21 12:17:22 +00:00
Jonathan Peyton 35d75aeda2 Minor improvement of KMP_YIELD_NOW() macro.
This change slightly improves performance of KMP_YIELD_NOW() macro, by using
_rdtsc() intrinsic function if possible.

Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D31008

llvm-svn: 298314
2017-03-20 22:11:31 +00:00
Jonathan Peyton 16fd8fec76 Fix incorrect initial value of __kmp_affinity_type.
Affinity initialization code expects __kmp_affinity_type has the value
affinity_default by default, but the cleanup code does not properly set the
value back to affinity_default.  This may introduce some issues when multiple
roots are trying to initialize/uninitialize the runtime successively.

Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D31012

llvm-svn: 298313
2017-03-20 22:04:02 +00:00
Andrey Churbanov a193ae19e1 Create a git ignore file for openmp runtime.
Patch by Guansong Zhang.

Differential Revision: https://reviews.llvm.org/D30784

llvm-svn: 297562
2017-03-11 13:05:08 +00:00
Jonathan Peyton de8d65914b Fix assertion failure when 'proclist' is used without 'explicit' in KMP_AFFINITY
This change fixes an assertion failure the in case KMP_AFFINITY is set with
'proclist' specified but without 'explicit'
e.g., KMP_AFFINITY=verbose,proclist=[0-31]

Patch by Olga Malysheva

Differential Revision: https://reviews.llvm.org/D30404

llvm-svn: 297480
2017-03-10 17:22:47 +00:00
Dan Albert 1dc735bf64 Fix GNU strerror_r check for Android.
Summary:
Bionic didn't get a GNU style strerror_r until Android M. Until then
we unconditionally exposed the POSIX one. Expand the check to account
for this.

Reviewers: pirama, AndreyChurbanov, jlpeyton

Reviewed By: jlpeyton

Subscribers: openmp-commits, srhines

Differential Revision: https://reviews.llvm.org/D30056

llvm-svn: 297235
2017-03-07 22:18:05 +00:00
Jonathan Peyton e844a54a85 OpenMP version 5.0 added
Add build option LIBOMP_OMP_VERSION=50, 5.0 headers, and add the year/month
associated with OpenMP 5.0 in relevant source locations. Also, remove the
deprecated LIBOMP_OMP_VERSION=41 option.

Patch by Olga Malysheva

Differential Revision: https://reviews.llvm.org/D30450

llvm-svn: 297083
2017-03-06 22:07:40 +00:00
Jonathan Peyton 41d3800d71 Mixed type atomic routines added to Windows DLL
Patch by Olga Malysheva

Differential Revision: https://reviews.llvm.org/D30408

llvm-svn: 297082
2017-03-06 21:46:36 +00:00
Paul Osmialowski 1e254c5295 Add AArch64 support.
This adds AArch64 support to recently added part of the runtime responsible for offloading to target. This piece of code allows offloading-to-self on AArch64 machines.

Differential Revision: https://reviews.llvm.org/D30644

llvm-svn: 297070
2017-03-06 21:00:07 +00:00
Jonathan Peyton 928b8ea203 Removing couple unnecessary architecture guards.
This section of code (__kmp_test_then_* functions) is guarded by
(KMP_ARCH_X86 || KMP_ARCH_X86_64) so it does not make sense to have other
architecture guards inside this section.  Non-x86 architectures always
use intrinsics (__sync_*)

llvm-svn: 296525
2017-02-28 21:43:28 +00:00
Michal Gorny 018d13597a [test] Try to link -latomic to provide atomics when available
When using -rtlib=libgcc, the fallback implementation of __atomic_*
builtins is provided via libatomic (included in GCC). However, neither
GCC itself nor clang link libatomic implicitly, and it seems that GCC
upstream expects projects to link it explicitly as necessary.

Since compiler-rt provides __atomic_* builtins directly in the main
library, check if they are provided by the default libraries first.
If they are not, check if -latomic is available to provide them
and add explicit -latomic for tests in this case.

This fixes unresolved __atomic_load() references when running openmp
tests on i386 with libgcc backend.

Differential Revision: https://reviews.llvm.org/D30083

llvm-svn: 296183
2017-02-24 22:15:24 +00:00
George Rokos 63efdd9e1e [OpenMP] Missing virtual destructor in KMPAffinity
Added virtual destructor in a class containing virtual functions.

Differential Revision: https://reviews.llvm.org/D30271

llvm-svn: 295896
2017-02-22 22:50:28 +00:00
Jonathan Peyton 12ecbb35eb [stats] add stats-gathering for static_steal scheduling method
Add counter to count number of static_steal for loops
Add counter for number of chunks executed per static_steal for loop
Add counter for number of chunks stolen per static_steal for loop

llvm-svn: 295461
2017-02-17 17:06:16 +00:00
Andrey Churbanov 72ba210916 Run-time library part of OpenMP 5.0 task reduction implementation.
Added test kmp_task_reduction_nest.cpp which has an example of
possible compiler codegen.

Differential Revision: https://reviews.llvm.org/D29600

llvm-svn: 295343
2017-02-16 17:49:49 +00:00
Andrey Churbanov ad3f63986d Added an option to bind initial thread at the start of application
via setting envirable KMP_INITIAL_THREAD_BIND=1.

Differential Revision: https://reviews.llvm.org/D29665

llvm-svn: 295339
2017-02-16 17:08:40 +00:00
George Rokos 15a6e7daab [OpenMP] libomptarget: Protect parent struct from being deallocated
Fixed bug due to which a parent struct was deallocated when one of the struct's pointers was being unmapped.

Differential Revision: https://reviews.llvm.org/D29914

llvm-svn: 295231
2017-02-15 20:45:37 +00:00
Jonathan Peyton 581fdbaad4 Enable yield cycle on Linux
This change allows the runtime to turn __kmp_yield() on/off repeatedly on Linux.
This feature was removed when disabling monitor thread, but there are
applications that perform better with this feature on.

Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D29227

llvm-svn: 295203
2017-02-15 17:19:21 +00:00
Jonas Hahnfeld 35801a2470 [OpenMP] New Tsan annotations to remove false positive on reduction and barriers
Added new ThreadSanitizer annotations to remove false positives with OpenMP reduction.
Cleaned up Tsan annotations header file from unused annotations.

Patch by Simone Atzeni!

Differential Revision: https://reviews.llvm.org/D29202

llvm-svn: 295158
2017-02-15 08:14:22 +00:00
Hans Wennborg 1155ad1bc8 libomptarget: Disable on Win32
It's not supported, and currently breaks the weekly LLVM snapshot
builds.

Differential Revision: https://reviews.llvm.org/D29801

llvm-svn: 294758
2017-02-10 17:13:28 +00:00
Jonas Hahnfeld a620a0745f [libomptarget] Align test code with runtime/
This change allows setting LIBOMPTARGET_LLVM_LIT_EXECUTABLE and
LIBOMPTARGET_FILECHECK_EXECUTABLE as full path. It also honors
OPENMP_LLVM_TOOLS_DIR which is meant as a common configuration
for both libomp and libomptarget.

Maybe this should be done in a common CMake module, but I'm no expert here.

Differential Revision: https://reviews.llvm.org/D29172

llvm-svn: 294284
2017-02-07 06:58:15 +00:00
Andrey Churbanov 581490e713 Fix a race in shutdown when tasking is used.
Patch by Terry Wilmarth.

Differential Revision: https://reviews.llvm.org/D28377

llvm-svn: 294214
2017-02-06 18:53:32 +00:00
George Rokos efd412f8fe [OpenMP] Redefined macro warning in libomptarget
Fixed compilation warning in libomptarget.

Differential Revision: https://reviews.llvm.org/D29353

llvm-svn: 293747
2017-02-01 08:33:38 +00:00
George Rokos 3de4cd1281 [OpenMP] Initial implementation of OpenMP offloading library - libomptarget plugins.
This is the patch upstreaming the plugins part of libomptarget (CUDA, generic-elf-64).

Differential Revision: https://reviews.llvm.org/D14253

llvm-svn: 293724
2017-02-01 00:14:41 +00:00
Jonas Hahnfeld 479088eefa Correct wrong comment in bug_nested_proxy_task.c
The nested proxy task does not have dependencies.

llvm-svn: 293472
2017-01-30 09:51:02 +00:00
Jonas Hahnfeld 49739f0e32 [libomptarget] Fix Debug build with glibc < 2.18
glibc < 2.18 is C99 compliant and only provides the format macros in C++ if
__STDC_FORMAT_MACROS is defined. This change fixes the debug build for
GCC 4.8, GCC 6.2 and Clang 3.9.1 that were previously broken on my machine.

It shows no regression for libc++ >= 4.0.0 which has a fix since September:
http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20160926/171659.html

llvm-svn: 293468
2017-01-30 08:11:20 +00:00
Jonathan Peyton 12313d44cf Cleanup: put i_maxmin members and ___kmp_size_type into traits_t
Put the duplicated i_maxmin into traits_t by adding new members max_value and
min_value. Put ___kmp_size_type into traits_t by adding member type_size.

Differential Revision: https://reviews.llvm.org/D28847

llvm-svn: 293316
2017-01-27 18:09:22 +00:00
Jonathan Peyton 3061e3e454 Printing OS thread id, when KMP_AFFINITY is set.
Patch by Vishakha Agrawal

Differential Revision: https://reviews.llvm.org/D28873

llvm-svn: 293315
2017-01-27 18:04:33 +00:00
Jonathan Peyton 2208a85101 Fix performance issue incurred by removing monitor thread.
When the monitor thread is used, most threads in the team directly go to
sleep if the copy of bt_intervals/bt_set is not available in the cache,
and this happens at least once per thread in the wait function, making the
overall performance slightly better.
This change tries to mimic this behavior by using the bt_intervals cache,
which simply keeps the blocktime interval in terms of the platform-dependent
ticks or nanoseconds.

Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D28906

llvm-svn: 293312
2017-01-27 17:54:31 +00:00
Jonas Hahnfeld cfe5ef589a [libomptarget] Fix compilation with libc++
iterator is only guaranteed to be default-constructible, without any argument.

Differential Revision: https://reviews.llvm.org/D29171

llvm-svn: 293277
2017-01-27 11:03:33 +00:00
George Rokos 2467df6e4f [OpenMP] Initial implementation of OpenMP offloading library - libomptarget.
This is the patch upstreaming the device-agnostic part of libomptarget.

Differential Revision: https://reviews.llvm.org/D14031

llvm-svn: 293094
2017-01-25 21:27:24 +00:00
Jonathan Peyton 3692fcf665 Use C++11 static_assert() for build asserts.
llvm-svn: 292350
2017-01-18 07:49:30 +00:00
Jonathan Peyton 7f976d556a Fix memory error in case of reinit using kmp_set_defaults() for lock code.
The lock tables were being reallocated if kmp_set_defaults() was called.
In the env_init code it says that the user should be able to switch between
different KMP_CONSISTENCY_CHECK values which is what this change enables.

llvm-svn: 292349
2017-01-18 07:02:21 +00:00
Jonathan Peyton d0365a228c Fix small memory leak regarding __kmp_nested_proc_bind
There is no corresponding free() for this expandable array.  The logic is
added in __kmp_cleanup() next to the freeing of __kmp_nested_nth.

llvm-svn: 292348
2017-01-18 06:40:19 +00:00
Jonas Hahnfeld c9a8a6c030 kmp_affinity: Fix check if specific bit is set
Clang 4.0 trunk warns:
warning: logical not is only applied to the left hand side of this bitwise operator [-Wlogical-not-parentheses]

This points to a potential bug if the code really wants to check if the single
bit is not set: If for example (buf.edx >> 9) = 2 (has any bit set except the
least significant one), 'logical not' will return 0 which stays 0 after the
'bitwise and'.
To do this correctly we first need to evaluate the 'bitwise and'. In that case
it returns 2 & 1 = 0 which after the 'logical not' evaluates to 1.

Differential Revision: https://reviews.llvm.org/D28599

llvm-svn: 291764
2017-01-12 11:39:04 +00:00
Jonas Hahnfeld 49152b3f06 [CMake] Make openmp build under runtimes/
runtimes/CMakeLists.txt in LLVM passes OPENMP_STANDALONE_BUILD.

Differential Revision: https://reviews.llvm.org/D28280

llvm-svn: 290978
2017-01-04 18:11:37 +00:00
Andrey Churbanov 76d4285460 Fix for the __kmpc_global_num_threads function to return the value of the __kmp_all_nth global var.
Patch by Yonghong Yan.

Differential Revision: https://reviews.llvm.org/D27975

llvm-svn: 290272
2016-12-21 21:20:20 +00:00
Oren Ben Simhon c11addb506 Reverting last change.
llvm-svn: 290245
2016-12-21 09:04:08 +00:00
Oren Ben Simhon 016f2af3c7 [X86] Vectorcall Calling Convention - Adding CodeGen Complete Support
Fixing build issues.

llvm-svn: 290242
2016-12-21 08:58:19 +00:00
Jonathan Peyton de4749b748 Follow up to r289732: Update comments in source files to reference .cpp files
Patch by Hansang Bae

llvm-svn: 289739
2016-12-14 23:01:24 +00:00
Jonathan Peyton 7cc577a4ef Change source files from .c to .cpp
Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D26688

llvm-svn: 289732
2016-12-14 22:39:11 +00:00
Andrey Churbanov 5dee8c43da Cleanup: debug print fixed and moved inside critical section.
Patch by Victor Campos.

Differential Revision: https://reviews.llvm.org/D27647

llvm-svn: 289640
2016-12-14 08:29:00 +00:00
Sylvestre Ledru cd9d374337 Support of mips & mips64 for openmprtl
Summary:
Implemented by Dejan Latinovic
See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=790735 for more more information

Reviewers: AndreyChurbanov, jlpeyton

Subscribers: openmp-commits, mgorny

Differential Revision: https://reviews.llvm.org/D26576

llvm-svn: 289032
2016-12-08 09:22:24 +00:00
Andrey Churbanov e0a2c3e99a fixed type in Windows-specific code
llvm-svn: 288368
2016-12-01 16:08:52 +00:00
Jonathan Peyton a88e8358af Fixed typo in kmp_process_deps trace output
Patch by Victor Campos

Differential Revision: https://reviews.llvm.org/D27172

llvm-svn: 288056
2016-11-28 20:10:32 +00:00
Andrey Churbanov bcadbd6302 Cleanup: memory leaks on warnings printing fixed; some memory freeing cleaned; poor indents and one typo fixed.
Patch by Victor Campos.

Differential Revision: https://reviews.llvm.org/D26786

llvm-svn: 288054
2016-11-28 19:23:09 +00:00
Jonathan Peyton 96fe1aa380 Set task->td_dephash to NULL after free
llvm-svn: 287552
2016-11-21 16:24:59 +00:00
Jonathan Peyton 7ca7ef0478 Fix for D25504 - segfault because of double free()-ing in shutdown code.
Paul Osmialowski pointed out a double free bug in shutdown code.  This patch
Moves the freeing of the implicit task to above the freeing of all fast memory
to prevent the double-free issue.

Differential Revision: https://reviews.llvm.org/D26860

llvm-svn: 287551
2016-11-21 16:18:57 +00:00
Jonathan Peyton 5375fe820c Update stats-gathering code
Have developer timers use partitioning scheme which also required that some
redundant developer timers be removed in favor of the already existing normal
timers. Move per thread stats initialization to just after global thread id
assignment which is as early as possible. Also put all global stats
initialization code in __kmp_stats_init() and all global stats destruction code
in __kmp_stats_fini().

Differential Revision: https://reviews.llvm.org/D26361

llvm-svn: 286892
2016-11-14 21:13:44 +00:00
Jonathan Peyton 1cdd87adfd Introduce dynamic affinity dispatch capabilities
This set of changes enables the affinity interface (Either the preexisting
native operating system or HWLOC) to be dynamically set at runtime
initialization. The point of this change is that we were seeing performance
degradations when using HWLOC. This allows the user to use the old affinity
mechanisms which on large machines (>64 cores) makes a large difference in
initialization time.

These changes mostly move affinity code under a small class hierarchy:

KMPAffinity
  class Mask {}
KMPNativeAffinity : public KMPAffinity
  class Mask : public KMPAffinity::Mask
KMPHwlocAffinity
  class Mask : public KMPAffinity::Mask

Since all interface functions (for both affinity and the mask implementation)
are virtual, the implementation can be chosen at runtime initialization.

Differential Revision: https://reviews.llvm.org/D26356

llvm-svn: 286890
2016-11-14 21:08:35 +00:00
Andrey Churbanov 1fbb482928 Added check for malloc return.
Patch by Victor Campos.

Differential Revision: https://reviews.llvm.org/D26318

llvm-svn: 286441
2016-11-10 09:08:03 +00:00
Jonas Hahnfeld 50fed0475f [OpenMP] Enable ThreadSanitizer to check OpenMP programs
This patch allows ThreadSanitizer (Tsan) to verify OpenMP programs.
It means that no false positive will be reported by Tsan when
verifying an OpenMP programs.
This patch introduces annotations within the OpenMP runtime module to
provide information about thread synchronization to the Tsan runtime.

In order to enable the Tsan support when building the runtime, you must
enable the TSAN_SUPPORT option with the following environment variable:

-DLIBOMP_TSAN_SUPPORT=TRUE

The annotations will be enabled in the main shared library
(same mechanism of OMPT).

Patch by Simone Atzeni and Joachim Protze!

Differential Revision: https://reviews.llvm.org/D13072

llvm-svn: 286115
2016-11-07 15:58:36 +00:00
Andrey Churbanov 4d49312cad fixed typo in comment
llvm-svn: 285947
2016-11-03 17:48:46 +00:00
Andrey Churbanov 753fa0468c Change task stealing to always get task from head of victim's deque.
Differential Revision: https://reviews.llvm.org/D26187

llvm-svn: 285833
2016-11-02 16:45:25 +00:00
Andrey Churbanov 51107e0abc Fixed problem introduced by part of https://reviews.llvm.org/D21196.
Check Task Scheduling Constraint (TSC) on stealing of untied task.
This is needed because the untied task can produce tied children
those can break TSC if untied is not a descendant of current task.
This can cause live lock on complex tyasking tests
(e.g. kastors/strassen-task-dep).

Differential Revision: https://reviews.llvm.org/D26182

llvm-svn: 285703
2016-11-01 16:19:04 +00:00
Andrey Churbanov dd313b0673 Add more conditions to check whether task waiting is necessary in kmp_omp_taskwait.
Differential Revision: https://reviews.llvm.org/D26058

Patch by Victor Campos

llvm-svn: 285678
2016-11-01 08:33:36 +00:00
Andrey Churbanov df0d75edf6 Fixed a memory leak related to task dependencies.
Differential Revision: http://reviews.llvm.org/D25504

Patch by Alex Duran.

llvm-svn: 285283
2016-10-27 11:43:07 +00:00
Jonathan Peyton 3c4050d698 Fixing typos in __kmp_release_deps trace outputs
Patch by Victor Campos

Differential Revision: https://reviews.llvm.org/D25972

llvm-svn: 285244
2016-10-26 21:46:43 +00:00
Jonathan Peyton 762bc46224 Use getpagesize() instead of PAGE_SIZE macro when KMP_OS_LINUX is true
Patch by Victor Campos

Differential Revision: https://reviews.llvm.org/D26001

llvm-svn: 285243
2016-10-26 21:42:48 +00:00
Andrey Churbanov 2e68768d1e Fixed memory leak mistakenly introduced by https://reviews.llvm.org/D23115
Differential Revision: http://reviews.llvm.org/D25510

llvm-svn: 284747
2016-10-20 17:14:17 +00:00
Samuel Antao 335151914a [OpenMP] Fix issue with directives used in a macro.
Summary:
If directives are used in a macro, clang complains with:
```
src/projects/openmp/runtime/src/kmp_runtime.c:7486:2: error: embedding a directive within macro arguments has undefined behavior [-Werror,-Wembedded-directive]
#if KMP_USE_MONITOR
```

This patch fixes two occurrences of the issue in `kmp_runtime.cpp`.

Reviewers: tlwilmar, jlpeyton, AndreyChurbanov, Hahnfeld

Subscribers: Hahnfeld, openmp-commits

Differential Revision: https://reviews.llvm.org/D25823

llvm-svn: 284728
2016-10-20 13:20:17 +00:00
Jonathan Peyton 0ac7b75f7b Fix OpenMP 4.0 library build
Patch by Andrey Churbanov

Differential Revision: https://reviews.llvm.org/D25505

llvm-svn: 284499
2016-10-18 17:39:06 +00:00
Michal Gorny efc536ee9d Fix a compile error on musl-libc due to strerror_r() prototype
Function strerror_r() has different signatures in different
implementations of libc: glibc's version returns a char*, while BSDs
and musl return a int. libomp unconditionally assumes glibc on Linux
and thus fails to compile against musl-libc. This patch addresses this
issue.

Differential Revision: https://reviews.llvm.org/D25071

llvm-svn: 284492
2016-10-18 16:38:44 +00:00
Jonathan Peyton 55466e9106 Mixed type atomic routines added for capture and update/capture reverse.
New mixed type atomic routines added for regular capture operations as well as
reverse update/capture operations.  LHS - all integer and float types (no
complex so far), RHS - float16.

Patch by Olga Malysheva

Differential Revision: https://reviews.llvm.org/D25275

llvm-svn: 284489
2016-10-18 16:20:55 +00:00
Jonathan Peyton e1c7c13c3d Code cleanup for the runtime without monitor thread
This change removes/disables unnecessary code when monitor thread is not used.

Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D25102

llvm-svn: 283577
2016-10-07 18:12:19 +00:00
Jonathan Peyton a1234cf280 Enable omp_get_schedule() to return static steal type.
As the code is now, calling omp_get_schedule() when OMP_SCHEDULE=static_steal
will cause an assert.

llvm-svn: 283576
2016-10-07 18:01:35 +00:00
Paul Osmialowski 7a9c29e4b8 [cmake] Fix for a bug https://llvm.org/bugs/show_bug.cgi?id=30489 "Cannot build with -DLIBOMP_FORTRAN_MODULES=True"
Differential Revision: https://reviews.llvm.org/D24959

llvm-svn: 282965
2016-09-30 22:05:45 +00:00
Jonathan Peyton 66e212ce2b Insert missing checks for KMP_AFFINITY_CAPABLE() in affinity API.
If affinity is not capable, then these API functions will perform the stubs
version.

llvm-svn: 282947
2016-09-30 20:56:44 +00:00
Michal Gorny 3ccf825e22 [test] Support 'lit' executable name
Support finding lit as plain 'lit', which is the name used by setup.py
in LLVM's utils/lit.

Differential Revision: https://reviews.llvm.org/D25072

llvm-svn: 282876
2016-09-30 16:56:16 +00:00
Jonathan Peyton 74f3ffce24 Fix incorrect OpenMP version in Fortran module.
Add check for "45" version to use "201511" string for OpenMP 4.5,
otherwise "200505" is used in Fortran module. Also, fix kmp_openmp_version
variable (used for the debugger, e.g.) and kmp_version_omp_api that is used
in KMP_VERSION=1 output.

Patch by Olga Malysheva

Differential Revision: https://reviews.llvm.org/D24761

llvm-svn: 282868
2016-09-30 15:50:14 +00:00
Jonathan Peyton be31337e9d Mixed type atomic routines for unsigned integers.
New routines should be used for atomics like "<int>OP=<float>" when <int> is
unsigned. Using functions __kmpc_atomic_fixed<bits>_<op>_fp) produces incorrect
results

Differential Revision: https://reviews.llvm.org/D24756

llvm-svn: 282509
2016-09-27 17:38:48 +00:00
Jonathan Peyton b66d1aab25 Disable monitor thread creation by default.
This change set disables creation of the monitor thread by default.  The global
counter maintained by the monitor thread was replaced by logic that uses system
time directly, and cyclic yielding on Linux target was also removed since there
was no clear benefit of using it. Turning on KMP_USE_MONITOR variable (=1)
enables creation of monitor thread again if it is really necessary for some
reasons.

Differential Revision: https://reviews.llvm.org/D24739

llvm-svn: 282507
2016-09-27 17:11:17 +00:00
Michal Gorny cd2bfb1e7c Fix respecting LIBOMP_LLVM_LIT_EXECUTABLE as full path
Fix lit search to correctly respect LIBOMP_LLVM_LIT_EXECUTABLE as full
program path.

The variable passed to find_program() is created by CMake as a cache
variable, and therefore can be directly overriden by the user. Since
this was the design of LIBOMP_LLVM_LIT_EXECUTABLE (as can be deduced
from the error messages) and there is no other use of LIT_EXECUTABLE,
remove the redundant variable and pass LIBOMP_LLVM_LIT_EXECUTABLE
directly to find_program().

Furthermore, the previous code did not work since the HINTS argument
specifies more search directories rather than expected full path.
Quoting the CMake documentation:

> 3. Search the paths specified by the HINTS option. These should be
> paths computed by system introspection, such as a hint provided by
> the location of another item already found. Hard-coded guesses should
> be specified with the PATHS option.

Differential Revision: https://reviews.llvm.org/D24710

llvm-svn: 281887
2016-09-19 06:55:56 +00:00
Michal Gorny 23132ebb0e [cmake] Make libgomp & libiomp5 alias install optional
Introduce a new LIBOMP_INSTALL_VARIABLES cache variable that can be used
to disable creating libgomp and libiomp5 aliases on 'make install'.
Those aliases are undesired e.g. on Gentoo systems where libomp is used
purely by clang.

Differential Revision: https://reviews.llvm.org/D24563

llvm-svn: 281512
2016-09-14 17:46:27 +00:00
Jonas Hahnfeld 848d690697 [OMPT] fix task frame information for gomp interface
Previous differencials D23305-D23310 changed task frame information management only for the kmp interface, but not for the whole gomp interface. This broke some testcases when building with gcc.
This patch fixes the broken task frame information for the gomp interface.

Patch by Joachim Protze!

Differential Revision: https://reviews.llvm.org/D24502

llvm-svn: 281468
2016-09-14 13:59:39 +00:00
Jonas Hahnfeld dd9a05d5d8 [OMPT] save exit address to lwt if available
In case, the current team is a serialized team (lwt), the frame information should be written to this data structure.
Before, nested serialized teams would overwrite the same task information.

Patch by Joachim Protze!

Differential Revision: https://reviews.llvm.org/D23310

llvm-svn: 281467
2016-09-14 13:59:31 +00:00
Jonas Hahnfeld 28ea24bba7 [OMPT] fix __ompt_get_teaminfo to consult lwt entries of parent teams
The comment already states, that this function should work similarly as __ompt_get_taskinfo.

The function only looked for lwt entries of the current team, but not when unrolling the parents. This fix aligns the implementation to __ompt_get_taskinfo.

The new test case creates a single theaded team (->lwt) and then a nested active team.
Before the innermost print_id(1) would deliver a different team then the outer print_id(0).

Patch by Joachim Protze!

Differential Revision: https://reviews.llvm.org/D23309

llvm-svn: 281466
2016-09-14 13:59:24 +00:00
Jonas Hahnfeld 8a27064e05 [OMPT] Reset task exit frame when execution is finished
The exit address is set when execution of a task is started and should be reset as soon as the execution is finished.
Especially for the asm implementation of __kmp_invoke_microtask, resetting in this call would be painfull, so reset just after the invokation.

The testcase shows the effect of this patch:
Before, the implicit barriers at the end of an implicit task would see an exit address for the implicit task.

This barrier is a task scheduling point. Thus, any explicit task scheduled there would see an exit, but no reenter address for the implicit task.

Patch by Joachim Protze!

Differential Revision: https://reviews.llvm.org/D23307

llvm-svn: 281465
2016-09-14 13:59:19 +00:00
Jonas Hahnfeld fd0614d830 [OMPT] Align implementation of reenter frame address to latest (frozen) version of OMPT spec
The latest OMPT spec changed the semantic of a tasks reenter frame to be the application frame, that will be entered, when the runtime frame drops.
Before it was the last frame in the runtime. This doesn't work for some gcc execution pathes or even clang generated code for :
Since there is no runtime frame between the executed task and the encountering task.

The test case compares exit and reenter addresses against addresses captured in application code

Patch by Joachim Protze!

Differential Revision: https://reviews.llvm.org/D23305

llvm-svn: 281464
2016-09-14 13:59:13 +00:00
Jonas Hahnfeld 464cdca9d3 [OMPT] extend ompt tests by checks for frame pointers
OMPT tests can check for right frame information of tasks:
 * parent_task_frame was directly printed as a pointer, but actually points to a struct ompt_frame {void*, void*}
 * NULL is printed in the beginning of execution and loaded to FileChecker variable [[NULL]]
 * implicit tasks now also print their frame information
 * macro to print frame address from application
 * print task info for barrier begin

Patch by Joachim Protze!

Differential Revision: https://reviews.llvm.org/D23304

llvm-svn: 281463
2016-09-14 13:59:05 +00:00
Jonathan Peyton 7c465a5f41 Fix bitmask upper bounds check
Rather than checking KMP_CPU_SETSIZE, which doesn't exist when using Hwloc, we
use the get_max_proc() function which can vary based on the operating system.
For example on Windows with multiple processor groups, it might be the case that
the highest bit possible in the bitmask is not equal to the number of hardware
threads on the machine but something higher than that.

Differential Revision: https://reviews.llvm.org/D24206

llvm-svn: 281245
2016-09-12 19:02:53 +00:00
George Rokos 118de30b44 [OPENMP] ppc64le recognized as big-endian
There is a bug in CMakeLists which causes powerpc64le systems to be recognized as big-endian. This patch fixes the issue.

Differential Revision: https://reviews.llvm.org/D23626

llvm-svn: 281068
2016-09-09 18:04:23 +00:00
George Rokos 28f31b405e [OPENMP] Implementation of omp_get_default_device and omp_set_default_device
Implementation of missing OpenMP 4.0 API functions omp_get_default_device and omp_set_default_device.
Also, added support for the environment variable OMP_DEFAULT_DEVICE.

Differential Revision: https://reviews.llvm.org/D23587

llvm-svn: 281065
2016-09-09 17:55:26 +00:00
Jonathan Peyton e6abe52905 Move function into cpp file under KMP_AFFINITY_SUPPORTED guard.
When affinity isn't supported, __kmp_affinity_compact doesn't exist.  The
problem is that in kmp_affinity.h there is a function which uses it without the
proper KMP_AFFINITY_SUPPORTED guard around it.  The compiler was smart enough to
ignore it and the function __kmp_affinity_cmp_Address_child_num which relies on
it, but I think it is cleaner to have it under the proper guard.  Since the
function is only used in the kmp_affinity.cpp file and there aren't any plans to
have it elsewhere.  I have moved it there.

llvm-svn: 280542
2016-09-02 20:54:58 +00:00
Jonathan Peyton 9e69696f5a Decouple the kmp_affin_mask_t type from determining if affinity is capable
the __kmp_affinity_determine_capable() functions are highly operating system
specific.  This change has the functions use the type they expect explicitly.

llvm-svn: 280538
2016-09-02 20:35:47 +00:00
Jonathan Peyton 788c5d65e8 Replace a bad instance of __kmp_free() with KMP_CPU_FREE_ARRAY() macro.
llvm-svn: 280530
2016-09-02 19:37:12 +00:00
Jonathan Peyton 5c32d5ef0d Use 'critical' reduction method when 'atomic' is not available but requested.
In case atomic reduction method is not available (the compiler can't generate
it) the assertion failure occurred if KMP_FORCE_REDUCTION=atomic was specified.
This change replaces the assertion with a warning and sets the reduction method
to the default one - 'critical'.

Patch by Olga Malysheva

Differential Revision: https://reviews.llvm.org/D23990

llvm-svn: 280519
2016-09-02 18:29:45 +00:00
Jonathan Peyton 0af717970c Appease older gcc compilers for the many-microtask-args.c test
Older gcc compilers error out with the C99 syntax of: for (int i =...)
so this change just moves the int i; declaration up above.

llvm-svn: 280138
2016-08-30 19:28:58 +00:00
Andrey Churbanov b35be69ff5 cleanup: fixed names of dummy arguments of Fortran interfaces declarations, no functional changes done
llvm-svn: 278951
2016-08-17 18:18:21 +00:00
Andrey Churbanov d6e1d7e521 Fixes for hierarchical barrier (possible hang if team size changed).
Differential Revision: http://reviews.llvm.org/D23175

llvm-svn: 278332
2016-08-11 13:04:00 +00:00
Dimitry Andric 70ba8c506c Fix linking of omp_foreign_thread_team_reuse test on FreeBSD
Summary:
On FreeBSD, linking the misc_bugs/omp_foreign_thread_team_reuse.c test
case fails with:

   /usr/local/bin/ld: /tmp/omp_foreign_thread_team_reuse-c5e71b.o: undefined reference to symbol 'pthread_create@@FBSD_1.0'

This is because the program is linked without `-lpthread`.  Since the
%libomp-compile-and-run macro does not allow that option to be added to
the compile command line, split it up and add the required `-lpthread`
between %libomp-compile and %libomp-run.

Reviewers: jlpeyton, hfinkel, Hahnfeld

Subscribers: Hahnfeld, emaste, openmp-commits

Differential Revision: https://reviews.llvm.org/D23084

llvm-svn: 278036
2016-08-08 18:34:05 +00:00
Jonas Hahnfeld ad0c42e3a9 kmp_gsupport: Fix library initialization with taskgroup
Differential Revision: https://reviews.llvm.org/D23259

llvm-svn: 278003
2016-08-08 13:23:08 +00:00
Jonas Hahnfeld ca32babfa7 Mark tests with task dependencies as unsupported with GCC
llvm-svn: 277996
2016-08-08 11:52:49 +00:00
Jonas Hahnfeld bedc371c9d Do not block on explicit task depending on proxy task
Consider the following code:

    int dep;
    #pragma omp target nowait depend(out: dep)
    {
        sleep(1);
    }
    #pragma omp task depend(in: dep)
    {
        printf("Task with dependency\n");
    }
    printf("Doing some work...\n");

In its current state the runtime will block on the second task and not
continue execution.

Differential Revision: https://reviews.llvm.org/D23116

llvm-svn: 277992
2016-08-08 10:08:14 +00:00
Jonas Hahnfeld 69f8511f8f __kmp_free_task: Fix for serial explicit tasks producing proxy tasks
Consider the following code which may be executed by a serial team:

    int dep;
    #pragma omp target nowait depend(out: dep)
    {
        sleep(1);
    }
    #pragma omp task depend(in: dep)
    {
        #pragma omp target nowait
        {
            sleep(1);
        }
    }

Here the explicit task may not be freed until the nested proxy task has
finished. The current code hasn't considered this and called __kmp_free_task
anyway which triggered an assert because of remaining incomplete children:

    KMP_DEBUG_ASSERT( TCR_4(taskdata->td_incomplete_child_tasks) == 0 );

Differential Revision: https://reviews.llvm.org/D23115

llvm-svn: 277991
2016-08-08 10:08:07 +00:00
Andrey Churbanov 5bf494e73d Fixed x2APIC discovery for 256-processor architectures.
Mask for value read from ebx register returned by CPUID expanded to 0xFFFF.

Differential Revision: https://reviews.llvm.org/D23203

llvm-svn: 277825
2016-08-05 15:59:11 +00:00
Jonas Hahnfeld d1f4b8f6e8 Add test case for nested creation of tasks
For discussion in D23115

llvm-svn: 277730
2016-08-04 14:55:56 +00:00
Jonas Hahnfeld 20236611d4 kmp_taskdeps.cpp: Fix debugging output
node->dn.task is only filled after the dependencies are already processed.
This currently leads to unhelpful output from KA_TRACE or even a crash
if one enables KMP_SUPPORT_GRAPH_OUTPUT.

llvm-svn: 277717
2016-08-04 11:03:47 +00:00
Pirama Arumuga Nainar 0554d25eb3 Disable KMP_CANCEL_THREADS on Android
Summary:
Android does not have pthread_cancel.  Disable KMP_CANCEL_THREADS if
__ANDROID__ is defined.

Subscribers: tberghammer, srhines, openmp-commits, danalbert

Differential Revision: https://reviews.llvm.org/D23029

llvm-svn: 277618
2016-08-03 18:08:57 +00:00
Paul Osmialowski ecbe2ea002 Make balanced affinity work on AArch64.
This patch enables balanced affinity on machines that do not have
hardware threads and have cores clustered into packages. In facts,
balacing algorithm could be generalized for any arrangement with
at least two levels of hierarchy (depth > 1).

Differential Revision: https://reviews.llvm.org/D22365

llvm-svn: 277212
2016-07-29 20:55:03 +00:00
Samuel Antao 71fef77dcb Replace enum types in variadic functions by build-in types.
Summary:
When compiling the runtime library with clang we get warnings like:
```
error: passing an object that undergoes default argument promotion to 'va_start' has undefined behavior [-Werror,-Wvarargs]
    va_start( args, id );
                    ^
note: parameter of type 'kmp_i18n_id_t' (aka 'kmp_i18n_id') is declared here
    kmp_i18n_id_t id,
```
My understanding is that the va_start macro only gets the promoted type so it won't know what was the exact type of the argument, which can potentially not work for some targets given that the implementation of the the calling convention could not be done properly.

This patch fixes that by using a built-in type in the function signature.

Reviewers: tlwilmar, jlpeyton, AndreyChurbanov

Subscribers: arpith-jacob, carlo.bertolli, caomhin, openmp-commits

Differential Revision: https://reviews.llvm.org/D22427

llvm-svn: 276428
2016-07-22 16:05:35 +00:00
Andrey Churbanov 429dbc2ad2 http://reviews.llvm.org/D22134: Implementation of OpenMP 4.5 nonmonotonic schedule modifier
llvm-svn: 275052
2016-07-11 10:44:57 +00:00
Jonathan Peyton 4d3c21307c Improving EPCC performance when linking with hwloc
When linking with libhwloc, the ORDERED EPCC test slows down on big
machines (> 48 cores). Performance analysis showed that a cache thrash
was occurring and this padding helps alleviate the problem.

Also, inside the main spin-wait loop in kmp_wait_release.h, we can eliminate
the references to the global shared variables by instead creating a local
variable, oversubscribed and instead checking that.

Differential Revision: http://reviews.llvm.org/D22093

llvm-svn: 274894
2016-07-08 17:43:21 +00:00
Andrey Churbanov 50ecf5de01 D22138: Added more Intel compiler versions as allowed build compilers
llvm-svn: 274854
2016-07-08 15:23:35 +00:00
Andrey Churbanov 2eca95c9a9 D22137: Memory leak fixed by adding missed cleanup of single level array of hot teams info
llvm-svn: 274851
2016-07-08 14:53:24 +00:00
Andrey Churbanov cb28d6e3a0 D22136: Memory leaks fixed by adding missed __kmp_free() calls
llvm-svn: 274850
2016-07-08 14:40:20 +00:00
Andrey Churbanov 42211eb125 D22135: formatting change
llvm-svn: 274849
2016-07-08 14:35:41 +00:00
Jonathan Peyton 741b70926f Fix the nowait tests for omp for and omp single
These tests are now modeled after the sections nowait test where threads wait
to be released in the first construct (either for or single) and the last thread
skips the last for/single construct and releases those threads.  If the test
fails, then it hangs because an unnecessary barrier is executed in between the
constructs.

llvm-svn: 274641
2016-07-06 17:26:12 +00:00
Jonas Hahnfeld 170fcc8772 __kmp_partition_places: Update assertion for new parameter update_master_only
If update_master_only is set the place list is not completely traversed
and therefore this assertion failed. Make it only trigger if
update_master_only is false.

(was introduced by D20539)

Differential Revision: http://reviews.llvm.org/D21925

llvm-svn: 274482
2016-07-04 05:58:10 +00:00
Jonathan Peyton 6b560f0dd9 Fix checks on schedule struct
This change fixes an error in comparing the existing schedule on the team to
the new schedule, in the chunk field. Also added additional checks and used
KMP_CHECK_UPDATE where appropriate.

Patch by Terry Wilmarth.

Differential Revision: http://reviews.llvm.org/D21897

llvm-svn: 274371
2016-07-01 17:54:32 +00:00
Jonathan Peyton c1666960f9 Improve performance of #pragma omp single
EPCC Performance of single is considerably worse than plain barrier.
Adding a read-only check to the code before the atomic compare-and-store
helps considerably.

Patch by Terry Wilmarth.

Differential Revision: http://reviews.llvm.org/D21893

llvm-svn: 274369
2016-07-01 17:37:49 +00:00
Jonathan Peyton fdcca8cd55 Fix omp_sections_nowait.c test to address Bugzilla Bug 28336
This rewrite of the omp_sections_nowait.c test file causes it to hang if the
nowait is not respected. If the nowait isn't respected, the lone thread which
can escape the first sections construct will just sleep at a barrier which
shouldn't exist. All reliance on timers is taken out. For good measure, the test
makes sure that all eight sections are executed as well. The test should take no
longer than a few seconds on any modern machine.

Differential Revision: http://reviews.llvm.org/D21842

llvm-svn: 274151
2016-06-29 19:46:52 +00:00
Jonathan Peyton ac7ba406ed Fix bugs in TAS and futex lock
* Incorrect lock value written in __kmp_test_futex_lock
* Incorrect lock value check in tas/futex lock with USE_LOCK_PROFILE on

Patch by Hansang Bae

llvm-svn: 274053
2016-06-28 19:37:24 +00:00
Jonathan Peyton cceebeef17 Revert r273898's UNICODE quick fix in favor of CMake's remove_definitions()
UNICODE and _UNICODE defintions were added in the LLVM CMake build system.
While on Unices, the UNICODE/_UNICODE macros don't cause problems, on Windows
only ittnotify_static.c should be compiled using -DUNICODE.  We are still
looking at a proper fix, but this change sets the build back to exactly what it
was doing before.  Also, a comment and TODO were added in the src/CMakeLists.txt
file to help explain.

llvm-svn: 274052
2016-06-28 19:25:13 +00:00
Hans Wennborg 8065c51875 Fix the Windows build after r273599
That patch made all LLVM projects build with -DUNICODE. However, this doesn't
work for the OpenMP runtime.

But just overriding the flag with -UUNICODE breaks compiling ittnotify_static.c,
which for some reason needs to be compiled with -DUNICIODE. Note that compiling
ittnotify.h with -DUNICODE does not work though.

This seems like a mess. This commit fixes it for now, but it would be great
if someone who works on the OpenMP runtime could fix it properly.

llvm-svn: 273898
2016-06-27 18:03:45 +00:00