In preparation for OMPT target changes, create separate categories of events that will be used by OMPT target support.
Split up existing macro FOREACH_OMPT_EVENT into new ones. There is no change to the original macro. Created new macros FOREACH_OMPT_HOST_EVENT, FOREACH_OMPT_DEVICE_EVENT, FOREACH_OMPT_NOEMI_EVENT, FOREACH_OMPT_EMI_EVENT, and a few other sub-categories that can be used as required. One such use is in D123974 which uses events selectively.
Patch from John Mellor-Crummey <johnmc@rice.edu>
Reviewed By: dreachem
Differential Revision: https://reviews.llvm.org/D123429
GCC, glibc, binutils, and LLVM have added support for LoongArch64.
This patch adds support for LLVM OpenMP following D59880 for RISCV64.
Reviewed By: MaskRay, SixWeining
Differential Revision: https://reviews.llvm.org/D132925
Previous support for device memory allocators used a single free
routine and did not provide the original kind of the allocation. This is
problematic as some of these memory types required different handling.
Previously this was worked around using a map in runtime to record the
original kind of each pointer. Instead, this patch introduces new free
routines similar to the existing allocation routines. This allows us to
avoid a map traversal every time we free a device pointer.
The only interfaces defined by the standard are `omp_target_alloc` and
`omp_target_free`, these do not take a kind as `omp_alloc` does. The
standard dictates the following:
"The omp_target_alloc routine returns a device pointer that references
the device address of a storage location of size bytes. The storage
location is dynamically allocated in the device data environment of the
device specified by device_num."
Which suggests that these routines only allocate the default device
memory for the kind. So this has been changed to reflect this. This
change is somewhat breaking if users were using `omp_target_free` as
previously shown in the tests.
Reviewed By: JonChesterfield, tianshilei1992
Differential Revision: https://reviews.llvm.org/D133053
Have it be simple KMP_MFENCE() which incorporates x86-specific logic and
reduces to KMP_MB() for other architectures.
Differential Revision: https://reviews.llvm.org/D130928
Serialized parallels allocate lightweight task teams on the heap
but never free them in the corresponding join. This patch adds a wrapper
around the allocation (if ompt enabled) and also adds the corresponding
free in the join call.
Differential Revision: https://reviews.llvm.org/D131690
This fixes warnings like these:
../runtime/src/kmp_dispatch.cpp:2159:24: warning: left operand of comma operator has no effect [-Wunused-value]
OMPT_LOOP_DISPATCH(*p_lb, *p_ub, pr->u.p.st, status);
^~~~~
../runtime/src/kmp_dispatch.cpp:2159:31: warning: left operand of comma operator has no effect [-Wunused-value]
OMPT_LOOP_DISPATCH(*p_lb, *p_ub, pr->u.p.st, status);
^~~~~
../runtime/src/kmp_dispatch.cpp:2159:46: warning: left operand of comma operator has no effect [-Wunused-value]
OMPT_LOOP_DISPATCH(*p_lb, *p_ub, pr->u.p.st, status);
~~~~~~~ ^~
../runtime/src/kmp_dispatch.cpp:2159:50: warning: expression result unused [-Wunused-value]
OMPT_LOOP_DISPATCH(*p_lb, *p_ub, pr->u.p.st, status);
^~~~~~
CMAKE_DL_LIBS is documented as "Name of library containing dlopen
and dlclose".
On Windows platforms, there's no system provided dlopen/dlclose, but
it can be argued that if you really intend to call dlopen/dlclose,
you're going to be using a third party compat library like
https://github.com/dlfcn-win32/dlfcn-win32, and CMAKE_DL_LIBS should
expand to its name.
This has been argued upstream in CMake in
https://gitlab.kitware.com/cmake/cmake/-/issues/17600 and
https://gitlab.kitware.com/cmake/cmake/-/merge_requests/1642, that
CMAKE_DL_LIBS should expand to "dl" on mingw platforms.
The merge request wasn't merged though, as it caused some amount of
breakage, but in practice, Fedora still carries a custom CMake patch
with the same effect.
Thus, this patch fixes cross compiling OpenMP for mingw targets
on Fedora with their custom-patched CMake.
Differential Revision: https://reviews.llvm.org/D130892
Fix the LD_LIBRARY_PATH prepending order to make sure that
config.library_path ends up before any potentially-system directories
(e.g. config.hwloc_library_dir). This makes sure that we are testing
against the just-built openmp libraries rather than the version that is
already installed.
Also rename the function to `prepend_*` to make it clearer what it
actually does.
https://github.com/llvm/llvm-project/issues/56821
Differential Revision: https://reviews.llvm.org/D130825
Warnings that occur during affinity initialization are supposed
to be guarded by KMP_AFFINITY=nowarnings,noverbose, but some had been
missed by this logic. Create one macro for affinity warnings that takes
these settings into account.
Differential Revision: https://reviews.llvm.org/D125991
Added control to reset affinity of primary thread after outermost parallel
region to initial affinity encountered before OpenMP runtime was initialized.
KMP_AFFINITY environment variable reset/noreset modifier introduced.
Default behavior is unchanged.
Differential Revision: https://reviews.llvm.org/D125993
icc does not properly detect lack of fallthrough attribute since it
defines __GNU__ > 7 and also icc's __has_cpp_attribute/__has_attribute
feature detectors do not properly detect the lack of fallthrough attribute.
Differential Revision: https://reviews.llvm.org/D126001
Made library registration conditional and skip it in the __kmp_atfork_child
handler, postponed it till middle initialization in the child.
This fixes the problem of applications those use e.g. popen/pclose
which terminate the forked child process.
Differential Revision: https://reviews.llvm.org/D125996
This check-in adds 4 APIs to support MSVC, specifically:
* 3 APIs (__kmpc_sections_init, __kmpc_next_section,
__kmpc_end_sections) to support the dynamic scheduling of OMP sections.
* 1 API (__kmpc_copyprivate_light, a light-weight version of
__kmpc_copyrprivate) to support the OMP single copyprivate clause.
Differential Revision: https://reviews.llvm.org/D128403
This patch implements omp_get_device_num() in the host and the device.
It uses the already existing getDeviceNum in the device config for the device.
And in the host it uses the omp_get_num_devices().
Two simple tests added
Differential Revision: https://reviews.llvm.org/D128347
When many nested teams are formed, __kmp_threads may be reallocated
to accommodate new threads. This reallocation causes a data
race when another existing team's thread simultaneously references
__kmp_threads. This patch keeps the old thread arrays around until library
shutdown so these lingering references can complete without issue and
access to __kmp_threads remains a simple array reference.
Fixes: https://github.com/llvm/llvm-project/issues/54708
Differential Revision: https://reviews.llvm.org/D125013
The code expanded from kmp_barrier.h uses some `KMP_INTERNAL_*`s,
so the definitions have to be placed before it.
Fixes#55815
Differential Revision: https://reviews.llvm.org/D126873
MSVC may not supply source location information to kmpc_reduce passing
NULL for the value. The patch adds a check for the loc value being NULL
in kmp_determine_reduction_method.
Differential Revision: https://reviews.llvm.org/D126564
The memkind library is only available for linux. Calling dlopen here
can also be problematic in a client app that fork'ed.
Differential Revision: https://reviews.llvm.org/D126579
When configuring llvm with the openmp subproject, the build for the omp
target fails if LIBOMP_CONFIGURED_LIBFLAGS contains more than one item.
LIBOMP_CONFIGURED_LIBFLAGS should be a semicolon-separated list instead
of a string with items separated by spaces.
Differential Revision: https://reviews.llvm.org/D125370
Without this patch, arguments to the
`llvm::OpenMPIRBuilder::AtomicOpValue` initializer are reversed.
Reviewed By: ABataev, tianshilei1992
Differential Revision: https://reviews.llvm.org/D126619
OpenMP 5.2, sec. 10.2 "teams Construct", p. 232, L9-12 restricts what
regions can be strictly nested within a `teams` construct. This patch
relaxes Clang's enforcement of this restriction in the case of nested
`atomic` constructs unless `-fno-openmp-extensions` is specified.
Cases like the following then seem to work fine with no additional
implementation changes:
```
#pragma omp target teams map(tofrom:x)
#pragma omp atomic update
x++;
```
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D126323
Currently the library ignores requested wait policy in the presence
of tasking. Threads always actively spin. The patch fixes this problem
making the wait policy passive if this explicitly requested by user.
Differential Revision: https://reviews.llvm.org/D123044
Intel Inspector uses itt notifications to analyze code execution, and it
reports race conditions in dependent tasks.
This patch fixes the issue notifying Inspector on tasks dependency
synchronizations.
Differential Revision: https://reviews.llvm.org/D123042
When hwloc is used and is installed outside of the default paths, the omp CMake target
needs to provide the needed include path thru the CMake target by adding it with
target_include_directories to it, so libompd gets it as well when it defines it's cmake
target using target_link_libraries.
As suggested in D122667
Reviewed By: ye-luo
Differential Revision: https://reviews.llvm.org/D123888
This patch adds the `llvm_omp_target_dynamic_shared_alloc` function to
the `omp.h` header file so users can access it by default. Also changed
the name to keep it consistent with the other target allocators. Added
some documentation so users know how to use it. Didn't add the interface
for Fortran since there's no way to test it right now.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D123246
The target allocators have been supported for NVPTX offloading for
awhile. The tests should use the allocators instead of calling the
functions manually. Also the comments indicating these being a preview
should be removed.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D123242
This change adds support for ompt_callback_dispatch with the new
dispatch chunk type introduced in 5.2. Definitions of the new
ompt_work_loop types were also added in the header file.
Differential Revision: https://reviews.llvm.org/D122107
Register constraint switched to "=q" which means very specifically (from
https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints)
> Any register accessible as rl. In 32-bit mode, a, b, c, and d; in 64-bit
mode, any integer register.
Older gcc versions (8.x and below) were trying to use esi or edi for the
8 bit flag variable, but it wound up displaying this error in the end:
kmp_lock.cpp: In function ‘void __kmp_spin_backoff(kmp_backoff_t*)’:
kmp_lock.cpp:2684:1: error: unsupported size for integer register
Hence the correct restriction is "=q" instead of "=r".
Fixes: https://github.com/llvm/llvm-project/issues/53309
Differential Revision: https://reviews.llvm.org/D120519
Before this patch task priorities were ignored, that was a valid implementation
as the task priority is a hint according to OpenMP specification.
Implemented shared list of sorted (high -> low) task deques one per task
priority value. Tasks execution changed to first check if priority tasks ready
for execution exist, and these tasks executed before others;
otherwise usual tasks execution mechanics work.
Differential Revision: https://reviews.llvm.org/D119676
Essentially removed the "use omp_lib_kinds" statement and replaced it
with import to maintain consistency (and avoid compilation error
in case the omp_lib_kinds.mod file is not accessible) in header file.
The import is required to access entities in host scoping unit.
Differential Revision: https://reviews.llvm.org/D120707