This fixes compilation in the Clang-cl configuration on aarch64;
Clang doesn't implement all the aarch64 MSVC atomic intrinsics yet.
Differential Revision: https://reviews.llvm.org/D138737
This does things in the same way as
D137168 / a356782426 and
D101173 / 4fb0aaf033 did for aarch64.
This adds a C implementation of __kmp_invoke_microtask in the same
way as the fallback C implementation in z_Linux_util.cpp.
Both the existing C fallback used on arm linux, and this one added here,
fail test/misc_bugs/many-microtask-args.c similarly (which could be
considered as an XFAIL).
Differential Revision: https://reviews.llvm.org/D138689
fb947c3586 introduced the gas
macro COMMON, but it was only defined within ifdefs of the form:
#if (KMP_OS_LINUX || KMP_OS_DARWIN || KMP_OS_WINDOWS) && KMP_ARCH_AARCH64
It was used, however, within other conditions:
#if KMP_ARCH_ARM || KMP_ARCH_MIPS
and:
#if KMP_ARCH_PPC64 || KMP_ARCH_AARCH64 || KMP_ARCH_MIPS64 || KMP_ARCH_RISCV64 || KMP_ARCH_LOONGARCH64
Move the definition of the COMMON macro out from the current ifdef,
so that it always gets defined (as it's only dependent on the target
platform).
This fixes building on ARM (and presumably all the other mentioned
architectures except aarch64).
Differential Revision: https://reviews.llvm.org/D138703
When building for Windows aarch64, and not using the actual MSVC,
we can assemble gnu assembly files just fine, and the existing
correct implementation of __kmp_invoke_microtask is fully usable.
The C implementation of __kmp_invoke_microtask in
z_Windows_NT-586_util.cpp relies on unguaranteed assumptions about
the compiler behaviour - it does work currently on MSVC, but doesn't
necessarily on other compilers. That function uses an alloca to pass
parameters on the stack to the called functions.
There's no guarantee that the buffer allocated by alloca is exactly
at the bottom of the stack when doing the call; the compiler might
have left space for extra things to save on the stack there.
Additionally, when compiled with Clang with optimization, Clang
optimizes out the alloca and memcpy entirely. On the C language
level, they don't have any visible effect outside of the function
and thus can be omitted entirely.
This fixes calling microtasks with more than 6 parameters, in
builds for Windows/aarch64 with Clang.
Differential Revision: https://reviews.llvm.org/D137827
In D135552 the #else is added, which causes build error when
building openmp on RISCV64. This patch fixed the error:
"Unknown or unsupported architecture"
Reviewed By: pirama
Differential Revision: https://reviews.llvm.org/D138241
In D135552 the #else is added, which causes build error when
building openmp on LoongArch. This patch fixed the error:
"Unknown or unsupported architecture"
Reviewed By: SixWeining, MaskRay
Differential Revision: https://reviews.llvm.org/D137604
This is part of a set of patches implementing OMPT target callback support and has been split out of the originally submitted https://reviews.llvm.org/D113728. The overall design can be found in https://rice.app.box.com/s/pf3gix2hs4d4o1aatwir1set05xmjljc
The purpose of this patch is to provide a way to register tool-provided callbacks into libomp when libomptarget is loaded.
Introduced a cmake variable LIBOMPTARGET_OMPT_SUPPORT that can be used to control OMPT target support. It follows host OMPT support, controlled by LIBOMP_HAVE_OMPT_SUPPORT.
Added a connector that can be used to communicate between OMPT implementations in libomp and libomptarget or libomptarget and a plugin.
Added a global constructor in libomptarget that uses the connector to force registration of tool-provided callbacks in libomp. A pair of init and fini functions are provided to libomp as part of the connect process which will be used to register the tool-provided callbacks in libomptarget.
Patch from John Mellor-Crummey <johnmc@rice.edu>
(With contributions from Dhruva Chakrabarti <Dhruva.Chakrabarti@amd.com>)
Reviewed By: dreachem, jhuber6
Differential Revision: https://reviews.llvm.org/D123572
This patch changes AArch64 + `__GNUC__` to use `__sync` builtins to
implement internal atomic macros just like for Unix, because mingw-w64
is missing some of the intrinsics which the MSVC codepath is using.
Then some remaining intel-only functions are removed from dllexport to
fix linking.
This should fix https://github.com/llvm/llvm-project/issues/56349.
Reviewed By: natgla
Differential Revision: https://reviews.llvm.org/D137168
Fix setting affinity type and topology method when affinity is disabled
and fix places that were not taking into account that affinity can be
explicitly disabled by putting proper KMP_AFFINITY_CAPABLE() check.
Differential Revision: https://reviews.llvm.org/D137176
This is part of a set of patches implementing OMPT target callback support and has been split out of the originally submitted https://reviews.llvm.org/D113728. The overall design can be found in https://rice.app.box.com/s/pf3gix2hs4d4o1aatwir1set05xmjljc
The purpose of this patch is to provide a way to register tool-provided callbacks into libomp when libomptarget is loaded.
Introduced a cmake variable LIBOMPTARGET_OMPT_SUPPORT that can be used to control OMPT target support. It follows host OMPT support, controlled by LIBOMP_HAVE_OMPT_SUPPORT.
Added a connector that can be used to communicate between OMPT implementations in libomp and libomptarget or libomptarget and a plugin.
Added a global constructor in libomptarget that uses the connector to force registration of tool-provided callbacks in libomp. A pair of init and fini functions are provided to libomp as part of the connect process which will be used to register the tool-provided callbacks in libomptarget.
Depends on D123429
Patch from John Mellor-Crummey <johnmc@rice.edu>
(With contributions from Dhruva Chakrabarti <Dhruva.Chakrabarti@amd.com>)
Reviewed By: dreachem
Differential Revision: https://reviews.llvm.org/D123572
Add new hidden helper affinity via the environment variable,
KMP_HIDDEN_HELPER_AFFINITY, which allows users to assign thread
affinity to hidden helper threads using the same syntax as
KMP_AFFINITY. OMP_PLACES/OMP_PROC_BIND have no interaction with
KMP_HIDDEN_HELPER_AFFINITY.
Differential Revision: https://reviews.llvm.org/D135113
Separate change for the warnings to depend on the relevant affinity
settings verbose and warnings settings.
Differential Revision: https://reviews.llvm.org/D135112
This patch parameterizes the affinity initialization code to allow multiple
affinity settings. Almost all global affinity settings are consolidated
and put into a structure kmp_affinity_t. This is in anticipation of the
addition of hidden helper affinity which will have the same syntax and
semantics as KMP_AFFINITY only for the hidden helper team.
Differential Revision: https://reviews.llvm.org/D135109
When detect __NR_sched_getaffinity. the last #else is missing,
which make the last platform MIPS64 failed to build with an error:
"Unknown or unsupported architecture"
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D135552
This patch is a partial fix for [[ https://github.com/llvm/llvm-project/issues/56349 | issue ]], due to functions affected by D117473.
Implementation details:
The patch essentially creates a new macro if the architecture is either
intel32 or intel64, since the generate-def.pl cannot process boolean algebra
on macros.
Reviewed By: jlpeyton
Differential Revision: https://reviews.llvm.org/D135795
The modifier bits in the schedule type is not used/supported in the
static scheduler, so it should be ignored.
Differential Revision: https://reviews.llvm.org/D134983
In preparation for OMPT target changes, create separate categories of events that will be used by OMPT target support.
Split up existing macro FOREACH_OMPT_EVENT into new ones. There is no change to the original macro. Created new macros FOREACH_OMPT_HOST_EVENT, FOREACH_OMPT_DEVICE_EVENT, FOREACH_OMPT_NOEMI_EVENT, FOREACH_OMPT_EMI_EVENT, and a few other sub-categories that can be used as required. One such use is in D123974 which uses events selectively.
Patch from John Mellor-Crummey <johnmc@rice.edu>
Reviewed By: dreachem
Differential Revision: https://reviews.llvm.org/D123429
GCC, glibc, binutils, and LLVM have added support for LoongArch64.
This patch adds support for LLVM OpenMP following D59880 for RISCV64.
Reviewed By: MaskRay, SixWeining
Differential Revision: https://reviews.llvm.org/D132925
Previous support for device memory allocators used a single free
routine and did not provide the original kind of the allocation. This is
problematic as some of these memory types required different handling.
Previously this was worked around using a map in runtime to record the
original kind of each pointer. Instead, this patch introduces new free
routines similar to the existing allocation routines. This allows us to
avoid a map traversal every time we free a device pointer.
The only interfaces defined by the standard are `omp_target_alloc` and
`omp_target_free`, these do not take a kind as `omp_alloc` does. The
standard dictates the following:
"The omp_target_alloc routine returns a device pointer that references
the device address of a storage location of size bytes. The storage
location is dynamically allocated in the device data environment of the
device specified by device_num."
Which suggests that these routines only allocate the default device
memory for the kind. So this has been changed to reflect this. This
change is somewhat breaking if users were using `omp_target_free` as
previously shown in the tests.
Reviewed By: JonChesterfield, tianshilei1992
Differential Revision: https://reviews.llvm.org/D133053
Have it be simple KMP_MFENCE() which incorporates x86-specific logic and
reduces to KMP_MB() for other architectures.
Differential Revision: https://reviews.llvm.org/D130928
Serialized parallels allocate lightweight task teams on the heap
but never free them in the corresponding join. This patch adds a wrapper
around the allocation (if ompt enabled) and also adds the corresponding
free in the join call.
Differential Revision: https://reviews.llvm.org/D131690
This fixes warnings like these:
../runtime/src/kmp_dispatch.cpp:2159:24: warning: left operand of comma operator has no effect [-Wunused-value]
OMPT_LOOP_DISPATCH(*p_lb, *p_ub, pr->u.p.st, status);
^~~~~
../runtime/src/kmp_dispatch.cpp:2159:31: warning: left operand of comma operator has no effect [-Wunused-value]
OMPT_LOOP_DISPATCH(*p_lb, *p_ub, pr->u.p.st, status);
^~~~~
../runtime/src/kmp_dispatch.cpp:2159:46: warning: left operand of comma operator has no effect [-Wunused-value]
OMPT_LOOP_DISPATCH(*p_lb, *p_ub, pr->u.p.st, status);
~~~~~~~ ^~
../runtime/src/kmp_dispatch.cpp:2159:50: warning: expression result unused [-Wunused-value]
OMPT_LOOP_DISPATCH(*p_lb, *p_ub, pr->u.p.st, status);
^~~~~~
CMAKE_DL_LIBS is documented as "Name of library containing dlopen
and dlclose".
On Windows platforms, there's no system provided dlopen/dlclose, but
it can be argued that if you really intend to call dlopen/dlclose,
you're going to be using a third party compat library like
https://github.com/dlfcn-win32/dlfcn-win32, and CMAKE_DL_LIBS should
expand to its name.
This has been argued upstream in CMake in
https://gitlab.kitware.com/cmake/cmake/-/issues/17600 and
https://gitlab.kitware.com/cmake/cmake/-/merge_requests/1642, that
CMAKE_DL_LIBS should expand to "dl" on mingw platforms.
The merge request wasn't merged though, as it caused some amount of
breakage, but in practice, Fedora still carries a custom CMake patch
with the same effect.
Thus, this patch fixes cross compiling OpenMP for mingw targets
on Fedora with their custom-patched CMake.
Differential Revision: https://reviews.llvm.org/D130892
Warnings that occur during affinity initialization are supposed
to be guarded by KMP_AFFINITY=nowarnings,noverbose, but some had been
missed by this logic. Create one macro for affinity warnings that takes
these settings into account.
Differential Revision: https://reviews.llvm.org/D125991
Added control to reset affinity of primary thread after outermost parallel
region to initial affinity encountered before OpenMP runtime was initialized.
KMP_AFFINITY environment variable reset/noreset modifier introduced.
Default behavior is unchanged.
Differential Revision: https://reviews.llvm.org/D125993
icc does not properly detect lack of fallthrough attribute since it
defines __GNU__ > 7 and also icc's __has_cpp_attribute/__has_attribute
feature detectors do not properly detect the lack of fallthrough attribute.
Differential Revision: https://reviews.llvm.org/D126001
Made library registration conditional and skip it in the __kmp_atfork_child
handler, postponed it till middle initialization in the child.
This fixes the problem of applications those use e.g. popen/pclose
which terminate the forked child process.
Differential Revision: https://reviews.llvm.org/D125996
This check-in adds 4 APIs to support MSVC, specifically:
* 3 APIs (__kmpc_sections_init, __kmpc_next_section,
__kmpc_end_sections) to support the dynamic scheduling of OMP sections.
* 1 API (__kmpc_copyprivate_light, a light-weight version of
__kmpc_copyrprivate) to support the OMP single copyprivate clause.
Differential Revision: https://reviews.llvm.org/D128403
When many nested teams are formed, __kmp_threads may be reallocated
to accommodate new threads. This reallocation causes a data
race when another existing team's thread simultaneously references
__kmp_threads. This patch keeps the old thread arrays around until library
shutdown so these lingering references can complete without issue and
access to __kmp_threads remains a simple array reference.
Fixes: https://github.com/llvm/llvm-project/issues/54708
Differential Revision: https://reviews.llvm.org/D125013
The code expanded from kmp_barrier.h uses some `KMP_INTERNAL_*`s,
so the definitions have to be placed before it.
Fixes#55815
Differential Revision: https://reviews.llvm.org/D126873
MSVC may not supply source location information to kmpc_reduce passing
NULL for the value. The patch adds a check for the loc value being NULL
in kmp_determine_reduction_method.
Differential Revision: https://reviews.llvm.org/D126564
The memkind library is only available for linux. Calling dlopen here
can also be problematic in a client app that fork'ed.
Differential Revision: https://reviews.llvm.org/D126579
Currently the library ignores requested wait policy in the presence
of tasking. Threads always actively spin. The patch fixes this problem
making the wait policy passive if this explicitly requested by user.
Differential Revision: https://reviews.llvm.org/D123044
Intel Inspector uses itt notifications to analyze code execution, and it
reports race conditions in dependent tasks.
This patch fixes the issue notifying Inspector on tasks dependency
synchronizations.
Differential Revision: https://reviews.llvm.org/D123042
When hwloc is used and is installed outside of the default paths, the omp CMake target
needs to provide the needed include path thru the CMake target by adding it with
target_include_directories to it, so libompd gets it as well when it defines it's cmake
target using target_link_libraries.
As suggested in D122667
Reviewed By: ye-luo
Differential Revision: https://reviews.llvm.org/D123888