llvm-project

Commit Graph

Author	SHA1	Message	Date
Terry Wilmarth	2e02579a76	[OpenMP] Add use of TPAUSE Add use of TPAUSE (from WAITPKG) to the runtime for Intel hardware, with an envirable to turn it on in a particular C-state. Always uses TPAUSE if it is selected and enabled by Intel hardware and presence of WAITPKG, and if not, falls back to old way of checking __kmp_use_yield, etc. Differential Revision: https://reviews.llvm.org/D115758	2022-01-18 10:14:32 -06:00
Jonathan Peyton	6a556ecaf4	[OpenMP][libomp] Add use-all syntax to KMP_HW_SUBSET This patch allows the user to request all resources of a particular layer (or core-attribute). The syntax of KMP_HW_SUBSET is modified so the number of units requested is optional or can be replaced with an '' character. e.g., KMP_HW_SUBSET=c:intel_atom@3 will use all the cores after offset 3 e.g., KMP_HW_SUBSET=c:intel_core will use all the big cores e.g., KMP_HW_SUBSET=s,c,1t will use all the sockets, all cores per each socket and 1 thread per core. Differential Revision: https://reviews.llvm.org/D115826	2021-12-20 13:45:21 -06:00
Jonathan Peyton	9769340905	[OpenMP][libomp] Fix compile errors with new KMP_HW_SUBSET changes Add missing guards around x86-specific code. Reviewed By: kaz7 Differential Revision: https://reviews.llvm.org/D115664	2021-12-14 08:33:05 +01:00
Jonathan Peyton	df20599597	[OpenMP][libomp] Add core attributes to KMP_HW_SUBSET Allow filtering of resources based on core attributes. There are two new attributes added: 1) Core Type (intel_atom, intel_core) 2) Core Efficiency (integer) where the higher the efficiency, the more performant the core On hybrid architectures , e.g., Alder Lake, users can specify KMP_HW_SUBSET=4c:intel_atom,4c:intel_core to select the first four Atom and first four Big cores. The can also use the efficiency syntax. e.g., KMP_HW_SUBSET=2c:eff0,2c:eff1 Differential Revision: https://reviews.llvm.org/D114901	2021-12-10 14:34:33 -06:00
Peyton, Jonathan L	286094af9b	[OpenMP][libomp] Improve Windows Processor Group handling within topology The current implementation of Windows Processor Groups has a separate topology method to handle them. This patch deprecates that specific method and uses the regular CPUID topology method by default and inserts the Windows Processor Group objects in the topology manually. Notes: * The preference for processor groups is lowered to a value less than socket so that the user will see sockets in the KMP_AFFINITY=verbose output instead of processor groups when sockets=processor groups. * The topology's capacity is modified to handle additional topology layers without the need for reallocation. * If a user asks for a granularity setting that is "above" the processor group layer, then the granularity is adjusted "down" to the processor group since this is the coarsest layer available for threads. Differential Revision: https://reviews.llvm.org/D112273	2021-11-17 16:29:01 -06:00
@t-msn	0808d956c4	[OpenMP] libomp: Fix handling of barrier pattern environment variables It is better to set all barrier patterns to use "dist" when at least one environment variable specifies "dist". Otherwise if only one environment is set to "dist" and others left blank inadvertently, it would result in mixing dist barrier with default hyper barrier pattern. Differential Revision: https://reviews.llvm.org/D112597	2021-11-08 15:01:26 +03:00
Peyton, Jonathan L	50b68a3d03	[OpenMP][host runtime] Add support for teams affinity This patch implements teams affinity on the host. The default is spread. A user can specify either spread, close, or primary using KMP_TEAMS_PROC_BIND environment variable. Unlike OMP_PROC_BIND, KMP_TEAMS_PROC_BIND is only a single value and is not a list of values. The values follow the same semantics under the OpenMP specification for parallel regions except T is the number of teams in a league instead of the number of threads in a parallel region. Differential Revision: https://reviews.llvm.org/D109921	2021-10-14 16:30:28 -05:00
AndreyChurbanov	5e58b63b28	[OpenMP] libomp: fix warning on comparison of integer expressions of different signedness Replaced macro with global variable of correspondent type. Differential Revision: https://reviews.llvm.org/D111562	2021-10-13 20:11:47 +03:00
Peyton, Jonathan L	343b9e8590	[OpenMP][host runtime] Introduce kmp_cpuinfo_flags_t to replace integer flags Store CPUID support flags as bits instead of using entire integers. Differential Revision: https://reviews.llvm.org/D110091	2021-10-01 11:08:39 -05:00
Martin Storsjö	f5616a981c	[OpenMP] Fix the usage of sscanf on MinGW KMP_SSCANF only evaluates to sscanf_s within #if KMP_OS_WINDOWS && KMP_MSVC_COMPAT so we need to pass the sscanf_s specific parameters within a similar condition. Differential Revision: https://reviews.llvm.org/D108196	2021-08-17 21:36:09 +03:00
Peyton, Jonathan L	b4a1f441d9	[OpenMP] Add a few small fixes * Add comment to help ensure new construct data are added in two places * Check for division by zero in the loop worksharing code * Check for syntax errors in parrange parsing Differential Revision: https://reviews.llvm.org/D105929	2021-08-16 10:02:49 -05:00
Peyton, Jonathan L	6eeb4c1f32	[OpenMP] Fix incorrect parameters to sscanf_s call On Windows, the documentation states that when using sscanf_s, each %c and %s specifier must also have additional size parameter. This patch adds the size parameter in the one place where %c is used. Differential Revision: https://reviews.llvm.org/D105931	2021-08-16 09:59:21 -05:00
Terry Wilmarth	d8e4cb9121	[OpenMP] libomp: Add new experimental barrier: two-level distributed barrier Two-level distributed barrier is a new experimental barrier designed for Intel hardware that has better performance in some cases than the default hyper barrier. This barrier is designed to handle fine granularity parallelism where barriers are used frequently with little compute and memory access between barriers. There is no need to use it for codes with few barriers and large granularity compute, or memory intensive applications, as little difference will be seen between this barrier and the default hyper barrier. This barrier is designed to work optimally with a fixed number of threads, and has a significant setup time, so should NOT be used in situations where the number of threads in a team is varied frequently. The two-level distributed barrier is off by default -- hyper barrier is used by default. To use this barrier, you must set all barrier patterns to use this type, because it will not work with other barrier patterns. Thus, to turn it on, the following settings are required: KMP_FORKJOIN_BARRIER_PATTERN=dist,dist KMP_PLAIN_BARRIER_PATTERN=dist,dist KMP_REDUCTION_BARRIER_PATTERN=dist,dist Branching factors (set with KMP_FORKJOIN_BARRIER, KMP_PLAIN_BARRIER, and KMP_REDUCTION_BARRIER) are ignored by the two-level distributed barrier. Patch fixed for ITTNotify disabled builds and non-x86 builds Co-authored-by: Jonathan Peyton <jonathan.l.peyton@intel.com> Co-authored-by: Vladislav Vinogradov <vlad.vinogradov@intel.com> Differential Revision: https://reviews.llvm.org/D103121	2021-07-29 14:09:26 -05:00
Johannes Doerfert	4eb90e893f	Revert "[OpenMP] Add Two-level Distributed Barrier" This reverts commit `25073a4ecf`. This breaks non-x86 OpenMP builds for a while now. Until a solution is ready to be upstreamed we revert the feature and unblock those builds. See: https://reviews.llvm.org/rG25073a4ecfc9b2e3cb76776185e63bfdb094cd98#1005821 and https://reviews.llvm.org/rG25073a4ecfc9b2e3cb76776185e63bfdb094cd98#1005821 The currently proposed fix (D104788) seems not to be ready yet: https://reviews.llvm.org/D104788#2841928	2021-06-29 09:38:27 -05:00
AndreyChurbanov	5dd4d0d46f	[OpenMP] libomp: fix dynamic loop dispatcher Restructured dynamic loop dispatcher code. Fixed use of dispatch buffers for nonmonotonic dynamic (static_steal) schedule: - eliminated possibility of stealing iterations of the wrong loop when victim thread changed its buffer to work on another loop; - fixed race when victim thread changed its buffer to work in nested parallel; - eliminated "static" property of the schedule, that is now a single thread can execute whole loop. Differential Revision: https://reviews.llvm.org/D103648	2021-06-22 16:29:01 +03:00
Terry Wilmarth	25073a4ecf	[OpenMP] Add Two-level Distributed Barrier Two-level distributed barrier is a new experimental barrier designed for Intel hardware that has better performance in some cases than the default hyper barrier. This barrier is designed to handle fine granularity parallelism where barriers are used frequently with little compute and memory access between barriers. There is no need to use it for codes with few barriers and large granularity compute, or memory intensive applications, as little difference will be seen between this barrier and the default hyper barrier. This barrier is designed to work optimally with a fixed number of threads, and has a significant setup time, so should NOT be used in situations where the number of threads in a team is varied frequently. The two-level distributed barrier is off by default -- hyper barrier is used by default. To use this barrier, you must set all barrier patterns to use this type, because it will not work with other barrier patterns. Thus, to turn it on, the following settings are required: KMP_FORKJOIN_BARRIER_PATTERN=dist,dist KMP_PLAIN_BARRIER_PATTERN=dist,dist KMP_REDUCTION_BARRIER_PATTERN=dist,dist Branching factors (set with KMP_FORKJOIN_BARRIER, KMP_PLAIN_BARRIER, and KMP_REDUCTION_BARRIER) are ignored by the two-level distributed barrier. Differential Revision: https://reviews.llvm.org/D103121	2021-06-16 15:34:55 -05:00
Vignesh Balasubramanian	f61602b0d3	[OpenMP][OMPD] Implementation of OMPD debugging library - libompd. This is the first of seven patches that implements OMPD, a debugging interface to support debugging of OpenMP programs. It contains support code required in "openmp/runtime" for OMPD implementation. Reviewed By: @hbae Differential Revision: https://reviews.llvm.org/D100181	2021-06-08 16:44:22 +05:30
Terry Wilmarth	8ec9aa236e	[OpenMP] Add experimental nesting mode feature Nesting mode is a new experimental feature in the OpenMP runtime. It allows a user to set up nesting for an application in a way that corresponds to the hardware topology levels on the machine an application is being run on. For example, if a machine has 2 sockets, each with 12 cores, then use of nesting mode could set up an outer level of nesting that uses 2 threads per parallel region, and an inner level of nesting that uses 12 threads per parallel region. Nesting mode is controlled with the KMP_NESTING_MODE environment variable as follows: 1) KMP_NESTING_MODE = 0: Nesting mode is off (default); max-active-levels-var is set to 1 (the default -- nesting is off, nested parallel regions are serialized). 2) KMP_NESTING_MODE = 1: Nesting mode is on, and a number of threads will be assigned for each level discovered in the machine topology; max-active-levels-var is set to the number of levels discovered. 3) KMP_NESTING_MODE = n, n>1: [Note: this option is experimental and may change or be removed in the future.] Nesting mode is on, and a number of threads will be assigned for each topology level discovered on the machine, up to k<=n levels (since there may be fewer than n levels discovered in the topology), and beyond the kth level, nested parallel regions will be serialized; NOTE: max-active-levels-var is 1 (the default -- nesting is off, and nested parallel regions are serialized until the user changes max-active-levels-var. If the user sets OMP_NUM_THREADS or OMP_MAX_ACTIVE_LEVELS, they will override KMP_NESTING_MODE settings for the associated environment variables. The detected topology may be limited by an affinity mask setting on the initial thread, or if the user sets KMP_HW_SUBSET. See also: KMP_HOT_TEAMS_MAX_LEVEL for controlling use of hot teams for nested parallel regions. Note that this feature only sets numbers of threads used at nesting levels. The user should make use of OMP_PLACES and OMP_PROC_BIND or KMP_AFFINITY for affinitizing those threads, if desired. Differential Revision: https://reviews.llvm.org/D102188	2021-06-04 16:01:11 -05:00
Peyton, Jonathan L	9982f33e2c	[OpenMP] Refactor/Rework topology discovery code This patch does the following: 1) Introduce kmp_topology_t as the runtime-friendly structure (the corresponding global variable is __kmp_topology) to determine the exact machine topology which can vary widely among current and future architectures. The current design is not easy to expand beyond the assumed three layer topology: sockets, cores, and threads so a rework capable of using the existing KMP_AFFINITY mechanisms is required. This new topology structure has: * The depth and types of the topology * Ratio count for each consecutive level (e.g., number of cores per socket, number of threads per core) * Absolute count for each level (e.g., 2 sockets, 16 cores, 32 threads) * Equivalent topology layer map (e.g., Numa domain is equivalent to socket, L1/L2 cache equivalent to core) * Whether it is uniform or not The hardware threads are represented with the kmp_hw_thread_t structure. This structure contains the ids (e.g., socket 0, core 1, thread 0) and other information grabbed from the previous Address structure. The kmp_topology_t structure contains an array of these. 2) Generalize the KMP_HW_SUBSET envirable for the new kmp_topology_t structure. The algorithm doesn't assume any order with tiles,numa domains,sockets,cores,threads. Instead it just parses the envirable, makes sure it is consistent with the detected topology (including taking into account equivalent layers) and then trims away the unneeded subset of hardware threads. To enable this, a new kmp_hw_subset_t structure is introduced which contains a vector of items (hardware type, number user wants, offset). Any keyword within __kmp_hw_get_keyword() can be used as a name and can be shortened as well. e.g., KMP_HW_SUBSET=1s,2numa,4tile,2c,3t can be used on the KNL SNC-4 machine. 3) Simplify topology detection functions so they only do the singular task of detecting the machine's topology. Printing, and all canonicalizing functionality is now done afterwards. So many lines of duplicated code are eliminated. 4) Add new ll_caches and numa_domains to OMP_PLACES, and consequently, KMP_AFFINITY's granularity setting. All the names within __kmp_hw_get_keyword() are available for use in OMP_PLACES or KMP_AFFINITY's granularity setting. 5) Simplify and future-proof code where explicit lists of allowed affinity settings keywords inside if() conditions. 6) Add x86 CPUID leaf 4 cache detection to existing x2apic id method so equivalent caches could be detected (in particular for the ll_caches place). Differential Revision: https://reviews.llvm.org/D100997	2021-05-03 18:00:24 -05:00
Hansang Bae	77dc7b4653	[OpenMP] Fix printing routine for OMP_TOOL_VERBOSE_INIT Also fixed typo in the verbose message. Differential Revision: https://reviews.llvm.org/D100414	2021-04-14 07:55:26 -05:00
Hansang Bae	467f39249d	[OpenMP] Misc. changes that add or remove pointer/bound checks -- Added or moved checks to appropriate places. -- Removed ineffective null check where the pointer is already being dereferenced around the code. -- Initialized variables that can be used without definitions. -- Added call to dlclose/FreeLibrary in OMPT tool activation. -- Added a new build compiler definition. Differential Revision: https://reviews.llvm.org/D98584	2021-03-23 18:55:08 -05:00
Shilei Tian	2df65f87c1	[OpenMP] Fixed a crash in hidden helper thread It is reported that after enabling hidden helper thread, the program can hit the assertion `new_gtid < __kmp_threads_capacity` sometimes. The root cause is explained as follows. Let's say the default `__kmp_threads_capacity` is `N`. If hidden helper thread is enabled, `__kmp_threads_capacity` will be offset to `N+8` by default. If the number of threads we need exceeds `N+8`, e.g. via `num_threads` clause, we need to expand `__kmp_threads`. In `__kmp_expand_threads`, the expansion starts from `__kmp_threads_capacity`, and repeatedly doubling it until the new capacity meets the requirement. Let's assume the new requirement is `Y`. If `Y` happens to meet the constraint `(N+8)2^X=Y` where `X` is the number of iterations, the new capacity is not enough because we have 8 slots for hidden helper threads. Here is an example. ``` #include <vector> int main(int argc, char argv[]) { constexpr const size_t N = 1344; std::vector<int> data(N); #pragma omp parallel for for (unsigned i = 0; i < N; ++i) { data[i] = i; } #pragma omp parallel for num_threads(N) for (unsigned i = 0; i < N; ++i) { data[i] += i; } return 0; } ``` My CPU is 20C40T, then `__kmp_threads_capacity` is 160. After offset, `__kmp_threads_capacity` becomes 168. `1344 = (160+8)*2^3`, then the assertions hit. Reviewed By: protze.joachim Differential Revision: https://reviews.llvm.org/D98838	2021-03-18 18:25:36 -04:00
tlwilmar	97d000cfc6	Added API for "masked" construct via two entrypoints: __kmpc_masked, and __kmpc_end_masked. The "master" construct is deprecated. Changed proc-bind keyword from "master" to "primary". Use of both master construct and master as proc-bind keyword is still allowed, but deprecated. Remove references to "master" in comments and strings, and replace with "primary" or "primary thread". Function names and variables were not touched, nor were references to deprecated master construct. These can be updated over time. No new code should refer to master.	2021-03-05 09:29:57 -06:00
Peyton, Jonathan L	8c73be9d86	[OpenMP] Limit number of dispatch buffers This patch limits the number of dispatch buffers (used for loop worksharing construct) to between 1 and 4096. Differential Revision: https://reviews.llvm.org/D96749	2021-02-22 13:14:28 -06:00
Shilei Tian	309b00a42e	[OpenMP][NFC] clang-format the whole openmp project Same script as D95318. Test files are excluded. Reviewed By: AndreyChurbanov Differential Revision: https://reviews.llvm.org/D97088	2021-02-20 12:46:32 -05:00
AndreyChurbanov	d7b12004bd	[OpenMP] libomp: implement nteams-var and teams-thread-limit-var ICVs The change includes OMP_NUM_TEAMS, OMP_TEAMS_THREAD_LIMIT env variables, omp_set_num_teams, omp_get_max_teams, omp_set_teams_thread_limit, omp_get_teams_thread_limit routines. Differential Revision: https://reviews.llvm.org/D95003	2021-02-01 22:54:11 +03:00
Jonathan Peyton	67773681c0	[OpenMP] Add environment variable to force monotonic dynamic scheduling This patch introduces a new environment variable to force monotonic behavior for users that absolutely need it. This is in anticipation of 5.0 change that uses non-monotonic behavior for dynamic scheduling by default. Fixes for that and the actual switch are coming soon. Differential Revision: https://reviews.llvm.org/D95263	2021-01-29 12:23:27 -06:00
AndreyChurbanov	7f5ad0e071	[OpenMP] libomp: fix build by cl with vs2019 Replace VLA with dynamic allocation using alloca(). This fixes https://bugs.llvm.org/show_bug.cgi?id=48919. Differential Revision: https://reviews.llvm.org/D95627	2021-01-29 13:16:41 +03:00
Peyton, Jonathan L	8e67134364	[OpenMP] Fix misleading warning for OMP_PLACES When OMP_PLACES contains an invalid value, the warning informs the user that the fallback is OMP_PLACES=threads, but the actual internal setting is OMP_PLACES=cores and is detected as such with KMP_SETTINGS=1. This patch informs the user that OMP_PLACES=cores is being used instead of OMP_PLACES=threads. Differential Revision: https://reviews.llvm.org/D95170	2021-01-27 14:27:24 -06:00
Peyton, Jonathan L	598c590b3c	[OpenMP] Add cpuid leaf 1f topology discovery This patch adds the new algorithm for topology discovery using cpuid leaf 1f. Only the new die level is detected and integrated into the current affinity mechanisms including KMP_AFFINITY (granularity level and compact/scatter algorithm), OMP_PLACES=dies, and KMP_HW_SUBSET. Differential Revision: https://reviews.llvm.org/D95157	2021-01-27 14:27:23 -06:00
Nawrin Sultana	927af4b3c5	[OpenMP] Modify OMP_ALLOCATOR environment variable This patch sets the def-allocator-var ICV based on the environment variables provided in OMP_ALLOCATOR. Previously, only allowed value for OMP_ALLOCATOR was a predefined memory allocator. OpenMP 5.1 specification allows predefined memory allocator, predefined mem space, or predefined mem space with traits in OMP_ALLOCATOR. If an allocator can not be created using the provided environment variables, the def-allocator-var is set to omp_default_mem_alloc. Differential Revision: https://reviews.llvm.org/D94985	2021-01-26 18:27:39 -06:00
Shilei Tian	9d64275ae0	[OpenMP] Added the support for hidden helper task in RTL The basic design is to create an outer-most parallel team. It is not a regular team because it is only created when the first hidden helper task is encountered, and is only responsible for the execution of hidden helper tasks. We first use `pthread_create` to create a new thread, let's call it the initial and also the main thread of the hidden helper team. This initial thread then initializes a new root, just like what RTL does in initialization. After that, it directly calls `__kmpc_fork_call`. It is like the initial thread encounters a parallel region. The wrapped function for this team is, for main thread, which is the initial thread that we create via `pthread_create` on Linux, waits on a condition variable. The condition variable can only be signaled when RTL is being destroyed. For other work threads, they just do nothing. The reason that main thread needs to wait there is, in current implementation, once the main thread finishes the wrapped function of this team, it starts to free the team which is not what we want. Two environment variables, `LIBOMP_NUM_HIDDEN_HELPER_THREADS` and `LIBOMP_USE_HIDDEN_HELPER_TASK`, are also set to configure the number of threads and enable/disable this feature. By default, the number of hidden helper threads is 8. Here are some open issues to be discussed: 1. The main thread goes to sleeping when the initialization is finished. As Andrey mentioned, we might need it to be awaken from time to time to do some stuffs. What kind of update/check should be put here? Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D77609	2021-01-25 22:16:17 -05:00
AndreyChurbanov	a60bc55c69	[OpenMP] libomp: cleanup parsing of OMP_ALLOCATOR env variable. Differential Revision: https://reviews.llvm.org/D94932	2021-01-19 16:21:22 +03:00
Shilei Tian	9bf843bdc8	Revert "[OpenMP] Added the support for hidden helper task in RTL" This reverts commit `ed939f853d`.	2021-01-18 06:57:52 -05:00
Shilei Tian	ed939f853d	[OpenMP] Added the support for hidden helper task in RTL The basic design is to create an outer-most parallel team. It is not a regular team because it is only created when the first hidden helper task is encountered, and is only responsible for the execution of hidden helper tasks. We first use `pthread_create` to create a new thread, let's call it the initial and also the main thread of the hidden helper team. This initial thread then initializes a new root, just like what RTL does in initialization. After that, it directly calls `__kmpc_fork_call`. It is like the initial thread encounters a parallel region. The wrapped function for this team is, for main thread, which is the initial thread that we create via `pthread_create` on Linux, waits on a condition variable. The condition variable can only be signaled when RTL is being destroyed. For other work threads, they just do nothing. The reason that main thread needs to wait there is, in current implementation, once the main thread finishes the wrapped function of this team, it starts to free the team which is not what we want. Two environment variables, `LIBOMP_NUM_HIDDEN_HELPER_THREADS` and `LIBOMP_USE_HIDDEN_HELPER_TASK`, are also set to configure the number of threads and enable/disable this feature. By default, the number of hidden helper threads is 8. Here are some open issues to be discussed: 1. The main thread goes to sleeping when the initialization is finished. As Andrey mentioned, we might need it to be awaken from time to time to do some stuffs. What kind of update/check should be put here? Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D77609	2021-01-16 14:13:35 -05:00
Terry Wilmarth	6b316febb4	[OpenMP] libomp: Handle implicit conversion warnings This patch partially prepares the runtime source code to be built with -Wconversion, which should trigger warnings if any implicit conversions can possibly change a value. For builds done with icc or gcc, all such warnings are handled in this patch. clang gives a much longer list of warnings, particularly for sign conversions, which the other compilers don't report. The -Wconversion flag is commented into cmake files, but I'm not going to turn it on. If someone thinks it is important, and wants to fix all the clang warnings, they are welcome to. Types of changes made here involve either improving the consistency of types used so that no conversion is needed, or else performing careful explicit conversions, when we're sure a problem won't arise. Patch is a combination of changes by Terry Wilmarth and Johnny Peyton. Differential Revision: https://reviews.llvm.org/D92942	2020-12-31 00:39:57 +03:00
Hansang Bae	c3b5009aa7	[OpenMP] Use RTM lock for OMP lock with synchronization hint This patch introduces a new RTM lock type based on spin lock which is used for OMP lock with speculative hint on supported architecture. Differential Revision: https://reviews.llvm.org/D92615	2020-12-09 19:14:53 -06:00
Terry Wilmarth	e0665a9050	[OpenMP] Add support for Intel's umonitor/umwait These changes add support for Intel's umonitor/umwait usage in wait code, for architectures that support those intrinsic functions. Usage of umonitor/umwait is off by default, but can be turned on by setting the KMP_USER_LEVEL_MWAIT environment variable. Differential Revision: https://reviews.llvm.org/D91189	2020-12-01 14:07:46 -06:00
Isabel Thärigen	b281a05dac	[OpenMP][OMPT] Implement verbose tool loading OpenMP 5.1 introduces the new env variable OMP_TOOL_VERBOSE_INIT=(disabled\|stdout\|stderr\|<filename>) to enable verbose loading and initialization of OMPT tools. This env variable helps to understand the cause when loading of a tool fails (e.g., undefined symbols or dependency not in LD_LIBRARY_PATH) Output of OMP_TOOL_VERBOSE_INIT is added for OMP_DISPLAY_ENV Tests for this patch are integrated into the different existing tool loading tests, making these tests more verbose. An Archer specific verbose test is integrated into an existing Archer test. Patch prepared by: Isabel Thärigen Differential Revision: https://reviews.llvm.org/D91464	2020-11-25 18:17:44 +01:00
AndreyChurbanov	5644f734d6	Revert "[OpenMP] Add support for Intel's umonitor/umwait" This reverts commit `9cfad5f9c5`.	2020-11-20 12:16:34 +03:00
AndreyChurbanov	9cfad5f9c5	[OpenMP] Add support for Intel's umonitor/umwait Patch by tlwilmar (Terry Wilmarth) Differential Revision: https://reviews.llvm.org/D91189	2020-11-19 22:04:21 +03:00
Kazuaki Ishizaki	4201679110	[OpenMP] NFC: Fix trivial typo Differential Revision: https://reviews.llvm.org/D77430	2020-04-04 12:06:54 +09:00
AndreyChurbanov	95df6747cf	[openmp] OpenMP 5.1 omp_display_env function implementation. Patch by Michael Klemm. Differential Revision: https://reviews.llvm.org/D74956	2020-03-04 18:15:05 +03:00
Kazuaki Ishizaki	4c6a098ad5	[OpenMP] NFC: Fix trivial typos in comments Reviewers: jdoerfert, Jim Reviewed By: Jim Subscribers: Jim, mgorny, guansong, jfb, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D72285	2020-01-07 14:05:03 +08:00
Jonathan Peyton	e4b4f994d2	[OpenMP] Remove OMP spec versioning Remove all older OMP spec versioning from the runtime and build system. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D64534 llvm-svn: 365963	2019-07-12 21:45:36 +00:00
Andrey Churbanov	a23806e67a	Create a runtime option to disable task throttling. Patch by viroulep (Philippe Virouleau) Differential Revision: https://reviews.llvm.org/D63196 llvm-svn: 364934	2019-07-02 15:10:20 +00:00
Andrey Churbanov	d47f5488cf	Added propagation of not big initial stack size of master thread to workers. Currently implemented only for non-Windows 64-bit platforms. Differential Revision: https://reviews.llvm.org/D62488 llvm-svn: 362618	2019-06-05 16:14:47 +00:00
Hansang Bae	ec1b4d1f6f	Fix OMP_TARGET_OFFLOAD parsing Current parsing allows trailing string after the permitted value, MANDATORY\|DISABLED\|DEFAULT -- e.g., "mandatorynot" is also recognized as "MANDATORY". Such cases should be recognized as incorrect/unknown value. Differential Revision: https://reviews.llvm.org/D62431 llvm-svn: 362125	2019-05-30 18:35:07 +00:00
Jonathan Peyton	71abe28e81	[OpenMP] Add OpenMP 5.0 nonmonotonic code This patch adds: * New omp_sched_monotonic flag to omp_sched_t which is handled within the runtime * Parsing of monotonic/nonmonotonic in OMP_SCHEDULE * Tests for the monotonic flag and envirable parsing * Logic to force monotonic when hierarchical scheduling is used Differential Revision: https://reviews.llvm.org/D60979 llvm-svn: 359601	2019-04-30 19:20:35 +00:00
Andrey Churbanov	705384be97	Fixed possible out of bound array access. The check of index value moved to before the write to the array. Differential Revision: https://reviews.llvm.org/D60471 llvm-svn: 358181	2019-04-11 15:03:44 +00:00

1 2

93 Commits