Commit Graph

603 Commits

Author SHA1 Message Date
Jonathan Peyton 284fab195a [OpenMP] Add GOMP version symbols for OMP_4.5 API
This patch adds the appropriate version symbols to the relevant API functions

Differential Revision: https://reviews.llvm.org/D49859

llvm-svn: 338281
2018-07-30 17:50:35 +00:00
Jonathan Peyton 369d72db11 [OpenMP] Implement GOMP doacross compatibility
This change introduces GOMP doacross compatibility. There are 12 new interface
functions 6 for long type and 6 for unsigned long long type:
GOMP_doacross_post, GOMP_doacross_wait, GOMP_loop_doacross_[schedule]_start
where schedule can be static, dynamic, guided, or runtime.

These functions just translate the parameters if necessary and send them
to the corresponding kmp function.
E.g., GOMP_doacross_post() -> __kmpc_doacross_post()

For the GOMP_doacross_post function, there is template specialization to
account for when long is a four byte vs an eight byte type. If it is a
four byte type, then a temporary array has to be created to convert the
four byte integers into eight byte integers and then sending that into
__kmpc_doacross_post(). Because GOMP_doacross_wait uses varargs, it
always needs a temporary array and does not need template specialization.

Differential Revision: https://reviews.llvm.org/D49857

llvm-svn: 338280
2018-07-30 17:48:33 +00:00
Jonathan Peyton 8692e142b3 [OpenMP] Fix build errors when building with KMP_DEBUG_ADAPTIVE_LOCKS=1
This change fixes build errors when building a runtime with adaptive lock stats
enabled. Most of the errors were due to the recent changes in the runtime, but
it seems that we have not tried to build this debug runtime on Windows for a
long time.

Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D49823

llvm-svn: 338277
2018-07-30 17:45:23 +00:00
Jonathan Peyton f0682ac498 [OpenMP][Stats] Cleanup stats gathering code
1) Remove unnecessary data from list node structure
2) Remove timerPair in favor of pushing/popping explicitTimers.
   This way, nested timers will work properly.
3) Fix #pragma omp critical timers
4) Add histogram capability
5) Add KMP_STATS_FILE formatting capability
6) Have time partitioned into serial & parallel by introducing
   partitionedTimers::exchange(). This also counts the number of serial regions
   in the executable.
7) Fix up the timers around OMP loops so that scheduling overhead and work are
   both counted correctly.
8) Fix up the iterations statistics so they count the number of iterations the
   thread receives at each loop scheduling event
9) Change timers so there is only one RDTSC read per event change
10) Fix up the outdated comments for the timers

Differential Revision: https://reviews.llvm.org/D49699

llvm-svn: 338276
2018-07-30 17:41:08 +00:00
Joachim Protze cdaefac5bd [OMPT] Fix OMPT callbacks for the taskloop construct and add testcase
Fix the order of callbacks related to the taskloop construct.
Add the iteration_count to work callbacks (according to the spec).
Use kmpc_omp_task() instead of kmp_omp_task() to include OMPT callbacks.
Add a testcase.

Patch by Simon Convent

Reviewed by: protze.joachim, hbae

Subscribers: openmp-commits

Differential Revision: https://reviews.llvm.org/D47709

llvm-svn: 338146
2018-07-27 18:13:24 +00:00
Joachim Protze 86ed6aa668 [OMPT] Adapt OMPT callbacks for tasks to handle untied tasks correctly
The ompt/tasks/task_types.c testcase did not test untied tasks properly. Now,
frame addresses are tested and two scheduling points are added at which the
task can switch to another thread. Due to scheduling effects, the frame address
could be NULL.

This needed a restructure of the way OMPT callbacks are called.
__ompt_task_finish() now as an extra parameter, whether a task is completed.
Its invocation has been moved into __kmp_task_finish(). Thus, the order of the
writes to the frame addresses is not subject to scheduling effects anymore.

Patch by Simon Convent

Reviewed by: protze.joachim, hbae

Subscribers: openmp-commits

Differential Revision: https://reviews.llvm.org/D49181

llvm-svn: 338145
2018-07-27 18:13:20 +00:00
Jonas Hahnfeld 3a0e9b37f3 PR30734: Remove __kmp_ft_page_allocate()
This function was not enabled by default and not exported when manually
tweaking the build flags. Additionally it was hard to use since there
is no corresponding __kmp_ft_page_free().
The code itself is questionable because the returned memory address
is padded by an extra pointer which stores the unpadded start of the
allocated region (this would need to be freed).

Differential Revision: https://reviews.llvm.org/D49802

llvm-svn: 338052
2018-07-26 18:15:02 +00:00
Jonathan Peyton a764af68be Block library shutdown until unreaped threads finish spin-waiting
This change fixes possibly invalid access to the internal data structure during
library shutdown.  In a heavily oversubscribed situation, the library shutdown
sequence can reach the point where resources are deallocated while there still
exist threads in their final spinning loop.  The added loop in
__kmp_internal_end() checks if there are such busy-waiting threads and blocks
the shutdown sequence if that is the case. Two versions of kmp_wait_template()
are now used to minimize performance impact.

Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D49452

llvm-svn: 337486
2018-07-19 19:17:00 +00:00
Jonathan Peyton dc73f512ae Fix const cast problem introduced in r336563
336563 eliminated CCAST() macros caused build failures

llvm-svn: 336586
2018-07-09 19:09:31 +00:00
Jonathan Peyton 61d44f188a [OpenMP] Fix a few formatting issues
llvm-svn: 336575
2018-07-09 18:09:25 +00:00
Jonathan Peyton f639936748 [OpenMP] Introduce hierarchical scheduling
This patch introduces the logic implementing hierarchical scheduling.
First and foremost, hierarchical scheduling is off by default
To enable, use -DLIBOMP_USE_HIER_SCHED=On during CMake's configure stage.
This work is based off if the IWOMP paper:
"Workstealing and Nested Parallelism in SMP Systems"

Hierarchical scheduling is the layering of OpenMP schedules for different layers
of the memory hierarchy. One can have multiple layers between the threads and
the global iterations space. The threads will go up the hierarchy to grab
iterations, using possibly a different schedule & chunk for each layer.

[ Global iteration space (0-999) ]

(use static)
[ L1 | L1 | L1 | L1 ]

(use dynamic,1)
[ T0 T1 | T2 T3 | T4 T5 | T6 T7 ]

In the example shown above, there are 8 threads and 4 L1 caches begin targeted.
If the topology indicates that there are two threads per core, then two
consecutive threads will share the data of one L1 cache unit. This example
would have the iteration space (0-999) split statically across the four L1
caches (so the first L1 would get (0-249), the second would get (250-499), etc).
Then the threads will use a dynamic,1 schedule to grab iterations from the L1
cache units. There are currently four supported layers: L1, L2, L3, NUMA

OMP_SCHEDULE can now read a hierarchical schedule with this syntax:
OMP_SCHEDULE='EXPERIMENTAL LAYER,SCHED[,CHUNK][:LAYER,SCHED[,CHUNK]...]:SCHED,CHUNK
And OMP_SCHEDULE can still read the normal SCHED,CHUNK syntax from before

I've kept most of the hierarchical scheduling logic inside kmp_dispatch_hier.h
to try to keep it separate from the rest of the code.

Differential Revision: https://reviews.llvm.org/D47962

llvm-svn: 336571
2018-07-09 17:51:13 +00:00
Jonathan Peyton 39ada85446 [OpenMP] Restructure loop code for hierarchical scheduling
This patch reorganizes the loop scheduling code in order to allow hierarchical
scheduling to use it more effectively. In particular, the goal of this patch
is to separate the algorithmic parts of the scheduling from the thread
logistics code.

Moves declarations & structures to kmp_dispatch.h for easier access in
other files.  Extracts the algorithmic part of __kmp_dispatch_init() and
__kmp_dispatch_next() into __kmp_dispatch_init_algorithm() and
__kmp_dispatch_next_algorithm(). The thread bookkeeping logic is still kept in
__kmp_dispatch_init() and __kmp_dispatch_next(). This is done because the
hierarchical scheduler needs to access the scheduling logic without the
bookkeeping logic.  To prepare for new pointer in dispatch_private_info_t, a
new flags variable is created which stores the ordered and nomerge flags instead
of them being in two separate variables. This will keep the
dispatch_private_info_t structure the same size.

Differential Revision: https://reviews.llvm.org/D47961

llvm-svn: 336568
2018-07-09 17:45:33 +00:00
Jonathan Peyton 37e2ef5434 [OpenMP] Use C++11 Atomics - barrier, tasking, and lock code
These are preliminary changes that attempt to use C++11 Atomics in the runtime.
We are expecting better portability with this change across architectures/OSes.
Here is the summary of the changes.

Most variables that need synchronization operation were converted to generic
atomic variables (std::atomic<T>). Variables that are updated with combined CAS
are packed into a single atomic variable, and partial read/write is done
through unpacking/packing

Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D47903

llvm-svn: 336563
2018-07-09 17:36:22 +00:00
Joachim Protze 4a73ae167e [OMPT] Provide the right thread_num for ancestor levels
The current implementation always provides the thread-num for the current
parallel region. This patch fixes the behavior for ancestor levels >0.

Differential Revision: https://reviews.llvm.org/D46533

llvm-svn: 336085
2018-07-02 09:13:24 +00:00
Andrey Churbanov a7fa3f009a minor: fixed typo in debug print
llvm-svn: 335138
2018-06-20 15:54:11 +00:00
Jonathan Peyton e92ae43be8 [OpenMP] Fix formatting issues in kmp_stats.h
llvm-svn: 334335
2018-06-08 22:27:53 +00:00
Joachim Protze 406361330b [OMPT] Rename ompt_wait_id to omp_wait_id
Rename ompt_wait_id to omp_wait_id, as defined in the spec.

Differential Revision: https://reviews.llvm.org/D46530

llvm-svn: 333368
2018-05-28 08:16:08 +00:00
Joachim Protze c5836064bb [OMPT] Rename ompt_frame_t to omp_frame_t
Rename ompt_frame_t to omp_frame_t, as defined in the spec.

Differential Revision: https://reviews.llvm.org/D43568

llvm-svn: 333367
2018-05-28 08:14:58 +00:00
Jonas Hahnfeld 65e0b8784c [CMake] Unify install path for libraries
Introduce OPENMP_INSTALL_LIBDIR and use in all install() commands.
This also fixes installation of libomptarget-nvptx that previously
didn't honor {OPENMP,LLVM}_LIBDIR_SUFFIX.

Differential Revision: https://reviews.llvm.org/D47130

llvm-svn: 333284
2018-05-25 15:56:41 +00:00
Joachim Protze 9be9cf20bf [OMPT] Fix thread_num for implicit_task_end callbacks in nested parallel regions
implicit_task_end callbacks in nested parallel regions did not always give the
correct thread_num, since the inner parallel region may have already been
finalized.
Now, the thread_num is stored at the beginning of the implicit task and
retrieved at the end, whenever necessary.

A testcase was added as well.

Differential Revision: https://reviews.llvm.org/D46260

llvm-svn: 331632
2018-05-07 12:42:21 +00:00
Heejin Ahn f78a493528 [OpenMP] Compilation error fix on const char*
Summary:
This line
(0ed912c7a7/runtime/src/kmp_gsupport.cpp (L1459))
added in D45327 (rL330282) causes a compilation failure.

Reviewers: jlpeyton

Subscribers: guansong, openmp-commits

Differential Revision: https://reviews.llvm.org/D45786

llvm-svn: 330299
2018-04-18 22:23:31 +00:00
Jonathan Peyton 1482db9e03 [OpenMP] Fix affinity API for KMP_AFFINITY=none|compact|scatter
Currently, the affinity API reports garbage for the initial place list and any
thread's place lists when using KMP_AFFINITY=none|compact|scatter.
This patch does two things:

for KMP_AFFINITY=none, Creates a one entry table for the places, this way, the
initial place list is just a single place with all the proc ids in it. We also
set the initial place of any thread to 0 instead of KMP_PLACE_ALL so that the
thread reports that single place (place 0) instead of garbage (-1) when using
the affinity API.

When non-OMP_PROC_BIND affinity is used
(including KMP_AFFINITY=compact|scatter), a thread's place list is populated
correctly. We assume that each thread is assigned to a single place. This is
implemented in two of the affinity API functions

Differential Revision: https://reviews.llvm.org/D45527

llvm-svn: 330283
2018-04-18 19:25:48 +00:00
Jonathan Peyton 27a677fc95 Introduce GOMP_taskloop API
This patch introduces GOMP_taskloop to our API. It adds GOMP_4.5 to our
version symbols. Being a wrapper around __kmpc_taskloop, the function
creates a task with the loop bounds properly nested in the shareds so that
the GOMP task thunk will work properly. Also, the firstprivate copy constructors
are properly handled using the __kmp_gomp_task_dup() auxiliary function.

Currently, only linear spawning of tasks is supported
for the GOMP_taskloop interface.

Differential Revision: https://reviews.llvm.org/D45327

llvm-svn: 330282
2018-04-18 19:23:54 +00:00
Joachim Protze 3865c69b84 Set the license header for all OMPT files
llvm-svn: 329928
2018-04-12 17:23:26 +00:00
Jonathan Peyton 1e6bb8d5de Minor cleanup in __kmp_atfork_child()
This change removes the unnecessary lock operation on __kmp_initz_lock inside
the __kmp_atfork_child() function for Linux; the lock variable is initialized
in the same function later.

Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D44949

llvm-svn: 328900
2018-03-30 19:55:11 +00:00
Jonathan Peyton ea82c769f4 Move blocktime_str variable right before its first use
llvm-svn: 328575
2018-03-26 19:20:50 +00:00
Andrey Churbanov 2d91a8a3ba Fixed __kmpc_get_target_offload() to call library initialization.
Differential Revision: https://reviews.llvm.org/D44793

llvm-svn: 328228
2018-03-22 18:51:51 +00:00
Jonathan Peyton 78f977fcd1 Read OMP_TARGET_OFFLOAD and provide API to access ICV
Added settings code to read OMP_TARGET_OFFLOAD environment variable. Added
target-offload-var ICV as __kmp_target_offload, set via OMP_TARGET_OFFLOAD,
if available, otherwise defaulting to DEFAULT. Valid values for the ICV are
specified as enum values {0,1,2} for disabled, default, and mandatory. An
internal API access function __kmpc_get_target_offload is provided.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D44577

llvm-svn: 328046
2018-03-20 21:18:17 +00:00
Andrey Churbanov 3336aa0d07 Fix for Fix for https://bugs.llvm.org/show_bug.cgi?id=36705.
Differential Revision: https://reviews.llvm.org/D44637

llvm-svn: 327875
2018-03-19 18:05:15 +00:00
Andrey Churbanov 9e9333aa8a Improve OpenMP threadprivate implementation.
Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D41914

llvm-svn: 326733
2018-03-05 18:42:01 +00:00
Andrey Churbanov 75bc70fb56 Fixed build of the OpenMP stubs library.
Differential Revision: https://reviews.llvm.org/D44019

llvm-svn: 326728
2018-03-05 18:01:47 +00:00
Joachim Protze aa2022e74f [OMPT] Fix ompt_get_task_info() and add tests for it
The thread_num parameter of ompt_get_task_info() was not being used previously,
but need to be set.

The print_task_type() function (form the task-types.c testcase) was merged into
the print_ids() function (in callback.h). Testing of ompt_get_task_info() was
added to the task-types.c testcase. It was not tested extensively previously.

Differential Revision: https://reviews.llvm.org/D42472

llvm-svn: 326338
2018-02-28 17:36:18 +00:00
Jonas Hahnfeld 82768d0ba1 [OMPT] Fix parallel_data in implicit barrier-end
This is required to be NULL for implicit barriers at the end of a
parallel region. Noticed in review of D43191.

Differential Revision: https://reviews.llvm.org/D43308

llvm-svn: 325922
2018-02-23 16:46:25 +00:00
Joachim Protze b0e4f87fb0 [OMPT] Omissionin in OMPT Formatting
Applying clang-format to the /runtime/src/ folder

Differential Revision: https://reviews.llvm.org/D42169

llvm-svn: 325424
2018-02-17 09:54:10 +00:00
Joachim Protze 76899b84fe [OMPT] Update api_calls testcase
Only use ompt_ functions when testing OMPT in api_calls testcase.
Add size parameter to print_list.
Fix small bug in implementation of ompt_get_partition_place_nums(): return correct length.

Differential Revision: https://reviews.llvm.org/D42162

llvm-svn: 325422
2018-02-17 09:40:02 +00:00
Joachim Protze 2a20299f91 [OMPT] Fix tool initialization returning 0
If tool initialization returns 0, OMPT should not be active. The current
implementation provided some callback invocations in this case.

Differential Revision: https://reviews.llvm.org/D42709

llvm-svn: 324320
2018-02-06 08:41:27 +00:00
Joachim Protze e6269e3509 Partial revert of [OMPT] Rename ompt_mutex_impl_t to kmp_mutex_impl
The previous commit did not revert all replaced ompt_mutex_impl_unknown.

llvm-svn: 322631
2018-01-17 11:13:11 +00:00
Joachim Protze 1b2bd2680b [OMPT] Rename ompt_mutex_impl_t to kmp_mutex_impl
The defintion is not part of the spec and thus should not have the prefix
"ompt_" but rather a prefix that indicates that this is implementation
specific.

Differential Revision: https://reviews.llvm.org/D41166

llvm-svn: 322621
2018-01-17 10:06:01 +00:00
Joachim Protze 1dc2afdcaf [OMPT] Return appropiate values for ompt runtime entry points for non-OpenMP threads
When the current thread is not an (initialized) OpenMP thread, the runtime
entry points return values that correspond to "not available" or similar

Differential Revision: https://reviews.llvm.org/D41167

llvm-svn: 322620
2018-01-17 10:05:55 +00:00
Andrey Churbanov 5388acd3de Fixed libomp static build broken by the commit rL322202.
Patch by simone <simone@cs.utah.edu>.

Differential Revision: https://reviews.llvm.org/D41945

llvm-svn: 322282
2018-01-11 15:09:49 +00:00
Jonathan Peyton 79390ad709 Force HWLOC topology method for NUMA-specific topology
If user requested affinity with granularity=tile we need to either use HWLOC
or ignore the request. The change allows user to not specify
KMP_TOPOLOGY_METHOD=hwloc and choose it automatically instead.

Patch by Andrey Churbanov

Differential Revision: https://reviews.llvm.org/D40905

llvm-svn: 322205
2018-01-10 18:31:49 +00:00
Jonathan Peyton 1800ecec70 Simplify __kmp_expand_threads
This change simplifies __kmp_expand_threads to take a single argument.
Previously, it allowed two arguments and had logic to decide on different
potential expansion sizes. However, no calls to __kmp_expand_threads in the
runtime make use of this extra logic. Thus the extra argument and logic is
removed here.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D41836

llvm-svn: 322204
2018-01-10 18:27:01 +00:00
Jonathan Peyton bff8ded906 Minor code cleanup
Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D41831

llvm-svn: 322203
2018-01-10 18:24:09 +00:00
Jonathan Peyton eaa9e40c9a Improve stability of the runtime in parent/child processes
This change improves stability of the runtime when the application forks child
processes.  Acquiring/releasing __kmp_initz_lock and __kmp_forkjoin_lock in the
atfork handlers insures that the actual fork does not occur while those two
locks are held, and __kmp_itt_reset() reverts the itt's global state to the
initial state which also initializes the mutex stored in the global state.
Some missing initialization code was also inserted in the child's atfork handler.

Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D41462

llvm-svn: 322202
2018-01-10 18:21:48 +00:00
Joachim Protze 14b512e20c [OMPT] Fix ompt_task_data handling in implicit barriers
Changes to task_data in barrier-begin were not visible at barrier-end

Differential Revision: https://reviews.llvm.org/D41176

llvm-svn: 322178
2018-01-10 12:51:27 +00:00
Paul Osmialowski 6db41e608f Fix type mismatch in omp_control_tool() implementation that makes it run incorrectly on 32-bit machines.
Differential Revision: https://reviews.llvm.org/D41854

llvm-svn: 322068
2018-01-09 10:54:06 +00:00
Jonas Hahnfeld 3ffca790f6 Correct types of pointers to doacross_num_done
This field is defined as kmp_int32, so we should use neither
pointers to kmp_int64 nor 64 bit atomic instructions.
(Found while testing on a Raspberry Pi, 32 bit ARM)

Differential Revision: https://reviews.llvm.org/D41656

llvm-svn: 321964
2018-01-07 16:54:36 +00:00
Jonathan Peyton 97f4320086 Fix some comments and formatting in kmp_dispatch.cpp
llvm-svn: 321831
2018-01-04 23:05:26 +00:00
Jonathan Peyton 8c432f2d5e Fix trademarks found by scanner
llvm-svn: 321827
2018-01-04 22:56:47 +00:00
Joachim Protze 265fb584a5 [OMPT] Set and reset frame address when creating a task with dependences
As for normal task creation, the task frame addresses need to be stored
for the encountering task.

Differential Revision: https://reviews.llvm.org/D41165

llvm-svn: 321421
2017-12-24 07:30:23 +00:00
Joachim Protze 25aa3ec1c5 Remove unused positional argument for printf
The format string for hints only prints the second argument (string) and drops
the first argument (hint id). Depending on how you read the POSIX text for
printf, this could be valid. But for practical reason, i.e., unpacking the
va_list passed to printf based on the formating information, it makes sense
to fix the implementation and not pass the id for hint.

Failing testcases were:

misc_bugs/teams-reduction.c
ompt/parallel/not_enough_threads.c

Differential Revision: https://reviews.llvm.org/D41504

llvm-svn: 321361
2017-12-22 16:40:26 +00:00
Joachim Protze f375f4b49a [OMPT] Add missing ompt_get_num_procs function
This function is defined in OpenMP-TR6 section 4.1.5.1.6
The functions was not implemented yet.

Since ompt-functions can only be called after the runtime was initialized and
has loaded a tool, it can assume the runtime to be initialized. In contrast
to omp_get_num_procs which needs to check whether the runtime is initialized.

Differential Revision: https://reviews.llvm.org/D40949

llvm-svn: 321269
2017-12-21 14:36:30 +00:00
Joachim Protze f8d22f9db8 [OMPT] Fix return address handling in a few GOMP interface methods
This revision fixes failing testcases with parallel for loops and the gomp
interface. The return address needs to be stored at entry to runtime.
The storage is cleared on usage, so we need to update the storage before
calling again internal functions, that will trigger event callbacks.

Differential Revision: https://reviews.llvm.org/D41181

llvm-svn: 321265
2017-12-21 13:55:39 +00:00
Joachim Protze 4fe83593eb [OMPT] Handle null pointer in set_callback to improve performance
We use the bitmap ompt_enabled thoughout the runtime, to avoid loading the
vector of callback functions when testing if specific code should be executed.
Before invoking an event callback function, the pointer is tested for NULL.

This revision resets the corresponding bit in ompt_enabled to 0 if
NULL is passed in set_callback.

Differential Revision: https://reviews.llvm.org/D41171

llvm-svn: 321264
2017-12-21 13:55:34 +00:00
Paul Osmialowski 7634f7093a [AArch64] fix an issue with older /proc/cpuinfo layout
There are two /proc/cpuinfo layots in use for AArch64: old and new.
The old one has all 'processor : n' lines in one section, hence
checking for duplications does not make sense.

Differential Revision: https://reviews.llvm.org/D41000

llvm-svn: 320593
2017-12-13 16:12:24 +00:00
Jonas Hahnfeld e628ab4c65 Use hyperbarrier by default on all architectures
All architectures except x86_64 used the linear barrier implementation
by default which doesn't give good performance for a larger number
of threads.

Improvements for PARALLEL overhead (EPCC) with this patch on a Power8
system (2 sockets x 10 cores x 8 threads, OMP_PLACES=cores)

 20 threads:  4.55us -> 3.49us
 40 threads:  8.84us -> 4.06us
 80 threads: 19.18us -> 4.74us
160 threads: 54.22us -> 6.73us

Differential Revision: https://reviews.llvm.org/D40358

llvm-svn: 320152
2017-12-08 15:07:07 +00:00
Jonas Hahnfeld ce528acf0d Fix thread affinity on non-x86 Linux
To make thread affinity work according to the OpenMP spec, the
runtime needs information about the hardware topology. On Linux
the default way is to parse /proc/cpuinfo which contains this
information for x86 machines but (at least) not for AArch64 and
Power architectures.

Fortunately, there is a different code path which is able to get
that data from sysfs. The needed patch has landed in 2006 for
Linux 2.6.16 which is safe to assume nowadays (even RHEL 5 had
a kernel version derived from 2.6.18, and we are now at RHEL 7!).

Differential Revision: https://reviews.llvm.org/D40357

llvm-svn: 320151
2017-12-08 15:07:05 +00:00
Jonas Hahnfeld 86c307821c Add missing memory barrier for queuing locks
Otherwise I see hangs in the omp_single_copyprivate test when
compiling in release mode. With the debug assertions, I get a
failure `head > 0 && tail > 0`.

Differential Revision: https://reviews.llvm.org/D40722

llvm-svn: 320150
2017-12-08 15:07:02 +00:00
Jonathan Peyton ebbcb43976 [OpenMP] Add entry for Intel Compiler 18
Patch by Simon Convent

Differential Revision: https://reviews.llvm.org/D40386

llvm-svn: 319961
2017-12-06 21:15:28 +00:00
Jonathan Peyton 125203e003 Eliminate double printing of verbose affinity settings
Redundant extra verbose output of binding to full mask in case
affinity=balanced or OMP_PLACES=<any> or OMP_PROC_BIND=<any>

Differential Revision: https://reviews.llvm.org/D40624

llvm-svn: 319960
2017-12-06 21:07:41 +00:00
Jonathan Peyton ec5b87188d Trivial enum fix
This change is a trivial fix for enums that removes specification of "last" or
"upper" values, or other boundary values. This simplifies the code in places,
and results in never needing to update the "upper" values again.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D40804

llvm-svn: 319957
2017-12-06 21:02:15 +00:00
Jonas Hahnfeld a4ca525c1b Fix PR30890: Reduction across teams hangs
__kmpc_reduce_nowait() correctly swapped the teams for reductions
in a teams construct. Apply the same logic to __kmpc_reduce() and
__kmpc_reduce_end().

Differential Revision: https://reviews.llvm.org/D40753

llvm-svn: 319788
2017-12-05 16:51:24 +00:00
Andrey Churbanov a5868215b4 Extension of HWLOC topology discovery with NUMA nodes and tiles
Patch by Olga Malysheva

Differential Revision: https://reviews.llvm.org/D40309

llvm-svn: 319422
2017-11-30 11:51:47 +00:00
Jonathan Peyton ba55a7b958 Make kmp_r_sched_t into a union
This change makes kmp_r_sched_t type into a union for simpler
comparisons and assignments

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D40374

llvm-svn: 319379
2017-11-29 22:47:52 +00:00
Jonathan Peyton 62da55020b Fix aligned memory allocation in the stub library
kmp_aligned_malloc() always returned NULL on Windows (stub library only)
that may cause Fortran application crash.  With this change all memory
allocation functions were fixed to use aligned{m,re,rec}alloc() to
allocate/reallocate memory. To deallocate that memory _aligned_free() is
used in kmp_free().

Patch by Olga Malysheva

Differential Revision: https://reviews.llvm.org/D40296

llvm-svn: 319375
2017-11-29 22:29:38 +00:00
Jonathan Peyton 64249504b5 Warning is emitted when tiles are requested but cannot be used
Added two warnings:
1) Before building the topology map check if tiles are requested but the
   topo method is not hwloc;
2) After building the topology map check if tiles are requested but not
   detected by the library.

Patch by Olga Malysheva

Differential Revision: https://reviews.llvm.org/D40340

llvm-svn: 319374
2017-11-29 22:27:18 +00:00
Jonathan Peyton 92ce4bcfd8 Fix types of Fortran array elements
Fortran array elements made default integer in OMP_GET_PLACE_PROC_IDS and
OMP_GET_PARTITION_PLACE_NUMS subroutines, otherwise call to them produces
incorrect result.

Patch by Olga Malysheva

Differential Revision: https://reviews.llvm.org/D40356

llvm-svn: 319372
2017-11-29 22:23:44 +00:00
Jonas Hahnfeld 5af381acad [CMake] Refactor common settings and flags
These are needed by both libraries, so we can do that in a
common namespace and unify configuration parameters.
Also make sure that the user isn't requesting libomptarget
if the library cannot be built on the system. Issue an error
in that case.

Differential Revision: https://reviews.llvm.org/D40081

llvm-svn: 319342
2017-11-29 19:31:48 +00:00
Jonas Hahnfeld 3e921d3c52 [CMake] Disallow direct configuration
As a first step, this allows us to generalize the detection of
standalone builds and make it fully compatible when building in
llvm/runtimes/ which automatically sets OPENMP_STANDLONE_BUILD.

Differential Revision: https://reviews.llvm.org/D40080

llvm-svn: 319341
2017-11-29 19:31:43 +00:00
Jonas Hahnfeld 221e7bb1fc Fix for OMP doacross implementation on Power
Power has a weak consistency model so we need memory barriers to
make writes (both from runtime and from user code) available for
all threads.

Differential Revision: https://reviews.llvm.org/D40175

llvm-svn: 318848
2017-11-22 17:15:20 +00:00
Andrey Churbanov 58acafc424 Fixed OMP doacross implementation on 32-bit platforms.
Differential Revision: https://reviews.llvm.org/D40171

llvm-svn: 318658
2017-11-20 16:00:42 +00:00
Andrey Churbanov a756cb240a Exclude untied tasks from checking of task scheduling constraint (TSC).
This can improve performance of tests with untied tasks.

Differential Revision: https://reviews.llvm.org/D39613

llvm-svn: 318388
2017-11-16 10:45:07 +00:00
Jonas Hahnfeld d0ef19ef9b [OMPT] Provide initialization for Mac OS X
Traditionally, the library had a weak symbol for ompt_start_tool()
that served as fallback and disabled OMPT if called. Tools could
provide their own version and replace the default implementation
to register callbacks and lookup functions. This mechanism has
worked reasonably well on Linux systems where this interface was
initially developed.

On Darwin / Mac OS X the situation is a bit more complicated and
the weak symbol doesn't work out-of-the-box. In my tests, the
library with the tool needed to link against the OpenMP runtime
to make the process work. This would effectively mean that a tool
needed to choose a runtime library whereas one design goal of the
interface was to allow tools that are agnostic of the runtime.

The solution is to use dlsym() with the argument RTLD_DEFAULT so
that static implementations of ompt_start_tool() are found in the
main executable. This works because the linker on Mac OS X includes
all symbols of an executable in the global symbol table by default.
To use the same code path on Linux, the application would need to
be built with -Wl,--export-dynamic. To avoid this restriction, we
continue to use weak symbols on Linux systems as before.

Finally this patch extends the existing test to cover all possible
ways of initializing the tool as described by the standard. It
also fixes ompt_finalize() to not call omp_get_thread_num() when
the library is shut down which resulted in hangs on Darwin.
The changes have been tested on Linux to make sure that it passes
the current tests as well as the newly extended one.

Differential Revision: https://reviews.llvm.org/D39801

llvm-svn: 317980
2017-11-11 13:59:48 +00:00
Joachim Protze 91732475a6 [OMPT] Fix assertion for OpenMP code generated with outdated compilers
For up-to-date compilers, this assertion is reasonable, but it breaks
compatibility with the typical compiler installed on most systems.
This patch changes the default value to what we had when there was no
compiler support. A warning about the outdated compiler is printed during
runtime, when this point is reached.

Differential Revision: https://reviews.llvm.org/D39890

llvm-svn: 317928
2017-11-10 21:07:01 +00:00
Jonas Hahnfeld e9b7c0a392 Add const to some variables to avoid const_casts
In these places the const attribute seems correct and doesn't
need any other change, so let's do it.

Differential Revision: https://reviews.llvm.org/D39756

llvm-svn: 317798
2017-11-09 15:52:29 +00:00
Jonas Hahnfeld aeb40adabf Remove const from variables with dynamic memory
Allocated memory is typically not 'const' if it needs to be freed.
This patch removes around 50 wrong const attributes, modifies the
corresponding functions and finally gets rid of some const_casts.
These have especially been strange for __kmp_str_fname_free() that
added a 'const' to call __kmp_str_free() which removed it again.

Two minor cleanups that I performed in this process:
 * __kmp_tool_libraries now lives in kmp_settings.cpp as it is
   used nowhere else.
 * __kmp_msg_empty was removed as it was never used and Clang
   now complained that it was assigned a string literal that
   is 'const char *'.

Differential Revision: https://reviews.llvm.org/D39755

llvm-svn: 317797
2017-11-09 15:52:25 +00:00
Jonathan Peyton 40039ac98c Cleanup version symbol macros and attributes/declspecs
1) Get rid of xaliasify, xexpand and xversionify for KMP_EXPAND_NAME and
KMP_VERSION_SYMBOL. KMP_VERSION_SYMBOL is a combination of xaliasify and
xversionify.

2) Put all attribute and __declspec definitions in kmp_os.h

Differential Revision: https://reviews.llvm.org/D39516

llvm-svn: 317636
2017-11-07 23:32:13 +00:00
Jonas Hahnfeld 13dc13ef09 [OMPT] Improve cast that was lost on commit, NFC.
llvm-svn: 317480
2017-11-06 14:33:09 +00:00
Joachim Protze cab9cdc2ad Updating implementation of OMPT as specified in OpenMP 5.0 Preview 2 (TR6)
The TR6 document is expected to be publically released around November 15.
This patch does not implement OMPT for libomptarget.

Patch by Simon Convent and Joachim Protze

Differential Revision: https://reviews.llvm.org/D39182

llvm-svn: 317436
2017-11-05 14:11:19 +00:00
Joachim Protze c255ca70ce Rename fields of ompt_frame_t
This is part of the renaming of data types from OpenMP TR4 to TR6

Patch by Simon Convent

Differential Revision: https://reviews.llvm.org/D39326

llvm-svn: 317435
2017-11-05 14:11:10 +00:00
Jonas Hahnfeld b71424fda5 Revert "Rename fields of ompt_frame_t"
This reverts commit r317338 which discarded some recent commits.

llvm-svn: 317347
2017-11-03 18:28:25 +00:00
Jonas Hahnfeld f0a1c65fb0 Revert "Updating implementation of OMPT as specified in OpenMP 5.0 Preview 2 (TR6)"
This reverts commit r317339 which discarded some recent commits.

llvm-svn: 317346
2017-11-03 18:28:19 +00:00
Joachim Protze 924cff0a39 Updating implementation of OMPT as specified in OpenMP 5.0 Preview 2 (TR6)
The TR6 document is expected to be publically released around November 15.
This patch does not implement OMPT for libomptarget.

Patch by Simon Convent and Joachim Protze

Differential Revision: https://reviews.llvm.org/D39182

llvm-svn: 317339
2017-11-03 17:09:00 +00:00
Joachim Protze 741572593f Rename fields of ompt_frame_t
This is part of the renaming of data types from OpenMP TR4 to TR6

Patch by Simon Convent

Differential Revision: https://reviews.llvm.org/D39326

llvm-svn: 317338
2017-11-03 17:08:40 +00:00
Jonathan Peyton 3d18a37ca9 [OpenMP] Fix race condition in omp_init_lock
This is a partial fix for bug 34050.

This prevents callers of omp_set_lock (which does not hold __kmp_global_lock)
from ever seeing an uninitialized version of __kmp_i_lock_table.table.

It does not solve a use-after-free race condition if omp_set_lock obtains a
pointer to __kmp_i_lock_table.table before it is updated and then attempts to
dereference afterwards. That race is far less likely and can be handled in a
separate patch.

The unit test usually segfaults on the current trunk revision. It passes with
the patch.

Patch by Adam Azarchs

Differential Revision: https://reviews.llvm.org/D39439

llvm-svn: 317115
2017-11-01 19:44:42 +00:00
Joachim Protze 82e94a5934 Update implementation of OMPT to the specification OpenMP 5.0 Preview 1 (TR4).
The code is tested to work with latest clang, GNU and Intel compiler. The implementation
is optimized for low overhead when no tool is attached shifting the cost to execution with
tool attached.

This patch does not implement OMPT for libomptarget.

Patch by Simon Convent and Joachim Protze

Differential Revision: https://reviews.llvm.org/D38185

llvm-svn: 317085
2017-11-01 10:08:30 +00:00
Jonathan Peyton 5e6cb9022c Fix fatal error message displaying
Replacing call to __kmp_msg(kmp_ms_fatal,...) with __kmp_fatal(...) caused an
issue when incomplete message is displayed in case an error message is followed
by another message (e.g. by a hint messa)ge. This is because __kmp_fatal()
passes incomplete list of arguments to __kmp_msg().

Patch by Olga Malysheva

Differential Revision: https://reviews.llvm.org/D39248

llvm-svn: 316623
2017-10-25 22:05:02 +00:00
Jonathan Peyton dff0ee2f4e Disable threadprivate data cleanup if runtime is terminating
The problem is due to the runtime's threadprivate cleanup code which tries to
access data that was already destroyed by one of the root threads.
__kmp_init_gtid is used as a checker here since it is set to false before actual
resource cleanup is done in __kmp_cleanup().

Patch by Hansang Bae

llvm-svn: 316452
2017-10-24 16:10:09 +00:00
Jonathan Peyton 7b16ae201f Restrict OMPT to OpenMP version 5.0 and remove old header files
Patch by Simon Convent

Differential Revision: https://reviews.llvm.org/D38876

llvm-svn: 316234
2017-10-20 20:14:46 +00:00
Jonathan Peyton 94a114fc39 Apply formatting changes
.clang-format's comments are removed and a (hopefully) final
set of formatting changes are applied.

Differential Revision: https://reviews.llvm.org/D38837
Differential Revision: https://reviews.llvm.org/D38920

llvm-svn: 316227
2017-10-20 19:30:57 +00:00
Jonathan Peyton 3f850bfcf0 KMP_HW_SUBSET vs KMP_PLACE_THREADS rival envirables fix
If both KMP_HW_SUBSET and KMP_PLACE_THREADS are set and KMP_PLACE_THREADS gets
parsed first, then the current environment variable parser rejects both and
neither get used. This patch uses the rivals mechanism that is used for other
environment variable groups (e.g., KMP_STACKSIZE, GOMP_STACKSIZE, OMP_STACKSIZE).
If both are set, then it tells the user that it is ignoring KMP_PLACE_THREADS in
favor of KMP_HW_SUBSET. The message about deprecating KMP_PLACE_THREADS when it
is set is still printed regardless.

Differential Revision: https://reviews.llvm.org/D38292

llvm-svn: 315091
2017-10-06 19:23:19 +00:00
Jonathan Peyton bd3a7633f1 Remove unnecessary semicolons
Removes semicolons after if {} blocks, function definitions, etc.
I was able to apply the large OMPT patch cleanly on top of this one
with no conflicts.

llvm-svn: 314340
2017-09-27 20:36:27 +00:00
Jonathan Peyton 8f3d7448b9 Allow printing of KMP_TOPOLOGY_METHOD when KMP_SETTINGS=true
llvm-svn: 314243
2017-09-26 20:33:53 +00:00
Jonathan Peyton 6de85b1565 Remove unused t_single_lock
Add padding inside team structure to keep same structure size.

llvm-svn: 314242
2017-09-26 20:12:16 +00:00
Jonathan Peyton 52527cd2c1 Read blocktime value set by kmp_set_blocktime() before reading from KMP_BLOCKTIME
Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D37403

llvm-svn: 312539
2017-09-05 15:45:48 +00:00
Jonathan Peyton 6a393f75f4 Minor code cleanup of Klocwork issues
Minor code cleanup of Klocwork issues. Fatal messages are given no return
attribute. Define and use KMP_NORETURN to work for multiple C++ versions.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D37275

llvm-svn: 312538
2017-09-05 15:43:58 +00:00
Jonathan Peyton 0447708f8d Use va_copy instead of __va_copy to fix building libomp against musl libc
Fixes https://bugs.llvm.org/show_bug.cgi?id=34040

Patch by Peter Levine

Differential Revision: https://reviews.llvm.org/D36343

llvm-svn: 311269
2017-08-19 23:53:36 +00:00
Jonathan Peyton d4daf4540a Remove BUILD_TV
Cleanup code to remove BUILD_TV and unused code bracketed by it.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D36011

llvm-svn: 311114
2017-08-17 19:09:28 +00:00
Paul Osmialowski a016279422 OMP_PROC_BIND: better spread
This change improves the way threads are spread across cores
when OMP_PROC_BIND=spread is set and no unusual affinity masks are in use.

Differential Revision: https://reviews.llvm.org/D36510

llvm-svn: 310670
2017-08-10 23:04:11 +00:00
Jonathan Peyton 1b536724d9 Move lock acquire/release functions in task deque cleanup code
The original locations can be reached without initializing the lock variable
(td_deque_lock), so it is potentially unsafe.  It is guaranteed that the lock
is initialized if the deque (td_deque) is not NULL, and lock functions can be
safely called.

Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D36017

llvm-svn: 309875
2017-08-02 20:06:32 +00:00