counters.
According to the OpenMP 5.0, For any associated loop where the b or lb
expression is not loop invariant with respect to the outermost loop, the
var-outer that appears in the expression may not have a random access
iterator type.
llvm-svn: 359340
loop nests.
Added a checks that the initializer/condition expressions depend only
only of the single previous loop iteration variable.
llvm-svn: 359200
Without this patch, APSInt inherits APInt::isNegative, which merely
checks the sign bit without regard to whether the type is actually
signed. isNonNegative and isStrictlyPositive call isNegative and so
are also affected.
This patch adjusts APSInt to override isNegative, isNonNegative, and
isStrictlyPositive with implementations that consider whether the type
is signed.
A large set of Clang OpenMP tests are affected. Without this patch,
these tests assume that `true` is not a valid argument for clauses
like `collapse`. Indeed, `true` fails APInt::isStrictlyPositive but
not APSInt::isStrictlyPositive. This patch adjusts those tests to
assume `true` should be accepted.
This patch also adds tests revealing various other similar fixes due
to APSInt::isNegative calls in Clang's ExprConstant.cpp and
SemaExpr.cpp: `++` and `--` overflow in `constexpr`, evaluated object
size based on `alloc_size`, `<<` and `>>` shift count validation, and
OpenMP array section validation.
Reviewed By: lebedev.ri, ABataev, hfinkel
Differential Revision: https://reviews.llvm.org/D59712
llvm-svn: 359012
runtime.
target [teams distribute] simd costructs do not require full runtime for
the correct execution, we can run them without full runtime.
llvm-svn: 358766
Summary: The requires directive containing target related clauses must appear before any target region in the compilation unit.
Reviewers: ABataev, AlexEichenberger, caomhin
Reviewed By: ABataev
Subscribers: guansong, jfb, jdoerfert, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D60875
llvm-svn: 358709
All target-parallel-based constructs can be run in SPMD mode from now
on. Even if num_threads clauses or if clauses are used, such constructs
can be executed in SPMD mode.
llvm-svn: 358595
Combined constructs with parallel and if clauses without modifiers may
be executed in SPMD mode since if the condition is true for the target
region, it is also true for parallel region and the threads must be run
in parallel.
llvm-svn: 358503
mode.
After the previous patch with the more correct handling of the number of
threads in parallel regions, the parallel regions with num_threads
clauses can be executed in SPMD mode.
llvm-svn: 358445
regions.
Added more complex analysis for number of teams and number of threads in
the target regions, also merged related common code between CGOpenMPRuntime
and CGOpenMPRuntimeNVPTX classes.
llvm-svn: 358126
Added special processing of the memory management directives/clauses for
NVPTX target. For private locals, omp_default_mem_alloc and
omp_thread_mem_alloc result in allocation in local memory.
omp_const_mem_alloc allocates const memory, omp_teams_mem_alloc
allocates shared memory, and omp_cgroup_mem_alloc and
omp_large_cap_mem_alloc allocate global memory.
llvm-svn: 357923
If the pointer is captured by reference, it must be mapped as
_PTR_AND_OBJ kind of mapping to correctly translate the pointer address
on the device.
llvm-svn: 357488
Before this patch, CGLoop would dump all transformations for a loop into
a single LoopID without encoding any order in which to apply them.
rL348944 added the possibility to encode a transformation order using
followup-attributes.
When a loop has more than one transformation, use the follow-up
attribute define the order in which they are applied. The emitted order
is the defacto order as defined by the current LLVM pass pipeline,
which is:
LoopFullUnrollPass
LoopDistributePass
LoopVectorizePass
LoopUnrollAndJamPass
LoopUnrollPass
MachinePipeliner
This patch should therefore not change the assembly output, assuming
that all explicit transformations can be applied, and no implicit
transformations in-between. In the former case,
WarnMissedTransformationsPass should emit a warning (except for
MachinePipeliner which is not implemented yet). The latter could be
avoided by adding 'llvm.loop.disable_nonforced' attributes.
Because LoopUnrollAndJamPass processes a loop nest, generation of the
MDNode is delayed to after the inner loop metadata have been processed.
A temporary LoopID is therefore used to annotate instructions and
RAUW'ed by the actual LoopID later.
Differential Revision: https://reviews.llvm.org/D57978
llvm-svn: 357415
According to OpenMP 5.0, 2.11.4 allocate Clause, Restrictions, allocate
clauses that appear on a target construct or on constructs in a target
region must specify an allocator expression unless a requires directive
with the dynamic_allocators clause is present in the same compilation
unit. Patch adds a check for this restriction.
llvm-svn: 357412
According to OpenMP 5.0 standard, 2.11.4 allocate Clause, Restrictions,
For any list item that is specified in the allocate clause on a
directive, a data-sharing attribute clause that may create a private
copy of that list item must be specified on the same directive. Patch
adds the checks for this restriction.
llvm-svn: 357390
target and task-based directives.
According to OpenMP 5.0, 2.11.4 allocate Clause, Restrictions, For task,
taskloop or target directives, allocation requests to memory allocators
with the trait access set to thread result in unspecified behavior.
Patch introduces a check for omp_thread_mem_alloc predefined allocator
on target- and trask-based directives.
llvm-svn: 357205
dynamic_allocators.
According to the OpenMP 5.0, 2.11.3 allocate Directive, Restrictions,
allocate directives that appear in a target region must specify an
allocator clause unless a requires directive with the dynamic_allocators
clause is present in the same compilation unit. Patch adds a check for a
presence of the requires directive with the dynamic_allocators clause.
llvm-svn: 356758
clause in target region.
According to the OpenMP 5.0, 2.11.3 allocate Directive, Restrictions,
allocate directives that appear in a target region must specify an
allocator clause unless a requires directive with the dynamic_allocators
clause is present in the same compilation unit.
llvm-svn: 356752
For the global variables the allocate directive must specify only the
predefined allocator. This allocator must be translated into the correct
form of the address space for the targets that support different address
spaces.
llvm-svn: 356702
allocators.
It is better to deduce omp_allocator_handle_t type from the predefined
allocators, because omp.h header might not define it explicitly. Plus,
it allows to identify the predefined allocators correctly when trying to
build the allcoator for the global variables.
llvm-svn: 356607
If the allocator was specified for the variable and next one is found
with the different allocator, the warning is emitted, and the allocator
is ignored.
llvm-svn: 356513
According to OpenMP, 2.11.3 allocate Directive, Restrictions, C / C++,
if a list item has a static storage type, the allocator expression in
the allocator clause must be a constant expression that evaluates to
one of the predefined memory allocator values. Added check for this
restriction.
llvm-svn: 356496
Added initial codegen for the local variables with the #pragma omp
allocate directive. Instead of allocating the variables on the stack,
__kmpc_alloc|__kmpc_free functions are used for memory (de-)allocation.
llvm-svn: 356472
If the doacross lop construct is used and the loop counter is declare
outside of the loop, the compiler might crash trying to get the address
of the loop counter. Patch fixes this problem.
llvm-svn: 356198
array.
If the firstprivate variable is a reference, we may incorrectly classify
the kind of the private copy. Use the type of the private copy instead
of the original shared variable.
llvm-svn: 356098
'_openmp_teams_reductions_buffer_$_.
nvlink does not handle weak linkage correctly, same symbols with the
different sizes are reported as erroneous though the largest size must
be chosen instead. Patch fixes this problem by using Internal linkage
instead of the Common.
llvm-svn: 356072
If the variable was declared and marked as declare target, a new offload
entry with size 0 is created. But if later a definition is created and
marked as declare target, this definition is not added to the entry set
and the definition remains not mapped to the target. Patch fixes this
problem allowing to redefine the size and linkage for
previously registered declaration.
llvm-svn: 355960
If the declare target link global is used in the target region
indirectly (used in the inner parallel, teams, etc. regions), we may
miss this variable and it leads to incorrect codegen.
llvm-svn: 355858