This patch is child of D89671, contains the clang
implementation to use the OpenMP IRBuilder's section
construct.
Co-author: @anchu-rajendran
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D91054
For combined worksharing directives need to emit the temp arrays outside
of the parallel region and update them in the master thread only.
Differential Revision: https://reviews.llvm.org/D100187
Need to bitcast the function pointer passed as a parameter to the real
type to avoid possible problem with calling conventions.
Differential Revision: https://reviews.llvm.org/D99521
Need to cast the argument for the debug wrapper function call to the
corresponding parameter type to avoid crash.
Differential Revision: https://reviews.llvm.org/D99617
Need to insert a basic block during generation of the target region to
avoid crash for the GPU to be able always calling a cleanup action.
This cleanup action is required for the correct emission of the target
region for the GPU.
Differential Revision: https://reviews.llvm.org/D99445
In order to prevent further building issues related to the usage of SmallVector
in other compilation unit, this patch adjusts the llvm.h header as a workaround
instead.
Besides, this patch reverts previous workarounds:
1. Revert "[NFC] Use llvm::SmallVector to workaround XL compiler problem on AIX"
This reverts commit 561fb7f60a.
2.Revert "[clang][cli] Fix build failure in CompilerInvocation"
This reverts commit 8dc70bdcd0.
Differential Revision: https://reviews.llvm.org/D98552
LLVM is recommending to use SmallVector (that is, omitting the N), in the
absence of a well-motivated choice for the number of inlined elements N.
However, this doesn't work well with XL compiler on AIX since some header(s)
aren't properly picked up with it. We need to take a further look into the real
issue underneath and fix it in a later patch.
But currently we'd like to use this patch to unblock the build compiler issue
first.
Differential Revision: https://reviews.llvm.org/D98265
Initial support for using the OpenMPIRBuilder by clang to generate loops using the OpenMPIRBuilder. This initial support is intentionally limited to:
* Only the worksharing-loop directive.
* Recognizes only the nowait clause.
* No loop nests with more than one loop.
* Untested with templates, exceptions.
* Semantic checking left to the existing infrastructure.
This patch introduces a new AST node, OMPCanonicalLoop, which becomes parent of any loop that has to adheres to the restrictions as specified by the OpenMP standard. These restrictions allow OMPCanonicalLoop to provide the following additional information that depends on base language semantics:
* The distance function: How many loop iterations there will be before entering the loop nest.
* The loop variable function: Conversion from a logical iteration number to the loop variable.
These allow the OpenMPIRBuilder to act solely using logical iteration numbers without needing to be concerned with iterator semantics between calling the distance function and determining what the value of the loop variable ought to be. Any OpenMP logical should be done by the OpenMPIRBuilder such that it can be reused MLIR OpenMP dialect and thus by flang.
The distance and loop variable function are implemented using lambdas (or more exactly: CapturedStmt because lambda implementation is more interviewed with the parser). It is up to the OpenMPIRBuilder how they are called which depends on what is done with the loop. By default, these are emitted as outlined functions but we might think about emitting them inline as the OpenMPRuntime does.
For compatibility with the current OpenMP implementation, even though not necessary for the OpenMPIRBuilder, OMPCanonicalLoop can still be nested within OMPLoopDirectives' CapturedStmt. Although OMPCanonicalLoop's are not currently generated when the OpenMPIRBuilder is not enabled, these can just be skipped when not using the OpenMPIRBuilder in case we don't want to make the AST dependent on the EnableOMPBuilder setting.
Loop nests with more than one loop require support by the OpenMPIRBuilder (D93268). A simple implementation of non-rectangular loop nests would add another lambda function that returns whether a loop iteration of the rectangular overapproximation is also within its non-rectangular subset.
Reviewed By: jdenny
Differential Revision: https://reviews.llvm.org/D94973
The tile directive is in OpenMP's Technical Report 8 and foreseeably will be part of the upcoming OpenMP 5.1 standard.
This implementation is based on an AST transformation providing a de-sugared loop nest. This makes it simple to forward the de-sugared transformation to loop associated directives taking the tiled loops. In contrast to other loop associated directives, the OMPTileDirective does not use CapturedStmts. Letting loop associated directives consume loops from different capture context would be difficult.
A significant amount of code generation logic is taking place in the Sema class. Eventually, I would prefer if these would move into the CodeGen component such that we could make use of the OpenMPIRBuilder, together with flang. Only expressions converting between the language's iteration variable and the logical iteration space need to take place in the semantic analyzer: Getting the of iterations (e.g. the overload resolution of `std::distance`) and converting the logical iteration number to the iteration variable (e.g. overload resolution of `iteration + .omp.iv`). In clang, only CXXForRangeStmt is also represented by its de-sugared components. However, OpenMP loop are not defined as syntatic sugar. Starting with an AST-based approach allows us to gradually move generated AST statements into CodeGen, instead all at once.
I would also like to refactor `checkOpenMPLoop` into its functionalities in a follow-up. In this patch it is used twice. Once for checking proper nesting and emitting diagnostics, and additionally for deriving the logical iteration space per-loop (instead of for the loop nest).
Differential Revision: https://reviews.llvm.org/D76342
OpenMPIRBuilder::createParallel outlines the body region of the parallel
construct into a new function that accepts any value previously defined outside
the region as a function argument. This function is called back by OpenMP
runtime function __kmpc_fork_call, which expects trailing arguments to be
pointers. If the region uses a value that is not of a pointer type, e.g. a
struct, the produced code would be invalid. In such cases, make createParallel
emit IR that stores the value on stack and pass the pointer to the outlined
function instead. The outlined function then loads the value back and uses as
normal.
Reviewed By: jdoerfert, llitchev
Differential Revision: https://reviews.llvm.org/D92189
For consistency with the IRBuilder, OpenMPIRBuilder has method names starting with 'Create'. However, the LLVM coding style has methods names starting with lower case letters, as all other OpenMPIRBuilder already methods do. The clang-tidy configuration used by Phabricator also warns about the naming violation, adding noise to the reviews.
This patch renames all `OpenMPIRBuilder::CreateXYZ` methods to `OpenMPIRBuilder::createXYZ`, and updates all in-tree callers.
I tested check-llvm, check-clang, check-mlir and check-flang to ensure that I did not miss a caller.
Reviewed By: mehdi_amini, fghanim
Differential Revision: https://reviews.llvm.org/D91109
In current implementation, if it requires an outer task, the mapper array will be privatized no matter whether it has mapper. In fact, when there is no mapper, the mapper array only contains number of nullptr. In the libomptarget, the use of mapper array is `if (mappers_array && mappers_array[i])`, which means we can directly set mapper array to nullptr if there is no mapper. This can avoid unnecessary data copy.
In this patch, the data privatization will not be emitted if the mapper array is nullptr. When it comes to the emit of task body, the nullptr will be used directly.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D90101
Local vars, marked with pragma allocate, mustbe allocate by the call of
the runtime function and cannot be allocated as other local variables.
Instead, we allocate a space for the pointer in private record and store
the address, returned by kmpc_alloc call in this pointer.
So, for untied tasks
```
#pragma omp task untied
{
S s;
#pragma omp allocate(s) allocator(allocator)
s = x;
}
```
compiler generates something like this:
```
struct task_with_privates {
S *ptr;
};
void entry(task_with_privates *p) {
S *s = p->s;
switch(partid) {
case 1:
p->s = (S*)kmpc_alloc();
kmpc_omp_task();
br exit;
case 2:
*s = x;
kmpc_omp_task();
br exit;
case 2:
~S(s);
kmpc_free((void*)s);
br exit;
}
exit:
}
```
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D86558
This is recommit of 6c8041aa0f, reverted in de044f7562 because of some
fails. Original commit message is below.
This change allow a CastExpr to have optional FPOptionsOverride object,
stored in trailing storage. Of all cast nodes only ImplicitCastExpr,
CStyleCastExpr, CXXFunctionalCastExpr and CXXStaticCastExpr are allowed
to have FPOptions.
Differential Revision: https://reviews.llvm.org/D85960
This change allow a CastExpr to have optional FPOptionsOverride object,
stored in trailing storage. Of all cast nodes only ImplicitCastExpr,
CStyleCastExpr, CXXFunctionalCastExpr and CXXStaticCastExpr are allowed
to have FPOptions.
Differential Revision: https://reviews.llvm.org/D85960
Fixes issue noticed by static analysis where we have a copy+paste typo, testing ScheduleKind.M1 twice instead of ScheduleKind.M2.
Differential Revision: https://reviews.llvm.org/D87250
In untied tasks, need to allocate the space for local variales, declared
in task region, when the memory for task data is allocated. THe function
can be interrupted and we can exit from the function in untied task
switch. Need to keep the state of the local variables in this case.
Also, the compiler should not call cleanup when exiting in untied task
switch until the real exit out of the declaration scope is met during
execution.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D84457
Summary:
In untied tasks, need to allocate the space for local variales, declared
in task region, when the memory for task data is allocated. THe function
can be interrupted and we can exit from the function in untied task
switch. Need to keep the state of the local variables in this case.
Also, the compiler should not call cleanup when exiting in untied task
switch until the real exit out of the declaration scope is met during
execution.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, cfe-commits, sstefan1, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D84457
Summary:
Introduced OMPChildren class to handle all associated clauses, statement
and child expressions/statements. It allows to represent some directives
more correctly (like flush, depobj etc. with pseudo clauses, ordered
depend directives, which are standalone, and target data directives).
Also, it will make easier to avoid using of CapturedStmt in directives,
if required (atomic, tile etc. directives).
Also, it simplifies serialization/deserialization of the
executable/declarative directives.
Reduces number of allocation operations for mapper declarations.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, jfb, cfe-commits, sstefan1, aaron.ballman, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D83261
Without this patch, the following example fails but shouldn't
according to OpenMP TR8:
```
#pragma omp target enter data map(alloc:i)
#pragma omp target data map(present, alloc: i)
{
#pragma omp target exit data map(delete:i)
} // fails presence check here
```
OpenMP TR8 sec. 2.22.7.1 "map Clause", p. 321, L23-26 states:
> If the map clause appears on a target, target data, target enter
> data or target exit data construct with a present map-type-modifier
> then on entry to the region if the corresponding list item does not
> appear in the device data environment an error occurs and the
> program terminates.
There is no corresponding statement about the exit from a region.
Thus, the `present` modifier should:
1. Check for presence upon entry into any region, including a `target
exit data` region. This behavior is already implemented correctly.
2. Should not check for presence upon exit from any region, including
a `target` or `target data` region. Without this patch, this
behavior is not implemented correctly, breaking the above example.
In the case of `target data`, this patch fixes the latter behavior by
removing the `present` modifier from the map types Clang generates for
the runtime call at the end of the region.
In the case of `target`, we have not found a valid OpenMP program for
which such a fix would matter. It appears that, if a program can
guarantee that data is present at the beginning of a `target` region
so that there's no error there, that data is also guaranteed to be
present at the end. This patch adds a comment to the runtime to
document this case.
Reviewed By: grokos, RaviNarayanaswamy, ABataev
Differential Revision: https://reviews.llvm.org/D84422
We need to keep track of the alloca insertion point (which we already
communicate via the callback to the user) as we place allocas as well.
Reviewed By: fghanim, SouraVX
Differential Revision: https://reviews.llvm.org/D82470
This patch implements the code generation to use OpenMP 5.0 declare mapper (a.k.a. user-defined mapper) constructs.
Patch written by Lingda Li.
Differential Revision: https://reviews.llvm.org/D67833
Summary:
Need to privatize addresses of the captured variables when trying to
emit the body of the target data directive in no target codegen mode.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, cfe-commits, sstefan1, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D83478
Summary:
D82193 exposed a problem with global type definitions in
`OMPConstants.h`. This causes a race when running in thinLTO mode.
Types now live inside of OpenMPIRBuilder to prevent this from happening.
Reviewers: jdoerfert
Subscribers: yaxunl, hiraditya, guansong, dexonsmith, aaron.ballman, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D83176
Summary:
This patch is removing the custom enumeration for OpenMP Directives and Clauses and replace them
with the newly tablegen generated one from llvm/Frontend. This is a first patch and some will follow to share the same
infrastructure where possible. The next patch should use the clauses allowance defined in the tablegen file.
Reviewers: jdoerfert, DavidTruby, sscalpone, kiranchandramohan, ichoyjx
Reviewed By: DavidTruby, ichoyjx
Subscribers: jholewinski, cfe-commits, dblaikie, MaskRay, ymandel, ichoyjx, mgorny, yaxunl, guansong, jfb, sstefan1, aaron.ballman, llvm-commits
Tags: #llvm, #flang, #clang
Differential Revision: https://reviews.llvm.org/D82906
Summary:
Modified the OMPBuilderCBHelpers in the following ways:
- Moved location of class definition and deleted all constructors
- Moved OpenMP-specific address allocation of local variables
- Moved threadprivate variable creation for the current thread
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79676
This reverts commit defd43a5b3.
with correction to solve msan report
To solve https://bugs.llvm.org/show_bug.cgi?id=46166 where the
floating point settings in PCH files aren't compatible, rewrite
FPFeatures to use a delta in the settings rather than absolute settings.
With this patch, these floating point options can be benign.
Reviewers: rjmccall
Differential Revision: https://reviews.llvm.org/D81869
This reverts commit b55d723ed6.
Reapply Modify FPFeatures to use delta not absolute settings
To solve https://bugs.llvm.org/show_bug.cgi?id=46166 where the
floating point settings in PCH files aren't compatible, rewrite
FPFeatures to use a delta in the settings rather than absolute settings.
With this patch, these floating point options can be benign.
Reviewers: rjmccall
Differential Revision: https://reviews.llvm.org/D81869
Summary:
According to OpenMP 5.0, nonmonotonic modifier can be used with all
schedule kinds, not only dynamic and guided as in OpenMP 4.5.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D82026
Rather than pushing inactive cleanups for the block captures at the
entry of a full expression and activating them during the creation of
the block literal, just call pushLifetimeExtendedDestroy to ensure the
cleanups are popped at the end of the scope enclosing the block
expression.
rdar://problem/63996471
Differential Revision: https://reviews.llvm.org/D81624
Summary:
Added codegen for use_device_addr clause. The components of the list
items are mapped as a kind of RETURN components and then the returned
base address is used instead of the real address of the base declaration
used in the use_device_addr expressions.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D80730
Fixes PR45753
When a program that contains a loop to which both `omp parallel for`
pragma and `clang loop` pragma are associated is compiled with the
-fopenmp option, `clang loop` pragma did not take effect. The example
below should not be vectorized by the `clang loop` pragma but it was
actually vectorized. The cause is that `llvm.loop.vectorize.width`
was not output to the IR when -fopenmp is specified.
The fix attaches attributes if they exist for the loop.
[example.c]
```
int a[100], b[100];
void foo() {
#pragma omp parallel for
#pragma clang loop vectorize(disable)
for (int i = 0; i < 100; i++)
a[i] += b[i] * i;
}
```
[compile]
```
$ clang -O2 -fopenmp example.c -c -Rpass=vect
example.c:3:11: remark: vectorized loop (vectorization width: 4, interleaved count: 2) [-Rpass=loop-vectorize]
#pragma omp parallel for
^
```
[IR with -fopenmp]
```
$ clang -O2 exmaple.c -S -emit-llvm -mllvm -disable-llvm-optzns -o - -fopenmp | grep 'vectorize\.width'
```
[IR with -fno-openmp]
```
$ clang -O2 example.c -S -emit-llvm -mllvm -disable-llvm-optzns -o - -fno-openmp | grep 'vectorize\.width'
!7 = !{!"llvm.loop.vectorize.width", i32 1}
```
Differential Revision: https://reviews.llvm.org/D79921
If we're going to assume references are dereferenceable, we should also
assume they're aligned: otherwise, we can't actually dereference them.
See also D80072.
Differential Revision: https://reviews.llvm.org/D80166
test cases
Add support for #pragma float_control
Reviewers: rjmccall, erichkeane, sepavloff
Differential Revision: https://reviews.llvm.org/D72841
This reverts commit 85dc033cac, and makes
corrections to the test cases that failed on buildbots.
Summary:
Patch forces codegen to use the new runtime functions for task reductions where
the issue with passing the address of the original variables to the UDR
initializers is fixed. Also, this patch is required for upcoming
support of task modifier inreduction clause.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D78733
Implemented codegen for the iterator expression in the depend clauses.
Iterator construct is emitted the following way:
iterator(cnt1, cnt2, ...), in : <dep>
<TotalNumDeps> = <cnt1_size> * <cnt2_size> * ...;
kmp_depend_t deps[<TotalNumDeps>];
deps_counter = 0;
for (cnt1) {
for (cnt2) {
...
deps[deps_counter].base_addr = &<dep>;
deps[deps_counter].size = sizeof(<dep>);
deps[deps_counter].flags = in;
deps_counter += 1;
...
}
}
For depobj construct the codegen is very similar, but the memory is
allocated dynamically and added extra first item reserved for internal use.
If the ancestor device modifier is used and the value of the device
clause is evaluated to 1, the ancestor device shall be used for the
execution.
Since the reverse offloading is not supported yet, the target construct
execution is always initiated from the host, not from the device. So, if
the ancestor modifier is specified, just execute target region on the
host.
Avoid copying of the orignal variable if it is going to be marked as
firstprivate in task regions. For taskloops, still need to copy the
non-trvially copyable variables to correctly construct them upon task
creation.
Added codegen for update clause in depobj. Reads the number of the
elements from the first element and updates flags for each element in
the loop.
```
omp_depend_t x;
kmp_depend_info *base = (kmp_depend_info *)x;
intptr_t num = x[-1].base_addr;
kmp_depend_info *end = x + num;
kmp_depend_info *el = base;
do {
el.flags = new_flag;
el = &el[1];
} while (el != end);
```
Added codegen for 'depend' clause in depobj directive. The depend clause
is emitted as kmp_depend_info <deps>[<number_of_items_in_clause> + 1]. The
first element in this array is reserved for storing the number of
elements in this array: <deps>[0].base_addr =
<number_of_items_in_clause>;
This extra element is required to implement 'update' and 'destroy'
clauses. It is required to know the size of array to destroy it
correctly and to update depency kind.
This patch introduces a new helper class `OMPBuilderCBHelpers`,
which will contain all reusable C/C++ language specific function-
alities required by the `OMPIRBuilder`.
Initially, this helper class contains the body and finalization
codegen functionalities implemented using callbacks which were
moved here for reusability among the different directives
implemented in the `OMPIRBuilder`, along with RAIIs for preserving
state prior to emitting outlined and/or inlined OpenMP regions.
In the future this helper class will also contain all the different
call backs required by OpenMP clauses/variable privatization.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D74562
Add support for Master and Critical directive in the OMPIRBuilder. Both make use of a new common interface for emitting inlined OMP regions called `emitInlinedRegion` which was added in this patch as well.
Also this patch modifies clang to use the new directives when `-fopenmp-enable-irbuilder` commandline option is passed.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D72304
According to OpenMP 5.0, cancel and cancellation point constructs are
supported in taskloop directive. Added support for cancellation in
taskloop, master taskloop and parallel master taskloop.
directive.
According to OpenMP 5.0, The atomic_default_mem_order clause specifies the default memory ordering behavior for atomic constructs that must be provided by an implementation. If the default memory ordering is specified as seq_cst, all atomic constructs on which memory-order-clause is not specified behave as if the seq_cst clause appears. If the default memory ordering is specified as relaxed, all atomic constructs on which memory-order-clause is not specified behave as if the relaxed clause appears.
If the default memory ordering is specified as acq_rel, atomic constructs on which memory-order-clause is not specified behave as if the release clause appears if the atomic write or atomic update operation is specified, as if the acquire clause appears if the atomic read operation is specified, and as if the acq_rel clause appears if the atomic captured update operation is specified.
Add support for Master and Critical directive in the OMPIRBuilder. Both make use of a new common interface for emitting inlined OMP regions called `emitInlinedRegion` which was added in this patch as well.
Also this patch modifies clang to use the new directives when `-fopenmp-enable-irbuilder` commandline option is passed.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D72304
Add support for Master and Critical directive in the OMPIRBuilder. Both make use of a new common interface for emitting inlined OMP regions called `emitInlinedRegion` which was added in this patch as well.
Also this patch modifies clang to use the new directives when `-fopenmp-enable-irbuilder` commandline option is passed.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D72304
regions.
If the lastprivate conditional is passed as shared in inner region, we
shall check if it was ever changed and use this updated value after exit
from the inner region as an update value.
regions with reductions, lastprivates or linears clauses.
If the lastprivate conditional variable is updated in inner parallel
region with reduction, lastprivate or linear clause, the value must be
considred as a candidate for lastprivate conditional. Also, tracking in
inner parallel regions is not required.
The option will limit debug info by only emitting complete class
type information when its constructor is emitted.
This patch changes comparisons with LimitedDebugInfo to use the new
level instead.
Differential Revision: https://reviews.llvm.org/D72427
Added codegen support for lastprivate conditional. According to the
standard, if when the conditional modifier appears on the clause, if an
assignment to a list item is encountered in the construct then the
original list item is assigned the value that is assigned to the new
list item in the sequentially last iteration or lexically last section
in which such an assignment is encountered.
We look for the assignment operations and check if the left side
references lastprivate conditional variable. Then the next code is
emitted:
if (last_iv_a <= iv) {
last_iv_a = iv;
last_a = lp_a;
}
At the end the implicit barrier is generated to wait for the end of all
threads and then in the check for the last iteration the private copy is
assigned the last value.
if (last_iter) {
lp_a = last_a; // <--- new code
a = lp_a; // <--- store of private value to the original variable.
}
This allows to use the OpenMPIRBuilder for parallel regions. Code was
extracted from D61953 and adapted to work with the new version (D70109).
All but one feature should be supported. An update of this patch will
provide test coverage and privatization other than shared.
Reviewed By: fghanim
Differential Revision: https://reviews.llvm.org/D70290
Summary:
Basic codegen for the declarations marked as nontemporal. Also, if the
base declaration in the member expression is marked as nontemporal,
lvalue for member decl access inherits nonteporal flag from the base
lvalue.
Reviewers: rjmccall, hfinkel, jdoerfert
Subscribers: guansong, arphaman, caomhin, kkwli0, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D71708
According to OpenMP 5.0, if clause can be used in for simd directive. If
condition in the if clause if false, the non-vectorized version of the
loop must be executed.
Summary:
The new OpenMPConstants.h is a location for all OpenMP related constants
(and helpers) to live.
This patch moves the directives there (the enum OpenMPDirectiveKind) and
rewires Clang to use the new location.
Initially part of D69785.
Reviewers: kiranchandramohan, ABataev, RaviNarayanaswamy, gtbercea, grokos, sdmitriev, JonChesterfield, hfinkel, fghanim
Subscribers: jholewinski, ppenzin, penzn, llvm-commits, cfe-commits, jfb, guansong, bollu, hiraditya, mgorny
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D69853
According to OpenMP 5.0, if clause can be used in for simd directive. If
condition in the if clause if false, the non-vectorized version of the
loop must be executed.
AggValueSlot
This reapplies 8a5b7c3570 after a null
dereference bug in CGOpenMPRuntime::emitUserDefinedMapper.
Original commit message:
This is needed for the pointer authentication work we plan to do in the
near future.
a63a81bd99/clang/docs/PointerAuthentication.rst
According to OpenMP 5.0, if clause can be used in for simd directive. If
condition in the if clause if false, the non-vectorized version of the
loop must be executed.
According to OpenMP 5.0, if clause can be used in simd directive. If
condition in the if clause if false, the non-vectorized version of the
loop must be executed.
Added parsing/sema/codegen support for 'parallel master taskloop'
constructs. Some of the clauses, like 'grainsize', 'num_tasks', 'final'
and 'priority' are not supported in full, only constant expressions can
be used currently in these clauses.
llvm-svn: 374791
The behavior from the original patch has changed, since we're no longer
allowing LLVM to just ignore the alignment. Instead, we're just
assuming the maximum possible alignment.
Differential Revision: https://reviews.llvm.org/D68824
llvm-svn: 374562
The test fails on Windows, with
error: 'warning' diagnostics expected but not seen:
File builtin-assume-aligned.c Line 62: requested alignment
must be 268435456 bytes or smaller; assumption ignored
error: 'warning' diagnostics seen but not expected:
File builtin-assume-aligned.c Line 62: requested alignment
must be 8192 bytes or smaller; assumption ignored
llvm-svn: 374456
Code to handle __builtin_assume_aligned was allowing larger values, but
would convert this to unsigned along the way. This patch removes the
EmitAssumeAligned overloads that take unsigned to do away with this
problem.
Additionally, it adds a warning that values greater than 1 <<29 are
ignored by LLVM.
Differential Revision: https://reviews.llvm.org/D68824
llvm-svn: 374450
We previously failed to treat an array with an instantiation-dependent
but not value-dependent bound as being an instantiation-dependent type.
We now track the array bound expression as part of a constant array type
if it's an instantiation-dependent expression.
llvm-svn: 373685
Runtime function __kmpc_push_tripcount better to call inside of the task
context for target regions. Otherwise, the libomptarget is unable to
link the provided tripcount value for nowait target regions and
completely looses this information.
llvm-svn: 372609
If the variable, used in the loop boundaries, is not captured in the
construct, this variable must be considered as undefined if it was
privatized.
llvm-svn: 372252
construct.
OpenMP 5.0 introduced new clause for declare target directive, device_type clause, which may accept values host, nohost, and any. Host means
that the function must be emitted only for the host, nohost - only for
the device, and any - for both, device and the host.
llvm-svn: 369775
Added basic support for non-rectangular loops. It requires an additional
analysis of min/max boundaries for non-rectangular loops. Since only
linear dependency is allowed, we can do this analysis.
llvm-svn: 368903
Target-based runtime functions use int64_t type for sizes, while the
compiler uses size_t type. It leads to miscompilation in 32 bit mode.
llvm-svn: 364327
If the variable is a firstprivate variable and it was not emitted beause
this a constant variable with the constant initializer, we can use the
initial value instead of the variable itself. It also fixes the problem
with the compiler crash in this case.
llvm-svn: 361564
If the doacross lop construct is used and the loop counter is declare
outside of the loop, the compiler might crash trying to get the address
of the loop counter. Patch fixes this problem.
llvm-svn: 356198