The `assumes` directive is an OpenMP 5.1 feature that allows the user to
provide assumptions to the optimizer. Assumptions can refer to
directives (`absent` and `contains` clauses), expressions (`holds`
clause), or generic properties (`no_openmp_routines`, `ext_ABCD`, ...).
The `assumes` spelling is used for assumptions in the global scope while
`assume` is used for executable contexts with an associated structured
block.
This patch only implements the global spellings. While clauses with
arguments are "accepted" by the parser, they will simply be ignored for
now. The implementation lowers the assumptions directly to the
`AssumptionAttr`.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D91980
The `assumes` directive is an OpenMP 5.1 feature that allows the user to
provide assumptions to the optimizer. Assumptions can refer to
directives (`absent` and `contains` clauses), expressions (`holds`
clause), or generic properties (`no_openmp_routines`, `ext_ABCD`, ...).
The `assumes` spelling is used for assumptions in the global scope while
`assume` is used for executable contexts with an associated structured
block.
This patch only implements the global spellings. While clauses with
arguments are "accepted" by the parser, they will simply be ignored for
now. The implementation lowers the assumptions directly to the
`AssumptionAttr`.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D91980
Support present modifier in defaultmap by adding an extra dimension
for `ImplicitMap`. Therefore, we now create OMPMapClause in `ActOnOpenMPExecutableDirective`
based on both `maptype` and `maptype-modifier`.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D92427
The variables used in atomic construct should be captured in outer
task-based regions implicitly. Otherwise, the compiler will crash trying
to find the address of the local variable.
Differential Revision: https://reviews.llvm.org/D92682
Compiler needs to convert some of the loop iteration
variables/conditions to different types for better codegen and it may
lead to spurious warning messages about implicit signed/unsigned
conversions.
Differential Revision: https://reviews.llvm.org/D92655
Reviewed by aaron.ballman, rsmith, wchilders
Highlights of review:
- avoid specifying an underlying type (unless such an enum is stored (or part of an abi?))
- avoid using enums as bit-fields, preferring unsigned bit-fields that we static_cast enumerators to. (MS's abi laysout enum bit-fields differently).
- clang-format, clang-format, clang-format.
https://reviews.llvm.org/D91035
Thank you!
If the variable is implicitly firstprivatized in the inner task-based
region, it also must be firstprivatized in outer task-based regions.
Previously firstprivates were captured in tasks but later it was
optimized to reduce the memory usage. But still need to mark such
variables as implicit firstprivate in outer tasks.
Differential Revision: https://reviews.llvm.org/D91627
In order not to modify the `tgt_target_data_update` information but still be
able to pass the extra information for non-contiguous map item (offset,
count, and stride for each dimension), this patch overload `arg` when
the maptype is set as `OMP_MAP_DESCRIPTOR`. The origin `arg` is for
passing the pointer information, however, the overloaded `arg` is an
array of descriptor_dim:
struct descriptor_dim {
int64_t offset;
int64_t count;
int64_t stride
};
and the array size is the same as dimension size. In addition, since we
have count and stride information in descriptor_dim, we can replace/overload the
`arg_size` parameter by using dimension size.
For supporting `stride` in array section, we use a dummy dimension in
descriptor to store the unit size. The formula for counting the stride
in dimension D_n: `unit size * (D_0 * D_1 ... * D_n-1) * D_n.stride`.
Demonstrate how it works:
```
double arr[3][4][5];
D0: { offset = 0, count = 1, stride = 8 } // offset, count, dimension size always be 0, 1, 1 for this extra dimension, stride is the unit size
D1: { offset = 0, count = 2, stride = 8 * 1 * 2 = 16 } // stride = unit size * (product of dimension size of D0) * D1.stride = 4 * 1 * 2 = 8
D2: { offset = 2, count = 2, stride = 8 * (1 * 5) * 1 = 40 } // stride = unit size * (product of dimension size of D0, D1) * D2.stride = 4 * 5 * 1 = 20
D3: { offset = 0, count = 2, stride = 8 * (1 * 5 * 4) * 2 = 320 } // stride = unit size * (product of dimension size of D0, D1, D2) * D3.stride = 4 * 25 * 2 = 200
// X here means we need to offload this data, therefore, runtime will transfer
// data from offset 80, 96, 120, 136, 400, 416, 440, 456
// Runtime patch: https://reviews.llvm.org/D82245
// OOOOO OOOOO OOOOO
// OOOOO OOOOO OOOOO
// XOXOO OOOOO XOXOO
// XOXOO OOOOO XOXOO
```
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D84192
folding to not constant folding.
Constant folding of ICEs is done as a GCC compatibility measure, but new
code was picking it up, presumably by accident, due to the bad default.
While here, also switch the flag from a bool to an enum to make it more
obvious what it means at call sites. This highlighted a couple of places
where our behavior is different between C++11 and C++14 due to switching
from checking for an ICE to checking for a converted constant
expression (where there is no 'fold' codepath).
This reapplies D88384 with the minor modification that an assertion was
changed to a regular conditional and graceful exit from
ASTContext::mergeTypes.
Especially for templates we need to check at some point if the base
function matches the specialization we might call instead. Before this
lead to the replacement of `std::sqrt(int(2))` calls with one that
converts the argument to a `std::complex<int>`, clearly not the desired
behavior.
Reported as PR47655
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D88384
Need to fix a check for the variable if it is declared in the inner
OpenMP region to be able to firstprivatize it.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D88240
Need to fix a check for the variable if it is declared in the inner
OpenMP region to be able to firstprivatize it.
Differential Revision: https://reviews.llvm.org/D88240
No need to make final copy from the firsptrivate/lastprivate copy to the original item if the item is a data memeber.
Firstprivate copy creates a copy by reference and the original item gets
updated correctly when updating the lastprivate shared variable.
Differential Revision: https://reviews.llvm.org/D88179
In CUDA/HIP a function may become implicit host device function by
pragma or constexpr. A host device function is checked in both
host and device compilation. However it may be emitted only
on host or device side, therefore the diagnostics should be
deferred until it is known to be emitted.
Currently clang is only able to defer certain diagnostics. This causes
false alarms and limits the usefulness of host device functions.
This patch lets clang defer all overloading resolution diagnostics for host device functions.
An option -fgpu-defer-diag is added to control this behavior. By default
it is off.
It is NFC for other languages.
Differential Revision: https://reviews.llvm.org/D84364
With this extension the effects of `omp begin declare variant` will be
applied to template function declarations. The behavior is opt-in and
controlled by the `extension(allow_templates)` trait. While generally
useful, this will enable us to implement complex math function calls by
overloading the templates of the standard library with the ones in
libc++.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D85735
This extension allows to declare variants in between `omp begin/end
declare variant` that do not match the type of the existing function
with that name. Without this extension we would not find a base function
(with a compatible type), therefore create a new one, which would
cause conflicting declarations. With this extension we will not create
"missing" base functions, which basically renders these specializations
harmless. They will be generated but never called.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D85878
Due to `omp begin/end declare variant`, OpenMP context selectors can be
nested. This patch adds initial support for this so we can use it for
target math variants. We should improve the detection of "equivalent"
scores and user conditions, we should also revisit the data structures
of the OMPTraitInfo object, however, both are not pressing issues right
now.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D85877
The base must be shared between the threads, threadprivates are not
allowed to be bases for array-like reductions.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D85762
This is recommit of 6c8041aa0f, reverted in de044f7562 because of some
fails. Original commit message is below.
This change allow a CastExpr to have optional FPOptionsOverride object,
stored in trailing storage. Of all cast nodes only ImplicitCastExpr,
CStyleCastExpr, CXXFunctionalCastExpr and CXXStaticCastExpr are allowed
to have FPOptions.
Differential Revision: https://reviews.llvm.org/D85960
This change allow a CastExpr to have optional FPOptionsOverride object,
stored in trailing storage. Of all cast nodes only ImplicitCastExpr,
CStyleCastExpr, CXXFunctionalCastExpr and CXXStaticCastExpr are allowed
to have FPOptions.
Differential Revision: https://reviews.llvm.org/D85960
Function Sema::isOpenMPGlobalCapturedDecl() has a parameter `unsigned Level`,
but use `Level >= 0` as the condition of `while`, thus cause an infinite loop.
Fix by changing the loop condition to `Level > 0`.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D86858
OpenMP 5.0 supports nested declare target regions. So, in general,it is
allow to mark a declarationas declare target with different device_type
or link type. Patch adds support for such kind of nesting.
Differential Revision: https://reviews.llvm.org/D86239
If the declaration is used in the reduction clause, it is captured by
reference by default. But if the declaration is a pointer and it is a
base for array-like reduction, this declaration can be captured by
value, since the pointee is reduced but not the original declaration.
Differential Revision: https://reviews.llvm.org/D85321
Summary:
Introduced OMPChildren class to handle all associated clauses, statement
and child expressions/statements. It allows to represent some directives
more correctly (like flush, depobj etc. with pseudo clauses, ordered
depend directives, which are standalone, and target data directives).
Also, it will make easier to avoid using of CapturedStmt in directives,
if required (atomic, tile etc. directives).
Also, it simplifies serialization/deserialization of the
executable/declarative directives.
Reduces number of allocation operations for mapper declarations.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, jfb, cfe-commits, sstefan1, aaron.ballman, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D83261
This patch implements Clang front end support for the OpenMP TR8
`present` motion modifier for `omp target update` directives. The
next patch in this series implements OpenMP runtime support.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D84711
It was unclear what `isa` was supposed to mean so we did not provide any
traits for this context selector. With this patch we will allow *any*
string or identifier. We use the target attribute and target info to
determine if the trait matches. In other words, we will check if the
provided value is a target feature that is available (at the call site).
Fixes PR46338
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D83281
This patch implements Clang front end support for the OpenMP TR8
`present` motion modifier for `omp target update` directives. The
next patch in this series implements OpenMP runtime support.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D84711
Summary:
If the variable is constrcutible, its copy is created by calling a
constructor. Such variables are duplicated and thus, must be captured.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, cfe-commits, sstefan1, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D83909
This change allow a CallExpr to have optional FPOptionsOverride object,
stored in trailing storage. The implementaion is made similar to the way
used in BinaryOperator.
Differential Revision: https://reviews.llvm.org/D84343
This patch implements Clang front end support for the OpenMP TR8
`present` map type modifier. The next patch in this series implements
OpenMP runtime support.
This patch does not attempt to implement TR8 sec. 2.22.7.1 "map
Clause", p. 319, L14-16:
> If a map clause with a present map-type-modifier is present in a map
> clause, then the effect of the clause is ordered before all other
> map clauses that do not have the present modifier.
Compare to L10-11, which Clang does not appear to implement yet:
> For a given construct, the effect of a map clause with the to, from,
> or tofrom map-type is ordered before the effect of a map clause with
> the alloc, release, or delete map-type.
This patch also does not implement the `present` implicit-behavior for
`defaultmap` or the `present` motion-modifier for `target update`.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D83061
Reapply 49e5f603d4
which had been reverted in c94332919b.
Originally reverted because I hadn't updated it in quite a while when I
got around to committing it, so there were a bunch of missing changes to
new code since I'd written the patch.
Reviewers: aaron.ballman
Differential Revision: https://reviews.llvm.org/D76646
Summary:
According to OpenMP 5.0, the restrictions for mapping of overlapped data
apply only for explicitly mapped data, there is no restriction for
implicitly mapped data just like in OpenMP 4.5.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D83398