The symbol may be used by use-association for multiple times such
as one in module specification part and one in module procedure.
Then in module procedure, the variable instantiation will be called
for multiple times. But we only need to threadprivatize it once and
use the threadprivatized value for the second time.
Fix#58379.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D136035
nullptr matches both against ::mlir::UnitAttr and ::mlir::TypeRange,
so the following two candidates fit:
static void mlir::omp::OrderedRegionOp::build(::mlir::OpBuilder &odsBuilder,
::mlir::OperationState &odsState,
/*optional*/::mlir::UnitAttr simd)
static void mlir::omp::OrderedRegionOp::build(::mlir::OpBuilder &odsBuilder,
::mlir::OperationState &odsState,
::mlir::TypeRange resultTypes,
/*optional*/bool simd = false)
This supports the lowering of private and firstprivate clauses in single
construct. The alloca ops are emitted in the entry block according to
https://llvm.org/docs/Frontend/PerformanceTips.html#use-of-allocas, and
the load/store ops are emitted in the single region. The data race
problem is handled in OMPIRBuilder. That is, the barrier is emitted in
OMPIRBuilder.
Co-authored-by: Nimish Mishra <neelam.nimish@gmail.com>
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D128596
This flips all of the remaining dialects to prefixed except for linalg, which
will be done in a followup.
Differential Revision: https://reviews.llvm.org/D134995
Adds support for .and. reductions with logical types.
Because arith.addi doesn'to work with fir.logical<4> types
logical<4> must be converted to i1 prior to the operation.
This means that the pattern matched by integer reductions
(load -> op -> store) will not match logical reductions.
Instead, the pattern being searched for here is
load -> convert(logical<4> to i1) -> op -> convert(i1 to logical<4>) -> store
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D132228
This patch adds private/firstprivate support for sections construct. For
a source like the following:
```
!$omp sections private(x) firstprivate(y)
!$omp section
<block of code>
!$omp section
<block of code>
!$omp end sections
```
...privatization proceeds to privatize `x` and `y` accordingly
inside each of the generated `omp.section` operations.
Reviewed By: peixin
Differential Revision: https://reviews.llvm.org/D131463
Allows addition/multiplication reductions to be used with
real types by adding getReductionOperation() to OpenMP.cpp,
which can select either integer or floating-point instruction.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D132459
Non-global variable which can be in threadprivate directive must be one
variable in main program, and it has implicit SAVE attribute. Take it as
with SAVE attribute, so to create GlobalOp for it to simplify the
translation to LLVM IR.
Reviewed By: NimishMishra
Differential Revision: https://reviews.llvm.org/D127047
This supports lowering from parse-tree to MLIR for OpenMP safelen clause
in SIMD construct.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D132574
This supports translation from MLIR to LLVM IR using OMPIRBuilder for
OpenMP safelen clause in SIMD construct.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D132245
Remove barriers for firstprivate except if the variable is also
lastprivatized. Re-arrange code and put all last-privates inside a
single scf.if.
As OpenMP Spec 5.0, to avoid the data races, concurrent updates of the
original list item must be synchronized with the read of the original
list item that occurs as a result of the firstprivate clause. Adding
barrier(s) before and/or after the worksharing region would remove the
data races, and it is the application(user)'s job. However, when
one list item is in both firstprivate and lastprivate clauses, the
standard (https://www.openmp.org/spec-html/5.0/openmpsu105.html) states
the following:
```
If a list item appears in both firstprivate and lastprivate clauses, the
update required for the lastprivate clause occurs after all
initializations for the firstprivate clause.
```
So, the synchronization should be ensured by compiler such as emiting
one barrier since the lastprivate clause follows the reads of the
original list item performed for the initialization of each of the
firstprivate list item.
Add FIXME for two special cases, sections construct and linear clause.
The data race problem for single construct will be handled later.
This implementation is based on the discussion with OpenMP committee and
clang code (clang/lib/CodeGen/CGStmtOpenMP.cpp).
Reviewed By: kiranchandramohan, NimishMishra
Differential Revision: https://reviews.llvm.org/D131832
This patch adds lowering support for default clause.
1. During symbol resolution in semantics, should the enclosing context
have a default data sharing clause defined and a `parser::Name` is not
attached to an explicit data sharing clause, the
`semantics::Symbol::Flag::OmpPrivate` flag (in case of
default(private)) and `semantics::Symbol::Flag::OmpFirstprivate` flag
(in case of default(firstprivate)) is added to the symbol.
2. During lowering, all symbols having either
`semantics::Symbol::Flag::OmpPrivate` or
`semantics::Symbol::Flag::OmpFirstprivate` flag are collected and
privatised appropriately.
Co-authored-by: Peixin Qiao <qiaopeixin@huawei.com>
Reviewed by: peixin
Differential Revision: https://reviews.llvm.org/D123930
This patch adds lowering support for default clause.
1. During symbol resolution in semantics, should the enclosing context have
a default data sharing clause defined and a `parser::Name` is not attached
to an explicit data sharing clause, the
`semantics::Symbol::Flag::OmpPrivate` flag (in case of default(private))
and `semantics::Symbol::Flag::OmpFirstprivate` flag (in case of
default(firstprivate)) is added to the symbol.
2. During lowering, all symbols having either
`semantics::Symbol::Flag::OmpPrivate` or
`semantics::Symbol::Flag::OmpFirstprivate` flag are collected and
privatised appropriately.
Co-authored-by: Peixin Qiao <qiaopeixin@huawei.com>
Reviewed by: peixin
Differential Revision: https://reviews.llvm.org/D123930
Adds support for reduction of multiplcation
by extending OpenMP.cpp::genOpenMPReduction()
and altering the identity constant emitted in
OpenMP.cpp::createReductionDelc()
This patch builds D130077 and as such,
only supports reductions for interger types in
worksharping loops.
Reviewed By: awarzynski
Differential Revision: https://reviews.llvm.org/D130767
This patch serves two main purposes:
Firstly, to split some of the logic into a seperate method
to try and improve readability
On top of this, it aims to make creating the reductions more generic.
That way, subsequent patches adding reductions shouldn't need
to add a significant amount of extra logic checks, such as checking
for specific operators.
Reviewed By: awarzynski
Differential Revision: https://reviews.llvm.org/D131161
This supports lowering from parse-tree to MLIR and translation from
MLIR to LLVM IR using OMPIRBuilder for OpenMP simdlen clause in SIMD
construct.
Reviewed By: shraiysh, peixin, arnamoy10
Differential Revision: https://reviews.llvm.org/D130195
This patch adds lowering support for default clause.
1. During symbol resolution in semantics, should the enclosing context have
a default data sharing clause defined and a `parser::Name` is not attached
to an explicit data sharing clause, the
`semantics::Symbol::Flag::OmpPrivate` flag (in case of `default(private)`)
and `semantics::Symbol::Flag::OmpFirstprivate` flag (in case of
`default(firstprivate)`) is added to the symbol.
2. During lowering, all symbols having either
`semantics::Symbol::Flag::OmpPrivate` or
`semantics::Symbol::Flag::OmpFirstprivate` flag are collected and
privatised appropriately.
Co-authored-by: Peixin Qiao <qiaopeixin@huawei.com>
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D123930
This patch adds an initial support to the lastprivate clause for worksharing loop. The patch creates necessary control flow to guarantee the store of the value from the logical last iteration of the workshare loop.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D130027
Lower the Flang parse-tree containing OpenMP reductions to the OpenMP
dialect. The OpenMP dialect models reductions with,
1) A reduction declaration operation that specifies how to initialize, combine,
and atomically combine private reduction variables.
2) The OpenMP operation (like wsloop) that supports reductions has an array of
reduction accumulator variables (operands) and an array attribute of the same
size that points to the reduction declaration to be used for the reduction
accumulation.
3) The OpenMP reduction operation that takes a value and an accumulator.
This operation replaces the original reduction operation in the source.
(1) is implemented by the `createReductionDecl` in OpenMP.cpp,
(2) is implemented while creating the OpenMP operation,
(3) is implemented by the `genOpenMPReduction` function in OpenMP.cpp, and
called from Bridge.cpp. The implementation of (3) is not very robust.
NOTE 1: The patch currently supports only reductions for integer type addition.
NOTE 2: Only supports reduction in the worksharing loop.
NOTE 3: Does not generate atomic combination region.
NOTE 4: Other options for creating the reduction operation include
a) having the reduction operation as a construct containing an assignment
and then handling it appropriately in the Bridge.
b) we can modify `genAssignment` or `genFIR(AssignmentStmt)` in the Bridge to
handle OpenMP reduction but so far we have tried not to mix OpenMP
and non-OpenMP code and this will break that.
I will try (b) in a separate patch.
NOTE 5: OpenMP dialect gained support for reduction with the patches:
D105358, D107343. See https://discourse.llvm.org/t/rfc-openmp-reduction-support/3367
for more details.
Reviewed By: awarzynski
Differential Revision: https://reviews.llvm.org/D130077
Co-authored-by: Peixin-Qiao <qiaopeixin@huawei.com>
This patch adds lowering support for atomic update construct. A region
is associated with every `omp.atomic.update` operation wherein resides:
(1) the evaluation of the expression on the RHS of the atomic assignment
statement, and (2) a `omp.yield` operation that yields the extended value
of expression evaluated in (1).
Reviewed By: peixin
Differential Revision: https://reviews.llvm.org/D125668
Shared is the default behaviour in the IR, so no handling is required.
Default clause with shared or none do not require any handling since
Shared is the default behaviour in the IR and None is only required
for semantic checks.
This patch is carved out from D123930 to remove couple of false TODOs.
Reviewed By: peixin, shraiysh
Differential Revision: https://reviews.llvm.org/D128797
Co-authored-by: Nimish Mishra <neelam.nimish@gmail.com>
1. Remove the redundant collapse clause in MLIR OpenMP worksharing-loop
operation.
2. Fix several typos.
3. Refactor the chunk size type conversion since CreateSExtOrTrunc has
both type check and type conversion.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D128338
This supports the lowering of copyin clause initially. The pointer,
allocatable, common block, polymorphic varaibles will be supported
later.
This also includes the following changes:
1. Resolve the COPYIN clause and make the entity as host associated.
2. Fix collectSymbolSet by adding one option to control collecting the
symbol itself or ultimate symbol of it so that it can be used
explicitly differentiate the host and associated variables in
host-association.
3. Add one helper function `lookupOneLevelUpSymbol` to differentiate the
usage of host and associated variables explicitly. The previous
lowering of firstprivate depends on the order of
`createHostAssociateVarClone` and `lookupSymbol` of host symbol. With
this fix, this dependence is removed.
4. Reuse `copyHostAssociateVar` for copying operation of COPYIN clause.
Reviewed By: kiranchandramohan, NimishMishra
Differential Revision: https://reviews.llvm.org/D127468
Loop variables of a worksharing loop and sequential loops in parallel
region are privatised by default. These variables are marked with
OmpPreDetermined. Skip explicit privatisation of these variables.
Note: This is part of upstreaming from the fir-dev branch of
https://github.com/flang-compiler/f18-llvm-project.
Reviewed By: Leporacanthicus
Differential Revision: https://reviews.llvm.org/D127249
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Mats Petersson <mats.petersson@arm.com>
This patch adds code so that using bbc we are able to see an end-to-end lowering of simd construct in action.
Reviewed By: kiranchandramohan, peixin, shraiysh
Differential Revision: https://reviews.llvm.org/D125282
Remove a backwards dependence from Optimizer -> Lower by moving Todo.h
to the optimizer and out of lowering.
This patch is part of the upstreaming effort from fir-dev branch.
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D127292
Add support for lowering the schedule modifiers (simd, monotonic,
non-monotonic) in worksharing loops.
Note: This is part of upstreaming from the fir-dev branch of
https://github.com/flang-compiler/f18-llvm-project.
Reviewed By: peixin
Differential Revision: https://reviews.llvm.org/D127311
Co-authored-by: Mats Petersson <mats.petersson@arm.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: V Donaldson <vdonaldson@nvidia.com>
Getting ompobject symbol is needed in multiple places and will be
needed later for the lowering of other constructs/clauses such as
copyin clause. Extract them into one function.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D127280
This supports lowering parse-tree to MLIR for threadprivate directive
following the OpenMP 5.1 [2.21.2] standard. Take the following as an
example:
```
program m
integer, save :: i
!$omp threadprivate(i)
call sub(i)
!$omp parallel
call sub(i)
!$omp end parallel
end
```
```
func.func @_QQmain() {
%0 = fir.address_of(@_QFEi) : !fir.ref<i32>
%1 = omp.threadprivate %0 : !fir.ref<i32> -> !fir.ref<i32>
fir.call @_QPsub(%1) : (!fir.ref<i32>) -> ()
omp.parallel {
%2 = omp.threadprivate %0 : !fir.ref<i32> -> !fir.ref<i32>
fir.call @_QPsub(%2) : (!fir.ref<i32>) -> ()
omp.terminator
}
return
}
```
A threadprivate operation (omp.threadprivate) is created for all
references to a threadprivate variable. The runtime will appropriately
return a threadprivate var (%1 as above) or its copy (%2 as above)
depending on whether it is outside or inside a parallel region. For
threadprivate access outside the parallel region, the threadprivate
operation is created in instantiateVar. Inside the parallel region, it
is created in createBodyOfOp.
One new utility function collectSymbolSet is created for collecting
all the variables with a property within a evaluation, which may be one
Fortran, or OpenMP, or OpenACC construct.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D124226
As per issue #1196, the loop induction variable, which is an argument
in the omp.wsloop operation, does not have a memory location, so when
passed to a function or subroutine, the reference to the value is not
a memory location, but the value of the induction variable. The callee
function/subroutine is then trying to dereference memory at address 1
or some other "not a good memory location".
This is fixed by creating a temporary memory location and storing the
value of the induction variable in that.
Test fixes as a consequence of the changed code generated.
Add checking for some of the omp-unstructured.f90 to check for alloca,
store and load operations, to ensure the correct flow. Add a test
for CYCLE inside a omp-do loop.
Also convert to use -emit-fir in the omp-unstructrued, and make
the symbol matching consistent in the omp-wsloop-variable test.
Reviewed By: peixin
Differential Revision: https://reviews.llvm.org/D126711
The following changes are made for OpenMP operations with unstructured region,
1. For combined constructs the outer operation is considered a structured
region and the inner one as the unstructured.
2. Added a condition to ensure that we create new blocks only once for nested
unstructured OpenMP constructs.
Tests are added for checking the structure of the CFG.
Note: This is part of upstreaming from the fir-dev branch of
https://github.com/flang-compiler/f18-llvm-project. Code originally reviewed
at https://github.com/flang-compiler/f18-llvm-project/pull/1394.
Reviewed By: vdonaldson, shraiysh, peixin
Differential Revision: https://reviews.llvm.org/D126375
Test that chunk size is passed to the static init function.
Using three different variations:
1. Single constant.
2. Expression with constants.
3. Variable value.
Reviewed By: peixin, shraiysh
Differential Revision: https://reviews.llvm.org/D126383
For pointer variables, using getSymbolAddress cannot get the coorect
address for atomic read/write operands. Use genExprAddr to fix it.
Reviewed By: shraiysh, NimishMishra
Differential Revision: https://reviews.llvm.org/D125793
Since the FIR operations are mostly structured, it is only the functions
that could contain multiple blocks inside an operation. This changes
with OpenMP since OpenMP regions can contain multiple blocks. For
unstructured code, the blocks are created in advance and belong to the
top-level function. This caused code in OpenMP region to be placed under
the function level.
In this fix, if the OpenMP region is unstructured then new blocks are
created inside it.
Note1: This is part of upstreaming from the fir-dev branch of
https://github.com/flang-compiler/f18-llvm-project. The code in this patch is a
subset of the changes in https://github.com/flang-compiler/f18-llvm-project/pull/1178.
Reviewed By: vdonaldson
Differential Revision: https://reviews.llvm.org/D126293
Co-authored-by: Val Donaldson <vdonaldson@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Valentin Clement <clementval@gmail.com>
The types of lower bound, upper bound, and step are converted into the
type of the loop variable if necessary. OpenMP runtime requires 32-bit
or 64-bit loop variables. OpenMP loop iteration variable cannot have
more than 64 bits size and will be narrowed.
This patch is part of upstreaming code from the fir-dev branch of
https://github.com/flang-compiler/f18-llvm-project. (#1256)
Co-authored-by: kiranchandramohan <kiranchandramohan@gmail.com>
Reviewed By: kiranchandramohan, shraiysh
Differential Revision: https://reviews.llvm.org/D125740
When parallel is used in a combined construct, then use a separate
function to create the parallel operation. It handles the parallel
specific clauses and leaves the rest for handling at the inner
operations.
Reviewed By: peixin, shraiysh
Differential Revision: https://reviews.llvm.org/D125465
Co-authored-by: Sourabh Singh Tomar <SourabhSingh.Tomar@amd.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Valentin Clement <clementval@gmail.com>
Co-authored-by: Nimish Mishra <neelam.nimish@gmail.com>