Commit Graph

114 Commits

Author SHA1 Message Date
Michael Kruse b738ffa845 Heap allocation for new arrays.
This patch aims to implement the option of allocating new arrays created
by polly on heap instead of stack. To enable this option, a key named
'allocation' must be written in the imported json file with the value
'heap'.

We need such a feature because in a next iteration, we will implement a
mechanism of maximal static expansion which will need a way to allocate
arrays on heap. Indeed, the expansion is very costly in terms of memory
and doing the allocation on stack is not worth considering.

The malloc and the free are added respectively at polly.start and
polly.exiting such that there is no use-after-free (for instance in case
of Scop in a loop) and such that all memory cells allocated with a
malloc are free'd when we don't need them anymore.

We also add :

- In the class ScopArrayInfo, we add a boolean as member called IsOnHeap
  which represents the fact that the array in allocated on heap or not.
- A new branch in the method allocateNewArrays in the ISLNodeBuilder for
  the case of heap allocation. allocateNewArrays now takes a BBPair
  containing polly.start and polly.exiting. allocateNewArrays takes this
  two blocks and add the malloc and free calls respectively to
  polly.start and polly.exiting.
- As IntPtrTy for the malloc call, we use the DataLayout one.

To do that, we have modified :

- createScopArrayInfo and getOrCreateScopArrayInfo such that it returns
  a non-const SAI, in order to be able to call setIsOnHeap in the
  JSONImporter.
- executeScopConditionnaly such that it return both start block and end
  block of the scop, because we need this two blocs to be able to add
  the malloc and the free calls at the right position.

Differential Revision: https://reviews.llvm.org/D33688

llvm-svn: 306540
2017-06-28 13:02:43 +00:00
Michael Kruse a6d48f59a1 Fix a lot of typos. NFC.
llvm-svn: 304974
2017-06-08 12:06:15 +00:00
Michael Kruse 706f79ab14 [CodeGen] Support partial write accesses.
Allow the BlockGenerator to generate memory writes that are not defined
over the complete statement domain, but only over a subset of it. It
generates a condition that evaluates to 1 if executing the subdomain,
and only then execute the access.

Only write accesses are supported. Read accesses would require a PHINode
which has a value if the access is not executed.

Partial write makes DeLICM able to apply mappings that are not defined
over the entire domain (for instance, a branch that leaves a loop with
a PHINode in its header; a MemoryKind::PHI write when leaving is never
read by its PHI read).

Differential Revision: https://reviews.llvm.org/D33255

llvm-svn: 303517
2017-05-21 22:46:57 +00:00
Siddharth Bhat b7f68b8c9e [Fortran Support] Materialize outermost dimension for Fortran array.
- We use the outermost dimension of arrays since we need this
information to generate GPU transfers.

- In general, if we do not know the outermost dimension of the array
(because the indexing expression is non-affine, for example) then we
simply cannot generate transfer code.

- However, for Fortran arrays, we can use the Fortran array
representation which stores the dimensions of all arrays.

- This patch uses the Fortran array representation to generate code that
computes the outermost dimension size.

Differential Revision: https://reviews.llvm.org/D32967

llvm-svn: 303429
2017-05-19 15:07:45 +00:00
Tobias Grosser f3adab4c20 [Polly] Canonicalize arrays according to base-ptr equivalence class
Summary:
    In case two arrays share base pointers in the same invariant load equivalence
    class, we canonicalize all memory accesses to the first of these arrays
    (according to their order in the equivalence class).

    This enables us to optimize kernels such as boost::ublas by ensuring that
    different references to the C array are interpreted as accesses to the same
    array. Before this change the runtime alias check for ublas would fail, as it
    would assume models of the C array with differing (but identically valued) base
    pointers would reference distinct regions of memory whereas the referenced
    memory regions were indeed identical.

    As part of this change we remove most of the MemoryAccess::get*BaseAddr
    interface. We removed already all references to get*BaseAddr in previous
    commits to ensure that no code relies on matching base pointers between
    memory accesses and scop arrays -- except for three remaining uses where we
    need the original base pointer. We document for these situations that
    MemoryAccess::getOriginalBaseAddr may return a base pointer that is distinct
    to the base pointer of the scop array referenced by this memory access.

Reviewers: sebpop, Meinersbur, zinob, gareevroman, pollydev, huihuiz, efriedma, jdoerfert

Reviewed By: Meinersbur

Subscribers: etherzhhb

Tags: #polly

Differential Revision: https://reviews.llvm.org/D28518

llvm-svn: 302636
2017-05-10 10:59:58 +00:00
Hongbin Zheng 0f8f177682 [Polly] Do not introduce address space cast
Do not introduce address space cast in IslNodeBuilder::preloadUnconditionally.

Differential Revision: https://reviews.llvm.org/D32581

llvm-svn: 301519
2017-04-27 06:42:14 +00:00
Matt Arsenault b3e30c32ce Update for alloca construction changes
llvm-svn: 299905
2017-04-11 00:12:58 +00:00
Philip Pfaffe 2d950f36ee [Polly][NewPM] Pull references to the legacy PM interface from utilities and helpers
Summary:
A couple of the utilities used to analyze or build IR make explicit use of the legacy PM on their interface, to access analysis results. This patch removes the legacy PM from the interface, and just passes the required results directly.

This shouldn't introduce any function changes, although the API technically allowed to obtain two different analysis results before, one passed by reference and one through the PM. I don't believe that was ever intended, however.

Reviewers: grosser, Meinersbur

Reviewed By: grosser

Subscribers: nemanjai, pollydev, llvm-commits

Tags: #polly

Differential Revision: https://reviews.llvm.org/D31653

llvm-svn: 299423
2017-04-04 10:01:53 +00:00
Roman Gareev cdfb57dc46 Introduce another level of metadata to distinguish non-aliasing accesses
Introduce another level of alias metadata to distinguish the individual
non-aliasing accesses that have inter iteration alias-free base pointers
marked with "Inter iteration alias-free" mark nodes. It can be used to,
for example, distinguish different stores (loads) produced by unrolling of
the innermost loops and, subsequently, sink (hoist) them by LICM.

Reviewed-by: Tobias Grosser <tobias@grosser.es>

Differential Revision: https://reviews.llvm.org/D30606

llvm-svn: 298510
2017-03-22 14:25:24 +00:00
Roman Gareev 23df27682a Map the new load to the base pointer of the invariant load hoisted load
Map the new load to the base pointer of the invariant load hoisted load
to be able to find the alias information for it.

Reviewed-by: Tobias Grosser <tobias@grosser.es>

Differential Revision: https://reviews.llvm.org/D30605

llvm-svn: 298507
2017-03-22 13:57:53 +00:00
Tobias Grosser b28f86e9e6 [CodeGen] Remove need for all parameters to be in scop context for load hoisting.
When not adding constraints on parameters using -polly-ignore-parameter-bounds,
the context may not necessarily list all parameter dimensions. To support code
generation in this situation, we now always iterate over the actual parameter
list, rather than relying on the context to list all parameter dimensions.

llvm-svn: 298197
2017-03-18 23:12:49 +00:00
Michael Kruse 52ab4943b4 Remove all references to PostDominators. NFC.
Marking a pass as preserved is necessary if any Polly pass uses it, even
if it is not preserved within the generated code. Not marking it would
cause the the Polly pass chain to be interrupted. It is not used by any
Polly pass anymore, hence we can remove all references to it.

llvm-svn: 295983
2017-02-23 15:16:22 +00:00
Tobias Grosser ff40087a6a Update to recent formatting changes
llvm-svn: 293756
2017-02-01 10:12:09 +00:00
Tobias Grosser 587f1f57ad [Polly] [BlockGenerator] Unify ScalarMap and PhiOpsMap
Instead of keeping two separate maps from Value to Allocas, one for
MemoryType::Value and the other for MemoryType::PHI, we introduce a single map
from ScopArrayInfo to the corresponding Alloca. This change is intended, both as
a general simplification and cleanup, but also to reduce our use of
MemoryAccess::getBaseAddr(). Moving away from using getBaseAddr() makes sure
we have only a single place where the array (and its base pointer) for which we
generate code for is specified, which means we can more easily introduce new
access functions that use a different ScopArrayInfo as base. We already today
experiment with modifiable access functions, so this change does not address
a specific bug, but it just reduces the scope one needs to reason about.

Another motivation for this patch is https://reviews.llvm.org/D28518, where
memory accesses with different base pointers could possibly be mapped to a
single ScopArrayInfo object. Such a mapping is currently not possible, as we
currently generate alloca instructions according to the base addresses of the
memory accesses, not according to the ScopArrayInfo object they belong to.  By
making allocas ScopArrayInfo specific, a mapping to a single ScopArrayInfo
object will automatically mean that the same stack slot is used for these
arrays. For D28518 this is not a problem, as only MemoryType::Array objects are
mapping, but resolving this inconsistency will hopefully avoid confusion.

llvm-svn: 293374
2017-01-28 07:42:10 +00:00
Tobias Grosser e1ff0cf2eb Relax assert when setting access functions with invariant base pointers
Summary:
Instead of forbidding such access functions completely, we verify that their
base pointer has been hoisted and only assert in case the base pointer was
not hoisted.

I was trying for a little while to get a test case that ensures the assert is
correctly fired in case of invariant load hoisting being disabled, but I could
not find a good way to do so, as llvm-lit immediately aborts if a command
yields a non-zero return value. As we do not generally test our asserts,
not having a test case here seems OK.

This resolves http://llvm.org/PR31494

Suggested-by: Michael Kruse <llvm@meinersbur.de>

Reviewers: efriedma, jdoerfert, Meinersbur, gareevroman, sebpop, zinob, huihuiz, pollydev

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D28798

llvm-svn: 292213
2017-01-17 12:00:42 +00:00
Tobias Grosser 21a059af09 Adjust formatting to commit r292110 [NFC]
llvm-svn: 292123
2017-01-16 14:08:10 +00:00
Roman Gareev bd5c6039c6 Align newly created arrays to the first level cache line boundary
Aligning data to cache lines boundaries helps to avoid overheads related to
an access to it ([1]). This patch aligns newly created arrays and adds an
option to specify the first level cache line size. By default we use 64 bytes,
which is a typical cache-line size ([2]).

In case of Intel Core i7-3820 SandyBridge and the following options,

clang -O3 gemm.c -I utilities/ utilities/polybench.c -DPOLYBENCH_TIME
-march=native -mllvm -polly -mllvm -polly-pattern-matching-based-opts=true
-DPOLYBENCH_USE_SCALAR_LB -mllvm -polly-target-cache-level-associativity=8,8
-mllvm -polly-target-cache-level-sizes=32768,262144 -mllvm
-polly-target-latency-vector-fma=8

it helps to improve the performance from 11.303 GFlops/sec (39,247% of
theoretical peak) to 12.63 GFlops/sec (43,8542% of theoretical peak).

Refs.:

[1] - http://www.alexonlinux.com/aligned-vs-unaligned-memory-access
[2] - http://igoro.com/archive/gallery-of-processor-cache-effects/

Differential Revision: https://reviews.llvm.org/D28020

Reviewed-by: Tobias Grosser <tobias@grosser.es>
llvm-svn: 290253
2016-12-21 12:37:36 +00:00
Tobias Grosser dc6b87c56e Add newline at end of debug print
In '[DBG] Allow to emit the RTC value at runtime' the diagnostics were printed
without a newline at the end of each diagnostic. We add such a newline to
improve readability.

llvm-svn: 288323
2016-12-01 08:08:47 +00:00
Michael Kruse 11c5e07925 canSynthesize: Remove unused argument LI. NFC.
The helper function polly::canSynthesize() does not directly use the LoopInfo
analysis, hence remove it from its argument list.

llvm-svn: 288144
2016-11-29 15:11:04 +00:00
Tobias Grosser df8f35b7b8 Update for clang-format change in r288119
llvm-svn: 288134
2016-11-29 12:52:08 +00:00
Tobias Grosser b3c3d149b9 [CodeGen] Add flag to code-generate most memory access expressions
Introduce the new flag -polly-codegen-generate-expressions which forces Polly
to code generate AST expressions instead of using our SCEV based access
expression generation even for cases where the original memory access relation
was not changed and the SCEV based access expression could be code generated
without any issue.

This is an experimental option for better testing the isl ast expression
generation. The default behavior of Polly remains unchanged. We also exclude
a couple of cases for which the AST expression is not yet working.

llvm-svn: 287694
2016-11-22 20:21:16 +00:00
Johannes Doerfert 81aa6e882f [NFC] Adjust naming scheme of statistic variables
Suggested-by: Tobias Grosser <tobias@grosser.es>
llvm-svn: 287347
2016-11-18 14:37:08 +00:00
Johannes Doerfert dae2e9287d [DBG] Collect statistics about actually versioned SCoPs
llvm-svn: 287267
2016-11-17 21:55:43 +00:00
Johannes Doerfert 8c5464a715 [DBG] Allow to emit the RTC value at runtime
The new command line flag "polly-codegen-emit-rtc-print" can be used to
place a "printf" in the generated code that will print the RTC value and
the overflow state.

llvm-svn: 287265
2016-11-17 21:49:19 +00:00
Tobias Grosser 16480186f8 IslNodeBuilder: Ensure newly generated memory accesses are well-defined
Add some additional asserts that ensure newly code-generated memory accesses
are defined on all domain and schedule domain instances.

llvm-svn: 286050
2016-11-05 21:46:01 +00:00
Eli Friedman acf8006471 [Polly CodeGen] Break critical edge from RTC to original loop.
This makes polly generate a CFG which is closer to what we want
in LLVM IR, with a loop preheader for the original loop. This is
just a cleanup, but it exposes some fragile assumptions.

I'm not completely happy with the changes related to expandCodeFor;
RTCBB->getTerminator() is basically a random insertion point which
happens to work due to the way we generate runtime checks. I'm not
sure what the right answer looks like, though.

Differential Revision: https://reviews.llvm.org/D26053

llvm-svn: 285864
2016-11-02 22:32:23 +00:00
Eli Friedman 3c1a75bf9c Handle multi-dimensional invariant load.
If the address of a load depends on another load, make sure to emit
the loads in the right order.

llvm-svn: 284426
2016-10-17 21:04:26 +00:00
Michael Kruse 4b0c5aea78 [CodeGen] Add assertion for indirect array index expression generation. NFC.
Currently Polly cannot generate code for index expressions if the base pointer
is computed within the scop. The base pointer must be generated as well, but
there is no code that triggers that.

Add an assertion to detect when this would occur and miscompile. The IR verifier
should catch it as well.

llvm-svn: 282893
2016-09-30 18:29:37 +00:00
Roman Gareev b3224adfb6 Perform copying to created arrays according to the packing transformation
This is the fourth patch to apply the BLIS matmul optimization pattern on matmul
kernels (http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf).
BLIS implements gemm as three nested loops around a macro-kernel, plus two
packing routines. The macro-kernel is implemented in terms of two additional
loops around a micro-kernel. The micro-kernel is a loop around a rank-1
(i.e., outer product) update. In this change we perform copying to created
arrays, which is the last step to implement the packing transformation.

Reviewed-by: Tobias Grosser <tobias@grosser.es>

Differential Revision: https://reviews.llvm.org/D23260

llvm-svn: 281441
2016-09-14 06:26:09 +00:00
Roman Gareev f5aff70405 Store the size of the outermost dimension in case of newly created arrays that require memory allocation.
We do not need the size of the outermost dimension in most cases, but if we
allocate memory for newly created arrays, that size is needed.

Reviewed-by: Michael Kruse <llvm@meinersbur.de>

Differential Revision: https://reviews.llvm.org/D23991

llvm-svn: 281234
2016-09-12 17:08:31 +00:00
Tobias Grosser a3afe44d6c IslNodeBuilder: Add missing __isl_take annotation
llvm-svn: 281034
2016-09-09 11:16:50 +00:00
Tobias Grosser f3600dfa2d IslNodeBuilder: Add missing __isl_take annotations
llvm-svn: 280936
2016-09-08 13:48:55 +00:00
Tobias Grosser c80d6979bd Drop '@brief' from doxygen comments
LLVM's coding guideline suggests to not use @brief for one-sentence doxygen
comments to improve readability. Switch this once and for all to ensure people
do not copy @brief comments from other parts of Polly, when writing new code.

llvm-svn: 280468
2016-09-02 06:33:33 +00:00
Tobias Grosser fa9abd1f03 Fix compilation in 'asserts' mode
llvm-svn: 278025
2016-08-08 17:35:52 +00:00
Tobias Grosser 0aa29532b7 [IslNodeBuilder] Move run-time check generation to NodeBuilder [NFC]
This improves the structure of the code and allows us to reuse the runtime
code generation in the PPCGCodeGeneration.

llvm-svn: 278017
2016-08-08 15:41:52 +00:00
Tobias Grosser 000db70754 [IslNodeBuilder] Directly use the insert location of our Builder
... instead of adding instructions at the end of the basic block the builder
is currently at. This makes it easier to reason about where IR is generated,
as with the IRBuilder there is just a single location that specificies where
IR is generated.

llvm-svn: 278013
2016-08-08 15:25:46 +00:00
Tobias Grosser 00bb5a99f5 GPGPU: Handle scalar array references
Pass the content of scalar array references to the alloca on the kernel side
and do not pass them additional as normal LLVM scalar value.

llvm-svn: 277699
2016-08-04 06:55:59 +00:00
Tobias Grosser 2219d15748 Fix a couple of spelling mistakes
llvm-svn: 277569
2016-08-03 05:28:09 +00:00
Roman Gareev d7754a1245 Extend the jscop interface to allow the user to declare new arrays and to reference these arrays from access expressions
Extend the jscop interface to allow the user to export arrays. It is required
that already existing arrays of the list of arrays correspond to arrays
of the SCoP. Each array that is appended to the list will be newly created.
Furthermore, we allow the user to modify access expressions to reference
any array in case it has the same element type.

Reviewed-by: Tobias Grosser <tobias@grosser.es>

Differential Revision: https://reviews.llvm.org/D22828

llvm-svn: 277263
2016-07-30 09:25:51 +00:00
Tobias Grosser 86083da0ec IslNodeBuilder: expose addReferencesFromStmt [NFC]
This will be used by Polly GPGPU to determine the values that need to be
passed to GPU kernels.

llvm-svn: 276269
2016-07-21 13:15:55 +00:00
Tobias Grosser faef9a7667 Fix gcc compile failure
Commit r275056 introduced a gcc compile failure due to us using two
types named 'Type', the first being the newly introduced member variable
'Type' the second being llvm::Type. We resolve this issue by renaming
the newly introduced member variable to AccessType.

llvm-svn: 275057
2016-07-11 12:27:04 +00:00
Tobias Grosser 4e2d9c45b9 InvariantEquivClassTy: Use struct instead of 4-tuple to increase readability
Summary:
With a struct we can use named accessors instead of generic std::get<3>()
calls. This increases readability of the source code.

Reviewers: jdoerfert

Subscribers: pollydev, llvm-commits

Differential Revision: http://reviews.llvm.org/D21955

llvm-svn: 275056
2016-07-11 12:15:10 +00:00
Tobias Grosser 3717aa5ddb This reverts recent expression type changes
The recent expression type changes still need more discussion, which will happen
on phabricator or on the mailing list. The precise list of commits reverted are:

- "Refactor division generation code"
- "[NFC] Generate runtime checks after the SCoP"
- "[FIX] Determine insertion point during SCEV expansion"
- "Look through IntToPtr & PtrToInt instructions"
- "Use minimal types for generated expressions"
- "Temporarily promote values to i64 again"
- "[NFC] Avoid unnecessary comparison for min/max expressions"
- "[Polly] Fix -Wunused-variable warnings (NFC)"
- "[NFC] Simplify min/max expression generation"
- "Simplify the type adjustment in the IslExprBuilder"

Some of them are just reverted as we would otherwise get conflicts. I will try
to re-commit them if possible.

llvm-svn: 272483
2016-06-11 19:17:15 +00:00
Johannes Doerfert 0767a511ba Use minimal types for generated expressions
We now use the minimal necessary bit width for the generated code. If
  operations might overflow (add/sub/mul) we will try to adjust the types in
  order to ensure a non-wrapping computation. If the type adjustment is not
  possible, thus the necessary type is bigger than the type value of
  --polly-max-expr-bit-width, we will use assumptions to verify the computation
  will not wrap. However, for run-time checks we cannot build assumptions but
  instead utilize overflow tracking intrinsics.

llvm-svn: 271878
2016-06-06 09:57:41 +00:00
Matthew Simpson acae9e3b30 [Polly] Fix -Wunused-variable warnings (NFC)
llvm-svn: 271518
2016-06-02 14:26:38 +00:00
Johannes Doerfert d36553753e Simplify the type adjustment in the IslExprBuilder
We now have a simple function to adjust/unify the types of two (or three)
  operands before an operation that requieres the same type for all operands.
  Due to this change we will not promote parameters that are added to i64
  anymore if that is not needed.

llvm-svn: 271513
2016-06-02 11:15:57 +00:00
Johannes Doerfert 0f0d209bec Use the SCoP directly for canSynthesize [NFC]
llvm-svn: 270429
2016-05-23 12:47:09 +00:00
Johannes Doerfert ef74443c97 Duplicate part of the Region interface in the Scop class [NFC]
This allows to use the SCoP directly for various queries,
  thus to hide the underlying region more often.

llvm-svn: 270426
2016-05-23 12:42:38 +00:00
Johannes Doerfert 952b5304bc Add and use Scop::contains(Loop/BasicBlock/Instruction) [NFC]
llvm-svn: 270424
2016-05-23 12:40:48 +00:00
Johannes Doerfert a61eda7698 [FIX] Let ScalarEvolution forget hoisted values
We have to rethink the handling of escaping values in order to make
  this kind of "fixes" go away.

llvm-svn: 270409
2016-05-23 09:02:54 +00:00