Value merging is only necessary for scalars when they are used outside
of the scop. While an array's base pointer can be used after the scop,
it gets an extra ScopArrayInfo of type MK_Value. We used to generate
phi's for both of them, where one was assuming the reault of the other
phi would be the original value, because it has already been replaced by
the previous phi. This resulted in IR that the current IR verifier
allows, but is probably illegal.
This reduces the number of LNT test-suite fails with
-polly-position=before-vectorizer -polly-process-unprofitable
from 16 to 10.
Also see llvm.org/PR26718.
llvm-svn: 262629
Polly recognizes affine loops that ScalarEvolution does not, in
particular those with loop conditions that depend on hoisted invariant
loads. Check for SCEVAddRec dependencies on such loops and do not
consider their exit values as synthesizable because SCEVExpander would
generate them as expressions that depend on the original induction
variables. These are not available in generated code.
llvm-svn: 262404
The generated dedicated subregion exit block was assumed to have the same
dominance relation as the original exit block. This is incorrect if the exit
block receives other edges than only from the subregion, which results in that
e.g. the subregion's entry block does not dominate the exit block.
llvm-svn: 261865
This is also be caught by the function verifier, but disconnected from
the place that produced it. Catch it already at creation to be able to
reason more directly about the cause.
llvm-svn: 261790
This allows code such as:
void multiple_types(char *Short, char *Float, char *Double) {
for (long i = 0; i < 100; i++) {
Short[i] = *(short *)&Short[2 * i];
Float[i] = *(float *)&Float[4 * i];
Double[i] = *(double *)&Double[8 * i];
}
}
To model such code we use as canonical element type of the modeled array the
smallest element type of all original array accesses, if type allocation sizes
are multiples of each other. Otherwise, we use a newly created iN type, where N
is the gcd of the allocation size of the types used in the accesses to this
array. Accesses with types larger as the canonical element type are modeled as
multiple accesses with the smaller type.
For example the second load access is modeled as:
{ Stmt_bb2[i0] -> MemRef_Float[o0] : 4i0 <= o0 <= 3 + 4i0 }
To support code-generating these memory accesses, we introduce a new method
getAccessAddressFunction that assigns each statement instance a single memory
location, the address we load from/store to. Currently we obtain this address by
taking the lexmin of the access function. We may consider keeping track of the
memory location more explicitly in the future.
We currently do _not_ handle multi-dimensional arrays and also keep the
restriction of not supporting accesses where the offset expression is not a
multiple of the access element type size. This patch adds tests that ensure
we correctly invalidate a scop in case these accesses are found. Both types of
accesses can be handled using the very same model, but are left to be added in
the future.
We also move the initialization of the scop-context into the constructor to
ensure it is already available when invalidating the scop.
Finally, we add this as a new item to the 2.9 release notes
Reviewers: jdoerfert, Meinersbur
Differential Revision: http://reviews.llvm.org/D16878
llvm-svn: 259784
We support now code such as:
void multiple_types(char *Short, char *Float, char *Double) {
for (long i = 0; i < 100; i++) {
Short[i] = *(short *)&Short[2 * i];
Float[i] = *(float *)&Float[4 * i];
Double[i] = *(double *)&Double[8 * i];
}
}
To support such code we use as element type of the modeled array the smallest
element type of all original array accesses. Accesses with larger types are
modeled as multiple accesses with the smaller type.
For example the second load access is modeled as:
{ Stmt_bb2[i0] -> MemRef_Float[o0] : 4i0 <= o0 <= 3 + 4i0 }
To support jscop-rewritable memory accesses we need each statement instance to
only be assigned a single memory location, which will be the address at which
we load the value. Currently we obtain this address by taking the lexmin of
the access function. We may consider keeping track of the memory location more
explicitly in the future.
llvm-svn: 259587
MemAccInst wraps the common members of LoadInst and StoreInst. Also use
of this class in:
- ScopInfo::buildMemoryAccess
- BlockGenerator::generateLocationAccessed
- ScopInfo::addArrayAccess
- Scop::buildAliasGroups
- Replace every use of polly::getPointerOperand
Reviewers: jdoerfert, grosser
Differential Revision: http://reviews.llvm.org/D16530
llvm-svn: 258947
Ensure that there is at most one phi write access per PHINode and
ScopStmt. In particular, this would be possible for non-affine
subregions with multiple exiting blocks. We replace multiple MAY_WRITE
accesses by one MUST_WRITE access. The written value is constructed
using a PHINode of all exiting blocks. The interpretation of the PHI
WRITE's "accessed value" changed from the incoming value to the PHI like
for PHI READs since there is no unique incoming value.
Because region simplification shuffles around PHI nodes -- particularly
with exit node PHIs -- the PHINodes at analysis time does not always
exist anymore in the code generation pass. We instead remember the
incoming block/value pair in the MemoryAccess.
Differential Revision: http://reviews.llvm.org/D15681
llvm-svn: 258809
Both functions implement the same functionality, with the difference that
getNewScalarValue assumes that globals and out-of-scop scalars can be directly
reused without loading them from their corresponding stack slot. This is correct
for sequential code generation, but causes issues with outlining code e.g. for
OpenMP code generation. getNewValue handles such cases correctly.
Hence, we can replace getNewScalarValue with getNewValue. This is not only more
future proof, but also eliminates a bunch of code.
The only functionality that was available in getNewScalarValue that is lost
is the on-demand creation of scalar values. However, this is not necessary any
more as scalars are always loaded at the beginning of each basic block and will
consequently always be available when scalar stores are generated. As this was
not the case in older versions of Polly, it seems the on-demand loading is just
some older code that has not yet been removed.
Finally, generateScalarLoads also generated loads for values that are loop
invariant, available in GlobalMap and which are preferred over the ones loaded
in generateScalarLoads. Hence, we can just skip the code generation of such
scalar values, avoiding the generation of dead code.
Differential Revision: http://reviews.llvm.org/D16522
llvm-svn: 258799
getAccessFor does not guarantee a certain access to be returned in case an
instruction is related to multiple accesses. However, in the vector code
generation we want to know the stride of the array access of a store
instruction. By using getArrayAccessFor we ensure we always get the correct
memory access.
This patch fixes a potential bug, but I was unable to produce a failing test
case. Several existing test cases cover this code, but all of them already
passed out of luck (or the specific but not-guaranteed order in which we build
memory accesses).
llvm-svn: 255715
When generating scalar loads/stores separately the vector code has not been
updated. This commit adds code to generate scalar loads for vector code as well
as code to assert in case scalar stores are encountered within a vector loop.
llvm-svn: 255714
When rewriting the access functions of load/store statements, we are only
interested in the actual array memory location. The current code just took
the very first memory access, which could be a scalar or an array access. As
a result, we failed to update access functions even though this was requested
via .jscop.
llvm-svn: 255713
This change should not change the behavior of Polly today, but it allows
external constants to be remapped e.g. when targetting multiple LLVM modules.
llvm-svn: 255506
Over time different vocabulary has been introduced to describe the different
memory objects in Polly, resulting in different - often inconsistent - naming
schemes in different parts of Polly. We now standartize this to the following
scheme:
KindArray, KindValue, KindPHI, KindExitPHI
| ------- isScalar -----------|
In most cases this naming scheme has already been used previously (this
minimizes changes and ensures we remain consistent with previous publications).
The main change is that we remove KindScalar to clearify the difference between
a scalar as a memory object of kind Value, PHI or ExitPHI and a value (former
KindScalar) which is a memory object modeling a llvm::Value.
We also move all documentation to the Kind* enum in the ScopArrayInfo class,
remove the second enum in the MemoryAccess class and update documentation to be
formulated from the perspective of the memory object, rather than the memory
access. The terms "Implicit"/"Explicit", formerly used to describe memory
accesses, have been dropped. From the perspective of memory accesses they
described the different memory kinds well - especially from the perspective of
code generation - but just from the perspective of a memory object it seems more
straightforward to talk about scalars and arrays, rather than explicit and
implicit arrays. The last comment is clearly subjective, though. A less
subjective reason to go for these terms is the historic use both in mailing list
discussions and publications.
llvm-svn: 255467
Previously, accesses that originate from PHI nodes in the exit block
were registered as SCALAR. In some context they are treated as scalars,
but it makes a difference in others. We used to check whether the
AccessInstruction is a terminator to differentiate the cases.
This patch introduces an MemoryAccess origin EXIT_PHI and a
ScopArrayInfo kind KIND_EXIT_PHI to make this case more explicit. No
behavioural change intended.
Differential Revision: http://reviews.llvm.org/D14688
llvm-svn: 254149
IVs of loops for which the loop header is in the subregion, but not the entire
loop may be incremented outside of the subregion and can consequently not be
kept private to the subregion. Instead, they need to and are modeled as virtual
loops in the iteration domains. As this is the case, generating new subregion
induction variables for such loops is not needed and indeed wrong as they would
hide the virtual induction variables modeled in the scop.
This fixes a miscompile in MultiSource/Benchmarks/Ptrdist/bc and
MultiSource/Benchmarks/nbench/. Thanks Michael and Johannes for their
investiagations and helpful observations regarding this bug.
llvm-svn: 252860
Scalar reloads in the generated entering block were not recognized as
dominating the subregions locks when there were multiple entering
nodes. This resulted in values defined in there not being copied.
As a fix, we unconditionally add the BBMap of the generated entering
node to the generated entry. This fixes part of llvm.org/PR25439.
This reverts 252449 and reapplies r252445. Its test was failing
indeterministically due to r252375 which was reverted in r252522.
llvm-svn: 252540
The dominance of the generated non-affine subregion block was based on
the scop's merge block, therefore resulted in an invalid DominanceTree.
It resulted in some values as assumed to be unusable in the actual
generated exit block.
We detect the case that the exit block has been moved and decide
dominance using the BB at the original exit. If we create another exit
node, that exit nodes is dominated by the one generated from where the
original exit resides. This fixes llvm.org/PR25438 and part of
llvm.org/PR25439.
llvm-svn: 252526
It introduced indeterminism as it was iterating over an address-indexed
hashtable. The corresponding bug PR25438 will be fixed in a successive
commit.
llvm-svn: 252522
This reverts commit 9775824b265e574fc541e975d64d3e270243b59d due to a
failing unit test.
Please check and correct the unit test and commit again.
llvm-svn: 252449
Scalar reloads in the generated entering block were not recognized as
dominating the subregions locks when there were multiple entering
nodes. This resulted in values defined in there not being copied.
As a fix, we unconditionally add the BBMap of the generated entering
node to the generated entry. This fixes part of llvm.org/PR25439.
llvm-svn: 252445
After loop versioning, a dominance check of a non-affine subregion's
exit node causes the dominance check to always fail on any block in the
subregion if it shares the same exit block with the scop. The
subregion's exit block has become polly_merge_new_and_old, which also
receives the control flow of the generated code. This would cause that
any value for implicit stores is assumed to be not from the scop.
We check dominance with the generated exit node instead.
This fixes llvm.org/PR25438
llvm-svn: 252375
Remove all the implicit ilist iterator conversions from polly, in
preparation for making them illegal in ADT. There was one oddity I came
across: at line 95 of lib/CodeGen/LoopGenerators.cpp, there was a
post-increment `Builder.GetInsertPoint()++`.
Since it was a no-op, I removed it, but I admit I wonder if it might be
a bug (both before and after this change)? Perhaps it should be a
pre-increment?
llvm-svn: 252357
We were adding all generated values in non-affine subregions to be used
for the subregions generated exit block. The thought was that only
values that are dominating the original exit block can be used there.
But it is possible for synthesizable values to be expanded in any
block. If the same values is also used for implicit writes, it would
try to reuse already synthesized values even if not dominating the exit
block.
The fix is to only add values to the list of values usable in the exit
block only if it is dominating the exit block. This fixes
llvm.org/PR25412.
llvm-svn: 252301
For generating scalar writes of non-affine subregions, all except phi
writes are generated in the exit block. The phi writes are generated in
the incoming block for which we errornously used the same BBMap. This
can conflict if a value for one block is synthesized, and then reused
for another block which is not dominated by the first block. This is
fixed by using block-specific BBMaps for phi writes.
llvm-svn: 252172
These maps are only needed during the construction of a single region statement.
Clearing them is important, as we otherwise get an assert in case some of the
referenced values are erased before the RegionGenerator is deleted.
llvm-svn: 251341
Such PHI nodes can not only appear in the ExitBlock of the Scop, but indeed
any scalar PHI node above the scop and used in the scop is modeled as scalar
read access.
llvm-svn: 251198
This change adds code to directly code-generate multi-exit PHI nodes, instead
of trying to reuse the EscapeMap infrastructure for this. Using escape maps
adds a level of indirection that is hard to understand and - more importantly -
breaks in certain cases.
Specifically, the original code relied on simplifyRegion() to split the original
PHI node in two PHI nodes, one merging the values coming from within the scop
and a second that merges the first PHI node with the values that come from
outside the scop. To generate code the first PHI node is then just handled like
any other in-scop value that is used somewhere outside the scop. This fails for
the case where all values from inside the scop are identical, as the first PHI
node is in such cases automatically simplified and eliminated by LLVM right at
construction. As a result, there is no instruction that can be pass to the
EscapeMap handling, which means the references in the second PHI node are not
updated and may still reference values from within the original scop that do not
dominate it.
Our new code iterates directly over all modeled ScopArrayInfo objects that
represent multi-exit PHI nodes and generates code for them without relying on
the EscapeMap infrastructure. Hence, it works also for the case where the first
PHI node is eliminated.
llvm-svn: 251191
New values were always synthesized in the block of the instruction
that needed them. This is incorrect for PHI node whose' value must be
defined in the respective incoming block. This patch temporarily moves
the builder's insert point to the incoming block while synthesizing phi
node arguments.
This fixes PR25241 (http://llvm.org/bugs/show_bug.cgi?id=25241)
llvm-svn: 250693
Expressing this in terms of BlockGenerator::getOrCreateAlloca(const
ScopArrayInfo *Array) does not work as the MemoryAccess BasePtr is in case of
invariant load hoisting different to the ScopArrayInfo BasePtr. Until this is
investigated and fixed, we move back to code that just uses the baseptr of
MemoryAccess.
llvm-svn: 250637
This allows the caller to get the alloca locations of an array without the
need to thank if Array is a PHI or a non-PHI Array. We directly make use of this
in BlockGenerator::getOrCreateAlloca(MemoryAccess &Access).
llvm-svn: 250628
Instead of generating implicit loads within basic blocks, put them
before the instructions of the statment itself, including non-affine
subregions. The region's entry node is dominating all blocks in the
region and therefore the loaded value will be available there.
Implicit writes in block-stmts were already stored back at the end of
the block. Now, also generate the stores of non-affine subregions when
leaving the statement, i.e. in the exiting block.
This change is required for array-mapped implicits ("De-LICM") to
ensure that there are no dependencies of demoted scalars within
statments. Statement load all required values, operator on copied in
registers, and then write back the changed value to the demoted memory.
Lifetimes analysis within statements becomes unecessary.
Differential Revision: http://reviews.llvm.org/D13487
llvm-svn: 250625
Instead of checking at code generation time for each ScopStmt if a scalar has
external uses, we just iterate over the ScopArrayInfo descriptions we have and
check each of these for possible external uses.
Besides being somehow clearer, this approach has the benefit that we will always
create valid LLVM-IR even in case we disable the code generation of ScopStmt
bodies e.g. for testing purposes.
llvm-svn: 250608
Helper functions in the BlockGenerators.h/cpp introduce dependences
from the frontend to the backend of Polly. As they are used in
ScopDetection, ScopInfo, etc. we move them to the ScopHelper file.
llvm-svn: 249919
This patch allows invariant loads to be used in the SCoP description,
e.g., as loop bounds, conditions or in memory access functions.
First we collect "required invariant loads" during SCoP detection that
would otherwise make an expression we care about non-affine. To this
end a new level of abstraction was introduced before
SCEVValidator::isAffineExpr() namely ScopDetection::isAffine() and
ScopDetection::onlyValidRequiredInvariantLoads(). Here we can decide
if we want a load inside the region to be optimistically assumed
invariant or not. If we do, it will be marked as required and in the
SCoP generation we bail if it is actually not invariant. If we don't
it will be a non-affine expression as before. At the moment we
optimistically assume all "hoistable" (namely non-loop-carried) loads
to be invariant. This causes us to expand some SCoPs and dismiss them
later but it also allows us to detect a lot we would dismiss directly
if we would ask e.g., AliasAnalysis::canBasicBlockModify(). We also
allow potential aliases between optimistically assumed invariant loads
and other pointers as our runtime alias checks are sound in case the
loads are actually invariant. Together with the invariant checks this
combination allows to handle a lot more than LICM can.
The code generation of the invariant loads had to be extended as we
can now have dependences between parameters and invariant (hoisted)
loads as well as the other way around, e.g.,
test/Isl/CodeGen/invariant_load_parameters_cyclic_dependence.ll
First, it is important to note that we cannot have real cycles but
only dependences from a hoisted load to a parameter and from another
parameter to that hoisted load (and so on). To handle such cases we
materialize llvm::Values for parameters that are referred by a hoisted
load on demand and then materialize the remaining parameters. Second,
there are new kinds of dependences between hoisted loads caused by the
constraints on their execution. If a hoisted load is conditionally
executed it might depend on the value of another hoisted load. To deal
with such situations we sort them already in the ScopInfo such that
they can be generated in the order they are listed in the
Scop::InvariantAccesses list (see compareInvariantAccesses). The
dependences between hoisted loads caused by indirect accesses are
handled the same way as before.
llvm-svn: 249607
The use of const qualified Value pointers prevents the use of AssertingVH. We
could probably think of adding const support to AssertingVH, but as const
correctness seems to currently provide limited benefit in Polly, we do not do
this yet.
llvm-svn: 249266
There have been various places where llvm::DenseMap<const llvm::Value *,
llvm::Value *> types have been defined, but all types have been expected to be
identical. We make this more clear by consolidating the different types and use
BlockGenerator::ValueMapT wherever there is a need for types to match
BlockGenerator::ValueMapT.
llvm-svn: 249264
By using asserting value handles, we will get assertions when we forget to clear
any of the Value maps instead of difficult to debug undefined behavior.
llvm-svn: 249237
Because we handle more than SCEV does it is not possible to rewrite an
expression on the top-level using the SCEVParameterRewriter only. With
this patch we will do the rewriting on demand only and also
recursively, thus not only on the top-level.
llvm-svn: 248916
Instructions which we can synthesis from a SCEV expression are not
generated directly, but only when they are used as an operand of
another instruction. This avoids generating unnecessary instructions
and works more reliably than first inserting them and then deleting
them later on.
This commit was reverted in r248860 due to a remaining miscompile, where
we forgot to synthesis the operand values that were referenced from scalar
writes. test/Isl/CodeGen/scalar-store-from-same-bb.ll tests that we do this
now correctly.
llvm-svn: 248900
As a first step in the direction of assumed invariant loads (loads
that are not written in some context) we now detect and hoist
definitively invariant loads. These invariant loads will be preloaded
in the code generation and used in the optimized version of the SCoP.
If the load is only conditionally executed the preloaded version will
also only be executed under the same condition, hence we will never
access memory that wouldn't have been accessed otherwise. This is also
the most distinguishing feature to licm.
As hoisting can make statements empty we will simplify the SCoP and
remove empty statements that would otherwise cause artifacts in the
code generation.
Differential Revision: http://reviews.llvm.org/D13194
llvm-svn: 248861
This reverts commit 07830c18d789ee72812d5b5b9b4f8ce72ebd4207.
The commit broke at least one test in lnt,
MultiSource/Benchmarks/Ptrdist/bc/number.c
was miss compiled and the test produced a wrong result.
One Polly test case that was added later was adjusted too.
llvm-svn: 248860
Every once in a while we see code that accesses memory with different types,
e.g. to perform operations on a piece of memory using type 'float', but to copy
data to this memory using type 'int'. Modeled in C, such codes look like:
void foo(float A[], float B[]) {
for (long i = 0; i < 100; i++)
*(int *)(&A[i]) = *(int *)(&B[i]);
for (long i = 0; i < 100; i++)
A[i] += 10;
}
We already used the correct types during normal operations, but fall back to our
detected type as soon as we import changed memory access functions. For these
memory accesses we may generate invalid IR due to a mismatch between the element
type of the array we detect and the actual type used in the memory access. To
address this issue, we always cast the newly created address of a memory access
back to the type of the memory access where the address will be used.
llvm-svn: 248781
Instructions which we can synthesis from a SCEV expression are not generated
directly, but only when they are used as an operand of another instruction. This
avoids generating unnecessary instruction and works more reliably than first
inserting them and then deleting them later on.
Suggested-by: Johannes Doerfert <doerfert@cs.uni-saarland.de>
Differential Revision: http://reviews.llvm.org/D13208
llvm-svn: 248712
This patch allows switch instructions with affine conditions in the
SCoP. Also switch instructions in non-affine subregions are allowed.
Both did not require much changes to the code, though there was some
refactoring needed to integrate them without code duplication.
In the llvm-test suite the number of profitable SCoPs increased from
135 to 139 but more importantly we can handle more benchmarks and user
inputs without preprocessing.
Differential Revision: http://reviews.llvm.org/D13200
llvm-svn: 248701
We now only delete trivially dead instructions in the BB we copy (copyBB), but
not in any other BB. Only for copyBB we know that there will _never_ be any
future uses of instructions that have no use after copyBB has been generated.
Other instructions in the AST that have been generated by IslNodeBuilder may
look dead at the moment, but may possibly still be referenced by GlobalMaps. If
we delete them now, later uses would break surprisingly.
We do not have a test case that breaks due to us deleting too many instructions.
This issue was found by inspection.
llvm-svn: 248688
After having generated a new user statement a couple of inefficient or
trivially dead instructions may remain. This commit runs instruction
simplification over the newly generated blocks to ensure unneeded
instructions are removed right away.
This commit does adds simplification for non-affine subregions which was not
yet part of 248681.
llvm-svn: 248683
After having generated a new user statement a couple of inefficient or trivially
dead instructions may remain. This commit runs instruction simplification over
the newly generated blocks to ensure unneeded instructions are removed right
away.
This commit does not yet add simplification for non-affine subregions.
llvm-svn: 248681
This commit basically reverts r246427 but still solves the issue
tackled by that commit. Instead of emitting initialization code in the
beginning of the start block we now generate parallel code in its own
block and thereby guarantee separation. This is necessary as we cannot
generate code for hoisted loads prior to the start block but it still
needs to be placed prior to everything else.
llvm-svn: 248674
There are three possible reasons to add a memory memory access: For explicit load and stores, for llvm::Value defs/uses, and to emulate PHI nodes (the latter two called implicit accesses). Previously MemoryAccess only stored IsPHI. Register accesses could be identified through the isScalar() method if it was no IsPHI. isScalar() determined the number of dimensions of the underlaying array, scalars represented by zero dimensions.
For the work on de-LICM, implicit accesses can have more than zero dimensions, making the distinction of isScalars() useless, hence now stored explicitly in the MemoryAccess. Instead, we replace it by isImplicit() and avoid the term scalar for zero-dimensional arrays as it might be confused with llvm::Value which are also often referred to as scalars (or alternatively, as registers).
No behavioral change intended, under the condition that it was impossible to create explicit accesses to zero-dimensional "arrays".
llvm-svn: 248616
All MemoryAccess objects will be owned by ScopInfo::AccFuncMap which
previously stored the IRAccess objects. Instead of creating new
MemoryAccess objects, the already created ones are reused, but their
order might be different now. Some fields of IRAccess and MemoryAccess
had the same meaning and are merged.
This is the last step of fusioning TempScopInfo.{h|cpp} and
ScopInfo.{h.cpp}. Some refactoring might still make sense.
Differential Revision: http://reviews.llvm.org/D12843
llvm-svn: 248024
While we do not need to model PHI nodes in the region exit (as it is not part
of the SCoP), we need to prepare for the case that the exit block is split in
code generation to create a single exiting block. If this will happen, hence
if the region did not have a single exiting block before, we will model the
operands of the PHI nodes as escaping scalars in the SCoP.
Differential Revision: http://reviews.llvm.org/D12051
llvm-svn: 247078
When this option is enabled, Polly will emit printf calls for each scalar
load/and store which dump the scalar value loaded/stored at run time.
This patch also refactors the RuntimeDebugBuilder to use variadic templates
when generating CPU printfs. As result, it now becomes easier to print
strings that consist of a set of arguments. Also, as a single printf
call is emitted, it is more likely for such strings to be emitted atomically
if executed multi-threaded.
llvm-svn: 246941
By inspection the update of the GlobalMaps in the RegionGenerator seems unneed,
and is removed as also no test cases fail when dropping this. Johannes Doerfert
confirmed that this is indeed save:
"I think that code was needed when we did not use the scalar codegen by default.
Now everything defined in a non-affine region should be communicated via memory
and reloaded in the user block. Hence, we should be good removing this code."
llvm-svn: 246926
The GlobalMap variable used in BlockGenerator should always reference the same
list througout the entire code generation, hence we can make it a member
variable to avoid passing it around through every function call.
History: Before we switched to the SCEV based code generation the GlobalMap
also contained a mapping form old to new induction variables, hence it was
different for each ScopStmt, which is why we passed it as function argument
to copyStmt. The new SCEV based code generation now uses a separate mapping
called LTS -> LoopToSCEV that maps each original loop to a new loop iteration
variable provided as a SCEVExpr. The GlobalMap is currently mostly used for
OpenMP code generation, where references to parameters in the original function
need to be rewritten to the locations of these variables after they have been
passed to the subfunction.
Suggested-by: Johannes Doerfert <doerfert@cs.uni-saarland.de>
llvm-svn: 246920
Our OpenMP code generation generated part of its launching code directly into
the start basic block and without this change the scalar initialization was
run _after_ the OpenMP threads have been launched. This resulted in
uninitialized scalar values to be used.
llvm-svn: 246427
Scalar dependences between scop statements have caused troubles during parallel
code generation as we did not pass on the new stack allocation created for such
scalars to the parallel subfunctions. This change now detects all scalar
reads/writes in parallel subfunctions, creates the allocas for these scalar
objects, passes the resulting memory locations to the subfunctions and ensures
that within the subfunction requests for these memory locations will return the
rewritten values.
Johannes suggested as a future optimization to privatizing some of the scalars
in the subfunction.
llvm-svn: 246414
We already modeled read-only dependences to scalar values defined outside the
scop as memory reads and also generated read accesses from the corresponding
alloca instructions that have been used to pass these scalar values around
during code generation. However, besides for PHI nodes that have already been
handled, we failed to store the orignal read-only scalar values into these
alloc. This commit extends the initialization of scalar values to all read-only
scalar values used within the scop.
llvm-svn: 246394
The current code really tries hard to use getNewScalarValue(), which checks if
not the original value, but a possible copy or demoted value needs to be stored.
In this calling context it seems, that we _always_ use the ScalarValue that
comes from the incoming PHI node, but never any other value. As also no test
cases fail, it seems right to just drop this call to getNewScalarValue and
remove the parameters that are not needed any more.
Johannes suggested that code like this might be needed for parallel code
generation with offloading, but it was still unclear if/what exactly would
be needed. As the parallel code generation does currently not support scalars
at all, we will remove this code for now and add relevant code back when
complitng the support of scalars in the parallel code generation.
Reviewers: jdoerfert
Subscribers: pollydev, llvm-commits
Differential Revision: http://reviews.llvm.org/D12470
llvm-svn: 246389
Our code generation currently does not support scalar references to metadata
values. Hence, it would crash if we try to model scalar dependences to metadata
values. Fortunately, for one of the common uses, debug information, we can
for now just ignore the relevant intrinsics and consequently the issue of how
to model scalar dependences to metadata.
llvm-svn: 246388
This commit drops some dead code. Specifically, there is no need to initialize
the virtual memory locations of scalars in BlockGenerator::handleOutsideUsers,
the function that initalizes the escape map that keeps track of out-of-scope
uses of scalar values. We already model instructions inside the scop that
are used outside the scope (escaping instructions) as scalar memory writes at
the position of the instruction. As a result, the virtual memory location of
this instructions is already initialized when code-generating the corresponding
virtual scalar write and consequently does not need to be initialized later on
when generating the set of escaping values.
Code references:
In TempScopInfo::buildScalarDependences we detect scalar cross-statement
dependences for all instructions (including PHIs) that have uses outside of the
scop's region:
// Check whether or not the use is in the SCoP.
if (!R->contains(UseParent)) {
AnyCrossStmtUse = true;
continue;
}
We use this information in TempScopInfo::buildAccessFunctions were we build
scalar write memory accesses for all these instructions:
if (!isa<StoreInst>(Inst) &&
buildScalarDependences(Inst, &R, NonAffineSubRegion)) {
// If the Instruction is used outside the statement, we need to build the
// write access.
IRAccess ScalarAccess(IRAccess::MUST_WRITE, Inst, ZeroOffset, 1, true,
Inst);
Functions.push_back(std::make_pair(ScalarAccess, Inst));
}
Reviewers: jdoerfert
Subscribers: pollydev, llvm-commits
Differential Revision: http://reviews.llvm.org/D12472
llvm-svn: 246383
For external users, the memory locations into which we generate scalar values
may be of interest. This change introduces two functions that allow to obtain
(or create) the AllocInsts for a given BasePointer.
We use this change to simplify the code in BlockGenerators.
llvm-svn: 246285
This change allows the BlockGenerator to be reused in contexts where we want to
provide different/modified isl_ast_expressions, which are not only changed to
a different access relation than the original statement, but which may indeed
be different for each code-generated instance of the statement.
We ensure testing of this feature by moving Polly's support to import changed
access functions through a jscop file to use the BlockGenerators support for
generating arbitary access functions if provided.
This commit should not change the behavior of Polly for now. The diff is rather
large, but most changes are due to us passing the NewAccesses hash table through
functions. This style, even though rather verbose, matches what is done
throughout the BlockGenerator with other per-statement properties.
llvm-svn: 246144
The SCEVExpander cannot deal with all SCEVs Polly allows in all kinds
of expressions. To this end we introduce a ScopExpander that handles
the additional expressions separatly and falls back to the
SCEVExpander for everything else.
Reviewers: grosser, Meinersbur
Subscribers: #polly
Differential Revision: http://reviews.llvm.org/D12066
llvm-svn: 245288
The new field in the MemoryAccess allows us to track a value related
to that access:
- For real memory accesses the value is the loaded result or the
stored value.
- For straigt line scalar accesses it is the access instruction
itself.
- For PHI operand accesses it is the operand value.
We use this value to simplify code which deduced information about the value
later in the Polly pipeline and was known to be error prone.
Reviewers: grosser, Meinsersbur
Subscribers: #polly
Differential Revision: http://reviews.llvm.org/D12062
llvm-svn: 245213
This allows the code generation to continue working even if a needed
value (that is reloaded anyway) was not yet demoted. Instead of
failing it will now create the location for future demotion to memory
and load from that location. The stores will use the same location and
by construction execute before the load even if the textual order in
the generated AST is otherwise.
Reviewers: grosser, Meinersbur
Subscribers: #polly
Differential Revision: http://reviews.llvm.org/D12072
llvm-svn: 245203
This change extends the BlockGenerator to not only allow Instructions as
base elements of scalar dependences, but any llvm::Value. This allows
us to code-generate scalar dependences which reference function arguments, as
they arise when moddeling read-only scalar dependences.
llvm-svn: 244874
Even though read-only accesses to scalars outside of a scop do not need to be
modeled to derive valid transformations or to generate valid sequential code,
but information about them is useful when we considering memory footprint
analysis and/or kernel offloading.
llvm-svn: 243981