llvm-project

Commit Graph

Author	SHA1	Message	Date
Tobias Grosser	132860afe5	[ScopInfo] Move ScopStmt::setAstBuild/getAstBuild to isl++ llvm-svn: 310216	2017-08-06 17:53:04 +00:00
Tobias Grosser	dcf8d696ff	Move ScopInfo::getDomain(), getDomainSpace(), getDomainId() to isl++ llvm-svn: 310209	2017-08-06 16:39:52 +00:00
Siddharth Bhat	e53c924b0f	[Polly] [PPCGCodeGeneration] Deal with loops outside the Scop correctly in PPCGCodeGeneration. A Scop with a loop outside it is not handled currently by PPCGCodeGeneration. The test case is such that the Scop has only one inner loop that is detected. This currently breaks codegen. The fix is to reuse the existing mechanism in `IslNodeBuilder` within `GPUNodeBuilder. Differential Revision: https://reviews.llvm.org/D36290 llvm-svn: 310193	2017-08-06 02:39:05 +00:00
Siddharth Bhat	0caed1fbe6	[IslNodeBuilder] [NFC] Refactor creation of loop induction variables of loops outside scops. This logic is duplicated, so we refactor it into a separate function. This will be used in a later patch to teach PPCGCodeGen code generation for loops that are outside the scop. Differential Revision: https://reviews.llvm.org/D36310 llvm-svn: 310192	2017-08-06 02:07:11 +00:00
Siddharth Bhat	f2cfd2a4db	[NFC] [IslNodeBuilder, GPUNodeBuilder] Unify mechanism for looking up replacement Values. We populate `IslNodeBuilder::ValueMap` which contains replacements for `llvm::Value`s. There was no simple method to pick up a replacement if it exists, otherwise fall back to the original. Create a method `IslNodeBuilder::getLatestValue` which provides this functionality. This will be used in a later patch to fix bugs in `PPCGCodeGeneration` where the latest value is not being used. Differential Revision: https://reviews.llvm.org/D36000 llvm-svn: 309674	2017-08-01 12:15:51 +00:00
Tobias Grosser	7639db8ed9	[IslNodeBuilder] Remove unused instruction Suggested-by: Maximilian Falkenstein <falkensm@student.ethz.ch> llvm-svn: 309533	2017-07-31 01:59:23 +00:00
Tobias Grosser	3b196131b5	Move applyScheduleToAccessRelation to isl++ llvm-svn: 308842	2017-07-23 04:08:52 +00:00
Tobias Grosser	6a87036e0f	Move MemoryAccess::getAddressFunction to isl++ llvm-svn: 308841	2017-07-23 04:08:45 +00:00
Tobias Grosser	1515f6b937	Move MemoryAccess::NewAccessRelation to isl++ We also move related accessor functions llvm-svn: 308840	2017-07-23 04:08:38 +00:00
Tobias Grosser	fe46c3ff3a	Move MemoryAccess::id to isl++ llvm-svn: 308836	2017-07-23 04:08:11 +00:00
Tobias Grosser	77eef90f50	Move ScopArrayInfo to isl++ This moves the full ScopArrayInfo class to isl++ llvm-svn: 308801	2017-07-21 23:07:56 +00:00
Tobias Grosser	1eeedf4829	[IslNodeBuilder] Relax complexity check in invariant loads and run it early When performing invariant load hoisting we check that invariant load expressions are not too complex. Up to this commit, we performed this check by counting the sum of dimensions in the access range as a very simple heuristic. This heuristic is a little too conservative, as it prevents hoisting for any scops with a very large number of parameters. Hence, we update the heuristic to only count existentially quantified dimensions and set dimensions. We expect this to still detect the problematic expressions in h264 because of which this check was originally introduced. For some unknown reason, this complexity check was originally committed in IslNodeBuilder. It really belongs in ScopInfo, as there is no point in optimizing a program which we could have known earlier cannot be code generated. The benefit of running the check early is that we can avoid to even hoist checks that are expensive to code generate as invariant loads. This can be seen in the changed tests, where we now indeed detect the scop, but just not invariant load hoist the complicated access. We also improve the formatting of the code, document it, and use isl++ to simplify expressions. llvm-svn: 308659	2017-07-20 19:55:19 +00:00
Siddharth Bhat	a1b2086a33	[Invariant Loads] Do not consider invariant loads to have dependences. We need to relax constraints on invariant loads so that they do not create fake RAW dependences. So, we do not consider invariant loads as scalar dependences in a region. During these changes, it turned out that we do not consider `llvm::Value` replacements correctly within `PPCGCodeGeneration` and `ISLNodeBuilder`. The replacements dictated by `ValueMap` were not being followed in all places. This was fixed in this commit. There is no clean way to decouple this change because this bug only seems to arise when the relaxed version of invariant load hoisting was enabled. Differential Revision: https://reviews.llvm.org/D35120 llvm-svn: 307907	2017-07-13 12:18:56 +00:00
Michael Kruse	b738ffa845	Heap allocation for new arrays. This patch aims to implement the option of allocating new arrays created by polly on heap instead of stack. To enable this option, a key named 'allocation' must be written in the imported json file with the value 'heap'. We need such a feature because in a next iteration, we will implement a mechanism of maximal static expansion which will need a way to allocate arrays on heap. Indeed, the expansion is very costly in terms of memory and doing the allocation on stack is not worth considering. The malloc and the free are added respectively at polly.start and polly.exiting such that there is no use-after-free (for instance in case of Scop in a loop) and such that all memory cells allocated with a malloc are free'd when we don't need them anymore. We also add : - In the class ScopArrayInfo, we add a boolean as member called IsOnHeap which represents the fact that the array in allocated on heap or not. - A new branch in the method allocateNewArrays in the ISLNodeBuilder for the case of heap allocation. allocateNewArrays now takes a BBPair containing polly.start and polly.exiting. allocateNewArrays takes this two blocks and add the malloc and free calls respectively to polly.start and polly.exiting. - As IntPtrTy for the malloc call, we use the DataLayout one. To do that, we have modified : - createScopArrayInfo and getOrCreateScopArrayInfo such that it returns a non-const SAI, in order to be able to call setIsOnHeap in the JSONImporter. - executeScopConditionnaly such that it return both start block and end block of the scop, because we need this two blocs to be able to add the malloc and the free calls at the right position. Differential Revision: https://reviews.llvm.org/D33688 llvm-svn: 306540	2017-06-28 13:02:43 +00:00
Michael Kruse	a6d48f59a1	Fix a lot of typos. NFC. llvm-svn: 304974	2017-06-08 12:06:15 +00:00
Michael Kruse	706f79ab14	[CodeGen] Support partial write accesses. Allow the BlockGenerator to generate memory writes that are not defined over the complete statement domain, but only over a subset of it. It generates a condition that evaluates to 1 if executing the subdomain, and only then execute the access. Only write accesses are supported. Read accesses would require a PHINode which has a value if the access is not executed. Partial write makes DeLICM able to apply mappings that are not defined over the entire domain (for instance, a branch that leaves a loop with a PHINode in its header; a MemoryKind::PHI write when leaving is never read by its PHI read). Differential Revision: https://reviews.llvm.org/D33255 llvm-svn: 303517	2017-05-21 22:46:57 +00:00
Siddharth Bhat	b7f68b8c9e	[Fortran Support] Materialize outermost dimension for Fortran array. - We use the outermost dimension of arrays since we need this information to generate GPU transfers. - In general, if we do not know the outermost dimension of the array (because the indexing expression is non-affine, for example) then we simply cannot generate transfer code. - However, for Fortran arrays, we can use the Fortran array representation which stores the dimensions of all arrays. - This patch uses the Fortran array representation to generate code that computes the outermost dimension size. Differential Revision: https://reviews.llvm.org/D32967 llvm-svn: 303429	2017-05-19 15:07:45 +00:00
Tobias Grosser	f3adab4c20	[Polly] Canonicalize arrays according to base-ptr equivalence class Summary: In case two arrays share base pointers in the same invariant load equivalence class, we canonicalize all memory accesses to the first of these arrays (according to their order in the equivalence class). This enables us to optimize kernels such as boost::ublas by ensuring that different references to the C array are interpreted as accesses to the same array. Before this change the runtime alias check for ublas would fail, as it would assume models of the C array with differing (but identically valued) base pointers would reference distinct regions of memory whereas the referenced memory regions were indeed identical. As part of this change we remove most of the MemoryAccess::getBaseAddr interface. We removed already all references to getBaseAddr in previous commits to ensure that no code relies on matching base pointers between memory accesses and scop arrays -- except for three remaining uses where we need the original base pointer. We document for these situations that MemoryAccess::getOriginalBaseAddr may return a base pointer that is distinct to the base pointer of the scop array referenced by this memory access. Reviewers: sebpop, Meinersbur, zinob, gareevroman, pollydev, huihuiz, efriedma, jdoerfert Reviewed By: Meinersbur Subscribers: etherzhhb Tags: #polly Differential Revision: https://reviews.llvm.org/D28518 llvm-svn: 302636	2017-05-10 10:59:58 +00:00
Hongbin Zheng	0f8f177682	[Polly] Do not introduce address space cast Do not introduce address space cast in IslNodeBuilder::preloadUnconditionally. Differential Revision: https://reviews.llvm.org/D32581 llvm-svn: 301519	2017-04-27 06:42:14 +00:00
Matt Arsenault	b3e30c32ce	Update for alloca construction changes llvm-svn: 299905	2017-04-11 00:12:58 +00:00
Philip Pfaffe	2d950f36ee	[Polly][NewPM] Pull references to the legacy PM interface from utilities and helpers Summary: A couple of the utilities used to analyze or build IR make explicit use of the legacy PM on their interface, to access analysis results. This patch removes the legacy PM from the interface, and just passes the required results directly. This shouldn't introduce any function changes, although the API technically allowed to obtain two different analysis results before, one passed by reference and one through the PM. I don't believe that was ever intended, however. Reviewers: grosser, Meinersbur Reviewed By: grosser Subscribers: nemanjai, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D31653 llvm-svn: 299423	2017-04-04 10:01:53 +00:00
Roman Gareev	cdfb57dc46	Introduce another level of metadata to distinguish non-aliasing accesses Introduce another level of alias metadata to distinguish the individual non-aliasing accesses that have inter iteration alias-free base pointers marked with "Inter iteration alias-free" mark nodes. It can be used to, for example, distinguish different stores (loads) produced by unrolling of the innermost loops and, subsequently, sink (hoist) them by LICM. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D30606 llvm-svn: 298510	2017-03-22 14:25:24 +00:00
Roman Gareev	23df27682a	Map the new load to the base pointer of the invariant load hoisted load Map the new load to the base pointer of the invariant load hoisted load to be able to find the alias information for it. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D30605 llvm-svn: 298507	2017-03-22 13:57:53 +00:00
Tobias Grosser	b28f86e9e6	[CodeGen] Remove need for all parameters to be in scop context for load hoisting. When not adding constraints on parameters using -polly-ignore-parameter-bounds, the context may not necessarily list all parameter dimensions. To support code generation in this situation, we now always iterate over the actual parameter list, rather than relying on the context to list all parameter dimensions. llvm-svn: 298197	2017-03-18 23:12:49 +00:00
Michael Kruse	52ab4943b4	Remove all references to PostDominators. NFC. Marking a pass as preserved is necessary if any Polly pass uses it, even if it is not preserved within the generated code. Not marking it would cause the the Polly pass chain to be interrupted. It is not used by any Polly pass anymore, hence we can remove all references to it. llvm-svn: 295983	2017-02-23 15:16:22 +00:00
Tobias Grosser	ff40087a6a	Update to recent formatting changes llvm-svn: 293756	2017-02-01 10:12:09 +00:00
Tobias Grosser	587f1f57ad	[Polly] [BlockGenerator] Unify ScalarMap and PhiOpsMap Instead of keeping two separate maps from Value to Allocas, one for MemoryType::Value and the other for MemoryType::PHI, we introduce a single map from ScopArrayInfo to the corresponding Alloca. This change is intended, both as a general simplification and cleanup, but also to reduce our use of MemoryAccess::getBaseAddr(). Moving away from using getBaseAddr() makes sure we have only a single place where the array (and its base pointer) for which we generate code for is specified, which means we can more easily introduce new access functions that use a different ScopArrayInfo as base. We already today experiment with modifiable access functions, so this change does not address a specific bug, but it just reduces the scope one needs to reason about. Another motivation for this patch is https://reviews.llvm.org/D28518, where memory accesses with different base pointers could possibly be mapped to a single ScopArrayInfo object. Such a mapping is currently not possible, as we currently generate alloca instructions according to the base addresses of the memory accesses, not according to the ScopArrayInfo object they belong to. By making allocas ScopArrayInfo specific, a mapping to a single ScopArrayInfo object will automatically mean that the same stack slot is used for these arrays. For D28518 this is not a problem, as only MemoryType::Array objects are mapping, but resolving this inconsistency will hopefully avoid confusion. llvm-svn: 293374	2017-01-28 07:42:10 +00:00
Tobias Grosser	e1ff0cf2eb	Relax assert when setting access functions with invariant base pointers Summary: Instead of forbidding such access functions completely, we verify that their base pointer has been hoisted and only assert in case the base pointer was not hoisted. I was trying for a little while to get a test case that ensures the assert is correctly fired in case of invariant load hoisting being disabled, but I could not find a good way to do so, as llvm-lit immediately aborts if a command yields a non-zero return value. As we do not generally test our asserts, not having a test case here seems OK. This resolves http://llvm.org/PR31494 Suggested-by: Michael Kruse <llvm@meinersbur.de> Reviewers: efriedma, jdoerfert, Meinersbur, gareevroman, sebpop, zinob, huihuiz, pollydev Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D28798 llvm-svn: 292213	2017-01-17 12:00:42 +00:00
Tobias Grosser	21a059af09	Adjust formatting to commit r292110 [NFC] llvm-svn: 292123	2017-01-16 14:08:10 +00:00
Roman Gareev	bd5c6039c6	Align newly created arrays to the first level cache line boundary Aligning data to cache lines boundaries helps to avoid overheads related to an access to it ([1]). This patch aligns newly created arrays and adds an option to specify the first level cache line size. By default we use 64 bytes, which is a typical cache-line size ([2]). In case of Intel Core i7-3820 SandyBridge and the following options, clang -O3 gemm.c -I utilities/ utilities/polybench.c -DPOLYBENCH_TIME -march=native -mllvm -polly -mllvm -polly-pattern-matching-based-opts=true -DPOLYBENCH_USE_SCALAR_LB -mllvm -polly-target-cache-level-associativity=8,8 -mllvm -polly-target-cache-level-sizes=32768,262144 -mllvm -polly-target-latency-vector-fma=8 it helps to improve the performance from 11.303 GFlops/sec (39,247% of theoretical peak) to 12.63 GFlops/sec (43,8542% of theoretical peak). Refs.: [1] - http://www.alexonlinux.com/aligned-vs-unaligned-memory-access [2] - http://igoro.com/archive/gallery-of-processor-cache-effects/ Differential Revision: https://reviews.llvm.org/D28020 Reviewed-by: Tobias Grosser <tobias@grosser.es> llvm-svn: 290253	2016-12-21 12:37:36 +00:00
Tobias Grosser	dc6b87c56e	Add newline at end of debug print In '[DBG] Allow to emit the RTC value at runtime' the diagnostics were printed without a newline at the end of each diagnostic. We add such a newline to improve readability. llvm-svn: 288323	2016-12-01 08:08:47 +00:00
Michael Kruse	11c5e07925	canSynthesize: Remove unused argument LI. NFC. The helper function polly::canSynthesize() does not directly use the LoopInfo analysis, hence remove it from its argument list. llvm-svn: 288144	2016-11-29 15:11:04 +00:00
Tobias Grosser	df8f35b7b8	Update for clang-format change in r288119 llvm-svn: 288134	2016-11-29 12:52:08 +00:00
Tobias Grosser	b3c3d149b9	[CodeGen] Add flag to code-generate most memory access expressions Introduce the new flag -polly-codegen-generate-expressions which forces Polly to code generate AST expressions instead of using our SCEV based access expression generation even for cases where the original memory access relation was not changed and the SCEV based access expression could be code generated without any issue. This is an experimental option for better testing the isl ast expression generation. The default behavior of Polly remains unchanged. We also exclude a couple of cases for which the AST expression is not yet working. llvm-svn: 287694	2016-11-22 20:21:16 +00:00
Johannes Doerfert	81aa6e882f	[NFC] Adjust naming scheme of statistic variables Suggested-by: Tobias Grosser <tobias@grosser.es> llvm-svn: 287347	2016-11-18 14:37:08 +00:00
Johannes Doerfert	dae2e9287d	[DBG] Collect statistics about actually versioned SCoPs llvm-svn: 287267	2016-11-17 21:55:43 +00:00
Johannes Doerfert	8c5464a715	[DBG] Allow to emit the RTC value at runtime The new command line flag "polly-codegen-emit-rtc-print" can be used to place a "printf" in the generated code that will print the RTC value and the overflow state. llvm-svn: 287265	2016-11-17 21:49:19 +00:00
Tobias Grosser	16480186f8	IslNodeBuilder: Ensure newly generated memory accesses are well-defined Add some additional asserts that ensure newly code-generated memory accesses are defined on all domain and schedule domain instances. llvm-svn: 286050	2016-11-05 21:46:01 +00:00
Eli Friedman	acf8006471	[Polly CodeGen] Break critical edge from RTC to original loop. This makes polly generate a CFG which is closer to what we want in LLVM IR, with a loop preheader for the original loop. This is just a cleanup, but it exposes some fragile assumptions. I'm not completely happy with the changes related to expandCodeFor; RTCBB->getTerminator() is basically a random insertion point which happens to work due to the way we generate runtime checks. I'm not sure what the right answer looks like, though. Differential Revision: https://reviews.llvm.org/D26053 llvm-svn: 285864	2016-11-02 22:32:23 +00:00
Eli Friedman	3c1a75bf9c	Handle multi-dimensional invariant load. If the address of a load depends on another load, make sure to emit the loads in the right order. llvm-svn: 284426	2016-10-17 21:04:26 +00:00
Michael Kruse	4b0c5aea78	[CodeGen] Add assertion for indirect array index expression generation. NFC. Currently Polly cannot generate code for index expressions if the base pointer is computed within the scop. The base pointer must be generated as well, but there is no code that triggers that. Add an assertion to detect when this would occur and miscompile. The IR verifier should catch it as well. llvm-svn: 282893	2016-09-30 18:29:37 +00:00
Roman Gareev	b3224adfb6	Perform copying to created arrays according to the packing transformation This is the fourth patch to apply the BLIS matmul optimization pattern on matmul kernels (http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf). BLIS implements gemm as three nested loops around a macro-kernel, plus two packing routines. The macro-kernel is implemented in terms of two additional loops around a micro-kernel. The micro-kernel is a loop around a rank-1 (i.e., outer product) update. In this change we perform copying to created arrays, which is the last step to implement the packing transformation. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D23260 llvm-svn: 281441	2016-09-14 06:26:09 +00:00
Roman Gareev	f5aff70405	Store the size of the outermost dimension in case of newly created arrays that require memory allocation. We do not need the size of the outermost dimension in most cases, but if we allocate memory for newly created arrays, that size is needed. Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D23991 llvm-svn: 281234	2016-09-12 17:08:31 +00:00
Tobias Grosser	a3afe44d6c	IslNodeBuilder: Add missing __isl_take annotation llvm-svn: 281034	2016-09-09 11:16:50 +00:00
Tobias Grosser	f3600dfa2d	IslNodeBuilder: Add missing __isl_take annotations llvm-svn: 280936	2016-09-08 13:48:55 +00:00
Tobias Grosser	c80d6979bd	Drop '@brief' from doxygen comments LLVM's coding guideline suggests to not use @brief for one-sentence doxygen comments to improve readability. Switch this once and for all to ensure people do not copy @brief comments from other parts of Polly, when writing new code. llvm-svn: 280468	2016-09-02 06:33:33 +00:00
Tobias Grosser	fa9abd1f03	Fix compilation in 'asserts' mode llvm-svn: 278025	2016-08-08 17:35:52 +00:00
Tobias Grosser	0aa29532b7	[IslNodeBuilder] Move run-time check generation to NodeBuilder [NFC] This improves the structure of the code and allows us to reuse the runtime code generation in the PPCGCodeGeneration. llvm-svn: 278017	2016-08-08 15:41:52 +00:00
Tobias Grosser	000db70754	[IslNodeBuilder] Directly use the insert location of our Builder ... instead of adding instructions at the end of the basic block the builder is currently at. This makes it easier to reason about where IR is generated, as with the IRBuilder there is just a single location that specificies where IR is generated. llvm-svn: 278013	2016-08-08 15:25:46 +00:00
Tobias Grosser	00bb5a99f5	GPGPU: Handle scalar array references Pass the content of scalar array references to the alloca on the kernel side and do not pass them additional as normal LLVM scalar value. llvm-svn: 277699	2016-08-04 06:55:59 +00:00
Tobias Grosser	2219d15748	Fix a couple of spelling mistakes llvm-svn: 277569	2016-08-03 05:28:09 +00:00
Roman Gareev	d7754a1245	Extend the jscop interface to allow the user to declare new arrays and to reference these arrays from access expressions Extend the jscop interface to allow the user to export arrays. It is required that already existing arrays of the list of arrays correspond to arrays of the SCoP. Each array that is appended to the list will be newly created. Furthermore, we allow the user to modify access expressions to reference any array in case it has the same element type. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D22828 llvm-svn: 277263	2016-07-30 09:25:51 +00:00
Tobias Grosser	86083da0ec	IslNodeBuilder: expose addReferencesFromStmt [NFC] This will be used by Polly GPGPU to determine the values that need to be passed to GPU kernels. llvm-svn: 276269	2016-07-21 13:15:55 +00:00
Tobias Grosser	faef9a7667	Fix gcc compile failure Commit r275056 introduced a gcc compile failure due to us using two types named 'Type', the first being the newly introduced member variable 'Type' the second being llvm::Type. We resolve this issue by renaming the newly introduced member variable to AccessType. llvm-svn: 275057	2016-07-11 12:27:04 +00:00
Tobias Grosser	4e2d9c45b9	InvariantEquivClassTy: Use struct instead of 4-tuple to increase readability Summary: With a struct we can use named accessors instead of generic std::get<3>() calls. This increases readability of the source code. Reviewers: jdoerfert Subscribers: pollydev, llvm-commits Differential Revision: http://reviews.llvm.org/D21955 llvm-svn: 275056	2016-07-11 12:15:10 +00:00
Tobias Grosser	3717aa5ddb	This reverts recent expression type changes The recent expression type changes still need more discussion, which will happen on phabricator or on the mailing list. The precise list of commits reverted are: - "Refactor division generation code" - "[NFC] Generate runtime checks after the SCoP" - "[FIX] Determine insertion point during SCEV expansion" - "Look through IntToPtr & PtrToInt instructions" - "Use minimal types for generated expressions" - "Temporarily promote values to i64 again" - "[NFC] Avoid unnecessary comparison for min/max expressions" - "[Polly] Fix -Wunused-variable warnings (NFC)" - "[NFC] Simplify min/max expression generation" - "Simplify the type adjustment in the IslExprBuilder" Some of them are just reverted as we would otherwise get conflicts. I will try to re-commit them if possible. llvm-svn: 272483	2016-06-11 19:17:15 +00:00
Johannes Doerfert	0767a511ba	Use minimal types for generated expressions We now use the minimal necessary bit width for the generated code. If operations might overflow (add/sub/mul) we will try to adjust the types in order to ensure a non-wrapping computation. If the type adjustment is not possible, thus the necessary type is bigger than the type value of --polly-max-expr-bit-width, we will use assumptions to verify the computation will not wrap. However, for run-time checks we cannot build assumptions but instead utilize overflow tracking intrinsics. llvm-svn: 271878	2016-06-06 09:57:41 +00:00
Matthew Simpson	acae9e3b30	[Polly] Fix -Wunused-variable warnings (NFC) llvm-svn: 271518	2016-06-02 14:26:38 +00:00
Johannes Doerfert	d36553753e	Simplify the type adjustment in the IslExprBuilder We now have a simple function to adjust/unify the types of two (or three) operands before an operation that requieres the same type for all operands. Due to this change we will not promote parameters that are added to i64 anymore if that is not needed. llvm-svn: 271513	2016-06-02 11:15:57 +00:00
Johannes Doerfert	0f0d209bec	Use the SCoP directly for canSynthesize [NFC] llvm-svn: 270429	2016-05-23 12:47:09 +00:00
Johannes Doerfert	ef74443c97	Duplicate part of the Region interface in the Scop class [NFC] This allows to use the SCoP directly for various queries, thus to hide the underlying region more often. llvm-svn: 270426	2016-05-23 12:42:38 +00:00
Johannes Doerfert	952b5304bc	Add and use Scop::contains(Loop/BasicBlock/Instruction) [NFC] llvm-svn: 270424	2016-05-23 12:40:48 +00:00
Johannes Doerfert	a61eda7698	[FIX] Let ScalarEvolution forget hoisted values We have to rethink the handling of escaping values in order to make this kind of "fixes" go away. llvm-svn: 270409	2016-05-23 09:02:54 +00:00
Johannes Doerfert	404a0f81ea	Check overflows in RTCs and bail accordingly We utilize assumptions on the input to model IR in polyhedral world. To verify these assumptions we version the code and guard it with a runtime-check (RTC). However, since the RTCs are themselves generated from the polyhedral representation we generate them under the same assumptions that they should verify. In other words, the guarantees that we try to provide with the RTCs do not hold for the RTCs themselves. To this end it is necessary to employ a different check for the RTCs that will verify the assumptions did hold for them too. Differential Revision: http://reviews.llvm.org/D20165 llvm-svn: 269299	2016-05-12 15:12:43 +00:00
Johannes Doerfert	e243753a4d	Simplify access relation for invariant loads early [NFC] llvm-svn: 269046	2016-05-10 11:59:59 +00:00
Johannes Doerfert	5f173d414e	Prevent complex access ranges with low number of pieces. Previously we checked the number of pieces to decide whether or not a invariant load was to complex to be generated. However, there are cases when e.g., divisions cause the complexity to spike regardless of the number of pieces. To this end we now check the number of totally involved dimensions which will increase with the number of pieces but also the number of divisions. llvm-svn: 269045	2016-05-10 11:46:57 +00:00
Michael Kruse	bc150127ae	Rename Conjuncts -> Disjunctions. NFC. The check for complexity compares the number of polyhedra in a set, which are combined by disjunctions (union, "OR"), not conjunctions (intersection, "AND"). llvm-svn: 268223	2016-05-02 12:25:18 +00:00
Johannes Doerfert	8ab2803b63	[FIX] Propagate execution domain of invariant loads If the base pointer of an invariant load is is loaded conditionally, that condition needs to hold for the invariant load too. The structure of the program will imply this for domain constraints but not for imprecisions in the modeling. To this end we will propagate the execution context of base pointers during code generation and thus ensure the derived pointer does not access an invalid base pointer. llvm-svn: 267707	2016-04-27 12:49:11 +00:00
Johannes Doerfert	a9dc529442	Collect and verify generated parallel subfunctions We verify the optimized function now for a long time and it helped to track down bugs early. This will now also happen for all parallel subfunctions we generate. llvm-svn: 265823	2016-04-08 18:16:02 +00:00
Johannes Doerfert	7b81103589	[FIX] Look through div & srem instructions in SCEVs The findValues() function did not look through div & srem instructions that were part of the argument SCEV. However, in different other places we already look through it. This mismatch caused us to preload values in the wrong order. llvm-svn: 265775	2016-04-08 10:25:58 +00:00
Michael Kruse	c7e0d9c216	Fix non-synthesizable loop exit values. Polly recognizes affine loops that ScalarEvolution does not, in particular those with loop conditions that depend on hoisted invariant loads. Check for SCEVAddRec dependencies on such loops and do not consider their exit values as synthesizable because SCEVExpander would generate them as expressions that depend on the original induction variables. These are not available in generated code. llvm-svn: 262404	2016-03-01 21:44:06 +00:00
Johannes Doerfert	abadd71da1	[FIX] Prevent compile time problems due to complex invariant loads This cures the symptoms we see in h264 of SPEC2006 but not the cause. llvm-svn: 262327	2016-03-01 13:05:14 +00:00
Michael Kruse	6f7721f02b	Introduce Scop::getStmtFor. NFC. Replace Scop::getStmtForBasicBlock and Scop::getStmtForRegionNode, and add overloads for llvm::Instruction and llvm::RegionNode. getStmtFor and overloads become the common interface to get the Stmt that contains something. Named after LoopInfo::getLoopFor and RegionInfo::getRegionFor. llvm-svn: 261791	2016-02-24 22:08:19 +00:00
Roman Gareev	11001e1534	Annotation of SIMD loops Use 'mark' nodes annotate a SIMD loop during ScheduleTransformation and skip parallelism checks. The buildbot shows the following compile/execution time changes: Compile time: Improvements Δ Previous Current σ …/gesummv -6.06% 0.2640 0.2480 0.0055 …/gemver -4.46% 0.4480 0.4280 0.0044 …/covariance -4.31% 0.8360 0.8000 0.0065 …/adi -3.23% 0.9920 0.9600 0.0065 …/doitgen -2.53% 0.9480 0.9240 0.0090 …/3mm -2.33% 1.0320 1.0080 0.0087 Execution time: Regressions Δ Previous Current σ …/viterbi 1.70% 5.1840 5.2720 0.0074 …/smallpt 1.06% 12.4920 12.6240 0.0040 Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: http://reviews.llvm.org/D14491 llvm-svn: 261620	2016-02-23 09:00:13 +00:00
Johannes Doerfert	6a7c3e4bac	Set AST Build for all statements [NFC] llvm-svn: 260956	2016-02-16 12:11:03 +00:00
Johannes Doerfert	96e5471139	Separate invariant equivalence classes by type We now distinguish invariant loads to the same memory location if they have different types. This will cause us to pre-load an invariant location once for each type that is used to access it. However, we can thereby avoid invalid casting, especially if an array is accessed though different typed/sized invariant loads. This basically reverts the changes in r260023 but keeps the test cases. llvm-svn: 260045	2016-02-07 17:30:13 +00:00
Johannes Doerfert	adeab372ca	Simplify code [NFC] llvm-svn: 260030	2016-02-07 13:57:32 +00:00
Tobias Grosser	107cd5f5f6	IslNodeBuilder: Invariant load hoisting of elements with differing sizes Always use access-instruction pointer type to load the invariant values. Otherwise mismatches between ScopArrayInfo element type and memory access element type will result in invalid casts. These type mismatches are after r259784 a lot more common and also arise with types of different size, which have not been handled before. Interestingly, this change actually simplifies the code, as we now have only one code path that is always taken, rather then a standard code path for the common case and a "fixup" code path that replaces the standard code path in case of mismatching types. llvm-svn: 260009	2016-02-06 21:23:39 +00:00
Tobias Grosser	d840fc7277	Support accesses with differently sized types to the same array This allows code such as: void multiple_types(char Short, char Float, char Double) { for (long i = 0; i < 100; i++) { Short[i] = (short )&Short[2 i]; Float[i] = (float )&Float[4 * i]; Double[i] = (double )&Double[8 * i]; } } To model such code we use as canonical element type of the modeled array the smallest element type of all original array accesses, if type allocation sizes are multiples of each other. Otherwise, we use a newly created iN type, where N is the gcd of the allocation size of the types used in the accesses to this array. Accesses with types larger as the canonical element type are modeled as multiple accesses with the smaller type. For example the second load access is modeled as: { Stmt_bb2[i0] -> MemRef_Float[o0] : 4i0 <= o0 <= 3 + 4i0 } To support code-generating these memory accesses, we introduce a new method getAccessAddressFunction that assigns each statement instance a single memory location, the address we load from/store to. Currently we obtain this address by taking the lexmin of the access function. We may consider keeping track of the memory location more explicitly in the future. We currently do _not_ handle multi-dimensional arrays and also keep the restriction of not supporting accesses where the offset expression is not a multiple of the access element type size. This patch adds tests that ensure we correctly invalidate a scop in case these accesses are found. Both types of accesses can be handled using the very same model, but are left to be added in the future. We also move the initialization of the scop-context into the constructor to ensure it is already available when invalidating the scop. Finally, we add this as a new item to the 2.9 release notes Reviewers: jdoerfert, Meinersbur Differential Revision: http://reviews.llvm.org/D16878 llvm-svn: 259784	2016-02-04 13:18:42 +00:00
Michael Kruse	70131d3416	Introduce MemAccInst helper class; NFC MemAccInst wraps the common members of LoadInst and StoreInst. Also use of this class in: - ScopInfo::buildMemoryAccess - BlockGenerator::generateLocationAccessed - ScopInfo::addArrayAccess - Scop::buildAliasGroups - Replace every use of polly::getPointerOperand Reviewers: jdoerfert, grosser Differential Revision: http://reviews.llvm.org/D16530 llvm-svn: 258947	2016-01-27 17:09:17 +00:00
Johannes Doerfert	370cf00c9f	Make sure we preserve alignment information after hoisting invariant load In Polly, after hoisting loop invariant loads outside loop, the alignment information for hoisted loads are missing, this patch restore them. Contributed-by: Lawrence Hu <lawrence@codeaurora.org> Differential Revision: http://reviews.llvm.org/D16160 llvm-svn: 258105	2016-01-19 00:17:21 +00:00
Tobias Grosser	a535dff471	ScopInfo: Harmonize the different array kinds Over time different vocabulary has been introduced to describe the different memory objects in Polly, resulting in different - often inconsistent - naming schemes in different parts of Polly. We now standartize this to the following scheme: KindArray, KindValue, KindPHI, KindExitPHI \| ------- isScalar -----------\| In most cases this naming scheme has already been used previously (this minimizes changes and ensures we remain consistent with previous publications). The main change is that we remove KindScalar to clearify the difference between a scalar as a memory object of kind Value, PHI or ExitPHI and a value (former KindScalar) which is a memory object modeling a llvm::Value. We also move all documentation to the Kind* enum in the ScopArrayInfo class, remove the second enum in the MemoryAccess class and update documentation to be formulated from the perspective of the memory object, rather than the memory access. The terms "Implicit"/"Explicit", formerly used to describe memory accesses, have been dropped. From the perspective of memory accesses they described the different memory kinds well - especially from the perspective of code generation - but just from the perspective of a memory object it seems more straightforward to talk about scalars and arrays, rather than explicit and implicit arrays. The last comment is clearly subjective, though. A less subjective reason to go for these terms is the historic use both in mailing list discussions and publications. llvm-svn: 255467	2015-12-13 19:59:01 +00:00
Tobias Grosser	2fd89da90d	Remove non-debug printing of domain set Contributed-by: Chris Jenneisch <chrisj@codeaurora.org> Differential Revision: http://reviews.llvm.org/D15094 llvm-svn: 254343	2015-11-30 22:59:41 +00:00
Johannes Doerfert	fdbf201fc9	[FIX] Do not generate code for parameters referencing dead values Check if a value that is referenced by a parameter is dead and do not generate code for the parameter in such a case. llvm-svn: 252813	2015-11-11 22:40:51 +00:00
Johannes Doerfert	dcfedf3505	[FIX] Cast pre-loaded values correctly or reload them with adjusted type. Especially for structs, the SAI object of a base pointer does not describe all the types that the user might expect when he loads from that base pointer. While we will still cast integers and pointers we will now reload the value with the correct type if floating point and non-floating point values are involved. However, there are now TODOs where we use bitcasts instead of a proper conversion or reloading. This fixes bug 25479. llvm-svn: 252706	2015-11-11 06:20:25 +00:00
Johannes Doerfert	fc4bfc465a	[FIX] Create empty invariant equivalence classes We now create all invariant equivalence classes for required invariant loads instead of creating them on-demand. This way we can check if a parameter references an invariant load that is actually not executed and was therefor not materialized. If that happens the parameter is not materialized either. This fixes bug 25469. llvm-svn: 252701	2015-11-11 04:30:07 +00:00
Tobias Grosser	6abc75af4c	ScopInfo: Introduce ArrayKind Since 252422 we do not only distinguish two ScopArrayInfo kinds, PHI nodes and others, but work with three kind of ScopArrayInfo objects. SCALAR, PHI and ARRAY objects. Instead of keeping two boolean flags isPHI and isScalar and wonder what an ScopArrayInfo object of kind (!isScalar && isPHI) is, we list now explicitly the three different possible types of memory objects. This change also allows us to remove the confusing nested pairs that have been used in ArrayInfoMapTy. llvm-svn: 252620	2015-11-10 17:31:31 +00:00
Johannes Doerfert	7a6e292d86	[FIX] Use same alloca for invariant loads and the scalar users llvm-svn: 252451	2015-11-09 06:28:45 +00:00
Johannes Doerfert	a768624f14	[FIX] Introduce different SAI objects for scalar and memory accesses Even if a scalar and memory access have the same base pointer, we cannot use one SAI object as the type but also the number of dimensions are wrong. For the attached test case this caused a crash in the invariant load hoisting, though it could cause various other problems too. This fixes bug 25428 and a execution time bug in MallocBench/cfrac. Reported-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> llvm-svn: 252422	2015-11-08 19:12:05 +00:00
Johannes Doerfert	c4898504ea	[FIX] Bail out if there is a dependence cycle between invariant loads While the program cannot cause a dependence cycle between invariant loads, additional constraints (e.g., to ensure finite loops) can introduce them. It is hard to detect them in the SCoP description, thus we will only check for them at code generation time. If such a recursion is detected we will bail out the code generation and place a "false" runtime check to guarantee the original code is used. This fixes bug 25443. llvm-svn: 252412	2015-11-07 19:46:04 +00:00
Duncan P. N. Exon Smith	b8f58b53dd	polly/ADT: Remove implicit ilist iterator conversions, NFC Remove all the implicit ilist iterator conversions from polly, in preparation for making them illegal in ADT. There was one oddity I came across: at line 95 of lib/CodeGen/LoopGenerators.cpp, there was a post-increment `Builder.GetInsertPoint()++`. Since it was a no-op, I removed it, but I admit I wonder if it might be a bug (both before and after this change)? Perhaps it should be a pre-increment? llvm-svn: 252357	2015-11-06 22:56:54 +00:00
Johannes Doerfert	22892687f7	[FIX] Simplify and correct preloading of base pointer origin To simplify and correct the preloading of a base pointer origin, e.g., the base pointer for the current indirect invariant load, we now just check if there is an invariant access class that involves the base pointer of the current class. llvm-svn: 251962	2015-11-03 19:15:33 +00:00
Johannes Doerfert	475d8e3f42	[FIX] Ensure base pointer origin was preloaded already If a base pointer of a preloaded value has a base pointer origin, thus it is an indirect invariant load, we have to make sure the base pointer origin is preloaded first. llvm-svn: 251946	2015-11-03 16:49:02 +00:00
Johannes Doerfert	3181c2ef72	[FIX] Correctly update SAI base pointer If a base pointer load is preloaded, we have change the base pointer of the derived SAI. However, as the derived SAI relationship is is coarse grained, we need to check if we actually preloaded the base pointer or a different element of the base pointer SAI array. llvm-svn: 251881	2015-11-03 01:42:59 +00:00
Johannes Doerfert	af3e301a67	[FIX] Restructure invariant load equivalence classes Sorting is replaced by a demand driven code generation that will pre-load a value when it is needed or, if it was not needed before, at some point determined by the order of invariant accesses in the program. Only in very little cases this demand driven pre-loading will kick in, though it will prevent us from generating faulty code. An example where it is needed is shown in: test/ScopInfo/invariant_loads_complicated_dependences.ll Invariant loads that appear in parameters but are not on the top-level (e.g., the parameter is not a SCEVUnknown) will now be treated correctly. Differential Revision: http://reviews.llvm.org/D13831 llvm-svn: 250655	2015-10-18 12:39:19 +00:00
Johannes Doerfert	d8b6ad255f	[FIX] Cast preloaded values Preloaded values have to match the type of their counterpart in the original code and not the type of the base array. llvm-svn: 250654	2015-10-18 12:36:42 +00:00
Johannes Doerfert	697fdf891c	Consolidate invariant loads If a (assumed) invariant location is loaded multiple times we generated a parameter for each location. However, this caused compile time problems for several benchmarks (e.g., 445_gobmk in SPEC2006 and BT in the NAS benchmarks). Additionally, the code we generate is suboptimal as we preload the same location multiple times and perform the same checks on all the parameters that refere to the same value. With this patch we consolidate the invariant loads in three steps: 1) During SCoP initialization required invariant loads are put in equivalence classes based on their pointer operand. One representing load is used to generate a parameter for the whole class, thus we never generate multiple parameters for the same location. 2) During the SCoP simplification we remove invariant memory accesses that are in the same equivalence class. While doing so we build the union of all execution domains as it is only important that the location is at least accessed once. 3) During code generation we only preload one element of each equivalence class with the unified execution domain. All others are mapped to that preloaded value. Differential Revision: http://reviews.llvm.org/D13338 llvm-svn: 249853	2015-10-09 17:12:26 +00:00
Johannes Doerfert	09e3697f44	Allow invariant loads in the SCoP description This patch allows invariant loads to be used in the SCoP description, e.g., as loop bounds, conditions or in memory access functions. First we collect "required invariant loads" during SCoP detection that would otherwise make an expression we care about non-affine. To this end a new level of abstraction was introduced before SCEVValidator::isAffineExpr() namely ScopDetection::isAffine() and ScopDetection::onlyValidRequiredInvariantLoads(). Here we can decide if we want a load inside the region to be optimistically assumed invariant or not. If we do, it will be marked as required and in the SCoP generation we bail if it is actually not invariant. If we don't it will be a non-affine expression as before. At the moment we optimistically assume all "hoistable" (namely non-loop-carried) loads to be invariant. This causes us to expand some SCoPs and dismiss them later but it also allows us to detect a lot we would dismiss directly if we would ask e.g., AliasAnalysis::canBasicBlockModify(). We also allow potential aliases between optimistically assumed invariant loads and other pointers as our runtime alias checks are sound in case the loads are actually invariant. Together with the invariant checks this combination allows to handle a lot more than LICM can. The code generation of the invariant loads had to be extended as we can now have dependences between parameters and invariant (hoisted) loads as well as the other way around, e.g., test/Isl/CodeGen/invariant_load_parameters_cyclic_dependence.ll First, it is important to note that we cannot have real cycles but only dependences from a hoisted load to a parameter and from another parameter to that hoisted load (and so on). To handle such cases we materialize llvm::Values for parameters that are referred by a hoisted load on demand and then materialize the remaining parameters. Second, there are new kinds of dependences between hoisted loads caused by the constraints on their execution. If a hoisted load is conditionally executed it might depend on the value of another hoisted load. To deal with such situations we sort them already in the ScopInfo such that they can be generated in the order they are listed in the Scop::InvariantAccesses list (see compareInvariantAccesses). The dependences between hoisted loads caused by indirect accesses are handled the same way as before. llvm-svn: 249607	2015-10-07 20:17:36 +00:00
Johannes Doerfert	521dd5842f	Move the ValueMapT declaration out of BlockGenerator Value maps are created and used in many places and it is not always possible to include CodeGen/Blockgenerators.h. To this end, ValueMapT now lives in the ScopHelper.h which does not have any dependences itself. This patch also replaces uses of different other value map types with the ValueMapT. llvm-svn: 249606	2015-10-07 20:15:56 +00:00
Tobias Grosser	f4bb7a6a4d	Consolidate the different ValueMapTypes we are using There have been various places where llvm::DenseMap<const llvm::Value , llvm::Value > types have been defined, but all types have been expected to be identical. We make this more clear by consolidating the different types and use BlockGenerator::ValueMapT wherever there is a need for types to match BlockGenerator::ValueMapT. llvm-svn: 249264	2015-10-04 10:18:32 +00:00

1 2 3 4

177 Commits