llvm-project

Commit Graph

Author	SHA1	Message	Date
Tobias Grosser	f51decb5fe	[BlockGenerator] Take context into account when identifying partial writes A partial write is a write where the domain of the values written is a subset of the execution domain of the parent statement containing the write. Originally, we directly checked this subset relation whereas it is indeed only important that the subset relation holds for the parameter values that are known to be valid in the execution context of the scop. We update our check to avoid the unnecessary introduction of partial writes in situations where the write appears to be partial without context information, but where context information allows us to understand that a full write can be generated. This change fixes (hides) a recent regression introduced in r303517, which broke our AOSP builds. The part that is correctly fixed in this change is that we do not any more unnecessarily generate a partial write. This is good performance wise and, as we currently do not yet explicitly introduce partial writes in the default configuration, this also hides possible bugs in the partial writes implementation. The crashes that we have originally seen were caused by such a bug, where partial writes were incorrectly generated in region statements. An additional patch in a subsequent commit is needed to address this problem. Reported-by: Reported-by: Eli Friedman <efriedma@codeaurora.org> Differential Revision: https://reviews.llvm.org/D33759 llvm-svn: 304398	2017-06-01 09:34:20 +00:00
Tobias Grosser	6b6ac90098	[BlockGenerator] Translate buildContainsCondition to idiomatic isl C++ llvm-svn: 304354	2017-05-31 21:49:51 +00:00
Michael Kruse	1aad76c18f	[CodeGen] Add invalidation of the loop SCEVs after merge block generation. The SCEVs of loops surrounding the escape users of a merge blocks are forgotten, so that loop trip counts based on old values can be revoked. This fixes llvm.org//PR32536 Contributed-by: Baranidharan Mohan <mbdharan@gmail.com> Differential Revision: https://reviews.llvm.org/D33195 llvm-svn: 303561	2017-05-22 15:36:53 +00:00
Michael Kruse	706f79ab14	[CodeGen] Support partial write accesses. Allow the BlockGenerator to generate memory writes that are not defined over the complete statement domain, but only over a subset of it. It generates a condition that evaluates to 1 if executing the subdomain, and only then execute the access. Only write accesses are supported. Read accesses would require a PHINode which has a value if the access is not executed. Partial write makes DeLICM able to apply mappings that are not defined over the entire domain (for instance, a branch that leaves a loop with a PHINode in its header; a MemoryKind::PHI write when leaving is never read by its PHI read). Differential Revision: https://reviews.llvm.org/D33255 llvm-svn: 303517	2017-05-21 22:46:57 +00:00
Reid Kleckner	96ab8726a3	[IR] De-virtualize ~Value to save a vptr Summary: Implements PR889 Removing the virtual table pointer from Value saves 1% of RSS when doing LTO of llc on Linux. The impact on time was positive, but too noisy to conclusively say that performance improved. Here is a link to the spreadsheet with the original data: https://docs.google.com/spreadsheets/d/1F4FHir0qYnV0MEp2sYYp_BuvnJgWlWPhWOwZ6LbW7W4/edit?usp=sharing This change makes it invalid to directly delete a Value, User, or Instruction pointer. Instead, such code can be rewritten to a null check and a call Value::deleteValue(). Value objects tend to have their lifetimes managed through iplist, so for the most part, this isn't a big deal. However, there are some places where LLVM deletes values, and those places had to be migrated to deleteValue. I have also created llvm::unique_value, which has a custom deleter, so it can be used in place of std::unique_ptr<Value>. I had to add the "DerivedUser" Deleter escape hatch for MemorySSA, which derives from User outside of lib/IR. Code in IR cannot include MemorySSA headers or call the MemoryAccess object destructors without introducing a circular dependency, so we need some level of indirection. Unfortunately, no class derived from User may have any virtual methods, because adding a virtual method would break User::getHungOffOperands(), which assumes that it can find the use list immediately prior to the User object. I've added a static_assert to the appropriate OperandTraits templates to help people avoid this trap. Reviewers: chandlerc, mehdi_amini, pete, dberlin, george.burgess.iv Reviewed By: chandlerc Subscribers: krytarowski, eraman, george.burgess.iv, mzolotukhin, Prazek, nlewycky, hans, inglorion, pcc, tejohnson, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D31261 llvm-svn: 303362	2017-05-18 17:24:10 +00:00
Michael Kruse	eedae7630a	Introduce VirtualUse. NFC. If a ScopStmt references a (scalar) value, there are multiple possibilities where this value can come. The decision about what kind of use it is must be handled consistently at different places, which can be error-prone. VirtualUse is meant to centralize the handling of the different types of value uses. This patch makes ScopBuilder and CodeGeneration use VirtualUse. This already helps to show inconsistencies with the value handling. In order to keep this patch NFC, exceptions to the general rules are added. These might be fixed later if they turn to problems. Overall, this should result in fewer post-codegen IR-verification errors, but instead assertion failures in `getNewValue` that are closer to the actual error. Differential Revision: https://reviews.llvm.org/D32667 llvm-svn: 302157	2017-05-04 15:22:57 +00:00
Tobias Grosser	7b5a4dfd46	Exploit BasicBlock::getModule to shorten code Suggested-by: Roman Gareev <gareevroman@gmail.com> llvm-svn: 299914	2017-04-11 04:59:13 +00:00
Tobias Grosser	67726b3260	SAdjust to recent change in constructor definition of AllocaInst llvm-svn: 299913	2017-04-11 04:23:38 +00:00
Matt Arsenault	b3e30c32ce	Update for alloca construction changes llvm-svn: 299905	2017-04-11 00:12:58 +00:00
Michael Kruse	c3e9c1442d	[ScopInfo] Introduce ScopStmt::contains(BB*). NFC. Provide an common way for testing if a statement contains something for region and block statements. First user is RegionGenerator::addOperandToPHI. Suggested-by: Tobias Grosser <tobias@grosser.es> llvm-svn: 298617	2017-03-23 16:12:21 +00:00
Tobias Grosser	1be726a40d	[IslExprBuilder] Print accessed memory locations with RuntimeDebugBuilder After this change, enabling -polly-codegen-add-debug-printing in combination with -polly-codegen-generate-expressions allows us to instrument the compiled binaries to not only print the values stored and loaded to a given memory access, but also to print the accessed location with array name and per-dimension offset: MemRef_A[3][2] Store to 6299784: 5.000000 MemRef_A[3][3] Load from 6299788: 0.000000 MemRef_A[3][3] Store to 6299788: 6.000000 This can be very helpful for debugging. llvm-svn: 298194	2017-03-18 20:54:43 +00:00
Michael Kruse	0446d81e2d	[Simplify] Add -polly-simplify pass. This new pass removes unnecessary accesses and writes. It currently supports 2 simplifications, but more are planned. It removes write accesses that write a loaded value back to the location it was loaded from. It is a typical artifact from DeLICM. Removing it will get rid of bogus dependencies later in dependency analysis. It also removes statements without side-effects. ScopInfo already removes these, but the removal of unnecessary writes can result in more side-effect free statements. Differential Revision: https://reviews.llvm.org/D30820 llvm-svn: 297473	2017-03-10 16:05:24 +00:00
Tobias Grosser	583be06fb2	[BlockGenerator] Use MemoryAccess::getAccessValue to get load instruction When generating code in the BlockGenerator we copy all (interesting) instructions and keep track of the new values in a basic block map. To obtain the original llvm::Value that belongs to a load memory access, we use getAccessValue() instead of getOriginalBaseAddr(). The former always references the instruction we use to load values from. The latter, on the other hand, is obtaine from the corresponding ScopArrayInfo and would not be unique in case ScopArrayInfo objects at some point allow memory accesses with different base addresses. This change is an update on r294566, which only clarified that we need the original memory access, but where we still remained dependent to have one base pointer per scop. This change removes unnecessary uses of MemoryAddress::getOriginalBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294669	2017-02-09 23:54:23 +00:00
Tobias Grosser	02400a0e0c	[BlockGenerator] BBMap uses original BaseAddress for scalar loads [NFC] When regenerating code in the BlockGenerator we copy instructions that may references scalar values, for which the new value of a given scalar is looked up in BBMap using the original scalar llvm::Value as index. It is consequently necessary that (re)loaded scalar values are made available in BBMap using the original llvm::Value as key independently if the llvm::Value was (re)loaded from the original scalar or a new access function has been specified that caused the value to be reloaded from an array with a differnet base address. We make this clear by using MemoryAccess::getOriginalBaseAddr() instead of MemoryAccess::getBaseAddr() as index to BBMap. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294566	2017-02-09 08:05:50 +00:00
Tobias Grosser	682c51143d	[BlockGenerator] Comment corretions for r293374 [NFC] This addresses some additional comments from Michael Kruse for commit r293374 as expressed in https://reviews.llvm.org/D28901. llvm-svn: 293378	2017-01-28 11:39:02 +00:00
Tobias Grosser	587f1f57ad	[Polly] [BlockGenerator] Unify ScalarMap and PhiOpsMap Instead of keeping two separate maps from Value to Allocas, one for MemoryType::Value and the other for MemoryType::PHI, we introduce a single map from ScopArrayInfo to the corresponding Alloca. This change is intended, both as a general simplification and cleanup, but also to reduce our use of MemoryAccess::getBaseAddr(). Moving away from using getBaseAddr() makes sure we have only a single place where the array (and its base pointer) for which we generate code for is specified, which means we can more easily introduce new access functions that use a different ScopArrayInfo as base. We already today experiment with modifiable access functions, so this change does not address a specific bug, but it just reduces the scope one needs to reason about. Another motivation for this patch is https://reviews.llvm.org/D28518, where memory accesses with different base pointers could possibly be mapped to a single ScopArrayInfo object. Such a mapping is currently not possible, as we currently generate alloca instructions according to the base addresses of the memory accesses, not according to the ScopArrayInfo object they belong to. By making allocas ScopArrayInfo specific, a mapping to a single ScopArrayInfo object will automatically mean that the same stack slot is used for these arrays. For D28518 this is not a problem, as only MemoryType::Array objects are mapping, but resolving this inconsistency will hopefully avoid confusion. llvm-svn: 293374	2017-01-28 07:42:10 +00:00
Tobias Grosser	75dfaa1dbe	BlockGenerator: Do not redundantly reload from PHI-allocas in non-affine stmts Before this change we created an additional reload in the copy of the incoming block of a PHI node to reload the incoming value, even though the necessary value has already been made available by the normally generated scalar loads. In this change, we drop the code that generates this redundant reload and instead just reuse the scalar value already available. Besides making the generated code slightly cleaner, this change also makes sure that scalar loads go through the normal logic, which means they can be remapped (e.g. to array slots) and corresponding code is generated to load from the remapped location. Without this change, the original scalar load at the beginning of the non-affine region would have been remapped, but the redundant scalar load would continue to load from the old PHI slot location. It might be possible to further simplify the code in addOperandToPHI, but this would not only mean to pull out getNewValue, but to also change the insertion point update logic. As this did not work when trying it the first time, this change is likely not trivial. To not introduce bugs last minute, we postpone further simplications to a subsequent commit. We also document the current behavior a little bit better. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D28892 llvm-svn: 292486	2017-01-19 14:12:45 +00:00
Tobias Grosser	943c369c60	BlockGenerator: remove obfuscating const and const casts Making certain values 'const' to just cast it away a little later mainly obfuscates the code. Hence, we just drop the 'const' parts. Suggested-by: Michael Kruse <llvm@meinersbur.de> llvm-svn: 292480	2017-01-19 13:25:52 +00:00
Tobias Grosser	97b8490982	Use range-based for loop [NFC] llvm-svn: 292471	2017-01-19 05:09:23 +00:00
Tobias Grosser	21a059af09	Adjust formatting to commit r292110 [NFC] llvm-svn: 292123	2017-01-16 14:08:10 +00:00
Tobias Grosser	4d5a917287	Use typed enums to model MemoryKind and move MemoryKind out of ScopArrayInfo To benefit of the type safety guarantees of C++11 typed enums, which would have caught the type mismatch fixed in r291960, we make MemoryKind a typed enum. This change also allows us to drop the 'MK_' prefix and to instead use the more descriptive full name of the enum as prefix. To reduce the amount of typing needed, we use this opportunity to move MemoryKind from ScopArrayInfo to a global scope, which means the ScopArrayInfo:: prefix is not needed. This move also makes historically sense. In the beginning of Polly we had different MemoryKind enums in both MemoryAccess and ScopArrayInfo, which were later canonicalized to one. During this canonicalization we just choose the enum in ScopArrayInfo, but did not consider to move this shared enum to global scope. Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D28090 llvm-svn: 292030	2017-01-14 20:25:44 +00:00
Michael Kruse	11c5e07925	canSynthesize: Remove unused argument LI. NFC. The helper function polly::canSynthesize() does not directly use the LoopInfo analysis, hence remove it from its argument list. llvm-svn: 288144	2016-11-29 15:11:04 +00:00
Eli Friedman	acf8006471	[Polly CodeGen] Break critical edge from RTC to original loop. This makes polly generate a CFG which is closer to what we want in LLVM IR, with a loop preheader for the original loop. This is just a cleanup, but it exposes some fragile assumptions. I'm not completely happy with the changes related to expandCodeFor; RTCBB->getTerminator() is basically a random insertion point which happens to work due to the way we generate runtime checks. I'm not sure what the right answer looks like, though. Differential Revision: https://reviews.llvm.org/D26053 llvm-svn: 285864	2016-11-02 22:32:23 +00:00
Mandeep Singh Grang	5b1abfc88e	[polly] Fix non-determinism in polly BlockGenerators Summary: Iterating over SeenBlocks which is a SmallPtrSet results in non-determinism in codegen Reviewers: jdoerfert, zinob, grosser Tags: #polly Differential Revision: https://reviews.llvm.org/D25778 llvm-svn: 284622	2016-10-19 17:56:49 +00:00
Michael Kruse	fa53c86dc1	[ScopInfo/CodeGen] ExitPHI reads are implicit. Under some conditions MK_Value read accessed where converted to MK_ExitPHI read accessed. This is unexpected because MK_ExitPHI read accesses are implicit after the scop execution. This behaviour was introduced in r265261, which fixed a failed assertion/crash in CodeGen. Instead, we fix this failure in CodeGen itself. createExitPHINodeMerges(), despite its name, also handles accesses of kind MK_Value, only to skip them because they access values that are usually not PHI nodes in the SCoP region's exit block. Except in the situation observed in r265261. Do not convert value accessed to ExitPHI accesses and do not handle value accesses like ExitPHI accessed in CodeGen anymore. llvm-svn: 284023	2016-10-12 16:31:09 +00:00
Michael Kruse	888ab55140	[CodeGen] Change 'Scalar' to 'Array' in method names. NFC. generateScalarLoad() and generateScalarStore() are used for explicit (MK_Array) memory accesses, therefore the method names were misleading. The names also were similar to generateScalarLoads() and generateScalarStores() (plural forms) which indeed handle scalar accesses. Presumbly, they were originally named to contrast VectorBlockGenerator::generateLoad(). Rename the two methods to generateArrayLoad(), respectively generateArrayStore(). llvm-svn: 282861	2016-09-30 14:34:05 +00:00
Michael Kruse	77394f1394	[CodeGen] Add assertion for partial scalar accesses. NFC. The code generator always adds unconditional LoadInst and StoreInst, hence the MemoryAccess must be defined over all statement instances. llvm-svn: 282853	2016-09-30 14:01:46 +00:00
Roman Gareev	b3224adfb6	Perform copying to created arrays according to the packing transformation This is the fourth patch to apply the BLIS matmul optimization pattern on matmul kernels (http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf). BLIS implements gemm as three nested loops around a macro-kernel, plus two packing routines. The macro-kernel is implemented in terms of two additional loops around a micro-kernel. The micro-kernel is a loop around a rank-1 (i.e., outer product) update. In this change we perform copying to created arrays, which is the last step to implement the packing transformation. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D23260 llvm-svn: 281441	2016-09-14 06:26:09 +00:00
Michael Kruse	2fa3519463	Allow mapping scalar MemoryAccesses to array elements. Change the code around setNewAccessRelation to allow to use a an existing array element for memory instead of an ad-hoc alloca. This facility will be used for DeLICM/DeGVN to convert scalar dependencies into regular ones. The changes necessary include: - Make the code generator use the implicit locations instead of the alloca ones. - A test case - Make the JScop importer accept changes of scalar accesses for that test case. - Adapt the MemoryAccess interface to the fact that the MemoryKind can change. They are named (get\|is)OriginalXXX() to get the status of the memory access before any change by setNewAccessRelation() (some properties such as getIncoming() do not change even if the kind is changed and are still required). To get the modified properties, there is (get\|is)LatestXXX(). The old accessors without Original\|Latest become synonyms of the (get\|is)OriginalXXX() to not make functional changes in unrelated code. Differential Revision: https://reviews.llvm.org/D23962 llvm-svn: 280408	2016-09-01 19:53:31 +00:00
Tobias Grosser	1c18440958	[BlockGenerator] Invalidate SCEV values for instructions in scop We already invalidated a couple of critical values earlier on, but we now invalidate all instructions contained in a scop after the scop has been code generated. This is necessary as later scops may otherwise obtain SCEV expressions that reference values in the earlier scop that before dominated the later scop, but which had been moved into the conditional branch and consequently do not dominate the later scop any more. If these very values are then used during code generation of the later scop, we generate used that are dominated by the values they use. This fixes: http://llvm.org/PR28984 llvm-svn: 279047	2016-08-18 10:45:57 +00:00
Tobias Grosser	776700d0b7	[BlockGenerator] Insert initializations at beginning of start block In case some code -- not guarded by control flow -- would be emitted directly in the start block, it may happen that this code would use uninitalized scalar values if the scalar initialization is only emitted at the end of the start block. This is not a problem today in normal Polly, as all statements are emitted in their own basic blocks, but Polly-ACC emits host-to-device copy statements into the start block. Additional Polly-ACC test coverage will be added in subsequent changes that improve the handling of PHI nodes in Polly-ACC. llvm-svn: 278124	2016-08-09 15:34:59 +00:00
Tobias Grosser	c59b3ce044	[BlockGenerator] Also eliminate dead code not originating from BB After having generated the code for a ScopStmt, we run a simple dead-code elimination that drops all instructions that are known to be and remain unused. Until this change, we only considered instructions for dead-code elimination, if they have a corresponding instruction in the original BB that belongs to ScopStmt. However, when generating code we do not only copy code from the BB belonging to a ScopStmt, but also generate code for operands referenced from BB. After this change, we now also considers code for dead code elimination, which does not have a corresponding instruction in BB. This fixes a bug in Polly-ACC where such dead-code referenced CPU code from within a GPU kernel, which is possible as we do not guarantee that all variables that are used in known-dead-code are moved to the GPU. llvm-svn: 278103	2016-08-09 08:59:05 +00:00
Michael Kruse	fbde435517	[CodeGen] Use MapVector instead of DenseMap. The map is iterated over when generating the values escaping the SCoP. The indeterministic iteration order of DenseMap causes the output IR to change at every compilation, adding noise to comparisons. Replace DenseMap by a MapVector to ensure the same iteration order at every compilation. llvm-svn: 277832	2016-08-05 16:45:51 +00:00
Tobias Grosser	3216f8546c	BlockGenerator: Assert that we do not get alloca of array access llvm-svn: 277698	2016-08-04 06:55:53 +00:00
Tobias Grosser	2219d15748	Fix a couple of spelling mistakes llvm-svn: 277569	2016-08-03 05:28:09 +00:00
Roman Gareev	d7754a1245	Extend the jscop interface to allow the user to declare new arrays and to reference these arrays from access expressions Extend the jscop interface to allow the user to export arrays. It is required that already existing arrays of the list of arrays correspond to arrays of the SCoP. Each array that is appended to the list will be newly created. Furthermore, we allow the user to modify access expressions to reference any array in case it has the same element type. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D22828 llvm-svn: 277263	2016-07-30 09:25:51 +00:00
Tobias Grosser	9d12d8ade3	BlockGenerator: remove dead instructions in normal statements This ensures that no trivially dead code is generated. This is not only cleaner, but also avoids troubles in case code is generated in a separate function and some of this dead code contains references to values that are not available. This issue may happen, in case the memory access functions have been updated and old getelementptr instructions remain in the code. With normal Polly, a test case is difficult to draft, but the upcoming GPU code generation can possibly trigger such problems. We will later extend this dead-code elimination to region and vector statements. llvm-svn: 276263	2016-07-21 11:48:36 +00:00
Sanjoy Das	03bcb910de	[Polly] Remove usage of the `apply` function Summary: API-wise `apply` is a somewhat unidiomatic one-off function, and removing the only(?) use in polly will let me remove it from SCEV's exposed interface. Reviewers: jdoerfert, Meinersbur, grosser Subscribers: grosser, mcrosier, pollydev Differential Revision: http://reviews.llvm.org/D20779 llvm-svn: 271177	2016-05-29 07:33:16 +00:00
Michael Kruse	996fb611b3	Remove some unused local variables. NFC. Found by clang static analyzer (http://llvm.org/reports/scan-build/) and Visual Studio. llvm-svn: 270432	2016-05-23 13:00:41 +00:00
Johannes Doerfert	0f0d209bec	Use the SCoP directly for canSynthesize [NFC] llvm-svn: 270429	2016-05-23 12:47:09 +00:00
Johannes Doerfert	ef74443c97	Duplicate part of the Region interface in the Scop class [NFC] This allows to use the SCoP directly for various queries, thus to hide the underlying region more often. llvm-svn: 270426	2016-05-23 12:42:38 +00:00
Johannes Doerfert	952b5304bc	Add and use Scop::contains(Loop/BasicBlock/Instruction) [NFC] llvm-svn: 270424	2016-05-23 12:40:48 +00:00
Johannes Doerfert	3f52e35471	Directly access information through the Scop class [NFC] llvm-svn: 270421	2016-05-23 12:38:05 +00:00
Johannes Doerfert	38a012c46b	Simplify BlockGenerator::handleOutsideUsers interface [NFC] llvm-svn: 270411	2016-05-23 09:14:07 +00:00
Tobias Grosser	2e27a0f5fd	BlockGenerator: Drop leftover debug statement llvm-svn: 267874	2016-04-28 12:31:05 +00:00
Johannes Doerfert	517d8d2f94	Check only loop control of loops that are part of the region This also removes a duplicated line of code in the region generator that caused a SPEC benchmark to fail with the new SCoPs. llvm-svn: 267404	2016-04-25 13:37:24 +00:00
Johannes Doerfert	6ba927148d	[FIX] Adjust the insert point for non-affine region PHIs If a non-affine region PHI is generated we should not move the insert point prior to the synthezised value in the same block as we might split that block at the insert point later on. Only if the incoming value should be placed in a different block we should change the insertion point. llvm-svn: 265132	2016-04-01 11:25:47 +00:00
Michael Kruse	faedfcbf6d	[BlockGenerator] Fix PHI merges for MK_Arrays. Value merging is only necessary for scalars when they are used outside of the scop. While an array's base pointer can be used after the scop, it gets an extra ScopArrayInfo of type MK_Value. We used to generate phi's for both of them, where one was assuming the reault of the other phi would be the original value, because it has already been replaced by the previous phi. This resulted in IR that the current IR verifier allows, but is probably illegal. This reduces the number of LNT test-suite fails with -polly-position=before-vectorizer -polly-process-unprofitable from 16 to 10. Also see llvm.org/PR26718. llvm-svn: 262629	2016-03-03 17:20:43 +00:00
Michael Kruse	c7e0d9c216	Fix non-synthesizable loop exit values. Polly recognizes affine loops that ScalarEvolution does not, in particular those with loop conditions that depend on hoisted invariant loads. Check for SCEVAddRec dependencies on such loops and do not consider their exit values as synthesizable because SCEVExpander would generate them as expressions that depend on the original induction variables. These are not available in generated code. llvm-svn: 262404	2016-03-01 21:44:06 +00:00
Michael Kruse	8f25b0cb4d	Use inline local variable declaration. NFC. llvm-svn: 261876	2016-02-25 15:52:43 +00:00
Michael Kruse	f33c125dd2	Fix DomTree preservation for generated subregions. The generated dedicated subregion exit block was assumed to have the same dominance relation as the original exit block. This is incorrect if the exit block receives other edges than only from the subregion, which results in that e.g. the subregion's entry block does not dominate the exit block. llvm-svn: 261865	2016-02-25 14:08:48 +00:00
Michael Kruse	375cb5fe0a	Introduce ScopStmt::getEntryBlock(). NFC. This replaces an ungly inline ternary operator pattern. llvm-svn: 261792	2016-02-24 22:08:24 +00:00
Michael Kruse	eac9726e8c	Add assertions checking def dominates use. NFC. This is also be caught by the function verifier, but disconnected from the place that produced it. Catch it already at creation to be able to reason more directly about the cause. llvm-svn: 261790	2016-02-24 22:08:14 +00:00
Tobias Grosser	2b809d1390	BlockGenerator: Drop unnecessary return value llvm-svn: 261473	2016-02-21 15:44:34 +00:00
Johannes Doerfert	2c3ffc04f3	Replace getLoopForInst by getLoopForStmt This patch was extracted from http://reviews.llvm.org/D13611. llvm-svn: 260958	2016-02-16 12:36:14 +00:00
Tobias Grosser	d840fc7277	Support accesses with differently sized types to the same array This allows code such as: void multiple_types(char Short, char Float, char Double) { for (long i = 0; i < 100; i++) { Short[i] = (short )&Short[2 i]; Float[i] = (float )&Float[4 * i]; Double[i] = (double )&Double[8 * i]; } } To model such code we use as canonical element type of the modeled array the smallest element type of all original array accesses, if type allocation sizes are multiples of each other. Otherwise, we use a newly created iN type, where N is the gcd of the allocation size of the types used in the accesses to this array. Accesses with types larger as the canonical element type are modeled as multiple accesses with the smaller type. For example the second load access is modeled as: { Stmt_bb2[i0] -> MemRef_Float[o0] : 4i0 <= o0 <= 3 + 4i0 } To support code-generating these memory accesses, we introduce a new method getAccessAddressFunction that assigns each statement instance a single memory location, the address we load from/store to. Currently we obtain this address by taking the lexmin of the access function. We may consider keeping track of the memory location more explicitly in the future. We currently do _not_ handle multi-dimensional arrays and also keep the restriction of not supporting accesses where the offset expression is not a multiple of the access element type size. This patch adds tests that ensure we correctly invalidate a scop in case these accesses are found. Both types of accesses can be handled using the very same model, but are left to be added in the future. We also move the initialization of the scop-context into the constructor to ensure it is already available when invalidating the scop. Finally, we add this as a new item to the 2.9 release notes Reviewers: jdoerfert, Meinersbur Differential Revision: http://reviews.llvm.org/D16878 llvm-svn: 259784	2016-02-04 13:18:42 +00:00
Tobias Grosser	e2c31210b2	Revert "Support loads with differently sized types from a single array" This reverts commit (@259587). It needs some further discussions. llvm-svn: 259629	2016-02-03 05:53:27 +00:00
Tobias Grosser	5d3fc1ea43	Support loads with differently sized types from a single array We support now code such as: void multiple_types(char Short, char Float, char Double) { for (long i = 0; i < 100; i++) { Short[i] = (short )&Short[2 i]; Float[i] = (float )&Float[4 * i]; Double[i] = (double )&Double[8 * i]; } } To support such code we use as element type of the modeled array the smallest element type of all original array accesses. Accesses with larger types are modeled as multiple accesses with the smaller type. For example the second load access is modeled as: { Stmt_bb2[i0] -> MemRef_Float[o0] : 4i0 <= o0 <= 3 + 4i0 } To support jscop-rewritable memory accesses we need each statement instance to only be assigned a single memory location, which will be the address at which we load the value. Currently we obtain this address by taking the lexmin of the access function. We may consider keeping track of the memory location more explicitly in the future. llvm-svn: 259587	2016-02-02 22:05:29 +00:00
Johannes Doerfert	800e17a75c	Add const keyword to MemoryAccess argument [NFC] llvm-svn: 259504	2016-02-02 14:16:01 +00:00
Michael Kruse	70131d3416	Introduce MemAccInst helper class; NFC MemAccInst wraps the common members of LoadInst and StoreInst. Also use of this class in: - ScopInfo::buildMemoryAccess - BlockGenerator::generateLocationAccessed - ScopInfo::addArrayAccess - Scop::buildAliasGroups - Replace every use of polly::getPointerOperand Reviewers: jdoerfert, grosser Differential Revision: http://reviews.llvm.org/D16530 llvm-svn: 258947	2016-01-27 17:09:17 +00:00
Michael Kruse	ee6a4fc680	Unique phi write accesses Ensure that there is at most one phi write access per PHINode and ScopStmt. In particular, this would be possible for non-affine subregions with multiple exiting blocks. We replace multiple MAY_WRITE accesses by one MUST_WRITE access. The written value is constructed using a PHINode of all exiting blocks. The interpretation of the PHI WRITE's "accessed value" changed from the incoming value to the PHI like for PHI READs since there is no unique incoming value. Because region simplification shuffles around PHI nodes -- particularly with exit node PHIs -- the PHINodes at analysis time does not always exist anymore in the code generation pass. We instead remember the incoming block/value pair in the MemoryAccess. Differential Revision: http://reviews.llvm.org/D15681 llvm-svn: 258809	2016-01-26 13:33:27 +00:00
Tobias Grosser	f2cdd144e5	BlockGenerators: Replace getNewScalarValue with getNewValue Both functions implement the same functionality, with the difference that getNewScalarValue assumes that globals and out-of-scop scalars can be directly reused without loading them from their corresponding stack slot. This is correct for sequential code generation, but causes issues with outlining code e.g. for OpenMP code generation. getNewValue handles such cases correctly. Hence, we can replace getNewScalarValue with getNewValue. This is not only more future proof, but also eliminates a bunch of code. The only functionality that was available in getNewScalarValue that is lost is the on-demand creation of scalar values. However, this is not necessary any more as scalars are always loaded at the beginning of each basic block and will consequently always be available when scalar stores are generated. As this was not the case in older versions of Polly, it seems the on-demand loading is just some older code that has not yet been removed. Finally, generateScalarLoads also generated loads for values that are loop invariant, available in GlobalMap and which are preferred over the ones loaded in generateScalarLoads. Hence, we can just skip the code generation of such scalar values, avoiding the generation of dead code. Differential Revision: http://reviews.llvm.org/D16522 llvm-svn: 258799	2016-01-26 10:01:35 +00:00
Tobias Grosser	5c7f16be6b	BlockGenerators: Avoid redundant map lookup [NFC] llvm-svn: 258660	2016-01-24 14:16:59 +00:00
Johannes Doerfert	5dced2693e	Refactor canSynthesize in the BlockGenerators [NFC] llvm-svn: 256269	2015-12-22 19:08:49 +00:00
Johannes Doerfert	28f8ac1db2	Treat inline assembly as a constant in the code generation. llvm-svn: 256267	2015-12-22 19:08:24 +00:00
Johannes Doerfert	42df8d1db6	Reduce indention in BlockGenerator::trySynthesizeNewValue [NFC] llvm-svn: 256266	2015-12-22 19:08:01 +00:00
Tobias Grosser	fcabb155c1	BlockGenerators: Remove unnecessary const_cast llvm-svn: 256227	2015-12-22 01:41:25 +00:00
Tobias Grosser	5624d3c978	Adjust formatting to clang-format changes in 256149 llvm-svn: 256151	2015-12-21 12:38:56 +00:00
Tobias Grosser	184a4926b3	BlockGenerator: Use getArrayAccessFor for vector code generation getAccessFor does not guarantee a certain access to be returned in case an instruction is related to multiple accesses. However, in the vector code generation we want to know the stride of the array access of a store instruction. By using getArrayAccessFor we ensure we always get the correct memory access. This patch fixes a potential bug, but I was unable to produce a failing test case. Several existing test cases cover this code, but all of them already passed out of luck (or the specific but not-guaranteed order in which we build memory accesses). llvm-svn: 255715	2015-12-15 23:50:01 +00:00
Tobias Grosser	a69d4f0d83	VectorBlockGenerator: Generate scalar loads for vector statements When generating scalar loads/stores separately the vector code has not been updated. This commit adds code to generate scalar loads for vector code as well as code to assert in case scalar stores are encountered within a vector loop. llvm-svn: 255714	2015-12-15 23:49:58 +00:00
Tobias Grosser	0921477248	ScopInfo: Look up first (and only) array access When rewriting the access functions of load/store statements, we are only interested in the actual array memory location. The current code just took the very first memory access, which could be a scalar or an array access. As a result, we failed to update access functions even though this was requested via .jscop. llvm-svn: 255713	2015-12-15 23:49:53 +00:00
Michael Kruse	5bbc0e1888	Fix typos; NFC llvm-svn: 255580	2015-12-14 23:41:32 +00:00
Tobias Grosser	9bd0dad926	BlockGenerator: Do not use fast-path for external constants This change should not change the behavior of Polly today, but it allows external constants to be remapped e.g. when targetting multiple LLVM modules. llvm-svn: 255506	2015-12-14 16:19:59 +00:00
Tobias Grosser	6f764bbd9c	BlockGenerator: Drop unneeded const_casts llvm-svn: 255505	2015-12-14 16:19:54 +00:00
Tobias Grosser	a535dff471	ScopInfo: Harmonize the different array kinds Over time different vocabulary has been introduced to describe the different memory objects in Polly, resulting in different - often inconsistent - naming schemes in different parts of Polly. We now standartize this to the following scheme: KindArray, KindValue, KindPHI, KindExitPHI \| ------- isScalar -----------\| In most cases this naming scheme has already been used previously (this minimizes changes and ensures we remain consistent with previous publications). The main change is that we remove KindScalar to clearify the difference between a scalar as a memory object of kind Value, PHI or ExitPHI and a value (former KindScalar) which is a memory object modeling a llvm::Value. We also move all documentation to the Kind* enum in the ScopArrayInfo class, remove the second enum in the MemoryAccess class and update documentation to be formulated from the perspective of the memory object, rather than the memory access. The terms "Implicit"/"Explicit", formerly used to describe memory accesses, have been dropped. From the perspective of memory accesses they described the different memory kinds well - especially from the perspective of code generation - but just from the perspective of a memory object it seems more straightforward to talk about scalars and arrays, rather than explicit and implicit arrays. The last comment is clearly subjective, though. A less subjective reason to go for these terms is the historic use both in mailing list discussions and publications. llvm-svn: 255467	2015-12-13 19:59:01 +00:00
Michael Kruse	cba170e4d0	Introduce origin/kind for exit PHI node accesses Previously, accesses that originate from PHI nodes in the exit block were registered as SCALAR. In some context they are treated as scalars, but it makes a difference in others. We used to check whether the AccessInstruction is a terminator to differentiate the cases. This patch introduces an MemoryAccess origin EXIT_PHI and a ScopArrayInfo kind KIND_EXIT_PHI to make this case more explicit. No behavioural change intended. Differential Revision: http://reviews.llvm.org/D14688 llvm-svn: 254149	2015-11-26 12:26:06 +00:00
Tobias Grosser	bc29e0b27c	RegionGenerator: Only introduce subregion.ivs for loops fully within a subregion IVs of loops for which the loop header is in the subregion, but not the entire loop may be incremented outside of the subregion and can consequently not be kept private to the subregion. Instead, they need to and are modeled as virtual loops in the iteration domains. As this is the case, generating new subregion induction variables for such loops is not needed and indeed wrong as they would hide the virtual induction variables modeled in the scop. This fixes a miscompile in MultiSource/Benchmarks/Ptrdist/bc and MultiSource/Benchmarks/nbench/. Thanks Michael and Johannes for their investiagations and helpful observations regarding this bug. llvm-svn: 252860	2015-11-12 07:34:09 +00:00
Michael Kruse	c993739e0d	Fix non-affine generated entering node not being recognized as dominating Scalar reloads in the generated entering block were not recognized as dominating the subregions locks when there were multiple entering nodes. This resulted in values defined in there not being copied. As a fix, we unconditionally add the BBMap of the generated entering node to the generated entry. This fixes part of llvm.org/PR25439. This reverts 252449 and reapplies r252445. Its test was failing indeterministically due to r252375 which was reverted in r252522. llvm-svn: 252540	2015-11-09 23:33:40 +00:00
Michael Kruse	d6fb6f1b0c	Fix dominance when subregion exit is outside scop The dominance of the generated non-affine subregion block was based on the scop's merge block, therefore resulted in an invalid DominanceTree. It resulted in some values as assumed to be unusable in the actual generated exit block. We detect the case that the exit block has been moved and decide dominance using the BB at the original exit. If we create another exit node, that exit nodes is dominated by the one generated from where the original exit resides. This fixes llvm.org/PR25438 and part of llvm.org/PR25439. llvm-svn: 252526	2015-11-09 23:07:38 +00:00
Michael Kruse	ebffcbeefa	Revert r252375 "Fix non-affine region dominance of implicitely stored values" It introduced indeterminism as it was iterating over an address-indexed hashtable. The corresponding bug PR25438 will be fixed in a successive commit. llvm-svn: 252522	2015-11-09 22:37:29 +00:00
Johannes Doerfert	544b23a1ef	Revert "Fix non-affine generated entering node not being recognized as dominating" This reverts commit 9775824b265e574fc541e975d64d3e270243b59d due to a failing unit test. Please check and correct the unit test and commit again. llvm-svn: 252449	2015-11-09 06:04:05 +00:00
Michael Kruse	fd9c89e84b	Fix non-affine generated entering node not being recognized as dominating Scalar reloads in the generated entering block were not recognized as dominating the subregions locks when there were multiple entering nodes. This resulted in values defined in there not being copied. As a fix, we unconditionally add the BBMap of the generated entering node to the generated entry. This fixes part of llvm.org/PR25439. llvm-svn: 252445	2015-11-09 05:00:30 +00:00
Johannes Doerfert	188542fda9	[FIX] Initialize incoming scalar memory locations for PHIs llvm-svn: 252437	2015-11-09 00:21:21 +00:00
Johannes Doerfert	80c716df2b	[NFC] Remove unused variable. llvm-svn: 252436	2015-11-09 00:21:04 +00:00
Michael Kruse	0651480b97	Fix non-affine region dominance of implicitely stored values After loop versioning, a dominance check of a non-affine subregion's exit node causes the dominance check to always fail on any block in the subregion if it shares the same exit block with the scop. The subregion's exit block has become polly_merge_new_and_old, which also receives the control flow of the generated code. This would cause that any value for implicit stores is assumed to be not from the scop. We check dominance with the generated exit node instead. This fixes llvm.org/PR25438 llvm-svn: 252375	2015-11-07 00:36:50 +00:00
Duncan P. N. Exon Smith	b8f58b53dd	polly/ADT: Remove implicit ilist iterator conversions, NFC Remove all the implicit ilist iterator conversions from polly, in preparation for making them illegal in ADT. There was one oddity I came across: at line 95 of lib/CodeGen/LoopGenerators.cpp, there was a post-increment `Builder.GetInsertPoint()++`. Since it was a no-op, I removed it, but I admit I wonder if it might be a bug (both before and after this change)? Perhaps it should be a pre-increment? llvm-svn: 252357	2015-11-06 22:56:54 +00:00
Michael Kruse	ddb6528ba6	Fix reuse of non-dominating synthesized value in subregion exit We were adding all generated values in non-affine subregions to be used for the subregions generated exit block. The thought was that only values that are dominating the original exit block can be used there. But it is possible for synthesizable values to be expanded in any block. If the same values is also used for implicit writes, it would try to reuse already synthesized values even if not dominating the exit block. The fix is to only add values to the list of values usable in the exit block only if it is dominating the exit block. This fixes llvm.org/PR25412. llvm-svn: 252301	2015-11-06 13:51:24 +00:00
Michael Kruse	27149cf32d	Use per-BB value maps for non-exit BBs For generating scalar writes of non-affine subregions, all except phi writes are generated in the exit block. The phi writes are generated in the incoming block for which we errornously used the same BBMap. This can conflict if a value for one block is synthesized, and then reused for another block which is not dominated by the first block. This is fixed by using block-specific BBMaps for phi writes. llvm-svn: 252172	2015-11-05 16:17:17 +00:00
Tobias Grosser	3e9560200f	RegionGenerator: Clear local maps after statement construction These maps are only needed during the construction of a single region statement. Clearing them is important, as we otherwise get an assert in case some of the referenced values are erased before the RegionGenerator is deleted. llvm-svn: 251341	2015-10-26 20:41:53 +00:00
Tobias Grosser	a3f6edaee1	BlockGenerator: Do not assert when finding model PHI nodes defined outside the scop Such PHI nodes can not only appear in the ExitBlock of the Scop, but indeed any scalar PHI node above the scop and used in the scop is modeled as scalar read access. llvm-svn: 251198	2015-10-24 19:01:09 +00:00
Tobias Grosser	27d742da59	BlockGenerator: Directly handle multi-exit PHI nodes This change adds code to directly code-generate multi-exit PHI nodes, instead of trying to reuse the EscapeMap infrastructure for this. Using escape maps adds a level of indirection that is hard to understand and - more importantly - breaks in certain cases. Specifically, the original code relied on simplifyRegion() to split the original PHI node in two PHI nodes, one merging the values coming from within the scop and a second that merges the first PHI node with the values that come from outside the scop. To generate code the first PHI node is then just handled like any other in-scop value that is used somewhere outside the scop. This fails for the case where all values from inside the scop are identical, as the first PHI node is in such cases automatically simplified and eliminated by LLVM right at construction. As a result, there is no instruction that can be pass to the EscapeMap handling, which means the references in the second PHI node are not updated and may still reference values from within the original scop that do not dominate it. Our new code iterates directly over all modeled ScopArrayInfo objects that represent multi-exit PHI nodes and generates code for them without relying on the EscapeMap infrastructure. Hence, it works also for the case where the first PHI node is eliminated. llvm-svn: 251191	2015-10-24 17:41:29 +00:00
Michael Kruse	dc12222287	Synthesize phi arguments in incoming block New values were always synthesized in the block of the instruction that needed them. This is incorrect for PHI node whose' value must be defined in the respective incoming block. This patch temporarily moves the builder's insert point to the incoming block while synthesizing phi node arguments. This fixes PR25241 (http://llvm.org/bugs/show_bug.cgi?id=25241) llvm-svn: 250693	2015-10-19 09:19:25 +00:00
Johannes Doerfert	d8b6ad255f	[FIX] Cast preloaded values Preloaded values have to match the type of their counterpart in the original code and not the type of the base array. llvm-svn: 250654	2015-10-18 12:36:42 +00:00
Tobias Grosser	b8d27aab7d	Revert to original BlockGenerator::getOrCreateAlloca(MemoryAccess &Access) Expressing this in terms of BlockGenerator::getOrCreateAlloca(const ScopArrayInfo *Array) does not work as the MemoryAccess BasePtr is in case of invariant load hoisting different to the ScopArrayInfo BasePtr. Until this is investigated and fixed, we move back to code that just uses the baseptr of MemoryAccess. llvm-svn: 250637	2015-10-18 00:51:13 +00:00
Tobias Grosser	a4f0988df5	BlockGenerator: Add getOrCreateAlloca(const ScopArrayInfo *Array) This allows the caller to get the alloca locations of an array without the need to thank if Array is a PHI or a non-PHI Array. We directly make use of this in BlockGenerator::getOrCreateAlloca(MemoryAccess &Access). llvm-svn: 250628	2015-10-17 22:16:00 +00:00
Michael Kruse	225f0d1ee2	Load/Store scalar accesses before/after the statement itself Instead of generating implicit loads within basic blocks, put them before the instructions of the statment itself, including non-affine subregions. The region's entry node is dominating all blocks in the region and therefore the loaded value will be available there. Implicit writes in block-stmts were already stored back at the end of the block. Now, also generate the stores of non-affine subregions when leaving the statement, i.e. in the exiting block. This change is required for array-mapped implicits ("De-LICM") to ensure that there are no dependencies of demoted scalars within statments. Statement load all required values, operator on copied in registers, and then write back the changed value to the demoted memory. Lifetimes analysis within statements becomes unecessary. Differential Revision: http://reviews.llvm.org/D13487 llvm-svn: 250625	2015-10-17 21:36:00 +00:00
Tobias Grosser	ffa2446f28	BlockGenerator: Register outside users of scalars directly Instead of checking at code generation time for each ScopStmt if a scalar has external uses, we just iterate over the ScopArrayInfo descriptions we have and check each of these for possible external uses. Besides being somehow clearer, this approach has the benefit that we will always create valid LLVM-IR even in case we disable the code generation of ScopStmt bodies e.g. for testing purposes. llvm-svn: 250608	2015-10-17 08:54:13 +00:00
Tobias Grosser	26e59ee746	Drop unused parameter from handleOutsideUsers llvm-svn: 250606	2015-10-17 08:25:54 +00:00
Johannes Doerfert	f363ed9804	[NFC] Move helper functions to ScopHelper Helper functions in the BlockGenerators.h/cpp introduce dependences from the frontend to the backend of Polly. As they are used in ScopDetection, ScopInfo, etc. we move them to the ScopHelper file. llvm-svn: 249919	2015-10-09 23:40:24 +00:00
Johannes Doerfert	09e3697f44	Allow invariant loads in the SCoP description This patch allows invariant loads to be used in the SCoP description, e.g., as loop bounds, conditions or in memory access functions. First we collect "required invariant loads" during SCoP detection that would otherwise make an expression we care about non-affine. To this end a new level of abstraction was introduced before SCEVValidator::isAffineExpr() namely ScopDetection::isAffine() and ScopDetection::onlyValidRequiredInvariantLoads(). Here we can decide if we want a load inside the region to be optimistically assumed invariant or not. If we do, it will be marked as required and in the SCoP generation we bail if it is actually not invariant. If we don't it will be a non-affine expression as before. At the moment we optimistically assume all "hoistable" (namely non-loop-carried) loads to be invariant. This causes us to expand some SCoPs and dismiss them later but it also allows us to detect a lot we would dismiss directly if we would ask e.g., AliasAnalysis::canBasicBlockModify(). We also allow potential aliases between optimistically assumed invariant loads and other pointers as our runtime alias checks are sound in case the loads are actually invariant. Together with the invariant checks this combination allows to handle a lot more than LICM can. The code generation of the invariant loads had to be extended as we can now have dependences between parameters and invariant (hoisted) loads as well as the other way around, e.g., test/Isl/CodeGen/invariant_load_parameters_cyclic_dependence.ll First, it is important to note that we cannot have real cycles but only dependences from a hoisted load to a parameter and from another parameter to that hoisted load (and so on). To handle such cases we materialize llvm::Values for parameters that are referred by a hoisted load on demand and then materialize the remaining parameters. Second, there are new kinds of dependences between hoisted loads caused by the constraints on their execution. If a hoisted load is conditionally executed it might depend on the value of another hoisted load. To deal with such situations we sort them already in the ScopInfo such that they can be generated in the order they are listed in the Scop::InvariantAccesses list (see compareInvariantAccesses). The dependences between hoisted loads caused by indirect accesses are handled the same way as before. llvm-svn: 249607	2015-10-07 20:17:36 +00:00

1 2 3 4 5 ...

268 Commits