llvm-project

Commit Graph

Author	SHA1	Message	Date
Wei Mi	deee61e434	Create a wrapper pass for BlockFrequencyInfo. This is useful when we want to do block frequency analysis conditionally (e.g. only in PGO mode) but don't want to add one more pass dependence. Patch by congh. Approved by dexonsmith. Differential Revision: http://reviews.llvm.org/D11196 llvm-svn: 242248	2015-07-14 23:40:50 +00:00
Adam Nemet	7cdebac0c8	[LAA] Lift RuntimePointerCheck out of LoopAccessInfo, NFC I am planning to add more nested classes inside RuntimePointerCheck so all these triple-nesting would be hard to follow. Also rename it to RuntimePointerChecking (i.e. append 'ing'). llvm-svn: 242218	2015-07-14 22:32:44 +00:00
Benjamin Kramer	e448b5be05	Avoid using Loop::getSubLoopsVector. Passes should never modify it, just use the const version. While there reduce copying in LoopInterchange. No functional change intended. llvm-svn: 242041	2015-07-13 17:21:14 +00:00
Hal Finkel	9cf58c4095	Move getStrideFromPointer and friends from LoopVectorize to VectorUtils The following functions are moved from the LoopVectorizer to VectorUtils: - getGEPInductionOperand - stripGetElementPtr - getUniqueCastUse - getStrideFromPointer These used to be static functions in LoopVectorize, but will also be used by the upcoming loop versioning LICM transformation. Patch by Ashutosh Nema! llvm-svn: 241980	2015-07-11 10:52:42 +00:00
Tyler Nowicki	3960d85262	Renamed some uses of unroll to interleave in the vectorizer. llvm-svn: 241971	2015-07-11 00:31:11 +00:00
Jingyue Wu	a277561922	[TTI] BasicTTIImpl assumes no vector registers Summary: Following the discussion on r241884, it's more reasonable to assume that a target has no vector registers by default instead of letting every such target overrides getNumberOfRegisters. Therefore, this patch modifies BasicTTIImpl::getNumberOfRegisters to return 0 when Vector is true, and partially reverts r241884 which modifies NVPTXTTIImpl::getNumberOfRegisters. It also fixes a performance bug in LoopVectorizer. Even if a target has no vector registers, vectorization may still help ILP. So, we need both checks to be false before disabling loop vectorization all together. Reviewers: hfinkel Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11108 llvm-svn: 241942	2015-07-10 21:14:54 +00:00
Sanjay Patel	1319446195	[SLPVectorizer] Try different vectorization factors for store chains ...and set max vector register size based on target This patch is based on discussion on the llvmdev mailing list: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/087405.html and also solves: https://llvm.org/bugs/show_bug.cgi?id=17170 Several FIXME/TODO items are noted in comments as potential improvements. Differential Revision: http://reviews.llvm.org/D10950 llvm-svn: 241760	2015-07-08 23:40:55 +00:00
Michael Zolotukhin	97295ea7dd	[LoopVectorizer] Rename BypassBlock to VectorPH, and CheckBlock to NewVectorPH. NFCI. llvm-svn: 241742	2015-07-08 21:48:03 +00:00
Michael Zolotukhin	8c874bb2f1	[LoopVectorizer] Restructurize code for emitting RT checks. NFCI. Place all code corresponding to a run-time check in one place. Previously we generated some code, then proceeded to a next check, then finished the code for the first check (like splitting blocks and generating branches). Now the code for generating a check is self-contained. llvm-svn: 241741	2015-07-08 21:47:59 +00:00
Michael Zolotukhin	66f5591f9b	[LoopVectorizer] Remove redundant variables PastOverflowCheck and OverflowCheckAnchor. NFCI. llvm-svn: 241740	2015-07-08 21:47:56 +00:00
Michael Zolotukhin	00345cadd5	[LoopVectorizer] Move some code around to ease further refactoring. NFCI. llvm-svn: 241739	2015-07-08 21:47:53 +00:00
Michael Zolotukhin	7db3063f87	[LoopVectorizer] Remove redundant variable LastBypassBlock. NFC. llvm-svn: 241738	2015-07-08 21:47:47 +00:00
Sanjay Patel	cc9fad0bf7	remove unnecessary temp variable; NFCI llvm-svn: 241415	2015-07-05 21:21:47 +00:00
Sanjay Patel	a4860f3af2	use range-based for loops; NFCI llvm-svn: 241412	2015-07-05 20:15:21 +00:00
Sanjay Patel	f73f8919ed	use range-based for loops; NFCI llvm-svn: 241395	2015-07-04 19:38:52 +00:00
Alexey Samsonov	958dab71b3	[LoopVectorize] Use ReplaceInstWithInst() helper where appropriate. This is mostly an NFC, which increases code readability (instead of saving old terminator, generating new one in front of old, and deleting old, we just call a function). However, it would additionaly copy the debug location from old instruction to replacement, which would help PR23837. llvm-svn: 241197	2015-07-01 22:18:30 +00:00
David Majnemer	9f3979fd78	[LoopVectorize] Pointer indicies may be wider than the pointer If we are dealing with a pointer induction variable, isInductionPHI gives back a step value of Stride / size of pointer. However, we might be indexing with a legal type wider than the pointer width. Handle this by inserting casts where appropriate instead of crashing. This fixes PR23954. llvm-svn: 240877	2015-06-27 08:38:17 +00:00
David Blaikie	b447ac6435	Move VectorUtils from Transforms to Analysis to correct layering violation llvm-svn: 240804	2015-06-26 18:02:52 +00:00
Michael Zolotukhin	79ff564ef3	[LoopVectorizer] Fix bailing-out condition for OptForSize case. With option OptForSize enabled, the Loop Vectorizer is not supposed to create tail loop. The condition checking that was invalid and was not matching to the comment above. Patch by Marianne Mailhot-Sarrasin. llvm-svn: 240556	2015-06-24 17:26:24 +00:00
Alexander Kornienko	f00654e31b	Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC) Apparently, the style needs to be agreed upon first. llvm-svn: 240390	2015-06-23 09:49:53 +00:00
Michael Zolotukhin	4d8ffa082c	[SLP] Vectorize for all-constant entries. Differential Revision: http://reviews.llvm.org/D10531 llvm-svn: 240144	2015-06-19 17:40:15 +00:00
Alexander Kornienko	70bc5f1398	Fixed/added namespace ending comments using clang-tidy. NFC The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-,llvm-namespace-comment -header-filter='llvm/.\|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137	2015-06-19 15:57:42 +00:00
Chandler Carruth	ac80dc7532	[PM/AA] Remove the Location typedef from the AliasAnalysis class now that it is its own entity in the form of MemoryLocation, and update all the callers. This is an entirely mechanical change. References to "Location" within AA subclases become "MemoryLocation", and elsewhere "AliasAnalysis::Location" becomes "MemoryLocation". Hope that helps out-of-tree folks update. llvm-svn: 239885	2015-06-17 07:18:54 +00:00
Tyler Nowicki	27b2c39eb3	Refactor RecurrenceInstDesc Moved RecurrenceInstDesc into RecurrenceDescriptor to simplify the namespaces. llvm-svn: 239862	2015-06-16 22:59:45 +00:00
Tyler Nowicki	0a91310c7f	Rename Reduction variables/structures to Recurrence. A reduction is a special kind of recurrence. In the loop vectorizer we currently identify basic reductions. Future patches will extend this to identifying basic recurrences. llvm-svn: 239835	2015-06-16 18:07:34 +00:00
Hao Liu	405f1d1651	[LoopVectorize] Revert the enabling of interleaved memory access in Loop Vectorizor, which was wrongly committed in r239514. llvm-svn: 239515	2015-06-11 09:18:07 +00:00
Hao Liu	4566d18e89	[AArch64] Match interleaved memory accesses into ldN/stN instructions. Add a pass AArch64InterleavedAccess to identify and match interleaved memory accesses. This pass transforms an interleaved load/store into ldN/stN intrinsic. As Loop Vectorizor disables optimization on interleaved accesses by default, this optimization is also disabled by default. To enable it by "-aarch64-interleaved-access-opt=true" E.g. Transform an interleaved load (Factor = 2): %wide.vec = load <8 x i32>, <8 x i32>* %ptr %v0 = shuffle %wide.vec, undef, <0, 2, 4, 6> ; Extract even elements %v1 = shuffle %wide.vec, undef, <1, 3, 5, 7> ; Extract odd elements Into: %ld2 = { <4 x i32>, <4 x i32> } call aarch64.neon.ld2(%ptr) %v0 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 0 %v1 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 1 E.g. Transform an interleaved store (Factor = 2): %i.vec = shuffle %v0, %v1, <0, 4, 1, 5, 2, 6, 3, 7> ; Interleaved vec store <8 x i32> %i.vec, <8 x i32>* %ptr Into: %v0 = shuffle %i.vec, undef, <0, 1, 2, 3> %v1 = shuffle %i.vec, undef, <4, 5, 6, 7> call void aarch64.neon.st2(%v0, %v1, %ptr) llvm-svn: 239514	2015-06-11 09:05:02 +00:00
Hao Liu	32c0539691	[LoopVectorize] Teach Loop Vectorizor about interleaved memory accesses. Interleaved memory accesses are grouped and vectorized into vector load/store and shufflevector. E.g. for (i = 0; i < N; i+=2) { a = A[i]; // load of even element b = A[i+1]; // load of odd element ... // operations on a, b, c, d A[i] = c; // store of even element A[i+1] = d; // store of odd element } The loads of even and odd elements are identified as an interleave load group, which will be transfered into vectorized IRs like: %wide.vec = load <8 x i32>, <8 x i32>* %ptr %vec.even = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 0, i32 2, i32 4, i32 6> %vec.odd = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 1, i32 3, i32 5, i32 7> The stores of even and odd elements are identified as an interleave store group, which will be transfered into vectorized IRs like: %interleaved.vec = shufflevector <4 x i32> %vec.even, %vec.odd, <8 x i32> <i32 0, i32 4, i32 1, i32 5, i32 2, i32 6, i32 3, i32 7> store <8 x i32> %interleaved.vec, <8 x i32>* %ptr This optimization is currently disabled by defaut. To try it by adding '-enable-interleaved-mem-accesses=true'. llvm-svn: 239291	2015-06-08 06:39:56 +00:00
Chandler Carruth	70c61c1a8a	[PM/AA] Start refactoring AliasAnalysis to remove the analysis group and port it to the new pass manager. All this does is extract the inner "location" class used by AA into its own full fledged type. This seems much cleaner as MemoryDependence and soon MemorySSA also use this heavily, and it doesn't make much sense being inside the AA infrastructure. This will also make it much easier to break apart the AA infrastructure into something that stands on its own rather than using the analysis group design. There are a few places where this makes APIs not make sense -- they were taking an AliasAnalysis pointer just to build locations. I'll try to clean those up in follow-up commits. Differential Revision: http://reviews.llvm.org/D10228 llvm-svn: 239003	2015-06-04 02:03:15 +00:00
Benjamin Kramer	f5e2fc474d	Replace push_back(Constructor(foo)) with emplace_back(foo) for non-trivial types If the type isn't trivially moveable emplace can skip a potentially expensive move. It also saves a couple of characters. Call sites were found with the ASTMatcher + some semi-automated cleanup. memberCallExpr( argumentCountIs(1), callee(methodDecl(hasName("push_back"))), on(hasType(recordDecl(has(namedDecl(hasName("emplace_back")))))), hasArgument(0, bindTemporaryExpr( hasType(recordDecl(hasNonTrivialDestructor())), has(constructExpr()))), unless(isInTemplateInstantiation())) No functional change intended. llvm-svn: 238602	2015-05-29 19:43:39 +00:00
Pete Cooper	9e1d335697	Change Function::getIntrinsicID() to return an Intrinsic::ID. NFC. Now that Intrinsic::ID is a typed enum, we can forward declare it and so return it from this method. This updates all users which were either using an unsigned to store it, or had a now unnecessary cast. llvm-svn: 237810	2015-05-20 17:16:39 +00:00
Wei Mi	062c74484d	[X86] Disable loop unrolling in loop vectorization pass when VF is 1. The patch disabled unrolling in loop vectorization pass when VF==1 on x86 architecture, by setting MaxInterleaveFactor to 1. Unrolling in loop vectorization pass may introduce the cost of overflow check, memory boundary check and extra prologue/epilogue code when regular unroller will unroll the loop another time. Disable it when VF==1 remove the unnecessary cost on x86. The same can be done for other platforms after verifying interleaving/memory bound checking to be not perf critical on those platforms. Differential Revision: http://reviews.llvm.org/D9515 llvm-svn: 236613	2015-05-06 17:12:25 +00:00
Michael Zolotukhin	d98330c424	Fix a couple of typos in comments. llvm-svn: 235674	2015-04-24 00:10:27 +00:00
David Blaikie	348de69a30	Recommit r235458: [opaque pointer type] Avoid using PointerType::getElementType for a few cases of CallInst (reverted in r235533) Original commit message: "Calls to llvm::Value::mutateType are becoming extra-sensitive now that instructions have extra type information that will not be derived from operands or result type (alloca, gep, load, call/invoke, etc... ). The special-handling for mutateType will get more complicated as this work continues - it might be worth making mutateType virtual & pushing the complexity down into the classes that need special handling. But with only two significant uses of mutateType (vectorization and linking) this seems OK for now. Totally open to ideas/suggestions/improvements, of course. With this, and a bunch of exceptions, we can roundtrip an indirect call site through bitcode and IR. (a direct call site is actually trickier... I haven't figured out how to deal with the IR deserializer's lazy construction of Function/GlobalVariable decl's based on the type of the entity which means looking through the "pointer to T" type referring to the global)" The remapping done in ValueMapper for LTO was insufficient as the types weren't correctly mapped (though I was using the post-mapped operands, some of those operands might not have been mapped yet so the type wouldn't be post-mapped yet). Instead use the pre-mapped type and explicitly map all the types. llvm-svn: 235651	2015-04-23 21:36:23 +00:00
Karthik Bhat	24e6cc2de4	Move common loop utility function isInductionPHI into LoopUtils.cpp This patch refactors the definition of common utility function "isInductionPHI" to LoopUtils.cpp. This fixes compilation error when configured with -DBUILD_SHARED_LIBS=ON llvm-svn: 235577	2015-04-23 08:29:20 +00:00
David Blaikie	d2db881e85	Revert "[opaque pointer type] Avoid using PointerType::getElementType for a few cases of CallInst" This reverts commit r235458. It looks like this might be breaking something LTO-ish. Looking into it & will recommit with a fix/test case/etc once I've got more to go on. llvm-svn: 235533	2015-04-22 18:16:49 +00:00
David Blaikie	506993636e	[opaque pointer type] Avoid using PointerType::getElementType for a few cases of CallInst Calls to llvm::Value::mutateType are becoming extra-sensitive now that instructions have extra type information that will not be derived from operands or result type (alloca, gep, load, call/invoke, etc... ). The special-handling for mutateType will get more complicated as this work continues - it might be worth making mutateType virtual & pushing the complexity down into the classes that need special handling. But with only two significant uses of mutateType (vectorization and linking) this seems OK for now. Totally open to ideas/suggestions/improvements, of course. With this, and a bunch of exceptions, we can roundtrip an indirect call site through bitcode and IR. (a direct call site is actually trickier... I haven't figured out how to deal with the IR deserializer's lazy construction of Function/GlobalVariable decl's based on the type of the entity which means looking through the "pointer to T" type referring to the global) llvm-svn: 235458	2015-04-21 23:26:57 +00:00
Karthik Bhat	76aa662cf0	[NFC] Refactor identification of reductions as common utility function. This patch refactors reduction identification code out of LoopVectorizer and exposes them as common utilities. No functional change. Review: http://reviews.llvm.org/D9046 llvm-svn: 235284	2015-04-20 04:38:33 +00:00
Daniel Berlin	25db4f4141	Add range iterators for post order and inverse post order. Use them llvm-svn: 235026	2015-04-15 17:41:42 +00:00
Benjamin Kramer	619c4e57ba	Reduce dyn_cast<> to isa<> or cast<> where possible. No functional change intended. llvm-svn: 234586	2015-04-10 11:24:51 +00:00
Adam Nemet	ce48250f11	[LoopAccesses] Allow analysis to complete in the presence of uniform stores (Re-apply r234361 with a fix and a testcase for PR23157) Both run-time pointer checking and the dependence analysis are capable of dealing with uniform addresses. I.e. it's really just an orthogonal property of the loop that the analysis computes. Run-time pointer checking will only try to reason about SCEVAddRec pointers or else gives up. If the uniform pointer turns out the be a SCEVAddRec in an outer loop, the run-time checks generated will be correct (start and end bounds would be equal). In case of the dependence analysis, we work again with SCEVs. When compared against a loop-dependent address of the same underlying object, the difference of the two SCEVs won't be constant. This will result in returning an Unknown dependence for the pair. When compared against another uniform access, the difference would be constant and we should return the right type of dependence (forward/backward/etc). The changes also adds support to query this property of the loop and modify the vectorizer to use this. Patch by Ashutosh Nema! llvm-svn: 234424	2015-04-08 17:48:40 +00:00
Adam Nemet	e09a928c80	Revert "[LoopAccesses] Allow analysis to complete in the presence of uniform stores" This reverts commit r234361. It caused PR23157. llvm-svn: 234387	2015-04-08 04:16:55 +00:00
Adam Nemet	0515c33b70	[LoopAccesses] Allow analysis to complete in the presence of uniform stores Both run-time pointer checking and the dependence analysis are capable of dealing with uniform addresses. I.e. it's really just an orthogonal property of the loop that the analysis computes. Run-time pointer checking will only try to reason about SCEVAddRec pointers or else gives up. If the uniform pointer turns out the be a SCEVAddRec in an outer loop, the run-time checks generated will be correct (start and end bounds would be equal). In case of the dependence analysis, we work again with SCEVs. When compared against a loop-dependent address of the same underlying object, the difference of the two SCEVs won't be constant. This will result in returning an Unknown dependence for the pair. When compared against another uniform access, the difference would be constant and we should return the right type of dependence (forward/backward/etc). The changes also adds support to query this property of the loop and modify the vectorizer to use this. Patch by Ashutosh Nema! llvm-svn: 234361	2015-04-07 21:46:16 +00:00
David Blaikie	93c5444fe0	[opaque pointer type] More GEP API migrations in IRBuilder uses The plan here is to push the API changes out from the common components (like Constant::getGetElementPtr and IRBuilder::CreateGEP related functions) and just update callers to either pass the type if it's obvious, or pass null. Do this with LoadInst as well and anything else that comes up, then to start porting specific uses to not pass null anymore - this may require some refactoring in each case. llvm-svn: 234042	2015-04-03 19:41:44 +00:00
Duncan P. N. Exon Smith	ec819c096b	Transforms: Use the new DebugLoc API, NFC Update lib/Analysis and lib/Transforms to use the new `DebugLoc` API. llvm-svn: 233587	2015-03-30 19:49:49 +00:00
Karthik Bhat	0f8c908934	Refactor Code inside LoopVectorizer's function isInductionVariable. This patch exposes LoopVectorizer's isInductionVariable function as common a functionality. http://reviews.llvm.org/D8608 llvm-svn: 233352	2015-03-27 03:44:15 +00:00
David Blaikie	68d535c45f	Opaque Pointer Types: GEP API migrations to specify the gep type explicitly The changes to InstCombine do seem a bit silly - it doesn't make anything obviously better to have the caller access the pointers element type (the thing I'm trying to remove) than the GEP itself, but it's a helpful migration step. This will allow me to more obviously lock down GEP (& Load, etc) API usage, then fix all the code that accesses pointer element types except the places that need to be removed (most of the InstCombines) anyway - at which point I'll need to just remove all that code because it won't be meaningful anymore (there will be no pointer types, so no bitcasts to combine) llvm-svn: 233126	2015-03-24 22:38:16 +00:00
Benjamin Kramer	799003bf8c	Re-sort includes with sort-includes.py and insert raw_ostream.h where it's used. llvm-svn: 232998	2015-03-23 19:32:43 +00:00
Michael Zolotukhin	9ef5671d36	Try to fix a test broken by one of my previous commits. llvm-svn: 232536	2015-03-17 20:31:56 +00:00
Michael Zolotukhin	9b3cf604ce	LoopVectorize: teach loop vectorizer to vectorize calls. The tests would be committed in a commit for http://reviews.llvm.org/D8131 Review: http://reviews.llvm.org/D8095 llvm-svn: 232530	2015-03-17 19:46:50 +00:00

1 2 3 4 5 ...

799 Commits