llvm-project

Commit Graph

Author	SHA1	Message	Date
Ayal Zaks	25e2800e20	[LV] Minor savings to Sink casts to unravel first order recurrence Two minor savings: avoid copying the SinkAfter map and avoid moving a cast if it is not needed. Differential Revision: https://reviews.llvm.org/D36408 llvm-svn: 310910	2017-08-15 08:32:59 +00:00
Dinar Temirbulatov	9e43d6e7b2	[SLPVectorizer] Replace VL[0] to VL0 with assert, add propagateIRFlags extra parameter VL0, replace E->Scalars[0] to VL0, NFCI. llvm-svn: 310904	2017-08-15 00:31:49 +00:00
Dehao Chen	45847d3612	Add missing dependency in ICP. (NFC) llvm-svn: 310896	2017-08-14 23:25:21 +00:00
Reid Kleckner	18728822d2	Remove checks for debug info intrinsics in use lists, NFC These haven't done anything since debug info intrinsics stopped appearing in Value use lists in 2014. llvm-svn: 310892	2017-08-14 22:10:54 +00:00
Craig Topper	0aa3a19512	Recommit r310869, "[InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify" This recommits r310869, with the moved files and no extra changes. Original commit message: This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too. I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself. I also had to make decomposeBitTest support vectors since InstSimplify needs that. As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library. Differential Revision: https://reviews.llvm.org/D36593 llvm-svn: 310889	2017-08-14 21:39:51 +00:00
Andrew Kaylor	53a5fbb45f	Add strictfp attribute to prevent unwanted optimizations of libm calls Differential Revision: https://reviews.llvm.org/D34163 llvm-svn: 310885	2017-08-14 21:15:13 +00:00
Craig Topper	69fa8e0d99	Revert r310869 "[InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify" Failed to add the two files that moved. And then added an extra change I didn't mean to while trying to fix that. Reverting everything. llvm-svn: 310873	2017-08-14 19:09:32 +00:00
Craig Topper	2f0b450666	[InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too. I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself. I also had to make decomposeBitTest support vectors since InstSimplify needs that. As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library. Differential Revision: https://reviews.llvm.org/D36593 llvm-svn: 310869	2017-08-14 18:49:42 +00:00
Dinar Temirbulatov	7b78f5e52d	[SLPVectorizer] Schedule bundle with different opcodes. This change let us schedule a bundle with different opcodes in it, for example : [ load, add, add, add ] Reviewers: mkuper, RKSimon, ABataev, mzolotukhin, spatel, filcab Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D36518 llvm-svn: 310847	2017-08-14 15:40:16 +00:00
Sanjay Patel	a1067d9bae	[BDCE] reduce scope of an assert (PR34179) The assert was added with r310779 and is usually correct, but as the test shows, not always. The 'volatile' on the load is needed to expose the faulty path because without it, DemandedBits would return that the load is just dead rather than not demanded, and so we wouldn't hit the bogus assert. Also, since the lambda is just a single-line now, get rid of it and inline the DB.isAllOnesValue() calls. This should fix (prevent execution of a faulty assert): https://bugs.llvm.org/show_bug.cgi?id=34179 llvm-svn: 310842	2017-08-14 15:13:46 +00:00
Sam Parker	718c8a6a2a	[LoopUnroll] Enable option to peel remainder loop On some targets, the penalty of executing runtime unrolling checks and then not the unrolled loop can be significantly detrimental to performance. This results in the need to be more conservative with the unroll count, keeping a trip count of 2 reduces the overhead as well as increasing the chance of the unrolled body being executed. But being conservative leaves performance gains on the table. This patch enables the unrolling of the remainder loop introduced by runtime unrolling. This can help reduce the overhead of misunrolled loops because the cost of non-taken branches is much less than the cost of the backedge that would normally be executed in the remainder loop. This allows larger unroll factors to be used without suffering performance loses with smaller iteration counts. Differential Revision: https://reviews.llvm.org/D36309 llvm-svn: 310824	2017-08-14 09:25:26 +00:00
Craig Topper	f720099007	[InstCombine] Simplify and inline FoldOrWithConstants/FoldXorWithConstants Summary: These functions were overly complicated. The body of this function was rechecking for an And operation to find the constant, but we already knew we were looking at two Ands ORed together and the pieces are in variables. We already had earlier nearby code that checked for ConstantInts. So just inline the remaining parts into the earlier code. Next step is to use m_APInt instead of ConstantInt. Reviewers: spatel, efriedma, davide, majnemer Reviewed By: spatel Subscribers: zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D36439 llvm-svn: 310806	2017-08-14 00:04:21 +00:00
Sanjay Patel	fe346f9f5b	[BDCE] clear poison generators after turning a value into zero (PR33695, PR34037) nsw, nuw, and exact carry implicit assumptions about their operands, so we need to clear those after trivializing a value. We decided there was no danger for llvm.assume or metadata, so there's just a comment about that. This fixes miscompiles as shown in: https://bugs.llvm.org/show_bug.cgi?id=33695 https://bugs.llvm.org/show_bug.cgi?id=34037 Differential Revision: https://reviews.llvm.org/D36592 llvm-svn: 310779	2017-08-12 16:41:08 +00:00
Eli Friedman	51cf2604b6	[OptDiag] Updating Remarks in SampleProfile Updating remark API to newer OptimizationDiagnosticInfo API. This allows remarks to show up in diagnostic yaml file, and enables use of opt-viewer tool. Hotness information for remarks (L505 and L751) do not display hotness information, most likely due to profile information not being propagated yet. Unsure if this is the desired outcome. Patch by Tarun Rajendran. Differential Revision: https://reviews.llvm.org/D36127 llvm-svn: 310763	2017-08-11 21:12:04 +00:00
Xinliang David Li	24524f314c	Fix typo /NFC llvm-svn: 310737	2017-08-11 17:49:20 +00:00
Craig Topper	9a6110b2d3	[InstCombine] Make (X\|C1)^C2 -> X^(C1^C2) iff X&~C1 == 0 work for splat vectors This also corrects the description to match what was actually implemented. The old comment said X^(C1\|C2), but it implemented X^((C1\|C2)&~(C1&C2)). I believe ((C1\|C2)&~(C1&C2)) is equivalent to (C1^C2). Differential Revision: https://reviews.llvm.org/D36505 llvm-svn: 310658	2017-08-10 20:35:34 +00:00
Craig Topper	57b4d8646b	[InstCombine] Fix a crash in getSelectCondition if we happen to have two inverse vectors of i1 constants. We used to try to truncate the constant vector to vXi1, but if it's already i1 this would fail. Instead we now use IRBuilder::getZExtOrTrunc which should check the type and only create a trunc if needed. I believe this should trigger constant folding in the IRBuilder and ultimately do the same thing just with the additional type check. llvm-svn: 310639	2017-08-10 17:48:14 +00:00
Craig Topper	cd13ebca5f	[InstCombine] Add a DEBUG_COUNTER to InstCombine to limit how many instructions are visited for debug Sometimes it would be nice to stop InstCombine mid way through its combining to see the current IR. By using a debug counter we can place an upper limit on how many instructions to process. This will also allow skipping the first X combines, but that has the potential to change later combines since earlier canonicalizations might have been skipped. Differential Revision: https://reviews.llvm.org/D36553 llvm-svn: 310638	2017-08-10 17:48:12 +00:00
Craig Topper	9cd976d041	[DebugCounter] Move the semicolon out of the DEBUG_COUNTER macro and require it to be placed at the end of each use. This make it consistent with STATISTIC which it will often appears near. While there move one DEBUG_COUNTER instance out of an anonymous namespace. It's already declaring a static variable so the namespace is unnecessary. llvm-svn: 310637	2017-08-10 17:48:11 +00:00
Alexander Potapenko	5241081532	[sanitizer-coverage] Change cmp instrumentation to distinguish const operands This implementation of SanitizerCoverage instrumentation inserts different callbacks depending on constantness of operands: 1. If both operands are non-const, then a usual __sanitizer_cov_trace_cmp[1248] call is inserted. 2. If exactly one operand is const, then a __sanitizer_cov_trace_const_cmp[1248] call is inserted. The first argument of the call is always the constant one. 3. If both operands are const, then no callback is inserted. This separation comes useful in fuzzing when tasks like "find one operand of the comparison in input arguments and replace it with the other one" have to be done. The new instrumentation allows us to not waste time on searching the constant operands in the input. Patch by Victor Chibotaru. llvm-svn: 310600	2017-08-10 15:00:13 +00:00
Chad Rosier	a5508e3119	[NewGVN] Add CL option to control the generation of phi-of-ops (disable by default). Differential Revision: https://reviews.llvm.org/D36478539 llvm-svn: 310594	2017-08-10 14:12:57 +00:00
Sanjay Patel	c50e55d0e6	[InstCombine] narrow rotate left/right patterns to eliminate zext/trunc (PR34046) I couldn't find any smaller folds to help the cases in: https://bugs.llvm.org/show_bug.cgi?id=34046 after: rL310141 The truncated rotate-by-variable patterns elude all of the existing transforms because of multiple uses and knowledge about demanded bits and knownbits that doesn't exist without the whole pattern. So we need an unfortunately large pattern match. But by simplifying this pattern in IR, the backend is already able to generate rolb/rolw/rorb/rorw for x86 using its existing rotate matching logic (although there is a likely extraneous 'and' of the rotate amount). Note that rotate-by-constant doesn't have this problem - smaller folds should already produce the narrow IR ops. Differential Revision: https://reviews.llvm.org/D36395 llvm-svn: 310509	2017-08-09 18:37:41 +00:00
Matt Morehouse	49e5acab33	[asan] Fix instruction emission ordering with dynamic shadow. Summary: Instrumentation to copy byval arguments is now correctly inserted after the dynamic shadow base is loaded. Reviewers: vitalybuka, eugenis Reviewed By: vitalybuka Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D36533 llvm-svn: 310503	2017-08-09 17:59:43 +00:00
Jonas Paulsson	6228aeda65	[LSR / TTI / SystemZ] Eliminate TargetTransformInfo::isFoldableMemAccess() isLegalAddressingMode() has recently gained the extra optional Instruction* parameter, and therefore it can now do the job that previously only isFoldableMemAccess() could do. The SystemZ implementation of isLegalAddressingMode() has gained the functionality of checking for offsets, which used to be done with isFoldableMemAccess(). The isFoldableMemAccess() hook has been removed everywhere. Review: Quentin Colombet, Ulrich Weigand https://reviews.llvm.org/D35933 llvm-svn: 310463	2017-08-09 11:28:01 +00:00
Jonas Paulsson	5052771af3	[LoopStrengthReduce] Don't neglect the Fixup.Offset in isAMCompletelyFolded(). In the recursive call to isAMCompletelyFolded(), the passed offset should be the sum of F.BaseOffset and Fixup.Offset. Review: Quentin Colombet. llvm-svn: 310462	2017-08-09 11:27:46 +00:00
Davide Italiano	c163fac184	[GlobalOpt] Switch an explicit loop to llvm::all_of(). NFCI. llvm-svn: 310453	2017-08-09 09:23:29 +00:00
Craig Topper	5706c01c0b	[InstCombine] Use regular dyn_cast instead of a matcher for a simple case. NFC llvm-svn: 310446	2017-08-09 06:17:48 +00:00
Wei Mi	bb9106ac4b	[GVN] Remove stale entries in phitranslate cache when new phi is generated for PRE When a new phi is generated for scalarpre of an expression, the phiTranslate cache will become stale: Before PRE, the candidate expression must not be available in a predecessor block, and phitranslate will cache the information. After PRE, the expression will become available in all predecessor blocks, so the related entries in phiTranslate cache becomes stale. The patch will simply remove the stale entries so phiTranslate can be recomputed next time. The stale entries in phitranslate cache will not affect correctness but will cause missing PRE opportunity for later instructions. Differential Revision: https://reviews.llvm.org/D36124 llvm-svn: 310421	2017-08-08 21:40:14 +00:00
Dehao Chen	34cfcb29aa	Make ICP uses PSI to check for hotness. Summary: Currently, ICP checks the count against a fixed value to see if it is hot enough to be promoted. This does not work for SamplePGO because sampled count may be much smaller. This patch uses PSI to check if the count is hot enough to be promoted. Reviewers: davidxl, tejohnson, eraman Reviewed By: davidxl Subscribers: sanjoy, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D36341 llvm-svn: 310416	2017-08-08 20:57:33 +00:00
Craig Topper	364359e4fc	[InstCombine] Support pulling left shifts through a subtract with constant LHS We already support pulling through an add with constant RHS. We can do the same for subtract. Differential Revision: https://reviews.llvm.org/D36443 llvm-svn: 310407	2017-08-08 20:14:11 +00:00
Chad Rosier	4d852597f8	[NewGVN] Use a cast instead of a dyn_cast. Differential Revision: https://reviews.llvm.org/D36478 llvm-svn: 310397	2017-08-08 18:41:49 +00:00
Anna Thomas	9b6e12f3dc	[LoopVectorize] Fix assertion failure in Fcmp vectorization Summary: When vectorizing fcmps we can trip on incorrect cast assertion when setting the FastMathFlags after generating the vectorized FCmp. This can happen if the FCmp can be folded to true or false directly. The fix here is to set the FastMathFlag using the FastMathFlagBuilder before creating the FCmp Instruction. This is what's done by other optimizations such as InstCombine. Added a test case which trips on cast assertion without this patch. Reviewers: Ayal, mssimpso, mkuper, gilr Reviewed by: Ayal, mssimpso Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D36244 llvm-svn: 310389	2017-08-08 18:07:44 +00:00
Craig Topper	8e351e9018	[InstCombine] Cast to BinaryOperator earlier in foldSelectIntoOp to simplify the code. We no longer need the explicit operand count check or the later dynamic cast. llvm-svn: 310339	2017-08-08 06:19:24 +00:00
Chandler Carruth	7c888dca46	[PM] Fix new LoopUnroll function pass by invalidating loop analysis results when a loop is completely removed. This is very hard to manifest as a visible bug. You need to arrange for there to be a subsequent allocation of a 'Loop' object which gets the exact same address as the one which the unroll deleted, and you need the LoopAccessAnalysis results to be significant in the way that they're stale. And you need a million other things to align. But when it does, you get a deeply mysterious crash due to actually finding a stale analysis result. This fixes the issue and tests for it by directly checking we successfully invalidate things. I have not been able to get any test case to reliably trigger this. Changes to LLVM itself caused the only test case I ever had to cease to crash. I've looked pretty extensively at less brittle ways of fixing this and they are actually very, very hard to do. This is a somewhat strange and unusual case as we have a pass which is deleting an IR unit, but is not running within that IR unit's pass framework (which is what handles this cleanly for the normal loop unroll). And where there isn't a definitive way to clear all of the stale cache entries. And where the pass is updating the core analysis that provides the IR units! For example, we don't have any of these problems with Function analyses because it is easy to clear out function analyses when the functions themselves may have been deleted -- we clear an entire module's worth! But that is too heavy of a hammer down here in the LoopAnalysisManager layer. A better long-term solution IMO is to require that AnalysisManager's make their keys durable to this kind of thing. Specifically, when caching an analysis for one IR unit that is conceptually "owned" by a higher level IR unit, the AnalysisManager should incorporate this into its data structures so that we can reliably clear these results without having to teach each and every pass to do so manually as we do here. But that is a change for another day as it will be a fairly invasive change to the AnalysisManager infrastructure. Until then, this fortunately seems to be quite rare. llvm-svn: 310333	2017-08-08 02:24:20 +00:00
Evgeny Stupachenko	c675290680	Reapply fix PR23384 (part 3 of 3) r304824 (was reverted in r305720). The root cause of reverting was fixed - PR33514. Summary: The patch makes instruction count the highest priority for LSR solution for X86 (previously registers had highest priority). Reviewers: qcolombet Differential Revision: http://reviews.llvm.org/D30562 From: Evgeny Stupachenko <evstupac@gmail.com> <evgeny.v.stupachenko@intel.com> llvm-svn: 310289	2017-08-07 19:56:34 +00:00
Aaron Ballman	428f0fe910	Removing an unused variable that was missed with the refactoring in r310272; NFC. llvm-svn: 310285	2017-08-07 19:26:17 +00:00
Craig Topper	7091a743b4	[InstCombine] Support (X \| C1) & C2 --> (X & C2^(C1&C2)) \| (C1&C2) for vector splats Note the original code I deleted incorrectly listed this as (X \| C1) & C2 --> (X & C2^(C1&C2)) \| C1 Which is only valid if C1 is a subset of C2. This relied on SimplifyDemandedBits to remove any extra bits from C1 before we got to that code. My new implementation avoids relying on that behavior so that it can be naively verified with alive. Differential Revision: https://reviews.llvm.org/D36384 llvm-svn: 310272	2017-08-07 18:10:39 +00:00
Alexey Bataev	9581b42589	[SLP] General improvements of SLP vectorization process. Patch tries to improve two-pass vectorization analysis, existing in SLP vectorizer. What it does: 1. Defines key nodes, that are the vectorization roots. Previously vectorization started if StoreInst or ReturnInst is found. For now, the vectorization started for all Instructions with no users and void types (Terminators, StoreInst) + CallInsts. 2. CmpInsts, InsertElementInsts and InsertValueInsts are stored in the array. This array is processed only after the vectorization of the first-after-these instructions key node is finished. Vectorization goes in reverse order to try to vectorize as much code as possible. Reviewers: mzolotukhin, Ayal, mkuper, gilr, hfinkel, RKSimon Subscribers: ashahid, anemet, RKSimon, mssimpso, llvm-commits Differential Revision: https://reviews.llvm.org/D29826 llvm-svn: 310260	2017-08-07 15:25:49 +00:00
Alexey Bataev	53d523c9eb	Revert "[SLP] General improvements of SLP vectorization process." This reverts commit r310255. llvm-svn: 310257	2017-08-07 14:51:52 +00:00
Alexey Bataev	faace8f1f1	[SLP] General improvements of SLP vectorization process. Summary: Patch tries to improve two-pass vectorization analysis, existing in SLP vectorizer. What it does: 1. Defines key nodes, that are the vectorization roots. Previously vectorization started if StoreInst or ReturnInst is found. For now, the vectorization started for all Instructions with no users and void types (Terminators, StoreInst) + CallInsts. 2. CmpInsts, InsertElementInsts and InsertValueInsts are stored in the array. This array is processed only after the vectorization of the first-after-these instructions key node is finished. Vectorization goes in reverse order to try to vectorize as much code as possible. Reviewers: mzolotukhin, Ayal, mkuper, gilr, hfinkel, RKSimon Subscribers: ashahid, anemet, RKSimon, mssimpso, llvm-commits Differential Revision: https://reviews.llvm.org/D29826 llvm-svn: 310255	2017-08-07 14:03:17 +00:00
Vitaly Buka	5d432ec929	[asan] Fix asan dynamic shadow check before copyArgsPassedByValToAllocas llvm-svn: 310242	2017-08-07 07:35:33 +00:00
Vitaly Buka	629047de8e	[asan] Disable checking of arguments passed by value for --asan-force-dynamic-shadow Fails with "Instruction does not dominate all uses!" llvm-svn: 310241	2017-08-07 07:12:34 +00:00
Davide Italiano	b53b075bb1	[Reassociate] Use a range loop for clarity. NFCI. While here, rename `i` to `Rank` as the latter is more self-explanatory (and this code also uses `I` two lines below to identify an Instruction). llvm-svn: 310238	2017-08-07 01:57:21 +00:00
Davide Italiano	a5cdc22e70	[Reassociate] Try to bail out early when canonicalizing. This commit rearranges the checks to avoid calls to getRank() when not needed (e.g. when RHS == LHS). llvm-svn: 310237	2017-08-07 01:49:09 +00:00
Craig Topper	576fb91aef	[InstCombine] Remove shift handling from OptAndOp. Summary: This is all handled by SimplifyDemandedBits. Reviewers: spatel, davide Reviewed By: davide Subscribers: davide, llvm-commits Differential Revision: https://reviews.llvm.org/D36382 llvm-svn: 310234	2017-08-06 23:30:49 +00:00
Craig Topper	a1693a2ed3	[InstCombine] Support (X ^ C1) & C2 --> (X & C2) ^ (C1&C2) for vector splats. llvm-svn: 310233	2017-08-06 23:11:49 +00:00
Craig Topper	9cbdbefd0f	[InstCombine] Support '(C - X) ^ signmask -> (C + signmask - X)' and '(X + C) ^ signmask -> (X + C + signmask)' for vector splats. llvm-svn: 310232	2017-08-06 22:17:21 +00:00
Craig Topper	b5bf016015	[InstCombine] Support ~(c-X) --> X+(-c-1) and ~(X-c) --> (-c-1)-X for splat vectors. llvm-svn: 310195	2017-08-06 06:28:41 +00:00
Craig Topper	9ffda5ab86	[InstCombine] Fold (C - X) ^ signmask -> (C + signmask - X). llvm-svn: 310186	2017-08-05 20:00:44 +00:00
Craig Topper	65dd32afbc	[InstCombine] Teach the code that pulls logical operators through constant shifts to handle vector splats too. llvm-svn: 310185	2017-08-05 20:00:42 +00:00

1 2 3 4 5 ...

18520 Commits